Fundamentals of the digital humanities | |
Computing and life |
|
This is not an idle question. It has two sorts of answers.
The first sort has to do with how individuals happen to use the standard machines they have or with the broad tendencies of practical use. Answers of this kind have faded away as use of computing has become more diversified and more a part of what people do in their daily lives.
"A sophisticated calculator" or "a kind of typewriter". Both of these answers were common some years ago. Hardly anyone would seriously maintain them now, although some people do in fact use computers almost exclusively to crunch numbers while others only write with them. Both answers arise from confusing the application with the device -- historically an easy error to fall into, because previously mechanical devices were more or less single-purpose.
The other sort of answer attempts to identify the essence of computing or the single form toward which it is progressing.
Where all of the above run into trouble is in asserting that computing is essentially like this or that, or destined to become essentially or only this or that. Computing is based on a fundamental scheme (known as the Universal Turing Machine, named after the mathematician Alan Turing, who devised it) for the design of indefinitely many devices. The basic question we need to ask is, what kind of computing does this or that situation require?
Apart from the mostly invisible uses of computing in household appliances, mobile phones, automobile regulatory sysems, bank machines and so forth, computing may be found in the following general-purpose devices.
The "desktop" computer was formerly more often called a "PC" (for "personal computer"), and before that a "microcomputer" (because of its then remarkably small size). The desktop comprises a system-unit, monitor, keyboard and mouse. The system unit usually contains in addition to a hard disk a floppy disk drive, a CD (and/or DVD) drive, which may be a read-write device. Common peripherals are a printer and scanner. Peripherals are usually attached by means of cables to plug-in USB ports.
The laptop is a highly portable version of the desktop in a single package, sometimes with individual components (such as a floppy disk or CD drive) supplied separately to minimise weight. Inevitably compromises have to be made in order to achieve minimum weight; chief among these is the difficulty or impossibility of changing components and the relatively high price of these (and of the system as a whole) in comparison to a desktop. The lightest laptops currently weigh ca. 1.2kg; one with a 10.5 inch screen (measured diagonally) will be about 8.5 by 10 by 1 inches. Current laptops are usually not quite as powerful as current desktops but can be more than adequate as a person's only machine.
The so-called personal assistant (such as the Palm) is a very small computer (about 3 x 4.5 x 3/8 inches) useful chiefly for keeping notes, memos, a diary, lists of addresses and telephone numbers and so forth. Text and figures are entered by means of a special alphabet, or on a fold-out keyboard. Data files are transferrable to a PC or laptop by cable or infrared port. Although not capable of replacing a PC or laptop under most circumstances, they can stand in for the larger machines quite successfully for simple wordprocessing or spreadsheet work while the user is travelling. Some mobile phones have PA functions, as does the iPod, but whether all handheld devices will merge into a single device is an open question.
Formerly a distinction was made on the basis of size, between the then "microcomputer", the "minicomputer" and the "mainframe". Now the latter two terms are mostly of historical importance only. Today the more powerful machines, which the ordinary person tends not to see, may not in fact be much larger than a desktop machine. Some of these, often known as "workstations", are used as desktop machines for computationally intensive applications, e.g. to analyze scientific data or for virtual reality simulations. Others support many users and/or tasks simultaneously, for example many users of ATM terminals.
The so-called "supercomputer" is a very specialised, highly expensive machine built for maximum speed and great storage capacity. It consists of multiple parallel processors and/or specialised processors; multiple units, including disk storage and other peripherals (for printing &c.). It will support many users, perhaps multiple connections to other computing systems. It is used for highly computationally-intensive applications, such as meteorology.
Many computers that function together in a coordinated way through network connections comprise a "distributed" system. The largest and best known example is the international telephone network, which can be said to be the world's largest computer. Through networking many relatively small computers, such as ordinary desktop machines, can constitute a quite powerful system. Current schemes allow individuals to volunteer the computing power of their machines when these are not otherwise in use.
To understand the role of software in relation to computing hardware, picture a computing system as a layer-cake, with hardware at the bottom and our particular uses of the machine at the top.
As Winograd and Flores 1986 point out, the architectural design of computing systems as a series of layers is one of the principal achievements of computer science. In this design, each layer is roughly speaking oblivious to how the layer below does what it does; all the upper layer needs to know is the nature of acceptable inputs and outputs. In other words, each lower layer is characterized by "opacity of implementation".
For us an important consequence of this design is that we need not care about the details of how our computers do what they do. In a sense the interface, provided by the operating system in conjunction with some hardware devices (principally keyboard, mouse, screen), is all we need be concerned with. For practical reasons we may still need occasionally to know about the capacity of the hard disk, the amount of RAM and so on, but increasingly these technical details are becoming unimportant.
What is very important, however, is that we understand the nature of this design, its basic structure and what it means.
The layers are roughly of the following kinds:
Hardware comprises the bottom-most layer. As the term suggests, whatever is in hardware is inflexible, i.e. difficult to change. If the computer were only hardware, as once was the case, all of our dealings with it would have to be highly technically specific, closer to the machine conceptually than most of us would be able to handle. (In the early days of computing, for example, one had to know how to set instructions in physical switches, wire programming boards and time instructions according to the speed of rotation of magnetic-drum storage.) In consequence, as once was the case, there would be very few applications of computing possible in the humanities.
Software in effect intermediates between us and the hardware, presenting us with a more humane interface and rendering our operations in terms that the hardware can process, down through numerous, increasingly lower-level layers. Since software is soft, i.e. changeable, a single configuration of hardware can serve many purposes far more easily than if each use had to be "hardwired".
As computer hardware becomes faster and more capacious, it can support thicker layers of software; these in turn push the development of hardware. The result is increasing sophistication of our systems. With each cycle of improvements they tend thus to become more humane, more like us. Hence technological progress drives and is driven by what it is that we want our machines to be.
The underlying intentionality to this progress is powerfully suggested by a microphotograph of a human brain cell growing on a Motorola 68000 chip, taken by John Stevens and Judy Trogadis (Toronto Western Research Institute, Canada) and reproduced here with permission [X].
Apart from the nature of the problem in hand and the characteristics of the applications software, how a computer is applied depends on two models or ideas of computing: (1) the model on which the interface of the operating system is based, which constrains how we may use the programs we have, and (2) within the constraints thus imposed by the operating system, the model we bring to the task These may be quite different, although the former will tend to influence the latter.
The model of computing suggested by early operating systems was the isolated or disconnected process, in which each "run" or execution of a specific program accomplished a particular task and yielded an end-result, as in the diagram to the right. You gave the computer a command, it did what it was asked to do, then returned to you for more instructions and new input. A complex task might involve use of more than one program, and the output of one program (in the form of a data file) might be used as input to the next, but the interface presented the programs one after the other. In many of the early systems, such as DOS for the PC, only one program could be in memory at a time, which meant that to use a program repeatedly meant loading and executing it repeatedly -- a slow process. Thus the isolation of each application from every other one before or after was reinforced by the fact that once a program had finished, you were returned to the OS with no trace of what had just happened nor any sign of the program that had just finished running.
A screen-image showing the "prompt" of the PC operating system DOS illustrates the problem: we cannot see what has just happened nor do we have any sign of what might happen next.
Although in its user interface UNIX remains largely DOS-like, the toolkit approach made the essential point that tasks come before tools, and that complex ones are most efficiently served by assembling a number of tools each of which accomplishes some part of the job. Furthermore, the entire process could be represented to the computer, on screen, in a single command by using symbols to "pipe" the output of one tool into the next as input.
Thus the UNIX toolkit approach encourages us to think of our intellectual work more as series of transformations in the data, not so much as monolithic operations governed by single programs.
In some more advanced software environments that run under UNIX and are used in the sciences, an entire set of processes can be represented visually as icons that the user assembles, arranges and interconnects at will, even with meters and samplers to measure or otherwise represent intermediate results. Might such a representation be useful in the humanities?
Yet another approach is that of "Object Linking and Embedding" (OLE), an OS mechanism implemented in some Windows programs (e.g. MS Word, Excel, Access) that allows data files, treated as objects, to be dynamically shared between programs. Thus a spreadsheet, made in Excel, can be imbedded in a Word document and will be altered automatically in this document if changed by Excel.
A further step in the evolution of the computing environment, partially accomplished and currently in development, is the provision of common formats so that processes can exchange data with a minimum of manual intervention. As you have seen with the data provided by the Air Quality Information Archive, the so-called "comma-delimited format" in which the data is given there is almost immediately comprehensible to Excel, which can also accept other delimiters and fix-width format. For some time, various programs have recognised the proprietary formats of others, for example the many word-processing formats accepted by MS Word. The situation remains frustratingly patch-work, however.
For the textual data of interest in the humanities, the problem of exchange is compounded by issues of markup. Because text, particularly literary, linguistic and historical text, works in ways that remain impervious to fully automatic analysis, in general it must be prepared in a way that renders the phenomena of interest visible to the computer. This is done by inserting meta-linguistic codes or tags, e.g. to mark a series of words as a title, or a paragraph, as you have seen in HTML, or even as a proper name or metaphor. Without a common way of devising such markup, the data becomes quite difficult to exchange, even for the user to understand.
The Text Encoding Initiative (TEI), an international collaborative project, has developed a form of the widely-used Standard General Markup Language (SGML) to provide just such a common exchange format for textual data and has been a primary influence in the design of the eXtended Markup Language (XML). You will learn about TEI-SGML and about XML later in the Programme.
The argument set forth above is based on the assumption that the more closely our computing systems mirror how we work the better. The technology is clearly developing toward that closer mirroring, though in practical day-to-day terms, much remains to be done. Meanwhile, one frequently faces the problem of piecing together the software and intermediate operations required by a particular task.
The most important idea to keep in mind is the systemic view of both scholarship and the concatenation of computing tools we put together to assist it. Computing is a complex process that serves scholarly work by isolating its mechanical components and providing for each of these one or more programs that work together with as little human intervention as possible. Humanities computing begins with this systemic and mechanical view of scholarly problems, then goes on to question the consequences, and in particular where the mechanical analogue of the scholarly process fails and what this tells us about our ways of knowing. The systemic view is thus a fundamental prerequisite.
To compute most effectively, then, one needs (1) a collection of commonly useful programs, such as are covered in our Programme, and (2) the understanding that they are to be used as components in a process.