[Index]
 

Fundamentals of the digital humanities

Computing and life

  1. What is a computer?
  2. Types of computers
    1. Desktop
    2. Laptop
    3. Personal assistant
    4. Workstations & supercomputers
    5. Distributed systems
  3. Design of computing systems
    1. Hardware
    2. Software
    3. Interrelation of hardware and software
  4. Interface models
    1. Operating system
    2. Common data formats
  5. Models of use

  1. What is a computer?
  2. This is not an idle question. It has two sorts of answers.

  3. Types of computers
  4. Apart from the mostly invisible uses of computing in household appliances, mobile phones, automobile regulatory sysems, bank machines and so forth, computing may be found in the following general-purpose devices.

    1. Desktop computer
    2. The "desktop" computer was formerly more often called a "PC" (for "personal computer"), and before that a "microcomputer" (because of its then remarkably small size). The desktop comprises a system-unit, monitor, keyboard and mouse. The system unit usually contains in addition to a hard disk a floppy disk drive, a CD (and/or DVD) drive, which may be a read-write device. Common peripherals are a printer and scanner. Peripherals are usually attached by means of cables to plug-in USB ports.

    3. Laptop
    4. The laptop is a highly portable version of the desktop in a single package, sometimes with individual components (such as a floppy disk or CD drive) supplied separately to minimise weight. Inevitably compromises have to be made in order to achieve minimum weight; chief among these is the difficulty or impossibility of changing components and the relatively high price of these (and of the system as a whole) in comparison to a desktop. The lightest laptops currently weigh ca. 1.2kg; one with a 10.5 inch screen (measured diagonally) will be about 8.5 by 10 by 1 inches. Current laptops are usually not quite as powerful as current desktops but can be more than adequate as a person's only machine.

    5. Personal assistant (PA)
    6. The so-called personal assistant (such as the Palm) is a very small computer (about 3 x 4.5 x 3/8 inches) useful chiefly for keeping notes, memos, a diary, lists of addresses and telephone numbers and so forth. Text and figures are entered by means of a special alphabet, or on a fold-out keyboard. Data files are transferrable to a PC or laptop by cable or infrared port. Although not capable of replacing a PC or laptop under most circumstances, they can stand in for the larger machines quite successfully for simple wordprocessing or spreadsheet work while the user is travelling. Some mobile phones have PA functions, as does the iPod, but whether all handheld devices will merge into a single device is an open question.

    7. Workstations & supercomputers
    8. Formerly a distinction was made on the basis of size, between the then "microcomputer", the "minicomputer" and the "mainframe". Now the latter two terms are mostly of historical importance only. Today the more powerful machines, which the ordinary person tends not to see, may not in fact be much larger than a desktop machine. Some of these, often known as "workstations", are used as desktop machines for computationally intensive applications, e.g. to analyze scientific data or for virtual reality simulations. Others support many users and/or tasks simultaneously, for example many users of ATM terminals.

      The so-called "supercomputer" is a very specialised, highly expensive machine built for maximum speed and great storage capacity. It consists of multiple parallel processors and/or specialised processors; multiple units, including disk storage and other peripherals (for printing &c.). It will support many users, perhaps multiple connections to other computing systems. It is used for highly computationally-intensive applications, such as meteorology.

    9. Distributed systems
    10. Many computers that function together in a coordinated way through network connections comprise a "distributed" system. The largest and best known example is the international telephone network, which can be said to be the world's largest computer. Through networking many relatively small computers, such as ordinary desktop machines, can constitute a quite powerful system. Current schemes allow individuals to volunteer the computing power of their machines when these are not otherwise in use.

  5. Design of computing systems
  6. To understand the role of software in relation to computing hardware, picture a computing system as a layer-cake, with hardware at the bottom and our particular uses of the machine at the top.

    As Winograd and Flores 1986 point out, the architectural design of computing systems as a series of layers is one of the principal achievements of computer science. In this design, each layer is roughly speaking oblivious to how the layer below does what it does; all the upper layer needs to know is the nature of acceptable inputs and outputs. In other words, each lower layer is characterized by "opacity of implementation".

    For us an important consequence of this design is that we need not care about the details of how our computers do what they do. In a sense the interface, provided by the operating system in conjunction with some hardware devices (principally keyboard, mouse, screen), is all we need be concerned with. For practical reasons we may still need occasionally to know about the capacity of the hard disk, the amount of RAM and so on, but increasingly these technical details are becoming unimportant.

    What is very important, however, is that we understand the nature of this design, its basic structure and what it means.

    The layers are roughly of the following kinds:

    1. Hardware
    2. Hardware comprises the bottom-most layer. As the term suggests, whatever is in hardware is inflexible, i.e. difficult to change. If the computer were only hardware, as once was the case, all of our dealings with it would have to be highly technically specific, closer to the machine conceptually than most of us would be able to handle. (In the early days of computing, for example, one had to know how to set instructions in physical switches, wire programming boards and time instructions according to the speed of rotation of magnetic-drum storage.) In consequence, as once was the case, there would be very few applications of computing possible in the humanities.

    3. Software
    4. Software in effect intermediates between us and the hardware, presenting us with a more humane interface and rendering our operations in terms that the hardware can process, down through numerous, increasingly lower-level layers. Since software is soft, i.e. changeable, a single configuration of hardware can serve many purposes far more easily than if each use had to be "hardwired".

      1. ROM. The lowest layer of software (known usually as "firmware") is in "read-only memory" (ROM). ROM is a fixed memory device (usually a single chip) that contains instructions for the computer when it is first switched on. Among other things, the ROM program tells the computer to go to its fixed (hard) disk to receive further instructions. Because this program is in ROM, on a chip, it can be upgraded relatively easily, although only by a technician.
      2. Operating system. The next layer of software is the operating system (OS) as such. It is actually comprised of numerous individual layers, beginning with the "hardware abstraction layer", and ending with the interface we see on screen. Thus the OS intermediates between specific applications, such as MS Word and the hardware. Without the OS each program would individually have to handle all hardware operations itself. Thus, for example, instead of being able to say in effect, "write these data to the file sample.doc", a program would have to look up where the file was stored physically on the disc; command the disc drive to go to the stated location; be responsible for managing the write-operation, including error-recovery; and so forth. What is equally worse, each program would (as once was the case) tend to have its own unique set of commands, thus placing a significant burden on the user to learn mutually incompatible ways of working. Think of the OS, then, as performing two vital functions: housekeeping and interfacing.
      3. Application programs. The next layer of software is for application programs, such as Word or Excel. These are task-specific for broad areas of work, such as writing or calculating respectively. Modern operating systems are designed for multiple applications to be running simultaneously; they allow for some degree of data-interchange among running programs. Thus this layer may be occupied by several programs simultaneously.
      4. Macros. The next layer consists of macros and other programmable operations supported by individual software applications.

    5. Interrelation of hardware and software
    6. As computer hardware becomes faster and more capacious, it can support thicker layers of software; these in turn push the development of hardware. The result is increasing sophistication of our systems. With each cycle of improvements they tend thus to become more humane, more like us. Hence technological progress drives and is driven by what it is that we want our machines to be.

      The underlying intentionality to this progress is powerfully suggested by a microphotograph of a human brain cell growing on a Motorola 68000 chip, taken by John Stevens and Judy Trogadis (Toronto Western Research Institute, Canada) and reproduced here with permission [X].

  7. Interface models
  8. Apart from the nature of the problem in hand and the characteristics of the applications software, how a computer is applied depends on two models or ideas of computing: (1) the model on which the interface of the operating system is based, which constrains how we may use the programs we have, and (2) within the constraints thus imposed by the operating system, the model we bring to the task These may be quite different, although the former will tend to influence the latter.

    1. Operating systems
      1. The isolated process. The model of computing suggested by early operating systems was the isolated or disconnected process, in which each "run" or execution of a specific program accomplished a particular task and yielded an end-result, as in the diagram to the right. You gave the computer a command, it did what it was asked to do, then returned to you for more instructions and new input. A complex task might involve use of more than one program, and the output of one program (in the form of a data file) might be used as input to the next, but the interface presented the programs one after the other. In many of the early systems, such as DOS for the PC, only one program could be in memory at a time, which meant that to use a program repeatedly meant loading and executing it repeatedly -- a slow process. Thus the isolation of each application from every other one before or after was reinforced by the fact that once a program had finished, you were returned to the OS with no trace of what had just happened nor any sign of the program that had just finished running.

        A screen-image showing the "prompt" of the PC operating system DOS illustrates the problem: we cannot see what has just happened nor do we have any sign of what might happen next.


      2. The toolkit. A significant step foward was taken in the mainframe operating system UNIX, where considerable thought was put into the design of a basic set of data-manipulation tools and a mechanism for directing the output of one process into another as input. Thus a complex task, such as generation of a concordance, could be accomplished by assembling the relevant tools into a string of operations, each delivering its output as the input of the next tool, as in the diagram shown here. Although in its user interface UNIX remains largely DOS-like, the toolkit approach made the essential point that tasks come before tools, and that complex ones are most efficiently served by assembling a number of tools each of which accomplishes some part of the job. Furthermore, the entire process could be represented to the computer, on screen, in a single command by using symbols to "pipe" the output of one tool into the next as input.

        Thus the UNIX toolkit approach encourages us to think of our intellectual work more as series of transformations in the data, not so much as monolithic operations governed by single programs.

        In some more advanced software environments that run under UNIX and are used in the sciences, an entire set of processes can be represented visually as icons that the user assembles, arranges and interconnects at will, even with meters and samplers to measure or otherwise represent intermediate results. Might such a representation be useful in the humanities?


      3. The desktop. The operating system interface first devised at the Palo Alto Research Center (PARC) of Xerox, then taken up by Macintosh, then by Microsoft Windows, carries the development further by allowing for more than a single program to be in memory and even run simultaneously and by representing all immediately available programs as objects on a "desktop", to be taken up for use, put down or discarded at will. The combination of multi-tasking with simultaneous desktop representation comes closer yet to modeling how scholars and others actually work, with a variety of tools applied as the need arises, and often spontaneously.

      4. Mechanisms of exchange. The UNIX pipe, and the interconnections provided in the advanced systems of process visualisation, are examples of automatic mechanisms for exchange of data: instead of having to write out an intermediate data file, one simply directs output into the next process. Another example, though requiring more manual intervention, is the "clipboard" in Macintosh and PC Windows -- a usually hidden temporary space to which on-screen data may be copied or cut and from which they may be pasted.

        Yet another approach is that of "Object Linking and Embedding" (OLE), an OS mechanism implemented in some Windows programs (e.g. MS Word, Excel, Access) that allows data files, treated as objects, to be dynamically shared between programs. Thus a spreadsheet, made in Excel, can be imbedded in a Word document and will be altered automatically in this document if changed by Excel.

    2. Common data formats
    3. A further step in the evolution of the computing environment, partially accomplished and currently in development, is the provision of common formats so that processes can exchange data with a minimum of manual intervention. As you have seen with the data provided by the Air Quality Information Archive, the so-called "comma-delimited format" in which the data is given there is almost immediately comprehensible to Excel, which can also accept other delimiters and fix-width format. For some time, various programs have recognised the proprietary formats of others, for example the many word-processing formats accepted by MS Word. The situation remains frustratingly patch-work, however.

      For the textual data of interest in the humanities, the problem of exchange is compounded by issues of markup. Because text, particularly literary, linguistic and historical text, works in ways that remain impervious to fully automatic analysis, in general it must be prepared in a way that renders the phenomena of interest visible to the computer. This is done by inserting meta-linguistic codes or tags, e.g. to mark a series of words as a title, or a paragraph, as you have seen in HTML, or even as a proper name or metaphor. Without a common way of devising such markup, the data becomes quite difficult to exchange, even for the user to understand.

      The Text Encoding Initiative (TEI), an international collaborative project, has developed a form of the widely-used Standard General Markup Language (SGML) to provide just such a common exchange format for textual data and has been a primary influence in the design of the eXtended Markup Language (XML). You will learn about TEI-SGML and about XML later in the Programme.

  9. Models of use
    1. The basic idea
    2. The argument set forth above is based on the assumption that the more closely our computing systems mirror how we work the better. The technology is clearly developing toward that closer mirroring, though in practical day-to-day terms, much remains to be done. Meanwhile, one frequently faces the problem of piecing together the software and intermediate operations required by a particular task.

      The most important idea to keep in mind is the systemic view of both scholarship and the concatenation of computing tools we put together to assist it. Computing is a complex process that serves scholarly work by isolating its mechanical components and providing for each of these one or more programs that work together with as little human intervention as possible. Humanities computing begins with this systemic and mechanical view of scholarly problems, then goes on to question the consequences, and in particular where the mechanical analogue of the scholarly process fails and what this tells us about our ways of knowing. The systemic view is thus a fundamental prerequisite.

    3. The basic kit
    4. To compute most effectively, then, one needs (1) a collection of commonly useful programs, such as are covered in our Programme, and (2) the understanding that they are to be used as components in a process.

rev. 8/05
[Index]