KCLCCHMinor programmeAV1000Text analysis


AV1000
Fundamentals of the digital humanities
Keywords and context

I. Overview

This exercise uses a simple concordancer to discover the meaning of words in context. The text for the exercise has been taken from Monday Night Class, a loosely edited collection of 8 talks, in the authentic American hippie vernacular of the late 1960s, by Stephen Gaskin. (A second edition is in print; the text for this exercise was taken from the first edition, 1969.) This text was chosen because the syntax is that of colloquial modern English, but the sense given to many words in this vernacular is highly unlikely to be familiar to you.

Gaskin was an assistant professor in San Francisco in the mid-1960s “when the psychedelic revolution, the hippie phenomenon and the politics of the radical left exploded on the scene. Dropping out from his academic position to lead an alternative lifestyle, he eventually returned to the campus to hold weekly public meetings as part of its ‘Free School’ curriculum. What came to be called Monday Night Class grew over the course of five years from a group of six to a gathering of thousands, while moving eventually from the college campus to increasingly spacious rock and roll halls. The community that formed from those meetings decided finally to leave San Francisco in 1971 and migrated in a bus caravan to Tennessee, establishing a large intentional community called The Farm—but not before a book was published that was meant to convey the essence of the Monday Night Class experience…. [The book was] printed entirely in purple, with no page numbers. The cover title was on the back, and the front cover was a wordless image of a pulsating mandala…. It eventually sold over 100,000 copies” (William Meyers, Monday Night Class, 2 December 2003; now in the Internet Archive).


II. Acquiring the data

The textual data from Monday Night Class are provided as a single, plain-text file, stephen.txt (270K). Download this file by clicking here, saving it on the hard disk of your computer.

III. Exploring the basic functions of Monoconc

A. Loading and unloading the corpus

  1. Select Load Corpus File(s) under the File menu. This allows you to select a single-file corpus or to build one from several files so that, for example, you can compare several parts of a corpus by concording them separately, then combine them for an overall view.
  2. Contents of the corpus file will appear in a window. This window can be kept for reading purposes or dismissed without unloading the corpus. To unload the corpus, select the Unload Corpus item under the File menu.

B. Inspecting the word-list

  1. To find out what words are in the corpus and so can be concorded, you have two choices:
    1. To view all words in order of their frequency, from most to least, select Frequency, then Corpus Frequency Data, then Frequency Order.
    2. To view all words in alphabetical order, select Frequency, then Corpus Frequency Data, then Alphabetical Order.
  2. Try both the above but pay particular attention to the Corpus Frequency List. Starting at the top, look down the list until you come across the first word that you regard as having some meaning in relation to the subject of the corpus. As an example, use the word “know”.

C. Generating and manipulating a concordance

  1. First select Concordance, then Search Options. For the moment, in the dialogue box pay attention only to Max Search Hits, which sets the maximum number of occurrences to be found. The default value is 500; set it to a very large number, say 50000, to ensure that all occurrences will be displayed. Click on OK.
  2. Select Concordance, then Search. In the dialogue box type the sample word “know” from the frequency list. Click on OK. The resulting KWIC (keyword-in-context) display centres and highlights each occurrence of the word, making an inspection of words before and after easy. The lines are given in the order they occur in the corpus.
  3. Click on any line in the KWIC; several lines of context will appear in the window above. The relative space devoted to context and KWIC can be adjusted by moving the intervening bar.
  4. Scan down the KWIC display. Look for a possibly significant collocation of a word before the target word “know”, for example “don't” or “didn't”. Now select Sort from the menu bar, then 1st Left, then No Second Sort. The concordance lines are now rearranged in the order of the word preceding the target word, so all occurrences of “don't know” are found together.
  5. Try the same operation, but this time sort for the word immediately to the right of the target word. Try other options as well.
  6. Note how by using these sorting tools, you can begin to figure out how a particular word is actually being used in context.

D. Using wildcards

  1. Although Monoconc cannot account for grammatical variations in a word, such as “know”, “knew”, “known”, “knowing” and so forth, special symbols called wildcards can be used to compensate. For example, in Monoconc, ? can be used to stand for any single letter, % for one optional letter, * for any number of letters.
  2. As an example, try generating a concordance for “kn?w*”. Look at the results.
  3. The symbol @ can be used to request a concordance of all places where one word occurs within a specified number of words of another word—what is called a co-occurrence or collocation. Try generating a concordance for “know @ you”, then inspect the results. Try generating another such concordance, this time for “you @ know”. Do you see the difference?
  4. The number of words defining a collocation is set by choosing Search options from the Concordance menu or by choosing Options from the Generate concordance dialogue box. The default value is from 2 to 5 words. Try changing it to 1 at the minimum to 10 at the maximum and generating another concordance.
  5. Once you've created a concordance, another way to get an idea of the collocates is to get a frequency list of them: go to Frequency, then Collocate Frequency.

Now you should be ready to use Monoconc for elementary probing of the Stephen text.

IV. The exercises

Here you are asked to pursue two questions: (1) how our intuition of the meaning of a word compares with actual usage; and (2) how various contexts show varieties of meaning. This will involve using the KWIC format and sorting the results of particular searches to the left and right of the target words for an understanding of their context.

A. Intuition vs. usage

The first question to pursue is how our intuition of the meaning of a word differs from actual daily usage. The Stephen corpus hardly represents our daily usage for many words, but here we will use a word that has not changed much if at all since the 1970s, back. For the purposes of the exercise it is important that you do the following steps in the order given. No cheating!

  1. First construct a dictionary entry for the word backwithout consulting a dictionary. In its basic form it should look like the following entry for front:

    front n. 1. the forward part or surface: a shirt with buttons down the front; a desk at the front of the room. 2. The area, location, or position directly before or ahead: We had hoped to be at the front of the long line. The Jamaican runner was in front. 3. A person's outward manner, behaviour or appearance: keeping up a brave front despite his bad luck. 4. Land bordering a lake, river or street: A house on the lakefront. 5. In warfare, an area where a battle is taking place. 6. The boundary between two air masses having different temperatures: a cold front. 7. A field of activity: Conditions on the economic front are poor. 8. A group or movement uniting persons or organisations that seek a common goal: Several unions formed a labour front. 9. A person or business that seems respectable but serves as a cover for secret or illegal activity: That business is a front for selling drugs.

    Your definition of back need not have as many sub-senses as are given above for front, only as many as you can think of. These sub-senses should be arranged in the order in which you would expect to find them in common usage, with the most frequent given first. Imagine that you are making a dictionary for a learner of English, who has a strong practical need to find out in every case the most common meaning for each word he or she looks up.

  2. Now run a concordance on the Stephen corpus for the word back, which should yield 77 hits, including the following. Scan these to get a sense of the various contexts. What do you notice about the senses of back illustrated here?

  3. Explore any ideas you have about the usage of “back” by running the appropriate sorts on the concordance. Note down the function of the word as this is affected by the immediate context.

  4. Now classify what you have found according to the senses you defined in your dictionary entry. Which of these senses are most frequently attested, which least?

  5. The important point here is that the close examination of the linguistic context of a word from a substantial corpus tends to reveal surprising subtleties not anticipated by one's intuition about language, even one's own native language. We tend to see new things by applying an apparently trivial technology, whose essence consists merely of formatting and sorting words on screen. Again, format matters.

B. Definition from a corpus

  1. Consider the following definition of the word energy:
    energy n. 1. The capacity for work or vigorous activity; power: They lacked the energy to finish the job. 2. Use of power or vigour; effort: She's devoting all her energies to caring for her family. 3. Usable heat or power: Do we have enough energy to run the computer? 4. The capacity for doing work, as turning, pushing or raising something: chemical energy; solar energy.

    Generate a KWIC concordance for the same word from the Stephen corpus. Using the sort options, group the occurrences to show the different meanings and shades of meaning. Look for any definitions that Stephen may supply. Do these match his usage?

    Don't worry about being correct—there is no right answer to a problem such as this one—rather about constructing a reasonable (and reasonably small) set of senses, what you regard as a practical guide for someone who wants to know how someone like Stephen used this common English word.

    Now write out a definition for energy based solely on the examples you have assembled, again grouping them by their apparent importance. Try to keep the number of different senses to no more than 4 but greater than 2.

    Compare your definition to the standard one given above. Taking Stephen's usage as typical for the American English of his subculture, how would you characterize the special meaning hippie speech gave to this word? What does this meaning tell you about the particular concerns of his subculture?

  2. For a second case consider the word heavy:
    heavy adj. I. 1. a. Of great weight; weighty, ponderous. The opposite of light. 2. a. Possessing great weight in proportion to bulk; of great specific gravity. b. Of bread, pastry, etc.: That has not properly ‘risen’, and is consequently dense and compact. b. Applied to elements whose specific gravity is relatively great; heavy metal, a metal of high specific gravity. 3. Great with young; gravid, pregnant. Also fig. 4. Increased in weight by the addition of something; laden with. 5. a. Applied technically to classes of goods, manufactured articles, breeds of animals, etc. of more than a defined or usual weight. II. Expressing the action or operation of things physically weighty. 8. Having great momentum; striking or falling with force or violence. 9. Of ground, a road, etc.: That clings or hangs heavily to the spade, feet, wheels, etc., and thus impedes motion or manipulation; soft and tenacious. 10. That weighs upon the stomach; difficult of digestion. III. Weighty in import, grave, serious. 12. Of great import; weighty, important; serious, grave. 13. Grave, severe, deep, profound, intense. IV. Having the aspect, effect, sound, etc. of heaviness. 14. a. Of the sky, clouds, etc.: Overcast with dark clouds; lowering, gloomy. 15. Having comparatively much thickness or substance; thick, coarse; also, massive in conformation or outline; wanting in gracefulness, lightness, elegance, or delicacy. V. Having the slow or dull action of what is weighty. VI. That weighs or presses hardly or sorely on the senses or feelings. VII. Weighed down mentally or physically.

    Generate a concordance from the Stephen corpus for this word. What is his prevailing sense of the word? To which sense in the above standard dictionary definition is his closest?


revised November 2007