KCL • CCH •
Minor
programme
• AV1000
•
Electronic
communications and publishing
AV1000
Fundamentals of the digital humanities
Introduction to HTML
I. HTML
The Hyper-Text Markup Language is the
metalanguage used to construct documents on the World Wide Web. In
other words, HTML comprises a set of instructions that describe how
the words of an online document are to be handled. It is called
hypertextual because one of its major features is to provide
for automatic linking between any two labelled places within the same
document or in two different documents.
Each HTML document may be displayed by the computer in two forms:
(1) what the author composes, and (2) what the reader sees. The former
is a mixture of HTML instructions ("markup") with the text that will
appear on the page; the latter is a formatted
page. If you click on the link in the previous sentence, you will
see the formatted page (2) in the normal way. From the View or Page
menu of Internet Explorer or Firefox, you can choose View Source or
Page Source and see the original HTML (1).
See also the course's template for
writing new HTML documents, which saves on typing parts that are
always necessary. In this course, the normal way of creating HTML
documents will be to start from this template and use a text editor to
fill in the rest. On Windows the Notepad program will work for this;
there is an enhanced version of this program, NotePad2, which you can
download here
and which may be run on PAWS machines. Copy it to your memory stick to
save the trouble of downloading it each time.
Note that:
- HTML is a simplified derivative of the Standard
Generalised Markup Language (SGML), a widely used
metalanguage for describing the elements and logical organisation of
electronic documents. The version of HTML we teach in this course is
XHTML 1.0, a variety of HTML based on the Extensible Markup Language
(XML). It is stricter than other versions of HTML.
- HTML consists of instructions applied to portions of a
document; together these are called elements. The elements are
divided into those that define how the body of a document is to be
displayed by the WWW browser and those that define information about
the document, such as its title, keywords, and relationship to other
online documents.
- Most elements affect a block of text by specifying a start-tag
at the beginning of the block and an end-tag immediately
following it. All tags are enclosed in angle-brackets and consist of
an element name optionally followed by one or more
attributes. The end-tag repeats the element name preceded by a
forward slash (virgule). Thus a paragraph is indicated by a
<p> at its beginning and a </p> at its end,
a centred paragraph by a <p align=center> and a
</p>.
- Some elements are empty, i.e. contain no affected text. An example
is the element to produce a horizontal "rule" or line,
<hr />. In these cases the forward slash appears within
the start-tag, just before the closing angle bracket.
II. Structure of an HTML document
See again the attached sample Web
page and use View Source for an illustration of the following.
- Enclosed by the <html> … </html>
element. Next is
- A <head> … </head> element, which usually
contains a <title> … </title> element. The
title is whatever words you want to appear on the title-bar of the
browser window. It will also appear as the title in listings produced
by search engines such as Google.
- The body of the document follows and encloses all other
contents. It is denoted by the <body> …
</body> element, which (as here) may contain attributes
determining the background colour or pattern, the colour of the text
and of text affected by linking.
- The body is free-form, i.e. may contain any mixture of HTML
elements and text.
III. Common HTML elements
Again, see the attached sample Web
page and use View Source for illustration of the following.
- Headings or titles. These are denoted by a set of elements,
<hx> … </hx>, where x = an
Arabic numeral from 1 to 6. The lower the number the larger the
enclosed text is rendered. How big the actual text is on screen is a
function of settings in the browser. Thus <h2>London: a
late 12th-century opinion</h2> renders that title in the
second-largest size.
- Paragraphs, denoted as such by being enclosed with the
<p> … </p> element. Note that "hard returns"
that you type in the text-editor are not represented as such by the
browser but as spaces. The text is reformatted to fit the browser
window rather than following the line breaks within your HTML
file. Two or more returns or spaces will be represented as a single
space. A common attribute for the paragraph element, illustrated in
the sample, is align, which may be set to display text centred
(<p align="center">), pushed to the right-hand margin
(<p align="right">) or against the left-hand margin (the
default).
- Lists. There are two kinds of lists: "ordered" lists, denoted by
the <ol> … </ol> element, which produce an
enumerated series such as you see here; and "unordered" lists, denoted
by the <ul> … </ul> element, which produce a
series marked by bullets. Each item in the series for both
<ol> … </ol> and <ul> …
</ul> is defined by the list-item element, <li>
… </li>.
- Links. Any segment of text may be made into a hypertextual link,
which when activated by the user will fetch another document. Text is
denoted as a link by enclosing it within an anchor element whose
attribute specifies the destination address. See the examples in the
attached sample.
IV. Browser tricks
- The best way to learn HTML is by example. Extensive
browsing on the Web with an eye to effective design features, followed
in each case by investigation of the HTML that causes them, is highly
recommended.
- To see the HTML behind any Web page, either choose the item "view
source" or "page source" (under the View or Page menu) or save the
page, using the Save As… item under the File menu, then use a
text-editor to view it.
- You can capture any image or graphic you see, including animated
GIFs. On the PC, place the mouse pointer on the image, right-click the
mouse and choose the Save As… option.
V. Design
- Effective design is a major consideration in publishing on the Web.
Observe what you think works, then ask yourself why. Copy the best examples
before you try to be creative.
- Since HTML gives you great freedom in how your pages
are designed, you have to be
particularly thoughtful about what you are providing for your
reader.
- Readers need to be told, explicitly or implicitly, what kind of a
thing your page is, what genre it belongs to (e.g., personal homepage,
academic c.v., essay, collection of links). It should be quickly
recognisable as something familiar; otherwise be sure of what you are
doing.
- Segmentation of a document into smaller, interlinked pages is
often better than a single page containing a large document.
- The mechanisms and path you provide to help your reader navigate
through your document will have much to do with its success. Two
common mechanisms are (a) a table of contents at the top of the first
page in a set, and (b) navigational "buttons" at the top and bottom of
each page. Consider caefully how you think your document should be
read and how it may be accessed. Note that a Web search-engine may
deliver to a potential reader a page somewhere in the middle of a
document; how is that reader to know what he or she has landed in the
middle of?
- Always remember: your page should be designed to be read (or
viewed), not to be admired. Cleverness is sometimes a vice.
- See the style guides in the Course Bibliography.
VI. Pragmatics of displaying & ethics of copying
- Anything you put on a Web page may be copied. You are in effect
giving away whatever you publish there. (This point is of special
concern to artists, but note the technology for imposing a "digital
watermark".)
- Information about yourself that you put on a Web page may be used
by others for purposes you may not welcome, e.g. sending you unwanted
e-mail.
- Your Internet service provider (ISP), King's College London, holds
you responsible for the contents. Nothing naughty!
- You are expected to acknowledge your sources explicitly. Ideas
cannot be owned, but significant implementation of them is. They
should be documented as carefully as you would a source for a
conventional academic paper. Failure to observe the ethics of copying
could get you into trouble (i.e. a charge of plagarism, or
worse).
revised October 2007