Document Properties

Doctype, Namespace, and Content Type

All new pages should use the HTML 4.0 Strict doctype. This doctype allows us to maintain a measurable coding standard while being flexible enough to be practical in a large organization's website such as ours. Also be sure to set the default language attribute in the <html> tag, and the default character set using a <meta> tag in the <head> of the document.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Language

It is important to declare a default language for web documents. It will ensure screen readers properly render (read) screen content, if you have ever heard Spanish read as English you will understand the importance. The language can be set using the lang (HTML) and xml:lang (XML) attributes of the <html> tag. Example: <html lang="en" xml:lang="en">. A complete list of language codes can be found at http://www.w3.org/WAI/ER/IG/ert/iso639.htm#2letter.

Title

Document titles should be clear, concise, and descriptive. On the public website it is more important for the title to represent the page rather than the site or organization. Title should be short enough to fit comfortably in the browsers title bar and be used as a bookmark label, but not so short that it becomes cryptic. In most cases the title should match the H2 on the page. This gives the best search results. The general format is:

Example: <title>Page Title | PCC</title>

Author

The documents author can be declared using a <meta> tag in the <head> of the document. We can make use of this information for update/correction forms.

Example: <meta name="author" content="webteam@pcc.edu" />

Contact

With a site this large, it can be difficult to determine who "owns" the content on any individual page. During the migration, pages will be given an additional meta tag named 'contact'. It will hold a comma-delimited list of following basic information about the contact for each document:

  • Name (first last)
  • Department (For reference if the individual leaves PCC)
  • Date ( The last major contact in universal time YYYY-MM-DD )
Example: <meta name="contact" content="Gabriel McGovern, TSS, 2006-02-06 ">

Robots

For most pages, being listed in a search engine is a good thing. However, there are certain pages (such as an index of meeting notes) where this is not the case. For these items, the "robots" metatag can be used to let spiders (web search robots) know that a page should not be indexed, and it's links should not be followed. However, it should be noted that many search engines may ignore this tag and index the page anyway.

Example: <meta name='robots" content="noindex,nofollow">

Sample Document

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="author" content="webteam@pcc.edu" />
<meta name="contact" content="Gabriel McGovern, TSS, 2006-02-06 ">
<title>Welcome | PCC</title>

<html>
<body>
[…page contents…]
</body>
</html>