You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
270 lines
7.6 KiB
270 lines
7.6 KiB
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>
|
|
ht://Dig: Recognized META information in HTML documents
|
|
</title>
|
|
</head>
|
|
<body bgcolor="#eef7ff">
|
|
<h1>
|
|
Recognized META information in HTML documents
|
|
</h1>
|
|
<p>
|
|
ht://Dig Copyright © 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br>
|
|
Please see the file <a href="COPYING">COPYING</a> for
|
|
license information.
|
|
</p>
|
|
<hr size="4" noshade>
|
|
<h2>
|
|
Introduction
|
|
</h2>
|
|
<p>
|
|
As the <a href="index.html">ht://Dig</a> system will index
|
|
all HTML pages on a system, individual authors of pages may
|
|
want to control some of the aspects of the indexing
|
|
operation. To this end, ht://Dig will recognize some special
|
|
<META> tag attributes. The following things can be
|
|
controlled in this manner:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
Do not index the document
|
|
</li>
|
|
<li>
|
|
Notify a user that the document has expired
|
|
</li>
|
|
<li>
|
|
Set keywords for the document
|
|
</li>
|
|
</ul>
|
|
<hr>
|
|
<h2>
|
|
General <META> tag use
|
|
</h2>
|
|
<p>
|
|
In HTML, any number of <META> tags can be used between
|
|
the <HEAD> and </HEAD> tags of a document. There
|
|
are three possible attributes in this tag, two of which are
|
|
recognized by ht://Dig:
|
|
</p>
|
|
<dl>
|
|
<dt>
|
|
NAME
|
|
</dt>
|
|
<dd>
|
|
Used to name a specific property.
|
|
</dd>
|
|
<dt>
|
|
CONTENT
|
|
</dt>
|
|
<dd>
|
|
Used to supply the value for a named property.
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
A document could start with something like the following:
|
|
</p>
|
|
<blockquote>
|
|
<HTML><br>
|
|
<HEAD><br>
|
|
<META NAME="htdig-keywords" CONTENT="phone telephone
|
|
online electronic directory"><br>
|
|
<META NAME="htdig-email"
|
|
CONTENT="pat.user@nowhere.net"><br>
|
|
<TITLE>Some document title</TITLE><br>
|
|
</HEAD><br>
|
|
<BODY>
|
|
<blockquote>
|
|
<em>Body of document</em>
|
|
</blockquote>
|
|
</BODY><br>
|
|
</HTML>
|
|
</blockquote>
|
|
<hr>
|
|
<h2>
|
|
Recognized properties
|
|
</h2>
|
|
<p>
|
|
The following properties are recognized by ht://Dig:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
htdig-keywords
|
|
</li>
|
|
<li>
|
|
htdig-noindex
|
|
</li>
|
|
<li>
|
|
htdig-email
|
|
</li>
|
|
<li>
|
|
htdig-notification-date
|
|
</li>
|
|
<li>
|
|
htdig-email-subject
|
|
</li>
|
|
<li>
|
|
robots
|
|
</li>
|
|
<li>
|
|
keywords
|
|
</li>
|
|
<li>
|
|
description
|
|
</li>
|
|
<li>
|
|
author
|
|
</li>
|
|
</ul>
|
|
<p>
|
|
Detailed information about the <em>htdig-email</em>, <em>
|
|
htdig-notification-date</em>, and <em>
|
|
htdig-email-subject</em> properties can be found in the
|
|
<a href="notification.html">Email notification service</a>
|
|
document.
|
|
</p>
|
|
<p>
|
|
Descriptions of the properties and their values:
|
|
</p>
|
|
<dl>
|
|
<dt>
|
|
<strong>htdig-keywords</strong>
|
|
</dt>
|
|
<dd>
|
|
The value of this property should be a blank separated list
|
|
of keywords which will get a very high weight when
|
|
searching. This can be used to get around some problems
|
|
with common synonyms for words in the document. For
|
|
example, if a document is a telephone directory, possible
|
|
keywords could be "telephone phone directory book list".
|
|
Now, regardless of what text is actually in the document,
|
|
it can be found if these keywords are used in the search.
|
|
The weight that words in the content string will have in
|
|
search results is controlled by the
|
|
<a href="attrs.html#keywords_factor">
|
|
keywords_factor</a> attribute in your configuration.
|
|
</dd>
|
|
<dt>
|
|
<strong>htdig-noindex</strong>
|
|
</dt>
|
|
<dd>
|
|
This property has no value associated with it. If it is
|
|
used, the document will NOT be included in any searches.
|
|
Example uses of this could be:
|
|
<ul>
|
|
<li>
|
|
A document which is dynamic. ie: the contents change
|
|
continually.
|
|
</li>
|
|
<li>
|
|
Temporary document, not officially available, yet.
|
|
</li>
|
|
<li>
|
|
A document you just don't want to be found.
|
|
</li>
|
|
</ul>
|
|
</dd>
|
|
<dt>
|
|
<strong>htdig-email</strong>
|
|
</dt>
|
|
<dd>
|
|
The value is the email address a notification message
|
|
should be sent to. Multiple email addresses can be given by
|
|
separating them by commas. If no email address is given, no
|
|
notification will be sent.<br>
|
|
(Please check the <a href="notification.html">Email
|
|
notification service</a> documentation for more details on
|
|
this.)
|
|
</dd>
|
|
<dt>
|
|
<strong>htdig-notification-date</strong>
|
|
</dt>
|
|
<dd>
|
|
The value is the date on or after which the notification
|
|
should be sent. The format is simply <em>month / day /
|
|
year</em>, or if the <a href="attrs.html#iso_8601">iso_8601</a>
|
|
attribute is set, <em>year - month - day</em>.
|
|
Make sure that the year has the century with it
|
|
as well. This means that you should use <em>1995</em>
|
|
instead of <em>95</em>.<br>
|
|
If no date is given, no notification will be sent. (Please
|
|
check the <a href="notification.html">Email notification
|
|
service</a> documentation for more details on this.)
|
|
</dd>
|
|
<dt>
|
|
<strong>htdig-email-subject</strong>
|
|
</dt>
|
|
<dd>
|
|
The value specifies the subject the notification message.
|
|
This is an optional property. (Please check the
|
|
<a href="notification.html">Email notification service</a>
|
|
documentation for more details on this.)
|
|
</dd>
|
|
<dt>
|
|
<a name="robots"><strong>robots</strong></a>
|
|
</dt>
|
|
<dd>
|
|
The value specifies restrictions on robots (including ht://Dig)
|
|
for the current page. These restrictions can be "noindex" to
|
|
prevent indexing the document but allowing the robot to follow
|
|
links from the page, "nofollow" to allow indexing but preventing
|
|
links from being followed, or "none" to prevent
|
|
both. Additionally, ht://Dig supports the values "index" and
|
|
"follow" and "all" which obviously are the opposite of the other
|
|
values and are the default behavior. For more information on
|
|
META robots tags, check out the
|
|
<a href="http://www.robotstxt.org/wc/meta-user.html">
|
|
HTMLAuthor's Guide to the Robots META tag</a>.
|
|
</dd>
|
|
<dt>
|
|
<strong>keywords</strong>
|
|
</dt>
|
|
<dd>
|
|
The value of this property should be a blank separated list
|
|
of keywords, just as for the htdig-keywords property.
|
|
They are treated as equivalent by htdig. The reason for
|
|
two different properties is that the keywords property
|
|
is used by other search engines as well, while the
|
|
htdig-keywords property can be used for words you want
|
|
indexed only by htdig. You can get htdig to treat other
|
|
property names as equivalent to htdig-keywords, or disable
|
|
the htdig-keywords or keywords properties, by changing the
|
|
<a href="attrs.html#keywords_meta_tag_names">
|
|
keywords_meta_tag_names</a> attribute in your configuration.
|
|
</dd>
|
|
<dt>
|
|
<strong>description</strong>
|
|
</dt>
|
|
<dd>
|
|
The value allows you to specify an alternate excerpt
|
|
(description) of a page. If the config-file attribute
|
|
<a href="attrs.html#use_meta_description">
|
|
use_meta_description</a> is used, then any documents with
|
|
descriptions will use them instead of the automatically
|
|
generated excerpts.
|
|
The weight that words in the content string will have in
|
|
search results is controlled by the
|
|
<a href="attrs.html#meta_description_factor">
|
|
meta_description_factor</a> attribute in your configuration.
|
|
</dd>
|
|
<dt>
|
|
<strong>author</strong>
|
|
</dt>
|
|
<dd>
|
|
The value specifies the name, email address and/or affiliation
|
|
of the creator or authoriser of a page.
|
|
The weight that words in the content string will have in
|
|
search results is controlled by the
|
|
<a href="attrs.html#author_factor">author_factor</a>
|
|
attribute in your configuration.
|
|
A search for "author:<em>name</em>" will
|
|
look only in these fields for the word <em>name</em>.
|
|
</dd>
|
|
</dl>
|
|
<hr size="4" noshade>
|
|
|
|
Last modified: $Date: 2004/05/28 13:15:19 $
|
|
|
|
</body>
|
|
</html>
|