You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4360 lines
158 KiB
4360 lines
158 KiB
Thu Jan 31 17:32:33 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Release of 3.1.6.
|
|
|
|
* htdoc/confindex.html, htdoc/htsearch.html, htdoc/index.html,
|
|
htdoc/mailarchive.html: Remove CSS link, not needed in these
|
|
frameset pages.
|
|
|
|
* htdoc/howto-mirror.html: Update with Jesse's latest version.
|
|
|
|
Thu Jan 31 15:13:07 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* Makefile.in: Fixed install-strip target to properly handle relative
|
|
paths in INSTALL_PROGRAM when passing it to subdirectories.
|
|
|
|
Thu Jan 31 11:41:39 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: Updated questions 4.8 & 4.9 to emphasize use of
|
|
doc2html over parse_doc.pl. Further clarified question 2.1.
|
|
|
|
Thu Jan 31 10:14:23 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/parse_doc.pl: Added comments explaining why you should
|
|
not be using this script.
|
|
|
|
Wed Jan 30 17:20:51 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html: Updated to mention 3.1.6 as the newest version
|
|
and --with-rx as a fix for regex problems on BSDI.
|
|
|
|
Wed Jan 30 17:15:49 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/synonyms: Updated with the version contributed by
|
|
David Adams, with minor changes. Kept old one as synonyms.original.
|
|
|
|
* installdir/english.0: Changed lots more dubious uses of suffixes to
|
|
get more appropriate and correct fuzzy endings expansions.
|
|
|
|
Wed Jan 30 12:30:16 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/Connection.cc (connect): Fixed bug with allow_EINTR and
|
|
add support for looping when the connection returns EAGAIN (no
|
|
more free local ports). Thanks to Ahmon Dancy <dancy@franz.com>
|
|
for pointing out the EAGAIN issue.
|
|
|
|
Tue Jan 29 09:59:58 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: Updated with today's changes to maindocs FAQ.
|
|
|
|
Mon Jan 28 16:54:15 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/README: Added mentions of examples & xmlsearch, fixed typo.
|
|
|
|
Sun Jan 27 23:13:11 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/*.html: Final batch of documentation updates.
|
|
|
|
Sat Jan 26 23:28:25 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/*: More documentation updates from merging with the
|
|
current maindocs CVS.
|
|
|
|
Fri Jan 25 21:36:21 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* acconfig.h, include/htconfig.h.in: Add USE_RX to potential
|
|
configure #include macros.
|
|
|
|
* htlib/gregex.h: Rename regex.h to prevent conflicts with system
|
|
version.
|
|
|
|
* htlib/regex.c, htlib/HtRegex.h: Ditto.
|
|
|
|
* htfuzzy/EndingsDB.cc: Use same tests as HtRegex.h for rxposix.h,
|
|
gregex.h or regex.h depending on configure results.
|
|
|
|
* configure.in: Implement more flexible test for rx/regex, which
|
|
will check for rxposix.h if --with-rx is supplied, will "fall
|
|
back" to regex test if rxposix.h isn't available and will only use
|
|
the htlib/ code and header for regex compile.
|
|
|
|
* configure: Update using autoconf.
|
|
|
|
Fri Jan 25 12:14:26 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/whatsnew/README, contrib/whatsnew/whatsnew.html: Added
|
|
an example of how to get a what's new listing from the new features
|
|
in htsearch.
|
|
|
|
Thu Jan 24 22:43:28 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Add ignore_dead_servers attribute to
|
|
control whether indexing will continue to try to contact a dead
|
|
server.
|
|
|
|
* htdig/Retriever.cc: Only mark a server as dead if the
|
|
ignore_dead_servers attribute is set.
|
|
|
|
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html:
|
|
Documentation updates.
|
|
|
|
Thu Jan 24 15:32:59 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure, configure.in: Add --with-rx option to switch to
|
|
system rx code (e.g. on BSDI). Needs some touchups still,
|
|
including checking that rxposix.h exists and if --without-rx was
|
|
supplied for some reason.
|
|
|
|
* htlib/HtRegex.h: Add conditional <rxposix.h> header for systems
|
|
where rx is better than regex.
|
|
|
|
* htlib/Makefile.in: Make sure regex.o is only compiled if it
|
|
works on a given system via LIBOBJS as supplied by the configure
|
|
script.
|
|
|
|
Mon Jan 21 22:33:30 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/RELEASE.html: Add first shot at the release notes for
|
|
3.1.6. Still need to finish some of the htdoc/ merges, including
|
|
the SF icons and such.
|
|
|
|
* htdoc/*.html: First stab at many of the htdoc/merges including
|
|
the new Copyright line. (It is 2002, after all.)
|
|
|
|
Fri Jan 18 18:17:34 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htmerge/docs.cc: Add a test if the DB database has no URLs
|
|
before proceeding.
|
|
|
|
* htmerge/words.cc: Add a slightly more user-friendly error
|
|
message if the word list file doesn't exist. Remove exit()
|
|
statements since reportError does this for us.
|
|
|
|
Fri Jan 18 16:47:50 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Rewrote description of prefix_match_character
|
|
to make it more clear, with crosslinks to related attributes, and
|
|
described new wildcard matching feature. Added more explanations
|
|
for relative days & months in startday et al. to make it clearer.
|
|
Added more notes about to-strings in the url_part_aliases description
|
|
and explained the example even more, as well as adding crosslinks
|
|
to the new *_rewrite_rules.
|
|
|
|
Fri Jan 18 15:56:11 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/htsearch.cc (setupWords), htsearch/parser.cc (perform_push):
|
|
Added support for a wildcard word of "*" (or prefix_match_character
|
|
if set and not empty) which returns all documents.
|
|
|
|
Wed Jan 16 17:21:26 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html, htdoc/hts_form.html: Described how to use
|
|
relative dates for startyear et al.
|
|
|
|
Wed Jan 16 16:58:05 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (buildMatchList): Fixed startday et al. to
|
|
allow relative days, month & years if values are negative.
|
|
|
|
Fri Jan 11 20:57:51 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Updated descriptions for translate_* attributes
|
|
to match the new default behavior.
|
|
|
|
Fri Jan 11 17:48:54 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/SGMLEntities.cc (translateAndUpdate): Added support for
|
|
translate_latin1 attribute, to turn off ISO-8859-1-specific entities.
|
|
* htcommon/defaults.cc: Added translate_latin1 attribute.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Fri Jan 11 17:14:54 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/xmlsearch.{README,tar.gz}: Removed older xmlsearch package.
|
|
|
|
Fri Jan 11 17:06:09 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/xmlsearch/*: Added files contributed by Nathan Hand and
|
|
me to implement XML output from htsearch, including DTD, templates
|
|
and config file.
|
|
|
|
Wed Jan 9 22:08:21 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* CONFIG.in: Fixed to allow setting BIN_DIR by configure option.
|
|
* contrib/htdig-3.1.6.spec: Fixed to make use of new ./configure
|
|
options for pathnames, do away with patch file. Used variables for
|
|
many pathnames to allow easy changes.
|
|
|
|
Wed Jan 9 16:22:32 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc (parse): Added support for max_keywords
|
|
attribute.
|
|
|
|
Wed Jan 9 16:10:44 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc (HTML, do_tag), htdig/ExternalParser.cc (parse):
|
|
Added support for description_meta_tag_names attribute.
|
|
Ensure external parser interface accepts META descriptions even if
|
|
'description' is added to the keyword list.
|
|
* htcommon/defaults.cc: Added description_meta_tag_names attribute.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Tue Jan 8 17:39:24 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc (parse): Added support for use_doc_date
|
|
attribute.
|
|
|
|
Thu Jan 3 17:10:50 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Makefile.in, htlib/lib.h: Removed references to timegm,
|
|
mytimegm and strptime functions. Removed C source for these.
|
|
|
|
Thu Jan 3 16:43:31 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/htmerge.html: Added extra description for -m option to clear
|
|
up common points of confusion, added note about LC_COLLATE environment
|
|
variable.
|
|
|
|
Fri Dec 21 18:52:32 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc: Added parsedcdate function, used by got_time,
|
|
to parse DC date meta tags without requiring strptime or timegm.
|
|
|
|
Thu Dec 20 12:25:47 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc: Added parsedate function, used by getdate, to
|
|
parse date headers without requiring strptime or timegm, which have
|
|
caused problems on some systems.
|
|
|
|
Thu Dec 20 11:51:26 CET 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
|
|
|
|
* configure.in: reviewed directory settings
|
|
* Makefile.in: ditto (for 'make install' of htdig.conf and rundig)
|
|
|
|
Wed Dec 19 23:05:09 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in: Add tests for ostream.h and iostream.h.
|
|
|
|
* htlib/htString.h: Add HAVE_OSTREAM_H and HAVE_IOSTREAM_H
|
|
preprocessor statements to deal with portability issues around the
|
|
C++ header files.
|
|
|
|
Wed Dec 19 13:33:55 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
|
|
|
|
* configure.in: fixed bug in customisation of configure paramters
|
|
* CONFIG.in: ditto
|
|
* configure: re-generated with autoconf
|
|
|
|
Tue Dec 18 16:12:17 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (displayMatch): Fixed to clear out old values
|
|
of ANCHOR template variable for each result.
|
|
|
|
Thu Dec 6 13:14:22 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/examples/rundig.sh: Fixed to make use of DBDIR variable.
|
|
|
|
Wed Nov 21 12:54:42 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/rundig.html: Added note about effect of changing database_base.
|
|
|
|
* htmerge/docs.cc (convertDocs): Changed confusing message about
|
|
total doc db size in stats.
|
|
|
|
Wed Nov 21 11:37:52 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/TemplateList.cc (createFromString), htdoc/attrs.html:
|
|
Treat template_map as a _quoted_ string list. Change <i> tags to
|
|
the HTML-4.0 compliant <em> tags in builtin-long template.
|
|
|
|
Tue Nov 20 17:13:27 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/String.cc (String, append, sub): Added checks for negative
|
|
lengths or start position to make code more fault-tolerant.
|
|
|
|
Tue Nov 20 16:37:26 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Synonym.cc (createDB): Check for lines with less than
|
|
2 words, to avoid segfault caused by calling Database::Put() with
|
|
negative length for data field.
|
|
|
|
Sat Nov 3 23:55:00 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/htString.h: Add #include for ostream.h to solve compile
|
|
problems with gcc3.
|
|
|
|
* htlib/Connection.h, htlib/Connection.cc: Backport Connection
|
|
class from 3.2 code--installs alarm() call to timeout connections
|
|
and will retry connections a few times before giving up.
|
|
|
|
Fri Nov 2 12:28:35 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc, htdoc/attrs.html: Added support for dc.date,
|
|
dc.date.created and dc.date.modified to use_doc_date handling.
|
|
|
|
Fri Nov 2 12:12:59 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/xmlsearch.README, contrib/xmlsearch.tar.gz: Added files
|
|
contributed by Nathan Hand and me to implement XML output from
|
|
htsearch, including DTD, templates and config file.
|
|
|
|
Fri Nov 2 12:05:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc (do_tag), htcommon/defaults.cc: Added ignore_alt_text
|
|
attribute to avoid indexing alt text in img tags.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Thu Nov 1 14:43:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/htsearch.cc (main): Fixed to only show file names in
|
|
error messages when REQUEST_METHOD not set and -v option given,
|
|
for security.
|
|
|
|
Thu Nov 1 10:19:27 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc, htsearch/Display.h: Added a localized
|
|
method for outputing HTTP headers, added support for a new
|
|
search_results_contenttype attribute to control that header.
|
|
* htcommon/defaults.cc: Added default for it.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Wed Oct 31 13:31:18 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/english.0: Changed lots of dubious uses of suffixes to
|
|
get more appropriate and correct fuzzy endings expansions.
|
|
|
|
Tue Oct 23 14:06:37 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc (RetrievedDocument): Fixed handling of null
|
|
return from getParsable(), to avoid segfault problem introduced
|
|
by text/css conditional added Jul 25.
|
|
|
|
Fri Oct 19 17:24:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (hilight): Added Stefan Nehlsen's idea for
|
|
anchor_target attribute.
|
|
* htcommon/defaults.cc: Added default for it.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Sun Oct 14 22:05:30 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (external_parsers): Documented external converter
|
|
chaining to same content-type, e.g. text/html->text/html-internal.
|
|
|
|
Sun Oct 14 21:54:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byprog.html, htdoc/cf_byname.html,
|
|
htcommon/defaults.cc: Documented and declared startyear, etc.
|
|
attributes used by htsearch.
|
|
|
|
Sun Oct 14 21:16:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/htdump.html, htdoc/htload.html, htdoc/attrs.html,
|
|
htdoc/cf_byprog.html, htdoc/contents.html: Documented htdump and
|
|
htload, indicating which attributes are used by them.
|
|
|
|
Fri Oct 12 14:58:15 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/URL.cc (removeIndex): Fixed to make sure the matched file
|
|
name is at the end of the URL.
|
|
|
|
Tue Oct 2 09:34:43 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (start_url): Added a reference and link to
|
|
limit_urls_to, explaining how the two are tied together.
|
|
|
|
Fri Sep 28 17:19:45 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/htdig-3.1.6.spec: Fixed %install to make symlinks for
|
|
htdump & htload, added these to %files list.
|
|
|
|
Fri Sep 28 15:38:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (displayMatch): Save rewritten URL in DocumentRef
|
|
so it'll be used for star_patterns and template_patterns matching.
|
|
|
|
Fri Sep 28 14:25:29 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (buildMatchList, displayMatch),
|
|
htsearch/htsearch.cc (main): Added calls to pass search_rewrite_rules
|
|
to HtURLRewriter class and use it to rewrite URLs in results.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htcommon/defaults.cc: Added search_rewrite_rules attribute.
|
|
|
|
Thu Sep 27 16:34:51 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Makefile.in, htlib/HtRegex.cc, htlib/HtRegex.h,
|
|
htlib/HtRegexReplace.cc, htlib/HtRegexReplace.h,
|
|
htlib/HtRegexReplaceList.cc, htlib/HtRegexReplaceList.h,
|
|
htlib/HtURLRewriter.cc, htlib/HtURLRewriter.h: Added new classes to
|
|
support regular expressions and implement url_rewrite_rules attribute,
|
|
using Geoff's variation of Andy Armstrong's implementation of this.
|
|
|
|
* htlib/URL.h, htlib/URL.cc: Added URL::rewrite() method.
|
|
|
|
* htlib/htString.h: Added Nth() method for HtRegex class.
|
|
|
|
* htdig/Retriever.cc (got_href, got_redirect): Added calls to
|
|
url.rewrite(), and debugging output for this.
|
|
|
|
* htdig/htdig.cc (main): Added calls to make instance of
|
|
HtURLRewriter class.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htcommon/defaults.cc: Added url_rewrite_rules attribute.
|
|
|
|
Mon Sep 17 16:52:07 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/running.html: New documentation on how to run after configuring.
|
|
* htdoc/rundig.html: New manual page for rundig script.
|
|
* htdoc/install.html: Added link to running.html.
|
|
* htdoc/contents.html: Added link to running.html, rundig.html, related
|
|
projects. Updated links to contrib and developer site. Got rid of
|
|
link to web site stats.
|
|
|
|
Fri Sep 14 09:18:38 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc (RetrieveHTTP): Add port to Host: header when
|
|
port is not default, as per RFC2616(14.23). Fixes bug #459969.
|
|
|
|
Sat Sep 8 22:04:47 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* acconfig.h, include/htconfig.h.in: Add undef for
|
|
ALLOW_INSECURE_CGI_CONFIG, which if defined does about what you'd
|
|
expect. (This is for any wrapper authors who don't want to rewrite
|
|
but are willing to run insecure.)
|
|
|
|
* htsearch/htsearch.cc: Only allow the -c flag to work when
|
|
REQUEST_METHOD is undefined. Fixes PR#458013.
|
|
|
|
Fri Aug 31 16:00:37 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/URL.cc (URL): Fixed to call normalizePath() even if URL
|
|
is relative but with absolute path. Should fix bug #408586.
|
|
|
|
Fri Aug 31 15:21:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.h, htdig/HTML.cc (HTML, parse, do_tag): Fixed buggy
|
|
handling of nested tags that independently turn off indexing, so
|
|
</script> doesn't cancel <meta name=robots ...> tag. Add handling
|
|
of <noindex follow> tag.
|
|
|
|
Fri Aug 31 14:33:41 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
[ Backport some 3.2.0b4 HTML parser changes. ]
|
|
* htdig/HTML.cc (do_tag): Rewrite using Configuration class to
|
|
separate tag attributes. Parse <object> tags properly, looking
|
|
for data= attribute rather than src=. Add support for TITLE
|
|
attributes in anchor and related tags. Treat <script></script>
|
|
tags as noindex tags, much like <style></style> as suggested
|
|
by Torsten.
|
|
* htdig/HTML.cc(parse): Fix to prevent closing ">" from being passed
|
|
to do_tag().
|
|
|
|
Wed Aug 29 10:20:55 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (allow_in_form, build_select_lists,
|
|
limit_normalized, server_aliases, server_max_docs, server_wait_time,
|
|
url_part_aliases): Added clarifications to allow_in_form,
|
|
server_aliases and url_part_aliases descriptions. Changed word
|
|
"directive" to "attribute" where appropriate. Added cross-link to
|
|
server_aliases from limit_normalized, and to allow_in_form from
|
|
build_select_lists.
|
|
|
|
Mon Aug 27 17:22:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc (do_tag): Improve handling of whitespace in META
|
|
refresh handling. Fixes bug #406244.
|
|
|
|
Mon Aug 27 16:38:43 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc (parse): Fixed delete [] text (was missing []), added
|
|
simple optimizations for comment & noindex_start skipping, handle
|
|
decoded < entity correctly.
|
|
|
|
Mon Aug 27 15:31:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
[ Backport 3.2.0b4 config files. ]
|
|
* installdir/htdig.conf: Added .css to bad_extensions default,
|
|
added missing closing ">", added mentions of accents & substring,
|
|
fixed a couple typos in comments.
|
|
* installdir/search.html: Add DTD tag for HTML 4 compliance.
|
|
* installdir/{long, syntax, header, footer, wrapper, nomatch}.html:
|
|
Add DTD tags, ALT attributes and remove bogus </select> tags to
|
|
fix invalid HTML pointed out in PR#901. Change all <b> and <i> tags
|
|
to the HTML-4.0 compliant <strong> and <em> tags.
|
|
* htdoc/config.html: Updated with sample of latest htdig.conf and
|
|
installdir/*.html, added blurb on wrapper.html.
|
|
|
|
Thu Jul 26 15:05:29 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc, htsearch/parser.cc (perform_or),
|
|
htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added new attribute
|
|
multimatch_method and used it to boost score on 'or' method with
|
|
multiple matches.
|
|
|
|
Thu Jul 26 14:25:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc, htsearch/parser.cc, htdoc/attrs.html,
|
|
htdoc/cf_by{name,prog}.html: Added new attribute boolean_syntax_errors
|
|
and used it to generate syntax error messages for boolean method.
|
|
|
|
Wed Jul 25 23:39:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htnotify/htnotify.cc: Changed calls to EmailNotification class
|
|
to avoid compiler warnings.
|
|
|
|
Wed Jul 25 23:15:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc, htsearch/htsearch.cc, htdoc/attrs.html,
|
|
htdoc/cf_by{name,prog}.html: Added new attribute boolean_keywords
|
|
and used it to make LOGICAL_WORDS and parse "words" using boolean
|
|
method.
|
|
|
|
Wed Jul 25 22:31:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Dictionary.cc (Remove): Fixed so it doesn't clobber rest of
|
|
chain when removing an entry, as suggested by Yariv Tal.
|
|
|
|
Wed Jul 25 22:06:08 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Add new attributes htnotify_replyto,
|
|
htnotify_webmaster, htnotify_prefix_file, htnotify_suffix_file.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Document them.
|
|
|
|
* htnotify/htnotify.cc, htnotify/EmailNotification.{h,cc},
|
|
htnotify/Makefile.in: Added in code from Richard Beton
|
|
<richard.beton@roke.co.uk> to collect multiple URLs per e-mail
|
|
address and allow customization of notification messages by
|
|
reading in header/footer text as designated by the new attributes
|
|
above.
|
|
|
|
* htdoc/THANKS.html: Credit where due.
|
|
|
|
Wed Jul 25 21:38:21 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Added .css to bad_extensions, for consistency
|
|
with 3.2.
|
|
|
|
* htdoc/attrs.html: Ditto for default value. Also set examples for
|
|
translate_* and modification_time_is_now to false so the example is
|
|
different than default.
|
|
|
|
Wed Jul 25 17:26:07 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc (getParsable): Add conditional to catch
|
|
text/css files to prevent these from being parsed as Plaintext.
|
|
|
|
* htdig/htdig.cc: Quick fix to make the logging -l flag the
|
|
default behavior. (Set to Retriever_logUrl from the start.)
|
|
|
|
* htcommon/defaults.cc: Set modification_time_is_now to default to
|
|
true (now that it works correctly). Also set translate_*
|
|
attributes to true.
|
|
|
|
* htdoc/htdig.html: Remove documentation for -l flag--now no
|
|
longer used.
|
|
|
|
* htdoc/attrs.html: Correct new default values for
|
|
modification_time_is_now and translate_* attributes.
|
|
|
|
Tue Jul 24 16:12:45 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Added reference to maximum_page_buttons in the
|
|
section on maximum_pages.
|
|
|
|
Tue Jul 24 15:38:39 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (generateStars): Add NSTARS variable for
|
|
template output as suggested by Caleb Crome
|
|
<ccrome@users.sourceforge.net> (except here precision is 0). Fixes
|
|
feature request #405787.
|
|
|
|
* htdoc/hts_templates.html: Add description of NSTARS variable
|
|
above. (Actually copied hts_templates.html from 3.2.0b4.)
|
|
|
|
Tue Jul 24 14:21:53 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (expandVariables, outputVariable),
|
|
htdoc/hts_templates.html: Add support for $=(var) template variable
|
|
references, as suggested by Quim Sanmarti.
|
|
|
|
Tue Jul 24 14:12:06 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (readFile): Added missing fclose() call, and
|
|
debugging message for when file can't be opened.
|
|
|
|
* htsearch/Display.cc (displayParsedFile): Added debugging message
|
|
for when file can't be opened.
|
|
|
|
Tue Jul 24 14:03:12 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (setVariables), htcommon/defaults.cc: Added
|
|
maximum_page_buttons attribute, to limit buttons to less than
|
|
maximum_pages. Fixes PR#731 & PR#781.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
|
|
|
|
Tue Jul 24 13:42:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/hts_templates.html, htsearch/Display.cc (displayMatch):
|
|
Add METADESCRIPTION variable.
|
|
|
|
Tue Jul 24 13:20:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/DocumentDB.{h,cc}: Added FindCoded() method to lookup
|
|
docdb record with URL that's still encoded.
|
|
|
|
* htsearch/Display.cc (display, displayMatch, buildMatchList):
|
|
Use new method to avoid problems with URLs that are decoded and
|
|
reencoded with another, more ambiguous url_part_aliases setting.
|
|
Also fixed a problem with date range checking looking at ref before
|
|
checking if it's null.
|
|
|
|
Thu Jul 12 11:45:05 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/conv_doc.pl, contrib/parse_doc.pl: Fixed EOF handling in
|
|
dehyphenation, fixed to handle %xx codes in title made from URL.
|
|
|
|
* contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
|
|
contrib/doc2html/swf2html.pl: Fixed to handle %xx codes in URL title.
|
|
|
|
Thu Jul 5 11:23:40 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* db/dist/config.guess: Update with more recent GNU version that
|
|
recognizes various flavors of Mac OS X automatically.
|
|
|
|
* htlib/DB2_db.cc: Only #include <malloc.h> if we have it. Fixes
|
|
compilation problems on Mac OS X.
|
|
|
|
* htlib/String.cc: Include <iostream.h> instead of depreciated
|
|
<stream.h>. Fixes compilation problems with Mac OS X.
|
|
|
|
* htlib/Configuration.cc: Make sure we never try to operate on
|
|
strings of no length--accessing string[-1] is a bug--exposed on
|
|
Mac OS X.
|
|
|
|
Fri Jun 29 11:56:25 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc (got_redirect): Allow the redirect to accept
|
|
relative redirects instead of just full URLs.
|
|
|
|
Fri Jun 22 16:25:21 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/THANKS.html: Credit Marc Pohl and Robert Marchand.
|
|
|
|
* htsearch/Display.cc (buildMatchList): Fix date_factor calculation
|
|
to avoid 32-bit int overflow after multiplication by 1000, and avoid
|
|
repetitive time(0) call, as contributed by Marc Pohl. Also move the
|
|
localtime() call up before gmtime() call, to avoid clobbering gmtime's
|
|
returned static structure (my thinko).
|
|
|
|
Tue Jun 19 17:07:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (setVariables): Fixed handling of
|
|
build_select_lists attribute, to deal with new restrict & exclude
|
|
attributes.
|
|
|
|
Fri Jun 15 17:45:40 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/require.html: Added mentions of accents, prefix & substring,
|
|
taken from 3.2.0b4.
|
|
* htdoc/htfuzzy: Added blurb on accents algorithm, taken from 3.2.0b4.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added entry for
|
|
accents_db attribute for htfuzzy and htsearch. Mentioned accents
|
|
algorithm in description of search_algorithm. Noted effect of
|
|
locale setting on floating point numbers in search_algorithm
|
|
and locale descriptions.
|
|
|
|
Fri Jun 15 16:47:09 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Accents.{h,cc}, htfuzzy/Fuzzy.c (getFuzzyByName),
|
|
htfuzzy/htfuzzy.cc (main, usage), htfuzzy/Makefile.in: Added
|
|
latest version of Robert Marchand's accents fuzzy match algorithm.
|
|
* htcommon/defaults.cc: Added accents_db attribute for this.
|
|
* htsearch/htsearch.cc: Fixed parsing of search_algorithm not to
|
|
use comma as separator, because it may be needed as decimal point
|
|
in some locales.
|
|
|
|
Fri Jun 15 16:30:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Endings.cc (getWords): Undid change introduced in 3.1.3,
|
|
in part. It now gets permutations of word whether or not it has
|
|
a root, but it also gets permutations of one or more roots that
|
|
the word has, based on a suggestion by Alexander Lebedev.
|
|
* htfuzzy/EndingsDB.cc (createRoot): Fixed to handle words that have
|
|
more than one root.
|
|
* installdir/english.0: Removed P flag from wit, like and high, so
|
|
they're not treated as roots of witness, likeness and highness, which
|
|
are already in the dictionary.
|
|
|
|
Thu Jun 7 17:09:46 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Add new attribute use_doc_date to use
|
|
document meta information for the DocTime() field.
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Document it.
|
|
* htdig/HTML.cc(do_tag): Call Retriever::got_time if use_doc_date
|
|
is set and we run across a META date tag.
|
|
* htdig/Retriever.h, htdig/Retriver.cc: Add new got_date
|
|
function. When called, sets the DocTime field of the DocumentRef
|
|
after parsing is completed. Currently assumes ISO 8601 format for
|
|
the date tag.
|
|
|
|
Thu Jun 7 16:48:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Add new attribute any_keywords to allow
|
|
ORing of keywords input parameter.
|
|
* htsearch/htsearch.cc (addRequiredWords): Use it. Fix handling
|
|
of empty search word list.
|
|
* htsearch/Display.cc (excerpt, highlight): Fix handling of case
|
|
where "words" is empty but "keywords" isn't.
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Document any_keywords.
|
|
|
|
Thu Jun 7 16:34:41 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Add new attribute plural_suffix to set the
|
|
language-dependent suffix for PLURAL_MATCHES contributed by Jesse.
|
|
* htsearch/Display.cc (setVariables): Use it.
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Document it.
|
|
|
|
Thu Jun 7 16:03:17 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.{h,cc}, htcommon/defaults.cc: Added multi-excerpt
|
|
feature and max_excerpts attribute, as contributed by Jim Cole.
|
|
* htdoc/THANKS.html, htdoc/attrs.html, htdoc/cf_byname.html,
|
|
htdoc/cf_byprog.html: Credit where due, and document attribute.
|
|
|
|
Thu Jun 7 15:27:33 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc: Backported from 3.2.0b3, fixing these
|
|
problems: no longer confused by "; charset=..." in Content-Type,
|
|
avoids security problems with popen() and shell parsing untrusted URL
|
|
(PR#542, PR#951), avoids predictable temporary file name if mkstemp()
|
|
exists, binary output from external converter no longer mangled,
|
|
less ambiguous error messages, opens temp. file in binary mode on
|
|
non-Unix systems.
|
|
|
|
Thu Jun 7 15:10:14 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/DocumentDB.{h,cc}: Replace CreateSearchDB() with DumpDB(),
|
|
add LoadDB(), both backported from 3.2.0b3.
|
|
* htdig/htdig.cc (main, usage), htdig/Makefile.in, htdoc/htdig.html:
|
|
Add handling of -m (minimal) option, file input for URLs, and arg 0
|
|
handling for htdump & htload.
|
|
* htdig/HTML.cc (do_tag): Change all white space to blanks in meta
|
|
description tag, for proper ASCII record dumps by htdump, and to fix
|
|
bug #405771.
|
|
* htlib/String.cc (= operator), htlib/htString.cc: change handling
|
|
of 0 length strings. Add readLine() for htload support.
|
|
|
|
Thu Jun 7 14:41:42 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc (got_href): Fix hop count mishandling.
|
|
|
|
Thu Jun 7 14:23:47 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htmerge/db.cc (mergeDB), htmerge/words.cc (mergeWords),
|
|
installdir/rundig: Fix various htmerge bugs. Quotes the temp.
|
|
directory name and word_list name (PR#872). Correctly handles
|
|
words beginning with +, - and ! when in extra_word_characters
|
|
(PR#952). Corrects problems with bad wordlists generated by
|
|
htmerge -m causing it to lose entries in words.db and problems
|
|
with the sort program using non-ASCII collating having a similar
|
|
effect.
|
|
|
|
Thu Jun 7 14:13:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/htsearch.cc (main), htsearch/Display.cc (setVariables,
|
|
createURL, buildMatchList), htdoc/THANKS.html, htdoc/hts_form.html,
|
|
htdoc/hts_templates.html: Add Mike Grommet's date range search
|
|
feature.
|
|
|
|
Thu Jun 7 13:57:06 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc (GetLocal, GetLocalUser): Fix to allow compiling
|
|
on AIX & other non-GNU compilers.
|
|
|
|
Thu Jun 7 13:52:20 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (setVariables): Extend the handling of
|
|
build_select_lists to handle select multiple, radio buttons and
|
|
checkboxes.
|
|
* htdoc/attrs.html, htdoc/hts_selectors.html: Describe this.
|
|
|
|
Thu Jun 7 13:40:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Exact.cc (Exact), htfuzzy/Prefix.cc (Prefix): Set the
|
|
name field to the class name, as suggested by Jesse.
|
|
|
|
Thu Jun 7 13:27:35 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/htdig-3.1.6.spec, contrib/htdig-3.1.6-conf.patch,
|
|
htdoc/where.html, .version, README: Bump to version 3.1.6.
|
|
|
|
Thu Jun 7 11:58:28 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/multidig/*: Backport from 3.2.0b3, including fixes below.
|
|
|
|
* contrib/multidig/Makefile, gen-collect, db.conf, multidig.conf:
|
|
Add missing trailing newlines as pointed out by Doug Moran
|
|
<dmoran@dougmoran.com>.
|
|
|
|
* contrib/multidig/Makefile (install): Make sure scripts have a+x
|
|
permissions. Pointed out by Doug Moran.
|
|
|
|
* contrib/multidig/new-collect: Fix typo to ensure MULTIDIG_CONF
|
|
is set correctly.
|
|
|
|
Thu Jun 7 11:37:52 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/README: Add in descriptions for web site contrib directory,
|
|
acroconv.pl & conv_doc.pl.
|
|
* contrib/examples/rundig.sh: Update to most recent version for 3.1.x.
|
|
* contrib/htparsedoc/htparsedoc: Add in contributed bug fixes from
|
|
Andrew Bishop to work on SunOS 4.x machines.
|
|
* contrib/acroconv.pl: Added external converter script to convert
|
|
PDFs with acroread.
|
|
|
|
Thu Jun 7 10:41:05 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/ParsedString.cc (get), htsearch/Display.cc (expandVariables):
|
|
Use isalnum() instead of isalpha() to allow digits in attribute and
|
|
variable names, allow '-' in variable names too for consistency.
|
|
|
|
Wed Jun 6 17:13:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc (do_tag): Make parsing of meta robots tag case
|
|
insensitive.
|
|
|
|
Wed Jun 6 15:31:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/doc2html/DETAILS, contrib/doc2html/README,
|
|
contrib/doc2html/doc2html.cfg, contrib/doc2html/doc2html.sty,
|
|
contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
|
|
contrib/doc2html/swf2html.pl: Added version 3.0 of doc2html,
|
|
contributed by David Adams <D.J.Adams@soton.ac.uk>.
|
|
|
|
Mon Jun 4 10:31:45 CEST 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
|
|
|
|
* htdoc/cf_byname.html: I forgot to insert the 'restrict' attribute.
|
|
|
|
Wed May 30 11:30:43 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
|
|
|
|
* htsearch/htsearch.cc: two new attributes, used by htsearch, have
|
|
been added: restrict and exclude. They can now give more control
|
|
to template customisation through configuration files, allowing
|
|
to restrict or exclude URLs from search without passing
|
|
any CGI variables (although this specification overrides the
|
|
configuration one).
|
|
* htcommon/defaults.cc: ditto
|
|
* htdoc/attrs.html: ditto
|
|
* htdoc/cf_byname.html: ditto
|
|
* htdoc/cf_byprog.html: ditto
|
|
* htdoc/hts_form.html: ditto
|
|
|
|
Sat May 5 21:43:32 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in, configure: Add tests for wait.h, sys/wait.h,
|
|
mkstemp() and malloc.h.
|
|
|
|
* acconfig.h, include/htconfig.h.in: Update with autoheader for
|
|
new tests.
|
|
|
|
* htlib/regex.[h,c]: Update with backports from 3.2.0b4 development.
|
|
|
|
Tue Feb 29 23:04:04 2000 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/DB2_db.cc (Error): Simply fprint the error message on
|
|
stderr. This is not a method since the db.h interface expects a C
|
|
function.
|
|
(db_init): Don't set db_errfile, instead set errcall to point to
|
|
the new Error function.
|
|
|
|
Fri Feb 25 10:11:50 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (maximum_pages): Describe new bahaviour (as of
|
|
3.1.4), where this limits total matches shown.
|
|
|
|
Thu Feb 24 20:24:24 2000 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html: Update to refer to 3.1.5 and edit comments about 3.2.
|
|
|
|
Thu Feb 24 15:20:08 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/RELEASE.html, htdoc/main.html: Updated notes for 3.1.5 release.
|
|
|
|
Thu Feb 24 10:37:45 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (external_parsers): Add references to FAQ 4.8 & 4.9.
|
|
(local_default_doc): Give an expanded example.
|
|
(logging): Explain log entry format.
|
|
(star_blank): Fix some old typos (incorrect references to other attrs.)
|
|
|
|
Wed Feb 23 13:58:24 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/cgi.cc(init): Fixed bug: array must be free by
|
|
delete [] buf, not just delete buf; (from Vadim).
|
|
* installdir/syntax.html: Fixed a $(WORDS) I'd missed earlier.
|
|
|
|
Tue Feb 22 12:40:22 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/RELEASE.html, htdoc/main.html: Updated notes for 3.1.5 release.
|
|
* htlib/URL.cc (URL, normalizePath): Fix PR#779, to handle relative
|
|
URLs correctly when there's a trailing ".." or leading "//".
|
|
|
|
Thu Feb 17 15:58:53 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/RELEASE.html, htdoc/main.html: Add notes for 3.1.5 release.
|
|
* htdoc/TODO.html, htdoc/author.html, htdoc/bugs.html,
|
|
htdoc/cf_general.html, htdoc/cf_types.html, htdoc/cf_variables.html,
|
|
htdoc/config.html, htdoc/howitworks.html, htdoc/htdig.html,
|
|
htdoc/htfuzzy.html, htdoc/htmerge.html, htdoc/htnotify.html,
|
|
htdoc/hts_form.html, htdoc/hts_general.html, htdoc/hts_method.html,
|
|
htdoc/install.html, htdoc/isp.html, htdoc/mailing.html,
|
|
htdoc/meta.html, htdoc/notification.html, htdoc/require.html,
|
|
htdoc/uses.html, htdoc/where.html: Update copyright date and fix
|
|
last modified date for automatic CVS update.
|
|
|
|
Thu Feb 17 14:37:18 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/htdig.conf: quote all HTML tag parameters.
|
|
* htsearch/TemplateList.cc (createFromString), installdir/long.html,
|
|
installdir/short.html: Use $&(URL) in templates.
|
|
|
|
Thu Feb 17 14:01:34 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/htdig-3.1.5.spec: Fix silly typos in %post script,
|
|
make cron script a %config file.
|
|
|
|
Thu Feb 17 10:34:05 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
[ Improve htsearch's HTML 4.0 compliance ]
|
|
* htsearch/TemplateList.cc (createFromString): Use file name rather
|
|
than internal name to select builtin-* templates, use $&(TITLE) in
|
|
templates and quote HTML tag parameters.
|
|
* installdir/long.html, installdir/short.html: Use $&(TITLE) in
|
|
templates and quote HTML tag parameters.
|
|
* htsearch/Display.cc (setVariables): quote all HTML tag parameters
|
|
in generated select lists.
|
|
* installdir/footer.html, installdir/header.html,
|
|
installdir/nomatch.html, installdir/search.html,
|
|
installdir/syntax.html, installdir/wrapper.html:
|
|
Use $&(var) where appropriate, and quote HTML tag parameters.
|
|
|
|
Thu Feb 17 10:00:26 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/htdig-3.1.5.spec: Fix %post script to add more descriptive
|
|
htdig.conf entries.
|
|
|
|
Wed Feb 16 16:26:05 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/htdig-3.1.5.spec, contrib/htdig-3.1.5-conf.patch,
|
|
htdoc/where.html, .version, README: Bump to version 3.1.5.
|
|
* htdoc/THANKS.html: Added new contributors.
|
|
* htdoc/FAQ.html, htdoc/main.html: Updated to versions from web site.
|
|
|
|
Wed Feb 16 15:49:28 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Configuration.h, htlib/Configuration.cc: split Add() method
|
|
into Add() and AddParsed(), so that only config attributes get parsed.
|
|
Use AddParsed() only in Read() and Defaults().
|
|
|
|
Wed Feb 16 15:02:47 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/URL.h (encodeURL): Change list of valid characters to
|
|
include only unreserved ones.
|
|
* htlib/cgi.cc (init): Allow "&" and ";" as input parameter separators.
|
|
* htsearch/Display.cc (createURL): Encode each parameter separately,
|
|
using new unreserved list, before piecing together query string, to
|
|
allow characters like "?=&" within parameters to be encoded.
|
|
|
|
Wed Feb 16 14:42:02 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (encodeSGML, excerpt): Add encoding for
|
|
characters that could pose problems in HTML output.
|
|
* htsearch/Display.cc (expandVariables, outputVariables): Add support
|
|
for $&(var) and $%(var) template variable references. This should
|
|
fix PR#750, once we use this in common/*.html.
|
|
|
|
Tue Feb 15 17:21:08 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
[ Applied a whole collection of patches and fixes from the archives ]
|
|
* htdig/Server.cc (robotstxt): apply more rigorous parsing of
|
|
multiple user-agent fields, and use only the first one.
|
|
|
|
* htdig/Retriever.cc(GetLocal, GetLocalUser): Add URL-decoding
|
|
enhancements to local_urls, local_default_urls & local_default_doc,
|
|
to allow hex encoding of special characters.
|
|
* htdoc/attrs.html: Document these.
|
|
|
|
* htdig/Retriever.cc (IsValidURL): Fix problem with
|
|
valid_extensions when an "extension" would include part of a
|
|
directory path or server name, as contributed by Warren Jones.
|
|
Also fix problem with valid_extensions matching failure when URL
|
|
parameters follow extension, as reported by fxbois@cybercable.fr.
|
|
|
|
* htdig/Document.cc (RetrieveLocal), htdig/Document.h,
|
|
htdig/Retriever.cc(Initial, parse_url, GetLocal, GetLocalUser,
|
|
IsLocalURL, got_href, got_redirect), htdig/Retriever.h,
|
|
htdig/Server.cc(Server), htdig/Server.h: Apply Paul B. Henson's
|
|
enhancements to local_urls, local_user_urls & local_default_doc.
|
|
* htdoc/attrs.html: Document these.
|
|
|
|
* htsearch/htsearch.cc (setupWords): Fix problem reported by
|
|
D.J. Adams, in which bad_words removal failed on upper-case
|
|
search words.
|
|
|
|
* htsearch/Display.cc(setVariables), htcommon/defaults.cc: Added
|
|
build_select_lists attribute, to generate selector menus in forms.
|
|
* htdoc/hts_selectors.html: Added this page to explain this new
|
|
feature, plus other details on select lists in general.
|
|
* htdoc/hts_templates.html: Added relevant links to related attributes
|
|
and selectors documentation.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added relevant
|
|
explanations and links to selectors documentation.
|
|
|
|
* htlib/QuotedStringList.cc (Create): fix PR#743, where quoted string
|
|
lists didn't allow embedded quotes of opposite sort in strings
|
|
(e.g. "'" or '"'), and fix to avoid overrunning end of string
|
|
if it ends with backslash.
|
|
|
|
* htcommon/WordList.cc (valid_word): Applied Marc Pohl's fix to make
|
|
this 8-bit clean on Solaris.
|
|
|
|
* contrib/conv_doc.pl, contrib/parse_doc.pl: Applied Warren Jones's
|
|
changes to these scripts.
|
|
|
|
* htdig/PDF.cc (parseNonTextLine): Fix bogus escape sequences
|
|
around Title parsing. (Fixes PR#740)
|
|
|
|
* htsearch/Display.cc (display, displaySyntaxError),
|
|
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htcommon/defaults.cc: Add new attribute "nph" to send out
|
|
non-parsed headers for servers that do not supply HTTP headers on
|
|
CGI output (e.g. IIS). If nph is set, send out HTTP OK header,
|
|
as suggested by Matthew Daniel <mdaniel@scdi.com> (PR#727)
|
|
|
|
* htdig/Document.cc (getdate): avoid strftime() altogether on
|
|
filled-in tm structure, to avoid recurring segfault problems. (PR#734)
|
|
|
|
* htlib/strptime.cc (mystrptime): Use Warren Jones's fix to deal
|
|
with a web server that returns dates with a two digit year field.
|
|
(Fixes PR#770)
|
|
|
|
* htdig/HTML.cc (HTML, parse, do_tag), htcommon/defaults.cc,
|
|
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Add max_keywords attribute to limit meta keyword spamming.
|
|
|
|
Wed Dec 8 18:19:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html, htdoc/bugs.html: Update to refer to latest versions.
|
|
(Update for 3.1.4 release.)
|
|
|
|
Wed Dec 8 18:10:27 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/QuotedStringList.cc (Create): Make sure that an empty
|
|
token isn't ignored.
|
|
|
|
Tue Dec 7 10:26:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc (setVariables): Fix a compilation error by
|
|
making a statment with '?' an explicit if-else statment.
|
|
|
|
* htdoc/RELEASE.html: Change case_sensitive fix to a bug-fix,
|
|
update release date for 12/9/99. (We certainly didn't release yesterday!)
|
|
|
|
Mon Dec 6 22:17:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(Display): Add missing call to setupTemplates(),
|
|
for handling template_patterns. Oops!
|
|
* htdoc/attrs.html: Fixed a couple typos in new attributes.
|
|
* htdoc/ChangeLog: Update to latest version.
|
|
|
|
Mon Dec 6 16:41:04 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/main.html: Update news with latest version.
|
|
* htdig/htdig.cc(main), htdig/Document.cc(Document),
|
|
htcommon/defaults.cc, htdoc/attrs.html, htdoc/cf_byname.html,
|
|
htdoc/cf_byprog.html: Add authorization attribute, settable by
|
|
htdig -u. Also fixes PR#490, by setting authentication before
|
|
robots.txt fetched.
|
|
* htdoc/RELEASE.html: Update with latest fix.
|
|
|
|
Fri Dec 3 17:31:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/DocumentRef.cc(Clear): Set docHopCount & docSig to 0,
|
|
and clear docEmail, docNotification & docSubject strings to have
|
|
a clean slate for Deserialize(), which assume 0/empty for these.
|
|
Fixes problem with hop counts getting clobbered.
|
|
* htdoc/RELEASE.html: Update with latest fix.
|
|
* htdoc/ChangeLog: Update to latest version.
|
|
|
|
Fri Dec 3 12:12:19 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc: removed vestiges of internal Postscript
|
|
support that never worked, and removed test for application/msword,
|
|
which is handled only by external parser.
|
|
* htdig/Makefile.in: removed Postscript.o from list.
|
|
* htdig/Retriever.cc(parse_url): Fix compilation error;
|
|
(Initial, got_href, got_redirect): Try to get the local filename
|
|
for a server's robots.txt file and pass it along to the newly
|
|
generated server.
|
|
* htdig/Server.cc(Server): Retrieve the robots.txt file from the
|
|
filesystem when possible; fix compilation error.
|
|
* htdig/Server.h(Server): Add local_robots_file parameter to Server().
|
|
* htlib/HtWordType.h, htlib/HtWordType.cc: fix compilation errors.
|
|
|
|
Fri Dec 3 10:52:57 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc(parse, do_tag): Add handling of <img alt=...> text,
|
|
fix parsing of words in meta tags, disable indexing of meta tags
|
|
when "noindex" state in effect, fix calculations of word positions
|
|
to more accurately reflect relative positions.
|
|
* htlib/HtWordType.h, htlib/HtWordType.cc: Add HtWordToken() function,
|
|
to replace strtok() in HTML parser.
|
|
* htdoc/RELEASE.html: Update with latest fixes.
|
|
|
|
Fri Dec 3 09:02:55 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Configuration(Add): handle strings in single quotes, as in
|
|
parm='value'.
|
|
|
|
Thu Dec 2 16:14:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Add Tom Metro's suggested revisions for pdf_parser
|
|
and external_parsers.
|
|
|
|
Thu Dec 2 15:15:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/mailing.html: Updated to version from htdig.org web site.
|
|
* htcommon/defaults.cc: Add missing no_page_number_text and
|
|
page_number_text attribute definitions.
|
|
* htdoc/attrs.html(modification_time_is_now): Make the description
|
|
a bit clearer as to how it may cut down on reindexing.
|
|
|
|
Thu Dec 2 13:46:11 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(parse_url), htdig/Server.cc(Server),
|
|
htcommon/defaults.cc, htdoc/attrs.html, htdoc/cf_byname.html,
|
|
htdoc/cf_byprog.html: Add support for local_urls_only attribute.
|
|
* htdoc/RELEASE.html: Update with latest feature.
|
|
|
|
Thu Dec 2 11:02:07 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/URL.cc(ServerAlias): Fix server_aliases processing to prevent
|
|
infinite loop (as for local_urls in PR#688).
|
|
|
|
Wed Dec 1 17:23:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(parse_url), htdig/Server.h: add IsDead() methods
|
|
to query and set server status, use them in Retriever to avoid repeated
|
|
HTTP request to a dead server. (Needed for persistent local stuff.)
|
|
|
|
Wed Dec 1 16:56:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(GetLocal): Fix error in GetLocalUser() return
|
|
value check, as suggested by Vadim.
|
|
* contrib/conv_doc.pl: Added a sample external converter script.
|
|
* htdoc/THANKS.html: A couple more additions.
|
|
|
|
Tue Nov 30 15:02:25 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(IsValidURL): Fix compilation error in
|
|
valid_extensions list handling.
|
|
* contrib/htdig-3.1.4.spec, contrib/htdig-3.1.4-conf.patch:
|
|
Added sample RPM spec file and config patch for it.
|
|
|
|
Tue Nov 30 14:01:51 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/where.html: Bump to version 3.1.4.
|
|
* htdoc/THANKS.html: Added new contributors.
|
|
* htdoc/isp.html, htdoc/uses.html, htdoc/main.html, htdoc/mailing.html:
|
|
Updated to versions from htdig.org web site.
|
|
|
|
Tue Nov 30 13:01:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/RELEASE.html: Add release notes for 3.1.4 release.
|
|
* .version, README: Bump for 3.1.4.
|
|
|
|
Tue Nov 30 11:03:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html(backlink_factor): Added Geoff's clarification of
|
|
what this attribute does.
|
|
|
|
Tue Nov 30 09:47:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc(RetrieveLocal): Handle common extensions for
|
|
text/plain, application/pdf & application/postscript.
|
|
* htdig/Retriever.cc(IsValidURL): Add valid_extensions list handling,
|
|
make it and bad_extensions case insensitive.
|
|
* htcommon/defaults.cc: Add config attribute valid_extensions,
|
|
with default as empty.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Document it.
|
|
|
|
Tue Nov 30 09:02:02 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(got_href & got_redirect): remove all of Patrick's
|
|
case insensitive server code, to replace it with Geoff's fix to URL.cc
|
|
* htlib/URL.cc(normalizePath, path): If not case_sensitive,
|
|
lowercase the URL. Should ensure that all URLs are appropriately
|
|
lowercased, regardless of where they're generated.
|
|
|
|
Mon Nov 29 20:25:01 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc, htdig/Retriever.h, htdig/Server.cc(push),
|
|
htdig/Server.h: added Alexis's patch for persistent local digging
|
|
even if HTTP server is down. Also made new GetLocal() method
|
|
call GetLocalUser() itself, to simplify its use, and made it
|
|
non-private, for eventual use by Server code.
|
|
|
|
Mon Nov 29 19:18:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(got_href & got_redirect): corrections to case
|
|
insensitive server fix, to handle redirects, to make more thorough
|
|
use of mapped URL, and to update it after normalization.
|
|
|
|
Fri Nov 26 17:14:46 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc(RetrieveHTTP): always c.close() the connection
|
|
when returning.
|
|
* htdig/HTML.cc(HTML & do_tag): add code to turn off indexing between
|
|
<style> and </style> tags.
|
|
|
|
Fri Nov 26 16:31:06 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Configuration.cc(Read): fixed to allow final line without
|
|
terminating newline character, rather than ignoring it.
|
|
* htlib/String.cc(write): added Alexis Mikhailov's fix to bump up
|
|
pointer after writing a block.
|
|
* htsearch/Display.cc(setVariables): added Alexis Mikhailov's fix
|
|
to check the number of pages against maximum_pages at the right time.
|
|
(Put it even earlier, to make sure nPages is at least 1.)
|
|
* htsearch/Display.cc(generateStars): Remove extra newline after
|
|
STARSRIGHT and STARSLEFT variables, noted by Torsten Neuer
|
|
<tneuer@inwise.de>.
|
|
|
|
Wed Nov 24 20:33:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/htdig.conf: Add bad_extensions to make it
|
|
more obvious to users how to exclude certain document types.
|
|
Fix the comments for search_algorithm to refer to all the current
|
|
possibilities. Add example of no_excerpt_show_top attribute in
|
|
line with most user's expectations. (Geoff's changes)
|
|
|
|
Wed Nov 24 20:02:32 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/search.html (Match): Add Boolean to default search
|
|
form, as suggested by PR#561.
|
|
|
|
Tue Nov 23 23:03:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(setupTemplates), htsearch/Display.h: fixed a
|
|
couple of compilation errors in template_patterns code.
|
|
|
|
Tue Nov 23 22:16:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(got_href): Applied Patrick's case insensitive
|
|
server fix, to lowercase all URLs if case_sensitive is false.
|
|
|
|
Tue Nov 23 22:08:22 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/StringList.cc(Join): Applied Loic's patch to fix memory leak.
|
|
|
|
Tue Nov 23 21:52:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
[Applied patch from Hanno Mueller <kontakt@hanno.de>, which includes...]
|
|
* contrib/README: Add scriptname directory.
|
|
* contrib/scriptname/*: An example of using htsearch within
|
|
dynamic SSI pages
|
|
* htcommon/defaults.cc: Add script_name attribute to override
|
|
SCRIPT_NAME CGI environment variable.
|
|
* htdoc/FAQ.html: Update question 4.7 based on including htsearch
|
|
as a CGI in SSI markup.
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htdoc/hts_templates.html: Update based on behavior of script_name
|
|
attribute.
|
|
* htsearch/Display.cc: Set SCRIPT_NAME variable to attribute
|
|
script_name if set and CGI environment variable if undefined.
|
|
|
|
Tue Nov 23 21:29:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: Added the past few month's updates to the FAQ.
|
|
|
|
Tue Nov 23 21:20:35 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc, htsearch/Display.h, htsearch/Display.cc,
|
|
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htdoc/hts_templates.html: add template_patterns attribute, to select
|
|
result templates based on URL patterns.
|
|
|
|
Tue Nov 23 20:52:38 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/cgi.h, htlib/cgi.cc(cgi & init), htsearch/htsearch.cc
|
|
(main & usage): allow a query string to be passed as an argument.
|
|
|
|
Tue Nov 23 20:35:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(setVariables & createURL),
|
|
htsearch/htsearch.cc(main), htdoc/hts_templates.html: handle keywords
|
|
input parameter like others, and make it propagate to followups.
|
|
|
|
Tue Nov 23 20:25:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: removed vestigial references to MAX_MATCHES
|
|
template variables in search_results_{header,footer}.
|
|
|
|
* htdoc/hts_form.html: add disclaimer about keywords parameter not
|
|
being limited to meta keywords.
|
|
|
|
* htdoc/meta.html: add description of "keywords" meta tag property.
|
|
add links to keywords_factor & meta_description_factor attributes.
|
|
|
|
Tue Nov 23 20:07:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(setVariables & hilight): added Sergey's idea
|
|
for start_highlight, end_highlight & page_number_separator attributes.
|
|
* htcommon/defaults.cc: added defaults for these.
|
|
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: documented them.
|
|
|
|
Tue Nov 23 19:58:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc: added support for external converters
|
|
as extension to external_parsers attribute.
|
|
* htdoc/attrs.html: Updated external_parsers with new description
|
|
and examples of external converters.
|
|
|
|
Tue Nov 23 19:52:27 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc(transSGML), htdig/SGMLEntities.cc(translateAndUpdate):
|
|
Fix the infamous problem in htdig 3.1.3 of mangling URL parameters that
|
|
contain bare ampersands (&), and not converting & entities in URLs.
|
|
|
|
* htdig/Retriever.cc(IsLocal & IsLocalUser): Fix PR#688, where
|
|
htdig goes into an infinite loop if an entry in local_urls
|
|
(or local_user_urls) is missing a '=' (or a ',').
|
|
|
|
* htcommon/cgi.cc(cgi): Fix bug in reading long queries via POST
|
|
method (PR#668).
|
|
|
|
* htnotify/htnotify.cc(send_notification): apply Jason Haar's fix
|
|
to quote the sender name "ht://Dig Notification Service".
|
|
|
|
Wed Sep 22 11:12:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/ChangeLog, htdoc/isp.html, htdoc/FAQ.html,
|
|
htdoc/RELEASE.html, htdoc/THANKS.html, htdoc/attrs.html,
|
|
htdoc/bugs.html, htdoc/contents.html, htdoc/main.html,
|
|
htdoc/require.html, htdoc/uses.html, htdoc/where.html: Update for
|
|
3.1.3 release and synch with latest versions from the website.
|
|
|
|
Wed Sep 15 17:54:31 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
A few changes to satisfy the AIX xlC compiler:
|
|
|
|
* htdig/htdig.cc: Moved variable declaration out of case block.
|
|
|
|
* configure.in, htconfig.in: Add check for sys/select.h.
|
|
Add "long unsigned int" to the possible getpeername_length types.
|
|
|
|
* htlib/Connection.cc: Include sys/select.h.
|
|
|
|
Sun Sep 12 15:02:19 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* .version: Bump for 3.1.3.
|
|
|
|
* README: Bump first line for 3.1.3 release, remove mention of rx
|
|
directory.
|
|
|
|
* htdoc/ChangeLog: Update with latest version.
|
|
|
|
* htdoc/RELEASE.html: Add release notes for 3.1.3 release.
|
|
|
|
Thu Sep 9 14:52:19 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/parse_doc.pl: fix bug in pdf title extraction.
|
|
|
|
Wed Sep 1 15:58:14 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc(got_word): add code to check for compound words
|
|
and add their component parts to the word database.
|
|
|
|
* htdig/PDF.cc(parseString), htdig/Plaintext.cc(parse): Don't strip
|
|
punctuation or lowercase the word before calling got_word. That
|
|
should be left up to got_word & Word methods.
|
|
|
|
* htlib/StringMatch.h, htlib/StringMatch.cc(Pattern, IgnoreCase):
|
|
Add an IgnorePunct() method, which allows matches to skip over valid
|
|
punctuation, change Pattern() and IgnoreCase() to accomodate this.
|
|
|
|
* htsearch/htsearch.cc(main, createLogicalWords): use IgnorePunct()
|
|
to highlight matching words in excerpts regardless of punctuation,
|
|
toss out old origPattern, and don't add short or bad words to
|
|
logicalPattern.
|
|
|
|
* htlib/HtWordType.h, htlib/HtWordType.cc(Initialize): set up and
|
|
use a lookup table to speed up HtIsWordChar() and HtIsStrictWordChar().
|
|
|
|
Wed Sep 1 15:48:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/PDF.cc(parse), htcommon/defaults.cc, htdoc/attrs.html:
|
|
Fix PDF.cc to handle acroread in Acrobat 4, which has a bug with
|
|
the -pairs option. It turns out that even without the -pairs
|
|
option, acroread 4 is still prone to segmentation violations when
|
|
generating PostScript, so acroread 3 is a better choice anyway.
|
|
|
|
* htdoc/FAQ.html: Added the past few month's updates to the FAQ.
|
|
|
|
* contrib/parse_doc.pl: Updated to latest version, adapted for
|
|
xpdf 0.90.
|
|
|
|
Wed Sep 1 15:39:41 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
Applied "bugfixes" patch collection, which I had posted to
|
|
htdig@htdig.org mailing list in August. Changes include...
|
|
|
|
* htsearch/Display.cc(expandVariables): Fix problem with $(VAR)
|
|
at end of template string not being expanded.
|
|
|
|
* htlib/URL.cc(URL): Fix PR#566 by setting the correct length of the
|
|
string being matched. 'http://' is 7 characters. Submitted by
|
|
<wolfgang.pichler@creditanstalt.co.at>.
|
|
|
|
* htdig/HTML.h, htdig/HTML.cc(do_tag, transSGML): Fix the HTML parser
|
|
to decode SGML entities within tag attributes.
|
|
|
|
* htlib/URL.cc(ServerAlias): Fix server_aliases entries so port
|
|
defaults to 80 if omitted.
|
|
|
|
* htlib/URL.cc(removeIndex): Fix the infamous problem with files
|
|
like left_index.html not getting indexed. PR#543 & PR#585.
|
|
|
|
* htdig/PDF.cc(parseNonTextLine): Fixed a bug in the PDF parser:
|
|
when the Title header was just the temporary file name, it
|
|
wouldn't be used, but it also wouldn't be cleared from the
|
|
_parsedString variable, so it ended up polluting the document
|
|
excerpt.
|
|
|
|
* htdig/Document.cc(RetrieveHTTP): Added error messages for unknown
|
|
hosts.
|
|
|
|
* htlib/cgi.cc(cgi): Fix PR#572, where htsearch crashed if
|
|
CONTENT_LENGTH was not set but REQUEST_METHOD was.
|
|
|
|
* htdig/HTML.cc(do_tag): Fix <meta> robots parsing to allow
|
|
multiple directives to work correctly. Fixes PR#578, as provided
|
|
by Chris Liddiard <c.h.liddiard@qmw.ac.uk>.
|
|
|
|
* htsearch/htsearch.cc(main): Allow multiple keywords input
|
|
parameters in search forms.
|
|
|
|
* htdig/Document.cc(Reset, readHeader): Fix the bug in the handling
|
|
of modification_time_is_now.
|
|
|
|
* htfuzzy/Fuzzy.cc(getWords), htfuzzy/Metaphone.cc(vscode,generateKey):
|
|
Should fix PR#514 in the bug database. It's Geoff's first attempt,
|
|
with a minor correction, plus an added test in the vscode macro,
|
|
which is where the problem seemed to be happening. This won't
|
|
map accented vowels to their unaccented counterparts, but
|
|
it should hopefully put an end to the segmentation faults.
|
|
|
|
* include/htconfig.h.in, htcommon/WordReference.h,
|
|
htcommon/WordList.cc(Word, Flush, BadWordFile),
|
|
htcommon/DocumentRef.cc(AddDescription), htcommon/defaults.cc,
|
|
htsearch/parser.cc(perform_push), htdoc/attrs.html,
|
|
htdoc/cf_byname.html, htdoc/cf_byprog.html: Change the maximum word
|
|
length into a run-time option, rather than compile-time.
|
|
|
|
* htsearch/Display.cc(displayMatch): Applied Torsten Neuer's
|
|
<tneuer@inwise.de> fix for PR#554.
|
|
|
|
* htdig/HTML.cc(HTML, do_tag): Added support for <embed>, <object>
|
|
and <link> tags.
|
|
|
|
* htdig/htdig.cc(main): Applied Geoff's patch to hide the
|
|
username/password in the command line arguments.
|
|
|
|
* htdig/Document.cc(readHeader): Fixed a few problems with header
|
|
parsing, including PR#535 & PR#557.
|
|
|
|
* htdig/Document.cc(getdate): This should help with PR#81 & PR#472,
|
|
where strftime() would crash on some systems. Idea submitted
|
|
by benoit.sibaud@cnet.francetelecom.fr.
|
|
|
|
* COPYING, htdoc/COPYING, Makefile.in: Updated the FSF address
|
|
in COPYING & Makefile.in. PR#595.
|
|
|
|
* htdig/Retriever.cc(IsValidURL): Fix PR#493, to avoid rejecting
|
|
a valid URL with ".." in it.
|
|
|
|
* htlib/URL.cc(parse): Fix PR#348, to make sure a missing
|
|
or invalid port number will get set correctly.
|
|
|
|
* htsearch/Display.h, htsearch/Display.cc(excerpt): Fix declaration
|
|
to refer to "first" as reference--ensures ANCHOR is properly set.
|
|
Fixes PR#541 as suggested by <pmb1@york.ac.uk>.
|
|
|
|
* htdig/ExternalParser.cc(parse): Quote the filename before passing
|
|
it to the command-line to prevent shell escapes. Fixes PR#542.
|
|
Also make error messages more useful.
|
|
|
|
* htfuzzy/Endings.cc(getWords): Suffix-handling improvement (PR#560),
|
|
to prevent inappropriate suffix stripping in endings fuzzy matches.
|
|
|
|
* htlib/URLTrans.cc(encodeURL): Fix encoding so all non-ascii
|
|
characters get hex-encoded. I think this is what PR#339 was all about.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Added descriptions for attributes that were missing, added
|
|
a few clarifications, and corrected a few defaults and typos.
|
|
Covers PR#558, PR#626, and then some.
|
|
|
|
* configure.in, configure, include/htconfig.h.in, htlib/regex.c:
|
|
Fix PR#545, to test for presence of alloca.h
|
|
|
|
Wed Apr 21 22:45:16 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* .version: Bump for final 3.1.2 release.
|
|
|
|
* htdoc/where.html, htdoc/FAQ.html: Update to mention the new release.
|
|
|
|
Tue Apr 20 13:34:22 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/RELEASE.html: Fixed a few typos, updated modification date.
|
|
|
|
Tue Apr 20 10:54:59 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/RELEASE.html: Add notes on changes in the 3.1.2 release.
|
|
|
|
* htdoc/contents.html, htdoc/mailarchive.html, htdoc/where.html,
|
|
htdoc/uses.html: Update with versions from maindocs.
|
|
|
|
* installdir/htdig.conf: Add example max_doc_size attribute to cut
|
|
down on FAQ, also add comment on including a file for start_url.
|
|
|
|
Mon Apr 19 15:40:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/WordList.cc(valid_word): fixed to avoid having the new
|
|
HtIsStrictWordChar() test circumvent the allow_numbers option by
|
|
allowing numbers all the time. Also fixed to allow HtIsStrictWordChar()
|
|
to override iscntrl(), so extra_word_characters can define characters
|
|
that a broken locale would define as control characters.
|
|
|
|
Mon Apr 19 15:17:12 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/WordList.cc(valid_word): fixed bug introduced Jan 9,
|
|
where it stopped scanning for control characters prematurely.
|
|
Now also use iscntrl() to detect all control characters.
|
|
|
|
Fri Apr 16 10:30:42 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: fixed typo - use_meta_description was plural.
|
|
|
|
Wed Apr 14 20:22:31 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* htlib/regex.h: fixed compile problem with AIX xlc compiler
|
|
|
|
Tue Apr 13 13:01:04 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(generateStars): Set status to -1 if
|
|
URLimage.hasPattern() fails, to avoid empty URLimageList.
|
|
(Fix to Mar 31 change.)
|
|
|
|
Tue Apr 13 11:27:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.h(class Display): move enum SortType up to public
|
|
section, to avoid problem compiling on IBM AIX C++ compiler.
|
|
|
|
Mon Apr 12 17:36:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: added sections on indexing docs in other languages,
|
|
practical & theoretical limits of ht://Dig.
|
|
|
|
Fri Apr 9 16:47:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: Fixed a few typos.
|
|
|
|
Fri Apr 9 16:24:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc(RetrieveHTTP): Show "Unable to build connection"
|
|
message at lower debug level.
|
|
|
|
Fri Apr 9 15:17:53 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/FAQ.html: Added changes in maindocs from Mar 18, a few
|
|
clarifications, and four new questions.
|
|
|
|
Wed Apr 7 19:41:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc (usage): Remove bogus -w flag.
|
|
|
|
Thu Apr 1 11:58:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/htsearch.cc(main): Apply Gabriele's patch to avoid using an
|
|
invalid matchesperpage CGI input variable.
|
|
|
|
* htsearch/Display.cc(display) & (setVariables): Correct any invalid
|
|
values for matches_per_page attribute to avoid div. by 0 error.
|
|
|
|
Wed Mar 31 18:21:21 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/htdig.cc: Undo March 30 change.
|
|
|
|
* htdig/Retriever.cc: Use excludes.hasPattern before using the
|
|
exclude list. (More elegant solution to problem, as pointed out by
|
|
Gilles.)
|
|
|
|
* htsearch/Display.cc: Remove code setting URLimage to a bogus
|
|
pattern. Instead, check that URLimage.hasPattern() before using
|
|
it.
|
|
|
|
Wed Mar 31 15:16:36 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Synonym.cc: Fix previous fix of minor memory leak.
|
|
(db pointer wasn't properly set)
|
|
|
|
Tue Mar 30 20:08:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/htdig.cc: If exclude_urls attribute is set to empty, set
|
|
it to something that will never match a URL to ensure nothing is
|
|
excluded.
|
|
|
|
* Makefile.config.in: Fix typo leading to HTLIBS referring to itself.
|
|
|
|
Mon Mar 29 16:47:48 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(excerpt): Added patch from Gabriele to
|
|
improve display of excerpts--show top of description always,
|
|
otherwise try to find the excerpt.
|
|
|
|
Mon Mar 29 15:57:06 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/htdig.cc: Rename main.cc for consistency with other
|
|
directories.
|
|
|
|
* htdig/Makefile.in: Use it.
|
|
|
|
Mon Mar 29 12:53:17 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/HtWordType.h (HtIsWordChar): Avoid matching 0 when using
|
|
strchr.
|
|
(HtIsStrictWordChar): Ditto. (Patch from Hans-Peter Nilsson)
|
|
|
|
Mon Mar 29 10:51:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/regex.h, htlib/regex.c: Include glibc versions of the
|
|
regex functions to override possibly buggy system versions.
|
|
|
|
* htlib/Makefile.in: Use them.
|
|
|
|
* htfuzzy/EndingsDB.cc: Use glibc regex functions instead of rx
|
|
for massive speedups on non-English affix files.
|
|
|
|
* configure, configure.in: Use the system timegm function if present.
|
|
Don't configure rx since we don't use it any more. Don't worry
|
|
about tsort since that was only needed for rx.
|
|
|
|
* Makefile.in, Makefile.config.in: Ignore the rx directory if present.
|
|
|
|
Thu Mar 25 12:24:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* installdir/long.html, installdir/short.html: Remove backslashes
|
|
before quotes in HTML versions of the builtin templates.
|
|
|
|
* Makefile.in: Add long.html & short.html to COMMONHTML list, so
|
|
they get installed in common_dir.
|
|
|
|
Thu Mar 25 11:45:59 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(displayMatch), htcommon/defaults.cc,
|
|
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Add date_format attribute suggested by Marc Pohl.
|
|
|
|
Thu Mar 25 09:49:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(displayMatch): Avoid segfault when DocAnchors
|
|
list has too few entries for current anchor number.
|
|
|
|
Wed Mar 24 12:20:02 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/main.cc (main): Call HtWordType::Initialize. (Missed this
|
|
one yesterday. Oops!)
|
|
|
|
Tue Mar 23 17:11:46 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* backport Hans-Peter Nilsson's suite of changes for HtWordType
|
|
and extra_word_characters support, to 3.1.2...
|
|
|
|
* htlib/HtWordType.h (class HtWordType): New.
|
|
* htlib/HtWordType.cc: New.
|
|
* htlib/Makefile.in (OBJS): Add HtWordType.o
|
|
|
|
* htdoc/attrs.html: Document attribute extra_word_characters.
|
|
* htdoc/cf_byprog.html: Ditto.
|
|
* htdoc/cf_byname.html: Ditto.
|
|
|
|
* htcommon/defaults.cc (defaults): Add extra_word_characters.
|
|
|
|
* htsearch/htsearch.h: Lose spurious extern declaration of unused
|
|
variable valid_punctuation.
|
|
* htsearch/htsearch.cc (main): Call HtWordType::Initialize.
|
|
(setupWords): Use HtIsWordChar, HtIsStrictWordChar and
|
|
HtStripPunctuation. Do not read valid_punctuation.
|
|
|
|
* htsearch/Display.cc (excerpt): Use HtIsStrictWordChar.
|
|
|
|
* htlib/StringMatch.cc (FindFirstWord): Ditto.
|
|
(CompareWord): Ditto.
|
|
|
|
* htdig/Retriever.h (class Retriever): Lose member
|
|
valid_punctuation.
|
|
* htdig/Retriever.cc (Retriever): Lose its initialization.
|
|
|
|
* htdig/Postscript.h (class Postscript): Lose member
|
|
valid_punctuation.
|
|
* htdig/Postscript.cc (Postscript): Lose its initialization.
|
|
(flush_word): Use HtStripPunctuation.
|
|
(parse_string): Use HtIsWordChar,
|
|
HtIsStrictWordChar and HtStripPunctuation.
|
|
|
|
* htdig/Parsable.h (class Parsable): Lose member
|
|
valid_punctuation.
|
|
* htdig/Parsable.cc (Parsable): Lose its initilization.
|
|
|
|
* htcommon/WordList.cc (valid_word): Use HtIsStrictWordChar.
|
|
(BadWordFile): Use HtStripPunctuation. Do not read
|
|
valid_punctuation.
|
|
|
|
* htcommon/DocumentRef.cc (AddDescription): Use HtIsWordChar,
|
|
HtIsStrictWordChar and HtStripPunctuation. Do not read
|
|
valid_punctuation.
|
|
|
|
* htdig/PDF.cc (parseString): Similar..
|
|
|
|
* htdig/HTML.cc (parse): Similar.
|
|
|
|
* htdig/Plaintext.cc (parse): Similar.
|
|
|
|
Tue Mar 23 15:52:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* .version: Bump to 3.1.2-dev.
|
|
|
|
Tue Mar 23 14:50:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/String.cc: Fix up code to be cleaner with memory
|
|
allocation, inline next_power_of_2, fix some memory leaks.
|
|
(Geoff's changes of Feb 22-25)
|
|
|
|
Tue Mar 23 14:35:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/HtWordCodec.cc(HtWordCodec): Fix bug with constructing from
|
|
uninitialized variables!
|
|
|
|
* htlib/HtURLCodec.cc (~HtURLCodec): Add missing deletion of
|
|
myWordCodec.
|
|
|
|
Tue Mar 23 14:18:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/PDF.cc(parseString): Use minimum_word_length instead of
|
|
hardcoded constant.
|
|
|
|
Tue Mar 23 12:02:00 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(generateStars): Add in support for use_star_image
|
|
which was lost when template support was put in way back when.
|
|
|
|
Tue Mar 23 11:47:52 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* Makefile.in: add missing ';' in for loops, between fi & done
|
|
|
|
Mon Mar 22 19:26:56 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/DocumentRef.cc(AddDescription): Check to see that
|
|
description isn't a null string or contains only whitespace before
|
|
doing anything.
|
|
|
|
Mon Mar 22 19:21:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Fix #ifdef
|
|
problems with zlib.
|
|
|
|
Mon Mar 22 19:14:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html (template_name): Typo; used by htsearch, not htdig.
|
|
|
|
Mon Mar 22 19:10:56 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Retriever.cc (got_href): Check if the ref is for the
|
|
current document before adding it to the db. (From H-P Nilsson, Mar 8)
|
|
|
|
Mon Mar 22 19:03:23 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Rephrase and clarify entry for url_part_aliases.
|
|
(From Hans-Peter Nilsson, Mar 2)
|
|
|
|
Mon Mar 22 18:48:10 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htfuzzy/Synonym.cc: Fix minor memory leak.
|
|
|
|
* htlib/Dictionary.h, htlib/Dictionary.cc(hashCode): Check if key
|
|
can be converted to an integer using strtol. If so, use the
|
|
integer as the hash code. (Geoff's patch)
|
|
|
|
Mon Mar 22 18:23:11 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/List.cc(Nth): Check for out-of-bounds requests before
|
|
doing anything.
|
|
|
|
Mon Mar 22 17:50:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc(display): Free DocumentRef memory after
|
|
displaying them.
|
|
(displayMatch): Fix memory leak when documents did not have anchors,
|
|
fix problems when documents did not have descriptions.
|
|
|
|
Mon Mar 22 17:32:14 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htmerge/docs.cc(convertDocs): Replace previous verbose patch
|
|
with H-P Nilsson's.
|
|
|
|
Mon Mar 22 17:13:35 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Plaintext.cc, htmerge/words.cc: removed Log lines.
|
|
|
|
Mon Mar 22 16:11:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/htsearch.cc: Add patch from Jerome Alet <alet@unice.fr>
|
|
to allow '.' in config field but NOT './' for security reasons.
|
|
|
|
Mon Mar 22 15:56:55 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/long.html, installdir/short.html: Write out HTML
|
|
versions of the builtin templates. (committed to 3.1.2 by Gilles)
|
|
|
|
* installdir/htdig.conf: Add commented-out template_map and
|
|
template_name attributes to use the on-disk versions.
|
|
|
|
Mon Mar 22 15:13:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc, htdoc/attrs.html: Change default locale
|
|
to "C", as H-P Nilsson recommended.
|
|
|
|
* htlib/Configuration.cc(Add): Fix small memory leak in locale code,
|
|
as Geoff discovered.
|
|
|
|
Mon Mar 22 15:03:10 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/parse_doc.pl: uses pdftotext to handle PDF files,
|
|
generates a head record with punctuation intact, extra checks
|
|
for file "wrappers" & check for MS Word signature (no longer
|
|
defaults to catdoc), strip extra punct. from start & end of words,
|
|
rehyphenate text from PDFs, fix handling of minimum word length.
|
|
|
|
Mon Mar 22 14:38:01 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Plaintext.cc(parse): Use minimum_word_length instead of
|
|
hardcoded constant.
|
|
|
|
Mon Mar 22 14:33:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Configuration.cc(Add): Fix function to avoid infinite loop
|
|
on some systems, which don't allow all the letters in isalnum() that
|
|
isalpha() does, e.g. accented ones.
|
|
|
|
* htdig/HTML.cc: Fix three reported bugs about inconsistent
|
|
handling of space and punctuation in title, href description & head.
|
|
Now makes destinction between tags that cause word breaks and those
|
|
that don't, and which of the latter add space.
|
|
|
|
Mon Mar 22 14:25:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htmerge/docs.cc: Make htmerge -vv report reasons for deleting docs.
|
|
|
|
* htmerge/words.cc(mergeWords): Fix to prevent description text
|
|
words from clobbering anchor number of merged anchor text words.
|
|
|
|
Fri Mar 19 17:09:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc: Fix bug where noindex_start was empty, allow case
|
|
insensitive matching of noindex_start & noindex_end.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Fix inconsistencies in documentation for noindex_start & noindex_end.
|
|
|
|
Fri Mar 19 17:05:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc: Add check for <a href=...> tag that is missing a
|
|
closing </a> tag, terminating it at next href.
|
|
|
|
Fri Mar 19 17:00:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Document.cc: Fix check of Content-type header in readHeader(),
|
|
correcting bug introduced Jan 10 (for PR#91), and check against
|
|
allowed external parsers.
|
|
|
|
* htdig/HTML.cc: More lenient comment parsing, allows extra dashes.
|
|
|
|
Fri Mar 19 16:52:51 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc: Check for presence of more than one <title> tag.
|
|
|
|
* htlib/mytimegm.cc: Fix Y2K problems.
|
|
|
|
Fri Mar 19 16:43:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc: Add patch from Gabriele to ensure META
|
|
descriptions are parsed, even if 'description' is added to the
|
|
keyword list.
|
|
|
|
Fri Mar 19 16:37:08 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/parser.h, htsearch/parser.cc: Clean up patch made for
|
|
error messages, made on Feb 16.
|
|
|
|
Tue Feb 16 23:48:09 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in, configure: Default to 'int' when we cannot
|
|
establish type used by getpeername.
|
|
|
|
* htdoc/RELEASE.html: Additional notes on everything fixed in 3.1.1.
|
|
|
|
Tue Feb 16 23:45:26 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* contrib/parse_doc.pl: Add replacement for less-capable (and
|
|
buggy) parse_word_doc.pl script. Handles Word, PS, RTF, and
|
|
WordPerfect files, with appropriate file->text converters.
|
|
|
|
* htsearch/parser.cc, htsearch/parser.h: Add more error messages
|
|
when the boolean expression is invalid.
|
|
|
|
Mon Feb 15 21:02:24 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc(RetrieveLocal): Fix to ensure we report
|
|
reading only max_doc_size bytes, even when the document is larger.
|
|
|
|
* configure.in, configure: Add 'socklen_t' to getpeername check to
|
|
prevent problems configuring on Solaris 7.
|
|
|
|
* htdoc/RELEASE.html: Minor changes for 3.1.1 release.
|
|
|
|
Sun Feb 14 16:29:48 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc(retrieveHTTP, retrieveLocal): Fix document
|
|
size when the document is larger than max_doc_size. Size should be
|
|
that sent by the server or as given by stat().
|
|
|
|
* htdoc/*.html: More cleanups from Marjolein.
|
|
|
|
Sat Feb 13 20:53:34 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc(got_word): Ensure heading is in a normal range.
|
|
|
|
* htdoc/RELEASE.html: Added information on the bugs fixed in 3.1.1.
|
|
|
|
* htdoc/attrs.html: Added info on the changed syntax of the pdf_parser
|
|
attribute in 3.1.0 and later.
|
|
|
|
Sat Feb 13 20:29:26 1999 Marjolein Katsma <webmaster@javawoman.com>
|
|
|
|
* htdoc/*.html: Cleaned up HTML, fixed typos, added appropriate
|
|
HTML 4.0 syntax, added DTDs to files, other minor fixed.
|
|
|
|
Fri Feb 12 19:58:28 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* .version: Bump for version 3.1.1.
|
|
|
|
* configure.in, configure: Fix problems determining getpeername
|
|
syntax under IRIX.
|
|
|
|
* db/os/os_map.c: Fixed problems on AlphaLinux pointed out by Paul
|
|
J. Meyer.
|
|
|
|
Fri Feb 12 12:00:25 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc: Fix crashes noted by Frank Richter.
|
|
|
|
* contrib/htparsedoc/parse_word_doc.pl: Use updated version (with
|
|
fixed line breaks).
|
|
|
|
* htnotify/htnotify.cc: Add patch mentioned in Feb 8 documentation
|
|
change.
|
|
|
|
Thu Feb 11 00:29:42 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/DocumentRef.cc (NUM_ASSIGN): Expand from unsigned types.
|
|
(getnum): Use temporary for "unsigned short", and memcpy data into
|
|
it instead of assignment.
|
|
|
|
Tue Feb 9 19:21:55 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html, htdoc/where.html: Update for 3.1.0 release.
|
|
|
|
* htdoc/uses.html: Added remaining backlog.
|
|
|
|
* htdoc/RELEASE.html: Finish up release notes for 3.1.0.
|
|
|
|
Tue Feb 9 19:19:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc: Ensure we remove the temporary file.
|
|
|
|
Mon Feb 8 20:28:07 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/ma_menu: Change relative URLs to absolute URLs to
|
|
www.htdig.org to reflect the changing mail archive.
|
|
|
|
* htdoc/install.html: Add notes on new configure flags to set
|
|
CONFIG variables.
|
|
|
|
* htdoc/*.html: Ensure Last Modifed date stamps are up-to-date.
|
|
|
|
Mon Feb 8 20:26:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/meta.html, htdoc/notification.html: Add info on date
|
|
formats for the htnotify-date tag, esp. in relation to ISO 8601.
|
|
|
|
Sat Feb 6 23:24:19 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.cc: Fixed compile problem when zlib is disabled.
|
|
|
|
* htdoc/cf_byname, htdoc/cf_byprog.html, htdoc/attrs.html: Added
|
|
entries for url_log, compression_level, noindex_start, noindex_end,
|
|
allow_in_form, bad_querystr, no_title_text.
|
|
|
|
* htdoc/THANKS.html: Added Gabriele Bartolini.
|
|
|
|
* htdoc/uses.html, htdoc/FAQ.html, htdoc/bugs.html: Synch with the
|
|
latest versions from the website tree.
|
|
|
|
Fri Feb 5 19:57:39 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htnotify/htnotify.cc: Add function parse_date() to parse date
|
|
strings from htnotify-date tags. It tries to be as flexible as
|
|
possible about formatting and will report invalid dates. Based in
|
|
part from code contributed by Gabriele Bartolini.
|
|
|
|
Fri Feb 5 19:28:24 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure, configure.in: Add a test to ensure the zlib.h header
|
|
file exists.
|
|
|
|
* include/htconfig.h.in: Added definition for HAVE_ZLIB_H.
|
|
|
|
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Add checks for
|
|
HAVE_ZLIB_H in addition to HAVE_LIBZ. Ensures the library is
|
|
actually accessible, not just present.
|
|
|
|
* htfuzzy/Soundex.cc: Fix typo.
|
|
|
|
Thu Feb 4 22:51:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* Makefile.in: Clean up previous patch and tidy up HTML and
|
|
dictionary installation.
|
|
|
|
Thu Feb 4 22:31:35 1999 Ric Klaren <klaren@telin.nl>
|
|
|
|
* Makefile.in, */Makefile.in: Add support for
|
|
$INSTALL_ROOT, making it easier to build packages (e.g. RPMs) into
|
|
directories for later processing.
|
|
|
|
* htsearch/Display.cc: Tiny patch to silence a compiler warning.
|
|
|
|
Thu Feb 4 13:03:44 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htfuzzy/Soundex.cc(generateKey): Skip initial non-alphabetic
|
|
characters and explicitly skip characters without values.
|
|
|
|
* htfuzzy/Metaphone.cc(generateKey): General bug-fixing, fixing a
|
|
bug that corrupted the string to be processed, fixing typos, and
|
|
ensuring keys generated fit the metaphone algorithm.
|
|
|
|
* htfuzzy/Fuzzy.cc(getWords): Add debugging output of the fuzzy
|
|
key used.
|
|
|
|
* contrib/doclist/doclist.pl, contrib/doclist/listafter.pl,
|
|
contrib/whatsnew/whatsnew.pl, contribu/urlindex.pl: Change to
|
|
support additions to ht://Dig database format.
|
|
|
|
Thu Feb 4 02:09:22 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc: Add debugging information on words
|
|
returned from fuzzy matching.
|
|
|
|
* htfuzzy/Metaphone.cc(addWord): Fix bug where only one word would be
|
|
stored per key in the database.
|
|
|
|
* htfuzzy/Soundex.cc(addWord): Ditto.
|
|
(generateKey): Rewrite to generate keys correctly.
|
|
|
|
Wed Feb 3 19:24:36 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/htdig.html: Added documentation on the -l log and restart
|
|
feature.
|
|
|
|
* htdoc/htmerge.html: Added documentation on the -m merge database
|
|
feature.
|
|
|
|
* htdig/main.cc: Added documentation on the -l flag to the usage
|
|
message.
|
|
|
|
* .version: Bump to 3.1.0.
|
|
|
|
Wed Feb 3 19:09:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc: Add check for URLs with no / in the
|
|
no_title code.
|
|
|
|
* htdig/Document.cc: Fix problems with dates returned from servers
|
|
with incorrect formats. Those simply missing the day of week are
|
|
parsed correctly, otherwise output an error, use the current date,
|
|
and keep going.
|
|
|
|
Wed Feb 3 09:57:14 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/nomatch.html: Fix small typo.
|
|
|
|
* htdoc/RELEASE.html: Finish up 3.1.0 release notes.
|
|
|
|
* htdoc/TODO.html: Update with status and new directions.
|
|
|
|
Wed Feb 3 14:22:11 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* htsearch/Display.cc(setVariables): Removed some of yesterdays
|
|
changes. Thanks to Gilles!
|
|
|
|
Tue Feb 2 17:26:06 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/PDF.h, htdig/PDF.cc: Fix problems with PDFs generated by
|
|
CorelDraw.
|
|
|
|
* htdoc/attrs.html: Fixed small typo.
|
|
|
|
Tue Feb 2 21:02:25 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* htsearch/Display.cc(setVariables,createURL): As pointed out by
|
|
Gilles, append allow_in_form variables to the query strings only
|
|
if they are given as input parameters.
|
|
|
|
Tue Feb 2 10:29:09 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure, configure.in: Rewrite getpeername_length_t detection
|
|
to use prototypes to eliminate type conversion.
|
|
|
|
* htsearch/Display.cc(buildMatchList): Ensure scores are always
|
|
positive or zero.
|
|
|
|
Mon Feb 1 22:54:02 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdoc/attrs.html: Correct "default" for "nothing_found_file".
|
|
|
|
Mon Feb 1 14:44:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc(displayMatch): Remove compiler warnings.
|
|
|
|
* */Makefile.in: Define INSTALL_PROGRAM from configure script.
|
|
|
|
Mon Feb 1 14:04:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/ExternalParser.cc: Add checks to prevent wayward parsers
|
|
from bringing down the dig.
|
|
|
|
Sun Jan 31 23:15:36 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/WeightWord.cc(set): Ensure word is lowercased for
|
|
accurate fuzzy comparisons.
|
|
|
|
* htfuzzy/Fuzzy.cc(openIndex): Destroy the database reference if
|
|
we cannot open the database. Fixes a coredump in classes that
|
|
inherit this method.
|
|
|
|
* Makefile.config.in: Remove bogus definitions of INSTALL.
|
|
|
|
* Makefile.in: Define INSTALL, INSTALL_PROGRAM, INSTALL_SCRIPT,
|
|
and INSSTALL_DATA as defined by configure. Use them.
|
|
|
|
* htdoc/RELEASE.html: Started release notes for version 3.1.0.
|
|
|
|
Mon Feb 1 04:36:29 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htsearch/Display.cc (displayMatch): Fix leaking user of
|
|
String(String *).
|
|
* htfuzzy/Prefix.cc (getWords): Ditto.
|
|
|
|
* htlib/htString.h, htlib/String.cc (String(const String &)): New.
|
|
* htlib/htString.h, htlib/String.cc (String(const String &, int)):
|
|
No default argument.
|
|
* htlib/htString.cc, htlib/String.cc (String(String *)): Removed.
|
|
|
|
Sun Jan 31 21:46:52 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* htlib/Connection.cc: Include sys/time.h needed by select, fixes
|
|
PR #322.
|
|
|
|
Sun Jan 31 20:50:38 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdig/Retriever.cc (Initial, GetRef, Need2Get, IsValidURL,
|
|
got_href, got_redirect): Do not lowercase URLs.
|
|
|
|
* htlib/HtURLCodec.h (class HtURLCodec): Fake a friend function.
|
|
|
|
Sat Jan 30 22:29:50 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure, configure.in: Add support for program name
|
|
transformations.
|
|
|
|
* */Makefile.in: Do it.
|
|
|
|
Sat Jan 30 21:16:50 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htmerge/docs.cc: Added translation of Dutch comment for us ignorant
|
|
Americans. ;-)
|
|
|
|
* installdir/rundig: As mentioned by Gilles, use sed with ls -t
|
|
test. Add more comments for FAQs.
|
|
|
|
* configure.in, configure: Add --disable-zlib to turn off compiling
|
|
compression entirely. Add --with-cgi-bin-dir,
|
|
--with-image-dir and --with-search-dir flags to set CONFIG
|
|
variables.
|
|
|
|
* CONFIG.in: Use them.
|
|
|
|
Sat Jan 30 21:05:35 1999 Randy Winch <gumby@cafes.net>
|
|
|
|
* htcommon/DocumentRef.h: If using compressed document databases,
|
|
declare compress and decompress functions and the current state of
|
|
the head (excerpt).
|
|
|
|
* htcommon/DocumentRef.cc: Change document compression to only
|
|
compress the DocHead field and only decompress when necessary.
|
|
|
|
Sat Jan 30 03:49:21 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/DocumentRef.h: Add #ifdef around declaration of
|
|
c_buffer.
|
|
|
|
* htcommon/DocumentRef.cc: Remove spurious extra "static" from
|
|
c_buffer definition. Add #ifdef HAVE_LIBZ around it.
|
|
|
|
Fri Jan 29 13:30:11 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc: Construct the StringMatch used for finding
|
|
excerpts in two pieces--user input and post-fuzzy matching. Fixes
|
|
problems with matching searches with punctuation.
|
|
|
|
* htlib/StringMatch.cc(IgnoreCase): Fix small memory leak pointed
|
|
out by Gilles.
|
|
|
|
Thu Jan 28 21:36:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/*.html: Changed copyright information to mention the
|
|
ht://Dig group, removing Andrew's name.
|
|
|
|
* README, configure.in, Makefile.in: Ditto.
|
|
|
|
* configure: Change mention of libg++ -> libstdc++.
|
|
|
|
Thu Jan 28 12:53:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Document new remove_default_doc attribute.
|
|
|
|
* Makefile.in: Make sure we put the wrapper file in the right place.
|
|
Make sure dictionaries are installed with the correct permissions.
|
|
|
|
* installdir/rundig: Use a portable test for testing the endings
|
|
and synonym databases. Also enhanced support for flags (-a, -s,
|
|
-vvv, -c config).
|
|
|
|
* htsearch/Display.cc: Fix bug when sorting results would cause a
|
|
coredump.
|
|
|
|
Wed Jan 27 20:00:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/HTML.cc, htdig/SGMLEntities.cc, htdig/ExternalParser.cc,
|
|
htcommon/WordList.cc, htcommon/DocumentRef.cc: Speedup by
|
|
converting many config lookups into static variables.
|
|
|
|
* htdoc/attrs.html, htdoc/hts_templates.cc, htdoc/cf_byname.html,
|
|
htdoc/cf_byprog.html: Various minor fixes.
|
|
|
|
* htsearch/Display.cc: Fix problems with star_patterns attribute.
|
|
|
|
Wed Jan 27 13:02:39 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/SGMLEntities.cc: Use StringMatch class for matching
|
|
" & < and > as defined by config options. Should
|
|
speed up translation.
|
|
|
|
* htdoc/THANKS.html: Minor updates for contributions towards 3.1.0.
|
|
|
|
Tue Jan 26 19:29:08 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* include/htconfig.h.in: Define TRUE and FALSE if not
|
|
defined. Change default of NO_WORD_COUNT (now undefined) for
|
|
compatibility.
|
|
|
|
* htdig/htdig.h: Remove definition of TRUE and FALSE (for consistency).
|
|
|
|
* htcommon/DocumentDB.cc(Add, Delete, Exists, []): Do not
|
|
lowercase the URL before storing it. URLs can be case-sensitive.
|
|
|
|
Tue Jan 26 19:07:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htcommon/defaults.cc: Define remove_default_doc as option of
|
|
default document to strip off URLs (e.g. /index.html -> /).
|
|
|
|
* htlib/URL.cc(removeIndex): Use it.
|
|
(normalizePath): Fix bug with stripping double slashes and the
|
|
like from a query string.
|
|
|
|
* htdig/Document.h, htdig/Document.cc: Add new variable
|
|
contentLength and consider content-length headers when reading in
|
|
documents.
|
|
|
|
* htdig/PDF.cc: Fix broken code calling acroread.
|
|
|
|
* htsearch/Display.cc: Allow braces in wrapper file.
|
|
|
|
* htdoc/hts_general.html, htdoc/hts_templates.html: Add info on
|
|
the wrapper alternative to separate header and footer files.
|
|
|
|
* htdoc/config.html, installdir/header.html,
|
|
installdir/nomatch.html, installdir/wrapper.html,
|
|
installdir/search.html: Change sort option to be more grammatically
|
|
correct.
|
|
|
|
Tue Jan 26 21:19:02 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htmerge/docs.cc (convertDocs): Use HtURLCodec to encode URLs
|
|
going into the doc_index database.
|
|
|
|
* htsearch/Display.cc (buildMatchList): Use HtURLCodec to decode
|
|
URLs from docIndex.
|
|
|
|
* htcommon/defaults.cc (defaults): Fix typo with "case_sensitive".
|
|
|
|
Tue Jan 26 18:08:19 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* include/htconfig.h.in: Added HAVE_STRINGS_H. (I forgot that when
|
|
added the configure check.)
|
|
* htdig/Retriever.h: Fix small compiler error. Removed Log-lines.
|
|
|
|
Tue Jan 26 02:22:45 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdig/main.cc (main): Fix typo "uncoded_db_compatbile".
|
|
|
|
Mon Jan 25 19:38:31 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/Configuration(Find): Make error message for missing
|
|
entries conditional to DEBUG symbol. Removes odd error messages
|
|
under normal use.
|
|
|
|
Sun Jan 24 23:55:57 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htmerge/db.cc, htmerge/docs.cc: Fix compiler errors.
|
|
* htnotify/htnotify.cc: Similar.
|
|
|
|
Sun Jan 24 14:13:37 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/WordRecord.h (struct WordRecord): Remove member count
|
|
if NO_WORD_COUNT defined.
|
|
* htmerge/db.cc (mergeDB): Remove handling.
|
|
* htmerge/words.cc (mergeWords): Similar.
|
|
|
|
* include/htconfig.h.in: Define NO_WORD_COUNT by default.
|
|
|
|
Sun Jan 24 14:13:37 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc(logSearch): Added fix from Gilles in case
|
|
REMOTE_ADDR is NULL as well.
|
|
|
|
* htnotify/htnotify.cc: Fix compiler warnings.
|
|
|
|
* htlib/String.cc(indexOf): Use autoconf check for strstr, fix
|
|
compiler warnings.
|
|
|
|
* htlib/Configuration.cc(Find): Complain when option is not in the
|
|
list.
|
|
|
|
* htdig/HTML.cc(parse): Move declarations out of the loop.
|
|
(parse): Don't add non-word characters to the excerpt if they're
|
|
in the title. Fixes PR #80.
|
|
|
|
Mon Jan 25 02:17:58 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/defaults.cc (defaults): New option
|
|
"uncoded_db_compatible", default true.
|
|
|
|
* htcommon/DocumentDB.h (DocumentDB::SetCompatibility): New
|
|
function.
|
|
(DocumentDB::myTryUncoded): New member.
|
|
|
|
* htcommon/DocumentDB.cc (Constructor, Add(), operator[],
|
|
Exists(), Delete()): Handle uncoded URL in database if
|
|
myTryUncoded.
|
|
|
|
* htdig/main.cc (main): Call (DocumentDB::)SetCompatibility() with
|
|
option "uncoded_db_compatible".
|
|
* htsearch/Display.cc (Display): Likewise.
|
|
* htnotify/htnotify.cc (main): Likewise.
|
|
* htmerge/docs.cc (convertDocs): Likewise.
|
|
* htmerge/db.cc (mergeDB): Likewise.
|
|
|
|
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
|
|
Document option "uncoded_db_compatible".
|
|
|
|
Sun Jan 24 15:21:02 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htlib/HtWordCodec.cc (HtWordCodec(StringList &, etc)): Check
|
|
limits separately for "to" and "from". Do not calculate
|
|
string-lengths separately for limit-checking; use methods Count()
|
|
and length() on data near the final result.
|
|
|
|
* htlib/HtWordCodec.cc (HtWordCodec constructors): Do not
|
|
explicitly add '\0' to the pattern strings.
|
|
|
|
* htlib/HtWordCodec.cc (code): Check for zero-length replacement
|
|
list.
|
|
|
|
Sat Jan 23 22:18:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc(parse_url): If a server ignores the
|
|
If-Modified-Since request, still compare the retrieved date to the
|
|
stored date to see if it has been modified.
|
|
|
|
Sat Jan 23 13:09:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htmerge/htmerge.cc: Unlink the db.docs.index file before we
|
|
build it again. This ensures we have a clean copy and don't
|
|
duplicate URLs.
|
|
|
|
Fri Jan 22 23:12:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* include/htconfig.h.in: Cleaned up preprocessor definitions.
|
|
|
|
* configure.in, configure: Fix NEED_PROTO_GETHOSTNAME check and
|
|
make check for GETPEERNAME_LENGTH_T more flexible.
|
|
|
|
* htlib/Connection.cc: Change __sun__ to NEED_PROTO_GETHOSTNAME
|
|
since we prefer feature tests.
|
|
|
|
Sat Jan 23 02:38:08 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htsearch/Display.cc (logSearch): Fix simple typo in last change.
|
|
|
|
Sat Jan 23 01:18:05 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htlib/String.cc (operator =): Add const modifier: const String &.
|
|
* htlib/htString.h (String::operator=(const String &)): Ditto.
|
|
|
|
* htlib/DB2_db.h (class DB2_db): Make Put(), Get(), Exists() and
|
|
Delete() use const modifiers on appropriate parameters.
|
|
* htlib/DB2_db.cc: Ditto.
|
|
* htlib/GDBM_db.h (class GDBM_db): Ditto.
|
|
* htlib/GDBM_db.cc: Ditto.
|
|
* htlib/Database.h (class Database): Ditto.
|
|
* htlib/Database.cc (Put): Similar.
|
|
|
|
* htlib/BTree.h (class BTree): Make Put(), Get() and Exists() use
|
|
const modifiers on appropriate parameters.
|
|
* htlib/BTree.cc: Ditto.
|
|
|
|
* htcommon/DocumentDB.cc (Add, operator[], Exists, Delete): Remove
|
|
needless temporary String.
|
|
* htcommon/DocumentRef.cc (Deserialize): Ditto.
|
|
|
|
Fri Jan 22 21:10:12 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/Configuration.cc: Add support for keyword "include" to
|
|
include other config files.
|
|
|
|
* htdoc/cf_general.html: Document it.
|
|
|
|
Thu Jan 21 23:25:37 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc(logSearch): Check if HTTP_REFERER is NULL,
|
|
if so, use a dash. (Otherwise we'll kill some syslog() services).
|
|
|
|
Thu Jan 21 05:30:40 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htlib/HtURLCodec.h, htlib/HtURLCodec.cc, htlib/HtWordCodec.cc,
|
|
htlib/HtWordCodec.h, htlib/HtCodec.cc, htlib/HtCodec.h: New files.
|
|
|
|
* htlib/Makefile.in (OBJS): Add the corresponding *.o files
|
|
|
|
* htcommon/DocumentDB.cc (Open, Read, Add, operator[], Exists,
|
|
Delete, CreateSearchDB, URLs): Use HtURLCodec; ::encode() and
|
|
::decode() the URL used as a key.
|
|
|
|
* htcommon/DocumentRef.cc (Serialize): Encode the URL using
|
|
HtURLCodec.
|
|
(Deserialize): Decode it.
|
|
|
|
* htmerge/htmerge.h: #include <HtURLCodec.h>
|
|
* htmerge/htmerge.cc (main): Check HtURLCodec for errors.
|
|
* htnotify/htnotify.cc (main): Ditto.
|
|
* htsearch/htsearch.cc (main): Ditto.
|
|
* htdig/main.cc (main): Ditto.
|
|
|
|
* htcommon/defaults.cc (defaults): Add common_url_parts and
|
|
url_part_aliases.
|
|
|
|
* htdoc/cf_byprog.html, htdoc/cf_byname.html,
|
|
htdoc/attrs.html: Document url_part_aliases and
|
|
common_url_parts.
|
|
|
|
* htlib/StringMatch.h (StringMatch::Pattern): Add default
|
|
parameter sep = '|'.
|
|
|
|
* htlib/StringMatch.cc (Pattern): Similar.
|
|
|
|
Wed Jan 20 20:20:35 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc(logSearch): Use REMOTE_ADDR when REMOTE_HOST
|
|
is unavailable (otherwise we silently dump core). Fixes PR #138.
|
|
|
|
* htcommon/WordList.cc(valid_word): Words cannot be valid if
|
|
they're shorter than minimum_word_length! Fixes PR #139.
|
|
|
|
* htsearch/Display.cc(expandVariables): Allow variables of the
|
|
form ${VAR}, fixes PR #121.
|
|
|
|
Wed Jan 20 17:21:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htmerge/docs.cc: Fix logic to remove documents--missing else
|
|
statements allow some "deleted" documents to not be removed.
|
|
|
|
Wed Jan 20 11:52:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/good_strtok.h, htlib/good_strtok.cc: Added fixes and speed
|
|
improvements contributed by Andrew Bishop.
|
|
|
|
* htdig/ExternalParser.cc, htdig/Server.cc, htlib/cgi.cc,
|
|
htmerge/db.cc, htmerge/words.cc: Call good_strtok with appropriate
|
|
parameters (explicitly include NULL first parameter, second param
|
|
is char, not char *).
|
|
|
|
* htcommon/WordList.cc(Word): Added check for adding words with
|
|
weight zero.
|
|
|
|
* htsearch/Display.h, htsearch/Display.cc: Revised setting ANCHOR
|
|
variable: it will be empty if there is no excerpt which matches
|
|
the search formula. Fixes problems with META descriptions. Based
|
|
on a patch contributed by Marjolein.
|
|
|
|
Wed Jan 20 00:30:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/SGMLEntities.cc: Declare extern config, since we now use
|
|
config options.
|
|
|
|
* htsearch/Display.cc: Fix typo causing compile problems.
|
|
|
|
Tue Jan 19 23:51:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Added options translate_amp, _lt_gt, _quot as
|
|
suggested by Marjolein to control SGML translation of these
|
|
entities.
|
|
|
|
* htdig/SGMLEntities.cc: Use them as contributed by Marjolein.
|
|
|
|
Tue Jan 19 12:55:36 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htlib/StringMatch.cc (Pattern): Always set PreviousState before
|
|
checking PreviousValue.
|
|
|
|
* htlib/StringMatch.cc (FindFirst): Be "greedy"; match longest.
|
|
(Compare): Ditto.
|
|
|
|
* htcommon/DocumentRef.cc (MEMCPY_ASSIGN, NUM_ASSIGN): New macros
|
|
for assigning portably to some possibly-enum numeric type.
|
|
(getnum): Use them.
|
|
|
|
* htlib/StringMatch.cc (FINAL): Remove.
|
|
(MATCH_INDEX_MASK): Include highest bit.
|
|
(Pattern, FindFirst, Compare, FindFirstWord, CompareWord): Do not
|
|
use FINAL.
|
|
(FindFirst, Compare, FindFirstWord, CompareWord): When shifting by
|
|
INDEX_SHIFT, cast to unsigned.
|
|
|
|
Mon Jan 18 17:43:29 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Added no_title_text option to allow
|
|
configuration of the text when no title is available. Default is
|
|
the filename.
|
|
|
|
* htsearch/Display.cc: Use no_title_text to set the title
|
|
appropriately, as contributed by Marjolein.
|
|
|
|
* htsearch/Display.cc: Ensure PERCENT variable has a minimum of 1.
|
|
|
|
Mon Jan 18 17:41:44 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdig/Server.cc: Use max_doc_size when retrieving robots.txt
|
|
files instead of a hard-coded 10k limit.
|
|
|
|
* htdig/Document.cc: When reading chunks of document, if a chunk
|
|
puts us over the max_doc_size limit, take everything up to that
|
|
limit (rather than discarding the entire chunk).
|
|
|
|
* htcommon/DocumentRef.cc: Fix thinko with compression_level.
|
|
|
|
Sun Jan 17 21:48:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/(attrs.html, cf_byname.html, cf_byprog.html, config.html,
|
|
hts_form.html, hts_templates.html): Add documentation for "sort"
|
|
config and form input.
|
|
|
|
* htcommon/defaults.cc: Added options "sort" and "sort_names" to
|
|
pick result sorting order and text names for sort options.
|
|
|
|
* htsearch/Display.cc: Added variable SORT to render a form menu
|
|
for sort options, based on "sort" and "sort_names" options.
|
|
|
|
* installdir/(wrapper.html, header.html, nomatch.html,
|
|
footer.html, search.html, syntax.html): Add in sort option to form.
|
|
|
|
Sun Jan 17 14:03:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/TemplateList.h
|
|
htsearch/TemplateList.cc(createFromString): Ensure
|
|
template_map config has three members for each template we add,
|
|
contributed by Gabriele Bartolini <tlm@mbox.comune.prato.it>.
|
|
|
|
* htsearch/Display.cc(Display): Take advantage of createFromString
|
|
returning an error value to bail out of poorly-constructed
|
|
template_maps, based on code contributed by <tlm@mbox.comune.prato.it>.
|
|
|
|
* htdig/PDF.cc: Add debugging output of URLs causing
|
|
problems. Also, switch system call to make it easier to call xpdf
|
|
instead of acroread.
|
|
|
|
* htcommon/defaults.cc: Change default pdf_parser attribute to
|
|
include acrobat-specific flags. Fix mismatched naming of
|
|
compression_level (was compression_factor).
|
|
|
|
* htdig/Retriever.cc: Fix compiler warnings.
|
|
|
|
* contrib/examples/updatedig: Added contributed rundig-type script
|
|
from David Robley <webmaster@www.nisu.flinders.edu.au>.
|
|
|
|
Sun Jan 17 13:42:43 1999 didier Gautheron <dgautheron@magic.fr>
|
|
|
|
* htcommon/defaults.cc: add url_log parameter for save and restart
|
|
function.
|
|
|
|
* htdig/Retriever.cc, htdig/Retriever.h: Add save and restart
|
|
function.
|
|
|
|
* htdig/main.cc: Add option -l for save and restart
|
|
function.
|
|
|
|
* htdig/PDF.cc: Check to see if we have acroread before copying
|
|
the pdf into TMPDIR!
|
|
|
|
Fri Jan 15 07:23:30 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/DocumentRef.cc(Serialize): Save
|
|
space when lengths can fit in an unsigned char or unsigned short.
|
|
|
|
* htcommon/DocumentRef.cc(Deserialize): Handle expansion.
|
|
|
|
Thu Jan 14 23:37:29 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Added options noindex_start and
|
|
noindex_end to enable NOT indexing some sections of
|
|
HTML. Contributed by Marjolein.
|
|
|
|
* htdig/HTML.cc: Use them.
|
|
|
|
* contrib/examples/rundig.sh: Add rundig example from Colin
|
|
Viebrock with a few modifications for using less disk space.
|
|
|
|
Thu Jan 14 23:27:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htlib/URL.cc: Fix parent path logic to ignore slashes in query
|
|
string. Noted by Adam Coyne <adam@criticalmass.com>.
|
|
|
|
Thu Jan 14 00:04:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* README: Fix for upcoming 3.1.0 release.
|
|
|
|
* htcommon/defaults.cc: Set compression_factor to 0 for default
|
|
(no compression).
|
|
|
|
Thu Jan 14 03:16:15 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdig/ExternalParser.cc (parse): Added support for 'm': meta element.
|
|
|
|
* htdoc/attrs.html: Document it.
|
|
|
|
Wed Jan 13 21:31:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.in(install): Add wrapper.html to the common directory
|
|
when installing.
|
|
|
|
* contrib/examples: Added directory for example common files
|
|
(e.g. badwords, dictionaries, templates, etc.)
|
|
|
|
* contrib/examples/badwords: Added example bad_words file by Marjolein.
|
|
|
|
* .version: Bump to 3.1.0dev.
|
|
|
|
* htdig/HTML.cc(parse): Added slight fixes to the comment parsing
|
|
code, contributed by Marjolein.
|
|
|
|
Wed Jan 13 20:11:26 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Fix typo with META example.
|
|
|
|
* htdig/Document.cc: Use new StringList::Join function for
|
|
http_proxy_exclude.
|
|
|
|
* htnotify/htnotify.cc: Bring latest security patch from 3.1.0b4
|
|
onto the mainline source.
|
|
|
|
* installdir/wrapper.html: New file to merge header and footer files.
|
|
|
|
* htcommon/defaults.cc: Added search_results_wrapper for the
|
|
location of the wrapper file, if used. (The default is empty,
|
|
which uses header.html and footer.html)
|
|
|
|
* htsearch/Display.cc: Added support for using the wrapper instead
|
|
of header and footer if search_results_wrapper is set.
|
|
|
|
* htsearch/htsearch.cc: Added check for sort config.
|
|
|
|
* htsearch/Display.cc, htsearch/Display.h: Added support for
|
|
sorting and reverse sorting by date, time, and score.
|
|
|
|
Wed Jan 13 18:45:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Removed use_document_compression
|
|
(redundant) and fixed problem with missing comma. Setting
|
|
compression_factor to 0 is the equivalent of turning off
|
|
use_document_compression.
|
|
|
|
* htcommon/DocumentRef.cc(Serialize, Deserialize): Update from
|
|
Randy Winch to eliminate use_document_compression and fix
|
|
compilation problems noted by Hans-Peter.
|
|
|
|
* htmerge/db.cc: Fixed problem with db.NextDocID() being set
|
|
incorrectly, reported by Roman Dimov <roman@mark-itt.ru>.
|
|
|
|
* htcommon/DocumentDB.h: Added IncNextDocID to allow big changes
|
|
in db.NextDocID(), such as those above.
|
|
|
|
* htdoc/THANKS.html: Added Akos Domotor.
|
|
|
|
Wed Jan 13 07:07:35 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htsearch/htsearch.cc (setupWords): Remove parsedWords parameter
|
|
with accociated processing of original words - deletion of
|
|
bad_words, spacing and on-the-fly modifiers.
|
|
(main): Create originalWords from input, not via setupWords().
|
|
|
|
Tue Jan 12 09:16:49 1999 didier Gautheron <dgautheron@magic.fr>
|
|
|
|
* htcommon/WordList.cc, htmerge/words.cc: Changed field order
|
|
in db.wordlist. With the old order, words from HTML body and words
|
|
from links to that url weren't merged sometimes.
|
|
|
|
* htdig/Document.cc, htmerge/words.cc: Small speed improvements.
|
|
|
|
* htdig/HTML.cc: Fixed small memory leak with bogus HTML and small
|
|
speedups.
|
|
|
|
* htdig/Retriever.cc(got_href) : if ref exists we have to call
|
|
AddDescription even if max_hop_count is reached. It's important
|
|
for wwwoffle (urls in the cache are restricted by max_hop_count)
|
|
|
|
* htcommon/DocumentDB.cc, htcommon/DocumentDB.h, htdig/Retriever.cc,
|
|
htlib/Dictionary.cc, htlib/Dictionary.h, htlib/Object.cc,
|
|
htlib/Object.h, htlib/String.cc, htlib/htString.h,
|
|
htcommon/WordList.cc: Speedups after gprof data.
|
|
|
|
Tue Jan 12 07:23:35 1999 didier Gautheron <dgautheron@magic.fr>
|
|
|
|
* htlib/Configuration.cc: Fixed time format to standard to avoid
|
|
sending If-Modified-Since http headers in native format (which
|
|
would be incorrect behavior). Use C locale.
|
|
|
|
* htlib/Dictionary.h, htlib/Dictionary.cc: Add new method
|
|
GetNextElement to directly return next object when iterating.
|
|
|
|
Tue Jan 12 12:56:26 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc(serialize,
|
|
deserialize): Added support for compressing data using zlib if
|
|
available, contributed by Randy Winch <gumby@cafes.net>.
|
|
|
|
* htcommon/defaults.cc: Added config options
|
|
use_document_compression and compression_factor for zlib support.
|
|
|
|
* configure.in, include/htconfig.h.in: Added autoconf check for
|
|
libz and deflate function.
|
|
|
|
* configure: Generated from above change.
|
|
|
|
Mon Jan 11 22:48:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htmerge/db.cc: Fixed thinko with setting the docIDs of new words
|
|
in the destination wordlist.
|
|
|
|
* htdoc/FAQ.html, htdoc/THANKS.html, htdoc/contents.html: Minor
|
|
cleanups.
|
|
|
|
* htdoc/RELEASE.html: Added release info from 3.1.0b4.
|
|
|
|
* htdoc/uses.html: Alphabetized, added a form for requests, and
|
|
added in lots of new sites.
|
|
|
|
Mon Jan 11 02:42:51 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htsearch/htsearch.cc (setupWords): Do not skip words if
|
|
"boolean" search.
|
|
|
|
Mon Jan 11 00:42:51 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdoc/hts_method.html: Add explanation of operator "not".
|
|
|
|
* installdir/syntax.html: Added examples of correct logical
|
|
expressions.
|
|
|
|
Mon Jan 11 00:23:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/attrs.html(search_algorithm): Added prefix and substring
|
|
matching--somehow slipped through the cracks!
|
|
|
|
* htdoc/THANKS.html: Update to be more accurate as far as recent
|
|
contributions.
|
|
|
|
Sun Jan 10 00:06:59 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc(readHeader): Added check for header status
|
|
when considering content-types. Fixed PR #91.
|
|
|
|
Sat Jan 9 20:52:49 1999 didier Gautheron <dgautheron@magic.fr>
|
|
|
|
* htcommon/WordList.cc(valid_word): Break out of looping once
|
|
we're sure the word is invalid.
|
|
|
|
* htlib/Dictionary.cc(Remove, Exists): Remember special case of an
|
|
empty dictionary.
|
|
|
|
Sat Jan 9 20:16:25 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc(parse): Don't capitalize headers--this creates
|
|
problems with non-ASCII values, since String::uppercase doesn't
|
|
know how to capitalize them. Fixes PR #100.
|
|
|
|
Sat Jan 9 14:47:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc(getdate): Strip off weekday before calling
|
|
strptime since some servers return invalid weekdays. Fixes PR #79.
|
|
|
|
* htmerge/htmerge.h: Declare new mergeDB code.
|
|
|
|
* htmerge/htmerge.cc: Set up merge_config file and add options for
|
|
mergeDB code.
|
|
|
|
* htmerge/db.cc: New file. Implements merging of two database sets
|
|
specified by the merge_config and config variables.
|
|
|
|
* htmerge/Makefile.in: Add db.o as an object to be compiled.
|
|
|
|
Fri Jan 8 20:11:56 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
|
|
|
|
* htdig/Plaintext.cc: fixed bug that inhibited compressing of
|
|
whitespace
|
|
|
|
* htlib/URL.cc: fixed problem in stripping anchors from URLs
|
|
|
|
Thu Jan 7 23:29:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc(parse): Corrected problems with parsing comments,
|
|
as contributed by Marjolein Katsma <webmaster@javawoman.com> and
|
|
Gilles.
|
|
|
|
* htsearch/Display.cc, htsearch/Display.h: Implement
|
|
add_anchors_to_excerpt option and new variable ANCHOR as
|
|
contributed by Marjolein.
|
|
|
|
* htdoc/THANKS.html: Added new contributors.
|
|
|
|
* README: Update for 1999 copyright, version, etc.
|
|
|
|
Thu Jan 7 17:29:52 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/(attrs.html, cf_byname.html, cf_byprog.html): Fix typo
|
|
noted by Joe Jah: keyword_factor -> keywords_factor.
|
|
|
|
Thu Jan 7 14:32:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htsearch/Display.cc (display): The start template, if provided,
|
|
should come out after the header, not before.
|
|
|
|
* htcommon/defaults.cc, installdir/footer.html: Use the
|
|
no_page_list_header stuff.
|
|
|
|
Thu Jan 7 11:09:08 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/*.png: Add PNG versions of the default GIF graphics.
|
|
|
|
Wed Jan 6 22:03:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htfuzzy/Synonym.cc, htfuzzy/htfuzzy.cc, htmerge/docs.cc,
|
|
htmerge/words.cc, htdig/SGMLEntities.cc: Fix minor memory leaks.
|
|
|
|
* htcommon/defaults.cc: Add .bin, .tgz, .rpm, .mov, .mpg, .avi to
|
|
bad_extensions.
|
|
|
|
* htdoc/attrs.html: Update documentation on default.
|
|
|
|
* installdir/rundig: Removed check for age of synonym and endings
|
|
DB. Nice feature, but it broke under too many shells.
|
|
|
|
* htlib/DB2_db.cc: Change allocation of database cursors to match
|
|
API in new version.
|
|
|
|
* htdig/Retriever.cc(got_word): Skip changing to lowercase, we do
|
|
it in WordList::Word.
|
|
|
|
Wed Jan 6 14:49:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
|
|
|
|
* htdoc/attrs.html: Added four new attributes, fixed defaults & typos.
|
|
|
|
* htdoc/cf_byname.html: Added four new attributes.
|
|
|
|
* htdoc/cf_byprog.html: Added four new attributes.
|
|
|
|
Wed Jan 6 14:37:06 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in: Changed to require Autoconf 2.13 to eliminate bugs
|
|
obeserved by users with older autoconf versions.
|
|
|
|
* configure: Regenerated using Autoconf 2.13.
|
|
|
|
Wed Jan 6 13:08:26 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.cc: Applied fix from Dave Alden
|
|
<alden@math.ohio-state.edu> to compile under SunPRO compilers
|
|
by eliminating trailing comma in enum.
|
|
|
|
Wed Jan 6 17:50:55 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
|
|
|
|
* {.,htcommon,htdig,htfuzzy,htlib,htmerge,htnotify,htsearch}/
|
|
Makefile.in, Makefile.config.in: fixed relative path problem if
|
|
install-sh is used.
|
|
|
|
Wed Jan 6 17:12:04 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
|
|
|
|
* htlib/StringList.cc: fixed bug in StringList::Join (oops!)
|
|
|
|
Wed Jan 6 10:34:45 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.cc(AddDescription): Remove delete
|
|
instruction that fouls up everything (it was removing descriptions
|
|
as we add them!).
|
|
|
|
Wed Jan 6 14:52:11 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htlib/String.cc (allocate_space): Add missing [] to delete.
|
|
|
|
Wed Jan 6 05:53:02 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htcommon/DocumentRef.cc(AddDescription): Do not add non-word
|
|
characters to the wordlist.
|
|
|
|
Wed Jan 6 00:28:19 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdoc/cf_byname.html: Fixed html syntax "<br" and "/a>".
|
|
|
|
Tue Jan 5 22:40:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Check if we need to do backlink and date
|
|
factoring (e.g. we don't if they're zero!), from a patch by Gilles.
|
|
|
|
Tue Jan 5 20:57:02 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
|
|
|
|
* configure.in, htlib/Connection.cc: Check for strings.h for those
|
|
platforms that don't have it.
|
|
|
|
Tue Jan 5 14:24:52 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.h: Added comments on the members (fields)
|
|
of DocumentRef objects.
|
|
|
|
* htcommon/defaults.cc: Added new option max_descriptions for
|
|
limit on the number of descriptions to store (default 5, matches
|
|
behavior pre 3.1.0b3).
|
|
|
|
* htcommon/DocumentRef.cc: Support restriction of max_descriptions.
|
|
|
|
* .version: Bump to 3.1.0b5dev.
|
|
|
|
Tue Jan 5 20:07:05 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
|
|
|
|
* htdig/Retriever.cc: fixed bug in bad_querystring detection
|
|
|
|
Sat Jan 2 16:39:34 1999 Alexander Bergolth <leo@strike.wu-wien.ac.at>
|
|
|
|
* htdig/main.cc, htlib/Configuration.cc: Added warning message if
|
|
the locale selection was not successful. (e.g. because the locale
|
|
definition is not installed) config["locale"] is now set to the
|
|
return string of setlocale.
|
|
|
|
* {.,htcommon,htdig,htfuzzy,htlib,htmerge,htnotify,htsearch}/
|
|
Makefile.in, Makefile.config.in, configure.in: Changed to allow
|
|
compiling in seperate build directories.
|
|
|
|
Fri Jan 1 05:49:19 1999 Hans-Peter Nilsson <hp@axis.se>
|
|
|
|
* htdoc/attrs.html: Describe more thoroughly how "pdf_parser"
|
|
is used.
|
|
|
|
* htdoc/attrs.html: Fix typo for anchor/attribute
|
|
"allow_virtual_hosts".
|
|
|
|
* htdoc/attrs.html: Correct and add more verbose description of
|
|
external parser program parameters and fields.
|
|
|
|
Sun Dec 27 14:52:45 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
|
|
|
|
* htlib/URL.cc: Small change in URL::removeIndex so that URLs are not
|
|
stripped if a query string ends with /index.html
|
|
|
|
* htsearch/Display.cc, htnotify/htnotify.cc: Added patches from
|
|
Gilles Detillieux <grdetil@scrc.umanitoba.ca> to fix memory leaks.
|
|
|
|
Sat Dec 19 17:53:44 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
|
|
|
|
* htdig/main.cc, htdig/htdig.h, htdig/Retriever.cc: Added new option
|
|
bad_querystr. Allows exclusion when digging CGI-Scripts.
|
|
|
|
* htsearch/htsearch.cc, htsearch/Display.cc: Added new option
|
|
allow_in_form. Does currently not work with some special variable
|
|
names!
|
|
|
|
* htcommon/defaults.cc: Added the two new options.
|
|
|
|
Sat Dec 19 11:21:38 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* contrib/htparsedoc/parse_word_doc.pl: Update from Jesse.
|
|
|
|
* .version: Bump for 3.1.0b4.
|
|
|
|
* README: Ditto.
|
|
|
|
* Makefile.in: Remove references to version number.
|
|
|
|
* htnotify/htnotify.cc: Fix nasty security hole found by Werner
|
|
Hett <hett@isbiel.ch>.
|
|
|
|
Sat Dec 19 15:22:38 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
|
|
|
|
* htlib/StringList.cc, htlib/StringList.h: Added StringList::Join
|
|
to simplify the creation of patterns for StringMatch.
|
|
|
|
* htlib/String.cc: lastIndexOf(char ch) added
|
|
|
|
* htlib/URL.cc: Changed URL::removeIndex to use local_default_doc.
|
|
(index.html was hardcoded) local_default_doc can be a list.
|
|
|
|
* htdig/main.cc, htlib/URL.cc: Use StringList::Join.
|
|
|
|
Sun Dec 13 23:06:35 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Fix potential coredump when calculating
|
|
date_factor and backlink_factor on docs that aren't in the
|
|
database.
|
|
|
|
Sat Dec 12 23:17:56 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html:
|
|
Added docs for new options since version 3.1.0b2.
|
|
|
|
* htdoc/RELEASE.html: Added notes on changes since 3.1.0b2 (we
|
|
should keep this up rather than all-at-once).
|
|
|
|
* htdoc/hts_templates: Include documentation on using CGI
|
|
environment variables in templates with this version.
|
|
|
|
* htdig/Retriever.cc(got_href): Added check to prevent
|
|
currenthopcount from becoming -1.
|
|
|
|
* htcommon/WordList.cc: Change undefined minimumWordLength to
|
|
config("minimum_word_length").
|
|
|
|
Sat Dec 12 12:01:55 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.in, Makefile.config.in, */Makefile.in: Added target
|
|
mostlyclean to clean up, but leave compile-intensive targets
|
|
(e.g. db, rx code). General cleanup too.
|
|
|
|
* htdoc/where.html: Updated for eventual 3.1.0b3 release.
|
|
|
|
* htcommon/WordList.cc: Added additional cleanups for the words in
|
|
the bad word file, in case they have invalid punctuation, etc.
|
|
|
|
Sat Dec 12 18:41:29 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
|
|
|
|
* htmerge/words.cc: Fix last update so that it compiles on AIX.
|
|
|
|
Fri Dec 11 10:40:48 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc: Added additional debugging info on the
|
|
reason for excluding a URL, based on a patch by Benoit Majeau
|
|
<Benoit.Majeau@nrc.ca>.
|
|
|
|
* htmerge/words.cc: Fixed a bug where pointer, rather than strings
|
|
were assigned. Silly references...
|
|
|
|
* htsearch/Display.cc, htsearch/Display.h: Added patch from Gilles
|
|
to allow CGI environment variables in templates.
|
|
|
|
* htdig/HTML.cc: Fix core dump when META refresh tags don't have
|
|
content portions.
|
|
|
|
Thu Dec 10 22:28:44 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc, htdig/Server.cc, htdig/Server.h:
|
|
Changed support for server_wait_time to use delay() method in
|
|
Server. Delay is from beginning of last connection to this
|
|
one. Currently this also delays local digging, which may not be ideal.
|
|
|
|
* htcommon/defaults.cc: Added option for server_max_docs as a
|
|
limit on the number of docs returned from a server.
|
|
|
|
* contrib/htparsedoc/parse_word_doc.pl: New version from
|
|
Jesse. New code speedups and better matching of punctuation.
|
|
|
|
* htdig/Document.cc: Check http_proxy_exclude to see if it's
|
|
empty. If so, use the proxy.
|
|
|
|
Mon Dec 7 21:46:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc: Fix thinko with multiple excludes and
|
|
restricts. Pointed out by Gilles.
|
|
|
|
* htcommon/defaults.cc: Add new option server_wait_time for the
|
|
number of seconds to wait between requests.
|
|
|
|
* htdig/Retriever.cc: Use server_wait_time to call sleep() before
|
|
requests. Should help prevent server abuse. :-)
|
|
|
|
* htcommon/WordList.cc(valid_word): Remove unnecessary code.
|
|
|
|
* htcommon/DocumentRef.cc: Fix typo that added description text
|
|
that contained punctuation or was too short.
|
|
|
|
Sun Dec 6 13:12:55 1998 Geoff Hutchison <ghutchis@ethel.williams.edu>
|
|
|
|
* htsearch/parser.cc: Check for empty boolean searches and report
|
|
an error. Fixes bug reported by Chuck O'Donnell <cao@bus.net>.
|
|
|
|
* install-sh, mkinstalldirs: Import latest version from autoconf.
|
|
|
|
* htcommon/DocumentRef.cc: Add the text of descriptions to the
|
|
word database with weight description_factor.
|
|
|
|
* htcommon/WordList.cc: Ensure duplicate words have minimum
|
|
location and anchor attributes.
|
|
|
|
* htcommon/WordRecord.h: Ensure blank WordRecords have a default
|
|
count of 1 since a word has to exist to have a WordRecord!
|
|
|
|
* htdig/ExternalParser.cc, htdig/PDF.cc, htfuzzy/EndingsDB.cc:
|
|
Ensure temporary files are placed in TMPDIR if it's set.
|
|
|
|
* htdig/Retriever.cc: Don't add the text of descriptions to the
|
|
word db here, it's better to do it in the DocumentRef itself.
|
|
|
|
* htmerge/words.cc: Check for word entries that are essentially
|
|
duplicates and compact them.
|
|
|
|
Sat Dec 5 01:10:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/THANKS.html: Updated for recent submissions.
|
|
|
|
* htdoc/FAQ.html: Cleaned up title.
|
|
|
|
* htdoc/uses.html: Added more sites and cleaned up the HTML.
|
|
|
|
Fri Dec 4 20:15:41 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* db/os/os_fsync.c, db/mutex/mutex.c: Patch from Klaus Mueller
|
|
<K.Mueller@intershop.de> to compile under CygWinB20.
|
|
|
|
* htdig/HTML.cc: Fix mistake in last update--file was included
|
|
twice.
|
|
|
|
* htdig/Retriever.cc: Do a check for blank URLs before adding them
|
|
to the list to be retrieved.
|
|
|
|
Fri Dec 4 19:21:17 1998 Didier Gautheron <dgautheron@magic.fr>
|
|
|
|
* htdig/HTML.cc: Fix parser bug with < becoming a tag.
|
|
|
|
* htlib/Dictionary.cc: Added check for empty dictionaries.
|
|
|
|
* htlib/URL.cc: Allow server_aliases to work under virtual hosts.
|
|
|
|
* htmerge/htmerge.cc: Remove previous db.words.db file before
|
|
doing a word merging. Fixes bug with deleted documents keeping
|
|
entries.
|
|
|
|
* htdig/main.cc, htdig/Retriever.h, htdig/Retriever.cc: Added
|
|
parameter to Initial function to prevent URLs from being checked
|
|
twice during an update dig.
|
|
|
|
* htcommon/WordList.cc, htmerge/words.cc: Don't store c:1 and a:0
|
|
entries in db.wordlist to save space.
|
|
|
|
Fri Dec 4 19:08:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in, Makefile.in, Makefile.config.in: Remove DB_DIR and
|
|
RX_DIR.
|
|
|
|
* configure: Regenerated for configure.in changes.
|
|
|
|
* htsearch/htsearch.cc: Added usage message for the command line.
|
|
|
|
Fri Dec 4 18:52:55 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html: Added question about phrase matching.
|
|
|
|
Fri Dec 4 21:21:00 1998 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* configure.in: Check if the third argument of getpeername is a
|
|
size_t* or an unsigned int*.
|
|
|
|
* include/htconfig.h.in: Define GETPEERNAME_LENGTH_T.
|
|
|
|
* htlib/Connection.cc: Use GETPEERNAME_LENGTH_T as the type of the
|
|
third getpeername argument. Included strings.h which is needed for
|
|
FD_ZERO on AIX.
|
|
|
|
Thu Dec 3 23:03:15 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in: Check for getopt.h for those platforms that don't
|
|
have it. Fix checks for db and rx dirs since these names won't
|
|
change.
|
|
|
|
* include/htconfig.h.in: Define HAVE_GETOPT_H.
|
|
|
|
* configure: Generate from configure.in with latest autoconf
|
|
(2.12.2).
|
|
|
|
* htdig/Plaintext.cc: Removed compiler warnings.
|
|
|
|
* htdig/main.cc, htfuzzy/htfuzzy.cc, htmerge/htmerge.cc,
|
|
htnotify/htnotify.cc, htsearch/htsearch.cc: Use configure check to
|
|
only include getopt.h when it exists.
|
|
|
|
* htcommon/defaults.cc: Add new option http_proxy_exclude for
|
|
servers that shouldn't use the proxy, from a patch by Gilles
|
|
Detillieux.
|
|
|
|
* htdig/Document.h, htdig/Document.cc: Use it, from a patch by Gilles.
|
|
|
|
Tue Dec 1 21:36:37 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.in: Fixed bug with "make depend," noted by Morgan Davis
|
|
<mdavis@cts.com>.
|
|
|
|
* htdig/main.cc, htfuzzy/htfuzzy.cc, htmerge/htmerge.cc,
|
|
htnotify/htnotify.cc, htsearch/htsearch.cc: Add include <getopt.h>
|
|
to help compiling under Win32 with CygWinB20.
|
|
|
|
* htdig/Retriever.cc: Update hopcount correctly by taking the
|
|
shortest paths to documents.
|
|
|
|
* htlib/DB2_db.cc: Added fix from Alexander Bergolth for Berkeley
|
|
DB under AIX.
|
|
|
|
* htlib/StringMatch.cc: Added fix from Christian Schneider
|
|
<cschneid@relog.ch>, discovered from behavior with limit_urls_to.
|
|
|
|
Tue Dec 1 18:06:33 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/hts_form.html: Explained why config fields reject periods.
|
|
|
|
* htdoc/FAQ.html: Added information about Internal Server Errors.
|
|
|
|
* htdoc/uses.html: Updated with more sites, change e-mail to Geoff.
|
|
|
|
Sun Nov 29 21:26:56 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc: Fix last update so it compiles (oops!).
|
|
|
|
* htdig/Document.cc: As above!
|
|
|
|
Sun Nov 29 20:06:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/htsearch.cc: Improved support for multiple restrict and
|
|
exclude patterns, based on code from Gilles Detillieux
|
|
and William Rhee <willrhee@umich.edu>.
|
|
|
|
* htdig/Document.cc, htdig/PDF.cc: Fixed problems under FreeBSD
|
|
where <sys/types.h> needed to be before <sys/stat.h>, noted by
|
|
Gilles.
|
|
|
|
* htdig/Server.cc: Fixed bug with robots.txt files containing
|
|
tabs, based on patch from Christian Schneider <cschneid@relog.ch>.
|
|
|
|
* htdig/Document.cc: Fixed core dumps caused by mystrptime
|
|
returning NULL. Instead, we'll use the current timestamp. Noted by
|
|
Michael Hauber <mhauber@datacore.ch> and
|
|
<MARK_ALLEYNE@Non-HP-UnitedKingdom-om8.om.hp.com>.
|
|
|
|
Fri Nov 27 19:09:33 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* db/*: Import of Sleepycat's Berkely DB 2.5.9
|
|
|
|
* rx/*: Import of FSF rx 1.5
|
|
|
|
* configure, configure.in: Updated to deal with changes in db, rx
|
|
directories.
|
|
|
|
* Attic/db-2.4.14.tar.gz: Removed old db package for update.
|
|
|
|
* htsearch/parser.cc: Removed bogus code with "%01" -> "|"
|
|
|
|
* htlib/URL.cc: Considers URLs with "%7E" to be equivalent to "~"
|
|
|
|
* htlib/String.cc: Changed MinimumAllocationSize to cut down on
|
|
memory usage on small strings.
|
|
|
|
* htdig/Retriever.h, htdig/Retriever.cc, htdig/HTML.cc: Changed
|
|
Retriever::got_word to check for small words, valid_punctuation to
|
|
remove bugs in HTML.cc.
|
|
|
|
* htcommon/defaults.cc: Changed backlink_factor to 1000,
|
|
description_factor to 150, match_method to and, and
|
|
meta_description factor to 50. Should produce more accurate search
|
|
results.
|
|
|
|
* htcommon/WordList.cc: Fixed bug with bad_words and
|
|
MAX_WORD_LENGTH, noted by Jeff Breidenbach <jeff@alum.mit.edu>.
|
|
|
|
* README: Updated to reflect bug-tracking system.
|
|
|
|
Tue Nov 24 15:57:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc: Added patch to use local_default doc with
|
|
local_user_urls from Gilles Detillieux
|
|
<grdetil@scrc.umanitoba.ca>.
|
|
|
|
Mon Nov 23 18:57:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/RELEASE.html, htdoc/bugs.html, htdoc/contents.html,
|
|
htdoc/where.html: Updated for new bug reporting system.
|
|
|
|
* htdoc/TODO.html: Updated To Do w/ current status.
|
|
|
|
Sun Nov 22 14:03:06 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/rundig: Added checks for synonym databases older than
|
|
the synonym files.
|
|
|
|
* htcommon/defaults.cc: New config options "description_factor"
|
|
for weighting words added as link descriptions, and
|
|
"no_excerpt_show_top" to show the top of an excerpt instead of the
|
|
"no_excerpt_text".
|
|
|
|
* htdig/Retriever.cc: Use "description_factor" to weight link
|
|
descriptions with the documents at the end of the link.
|
|
|
|
* htsearch/Display.cc: Adjust date_factor and backlink_factor
|
|
rankings to produce better results.
|
|
|
|
* htsearch/Display.cc: Use "no_excerpt_show_top."
|
|
|
|
* htsearch/htsearch.cc: Don't remove boolean operators from
|
|
boolean search strings!
|
|
|
|
Thu Nov 19 01:31:37 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html: Update for -ldb problem on Digital UNIX.
|
|
|
|
Wed Nov 18 05:14:53 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/FAQ.html: Update FAQ w/ new questions, better responses.
|
|
|
|
* htdoc/mailing.html: Mention additional archive at
|
|
www.mail-archive.com.
|
|
|
|
* htdoc/require.html: Update requirements (libstc++ instead of libg++).
|
|
|
|
Tue Nov 17 23:13:04 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* contrib/wordfreq/wordfreq.pl: Added changes by Isoif.
|
|
|
|
* htsearch/Display.cc: Added HTTP_REFERER to htsearch logging
|
|
|
|
* htdig/Document.cc: Fixed memory leak as a result of thinko.
|
|
|
|
* htcommon/DocumentRef.cc: Removed limit on number of link
|
|
descriptions.
|
|
|
|
Mon Nov 16 22:30:07 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Declare new config options backlink_factor
|
|
and date_factor for counting document backlink counts and modifed
|
|
dates in rankings.
|
|
|
|
* htsearch/Display.cc: Use above factors.
|
|
|
|
* htsearch/ResultMatch.cc: Clarify getScore() comments.
|
|
|
|
* htlib/mktime.c: Import new version.
|
|
|
|
* installdir/htdig.conf: Add max_doc_size example (to help w/FAQ).
|
|
|
|
Mon Nov 16 10:46:15 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/ExternalParser.cc: Add checks for null tokens, adapted
|
|
from patch by Vadim Checkan.
|
|
|
|
* htdig/Retriever.cc: Count docBackLinks accurately (previously
|
|
all docs had count of 2!).
|
|
|
|
Sun Nov 15 17:04:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc(do_tag): Fix for refresh tags w/o URLs.
|
|
|
|
* htmerge/docs.cc, htmerge/words.cc: Change \r to \n, as mentioned
|
|
by Andrew Bishop.
|
|
|
|
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Define new fields
|
|
docBackLinks (backlink count) and docSig (document signature).
|
|
|
|
* htdig/Retriever.cc: Keep track of docBackLinks.
|
|
|
|
* htsearch/Display.cc: Add variable BACKLINKS to display the count.
|
|
|
|
Sat Nov 14 20:30:18 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc(parse, do_tag): Ensure links respect META robot
|
|
settings. Patch contributed by Michael Spann
|
|
<mikes@mail.sv.dialogic.com>.
|
|
|
|
* htdig/HTML.cc(do_tag): Eliminate bug that ignores "?" in URLs
|
|
|
|
* htdig/HTML.cc(do_tag): Add support for META refresh tags as
|
|
"redirects", submitted by Aidas Kasparas
|
|
<kaspar@dobilas.infosistema.lt>.
|
|
|
|
Thu Nov 12 04:13:26 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/contents.html: Added link to jitterbug bug db.
|
|
|
|
Sun Nov 8 21:10:19 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/ChangeLog, htdoc/RELEASE.html, htdoc/THANKS.html:
|
|
Correct spelling error with Rene' Seindal's name.
|
|
|
|
* htdoc/hts_templates.html: Update to improve clarity.
|
|
|
|
Sun Nov 8 20:33:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Document.cc: Changed reset to keep proxy settings--fixes
|
|
bug noted by Didier Gautheron <dgautheron@magic.fr>
|
|
|
|
Fri Nov 6 17:07:00 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* contrib/wordfreq/wordfreq.pl: Updated with patch from Isoif
|
|
Fettich <ifettich@netsoft.ro> to use Berkeley DB.
|
|
|
|
* contrib/whatsnew/whatsnew.pl: Fixed mistake from Oct 26 change.
|
|
|
|
* contrib/htparsedoc/parse_word_doc.pl: Added file contributed by
|
|
Jesse.
|
|
|
|
* contrib/README: Updated to include short descriptions of the scripts.
|
|
|
|
* contrib/multidig/*: New scripts to make working with multiple DB
|
|
a little easier.
|
|
|
|
* configure, configure.in: Added changes to support snapshots.
|
|
|
|
* .version: Resurrected to automate snapshot versions.
|
|
|
|
Wed Nov 4 20:13:10 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdoc/contents.html: Added "Contributors" for THANKS.html
|
|
|
|
* htdoc/THANKS.html: Added acknowledgement to contributors.
|
|
|
|
Wed Nov 4 15:02:43 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htnotify/htnotify.cc: Fixed buglet with -F flag to sendmail.
|
|
|
|
* htdig/Plaintext.cc: Added patch from Vadim Chekan to change char
|
|
to unsigned char to fix reading Cyrillic plaintext files.
|
|
|
|
Mon Nov 2 15:34:53 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htnotify/htnotify.cc, Makefile.config.in, README:
|
|
Changed "HTDig" to "ht://Dig."
|
|
|
|
Sun Nov 1 20:34:14 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.in: Fixed buglet with dist target.
|
|
|
|
* htdig/Makefile.in: Fixed buglet with distclean target.
|
|
|
|
* htdoc/FAQ.html, htdoc/RELEASE.html, htdoc/attrs.html
|
|
htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/htdig.html
|
|
htdoc/hts_templates.html: Updated documentation for new features,
|
|
bug-fixes in ht://Dig 3.1.0b2.
|
|
|
|
* htlib/Makefile.in, htlib/lib.h: Call mytimegm.cc instead of timegm.c.
|
|
|
|
* Attic/makedp: Remove file generated by configure
|
|
|
|
* htdig/Document.cc: Remove const from *ext to fix compiler warning.
|
|
|
|
Sun Nov 1 00:17:08 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Added template var DESCRIPTION as first
|
|
item in DESCRIPTIONS, as requested by Ryan Scott
|
|
<test@netcreations.com>.
|
|
|
|
* htlib/mytimegm.cc: Resurrected mytimegm() until problems with
|
|
glibc version can be solved.
|
|
|
|
* htdig/Document.cc, htdig/Retriever.cc, htfuzzy/Prefix.cc,
|
|
htsearch/WeightWord.cc, htsearch/htsearch.cc: Replaced system
|
|
calls with htlib/my* functions.
|
|
|
|
Sat Oct 31 23:58:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/URL.cc: Fixed compiler warning.
|
|
|
|
* rx-1.5/Attic/Makefile, rx-1.5/Attic/config.log:
|
|
Removed useless Makefile and config.log file.
|
|
|
|
Tue Oct 27 22:53:03 1998 Andrew Scherpbier <andrew@contigo.com>
|
|
|
|
* */Makefile.in (depend): Fixed so that 'make depend' works
|
|
again. (Not sure exactly how long it was broken!)
|
|
|
|
Tue Oct 27 20:00:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.in: Fix buglet with distclean target
|
|
|
|
* configure configure.in: Added check for LOCALTIME_R, removed
|
|
test for timegm replacement, changed compiler for most tests to
|
|
$CC.
|
|
|
|
* include/htconfig.in: Added option for LOCALTIME_R.
|
|
|
|
* htlib/timegm.c, htlib/mktime.c: Fixed some compilation problems.
|
|
|
|
* htlib/Makefile.in: Remove mktime.o since source is included in
|
|
timegm.o.
|
|
|
|
Tue Oct 27 13:31:25 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/mktime.c: Imported new version from glibc-2.0.99.
|
|
|
|
* htcommon/DocumentDB.cc: Fixed bug noted by Vadim Chekan with
|
|
CreateSearchDB.
|
|
|
|
Mon Oct 26 15:27:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* Makefile.config.in, configure.in, configure: Fixed problem with
|
|
-ldb, -lrx, etc. not being declared in $LIBS
|
|
|
|
* htdoc/install.html: Added remarks about using ./configure
|
|
--prefix=
|
|
|
|
* README: Cleaned up for new URLs, version numbers, etc.
|
|
|
|
* htsearch/htsearch.cc: Added patch by Esa Ahola fixing bug with
|
|
not ingoring bad_words properly.
|
|
|
|
* contrib/whatsnew/whatsnew.pl: Added fix from Jacques Reynes
|
|
<Jacques.Reynes@cict.fr> to get whatsnew to work with Berkeley DB.
|
|
|
|
* htdig/Retriever.cc, htdig/Document.cc: Fixed bug introduced by
|
|
Oct 18 change. Authorization will not be cleared.
|
|
|
|
* htlib/URL.cc: Fixed new -Wall warnings.
|
|
|
|
Wed Oct 21 13:30:05 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/timegm.c: Corrected Oct 17 change. Should now work. :-)
|
|
|
|
* htcommon/defaults.cc: Added defaults for new directives
|
|
server_aliases and limit_normalized.
|
|
|
|
* htdig/HTML.cc: Cleaned up HTML parsing based on patch by Rene'
|
|
Seindal.
|
|
|
|
Wed Oct 21 18:31:00 1998 Alexander Bergolth <leo@leo.wu-wien.ac.at>
|
|
|
|
* htlib/URL.cc, htlib/URL.h: Added patch to support translation of
|
|
server names. (Configuration directive: server_aliases)
|
|
|
|
* htdig/Retriever.cc, htdig/htdig.h, htdig/main.cc:
|
|
Additional limiting after normalization of the URL.
|
|
(Configuration directive: limit_normalized)
|
|
|
|
Sun Oct 18 17:19:51 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/Connection.h, htlib/Connection.cc: Define new function
|
|
timeout() as adapted from a patch by Rene' Seindal.
|
|
|
|
* htdig/Document.cc: Use it as adapted from a patch by Rene' Seindal.
|
|
|
|
Sun Oct 18 16:33:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentDB.cc: Changed deserialize function to
|
|
explicitly delete DocumentRef.
|
|
|
|
* htcommon/DocumentRef.cc: Added trap for DOC_STRING value.
|
|
|
|
* htdig/Retriever.cc: Delete and reallocate Document variable
|
|
before retrieving. (Fixes database corruption bug) Removed code to
|
|
add a "/" to every URL with a 404--servers should send a redirect
|
|
in this case.
|
|
|
|
Sat Oct 17 20:15:44 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/timegm.c: Declare __gmtime_r if not defined
|
|
|
|
Sat Oct 17 10:15:57 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* configure.in: Fixed problem with configuring DB_DIR introduced
|
|
by Oct 11 change.
|
|
|
|
* configure: Regenerated by autoconf for above fix.
|
|
|
|
* htlib/Connection.h, htlib/Connection.cc: Included fixes sent by
|
|
Paul J. Meyer <pmeyer@rimeice.msfc.nasa.gov> to fix connections on
|
|
Dec Alpha environments.
|
|
|
|
* htsearch/Display.cc, htsearch/Display.h,
|
|
htdoc/hts_templates.html: Added variable CURRENT as the number of
|
|
the current match, adapted from a patch by Rene' Seindal
|
|
<seindal@webadm.kb.dk>
|
|
|
|
* htcommon/defaults.cc: Changed htdig.sdsu.edu to www.htdig.org in
|
|
start_urls
|
|
|
|
Wed Oct 14 03:43:22 1998 turtle <turtle@kiwi>
|
|
|
|
* installdir/htdig.conf: fixed broken link pointed out by
|
|
chris@impulsedata.net, moved maintainer stuff up in the file
|
|
|
|
Sun Oct 11 22:16:27 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/DB2_db.cc: Added fix suggested by Domotor Akos
|
|
<dome@impulzus.sch.bme.hu> with (char *)NULL cast.
|
|
|
|
* htlib/Attic/mytimegm.cc: Removed old mytimegm function.
|
|
|
|
* installdir/syntax.html: Improved boolean method error
|
|
message. It now gives examples of boolean expressions.
|
|
|
|
* htcommon/defaults.cc, htsearch/Display.cc, htsearch/Display.h,
|
|
htsearch/parser.cc: Added htsearch logging patch from Alexander
|
|
Bergolth.
|
|
|
|
* */Makefile.in, include/htconfig.h.in, htdig/Document.cc,
|
|
htdig/Images.cc, Attic/.version, Makefile.config.in, Makefile.in,
|
|
configure, configure.in, mkinstalldirs: Updated Makefiles and
|
|
configure variables.
|
|
|
|
* htfuzzy/Endings.cc, htfuzzy/Fuzzy.cc, htfuzzy/Prefix.cc,
|
|
htfuzzy/htfuzzy.cc, htlib/DB2_db.cc, htcommon/DocumentDB.cc:
|
|
Removed more -Wall warnings.
|
|
|
|
Fri Oct 9 00:29:18 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc: Fixed typo with "meta_desription_factor".
|
|
|
|
* htdig/Images.cc: Use user_agent config in GET request.
|
|
|
|
Thu Oct 8 09:05:41 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/syntax.html: Improved Boolean search description.
|
|
|
|
Mon Oct 5 11:30:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* contrib/ewswrap/ewswrap.cgi, contrib/ewswrap/htwrap.cgi,
|
|
contrib/ewswrap/README: New scripts, contributed by John Grohol
|
|
PsyD <johngr@cmhcsys.com>.
|
|
|
|
Fri Oct 2 13:11:24 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc: Added check for docs removed with
|
|
noindex. Now words in these docs should be ignored for the word
|
|
db.
|
|
|
|
Fri Oct 2 13:09:04 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* CONFIG Makefile.config.in Makefile.in */Makefile.in,
|
|
htcommon/defaults.cc htdig/main.cc, htfuzzy/htfuzzy.cc,
|
|
htmerge/htmerge.cc, htnotify/htnotify.cc include/htconfig.h.in:
|
|
More configure improvements--use top_srcdir instead of
|
|
HTDIG_TOP, use PACKAGE, VERSION, etc.
|
|
|
|
Fri Oct 2 11:32:59 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/StringList.cc: Added patch by Alexander Bergolth for bug
|
|
with multiple delimeter characters
|
|
|
|
Fri Oct 2 15:22:06 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* installdir/rundig, configure.in, CONFIG, CONFIG.in, aclocal.m4,
|
|
configure: Improvements in configure.in, notably using --prefix=
|
|
and --exec-prefix=
|
|
|
|
Tue Sep 29 19:26:11 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc: Added patch from Tim Frost <tim@nz.eds.com> for
|
|
single quotes around URLs.
|
|
|
|
* htfuzzy/Prefix.cc: Added patch from Esa to fix Prefix matching
|
|
for capitalization.
|
|
|
|
* htcommon/defaults.cc: Added modification_time_is_now config
|
|
|
|
* htdig/Document.cc:, htdig/Retriever.cc: Added patch from Andrew
|
|
Bishop <amb@gedanken.demon.co.uk> for above to use modification
|
|
times when servers do not supply them.
|
|
|
|
* htsearch/htsearch.cc: Added patch from Andrew Bishop for -c switch.
|
|
|
|
Wed Sep 23 14:46:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc, htdig/Server.cc: Added case_sensitive
|
|
attribute to work on case insensitive servers.
|
|
|
|
Wed Sep 23 11:58:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: re-fixed bug noted by Alexander Bergolth
|
|
|
|
* htlib/Attic/timegm.cc, htlib/Makefile.in, htlib/mktime.c,
|
|
htlib/mytimegm.cc, htlib/timegm.c: Switched to using glibc timegm
|
|
replacement.
|
|
|
|
* configure, configure.in, Makefile.config.in: Add configure
|
|
searches for acroread and sendmail programs.
|
|
|
|
* htnotify/Makefile.in, htnotify/htnotify.cc,
|
|
htcommon/Makefile.in, htcommon/defaults.cc: Use them.
|
|
|
|
* htdig/HTML.cc: Fix thinko in META robots tag.
|
|
|
|
* htcommon/defaults.cc: Define iso_8601 date formatting option
|
|
|
|
* htsearch/Display.cc, htnotify/htnotify.cc: Use it as suggested
|
|
by Knut A. Syed <Knut.Syed@nhh.no>
|
|
|
|
Fri Sep 18 14:35:02 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Fixed bug noted by Alexander Bergolth
|
|
<leo@strike.wu-wien.ac.at> in exclude logic
|
|
|
|
* htdig/HTML.cc: Fixed bug in comma-separated keywords noted by
|
|
<C.H.Liddiard@qmw.ac.uk>
|
|
|
|
* installdir/synonyms: New version contributed by John Banbury
|
|
<lijab@flinders.edu.au>
|
|
|
|
Fri Sep 18 00:38:09 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* .version: Bump to 3.1.0b2
|
|
|
|
* htsearch/Makefile.in, htdig/Makefile.in, htfuzzy/Makefile.in,
|
|
htlib/Makefile.in, htmerge/Makefile.in,
|
|
htnotify/Makefile.in, htcommon/Makefile.in: Remove include
|
|
.sniffdir directive.
|
|
|
|
* htdig/HTML.cc: Fix horrible META description coding.
|
|
|
|
* htfuzzy/EndingsDB.cc, htfuzzy/Fuzzy.cc htfuzzy/Synonym.cc,
|
|
htfuzzy/htfuzzy.cc: Change "\r" to "\n" in statistics on
|
|
suggestion of Andrew M. Bishop <amb@gedanken.demon.co.uk>
|
|
|
|
* Makefile.config.in: Remove -ggdb from LDFLAGS.
|
|
|
|
Tue Sep 15 22:31:48 1998 turtle <turtle@kiwi>
|
|
|
|
* Makefile.in: add substitution for @DATABASE_DIR@
|
|
|
|
Thu Sep 10 00:06:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/HTML.cc: Change debug level of META tags.
|
|
|
|
* htsearch/TemplateList.cc, htsearch/htsearch.cc, htsearch/Display.cc,
|
|
htsearch/Display.h: Backed out builtin-long default from Monday, now
|
|
use error handler
|
|
|
|
Mon Sep 7 23:19:12 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* contrib/htparsedoc: Added contributed external parser for MS
|
|
Word documents by Richard Jones <rjones@imcl.com>.
|
|
|
|
* htdig/Document.cc: Added fix to use htparsedoc.
|
|
|
|
* htdoc/*.html: Merged in new documentation for htdig-3.1.0b1.
|
|
|
|
* htdig/HTML.cc: Extended "noindex" behavior in previous patch.
|
|
|
|
* htcommon/defaults.cc: Added user_agent config option.
|
|
|
|
* htdig/Document.cc: Use it.
|
|
|
|
Mon Sep 7 00:34:19 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/DocumentRef.h: Added DocState for documents marked as
|
|
"noindex".
|
|
|
|
* htdig/HTML.cc, htdig/Retriever.h, htdig/Retriever.cc,
|
|
htmerge/docs.cc: Use it to remove them.
|
|
|
|
* htsearch/TemplateList.cc: Add default template of builtin-long
|
|
to slot 0 in case of an error.
|
|
|
|
* htsearch/Display.cc: Use it.
|
|
|
|
Sun Sep 6 21:36:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htcommon/defaults.cc: Sorted the current list of defaults, added
|
|
"pdf_parser" for the program to use in PDF.cc.
|
|
|
|
* htdig/PDF.cc: Use it, checking for the file before calling
|
|
system to fail gracefully.
|
|
|
|
* htlib/URL.cc: Bug fix for http:/ v. http://
|
|
|
|
Sat Sep 5 23:11:48 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/String.cc: Added patch by Zvi Har'El
|
|
<rl@math.technion.ac.il> to indexOf function to prevent "false
|
|
positive" matches.
|
|
|
|
* installdir/nomatch.html, installdir/syntax.html: Fixed reference
|
|
to ht://Dig 3.0.
|
|
|
|
* htdig/Document.cc: Use robotstxt_name as user-agent as a more
|
|
consistent approach.
|
|
|
|
* htsearch/parser.cc: Convert "%01" to "|" to support <SELECT
|
|
... MULTIPLE> tags.
|
|
|
|
Thu Sep 3 20:53:51 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Makefile.in: Remove reference to -lgdbm
|
|
|
|
* htsearch/Display.cc: Send Content-type header after all variable
|
|
expansion is completed.
|
|
|
|
* htcommon/WordList.cc: Removed warning under egcs-1.1
|
|
|
|
Tue Aug 11 08:58:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc, htdig/Retriever.h,
|
|
htdig/Retriever.cc, htdig/Parsable.h, htdig/Parsable.cc,
|
|
htdig/HTML.h, htdig/HTML.cc, htcommon/defaults.cc,
|
|
htcommon/DocumentRef.h, htcommon/DocumentRef.cc,
|
|
htcommon/DocumentDB.cc:
|
|
Second patch for META description tags. New field in DocDB for the
|
|
desc., space in word DB w/ proper factor.
|
|
|
|
* htmerge/docs.cc: Added statistic for total size of docs in DB.
|
|
|
|
Thu Aug 6 10:15:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Retriever.cc: Added "local_dir_doc" config option,
|
|
the default filename in a directory.
|
|
|
|
* htcommon/defaults.cc: Fixed "elipses" spelling mistake,
|
|
local_dir_doc as above
|
|
|
|
Tue Aug 4 11:34:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htlib/Configuration.cc: Added fix by Philippe Rochat
|
|
<prochat@lbdsun.epfl.ch> to remove whitespace after config
|
|
options.
|
|
|
|
* htdig/HTML.cc, htdig/HTML.h: Added support for META robots tags.
|
|
|
|
Mon Aug 3 16:50:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/ResultList.cc, htnotify/htnotify.cc,
|
|
htmerge/htmerge.cc, htmerge/docs.cc, htlib/String.cc,
|
|
htlib/ParsedString.cc, htfuzzy/Substring.cc,
|
|
htfuzzy/Prefix.cc, htfuzzy/Exact.cc,
|
|
htdig/SGMLEntities.cc, htdig/Retriever.cc, htdig/PDF.cc,
|
|
htdig/HTML.cc, htdig/Document.cc:
|
|
Fixed compiler warnings under -Wall
|
|
|
|
Mon Aug 3 05:56:23 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Spelling correction for "ellipses"
|
|
|
|
Thu Jul 23 12:14:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/PDF.cc, htdig/PDF.h, htdig/Document.cc: Added files (and
|
|
patch) from Sylvain Wallez for PDF parsing. Incorporates fix for
|
|
non-Adobe PDFs.
|
|
|
|
* htcommon/defaults.cc: Removed .pdf extension from bad_extensions.
|
|
|
|
Wed Jul 22 10:04:31 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Added patch from Sylvain Wallez
|
|
<s.wallez.alcatel@e-mail.com> to use the filename if no title is found.
|
|
|
|
* htnotify/htnotify.cc: Added patch from Chris Jason
|
|
Richards <richards@cs.tamu.edu> to fix problems with sendmail.
|
|
|
|
Tue Jul 21 09:56:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htsearch/Display.cc: Added patch by Rob Stone
|
|
<rob@psych.york.ac.uk> to create new environment variables to
|
|
htsearch: SELECTED_FORMAT and SELECTED_METHOD.
|
|
|
|
Sun Jul 19 09:51:47 1998 Andrew Scherpbier <andrew@contigo.com>
|
|
|
|
* configure.in (berkeley db stuff): Added the berkeley db .tar.gz
|
|
to the distribution and modified configure.in to extract it if it
|
|
needs to.
|
|
|
|
Thu Jul 9 09:39:01 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
|
|
|
|
* htdig/Server.cc, htdig/Retriever.h, htdig/Retriever.cc,
|
|
htdig/Document.h, htdig/Document.cc, htcommon/defaults.cc: Added
|
|
support for local file digging using patches by Pasi Eronen
|
|
<pe@iki.fi>. Patches include support for local user (~username)
|
|
digging.
|
|
|
|
* htdig/HTML.h, htdig/HTML.cc, htcommon/defaults.cc:
|
|
Added support for META name=description tags. Uses new config-file
|
|
option "use_meta_description" which is off by default.
|
|
|
|
Mon Jun 22 05:02:01 1998 turtle <turtle@kiwi>
|
|
|
|
* configure.in:
|
|
Added test to make sure that the berkeley db library is present
|
|
|
|
* .cvsignore: Ignore the berkeley db library
|
|
|
|
* configure: changed
|
|
|
|
* Makefile.config.in: Removed GDBM references
|
|
|
|
* Makefile.in: Removed GDMB references
|
|
|
|
* .version: updated version to 3.1.0b1
|
|
|
|
* README: Updated version # and website location
|
|
|
|
* htdig/HTML.cc: Applied patch that prevented SGML entities that
|
|
translate to valid_punctuation characters from becoming part of
|
|
words
|
|
|
|
* configure.in: Removed references to GDBM
|
|
|
|
* htcommon/defaults.cc: Got rid of my email address as the default
|
|
maintainer
|
|
|
|
* htdig/htdig.conf: simple config file for development
|
|
|
|
* htlib/String.cc, htlib/Attic/SDSU.h, htlib/Attic/SDSU.cc,
|
|
htlib/DB2_db.cc, htlib/Connection.cc, htlib/Configuration.cc,
|
|
htlib/BTree.cc: New Berkeley database stuff
|
|
|
|
* htlib/.sniffdir/ofiles.incl: removed SDSU.*
|
|
|
|
* installdir/syntax.html, installdir/search.html,
|
|
installdir/rundig, installdir/nomatch.html, installdir/htdig.conf,
|
|
installdir/footer.html: Changed to use the new
|
|
http://www.htdig.org/ instead of the sdsu site
|
|
|
|
Sun Jun 21 23:20:14 1998 turtle <turtle@kiwi>
|
|
|
|
* rx-1.5/rx/Attic/config.log, htsearch/htsearch.cc,
|
|
htsearch/Attic/display.cc, htsearch/Display.cc, htmerge/docs.cc,
|
|
htlib/.sniffdir/ofiles.incl, htlib/Database.h, htlib/DB2_db.cc,
|
|
htlib/DB2_db.h, htlib/Database.cc, htfuzzy/.sniffdir/ofiles.incl,
|
|
htfuzzy/Prefix.cc, htfuzzy/Prefix.h, htfuzzy/Makefile.in,
|
|
htfuzzy/Fuzzy.cc, htcommon/defaults.cc, configure.in, Makefile.in,
|
|
Makefile.config.in: patches by Esa and Jesse to add BerkeleyDB and
|
|
Prefix searching
|
|
|
|
Mon Jun 15 18:15:50 1998 turtle <turtle@kiwi>
|
|
|
|
* htdig/HTML.cc: Added suggestion by Chris Liddiard to add ',' to
|
|
the list of separator characters for meta keyword parsing
|
|
|
|
Tue May 26 03:58:14 1998 turtle <turtle@kiwi>
|
|
|
|
* rx-1.5/rx/Attic/config.log, htlib/htString.h, htlib/cgi.cc,
|
|
htlib/URL.cc, htlib/String.cc, htlib/ParsedString.cc,
|
|
htlib/Database.cc, htlib/Connection.cc: Got rid of compiler
|
|
warnings.
|
|
|
|
* rx-1.5/rx/.cvsignore: added config.log
|
|
|
|
Fri Apr 3 17:10:44 1998 turtle <turtle@kiwi>
|
|
|
|
* htsearch/Display.cc: Patch to make excludes work
|
|
|
|
Tue Mar 10 16:02:32 1998 turtle <turtle@kiwi>
|
|
|
|
* htlib/strcasecmp.cc: Applied patch by Bernhard Griener to add
|
|
arguments checks in the mystrncasecmp() function
|
|
|
|
Sun Feb 22 17:43:49 1998 turtle <turtle@kiwi>
|
|
|
|
* htdoc/mailing.html: New mailing list archive location
|
|
|
|
Tue Feb 17 18:05:40 1998 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added new one
|
|
|
|
Thu Feb 12 22:22:15 1998 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added more sites
|
|
|
|
Mon Jan 5 06:14:11 1998 turtle <turtle@kiwi>
|
|
|
|
* configure, configure.in: Added check for fstream.h to get rid of
|
|
the annoying emails about ht://Dig not compiling...
|
|
|
|
* Makefile.config.in: Added include of the GDBM library back
|
|
|
|
* .version: Now at version 3.0.9
|
|
|
|
* include/htconfig.h.in: Changed refs to time related stuff
|
|
|
|
* htmerge/htmerge.cc, htmerge/docs.cc: format changes
|
|
|
|
* htdig/Document.cc: Changed tm from pointer to real structure
|
|
|
|
* htlib/.sniffdir/ofiles.incl, htlib/timegm.cc: Our own timegm
|
|
function
|
|
|
|
* rx-1.5/rx/.cvsignore, rx-1.5/rx/Attic/Makefile: cvs cleanup
|
|
|
|
* htmerge/docs.cc: Fixed memory leak
|
|
|
|
* htlib/lib.h: Added own replacement of timegm()
|
|
|
|
* htlib/Dictionary.cc: Fixed memory leaks
|
|
|
|
* htlib/Connection.cc: Fix by Pontus Borg for AIX. Changed
|
|
'size_t' to 'unsigned long' for the length parameter for
|
|
getpeername()
|
|
|
|
* htfuzzy/Metaphone.cc: formatting changes
|
|
|
|
* htdig/Retriever.cc: fixed memory leak
|
|
|
|
* htdig/Document.cc: * Alarm was not cancelled if readHeader
|
|
returned anything but OK * Use our own timegm() replacement if
|
|
necessary
|
|
|
|
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: format changes
|
|
|
|
* htcommon/DocumentDB.h: reformatting
|
|
|
|
* htcommon/DocumentDB.cc: Fixed major memory leak
|
|
|
|
* include/.cvsignore, include/Attic/htconfig.h, rx-1.5/.cvsignore,
|
|
rx-1.5/Attic/config.cache, rx-1.5/Attic/config.status,
|
|
rx-1.5/rx/.cvsignore, rx-1.5/rx/Attic/config.status,
|
|
htlib/Attic/htlib.proj, htmerge/.cvsignore,
|
|
htmerge/Attic/htmerge.proj, htnotify/.cvsignore,
|
|
htnotify/Attic/htnotify.proj, htsearch/.cvsignore,
|
|
htsearch/Attic/htsearch.proj, Attic/config.cache,
|
|
htcommon/Attic/htcommon.proj, htfuzzy/.cvsignore,
|
|
htfuzzy/Attic/htfuzzy.proj, lookfor: General cleanup of archived
|
|
stuff
|
|
|
|
* .cvsignore: config.cache added
|
|
|
|
* htdig/.cvsignore: Added htdig
|
|
|
|
Tue Dec 16 15:57:22 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Document.cc: Added little patch by Tobias Oetiker
|
|
<oetiker@ee.ethz.ch> that should fix problems with timeouts.
|
|
|
|
Thu Dec 11 00:28:59 1997 turtle <turtle@kiwi>
|
|
|
|
* htlib/URL.h, htlib/URL.cc: Added double slash removal code.
|
|
These were causing loops.
|
|
|
|
Thu Oct 23 18:01:10 1997 turtle <turtle@kiwi>
|
|
|
|
* htlib/Connection.cc: Fix by Pontus Borg for AIX. Changed
|
|
'size_t' to 'unsigned long' for the length parameter for
|
|
getpeername()
|
|
|
|
Mon Oct 13 02:13:52 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Attic/Makefile, htdig/Attic/htdig.proj: remove files that
|
|
shouldn't be in the repository
|
|
|
|
* htdig/.cvsignore: Ignore Makefile
|
|
|
|
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html,
|
|
htdoc/ChangeLog: Added documentation for the external_parsers
|
|
attribute.
|
|
|
|
Mon Jul 14 15:32:22 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added cambridge
|
|
|
|
Wed Jul 9 15:57:30 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added the rhodos project
|
|
|
|
Mon Jul 7 22:15:45 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Document.cc: Removed old getdate() code that replaced '-'
|
|
with ' '.
|
|
|
|
* htlib/URL.cc: Sequences of "/./" are now replaced with "/" to
|
|
reduce the chance of infinite loops
|
|
|
|
* htdig/Document.cc: Added better date parsing. Now also supports
|
|
the old RFC 850 format
|
|
|
|
Thu Jul 3 17:44:39 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/cf_byname.html, htdoc/cf_byprog.html,
|
|
htcommon/defaults.cc, htdig/htdig.h, htdoc/attrs.html,
|
|
htlib/Configuration.h, htlib/URL.cc, htdig/Attic/Makefile,
|
|
htdig/Document.cc: Added support for virtual hosts
|
|
|
|
Mon Jun 30 17:07:49 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added Depaul university
|
|
|
|
Tue Jun 24 14:59:45 1997 turtle <turtle@kiwi>
|
|
|
|
* Makefile.in: Fixed syntax error in the installation target.
|
|
|
|
Mon Jun 23 17:33:14 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Attic/teamball.conf, htdig/Attic/tsdsu.conf,
|
|
htdig/Attic/rohan.conf, htdig/Attic/sdsu.conf, htdig/Attic/t.conf,
|
|
htdig/Attic/nsdsu.conf, htdig/Attic/daztec.conf,
|
|
htdig/Attic/max.conf, htdig/htdig.conf, htdig/Attic/Makefile,
|
|
htdig/Attic/catalog.conf: Removed old config files
|
|
|
|
* htdoc/FAQ.html: FAQ initial
|
|
|
|
* htdoc/contents.html: Added link to the new FAQ
|
|
|
|
* htdoc/FAQ.html: *** empty log message ***
|
|
|
|
* htnotify/htnotify.cc: Added version info to the usage output
|
|
|
|
* htfuzzy/htfuzzy.cc: Added version info the usage output
|
|
|
|
* htmerge/htmerge.cc: Added version info to usage message
|
|
|
|
* htdig/main.cc: Added version info to the usage message
|
|
|
|
Mon Jun 16 15:35:56 1997 turtle <turtle@kiwi>
|
|
|
|
* installdir/footer.html: Changed the hardcoded version number to
|
|
the new VERSION variable
|
|
|
|
* htdoc/hts_templates.html: Added docs for the VERSION and PERCENT
|
|
variables
|
|
|
|
* htsearch/Display.cc: Added PERCENT and VERSION variables for the
|
|
output templates
|
|
|
|
Sat Jun 14 18:52:42 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Document.cc: Made redirect detection code more general
|
|
|
|
Fri Jun 13 05:31:17 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/cf_general.html: Fixed typo
|
|
|
|
Thu Jun 5 15:00:53 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added VG Gas Analysis Systems
|
|
|
|
Tue Jun 3 17:49:05 1997 turtle <turtle@kiwi>
|
|
|
|
* installdir/english.0.original, installdir/english.0: Added new
|
|
english dictionary for the endings algorithm
|
|
|
|
Thu May 29 14:56:40 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added Indiana University Computer Security
|
|
Office
|
|
|
|
Wed May 28 14:47:25 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/main.html: Fixed typo
|
|
|
|
Mon May 19 15:23:18 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added daily californian online
|
|
|
|
Tue May 13 19:28:32 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added The Reohr Group
|
|
|
|
* htdoc/uses.html: Added the Linux Documentation Project
|
|
|
|
Sun May 11 17:52:05 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/index.html: Made the contents frame a little wider so that
|
|
text doesn't wrap
|
|
|
|
* htdoc/uses.html: Added NOVA and Gajo & Associati
|
|
|
|
Fri May 2 23:35:56 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added www.bajan.org
|
|
|
|
Wed Apr 30 22:28:28 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: Added Caldera, Inc.
|
|
|
|
Sun Apr 27 14:43:31 1997 turtle <turtle@kiwi>
|
|
|
|
* htsearch/parser.cc, htsearch/parser.h, include/Attic/htconfig.h,
|
|
htdoc/RELEASE.html, htdoc/uses.html, htdoc/where.html,
|
|
htlib/URL.cc, htlib/strcasecmp.cc, htsearch/htsearch.cc, .version,
|
|
README, htdig/Attic/Makefile, htdoc/ChangeLog: changes
|
|
|
|
Mon Apr 21 15:44:39 1997 turtle <turtle@kiwi>
|
|
|
|
* htsearch/htsearch.cc: Added code to check the search words
|
|
against the minimum_word_length attribute
|
|
|
|
Sun Apr 20 15:27:37 1997 turtle <turtle@kiwi>
|
|
|
|
* CONFIG: Made paths more generic
|
|
|
|
* htdig/Document.cc: Added include for ctype.h
|
|
|
|
* htdig/Plaintext.cc: Fixed bug
|
|
|
|
Tue Apr 1 17:56:57 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: added ukc
|
|
|
|
Sun Mar 30 01:18:16 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Attic/Makefile, htdoc/uses.html, Attic/Makefile.config,
|
|
Attic/config.log, Attic/config.status, .cvsignore, Attic/Makefile,
|
|
htsearch/Attic/Makefile, htsearch/.cvsignore,
|
|
htnotify/Attic/Makefile, htnotify/.cvsignore, htmerge/.cvsignore,
|
|
htmerge/Attic/Makefile, htlib/.cvsignore, htlib/Attic/Makefile,
|
|
htfuzzy/.cvsignore, htfuzzy/Attic/Makefile, htcommon/.cvsignore,
|
|
htcommon/Attic/Makefile: update
|
|
|
|
Thu Mar 27 00:06:05 1997 turtle <turtle@kiwi>
|
|
|
|
* htdig/Plaintext.cc: Applied patch supplied by Peter Enderborg
|
|
<pme@ufh.se> to fix a problem with a pointer running off the end
|
|
of a string.
|
|
|
|
Mon Mar 24 04:33:26 1997 turtle <turtle@kiwi>
|
|
|
|
* rx-1.5/rx/Attic/config.log, rx-1.5/rx/Attic/config.status,
|
|
htsearch/htsearch.h, htsearch/parser.h, include/Attic/htconfig.h,
|
|
rx-1.5/Attic/config.status, htsearch/Attic/Makefile,
|
|
htsearch/ResultList.cc, htsearch/ResultMatch.h,
|
|
htsearch/Template.h, htsearch/WeightWord.h, htlib/cgi.cc,
|
|
htlib/htString.h, htlib/io.cc, htmerge/Attic/Makefile,
|
|
htmerge/htmerge.h, htnotify/Attic/Makefile, htlib/StringList.cc,
|
|
htlib/StringList.h, htlib/String_fmt.cc, htlib/URL.h,
|
|
htlib/URLTrans.cc, htlib/Attic/SDSU.cc, htlib/Attic/String.h,
|
|
htlib/ParsedString.h, htlib/String.cc, htfuzzy/htfuzzy.cc,
|
|
htlib/Attic/Makefile, htlib/Configuration.cc, htlib/Connection.cc,
|
|
htlib/Database.h, htdig/URLRef.h, htfuzzy/Attic/Makefile,
|
|
htfuzzy/Exact.cc, htfuzzy/Fuzzy.h, htfuzzy/Substring.cc,
|
|
htfuzzy/SuffixEntry.h, htdig/Plaintext.cc, htdig/Postscript.cc,
|
|
htdig/SGMLEntities.cc, htdig/Server.cc, htdig/Server.h,
|
|
htdig/Attic/Makefile, htdig/ExternalParser.cc,
|
|
htdig/ExternalParser.h, htdig/Parsable.h, htcommon/Attic/Makefile,
|
|
htcommon/DocumentRef.h, htcommon/WordList.cc, htcommon/WordList.h,
|
|
htcommon/WordReference.h, htdig/Document.h, Attic/config.status,
|
|
configure, configure.in, Attic/Makefile, Attic/Makefile.config,
|
|
Attic/config.cache, Attic/config.log, Makefile.config.in: Renamed
|
|
the String.h file to htString.h to help compiling under win32
|
|
|
|
* Makefile.in: Updated "make dist" to remove CVS stuff
|
|
|
|
Fri Mar 14 17:15:32 1997 turtle <turtle@kiwi>
|
|
|
|
* htcommon/defaults.cc: Changed default value for remove_bad_urls
|
|
to true
|
|
|
|
Thu Mar 13 18:37:50 1997 turtle <turtle@kiwi>
|
|
|
|
* htnotify/htnotify.cc, Attic/Makefile.config,
|
|
htdig/SGMLEntities.cc, htdoc/uses.html: Changes
|
|
|
|
Thu Feb 27 00:52:52 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: new uses
|
|
|
|
Mon Feb 24 17:52:55 1997 turtle <turtle@kiwi>
|
|
|
|
* htsearch/htsearch.cc, htnotify/Attic/Makefile,
|
|
htsearch/Attic/Makefile, htlib/strcasecmp.cc,
|
|
htmerge/Attic/Makefile, htlib/Attic/Makefile, htlib/String.cc,
|
|
htlib/StringMatch.cc, htdig/SGMLEntities.cc,
|
|
htfuzzy/Attic/Makefile, htdig/Attic/Makefile,
|
|
htcommon/Attic/Makefile, htcommon/WordList.cc: Applied patches
|
|
supplied by "Jan P. Sorensen" <japs@garm.adm.ku.dk> to make
|
|
ht://Dig run on 8-bit text without the global unsigned-char option
|
|
to gcc.
|
|
|
|
Sun Feb 23 17:29:38 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: *** empty log message ***
|
|
|
|
Tue Feb 18 15:03:03 1997 turtle <turtle@kiwi>
|
|
|
|
* htdoc/uses.html: New uses of ht://Dig
|
|
|
|
Tue Feb 11 00:38:48 1997 turtle <turtle@kiwi>
|
|
|
|
* htsearch/htsearch.cc: Renamed the very bad wordlist variable to
|
|
badWords
|
|
|
|
Mon Feb 10 17:32:47 1997 turtle <turtle@kiwi>
|
|
|
|
* htlib/Connection.cc, htdig/Document.h, htdig/Document.cc,
|
|
htcommon/DocumentRef.cc, htcommon/DocumentRef.h: Applied AIX
|
|
specific patches supplied by Lars-Owe Ivarsson
|
|
<lars-owe.ivarsson@its.uu.se>
|
|
|
|
Fri Feb 7 18:04:13 1997 turtle <turtle@kiwi>
|
|
|
|
* htlib/URL.cc: Fixed problem with anchors without a URL
|
|
|
|
Mon Feb 3 17:37:59 1997 turtle <turtle@kiwi>
|
|
|
|
* .version, README: updated stuff to 3.0.8
|
|
|
|
* Many files: Initial CVS
|
|
|
|
Local Variables:
|
|
add-log-time-format: current-time-string
|
|
End:
|