Introduction
GlimpseHTTP is a collection of tools that allows you to use
Glimpse
to search your files using HTTP interface. In other words,
it is "Glimpse search engine - HTTP" gateway. Glimpse indices
are much smaller than, for example, WAIS indices
(they are 2-7% vs. more than 100% of
the size of the text), glimpse gives you the line containing the match
(like grep), and glimpse allows you to search even with misspellings.
Glimpse, however, can be slower than WAIS, and it does not rank the matches
(although there is a custom modification that does that).
Furthermore, GlimpseHTTP allows you to integrate search with 
browsing.  If you have several nested directories which the user may
browse, you can include the glimpse interface in each document such that
only the relevant directories will be included in the search.  More
details are given below.
The current version of GlimpseHTTP was
tested under httpd 1.2 HTML server from NCSA and
Glimpse currently works on many Unix platforms.
To search and browse the information any HTML browser can be used
(this includes NCSA Mosaic for X-Windows, MS-Windows and
Macintosh, Lynx and other browsers. For maximum convenience
your browser should support forms, although minimal
functionality can be achieved with any browser).
Since GlimpseHTTP uses Glimpse, this provides some unique features
- A very small index (3-5% of the total text).
 - Reasonably fast search.
 - Search for approximate match allowing errors.
 
In addition, GlimpseHTTP provides you with the following
capabilities:
- You can use a combination of browsing and searching:
    first, you locate the directory where the relevant
    information can be stored, then you can use search
    to locate specific files.
 - The result of the search is a nicely formatted hypertext with
    hyperlinks to matching documents.
 - Following the hyperlink leads you not only to a particular
    file, but also to the exact place where the match occured.
 - Hyperlinks in the documents are converted on the fly to
    actual hyperlinks, which you can follow immediately. This
    makes the GlimpseHTTP particularily suitable for searching
    meta-information (Internet directories etc.).
 - Similar tools are provided for archiving and searching
    USENET newsgroups. You can maintain the archive of news articles
    and allow people to search your archive using the
    same interface. Features supported include kill-file for articles
    and fast search for particular posters. Since news archiver uses
    NNTP interface, you can archive news articles from remote
    news servers. (Browse and search for news is yet to be
    implemented: browsing in this case means selection of pertinent
    newsgroup(s), currently supported is only the search within
    one newsgroup a time)
 
Among the possible applications of GlimpseHTTP we envision:
- FTP sites with search possibilities;
 - news archiving sites;
 - any search application which should be accessed over local
    or global network where searching for approximate match and/or
    saving of disk space for indices is an issue.
 
GlimpseHTTP components
-  aglimpse - "Archive Glimpse" - a tool for searching file
     hierarchies indexed for Glimpse. aglimpse is a CGI-compliant
     program which performs the search and formats the output as 
     HTML document with hyperlinks to the matches.
 -  Archive Manager
     facilitates maintaining and
     indexing of Glimpse archives. One of its options is
     HTML indexing
     which prepares hypertext indices for
     each searchable directory - this supports the concept
     of combined browsing and searching.
 -  GlimpseNews - a collection of tools for archiving and
     searching newsgroups archives.
 
Documentation
Software
See also
Authors
Paul Klark
 (GlimpseHTTP) 
Udi Manber,
 Sun Wu, and Burra Gopal (Glimpse) 
University of Arizona,
 Department of Computer Science 
To be put on glimpse mailing list, send mail to
glimpse-request@cs.arizona.edu
Paul Klark
paul@cs.arizona.edu