STARTS
Stanford Protocol
Proposal for Internet Search and Retrieval
Reference
Implementation
Installing and running the implementation
This page describes how to install the components of the
reference implementation on you workstation and run them. The
entire set of components have been tested and run on Solaris 2.5.
Note that even though the Java parts of the implementation (most
notably the StartsServer application) will run on any Java
virtual machine, the freeWAIS parts of the implementation
will run only on UNIX. Although not recommended, you can run
these components on separate machines (e.g., an NT machine
and a UNIX machine). Consult the testing notes for more
information.
- Installing the freeWAIS search engine
- download the tar'ed freeWAIS
binaries that have been compiled for
Solaris 2.5.
- create a new directory (which we'll call <freewais>
here) and extract the tared files into that
directory.
- Installing the sample database
- The index files
- download the tar'ed source index files.
- create a new directory (which we'll call <indexes>
here) and extract the tar'ed files
into that directory.
- NOTE: the <indexes>
directory will now contain the inverted
index, catalog, dictionary, and ancillary
files for each source (cstr and linux).
The two files with the "fmt"
suffix are the format files for each
source, which allow the wais indexer to
extract field information from the
document source files (see the freeWAIS
documentation for more information.)
- The document files
- download the tar'ed cstr document
sources.
- create a new directory (which we'll call <cstrdocs>
here) and extract the tar'ed files
into that directory.
- download the tar'ed linux document
sources.
- create a new directory (which we'll call <linuxdocs>
here) and extract the tar'ed files
into that directory.
- Modify the index files to point to the document
files
- open the two files cstr.cat and linux.cat
in the <indexes> directory
using your favorite text editor.
- both files consist of headlines and
DocID's for the set of documents in the
source. Globally change the pathname for
the documents to point to the respective
file in either <cstrdoc>
or <linuxdocs>. For
example, Document #1 in the cstr
collection has the pathname
/home/lagoze/projects/starts/freewais/cstrdb/92-1260
Change this to:
<cstrdocs>/92-1260
- NOTE: you could reindex the files to
accomplish the same task, but this method
is easier.
- Installing the StartsServer application
- download the tar'ed StartsServer
application.
- create a new directory (which we'll call it <server>
here) and extract the tar'ed files into that
directory.
- modify StartsServer for your host.
- open the file <server>/STARTSConfiguratin/ServerConfiguration.java
in your favorite text editor.
- change the initial value of the variable
"hostName" to correspond to the
host that your server will run on.
- Installing the CGI script for interfacing between an HTTP
server and StartsServer.
- download the Perl
CGI script
- place the script in a directory that your HTTP
server, from which you will accessing StartsServer,
has execute access to.
- Installing the HTML input form for
constructing STARTS requests.
- download the HTML input
form.
- place the HTML file in a directory accessible
from your HTTP server.
- Running the components
- Running freeWAIS
- in the <freewais>
directory run the executable waisserver
with the arguments "-p 5000"
(specifying port 5000) and "-d <indexes>"
(specifying that the indexes are in the
directory <indexes>)
- Running StartsServer
- in the <server> directory
run the command "java Server".
- StartsServer will then start up,
issue messages has it pre-loads document
sources for both the cstr and linux
source, and then listen on port 6789.
- Make the CGI script accessible from your HTTP
server
- edit the "config" file for you
HTTP server to execute the Perl CGI
script (called nph-dienst.pl)
when it receives URL's that look like /STARTS/*.
In the NCSA server this means adding the
following line to the httpd.conf
file:
Exec /STARTS/* <dir>/nph-starts.pl
where <dir> is the directory in
which the Perl CGI script is resident.
- You can now access the reference implementation
through your HTTP server. The easiest way to do
this is to use the input form that you installed above.
Carl
Lagoze
lagoze@cs.cornell.edu