Class wais.WAISInvertedIndex
All Packages Class Hierarchy This Package Previous Next Index
Class wais.WAISInvertedIndex
java.lang.Object
|
+----wais.WAISInvertedIndex
- public class WAISInvertedIndex
- extends Object
Class to map to a WAISInvertedIndex file. The sole constructor allows us to
open up an inverted index file given a database name (source) and a
field. From this file we can extract data about terms and data about terms
with repect to documents.
-
WAISInvertedIndex(String, String, String)
- Create (open up) a WAISInverted index that corresponds to the
respective source/field
-
Close()
- Close the inverted index file.
-
GetWAISTermDocInfo(long, int)
- Given an offset into the inverted index and an entry number of
a document return data the terms within that document.
-
GetWAISTermInfo(String, long)
- Get the term info (not document specific) for a given term at a given
offset in the inverted index.
WAISInvertedIndex
public WAISInvertedIndex(String indexLocation,
String dataBase,
String field) throws SourceException
- Create (open up) a WAISInverted index that corresponds to the
respective source/field
- Parameters:
- indexLocation - Source specific location of index files.
- database - ID of the database
- field - Field identifier for the database (may be null).
Close
public void Close()
- Close the inverted index file. Garbage collection (a finalize
method) could do this but at an unreliable time.
GetWAISTermInfo
public WAISTermInfo GetWAISTermInfo(String term,
long offset) throws SourceException
- Get the term info (not document specific) for a given term at a given
offset in the inverted index. The pieces of information we can get
at this level are:
- the total number of occurrences of the term in the source
(multiple occurences in documents count)
- number of postings (documents) for the term in the inverted
index file - this should match the information in the dictionary
file.
- Parameters:
- term - The term to extract information for.
- offset - The offset in the inverted index file at which the
term resides.
GetWAISTermDocInfo
public WAISTermDocInfo GetWAISTermDocInfo(long offset,
int docEntryNum) throws SourceException
- Given an offset into the inverted index and an entry number of
a document return data the terms within that document. The info.
we can get here is:
- a float value which gives the weight of the term with respect
to the document.
- the number of the occurrences of the term in the document.
- Parameters:
- offset - The offset into the inverted index file of the
term.
- docEntryNum - The entry number of the document for which
we wish to extract term specific information.
All Packages Class Hierarchy This Package Previous Next Index