In Electronic Publishing and the Information Superhighway: Proceedings of the Dartmouth Institute for Advanced Graduate Studies (DAGS '95), pages 124 - 133, Boston, May 1995.
Abstract
The automated discovery of logical structure in text documents
is an important problem that has recently received a good deal of
attention; it can enable the creation of
flexible and sophisticated document manipulation tools that will
greatly increase the impact of electronic documents. This paper
addresses aspects of the nature of these logical structures,
in order to develop categories of structures that reflect
the variance in requirements for discovery and the variance
in significance for applications. A complete taxonomy is not
developed, but relevant attributes are identified in three
forms of categorization: fundamental, based on structure definitions;
discovery, based on required observables to find structures; and
usage, based on roles structures play in applications.
The attributes themselves are independent of the choice
of particular logical structures to consider in a given
application, and their direct
implications are discussed.
You can view the full postscript file, view an html version at the conference papers site, or return to my home page.