Open Archives Software 2000-03-07 13:54:51 -0500
![]() |
Open Archives Reference Software
|
This document describes the use of and installation of the Open Archives Software (OA software) subset of the Dienst software. This software provides a simple to install and use front end for archives that choose to support the Open Archives Subset of the Dienst Protocol. This protocol provides a mechanism for harvesting common metadata - the Open Archives Metadata Set - and archive-specific metadata from records (e.g., documents) in participating archives. The Open Archives Initiative home page provides complete information on participation in the initiative.
The OA Software is small set of Perl files that manage dispatching of protocol requests defined by the Open Archives Subset. The OA Software is intended for use in conjunction with site-specific software that manages the individual archive. Use of the OA software will require programming to establish the actual functional interface between the dispatched protocol requests and the individual archive.
Organizations wishing to participate in the Open Archives Initiative that do not already have archive software should look at the full Dienst software release.
The Open Archives software is designed and written to be run in conjunction with an HTTP server. (In fact, the protocol is designed to be embedded in URLs carried in HTTP requests). The installation instructions support two mechanisms for linking an HTTP server to the Open Archives Software:
Using standard CGI, which is supported by virtually all HTTP servers (although the OA Software is intended to be run with the Apache HTTP server).
Using mod_perl, which embeds a persistent Perl interpreter and the Open Archives software in an Apache HTTP server. This significantly speeds up the handling of protocol requests by avoiding the overhead of starting the Perl interpreter at each request. mod_perl is only supported for various flavors of UNIX (e.g., linux, solaris, hp-ux).
All of the Perl code in the Open Archives Software will run on any computer system that supports Perl (many flavors of UNIX, many flavors of Windows, MacOS). However, use of the OA Software in conjunction with an HTTP server requires URL rewriting (in order to redirect Open Archive protocol requests to the OA Software). To the best of our knowledge, URL rewriting is available only through the mod_rewrite module in Apache. While Apache is supported on both flavors of Unix and flavors of WIN32, the follow caveat for WIN32 exists (lifted from the Apache Windows Web Page):
Warning: Apache on NT has not yet been optimized for performance. Apache still performs best, and is most reliable on Unix platforms. Over time we will improve NT performance. Folks doing comparative reviews of webserver performance are asked to compare against Apache on a Unix platform such as Solaris, FreeBSD, or Linux.
Furthermore, installers of the software who wish to exploit the performance gains offered by mod_perl can only do so on UNIX systems.
Any system that is capable of hosting a Web server should be capable of running the Open Archives Software. That includes a standard desktop workstation (e.g., Sun, IBM, etc.) or a garden variety Pentium-class PC running Linux (i.e., 400 MhZ processor, 128M or memory, Ethernet connection, multi-gigabyte hard disk). Obviously the higher the capacity of the system, in terms of both processor and memory, will determine its ability to handle a very high volume of HTTP and Open Archive protocol requests. Disk space consumption by the actual Dienst software is minimal - in the several megabyte range. We expect that actual hardware requirements for running an archive site will depend on the archive-specific software rather than the Open Archives Software itself.
The following software is required for installation and execution of the Open Archives Software:
Perl - minimum version 5.005_xx. Available from http://www.perl.com.
Perl Modules from CPAN (already may be installed in your Perl configuration).
XML::Writer
POSIX
IO::File
CGI
Apache - minimum version 1.3.x. Available from http://www.apache.org.
mod_perl - minimum version 1.2.x. Available from http://perl.apache.org.
The physical organization of the OA Software is as follows. The code is organized into five directories:
Main - These are the source files that are associated with the main entry point of the software. This code manages receiving the hand-off of Open Archives protocol requests from the HTTP server (through mod_perl or CGI), parsing of the requests, and dispatching them.
Common - These are source files that contain common utilities for the software.
Config- These are source files that contain information for localizing the software. Some of the variable settings in this directory need to be changed at installation time.
Services/Respository - The source files containing protocol definitions and the stub functions that provide the interface to local archive functionality.
Services/Info - The source files containing protocol definitions and the stub functions that provide the basic service information.
Instructions on changes to these files at installation time are provided in the installation section.
The following steps should be followed in to install the Open Archives Software. Note that the installation process assumes that your site has not installed and is already running an Apache HTTP server. If you are already running Apache, you will need to modify the configuration file for that server as described below:
Download the latest version of the Apache source into a
temporary directory.
(Skip this step if you already have Apache installed and running at
your site). The Apache source is located here.
Untar the Apache source file and create the Apache source directory (the
full path of that directory will be called apache_src
in the following steps).
Download the latest version of mod_perl source into the
same temporary directory.
(Skip this step if you already have an Apache server built with
mod_perl, or if you wish to use standard CGI - which will degrade
performance of your Open Archives Server). The mod_perl source is
located here. Untar the
mod_perl source file and create the mod_perl source directory (the full path
of that directory will be called mod_perl_src
in the following steps).
Build mod_perl.
In the mod_perl_src directory run the
following commands:
perl Makefile.PL \ APACHE_SRC=apache_src \ DO_HTTPD=1 \ USE_APACI=1 \ PREP_HTTPD=1 \ EVERYTHING=1 make make install
Note that to run these commands you will need write access to your Perl installation (in most cases this means that you have root access to your machine). Note that detailed information on installing mod_perl is available in the installation files in the mod_perl source directory.
Build Apache.
Choose the directory into which you wish to install Apache (this will be
called apache_run for the remainder of this
document). In the apache_src directory
run the following commands:
./configure \ --prefix=apache_run \ --activate-module=src/modules/perl/libperl.a \ --enable-module=rewrite \ --enable-shared=rewrite make make install
Note that detailed information on installing Apache is available in the installation files in the Apache source directory.
Test Apache.
Go to apache_run/conf and edit the httpd.conf
file. Find the line that says:
Port xxxx
where xxxx is a number like 8090
and either leave it or change it to the port on which you want to run your
Apache HTTP server. Now go to apache_run/conf and
run the command:
apachectl start
Your Apache server should start. If it doesn't, refer to the
Apache documentation for help.
Download the Open Archives Software.
The OA Sofware is available here. Untar the
source into a directory that is readable by the Apache Web server installed
above. This directory will be called OA_src
for the remainder of this document. The directory tree below OA_src
should look like that described above.
Configure the Open Archives Software.
You must modify two files in order to configure the Open Archives
Software:
Edit the file at OA_src/Main/dienst.pl and locate the setting of the variable $dienst::source_dir. Modify the path setting of this variable as indicated in the comment that accompanies it.
Edit the file at OA_src/Config/config_constants.pl follow the instructions in the comments that show which variables should be modified. Set those variable appropriately.
Configure the Apache Server to use the Open Archives
Software
Go to apache_run/conf and edit the httpd.conf
file. Add the following lines at the end of the file.
RewriteEngine on
RewriteRule ^/Dienst(.*) OA_src/Main/dienst.pl
<Directory OA_src/Main>
SetHandler perl-script
PerlHandler Apache::Registry
Options ExecCGI
allow from all
PerlSendHeader Off
</Directory>
Note that the line allow from all
specifies that all clients can execute the Open Archives protocol
requests. If you want more constrained access consult the Apache
documentation.
Now go to apache_run/conf and
run the command:
apachectl restart
to restart your Apache Server with the configuration
changes.
Perform Basic Installation Tests
Edit the file OA_src/Tests/InstallTest.htm
and change all occurrences of the string host:port
to the actual host and port of your Apache Server. Open this file in a
Web browser and check that each link successfully returns an XML
document with a root element that has the same
name as
the verb of the respective request. The root element should have a single
attribute named version,
which has a value that is the version of the verb of the respective
request. For example, a Disseminate
verb with version 1.0 will produce
text/xml content with an wrapped
in a tag like
<Disseminate
version="1.0">
You now need to do the actual programming task of linking the Open Archives Software to the archive software that you are running at your site. All Open Archives Protocol requests are dispatched to the set of subroutines in the file OA_src/Services/Repository/Repository_stubs.pl. This file is commented to show the points at which each protocol request is handled and at which point you should insert the linkages to your own archive code. When customizing the code for your individual site you should refer to: