This page consists of my general ideas on Electronic notebooks and the software tools and research that will be involved in the Cornell proposal to the ASCI program. At the end of this note is list of references to related issues that I have run into.
Data is the basic content of the electronic notebook.Agreements are required to ensure that the data is useful as widely as possible. From a gramatical sense, this data can be viewed as the nouns of our environment. Some issues and alternatives are discussed here. The notebook will also serve as the medium by which researchers collaborate. The individual interactions can be viewed as the verbs of our envirnoment. Agreement is also needed here to ensure that the commands (e.g., solve linear equations, mesh a geometry, compute crack propogation) issued by research group are understandable by others. Some issues related to the verbs are discussed here.
A common interchange structure is needed to allow tools to exchange information about geometry, simulation results, mathematical models, experimental set-ups, etc. Among the goals such a representation must possess are:
It must also be possible to make these representations persistent and to store them in databases.
The representation should be language independent.
The representation should be able to cover a very wide range of mathematical and physical concepts.
It should provide a plausible framework for support of search engines. How do you index experimental/simulation data?
There are two basic approaches that could be pursued. The first is to represent all concepts as objects, and the rely on the interfaces of the objects for all operations. Frame works for this approach include CORBA, DCOM and JavaBeans. CORBA and DCOM provide object sharing schemes that are independent of languages and provide mechanisms for making objects persistent. JavaBeans addresses many of these issues but is only for Java. ILU address many of the object sharing issues, other than persistence and is language independent. All schemes essentially deal with "nouns" and "verbs", both the data structures that need to be shared and the methods that operate on these data data structures. This is the basic object oriented programming model. Ultimately all of these schemes are quite complex due to the complexity of the inter-language differences and increased semantic complexity introduced by having both nouns and verbs in the standard..
An alternative approach is to exchange universal data structures that only represent the nouns of the interchange. This is the approach being taken by the MathBus term structure.
The activities that will be funded under ASCI will involve researchers from different research groups, from different departments and who may not even be at or associated with Cornell at the same time. At the same time the software that is developed should be used as widely among the Cornell ASCI community as possible to gain the greatest leverage for our efforts. Continuing independently in our software development will ensure that our tools and data (both experimental and computational) are incompatible. At the same time developing a single overarching software architecture and standards will ensure that our effort will be frozen for several years. What is needed is a planning process that while keeping in mind an evolving software architecture vision, will develop useful tools that facilitate adherence to that vision and encourage, where responsible, the development of standards and policies that are consistent with the vision. Issues that should be addressed include:
Parallelization techniques: Along with the standard need for parallelization tools need to address the fact the platforms the code will need run on will vary widely from ASCI level super-computers through networks of workstation clusters to individual workstations and laptops. It shouldn't be a major effort to move from one platform to another. What networking hardware should be recommended.
Programming languages and development platforms: What programming languages should be encouraged, what development environments, what database frameworks, what CAD tools, what hardware platforms, etc.
Software integration: How should modules from different groups be documented, organized, etc, to ensure that researchers in other groups will be able to use them.
Computational tools: Which mesh generators should be used, linear algebra packages, ODE solvers, finite element packages, geometric modeling tools, etc. best meet the needs our research community and should be used.
The following list consists of a number of items that I found on the Web. Although they are all related to our interests in electronic notebooks and share some ideas, they are fairly divergent either in scale or flexibility. We probably want to use none of these systems, but ideas from all of them.
A document produced at Cornell some years ago that touches on these ideas is given here.