A Semantic Web for science and technology communicating the content of mathematics In the Large

Saturday, December 9, 2006 - 12:00pm - 12:30pm
EE/CS 3-180
Michael Kohlhase (International University Bremen)
The distributivity of information and services over the Internet has changed all aspects of
life. The process of developing and deploying science and technology is no exception to this. Its
individual aspects are already supported by a variety of software systems, but the systems are,
by and large, not able to inter-operate since they use differing data formats, make differing model
assumptions, and are bound to an implicitly given context that is only documented in publications
about the systems.

We anticipate a new quality to emerge if humans and systems can inter-operate to cover the
whole work-flow of research, education, and application. To further this vision we need to develop,
implement, and provide semantic-based and context-aware techniques for acquiring, organizing,
processing, sharing, and using knowledge in Science and Technology.

In this talk I will present an alternative vision of a 'Semantic Web' for Science and Technology.
Like Tim Berners-Lee's vision we aim to make the Web (here scientific knowledge) machineunderstandable
instead of merely machine-readable. However, instead of a top-down metadatadriven
approach, which tries to approximate the content of documents by linking them to web
ontologies (expressed in terminologic logics), we explore a bottom-up approach and focus on
making explicit the intrinsic structure of the underlying scientific knowledge. A connection of
documents to web ontologies is still possible, but a secondary effect.

I will make these ideas concrete with the XML-based content/context format for mathematic
discourses (OMDoc: Open Mathematical Documents) that supports novel web services by blending
formal and natural elements. The core purpose of the OMDoc format is to enable communication
of mathematics in the large. Most current representation formats for mathematics concentrate
on representing mathematical formulae and give the representations meaning by providing a fixed
context (in the specification). As these contexts only cover specific mathematical areas, we can
only cover mathematics mathematics in the small with this approach.

OMDoc extends MathML and OpenMath by a rich markup language for (meaning-providing)
contexts, which even provides constructs to inter-relate contexts. Thus it can emulate other
representation languages by marking up their inscribed contexts, and can act as an interoperability
format between languages and even software systems. Taken to the extreme, it can even act as
an interoperability format between scientific disciplines; I will discuss this using the the ongoing
extension efforts towards STMML2 (Science, Technology & Medical Markup Language).