Research
Accomplishments in Mathematics in High Performance Computing
1.
ANNUAL PROGRAM ORGANIZERS
Petter
E. Björstad
of
University of Bergen (Informatikk) was one of the main organizers
for the 1996-1997 annual program on "Mathematics in High
Performance Computing." He was also a co-organizer in the
IMA workshop on "Parallel Solution of Partial Differential
Equations," June 9-13, 1997. Below is is report:
Summary
This
brief note written 18 months after the conclusion of the IMA year
on HPC is intended to focus on some of the aspects and events
that took place. With the distance in time, one can judge the
impact and perhaps clearer see the many small factors that all
contribute to IMA being such a unique meeting place for applied
mathematics, each year setting an agenda, probing into a focused
area of importance to industry and society at large.
I had the great privilege to participate in the HPC year from
the early planning phase until its conclusion in the summer
of 97. This note is not intended to report on the complete year,
many events of considerable significance have been left out.
Rather, I will try to say just a few words about some of the
aspects that characterize IMA and that, in my opinion, contribute
to its unique atmosphere and stimulating environment.
Planning
The
year was carefully planned, starting almost two full years ahead
of time. An external committee of known experts with broad experience
from academia and industry was given responsibility. Several meetings
were conducted and guided by the IMA director workshops and specific
themes as well as tentative lists of invitees started to emerge.
It shall also be noted that it was properly recognized that
an upgrade of the IMA local computer facilities as well as the
local network would be required. Similarly, agreements on joint
sessions with the supercomputer center and proper access to
supercomputer cycles were made.
Organization
The
year was organized in three parts with a total of nine workshops.
A very large number of participants was invited. They represented
universities, research laboratories and industry, not only from
the US, but also a significant number of applied mathematicians
from abroad.
Each workshop was still small enough to foster a free flowing
exchange of ideas, a typical size being about 30 people. The
format is right, clearly derived from years of experience at
the IMA.
Practical
Matters
This
part should not be underestimated. The IMA staff is professional.
Every little detail to make visitors feel comfortable is attended
to. A welcome letter, efficient setup of keys and computer accounts
before arrival, help with housing etc. Anybody trying this without
similar experience knows how easily such matters can distract
attention. When many stays last from 5 to 10 days, the IMA makes
a big win by not wasting visitor time at all.
Workshops
and Tutorials
I
will limit this note to say a few words about the last tutorial
on software for PDEs. Half a dozen of the worlds best software
packages and their authors were invited to present and talk about
their software. Practical computer demonstrations were mandatory.
In fact, the entire tutorial (lasting a full week) was mainly
based on using WEB technology for presentation, talks and demos.
People had hands on exercises and the tutorial quickly turned
into a large software laboratory where authors tried each others
problems etc. In fact, the week can be characterized as a hybrid
between a tutorial (mainly targeting the post doctoral program)
and a workshop. Everybody worked hard, everybody learned something
and we all came away feeling that we had a fun week.
Post
Doctoral Programme
The
year had nine post doctor positions, some from the IMA industrial
programme, the rest directly under the HPC year. In addition,
I had some of my own Ph.D. students as well as a couple of postdocs
visiting from Europe for shorter periods of time. Today, 18 months
later I still correspond with about half of the postdocs from
time to time. They are now scattered around the US and in Europe,
in universities but also working for industry.
Science
The
amount of science, say in the form of publications produced as
a direct result from the interactions at the IMA is hard to measure
due to the considerable time lag that always will occur. An example
should suffice: My student Talal Rahman met Xiaobing Feng from
Tennessee, several papers have been written and they still work
closely together. An example (among many) of how the IMA makes
people meet and mix, new ideas result and new long term scientific
collaborations are started.
Life after the IMA year
Most
of the IMA visitors and participants return home with new friends
and contacts. A considerable subset met again in the spring of
1998 at the Mittag Leffler Institute in Stockholm, Sweden. Then
partly again at the Oberwolfach institute in Germany. There is
no doubt that most visitors remember their stay at the IMA as
a very stimulating time where new ideas came about and lifelong
scientific relationships got started.
Mitchell
Luskin
from the University of Minnesota, School of Mathematics was
one of the organizers for the year on "Mathematics in High Performance
Computing." He was also one of the organizers for the April
28-May 2, 1997 workshop on "Grid Generation and Adaptive Algorithms."
and the June 9-13, 1997 workshop on "Parallel Solution of PDE."
He reports:
During the academic year 1996-97, I was an organizer for the
IMA program on High Performance Computing. I also was an organizer
for two workshops; Grid Generation and Adaptive Algorithms,
April 28-May 2, 1997, and Parallel Solution of PDE, June 9-13,
1997. In addition, I organized the weekly Numerical Analysis
Seminar in which many of the IMA visitors participated.
I was involved in continuous informal interactions with the
IMA visitors. Several of the IMA post-docs worked on research
projects that followed my research program in microstructure
which will be described below. Matthias Gobbert and Andreas
Prohl developed and analyzed a discontinuous finite element
for microstructure computations, and Martin Kruzik worked on
a project to compute microstructure by relaxation.
Advances in the understanding of material microstructure are
playing an important role in the development of active materials:
shape memory, magnetostrictive, and ferroelectric materials.
During the past several years computational methods for the
equilibria of martensitic crystals based on elastic energy minimization
has been developed by Luskin in collaboration with students,
post-docs, and co-workers. The invariance of the energy density
with respect to symmetry-related states implies that the elastic
energy density is non-convex and must have multiple energy wells.
Microstructure then occurs as the fine-scale spatial oscillation
between these symmetry-related states to attain the lowest possible
energy state. During the 1996-97 IMA program on High Performance
Computing, Luskin developed and analyzed computational strategies
that cope with the complexity of microstructure in situations
that were formerly far beyond computational capability, and
he and his co-workers showed how these concepts apply also to
theories of micromagnetics and magnetostriction.
During the 1996-97 year, Luskin (jointly with Bo Li) considered
multi-well energy minimization problems modeling martensitic
crystals that can undergo either an orthorhombic to monoclinic
or a cubic to tetragonal transformation. They obtained results
for the approximation of a laminate with varying volume fractions.
They constructed energy minimizing sequences of deformations
which satisfy the corresponding boundary condition, and they
established a series of error bounds in terms of the elastic
energy for the approximation of the limiting macroscopic deformation
and the simply laminated microstructure. Finally, they gave
results for the corresponding finite element approximation of
the laminate with varying volume fractions.
In another project, Luskin (with Li and Riordan) analyzed a
class of piecewise linear, nonconforming finite element approximations
of a simply laminated microstructure which minimizes the nonconvex
variational problem for the deformation of martensitic crystals
which can undergo either an orthorhombic to monoclinic (double
well) or a cubic to tetragonal (triple well) transformation.
Discontinuous, nonconforming elements are appealing for problems
with microstructure because the admissible deformations have
more flexibility to approximate oscillatory functions. This
project extended the stability analysis and numerical analysis
to a class of nonconforming elements that have been used successfully
to compute microstructure.
The IMA program on the Mathematics of High Performance Computing
got off to an active and hand-on start with tutorials on The
Message Passing Interface Standard (MPI) on September 9--11,
1996 and High-Performance Fortran (HPF), September 11--13, 1996.
MPI and HPF are software tools which enable parallel computing.
The IMA was fortunate to have some of the developers of these
tools present the tutorial (Gropp and Lusk for MPI and Koelbel
and Schreiber for HPF). The post-docs were active participants
in the tutorial and several utilized these tools extensively
in their research.
In addition to the workshops, the post-docs and visitors were
actively involved in a weekly Post-Doc seminar (which was run
by the post-docs) and the weekly Numerical Analysis Seminar
which is run by the School of Mathematics in collaboration with
the IMA. Two of the program organizers (Bjørstad and
Luskin) were in residence during the entire year and ensured
continuity in organization.
Among the highlights of the program was the tutorial on PDE
Software held during April 21--25, 1997. This tutorial featured
presentations by eight leading developers of software for PDEs.
The presentations included an overview of the packages, a demonstration
of the software, carefully prepared exercises, and a general
discussion on the last day. The visitors and post-docs actively
learned to use the software on the prepared exercises, as well
as tried to utilize the software on problems of research interest.
The final day featured a stimulating and energetic discussion
among the developers and the participants on the different approaches
and future directions.
It was very exciting to see at this tutorial that many of the
concepts on adaptive error control, grid generation, and iterative
solvers that have been developed on a theoretical basis during
the past decade are now being used in general purpose software.
The following workshops on Grid Generation and Adaptive Algorithms,
April 28 -- May 2, 1997, and Parallel Solution of PDE, June
9--13, 1997, continued the discussion of grid generation, adaptive
methods, and iterative solvers and offered a view of future
directions for PDE software.
Many of the long-term visitors made significant contributions
to the post-doctoral program. Rolf Rannacher (in residence during
the spring program) gave a series of four of lectures on ``Computational
Methods for the incompressible Navier-Stokes Equations.'' the
numerical solution of the Navier-Stokes equations. Petter Bjørstad
assisted many of the visitors and post-docs in the development
of parallel versions of their codes. For instance, he guided
post-doctoral scientist Brian Suchomel in the development of
a parallel version of his code using MPI for message passing
and the additive Schwarz Domain Decomposition algorithm to solve
an elliptic pressure equation. Mitchell Luskin introduced several
post-docs (Gobbert and Kruzik) to computational problems in
materials science.
The workshop on Grid Generation and Adaptive Algorithms during
April 28--May 2, 1997 was one of the two major workshops of
the Spring focus on Parallel Computational Mechanics. Grid generation
is a common feature of many computational tasks which require
the discretization and representation of space and surfaces.
The approximation of the equilibrium states and dynamics of
continua require that the continua be represented by a grid.
The workshop speakers discussed how the geometric complexity
of the physical object or the non-uniform nature of the solution
variable make an unstructured grid desirable. Since an efficient
grid requires knowledge of the computed solution, many of the
workshop speakers spoke on how to construct grids that are adaptively
computed with the solution.
Some of the themes discussed were the development of a posteriori
error estimators (Bank, Berzins, Dupont, Le Veque, Nochetto,
Rannacher, mesh generation schemes including refinement, coarsening,
and mesh movement (Bank, Bern, Mukherjee, applications to degenerate,
singular, or heterogeneous problems (Baker, Le Veque, Nochetto,
Oden, Shephard), and software (Bank, Le Veque).
Parallel computers are only easily programmed with regular data
structures, so the development of adaptive grid generation algorithms
which can utilize parallel computers effectively has posed a
great challenge. Approaches to repartioning and load balancing
were discussed by several workshop speakers (Biswas, Flaherty).
The workshop on Parallel Solution of PDE held during June 9--13,
1997, was the other major workshop of the program on Parallel
Computational Mechanics. Parallel computers offers the promise
of greatly increased performance and the routine calculation
of previously intractable problems. This period of concentration
and the workshop promoted the development and assessment of
new approximation and solution techniques which can take advantage
of parallel computers. Techniques discussed included domain
decomposition methods (Brenner, Cai, Chan, Dryja, Fischer, Hoppe,
Keyes, Pavarino, Widlund) and multigrid methods (Bastian, Rannacher,
Xu). Applications discussed included fluid dynamics, solid mechanics,
and semiconductor simulation).
Top
of page
2.
WORKSHOP ORGANIZERS
Robert Schreiber from Hewlett-Packard,
Inc. (Hewlett-Packard Labs) was a member of the organizing committee
on "Mathematics in High Performance Computing." He also co-organized
the workshop on "Algorithms for Parallel Processing, September
16-20, 1996" along with the special IMA Short Course: High-Performance
Fortran (HPF) held on September 11-13, 1996.
He writes the following preface for the proceedings where he
is one of the editors:
The Workshop on Algorithms for Parallel Processing was held
at the IMA September 16 - 20, 1996; it was the first workshop
of the IMA year dedicated to the mathematics of high performance
computing. The workshop organizers were Abhiram Ranade of The
Indian Institute of Technology, Bombay, Michael Heath of the
University of Illinois, and Robert Schreiber of Hewlett Packard
Laboratories. Our idea was to bring together researchers who
do innovative, exciting, parallel algorithms research on a wide
range of topics, and by sharing insights, problems, tools, and
methods to learn something of value from one another. As the
remarkable ensemble of papers contained herein should show,
we seem to have succeeded in creating ample opportunity for
such exchanges. The papers offer a wide-ranging tour of recent
developments in the very rapidly growing and changing field
of parallel algorithms. They cover the following general areas:
*
models and mechanisms for parallel machines (Chapters 1-4),
*
discrete and combinatorial algorithms (Chapters 5-7),
*
mathematical issues in parallelizing compilers (Chapter 8),
*
parallel algorithms for matrix computation, differential equations,
random number generation, and Fourier methods (Chapters 9-14),
*
new parallel computer systems and software (Chapters 15-16).
We
hope that readers will find this collection as enjoyable and
informative as we have.
Raymond
Johnson
of the University of Maryland (Mathematics), Fletcher
Jones of (IBM (Research Division), and James
Turner of Rensselaer Polytechnic Institute (Computer
Science) have the following joint report:
Preparing for opportunities was the theme of a workshop on Minorities
and Applied Mathematics-Connections to Industry, held October
4-6, 1996 at the Institute for Mathematics and its Applications
(IMA), University of Minnesota.
Approximately sixty invited minorities in mathematical sciences
attended the workshop. Of these, forty were Ph.D. students from
mathematical sciences departments in North America; the other
twenty participants represented a range of professional experience
from postdoc to senior scientist. Also attending were Avner
Friedman, Director of the IMA; Robert Gulliver, Associate Director
of the IMA; and Barry Cipra, a science writer.
The workshop was arranged by the IMA Director and the organizers
to provide an atmosphere in which minority graduate students
could hear about the research and careers associated with applying
mathematics to real- world problems. The real-world problems
presented to students involved mathematics at all levels from
elementary to technical, and showed students the need to communicate
across disciplines with scientists and engineers having sophisticated
mathematical training. Students began that communication process
(listening and speaking) at this workshop. Their careers will
be enhanced by this type of exposure; they were encouraged to
seek similar opportunities at their home institutions and in
their home regions. Each graduate student attending the workshop
agreed to return to their home institution and make a presentation
about the workshop, so that its benefits would not be limited
to those who attended.
The composition of the workshop-- a relatively small group,
mostly graduate students-- was based on the model used at the
workshop for women at the IMA in February, 1996 and was intended
to create a comfortable and relaxed environment. The workshop
contained four components:
- Overview
talks by senior participants about their technical work and
career experiences;
-
Technical talks about the applications of mathematics associated
with various real-world problems;
-
Focused small-group discussions charged to produce action
items for colleges and universities, government laboratories,
funding agencies and professional organizations;
-
An after-dinner talk by Earl Barnes of Georgia Tech describing
how the discovery of Karmarkar's algorithm affected IBM's
business strategy and the relationship between further developments
in the mathematics and changes in strategy by IBM and its
competitors.
The minority mathematics community is small, and the workshop
was the first opportunity for many of the students to network
with minority professionals sharing their interest in mathematics.
The technical talks were uniformly of high quality, and covered
a range of applications, including manufacturing of semi-conductors,
microstructure of materials, design of a chemical vapor deposition
reactor, mathematical problems arising in biology including
freezing of tissues for biomedical engineering, dynamics of
proteins in aqueous solutions, transport of solutes across cell
membranes and reconstruction of images in tomography. The mathematics
involved included wavelets, Markov processes, optimization,
partial differential equations and computer models. (Abstracts
of all the talks are in Section IV below.)
The small-group discussion sessions were also modelled on the
program held for women in February. Each group included about
twelve people, typically eight graduate students and four senior
mathematicians. One or two people served as coordinators to
assure that everyone had a chance to speak and to assure that
the group covered all relevant topics. One person was designated
as recorder to prepare notes of each group's discussions. Another
member of each group was asked to present the group's recommendations
at the final assembly of all workshop participants. Student
volunteers introduced speakers after the morning session, providing
another chance for them to practice their communications skills.
The organizing committee was extremely pleased with the workshop.
Participants were so enthusiastic that one of their primary
suggestions was a request to meet again to see how people had
carried out the suggestions made to them. They wanted to use
another meeting to practice skills suggested at this workshop,
where graduate students would give more of the talks, and would
receive advance help in order to make maximum use of the conference.
The primary value of workshops like this is the students' exposure
to people like themselves with interests like theirs, who have
accomplished what they are striving to accomplish. All mathematicians
are members of many communities--minorities, women, men, analysts,
geometers, topologists, applied mathematicians. Workshops like
this do not substitute for the specialized meetings of those
communities; they serve to demonstrate the existence of a minority
mathematics community which is not visible to students isolated
in their graduate programs.
The meeting was valuable because minority mathematicians have
an unparalleled opportunity. Mathematics research and education
are rapidly changing. The minority mathematics community did
not prosper under the old model; there is a willingness in our
community to consider other models of preparation for a career
in research. This workshop showed that minority students are
eager to prepare themselves for twenty-first century opportunities.
Mathematical problems arising in industrial applications typically
embody complicated, interdisciplinary issues of formulation,
analysis and solution. Minorities in mathematical careers are
often attracted to areas in which their results can have a societal
impact. There are many opportunities provided by real-world
problems for high-quality research, contributions to practical
results, and rewarding scientific careers. The purpose of the
weekend workshop is to show examples of people and problems
from industrial settings and to develop a set of concrete action
items that individuals and agencies can carry out and help minority
scientists at all levels and in varied environments become involved
with industrial problems.
The first goal will be achieved through technical talks by selected
participants chosen based on their success with real-world problems.
The collection of action items will build on suggestions received
at earlier workshops.
Breakout
Group Recommendations
Each breakout group was asked to discuss the following topics:
-
Undergraduate and graduate education in mathematical sciences
-
Transition from undergraduate to graduate work;
- Curricular
issues;
- Uses
of technology.
-
Preparing for opportunities
- Bridging
the gap between academia and industry;
- Breadth
of training;
- Role
of interdisciplinary work;
- Role
of internships;
- Entrepreneurs
and the global economy.
Although each group took a slightly different perspective on
the main issues, many common elements were cited. Means of overcoming
difficulties faced by students at the transition points (undergraduate
to graduate, graduate to work) were subjects of numerous suggestions.
Since members of the minority community frequently work in isolation,
most of the recommendations were actions for individuals to
undertake to prepare themselves better. The key to increasing
the number of minority mathematicians is individual initiative
on the items discussed below.
The main recommendations from all breakout groups are listed
here in four groups; recommendations for faculty and students,
recommendations for students, recommendations for academic mathematical
sciences departments and recommendations for the professional
societies. (Some recommendations are listed under more than
one heading.)
A.
Actions for everyone
-
Get connected; have and use e-mail and internet access
-
Make departmental presentations about this workshop; invite
students from other departments
-
Take advantage of computer center resources (C, C++, software
packages, LATEX, UNIX, EXCEL)
-
Encourage the Department to invite speakers who can give talks
about applications of mathematics (and to make contacts with
local industry)
-
Attend seminars in other departments
-
Contribute information to Internet sites for minorities (information
about internships, co-ops, programs, etc.)
-
Become aware of other minority/professional organizations
-
Look for internships and summer appointments in industrial
settings
-
Keep in contact with mentors
-
Set up a Web site on the Internet containing:
-
Profiles of minority industrial mathematicians
-
Names and research areas of minority graduate students
who are working toward the Ph. D. degree; thesis topics
when completed
-
Profiles of minority businesses
-
Listing of available internships
-
Profile of tools needed for successful graduate experience
(C, C++, Hypertext, GAMS, presentation skills, LATEX)
-
Profile of conference abstracts, speakers, e-mail addresses,
as in AARMS at http://www.wam.umd.edu/~rlj/aarms.html
; contribute to such sites
-
Information on how to subscribe to minority e-mail lists
-
Current sources of minority scholarships
-
List of industrial and academic mentors
-
Support for budding entrepreneurs, such as information
on how to get started, information about others interested
in starting businesses and information about writing proposals
and reviewing proposals
-
Information about getting involved with "virtual" companies
-
Use the Web to foster Applied Math team projects
-
Identify hot areas in applied mathematics
-
Recruit students
-
Encourage students to form teams around these areas
-
Support students planning and executing their chosen project
-
Encourage student presentations on their projects at conferences
B.
Actions for students
-
Set a goal and remain focused on it
-
Spend some time learning how to learn mathematics; take responsibility
to prepare yourself
-
Explore academic offerings of other departments to broaden
research opportunities: take a computer course; perhaps minor
in some area of engineering, science, business, etc.
-
Develop facility with written/spoken language
-
Get computational experience; learn a computer language and
how to apply it to your problem
-
Request a cross-departmental math modeling class with strong
industry involvement
-
Start an interdisciplinary journal club (students getting
together to read articles from journals) or a graduate student
seminar
-
Get involved with a project involving applications or integrating
math with other disciplines
-
Make contact with other (minority) graduate students for possible
collaboration on research-seek out a "like-minded" group
-
Discuss with advisor "what lies ahead"
-
Always keep your resume in mind
-
Go to conferences (for example, SIAM conferences including
SIAM's Diversity Day during the SIAM meeting at Stanford,
July 14-18, 1997) and take a leadership role
-
Prepare for conferences by reading abstracts, deciding
on talks you will attend and contacting authors of articles
in which you are interested
-
Do things inside and outside of school to make yourself
more marketable
-
When working on a project, always think about what part
or extensions will be publishable
- Make
an all-out effort before graduating and looking for a job
-
Network at every opportunity-- attend seminars, attend
conferences, e-mail authors of articles
-
Contact all your mentors and professors as you near completion
of your M.S. or Ph.D. degree, asking them to get the word
out that you are close to graduating
-
Ask professors and mentors to send recommendations; those
based on personal contact are particularly important
-
Send letters/resumes "out of cycle" when the majority
of letters/resumes are least likely to come (this is less
effective in academe than in industry)
-
Always follow up contacts
-
Continually update your resume
-
Stay aware of current events to facilitate conversations
during job interviews
-
Call ahead to determine which areas of research are of
interest to the company with which you are interviewing--
meet industry halfway by showing them you are a good match
with their needs
C. Actions for academic mathematical sciences departments
- Organize
student-to-student forums conducted by graduate students for
undergraduate student math majors to talk about the transition
to graduate school
-
Have a "strategies to get a job" seminar (for undergraduates
and/or for graduate students). Invite employers of all types--
community colleges, four-year colleges, industry and government
representatives
-
Recognize and support students who plan to enter the job market
with a B.S. or a M.S. degree
-
Forward all job listings to all graduate students at all levels
- Offer
a math modeling class where students can work on problems
from industry-- expose students to working in teams and learning
how to approach problems
-
Make the modeling class interdisciplinary by cross-listing
it with other departments
- Encourage
students who want to take courses outside the Mathematics
Department
-
Invite speakers from industry to talk about real-world problems
-
Contact graduates who work in industry
-
Set up an Advisory Committee with invited representatives
from local industry to provide another source of speakers
-
Improve advising for graduate students; some groups even suggested
development and use of a placement exam
-
Offer support to students other than teaching assistantships;
research internships in industry would prepare students to
begin industrial careers as teaching assistantships encourage
them to pursue teaching
-
Be aware of students in other disciplines, such as EE, who
take lots of mathematics, as sources of double majors and
graduate students
-
In industry, mathematics departments should explain their
usefulness to the company; in academe, mathematical sciences
departments should explain their usefulness to allied departments
D.
Actions for professional societies
-
Encourage student participation at meetings
-
Organize events for students
-
Support students' attendance at society meetings (as is
done by the Society for Mathematical Biology)
Lawrence
David Davis
from Tica Technologies, Inc. was one of the organizers for the
workshop on "Evolutionary Algorithms" held on October 21-25,
1996. He reports:
In my estimation, the 1996 workshop on "evolutionary algorithms"
yielded a number of important benefits. It brought together
representatives of the superconductor community, applied mathematics
community, and evolutionary algorithms community in a congenial
and constructive interchange. I believe a number of positive
results will follow from this. I describe several below.
Many of the interactions between representatives of different
fields are leading and will lead to beneficial results. For
example, some of the applied mathematicians became better acquainted
with techniques for using evolutionary algorithms for real-world
applications and intend to add the evolutionary techniques to
their problem-solving machinery. Several of the parallel computing
speakers discussed techniques for parallelizing algorithms at
a meta-level. (This approach is not one typically followed in
the evolutionary algorithm field, where hybridization of algorithms
tends to occur in the operator set.) The result of this is likely
to be renewed experimentation with hybridization techniques
in the evolutionary algorithm field. Finally, some of the insights
gained by members of the evolutionary algorithm field with regard
to minimizing evaluation time appeared to be of benefit to members
of the supercomputing field.
In addition, the workshop brought together representatives of
different branches of the evolutionary algorithm field. Again,
the result was a cross-fertilization of ideas that I expect
will result in applied and theoretical work for several years
to come.
The IMA did a wonderful job in creating the environment for
the workshop, in promoting it, and in structuring it so that
there was extensive time for offline interaction. I greatly
enjoyed attending the workshop, and benefited tremendously from
it.
Michael
Vose
from the University of Tennessee (Computer Science) was one
of the organizers for the workshop on "Evolutionary Algorithms"
held on October 21-25, 1996. His report follows:
The Evolutionary Algorithms workshop assembled a variety of
people with differing viewpoints. The field is not mature, still
enjoying youthful exuberance (and occasional excesses). I believe
the workshop facilitated communication between many of the various
camps in the evolutionary computation field, encouraging mutual
cooperation, and challenging each to translate and integrate
within their own perspectives the principles and conceptions
of the others.
The workshop also provided opportunity for people who infrequently
have the opportunity to meet with each other - but who do belong
to the same subgroup of the field, and who do have similar outlooks
or techniques - to share and discuss ideas.
As for specific contributions made by myself or by by other
visitors to the IMA, I feel comfortable about commenting on
my own part.
My point of view is that it makes sense, within the vast collection
of evolutionary algorithms and variants, to focus attention
on a simple, commonly used type, to then formalize it as a mathematical
object, and to study it with the aim of proving theorems about
its behavior.
That puts me in the lunatic fringe of the Evolutionary Algorithms
crowd for various reasons ... not the least of which is that
the vast majority are interested in solving some specific problem
or class of problems (which are usually NP complete), and their
goal is to find a usable heuristic. Any heuristic - however
obtained - that seems to work, at least better than what was
tried before, is the answer; proof, to most, is at best an irrelevant
concept.
Although the paper I submitted for the IMA proceedings focuses
more on explaining the mathematical framework than anything
else, the message I tried to get across in my talk contained
the following basic ideas:
-
Standard GA theory, from the perspective of logical rigor,
leaves much to be desired.
-
There is an alternative, which, though more mathematically
demanding, has more solid foundations, more predictive power,
and is capable of supporting nontrivial and provably correct
analysis.
-
That alternative points towards fixed points of an underlying
dynamical system, among other things, as important objects
in the theory.
-
Besides asymptotic results (i.e., proven theorems), there
is empirical evidence gathered from extensive computer runs
verifying the claim that the alternative theory can make qualitatively
correct assertions about the behavior of genetic search as
used in practice.
I hope you will be able to paraphrase parts of my comments to
serve your purposes. The workshop was quite enjoyable, and I
thought it a tremendous success. Best Regards.
Darrell
Whitley
of Colorado State University (Computer Science) was one of the
organizers for the October 21-25, 1996 workshop on "Evolutionary
Algorithms." He writes:
The support of the IMA for the Workshop on Evolutionary Algorithms
made it possible to bring together for the first time a significant
number of the top researchers working in the area of Evolutionary
Algorithms, including the fields of Genetic Algorithms, which
developed in the United States in the mid-70's, and Evolution
Strategies, which developed in Germany in the mid-70's. These
two communities worked in almost total isolation until 1990;
in the last 5 years or so the Evolution Strategies community
has been fairly active in integrating Genetic Algorithm based
concepts into their work. But knowledge of Evolution Strategies
on the part of the larger US community and even other European
communities has been slower to develop. By bringing together
key individuals in these communities in one place in a relative
small forum (50 individuals) commonalities and differences in
the two approaches were made much clearer. Even though I have
been one of perhaps a half dozen individuals in the Genetic
Algorithms community to interact with the Evolution Strategies
community since 1990, I learned new things at the workshop that
surprised me and deepen my understanding of the two approaches.
It is somewhat symbolic that the "Fathers" of these two fields,
John Holland and Hans Paul Schwefel, had only been briefly introduced
once before and were able to interact for the first time at
this workshop.
The workshop also did a great deal to clarify the current state
of the theory in Evolutionary Algorithms; the existing theory
might be characterized as belonging in at least two different
major camps. There is a high level macro-theory that looks at
the processing of "building blocks" and "schemata" that are
shared by many good solutions when searching a problem space.
There is also a low level micro-theory that builds exact Markov
models of the search process. The macro-level theory of course
ignores certain detail; the micro-level theory involves working
with detailed models that unfortunately growth exponentially
as a function of problem size. (The size of the models are larger
than the search space itself.) I think that the IMA workshop
helped both sides in this debate to present their cases and
to move toward common ground.
A concept that was discussed at the workshop includes the so-called
"No Free Lunch" result which shows that all search algorithms
are the same when their performance is averaged over all possible
discrete functions. Debate on this topic has motivated me to
do new theoretically work proving that there are interesting
and potentially useful ways to partition the set of all possible
discrete functions such that some search algorithms are indeed
better than others. The results in fact have practical implications
for problem representations and improving search methods.
Another area where there was some real progress was in the area
of communication between theorist and practitioners. A wide
range of applications were presented. In some cases, theoretically
motivated methods worked quite well. In other cases, practitioners
used ad hoc methods to obtain better performance than could
be achieved with theoretically motivated methods; but at least
some individuals on both sides went away with a better appreciation
of the successes and failures of current theory. I believe that
the workshop will help to change what practitioners say about
the current state of theory in the field.
Some very surprising and impressive applications were presented
at the workshop, including data decomposition, deriving control
parameters for modeling systems using partial differential equations
and even new best know solutions to bin packing problems.
Overall, I found the workshop not only useful for bring people
together and strengthening the community, but also as a learning
experience.
Dianne
P. O'Leary
from the University of Maryland (Computer Science) was a member
of the organizing committee on "Mathematics in High Performance
Computing." She also co-organized the workshop on "The Mathematics
of Information Coding" held on November 11-15, 1996. She writes:
A workshop on Information Coding, Extraction, and Distribution
was held at IMA November 11-15. There were approximately 30
attendees. The speakers included Michael Orchard, Robert Gray,
Ahmed Tewfik, Jorma Rissanen, Julia Abrahams, Paul Siegel, Gil
Strang, Cynthia Dwork, George Cybenko, Chris Atkeson, Dianne
O'Leary, Geoff Davis, Eric Metois, Duncan Buell, and Manfred
Opper. Topics ranging from the mathematics of coding theory
to the practicalities of copyright preservation for Internet
resources drew spirited discussion and interaction among experts
in diverse but related fields. George Cybenko and Dianne O'Leary
made plans to collaborate on the study of iterative methods
for matrix approximation.
Tamar
Schlick
from New York University (Courant Institute, Mathematical Sciences
and Chemistry) is one of the organizers for the January 20-23
workshop on "Molecular Structure: Dynamics, Geometry, and Topology."
She writes:
The workshop on macromolecular structure brought together, at
the heart of winter, applied mathematicians and biophysicists.
The talks covered many exciting current topics in molecular
mechanics, molecular dynamics, global minimization of potential
energy functions, fast electrostatics, parallelization of large-scale
molecular codes, and much more.
Most exciting about the workshop was the cross-fertilization
of ideas, as reflected by the many discussions throughout the
lectures and during the meals and long breaks. The mathematical
and biological scientists learned and exposed their colleagues
to new information, new ideas, and future challenges. We all
left the workshop feeling that the informal setting and small
group size really worked to make the conference what it should
be -- scientific exchanges of ideas -- rather than a continuous
series of expositions by people who had been on the lecture
circuit for a long time...
Scott
Baden
from the University California-San Diego (Computer Science and
Engineering) was one of the organizers for the March 12-13,
1997 workshop on "Structured Adaptive Mesh Refinement Grid Methods."
He writes:
After the formal talks had ended, a number of us (SAMR workshop
participants) decided that there was enough interest to look
into building standardized software support for structured adaptive
mesh refinements. A web page will also be maintained.
Christoph
Borgers
from Tufts University (Mathematics) was one of the organizers
for the workshop on "Computational Radiology and Imaging: Therapy
and Diagnostics" held on March 17-21, 1997. He shares the following:
Margaret Cheney told Todd Quinto about a problem concerning
singularity detection from sonar data. Todd was able to solve
the problem, and even proposed a singularity detection algorithm
based on his solution.
Robert Levine learned about the ongoing work on diffusion tomography.
He talked to me about his interest in this subject, and I introduced
him to David Boas, a researcher in diffusion tomography at Tufts
University. Robert, David, Todd Quinto, and I have started a
working seminar on the subject together since then.
I can think of a lot of points and ideas and references that
I learned about at the workshop, or made other people aware
of, but that is too detailed for this report. Several participants
have made enthusiastic comments about the workshop.
Some
examples:
- Robert
Levine made me much more aware of the importance of the beam
selection (as opposed to beam weight optimization) problem
in radiation therapy planning.
-
I provided Meldon Lodwick with a list of references on biological
response models that have been proposed for the purpose of
formulating the radiation therapy planning problem as a constrained
optimization problem.
-
I learned about various approaches to the diffusion tomography
problem: Michael Klibanov's and Simon Arridge's from their
talks, and Sarah Patch's through conversations with her.
-
I pointed out to Yair Censor that constraints used in the
radiation therapy planning problem often have strong "volume
dependence:" It may be acceptable to give a large dose to
a small piece of a healthy organ, but not to a large piece.
It is not immediately obvious how such constraints could be
incorporated into his approach to the radiation therapy planning
problem. He thought about it a bit and came up with a suggestion
that ought to be explored further.
-
I learned about some work by Lech Papiez on dose calculations
in radiation therapy planning, and gave him references to
the "Boltzmann/ Fokker-Planck splitting" method, which has
been proposed in the Nuclear Engineering literature and appears
to be closely related to Lech's ideas.
-
George Majda told me about his work on the Vlasov-Poisson
equation. Some aspects of this work appear related to the
dose calculation problem in radiation therapy planning. I
visited George at Ohio State University in early May, and
we discussed this subject further.
Frank
Natterer
from the Universitaet Muenster (Fachbereich Mathematik) was
one of the organizers for the March 17-21, 1997 workshop on
"Computational Radiology and Imaging: Therapy and Diagnostics."
He reports:
The meeting brought together specialists from quite different
fields. This resulted in many lively and often controversial
discussions.
One example is regularization. While everybody agreed that some
kind of regularization is necessary for virtually all medical
imaging problems, some people contended that Tikhonov regularization
is not sufficient. It does not permit to take into account properly
the properties of the image/object to be reconstructed, nor
does it reflect the role of the observer. These questions led
to very lively discussions after the talks of Barrett, Herman
and Johnson. A related controversial issue was assessment of
image quality. Barrett and Herman pointed out quite clearly
that naive distortion measures for images, such as norms, are
insufficient. Statistical and psychological assessment methods
such as ROC studies are required.
Another subject which was discussed quite controversially was
the usefulness of seriously ill-posed problems in medical imaging,
such as impedance and near infrared imaging. Some participants
frankly questioned the practical value of such techniques, describing
situations in which these methods are bound to fail. On the
other hand, Arridge presented simulations and also reconstructions
of measured data which demonstrate that near infrared imaging
can produce useful pictures for neonatal heads. Isaacson impressed
the audience with quite realistic patient studies. His talk
demonstrated the potential and the limitations of imaging techniques
based on seriously ill-posed problems.
A criticism which often came up was oversimplification and naive
modeling, in particular in connection with ultrasonic and optical
imaging. The modeling of noise and its propagation through the
algorithm has almost never been discussed.
Jaeger stressed in his talk that image reconstruction and image
enhancement should be considered as a single task. An example
is the removal of artifacts from CT pictures which should be
done on the data rather than on the image.
Sparr brought up Doppler imaging and the relevant mathematical
apparatus which has been developed recently by Sharafutdinov.
This talk impressively demonstrated how new mathematics leads
to new imaging techniques. The role of mathematics in 3D imaging
was the theme of the talk of Clack. A few years ago the underlying
mathematics was understandable only to a few experts, but it
is now almost as simple and elegant as in the 2D case. However,
there is still need for algorithms which can handle the case
of truncated projections.
Several approaches to the radiation treatment planning problem
were discussed in the talks by Censor, Levine, and Lodwick.
Levine stressed the importance of selecting a clinically manageable
number of beams, an aspect of the treatment planning problem
that is neglected in algorithms for optimizing the weights of
pre-selected beams.
One of the mathematical tools which came up in many talks was
the transport equation. It serves as a model for different modalities
such as treatment planning, optical imaging, tomography including
scatter, and emission tomography. It seems that a large part
of medical imaging can be viewed as inverse problems for the
transport equation.
Jeff
Blaney
was one of the organizers of the April 7-11, 1997 workshop on
Mathematical and Computational Issues in Drug Design. Following
is the panel discussion where he served as moderator:
PANEL
DISCUSSION: NEW PROBLEMS THAT SHOULD BE ADDRESSED IN THE NEXT
TEN YEARS
PANELISTS: GORDON CRIPPEN, SIMON KEARSLEY, GARLAND MARSHALL
and PHIL PORTOGHESE
This panel's challenge was to identify important problems and
challenges in drug discovery that should be addressed within
the next decade given steady, predictable improvement in computational
power and perhaps less predictable improvement in algorithms
and methods. What are the pressing problems and bottlenecks
that may succumb to new computational and theoretical approaches?
Garland Marshall discussed the need for improved methods to
deal with error analysis and propagation of errors in the drug
discovery and optimization process. We routinely face many sources
of error, experimental and theoretical, that complicate our
efforts to interpret and predict structure-activity relationships.
He also noted that we need to improve our ability to predict
the structure of drug receptors. Despite the impressive increases
in the ability of Xray crystallography and NMR to solve the
structures of biologically relevant macromolecules, many critical
receptors remain intractable. For example, 7-helical transmembrane,
G protein-coupled receptors (7TM GPCRs) make up the largest
single class of receptor targeted by today's drugs, yet no experimental
structures are available due to the extreme difficulties in
crystallizing transmembrane proteins. He also noted the immense
informatics problem posed by the explosion of data available
on the internet. We're "drowning in a sea of data," with only
primitive tools to integrate, assimilate, and interpret the
data. The explosion of genomics data is only beginning, due
to the massive efforts behind the human genome project (the
goal is to completely sequence the 3 billion bases in human
DNA by 2005) and other sequencing efforts targeting a variety
of microorganisms and pathogens.
Phil Portoghese also discussed the importance of 7TM GPCRs,
noting that our only structural information comes from low-resolution
experimental data and approximate homology models. The mechanism
by which 7TM GPCRs transduce an extracellular signal across
the membrane is poorly understood. He described experimental
methods that can provide additional data to help modeling, such
as recent work in making chimeric receptors that combine features
of two different receptors in a single molecule. For example,
an extracellular loop of the kappa-opiate receptor was transferred
to the mu-opiate receptor, resulting in a new, kappa-selective,
chimeric receptor. Current structures and models are not able
to explain the difference between 7TM GPCR agonists (molecules
that turn on an intracellular response) and antagonists (molecules
that inhibit the intracellular response). For example, morphine-type
molecules containing an N-CH3 group are agonists
at the mu-opiate receptor, but converting the N-CH3
group to N-CH2-cyclopropyl produces an antagonist.
This is a very small structural change, yet it produces a dramatically
different response.
Gordon Crippen noted that structure-activity relationships and
computer-assisted drug design rely on the rather vague notion
of molecular similarity as a central theme: similar molecules
tend to have similar biological activity. However, we have many
different ways of defining, measuring, and calculating similarity.
He suggested that molecular similarity should be defined relative
to the environment in which it is being measured, rather than
as an independent property. He also addressed the critical problem
of modeling oral bioavailability and drug distribution within
the body. We have extremely limited control and understanding
of how to alter chemical structure to achieve the desired properties
(for example, a pill that is stable in the gut, reaches the
target organ and receptors intact, has a long enough half-life
to require only 1-2 doses per day, and does not produce toxic
metabolites). Drug design and discovery is a complex, interdisciplinary
problem: rather than focus on isolated problems, we should consider
the entire system as a large graph depicting how drugs interact
with the body. The nodes are organs, tissues, fluids, membranes,
and other structures within the body. The edges are known or
hypothetical pathways; sometimes there may be more than one
path between a pair of nodes. Nodes are labeled by various capacities,
edges by rates and permissions. Can we integrate known data
into such a model?
Simon Kearsley provided a brief historical overview and a map
to the future of modeling challenges, noting that modeling has
already had a large impact in lead identification and optimization.
Due to the advent of high-throughput screening, genomics, and
bioinformatics, data mining and modeling are being used to help
biologists get needed chemistry support (for example, by computationally
identifying existing molecules in databases that validate biological
hypotheses, which in turn creates demand for additional, optimized
molecules). Current approaches have focused on the early stages
of drug discovery by improving the odds through statistical
and qualitative methods, for example, by helping to prioritize
screening and assisting in the identification of new leads.
Such methods include two-dimensional chemical substructure and
similarity searching, and three-dimensional superposition and
docking calculations. He suggested that the next challenge is
in information science: extrapolating from data mining to "concept"
mining and mining relationships in the data, going beyond improving
the odds by studying case histories and anticipating decision
points. He also mentioned that modeling approaches will need
to head toward modeling cellular processes and relationships
(similar to Gordon Crippen's ideas). This will require a canonicalization
of text, and the building of regular vocabularies and thesauri
to deal with the huge number of imprecise names and synonyms
in biology. A formalized mathematical language will need to
be developed for biology which can deal with a limited number
of observable properties and data. He mentioned the need for
technology to deal with very large databases and "data warehousing":
for example, data integration, data "scrubbing," and multidimensional
trend analysis. He also discussed the need for rule-generating
algorithms that can deal with fuzzy logic and ambiguity. These
methods will need to define and measure information content,
including the subjectivity content, deal with noise and sparse
data, and capture human expertise and knowledge.
There are clearly huge problems, challenges, and opportunities
for drug discovery in the next decade. Most of today's methods
focus on the early phases of drug discovery, which consume only
a small fraction of the time and money (less than 20%) of the
total path (10-12 years, $300-500 million) on the way to a marketed
drug. Integrated approaches such as those suggested by Kearsley
and Crippen have the potential to improve drug discovery further
down the expensive and time-consuming development process, but
will require increasingly sophisticated computational methods
and information science technology.
A.J.
Hopfinger
and
W.J. Howe were co-organizers of the April 7-11,
1997 workshop on Mathematical and Computational Issues in Drug
Design. Following is the panel discussion where they served
as moderators:
PANEL
DISCUSSION: IMPORTANT CURRENT PROBLEMS IN DRUG DESIGN THAT MAY
BE COMPUTATIONALLY TRACTABLE
PANELISTS: DAVE DOHERTY, BILL DUNN, GRAHAM and RICHARDS DOUG
ROHRER
The intent of the first panel discussion was to identify areas,
within the broad range of fields represented at the workshop,
in which there are particularly important problems that may
be amenable to theoretical, mathematical, or computational advancement.
The focus was on areas that are currently under study, as opposed
to the second panel discussion which covered areas that one
may anticipate will begin to yield to computational methods
within 5-10 years or so. Given the breadth of the subject matter
covered at the workshop, the issues presented during the panel
discussions were necessarily only a small subset of a potential
list of "important problems," and they were obviously biased
by the particular areas of research interest of the panelists.
Nonetheless, each of the issues presented by a panelist engendered
considerable discussion among the workshop participants, and
the issues discussed in the first panel session can be considered
to be representative of problems currently under study by numerous
scientists in the field.
The panel discussion was structured around brief presentations
by each of the panelists on one or more issues viewed as particularly
difficult or important for the furtherance of the field. Each
presentation was then followed by discussion among the larger
audience. After the four presentations were complete, additional
topics and issues were put forth for discussion by workshop
participants. Following is a brief summary of the issues presented
by the panelists and other participants.
The approach taken by Graham Richards was to present the drug
discovery process as a continuum -- an essentially linear process
that begins with genes, and moves through proteins of therapeutic
interest that are encoded by the genes, to small molecules (potential
drugs) that are designed to modulate the function of the target
proteins. At the genetic end, the problem most suited to computing
(and particularly, supercomputing) is hunting for sequence similarities
between known genes and the entire genome database. The vast
amount of data from sequencing the human genome and those of
pathogens means that this is an enormous computational task.
A recent example of the types of discoveries that can be made
from such data mining is the identification of the similarity
between smallpox viral genes and some of the genes in the human
immunodefense system. It is clear at this point that the rapidly
increasing amount of genetic data that are becoming available,
coupled with the increasing reliance of pharmaceutical companies
on the selection of drug targets based on information contained
in these databases, will translate to substantial opportunities
for improvements in bioinformatics algorithms.
Moving along the continuum, the greatest scientific challenge
remains the `protein folding problem' -- going from gene sequence
(and therefore protein sequence) to protein structure. Sequence
identification far outpaces experimental structure solution,
so there is a need to predict structure from sequence. Folding
must depend on sequence and physics, but new insights and methods
are needed. This remains the "Nobel Prize" problem.
Moving from protein targets to the small molecules that are
being designed to bind to them, it is noted that such molecules
are increasingly being synthesized by combinatorial or highly
parallel synthesis techniques. While such methods are able to
generate molecules on a scale of tens to hundreds of thousands,
there is a need (and a recent trend) to reduce the structure
space somewhat by clever application of computational methods
to achieve greater focus and to improve the odds of identifying
active compounds. This also carries over to the need for more
efficient methods for comparing and selecting molecules (with
specified properties) from databases containing hundreds of
thousands of "real" compounds, to billions of "virtual" compounds
(currently existing databases), each of which can have many
conformationally-accessible structures. It was noted in the
workshop that organic structure-space, of molecular weight less
than 500, has been estimated variously as from 1060
to 10400 unique compounds.
Doug Rohrer's focus was in the "small molecule" region of the
continuum. He noted that many of the current challenges in making
sense of structure-activity relation (SAR) data relate to choices,
and that a large number of choices are generally required. To
name a few, these choices range from selection of compounds,
generating 3D structures for each compound (the conformational
flexibility problem), determining the most appropriate overlay
of the compounds (alignment rules), defining a consensus multi-molecule
binding model, and determining how the consensus features of
the model can best be used in drug discovery and optimization
(that is, now that we have a model, what do we do with it?).
The initial choice of compounds used to develop an SAR model
typically encompasses a small number of highest-affinity compounds.
However, as the project matures, the selection of additional
compounds must include as much diversity as possible, in both
the 2D (connectivity) and 3D (spatial) sense. The challenge
is to evaluate this diversity quickly. It is important, as was
mentioned numerous times during the workshop, to maintain a
fast turn-around time on such studies in order to remain in
the decision and design loops. Otherwise the chemistry will
move ahead without it. Doug suggested that perhaps improved
graph theoretic approaches could be applied at this stage.
Usually the compounds of interest can have numerous low energy
conformations. The challenges here involve methods for exploring
conformational space, followed by analysis to select a representative
subset of diverse conformations. Several novel methods were
presented at the workshop which may address this challenge.
One involved reducing the 3D structures to a 2D format that
retains more information than does a simple projection technique.
Modifications of alphanumeric character recognition algorithms
were then used to identify similar shapes. Another technique
that was presented involved mapping the structures into pre-defined
arrangements of spheres to classify the shape of the molecules.
Such classification of the molecular shape could provide a starting
point for the production of multiple molecule overlays to generate
the SAR model.
Field-based similarity matching provides an excellent means
of optimizing structural overlays. However, the speed of this
process needs to be improved in order to be useful with larger
numbers of molecules (greater than ten). In addition, information
about the relative biological potencies somehow needs to be
factored into this process, so that the good, the bad, and the
ugly features of the overlay model can be identified.
Finally, methods to use the overlay features of the SAR model
to derive new molecules algorithmically, to test hand-generated
molecules against the model, or to optimize existing leads via
suggestions from the model, would be the ultimate goals. Automated
de novo design programs like GROWMOL might be a start,
but the solution is still elusive.
Bill Dunn continued the discussion of challenges related to
small-molecule design, and again, for the common situation where
the 3D structure of the target (protein) is not known, leading
to the application of indirect methods in the delineation of
3D quantitative structure-activity relations (QSAR). One difficulty
with these methodologies (e.g. comparative molecular field analysis,
molecular shape analysis, etc.) is the assignment of a receptor-bound
conformation (conformational flexibility again) and an alignment
for the ligands (overlay problem). This is generally done by
trial and error, with some guidance provided by ligands with
restricted degrees of conformational freedom. This is an extremely
costly exercise that limits the utility of 3D-QSAR.
A general solution to this problem may be found by relaxing
the conformation and alignment constraints of existing approaches,
and by describing the active ligands with features that vary
with conformation and alignment. The resulting arrays are third-order
tensors that can be decomposed to yield the descriptor vector(s)
most highly correlated with biological activity.
Dunn's group is exploring a general formalism for 3D-QSAR that
includes various methods useful for the decomposition of 3-way
arrays of conformation- and alignment-dependent feature data.
They are also exploring novel methods of aligning sets of bioactive
compounds, in such a way that leads to an unconstrained 3D-QSAR
procedure.
Dave Doherty focused on problems in biological chemistry that
are characterized by structural transitions that involve large-scale
motions of portions of the structure. One obvious example of
this is the folding of proteins. But there are other, less global,
changes that also occur via large scale motions, and Doherty's
statement of the problem of identifying these transitions was
aimed at discovering cooperative or nonlinear effects that cause
such large scale motions. Definitive answers to these questions
would contribute greatly to our understanding of protein structure,
and particularly to our ability to predict tertiary structure
and its longer time-scale behavior.
The issues that he raised were intended to provoke thought among
the mathematicians and drug designers about how further improvements
in solutions to nonlinear mathematical systems might shed some
light on current problems in drug design. To motivate such thought,
an example was given of some of his studies that used direct
molecular dynamics simulation to search for nonlinear motions
that have long been postulated to be characteristic of the so-called
"rotator" pre-melting phases in n-paraffin crystals
(Doherty, D. C., Hopfinger, A. J., Phys. Rev. Lett.
(1994) 72 (5), 661). These "sine-Gordon" soliton waves
arise from the solutions to a set of nonlinear differential
equations. The fact that these equations have an analytical
solution (a traveling arctangent waveform) provides a unique
opportunity to search for these cooperative motions using direct
simulation. This work appears to have definitively identified
these high amplitude, localized waves in the molecular dynamics
simulations, even though there is no explicit term in the molecular
mechanics force field that specifically gives rise to them --
which is indicative of the cooperative effects of a collection
of multiple chains.
This is an exciting development in our understanding of cooperative
effects in molecular phase transitions. So, to the attending
mathematicians one might ask, "What are the current developments
in solutions (analytical or simulated) to molecular dynamics
equations for complex systems that may be of some use in our
understanding of cooperative processes in biochemical systems?"
And one might ask the computational chemists, "What are the
important questions that might move us along to a better understanding
of cooperative effects, through some combination of mathematics
and direct simulation?"
After the panelists' presentations and the ensuing discussions,
additional challenges were put forth by other workshop participants.
It was noted that few of the issues discussed to that point
related to what has become known as `direct methods' -- techniques
such as docking, de novo design, ligand-binding scoring,
and so on, that can be applied when the target protein is known
in structural detail. Such detailed knowledge is becoming much
more common, in part because of the rapid emergence of genomics-based
targets. In many pharmaceutical companies, all new discovery
programs are molecularly-based, and they follow the continuum
outlined by Graham Richards. Genomics is used to select the
target, which is then further developed through cloning, expression,
folding, and purification of the relevant protein(s) to produce
molecular (and generally, high throughput) assays of small-molecule
activity. With protein on hand, structural biology (X-ray crystallography,
NMR structure determination, antibody epitope mapping, and so
on) can provide the structural detail needed for application
of direct methods to the task of lead discovery, or at least
to lead optimization. (An "epitope" is the surface region of
another molecule that is recognized and bound by an antibody.
A "lead" is a compound with the sought for bioactivity, but
which requires further optimization, for example to improve
its bioavailability or eliminate certain side reactions, in
order to become a useful drug.)
A number of the important issues in this area were discussed
in the main workshop presentations during the week. One of the
most pressing issues was improved methods for dealing with the
physics and physical chemistry of water, sometimes through explicit
consideration of solvation effects during simulations or binding
energy calculations, but more typically through continuum solvation
models. It was stated by one of the participants that we can
generally do quite well now in predicting geometries of potential
drug molecules at a binding site, but not so well in accounting
for their solvation effects on the binding energy or even for
the intrinsic components of the binding energy. Accurate calculation
of ligand/protein binding energies (including entropic considerations)
was generally viewed as one of the most important current needs,
as it pervades all of the direct-methods techniques for studying
and predicting the activity of small-molecule analogs. It was
also noted that, in order to remain within the design cycle,
molecular modelers must choose their methods, in part, on the
basis of speed, to provide quick turn-around of results. This
places a greater burden on finding energy-based methods or scoring
functions that are accurate and fast. The need for
speed is most evident in the effective application of automated
de novo design methods, where tens to hundreds of thousands
of candidate structures must be generated, evaluated (by some
form of scoring function which assesses goodness of fit to the
target), and assigned a priority ranking. Of course, when opting
for methods that are fast, one must always question whether
the reliability of the results warrants the use of the faster
methods.
Another participant went further and pointed out that many of
the currently available methods typically produce huge lists
of seemingly reasonable molecules. However, only a few can be
chosen for synthesis. Ideally a free energy scoring function
would provide a basis for ranking and selecting the structures.
Despite much progress, this remains an unsolved problem. Although
the list can be reduced, one cannot with confidence pick only
the top ten or so compounds for further study. Until the perfect
scoring function is developed one must sample the reduced list
in some rational manner. Experimental design methods are needed
to address this. This is an opportunity for closer collaboration
between the computational chemistry community and the statistical
and applied mathematics community.
Donald
G. Truhlar
from
the University of Minnesota (Chemistry and Supercomputer Institute)
was one of the organizers for the workshops on "Evolutionary
Algorithms" and "Mathematical and Computational Issues in Drug
Design." He together with W. Jeffrey Howe, Anthony J. Hopfinger,
Jeff Blaney, and Richard A. Dammkoehler write the following
foreword for the IMA Volume on "Rational Drug Design."
Drug research and discovery are of critical importance in human
health care and are becoming increasingly expensive, while the
need for new drugs is also increasing. Computational approaches
for discovery and optimization of drug leads have proven successful
in many recent research programs. (A "lead" compound is one
with sought for bioactivity but which requires further optimization,
for example to improve its bioavailability or reduce certain
side reactions, in order to become a useful drug.) Methods for
drug lead discovery and optimization have grown in their effectiveness
not only because of improved understanding of the basic science
-- the biological events and molecular interactions that define
a target for therapeutic intervention -- but also because of
advances in algorithms, representations, and mathematical procedures
for studying drug processes. In order to promote the interaction
of mathematicians, computer scientists, and chemists to further
the progress in the field and alert researchers to the opportunities
for interdisciplinary research, the University of Minnesota's
Institute for Mathematics and Its Applications and the University
of Minnesota Supercomputer Institute sponsored a Workshop on
Rational Drug Design on the Minneapolis campus, April 7-11,
1997. The workshop was devoted primarily to mathematical and
computational issues in drug design. This volume contains the
proceedings of that Workshop.
The workshop brought together top researchers in computer-aided
drug discovery, computational chemistry, mathematics, and computer
science to present state-of-the-art research in both the science
and the underlying mathematics and to identify new problems
for possible collaborations. General subject areas of the workshop
included receptor-based applications such as binding energy
approximations, molecular docking, and de novo design; non-receptor-based
applications such as molecular similarity, conformational analysis,
and structural diversity; molecular dynamics simulations; and
solvation issues related to partitioning of a solute between
aqueous and nonpolar media. The workshop focused on the mathematical
procedures and algorithms upon which the scientific applications
are based. |