Main navigation | Main content
HOME » PROGRAMS/ACTIVITIES » Annual Thematic Program
with
partial support by The
Office of Naval Research
Organizers:
Roni
Rosenfeld
Carnegie Mellon University
roni@cmu.edu
Sanjeev
Khudanpur
Center for Language and Speech Processing
Johns Hopkins University
khudanpur@jhu.edu
Mark
Johnson
Department of Cognitive and Linquistic
Sciences
Brown University
mj@cs.brown.edu
Registration for this workshop is now closed.
Language modeling is crucial to all applications that process human language with less than complete knowledge. This includes speech recognition, machine translation, optical character recognition, handwriting recognition, spelling and grammar correction, and others. Formal theories of grammar have so far failed to account adequately for actual natural usage of language. Stochastic versions of formal grammars are still less successful (as measured by cross entropy of their predictions) than simple Markovian models (ngrams) which are estimated from larger amounts of data. With the advent of huge textual corpora, a breakthrough in language modeling will come when we successfully integrate linguistic knowledge with statistical estimation techniques.
This workshop will bring together researchers who are working on various aspects of language modeling (stochastic grammars, clustering, maximum entropy models) with mathematicians with interest in these and related problems (Bayesian methods, clustering, information theory). The first 2 days will consist of an overview of the field and existing techniques, followed by presentations of ongoing research. The overall objective is to encourage interaction and collaborations between mathematicians and practitioners to pursue the next generation of solutions to language modeling problems. The specific goals are:
| Monday | Tuesday |
| MONDAY,
OCTOBER 30 STATISTICAL LANGUAGE MODELING All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted. |
||
|---|---|---|
| 8:30 am | Coffee and Registration |
Reception Room EE/CS 3-176 |
| 9:10 am | Willard Miller, Fred Dulles, and Sanjeev Khudanpur | Introduction |
| 9:30 am |
Harry
Printz |
Overview of SLM Applications |
| 10:30 am | Break | Reception Room EE/CS 3-176 |
| 11:00 am-12:00 pm | Roni
Rosenfeld Carnegie Mellon University |
An Accelerated Introduction to Statistical Language Modeling
|
| 2:00 pm | Jerome
Bellegarda Apple Computer, Inc. |
Data-Driven Semantic Language Modeling Slides pdf postscript "Exploiting Latent Semantic Information in Statistical Language Modeling," Proceedings of the IEEE, Vol. 88, No. 8, pp. 1279-1296, August 2000. |
| 3:00 pm | Break | Reception Room EE/CS 3-176 |
| 3:30-4:30 pm |
James Baker (Dragon Systems) , Dietrich Klakow (Phillips), Harry Printz (IBM) and Alejandro Murua (U. of Washington) |
Panel
Discussion on "The State of the Art in SLM"
Harry Printz Talk Acoustic Confusability pdf postscript associated paper pdf postscript |
| 4:30 pm | IMA
Tea A variety of appetizers and beverages will be served. |
IMA East, 400 Lind Hall |
| TUESDAY,
OCTOBER 31 STATISTICAL COMPUTATIONAL LINGUISTICS All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted. |
||
| 9:15 am | Coffee | Reception Room EE/CS 3-176 |
| 9:30 am | Christopher
Manning Stanford University |
Probabilistic Models in Computational Linguistics Talk pdf |
| 10:30 am | Break | Reception Room EE/CS 3-176 |
| 11:00 am-12:00 pm | Mark
Johnson Brown University |
An Introduction to Probabilistic Grammars and their Applications Talk pdf postscript |
| 2:00 pm | Michael
Collins AT&T Labs-Research |
Statistical Models for Natural Language Parsing |
| 3:00 pm | Break | Reception Room EE/CS 3-176 |
| 3:30-4:30 pm | Fernando Pereira, (At&T Labs), Jason Eisner (University of Rochester), and Chanshu Ji,(University of North Carolina) | Panel Discussion on "From Parsing to Text Understanding; What are the Real Challenges?" |
| WEDNESDAY,
NOVEMBER 1 MAXIMUM ENTROPY AND EM TECHNIQUES All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted. |
||
| 9:15 am | Coffee | Reception Room EE/CS 3-176 |
| 9:30 am | Sanjeev
Khudanpur Johnds Hopkins University |
Maximum Entropy Techniques and Exponential Models in SLM/SCL |
| 10:30 am | Break | Reception Room EE/CS 3-176 |
| 11:00 am-12:00 pm | Andreas Stolcke (SRI International), Stefan Riezler (Universitat Stuttgart), D. Klakow (Phillips), and Zhiyi Chi, (University of Chicago) |
Panel Discussion on ``Modeling Techniques for Combining Multiple Information Sources" Andreas Stolcke Talk pdf postscript |
| 2:00 pm | Frederick
Jelinek Johns Hopkins University |
|
| 3:00 pm | Break | Reception Room EE/CS 3-176 |
| 3:30-4:30 pm | Eugene Charniak (Brown University), Lillian Lee (Cornell University) Larry Gillick (Dragon Systems), and Peter Bickel,(UC-Berkeley) | Panel Discussion on "Applications of EM Techniques" |
| THURSDAY,
NOVEMBER 2 BAYESIAN METHODS AND MCMC All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted. |
||
| 9:15 am | Coffee | Reception Room EE/CS 3-176 |
| 9:30 am | Julian
E. Besag University of Washington |
Markov Chain Monte Carlo and Bayesian Computation |
| 10:30 am | Break | Reception Room EE/CS 3-176 |
| 11:00 am-12:00 pm | Roni Rosenfeld (Carnegie Mellon University), Olivier Catoni (CNRS), and Jean-Phillippe Vert (Ecole Normale Superieur) | Panel Discussion on "Bayesian Techniques in Computational Models of Natural Language" |
| 2:00 pm | Mehryar
Mohri AT&T Labs - Research |
Finite-State Language Modeling |
| 3:00 pm | Break | Reception Room EE/CS 3-176 |
| 3:30-4:30 pm | Steven Abney (AT&T ), Ya'acov Ritov, (Hebrew University of Jersusalem) and Stu Geman, (Brown) | Panel Discussion on "Connections between Weighted Finite State Techniques and More Traditional Statistical Models" |
| 6:00 pm | Workshop Dinner | Caspian Restaurant |
| FRIDAY,
NOVEMBER 3 FUTURE DIRECTIONS All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted. |
||
| 9:15 am | Coffee | Reception Room EE/CS 3-176 |
| 9:30-10:30 am | New Multidisciplinary Research Proposals from Workshop Participants | |
| 10:30 am | Break | Reception Room EE/CS 3-176 |
| 11:00 am | New Multidisciplinary Research Proposals from Workshop Participants (Cont'd.) | |
| 12:00-12:15 pm | Roni Rosenfeld (Carnegie Mellon University), Sanjeev Khudanpur (Johns Hopkins University), and Mark Johnson (Brown) | Closing Remarks |
| Monday | Tuesday |
| Name | Department | Affiliation |
|---|---|---|
| Steven Abney | AT&T | |
| Joan Bachenko | Linguistech Technologies | |
| James Baker | . | |
| Marian Barry | ||
| Jerome Bellegarda | Advanced Technology Group | Apple Computer, Inc. |
| Julian E. Besag | Statistics | University of Washington |
| Peter Bickel | Statistics | University of California-Berkeley |
| Dan Boley | Computer Science | Un of MN |
| Jamylle Carter | Institute for Mathematics and its Applications | |
| Olivier Catoni | Probabilites et Modeles Aleatoires | C.N.R.S. |
| Eugene Charniak | Computer Science | Brown University |
| Li-Tien Cheng | Institute for Mathematics and its Applications | |
| Zhiyi Chi | Statistics | University of Chicago |
| Michael Collins | AT&T Labs-Research | |
| Akira Date | Advanced Brain Signal Processing Lab. | Riken Brain Science Institute |
| Mukund Deshpande | Computer Science | Un of Mn |
| Fred Dulles | Institute for Mathematics & its Applications | |
| Jason Eisner | ||
| Selim Esedoglu | Institute for Mathematics and its Applications | |
| Paul Garrett | Mathematics | University of Minnesota |
| Stu Geman | Applied Mathematics | Brown University |
| Larry Gillick | Vice President of Research | Dragon Systems, Inc. |
| Marcia Gini | CSCI | Un of MN |
| Frederick Jelinek | Electrical and Computer Engineering | Johns Hopkins University |
| Chuanshu Ji | Statistics | University of North Carolina - Chapel Hill |
| Mark Johnson | Cognitive & Linguistic Sciences | Brown University |
| Sanjeev Khudanpur | Center for Language and Speech Processing | Johns Hopkins University |
| Dietrich Klakow | Philips Research | |
| Christopher Lang | Indiana University Southeast | |
| Lillian Lee | Computer Science | Cornell University |
| Elizabeth Lovance | Linguistic Technology | |
| Martin Maiers | Computer Science | University of Minnesota |
| Christopher Manning | Computer Science | Stanford University |
| David McKoskey | Research/Development | Linguistic Technologies, Inc. |
| Dan Melamed | Westgroup | |
| Willard Miller | Institute for Mathematics & its Applications | |
| Mehryar Mohri | AT&T Labs - Research | |
| Alejandro Murua | Statistics | University of Washington |
| Sergey Pakhomov | ILLASL (Linguistics Program) | University of Minnesota |
| Fernando Pereira | Whiz Bang! Labs | |
| Harry Printz | IBM T.J. Watson Research Center | |
| Kashif Riaz | Strategic Development | West Group |
| Stefan Riezler | Institut fur Maschinelle Sprachverarbeitung | Universitat Stuttgart |
| Ya'acov Ritov | Statistics | The Hebrew University of Jerusalem |
| Roni Rosenfeld | Computer Science | Carnegie Mellon University |
| Judith Schlesinger | IDA/Center for Computing Sciences | |
| Michael Schonwetter | ||
| Andreas Stolcke | SRI International | |
| Paul Thompson | West Group | |
| Jean-Philippe Vert | DMI-LMENS | Ecole Normale Superieure |
| Shaojun Wang | CALD | Carnegie Mellon University |
| Ken Williams | Un of MN |
|
|
|
|
|