nl.ics

BEGIN:VCALENDAR
CALSTYLE:GREGORIAN
PRODID:-//NL//Seminar Calendar//EN
VERSION:2.0
X-WR-CALNAME:NL
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DESCRIPTION: TBD

DTEND;TZID=America/Los_Angeles:20181109T160000
DTSTART;TZID=America/Los_Angeles:20181109T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:TBD
UID:20181109T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBD

DTEND;TZID=America/Los_Angeles:20181012T160000
DTSTART;TZID=America/Los_Angeles:20181012T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:TBD
UID:20181012T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBD

DTEND;TZID=America/Los_Angeles:20180907T160000
DTSTART;TZID=America/Los_Angeles:20180907T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:TBD
UID:20180907T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Can we detect the parts responsible for a generic behavior in a neural model to transfer it to another? In this talk, we first see why this might be a good idea, especially for low-resource machine translation. Then we focus on our approach to isolating a behavior. In our case, we specifically focus on coverage during machine translation. We present our results across different languages that show how neural models try to ensure coverage.Bio: Mozhdeh Gheini is a last-semester Computer Science master's student at USC Viterbi School of Engineering. At ISI NLP Group, she works on improving neural low-resource machine translation under the supervision of Jonathan May. She will be applying for Ph.D. programs this Fall.Abstract: In improvised comedy, saying "yes, and.. " is a rule-of-thumb that suggests that one person should accept the other person's offer (yes), and then add related information on top of that (and). Collecting a "yes, and.." corpus is not only helpful for building an improv agent, but can also be used for building conversational skill training tool, improving a dialogue system, etc. I will discuss the methods we have used for building such a dataset, data we have got so far and future considerations.Bio: Xinyu is a 2018 summer intern working with Dr. Jonathan May and Dr. Nanyun Peng on computerized improvised comedy. She will be joinging the Language Technologies Institute at Carnegie Mellon University in 2018 fall.

DTEND;TZID=America/Los_Angeles:20180824T160000
DTSTART;TZID=America/Los_Angeles:20180824T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:T1. Constraints for Transfer Learning for Neural Machine Translation T2. Say Yes-and: Building a Specialized Corpus for Digital Improvised Comedy
UID:20180824T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: To generate language, we model what to say, why not also model how listeners will react? We show how pragmatic inference can be used to both generate and interpret natural language instructions for complex, sequential tasks. Our pragmatics-enabled models reason about how listeners will react upon hearing instructions, and reason counterfactually about why speakers produced the instructions they did. We find that this inference procedure improves state-of-the-art listener models (at correctly interpreting human instructions) and speaker models (at generating instructions correctly interpreted by humans) in diverse settings, including navigating through real-world indoor environments.Bio:Daniel Fried is a PhD student at UC Berkeley, working with Dan Klein on grounded semantics and structured prediction in natural language processing. Previously, he received a BS from the University of Arizona and an MPhil from the University of Cambridge. His work has been supported by a Churchill Scholarship, NDSEG Fellowship, Huawei / Berkeley AI Fellowship, and Tencent Fellowship.

DTEND;TZID=America/Los_Angeles:20180914T160000
DTSTART;TZID=America/Los_Angeles:20180914T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Pragmatic Models for Generating and Following Grounded Instructions
UID:20180914T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Machine learning is at the forefront of many recent advances in natural language processing, enabled in part by the sophisticated models and algorithms that have been recently introduced. However, as a consequence of this complexity, machine learning essentially acts as a black-box as far as users are concerned. It is incredibly difficult to understand, predict, or "fix" the behavior of NLP models that have been deployed. In this talk, I propose interpretable representations that allow users and machine learning models to interact with each other: enabling machine learning models to provided explanations as to why a specific prediction was made and enabling users to inject domain knowledge into machine learning. The first part of the talk introduces an approach to estimate local, interpretable explanations for black-box classifiers and describes an approach to summarize the behavior of the classifier by selecting which explanations to show to the user. I will also briefly describe work on "closing the loop", i.e. allowing users to provide feedback on the explanations to improve the model, for the task of relation extraction, an important subtask of natural language processing. In particular, we introduce approaches to both explain the relation extractor using logical statements and to inject symbolic domain knowledge into relational embeddings to improve the predictions. I present experiments to demonstrate that an interactive interface is effective in providing users an understanding of, and an ability to improve, complex black-box machine learning systems.Bio: Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine. He is working on large-scale and interactive machine learning applied to information extraction and natural language processing. Till recently, Sameer was a Postdoctoral Research Associate at the University of Washington. He received his PhD from the University of Massachusetts, Amherst in 2014, during which he also interned at Microsoft Research, Google Research, and Yahoo! Labs on massive-scale machine learning. He was selected as a DARPA Riser, was awarded the Adobe Research Data Science Award, won the grand prize in the Yelp dataset challenge, has been awarded the Yahoo! Key Scientific Challenges fellowship, and was a finalist for the Facebook PhD fellowship. Sameer has published more than 30 peer-reviewed papers at top-tier machine learning and natural language processing conferences and workshops.

DTEND;TZID=America/Los_Angeles:20170324T160000
DTSTART;TZID=America/Los_Angeles:20170324T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Intuitive Interactions with Black-box Machine Learning
UID:20170324T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I am going to be talking about stuff that I have been working over thepast 6-9 months. This includes randomized algorithms and its applicationto 2 NLP problems: noun clustering and noun-pair clustering. I will alsobe commenting on my experience of working with very very large amounts ofreal Natural Language text (This includes processing and working with dataavailable from the web. This corpus is not the standard newspaper textthat we are so used to in the NLP community.) This talk will also cover alarge part of my thesis work.

DTEND;TZID=America/Los_Angeles:20050422T163000
DTSTART;TZID=America/Los_Angeles:20050422T150000
LOCATION:11 Large
SUMMARY:Working with Large Corpus, High speed clustering and its applications
UID:20050422T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030729T160000
DTSTART;TZID=America/Los_Angeles:20030729T150000
LOCATION:11 Small
SUMMARY:A Model of Word Movement for Machine Translation
UID:20030729T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many datasets for natural language processing problems lack linguistic variation, which hurts generalization of models trained on them. Recent research has shown that it is possible to break many learned models by evaluating them on adversarial examples, which are generated by manually introducing lexical, pragmatic, and syntactic variation to existing held-out examples from the data. Automating this process is challenging, as input semantics must be preserved in the face of potentially large sentence modifications. In this talk, I will focus specifically on syntactic variation in discussing our recent work on syntactically controlled paraphrase networks (SCPN) for adversarial example generation.Given a sentence and a target syntactic form (e.g., a constituency parse), an SCPN is trained to produce a paraphrase of the sentence with the desired syntax. We show it is possible to create training data for this task by first doing backtranslation at a very large scale, and then using a parser to label the syntactic transformations that naturally occur during this process. Such data allows us to train a neural encoder-decoder model with extra inputs to specify the target syntax. A combination of automated and human evaluations show that SCPNs generate paraphrases that almost always follow their target specifications without decreasing paraphrase quality when compared to baseline (uncontrolled) paraphrase systems. Furthermore, they are more capable of generating syntactically adversarial examples that both (1) "fool" pretrained models and (2) improve the robustness of these models to syntactic variation when used for data augmentation.Bio: Mohit Iyyer will be joining UMass Amherst as an assistant professor in Fall 2018. Currently, he is a Young Investigator at the Allen Institute of Artificial Intelligence; prior to that, he received a Ph.D. from the Department of Computer Science at the University of Maryland, College Park, advised by Jordan Boyd-Graber and Hal DaumÃ© III. His research interests lie at the intersection of natural language processing and machine learning. More specifically, he focuses on designing deep neural networks for both traditional NLP tasks (e.g., question answering, language generation) and new problems that involve creative language (e.g., understanding narratives in novels). He has interned at MetaMind and Microsoft Research, and his research has won a best paper award at NAACL 2016 and a best demonstration award at NIPS 2015.

DTEND;TZID=America/Los_Angeles:20180330T160000
DTSTART;TZID=America/Los_Angeles:20180330T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Generating Adversarial Examples with Syntactically Controlled Paraphrase Networks
UID:20180330T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Philipp Koehn and I will do a machine translation tutorial at ACL.Instead of an introductory tutorial, we'll do short 15-minute segmentson various hot topics in MT research.  For the ISI NL seminar, I'llpresent 3 or 4 of those topics, determined by audience vote.

DTEND;TZID=America/Los_Angeles:20090710T160000
DTSTART;TZID=America/Los_Angeles:20090710T150000
LOCATION:11 Large
SUMMARY:Excerpts from ACL-09 Tutorial on "Topics in Machine Translation"
UID:20090710T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the 1990s, researchers applied their new developments in transducertheory using widely available easy-to-use toolkits for string transducers,and made well-known advances in parsing, machine translation, and otherareas. Rapid prototyping via software such as the AT&T toolkit and carmelwas useful for proofs of concept and in many cases led to unforseendevelopments in novel areas. In the current nlp research environment treebased strategies and new models have shown promising results in advancingthe state of the art, and recent developments in weighted tree automatatheory are enriching the bedrock created 40 years ago, but as of yet thereis no toolkit available with the necessary capabilities to turn promiseinto solution.Tiburon is the first probablistic tree transducer toolkit. Similar in formand function to the string-based toolkits of yesteryear, it is designed tobe easy to use, with simple but expressive definitions of tree automataand a concise set of vital operations that can be used to construct manyuseful tree-based nlp projects. Although a work in progress, Tiburon isalready a usable tool with active users between the ages of 6 and 41. Iwill describe the current status of the system, demonstrate ease of useand potential power, and discuss the challenges ahead.

DTEND;TZID=America/Los_Angeles:20060317T163000
DTSTART;TZID=America/Los_Angeles:20060317T150000
LOCATION:4th Floor
SUMMARY:Tiburon: A Finite State Tree Automata Toolkit
UID:20060317T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Consider Donald Norman's quote, "The power of the unaided mind ishighly overrated. Without external aids, memory, thought, andreasoning are all constrained. But human intelligence is highlyflexible and adaptive, superb at inventing procedures and objects thatovercome its own limits. The real powers come from devising externalaids that enhance cognitive abilities." (Norman, 1993) Common methodsfor externalization include making sketches on whatever happens to behandy -- paper napkins, program margins, etc. -- and/or finding acolleague or two to discuss the problem with. It would seem then, thatvisualization and collaboration are natural possibilities for creatingpositive cognitive aids. I will discuss our approach to developinginteractive information visualizations both to support individuals andsmall groups of collaborators and briefly describe some of our recentresults.About the speaker:Sheelagh Carpendale holds a Canada Research Chair in InformationVisualization at the University of Calgary. Her research focuses onthe visualization, exploration and manipulation of information;visualizing such topics as ecological dynamics, uncertainty ininformation, social and communication information and investigatingthe development of information visualization environments that supportcollaboration. Dr. Carpendale's research in information visualizationand interaction design draws on her dual background in ComputerScience (BSc. and Ph.D. Simon Fraser University) and Visual Arts(Sheridan College, School of Design and Emily Carr, College of Art).

DTEND;TZID=America/Los_Angeles:20070504T163000
DTSTART;TZID=America/Los_Angeles:20070504T150000
LOCATION:11 Large
SUMMARY:Information Visualization and Collaboration
UID:20070504T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is two practice talks.-----------------------------------------------------------------------------FIRST TALK:The traditional approach to diagnosing learner speech errors in ComputerAided Language Learning is to create a linguistic profile of thelearner/user. We, however, propose that work must also be done to modelthe linguistic profile of a typcial native listener.Not all errors in second langage learner speech are created equal.Different errors sound more "severe" or "harsh" to native speaker ears andshould therefore be treated with more emphasis in pedagogical interaction.The Tactical Language Training System (TLTS) is a speech-enabledvirtual-reality based computer learning environment designed to teachArabic spoken communication to American English speakers. This talkaddresses the ways the TLTS contextualizes non-native speech errors, andhow this contextualization fits in the corrective exchanges between anon-native learner and a pedagogical agent built to model a nativelistener.The pedagogical system used in TLTS includes:Â  * Automatic Speech Recognition (ASR) models which are built on acombination of both annnotated and unannotated non-native speech withnative speech data.Â  * A stochastic generative model for errors in learner speech thatcreates mispronunciation grammars for the ASRÂ  * Reweighting of system-perceived mispronunciation severity based onaggregate native speaker judgements of quality pronunciation andintelligiblity.Â  * Contextualization of feedback based on lexical and phoneticinventories of the native and non-native languages.-----------------------------------------------------------------------------SECOND TALK:We present a novel feature-enriched approach that learns to detect theconversation focus of threaded discussions by combining NLP analysis andIR techniques. Using the graph-based algorithm HITS, we integratedifferent features such as lexical similarity, poster trustworthiness, andspeech act analysis of human conversations with featureoriented linkgeneration functions. It is the first quantitative study to analyze humanconversation focus in the context of online discussions that takes intoaccount heterogeneous sources of evidence. Experimental results using athreaded discussion corpus from an undergraduate class show that itachieves significant performance improvements compared with the baselinesystem.

DTEND;TZID=America/Los_Angeles:20060512T163000
DTSTART;TZID=America/Los_Angeles:20060512T150000
LOCATION:11 Large
SUMMARY:Pedagogical Contextualization of Language Learner Speech Errors AND Learning to Detect Conversation Focus of Threaded Discussions
UID:20060512T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Social networks have many counter-intuitive properties, including the âfriendship paradoxâ that states, on average, your friends have more friends than you do. Recently, a variety of other paradoxes were demonstrated in online social networks. This paper explores the origins of these network paradoxes. Specifically, we ask whether they arise from mathematical properties of the networks or whether they have a behavioral origin. We show that sampling from fat-tailed distributions always gives rise to a paradox in the mean, but not the median. We propose a strong form of network paradoxes, based on utilizing the median, and validate it empirically using data from two online social networks. Specifically, we show that for any user the majority of userâs friends and followers have more friends, followers, etc. than the user, and that this cannot be explained by statistical properties of sampling. Next, we explore the behavioral origins of the paradoxes by using the shuffle test to remove correlations between node degrees and attributes. We find that paradoxes for the mean persist in the shuffled network, but not for the median. We demonstrate that strong paradoxes arise due to the assortativity of user attributes, including degree, and correlation between degree and attribute.

DTEND;TZID=America/Los_Angeles:20140411T160000
DTSTART;TZID=America/Los_Angeles:20140411T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Network Weirdness: Exploring the Origins of Network Paradoxes
UID:20140411T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The dialogue policy of a dialogue system decides on what dialogue move (also called âactionâ) the system should make given the dialogue context (also called âdialogue stateâ). Building hand-crafted dialogue policies is a hard task, and there is no guarantee that the resulting policies will be optimal. This issue has motivated the dialogue community to use statistical methods for automatically learning dialogue policies, the most popular of which is reinforcement learning (RL). However, to date, RL has mainly been used to learn dialogue policies in slot-filling applications (e.g., restaurant recommendation, flight reservation, etc.) largely ignoring other more complex genres of dialogue such as negotiation. This talk presents challenges in reinforcement learning of negotiation dialogue policies. The first part of the talk focuses on applying RL to a two-party multi-issue negotiation domain. Here the main challenges are the very large state and action space, and learning negotiation dialogue policies that can perform well for a variety of negotiation settings, including against interlocutors whose behavior has not been observed before. Good negotiators try to adapt their behaviors based on their interlocutorsâ behaviors. However, current approaches to using RL for dialogue management assume that the userâs behavior does not change over time. In the second part of the talk, I will present an experiment that deals with this problem in a resource allocation negotiation scenario.Kallirroi Georgila is a Research Assistant Professor at the Institute for Creative Technologies (ICT) at the University of Southern California (USC) and at USCâs Computer Science Department. Before joining USC/ICT in 2009 she was a Research Scientist at the Educational Testing Service (ETS) and before that a Research Fellow at the School of Informatics at the University of Edinburgh. Her research interests include all aspects of spoken dialogue processing with a focus on reinforcement learning of dialogue policies, expressive conversational speech synthesis, and speech recognition. She has served on the organizing, senior, and program committees of many conferences and workshops. Her research work is funded by the National Science Foundation and the Army Research Office.

DTEND;TZID=America/Los_Angeles:20170421T160000
DTSTART;TZID=America/Los_Angeles:20170421T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Reinforcement learning of negotiation dialogue policies
UID:20170421T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic word alignment is the problem of automatically annotatingparallel text with translational correspondence. Previous generativeword alignment models have made structural assumptions such as the1-to-1, 1-to-N, or phrase-based consecutive word assumptions, whileprevious discriminative models have either made one of theseassumptions directly or used features derived from a generative modelusing one of these assumptions. We present a new generative alignmentmodel which avoids these structural limitations, and show that it iseffective when trained using both unsupervised and semi-supervisedtraining methods. Experiments show strong improvements in wordalignment accuracy and usage of the generated alignments inhierarchical and phrasal SMT systems improves the BLEU score.

DTEND;TZID=America/Los_Angeles:20070615T110000
DTSTART;TZID=America/Los_Angeles:20070615T103000
LOCATION:11 Large
SUMMARY:Getting the structure right for word alignment: LEAF
UID:20070615T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree-based probability models of translation have been proposed to takeadvantage of parse trees on one, both, or neither sides of a parallelcorpus.  I will present comparative results for these three approaches forthe task of word alignment on Chinese-English and French-English data, aswell as some analysis of what is going on behind the numbers.

DTEND;TZID=America/Los_Angeles:20040625T160000
DTSTART;TZID=America/Los_Angeles:20040625T150000
LOCATION:11 Large
SUMMARY:Syntactic Supervision and Tree-Based Alignment
UID:20040625T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Various types of how-to-knowledge are encoded in natural language instructions: from setting up a tent, to preparing a dish for dinner, and to executing biology lab experiments. These types of instructions are based on procedural language, which poses unique challenges. For example, verbal arguments are commonly elided when they can be inferred from context, e.g., ``bake for 30 minutes'', not specifying bake what and where. Entities frequently merge and split, e.g., ``vinegarââ and ``oilââ merging into ``dressingââ, creating challenges to reference resolution. And disambiguation often requires world knowledge, e.g., the implicit location argument of ``stir frying'' is on ``stove''. In this talk, I will present our recent approaches to interpreting and composing cooking recipes that aim to address these challenges.In the first part of the talk, I will present an unsupervised approach to interpreting recipes as action graphs, which define what actions should be performed on which objects and in what order. Our work demonstrates that it is possible to recover action graphs without having access to gold labels, virtual environments or simulations. The key insight is to rely on the redundancy across different variations of similar instructions that provides the learning bias to infer various types of background knowledge, such as the typical sequence of actions applied to an ingredient, or how a combination of ingredients (e.g., ``flour'', ``milk'', ``eggs'') becomes a new entity (e.g, ``wet mixture'').In the second part of the talk, I will present an approach to composing new recipes given a target dish name and a set of ingredients. The key challenge is to maintain global coherence while generating a goal-oriented text. We propose a Neural Checklist Model that attains global coherence by storing and updating a checklist of the agenda (e.g., an ingredient list) with paired attention mechanisms for tracking what has been already mentioned and what needs to be yet introduced. This model also achieves strong performance on dialogue system response generation. I will conclude the talk by discussing the challenges in modeling procedural language and acquiring the necessary background knowledge, pointing to avenues for future research.Bio:Yejin Choi is an assistant professor at the Computer Science & Engineering Department of University of Washington. Her recent research focuses on language grounding, integrating language and vision, and modeling nonliteral meaning in text. She was among the IEEEâs AI Top 10 to Watch in 2015 and a co-recipient of the Marr Prize at ICCV 2013. Her work on detecting deceptive reviews, predicting the literary success, and learning to interpret connotation has been featured by numerous media outlets including NBC News for New York, NPR Radio, New York Times, and Bloomberg Business Week. She received her Ph.D. in Computer Science at Cornell University.

DTEND;TZID=America/Los_Angeles:20161202T160000
DTSTART;TZID=America/Los_Angeles:20161202T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Procedural Language and Knowledge
UID:20161202T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The topics & approximate start times:(3:00 sharp) My 7-10 min bit for panel discussion on "Manual vs. AutomatedKnowledge Acquisition"Will touch on web extraction vs. learning from volunteers -- strengths andweaknesses, new thoughts on synergies(3:15) Designing Intelligent Acquisition Interfaces for Collecting WorldKnowledge from Web Contributors(paper by Timothy Chklovski, Yolanda Gil)(3:55) Collecting Paraphrase Corpora from Volunteer Contributors (paper byTimothy Chklovski)

DTEND;TZID=America/Los_Angeles:20050929T163000
DTSTART;TZID=America/Los_Angeles:20050929T150000
LOCATION:11 Large
SUMMARY:Previews of my talks for K-CAP
UID:20050929T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We develop a system that lets people overcome language barriers by letting them speak a language they do not know.  Our system accepts text entered by a user, translates the text, then converts the translation into a phonetic spelling in the userâs own orthography. We trained the system on phonetic spellings in travel phrasebooks.Xing Shi is a PhD student at USC, advised by Professor Kevin Knight.

DTEND;TZID=America/Los_Angeles:20140523T160000
DTSTART;TZID=America/Los_Angeles:20140523T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:How to Speak a Language Without Knowing It
UID:20140523T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a grand challenge to build a corpus that will include all of the world's languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics.  The focal data types, bilingual texts and lexicons, relate each language to one of a set of reference languages. We propose that the ability to train systems to translate into and out of a given language be the yardstick for determining when we have successfully captured a language.  We call on the computational linguistics community to begin work on this Universal Corpus, pursuing the many strands of activity described here, as their contribution to the global effort to document the world's linguistic heritage before more languages fall silent.(This talk will present joint work with Steve Abney.)Brief Bio:Steven Bird is Associate Professor in the Department of ComputerScience and Software Engineering at the University of Melbourne, andalso Senior Research Associate at the Linguistic Data Consortium.  In2009 he served as president of the Association for ComputationalLinguistics, and he completed a textbook on Natural LanguageProcessing, published by O'Reilly.  Steven studies scalable,semi-automatic methods for analyzing spoken and written language, andfor preserving endangered languages. This involves a mixture ofcomputational modelling and linguistic fieldwork.  For further detailsand online publications, please visit http://stevenbird.me/

DTEND;TZID=America/Los_Angeles:20100609T163000
DTSTART;TZID=America/Los_Angeles:20100609T153000
LOCATION:10th Floor Conference Room
SUMMARY:The Human Language Project: Building a Universal Corpus of the World's Languages
UID:20100609T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: What is in common, and what is different, between translating from English to Chinese and compiling C++ into machine code?In this talk I will first introduce a tree-based (aka syntax-directed) paradigm for machine translation, inspired by both human translators and compilers. In this paradigm, a source language sentence is first parsed into a syntactic tree, which is then recursively converted into a target language sentence via tree-to-string transformation rules. Since the translation process is driven by the syntax, this approach resembles the classical "syntax-directed translation" method in compiling theory.However, natural languages are crucially different from programming languages in that they are fundamentally ambiguous. So we don't (and will probably never) have perfect parsers, and parsing errors adversely affect translation quality. To alleviate this problem, an obvious idea is to use the top-k parses, rather than a single 1-best, but this only helps a little bit due to the limited scope of the k-best list. We instead propose a "forest-based approach", which translates a packed forest encoding *exponentially* many parses in a compact (polynomial) space by sharing common subtrees. Large-scale experiments showed very significant improvements (over the 1-best baseline) in terms of translation quality, which outperforms the best reported systems to date. More interestingly, translating a forest of millions of trees is even faster than translating on top-30 individual trees thanks to dynamic programming.This talk includes joint work with Kevin Knight and Aravind Joshi (first part), and with Haitao Mi and Qun Liu (second/third parts).Short Bio:Liang Huang recently completed his PhD study at the University of Pennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/ISI). He is mainly interested in the theoretical aspects of computational linguistics, in particular, efficient algorithms in parsing and machine translation, generic dynamic programming, and formal properties of synchronous grammars. His thesis develops a set of "forest-based methods" that have been applied to many problems in NLP including k-best parsing, forest rescoring and reranking, and forest-based translation. His awards include an Outstanding Paper Award at ACL 2008, and a University Teaching Award at Penn in 2005.http://www.cis.upenn.edu/~lhuang3/

DTEND;TZID=America/Los_Angeles:20081217T160000
DTSTART;TZID=America/Los_Angeles:20081217T150000
LOCATION:4th Floor CR
SUMMARY:Tree-based and Forest-based Translation
UID:20081217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic Natural Language applications often require the processing ofstructured data. Traditional machine learning approaches attempt torepresent structured syntactic/semantic objects by means of flat featurerepresentations, i.e. attribute-value vectors. This raises two problems:1. There is no well defined theoretical motivation for such feature model.Structural properties may not fit in any flat feature representation.2. To define effective flat features, a deep knowledge about thelinguistic phenomenon is required.Kernel methods for Natural Language Processing aim to solve both the aboveproblems as kernel functions can be used to define similarities betweenlinguistic objects without explicitly defining the target feature space.In this way, a linguistic phenomenon can be modeled at a more abstractlevel where the modeling is easier. Such property is extremely useful whenthe representation of linguistic phenomena is still not well understood.For example, the feature design of semantic role labeling appear to bequite complex since several and non-definitive feature sets have beenproposed.As a viable alternative to manual feature design, kernel methods proposetwo steps: (1) they generate all substructures of the targetsyntactic/semantic structures and (2) they let the learning algorithm(e.g. Support Vector Machines) to select the most relevant substructures.In this talk, we (1) introduce the PropBank and FrameNet predicateargument structures, (2) present the standard approaches to the automaticlabeling of semantic roles and (3) show advanced semantic role labelingmodels based on kernel methods.About the speaker:Alessandro Moschitti is a researcher at the Computer Science Department ofthe University of Rome ^ÃTor Vergata^Ã. In 1998 he took his master degreein Computer Science at the University of Rome ^ÃLa Sapienza^Ã. In 2003 hefinished his PhD in Computer Science at ^ÃTor Vergata^Ã University.Between 2002 and 2004 he worked as an associate researcher in theUniversity of Texas at Dallas. His research interests concern machinelearning approaches for Natural Language Processing and InformationRetrieval. His deep expertise relates to automated text categorization andsemantic role labeling.  Recently, he has devised new kernels which enableSupport Vector and other kernel-based machines to carry out advancedsemantic processing.

DTEND;TZID=America/Los_Angeles:20050706T153000
DTSTART;TZID=America/Los_Angeles:20050706T140000
LOCATION:11 Large
SUMMARY:Kernel Methods for Semantic Role Labeling
UID:20050706T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We propose a simple generative syntactic language model that conditions on overlapping tree contexts in the same way that n-gram language models condition on overlapping sentence context. We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a mater of hours. We evaluate on a range of grammaticality tasks, and find that we consistently outperform n-gram models and other generative baselines, and even compete with state-of-the-art discriminative models hand-designed for each task, despite training on positive data alone. We also show some improvements in preliminary machine translation experiments.

DTEND;TZID=America/Los_Angeles:20120217T160000
DTSTART;TZID=America/Los_Angeles:20120217T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Large Scale Syntactic Language Modeling with Treelets
UID:20120217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Designing 3D scenes is currently a creative task that requires significant expertise and effort in using complex 3D design interfaces.  This design process starts in contrast to the easiness with which people can use language to describe real and imaginary environments.  We present an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.  A user provides input text from which we extract explicit constraints on the objects that should appear in the scene.  Given these explicit constraints, the system then uses a spatial knowledge base learned from an existing database of 3D scenes and 3D object models to infer an arrangement of the objects forming a natural scene matching the input description.  Using textual commands the user can then iteratively refine the created scene by adding, removing, replacing, and manipulating objects.Bio: Angel Chang recently received her PhD after working in the Stanford NLP group where she was advised by Chris Manning.  Her research focuses on the intersection of natural language understanding, computer graphics, and AI.  She is currently a visiting expert at Tableau Research.  More details at http://stanford.edu/~angelx/Webcast link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=735bfbb4ba1a4b749fe591958f837ccb1d

DTEND;TZID=America/Los_Angeles:20160226T160000
DTSTART;TZID=America/Los_Angeles:20160226T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Interactive scene design using natural language
UID:20160226T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will examine problems encountered in coming to somekind of understanding of one sonnet by Shakespeare (his 64th), askwhat it would take to solve these problems computationally, andsuggests routes to the solution.  The general conclusion is that weare closer to this goal as one might think.  Or are we?Bio:Jerry Hobbs is famous primarily for having an office next to KevinKnight's and a parking space next to Ed Hovy's.  He has readeverything of Shakespeare's that survives, including his will andplays of dubious authorship.  But that was all a long time ago.

DTEND;TZID=America/Los_Angeles:20061215T163000
DTSTART;TZID=America/Los_Angeles:20061215T150000
LOCATION:11 Large
SUMMARY:When Will Computers Understand Shakespeare?
UID:20061215T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will focus on the importance of integrating knowledge of human speech production and speech perception mechanisms, and language-specific information with statistically-based, data-driven approaches to develop robust and scalable speech processing algorithms. The need for such hybrid systems is especially critical when dealing with data corrupted by background acoustic noise, when training data are limited, and when dealing with accents.

DTEND;TZID=America/Los_Angeles:20130201T160000
DTSTART;TZID=America/Los_Angeles:20130201T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Dealing with Limited and Noisy Data in Speech Processing: A Hybrid Knowledge-Based and Statistical Approach
UID:20130201T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Test collections for information retrieval tasks have traditionallyassumed that what we are searching for are documents (e.g., Web pages,news stories, or academic documents).  Most information that is generatedis, however, not in originally generated as part of a document, but ratheras what we might refer to as "conversational media" (e.g., email, speech,or instant messaging).  In this talk, I'll describe the creation of twotest collections for conversational media, an email collection beingcreated in the TREC Enterprise Search track and a spoken word testcollection for the the Cross-Language Evaluation Forum (CLEF).  I'll spendmost of the talk describing the details of the CLEF test collection,illustrating the issues with some of the results that we have obtainedfrom our experiments with that collection.  I'll conclude with a fewremarks about the implications of what we are learning for DARPA's newGALE program.  This is joint work with Charles University, the IBM TJWatson Research Center, the Johns Hopkins University, the Survivors of theShoah Visual History Foundation, and the University of West Bohemia.About the speaker:Douglas Oard is an Associate Professor at the University of Maryland,College Park, with a joint appointment in the College of InformationStudies and the Institute for Advanced Computer Studies.  He holds a Ph.D.in Electrical Engineering from the University of Maryland, and hisresearch interests center around the use of emerging technologies tosupport information seeking by end users.  In 2002 and 2003, Doug spent ayear in paradise here at USC-ISI.  His recent work has focused oninteractive techniques for cross-language information retrieval and onsearching conversational text and speech.  Additional information isavailable at http://www.glue.umd.edu/~oard/.

DTEND;TZID=America/Los_Angeles:20050805T163000
DTSTART;TZID=America/Los_Angeles:20050805T150000
LOCATION:11 Large
SUMMARY:The CLEF Cross-Language Speech Retrieval Test Collection
UID:20050805T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The last decade has seen a plethora of papers in NLP devoted to MachineLearning algorithms. However, most of these papers have devoted theireffort exclusively to improving the system performance on the accuracyaxis. Most of the sophisticated NLP algorithms are extremely slow and donot scale up easily when applied to large amounts of data.I will talk about the importance of randomized algorithms and theirpotential in speeding up some NLP algorithms. This talk will be a surveyof some recent advances in Theoretical Computer Science/Math seen with anNLP point-of-view. I am not going to present any results. But I am hopingthat this talk will clarify my thinking process, get feedback from peopleand help me colloborate with others.

DTEND;TZID=America/Los_Angeles:20040813T163000
DTSTART;TZID=America/Los_Angeles:20040813T150000
LOCATION:11 Large
SUMMARY:Randomized algorithms and its application to NLP
UID:20040813T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I'll present the investigation I'm carrying out in ISIlately under Daniel Marcu's supervision.  Following the noisy-channelframework, we propose a statistical model for learning the argumentstructures of verbs automatically.  We show that we are able to learn bothlexicalized and generalized structures and achieve good results, relyingonly on basic NLP tools like a POS tagger and named-entity recognizer. Wealso present a comparison of the structures we learn with the predictedones in PropBank.

DTEND;TZID=America/Los_Angeles:20041115T163000
DTSTART;TZID=America/Los_Angeles:20041115T150000
LOCATION:8th floor multipurpose room (#849) -- NOT the conference room
SUMMARY:Unsupervised learning of verb argument structures
UID:20041115T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Composition of Tree TransducersSince finite state (string) transducers are not expressive enough for many NLPapplications, computational linguistics started to investigate treetransducers for the task of machine translation, for example. Quite somesuccessful work has been done on generalizing results from string transducersto tree transducers. But when it comes to composition results are notsatisfying because generally tree transducers are not closed undercomposition. Still we think that most of the tree transducers used in NLP arecomposable and that is why we defined the problem of the composition for twoindividual transducers instead of the whole class. During the summer westarted with linear nondeleting tree transducers with epsilon rules andapproached an algorithm to decide for two such transducers whether theircomposition is again in the same class.Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine TranslationCurrent state-of-the-art syntax-based statistical machine translationsystems produce many candidate translations out of which the output translationis selected by taking the argmax over all candidates i of &lt;w,f_i&gt; where w is aweight vector and f_i is a vector of the feature values for candidate i. Thefeatures used by the system and their corresponding weights have a major impacton a system's performance.  Currently, Minimum Error Rate Training (MERT) is used totune the weights of the features.  A drawback of this is that it isn't tractableto tune large numbers of feature weights.  I will discuss using the perceptronalgorithm to tune feature weights for statistical machine translation.  If I get interestingresults before my talk, I may also dicsuss new classes of features (potentially very largenumbers of features) that can be used for improving MT performance.

DTEND;TZID=America/Los_Angeles:20070829T163000
DTSTART;TZID=America/Los_Angeles:20070829T150000
LOCATION:11 Large
SUMMARY:Summer Intern Presentations: Composition of Tree Transducers AND Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine Translation
UID:20070829T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic word alignment plays a critical role in statistical machinetranslation. Unfortunately the relationship between alignment quality andstatistical machine translation performance has not been well understood.In the recent literature the alignment task has frequently been decoupledfrom the translation task, and assumptions have been made about measuringalignment quality for machine translation which, it turns out, are notjustified. In particular, none of the tens of papers published over thelast five years has shown that significant decreases in Alignment ErrorRate (AER) result in significant increases in translation quality. I willexplain this state of affairs and present steps towards measuringalignment quality in a way which is predictive of statistical machinetranslation quality.I will also provide a brief overview of some of my other work on trainingand search for word alignment.

DTEND;TZID=America/Los_Angeles:20060203T163000
DTSTART;TZID=America/Los_Angeles:20060203T150000
LOCATION:11 Large
SUMMARY:Measuring Word Alignment Quality for Statistical Machine Translation
UID:20060203T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: My presentation will overview recent activities on Chinese-English SMTcarried out at ITC-irst (Trento, Italy).  After an overview of thecomplete architecture of our system, I will focus on progress made inChinese word-segmentation, phrase-based modeling and decoding, log-linearmodeling and minimum error training, and language model adaptation.Experimental results will be provided in terms of Bleu and Nist scores ontwo translation tasks:  basic traveling expressions and news reports,respectively adopted by the C-STAR consortium and for the 2002 and 2003NIST MT evaluation campaigns.Bio:Marcello Federico has been a permanent researcher at ITC-irst since 1991.During 1998-2003, he led the "Multilingual natural speech technologies"(MUNST)  research line at ITC-irst.  Since 2004, he is head of the"Cross-language information processing" (Hermes) research line. Hisinterests include automatic speech recognition, statistical languagemodeling, information retrieval, and machine translation.

DTEND;TZID=America/Los_Angeles:20040617T163000
DTSTART;TZID=America/Los_Angeles:20040617T150000
LOCATION:4th Floor
SUMMARY:Statistical Machine Translation at ITC-irst
UID:20040617T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Jointly parsing two languages has been shown to improve accuracies oneither or both sides. However, its search space is much bigger thanthe monolingual case, forcing existing approaches to employcomplicated modeling and crude approximations. Here we propose a muchsimpler alternative, bilingually-constrained monolingual parsing,where a source-language parser learns to exploit reorderings asadditional observation, but not bothering to build the target-sidetree as well. We show specifically how to enhance a shift-reducedependency parser to use alignment features to resolve shift-reduceconflicts. Experiments on the bilingual portion of Chinese Treebankshow that, with just 3 bilingual features, we can improve parsingaccuracies by 0.6% for both English and Chinese, with negligible (~6%)efficiency overhead, thus much faster than biparsing.http://www.cis.upenn.edu/~lhuang3/biparsing.pdf

DTEND;TZID=America/Los_Angeles:20090821T161500
DTSTART;TZID=America/Los_Angeles:20090821T150000
LOCATION:4th Floor Conference Room
SUMMARY:Bilingually-Constrained (Monolingual) Shift-Reduce Parsing
UID:20090821T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk summarizes our experience with searching for small modelsfor syntax-based machine translation.  I will first present casessuggesting that smaller models are desirable, and present someevidence that minimizing model size is a reasonable objectivefunction.  I will then show cases where this objective may be tooaggressive.

DTEND;TZID=America/Los_Angeles:20100827T160000
DTSTART;TZID=America/Los_Angeles:20100827T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Intern Final Talk: Small is beautiful. Is it any good?
UID:20100827T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Neural Machine Translation is powerful but we know little about the black box. We conduct the following two investigations to gain a better understanding: First, we investigate how neural, encoder-decoder translation systems output target strings of appropriate lengths, finding that a collection of hidden units learns to explicitly implement this functionality. Second, we investigate whether a neural, encoderdecoder translation system learns syntactic information on the source side as a by-product of training. We propose two methods to detect whether the encoder has learned local and global source syntax. A fine-grained analysis of the syntactic structure learned by the encoder reveals which kinds of syntax are learned and which are missing.Bio: Xing Shi is a PhD student at ISI working with Prof. Kevin Knight.

DTEND;TZID=America/Los_Angeles:20161014T160000
DTSTART;TZID=America/Los_Angeles:20161014T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:EMNLP practice talk: Understanding Neural Machine Translation: length control and syntactic structure
UID:20161014T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present my summer project  - writing rule-based software forsimplifying texts. Task definition and motivations will bediscussed, as well as human and automatic evaluation, thelatter using a question answering system.This is joint work with Daniel Marcu and Kevin Knight.

DTEND;TZID=America/Los_Angeles:20030915T160000
DTSTART;TZID=America/Los_Angeles:20030915T143000
LOCATION:11 Large
SUMMARY:Analyzing Sentences into Facts: Simple is Beautiful
UID:20030915T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Large corpora of parsed sentences with semantic role labels (e.g. PropBank)provide training data for use in the creation of high-performance automaticsemantic role labeling systems. Despite the size of these corpora,individual verbs (or rolesets) often have only a handful of instances inthese corpora, and only a fraction of English verbs have even a singleannotation. In this paper, we describe an approach for dealing with thissparse data problem, enabling accurate semantic role labeling for novelverbs (rolesets) with only a single training example. Our approach involvesthe identification of syntactically similar verbs found in PropBank, thealignment of arguments in their corresponding rolesets, and the use of theircorresponding annotations in PropBank as surrogate training data.

DTEND;TZID=America/Los_Angeles:20070601T160000
DTSTART;TZID=America/Los_Angeles:20070601T153000
LOCATION:11 Large
SUMMARY:Generalizing Semantic Role Annotations Across Syntactically Similar Verbs
UID:20070601T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Previous research has indicated that when a polysemous word appears twoor more times in a discourse, it is extremely likely that they will allshare the same sense (Gale et al. 92). However, those results werebased on a coarse-grained distinction between senses (e.g, {\emsentence} in the sense of a `prison sentence' vs. a `grammaticalsentence'). I conducted an analysis of multiple senses within twosense-tagged corpora, Semcor and DSO. These corpora used WordNet fortheir sense inventory. I found significantly more occurrences ofmultiple-senses per discourse than reported in (Gale et al. 92) (33\%instead of 4\%). I also found classes of ambiguous words in which asmany as 45\% of the senses in the class co-occur within a document. Iwill discuss the implications of these results for the task ofword-sense tagging and for the way in which senses should berepresented.

DTEND;TZID=America/Los_Angeles:20031219T163000
DTSTART;TZID=America/Los_Angeles:20031219T150000
LOCATION:11 Large
SUMMARY:More than One Sense Per Discourse
UID:20031219T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Mouse Genome Informatics database (MGI) has participated extensively in shared NLP challenges focussed on developing infrastructure for their use. This collaboration has advanced the field of applying NLP to biomedical text but has not yet generated workable technology for use in the lab. In advance of a workshop (Monday August 19, 2013 at ISI) dedicated to this subject, I will describe the SciKnowMine project to introduce the domain of biomedical NLP and to showcase how we can collaboratively accelerate the process of biocuration, making these important databases far more effective.Students, colleagues! You are very welcome to the workshop: <a href=http://www.isi.edu/projects/sciknowmine/sciknowmine_release_workshop_-_bridging_bionlp_and_biocuration>http://www.isi.edu/projects/sciknowmine/sciknowmine_release_workshop_-_bridging_bionlp_and_biocuration</a>

DTEND;TZID=America/Los_Angeles:20130816T160000
DTSTART;TZID=America/Los_Angeles:20130816T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Bridging Between Bioinformatics and Natural Language Processing
UID:20130816T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: NLP applications such as Question Answering (QA), Information Extraction (IE), or Machine Translation (MT) are incorporating increasing amounts of semantic information. A fundamental building block of semantic information is the relation between a predicate and its arguments, e.g. eat(John,burger). In order to reason at higher levels of abstraction, it is useful to group relation instances according to the types of their predicates and the types of their arguments. For example, while eat(Mary,burger) and devour(John,tofu) are two distinct relation instances, they share the underlying predicate and argument types INGEST(PERSON,FOOD).A central question is: where do the types and relations come from?The subfield of NLP concerned with this is relation extraction, which comprises two main tasks:1. identifying and extracting relation instances from text2. determining the types of their predicates and argumentsThe first task is difficult for several reasons. Relations can express their predicate explicitly or implicitly. Furthermore, their elements can be far part, with unrelated words intervening. In this thesis, we restrict ourselves to relations that are explicitly expressed between syntactically related words. We harvest the relation instances from dependency parses.The second task is the central focus of this thesis. Specifically, we will address these three problems: 1) determining argument types 2) determining predicate types 3) determining argument and predicate types. For each task, we model predicate and argument types as latent variables in a hidden Markov models. Depending on the type system available for each of these tasks, our approaches range from unsupervised to semi-supervised to fully supervised training methods.The central contributions of this thesis are as follows:1. Learning argument types (unsupervised): We present a novel approach that learns the type system along with the relation candidates when neither is given. In contrast to previous work on unsupervised relation extraction, it produces human-interpretable types rather than clusters. We also investigate its applicability to downstream tasks such as knowledge base population and construction of ontological structures. An auxiliary contribution, born from the necessity to evaluate the quality of human subjects, is MACE (Multi-Annotator Competence Estimation), a tool that helps estimate both annotator competence and the most likely answer.2. Learning predicate types (unsupervised and supervised): Relations are ubiquitous in language, and many problems can be modeled as relation problems. We demonstrate this on a common NLP task, word sense disambiguation (WSD) for prepositions (PSD). We use selectional constraints between the preposition and its argument in order to determine the sense of the preposition. In contrast, previous approaches to PSD used n-gram context windows that do not capture the relation structure. We improve supervised state-of-the-art for two type systems.3. Argument types and predicates types (semi-supervised): Previously, there was no work in jointly learning argument and predicate types because (as with many joint learning tasks) there is no jointly annotated data available. Instead, we have two partially annotated data sets, using two disjoint type systems: one with type annotations for the predicates, and one with type annotations for the arguments. We present a semisupervised approach to jointly learn argument types and predicate types, and demonstrate it for jointly solving PSD and supersense-tagging of their arguments. To the best of our knowledge, we are the first to address this joint learning task.Our work opens up interesting avenues for both the typing of existing large collections of triple stores, using all available information, and for WSD of various word classes.

DTEND;TZID=America/Los_Angeles:20130503T160000
DTSTART;TZID=America/Los_Angeles:20130503T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Semantic Types and Relations from Text (Defense Practice Talk)
UID:20130503T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Narratology analyzes the discursive structure of narratives as finalizedproducts of human invention, such as novels, short-stories, orfairy-tales. Those narratives are rendered in a given surface form;Narratology focuses on narratives in natural language. Narratologistsassume that each narrative surface representation is associated with aneutral, abstract event sequence, the "Story" (histoire, sjuzhet). Theabstractness of Story is illustrated by the fact that the same Story canbe realized in different surface texts. By discursive structure or"Discourse" (discours, fabula), narralogists mean the relation between anabstract Story and its concrete expression in a sequential text. Forexample, if the chronological order of the Story is not respected in itstextual recount, we are dealing with the Discourse parameter of order.Other Discourse parameters include the frequency with which Story eventsare evoked, the point of view from which they are narrated (perceived,evaluated,...), or framed narratives with several narrative levels.The Story Generator Algorithms project at the University of Hamburgevaluated several existing Story Generators with respect to theirdiscursive abilities. It became obvious that most Story Generatorsconcentrate on creating a coherent and chronological abstract Story,which is directly mapped onto natural language. This results in apredominance of 1:1 relations between Story and surface, and in mostcases corresponds to a default or zero instantiation of Discourseparameters. As a consequence, Story Generator outputs tend to be veryexplicit and straightforward, and are likely to be perceived as uniformand boring.Narratological expert knowledge might be useful to future enhanced StoryGenerators and to Natural Language Generation systems dealing withnarrative. One of the aims of Computational Narratology is to model thatexpert knowledge. Ideally, narratological knowledge will be integratedinto a Narratological Structurer, as a processing component of anadvanced system that creates narratives. In such a system, theNarratological Structurer will be the interface between a Story Generatorand subsequent Natural Language Generation modules. The talk alsopresents examples of the knowledge that is being modelled.About the Speaker:Birte LÃ¶nneker graduated from the University of Hamburg, Germany, with adegree in French with Finno-Ugristics (Finnish) and BusinessAdministration. Since then, her main fields of publication are CognitiveLinguistics and electronic resources for Natural Language Processing,with special focus on frames and metaphors, as well as electronicdictionaries, corpora, and recently part-of-speech tagging. Her PhD onConcept Frames and Relations, also published as a book in 2003, wasco-supervised at the Institute for Romance Languages and at theDepartment of Informatics in Hamburg. For her Slovenian-German onlinedictionary, Birte LÃ¶nneker was twice awarded the EURALEX Laurence UrdangAward. From 2002 to 2004, she received various research grants forSlovenia, where she was working in the Corpus Laboratory of the Instituteof Slovenian Language.Since 2004, Birte LÃ¶nneker carries out research on Story GeneratorAlgorithms within the Narratology Research Group Hamburg. She is also aboard member of the German Cognitive Linguistics Association.

DTEND;TZID=America/Los_Angeles:20050620T113000
DTSTART;TZID=America/Los_Angeles:20050620T100000
LOCATION:11 Small
SUMMARY:Between Story Generation and Natural Language Generation
UID:20050620T100000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 11,001 New Features for Statistical Machine Translation (David Chiang)- Winner of Best Paper Award at NAACL/HLT 2009We use the Margin Infused Relaxed Algorithm of Crammer et al. to add alarge number of new features to two machine translation systems: theHiero hierarchical phrase based translation system and oursyntax-based translation system. On a large-scale Chinese-Englishtranslation task, we obtain statistically significant improvements of+1.5 BLEU and +1.1 BLEU, respectively. We analyze the impact of the new features and the performance of the learning algorithm.

DTEND;TZID=America/Los_Angeles:20090515T160000
DTSTART;TZID=America/Los_Angeles:20090515T150000
LOCATION:4th flr CR
SUMMARY:Practice talks for NAACL HLT
UID:20090515T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Composing, revising, and editing are highly demanding tasks. Even in polishedand published texts from professional writers we can observe errors and mistakes.For many errors, we can infer how they came to be: Word processors offercharacter-based functions only. These functions do not take into accountelements and structures of the language the author is using. Authors are thusforced to translate their high-level goals into long and complex sequencesof low-level character-based functions. Both the translation process and theexecution of such sequences of functions are error-prone.However, in text editors for programmers ww find so-called language-awareediting functions. These functions operate on the elements and structures of aprogramming or mark-up language and help to avoid errors, as language-awarefunctions make revising and editing less tedious and error-prone.We argue that the concept of language awareness can be transferred to writingnatural language texts using word processors. We propose functions that take thestructures of natural languages into consideration. We distinguish informationfunctions, movement functions, and operations to support revising and editing.The design is based on current findings from writing research.Language-aware editing functions rely on the recognition and categorizationof relevant elements and structures with respect to a certain language. Weuse methods and resources from computational linguistics for morphologicalanalysis and generation, and for part-of-speech tagging. When evaluatingrespective resources we face a rather disappointing situation: NLP resourcesfor German are less suitable than assumed and less applicable for real-worldapplications than usually claimed in the literature.Our prototypical implementation of language-aware functions for revising andediting of German texts serves as a proof of concept. The implementationillustrates opportunities and limits of current NLP resources for German.

DTEND;TZID=America/Los_Angeles:20110916T160000
DTSTART;TZID=America/Los_Angeles:20110916T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Linguistically supported editing and revising: concept and prototypical implementation based on interactive NLP resources
UID:20110916T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 3:30pm  Mark Hopkins (UCLA)Tree Sequence Automata: A Unifying Framework for Tree Relation FormalismsThere exist a wide variety of competing formalisms for representing alanguage of ordered tree pairs.  These include (bottom-up and top-down)tree transducers, synchronous tree-substitution grammars (STSGs),synchronous tree-adjoining grammars (STAGs), and inversion transductiongrammars (ITGs).  Since these formalisms have all developed independentlyof one another, it is difficult to compare their respectiverepresentational power.  This work seeks to make this task simpler byviewing these formalisms as instances of a general unifying formalism,which we call tree sequence automata (TSA).  By casting these differentformalisms in a single framework, we can compare them directly by studyingthe specific subclass of TSA that they fall into.4:00pm  Jason Riesa (Johns Hopkins)A case study in building a cost-effective speech-to-speech machine translation system with sparse resources: English - Iraqi ArabicThe Arabic spoken dialect of Iraq is a language deprived of the vastresources that researchers enjoy when working with its writtencounterpart, Modern Standard Arabic (MSA). The Iraqi Arabic lexicon andgrammar are also sufficiently distinct so that the use of existing toolsor corpora for MSA yield little or no positive effect on machinetranslation output quality.  One can see that building a machinetranslation system normally dependent on a large parallel corpus is aparticularly difficult task when given just a 37,000 line translatedparallel text based on transcribed speech. This talk will explore theconstraints involved in working with this type of data, how we endeavoredto mitigate such problems as a non-standard orthography and a highlyinflected grammar, and propose a cost- effective way for dealing with suchprojects in the future.4:30pm  Preslav Nakov (UC Berkeley)Multilingual Word AlignmentRecently there has been a growing number of available multilingualparallel texts. One such source is the European Union, which publishes itsofficial documents in the official languages of all member states(sometimes also in the languages of the candidates). Another source arethe United Nations. These corpora are a great source of training data formachine translation between new language pairs. But they also offer theopportunity to obtain better pairwise word alignments by looking atmultiple languages in parallel. In this talk I will present my research asa summer intern at ISI on getting better French (Fr) to English (En) wordalignments using an additional language (Xx). First, I will introduce twoheuristics which start with pairwise alignments between Fr-Xx, En-Xx andFr-En and then combine them probabilistically (in a linear model) orgraph-theoretically (by looking at in- and out-degrees for each word).Then I will present two Model1 inspired alignment models: (a) from "Fr andXx" to En; and (b) from Fr to "En and Xx".

DTEND;TZID=America/Los_Angeles:20050824T170000
DTSTART;TZID=America/Los_Angeles:20050824T153000
LOCATION:11 Large
SUMMARY:Summer Student Presentations
UID:20050824T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Models that align phrases instead of words offer anappealing alternative to the standard relative frequency estimates ofphrase translation probabilities.  But, while some effective wordalignment models (Model 1, Model 2 & HMM) can be estimated tractablywith EM, phrase alignment models cannot.  I'll talk about how to showthat estimation and inference under these models is intractable.Then, I'll present two useful approximation techniques.First, I'll talk about how to cast phrase alignment search as aninteger linear programming (ILP) problem and find the optimalalignment reliably and quickly with off-the-shelf ILP software.  Someapplications of this technique include training phrase alignmentmodels and interpreting the output of word alignment models.Second, we'll look at how to estimate translation probabilities undera phrase alignment model using a Gibbs sampling procedure.  Thesampler has some nice asymptotic convergence properties and also seemsto produce good results in practice. I'll walk through the differentmodels we've trained and how they performed.Time permitting, I'll also talk about some of the ways in which wecould potentially extend this work to syntactic MT.

DTEND;TZID=America/Los_Angeles:20080509T160000
DTSTART;TZID=America/Los_Angeles:20080509T150000
LOCATION:11 Large
SUMMARY:Inference in phrase alignment models
UID:20080509T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none have supplanted them in practice.We propose a simple extension to the IBM models: an l0 prior to encourage sparsity in the word-to-word translation model. This extension has been implemented in GIZA++ and scales to large-scale data . We achieve significant improvements over IBM Model 4 in both word alignment and translation quality.This is a practice talk for ACL.Bio:Ashish Vaswani is a PhD student at ISI.

DTEND;TZID=America/Los_Angeles:20120703T160000
DTSTART;TZID=America/Los_Angeles:20120703T150000
LOCATION:4th Floor Conference Room
SUMMARY:Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
UID:20120703T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND;TZID=America/Los_Angeles:20050218T163000
DTSTART;TZID=America/Los_Angeles:20050218T150000
LOCATION:11 Large
SUMMARY:TBA
UID:20050218T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will be a continuation of topics from Monday's talk.

DTEND;TZID=America/Los_Angeles:20100610T170000
DTSTART;TZID=America/Los_Angeles:20100610T160000
LOCATION:10th Floor Conference Room
SUMMARY:"Bayesian models of language acquisition" or "Where do the rules come from?" (continued from 7 Jun 2010)
UID:20100610T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Due to the availability of large amounts of training data andcomputational resources, building more complex models with sentencelevel knowledge and longer dependencies has been an active area ofresearch in automatic speech recognition (ASR). Yet, due to thecomplexity of the speech recognition task, integration of many ofthese complex and sophisticated knowledge sources into the firstdecoding pass is not feasible. Many of these long-span models cannotbe represented as weighted finite-state automata (WFSA), making itdifficult even to incorporate them in a lattice rescoring pass.First, we motivate our work by providing compelling empirical evidencethat n-gram LMs are not sufficient for ASR task and why we need toincorporate non-local features such as syntax.  The development oflanguage models with such long-span (non-local) features is underway,but is not addressed in this talk.  We instead address how such modelsshould be trained discriminatively and applied effectively.Specifically, we describe a new approach for rescoring speech latticeswith such models (acoustic or language) that does not entailcomputationally intensive lattice expansion or limited rescoring ofonly an N -best list.We view the set of word-sequences in a lattice as a discrete space anddevelop a hill climbing technique to start with, say, the 1-besthypothesis under the lattice-generating model(s) and iterativelyimprove it using the new model. We demonstrate empirically that toachieve the same reduction in error rate using a better estimated,higher order LM, our technique evaluates fewer hypotheses thanconventional N-best rescoring by up to two orders of magnitude.We also propose to integrate the idea of hill climbing into thetraining of discriminative language models with non-local sentencelevel features. Discriminative models provide the flexibility toinclude both local n-gram features and arbitrary sentence levelfeatures. However, unlike generative LMs with long-span dependencieswhere one has to resort to N-best lists only during decoding(rescoring), discriminative models force the use of N-best lists evenfor LM training. We demonstrate significant computational saving during training as well as error-rate reduction over N-best training methods.Bio:Ariya Rastrow is a Ph.D. candidate at Johns Hopkins University,working with Sanjeev Khudanpur and Mark Dredze. He was initiallyadvised by Fred Jelinek. The focus of his PhD research is to advancespeech recognition systems to efficiently incorporate linguisticallymotivated non-local features into language models. In his recent work,he has developed an efficient hill-climbing algorithm to applynon-local complex models for the speech recognition task. He has alsoworked on out-of-vocabulary (OOV) detection, spoken term detection andsemi-supervised adaptation techniques for speech recognition.

DTEND;TZID=America/Los_Angeles:20111104T160000
DTSTART;TZID=America/Los_Angeles:20111104T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Going beyond n-grams: Incorporating non-local dependencies for Speech Recognition
UID:20111104T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This is a practice talk for a paper by Giorgio Satta and Enoch Peserico)This paper investigates some computational problems associated withprobabilistic translation models that have recently been adopted in theliterature on machine translation. These models can be viewed as pairs ofprobabilistic context-free grammars working in a `synchronous' way. Twohardness results for the class NP are reported, along with an exponentialtime lower-bound for certain classes of algorithms that are currently usedin the literature.

DTEND;TZID=America/Los_Angeles:20050930T163000
DTSTART;TZID=America/Los_Angeles:20050930T150000
LOCATION:4 Large
SUMMARY:Some Computational Complexity Results for Synchronous Context-Free Grammars
UID:20050930T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This dissertation studies how people describe emotions with languageand how computers can simulate this descriptive behavior.  Althoughmany non-human animals can express their current emotions as socialsignals, only humans can communicate about emotions symbolically.This symbolic communication of emotion allows us to talk aboutemotions that we may not currently be feeling, for example describingemotions that occurred in the past, gossiping about the emotions ofothers, and reasoning about emotions hypothetically.  Another feature of thisdescriptive behavior is that we talk about emotions as if they werediscrete entities, even though we may not always have necessary andsufficient observational cues to distinguish one emotion from another,or even to say what is and is not an emotion. This motivates us tofocus on aspects of meaning that are learned primarily throughlanguage interaction rather than by observations through the senses.To capture these intuitions about how people describe emotions, wepropose the following thesis: natural language descriptions of emotionare definite descriptions that refer to intersubjective theoretical entities.We support our thesis using theoretical, experimental, computationalresults. The theoretical arguments use Russell's notion of definitedescriptions, Carnap's notion of theoretical entities, and thequestion-asking period in child language acquisition. The experimentaldata we collected include dialogs between humans and computers andweb-based surveys, both using crowd-sourcing on Amazon MechanicalTurk. The computational models include a dialog agent based onsequential Bayesian belief update within a generalized pushdown automaton,as well as a fuzzy logic model of similarity and subsethood between emotion terms.For future work, we propose a research agenda that includes acontinuation of work on the emotion domain as well as new work onother domains where subjective descriptions are established throughnatural language communication.Short Bio:Abe Kazemzadeh is a PhD candidate at the USC Computer Science Dept anda research assistant at the the Signal Analysis and InterpretationLaboratory (SAIL). His interests include natural language, logic,emotions, games, and algebra. He is currently the chief technologyofficer at the USC Annenberg Innovation Laboratory (AIL).

DTEND;TZID=America/Los_Angeles:20130111T160000
DTSTART;TZID=America/Los_Angeles:20130111T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Natural Language Description of Emotion (Ph.D. Thesis Defense Practice Talk)
UID:20130111T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll give a survey of trees and grammars, at least the parts that seemmost relevant to ongoing work at ISI.  This will be a theory talk.  I'llstart with context-free grammars, which were developed in the 1950s, andcover other tree-generating systems.  I'll also talk abouttree-transforming systems.

DTEND;TZID=America/Los_Angeles:20040709T163000
DTSTART;TZID=America/Los_Angeles:20040709T150000
LOCATION:11 Large
SUMMARY:Survey of Trees and Grammars
UID:20040709T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Bake-offs, shared tasks, evaluations: these are names for short,high-stress periods in many CS researchers' lives where theiralgorithms and models are exposed to unseen data, often withreputations and funding on the line. Evaluations are sometimesperceived to be the bane of much of our work lives. Wegrouse about metrics, procedures, glitches, and all thetime "wasted" chasing scores, rather than doing RealScience (TM).  In this talk I will argue that despite valid criticismsof the approach, coordinated evaluation is a net benefit to NLPresearch and has led to accomplishments that might not have otherwisearisen. This argument will frame a more in-depth discussion of severalpieces of recent evaluation-grounded work: rapid generation oftranslation and information extraction for low-resource surpriselanguages (DARPA LORELEI) and organization of SemEval sharedtasks in semantic parsing and generation.Jonathan May is a Research Assistant Professor at the University ofSouthern California's Information Sciences Institute(USC/ISI). Previously, he was a research scientist at SDL Research(formerly Language Weaver) and a scientist at Raytheon BBNTechnologies. He received a Ph.D. in Computer Science from theUniversity of Southern California in 2010 and a BSE and MSE inComputer Science Engineering and Computer and Information Science,respectively, from the University of Pennsylvania in 2001. Jon'sresearch interests include automata theory, natural languageprocessing, machine translation, and machine learning.

DTEND;TZID=America/Los_Angeles:20170120T160000
DTSTART;TZID=America/Los_Angeles:20170120T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:How I Learned to Stop Worrying and Love Evaluations (and Keep Worrying)
UID:20170120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural language is riddled with many ambiguities, greatly complicatingnatural language processing tasks. Current parsers reconstruct thesyntax of sentences without addressing the numerous ambiguities oflanguage. This talk discusses a proposed solution forsemantically-enriched parsing that consists of ontological resources,datasets, and tools that can be used to produce more informative parsesof English sentences. The resulting parses consist not only of syntacticstructure, but also semantic interpretations for noun compounds,preposition senses, and possessive constructions.

DTEND;TZID=America/Los_Angeles:20101112T160000
DTSTART;TZID=America/Los_Angeles:20101112T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Semantically-enriched Parsing for Natural Language Understanding (Ph.D. Proposal practice talk)
UID:20101112T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: There is abundant knowledge out there carried in the form of natural language texts, such as social media posts, scientific research literature, medical records, etc., which grows at an astonishing rate. Yet this knowledge is mostly inaccessible to computers and overwhelming for human experts to absorb. Information extraction (IE) processes raw texts to produce machine understandable structured information, thus dramatically increasing the accessibility of knowledge through search engines, interactive AI agents, and medical research tools. However, traditional IE systems assume abundant human annotations for training high quality machine learning models, which is impractical when trying to deploy IE systems to a broad range of domains, settings and languages. In this talk, I will present how to leverage the distributional statistics of characters and words, the annotations for other tasks and other domains, and the linguistics and problem structures, to combat the problem of inadequate supervision, and conduct information extraction with scarce human annotations.Nanyun Peng is a PhD candidate in the Department of Computer Science at Johns Hopkins University, affiliated with the Center for Language and Speech Processing and advised by Dr. Mark Dredze. She is broadly interested in Natural Language Processing, Machine Learning, and Information Extraction. Her research focuses on using deep learning for information extraction with scarce human annotations. Nanyun is the recipient of the Johns Hopkins University 2016 Fred Jelinek Fellowship. She has completed two research internships at IBM T.J. Watson Research Center, and Microsoft Research Redmond. She holds a master's degree in Computer Science and BAs in Computational Linguistics and Economics, all from Peking University.

DTEND;TZID=America/Los_Angeles:20170223T160000
DTSTART;TZID=America/Los_Angeles:20170223T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Representation Learning with Joint Models for Information Extraction
UID:20170223T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Most state-of-the-art techniques used in natural language processing (NLP) are supervised and require labeled training data. For example, statistical language translation requires huge amounts of bilingual data for training translation systems. But such data does not exist for all language pairs and domains. Using human annotation to create new bilingual resources is not a scalable solution. This raises a key research challenge: How can we circumvent the problem of limited labeled resources for NLP applications? Interestingly, cryptanalysts and archaeologists have tackled similar challenges in solving "decipherment problems".This thesis work aims to bring together techniques from classical cryptography, NLP and machine learning. We introduce a novel approach called "natural language decipherment" that can solve natural language problems without labeled (parallel) data. In this talk, we show how a wide variety of NLP problems can be formulated as decipherment tasks---for example, in statistical language translation one can view the foreign-language text as a cipher for English. Instead of relying on parallel training data, decipherment uses knowledge of the target language (e.g., English) and large quantities of readily available monolingual source (cipher) data to induce bilingual connections between the source and target languages. Using decipherment techniques, we make headway in attacking a hierarchy of problems ranging from letter substitution decipherment to sequence labeling problems (such as part-of-speech tagging) to language translation. Along the way, we make several key contributions---novel unsupervised algorithms that search for minimized models during decipherment and achieve state-of-the-art results on a number of important natural language tasks. Unlike conventional approaches, these decipherment methods can be easily extended to multiple domains and languages (especially resource-poor languages), thereby helping to spread the impact and benefits of NLP research.

DTEND;TZID=America/Los_Angeles:20110318T160000
DTSTART;TZID=America/Los_Angeles:20110318T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Deciphering Natural Language
UID:20110318T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Computational creativity is an emerging field of AI, with linguistic creativity being an interesting test-bed for developing and evaluating machines with reasoning capabilities. A concrete example is story generation and understanding, a task which unlike the vast majority of traditional NLP that treats sentences in isolation,  requires deep understanding of the general context and discourse of stories.In this talk, I will present some preliminary steps towards this goal and show how sequence-to-sequence models can be applied to this task. Overall, our results on story understanding are on par with current state-of-the-art  (that nevertheless have no generative capabilities), while at the same time producing sometimes rather amusing story endings.Bio: Angeliki is a final year PhD student at the Center for Mind/Brain Sciences of the University of Trento. She received her MSc from the Saarland University, where she worked with Ivan Titov and Caroline Sporleder on Bayesian models for sentiment and discourse. She is currently working at the intersection between language and vision under the supervision of Marco Baroni.Webcast: http://webcastermshd.isi.edu/Mediasite/Play/6f51b67c1a304a0c83297dd2f9b453921d

DTEND;TZID=America/Los_Angeles:20160803T115900
DTSTART;TZID=America/Los_Angeles:20160803T110000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Can machines understand and generate stories?
UID:20160803T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Yarowsky et al., 2001) present an algorithm for bootstrapping a POStagger for an arbitrary target language, using an existing POS tagger fora source language and a parallel corpus in the source and targetlanguages.  The source text is annotated with the POS tagger; the parallelcorpus is word-aligned; the POS tags are "projected" from source to targetlanguage; and finally smoothing is performed before training a POS taggerfor the target language on the projected annotations.I will talk about my work (jointly with my advisor, Steve Abney, at U. ofMichigan) in which we extend this algorithm by projecting from multiplesource languages onto a target language, then combining the outputs tocompute a consensus POS tagger.  Our hypothesis is that systematictransfer errors from different source-target pairs can be reduced by usingmultiple source languages.  I will present experimental results for threedifferent source languages (English, German, and Spanish), and twodifferent target languages (French and Czech).  Our results indicate thatusing multiple source languages improves performance.

DTEND;TZID=America/Los_Angeles:20050715T163000
DTSTART;TZID=America/Los_Angeles:20050715T150000
LOCATION:11 Large
SUMMARY:Inducing POS Taggers by Projecting from Multiple Source Languages
UID:20050715T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: How are concepts represented in the brain? When we hear the ringing of a bell, or watch a bell swinging back and forth, is there a shared "BELL" pattern of neural activity in our brains? Philosophers have debated the nature of concepts for centuries, but recent technical advances have allowed neuroscientists to make contributions to this topic. The combination of functional neuroimaging and machine learning has allowed us to examine distributed patterns of activity in the human brain to decode what they represent about the world, and to what level of abstraction. I describe our recent findings that revealed a hierarchical organization of multisensory information integration, leading to representations that generalize across different sensory modalities. I will also discuss our work on the social function of concepts, which enables the communication of similar thoughts and associations between individuals.Bio:I am a research associate at the Brain and Creativity Institute of the University of Southern California. I earned my Ph.D. at USC, mentored by Antonio Damasio. I am interested in the general problem of consciousness, and in particular how different sensations are bound together by the brain into a unified experience of the world.

DTEND;TZID=America/Los_Angeles:20141205T160000
DTSTART;TZID=America/Los_Angeles:20141205T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Multisensory integration in a neural framework for concepts
UID:20141205T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Lexical cohesion refers to structure created in a text by use of words withrelated meanings. Apart from its importance in theoretical and appliedlinguistics, lexical cohesion detection is used in NLP tasks like topicsegmentation, extractive summarization, spelling correction, etc.  However, theintuitive potential of lexical cohesion for such tasks is often not realized inpractice, possibly due to shortcomings of detection algorithms.I will briefly describe an experiment with readers aimed at providing reliabledata for a computational investigation of lexical cohesion. We then discuss anumber of informative features for cohesion detection, drawing on sources likeWordNet, distributional information, free associations, and the structure ofinformation in  the text itself.  Finally, I report experimentswith supervised learning of lexical cohesion.About the speaker:Beata Beigman Klebanov is a PhD candidate at the Hebrew University of Jerusalem,Israel, currently a visiting scholar at Northwestern University. Beata'sinterests are in experimental, computational and applied research in textpragmatics.

DTEND;TZID=America/Los_Angeles:20070105T163000
DTSTART;TZID=America/Los_Angeles:20070105T150000
LOCATION:11 Large
SUMMARY:Experimental and Computational Investigation of Lexical Cohesion in Texts
UID:20070105T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Arabic language exhibits diglossia, i.e., the coexistence of two formsof language, a variety with standard orthography and sociopolitical cloutwhich is not natively spoken by anyone (Modern Standard Arabic, MSA) andvarieties that are primarily spoken and lack writing standards (Arabicdialects). There are important resources currently available for MSA withmuch on-going NLP work; for example, there is an Arabic Treebank andseveral syntactic parsers for MSA.  However, Arabic dialect resources andNLP research are still at an infancy stage. I will present work done atthe Johns Hopkins CLSP Summer Workshop on parsing of Arabic dialects, inparticular, Levantine Arabic.  We have experimented with three approachesto leveraging MSA resources to create a parser for Levantine Arabic, aswell as methods for induction of MSA-Levantine translation lexicons and aLevantine part-of-speech tagger. Using these methods we obtain errorreductions of up to 15% compared with applying an MSA parser directly toLevantine text.Rambow et al. Parsing Arabic Dialects: Final Report. Johns HopkinsUniversity Center for Language and Speech Processing Workshop 2005.http://www.clsp.jhu.edu/ws2005/groups/arabic/documents/finalreport.pdfChiang et al. Parsing Arabic Dialects. To appear in Proc. EACL 2006.This is joint work with O. Rambow, M. Diab, N. Habash, R. Hwa, K. Sima'an,V.  Lacey, R. Levy, C. Nichols and S. Shareef.

DTEND;TZID=America/Los_Angeles:20060210T160000
DTSTART;TZID=America/Los_Angeles:20060210T150000
LOCATION:11 Large
SUMMARY:Parsing Arabic Dialects
UID:20060210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We show that phrase structures in Penn Treebank style parsesare not optimal for syntax-based machine translation. Weexploit a series of binarization methods to restructure thePeen Treebank style trees such that syntactified phrasessmaller than Penn Treebank constituents can be acquired andexploited in translation. We find that by employing the EMalgorithm for determining the binarization of a parse treeamong a set of alternative binarizations gives us the besttranslation result.

DTEND;TZID=America/Los_Angeles:20070525T153000
DTSTART;TZID=America/Los_Angeles:20070525T150000
LOCATION:11 Large
SUMMARY:Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy
UID:20070525T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030801T160000
DTSTART;TZID=America/Los_Angeles:20030801T150000
LOCATION:11 Large
SUMMARY:Toward deciphering the 2-dimensional ancient Luwian script by discovering its writing order
UID:20030801T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The modeling of discourse has been a major topic of research in the linguistics and AI communities for decades. With respect to language, discourse phenomena refer to the use of linguistic indicators that reflect the functional organization of utterances, relationships between different utterances, with the interlocutors' state of mind and with the situational surrounding.The development of models of discourse that are operationalizable (as a part of NLP applications) is essential, for example, in machine translation:* to interpret, to translate and to generate pronouns, definite and indefinite NPs correctly,* to translate non-canonical constructions (e.g., passive),* to generate the correct word order (e.g., when translating into a free-word order language),* to insert or to drop discourse markers and conjunctions, or* to choose the appropriate type of syntactic embedding in complex sentences.In other branches of NLP, different aspects of discourse are important, e.g., relations between utterances (machine reading), the hierarchical organization of discourse (text summarization) and the sequential organization of utterances in a text (text structuring/natural language generation).Numerous models of different aspects of discourse have been proposed, including discourse structure (the hierarchical organization of utterances in discourse), discourse relations (relations between independent utterances in discourse), information structure (the functional structure of utterances in context), and information status (accessibility of antecedents of pronouns, definite descriptions and elliptic constructions). These approaches range from relatively abstract models from cognitive and functional linguistics (e.g.,  Givon 1983), over elaborate formal models developed in formal semantics (e.g., Asher 1993), to "parameterized", rule-based models in AI (e.g., Grosz et al. 1995).Since the mid-1990s, this traditional, "theory-centered" line of research has been complemented with an "annotation-centered" methodology, i.e., the development and the use of annotated corpora to test predictions and to develop statistical classifiers. In the first part of the talk, I describe selected activities of the applied computational linguistics group at the University of Potsdam/Germany in this direction, which include* the annotation of discourse structure, coreference, information structure and information status (Stede 2004, Krasavina and Chiarcos 2007, Ritz et al. 2008)* the development of generic multi-layer architectures capable to represent and to access these annotations along with other types of annotation  applied to the same stretch of data (Chiarcos et al. 2008), e.g., annotations for constituent syntax, dependency syntax, or frame semantics, and* the application of machine learning techniques to predict discourse features from less abstract annotation layers (Ritz 2007, Chiarcos 2011).The primary drawback of annotation-centered models are the immense cognitive (and thus, financial) efforts necessary to produce reliable discourse annotations. One way to address this problem is to make use of corpora without discourse annotations to test predictions of candidate models, and to develop unsupervised or weakly supervised approaches to support or to replace manual annotation.In the second part of my talk, this "data-centered" approach on discourse will be illustrated for the example of discourse relations, one of the main topics of my work at ISI. I describe a pilot study that shows that significant, reproducible and interpretable insights about the discourse relation (that is likely to be) connecting a pair of events can be achieved from a sufficiently large corpus with syntax annotations only. Further, possible lines for subsequent research will be sketched.Nicholas Asher (1993). Reference to Abstract Objects in Discourse. Kluwer, Dordrecht, 1993.Christian Chiarcos (2011). Evaluating salience metrics for the context-adequate realization of discourse referents. In: Proceedings of the 13th European Workshop on Natural Language Generation (ENLG 2011). Association of Computational Linguistics, Nancy, France, Sep 2011, 32-43.Christian Chiarcos, Stefanie Dipper, Michael Gotze, Ulf Leser, Anke LÃ¼deling, Julia Ritz, and Manfred Stede (2008). A Flexible Framework for Integrating Annotations from Different Tools and Tagsets. TAL (Traitement automatique des langues) 49 (2): 218-248.Talmy Givon (ed., 1983). Topic Continuity in Discourse: A Quantitative Cross-Language Study. John Benjamins, Amsterdam and Philadelphia.Barbara J. Grosz, Aravind K. Joshi, and Scott Weinstein (1995). Centering: A framework for modelling the local coherence of discourse. Computational Linguistics, 21(2):203â225.Olga Krasavina and Christian Chiarcos (2007). PoCoS - Potsdam Coreference Scheme. In Proceedings of the Linguistic Annotation Workshop. Held in Conjunction with the ACL-2007, Prague, Czech Republic, pages 156â163.Julia Ritz, Svetlana Petrova, Michael GÃ¶tze, and Stefanie Dipper (2007). Automatic Identification of Information Structure in Small Corpora of Modern and Old High German. GLDV-Fruhjahrstagung 2007, Tubingen, Germany.Julia Ritz, Stefanie Dipper, und Michael GÃ¶tze (2008). Annotation of Information Structure: An Evaluation Across Different Types of Texts. In Proceedings of the the 6th LREC conference. Marrakech, Morocco.Manfred Stede (2004). The Potsdam Commentary Corpus. In Bonnie Webber and Donna K. Byron, editors, Proceedings of the ACL-2004 Workshop on Discourse Annotation, Barcelona, pages 96â102.Biography:Christian Chiarcos, born 1977, studied Computer Science (MSc, 2002) and General Linguistics (MA, 2004) at the Technical University Berlin, Germany. From 2002 to 2003, he received a scholarship in the context of the project "Collocations in Dictionary" at the Berlin-Brandenburg Academy of Science under the auspicion of Christiane Fellbaum (Princeton). From 2003 to 2005, he participated in the graduate school "Economy and Complexity in Language" at the Humboldt-Unversity at Berlin and the University of Potsdam, Germany, where he developed a corpus-based approach to predict syntactic alternations for Natural Language Generation. This research formed the basis for his PhD thesis "Mental Salience and Grammatical Form" (University of Potsdam, 2010).Since 2006, he worked in the Applied Computational Linguistics group at the University of Potsdam, Germany, where he participated in different research projects dedicated to the development of interoperable infrastructures for NLP and multi-layer corpora. Since 2007, this research was carried out in the context of the Collaborative Research Center "Information Structure", a multidisciplinary network of projects at the University of Potsdam and the Humboldt-University Berlin, dedicated to the study of discourse phenomena.

DTEND;TZID=America/Los_Angeles:20120427T160000
DTSTART;TZID=America/Los_Angeles:20120427T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards operationalizable models of discourse phenomena: Addressing discourse relations
UID:20120427T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll talk about some unsupervised learning experiments -- how I was satisfied with the initial results, how I became very dissatisfied, and how I became (somewhat) satisified again.

DTEND;TZID=America/Los_Angeles:20080111T163000
DTSTART;TZID=America/Los_Angeles:20080111T150000
LOCATION:11 Large
SUMMARY:How to Make EM Do What You Want
UID:20080111T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We discuss preliminary work on a possible approach to exploitingsyntax in an effective way for machine translation. The drivingguideline is to devise a machine translation system that can performeffectively, given a very limited quantity of parsed training data.

DTEND;TZID=America/Los_Angeles:20061127T163000
DTSTART;TZID=America/Los_Angeles:20061127T150000
LOCATION:11 Large
SUMMARY:Towards the Effective Exploitation of Syntax in Machine Translation
UID:20061127T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: For a large number of natural language processing (NLP) problems, we are concerned with finding semantic patterns from input sequences. In recurrent neural network (RNN) based approach, such pattern is âencodedâ in a vector called hidden state. Since Elmanâs âFinding structure in timeâ published in 1990, it has long been believed that the âmagic powerâ of RNNâs memory, which is enclosed inside the hidden state, can handle very long sequences. Yet besides some experimental observations, there is no formal definition of RNNâs memory, let alone a rigid mathematical analysis of how RNNâs memory forms.This talk will focus on understanding memory from two viewpoints. The first viewpoint is that memory is a function that maps certain elements in the input sequences to the current output. Such definition, for the first time in literature, allows us to do detailed analysis of the memory of simple RNN (SRN), long short-term memory (ELSTM), and gated recurrent unit (GRU). It also opens the door for further improving the existing RNN basic models. The end results are the proposal of a new basic RNN model called extended LSTM (ELSTM) with outstanding performance for complex language tasks, and a new macro RNN model called dependent bidirectional RNN (DBRNN) with smaller cross entropy than bidirectional RNN (BRNN) and encoder-decoder (enc-dec) models.The second viewpoint is that memory is a compact representation of sparse sequential data. From this perspective, the process of generating hidden state of RNN is simply dimension reduction. Thus, method like principal component analysis (PCA) which does not require labels for training becomes attractive. However, there are two known problems in implementing PCA for NLP problems: the first is computational complexity; the second is vectorization of sentence data for PCA. To deal with this problem, an efficient dimension reduction algorithm called tree -structured multi-linear PCA is proposed.Bio: Yuanhang Su received the dual B.S. degree in Electrical Engineering & Automation and Electronic & Electrical Engineering from University of Strathclyde, Glasgow, U.K. and Shanghai University of Electric Power, Shanghai, China, respectively in 2009, and the M.S. degree in Electrical Engineering from the University of Southern California, Los Angeles, CA, in 2010. From 2011 to 2015, he worked as image/video/camera software and algorithm engineer for a Los Angeles startup named Exaimage, Shanghai Aerospace Electronics Technology Institute in China and Huawei Technology in China consecutively. He joined MCL lab in 2016 spring, and is currently pursing his Ph.D. in computer vision, natural language processing and machine learning.

DTEND;TZID=America/Los_Angeles:20180413T160000
DTSTART;TZID=America/Los_Angeles:20180413T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Finding memory in time
UID:20180413T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Randomized data structures can help us scale discrete models encountered in NLP. This talk will describe their use in language modeling and present some more general related results.N-gram language models are fundamental to speech recognition and machine translation. Unfortunately, the n-gram parameter space grows exponentially with the dimension of the feature vector. I will describe how randomization can be used to remove the space-dependency of such models on the a priori parameter space.The novel extensions of the Bloom filter that I will present are able to take advantage of the entropy of the distribution of values assigned to feature vectors to save space in a discrete statistical model. I will review some results applying these models to language modeling in machine translation and relate their space-requirements to a novel lower bound on the general problem of querying a map of key/value pairs.No prior knowledge of randomized data structures will be assumed.

DTEND;TZID=America/Los_Angeles:20071012T163000
DTSTART;TZID=America/Los_Angeles:20071012T150000
LOCATION:11 Large
SUMMARY:Scalable Language Modeling: Breaking the Curse of Dimensionality
UID:20071012T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As part of an effort to encode the commonsense knowledge we need innatural language understanding, I have been looking at several very commonwords and their uses in diverse corpora, and asking what we have to knowto understand this word in this context.  In this talk, I will describethe investigations of the uses of two words -- the adverb "now" and thepreposition "like".One might think that "now" simply expresses a temporal property of anevent.  But in fact in almost every instance, it is used to point up acontrast -- "This is true now.  Something else was true then."  It is thusmore of a relation than a property.  I will describe several categories ofsuch relations.  Another question of interest about "now" is "How long aperiod is the word "now" describing in its various uses?": "I'm typing anabstract now" vs. "We travel by automobile now."  I suggest somecategories of knowledge that need to be encoded to answer this question.When we successfully understand "A is like B", we have figured out someproperty that A and B have in common.  How can we find that propertycomputationally?  In the data I looked at, in 80% of the instances, theproperty is explicit in the nearby text, and I will talk about how we canidentify it.  For the remainder I examine the knowledge we would need inorder to infer the common property.

DTEND;TZID=America/Los_Angeles:20041022T163000
DTSTART;TZID=America/Los_Angeles:20041022T150000
LOCATION:11 Large
SUMMARY:Like Now:  Two Explorations in Deep Lexical Semantics
UID:20041022T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Inflected languages in a low-resource setting present a data sparsity problem forstatistical machine translation. In this work, we present a minimallysupervised algorithm for morpheme segmentation on Arabic dialectswhich reduces unknown words at translation time by over 50%, totalvocabulary size by over 40%, and yields a significant increase inBLEU score over a previous state-of-the-art phrase-based statistical MT system.

DTEND;TZID=America/Los_Angeles:20060825T160000
DTSTART;TZID=America/Los_Angeles:20060825T153000
LOCATION:11 Large
SUMMARY:Minimally Supervised Morphological Segmentation with Applications to Machine Translation
UID:20060825T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The syntax and semantics of human language can illuminate many individual psychological differences and important dimensions of social interaction. Thus, analysis of language provides important insights into the underlying psychological properties of individuals and groups. Accordingly, psychological and psycholinguistic research has begun incorporating sophisticated representations of semantic content to better understand the connection between word choice and psychological processes. While the majority of language analysis work in psychology has focused on semantics, psychological information is encoded not just in what people say, but how they say it. We introduce ConversAtion level Syntax SImilarity Metric (CASSIM), a novel method for calculating conversation-level syntax similarity. CASSIM estimates the syntax similarity between conversations by automatically generating syntactical representations of the sentences in conversations, estimating the structural differences between them, and calculating an optimized estimate of the conversation-level syntax similarity. Also, we conduct a series of analyses with CASSIM to investigate syntax accommodation in social media discourse. Further, building off of CASSIM, we propose ConversAtion level Syntax SImilarity Metric-Group Representations (CASSIM-GR). This extension builds generalized representations of syntactic structures of documents, thus allowing researchers to distinguish between people and groups based on syntactic differences.Bio: Reihane is a forth year Ph.D student at USC, working with Morteza Dehghani in Computational Social Science Laboratory. She is interested in introducing new methods and computational models to psychology, and more broadly to social sciences. Her work spans the boundary between natural language processing and psychology, as does her intellectual curiosity.

DTEND;TZID=America/Los_Angeles:20170407T160000
DTSTART;TZID=America/Los_Angeles:20170407T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:ConversAtion level Syntax SImilarity Metric (CASSIM)
UID:20170407T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Human incremental sentence processing is the process by which we reada sentence, word-by-word, and ultimately comprehend its meaning.  Acentral question in sentence processing research is to understand theprecise nature of the linguistic representations that we constructwhile comprehending a sentence.  Experimental evidence demonstratesthat syntactic structure plays a role in these representations.  Butopen questions remain about the type of syntactic structure that ismost relevant to the human sentence processing mechanism: is thissyntactic structure sequential or hierarchical?  Does it includelexical information (in which case it is "lexicalized"), or is lexicalinformation processed independently from the syntactic structure (inwhich case the syntactic structure is "unlexicalized")?A previous study (Frank and Bod, 2011) compared unlexicalizedsequential and hierarchical models of human sentence processing, andfound that sequential models explain observed human behavior (e.g. eyemovements) during sentence processing better than hierarchical models.The authors concluded that the human sentence processing mechanism isinsensitive to hierarchical syntactic structure.We investigate this claim, and find a picture that is more complicatedthan the one presented by the previous study.  First, we show thatlexicalized syntactic models explain observed human behavior duringsentence processing better than unlexicalized syntactic models.Second, we consider a broader set of sequential and hierarchicalmodels, and show that the findings of (Frank and Bod, 2011) do notgeneralize to this broader set.  Finally, we show why, even within theset of models considered by (Frank and Bod, 2011), their findings arenot entirely conclusive.  Our results indicate that the claim that thehuman sentence processing mechanism is insensitive to hierarchicalsyntactic structure is premature.

DTEND;TZID=America/Los_Angeles:20121010T150000
DTSTART;TZID=America/Los_Angeles:20121010T140000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Sequential vs. hierarchical syntactic models of human sentence processing
UID:20121010T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural language generation (NLG) is a well studied and still very challenging field in natural language processing. One of the less studied NLG tasks is the generation of creative texts such as jokes, puns, or poems. Multiple reasons contribute to the difficulty of research in this area. First, no immediate application exists for creative language generation. This has made the research on creative NLG extremely diverse, having different goals, assumptions, and constraints. Second, no quantitative measure exists for creative NLG tasks. Consequently, it is often difficult to tune the parameters of creative generation models and drive improvements to these systems. Finally, rule based systems for creative language generation are not yet combined with deep learning methods.In this work, we address these challenges for poetry generation which is one of the main areas of creative language generation. We introduce password poems as a novel application for poetry generation.  Furthermore, we combine finite-state machinery with deep learning models in a system for generating poems for any given topic. We introduce a quantitative metric for evaluating the generated poems and build the first interactive poetry generation system that enables users to revise system generated poems by adjusting style configuration settings like alliteration, concreteness and the sentiment of the poem.In order to improve the poetry generation system, we decide to borrow ideas from human literature and develop a poetry translation system. We propose to study human poetry translation and measure the language variation in this process. we will study how human poetry translation is different from human translation in general and whether a translator translates poetry more freely. Then we will use our findings to develop a machine translation system specifically for translating poetry and proposing metrics for evaluating the quality of poetry translation.Bio: Marjan Ghazvininejad is a PhD student at ISI working  with Professor Kevin Knight.

DTEND;TZID=America/Los_Angeles:20170818T160000
DTSTART;TZID=America/Los_Angeles:20170818T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Neural Creative Language Generation
UID:20170818T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 1) In this talk I describe Hafez, a program that generates any number of distinct poems on a user-supplied topic.  Poems obey rhythmic and rhyme constraints.  I describe the poetry-generation algorithm, give experimental data concerning its parameters, and show its generality with respect to language and poetic form.2) In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag induction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.Marjan Ghazvininejad is a PhD student at ISI working with Prof. Kevin Knight.Yonatan Bisk is a Postdoc at ISI working with Prof. Daniel Marcu.

DTEND;TZID=America/Los_Angeles:20161021T160000
DTSTART;TZID=America/Los_Angeles:20161021T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:EMNLP practice talk: 1) Generating Topical Poetry & 2) Unsupervised Neural Hidden Markov Models
UID:20161021T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Information Extraction (IE) or the algorithmic extraction of named entities, relations and attributes of interest from text-rich data is an important natural language processing task. In this talk, I will discuss the relationship of IE to fine-grained Information Retrieval (IR), especially when the domain of interest is unusual i.e. computationally under-studied, socially consequential and difficult to analyze. In particular, such domains exhibit a significant long-tail effect, and their language models are obfuscated. Using real-world examples and results obtained in recent DARPA MEMEX evaluations, I will discuss how our search system uses semantic strategies to usefully facilitate complex information needs of investigative users in the human trafficking domain, even when IE outputs are extremely noisy. I briefly report recent results obtained from a user study conducted by DARPA, and the lessons learned thereof for both IE and IR research.Bio: Mayank Kejriwal is a computer scientist in the Information integration group at ISI. He received his Ph.D. from the University of Texas at Austin under Daniel P. Miranker. His dissertation involved domain-independent linking and resolving of structured Web entities at scale, and was published as a book in the Studies in the Semantic Web series. At ISI, he is involved in the DARPA MEMEX, LORELEI and D3M projects. His current research sits at the intersection of knowledge graph construction, search, inference and analytics, especially over Web corpora in unusual social domains.

DTEND;TZID=America/Los_Angeles:20170616T160000
DTSTART;TZID=America/Los_Angeles:20170616T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:From Noisy Information Extraction to Rich Information Retrieval in Unusual Domains
UID:20170616T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many classic problems in natural language processing can be cast as building mapping from a complex input (e.g., a sequence of words) to a complex output (e.g., a syntax tree or semantic graph).  This task is challenging both because language is ambiguous (learning difficulties) and represented with discrete combinatorial structures (computational difficulties). Often these are at odds: the features you want to add to decrease learning difficulties cause nontrivial additional structure yielding worse computational difficulties.I will begin by discussing algorithms that side-step the issue of combinatorial blowup and aim to predict an output structure directly. I will then present approaches that explicitly learn to trade-off accuracy and efficiency, applied to a variety of linguistic phenomena. Moreover, I will show that in some cases, we can actually obtain a model that is faster and more accurate by exploiting smarter learning algorithms.Hal's homepage: http://www.umiacs.umd.edu/~hal/

DTEND;TZID=America/Los_Angeles:20140214T160000
DTSTART;TZID=America/Los_Angeles:20140214T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Predicting Linguistic Structures Accurately and Efficiently
UID:20140214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Bilingual alignment serves as an integral step and the foundation inthe building of any state-of-the-art statistical machine translationsystem. It enables us to automatically learn and extract translationrules from hundreds of millions of words of bilingual text.Twenty years ago, the research area of machine translation wasbeginning to make use of the increasing availability and speed ofcomputing resources demanded by the ideas of a previous generation,notably Weaver (1949). The IBM translation models -- statisticalmodels for automatic word-to-word translation (Brown et al., 1990;Brown et al., 1993) - spurred a flurry of new statistical andempirical research in this area. They have become ubiquitous in thefield and are easy to train in an unsupervised fashion; Al-Onaizan etal. (1999) and Och and Ney (2003) have given us open-source toolkitsfor this purpose.However, there are many problems that still exist. The work presentedin this thesis proposal will eliminate many of the problems withalignment systems that have persisted for two decades, significantly improving machine translationquality and decidedly advancing the state-of-the-art. In achievingthis goal, we develop new models of bilingual alignment and efficientsearch algorithms for working with such models.

DTEND;TZID=America/Los_Angeles:20101115T170000
DTSTART;TZID=America/Los_Angeles:20101115T160000
LOCATION:4th Floor Conference Room [460]
SUMMARY:Structured Models for Bilingual Alignment (Ph.D. Proposal practice talk)
UID:20101115T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Semantic models of data sources and services provide support to automate many tasks such as source discovery, data integration, and service composition, but writing these semantic descriptions by hand is a tedious and time-consuming task. Most of the related work focuses on automatic annotation with classes or properties of source attributes or input and output parameters. However, constructing a source model that includes the relationships between the attributes in addition to their semantic types remains a largely unsolved problem. In this talk, we present a graph-based approach to hypothesize a rich semantic description of a new target source from a set of known sources that have been modeled over the same domain ontology. We exploit the domain ontology and the known source models to build a graph that represents the space of plausible source descriptions. Then, we compute the top k candidates and suggest to the user a ranked list of the semantic models for the new source. The approach takes into account user corrections to learn more accurate semantic descriptions of future data sources. Our evaluation shows that our method produces models that are twice as accurate than the models produced using a state of the art system that does not learn from prior models.Mohsen's webpage: http://www-scf.usc.edu/~taheriya/

DTEND;TZID=America/Los_Angeles:20140117T160000
DTSTART;TZID=America/Los_Angeles:20140117T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Graph-based Approach to Learn Semantic Descriptions of Data Sources
UID:20140117T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is about an improved approach for learning dependency parsersfrom treebank data. Our technique is based on two ideas for improvinglarge margin training in the context of dependency parsing.  First, weincorporate local constraints that enforce the correctness of eachindividual link, rather than just scoring the global parse tree. Second,to cope with sparse data, we smooth the lexical parameters according totheir underlying word similarities using Laplacian Regularization.  Todemonstrate the benefits of our approach, we consider the problem ofparsing Chinese treebank data using only lexical features, that is,without part-of-speech tags or grammatical categories.  We achieve stateof the art performance, improving upon current large margin approaches.Here is the link for the paper:http://www.cs.ualberta.ca/~wqin/papers/depar_margin_conll06.pdfAbout the speaker:Qin Iris Wang is a Ph.D. student from the University of Alberta,working with Dekang Lin and Dale Schuurmans. Her research interestsare in natural language processing and machine learning. Specifically,she has been working on dependency parsing using both generative anddiscriminative methods.

DTEND;TZID=America/Los_Angeles:20060728T163000
DTSTART;TZID=America/Los_Angeles:20060728T150000
LOCATION:11 Large
SUMMARY:Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization
UID:20060728T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: What do we want to learn from a translation competition and how do we learn it with confidence? We argue that a disproportionate focus on ranking competition participants has led to lots of different rankings, but little insight about which rankings we should trust. In response, we provide the first framework that allows an empirical comparison of different analyses of competition results. We then use this framework to compare several analytical models on data from the Workshop on Machine Translation (WMT).

DTEND;TZID=America/Los_Angeles:20130823T160000
DTSTART;TZID=America/Los_Angeles:20130823T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Models of Translation Competitions (long paper at ACL2013)
UID:20130823T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will describe the research we are undertaking at the Naval Research Laboratory which revolves around chat (such as Internet Relay Chat) and the problems it causes in the military domain. Chat has become a primary means for command and control communications in the US Navy. Unfortunately, its popularity has contributed to the classic problem of information overload. For example, Navy watchstanders monitor multiple chat rooms while simultaneously performing their other monitoring duties (e.g.,  tactical situation screens and radio communications). Some researchers have proposed how automated techniques can help to alleviate these problems, but very little research has addressed this problem.I will give an overview of the three primary tasks that are the current focus of our research. The first is urgency detection, which involves detecting important chat messages within a dynamic chat stream. The second is summarization, which involves summarizing chat conversations and temporally summarizing sets of chat messages. The third is human-subject studies, which involves simulating a watchstander environment and testing whether our urgency detection and summarization ideas, along with 3D-audio cueing, can aid a watchstander in conducting their duties.Short Bio: David Uthus is a National Research Council Postdoctoral Fellow hosted at the Naval Research Laboratory, where he is currently undertaking research focusing on analyzing multiparticipant chat. He received his PhD (2010) and MSc (2006) from the University of Auckland in New Zealand and his BSc (2004) from the University of California, Davis. His research interests include microtext analysis, machine learning, metaheuristics, heuristic search, and sport scheduling.

DTEND;TZID=America/Los_Angeles:20110805T160000
DTSTART;TZID=America/Los_Angeles:20110805T150000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Overcoming Information Overload in Navy Chat
UID:20110805T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will introduce languageFractal, an online system for human-augmented machine translation (MT) that aims to incorporate monolingual speakers into the translation pipeline in a cost-effective manner. The essential principle is to take a middle ground between pure MT and a fully crowdsourced approach by augmenting MT results with human corrections in an iterative cycle. To efficiently emit phrases and sentences to users and to effectively explore the space of possible translation options, we propose the use of determinantal point processes (DPPs), which can be used to model subset selection problems in which diversity of the subset is a desirable characteristic.I will provide a brief tutorial on DPPs (including L-ensembles and the structured variant), and I will present an overview of our formulation of DPPs for dynamic programming problems in the context of the human-augmented machine translation pipeline. I will also introduce the languageFractal pilot and pipeline, the full trials of which will run through the 2014-2015 academic year at Harvard University.Bio: Allen Schmaltz is a Ph.D. student in Computer Science in the School of Engineering and Applied Sciences at Harvard University (2013-present; S.M. 2014), working with Stuart Shieber. He is interested in formal, statistical, and human-augmented machine learning approaches for computational linguistics. Before starting his Ph.D. in Computer Science, he completed the better part of an additional Ph.D. in the (quantitative) social sciences at Harvard University (2010-2013), received a M.A. from Stanford University (2010), and received a B.A. from Northwestern University (2006). Earlier in his academic career he also studied at Cornell University and in Yokohama, Japan, among other places.

DTEND;TZID=America/Los_Angeles:20140822T160000
DTSTART;TZID=America/Los_Angeles:20140822T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Determinantal Point Processes for Human-Augmented Machine Translation [Intern talk]
UID:20140822T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many research efforts are addressing the problem of enabling automaticsummarization of opinions and assessments stated on the web in productreviews, discussion forums, and blogs. One key difficulty is that relevantassessments scattered throughout web pages are obscured by variations innatural language. In this paper, we focus on a novel aspect of enablingaggregations of assessments of degree to which a given property holds fora given entity (for instance, how touristy is Boston). We presentGrainPile, a user interface for extracting from the web, aggregating andquantifying degree assessments of unconstrained topics. The interfaceprovides a variety of functions: a) identification of dimensions ofcomparison (properties) relevant to a particular entity or set ofentities, b) comparisons of like entities on user-specified properties(for example, which university is more prestigious, Yale or Cornell), c)tracing the derived opinions back to their sources (so that the reasonsfor the opinions can be found). A central contribution in GrainPile is theevaluated demonstration of feasibility of mapping the recognizedexpressions (such as fairly, very, extremely, and so on) to a common scaleof numerical values and aggregating across all the extracted assessmentsto derive an overall assessment of degree. GrainPile&#8217;s novelassessment and aggregation of degree expressions is shown to stronglyoutperform an interpretation-free, co-occurrence based method.Full paper:http://www.isi.edu/~timc/papers/IUI06-grainpile-chkl.pdf

DTEND;TZID=America/Los_Angeles:20060126T140000
DTSTART;TZID=America/Los_Angeles:20060126T130000
LOCATION:4th floor
SUMMARY:GrainPile: Deriving Quantitative Overviews of Free Text Assessments on the Web
UID:20060126T130000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Exponential growth in electronic health care data has resulted in new opportunities and urgent needs to discover meaningful data-driven representations and patterns of diseases. Recent rise of this research field with more available data and new applications also has introduced several challenges. In this talk, we will present our deep learning solutions to address some of the challenges. First, health care data is inherently heterogeneous, with a variety of missing values and from multiple data sources. We propose variations of Gated Recurrent Unit (GRU) to explore and utilize the informative missingness in health care data, and hierarchical multimodal deep models to utilize the relations between different data sources. Second, model interpretability is not only important but necessary for care providers and clinical experts. We introduce a simple yet effective knowledge distillation approach called interpretable mimic learning to learn interpretable gradient boosting tree models while mimicking the performance of deep learning models.Bio: Zhengping Che is a third year PhD candidate in the Computer Science Department at the University of Southern California, advised by Professor Yan Liu. Before that, he received his bachelor degree in Computer Science from Pilot CS Class (Yao Class) at Tsinghua University, China. His primary research interest lies in the area of deep learning and its applications in health care domain, especially on multivariate time series data.

DTEND;TZID=America/Los_Angeles:20160429T160000
DTSTART;TZID=America/Los_Angeles:20160429T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Deep learning solutions to computational phenotyping in health care
UID:20160429T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this paper, we analyze the effect of resampling techniques,including under-sampling and over-sampling used in active learning forword sense disambiguation (WSD). Experimental results show thatunder-sampling causes negative effects on active learning, butover-sampling is a relatively good choice. To alleviate thewithin-class imbalance problem of over-sampling, we propose abootstrap-based over-sampling (BootOS) method that works better thanordinary over-sampling in active learning for WSD. Finally, weinvestigate when to stop active learning, and adopt two strategies,max-confidence and min-error, as stopping conditions for activelearning. According to experimental results, we sug-gest a predictionsolution by considering max-confidence as the upper bound andmin-error as the lower bound for stopping conditions.

DTEND;TZID=America/Los_Angeles:20070601T153000
DTSTART;TZID=America/Los_Angeles:20070601T150000
LOCATION:11 Large
SUMMARY:Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem
UID:20070601T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Scamseek project aims to build a surveillance tool for identifyingfinancial scams on the Internet by performing document classification ofInternet pages. There are three principle types of documents of concern:those that give financial advice by unregistered advisors, unlawfulinvestment schemes, and share ramping.The first phase of the project has been completed and a working system,known as ScamAlert installed at the Australian Securities and InvestmentCommission (ASIC). The independent audit of the performance of the systemproved satisfactory with a result for precision of .75, recall .43, andF=. 54, along with identification of 4 scams misclassified by the client.Significant improvement in recall is foreshadowed in the 2nd phase of theproject.  The results are satisfying in the context of the structure ofthe data where the density of scam documents is about 1.8% of the totalcorpus.The good performance of the operational system is ascribed to thecombination of using a strong linguistic model of language (SystemicFunctional Linguistics) to define the scam documents in parallel with arich statistical analysis of the structure of non-scam documents and scamlook-alikes. A large amount of the experimental program has concentratedon understanding and exploiting the interaction between the linguisticallydescribed aspects of the documents and the statistical properties. Eachtype of data has been used to inform and modify the usage of the other.The operational aspects of the project have proven to be as challenging asthe research objectives. The project has a budget of $2.2M over 15 months.It has been managed so as to create a balance in resources between theneeds of both the research objectives and the engineering objectives.Software development has concentrated on three aspects. Firstly, toproduce an environment for the strong directive management ofcomputational linguistics experiments, secondly, in the aid of thelinguists to create tools to support their manual analysis, and thirdlythe best practice of software engineering principles to ensure a cleanautomated rollout of the production system for ASIC.The contributing partners in the Scamseek project are The Capital MarketsCo-operative Research Centre (CMCRC), ASIC, the University of Sydney andMacquarie University.

DTEND;TZID=America/Los_Angeles:20040325T120000
DTSTART;TZID=America/Los_Angeles:20040325T103000
LOCATION:11 Large
SUMMARY:ScamSeek: Capturing Financial Scams at the Coalface by Language Technology
UID:20040325T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present work on the interpretation of descriptions of visual scenes such as 'A man is sitting on a chair and using the computer'. One application of this research is the automatic generation of 3D scenes which provides a way for non-artists to create graphical content and have wide-ranging applications in entertainment and education.The core task of text-to-scene generation involves understanding the high-level content of a description and translating it into a low-level representation representing a 3D scene as a set of relations between pre-existing 3D models. Linguistic, spatial, and world-knowledge inference is required in this process on different levels.My talk will present VigNet, a repository of lexical- and world knowledge needed for text-to-scene generation, which is based on FrameNet. I will also describe how visual scenes can be represented as directed graphs  and how information in VigNet can be encoded in Synchronous Hyperedge Replacement Grammars to enable semantic parsing and generation of a scene.Bio:Daniel Bauer is a PhD candidate at Columbia University. His research interests include lexical and computational semantics, semantic parsing, and formal grammars in syntax and semantics. He is a co-founder of WordsEye Inc, a company that aims to make text-to-3D-scene generation available to everyone on social media. Daniel is currently an intern at ISI for the second summer in a row. He received his undergrad degree in Cognitive Science from the University of OsnabrÃ¼ck, Germany, and a MSc in Language Science and Technology from Saarland University.

DTEND;TZID=America/Los_Angeles:20130712T160000
DTSTART;TZID=America/Los_Angeles:20130712T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Understanding Descriptions of Visual Scenes Using Graph Grammars
UID:20130712T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a preliminary study on unsupervised preposition sense disambiguation (PSD), comparing different models and training techniques (EM, MAP-EM with L0 norm, Bayesian inference using Gibbs sampling). To our knowledge, this is the ï¬rst attempt at unsupervised preposition sense disambiguation. Ultimately, we want to disambiguate prepositions not by and for themselves, but in the context of sequential semantic labeling. This should also improve disambiguation of the words linked by the prepositions (here, morning, shopped, and Rome). We propose using unsupervised methods in order to leverage unlabeled data, since, to our knowledge, there are no annotated data sets. Our best accuracy for PSD reaches 56%, a signiï¬cant improvement (at p < .001) of 16% over the most-frequent-sense baseline.This is a joint work with Ashish Vaswani, Stephen Tratz, David Chiang, and Eduard Hovy

DTEND;TZID=America/Los_Angeles:20110422T160000
DTSTART;TZID=America/Los_Angeles:20110422T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Models and Training for Unsupervised Preposition Sense Disambiguation
UID:20110422T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (note: this is a very tentative title -- comments welcome!)We present a novel extension of syntax-directed translation forstatistical MT. Formally speaking, our model is based on tree-to- stringtransducers that recursively convert a parse-tree in the source-languageinto a string in the target-language. These transduction rules havemulti-level trees on the source-side, giving this system moretransformational power due to the extended domain of locality. We alsopresent efficient algorithms for decoding based on dynamic programming.Initial experiments on English-to-Chinese translation show promisingresults in both speed and the translation quality.Joint work with Kevin Knight and Aravind Joshi.Bio:Liang Huang is a 3rd-year PhD student from the University of Pennsylvania.He is mainly interested in algorithms and formalisms for parsing andsyntax-based machine translation. His recent work has been on k-bestparsing algorithms (with David Chiang) and synchronous binarization for MT(with Hao Zhang, Dan Gildea, and Kevin Knight).

DTEND;TZID=America/Los_Angeles:20060303T163000
DTSTART;TZID=America/Los_Angeles:20060303T150000
LOCATION:11th Floor (Large)
SUMMARY:Syntax-Directed Translation with Extended Domain of Locality
UID:20060303T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will cover two topics. The first part will be a brief overview of Manuel's recent project in abbreviation disambiguation. Following, Manuel will give a brief overview of how various NLP methods are used in an industrial setting in a danish company that provides text analytics services for publishers such as Springer-Nature.Bio: Manuel is a 3rd year PhD student at Aarhus University in Denmark. His PhD is focused on applying Data Mining and Machine Learning on large collections of unstructured text documents with the goal of extracting and representing knowledge embedded in the documents.

DTEND;TZID=America/Los_Angeles:20180208T120000
DTSTART;TZID=America/Los_Angeles:20180208T110000
LOCATION:Conference Room [689]
SUMMARY:Abbreviation Disambiguation and NLP Deployment in Industrial Settings
UID:20180208T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a method to transliterate names in the framework ofend-to-end statistical machine translation. The system is trained tolearn when to transliterate.For Arabic to English MT, we developed and trained a transliterator on abitext of 7 million sentences and Google's English terabyte ngrams andachieved better name translation accuracy than 3 out of 4 professionaltranslators. The talk also includes a discussion of challenges in nametranslation evaluation.

DTEND;TZID=America/Los_Angeles:20080404T160000
DTSTART;TZID=America/Los_Angeles:20080404T150000
LOCATION:11 Large
SUMMARY:Name Translation in Statistical Machine Translation: Learning When to Transliterate
UID:20080404T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Online activity is characterized by diurnal and weekly patterns, reflecting human circadian rhythms, sleep cycles, and social patterns of work and leisure. Using data from online social networking site Facebook, we uncover temporal patterns that take place at far shorter time scales. Specifically, we demonstrate fine-grained, within-session behavioral changes, where a session is defined as a period of time a user engages with Facebook before choosing to take a break. We show that over the course of a session, users spend less time consuming some types of content, such as textual posts, and preferentially consume more photos and videos. Moreover, users who spend more time engaging with Facebook have different patterns of session activity than the less-engaged users, a distinction that is already visible at the start of the session. We study activity patterns with respect to usersâ demographic characteristics, such as age and gender, and show that age has a strong impact on within-session behavioral changes. Finally, we show that the temporal patterns we uncover help us more accurately predict the length of sessions on Facebook.Bio.  I am a third-year Computer Science PhD student at the University of Southern California (USC), Information Sciences Institute (ISI) working under the supervision of Kristina Lerman. My main research interest is the study of large and complex datasets, especially data from online social networks, which includes the measurement and analysis of users' behavior in OSNs. I'm currently a Data Science intern at Facebook in Menlo Park.Before joining USC, I got my master's from Max Planck Institute for Software Systems (MPI-SWS), Germany. I worked with Krishna Gummadi as my advisor and also with Meeyoung Cha (KAIST) and Winter Mason (Facebook) during my master's. Before MPI, I got my bachelor's in Computer Engineering (Software) from University of Tehran, Iran.

DTEND;TZID=America/Los_Angeles:20151023T160000
DTSTART;TZID=America/Los_Angeles:20151023T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Fine-grained Temporal Patterns of Online Content Consumption
UID:20151023T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a novel method to detect parallel fragments within noisy parallel corpora. Isolating these parallel fragments from the noisy data in which they are contained frees us from noisy alignments and stray links that can severely constrain translation-rule extraction. We do this with existing machinery, making use of an existing word alignment model for this task. We evaluate the quality and utility of the extracted data on large-scale Chinese-English and Arabic-English translation tasks and show significant improvements over a state-of-the-art baseline.

DTEND;TZID=America/Los_Angeles:20120518T153000
DTSTART;TZID=America/Los_Angeles:20120518T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Automatic Parallel Fragment Extraction From Noisy Data (NAACL HLT Practice Talk)
UID:20120518T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Human behavior is exceedingly complex. Its expression and experience are inherently multimodal, and are characterized by individual and contextual heterogeneity.  The confluence of sensing, communication and computing is however allowing access to data, in diverse forms and modalities, that is enabling us understand and model human behavior in ways that were unimaginable even a few years ago. No domain exemplifies these opportunities more than that related to human health and wellbeing. Consider for example the domain of Autism where crucial diagnostic information comes from manually-analyzed audiovisual data of verbal and nonverbal behavior. Behavioral signal processing advances can enable not only new possibilities for gathering data in a variety of settings--from laboratory and clinics to free living conditions--but in offering computational models to advance evidence-driven theory and practice.This talk will describe our ongoing efforts on Behavioral Signal Processing (BSP)--technology and algorithms for quantitatively and objectively understanding typical, atypical and distressed human behavior--with a specific focus on communicative, affective and social behavior. Using examples drawn from different application domains, the talk will also illustrate Behavioral Informatics applications of these processing techniques that contribute to quantifying higher-level, often subjectively described, human behavior in a domain-sensitive fashion.[Work supported by NIH, NSF, DARPA, and ONR].Biography of the Speaker:Shrikanth (Shri) Narayanan is Andrew J. Viterbi Professor of Engineering at USC, where he is Professor of Electrical Engineering, and, jointly in, Computer Science, Linguistics and Psychology. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Fellow of the Acoustical Society of America, IEEE, and the American Association for the Advancement of Science (AAAS). Shri Narayanan is an Editor for the Computer, Speech and Language Journal and an Associate Editor for the IEEE Transactions on Multimedia, the IEEE Transactions on Affective Computing and the Journal of Acoustical Society of America having previously served an Associate Editor for the IEEE Transactions of Speech and Audio Processing (2000-2004) and the IEEE Signal Processing Magazine (2005-2008). He is a recipient of several honors including the 2005 and 2009 Best Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11. With his students, he has received a number of best paper awards including winning the Interspeech Challenges in 2009 (Emotion classification), 2011 (Speaker state classification) and in 2012 (Speaker trait classification).  He has published over 500 papers and has 13 U.S. patents.

DTEND;TZID=America/Los_Angeles:20130124T160000
DTSTART;TZID=America/Los_Angeles:20130124T150000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Behavioral Signal Processing: Deriving Human Behavioral Informatics from Multimodal Signals
UID:20130124T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I look at how the notion of discourse coherence can bemodeled computationally. I begin with the following idea: if you takea text and shuffle its sentences into a random order, that text willno longer make sense. In other words, the text will be "incoherent".Our task is to learn how to reassemble a shuffled text into an orderthat humans would consider to be coherent.I discuss practical and theoretical motivations for the task,evaluations of our model, increases in performance achieved over thesummer, and directions for future research.This work was done in collaboration with Kevin Knight, Daniel Marcu,Jonathan Graehl and Nick Mote.

DTEND;TZID=America/Los_Angeles:20030912T160000
DTSTART;TZID=America/Los_Angeles:20030912T143000
LOCATION:11 Large
SUMMARY:Discourse Coherence for Ordering Information
UID:20030912T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Statistical machine translation (SMT) has witnessed promising progress in recent years. Typically, an SMT system is characterized as a single-best pipeline, whose modules are independent to each other and only take as input single-best results from the previous module. With this assumption, each module will inevitably introduce errors in single-best outputs, which will propagate and accumulate along the pipeline, and eventually hurt the translation quality.In order to alleviate this problem, we use compact structures such as lattices and forests instead of single-best results in each module, and then integrate both lattice and forest into a single tree-to-string system. We explore the algorithms of lattice parsing, lattice-forest-based rule extraction and decoding. Experiments show a statistically significant improvement over a start-of-the-art forest-based baseline. More interestingly, we observe a significant reduction in rule-set size when extracting with a lattice, which implies better generalizability (with a smaller model).About the speaker:Haitao Mi is an Assistant Researcher in the Institute of Computing Technology, Chinese Academy of Sciences (CAS/ICT). He received his Ph.D. from CAS/ICT in 2009. His main research interests include syntax-based machine translation and statistical parsing. Additional information about him and his group can be found at http://nlp.ict.ac.cn/~mihaitao/

DTEND;TZID=America/Los_Angeles:20100331T160000
DTSTART;TZID=America/Los_Angeles:20100331T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Lattice and Forest for SMT
UID:20100331T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We introduce dependency relations into deciphering foreign languages and show that dependency relations help improve the state-of-the-art deciphering accuracy by over 500%. We learn a translation lexicon from large amounts of genuinely non parallel data with decipherment to improve a phrase-based machine translation system trained with limited parallel data. In experiments, we observe BLEU gains of 1.2 to 1.8 across three different test sets.

DTEND;TZID=America/Los_Angeles:20131016T120000
DTSTART;TZID=America/Los_Angeles:20131016T110000
LOCATION:6th Floor Large Conference Room [Rm # 689]
SUMMARY:Dependency Based Decipherment for Resource-Limited Machine Translation (EMNLP2013 practice talk)
UID:20131016T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will be talking about deciphering letter-substitution ciphers *optimally* using only minimal knowledge (bigrams, trigrams, etc.) of the source language, instead of relying on large look-up dictionaries. We also plan to show how our empirical results compare with Shannon's predictions on the equivocation curves and unicity distance measure.

DTEND;TZID=America/Los_Angeles:20080718T160000
DTSTART;TZID=America/Los_Angeles:20080718T150000
LOCATION:11 Large
SUMMARY:Deciphering Ciphers Optimally Using Only Minimal Knowledge of the Source Language
UID:20080718T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030815T160000
DTSTART;TZID=America/Los_Angeles:20030815T150000
LOCATION:11 Large
SUMMARY:On Her Masters Research
UID:20030815T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I wanted to understand why it's so hard to build working robots, so I programmed one to play tic-tac-toe.  Now I understand a lot better!  I thought I'd relate my experience right now, just in case I later become more knowledgeable and impossible to understand.Kevin Knight is a Research Director at the Information Sciences Institute (ISI) of the University of Southern California (USC), and a Professor in the USC Computer Science Department.  He received a PhD in computer science from Carnegie Mellon University and a bachelor's degree from Harvard University.  Dr. Knightâs research interests include statistical machine translation, natural language generation, automata theory, and decipherment of historical manuscripts.

DTEND;TZID=America/Los_Angeles:20170414T160000
DTSTART;TZID=America/Los_Angeles:20170414T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Why is it harder to build a tic-tac-toe playing robot than a tic-tac-toe playing program?
UID:20170414T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will present Ensemble Semantics (ES), a new general framework for information extraction developed at Yahoo!, that combines multiple sources of information and extractors. The ES framework is based on the hypothesis that although distributional and pattern-based extraction algorithms are complementary, they do not exhaust the semantic space; other sources of evidence can be leveraged to better combine them.  In this presentation, I will focus on a specific implementation of ES for the task of entity extraction. I will report experimental results showing large gains in performance, by combining state-of-the-art distributional and pattern-based systems with a large set of features from a document webcrawl, one year of query logs, and a snapshot of Wikipedia. I will also propose an analysis of feature correlations and interactions showing the value of the different feature sets. I will conclude discussing some issues that can impact on the overall performance of entity extraction algorithms.

DTEND;TZID=America/Los_Angeles:20091120T160000
DTSTART;TZID=America/Los_Angeles:20091120T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Entity Extraction via Ensemble Semantics
UID:20091120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We take a novel approach to zero pronoun resolution in Chinese: our model explicitly tracks the flow of focus in a discourse. Our approach, which generalizes to deictic references, is not reliant on the presence of overt noun phrase antecedents to resolve to, and allows us to address the large percentage of ânon-anaphoricâ pronouns filtered out in other approaches. We furthermore train our model using readily available parallel Chinese/English corpora, allowing for training without hand-annotated data. Our results demonstrate improvements on two test sets, as well as the usefulness of linguistically motivated features.Bio. I am a PhD student from University of Maryland, College Park working under Prof. Hal Daume III and Prof. Philip Resnik. My recent project on "Dialogue focus tracking for zero pronoun resolution" appeared at NAACL 2015. At ISI, I am working with Prof. Daniel Marcu and Prof. Kevin Knight on application of Abstract Meaning Representation (AMR) to biology literature. Specifically we will be developing techniques for constructing text level AMRs from sentence level AMRs and then assess its impact on reading-against-a-model molecular biology tasks. In my spare time, I enjoy singing, dancing and watching movies.

DTEND;TZID=America/Los_Angeles:20150724T160000
DTSTART;TZID=America/Los_Angeles:20150724T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Dialogue focus tracking for zero pronoun resolution
UID:20150724T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Noisy Language ModelsThe language models used in statistical machine translation are oftenquite large, requiring significant memory and sometimes pre-processingin order to be utilized effectively. It would be desirable to have amore compact representations of language models while minimizing theimpact on translation quality. Various quantization methods and lossystorage of language models will be presented.Context for Syntax-Based Translation RulesThe rules that a translation system employs should be applicable inmany contexts.  This ensures that a rich language is expressible witha minimum number of rules.  However, when rules that are applicable intoo many contexts are combined, they result in nonsensicaltranslations.  How can we keep rules general but constrain the contextof their use?  This summer we explored the approach of constrainingthe context by conditioning on various neighboring elements of eachrule.

DTEND;TZID=America/Los_Angeles:20070824T170000
DTSTART;TZID=America/Los_Angeles:20070824T153000
LOCATION:11 Large
SUMMARY:Summer Intern Presentations: Noisy Language Models AND Context for Syntax-Based Translation Rules
UID:20070824T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We offer a simple, effective, and scalable method for statisticalmachine translation parameter tuning based on the pairwise approach toranking. Unlike the popular MERT algorithm, our pairwise rankingoptimization (PRO) method is not limited to a handful of parametersand can easily handle systems with thousands of features. Moreover,unlike recent approaches built upon the MIRA algorithm of Crammer andSinger, PRO is easy to implement. It uses off-the-shelf linear binaryclassifier software and can be built on top of an existing MERTframework in a matter of hours. We establish PRO's scalability andeffectiveness by comparing it to MERT and MIRA and demonstrate parityon both phrase-based and syntax-based systems in a variety of languagepairs, using large scale data scenarios.

DTEND;TZID=America/Los_Angeles:20110715T153000
DTSTART;TZID=America/Los_Angeles:20110715T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tuning as Ranking (EMNLP 2011 practice talk)
UID:20110715T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: The increasing growth of network data such as linked documents on the Web and social networks, has imposed great challenges on automatic data analysis. We study the problem of learning representations of network data, which is of critical for applications including data classification, ranking and link prediction. We present neural network embedding algorithms to learn distributed representations of network data that capture the deep context of each data point, and human cognition in navigation data. To improve the scalability of our algorithms, we use efficient optimization and sampling methods.Bio: Hao Wu is a PhD student at USC/ISI, advised by Kristina Lerman.

DTEND;TZID=America/Los_Angeles:20160408T160000
DTSTART;TZID=America/Los_Angeles:20160408T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Distributed Representations from Network Data and Human Navigation
UID:20160408T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Textual data is everywhere, in email and scientific papers, in onlinenewspapers and e-commerce sites. The Web contains more than 200 terabytesof text not even counting the contents of dynamic textual databases. Thisenormous source of knowledge is seriously underexploited. Textualdocuments on the Web are very hard to model computationally: they aremostly unstructured, time-dependent, collectively authored, multilingual,and of uneven importance.  Traditional grammar-based techniques don'tscale up to address such problems. Novel representations and analyticaltools are needed.I will discuss several current projects at Michigan related to text miningfrom a variety of genres. Depending on the amount of time, I will talkabout (a) lexical centrality for multidocument summarization, (b)syntax-based sentence alignment, (c) graph-based classification,(d)lexical models of Web growth, and (e) mining protein interactions fromscientific papers. As it turns out, the right representations, whencomplemented with traditional NLP and IR techniques, turn many of theseinto instances of better studied problems in areas such as socialnetworks, statistical mechanics, sequence analysis, and computationalphylogenetics.About the Speaker:Dragomir R. Radev is Assistant Professor of Information, ElectricalEngineering and Computer Science, and Linguistics at the University ofMichigan, Ann Arbor.  He leads the CLAIR (Computational LingusiticsAnd Information Retrieval) group which currently includes 12undergraduate and graduate students.  Dragomir holds a Ph.D. inComputer Science from Columbia University.  Before joining Michigan,he was a Research Staff Member at IBM's TJ Watson Research Center inHawthorne, NY.  He is the author of more than 45 papers on informationretrieval, text summarization, graph models of the Web, questionanswering, machine translation, text generation, and informationextraction.  Dr. Radev's current research on probabilistic andlink-based methods for exploiting very large textual repositories,representing and acquiring knowledge of genome regulation, andsemantic entity and relation extraction from Web-scale text documentcollections is supported by NSF and NIH.  Dragomir serves on theHLT-NAACL advisory committee, was recently reelected as treasurer ofNAACL, is a member of the editorial boards of JAIR and InformationRetrieval, and is a four-time finalist at the ACM internationalprogramming finals (as contestant in 1993 and as coach in1995-1997). Dragomir received a graduate teaching award at Columbiaand recently, the U. of Michigan award for Outstanding ResearchMentorship (UROP).

DTEND;TZID=America/Los_Angeles:20041112T163000
DTSTART;TZID=America/Los_Angeles:20041112T150000
LOCATION:11 Large
SUMMARY:Words, links, and patterns: novel representations for Web-scale text mining
UID:20041112T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is the second in three tutorial lectures on ontologies.  Itfirst shows some details of various Upper Ontologies-ResearchCYC, SUMO,DOLCE, and the Penman Upper Model.  It then discusses the problem ofcreating content for the 'Middle Model' zone of ontologies, and outlines amethodology for moving from words to word senses to concepts.  Itconcludes by describing ISI's Omega ontology and showing how Omega hasbeen used in annotation projects to support semantic labeling of texts.Please bring a pen or pencil and some paper; there is a small exercise!

DTEND;TZID=America/Los_Angeles:20050318T163000
DTSTART;TZID=America/Los_Angeles:20050318T150000
LOCATION:11 Large
SUMMARY:Methodologies of ontology content construction
UID:20050318T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Machine Reading relies to a large extent on information about entities and events. While the definition of events is controversial, most people agree that they have certain properties like a time and a place.We exploit this by trying to establish relations between events (such as ``bombing''  or ``election'') and temporal expressions that can be resolved to a timestamp, i.e., an expression like ``last Tuesday'' to an absolute value like 20110802.This enables a number of interesting applications, such as generation of absolute timelines, cross-document event coreference, and resolution of logical discrepancies.We define a baseline approach and improve upon it by identifying important subproblems (within-sentence vs. across-sentence), casting them as a relation extraction problem and showing that classification with kernel methods works well in capturing the information. Our results are competitive with previous approaches and reach a F-score of 76.6.We also show that resolution across sentences is a lot harder and cannot be approached with the same techniques used for the within-sentence. We outline some promising findings and suggest further research.

DTEND;TZID=America/Los_Angeles:20110930T170000
DTSTART;TZID=America/Los_Angeles:20110930T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Aligning Events and Time Stamps
UID:20110930T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk)A wide variety of problems in NLP require parallel data to train supervised models to perform different tasks. For example, in machine translation (where the task is to translate between two languages automatically) parallel data containing source/target language sentence pairs is required to train various models which can then be used to translate new sentences or documents. The dependency on parallel data for many of these NLP tasks limits their applications to specific domains, or language pairs for which a lot of training data is readily available. On the other hand, collecting parallel data for new domains, language pairs, etc. is a costly as well as time-intensive operation. For such tasks, the development of novel unsupervised approaches which require only {\em non-parallel} data for training can enable their application to new domains and potentially broaden the impact and benefits of NLP research to wider areas.A similar problem has been tackled by cryptographers and archaeologists in a different context---for "decipherment" purposes. During the 1940's and 1950's, mathematicians and scientists worked on code-breaking operations, which spurred the development of many research ideas for modern computer science. For such problems, it is highly unlikely to assume the availability of parallel data relating the ciphertext and plaintext, yet cryptographers and archaeologists have attempted to solve such tasks using various decipherment techniques along with other non-parallel sources of information.In this thesis proposal practice talk, I will show how we combine the two ideas (decipherment and unsupervised learning for NLP problems) together and present a unified decipherment-based approach for modeling a wide range of problems in NLP. Instead of relying on parallel data, I propose to use alternate sources of linguistic knowledge and large quantities of readily available monolingual data to induce strong bilingual connections in problems such as machine transliteration and translation. The talk will describe how various NLP problems such as unsupervised part-of-speech tagging, word alignment, transliteration, and machine translation can be formulated as decipherment tasks. I will present decipherment algorithms for tackling many of these problems and show that it is possible to achieve good results for many problems of interest in NLP without using any parallel data at all.

DTEND;TZID=America/Los_Angeles:20090826T160000
DTSTART;TZID=America/Los_Angeles:20090826T150000
LOCATION:11 Large
SUMMARY:Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk)
UID:20090826T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a general framework for automatically extracting social networks and biographical facts from conversationalspeech. Our approach relies on fusing the output produced by multiple information extraction modules, including entityrecognition and detection, relation detection, and event detection modules. We describe the specific features and algorithmicrefinements effective for conversational speech. These cumulatively increase the performance of social networkextraction from 0.06 to 0.30 for the development set, and from 0.06 to 0.28 for the test set, as measured by f-measure on theties within a network. The same framework can be applied to other genres of text -- we have built an automatic biographygeneration system for general domain text using the same approach.--Brief Bio:Nanda Kambhatla has nearly 17 years of research experience in the areas ofNatural Language Processing (NLP), text mining, information extraction, dialog systems, andmachine learning. He holds 6 U.S patents and has authored over 30 publications in books,journals, and conferences in these areas. Nanda holds a B.Tech in Computer Science and Engineeringfrom the Institute of Technology, Benaras Hindu University, India, and a Ph.D in ComputerScience and Engineering from the Oregon Graduate Institute of Science & Technology, Oregon, USA.Currently, Nanda is the manager of the Data Analytics Group at IBM's India Research Lab (IRL), Bangalore. The group is focused on research on machine translation, Natural Language Processing, text analysis and machine learning techniques for developing analyticssolutions to help IBM's services divisions. Most recently, Nanda was the manager of the StatisticalText Analytics Group at IBM's T.J. Watson Research Center, the Watson co-chair of the NaturalLanguage Processing PIC, and the task PI for the Language  Exploitation Environment (LEE) subtaskfor the DARPA GALE project. He has been leading the development of information extractiontools/products and his team has achieved top tier results in successive Automatic Content Extraction(ACE) evaluations conducted by NIST for extracting entities, events and relations from text frommultiple sources, in multiple languages and genres.Earlier in his career, Nanda has worked on natural language web-based and spoken dialog systems at IBM. Before joining IBM, he has worked on information retrieval and filtering algorithms as a senior research scientist at WiseWire Corporation, Pittsburgh and on image compression algorithms while working as a postdoctoral fellow under Prof. Simon Haykin at McMaster University, Canada.Nanda's research interests are focused on NLP and technology solutions for creating, storing, searching, and processing large volumes of unstructured data (text, audio, video, etc.) and specifically on applications of statistical learning algorithms to these tasks.

DTEND;TZID=America/Los_Angeles:20091009T160000
DTSTART;TZID=America/Los_Angeles:20091009T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Extracting Social Networks and Biographical Facts from Conversational Speech Transcripts
UID:20091009T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Multiple Kernel Learning (MKL) has been a subject of intensive research over the past decade.Instead of searching for a good kernel function (implicitly, feature transformation of our data), the idea is to learn a combination of kernels that optimizes our objective.This formulation has found usage in feature selection and interpretability as well as (sometimes) leading to increased classification accuracy.In the talk, I will provide an introduction to MKL as well as present and compare a few MKL formulations for SVM classification.Given time, I will present our own non-linear (yet still convex) MKL formulation that linearly combines kernels that are first multiplied by low-rank matrices.

DTEND;TZID=America/Los_Angeles:20130830T160000
DTSTART;TZID=America/Los_Angeles:20130830T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:MKL and Low Rank Multiplicative Shaping
UID:20130830T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Digital humanities is a field that uses digital tools to explore humanities questions. That work can take many different forms, from maps to data visualization to video-based projects. In this talk, weâll discuss humanities approaches to large-scale text analysis, with a focus on corpora that may be of interest to computer scientists. Weâll also talk about the distinctive ways that humanists approach text analysis, and some of the âliveâ questions in the field that might interest NLP researchers.Bio: Miriam Posner is an assistant professor at the UCLA School of Information. Sheâs also a digital humanist with interests in labor, race, feminism, and the history and philosophy of data. As a digital humanist, she is particularly interested in the visualization of large bodies of data from cultural heritage institutions, and the application of digital methods to the analysis of images and video. She is at work on two projects: the first on what âdataâ might mean for humanistic research; and the second on how multinational corporations are making use of data in their supply chains.Bio: David Shepard (UCLA) is Lead Academic Developer at UCLAâs Center for Digital Humanities. After receiving his PhD in English from UCLA in 2012, he coauthored the book HyperCities: Thick Mapping in the Digital Humanities and has worked on social media and text mining. His work focuses on large-scale analysis of social media in disasters.Bio: Andrew Wallace is a software developer in the UCLA digital library. He received his PhD in Cognitive Science from Brown University in 2011.

DTEND;TZID=America/Los_Angeles:20180223T160000
DTSTART;TZID=America/Los_Angeles:20180223T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Digital Humanities: Lots of Text-Based Corpora, Lots of Questions
UID:20180223T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This pilot study explores the hypothesis that customer reviews of cars can be used to create and/or fine tune a recommendation system that offers a list of ranked top-N matches for a given vehicle.  Our main premise is that positive or negative reviews invariably focus on the features relevant to the car being reviewed and hence can be used to uncover subtle similarities among various car models, as well as discover macro-types of cars (e.g. family cars, luxury, high performance sports etc).  To discover similar models based on reviews we propose a Weighted Dice Coefficient which weighs each shared or non-shared word token by its tf-idf score.  Closest top five cars are then discovered for each of the 226 reviewed car models.   We also show that integrating tf-idf scores into the similarity metric improves the accuracy of the top five picks, as compared to the standard Dice Coefficient.Bio:I graduated from Rutgers in 2005 with a PhD in Linguistics. Having taught linguistics at Pomona College and Simon Fraser University between 2006 and 2008, I moved into industry in 2008. I currently work as a Computational Linguist at Disney Interactive Media Group. My work primarily concerns developing natural language processing techniques to ensure that the content of Disney's online chat is safe for kids. My work involves developing various NLP methods that filter online chat for inappropriate content, while taking into account the vast informality, sparsity, and noise of the on-line child chat language. In addition, I conduct independent research on Twitter data, specifically clustering one-line micro-tweets by topic. My additional research includes mining online car reviews to identify common car-types based on the features people rate as positive or negative.

DTEND;TZID=America/Los_Angeles:20120622T160000
DTSTART;TZID=America/Los_Angeles:20120622T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Discovering Latent Similarities in Car Models Based On Customer Reviews: Towards a Consumer-Driven Product Recommendation System
UID:20120622T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: What is in common between translating from English into Chinese andcompiling C++ into machine code? And yet what are the differences thatmake the former so much harder for computers? How can computers learnfrom human translators?This talk sketches an efficient (linear-time) "understanding +rewriting" paradigm for machine translation inspired by both humantranslators as well as compilers. In this paradigm, a source languagesentence is first parsed into a syntactic tree, which is thenrecursively converted into a target language sentence viatree-to-string rewriting rules. In both "understanding" and"rewriting" stages, this paradigm closely resembles the efficiency andincrementality of both human processing and compiling. We will discussthese two stages in turn.First, for the "understanding" part, we present a linear-timeapproximate dynamic programming algorithm for incremental parsing thatis as accurate as those much slower (cubic-time) chart parsers, whilebeing as fast as those fast but lossy greedy parsers, thus getting theadvantages of both worlds for the first time, achievingstate-of-the-art speed and accuracy. But how do we efficiently learnsuch a parsing model with approximate inference from huge amounts ofdata? We propose a general framework for structured prediction basedon the structured perceptron that is guaranteed to succeed withinexact search and works well in practice.Next, the "rewriting" stage translates these source-language parsetrees into the target language. But parsing errors from the previousstage adversely affect translation quality. An obvious solution is touse the top-k parses, rather than the 1-best tree, but this only helpsa little bit due to the limited scope of the k-best list. We insteadpropose a "forest-based approach", which translates a packed forestencoding *exponentially* many parses in a polynomial space by sharingcommon subtrees. Large-scale experiments showed very significantimprovements in terms of translation quality, which outperforms theleading systems in literature. Like the "understanding" part, thetranslation algorithm here is also linear-time and incremental, thusresembles human translation.

DTEND;TZID=America/Los_Angeles:20120210T160000
DTSTART;TZID=America/Los_Angeles:20120210T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Efficient Search and Learning for Language Understanding and Translation
UID:20120210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The increasing volume of electronic health records (EHR) data has spurred significant interest in the development of algorithmic phenotyping, used to identify patient cohorts in massive databases. Data-driven phenotyping, which formulates phenotyping as a statistical learning problem, offers superior scalability and generalization. Building upon previous work at Stanford, we propose a deep multi-phenotyping model: we train a single multi-task neural network to recognize multiple phenotypes, trained on noisy labels generated via an automatic process. We present preliminary results on classifying over 30 different phenotypes on a data set of over one million patients from the Stanford clinical system. This is joint work with Nigam Shah at Stanford University Center for Biomedical Informatics Research.BIO: Dave Kale is a fourth year PhD student in Computer Science and an Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California. He is advised by Greg Ver Steeg. Before joining USC and ISI, he worked in the Whittier VPICU at Children's Hospital LA and co-founded the Meaningful Use of Complex Medical Data (MUCMD) Symposium. Dave holds a BS and MS from Stanford University.

DTEND;TZID=America/Los_Angeles:20151002T160000
DTSTART;TZID=America/Los_Angeles:20151002T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Automated Deep Multi-Phenotyping with Noisy Labels
UID:20151002T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Today a large number of Electronic Patient Records (EPRs) are produced for legal reasons but they are very seldom reused, neither for clinical research nor for business (hospital) intelligence reasons. Moreover, the clinician's daily work in documenting the patient status is not always supported in a proper way. Hospital management needs key and real time information of the health care processes. Simultaneously, patients have become more demanding customers that want to be involved in their own health care process. We are aiming to support these demands.Clinical documentation forms an abundant source to extract valuable information that can be used for this purpose, however clinical corpora contain protected health information and must be kept in a safe way. Today only in Sweden (with a population of 10 million) 4-10 million pages of patient records are produced each year.We have studied the Stockholm EPR Corpus, a huge clinical document collection written in Swedish, containing over one million patient records. The document collection is distributed over 900 clinics from the Stockholm area encompassing three years 2006-2008. We have used this clinical corpus as a knowledge base to develop a set of tools that can work as basic building blocks for the future tools for health engineering. We have been assisted by physicians that have interpreted the content in the clinical text to us, they have annotated the clinical text and they have also set requirements on these tools together with their colleagues. We have identified four groups of users in the health domain: physicians, clinical researchers, hospital management and patients. We will show examples on these tools and the benefits they will give to health care.1) For physicians: Automatic ICD-10 assignment 2) For clinical researchers: Comorbidity networks 3) For hospital management: ICD-10 validation and adverse event detection, and finally 4) For patients: automatic text summarization.Brief Bio:Dr. Hercules Dalianis, Professor, born 20 July 1959Dalianis is a professor in Computer and Systems Sciences at Stockholm University. Dalianis received his Ph.D in 1996. Dalianis was a postdoc researcher at University of Southern California/ISI in Los Angeles in 1997. Dalianis was also postdoc researcher (forskarassistent) at KTH-Royal Institute of Technology in Stockholm, 1999-2003. Dalianis held a three year guest professorship at CST, University of Copenhagen during 2002-2005, founded by the Norfa, the Nordic council. Dalianis works in the interface between industry and university and with the aim to make research results useful for society. Dalianis has specialized in the area of human language technology, to make computers understand and process human language text, but also to make computers produce text automatically. Currently Dalianis is working in the area of clinical text mining with the aim to improve health care in form of better electronic patient record systems, presentation of the patient records and extraction of valuable information for clinical researchers as well as for the patients.

DTEND;TZID=America/Los_Angeles:20120113T160000
DTSTART;TZID=America/Los_Angeles:20120113T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Reusing clinical documentation for better health
UID:20120113T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Randomized algorithms are those which use randomness to achieve efficient performance with a bounded probability of error; typically, the bound is adjustable and the performance depends on the bound. Randomized data structures, likewise, use randomness to achieve efficient storage with a bounded probability of error. I will give an overview of the use of such data structures, namely, Bloom filters and "Bloomier" filters, for storing very large n-gram language models, and will discuss possibilities for using randomized data structures for other purposes as well.

DTEND;TZID=America/Los_Angeles:20080425T160000
DTSTART;TZID=America/Los_Angeles:20080425T150000
LOCATION:11 Large
SUMMARY:Tutorial: Randomized data structures for large statistical NLP models
UID:20080425T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The overwhelming popularity of online social media creates an unprecedented opportunity to  display aspects of oneself.  Inferring information about these users has the potential to benefit manydownstream applications such as recommendation engines and targeted advertising. In this talk I will show how to extract important personal information such as major life events  and personal attributes (e.g., gender, education, job)  from social evidence  such as the text produced by users and their friends and from properties of their social network. I will describe algorithms making use of a variety of frameworks, including distant supervision, and a deep learning architecture that learns user representations by integrating many heterogeneous social signals.Bio. Jiwei Li is a PH.D. student in the computer science department at Stanford University, working with Prof. Dan Jurafsky. His research interests include discourse, language generation, and social networks, with a focus on deep learning methods. Jiwei receives his B.S. from Peking University in 2012.  He was rewarded the Facebook Fellowship in 2015.Webcast link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=6b5348f2f8dc4a4dbb595eca444410d51d

DTEND;TZID=America/Los_Angeles:20160122T160000
DTSTART;TZID=America/Los_Angeles:20160122T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Extracting User Information from Online Social Media
UID:20160122T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The phenomenal amount of text on social media has recently spawned endeavors on computational methods to study language variation and change. However, we also have access to an unprecedented quantity of speech -- from Youtube video blogs to podcasts to recordings of radio and television shows, spanning several different accents and dialects. This data is a boon to sociophoneticians, who have traditionally relied on small-scale interviews to study systematic variation in speech. At the same time, it presents a challenge: the usual manual speech analysis methods do not scale.I will present ongoing work on an application that allows sociophoneticians to identify dialect features from potentially noisy speech data without the need for manual transcription.

DTEND;TZID=America/Los_Angeles:20150623T160000
DTSTART;TZID=America/Los_Angeles:20150623T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Automated tools for analyzing sociophonetic variation
UID:20150623T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We revisit the idea of history-based parsing, and present a history-basedparsing framework that strives to be simple, general, and flexible.Â  Wealso provide a decoder for this probability model that is linear-space,optimal, and anytime.Â  A parser based on this framework, when evaluated onSection 23 of the Penn Treebank, compares favorably with otherstate-of-the-art approaches, in terms of both accuracy and speed.

DTEND;TZID=America/Los_Angeles:20060310T163000
DTSTART;TZID=America/Los_Angeles:20060310T150000
LOCATION:10th Floor
SUMMARY:Exploring the Potential of Intractable Parsers
UID:20060310T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: European libraries are filled with undeciphered historical manuscripts from the 16th-18th centuries. These documents are enciphered with classical methods, which puts their contents out of the reach of historians who are interested in the history of that era.In this talk, we show how we automatically cracked a 400-page book from the 17th century. We also describe a system aimed at deciphering from camera-phone images. We show initial results for different ciphers.Bio:Nada is a graduate student at USC, working on her thesis under the supervision of Prof. Kevin Knight. She is currently working on the decipherment of historical documents (joint project with Uppsala University, Sweden). Her research interests include natural language processing, machine learning, decipherment and machine translation.1

DTEND;TZID=America/Los_Angeles:20160909T160000
DTSTART;TZID=America/Los_Angeles:20160909T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:How we Cracked the âBorgâ Cipher + First Steps Towards Deciphering from Images
UID:20160909T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: There is a hierarchy of generative devices that generate trees:starting with regular tree languages (RTLs), which are containedwithin context-free tree languages (CFTLs), and so on.  The stringyield of the RTLs is exactly the set of Context-Free Languages,while the yield of the CFTLs is exactly the set of Indexed Languages.In this talk we introduce Adjoining Tree Languages (ATLs) which sitin between RTLs and CFTLs.The yield of ATGs is exactly the set of Tree-Adjoining Languages.Just like RTGs are stronger than CFGs, ATGs are stronger than TAGs.In addition we will show that the ATG notation simplifies many ofthe foundational proofs for TAGs including proofs of the closureproperties. In particular, ATLs do not use adjunction constraints,and thus are much easier to understand than TAGs.We compare ATGs with previously proposed simplifications of CFTGs,called monadic simple CFTGs, which also have been shown to be weaklyequivalent to TAG (i.e. they generate the same set of stringlanguages). We consider the question of whether these two weaklyequivalent formalisms are strongly equivalent (i.e. generate exactlythe same set of tree languages).Finally, we will show that the standard definition used forprobabilistic TAG is (surprisingly) very different from the naturaldefinition of probabilistic ATL. Using an example of PP-attachmentambiguity we show that the two probabilistic models are differentfrom each other.About the speaker:Anoop Sarkar is an assistant professor in the Department of ComputingScience at Simon Fraser University. He received his PhD in 2002from the Department of Computer and Information Science at theUniversity of Pennsylvania, with Prof. Aravind Joshi as his advisor.His research work is on machine learning, especially semi-supervisedlearning, applied to the processing of natural language and stochasticformal grammars.Anoop Sarkar's web-page: <a href=http://www.cs.sfu.ca/~anoop/> http://www.cs.sfu.ca/~anoop</a>

DTEND;TZID=America/Los_Angeles:20070816T163000
DTSTART;TZID=America/Los_Angeles:20070816T150000
LOCATION:11 Large
SUMMARY:Extensions of Regular Tree Grammars and their relation to Tree Adjoining Grammars
UID:20070816T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk investigates the use of domain knowledge to constrain and improve the unsupervised learning of a classifier, by placing limits or biases on the possible hypotheses for each input. Theoretically, we view the contribution of the knowledge source as a reduction in the uncertainty of the model's decisions, quantified by the resulting conditional entropy of the label distribution given the input corpus. Evaluating on the simple case of an unsupervised HMM tagger, we find surprising levels of improvement from little knowledge, with more stable and efficient training convergence and label assignment, and a high degree of correlation between classification entropy and model performance. We conclude that, while we should always seek better generic models and techniques, for applications in an unsupervised setting, knowledge may still be key.

DTEND;TZID=America/Los_Angeles:20080606T160000
DTSTART;TZID=America/Los_Angeles:20080606T150000
LOCATION:11 Large
SUMMARY:Knowledge as a Constraint on Uncertainty for Unsupervised Classification
UID:20080606T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030827T160000
DTSTART;TZID=America/Los_Angeles:20030827T150000
LOCATION:11 Large
SUMMARY:Syntax for Statistical MT
UID:20030827T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: One of the leading unsolved mysteries in Australia, is the case of the Somerton Man.  This was a very athletically fit man found in a nice suit lying deceased on a beach in Australia in 1948.  The mystery is that there was no mark on him and there was nothing to identify him. No one came forward to identify him either. Over 65 years later we still do not know his name or how he died.  He had no ID, but his pocket had a piece of paper with the words "Tamam Shud" on it. It was subsequently found that the piece of paper had been torn out of a copy of a poetry book called the Rubaiyat of Omar Khayyam. Penciled in the back of the book were letters that appeared to be some sort of code.  Is this a clue? This talk will outline the key facts of mystery and show how forensic skills in engineering and computing are being used to attempt to both identify the man and shed light on the mysterious letters.Derek Abbott received a B.Sc. (Hons) in physics from Loughborough University, U.K. in 1982 and completed his Ph.D. in electrical and electronic engineering from the University of Adelaide, Adelaide, Australia, in 1995. From 1978 to 1986, he was a research engineer at the GEC Hirst Research Centre, London, U.K. From 1986â1987, he was a VLSI design engineer at Austek Microsystems, Australia. Since 1987, he has been with the University of Adelaide, where he is presently a full Professor with the School of Electrical and Electronic Engineering.  Prof. Abbott is a Fellow of the Institute of Physics (IOP) and a Fellow of the IEEE. He has won a number of awards including a Tall Poppy Award for Science (2004), a Premierâs Award in Science and Technology for outstanding contributions to South Australia (2004), and an Australian Research Council (ARC) Future Fellowship (2012). He is on the editorial board of Proceedings of the IEEE. His interests are in complex systems and multidisciplinary applications of physics and engineering.

DTEND;TZID=America/Los_Angeles:20140416T160000
DTSTART;TZID=America/Los_Angeles:20140416T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:The Mystery of the Tamam Shud Code
UID:20140416T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20031002T170000
DTSTART;TZID=America/Los_Angeles:20031002T160000
LOCATION:11 Large
SUMMARY:TBA
UID:20031002T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: State of the art machine translation systems learn translation rules from large amounts of parallel data (pairs of sentences that are translation of each other). Unfortunately, the amount of parallel data is very limited for many languages and domains. In general, it is easier to obtain monolingual data. Is it possible to learn useful translations from large amounts of monolingual data to improve machine translation when the amount of parallel data is limited? In this talk, I will present my ongoing work that applies decipherment techniques to decipher hundreds of millions Spanish news texts into English and learns a translation lexicon from the decipherment to improve a translation model learned from limited parallel data.

DTEND;TZID=America/Los_Angeles:20130517T160000
DTSTART;TZID=America/Los_Angeles:20130517T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Deciphering Gigaword
UID:20130517T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Extracting well-defined entities and relations that hold between them fromunstructured text is an important prerequisite for a variety of tasks such asknowledge base population, question answering, data analytics, visualization,etc.  The difficulty of this problem is evidenced by the annual TAC-KBPevaluations organized by NIST, where the best-performing systems in theslot-filling task still only achieve an f-value in the high 30's.  These higherror rates on individual relations get further compounded once relationshave to be joined to answer a question.State-of-the art statistical information extraction techniques focus primarilyon the phrase and sentence level to extract entities and relations betweenthem, and are generally ignorant of the greater context around them.  Wepresent a new approach which aggregates locally extracted information into alarger story context and uses abductive reasoning to generate the beststory-level interpretation.  We demonstrate that this approach cansignificantly improve relation extraction and question answering performanceon complex questions.  We will also describe ongoing work to apply this typeof inference to the TAC Knowledge Base Population task in order to improverelation extraction and coreference resolution.Bio:Hans Chalupsky is a project leader at the Information Sciences Institute ofthe University of Southern California, where he leads the Loom KnowledgeRepresentation and Reasoning Group.  He holds a Master's degree in computerscience from the Vienna University of Technology, Austria and a Ph.D. incomputer science from the State University of New York at Buffalo.Dr. Chalupsky has over 25 years of experience in the design, development andapplication of knowledge representation and reasoning systems such asPowerLoom, and he is the principal architect of the KOJAK Link DiscoverySystem.  His research interests include knowledge representation and reasoningsystems, natural language processing, knowledge and link discovery, anomalydetection and semantic interoperability.

DTEND;TZID=America/Los_Angeles:20140516T160000
DTSTART;TZID=America/Los_Angeles:20140516T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Story-Level Inference to Improve Machine Reading
UID:20140516T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditionally, computers communicate with humans by converting computer-readable input to human-interpretable output, for example via graphical user interfaces. My research focuses on building programs that automatically generate textual output from computer-readable input. The majority of existing Natural Language Generation (NLG) systems use hard-wired rules or templates in order to capture the input for every different application and rely on small manually annotated corpora.  In this talk, I will present a framework for building NLG systems using Neural Network architectures. The approach makes no domain-specific modifications to the input and benefits from training on very large unannotated corpora.  It achieves state-of-the-art performance on a number of tasks, including generating text from meaning representations and source code.  Such a system can have direct applications to intelligent conversation agents, source code assistant tools, and semantic-based Machine Translation.Bio: Ioannis Konstas is a postdoctoral researcher at the University of Washington, Seattle, collaborating with Prof. Luke Zettlemoyer since 2015. His main research interest focuses on the area of Natural Language Generation (NLG) with an emphasis on data-driven deep learning methods.He has received BSc in Computer Science from AUEB (Greece) in 2007, and MSc in Artificial Intelligence from the University of Edinburgh (2008). He continued his study at the University of Edinburgh and received his Ph.D. degree in 2014. He has previously worked as a Research Assistant at the University of Glasgow (2008), and as a postdoctoral researcher at the University of Edinburgh (2014).

DTEND;TZID=America/Los_Angeles:20170526T160000
DTSTART;TZID=America/Los_Angeles:20170526T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Building Adaptable and Scalable Natural Language Generation Systems
UID:20170526T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: One fundamental assumption in machine translation is that sentences are translated independently of each other. We attack this assumption by trying to achieve lexical translation consistence among sentences within the same document. An additional lexicon reuse feature is introduced to help the decoder select a more consistent translation. In this talk we will discuss the design of the reuse feature and show experimental results.

DTEND;TZID=America/Los_Angeles:20110824T143000
DTSTART;TZID=America/Los_Angeles:20110824T140000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Introducing context-dependent features into machine translation (Interns Final Talk)
UID:20110824T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The 2014-2015 measles outbreak in California was a serious public health crisis. Health officials attributed the outbreak to the increasing number of children whose parents had secured exemptions from vaccination for various vaccine-preventable diseases (VPDs). We believe that exemption seeking is part of a broader culture of distrust driven in large part by stories circulating in social media. An under- standing of the dynamics of this broader culture is necessary if we are to develop health policies that do not simply address outcomes but rather the cultural basis for decisions leading to those outcomes. We reveal the dynamics of exemption seeking and the greater culture of distrust endemic to these sites by developing a generative statistical-mechanical model where stories are represented as net- works with actants such as parents, medical professionals, and religious institutions as nodes, and their various relationships as edges. We estimate the latent but unknown stories circulating on these sites by modeling the posts as a sampling of the hidden story graph. Working with a data set of â2 million posts crawled from parent- ing sites over a â5 year period, we uncover a strong, persistent story signal in which parents, driven by a distrust of government and medical institutions, devise strategies to secure exemptions for their children from required vaccinations. In these stories, it is the vaccines and not the VPDs that pose a threat to the children. Our method of analyzing social media conversations and the exchange of stories at scale can provide an alert mechanism to health officials, help lay the groundwork for devising community-specific messaging interventions, and inform policy making.Bio. Ehsan Ebrahimzadeh is a PhD candidate in the Electrical Engineering Department of UCLA, where he is simultaneously working towards my his degree in Applied Mathematics. Broadly speaking, he is interested in Statistics, Applied probability, and Data Analytics. Before joining UCLA in 2013, he received his MASc degree in Electrical Engineering from University of Waterloo, and BSc degrees in Mathematics and Electrical Engineering from Isfahan University of Technology.

DTEND;TZID=America/Los_Angeles:20160219T160000
DTSTART;TZID=America/Los_Angeles:20160219T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Chasing vaccination in social media: Narrative discovery from an unstructured corpus of text
UID:20160219T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Dealing with creative language and in particular with affective, persuasive and even humorous language has often been considered outside the scope of computational linguistics. Nonetheless it is possible to exploit current NLP techniques starting some explorations about it. We briefly review some computational experiences about these typical creative genres. We will start introducing techniques for dealing with emotional and witty language. Then we will talk about the exploitation of some extra-linguistic features: for example music and lyrics in emotion detection, and an audience-reaction tagged corpus of political speeches for the analysis of persuasive language. As examples of practical applications, we will present a system for automatized memory techniques for vocabulary acquisition in a second language, and an application for automatizing creative naming (branding).Bio: Carlo Strapparava is a senior researcher at FBK-irst (Fondazione Bruno Kessler - Istituto per la ricerca scientifica e Tecnologica) in the Human Language Technologies Unit. His research activity covers artificial intelligence, natural language processing, intelligent interfaces, human-computer interaction, cognitive science, knowledge-based systems, user models, adaptive hypermedia, lexical knowledge bases, word-sense disambiguation, affective computing and computational humour. He is the author of over 150 papers, published in scientific journals, book chapters and in conference proceedings. He also played a key role in the definition and the development of many projects funded  by European research programmes.He regularly serves in the program committees of the major NLP conferences (ACL, EMNLP, etc.). He was executive board member of SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics (2007-2010), Senseval (Evaluation Exercises for the Semantic Analysis of Text) organisation committee (2005-2010).On June 2011, he was awarded with a Google Research Award on Natural Language Processing, specifically on the computational treatment of creative language.

DTEND;TZID=America/Los_Angeles:20130318T160000
DTSTART;TZID=America/Los_Angeles:20130318T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Computational explorations of creative language
UID:20130318T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We will given an overview of  the challenges and early results of an EC-funded project,  named MateCat,whose goal is  developing an enhanced web-based CAT tool integrating new MT functionalities. In particular,MateCat will investigate the integration of MT into the CAT working process along three main directions:self-tuning MT, user adaptive MT, and informative MT.  In this seminar, we will report on recent activitiesconcerning domain and on-line MT adaptation and will introduce the first version of the MateCat tool,that will be officially released in open source by the end of the year.

DTEND;TZID=America/Los_Angeles:20121031T160000
DTSTART;TZID=America/Los_Angeles:20121031T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards  the integration of human and machine translation
UID:20121031T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In recent years, deep learning has had a huge impact on natural language processing surpassing the performance of many other statistical and machine learning methods. One of the many promises of deep learning is that features are learned implicitly and that there is no need to manually engineer features for good performance. However, neural network performance is highly dependent on network architecture and selection of hyper-parameters. In many ways, architecture engineering has supplanted feature engineering in NLP tasks. In this talk, I will focus on two ways neural network structures can be learned while concurrently training models. First, I'll present a regularization scheme for learning the number of neurons in a neural language model during training (Murray and Chiang 2015) and show how it can be used in a Machine Translation task. Then, I'll move onto a Visual Question Answering task where denotations are selected by executing a probabilistic program that models non-determinism with neural networks (Murray and Krishnamurthy 2016).Kenton Murray is a PhD student in the Natural Language Processing Lab at the University of Notre Dame's Computer Science and Engineering Department working with David Chiang. His research is on neural methods for human languages, particularly machine translation and question answering. Prior to Notre Dame, he was a Research Associate at the Qatar Computing Research Institute (QCRI) and received a Master's in Language Technologies from Carnegie Mellon University and a Bachelor's in Computer Science from Princeton University.

DTEND;TZID=America/Los_Angeles:20170106T160000
DTSTART;TZID=America/Los_Angeles:20170106T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Neural Network Structures for Natural Language
UID:20170106T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will present the idea of the constraint composite graph (CCG) associated with any combinatorial problem modeled as a weighted constraint satisfaction problem (WCSP). The CCG constitutes the first mathematical framework for simultaneously exploiting the numerical structure of the weighted constraints as well as the graphical structure of the variable-interactions in a WCSP. I will discuss a number of important applications of the CCG including its role in: (a) identification of tractable classes of WCSPs; (b) kernelization techniques for combinatorial problems; and (c) understanding the scope of incremental computation for hard combinatorial problems.Bio. Dr. Satish Kumar Thittamaranahalli (T. K. Satish Kumar) is a Research Scientist at the University of Southern California. He has published extensively on numerous topics in Artificial Intelligence spanning such diverse areas as Constraint Reasoning, Planning and Scheduling, Probabilistic Reasoning, Combinatorial Optimization, Approximation and Randomization, Heuristic Search, Model-Based Reasoning, Knowledge Representation and Spatio-Temporal Reasoning. He has served on the Program Committees of many international conferences in Artificial Intelligence and is a co-winner of the Best Student Paper Award from the 2005 International Conference on Automated Planning and Scheduling. Dr. Kumar received his PhD in Computer Science from Stanford University in March 2005. In the past, he has also been a Visiting Student at the NASA Ames Research Center, a Postdoctoral Research Scholar at the University of California, Berkeley, a Research Scientist at the Institute for Human and Machine Cognition, a Visiting Assistant Professor at the University of West Florida, and a Senior Research and Development Scientist at Mission Critical Technologies.

DTEND;TZID=America/Los_Angeles:20151113T160000
DTSTART;TZID=America/Los_Angeles:20151113T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Notes on the Constraint Composite Graph
UID:20151113T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will be divided into two parts. In the first part I willtalk about using Transfer Learning techniques to improve the task ofWord Sense Disambiguation (WSD).Usually in supervised WSD, we suffer due to paucity of labeled data asthere are some words that occur less frequently in the data and itsvery difficult to get enough labeled data for these words. In suchcases it is very difficult to build high accuracy supervised learningmodels for these words. So, we propose an approach called TransFeat(based on the MDL principle) which ``transfers information", fromsimilar words in the form of a feature relevance prior to get improvedaccuracies on these rare words. Besides this, our experiments showthat we also get decent improvement in accuracy for words that havemore amount of labeled data available. TransFeat gives accuracies thatare in the worst case comparable to state-of-the-art on ONTONOTES andSENSEVAL-2 datasets.In the second part of the talk I will talk about incorporatingnon-local constraints in Named Entity Recognition (NER) systems. Themain idea is that some linguistic constraints (e.g. every occurrenceof the word ``Einstein" in the data should have the tag PERi.e. person ) are very useful and can give improved performance butthey are non - local and hence are intractable and can not beefficiently modeled using state-of-the-art sequence modeling methodslike CRFs. Though people have used Skip-chain CRFs (with LoopyBP)(Sutton and McCallum '04) and Gibbs Sampling (Finkel and Manning'05) to enforce these non-local constraints, but they turn out to bereally inefficient and custom-tailored to one particular kind ofconstraints (say) consistency constraints of the type mentionedabove. We propose a constrained version of EM in which a general setof constraints (not limited to consistency constraints!) can beincorporated into the model. In the end I will show some results ofthis approach on CoNLL 03 English and CoNLL 02 Spanish NER shared tasks.

DTEND;TZID=America/Los_Angeles:20090717T160000
DTSTART;TZID=America/Los_Angeles:20090717T150000
LOCATION:11 Large
SUMMARY:Transfer Learning for WSD & Non-local constraints for Named Entity Recognition
UID:20090717T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030627T160000
DTSTART;TZID=America/Los_Angeles:20030627T150000
LOCATION:10 Large
SUMMARY:Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked and Maximum Entropy Models for FrameNet Classification
UID:20030627T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modeling of phrase-internal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source.Yang Feng is a posdoc of the natural language group in USC/ISI. She got her ph.D degree from Institute of Computing Technology, Chinese Academy of Sciences. Her research interests are in all aspects of machine translation and machine learning focusing on graphical models and Bayesian inference.

DTEND;TZID=America/Los_Angeles:20130920T160000
DTSTART;TZID=America/Los_Angeles:20130920T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Markov Model of Machine Translation using Non-parametric Bayesian Inference (ACL 2013)
UID:20130920T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Visual question answering (Visual QA) requires comprehending and reasoning with both visual and language information, a characteristic ability that AI should strive to achieve. Merely in the past three years, over a dozen datasets have been released, together with many learning-based models that have been narrowing the gap between the humansâ performance and the machinesâ. On one popular dataset VQA, the state-of-the-art model achieves 71.4% accuracy, just 17% shy of that by humans.While seemingly remarkable, it needs a deeper investigation on what knowledge the machine actually learnsâdoes it understand the multi-modal information? Or it relies on and over-fits to the incidental dataset statistics. Moreover, current experimental setups mainly focus on training and testing within the same dataset. It is unclear how the learned model can be applied to the real environment where both the visual and language data might have mismatch.In this talk, I will present our recent studies to answer these questions. We show that the dataset design has a significant impact on what a model learns. Specifically, the resulting model can ignore the visual information, the question, or both while still doing well on the task. We thus propose automatic procedures to remedy such design deficiencies. We then show that the mismatch in language hinders transferring a learned model across datasets. To this end, we develop a domain adaptation algorithm for Visual QA to facilitate knowledge transfer. Finally, I will present a probabilistic framework of Visual QA algorithms to effectively leverage the answer semantics, drastically increasing the transferability. I will conclude the talk with future directions to advance Visual QA.Bio: Wei-Lun (Harry) Chao is a Computer Science PhD candidate at University of Southern California, working with Fei Sha. His research interests are in machine learning and its applications to computer vision, artificial intelligence, and health care. His recent work has focused on transfer learning toward vision and language understanding in the wild. His earlier research includes work on probabilistic inference, structured prediction for video summarization, and face understanding. He will be joining The Ohio State University as an assistant professor in 2019 Fall, following a one-year postdoc at Cornell University.

DTEND;TZID=America/Los_Angeles:20180720T160000
DTSTART;TZID=America/Los_Angeles:20180720T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Visual Question Answering: the Good, the Bad, and the Ugly
UID:20180720T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030829T160000
DTSTART;TZID=America/Los_Angeles:20030829T150000
LOCATION:11 Large
SUMMARY:Deepening Representations
UID:20030829T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will introduce some of the technologies whichwe have developed in the project on an English reading assistant systemcalled English Reading Wizard. The technologies include a method formining translations from web (unparallel corpora), a method for wordtranslation disambiguation based on bootstrapping, which is calledBilingual Bootstrapping, and a general method of bootstrapping, which iscalled Collaborative Bootstrapping. First, I will introduce the mainfeatures of English Reading Wizard. Next, I will introduce each of themethods. The translation mining method is based on a naÃ¯ve Bayesianensemble and the EM algorithm. Bilingual Bootstrapping uses theasymmetric translation relationship between words in the two languagesin translation and can construct reliable classifiers for wordtranslation disambiguation. Collaborative Bootstrapping contains theco-training algorithm as its special case, and it uses the strategy ofuncertainty reduction in training of the two classifiers.Bio:Hang Li is a researcher at the Natural Language Computing Groupof Microsoft Research in Beijing, China. He is also adjunct professor ofXian Jiaotong University. Hang Li obtained a B.S. in ElectricalEngineering from Kyoto University (Japan) in 1988 and a M.S. in ComputerScience from Kyoto University in 1990. He earned his Ph.D. in ComputerScience from the University of Tokyo in 1998. >From 1990 to 2001, HangLi worked at the Research Laboratories of NEC Corporation in Kawasaki,Japan. He joined Microsoft Research in 2001. Â His research interestincludes statistical learning, natural language processing, data mining,and information retrieval. Hang Li's web site:http://research.microsoft.com/users/hangli/

DTEND;TZID=America/Los_Angeles:20031125T120000
DTSTART;TZID=America/Los_Angeles:20031125T223000
LOCATION:11th Floor Large
SUMMARY:Using Bilingual Data to Mine and Rank Translations
UID:20031125T223000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: How can we effectively explore the space of automated dialog systems? In this talk, I introduce WebPPL, a probabilistic programming language that provides a wide range of inference and optimization algorithms out of the box. This language makes it easy to express and combine probabilistic models, including regression and categorization models, highly structured cognitive models, models of agents that make sequential plans, and deep neural nets. I show that this also includes recent sequence-to-sequence architectures for dialog. I then use this framework to implement *dialog automation using workspaces*, a variation on these architectures that is aimed at dialogs that require sufficiently deep reasoning between utterances that it is difficult to learn how to automate them from transcripts alone.Bio: Andreas StuhlmÃ¼ller is a post-doctoral researcher at Stanford, working in Prof. Noah Goodman's Computation & Cognition lab, and founder of Ought Inc. Previously, he received his Ph.D. in Brain and Cognitive Sciences from MIT, where he was part of Prof. Josh Tenenbaum's Computational Cognitive Science group. He has worked on the design and implementation of probabilistic programming languages, on their application to cognitive modeling, and recently on dialog systems. He is broadly interested in leveraging machine learning to help people think.

DTEND;TZID=America/Los_Angeles:20170428T160000
DTSTART;TZID=America/Los_Angeles:20170428T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Modeling Dialog using Probabilistic Programs
UID:20170428T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is a practice talk for my Ph.D. defense, whichwill be held on Aug 24th 3-5pm, SAL 322.An important problem in the area of homeland security and frauddetection is to identify abnormal entities in large datasets.Although there are methods from knowledge discovery and data miningfocusing on finding anomalies in numerical datasets, there has beenlittle work aimed at discovering abnormal or suspicious instances inlarge and complex semantic graphs whose nodes are richly connectedwith many different types of links. In this talk, I will describe anovel, domain-independent and unsupervised framework to identify suchinstances.  Besides discovering suspicious instances, we believe thatto complete the discovery process and to deal with the "curse offalse positives", a system has to convince the users by providingexplanations for its findings. Therefore, in the second part of thetalk I will describe an explanation mechanism to automaticallygenerate human-understandable explanations for the discoveredresults. Experimental results show that our discovery systemoutperforms state-of-the-art unsupervised network algorithms used toanalyze the 9/11 terrorist network by a large margin. Additionally, ahuman study we conducted demonstrates that our explanation system,which provides natural language explanations for its findings,allowed human subjects to perform complex data analysis in a muchmore efficient and accurate manner

DTEND;TZID=America/Los_Angeles:20060804T163000
DTSTART;TZID=America/Los_Angeles:20060804T153000
LOCATION:11 Large
SUMMARY:Ph.D. defense practice talk
UID:20060804T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We present ongoing research into OCR for both machine print and handwriting recognition. We utilize a neural network along with LSTM's to perform OCR directly from pixel intensity. We are exploring a few novel improvements, including using a CNN for feature extraction prior to the LSTM, and combining reinforcement learning into our training to directly optimize word error rate in our test-time decoding procedure, which utilizes a (non-differentiable) language-model based decoding of the LSTM output. Finally, we present the design of the OCR system we used to win a pilot project with the US Census for recognizing handwritten first and last names.Bio: Stephen Rawls is a research programmer and a PhD student at USC/ISI advised by Dr. Prem Natarajan. He works in the Computer Vision group at ISI on face recognition and OCR, among other projects.Huaigu Cao is a computer scientist at USC ISI. His interest of research includes image processing and pattern recognition.

DTEND;TZID=America/Los_Angeles:20160722T160000
DTSTART;TZID=America/Los_Angeles:20160722T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Title: LSTM's for OCR
UID:20160722T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As machine learning algorithms and their application for NLP becomebetter understood, attention turns toward the production of annotatedcorpora to which they can be applied.  Numerous phenomena presentthemselves for annotation, including aspects in lexical semantics,discourse, pragmatics, and dialogue.  But several questionsimmediately must be answered:1. How does one obtain a balanced corpus to annotate?  What is abalanced corpus?2. How does one decide which aspects to annotate? How does oneadequately express the theory behind the phenomena in simple annotation steps?3. Which annotators does one hire?  How does one ensure that they are adequately trained?4. How does one establish a simple, fast, and trustworthy annotationprocedure?  What interfaces does one build?  How does one ensure thatthe interfaces do not affect the annotation results?5. How does evaluate the results? What are the appropriate agreementmeasures?  At which cutoff points should one re-do the annotations?How does one ensure improvement?6. How should one formulate and store the results?  How does oneensure compatibility with other existing resources?  How does one makeresults available for best impact?7. How does one report the annotation effort and results?  How doesone actually get a paper on this work published at an importantconference?  What should the paper contain?Despite their being so basic, there is almost no established procedureor standard set of answers to these questions today.  In this talk Idiscuss some of these aspects, pointing to the lessons learned in theongoing OntoNotes project (joint with BBN, the University of Colorado(PropBank), the University of Pennsylvania (Treebank), and ISI).

DTEND;TZID=America/Los_Angeles:20060922T163000
DTSTART;TZID=America/Los_Angeles:20060922T150000
LOCATION:11 Large
SUMMARY:Toward a 'Science' of Annotation: Experiences from OntoNotes
UID:20060922T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will talk about recent progress in implementing an efficient method for doing a type of inferencing called abduction, or inference to the best explanation.  I will illustrate its wide applicability to a variety of language interpretation problems.  I'll describe our recent work on implementing ontologies, or logical theories of commonsense domains.  Then I will show how we are applying all this to the interpretation of metaphors.

DTEND;TZID=America/Los_Angeles:20121116T160000
DTSTART;TZID=America/Los_Angeles:20121116T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Abduction and Metaphor
UID:20121116T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A major hurdle in building automated information retrieval systems forHindi text is the lack of an uniform encoding for text representation.Standards do exist, but noone seems interested. Every web contentpublisher seems to have their encoding system, making informationextraction a nightmare. We explore an unsupervised approach toconvert any given "unknown" encoding to UTF-8, by treating it as adecipherment problem. We also study how a little amount of supervisioncan improve decoding accuracy.

DTEND;TZID=America/Los_Angeles:20030905T160000
DTSTART;TZID=America/Los_Angeles:20030905T150000
LOCATION:11 Large
SUMMARY:Deciphering Hindi Scripts
UID:20030905T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will discuss some recent results from my lab on the relationship between reference resolution and coherence relations. Previous work found that pronoun interpretation is guided by the coherence relations between clauses (e.g., 'as a result', 'and then', 'and similarly'), e.g. Hobbs (1979), Kehler et al. (2008). For example, consider "Phil tickled Stan, and similarly Liz poked him" (preference to interpret 'him' as Stan) and "Phil tickled Stan, and as a result Liz poked him" (more consideration of Phil as the antecedent of 'him'). However, the linguistic and cognitive properties of these coherence representations are not yet fully understood, and it is also not yet clear whether this kind of coherence sensitivity extends straightforwardly to other kinds of reduced referring expressions in addition to pronouns (e.g. anaphoric demonstratives, which can in many languages be used to refer to humans as well). I will discuss experiments -- conducted using a visual-world eye-tracking paradigm as well as other methods -- that investigate the nature and generality of these coherence representations. In addition to investigating whether coherence effects extend to other reduced referring expressions, I have also explored the domain-generality of coherence representations, for example whether non-linguistic, visuo-spatial input (video clips of moving shapes) can prime (bias) subsequent reference resolution in a seemingly unrelated task. Time permitting, I will also discuss issues related to data analysis and the annotation of data collected through psycholinguistic experiments.Brief bio:Elsi Kaiser is an Assistant Professor of Linguistics at the University ofSouthern California, with a specialization in Psycholinguistics. Shereceived her Ph.D. from the University of Pennsylvania in 2003, and was apost-doc at the University of Rochester for two years before moving toUSC.  Her current research focuses on the comprehension of variousreferential forms (including pronouns, reflexives and demonstratives) indifferent languages, which she investigates using a range of tools,including eye-tracking.

DTEND;TZID=America/Los_Angeles:20100326T160000
DTSTART;TZID=America/Los_Angeles:20100326T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Discourse coherence effects in language processing: A psycholinguistic approach
UID:20100326T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We investigate the predictive power behind the language offood on social media. We collect a corpus of over three millionfood-related posts from Twitter and demonstrate that many latentpopulation characteristics can be directly predicted from this data:overweight rate, diabetes rate, political leaning, and homegeographical location of authors. For all tasks, our language-basedmodels significantly outperform the majority- class baselines.Performance is further improved with more complex natural languageprocessing, such as topic modeling. We analyze which textual featureshave most predictive power for these datasets, providing insight intothe connections between the language of food, geographic locale, andcommunity characteristics. Lastly, we design and implement an onlinesystem for real-time query and visualization of the dataset.Visualization tools, such as geo-referenced heatmaps,semantics-preserving wordclouds and temporal histograms, allow us todiscover more complex, global patterns mirrored in the language offood.Stephen Kobourov is a Professor of Computer Science at the Universityof Arizona. He completed BS degrees in Mathematics and ComputerScience at Dartmouth College in 1995, and a PhD in Computer Science atJohns Hopkins University in 2000.  He has worked as a ResearchScientist at AT&T Research Labs, a Hulmboldt Fellow at the Universityof TÃ¼bingen in Germany, and a Distinguished Fulbright Chair at CharlesUniversity in Prague.

DTEND;TZID=America/Los_Angeles:20170327T160000
DTSTART;TZID=America/Los_Angeles:20170327T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Analyzing the Language of Food on Social Media
UID:20170327T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: An Overview of Question Answering ChallengeJun'ichi Fukumoto and Tsuneaki KatoIn this talk, we will present an overview of Question AnsweringChallenge(QAC), which is the question answering task of the NTCIRWorkshop.  QAC-1 (the first evaluation of QAC) was carried outat NTCIR Workshop 3 in October 2002, and QAC-2 will be atNTCIR Workshop 4 in December 2003.  In the QAC, systems to beevaluated are expected to return exact answers consisting of a nounor noun compound denoting, for example, the names of persons,organizations, or various artifacts or numerical expressions suchas money, size, or date.  Those basically range over the NamedEntity (NE) elements of MUC and IREX but is not limited to them.QAC consists of three kinds of subtasks: Task 1, where the systemsare allowed to return ranked five possible answers; Task 2, wherethe systems are required to return a complete list of answers; andTask 3, the systems are required to answer series of questions, thathave anaphora and zero-anaphora.  We will present the results ofQAC-1, and vision and prospect of QAC-2.NTCIR -- the Way AheadNoriko KandoDr. Noriko Kando is the leader of NTCIR(Test Collections and Evaluationof IR, Text Summarization, Q&A, etc) project, and an associate professorof National Institute of Informatics (NII).  She got her Ph. D in 1995from Keio University.  Her research interest includes evaluation ofinformation retrieval systems, technologies to "Make Information Usablefor Users", cross-lingual information retrieval, and analysis of textstructure, genre, citation & link  She is a member of editorial boards ofInternational Journal on Information Processing and Management,ACM-Transaction on Asian Language Information Processing, etc.Jun'ichi Fukumoto and Tsuneaki Kato are task organizers of QAC.Dr. Jun'ichi Fukumoto is an associate professor of RitsumeikanUniversity.  He got his Ph. D in 1999 from University of ManchesterInstitute of Science and Technology.  His research interest includesQ&A, automatic summarization, and dialogue processing.Dr. Tsuneaki Kato is an associate professor of the University of Tokyo.He got his Dr. of Engineering in 1995 from Tokyo Institute ofTechnology.  His research interests includes multimodal dialogueprocessing, multimodal presentation generation and domain independentquestion and answering.  He is a member of editorial committee oftransaction on information and systems of The Institute of Electronics,Information and Communication Engineers.

DTEND;TZID=America/Los_Angeles:20031117T120000
DTSTART;TZID=America/Los_Angeles:20031117T103000
LOCATION:4th Floor
SUMMARY:An Overview of the QA Challenge + NTCIR -- The Way Ahead
UID:20031117T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA(Note: This is part of the Summer Intern Series)

DTEND;TZID=America/Los_Angeles:20060823T160000
DTSTART;TZID=America/Los_Angeles:20060823T153000
LOCATION:11 Large
SUMMARY:Speeding-up Syntax-based Decoding
UID:20060823T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: There are many tools available to the NLP community for Natural Language Parsing, (i.e converting a raw sentence in to a parse-tree). NLP researchers usually use some "off-the-shelf" parser which has been trained on the Wall Street Journal (WSJ) corpora and then apply the WSJ-trained parser to their data. This works in many cases, especially for systems which use data from WSJ or similar corpora. However, in real life applications, the data may be compiled from many different sources and span different genres, and may not be similar to the WSJ corpora in terms of sentence structure, etc . A particular parser might parse well on some corpora and not so well on others. Choosing the right parser for your data may have an impact on the performance of the NLP system as a whole. But in order to measure the accuracy of any parser for a given corpus, we require a set of gold-standard parse trees corresponding to the sentences within the corpus. Generating gold-standard set takes a lot of manual work and in many real-life applications, it is not a feasible  task to generate gold-standard parses for large corpora.We attempted to build a system which can predict the accuracy (in terms of f-measure value) of the Charniak parser (a popular parsing tool) on any given sentence corpus. Without using any additional information (i.e gold std. parses), our system predicts "how accurately the Charniak parser could parse the given corpus". In order to evaluate our system's predictions on a particular corpus, we compute the "Correlation" measure between the "actual accuracies (using Gold-standard)" vs. "predicted accuracies (from our system)" for the given corpus. We tested our system on different corpora and using different methods and will present these results.

DTEND;TZID=America/Los_Angeles:20071005T163000
DTSTART;TZID=America/Los_Angeles:20071005T150000
LOCATION:11 Large
SUMMARY:Will this parser work with my data? - Predicting Parser Accuracy without Gold-Standard information
UID:20071005T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Since its inception more than 30 years ago, electronic mail (email)has developed into a powerful communication medium with applicationsthat extend well beyond simple asynchronous message exchange betweenindividuals. Automated tools to support the use of email inindividual, organizational and social contexts have receivedincreasing attention in recent years. Among the tasks that are nowsupported are filtering (e.g., spam detection), aggregation (e.g.,mailing list digests), workflow management (e.g., help desk routing),and reuse (e.g., retrospective search). We are interested in howtoday's email will be used in the future -- some will certainly bepreserved (indeed, some MUST be preserved!), and those records willserve as powerful evidence of how we lived our lives and organized oursocieties. The challenges of managing many types of electronic recordcollections are receiving increasing attention, but we are not awareof any work yet on supporting access to electronic mail archives.That will be the focus of this talk.We will introduce the Open Archival Information Systems (OAIS) model,and then focus on two key processes: ingestion and access. Our focusin ingestion is on support for review and redaction, which we believewill be key enablers to acquisition and near-term access. For access,we will address both browsing based on provenance (original order) anduser-guided reorganization based on search and visualization. Alongthe way, we will identify potentially productive opportunities toapply natural language processing technologies such as topicsegmentation, link detection, and summarization. We will thendescribe two test collections, and demonstrate a system that we havedeveloped to explore user-guided reorganization through visualizationfor one of those collections. We will conclude the talk by sketchingout a research agenda. At that point, we will expect suggestions andcomments from the audience. Knowing this audience, it is unlikelythat we will need to wait that long :-).

DTEND;TZID=America/Los_Angeles:20030124T160000
DTSTART;TZID=America/Los_Angeles:20030124T150000
LOCATION:11 Large
SUMMARY:Access to Archival Collections of Electronic Mail
UID:20030124T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree Adjoining Grammars have well-known advantages, but are typicallyconsidered too difficult for practical systems.  We demonstrate that,when done right, adjoining improves translation quality withoutbecoming computationally intractable.  Using adjoining to modeloptionality allows general translation patterns to be learned withoutthe clutter of endless variations of optional material, with extrainformation spliced in as needed.In this paper, we describe a novel method for learning a type ofSynchronous Tree Adjoining Grammar and associated probabilities fromaligned tree/string training data.  We introduce a method ofconverting these grammars to a weakly equivalent tree transducer forefficient decoding.  Finally, we show that adjoining results in anend-to-end improvement of +0.8 BLEU over a baseline statisticalsyntax-based MT model on a large-scale Arabic/English MT task.

DTEND;TZID=America/Los_Angeles:20090626T153000
DTSTART;TZID=America/Los_Angeles:20090626T150000
LOCATION:11 Large
SUMMARY:Synchronous Tree Adjoining Machine Translation (Practice talk for EMNLP)
UID:20090626T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Inversion Transduction Grammar (ITG) of \cite{DekaiCL} generates asynchronous parse tree for a given pair of sentences in two languages. Byallowing inversion of the order of children at any level of thesynchronous parse tree, ITG can do recursive, systematic word reordering.We made a version of ITG where the nonterminals are lexicalized by wordpairs and the inversions are dependent on the so-lexicalized nonterminals.We found out that after lexicalization, the Alignment Error Rate (AER)against gold standard is reduced for short sentences. ITG parsingcomplexity is high polynomial. We proposed a pruning techique thatutilizes IBM Model 1 to estimate the inside and outside probability of abitext cell. Taking a step further, we applied the A* parsing having beenused for monolingual parsing to ITG.  I will talk about the heuristicestimates we used for A* parsing for Viterbi alignment selection anddecoding.

DTEND;TZID=America/Los_Angeles:20050608T163000
DTSTART;TZID=America/Los_Angeles:20050608T150000
LOCATION:4th floor
SUMMARY:Lexicalization and A* Searching for Inversion Transduction Grammar
UID:20050608T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We have recently proposed Recognizing Textual Entailment (RTE) as ageneric task that captures major semantic inferences across differentnatural language processing applications. The talk will first reviewthe motivation and definition of the textual entailment task and thePASCAL RTE-1,2&3 Challenges benchmarks. Then we will demonstratedirections for building textual entailment systems, based on knowledgeacquisition and inference, and for utilizing them within concreteapplications. Furthermore, we suggest that textual entailment modelingmay become a comprehensive framework for applied semanticsresearch. Such framework introduces useful variants of known semanticproblems and highlights important tasks which were hardly investigatedso far at an applied computational level. The semantic modelingperspective will be illustrated in more detail by a case study for anentailment-based variant of word sense disambiguation.About the speaker:Ido Dagan is a Senior Lecturer at the Department of Computer Scienceat Bar Ilan University, Israel. His areas of interest are largelywithin empirical NLP, particularly empirical approaches for appliedsemantic processing. In the last few years Ido and his colleaguesintroduced <i>textual entailment</i> as a generic framework for appliedsemantic inference and have organized the first three rounds of thePASCAL Recognizing Textual Entailment Challenges. Ido received hisPh.D. from the Technion. He has been a research fellow at the IBMHaifa Scientific Center and a Member of Technical Staff at AT&T BellLaboratories. During 1998-2003 he was co-founder and CTO ofFocusEngine and VP of Technology of LingoMotors.

DTEND;TZID=America/Los_Angeles:20070330T163000
DTSTART;TZID=America/Los_Angeles:20070330T150000
LOCATION:11 Large
SUMMARY:Textual entailment as a framework for applied semantics
UID:20070330T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree-to-String Alignment ModelsMachine translation systems typically rely on some form alignment as apreprocessing step. Typically, these alignments take the form ofword-to-word alignments. In this talk, we will introduce severalmodels aimed at aligning foreign words to either English words ornodes in the English parse tree. Such word-to-node alignments offerseveral potential advantages over traditional word-to-wordalignments. Firstly, since the extraction process for some syntacticsystems explicitly considers the English trees, we expect that alsoconsidering the trees at alignment time will produce alignments thatwill better suit the extraction process. Secondly, aligning foreignfunction words to English tree nodes can admits highly desirablesyntactic transfer rules which cannot be directly as word-to-wordalignments. Finally, word-to-node alignments can effectively modelmany-to-one alignments.  We present four models of increasingcomplexity and show preliminary results for each model.

DTEND;TZID=America/Los_Angeles:20090828T160000
DTSTART;TZID=America/Los_Angeles:20090828T150000
LOCATION:11 Large
SUMMARY:Intern Final Talks
UID:20090828T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will describe some recent work on "natural logics", logics for languagesthat are more similar to human languages than traditional first orderpredicate logic, giving particular attention to questions about what thesyntax encodes about semantic relations among sentences. On everyone'sview, some but not all entailments are syntactically encoded (in a sensethat I will define precisely), but, beyond this starting point,controversy starts almost immediately. Considering some particularexamples, I will sketch methods for addressing some of the basicquestions.

DTEND;TZID=America/Los_Angeles:20050513T163000
DTSTART;TZID=America/Los_Angeles:20050513T150000
LOCATION:11 Large
SUMMARY:Natural Logic
UID:20050513T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Weighted hypergraphs arise naturally in parsing, syntax-based machinetranslation and other tree-based NLP models, as well as in weightedlogic programming.We present an open-source toolkit for the representation andmanipulation of weighted hypergraphs. It provides hypergraph datastructures and algorithms, such as the shortest path andinside-outside algorithms, composition, projection, and more. Inaddition, it provides functionality to optimize hypergraph featureweights from training data. We model finite-state machines as aspecial case. We give a tutorial on hypergraphs and the hypergraphtoolkit and explain how you can use these tools in your research.This is joint work with Jonathan Graehl.Bio: Markus Dreyer is a Senior Research Scientist at SDL LanguageWeaver. His research focuses on algorithms and machine learningtechniques for large-scale machine translation and NLP. He receivedhis PhD in Computer Science from Johns Hopkins University, advised byJason Eisner. For more information, see http://goo.gl/d6mHUi.

DTEND;TZID=America/Los_Angeles:20140919T160000
DTSTART;TZID=America/Los_Angeles:20140919T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:An open-source toolkit for the representation, manipulation and optimization of weighted hypergraphs
UID:20140919T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Natural language generation (NLG) is a well-studied and still very challenging field in natural language processing. One of the less studied NLG tasks is the generation of creative texts such as jokes, puns, or poems. Multiple reasons contribute to the difficulty of research in this area. First, no immediate application exists for creative language generation. This has made the research on creative NLG extremely diverse, having different goals, assumptions, and constraints. Second, no quantitative measure exists for creative NLG tasks. Consequently, it is often difficult to tune the parameters of creative generation models and drive improvements to these systems. Lack of a quantitative metric and the absence of a well-defined immediate application makes comparing different methods and finding the state of the art an almost impossible task in this area. Finally, rule-based systems for creative language generation are not yet combined with deep learning methods. Rule-based systems are powerful in capturing human knowledge, but it is often too time-consuming to present all the required knowledge in rules. On the other hand, deep learning models can automatically extract knowledge from the data, but they often miss out some essential knowledge that can be easily captured in rule-based systems.In this work, we address these challenges for poetry generation, which is one of the main areas of creative language generation. We introduce password poems as a new application for poetry generation. These passwords are highly secure, and we show that they are easier to recall and preferable compared to passwords created by other methods that guarantee the same level of security. Furthermore, we combine finite-state machinery with deep learning models in a system for generating poems for any given topic. We introduce a quantitative metric for evaluating the generated poems and build the first interactive poetry generation system that enables users to revise system generated poems by adjusting style configuration settings like alliteration, concreteness and the sentiment of the poem. The system interface also allows users to rate the quality of the poem. We collect usersâ rating for poems with various style settings and use them to automatically tune the system style parameters. In order to improve the coherence of generated poems, we introduce a method to borrow ideas from existing human literature and build a poetry translation system. We study how poetry translation is different from translation of non-creative texts by measuring the language variation added during the translation process. We show that humans translate poems much more freely compared to general texts. Based on this observation, we build a machine translation system specifically for translating poetry which uses language variation in the translation process to generate rhythmic and rhyming translations.Bio: Marjan Ghazvininejad is a Ph.D. student at ISI working with Professor Kevin Knight.

DTEND;TZID=America/Los_Angeles:20180504T160000
DTSTART;TZID=America/Los_Angeles:20180504T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Neural Creative Language Generation (PhD Defense Practice Talk)
UID:20180504T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Brain MRI offers tremendous opportunity to learn about cortical anatomy, function and connectivity. In this talk I will go over several standard techniques for image understanding used in brain imaging. These include image registration, segmentation, tractography and graph-based connectivity analyses. Among these algorithms, we routinely encounter both continuous and discrete types of analysis. Non-linear image registration, typically formalized as a diffeomorphism on the image domain, is an example of the former:  we may ask for instance how much volume change the brain is experiencing locally over time, clearly a continuous measure. In another example, we may trace continuous curves in space that best fit a Diffusion Tensor MR image to approximate fibers in the brainâs white matter. One the other hand, connectivity between distinct units within the nervous system is an example of discrete analysis: for instance, the brainâs functionally distinct regions are thought of as nodes in a graph, whose edges are defined by the connecting fiber models.After a brief description of the standard methods at hand, I will suggest an approach for combining the two types of analysis. By assuming the continuous paradigm for connectivity, we can push our connectome model from being a discrete graph to being a linear operator. Using some well-known results from operator theory, we can decompose the operator into its resident âeigen-networks,â and apply continuous methods directly. As an example, we can spatially register connectivity matrices with spatially distributed nodes. Finally, I will show two simple examples of continuous analogues for standard graph theory measures, and their potential application for an Alzheimer âs disease study.Bio: Boris Gutman received his B.S. in Applied Mathematics and PhD in Biomedical Engineering from UCLA before joining USCâs Imaging Genetics Center (IGC). He is currently a post-doctoral scholar at the IGC, under the supervision of Professor Paul M. Thompson.

DTEND;TZID=America/Los_Angeles:20141010T160000
DTSTART;TZID=America/Los_Angeles:20141010T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Interplay between Continuous and Discrete Aspects of Brain Image Analysis
UID:20141010T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many machine learning problems involve making jointpredictions over a set of mutually dependent output variables. Thedependencies between output variables can be represented by astructure, such as a sequence, a tree, a clustering of nodes, or agraph. Structured prediction models have been proposed for problems ofthis type. In this talk, I will describe a collectionof results that improve several aspects of these approaches. Ourresults lead to efficient and effective algorithms for learning structuredprediction models, which, in turn, support weak supervision signals and improve training and evaluation speed.I will also discuss potential risks and challenges when using structured prediction modelsBio: Kai-Wei Chang is an assistant professor in the Department ofComputer Science at the University of California, Los Angeles. Hehas published broadly in machine learning and natural language processing. Hisresearch has mainly focused on designing machine learning methods forhandling large and complex data. He has been involved in developingseveral machine learning libraries, including LIBLINEAR, VowpalWabbit, and Illinois-SL. He was an assistant professor at the Universityof Virginia in 2016-2017. He obtained his Ph.D. from the University ofIllinois at Urbana-Champaign in 2015 and was a post-doctoral researcher at Microsoft Research in 2016.Kai-Wei was awarded the EMNLP Best Long Paper Award (2017),  KDDBest Paper Award (2010), and the Yahoo! Key Scientific Challenges Award(2011). Additional information is available at http://kwchang.net.

DTEND;TZID=America/Los_Angeles:20171103T160000
DTSTART;TZID=America/Los_Angeles:20171103T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Structured Predictions: Practical Advancements and Applications in Natural Language Processing
UID:20171103T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Understanding what people mean when they use metaphoric language is a central problem in natural language understanding. Metaphors give a partial understanding of one kind of experience in terms of another, highlighting similarities and hiding differences. In this talk, I give an overview of the problems posed by metaphoric language. I then describe ongoing crosslinguistic work on the knowledge-based interpretation of metaphors by abductive inference. This work moves us toward a better understanding not only of what people are saying with metaphors but also how the metaphors used by groups of people (e.g., the supporters and opponents of gun control) expose their different world views.Bio:Jonathan Gordon is a postdoctoral researcher at the USC Information Sciences Institute, where he is advised by Jerry Hobbs. His 2014 doctoral dissertation, 'Inferential Commonsense Knowledge from Text', was supervised by Lenhart Schubert at the University of Rochester. Jonathan's research interests include natural language understanding, semantics, and knowledge extraction.

DTEND;TZID=America/Los_Angeles:20150116T160000
DTSTART;TZID=America/Los_Angeles:20150116T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Towards the Interpretation of Metaphoric Language
UID:20150116T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030718T160000
DTSTART;TZID=America/Los_Angeles:20030718T150000
LOCATION:11 Large
SUMMARY:A Maryland Yankee in King Eduard's Court: Some Remarks on a Year in Paradise
UID:20030718T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: How to use NLP techniques to help medical researchers is crucial now. And making use of millions of medical passages is a good starting point. By doing this, we can extract useful information from these papers and help medical researchers a lot.Iâll introduce a simple method to extract relations between proteins using AMR. By using this rule-base system, we can get AMR representation to simplified AMR(SMR) which only contains protein relation information.Bio: Xiang Li(Lorraine) is a 2016 summer intern under the supervision of Prof Kevin Knight and Prof Daniel Marcu. She is also going to be a PhD student at the University of Massachusetts Amherst in Andrew McCallumâs research group in this coming Fall. She got her B.S at the East China Normal University, Shanghai, China and got her M.S at the University of Chicago. Her research interest mainly focused on natural language processing and machine learning.

DTEND;TZID=America/Los_Angeles:20160819T160000
DTSTART;TZID=America/Los_Angeles:20160819T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Event extraction from AMR representations
UID:20160819T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: This work is geared towards developing pedagogically-tuned information retrieval systems to help learners select the most informative documents as a reading list for a given query over a given technical corpus. This work will enable learners to understand complex subjects more quickly. I will discuss our overall methodology, our efforts to study dependency between topics within a technical corpus and improvements to evaluating topic quality. I will describe ongoing efforts to study a document's pedagogical value to the end user and future directions for this enterprise.Bio: Gully Burns' focus is to develop pragmatic knowledge engineering systems for scientists in collaboration with experts from the field of AI. He was originally trained as a physicist at Imperial College in London before switching to do a Ph.D. in neuroscience at Oxford. He came to work at USC in 1997, developing the 'NeuroScholar' project in Larry Swanson's lab before joining the Information Sciences Institute in 2006. He is as Research Lead at ISI.

DTEND;TZID=America/Los_Angeles:20160506T160000
DTSTART;TZID=America/Los_Angeles:20160506T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Title: The TechKnAcq Project: Building Pedagogically Tuned Reading Lists from Technical Corpora
UID:20160506T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Ensembles of machine experts, from simple linear classifiers to complex hidden Markov models, have out-performed single experts across many applications. Likewise, ensembles have been central to computing with human experts, e.g. for data annotation. This widespread use of ensembles, albeit largely heuristic, is motivated by their better generalization and robustness to ambiguity in the production, representation, and processing of information.This talk will focus on three important problems which contribute towards a unified computational framework for ensembles of diverse experts. The first problem deals with "modeling" a diverse ensemble. I will present our proposed Globally-Variant Locally-Constant (GVLC) model as a statistical framework for answering this question. The second question is about "analysis", where I will address the link between ensemble diversity and performance using statistical learning theory. The final segment of my talk will focus on "designing" an ensemble of diverse linear classifiers, specifically conditional maximum entropy models. Practical applications throughout the talk will include emotion classification from speech, text classification, and crowd-sourcing for automatic speech recognition.Speaker bio: Kartik Audhkhasi received B.Tech. in Electrical Engineering and M.Tech. in Information and Communication Technology from Indian Institute of Technology, Delhi in 2008. He is currently pursuing the Ph.D. degree in Electrical Engineering from University of Southern California, Los Angeles. His thesis research focuses on modeling, analysis, and design of ensembles of multiple human or machine experts. He is also interested in crowd-sourcing for speech and language processing. His broad interests include machine learning and signal processing. Kartik is the recipient of the Annenberg, IBM, and Ming Hsieh Institute PhD fellowships, and best teaching assistant awards of the EE department at USC.

DTEND;TZID=America/Los_Angeles:20130208T160000
DTSTART;TZID=America/Los_Angeles:20130208T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Computational Framework for Ensembles of Diverse Experts
UID:20130208T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will present my current work on language understandingin the project, Mission Rehearsal Exercise(MRE). One of the challengesin a dialogure system is to provide a robust understanding/parsingcompoment. We applied both Finte State Model and Statistical LearningModel for the parsing of separate sentences of dialogue utterances.Their performances are evaluated and compared with a new blind set.And we hope to incorporate them to make a better solution in thisspecific application.

DTEND;TZID=America/Los_Angeles:20030404T160000
DTSTART;TZID=America/Los_Angeles:20030404T150000
LOCATION:11 Large
SUMMARY:Natural Language Understanding in MRE
UID:20030404T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In previous work on "Learning by Reading" we successfully extracted entities, states and events from technical natural language descriptions of processes. The research described here is aimed at the automatic discovery of causal and temporal ordering relations among states and events, specifically, among molecular and other events in biomedical articles. We have annotated causal and temporal relations in articles on the cell cycle, and we discuss our annotation guidelines and the issue of inter-annotator agreement. We then describe the natural language parsing and the inference system we use to extract these relations. We have created axioms manually for this system, focusing on temporal, causal and aspectual information and we have used semi-automatic means to augment these axioms. We have evaluated the performance of this system, and the results are promising.

DTEND;TZID=America/Los_Angeles:20090319T143000
DTSTART;TZID=America/Los_Angeles:20090319T140000
LOCATION:4th floor CR
SUMMARY:Discovering Causal and Temporal Relations in Biomedical Texts (practice talk for AAAI Spring Symposium)
UID:20090319T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Large-scale labeled corpora are always critical for Natural Language Processing tasks using statistical machine learning methods, but at the great expense of human labor of annotation. While we have various labeled corpora in different languages at hand, such as English and Chinese, either resource-rich or resource-poor, can these corpora be taken full advantage of for NLP tasks in different languages to help each other? The difficult lies in the fact that parallel corpora with aligned NLP entities are hard to acquire. In this talk, I shall first discuss how to generate pseudo-parallel corpora for relation extraction via machine translation and entity alignment techniques, and then I will proceed to apply this corpora to statistical ML-based relation extraction in terms of the degree of supervision: (1) supervised learning; (2) bilingual co-training; (3) bilingual active learning. This talk is chiefly based on the ACL-2014 paper âBilingual Active Learning for Relation Classification via Pseudo Parallel Corporaâ.Bio: Longhua Qian is a visiting researcher from the School of Computer Science and Technology, Soochow University, China. He joined the Natural Language Group and will work with Professor Kevin Knight and his team members for one year. He will participate in on-going projects on Abstract Meaning Representation (AMR) and Machine Reading etc. His mainly focuses on information extraction, relation extraction, and entity linking etc. He is also interested in extracting information from clinical medical records and building social networks from free text.

DTEND;TZID=America/Los_Angeles:20150417T160000
DTSTART;TZID=America/Los_Angeles:20150417T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Exploiting Bilingual Corpora for Relation Extraction
UID:20150417T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the last years a standard model in statistical machinetranslation has emerged, which is based on the translationof sequences of words (so-called "phrases") at a time.I will describe this model, how to train and decode with it,but the focus of this talk will be how to address thechallenges to advance and move beyond the model: my thesiswork on noun phrase translation, making use of syntax, andbetter modeling, such as discriminative training.Bio: Philipp Koehn is the author of papers on natural languageprocessing, machine translation, and machine learning. Hereceived his PhD from the University of Southern Californiain 2003 (advisor: Kevin Knight), and is currently employed asa postdoc at the Massachusetts Institute of Technology, workingwith Michael Collins. He has worked at AT&T Laboratories ontext-to-speech systems, and at WhizBang! Labs on textcategorization.

DTEND;TZID=America/Los_Angeles:20040524T170000
DTSTART;TZID=America/Los_Angeles:20040524T160000
LOCATION:11 Large
SUMMARY:Challenges in Statistical Machine Translation
UID:20040524T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Computational stylometry is the automatic assignment ofauthor properties (e.g., identity, gender, personality,region, age, period, ideology, ...) to a text. Applicationsrange from forensic use to literary scholarship. Themethodology currently most successful is based on the wellknown approach to  text categorization using trainingdata in the form of texts with known classes. The approachworks by extracting text features, selecting the best onesusing statistical methods, representing the text as a vectorof these features, and applying machine learning methods tothe resulting vectors with associated classes. The maindifference with the original text categorization approach isthat the extracted text features may be complex andlinguistically motivated (e.g. syntactic features).I will describe some recent applications at the Universityof Antwerp using this methodology: personality detection,author assignment with many authors and short texts, scribedetection  in medieval texts, provenance and ideology detectionin Kenyan news articles, etc.I will then focus on an empirical comparison of therobustness of different feature types in differentsituations.Bio:Walter Daelemans (PhD in Computational Linguistics, University of Leuven, 1987). Trained as a linguist and psycholinguist at the Universities of Antwerpen and Leuven, he specialised in computational linguistics and held research posts at the University of Nijmegen and the AI Lab of the University of Brussels before becoming a lecturer in Computational Linguistics and Artificial Intelligence at Tilburg University where he founded an early research group on machine learning of language (ILK). Since 1999 he is full-time professor at the University of Antwerp where he also heads the computational linguistics group within the CLiPS research centre. His mainresearch interests are in machine learning of language (especially memory-based learning), text analytics, and computational psycholinguistics. He co-founded ACL's Special Interest Group on Natural Language Learning (SIGNLL) and its associated conference and shared task series (CoNLL).

DTEND;TZID=America/Los_Angeles:20100430T160000
DTSTART;TZID=America/Los_Angeles:20100430T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Robust features for Computational Stylometry
UID:20100430T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 1)Statistical models have outperformed neural models in machinetranslation, until recently, with the introduction of the sequence tosequence neural model. However, this model's performance suffers greatlywhen starved of bilingual parallel data. This talk will discuss severalstrategies that try to overcome this low-resource challenge, includingmodifications to the sequence to sequence model, transfer learning, dataaugmentation, and the use of monolingual data.2)Neural machine translation is effective for language pairs with large datasets, but falls short to traditional methods (e.g. phrase or syntax-based machine translation) in the low-resource setting. However, these classic approaches struggle to translate out-of-vocabulary tokens, a limitation that is amplified when there is little training data. In this work, we augment a syntax-based machine translation system with a module that provides translations of out-of-vocabulary tokens. We present several language-independent strategies for translation of unknown tokens, and benchmark their accuracy on an intrinsic out-of-vocabulary translation task across a typologically diverse dataset of sixteen languages. Lastly, we explore the effects of using the module to add rules to a syntax-based machine translation system on overall translation quality.Bio:Leon Cheung is a second year undergraduate from UC San Diego. Thissummer he has been working with Jon May and Kevin Knight to improveneural machine translation for low resource languages.Nelson Liu is an undergraduate at the University of Washington, where he works with Professor Noah Smith. His research interests lie at the intersection of machine learning and natural language processing. Previously, he worked at the Allen Institute for Artificial Intelligence on machine comprehension---he is currently a summer intern at ISI working with Professors Kevin Knight and Jonathan May.

DTEND;TZID=America/Los_Angeles:20170908T160000
DTSTART;TZID=America/Los_Angeles:20170908T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:1)Improving Low Resource Neural Machine Translation 2)Language-Independent Translation of Out-of-Vocabulary Words
UID:20170908T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Extracting knowledge from unstructured text has been a long-standinggoal of NLP. The advent of the Web further increases its urgency bymaking available billions of online documents. To represent theacquired knowledge that is complex and heterogeneous, we needfirst-order logic. To handle the inherent uncertainty and ambiguity inextracting and reasoning with knowledge, we need probability.Combining the two has led to rapid progress in the emerging field ofstatistical relational learning. In this talk, I will show thatstatistical relational learning offers promising solutions forconquering the knowledge-extraction quest. I will present Markovlogic, which is the leading unifying framework for representing andreasoning with complex and uncertain knowledge, and has spawned anumber of successful applications for knowledge extraction from theWeb. In particular, I will present OntoUSP, an end-to-end knowledgeextraction system that can read text and answer questions. OntoUSP iscompletely unsupervised and benefits from jointly conducting ontologyinduction, population, and knowledge extraction. Experiments show thatOntoUSP extracted five times as many correct answers compared tostate-of-the-art systems, with a precision of 91%.

DTEND;TZID=America/Los_Angeles:20100726T160000
DTSTART;TZID=America/Los_Angeles:20100726T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Statistical Relational Learning for Knowledge Extraction from the Web
UID:20100726T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Natural language interfaces designed for agents that interact with usersin shared environments (e.g. training simulators, videogames) mustincorporate knowledge about the users' context in order to address themany ambiguities of situated language use. We introduce a model ofsituated language acquisition that operates in two phases.  First,intentional context is represented and inferred from user actions usingprobabilistic context free grammars.  Then, utterances are mapped ontothis representation in a noisy channel framework.  The acquisition modelis trained on unconstrained speech collected from subjects playing aninteractive game, and tested using an understanding task.  Discussion ofresults focuses both on the implications for theoretical models ofcognition, as well as, for natural language applications in sharedenvironments.

DTEND;TZID=America/Los_Angeles:20050623T120000
DTSTART;TZID=America/Los_Angeles:20050623T103000
LOCATION:11 Small
SUMMARY:Intentional Context in Situated Language Learning
UID:20050623T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20050211T163000
DTSTART;TZID=America/Los_Angeles:20050211T150000
LOCATION:11 Large
SUMMARY:Unsupervised Word Sense Disambiguation Using Wordnet Relatives
UID:20050211T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Most data-driven dependency parsing approaches assume that thestructure of sentences is represented as trees. Although trees haveseveral desirable properties from a computational perspective, thestructure of linguistic phenomena that go beyond shallow syntax oftencannot be fully captured by tree representations. I will describedata-driven dependency parsing approaches that produce more generalgraphs as output, and present results obtained with these approacheson predicate-argument structures extracted from CCG and HPSG datasets.Kenji Sagae is a Research Scientist in the Institute for Creative Technolgies at the University of Southern California, and a Research Assistant Professor in the USC Computer Science Department.  He received his PhD from Carnegie Mellon University in 2006.  Prior to joining USC in 2008, he was a research associate at the University of Tokyo. His main area of research is Natural Language Processing, focusing on data-driven approaches for syntactic parsing, predicate-argument analysis and discourse processing. His current work includes the application of these techniques in analysis of personal narratives in blog posts, the study of child language, spoken dialogue systems, and multimodal processing.

DTEND;TZID=America/Los_Angeles:20140228T160000
DTSTART;TZID=America/Los_Angeles:20140228T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Dependency parsing with directed graph output
UID:20140228T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Error analysis on grammars extracted for Machine Translation showsthat bad and useless translation rules are usually caused by badalignments. In this work, we improve previous work on hierarchicaldiscriminative alignment by incorporating knowledge of foreign sideparse trees, output from other aligners, and a look-ahead to grammarextraction. We give examples and results on Chinese to Englishtranslation.

DTEND;TZID=America/Los_Angeles:20100825T143000
DTSTART;TZID=America/Los_Angeles:20100825T140000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Intern Final Talk: Making Discriminative Alignment Smarter
UID:20100825T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditional statistical MT systems mostly work on the word-andphrase-level. For different language pairs, the performance of suchsystems vary from some 15% to 35%. These systems suffer from problemssuch as sparse data, with huge vocabulary sizes leading to lessreliable probability estimates. In our current research, we aim tocome up with a better MT system by looking inside the words. Almost inevery language, a root (stem) can have many different forms(inflectional, derivational, etc.). If we can identify the roots, thesize of the vocabulary will quite small, and we can have betterprobability estimates, reducing the sparse data problem andpotentially leading to higher accuracy. We are trying to come up witha model that induces morphology automatically from a bilingual corpusand achieves this improvement.

DTEND;TZID=America/Los_Angeles:20030425T160000
DTSTART;TZID=America/Los_Angeles:20030425T150000
LOCATION:11 Large
SUMMARY:Statistical MT with Bilingual Morphology
UID:20030425T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Despite all the recent successes of machine translation, when itcomes to high quality publishable translation, human translatorsare still unchallenged. Since we can't beat them, can we helpthem to become more productive? I will talk about some recentwork on developing assistance tools for human translators.You can also check out a prototype at http://www.caitra.org/

DTEND;TZID=America/Los_Angeles:20121026T160000
DTSTART;TZID=America/Los_Angeles:20121026T30000
LOCATION:10th Floor Conference Room [1026]
SUMMARY:Computer Aided Translation
UID:20121026T30000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract numerical representations of word and term content are very popular in NLP applications of behavioral analysis, like sentiment analysis, where the low dimensional representation allows for the use of complicated machine learning techniques, despite the lack of annotated in-domain data. In this presentation we will discuss our experiments on automatically expanding manually annotated lexica of linguistic norms, starting from word emotion norms and generalizing to include higher order terms, norms beyond emotion (like concreteness and age of acquisition) as well as languages other than English. We will present our attempts at domain adaptation of these norms, as well as the composition of norms for larger lexical units via their constituents by utilizing distributional semantic representations. As examples of actual applications we will present a highly ranked system of sentiment analysis submitted to SemEval 2014 and a multi-modal depression diagnosis system for German submitted to AVEC 2014.Bio: Nikolaos Malandrakis is a third year PhD student at the USC Computer Science Department and a research assistant at the Signal Analysis and Interpretation Laboratory (SAIL). He is originally from Chania, Greece, where he completed a BSc and MSc in Computer Engineering at the Technical University of Crete.

DTEND;TZID=America/Los_Angeles:20141031T160000
DTSTART;TZID=America/Los_Angeles:20141031T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Generating Psycholinguistic Norms
UID:20141031T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We describe existing forward and backward bisimulation minimisationalgorithms for nondeterministic automata and extend these algorithmsto weighted tree automata. The extended algorithms, which work for allsemirings, retain the time complexity of their counterparts forunweighted tree automata for additively cancellative semirings, andare only slightly higher (linear instead of logarithmic in the numberof states) on other semirings. We describe the effectiveness of animplementation of these algorithms on a typical task in naturallanguage processing.This is joint work with Johanna Hogberg, Umea University and AndreasMaletti, Technische Universitat Dresden.

DTEND;TZID=America/Los_Angeles:20070608T160000
DTSTART;TZID=America/Los_Angeles:20070608T153000
LOCATION:11 Large
SUMMARY:Bisimulation Minimisation for Weighted Tree Automata
UID:20070608T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This summer we held a three-month workshop on syntax-driven machinetranslation, in which we learned syntactic transformations automaticallyfrom Chinese/English translated corpora and applied them to translate newtext.  We'll give a progress report!

DTEND;TZID=America/Los_Angeles:20040917T163000
DTSTART;TZID=America/Los_Angeles:20040917T150000
LOCATION:11 Large
SUMMARY:About Syntax Fest 2004 (Part II)
UID:20040917T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Machine Reading (MR) aims at bridging the gap between texts and a formal representation that a reasoning system can use to make inferences about the text. In the MR Program (MRP), the target ontology is given and the inferences are oriented to answer queries about a set of textual documents. Traditionally, this setting is approached by Information Extraction engines that use annotated texts to learn the mapping between the text and the entity classes and relations of the target ontology. However, in the current MRP setting, almost no annotated data is given, and the systems are expected to adapt to a new domain in a very short time. This setting introduces the need to develop new architectures able to learn from previous readings (of unannotated texts) and to leverage as much as possible the small amount of annotated data. The talk will report the current development of a system with these features.

DTEND;TZID=America/Los_Angeles:20101007T160000
DTSTART;TZID=America/Los_Angeles:20101007T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Toward a Reading Machine
UID:20101007T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The talk gives an overview of  Multilayered Extended Semantic Networks(abbreviated MultiNet), which is one of the most comprehensivelydescribed knowledge representation paradigms used as a semanticinterlingua in large-scale NLP applications and for linguisticinvestigations into the semantics and pragmatics of naturallanguage. As with other semantic networks, concepts are represented inMultiNet by nodes, and relations between concepts are represented asarcs between these nodes. Additionally to that, every node isclassified according to a predefined conceptual ontology forming ahierarchy of sorts, and the nodes are embedded in a multidimensionalspace of layer attributes and their values. MultiNet provides a set ofabout 150 standardized relations and functions which are described ina very concise way including an axiomatic apparatus, where the axiomsare classified according to predefined types. The representationalmeans of MultiNet claim to fulfill the criteria of universality,homogeneity, and cognitive adequacy. In the talk, it is also shown,how MultiNet can be used for the semantic representation of differentsemantic phenomena. To overcome the quantitative barrier in buildinglarge knowledge bases and semantically oriented computational lexica,MultiNet is associated with a set of tools including a semanticinterpreter NatLink for automatically translating natural languageexpressions into MultiNet networks, a workbench LIA for the computerlexicographer, and a workbench MWR for the knowledge engineer formanaging and graphically manipulating semantic networks. Theapplications of MultiNet as a semantic interlingua range from naturallanguage interfaces to the Internet and to dedicated databases, overquestion-answering systems, to systems for automatic knowledgeacquisition.About the speaker:Prof. Helbig is head of the chair Intelligent Information and CommunicationSystems at the University of Hagen, Germany. His main research areas areKnowledge Representation, Semantic Natural Language Processing, andQuestion-Answering.A CV can be found <a href="slides/CV-En-HH.pdf"> here</a>.

DTEND;TZID=America/Los_Angeles:20070323T163000
DTSTART;TZID=America/Los_Angeles:20070323T150000
LOCATION:4 CR
SUMMARY:Multilayered Extended Semantic Networks as a Knowledge Representation Paradigm and Interlingua for Meaning Representation
UID:20070323T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Noisy word alignments negatively affect the quality of the translationrules extracted by the ISI syntax-based MT system.  In the literature,alignment is typically treated as a separate process from subsequentstages in the MT pipeline.  By contrast, we allow rule extraction toguide the alignment process.We present an unsupervised algorithm for identifying and removing "bad"links using GHKM syntax-based rule extraction.  We show thatwe can improve upon the precision of GIZA union (measured against a goldstandard set of manually aligned Chinese-English sentence pairs),while only decreasing recall slightly.(Note: This is part of the Summer Intern Series)

DTEND;TZID=America/Los_Angeles:20060825T153000
DTSTART;TZID=America/Los_Angeles:20060825T150000
LOCATION:11 Large
SUMMARY:Improving Precision of Word Alignments Using GHKM Syntax-Based Rule Extraction
UID:20060825T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Research on networks has already revealed much about the structure of real-world networks. Network dynamics such as navigation or exploration, however, are something less well-researched. Yet, we constantly design and use networked systems meant for navigation and exploration. In this talk, I will present a short overview of what we know about navigability, followed by the our work on exploring dynamics occurring on recommendation networks - networks formed implicitly by recommender systems. Navigability can serve as an evaluation criterion for recommender systems and reveal to what extent a system supports navigation and exploration. Based on analysis of topology and dynamical processes, we find that current systems do not support navigation very well, and propose techniques to overcome this.Bio: Daniel Lamprecht is a PhD student at Graz University of Technology and is interning at ISI this summer. His research explores network science, web science and recommender systems and especially focuses on network navigability. This summer, he's working with Kristina Lerman on navigation dynamics and click biases in Wikigames. In the past, he has also studied navigation dynamics in information networks with the aid of biomedical ontologies.

DTEND;TZID=America/Los_Angeles:20140725T160000
DTSTART;TZID=America/Los_Angeles:20140725T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Navigation Dynamics in Networks
UID:20140725T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging "equivalent" stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce dependency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.

DTEND;TZID=America/Los_Angeles:20100707T163000
DTSTART;TZID=America/Los_Angeles:20100707T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Dynamic Programming for Linear-time Incremental Parsing (ACL 2010 Practice Talk)
UID:20100707T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Understanding how conceptual knowledge is represented and organized in the human brain is one of the core problems of cognitive science, and many studies have aimed at exploring and understanding the similarities of neuro-semantic representations of concepts. A general approach that has been particularly fruitful in this domain is the investigation of the relationship between various corpus statistics of words and neural activity during exposure to those words. In this work, we examine the neuro-semantic representations of stories across three different languages. We demonstrate that using new advances in vector-based representation of text and paragraphs, fMRI signals can be reliably mapped to story representations. We also show that such representations can capture common neuro-semantic representation of stories across different languages. Finally, performing search-light analysis using over a billion regressions, we show that activation patterns in the default mode network of the brain are the most reliable features for decoding stories.Bio: Morteza is an Assistant Professor of psychology, computer science and the Brain and Creativity Institute at University of Southern California. His research spans the boundary between psychology and artificial intelligence, as does his education. His work investigates properties of cognition by using documents of the social discourse, such as narratives, social media, transcriptions of speeches and news articles, in conjunction to behavioral studies.

DTEND;TZID=America/Los_Angeles:20160415T160000
DTSTART;TZID=America/Los_Angeles:20160415T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Decoding Neuro-Semantic Representation of Stories across Languages
UID:20160415T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This talk occurs in the morning on the same day as the Bayesian tutorial.)The goal of our research is to support cooperative work performed bystakeholders sitting around a table. To support such cooperation, varioustable-based systems with a shared electronic display on the tabletop havebeen developed. These systems, however, suffer the common problem of notrecognizing shared information such as text and images equally because theorientation of their view angle is not favorable. To solve this problem,we propose the Lumisight Table. This is a system capable of displayingpersonalized information to each required direction on one horizontalscreen simultaneously by multiplexing them and of capturing stakeholders'gestures to manipulate the information.About the Speaker:Mitsunori Matsushita is a research scientist of NTT Communication ScienceLabs., Nippon Telegraph and Telephone Corporation (NTT). He received B.E.,M.E., and Dr.E. degrees from Osaka University, in 1993, 1995 and 2003respectively. In 1995, he joined NTT, and has been engaged in researcheson natural language understanding, information visualization, andinteraction design.

DTEND;TZID=America/Los_Angeles:20050622T120000
DTSTART;TZID=America/Los_Angeles:20050622T110000
LOCATION:11 Large
SUMMARY:Lumisight Table: A Face-to-face Collaboration Support System That Optimizes Direction of Projected Information to Each Stakeholder
UID:20050622T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Despite a lot of recent attention, corpus annotation remains somewhat of an art.  This talk is the main part of a tutorial intended to provide the attendee with an in-depth look at the procedures, issues, and problems in corpus annotation.  After describing some currently available resources, services, and frameworks (including the QDAP annotation center, Amazon's Mechanical Turk, annotation facilities in GATE, and UIMA), it addresses the open questions, pitfalls, and problems that the annotation manager should avoid, highlighting the seven major issues at the heart of annotation for which there are as yet no standard and fully satisfactory answers or methods.  For each of these it provides suggestions and a possibly helpful list of references.Your participation in order to critique the tutorial is appreciated!

DTEND;TZID=America/Los_Angeles:20100402T163000
DTSTART;TZID=America/Los_Angeles:20100402T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Annotation
UID:20100402T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: This talk will discuss a technique to create figurative relationships using Mikolov et al.âs word vectors. Drawing on existing work on figurative language, we start with a pair of words and use the intersection of word vector similarity sets to blend the distinct semantic spaces of the two words. We conduct preliminary quantitative and qualitative observations to compare the use of this novel intersection method with the standard word vector addition method for the purpose of supporting the generation of figurative language. To showcase this technique, we use it to write computer generated sonnets.BioAndrea Gagliano is a masters student at UC Berkeley's School of Information and the Berkeley Center for New Media. Her research explores the use of computation for creativity - both tools to support creative practices and generation of creative works. Recently, she has been focusing in the field of natural language processing by working on poetry and metaphor generation.Previously, Andrea received her BS in Mathematics and BA in Business Administration from the University of Washington in 2013. During her studies, she spent time with the Creative Writing department studying poetry.

DTEND;TZID=America/Los_Angeles:20160926T160000
DTSTART;TZID=America/Los_Angeles:20160926T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Poetry at the Metaphorical Intersection
UID:20160926T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Justin Busch:Weight and Semantic Class Issues in Japanese Noun Phrase OrderingMany current designs for automatic parsers learn probabilities for therelative frequencies of parts-of-speech and syntactic rules, and this hasproven to be generally reliable. In spite of the ubiquity of probabilistictechniques for parsing, however, little attention has been given to thelinguistic significance of the probabilistic data and what it might sayabout human performance.Hawkins proposes a general theory of grammaticalization based on theminimization of syntactic domains. Given that a sentence of any languagewill contain at least one noun phrase, one verb, and possibly additionalnoun phrases and prepositional phrases, "minimize domains" suggests thatthese phrases will order themselves according to whichever patternrequires the least effort to recognize the higher syntactic structure ofthe sentence. These effects are directly measurable through corpusstatistics, and can be interpreted as potential heuristics forprobabilistic parsers.  In this study, we examine Japanese data from theKyoto Treebank and test Hawkins' predictions for noun phrase ordering bynoun phrase weight as well as by generic semantic types. The discussionwill focus primarily on how accurately Hawkins' predictions are reflectedin the corpus statistics, and will conclude with observations about howthey might be applied to the decision mechanisms of probabilistic parsers.--------------------------------------------------------------------------Hai Huang:TBA--------------------------------------------------------------------------Jens Stephan:Evaluation and Visualization of a Dialogue SystemEvaluations have become a necessary standard to almost any type ofresearch. However, there are many areas where there is no common agreementon how to evaluate, which is the case for complex problem of evaluatingdialogue systems. The evaluation of the multi party multi modal dialoguesystem MRE(1) provides a good example of what questions are important forsuch an evaluation, how to actually do the evaluation and finally how tohow make special problems of the system visible to use the evaluationresults to improve the systems performance.After a brief introduction of the MRE domain and architecture, I willbreak the task town to a set of general evaluation questions. From there Iwill explain what kinds of metrics and visualizations are suited to answerthose questions and what kind of data is needed, as well as how that datawas obtained. Along the road, examples of actual system problems andperformances will be presented. The topics of data formatting andvisualization will receive some special attention by introducing the MREEvaluation Toolkit as well as the corpus it operates on.--------------------------------------------------------------------------Chen-kang Yang:Using the Omega Ontology to Determine Selectional Restrictions for Word Sense DisambiguationWord sense disambiguation is fundamental for language processing. Thoughpurely statistical methods are effective for this task, they neglect thesyntactic and semantic aspects. In this study, we adopt a hybrid approachby applying an unsupervised machine learning method to learn verbsselectional restrictions on their subjects/objects. The system then usesthese learned selectional restrictions for word sense disambiguation ofthe subjects/objects. Instead of words, the training data containsontological taxonomy hierarchies that are retrieved from the Omegaontology. Unlike other similar systems, we are able to automatically findthe best match among classes from different levels of the ontology. Thisprovides us more flexibility and is closer to human instinct. Our systemperforms better than other similar systems, though it still needscooperating methods for better results.

DTEND;TZID=America/Los_Angeles:20040809T163000
DTSTART;TZID=America/Los_Angeles:20040809T150000
LOCATION:11 Large
SUMMARY:CL Student Presentations
UID:20040809T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The ABC (Assess by Computer) system has been developed and used in theSchool of Computer Science at the University of Manchester for formativeand (principally) summative assessment at undergraduate and postgraduatelevel. We believe that fully automatic marking of constructed answers -especially free text answers - is not a sensible aim. Instead - drawing onparallels in the history of machine translation - we take a"human-computer collaborative" approach, in which the system does what itcan to support the efficiency and consistency of the human marker, whokeeps the final judgement.Our current work focuses on what are generally referred to as "short textanswers" as contrasted to "essays". However we prefer to contrast"factual" with "discursive" answers, and speculate that the former may beamenable to simple statistical techniques, while the latter require moresophisticated natural language analysis. I will show some examples of realexam data and the techniques we are using and developing to handle them.

DTEND;TZID=America/Los_Angeles:20041105T163000
DTSTART;TZID=America/Los_Angeles:20041105T150000
LOCATION:11 Large
SUMMARY:A Human-Computer Collaborative Approach to Computer Aided Assessment
UID:20041105T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatically word-aligning a parallel bitext in the source and target languages constitutes the first stage of most statistical machine translation pipelines.  Automatic word alignment is error-prone, and produces many incorrect links.  Incorrect links that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntactic machine translation.  We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules.  We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments, relative to a GIZA++ union baseline.

DTEND;TZID=America/Los_Angeles:20080118T163000
DTSTART;TZID=America/Los_Angeles:20080118T150000
LOCATION:11 Large
SUMMARY:Using Syntax to Improve Word Alignment Precision for Syntactic Machine Translation
UID:20080118T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Probabilistic parsing methods have in recent years transformed our ability torobustly find correct parses for open domain sentences.  Much of this work hasbeen within a common architecture of heuristic search for good pares inlexicalized probabilistic context-free grammars, with many layers of back-offto avoid problems of sparse data.In this talk, I will outline some different ideas that we have been pursuing.I will connect stochastic parsing with finding shortest paths in hypergraphs,and show how this approach naturally provides a chart parser for arbitraryprobabilistic context-free grammars (finding shortest paths in a hypergraph iseasy; the central problem of parsing is that the hypergraph has to beconstructed on the fly). From this viewpoint, a natural approach is to use theA* algorithm to cut down the work in finding the best parse. On unlexicalizedgrammars, this can reduce the parsing work done dramatically, by at least 97%.This approach is competitive with methods standardly used in statisticalparsers, while ensuring optimality, unlike most heuristic approaches tobest-first parsing.Finally, I will present a novel modular generative model in which semantic(lexical dependency) and syntactic structures are scored separately. Thisfactored model is conceptually simple, linguistically interesting, admits exactinferenence with an extremely effective A* algorithm, and providesstraightforward opportunities for separately improving the component models. Inparticular, I will mention some of the work we have done focusing on the PCFGcomponent to produce a very high accuracy unlexicalized grammar.This is joint work with Dan Klein.About the Speaker:Christopher Manning is an Assistant Professor of Computer Science andLinguistics at Stanford University. He received his Ph.D. from StanfordUniversity in 1995, and served on the faculty of the Computational LinguisticsProgram at Carnegie Mellon University (1994-1996) and the University of SydneyLinguistics Department (1996-1999) before returning to Stanford. His researchinterests include probabilistic models of language, natural language parsing,constraint-based linguistic theories, syntactic typology, informationextraction and text mining, and computational lexicography. He is the author ofthree books, including Foundations of Statistical Natural Language Processing(MIT Press, 1999, with Hinrich Schuetze).Chris' schedule is available in <a href="manning.ps">Postscript</a> or<a href="manning.pdf">PDF</a> format.

DTEND;TZID=America/Los_Angeles:20031027T110000
DTSTART;TZID=America/Los_Angeles:20031027T100000
LOCATION:11 Large
SUMMARY:Natural Language Parsing: Graphs, the A* Algorithm, and Modularity
UID:20031027T100000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Suppose we translate a word from English to French and back. Should we get the original English word? That is, is translation invertible?Alternatively, suppose we translate an English word e to Spanish and then from Spanish to French, obtaining a word f.Should e-f be a valid entry in an English-French dictionary? That is, is translation transitive?Intuitively, if translation is done carefully, we expect to answer both these questions with "Yes, with high probability".In this talk, I will discuss how to formulate our intuition about invertibility/transitivity with random-walks, using translation probability distributions.I will then present two random-walk based regularization techniques that we recently used in a multitask word alignment setting:(1) Model Invertibility Regularization (MIR) - a concave regularizer for bi-directional models which can be applied even without parallel data.(2) Triangulation based Dirichlet prior - a method that capitalizes on parallel data with a pivot language, to construct and learn better translation priors.This talk is based on joint work with Prof. David Chiang (ND) and Dr. Ashish Vaswani (ISI).Bio:Tomer Levinboim is a PhD student at the University of Notre Dame, working with Prof. David Chiang on developing machine learning techniques for improving machine translation and NLP of low resource languages.He is generously hosted by Kevin Knight at USC/ISI.

DTEND;TZID=America/Los_Angeles:20150320T160000
DTSTART;TZID=America/Los_Angeles:20150320T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Multitask Word Alignment with Random-Walk Regularizers
UID:20150320T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: During the last decade, Question Answering (QA) was redefined inside TREC as a kind of highly-precision-oriented Information Retrieval task where the introduction of NLP was necessary, specially for Answer Extraction purposes. The same general approach was activated at the Cross-Language Evaluation Forum (CLEF) in 2003, but for other European languages different than English, and with some different settings and subtasks. The talk will report the last 4-year cycle of the QA evaluation at CLEF, starting with the general methodology for long term QA evaluation at CLEF and the motivation for the Answer Validation task, continuing with the development of AVE in the three year campaign, and concluding with the goals, evaluation measure and results of the current QA evaluation setting after the AVE experience.

DTEND;TZID=America/Los_Angeles:20091211T160000
DTSTART;TZID=America/Los_Angeles:20091211T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Evaluating Question Answering Validation
UID:20091211T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: For many NLP tasks, EM-trained HMMs are the common models. However, in order to escape local maxima and find the best model, we need to start with a good initial model. Researchers suggested repeated random restarts or constraints that guide the model evolution. Neither approach is ideal. Restarts are time-intensive, and most constraint-based approaches require serious re-engineering or external solvers. In this paper we measure the effectiveness of very limited initial constraints: specifically, annotations of a small number of words in the training data. We vary the amount and distribution of initial partial annotations, and compare the results to unsupervised and supervised approaches. We find that partial annotations improve accuracy and reduce the need for random restarts, which speeds up training time considerably.

DTEND;TZID=America/Los_Angeles:20120518T160000
DTSTART;TZID=America/Los_Angeles:20120518T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Exploiting Partial Annotations with EM Training (NAACL HLT Practice Talk)
UID:20120518T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will present how a new class of unsupervised,nonparametric Bayesian models can be effectively applied to solve real data applications that involve clustering feature-rich linguistic objects.First, I will describe a generalization of the hierarchical Dirichletprocess model to account for additional properties associated withobservable objects. In addition, to overcome some of the limitations ofthis new model, I will then describe a new hybrid model which combinesan infinite latent class model with a discrete time series model. Themain advantages of this hybrid model are the abilities to representa potentially infinite number of features associated with observableobjects and to perform an automatic selection of the most salientfeatures.  Furthermore, all the models described in this talk aredesigned to account for a potential number of categorical outcomes.The evaluation performed for solving both within- and cross documentevent coreference shows significant improvements of the models whencompared against three baselines for this task.Short Bio:Cosmin Adrian Bejan is a postdoctoral researcher at the USC Institutefor Creative Technologies, where he is currently working on applicationsthat involve extraction and analysis of commonsense knowledge from largecollections of text documents. His research interests include eventsemantics, semantic parsing, commonsense causal reasoning, unsupervised learning, and nonparametric Bayesian methods.

DTEND;TZID=America/Los_Angeles:20110311T160000
DTSTART;TZID=America/Los_Angeles:20110311T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Nonparametric Bayesian Models for Clustering Feature-Rich Linguistic Objects
UID:20110311T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Although we live in a predominantly statistical world, there are stillmany language processing applications that long for accuraterepresentations of text meaning. Even applications that found partialsolutions in statistical modeling, including information retrieval,machine translation, or automatic summarization, are likely to get asignificant boost from deeper text understanding.In this talk, I will present an innovative method for automatic extractionof conceptual graphs as a means to represent text meaning.Â  The methodrelies on a novel adaptation of graph-based ranking algorithms -traditionally (and successfully) used in citation analysis, Web pageranking, and social networks. I will show how such algorithms can beadapted to semantic networks, resulting in an efficient unsupervisedmethod for resolving the semantic ambiguity of all words in open text, andidentifying relations between entities in the text. I will also outline anumber of applications that are enabled by this representation, includingkeyphrase extraction, domain classification, and extractive summarization.BIO: Rada Mihalcea is an Assistant Professor of Computer Science atUniversity of North Texas. Her research interests are in lexicalsemantics, minimally supervised natural language learning, andmultilingual natural language processing. She is currently involved in anumber of research projects, including word sense disambiguation, shallowsemantic parsing, (non-traditional) methods for building annotated corporawith volunteer contributions over the Web, word alignment for languagepairs with scarce resources, and graph-based ranking algorithms forlanguage processing. Her research is supported by NSF and the state ofTexas.

DTEND;TZID=America/Los_Angeles:20040416T120000
DTSTART;TZID=America/Los_Angeles:20040416T103000
LOCATION:11 Large
SUMMARY:Graph-based Ranking Algorithms for Language Processing
UID:20040416T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: NLP research has been fluctuating between two extreme models of computation:finite computers and universal computers.  Often a practical solution combinesboth of these two extremes because formally powerful models are simulated byphysical machines that approximate them.  This is especially true for recurrentneural networks whose activation vector is the key to deeper understanding oftheir emergent finite-state behavior.  However, we currently have only a veryloose characterization for the finite-state property in neural networks.In order to construct a hypothesis for a possible bottom-up organization of thestate-space of activation vectors of RNNs, I compare neural networks withbounded Turing machines and finite-state machines, and quote recent results onfinite state models for semantic graphs.  These models enjoy the nice closureproperties of weighted finite-state machines.In the end of the talk, I sketch my vision for neural networks that performfinite-state graph transductions in real time. Such transductions would have avast variety of applications in machine translation and semantic informationretrieval involving big data.Anssi Yli-JyrÃ¤ has the titles of Adjunct Professor (Docent) in LanguageTechnology at the University of Helsinki and Life Member of Clare Hall College at the University of Cambridge.  He is currently a PI and a ResearchFellow of the Academy of Finland in a project concerning universality offinite-state syntax.  He has published a handbook on Hebrew and Greek morphemealignments in the Finnish Bible translation together with a group of DigitalHumanists, and then served the Finnish Electronic Library at CSC - IT Centre ofScience where he built an internet harvester and a search engine for the FinnishWWW.   In 2005, he earned his PhD from the University of Helsinki and then worked as a coordinator for the Language Bank of Finland at CSC.  There he contributed to pushing his employer to what is now known as the CLARIN EuropeanResearch Infrastructure Consortium.  He became the first President of SIGFSM in 2009, after fostering and organizing FSMNLP conferences for several years.  In 2012-2013, he served as a Subject Head of Language Technology in his home university before visiting the Speech Group at the Department of Engineering, Cambridge University.  He has supervised theses and contributed to thetheoretical basis of Helsinki Finite-State Transducer (HFST) library.  In hisown research, Yli-JyrÃ¤ constantly pursues unexplored areas, applyingfinite-state transducers to graphical language processing tasks such asautosegmental phonology, constraint interaction, and dependency syntax and neural semantics.  He is a qualified teacher and interested in the occurrence offlow in agile programming and simultaneous translation.

DTEND;TZID=America/Los_Angeles:20171110T160000
DTSTART;TZID=America/Los_Angeles:20171110T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:On Real-Time Graph Transducers
UID:20171110T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Speakers of more than 5,000 languages have access to internet and communication technologies. The majority of phones, tablets and computers now ship with language-enabled capabilities like speech-recognition and intelligent auto-correction, and people increasingly interact with data-intensive cloud-based language technologies like search-engines and spam-filters. For both personal and large-scale technologies, the service quality drops or disappears entirely outside of a handful of languages. Speakers of low-resource languages correlate with lower access to healthcare, education and higher vulnerability to disasters. Serving the broadest possible range of languages is crucial to ensuring equitable participation in the global information economy.I will present examples of how natural language processing and distributed human computing are improving the lives of speakers of all the world's languages, in areas including education, disaster-response, health and access to employment. When applying natural language processing to the full diversity of the world's communications, we need to go beyond simple keyword analysis and implement complex technologies that require human-in-the-loop processing to ensure usable accuracy. I will share results that show how for-profit technologies are improving people's lives by providing sustainable economic growth opportunities when they support more languages, aligning business objectives with global diversity.Bio:Robert Munro is the CEO of Idibon, a company with the objective of providing language technologies for all the world's languages. In past work he has served as Chief Information Officer for the largest solar energy company in Sierra Leone; was the Chief Technology Officer for the largest use of big data technologies to track disease outbreaks globally; worked for the UN High Commission for Refugees in Liberia; lead the crowdsourced response to the 2010 earthquake Haiti; and has helped information processing in disaster response and election monitoring in more than a dozen countries. In current work, Idibon helps everyone from Fortune 500s to disaster response organizations process language data at scale. Outside of work, he has learned about the world's diversity by cycling more than 20,000 kilometers across 20 countries. Robert has a PhD from Stanford University.

DTEND;TZID=America/Los_Angeles:20141120T160000
DTSTART;TZID=America/Los_Angeles:20141120T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Technologies for every language: how machine learning can reach everyone in the world
UID:20141120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on tree-to-tree translation models: models that are able to learn the relationship between the grammars of both the source and target language. I will discuss the reasons why tree-to-tree translation has been a challenge, review existing attempts at tree-to-tree models, and present some of our own work-in-progress on robustly modeling source and target language syntax for significant improvements in translation quality.

DTEND;TZID=America/Los_Angeles:20100122T160000
DTSTART;TZID=America/Los_Angeles:20100122T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards Tree-to-Tree Translation
UID:20100122T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you do good research, you'll find that many doors open. I'll offer some suggest for how to make that happen. This should be an interactive session.Bio: Kevin Knight is the director of the ISI Natural Language group, a professor of Computer Science at USC, and an ISI Fellow.

DTEND;TZID=America/Los_Angeles:20141003T160000
DTSTART;TZID=America/Los_Angeles:20141003T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Getting Good at Research
UID:20141003T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Metaphors are used to replace complicated or unfamiliar ideas with familiar, yet unrelated concepts that share an important attribute with the intended idea. The result is a conceptual mapping between metaphoric source and literal target meaning.Computational metaphor processing is divided into detection and interpretation.To detect metaphors, most existing approaches attempt to identify these conceptual mappings. They require resources for the source (metaphor) as well as the target domain, and a set of defined mappings between the two. Creating these resources is expensive and limits the scope of these systemsThey are also usually restricted to well-observed, conventionalized metaphors, and can not deal with neologisms. Since metaphors are a productive area of language, this is a major shortfall.We propose a statistical approach to metaphor detection that utilizes the uncommonness of novel metaphors. Words that do not match a text's typical vocabulary are highlighted as metaphor candidates. No knowledge of semantic concepts or the metaphor's source domain is required for this. We analyze the performance of this approach as an unsupervised standalone classifier and as a feature in a supervised graphical model.

DTEND;TZID=America/Los_Angeles:20121019T160000
DTSTART;TZID=America/Los_Angeles:20121019T30000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Metaphor Detection through Term Frequency
UID:20121019T30000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Statistical machine translation (MT) often needs a large corpus of parallel translated sentences in order to achieve good performance. This limits the use of current MT technologies to a few resource-rich languages. Assume an incident happens in an area with a low-resource language. For a quick response, we need to build an MT system with available data, as finding or translating new parallel data is expensive and time consuming. For many languages this means that we only have a small amount of often out-of-domain parallel data (e.g. a Bible or Ubuntu manual). This talk is about ways to improve machine translation in low resource scenarios. I'll talk about use of monolingual data and parallel data from related languages to improve machine translation from the low resource language into English.Bio: Nima Pourdamghani is a fourth year Ph.D. student at ISI. He works with Professor Kevin Knight on machine translation from low resource languages.

DTEND;TZID=America/Los_Angeles:20170811T160000
DTSTART;TZID=America/Los_Angeles:20170811T150000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Improving machine translation from low resource languages
UID:20170811T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Semantic parsing, which parses natural language into formal languages, has been applied to a wide range of structured data like relation databases, knowledge bases, and web tables. To learn a semantic parser for a new domain, the first challenge is always how to collect training data. While data collection using crowdsourcing has become a common practice in NLP, it's a particularly challenging and interesting problem when it comes to semantic parsing, and is still in its early stages. Given a domain and a formal language, how can we generate meaningful logical forms in a configurable way? How to design the annotation task so that crowdsourcing workers, who do not understand formal languages, can handle with ease? How can we exploit the compositional nature of formal languages to optimize the crowdsourcing process? In this talk I will introduce some recent advances in this direction, and present some preliminary answers to the above questions. The covered works mainly concern knowledge bases, but we will also cover some ongoing work concerning web APIs.Yu Su is a fifth year PhD candidate in the Computer Science Department at UCSB, advised by Professor Xifeng Yan. Before that, He received a bachelor degree from Tsinghua University in 2012, with a major in Computer Science. He is interested in the interplay between language and formal meaning representations, including problems like semantic parsing, continuous knowledge representation, and natural language generation. He also enjoys applying deep learning on these problems.

DTEND;TZID=America/Los_Angeles:20161028T160000
DTSTART;TZID=America/Los_Angeles:20161028T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning from Zero: Recent Advances in Bootstrapping Semantic Parsers using Crowdsourcing
UID:20161028T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: User-generated passwords tend to be memorable,  but  not  secure.   A  random,  computer-generated 60-bit string is much more secure.However, users cannot memorize random 60-bit strings. In this paper, we investigate methods for converting arbitrary bit strings into English word sequences (both prose and poetry), and  we  study  their  memorability  and  other properties.Bio: Marjan Ghazvininejad is a second year PhD student in Computer Science at University of Southern California (USC). She is working with Professor Kevin Knight at the Information Sciences Institute (ISI). She is interested in natural language processing, especially the application of machine learning techniques in this area.

DTEND;TZID=America/Los_Angeles:20150522T160000
DTSTART;TZID=America/Los_Angeles:20150522T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:How to Memorize a Random 60-Bit String
UID:20150522T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Naturalis, the Dutch National Museum of Natural History, harbours one of the largest treasures of the world: the key specimens of millions of animals found throughout the world through centuries of biological expeditions. While the depot where the animals are stored is a technical marvel, Noah's ark of the 21st century, it is hard to search through it. Research in taxonomy, the evolution of life and biodiversity revolves around the specimens in the depot. The main key to accessing the depot are(mostly) handwritten expedition logs and registration books, which are currently being photographed and keyed in to be stored in searchable digital archives. Such digital logs already enable a kind of "Biogoogle" search, but actual research questions are more complicated ("how did this kind of frog develop over the last century in the Amazon rainforests?"), and demand more intelligent handling. This is where the MITCH project comes in.The goal of MITCH is to turn the field logs and registration books into a populated semantic network, in which concepts such as animal specimens are related to all other concepts that define them: where, when, under which circumstances and by whom were they found, who described them first in the academic literature, who prepared them for storage in the Naturalis depot, which registration number was assigned to them, etc. This means that all textual descriptions of a specimen need to be parsed into exactly these concepts and their relations. All of this needs to be  done at a scale that goes far beyond the human capacity, as tens of thousands of digitized but unanalysed textual records are waiting for semantic analysis. This necessitates the use of state-of-the-art machine learning methods that learn from examples automatically.The project addresses its goals on three levels. The basic level is the development and application of automatic data cleaning and markup tools. On top of this, semi-structured textual material such as fieldbook logs and scientific papers, are semi-automatically converted to a searchable knowledge base. Search results are visualised by displaying maps and specimen photos. The conversion phase assumes the active intervention of domain experts, such as collection managers, to correct and steer the automatic extraction  procedure. At the top level, information resources are cross-linked using a domain ontology, populating a semantic network that can be hooked up to any other standardised cultural heritage knowledge base or to a search engine.

DTEND;TZID=America/Los_Angeles:20071214T163000
DTSTART;TZID=America/Los_Angeles:20071214T150000
LOCATION:11 Large
SUMMARY:MITCH: Mining for Information in Texts from the Cultural Heritage
UID:20071214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: In this talk, I will focus on two important aspects in language modeling: memorization and exploration.First, I will present Recurrent Memory Network, a recurrent language model augmented with an external memory block. I will show that by explicitly addressing the memory, RMN not only amplifies the power of recurrent neural network but also facilitate our understanding of its internal functioning and allows us to discover underlying patterns in data. Furthermore, our experiments demonstrate that using external memory allows RMN capturing sentence coherence better than previous models on sentence completion task.In context of language generation (e.g. using conditional recurrent language models), memorization might hurt the performance of the whole system especially when recurrent models start hallucinating. In the second part, I will present preliminary findings in training neural machine translation (NMT) to avoid this pitfall. Particularly, we allow NMT to explore during training using REINFORCE/deep Q-network/minimum risk training.Bio: Ke is a third year PhD candidate at University of Amsterdam, advised by Christof Monz and Arianna Bisazza. Before that, he received Msc degree from University of Groningen and Charles University in Prague. He is interested in neural machine translation.

DTEND;TZID=America/Los_Angeles:20160603T160000
DTSTART;TZID=America/Los_Angeles:20160603T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Title: Memorization and Exploration in Recurrent Neural Language Models
UID:20160603T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In recent years, a dramatic increase in the availability of digital text has created challenges and opportunities for learning for both humans and machines. My talk will describe research on learning commonsense knowledge from text -- despite our Gricean imperative to write down only what other people wouldn't know -- and using this for reasoning about language and the world. It will also address helping people to learn scientific knowledge by using implicit structure in a proliferation of articles, books, online courses, and other educational resources.Bio: Jonathan Gordon is a postdoctoral researcher at the USC Information Sciences Institute, where he works with Jerry Hobbs and colleagues on the problems of learning and organizing knowledge from text. He completed a bachelor's degree in computer science at Vassar College and a Ph.D. in artificial intelligence at the University of Rochester, supervised by Lenhart Schubert.

DTEND;TZID=America/Los_Angeles:20171117T160000
DTSTART;TZID=America/Los_Angeles:20171117T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning and Reading
UID:20171117T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The task of coreference resolution identifies those expressions in a text that point to the same discourse entity. Natural language applications such as information extraction, question answering and machine translation can greatly benefit from its output (the different pieces of information in connection with the same entity are linked, pronouns are disambiguated, etc.). The task is extremely complex since a number of knowledge sources come into play, from morphology to discourse structure and world knowledge. In this talk I present the results of my PhD research up to now, including the development of two 400k-word corpora for Spanish and Catalan (AnCora) annotated at various levels (morphology, syntax, semantics, pragmatics), a 100k-word corpus for English, and a series of experiments towards building a learning-based coreference resolution system. More specifically, I'll discuss issues concerning the definition of the annotation scheme, the selection of features for machine learning, the effect of sample selection, and I'll introduce CISTELL, the new learning-approach we propose for coreference resolution.

DTEND;TZID=America/Los_Angeles:20090529T160000
DTSTART;TZID=America/Los_Angeles:20090529T150000
LOCATION:11 Large
SUMMARY:Learning-based Coreference Resolution for Spanish and Catalan
UID:20090529T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: At present, adapting an Information Extraction system to new topics isan expensive and slow process, requiring some knowledge engineering foreach new topic. We propose a new paradigm of Information Extractionwhich operates 'on demand' in response to a user's query. On-demandInformation Extraction (ODIE) aims to completely eliminate thecustomization effort. Given a user's query, the system willautomatically create patterns to extract salient relations in the textof the topic, and build tables from the extracted information usingparaphrase discovery technology. It relies on recent advances in patterndiscovery, paraphrase discovery, and extended named entity tagging.I will show you a demo system, which produce a table in less than aminute for any give queries.I will also explain the need of linguistic knowledge and introduce someweakly supervised learning methods. I will show a demo of the ngramsearch engine, which extracts ngrams and sentences which match to aquery with arbitrary wildcard.Also, I will give a brief introduction about the Web People Search.It is a task to disambiguate search results of people name and peopleattribute extraction task. We organized WePS1 and 2, and currentlystarted the third evaluation, which includes 2 tasks: 1) the combinedtask of people disambiguation and attribute extraction and 2)organization disambiguation from twitter messages.Brief Bio:Satoshi Sekine is an Research Associate Professor at New York University.He received his MSc at UMIST, UK in 1992 and his PhD in 1998 at NYU. Hehas been working on various topics, including parsing, NE, InformationExtraction and minimally supervised knowledge discovery. He edited a bookabout NE from John Benjamins, organized a JHU summer workshop 2009,WePS task, NSF symposium on Semantic Knowledge Discovery, Organizationand Use in 2008, workshop on Textual Entailment and Parsing 2007 and so on.

DTEND;TZID=America/Los_Angeles:20100405T113000
DTSTART;TZID=America/Los_Angeles:20100405T103000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:On-Demand Information Extraction and Knowledge Discovery
UID:20100405T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The fundamental task in statistical machine translation (SMT) is tocharacterize the probability of a target sentence given its sourcetranslation; for translating French as English, P(f | e). By applyingBayes Rule, we derive the fundamental theorem of SMT: e maximizingP(e) P(f | e). Advances in SMT come from improving estimations ofthese two terms, or from more efficient ways of searching for optimalsolutions (Brown et al. 1993).In the case of language modeling, Shannon (1949) and Brown et al.(1992) identified upper and lower bounds for the per-character entropyof English, H(e), for humans and machines, respectively. We ask thesame question for SMT, H(e | f), comparing the results for humantranslators and a simple machine baseline based on IBM Model 1. Thesenumbers are the upper and lower bounds for SMT systems trained onparallel data.

DTEND;TZID=America/Los_Angeles:20080820T153000
DTSTART;TZID=America/Los_Angeles:20080820T150000
LOCATION:11 Small
SUMMARY:Intern Final Talk:  The Entropy of English given French
UID:20080820T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Probabilistic models can learn usersâ preferences from the history of their item adoptions on a social media site, and in turn, recommend new items to users based on learned preferences. However, current models ignore psychological factors that play an important role in shaping online social behavior. One such factor is attention, the mechanism that integrates perceptual and cognitive features to select the items the user will consciously process and may eventually adopt. Recent research has shown that people have finite attention, which constrains their online interactions, and that they divide their limited attention non-uniformly over other people. We propose a collaborative topic regression model that incorporates limited, non-uniformly divided attention. We show that the proposed model is able to learn more accurate user preferences than state-of-art models, which do not take human cognitive factors into account. Specifically we analyze voting on news items on the social news aggregator and show that our model is better able to predict held out votes than alternate models. Our study demonstrates that psycho-socially motivated models are better able to describe and predict observed behavior than models which only consider latent social structure and content.

DTEND;TZID=America/Los_Angeles:20130906T160000
DTSTART;TZID=America/Los_Angeles:20130906T150000
LOCATION:6th Floor Conference Room [RM # 689]
SUMMARY:LA-CTR: A Limited Attention Collaborative Topic Regression for Social Media
UID:20130906T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recurrent neural networks are effective tools for processing natural language. RNNs can be effectively trained to perform sequence processing tasks such as translation, classification, language modeling, and paraphrase detection. However despite major gains in fields related to these power hungry artificial neural networks, it remains difficult to construct functional models of cognition inspired by biological nervous systems. In this talk I'll describe how RNNs can be trained to excel at language understanding tasks and then adapted to run on ultra-low power neuromorphic hardware which simulates the spiking of individual neurons. The result is an interactive embedded system that uses recurrent neural networks to process language while consuming an estimated .000048 watts (48 microwatts).Bio: Guido Zarrella is a Principal Artificial Intelligence Engineer at the MITRE Corporation in Denver, Colorado. He leads a R&D effort pursuing advances in deep learning for language understanding. He is a former President of the Association for Computational Linguistics, having served in this role on December 5th, 2011.

DTEND;TZID=America/Los_Angeles:20150911T160000
DTSTART;TZID=America/Los_Angeles:20150911T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Neuromorphic Language Understanding
UID:20150911T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll describe our entry into the DUC 2004 automatic document summarizationcompetition.  We competed only in the single document, headline generationtask.  Our system is based on a novel kernel dubbed the tree positionkernel, combined with two other well-known kernels.  Our system performswell on white-box evaluations, but does very poorly in the overall DUCevaluation.  C'est la vie.

DTEND;TZID=America/Los_Angeles:20040423T160000
DTSTART;TZID=America/Los_Angeles:20040423T150000
LOCATION:10 Large
SUMMARY:A Tree-Position Kernel for Document Compression
UID:20040423T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We propose a framework for devising testable algorithms for bridging the communication gap between humans and robots.  We begin with a setting in which humans give instructions to robots using unrestricted language commands, with instruction sequences aimed at building complex goal configurations in a blocks world.  I will present details of our data-collection effort, and preliminary results on action understanding.  Time permitting, I will present new baseline results for flipping the semantic parsing paradigm to address the problem of language generation, where a human performs commands produced by a machine to demonstrate basic two-way communication.Bio: Yonatan Bisk received his PhD from UIUC in 2015 under Professor Julia Hockenmaier and is now a Postdoc with Daniel Marcu at ISI.

DTEND;TZID=America/Los_Angeles:20160520T160000
DTSTART;TZID=America/Los_Angeles:20160520T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Title: Natural Language Communication with Computers
UID:20160520T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Compositional topics ("Swiss passport", "German grammar") ofinterest to Web users may be available as entries within structuredknowledge resources. But such topics are not necessarily connected to,let alone represented in relation to, entries of the constituenttopics ("Switzerland" and "Passport", or "German language" and"Grammar") from which their approximate meaning could be aggregated.Web documents - more precisely, encyclopedic articles - and Web searchqueries are shown to be useful in complementary tasks relevant tounderstanding compositional topics. The tasks are the decomposition ofpotentially compositional topics into zero, one or more constituenttopics; and the interpretation of the role ("issued by", "of") playedby constituents ("Swiss", "German") within ambiguous compositionalphrases that might refer to compositional topics.Bio: Marius Pasca is a research scientist at Google in Mountain View,California. Current research interests include factual informationextraction from unstructured text within documents and queries, andits applications to Web search.

DTEND;TZID=America/Los_Angeles:20150807T160000
DTSTART;TZID=America/Los_Angeles:20150807T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Understanding the World's Compositional Concepts
UID:20150807T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: The correct interpretation of any sentence is obscured by a vast array of alternatives. Previous work on disambiguating meaning has focused on representations of syntax using tree structures. Simplifying syntax in this way often means leaving out long-distance relations between words, providing less information to downstream tasks such as dialog and question answering. We propose a new algorithm that is able to efficiently search over graph structures, fully capturing argument structures as a directed acyclic graph.  Our dynamic program uniquely decomposes structures, and is sound and complete with respect to the class of one-endpoint crossing graphs.Bio: Jonathan is a Ph.D. candidate at UC Berkeley working on natural language processing with Dan Klein. His research focuses on new algorithms for interpreting text and analyzing system behavior. In particular, he has built search-based error analysis tools for syntactic parsing and coreference resolution, and a graph-based syntactic parser.

DTEND;TZID=America/Los_Angeles:20160325T160000
DTSTART;TZID=America/Los_Angeles:20160325T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Capturing More Linguistic Structure with Graph-Structured Parsing
UID:20160325T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the last 15 years, the interoperability of language resources hasbeen recognized as a major problem in the development of NLPinfrastructures -- partly due to an increased focus on novel,underresourced languages and efforts to bootstrap language resourcesby annotation projection -- partly due to the increased interest inmore abstract levels of linguistic analysis beyond morphosyntax andsyntax, namely semantics, reference and discourse.This talk describes the application of Semantic Web formalisms, RDF,OWL/DL and SPARQL, to facilitate the interoperability of linguisticcorpora and linguistic annotations. Interoperability of linguisticcorpora involves two aspects: Structural interoperability (annotationsof different origin are represented using the same formalism) andconceptual interoperability (annotations of different origin arelinked to a common vocabulary). I will describe ontology-basedapproaches for both aspects, the POWLA ontology that defines a datamodel for annotated corpora, and the Ontologies of LinguisticAnnotation (OLiA) that provide definitions for linguistic categoriesand properties (Chiarcos 2012). As compared to state-of-the-artapproaches based on standoff XML, e.g., the recently published ISOstandard for an Linguistic Annotation Framework, key advantages ofthis approach include the existence of a rich technological ecosystemdeveloped around RDF and OWL, including standardized query languagesfor directed acyclic (multi-) graphs (SPARQL), APIs, data baseimplementations, as well as the availability of OWL reasoners that canbe applied to validate the consistency of linguistic corpora and theirannotations and to infer additional information that is relevant, forexample, for their appropriate visualization.Naturally, representing corpora in OWL and RDF also allows tointerlink resources freely, e.g., different annotation layers of amulti-layer corpus, translated texts in parallel corpora, orlinguistic corpora and lexical-semantic resources. Modeled in thisway, corpora can be fully integrated in a Linked Open Data (sub-)cloudof linguistic resources, along with lexical-semantic resources andknowledge bases of information about languages and linguisticterminology. The second part of my talk will introduce recent effortsto create a Linked Open Data sub-cloud of linguistic resources, theLinguistic Linked Open Data cloud (Chiarcos et al. 2012, cf.http://linguistics.okfn.org).ReferencesChristian Chiarcos, Sebastian Hellmann, Sebastian Nordhoff, et al.(2012), The Open Linguistics Working Group, Proceedings of the 8thInternational Conference on Language Resources and Evaluation(LREC-2012). Istanbul, Turkey, May 2012.[http://www.lrec-conf.org/proceedings/lrec2012/pdf/912_Paper.pdf]Christian Chiarcos (2012), Interoperability of Corpora andAnnotations, In: Christian Chiarcos, Sebastian Nordhoff, and SebastianHellmann (eds.) Linked Data in Linguistics. Representing andConnecting Language Data and Language Metadata. Springer, Heidelberg.[http://www.springer.com/computer/ai/book/978-3-642-28248-5]BioChristian Chiarcos studied Computer Science and General Linguistics atthe Technical University Berlin, Germany, and received his PhD inComputational Linguistics from the University of Potsdam, Germany in2010. He is currently affiliated with the University of Frankfurt/M.,Germany. Since April 2012, he is visiting scholar at the ISI. Hisprimary areas of expertese include the study and modeling of discoursesemantics, as well as the development of infrastructures for rich andheterogeneous linguistic annotations.

DTEND;TZID=America/Los_Angeles:20121102T160000
DTSTART;TZID=America/Los_Angeles:20121102T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Linguistic Linked Open Data. Linking Corpora
UID:20121102T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Solutions for many natural language processing problems such as speech recognition, transliteration, and  translation have been described as weighted finite-state transducer cascades. The transducer formalism is very useful for researchers, not only for its ability to expose the deep similarities between seemingly disparate models, but also because expressing models in this formalism allows for rapid implementation of real, data-driven systems. Finite-state toolkits can interpret and process transducer chains using generic algorithms and many real-world systems have been built using these toolkits. Current research in NLP makes use of syntax-rich models that are poorly suited to extant transducer toolkits, which process linear input and output. Tree transducers can handle these models, and a weighted tree transducer toolkit with appropriate generic algorithms will lead to the sort of gains in syntax-based modeling that were achieved with string transducer toolkits. In this thesis proposal practice talk I will briefly trace the history of finite-state transducers and automata as they relate to natural language processing and the evolution of formalisms and the toolkits that support them, leading up to motivation for the design and creation of Tiburon, the toolkit referenced in this talk's title. I will describe previous, current, and future work on Tiburon's algorithms and the effectiveness of both algorithms and  software at cleanly representing syntax-based NLP models from the literature and at constructing and evaluating novel models.

DTEND;TZID=America/Los_Angeles:20080711T160000
DTSTART;TZID=America/Los_Angeles:20080711T150000
LOCATION:11 Large
SUMMARY:Thesis Proposal Practice Talk:  A Weighted Tree Transducer Toolkit for Syntactic Natural Language Processing Models
UID:20080711T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The highly complex structure of the human brain is strongly shaped by genetic influences. Subcortical brain regions act jointly with cortical areas to coordinate movement, memory, motivation, reinforcement and learning. To investigate how common genetic variants affect the structure of these brain regions, we conducted genome-wide association studies (GWAS) of the volumes of seven subcortical regions and intracranial volume, derived from magnetic resonance images (MRIs) of 30,717 individuals. By identifying genetic influences on brain structure, we can begin to map the genetic architecture underlying variability in human brain development and function, a process that will help elucidate the dysfunctions that lie at the core of neuropsychiatric disorders.Bio: Derrek Hibar is an assistant professor in the Department of Neurology in the Keck School of Medicine of USC where he studies common genetic influences on brain structure and susceptibility to psychiatric disorders. He is currently coordinating one of the largest studies of brain structure to date as part of the ENIGMA Consortium (http://enigma.ini.usc.edu).

DTEND;TZID=America/Los_Angeles:20150130T160000
DTSTART;TZID=America/Los_Angeles:20150130T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Neuroimaging Genetics in the ENIGMA Consortium
UID:20150130T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Natural language processing (NLP) tools have become ubiquitous for data analysis in digital environments such as the Web and social media. Popular applications include tools for clustering, sequence labeling, machine translation, to name a few. But unfortunately, majority of the existing toolkits rely on supervised learning to train models using labeled data. This poses several challenges---labeled data is not readily available in all languages or domains and building an NLP system from scratch for a new domain (or language, user, etc.) requires significant human effort which is both time-consuming and expensive. Moreover, scaling this strategy on the Web is infeasible.Recent advances in unsupervised algorithms have demonstrated promising results on several NLP tasks without using any labeled data. But despite their utility, scalable unsupervised algorithms rarely provide probabilistic representations of the data which can be useful for predicting on unseen data or integrated as components with a larger model or pipeline. In addition, these methods often favor simple model descriptions (e.g., k-means algorithm for clustering) in exchange for rich statistical models. This leads to the problem of rapidly diminishing returns when applying these methods on increasing amounts of data. Instead, we need to design algorithms that can scale elegantly to large data as well as complex models.In this work, I will present our recent work on scalable probabilistic learning with Bayesian inference. We show a novel algorithm for fitting mixtures of exponential families, which generalizes several models that are typically used in NLP and other areas. A major contribution of our work is a novel sampling method that uses locality sensitive hashing to achieve high throughput in generating proposals during sampling. Using "clustering" as an example application, I will describe our approach and show that it scales elegantly to large numbers of clusters achieving a speedup of several orders of magnitude over existing toolkits, while maintaining high clustering quality. In addition, we also prove probabilistic error guarantees for the new sampling algorithm. This is joint work with Amr Ahmed and Alex Smola. Lastly, I will briefly mention some ongoing work on large-scale unsupervised learning for other NLP applications such as machine translation.Bio: Sujith Ravi is a Research Scientist at Google. He completed his PhD at University of Southern California/Information Sciences Institute and joined Yahoo! Research, Santa Clara as a Research Scientist before joining Google, Mountain View in 2012. His main research interests span various problems and theory related to the fields of Natural Language Processing (NLP) and Machine Learning. He is specifically interested in large-scale unsupervised and semi-supervised methods and their applications to structured prediction problems in NLP, information extraction, user modeling in social media, graph optimization algorithms for summarizing noisy data, computational decipherment and computational advertising. His work has been reported in several magazines such as New Scientist, ACM TechNews, etc. For more information, you can visit his personal page (http://www.sravi.org).

DTEND;TZID=America/Los_Angeles:20130308T160000
DTSTART;TZID=America/Los_Angeles:20130308T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Scalable Unsupervised Learning for Natural Language Processing
UID:20130308T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This summer we held a three-month workshop on syntax-driven machinetranslation, in which we learned syntactic transformations automaticallyfrom Chinese/English translated corpora and applied them to translate newtext.  We'll give a progress report!

DTEND;TZID=America/Los_Angeles:20040910T163000
DTSTART;TZID=America/Los_Angeles:20040910T150000
LOCATION:11 Large
SUMMARY:About Syntax Fest 2004 (Part I)
UID:20040910T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: NLP Technologies have developed a technology for automated analysis of social media data. TRANSLI Social Media Analytics and monitoring, is an online visual analytics system designed to provide social intelligence from news and other events from Twitter. During this seminar, Dr. Atefeh Farzindar will give a presentation on TRANSLI-SM where the system features an intuitive user interface and is designed to browse and visualise the results of the semantic analysis of social discussion on specific events from Twitter. The user can obtain the information not only limited to the main event of interest but also to the intelligence for the sub events.NLP Technologies Inc. is a Canadian company founded in 2005 and that expanded to California in 2014. The company specialises in natural language processing, NLP-based search engines, translation technologies and services, social media analytics, and automatic summarization.http://www.nlptechnologies.ca/Bio: Dr. Atefeh Farzindar, CEO NLP Technologies Inc.and Adjunct professor at University of MontrealDr. Atefeh Farzindar is the co-founder and CEO of NLP Technologies. She received her PhD in Computer Science from the University of Montreal and her Doctorate in automatic summarization of legal documents from Paris-Sorbonne University in 2005. She has been an Adjunct professor at the Department of Computer Science at the University of Montreal since 2010, and she was Honorary Research Fellowship at the Research Group in Computational Linguistics at the University of Wolverhampton, UK (2010-2012).Dr. Farzindar has been Action Editor in the international journal of Computational Intelligence since 2011. She co-edited two special issues on social media analysis for the International Journal of Computational Intelligence (CI) and Journal TAL, an international journal on natural language processing.She co-authored an upcoming book on Natural Language Processing for Social Media [Morgan & Claypool Publishers, 2014], and authored a book chapter in Social Network Integration in Document Summarization, Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding, IGI Global publisher January 2014.In 2013, Dr. Farzindar won  Femmessor-MontrÃ©alâs contest,  Succeeding with a balanced lifestyle, in the  Innovative Technology and Information and Communications Technology category because of her involvement in the arts. Her paintings have recently been published in a book titled One Thousand and One Nights, in which the palette of vivid colours and her unique contemporary style revolved around on the place of women in modern society (Vernissage & Artist Book Launch April, MontrÃ©al, Galerie 203 https://www.youtube.com/watch?v=TLCghx1mvzY)

DTEND;TZID=America/Los_Angeles:20150410T160000
DTSTART;TZID=America/Los_Angeles:20150410T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:TRANSLI, NLP-based social media analytics and monitoring
UID:20150410T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Duolingo is a language education platform with more than 150 million students worldwide. Our flagship learning app is the #1 way to learn a language online, and is the most-downloaded education app for both Android and iOS devices. It is also completely free. In this talk, I will describe the Duolingo system and several empirical projects, which mix machine learning with computational linguistics and psychometrics to improve learning, engagement, and even language proficiency assessment through our products.Burr Settles is a scientist, engineer, and head of research at Duolingo: the most widely used education application in the world, teaching 20 languages to more than 150 million users worldwide. He is also the principal developer of the Duolingo English Test: a computer-adaptive proficiency exam that aims to disrupt and democratize the global certification marketplace through highly accessible mobile technology. Before joining Duolingo, he earned a PhD in computer sciences at University of Wisconsin-Madison, and then worked as a postdoctoral research scientist at Carnegie Mellon University, where his work has spanned machine learning, natural language processing, and computational social science. His 2012 book Active Learning is now the standard text on learning algorithms that are adaptive, curious, or exploratory (if you will). Burr gets around by bike and (among other things) plays guitar in the pop band delicious pastries.

DTEND;TZID=America/Los_Angeles:20160919T160000
DTSTART;TZID=America/Los_Angeles:20160919T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Duolingo: Improving Language Learning and Assessment with Data
UID:20160919T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present our approach to identify an argument structure defined as asimple hierarchical structure of claim and reasons.  The claim is alsoclassified into "in favor of" or "against" the topic. The experiment isperformed on the comments from the general public sent to governmentofficials in response to proposed regulations.

DTEND;TZID=America/Los_Angeles:20060505T163000
DTSTART;TZID=America/Los_Angeles:20060505T150000
LOCATION:11 Large
SUMMARY:Recognizing Argument Structures in Texts
UID:20060505T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Psychiatric document retrieval attempts to help people to efficientlyand effectively locate the consultation documents relevant to theirdepressive problems. Individuals can understand how to alleviate theirsymptoms according to recommendations in the relevant documents. Thiswork proposes the use of high-level topic information extracted fromconsultation documents to improve the precision of retrievalresults. The topic information adopted herein includes negative lifeevents, depressive symptoms and semantic relations between symptoms,which are beneficial for better understanding of users'queries. Experimental results show that the proposed approach achieveshigher precision than the word-based retrieval models, namely thevector space model (VSM) and Okapi model, adopting word-levelinformation alone.About the speaker:Liang-Chih Yu (<a href=http://www.isi.edu/~liangchi>http://www.isi.edu/~liangchi</a>) isnow a visiting student in the Information Sciences Institute (ISI) ofUniversity of Southern California (USC). My host advisor is Dr. EduardHovy. I am also a PhD candidate in the Department of Computer Scienceand Information Engineering, National Cheng Kung University, Tainan,Taiwan. My advisor is Dr. Chung-Hsien Wu. My research interestsinclude natural language processing, text mining, informationretrieval, ontology construction, spoken dialogue system.

DTEND;TZID=America/Los_Angeles:20070608T153000
DTSTART;TZID=America/Los_Angeles:20070608T150000
LOCATION:11 Large
SUMMARY:Topic Analysis for Psychiatric Document Retrieval (Practice Talk for ACL)
UID:20070608T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will survey results of several recent projects we have beenundertaking in automated text categorization based upon the style,rather than the topic, of the documents.  I will describe a generaltext-categorization framework using machine learning along with generalprinciples for choosing stylistically relevant sets of features forlearning effective classification models.  Applications of these methodsinclude determining author gender and text genre in published books andarticles, authorship attribution of email messages, and analysis oflanguage use in different scientific fields.  In many cases, the modelsthat are learned also give some insight into the respective styles beingdistinguished, which I will also discuss.Shlomo Argamon is an associate professor at the Illinois Institute ofTechnology Chicago.

DTEND;TZID=America/Los_Angeles:20040326T150000
DTSTART;TZID=America/Los_Angeles:20040326T133000
LOCATION:11 Large
SUMMARY:On Writing, Our Selves: Explorations in Stylistic Text Categorization
UID:20040326T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automated techniques that can support the human activities of search andsense-making in large email collections are of increasing importance for abroad range of uses, including historical scholarship, law enforcement andintelligence applications, and lawyers involved in "e-discovery" incidentto civil litigation.  In this talk, I'll briefly describe some of the workto date on searching large email collections, and then for most of thetalk I will focus on the more challenging task of support forsense-making.  Specifically, I'll describe joint work with Tamer Elsayedto automatically resolve the identity of people who are mentionedambiguously (e.g., just by first name) in a collection of email from afailed corporation (Enron).  Our results indicate that for people who arewell represented in the collection we can use a generative model to guessthe right identity about 80% of the time, and for others we are rightabout half the time.  I'll conclude the talk with a few remarks on ournext directions for techniques, evaluation, and additional types ofcollections to which similar ideas might be applied.About the Speaker:Douglas Oard is an Associate Professor at the University of Maryland,College Park, with joint appointments in the College of InformationStudies and the Institute for Advanced Computer Studies; he is onsabbatical at Berkeley's iSchool for the Fall 2009 semester.  Dr. Oardearned his Ph.D. in Electrical Engineering from the University ofMaryland, and his research interests center around the use of emergingtechnologies to support information seeking by end users.  His recent workhas focused on interactive techniques for cross-language informationretrieval and techniques for search and sense-making in conversationalmedia.  Additional information is available athttp://www.glue.umd.edu/~oard/.

DTEND;TZID=America/Los_Angeles:20091021T160000
DTSTART;TZID=America/Los_Angeles:20091021T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Who 'Dat? Identity resolution in large email collections
UID:20091021T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Each human language contains an astronomically large (if not unbounded) number of different sentences.Â  How can something so large and complex possibly be learnt?Â  Over the past decade and a half we've figured out how to define probability distributions over grammars and the linguistic structures they generate, opening up the possibility of Bayesian models of language acquisition.Â  Bayesian approaches are particularly attractive because they can exploit "prior" (e.g., innate) knowledge as well as statistical generalizations from the input.Â  This opens the possibility of an empirical evaluation of the utility of various kinds of innate knowledge.Â  Structured statistical learners have two major advantages over other approaches.Â  First, because the generalizations they learn and the prior knowledge they utilize are both expressed in terms of explicit linguistic representations, it is clear what is learnt and what information is exploited during learning.Â  Second, because of the "curse of dimensionality", learners that identify and exploit structural properties of their input seem to be the only ones that have a chance of "scaling up" to learn real languages.Â  This talk describes Bayesian methods for learning Context-Free Grammars and a generalization of them that we call Adaptor Grammars, and applies them to problems of morphological acquisition and word segmentation.Joint work with Tom Griffiths (Berkeley) and Sharon Goldwater (Edinburgh)Speaker Bio:Mark Johnson is a Professor of Language Science (CORE) in the Department of Computing at Macquarie University. He was awarded a BSc (Hons) in 1979 from the University of Sydney, an MA in 1984 from the University of California, San Diego and a PhD in 1987 from Stanford University. He held a postdoctoral fellowship at MIT from 1987 until 1988, and has been a visiting researcher at the University of Stuttgart, the Xerox Research Centre in Grenoble, CSAIL at MIT and the Natural Language group at Microsoft Research. He has worked on a wide range of topics in computational linguistics, but his main research area is parsing and its applications to text and speech processing. He was President of the Association for Computational Linguistics in 2003, and was a professor from 1989 until 2009 in the Departments of Cognitive and Linguistic Sciences and Computer Science at Brown University.Professor Johnson's research area is computational linguistics, i.e., explicit computational models of language acquisition, comprehension and production. His recent work has focused on probabilistic models for syntactic parsing (identifying the way words combine to form phrases and sentences) and semantic interpretation, and on Bayesian models of the acquisition of phonology, morphology and the lexicon.

DTEND;TZID=America/Los_Angeles:20100607T150000
DTSTART;TZID=America/Los_Angeles:20100607T140000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:"Bayesian models of language acquisition" or "Where do the rules come from?"
UID:20100607T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Inspired by previous work, where decipherment is used to improve machine translation, we propose a new idea to combine word alignment and decipherment into a single learning process. We use EM to estimate the model parameters, not only to maximize the probability of parallel corpus, but also the monolingual corpus. We apply our approach to im- prove Malagasy-English machine transla- tion, where only a small amount of paral- lel data is available. In our experiments, we observe gains of 0.9 to 2.1 Bleu over a strong baseline.Qing Dou is a fifth year Ph.D. student at ISI. He works with Professor Kevin Knight on various decipherment problems and its application to different Natural Language Processing tasks.

DTEND;TZID=America/Los_Angeles:20141017T160000
DTSTART;TZID=America/Los_Angeles:20141017T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Beyond Parallel Data: Joint Word Alignment and Decipherment Improves Machine Translation [EMNLP Practice Talk]
UID:20141017T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract:  While deep learning methods in NLP are arguably overhyped, recurrent neural networks (RNNs), and in particular LSTM networks, emerge as very capable learners for sequential data. Thus, my group started using them everywhere. After briefly explaining what they are and why they are cool, I will describe some recent work in which we use LSTMs as a building block: learning a shared representation in a multi-task setting; learning feature representations for syntactic parsing; and learning to detect hypernyms in a large corpus. Most work achieve state of the art results.  I will also describe a work which reviewers seem to hate but I really like in which we try to shed some light on what's being captured by LSTM-based sentence representations.Bio: Yoav Goldberg is a senior lecturer in Computer Science at Bar Ilan University, Israel, working on natural language processing. Prior to that he was a research scientist at Google. Before deep learning took over he used to work on syntactic parsing and structured prediction. He still does, but now he uses some new shiny tools which he is trying to understand and refine.Live here: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=3d82a6274df44b89a94f376c0c9630f71d

DTEND;TZID=America/Los_Angeles:20160610T160000
DTSTART;TZID=America/Los_Angeles:20160610T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Title: Doing stuff with LSTMs
UID:20160610T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, I will be speaking about our ongoing effort in thedevelopment of Eve, state-of-the-art incremental reference resolutionin images based spoken dialogue agent. Incrementality is central todeveloping a naturally conversing spoken dialogue systems.Incrementality makes the conversations more natural and efficientcompared to non-incremental alternatives. The performance of the Evewas found to be comparable to human performance and she convenientlyoutperforms alternative non-incremental architectures. However,building such a system is not trivial. It needs high-performancearchitectures and dialogue components (ASR, dialogue policies,language understanding etc.). I will also speak about future plans forenhancing Eve's capability. I also take a slight deviation and explorea different word level natural language understanding model forreference resolution in images in a dialogue setting.Bio: Ramesh Manuvinakurike is a Ph.D. student at USC Institute forCreative Technologies working with Prof. David DeVault and Prof.Kallirroi Georgila. He is interested in developing conversationalsystems and has developed various such systems. His work with hiscolleagues on agent Eve won 'Best paper' award at Sigdial 2015.

DTEND;TZID=America/Los_Angeles:20161118T160000
DTSTART;TZID=America/Los_Angeles:20161118T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Incremental spoken dialogue system for reference resolution in images
UID:20161118T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The presentation will give an overview of the SMT activities at theLanguage Technologies Institute, Carnegie Mellon University, in largevocabulary text translation tasks, esp. the Chinese-English andArabic-English, as well as in limited domain speech-to-speech translationtasks.  The CMU SMT system is, like most modern statistical MT systems,based on phrase translation.  Several approaches have been developed toextract the phrase pairs from parallel corpora and current researchinvestigates different scoring approaches for these translation pairs.Details of the decoder, esp. on hypothesis recombination, pruning, andefficient n-best list generation will be given.  Recently, the SMT systemhas been extended to use partial translations generated from example basedand grammar based translation system, thereby performing multi-enginemachine translation.Bio:Stephan Vogel is a researcher at the Language Technologies Institute,Carnegie Mellon University, where he heads the statistical machinetranslation team.  He received a Diploma in Physics from PhilipsUniversity Marburg, Germany, and a Masters of Philosophy from theUniversity of Cambridge, England.  After working for a number of years onthe history of science, he turned to computer science, especially naturallanguage processing.  Before coming to CMU, he worked for several years atthe Technical Univerity of Aachen on statistical machine translation, andalso in the Interactive Systems Lab at the University of Karlsruhe.

DTEND;TZID=America/Los_Angeles:20040402T160000
DTSTART;TZID=America/Los_Angeles:20040402T150000
LOCATION:11 Large
SUMMARY:The CMU Statistical Machine Translation System
UID:20040402T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Building systems that understand human language often requires access to large amounts of text annotated with all the features and nuances of human communication.  However, building these annotated corpora is often prohibitive due to the time, cost, and expertise required to annotate.  While crowdsourcing the work can help, untrained workers still incur costs and the workers may not be as motivated to answer correctly.  In this talk, I will describe how to solve this annotation bottleneck using video games in which traditional annotation tasks are transformed into core video game mechanics and embedded in the kinds of games you might play on your mobile phone.  Our video games are not only fun to play but are capable of annotating a wide variety of linguistic phenomena at costs lower that crowdsourcing and have quality equal to that of experts.  Using four games, I will demonstrate how their creation process can be distilled into reusable design patterns to create new games for different types of tasks in linguistics and beyond.Bio: David Jurgens is postdoctoral scholar in the department of Computer Science at Stanford University.  He received his PhD in Computer Science from UCLA in 2014 and has been a visiting researcher at HRL Laboratories, research scientist at Sapienza University of Rome and postdoctoral scholar at McGill University.  His research focuses on two areas: natural language processing, where he works on new methods for understanding the meaning of text, and computational social science where he investigates population dynamics through peoples' language and demographics.  He is currently a co-chair of the International Workshops on Semantic Evaluation (SemEval) and of the workshop on Natural Language Processing and Computational Social Science.  His research has been featured in Forbes, MIT Technology Review, Business Insider, and Schneier on Security.

DTEND;TZID=America/Los_Angeles:20160304T160000
DTSTART;TZID=America/Los_Angeles:20160304T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Linguistic Annotation Using Video Games with a Purpose
UID:20160304T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk presents dual decomposition as a general technique for NLP.The first part introduces dual decomposition as a framework forderiving inference algorithms for NLP problems. The approach relies onstandard dynamic-programming algorithms as oracle solvers forsub-problems, together with a simple method for forcing agreementbetween the different oracles. The approach provably solves a linearprogramming (LP) relaxation of the global inference problem. It leadsto algorithms that are simple, in that they use existing decodingalgorithms; efficient, in that they avoid exact algorithms for thefull model; and often exact, in that empirically they often recoverthe correct solution in spite of using an LP relaxation.The second part presents an application of dual decomposition tonon-projective parsing . We focus on parsing algorithms fornon-projective head automata, a generalization of head-automata modelsto non-projective structures. The dual decomposition algorithms aresimple and efficient, relying on standard dynamic programming andminimum spanning tree algorithms. They provably solve an LP relaxationof the non-projective parsing problem. Empirically the LP relaxationis very often tight: for many languages, exact solutions are achievedon over 98% of test sentences.The accuracy of our models is higherthan previous work on a broad range of datasets.

DTEND;TZID=America/Los_Angeles:20100806T160000
DTSTART;TZID=America/Los_Angeles:20100806T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Dual Decomposition for Natural Language Inference
UID:20100806T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: For ten days in March, nine research teams worked together to buildCebuano language resources and systems for a "dry run" the TIDES SupriseLanguage experiment. Cebuano is spoken widely in the southernPhillipines, but there had previously been little work on computationallinguistics for that language. As we prepare for the actual SupriseLanguage experiment this June, we will use this talk to look back on whatworked, what didn't, and what lessons there are to be learned from ourexperience in March. Come prepared to share the excitement, offer yourideas, and understand why we have tried to ask Ed to cancel all vacationsduring the month of June (just kidding...).

DTEND;TZID=America/Los_Angeles:20030509T160000
DTSTART;TZID=America/Los_Angeles:20030509T150000
LOCATION:11 Large
SUMMARY:Coping with Surprise: The Case of Cebuano
UID:20030509T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Knowledge representation is hard.  As natural language scientists andengineers, we'd like something that- is expressive enough to capture how natural language works- permits tractable inference- admits learning algorithms for automatic knowledge acquisition- leads to modular system constructionThis talk will look at knowledge representation for capturing naturallanguage transformations.  A lot of what we do falls into thiscategory.  Examples of transformations include language translation(French to English), question answering (Question to Answer),transliteration (foreign script to Roman alphabet), summarization(long text to short text), parsing (string to tree), languagegeneration (meaning to string), etc.I'll show various knowledge formats (starting with simple finite-statetransducers) and show how they stack up on the 4 criteria above, usingtheorems and examples.  We'll see that different types of tree andstring automata lead to good behavior on various subsets of the 4criteria, but getting 4 out of 4 is still elusive.This is a Krazy Theory talk -- since this kind of talk should not goon and on, I promise to finish within 50 minutes.

DTEND;TZID=America/Los_Angeles:20070112T153000
DTSTART;TZID=America/Los_Angeles:20070112T140000
LOCATION:11 Large
SUMMARY:Capturing Natural Language Transformations
UID:20070112T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will discuss an agent that can play a simple word-guessing game with a user.   The fast-paced, multi-modal, and interactive nature of the dialogue that takes place in word-guessing games are challenging for todayâs dialogue systems to emulate.   The agent serves as a research testbed to explore issues of fast-paced incremental interaction and user satisfaction in such a setting.  I will trace how the agent's design was motivated by a human-human corpus as well as discuss two empirical studies involving the agent.  The first study was designed to learn an algorithm to automatically select effective clues (clues likely to elicit a correct guess from a human).  The second study was an evaluation of several synthetic voices and 1 human voice which showed how participant's subjective perceptions and objective task performances fluctuated based on the voice used and the duration of the participant's exposure to the voice.Bio. Eli Pincus is a 3rd year USC PhD student and a graduate research assistant in the Natural Dialogue Group at USC Institute for Creative Technologies.  He is advised by Professor David Traum. Eli's main research is in human-computer dialogue.  Since joining USC he has been working on improving virtual human dialogue.  He won the best computer science department TA award in spring 2015.   He was a research intern in the NLP and AI group at Nuance Communications in summer 2015.

DTEND;TZID=America/Los_Angeles:20151204T160000
DTSTART;TZID=America/Los_Angeles:20151204T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:What Can We Learn From An Agent that Plays Word-Guessing Games?
UID:20151204T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Bilingual term lists have proven to be a useful basis fordictionary-based Cross-Language Information Retrieval (CLIR), butthere is ample anecdotal evidence that differences in vocabularycoverage can have a substantial impact on retrieval effectiveness.This issue has recently been explored using ablation studies in whichprogressively smaller term lists were synthesized using samplingtechniques. The ablation techniques used in those studies have not,however, been validated using real terms lists. In this talk I willreport the results of what we believe is the first large coveragestudy use naturally occurring term lists. Thirty-five bilingual termslists were obtained from a variety of sources, each with English asone of the two paired languages. From these, we created 35English-to-English term lists by taking each term that was present inthe English side of the list as its own translation. When used withan English information retreval test collection, this allowed us tomeasure the reduction in retrieval effectivenss that could beattributed to deficiencies in the coverage of English terms. Eighttypes of untranslatable terms were identified in a collection of newsstories, of which named entitles were found to have the greatestimpact on retrieval effectiveness. Differences in named entitycoverage were found to produce large differences in retrievaleffectiveness for term lists of similar sizes. Controlling for namedentity effects yielded a clear relationship between retrievaleffectiveness and the size of the translatable English vocabulary.The functional dependence that we observed is consistent with onepreviously applied ablation technique and inconsistent with another.Our results indicate that the outcome of a widely cited landmark studyof query expansion effects for CLIR was likely affected by a flawedablation model. We conclude our talk with a suggestion for furtherwork on that topic, and a simple prescription for avoiding suchproblems in the future.

DTEND;TZID=America/Los_Angeles:20030612T120000
DTSTART;TZID=America/Los_Angeles:20030612T110000
LOCATION:11 Large
SUMMARY:Measuring the Effect of Dictionary Coverage on Cross-Language Retrieval
UID:20030612T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: People make explicit subjective judgments of speech when doing things like tutoring students in a foreign language, or testing a child's reading skills.  On what do we base these judgments, and how can they be made automatically?  The "quality" of speech does not exist on any one scale alone, and is not limited strictly to pronunciation - it is manifested through a multiplicity of simultaneous and interacting cues of various sizes.  In this talk I'll discuss modeling strategies for categorical pronunciation on several scales, cognitive models for estimating student knowledge demonstrated through speech, and applications in the fields of education and speech synthesis.

DTEND;TZID=America/Los_Angeles:20090213T160000
DTSTART;TZID=America/Los_Angeles:20090213T150000
LOCATION:11 Large
SUMMARY:Estimating Subjective Judgments of Speech on Multiple Levels
UID:20090213T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Over the last two decades, the inter-disciplinary fields of visual attention and saliency have attracteda lot of interest in cognitive sciences, computer vision, robotics, and machine learning. The high complexity of natural environments requires the primate visual system to combine, in a highly dynamic and adaptive manner, sensory signals that originate from the environment (bottom-up) with behavioral goals and priorities dictated by the task at hand (top-down). I will talk about my recent research in two directions: 1) Bottom-up attention: I will give a snapshot of biological findings on visual attention (e.g., how gaze direction of people in a scene influences eye movements of an external observer), theoretical background on saliency concepts, our model benchmark and saliency models, and 2) Top-down attention: I will describe our neuromorphic algorithms to predict, in a task-independent manner, which elements in a video scene might more strongly attract the gaze of a human. Multi-modal data including bottom-up saliency, "gist" or global context, physical actions and object properties (using example recorded eye movements and videos of humans engaged in various 3D video games, including flight combat, driving, first-person shooting, running a hot-dog stand that serves hungry customers) are utilized to associate particular scenes with particular locations of interest, given the task (e.g., when the task is to drive, if the scene depicts a road turning left, the system learns to look at that left turn). Finally, I will present some successful engineering and clinical applications of our models.Bio:Ali Borji received the BS and MS degrees in computer engineering from the Petroleum University of Technology, Tehran, Iran, 2001 and Shiraz University, Shiraz, Iran, 2004, respectively.He received the PhD degree in computational neurosciences from the Institute for Studiesin Fundamental Sciences (IPM) in Tehran, 2009. He then spent a year at University of Bonn as a postdoc. He has been a postdoctoral scholar at iLab, University of Southern California, Los Angeles since March 2010.His research interests include computer vision, machine learning, and neurosciences with particular emphasis on visual attention, visual search, active learning, scene and object recognition, and biologically plausible vision models.

DTEND;TZID=America/Los_Angeles:20140731T120000
DTSTART;TZID=America/Los_Angeles:20140731T110000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Computational Modeling of Bottom-up and Top-down Visual Attention
UID:20140731T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Online reviews are often our first port of call when considering products and purchases online. Yet navigating huge volumes of reviews (many of which we might disagree with) is laborious, especially when we are interested in some niche aspect of a product. This suggests a need to build models that are capable of capturing the complex and idiosyncratic semantics of reviews, in order to build richer and more personalized recommender systems. In this talk I'll discuss three such directions: First, how can reviews be harnessed to better understand the dimensions (or facets) of people's opinions? Second, how can reviews be used to answer targeted questions, that may be subjective or require personalized responses? And third, how can reviews themselves be synthesized, so as to predict what a reviewer would say, even for products they haven't seen yet?Bio: Dr. McAuley has been an Assistant Professer in the Computer Science Department at the University of California, San Diego since 2014. Previously he was a postdoctoral scholar at Stanford University after receiving his PhD from the Australian National University in 2011. His research is concerned with developing predictive models of human behavior using large volumes of online activity data.

DTEND;TZID=America/Los_Angeles:20160401T160000
DTSTART;TZID=America/Los_Angeles:20160401T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Harnessing reviews to build richer models of opinions
UID:20160401T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Granularity is the task of breaking down a complex description into simpler concepts of finer detail, such that each of the simpler concepts can be collectively describe the main description. It can be thought of as a hierarchy of varying levels of information, with fine grained and specific information i.e. information with more detail at lower levels, and coarse grained and generic information i.e. information with less detail, at higher levels. Shifting in granularity from lower to higher levels leads to information loss or abstraction of certain fine details which become irrelevant at that level. Similarly, shifting granularity from a coarse level to a fine level involves more specific details as compared to the level above this.Humans can seamlessly shift between various granularity levels when interpreting discourse. Textual descriptions are usually written such that the reader gets to know the key features of fine-grained events, and then theoverall picture from the coarse-grained description of a process. This thesis proposal is towards identification and extraction of such structures from Natural Language Discourse.

DTEND;TZID=America/Los_Angeles:20100416T160000
DTSTART;TZID=America/Los_Angeles:20100416T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Understanding Granularity in Natural Language Discourse (Ph.D. Proposal practice talk)
UID:20100416T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This is a practice tutorial for one I am giving at HLT/NAACL one weeklater.  Comments/feedback are very welcome.----------------------------------------------------------------------Expectation Maximization (EM) has proved to be a great and usefultechnique for unsupervised learning problems in speech and languageprocessing.  Unfortunately, its range of applications is limited either byintractable E- or M-steps, or by its reliance on the maximum likelihoodestimator.  The natural language processing community typically resorts toad-hoc approximation methods to get (some reduced form of) EM to apply toNLP tasks.  However, many of the problems that plague EM can be solvedwith Bayesian methods, which are theoretically more well justified.  Inthis tutorial, I discuss Bayesian methods as they can be used in naturallanguage processing.  The two primary foci of this tutorial are specifyingprior distributions and performing the necessary computations to performinference in Bayesian models.  I focus on unsupervised techniques (forwhich EM is the obvious choice), but discuss supervised and discriminativetechniques at the conclusion with pointers to relevant literature.Depending on one's inference technique of choice, the math required tobuild Bayesian learning models can be difficult.  Compounding this problemis the fact that current written tutorials on Bayesian techniques tend tofocus on continuous-valued problems, a poor match for the high-dimensiondiscrete world of text.  This combination makes the cost of entrance tothe Bayesian learning literature often too high.  The goal of thistutorial is to provide sufficient motivation, intuition and vocabularymapping so that one can easily understand recent papers in Bayesianlearning that are published at conferences like NIPS, and increasingly atACL.  In addition to the standard tutorial materials (slides), thistutorial is accompanied by a technical report that spells out all themathematic derivations in great detail, for those who wish to startresearch projects in this fields.This tutorial should be accessible to anyone with a basic understanding ofstatistics.  I use a query-focused summarization task as a motivatingrunning example for the tutorial, which should be of interest toresearchers in natural language processing and in information retrieval.Additionally, though the tutorial does not focus on speech problems, thoseattendees interested in graphical modeling techniques for automatic speechrecognition might also find the tutorial of interest.

DTEND;TZID=America/Los_Angeles:20060524T120000
DTSTART;TZID=America/Los_Angeles:20060524T90000
LOCATION:4th Floor
SUMMARY:Beyond EM: Bayesian Techniques for Human Language Technology Researchers
UID:20060524T90000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will describe a framework for cross-lingual transfer ofprobabilistic models that uses posterior regularization.  As a longaside, I will describe several methods for learning with sideinformation: constraint driven learning, posterior regularization,generalized expectation, augmented loss as well as how they relate toeach other and to Bayesian measurements.   I will conclude with someapplications from my work and from the literature, including sequenceand tree models.Biography:I was born in Sofia, Bulgaria where I lived until February 1989. Myfamily moved to Zimbabwe and then in 1995 to New Zealand where I wentto high school. I came to the US in 1999 to study at SwarthmoreCollege. I spent the 2001-2002 academic year studying abroad in Paris.After graduating with a Bachelor of Arts in Computer Science in 2003 Iworked at StreamSage Inc. in Washington DC until starting at theUniversity of Pennsylvania in Fall 2004. During the summer of 2007 Iwas an intern at TrialPay in Mountain View, CA and during the summerof 2008 I was an intern at Bank of America in New York. I graduatedfrom UPenn in 2010 and have since been working at Google Inc. in NewYork.

DTEND;TZID=America/Los_Angeles:20131024T160000
DTSTART;TZID=America/Los_Angeles:20131024T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Cross lingual transfer and learning with side information
UID:20131024T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Synchronous grammars have been shown to be effective as models of translation, and the performance of such systems depends heavily on the quality of the grammar induced from the training data. The standard method for induction of synchronous grammars uses automatic word alignments to constrain possible derivations, which makes them prey to alignment errors. In this work, we propose a method for joint word alignment and grammar induction. Our experiments show that our method significantly outperforms the standard method, while reducing the size of the grammar by more than half.

DTEND;TZID=America/Los_Angeles:20110211T160000
DTSTART;TZID=America/Los_Angeles:20110211T150000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Joint Word Alignment and Synchronous Grammar Induction
UID:20110211T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Minimum Bayes risk (MBR) decoding improves the output ofmachine translation systems by selecting a translation that matches alarge proportion of the k-best hypotheses of a system.  We extend thisidea to apply to packed forests by selecting an output sentence thatmatches a large proportion of all hypotheses in the pruned forest ofderivations from a syntax-based translation system.

DTEND;TZID=America/Los_Angeles:20080820T161500
DTSTART;TZID=America/Los_Angeles:20080820T154500
LOCATION:11 Small
SUMMARY:Intern Final Talk: Minimum Risk Decoding over Forests
UID:20080820T154500@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Analogies allow us to make connections between different domains of knowledge and to apply what we already know to new situations. For this reason, they're important to developing new theories and new understandings of the social and natural world, and have often been seen as an important task for machine learning. In my talk, I'll explore how different theories of how analogy works shape the different approaches that research teams take when modeling analogical thinking. Specifically, I'll contrast what I term "formal" or "top-down" theories of analogy with a "serial" or "bottom-up" approach. Finally, I'll describe a syntactic and semantic method for searching out analogies within corpora. I'm convinced that understanding analogies better, and being able to find locate new analogies in historical documents, can help us understand where new ideas come from.Bio: Devin Griffiths is an assistant professor in the English Department at USC, where he studies nineteenth-century British literature and scientific history. His current book project, titled "Between the Darwins," explores how analogies were used in the nineteenth century to create new theories of evolution and social progress. His areas of research include science and literature, poetics, book history, and the digital humanities.Webcast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=56439af4a5cb41f49a2c5faef5683cd11d

DTEND;TZID=America/Los_Angeles:20150123T160000
DTSTART;TZID=America/Los_Angeles:20150123T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Understanding Analogies: Theory and Method
UID:20150123T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Non-expert annotation services like Amazon's Mechanical Turk (AMT) are cheap and fast ways to evaluate systems and provide categorical annotations for training data. Unfortunately, some annotators choose bad labels in order to maximize their pay. Manual identification is tedious, so we experiment with an item-response model. It learns in an unsupervised fashion to a) identify which annotators are trustworthy and b) predict the correct underlying labels. We match performance of more complex state-of-the-art systems and perform well even under adversarial conditions. We show considerable improvements over standard baselines, both for predicted label accuracy and trustworthiness estimates. We show that the latter can be further improved by introducing a prior on model parameters and using Variational Bayes inference. Additionally, we present a method for trading precision and recall, achieving even higher performance by focusing on the instances our model is most confident in. We provide an implementation of MACE (Multi- Annotator Competence Estimation) for download at (http://www.isi.edu/publications/licensed-sw/mace/).Bio:Dirk Hovy is a recent PhD graduate from USC's Information Sciences Institute, working with Jerry Hobbs and Ed Hovy. He has a background in socio-linguistics. His current work includes unsupervised and semi-supervised sequential models of relation extraction and WSD, as well as annotator assessment. He has also worked on temporal relations, metaphors, and prosody. A full list of his publications can be found at(http://www.dirkhovy.com/portfolio/papers/index.php). His other interests include cooking, picking up heavy things (and putting them back down), and medieval art and literature.

DTEND;TZID=America/Los_Angeles:20130605T163000
DTSTART;TZID=America/Los_Angeles:20130605T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Whom to Trust with MACE(NAACL Practice Talk)
UID:20130605T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We cast the generation of semantic graphs from natural language text as a machine translation problem, where the source language is English and the target language is a labeled graph representing a semantic interpretation, known as an Abstract Meaning Representation (AMR). Via a series of data transformations we create a training set that is amenable to a string-to-tree syntax mt decoder. Previous work in SBMT and AMR parsing is combined to yield a trainable system that achieves state-of-the-art parsing results.Bio: Jonathan May is a computer scientist at USC-ISI, where he also received a PhD in 2010. His current focus areas are in machine translation, machine learning, and natural language understanding. Jonathan co-developed and patented a highly portable method for optimizing thousands of features in machine translation systems that has since been incorporated into all leading open source MT systems. He has previously worked in automata theory and information extraction and at SDL Language Weaver and BBN Technologies.

DTEND;TZID=America/Los_Angeles:20150220T160000
DTSTART;TZID=America/Los_Angeles:20150220T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Semantic Parsing as Machine Translation
UID:20150220T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditional anomaly detection on social media mostly focuses on individual point anomalies while anomalous phenomena usually occur in groups. Therefore it is valuable to study the collective behavior of individuals and detect group anomalies. Existing group anomaly detection approaches rely on the assumption that the groups are known, which can hardly be true in real world social media applications. In this paper, we take a generative approach by proposing a hierarchical Bayes model: Group Latent Anomaly Detection (GLAD) model. GLAD takes both pair-wise and point-wise data as input, automatically infers the groups and detects group anomalies simultaneously. To account for the dynamic properties of the social media data, we further generalize GLAD to its dynamic extension d-GLAD. We conduct extensive experiments to evaluate our models on both synthetic and real world datasets. The empirical results demonstrate that our approach is effective and robust in discovering latent groups and detecting group anomalies.Bio: Yan Liu is an assistant professor in Computer Science Department at University of Southern California from 2010. Before that, she was a Research Staff Member at IBM Research. She received her M.Sc and Ph.D. degree from Carnegie Mellon University in 2004 and 2007. Her research interest includes developing scalable machine learning and data mining algorithms with applications to social media analysis, computational biology, climate modeling and healthcare analytics. She has received several awards, including NSF CAREER Award, Okawa Foundation Research Award, ACM Dissertation Award Honorable Mention, Best Paper Award in SIAM Data Mining Conference, Yahoo! Faculty Award and the winner of several data mining competitions, such as KDD Cup and INFORMS data mining competition.

DTEND;TZID=America/Los_Angeles:20150612T160000
DTSTART;TZID=America/Los_Angeles:20150612T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Group Anomaly Detection in Social Media Analysis
UID:20150612T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The functioning regions of the brain are connected through a complex network of fibers, described by the brainâs white matter. Non-invasive imaging using MRI-based diffusion imaging can help capture important characteristics of the connections by describing the strength and directionality profile of water diffusion along white matter fibers. Variability in these connections have been noted in many neurological, degenerative, and psychiatric disorders where ultimately information transfer from on brain region to the other may be weakened or completely compromised. To discover genetic risk factors for altered connectivity and common genetic variants which put the brain at subtle risk for weakened connections, we find power in sample size and pool multiple datasets from around the world to determine common effects in all populations. However, there is no standard method for acquiring diffusion images and standardizing measures across datasets is an ongoing challenge. The Enhancing Neuro Imaging Genetics through Meta Analysis group on Diffusion Tensor Imaging has established a set of basic protocols to overcome a portion of these challenges, which I will describe, along with works-in-progress to tackle additional obstacles to reveal critical details of the brains network.Bio. Neda Jahanshad is an assistant professor of Neurology at USC in the Imaging Genetics Center at ISI.  She received her PhD in Biomedical Engineering at UCLA in 2012 where she worked on optimizing diffusion imaging protocols to map structural brain connections in large populations. She has since extended the work to explore methods of pooling such imaging data from across the world and determine genetic and environmental contributions to the connectivity of the brain and determine how these effects vary across the lifespan. She is coordinating one of the largest studies of the brain's white matter through the ENIGMA Consortium http://enigma.ini.usc.edu.

DTEND;TZID=America/Los_Angeles:20150306T160000
DTSTART;TZID=America/Los_Angeles:20150306T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Multi-site genetic analysis of the brainâs white matter: ENIGMA-DTI
UID:20150306T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We present a general framework for unsupervised learning that combines probalistic graphical models with the power of deep nets. We employ a neuralized expectation miminization algorithm for learning. We apply this framework for unsupervised sequential tagging and show some interesting results.Bio: Ke is a PhD candidate at University of Amsterdam. He is interning at ISI, working with Yonatan Bisk, Ashish Vaswani, Kevin Knight, and Daniel Marcu. His research focuses on deep learning and machine translation.

DTEND;TZID=America/Los_Angeles:20160826T160000
DTSTART;TZID=America/Los_Angeles:20160826T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Unsupervised learning linguistic structures with deep neural networks
UID:20160826T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Summarization requires one to identify the internal structure ofinformation and to bring that to the surface both operationally andorganizationally.How does one put this theory to practice and build real summarizationsystems? How do the systems built based on this idea perform?

DTEND;TZID=America/Los_Angeles:20040430T163000
DTSTART;TZID=America/Los_Angeles:20040430T150000
LOCATION:11 Large
SUMMARY:Automating the Building of Summarization Systems
UID:20040430T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Thanks to massive training data, and powerful machine translation techniques, machine translation quality has reached acceptable levels for a handful of languages. However, for hundreds of other languages, translation quality decreases quickly as the size of the available training data becomes smaller. For languages with a few millions or less tokens of translation data (called low-resource languages in this dissertation) traditional machine translation technologies fail to produce understandable translations into English. In this work, I explore various non-traditional sources for improving low-resource machine translation.Bio: Nima Pourdamghani is a phd student at USC/ISI working with professor Kevin Knight. Nima's interests are natural language processing, and applications of machine learning in general. His phd thesis is on building tools to improve machine translation for hundreds of low-resource languages.

DTEND;TZID=America/Los_Angeles:20180212T160000
DTSTART;TZID=America/Los_Angeles:20180212T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Non-traditional resources and improved tools for low-resource machine translation
UID:20180212T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The path to natural language understanding goes through increasingly challenging question answering tasks.  I will present research that significantly improves performance on two such tasks: answering complex questions over tables, and open-domain factoid question answering.  For answering complex questions, I will present a type-constrained encoder-decoder neural semantic parser that learns to map natural language questions to programs.  For open-domain factoid QA, I will show that training paragraph-level QA systems to give calibrated confidence scores across paragraphs is crucial when the correct answer-containing paragraph is unknown.  I will conclude with some thoughts about how to combine these two disparate QA paradigms, towards the goal of answering complex questions over open-domain text.Bio: Matt Gardner is a research scientist at the Allen Institute for Artificial Intelligence (AI2), where he has been exploring various kinds of question answering systems.  He is the lead designer and maintainer of the AllenNLP toolkit, a platform for doing NLP research on top of pytorch.  Matt is also the co-host of the NLP Highlights podcast, where, with Waleed Ammar, he gets to interview the authors of interesting NLP papers about their work.  Prior to joining AI2, Matt earned a PhD from Carnegie Mellon University, working with Tom Mitchell on the Never Ending Language Learning project.

DTEND;TZID=America/Los_Angeles:20180727T160000
DTSTART;TZID=America/Los_Angeles:20180727T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Tale of Two Question Answering Systems
UID:20180727T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In a famous 1944 paper, psychologist Fritz Heider and his student Marianne Simmel described an experiment where undergraduates were shown a short animated film depicting the movement of geometric shapes. Asked to describe what happened in the film, these students produced narratives that described the behavior of these shapes in anthropomorphic terms, ascribing to them plans, goals, emotions, and social roles that accounted for their behavior. Fritz Heider later wrote his seminal book, The Psychology of Interpersonal Relations, which articulated the role of Commonsense Psychology in the interpretation of the behavior of other people.In this talk I'll discuss our recent efforts to model the reasoning of the students in Heider and Simmel's original experiment. I'll describe our vision of a "Heider-Simmel Interactive Theater," a software application where people can create their own short movies involving geometric shapes in the style of Heider and Simmel's original film, which are then interpreted by the computer to generate a textual narrative of the author's creation. Then I'll lay out the technical plan, which involves the integration of probabilistic graphical models, weighted abduction, data-driven text generation, logical formalizations of commonsense psychology, and game-based data collection from the public at large.Before coming to the talk, please sign up and play "Triangle Charades" at the following website: http://charades.ict.usc.edu

DTEND;TZID=America/Los_Angeles:20130927T160000
DTSTART;TZID=America/Los_Angeles:20130927T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Heider-Simmel Interactive Theater
UID:20130927T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Both coarse-to-fine and A* parsing use simple grammars to guide search incomplex ones. We compare the two approaches in a common, agenda-basedframework, demonstrating the tradeoffs and relative strengths of eachmethod. Overall, coarse-to-fine is much faster for moderate levels of searcherrors, but below a certain threshold A* is superior. In addition,we present the first experiments on hierarchical A* parsing, inwhich computation of heuristics is itself guided bymeta-heuristics. Multi-level hierarchies are helpful in bothapproaches, but are more effective in the coarse-to-fine case becauseof accumulated slack in A* heuristics.

DTEND;TZID=America/Los_Angeles:20090619T160000
DTSTART;TZID=America/Los_Angeles:20090619T150000
LOCATION:11 Large
SUMMARY:Hierarchical Search for Parsing (and Machine Translation)
UID:20090619T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will be about automatic speech-to-speech translation.  In oursystem, a doctor speaks one language, the patient speaks another language,and the machine translates their utterances from one language to theother.  The talk will be followed by a demo of our system.One approach we have been successful with is phrase classification, i.e.,classifying a noisy speech-recognized utterance into one of many meaningcategories.  Phrase classification is computationally cheap and canprovide high quality translations for in domain utterances almostinstantaneously. Speed is important for speech translation, whereprocessing delay is a great concern.In this talk, different aspects of building a classification-based speechtranslator are discussed. Following an overview of automaticspeech-to-speech translation and its challenges, a comparison of differentclassification methods is presented and data collection techniques forthat application are introduced.

DTEND;TZID=America/Los_Angeles:20040621T160000
DTSTART;TZID=America/Los_Angeles:20040621T150000
LOCATION:11 Large
SUMMARY:Speech-to-Speech Translation: A Phrase Classification Approach
UID:20040621T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: With the shift from desktop to mobile, and the rise of voice-driven UIs, a growing proportion of the Google query stream is not well-served by conventional keyword-based information retrieval.  More and more queries use natural language ("when does walgreens close"), seek answers not found on any web page ("how do i get to work from here"), or demand action rather than information ("text my wife i'm 10 minutes late").  Satisfying such queries requires semantic parsing, that is, mapping the query into a structured, machine-readable representation of meaning.  In this talk, I will give an overview of the techniques Google has developed to address the problem of semantic parsing, and discuss some of the challenges that remain.  I'll also highlight differences between academia and industry in how the problem is conceived.Bio: Bill MacCartney is a Senior Research Scientist at Google, working primarily on semantic parsing. He is also a Consulting Assistant Professor of Computer Science at Stanford.  For more info: http://nlp.stanford.edu/~wcmac/

DTEND;TZID=America/Los_Angeles:20140926T160000
DTSTART;TZID=America/Los_Angeles:20140926T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Semantic Parsing at Google
UID:20140926T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: More than three decades of research has sought to uncover theprinciples that determine how hearers interpret pronouns in context.This work has focused predominantly on identifying so-called'preferences' or 'heuristics' that hearers utilize based on linguisticproperties of antecedent expressions.  This focus is a departure fromthe type of approach outlined in Hobbs (1979), which argues that themechanisms that drive pronoun interpretation are driven predominantlyby semantics, world knowledge, and inference, with particularreference to how these are used to establish the coherence ofdiscourses.In this talk, I report on new experimental evidence in support of acoherence-driven analysis, and describe how the analysis canaccommodate a range of previous findings suggestive of conflictingpreferences and biases.  Case studies of four commonly-citedpreferences are described, specifically (i) the parallel grammaticalrole preference (e.g., Smyth 1994), (ii) thematic role preferences(e.g., Stevenson et al. 1994), (iii) implicit causality biases (e.g.,Caramazza et al. 1977), and (iv) the subject assignment strategy(e.g., Crawley et al. 1990).  In each case, the experimental resultsoffer an explanation of what the underlying source of the bias is, andpredicts in what contexts evidence for it will surface.These results suggest that pronoun interpretation is incrementallyinfluenced in part by the probabilistic expectations that hearers haveabout how the discourse will be coherently continued.  They are alsoargued to leave various myths by the roadside, e.g., that pronouninterpretation can be profitably thought of as a 'search and match'procedure, and that coherence relations need not be controlled for inexperimental stimuli.This talk includes joint work with Laura Kertz, Hannah Rohde, andJeffrey Elman.

DTEND;TZID=America/Los_Angeles:20090508T160000
DTSTART;TZID=America/Los_Angeles:20090508T150000
LOCATION:11 Large
SUMMARY:Coherence and the (Psycho-) Linguistics of Pronoun Interpretation
UID:20090508T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Topic models are statistical models that can be used to infer the most likely topics that some piece of text is about. Such models are useful for applications that rely on semantic representations of text, such as query classification, document understanding, and measuring semantic similarity. These models are widely used within Google. In this talk, I will first describe the details of one of these models -- one that learns over a million topics covering just about every language. I will then describe a number of technical and practical challenges involved in keeping such a model fresh and up-to-date within real-world applications.Bio: Donald Metzler is a Staff Software Engineer at Google Inc. Prior to that, he was a Research Assistant Professor at the University of Southern California (USC) and a Senior Research Scientist at Yahoo!. He has served as the Program Chair of the WSDM, ICTIR, and OAIR conferences and sat on the editorial boards of the major journals. He has published over 40 research papers, has been awarded 4 patents, and co-authored the textbook Search Engines: Information Retrieval in Practice.

DTEND;TZID=America/Los_Angeles:20150403T160000
DTSTART;TZID=America/Los_Angeles:20150403T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Keeping Topic Models Fresh: Technical and Practical Challenges
UID:20150403T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A group of 60 researchers have been asked to comment on what they perceive to be- the most important contributions in the fields of speech recognition, language modeling, and machine translation;- past ideas that failed to lead to substantial improvements;- and contributions that are most likely to have a material impact in the future.This talk summarizes the perceptions and trends identified in the collection of answers provided by the researchers.

DTEND;TZID=America/Los_Angeles:20081107T160000
DTSTART;TZID=America/Los_Angeles:20081107T150000
LOCATION:11 Large
SUMMARY:The best/worst Speech Recognition, Language Modeling, and Machine Translation ideas
UID:20081107T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Feature structures are useful for capturing logical semantic relationships. In this talk, we present smatch, a metric that determines semantic overlap between two semantic feature structures. We give an ef.cient algorithm to compute the metric, and we show the results of an inter-annotator agreement study.

DTEND;TZID=America/Los_Angeles:20121207T160000
DTSTART;TZID=America/Los_Angeles:20121207T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Smatch: an Evaluation Metric for Semantic Feature Structures
UID:20121207T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TALK 1: Erica GreeneTitle: A Statistical Foray into PoetryAbstract: Although the analysis and generation of poetry is often considered anexclusively human task, we have taken some initial steps to automatethe process.  We build a series of finite state transducers to analyzepoetic meter and train them on a handmade corpus of poetry. We thenuse these trained transducers to generate poetry.  Specifically, wefocus on generating sonnets and limericks.------------------------------------------TALK 2: Paramveer DhillonTitle: Learning to simplify target language for MT + Unsupervised log-linearmodels for Word AlignmentAbstract: We consider the Machine Translation task for the language pair(Chinese and English), where English is the target language. There arelots of redundancies in English language, e.g. It has capitalization,i.e. the first word of each sentence is capitalized, and it hasdifferent morphology i.e. it has noun and verb endings; none of whichare present in Chinese. In a way, due to these redundancies, we arelearning that a single Chinese word "tamen" translates to "They" and"they" and another Chinese word translates to "run", "runs" and"running". We present generative models which learn to "cluster" thetarget language vocabulary, by removing the above redundancies, namely(Capitalization and Different morphology). We show results on how this"clustering" affects the translation quality in end-to-end MTexperiments.In the last part of the talk, I would talk about using unsupervisedlog-linear(discriminative) models for improving word alignments. Thereare very few precedents of using discriminative models for wordalignment in totally unsupervised settings. (Taskar et. al. '05) and(Lacoste-Julien et. al. '06) used maximum weight bipartite matching in"nearly" unsupervised setting and (Blunsom et. al. '06) used CRFs forsupervised word alignment. We use log-linear models in totallyunsupervised settings to do word alignments. Speicifically we useContrastive Estimation (Smith et. al. '05) to shift the probabilitymass to the correct set of alignments from a well-chosen"neighborhood" of those alignments. In the end I will show somepreliminary word alignment results using our approach.

DTEND;TZID=America/Los_Angeles:20090827T160000
DTSTART;TZID=America/Los_Angeles:20090827T150000
LOCATION:11 Large
SUMMARY:Intern Final Talks
UID:20090827T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Weighted finite-state string transducer cascades are a powerful formalism for models of solutions to many natural language processing problems such as speech recognition, transliteration, and translation. Researchers often directly employ these formalisms to build their systems by using toolkits that provide fundamental algorithms for transducer cascade manipulation, combination, and inference. However, extant transducer toolkits are poorly suited to current research in NLP that makes use of syntax-rich models. More advanced toolkits, particularly those that allow the manipulation, combination, and inference of weighted extended top-down tree transducers, do not exist. In large part, this is because the analogous algorithms needed to perform these operations have not been defined. This thesis solves both these problems, by describing and developing algorithms, by producing an implementation of a functional weighted tree transducer toolkit that uses these algorithms, and by demonstrating the performance and utility of these algorithms in multiple empirical experiments on machine translation data.

DTEND;TZID=America/Los_Angeles:20100414T170000
DTSTART;TZID=America/Los_Angeles:20100414T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Weighted Tree Automata and Transducers for Syntactic Natural Language Processing (Ph.D. Defense practice talk)
UID:20100414T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As DARPA's TIDES (Translingual Information Detection, Extraction, andSummarization) program coming to an end, I will give a summary of what wehave learned from TIDES in summarization and a brief overview of ourcurrent effort in developing automatic evaluation methods that go beyondsurface n-gram matching. Topics to be covered:(1) Summary of DUCs 2001 - 2004(2) Automatic Evaluations in Summarization and MT(3) Basic Elements - New Efforts in Summarization at ISI

DTEND;TZID=America/Los_Angeles:20041119T163000
DTSTART;TZID=America/Los_Angeles:20041119T150000
LOCATION:11 Large
SUMMARY:After TIDES, What's Left? - Finding Basic Elements
UID:20041119T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We introduce two probabilistic models that can be used to identifyelementary discourse units and build sentence-level discourse parsetrees. The models use syntactic and lexical features. A discourse parsingalgorithm that implements these models derives discourse parse trees withan error reduction of 18.8\% over a state-of-the-art decision-baseddiscourse parser. A set of empirical evaluations shows that our discourseparsing model is sophisticated enough to yield discourse trees at anaccuracy level that matches near-human levels of performance.

DTEND;TZID=America/Los_Angeles:20030228T160000
DTSTART;TZID=America/Los_Angeles:20030228T150000
LOCATION:11 Large
SUMMARY:Sentence Level Discourse Parsing using Syntactic and Lexical Information
UID:20030228T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English (practice talk for AMTA)Errors in English parse trees impact the quality of syntax-based MT systems trained using those parses. Frequent sources of error for English parsers include PP-attachment ambiguity, NP-bracketing ambiguity, and coordination ambiguity. Not all ambiguities are preserved across languages. We examine a common type of ambiguity in English that is not preserved in Chinese: given a sequenc "VP NP PP", should the PP be attached to the main verb, or to the object noun phrase? We present a discriminative method for exploiting bilingual Chinese-English word alignments to resolve this ambiguity in English. On a heldout test set of Chinese-English parallel sentences, our method achieves 86.3% accuracy on this PP-attachment disambiguation task, an improvement of 4% over the accuracy of the baseline Collins parser (82.3%).Online Large-Margin Training of Syntactic and Structural Translation Features (practice talk for EMNLP)Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrase based model: first, we simultaneously train a large number of Marton and ResnikÂs soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 Arabic-English evaluation data.(Joint work with Yuval Marton and Philip Resnik)

DTEND;TZID=America/Los_Angeles:20081014T161500
DTSTART;TZID=America/Los_Angeles:20081014T150000
LOCATION:11 Large
SUMMARY:Practice talks for AMTA/EMNLP
UID:20081014T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Real-world networks are often organized as modules or communities of similar nodes that serve as functional units. These networks are also rich in content, with nodes having distinguishing features or attributes. In order to discover a network's modular structure, it is necessary to take into account not only its links but also node attributes.We describe an information-theoretic method that identifies modules by compressing descriptions of information flow on a network. Our formulation introduces node content into the description of information flow, which we then minimize to discover groups of nodes with similar attributes that also tend to trap the flow of information.The method has several advantages: it is conceptually simple and does not require ad-hoc parameters to specify the number of modules or to control the relative contribution of links and node attributes to network structure.We apply the proposed method to partition real-world networks with known community structure. We demonstrate that adding node attributes helps recover the underlying community structure in content-rich networks more effectively than using links alone. In addition, we show that our method is faster and more accurate than alternative state-of-the-art algorithms.Linhong Zhu is currently a Postdoctoral  Research Associate at Information Sciences Institute, University of Southern California, under the supervision of Dr. Kristina Lerman and Dr. Aram Galstyan.  Before that, she worked as a scientist-I at Institute for Infocomm Research Singapore from Oct 2010 to Jan 2013. She got her B Eng. Degree in Computer Science from University of Science and Technology of China in 2006 (2002-2006) and received her Ph.D. Degree in Computer Engineering from Nanyang Technological University (2006-2011). Her research interests focus on large-scale social network analysis and sentiment analysis.

DTEND;TZID=America/Los_Angeles:20140425T160000
DTSTART;TZID=America/Los_Angeles:20140425T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Partitioning Networks with Node Attributes by Compressing Information Flow
UID:20140425T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: These are two practice talks for our upcoming thesis defenses.  The titlesand abstracts are:--------------------------------------------------------------------------NATURAL LANGUAGE GENERATION FOR TEXT-TO-TEXT APPLICATIONS USING AN INFORMATION-SLIM REPRESENTATIONRadu SoricutIn this talk, I describe a new natural language generation paradigm, basedon direct transformation of textual information into well-formed textualoutput.  I support this language generation paradigm with theoreticalcontributions in the field of formal languages, new algorithms, empiricalresults, and software implementations. At the core of this work is a novelrepresentation formalism for probability distributions over finitelanguages. Due to its convenient representation and computationalproperties, this formalism supports a wide range of language generationneeds, from sentence realization to text planning.Based on this formalism, I describe, implement, and analyze theoreticallya family of algorithms that perform language generation using directtransformations of text. These algorithms use stochastic models oflanguage to drive the generation process. I perform extensive empiricalevaluations using my implementation of these algorithms. These evaluationsshow state-of-the-art performance in automatic translation, andsignificant improvements in state-of-the-art performance in abstractiveheadline generation and coherent discourse generation.--------------------------------------------------------------------------PRACTICAL STRUCTURED LEARNING FOR NATURAL LANGUAGE PROCESSINGHal Daume IIINatural language processing is replete with problems whose outputs arehighly complex and structured.  The current state-of-the-art in machinelearning is not yet sufficiently general to be applied to general problemsin NLP.  In this thesis, I present Searn (for "search" + "learn"), anapproach to learning for structured outputs that is applicable to the widevariety of problems encountered in natural language.  Searn operates bytransforming structured prediction problems into a collection ofclassification problems, to which any standard binary classifier may beapplied.  From a theoretical perspective, Searn satisfies a strongfundamental performance guarantee: given a good classification algorithm,Searn yields a good structured prediction algorithm.  To demonstrateSearn's general applicability, I present applications in such diverseareas as automatic document summarization and entity detection andtracking.  In these applications, Searn is empirically shown to achievestate-of-the-art performance.

DTEND;TZID=America/Los_Angeles:20060526T170000
DTSTART;TZID=America/Los_Angeles:20060526T150000
LOCATION:11 Large
SUMMARY:Defense Practice Talks: Generation and Learning
UID:20060526T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Minimum Description Length (MDL) principle is a method for modelselection that trades off between the explanation of the data by the modeland the complexity of the model itself. Inspired by the MDL principle, wedevelop an objective function for generative models that captures thedescription of the data by the model (log-likelihood) and the description ofthe model (model size). We also develop a efficient general search algorithmbased on the MAP-EM framework to optimize this function. Since recent workhas shown that minimizing the model size in a Hidden Markov Model forpart-of-speech (POS) tagging leads to higher accuracies, we test ourapproach by applying it to this problem. The search algorithm involves asimple change to EM and achieves high POS tagging accuracies on both Englishand Italian data sets.

DTEND;TZID=America/Los_Angeles:20100702T153000
DTSTART;TZID=America/Los_Angeles:20100702T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:An MDL-Inspired Objective Function for Unsupervised Training of Generative Models (ACL 2010 Practice Talk)
UID:20100702T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We enrich a curated resource of commonsense knowledge by formulating the problem as one of knowledge base completion (KBC). Most work in KBC focuses on knowledge bases like Freebase that relate entities drawn from a fixed set. However, the tuples in ConceptNet (Speer and Havasi, 2012) define relations between an unbounded set of phrases. We develop neural network models for scoring tuples on arbitrary phrases and evaluate them by their ability to distinguish true held-out tuples from false ones. We find strong performance from a bilinear model using a simple additive architecture to model phrases. We manually evaluate our trained modelâs ability to assign quality scores to novel tuples, finding that it can propose tuples at the same quality level as medium- confidence tuples from ConceptNet.Bio: Xiang Li is a 2016 summer intern under the supervision of Prof Kevin Knight and Prof Daniel Marcu. She is also going to be a PhD student at the University of Massachusetts Amherst in Andrew McCallumâs research group in this coming Fall. She got her B.S at the East China Normal University, Shanghai, China and got her M.S at the University of Chicago. Her research interest mainly focused on natural language processing and machine learning. This work is done when she was in Chicago working with Prof Kevin Gimpel at TTIC(Toyota Technological Institute at Chicago)

DTEND;TZID=America/Los_Angeles:20160715T160000
DTSTART;TZID=America/Los_Angeles:20160715T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Title: Commonsense Knowledge Base Completion
UID:20160715T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Bagging [Breiman, 96] and its variants is one of the most popular methods in aggregating classifiers and regressors. Its original analysis assumes that the bootstraps are built from an unlimited, independent source of samples. In the real world this analysis fails because there is a limited number of training samples.We analyze the effect of intersections between bootstraps to train different base predictors, which shows that the real-world bagging behaves very differently than its ideal analog [Breiman, 96]. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs. We prove that this is universally better than bagging.  Our analytical results are backed up by experiments on general classification and regression settings, and significantly improved all machine translation systems we used in the NIST-15 C-E competition.Bio:  Jia Xu is an associate professor at ICT/CAS, after being an assistant professor in Tsinghua University and a senior researcher at DFKI lecturing at Saarland University in Germany. She worked at IBM Watson and MSR Redmond during her Ph.D. advised by Hermann Ney at RWTH-Aachen University. Her current research interests are in Machine Learning with a focus towards highly competitive machine translation systems, where she led and participated in teams winning first place in WMT-11, TC-Star -05-07 and NIST-08. In NIST-15 she led one more team that won 4th place, which is the 1st among academic institutions.

DTEND;TZID=America/Los_Angeles:20151120T160000
DTSTART;TZID=America/Los_Angeles:20151120T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Better Bootstraps, Better Translation.
UID:20151120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: K-Best A* Parsing (Adam Pauls)A* parsing makes 1-best search efficient bysuppressing unlikely 1-best items. Existing k-best extraction methods can efficiently searchfor top derivations, but only after an exhaus-tive 1-best pass. We present a unified algo-rithm for k-best A* parsing which preservesthe efficiency of k-best extraction while giv-ing the speed-ups of A* methods. Our algo-rithm produces optimal k-best parses under thesame conditions required for optimality in a1-best A* parser. Empirically, optimal k-bestlists can be extracted significantly faster thanwith other approaches, over a range of gram-mar types.------------------------------------------Improved Word Alignment with Statistics and Linguistic Heuristics (Ulf Hermjakob)We present a method to align words in a bitext that combines elementsof a traditional statistical approach with linguistic knowledge.We demonstrate this approach for Arabic-English, using an alignmentlexicon produced by a statistical word aligner, as well as linguisticresources ranging from an English parser to heuristic alignment rulesfor function words. These linguistic heuristics have been generalizedfrom a development corpus of 100 parallel sentences.Our aligner, UALIGN, outperforms both the commonly used GIZA++ alignerand the state-of-the-art LEAF aligner on F-measure and producessuperior scores in end-to-end statistical machine translation,+1.3 BLEU points over GIZA++, and +0.7 over LEAF.

DTEND;TZID=America/Los_Angeles:20090724T161500
DTSTART;TZID=America/Los_Angeles:20090724T150000
LOCATION:11 Large
SUMMARY:Practice talks for EMNLP
UID:20090724T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many semantic parsing models use tree transformations to map between natural language and meaning representation. However, while tree transformations are central to several state-of-the-art approaches, little use has been made of the literature on tree automata, which could both clarify the relationships between different approaches and increase the generality of new contributions. We attempt to clarify the relationship by presenting a tree transducer model that is closely related to previous work made without appealing to automata theory. We then describe a variational Bayesian inference algorithm that is applicable to a wide class of tree transducers, producing state-of-the-art semantic parsing results when coupled with our model while remaining applicable to any domain employing probabilistic tree transducers (not just semantic parsing).This is joint work with Mark Johnson and Sharon Goldwater to be presented at this yearâs ACLBio:I research computational models of language acquisition, exploring questions of how linguistic structure and meaning might interact during learning. For instance, I have worked on Bayesian models of unsupervised word segmentation, exploring how simultaneous word meaning acquisition influences the identification of lexical boundaries. Currently, I work on semantic parsing, using a combination of Bayesian techniques and automata theory to model more complex structural relationships between compositional meaning and syntactic structure. My PhD began at the department of Cognitive, Linguistic and Psychological Sciences at Brown University but has since moved to the School of Informatics at the University of Edinburgh and the Computing Department of Macquarie University.

DTEND;TZID=America/Los_Angeles:20120629T160000
DTSTART;TZID=America/Los_Angeles:20120629T150000
LOCATION:4th Floor Conference Room
SUMMARY:Semantic Parsing with Bayesian Tree Transducers
UID:20120629T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Statistical natural language processing can be difficult formorphologically rich languages. The observed vocabularies of suchlanguages are very large, since each word may have been inflected formorphological properties like person, number, gender, tense, orothers. This unfortunately masks important generalizations, leads toproblems with data sparseness and makes it hard to generate correctlyinflected text.The presented dissertation work tackles the problem of inflectional morphology with a novel, unified statistical approach.  We present a generative probabilitymodel that can be used to learn from plain text how the words of alanguage are inflected, given some minimal supervision. In otherwords, we discover the inflectional paradigms that are implicit, orhidden, in a large unannotated text corpus.This model consists of several components: a hierarchical Dirichletprocess clusters word tokens of the corpus into lexemes and theirinflections, and graphical models over strings -- a novelgraphical-model variant -- model the interactions of multiplemorphologically related type spellings, using weighted finite-statetransducers as potential functions.We present the components of this model, from (1) weightedfinite-state transducers parameterized as log-linear models, to (2)graphical models over multiple strings, to (3) the final Bayesiannon-parametric model over a corpus, its lexemes, inflections, andparadigms. These three components of the model correspond to thecombined use of (1) dynamic programming, (2) belief propagation and(3) MCMC for inference.We show experimental results for several tasks along the way,including a lemmatization task in multiple languages and, todemonstrate that parts of our model are applicable outside ofmorphology as well, a transliteration task. Finally, we show thatlearning from large unannotated text corpora under our non-parametricmodel significantly improves the quality of predicted wordinflections.

DTEND;TZID=America/Los_Angeles:20110128T160000
DTSTART;TZID=America/Los_Angeles:20110128T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Non-Parametric Model for the Discovery of Inflectional Paradigms from Plain Text using Graphical Models over Strings
UID:20110128T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Multi-agent path-finding (MAPF) is important for applications such as the kind of warehousing done by Kiva systems. Solving the problem optimally is NP-hard, yet finding low-cost solutions is important. Bounded-suboptimal MAPF algorithms, such as enhanced conflict-based search (ECBS), often do not perform well in warehousing domains with many agents. We therefore develop bounded-suboptimal MAPF algorithms, called CBS+HWY and ECBS+HWY, that exploit the problem structure of a given MAPF instance by finding paths for the agents that include edges from user-provided highways, which encourages a global behavior of the agents that avoids collisions. On the theoretical side, we develop a simple approach that uses highways for MAPF and provides suboptimality guarantees. On the experimental side, we demonstrate that ECBS+HWY can decrease the runtimes and solution costs of ECBS in Kiva-like domains with many agents if the highways capture the problem structures well.Bio: Liron received a B.S. in Computer Engineering in 2007 and an M.S. in Computer Science in 2012, both from the Hebrew University of Jerusalem. Liron is interested in combinatorial problems related to constraint-based reasoning and symbolic planning. Specifically, he is looking at novel algorithmic techniques for exploiting structure in such combinatorial problems.

DTEND;TZID=America/Los_Angeles:20151009T160000
DTSTART;TZID=America/Los_Angeles:20151009T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Using Highways for Bounded-Suboptimal Multi-Agent Path Finding
UID:20151009T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The future of virtual assistants, self-driving cars, and smart homes require intelligent agents that work intimately with users. Instead of passively following orders given by users, an interactive agent must actively collaborate with people through communication, coordination, and user-adaptation. In this talk, I will present our recent work towards building agents that interact with humans. First, we propose a symmetric collaborative dialogue setting in which two agents, each with some private knowledge, must communicate in natural language to achieve a common goal. We present a human-human dialogue dataset that poses new challenges to existing models, and propose a neural model with dynamic knowledge graph embedding. Second, we study the user-adaptation problem in quizbowl - a competitive, incremental question-answering game. We show that explicitly modeling of different human behavior leads to more effective policies that exploits sub-optimal players. I will conclude by discussing opportunities and open questions in learning interactive agents.He He is a post-doc at Stanford University, working with Percy Liang. Prior to Stanford, she earned her Ph.D. in Computer Science at the University of Maryland, College Park, advised by Hal DaumÃ© III and Jordan Boyd-Graber. Her interests are at the interface of machine learning and natural language processing. She develops algorithms that acquire information dynamically and do inference incrementally, with an emphasis on problems in natural language processing. She has worked on dependency parsing, simultaneous machine translation, question answering, and more recently dialogue systems.

DTEND;TZID=America/Los_Angeles:20170310T160000
DTSTART;TZID=America/Los_Angeles:20170310T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning agents that interact with humans
UID:20170310T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Storytelling is an integral part of human interaction and critical to nearly all forms of entertainment. Since the introduction of TALE-SPIN over thirty years ago, automating the process of storytelling has been an active area of research. However, despite the incredible advances in other areas of computer science, such as 3D graphics and computational physics, that have enabled dazzling immersive interactive environments, there has been little progress in delivering automated *stories* that have the richness and complexity we expect in this genre of discourse.In this talk I will primarily discuss work done during my thesis that leverages the vast amounts of knowledge hidden implicitly in the social web in order to enable a text-based open-domain interactive storytelling system. In this system the human and computer take turns writing sentences of an emerging fictional story on any topic the author chooses. The system uses an architecture inspired by case-based reasoning with a knowledge base of over a million personal stories about the daily lives and experiences of ordinary people. At each turn the system selects a sentence from the corpus that tries to maximize the semantic and discourse coherence given the text of the story so far.I will also describe how crowd-sourcing communities were used to collect thousands of collaborative stories with the system and tens of thousands of ratings from hundreds of participants on several subjective evaluation criteria. The best models show significant improvements over the baseline and are judged to be indistinguishable from entirely human written weblog stories from a held out part of the collection.I will conclude with some more recent and ongoing research that examines additional methods of evaluation and new models of narrative generation based on Recurrent Neural Networks.Bio. Reid Swanson received his PhD in Computer Science from the University of Southern California in 2010 where he focused on a large-scale text-based interactive storytelling system. His primary research interest is in large-scale open-domain interpretation and generation of interactive narratives.After graduating he spent a year at the Walt Disney Imagineering Research & Development lab in Glendale, CA. At Disney he worked with an interdisciplinary team of industry engineers, academics, artists and performers to develop technologies for bringing persistent interactive storytelling to select groups of guests at their theme parks and resorts.From 2011 until 2015, Reid worked as a postdoc at UC Santa Cruz where he participated in a range of different projects. As part of the SIREN project, with Arnav Jhala, he investigated games for teaching conflict resolution management. On the SSIM project, with Michael Mateas, he helped research and develop virtual training environments targeting the military and law enforcement agencies to help prevent conflict escalation in unknown social environments. With Marilyn Walker, he also investigated automated methods for analyzing and mining prototypical arguments on internet debate forums about controversial topics such as gun control, gay marriage and evolution.In August of 2015 he rejoined the Institute of Technologies as a Research Scientist where he is researching the role of narrative structure in the persuasiveness of an intended message embedded in the story across different cultures.

DTEND;TZID=America/Los_Angeles:20160129T160000
DTSTART;TZID=America/Los_Angeles:20160129T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Leveraging the Social Web to Enable Open-Domain Interactive Storytelling
UID:20160129T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Machine translation can potentially benefit from the guidance of alanguage model that evaluates translation candidates based on syntacticstructures. In this talk we are going to describe the summer project tobuild such an incremental structured language model that can be used inmachine translation systems that generate the target language in aleft-to-right manner. We will describe in detail our work in modelling,search, and parameter smoothing.

DTEND;TZID=America/Los_Angeles:20110817T143000
DTSTART;TZID=America/Los_Angeles:20110817T140000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Structured Language Modelling for Machine Translation
UID:20110817T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND;TZID=America/Los_Angeles:20040716T163000
DTSTART;TZID=America/Los_Angeles:20040716T150000
LOCATION:11 Large
SUMMARY:Practice Talks for ACL (+workshops)
UID:20040716T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Topic models, a class of Bayesian probabilistic models for discretedata, have recently gained popularity in applications ranging fromdocument modeling to computer vision.  Since the introduction ofLatent Dirichlet Allocation (LDA) in 2003, there have been numerousextensions to this archetype.  I will review the theory behind LDA,and discuss subsequent models, including (some of): Correlated TopicModel, Dynamic Topic Model, Hierarchical Topic Model, Special WordsTopic Model, Hierarchical Dirichlet Process Model, Pachinko AllocationMachine, Topics and Syntax Model, Bi-LDA, Author-Topic Model,Supervised Topic Model, Spatial LDA, etc.

DTEND;TZID=America/Los_Angeles:20080516T160000
DTSTART;TZID=America/Los_Angeles:20080516T150000
LOCATION:11 Large
SUMMARY:Theory and Applications of Topic Modeling
UID:20080516T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the first part of this talk, I will discuss our work on deciphering running-key ciphers, which are produced by encrypting the plaintext with a natural language string of the same length as the plaintext (the 'running key'). These ciphers are harder to crack than simple substitution ciphers, and no previous work has succeeded in decoding them.The second part of the talk will address the problem of speech recognition without access to word pronunciations or annotated training data. The problem's motivations arise from languages and domains where pronunciation lexicons and transcribed speech are not available. Given a representation of the speech as a sequence of phonemes, and a language model from non-parallel text, we present methods to find the sequence of words correspoding to the speech input.

DTEND;TZID=America/Los_Angeles:20110824T150000
DTSTART;TZID=America/Los_Angeles:20110824T143000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Cracking Running-Key Ciphers and Deciphering Speech (Interns Final Talk)
UID:20110824T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I would like to talk about some of the things I did during the lastyear. I will discuss and demonstrate CuSTaRD, a cross-lingualinformation retrieval, organization, summarization, and visualizationsystem that was built for the Surprise Language exercise. I will focusin more details on iNeATS, the interactive multi-document summarizationpart of CuSTaRD. The other project I plan to present is eArchivarius, asystem for accessing collections of electronic mail.

DTEND;TZID=America/Los_Angeles:20031003T160000
DTSTART;TZID=America/Los_Angeles:20031003T150000
LOCATION:11 Large
SUMMARY:A Year in Paradise
UID:20031003T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Discriminating vandalism edits from non-vandalism edits in Wikipedia is a challenging task, as ill-intentioned edits can include a variety of content and be expressed in many different forms and styles. Previous studies are limited to rule-based methods and learning based on lexical features, lacking in deep linguistic analysis. In this talk, I will discuss a novel Web-based syntactic-semantic modeling method, which utilizes Web search results as resource and trains topic-specific n-tag and syntactic n-gram language models to detect vandalism. By combining basic task-specific and lexical features, we have achieved high F-measures using logistic boosting and logistic model trees classifiers, surpassing the results reported by major Wikipedia vandalism detection systems. This is a joint work with Prof. Kathleen McKeown at Columbia University and will appear in the oral session at COLING 2010.Bio:William Yang Wang is a graduate student at Columbia University, and heis currently visiting the NL Dialog Group at USC/ICT, working on phoneticallyaware natural language understanding and speech synthesis. In 2008-2009, he waswith the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences.

DTEND;TZID=America/Los_Angeles:20100730T160000
DTSTART;TZID=America/Los_Angeles:20100730T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Automatic Vandalism Detection in Wikipedia (COLING 2010 Practice Talk)
UID:20100730T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: CCG is able to derive typed dependency structures, providing a usefulapproximation to the underlying predicate-argument relations of âwhodid what to whomâ and dependency structures form an integral part ofCCG. In this talk, I will first cover some essential background onCCG, its dependency structures and CCG parsing; I will then discuss arecent dependency model we developed for shift-reduce CCG parsing.  Achallenge arises in this model from the fact that the oracle needs tokeep track of exponentially many gold-standard derivations, which areall hidden. And we solve this by integrating a packed parse forestwith the beam-search decoder and introduce a novel technique forquerying an exponentially-sized oracle on-the-fly during beam-searchdecoding.Bio. Wenduan Xu is a graduate student in Cambridge advised by Stephen Clark,working on CCG parsing.

DTEND;TZID=America/Los_Angeles:20150717T160000
DTSTART;TZID=America/Los_Angeles:20150717T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Shift-Reduce CCG Parsing with a Dependency Model
UID:20150717T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Hierarchical and recursive structure is commonly found in differentmodalities, including natural language sentences and scene images.  Iwill present some of our recent work on three recursive neural networkarchitectures that learn meaning representations for such hierarchicalstructure. These models obtain state-of-the-art performance on severallanguage and vision tasks.The meaning of phrases and sentences is determined by the meanings ofits words and the rules of compositionality. We introduce a recursiveneural network (RNN) for syntactic parsing which can learn vectorrepresentations that capture both syntactic and semantic informationof phrases and sentences. For instance, the phrases "declined tocomment" and "would not disclose" have similar representations.Since our RNN does not depend on specific assumptions for language, itcan also be used to find hierarchical structure in complex sceneimages. This algorithm obtains state-of-the-art performance forsemantic scene segmentation on the Stanford Background and the MSRCdatasets and outperforms Gist descriptors for scene classification by4%.The ability to identify sentiments about personal experiences,products, movies etc. is crucial to understand user generated contentin social networks, blogs or product reviews. The second architectureI will talk about is based on semi-supervised recursive autoencoders (RAE).RAEs learn vector representations for phrases sufficiently well as tooutperform other traditional supervised sentiment classification methodson several standard datasets.Lastly, I describe an alternative unsupervised RAE model that can learnfeatures which outperform previous approaches for paraphrasedetection on the Microsoft Research Paraphrase corpus.This talk presents joint work with Andrew Ng and Chris Manning.Bio:Richard Socher is a Computer Science PhD student at Stanford,co-advised by Chris Manning and Andrew Ng.Most recently, he won the Yahoo! Key Scientific Challenges ProgramAward and the Distinguished Application Paper Award at ICML, 2011for his work on recursive deep learning.

DTEND;TZID=America/Los_Angeles:20110909T160000
DTSTART;TZID=America/Los_Angeles:20110909T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Recursive Deep Learning in Natural Language Processing and Computer Vision
UID:20110909T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.Bio: Ashish Vaswani is Research Scientist at Google Brain, where he works with fun people on non-sequential generative models that seem to translate well and generate reasonable images of cars and faces. He's also interested in non-autoregressive models for generating structured outputs. Before Brain, he spent many wonderful years at ISI, first as a PhD student, working on fast training of neural language models and MDL inspired training of latent-variable models with David Chiang and Liang Huang, and later as a scientist. He misses his colleagues in LA but he prefers the weather in San Francisco.Bio: Jakob Uszkoreit is currently a member of the Google Brain research team. There, he works on neural networks generating text, images and other modalities in tasks such as machine translation or image super-resolution. Before joining Brain, Jakob led teams in Google Research and Search, developing neural network models of language that learn from weak supervision at very large scale and designing the semantic parser of the Google Assistant. Prior to that, he worked on Google Translate in its early years. Jakob received his MSc in Computer Science and Mathematics from the Technical University of Berlin in 2008.Bio: Niki Parmar is currently a Research Engineer in Google Brain, where she works on generative modeling for tasks across different modalities like Machine Translation, conditional Image generation and super-resolution. Previous to Brain, she worked within Google Research to experiment and evaluate models for Query Similarity and Question Answering used within Search. Niki recieved her Masters in Computer Science from USC before joining Google.

DTEND;TZID=America/Los_Angeles:20180119T160000
DTSTART;TZID=America/Los_Angeles:20180119T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Attention Is All You Need
UID:20180119T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Current statistical machine translation systems usuallyextract rules from bilingualcorpora annotated with 1-best alignments. They are prone to learnnoisy rules due to alignment mistakes. We propose a new structurecalled weighted alignment matrixto encode all possible alignments for a parallel text compactly. Thekey idea is to assign a probability to each word pair to indicate howwell they are aligned. We design new algorithms for extracting phrasepairs from weighted alignment matrices and estimating theirprobabilities. Our experiments on multiple language pairs show thatusing weighted matrices achieves consistent improvements over usingn-best lists in significant less extraction time.About the speaker:Yang Liu is an Assistant Researcher at Institute of ComputingTechnology (ICT), Chinese Academy of Sciences. He received his PhDdegree in Computer Science from ICT in 2007. His major researchinterests include statistical machine translation and Chineseinformation processing. He has been working on syntax-based modeling,word alignment, and system combination. His paper on tree-to-stringtranslation won the Meritorious Asian NLP Paper Award of COLING/ACL2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.

DTEND;TZID=America/Los_Angeles:20090716T113000
DTSTART;TZID=America/Los_Angeles:20090716T103000
LOCATION:11 Large
SUMMARY:Weighted Alignment Matrices for Statistical Machine Translation
UID:20090716T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract:People from different cultures and backgrounds tend to make different decisions faced with the same set of choices. Cultural background influences people's decisions in social interactions. Computational agents that are intended to simulate human behavior or engage in interpersonal interactions such as negotiation with humans need decision making models that are sensitive to culture. In this talk, we show how agents can learn to behave like people from specific cultures in the context of a negotiation game.Bio: Elnaz Nouri is a PhD student in the Natural Language group at USC's Institute for creative Technologies (ICT).

DTEND;TZID=America/Los_Angeles:20140606T160000
DTSTART;TZID=America/Los_Angeles:20140606T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Cultural Negotiating Agents
UID:20140606T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic speech recognition systems exist only for a small fraction of the more than 7,100 languages in the world since the development of such systems is usually expensive and time-consuming. Therefore, porting speech technology rapidly to new languages with little effort and cost is an important part of research and development.Pronunciation dictionaries are a central component for both automatic speech recognition and speech synthesis. They provide the mapping from the orthographic form of a word to its pronunciation, typically expressed as a sequence of phonemes.I will present innovative strategies and methods for the rapid generation of pronunciation dictionaries for new application domains and languages. Depending on various conditions, solutions are developed and proposed â starting from the simple scenario in which the target language can be found in written form on the Internet and we have a simple mapping between speech and written language â up to the difficult scenario in which no written form for the target language exists. We embedded many of the tools implemented in this work in the Rapid Language Adaptation Toolkit. Its web interface is publicly accessible and allows people to build first speech recognition systems with little technical background.Bio: Since 2008 Tim Schlippe is a research assistent and PhD student at Karlsruhe Institute of Technology (KIT), Institute for Anthropomatics, in Germany.At KIT he is involved in teaching and several projects. He has published multiple publications in the field of multilingual speech recognition.For his master's thesis he was as a visiting researcher at Carnegie Mellon University, doing research in the field of statistical machine translation.Tim Schlippe will finish his PhD in November 2014. His current research interests are:Multilingual speech recognition with a focus on rapid adaptation of speech recognition systems to new domains and languages, pronunciation modeling, and language modeling.

DTEND;TZID=America/Los_Angeles:20140808T160000
DTSTART;TZID=America/Los_Angeles:20140808T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Rapid Generation of Pronunciation Dictionaries for New Domains and Languages
UID:20140808T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The aim of this presentation is to describe a program of research in the area of automatic knowledge acquisition which has been submitted in response to the European Information and Communication Technologies FP7 Call 5, Objective 4.3: Intelligent Information Management. The objective of this research program is to develop data-driven techniques and tools for extracting common sense knowledge from unstructured text and applying it for making the approximate inferences needed in order to interpret the ambiguities of human language communication.The central activities include developing techniques and tools for:- converting texts into representations of the particular events and entities they refer to,- identifying relations between these entity and event instances such as shared participants, temporal and spatial juxtapositions, causal connections, entailments, and so on, thereby constructing representations of complex scenarios,- inducing from sets of like entity, event and scenario instances, representations of entity, event and scenario types,- using these entity, event and scenario types as background knowledge to support approximate inferencing (e.g., statistical inference rules such as poisoning probably entails death) within important interactive tasks such as Information Retrieval and web search.The technologies developed will be validated by applying them to two broad NLU tasks: faceted search for Information Retrieval in the domain of health information and open-domain web search for web browsing and UI improvements.

DTEND;TZID=America/Los_Angeles:20100205T160000
DTSTART;TZID=America/Los_Angeles:20100205T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Knowledge Acquisition and Textual Entailment: a proposed research program
UID:20100205T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Is it possible to learn useful translations from large amounts ofmonolingual data to improve machine translation? The intuitivefeeling is that learning a language without bilingual data is atleast "more difficult than learning from example translations". Inthis talk, I will present recent results on decipherment: I will showthat the decipherment problem is indeed difficult (NP-hard) and whatapproximations to the original problem can be made without hurtingdecipherment accuracy much.Bio:Having studied Physics and Computer Science at RWTH Aachen University, I'm currently a PhD student at Prof. Ney's Human Language Technology and Pattern Recognition Group in Aachen. I'm particularly interested in applying decipherment techniques to improve machine translation.

DTEND;TZID=America/Los_Angeles:20130607T160000
DTSTART;TZID=America/Los_Angeles:20130607T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Is Decipherment Difficult?
UID:20130607T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Neural Machine Translation (NMT) is a simple new architecture for getting machines to learn to translate. At its core, NMT is a single big recurrent neural network that is trained end-to-end with several advantages such as simplicity and generalization. Despite being relatively new, NMT has already been showing promising results in various translation tasks. In this talk, I will give an overview of NMT and highlight my recent work on (a) how to address the rare word problem in NMT, (b) how to improve the attention (alignment) mechanism, and (c) how to leverage data from other modalities to improve translation.Bio. Thang Luong is currently a 5th-year PhD student in the Stanford NLP group under Prof. Chris Manning. In the past, he has published papers on various different NLP-related areas such as digital library, machine translation, speech recognition, parsing, psycholinguistics, and word embedding learning. Recently, his main interest shifts towards the area of deep learning using sequence to sequence models to tackle various NLP problems, especially neural machine translation. He has built state-of-the-art (academically) neural machine translation systems both at Google and at Stanford.

DTEND;TZID=America/Los_Angeles:20160212T160000
DTSTART;TZID=America/Los_Angeles:20160212T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Recent Advances in Neural Machine Translation
UID:20160212T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Although a considerable number of generic Natural Language Generation(NLG) systems has been produced over the years, none of them is usuallyemployed in end-to-end, text-to-text applications such as MachineTranslation, Summarization, Question Answering, etc. In this talk, weidentify the likely reasons for this state of affairs, and proposeWIDL-expressions as a flexible formalism that facilitates the integrationof a generic NLG engine within end-to-end language processingapplications.WIDL-expressions represent compactly probability distributions over finitesets of candidate realizations, and have optimal algorithms for textrealization via interpolation with language model probabilitydistributions. We show the effectiveness of our WIDL-based NLG engine forboth sentence realization and document realization tasks. By employinglanguage models that capture sentence-level properties, we perform MachineTranslation and Headline Generation at state-of-the-art levels or better.By employing language models that capture document-level properties suchas text coherence, we synthesize output for Multi-document Summarizationthat displays both high content selection performance and increasedcoherence.

DTEND;TZID=America/Los_Angeles:20060414T163000
DTSTART;TZID=America/Los_Angeles:20060414T150000
LOCATION:11 Large
SUMMARY:Natural Language Generation for Text-to-Text Applications using an Information-Slim Representation
UID:20060414T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic human-computer conversational systems have attracted great attention from both industry and academia. Intelligent products such as XiaoIce (by Microsoft) have been released, while tons of Artificial Intelligence companies have been established. We see that the technology behind the conversational systems is accumulating and now open to the public gradually. With the investigation of researchers, conversational systems are more than scientific fictions: they become real. I would review the recent development of human-computer conversational systems, especially the significant changes brought by deep learning techniques. In the meanwhile, I would share some work conducted by our group.Bio: Dr. Rui Yan is an assistant professor at Peking University, an adjunct professor in Central China Normal University and Central University of Finance and Economics, and he was a Senior Researcher at Baidu Inc. He has investigated several open-domain conversational systems and dialog systems in vertical domains. Till now he has published more than 50 highly competitive peer-reviewed papers. He serves as a (senior) program committee member of several top-tier venues (such as KDD, SIGIR, ACL, WWW, IJCAI, AAAI, CIKM, EMNLP).

DTEND;TZID=America/Los_Angeles:20180622T160000
DTSTART;TZID=America/Los_Angeles:20180622T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Recent Advances and Challenges on Human-Computer Conversational Systems
UID:20180622T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030725T160000
DTSTART;TZID=America/Los_Angeles:20030725T150000
LOCATION:11 Large
SUMMARY:Super-Carmel for Trees
UID:20030725T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Thanks to the availability of parallel data and advances in machinelearning techniques, we have seen tremendous improvement in the fieldof machine translation over the past 20 years. However, due to lack ofparallel data, the quality of machine translation is still far fromsatisfying for many language pairs and domains. In general, it iseasier to obtain non-parallel data, and much work has tried to learntranslations from non-parallel data. Nonetheless, improvements tomachine translation have been limited. In this work, I follow adecipherment approach to learn translations from non parallel data andachieve significant gains in machine translation.I apply slice sampling to Bayesian decipherment. Compared with thestate- of-the-art algorithm, the new approach is highly scalable andaccurate, making it possible to decipher billions of tokens withhundreds of thousands of word types at high accuracy for the firsttime. When it comes to deciphering foreign languages, I introducedependency relations to address the problems of word reordering,insertion, and deletion. Experiments show that dependency relationshelp improve Spanish/English deciphering accuracy by over 5-fold.Moreover, this accuracy is further doubled when word embeddings areused to incorporate more contextual information.Moreover, I decipher large amounts of monolingual data to improve thestate- of-the-art machine translation systems in the scenario ofdomain adaptation and low density languages. Through experiments, Ishow that decipherment find high quality translations forout-of-vocabulary words in the task of domain adaptation, and helpimprove word alignment when the amount of parallel data is limited. Iobserve up to 3.8 point and 1.9 point BlEU gain in Spanish/French andMalagasy/English machine translation experiments respectively.Bio. Qing is a PhD candidate at USC. His research interests focus onapplication of machine learning techniques to help computer betterunderstand human languages. He is working with Kevin Knight on variousproblems related to Machine Translation and Decipherment. Prior tothat, he has worked on computational phonology, including stressprediction and transliteration. He is interested in continuing hisresearch in industrial settings to solve exciting large scale problems.

DTEND;TZID=America/Los_Angeles:20150814T160000
DTSTART;TZID=America/Los_Angeles:20150814T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Beyond Parallel Data - A Decipherment Approach for Better Quality Machine Translation
UID:20150814T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Information retrieval using word senses is emerging as a good researchchallenge on semantic information retrieval. In this presentation, I amgoing to propose a new method using word senses in information retrieval:root sense tagging method. This method assigns coarse-grained word sensesdefined in WordNet to query terms and document terms by unsupervised wayusing co-occurrence information constructed automatically. The sensetagger is crude, but performs consistent disambiguation by consideringonly the single most informative word as evidence to disambiguate thetarget word. We also allow multiple-sense assignment to alleviate theproblem caused by incorrect disambiguation.Experimental results on a large-scale TREC collection show that theproposed approach to improve retrieval effectiveness is successful, whilemost of the previous work failed to improve performances even on smalltext collection. The proposed method also shows promising results when iscombined with pseudo relevance feedback and state-of-the-art retrievalfunction such as BM25.

DTEND;TZID=America/Los_Angeles:20040806T163000
DTSTART;TZID=America/Los_Angeles:20040806T150000
LOCATION:11 Large
SUMMARY:Information Retrieval using Word Senses: Root Sense Tagging Approach
UID:20040806T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Treebank parsing can be seen as the search for an optimally refinedgrammar consistent with a coarse training treebank. We describe amethod in which a minimal grammar is hierarchically refined using EMto give accurate, compact grammars. The resulting grammars areextremely compact compared to other high-performance parsers, yet theparser gives the best published accuracies on several languages, aswell as the best generative parsing numbers in English. In addition,we give an associated coarse-to-fine inference scheme which vastlyimproves inference time with no loss in test set accuracy.

DTEND;TZID=America/Los_Angeles:20071019T113000
DTSTART;TZID=America/Los_Angeles:20071019T103000
LOCATION:11 Large
SUMMARY:Learning and Inference for Hierarchically Split PCFGs
UID:20071019T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We introduce a generative probabilistic model, the noisychannel model, for unsupervised word sense disambiguation. In ourmodel, each context C is modeled as a distinct channel through whichthe speaker intends to transmit a particular meaning S using apossibly ambiguous word W. To reconstruct the intended meaning thehearer uses the distribution of possible meanings in the given contextP(S|C) and possible words that can express each meaning P(W|S). Weassume P(W|S) is independent of the context and estimate it usingWordNet sense frequencies. The main problem of unsupervised WSD isestimating context dependent P(S|C) without access to any sense taggedtext. We show one way to solve this problem using a statisticallanguage model based on large amounts of untagged text. Our model usescoarse-grained semantic classes for S internally and we explore theeffect of using different levels of granularity on WSD performance.The system outputs fine grained senses for evaluation and itsperformance on noun disambiguation is better than most previouslyreported unsupervised systems and close to the best supervisedsystems.Short Bio: Deniz Yuret is an assistant professor in Computer Engineering at Koc University in Istanbul. Previously he was at the MIT AI Lab and laterco-founded Inquira, Inc. His research is on lexical semantics andunsupervised approaches to parsing and disambiguation. Currently he isone of the organizers of the SemEval3 semantic evaluation exercise,co-chair for the ACL 2011 semantics area, and an editor for theComputational Linguistics Journal.

DTEND;TZID=America/Los_Angeles:20110707T160000
DTSTART;TZID=America/Los_Angeles:20110707T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:The Noisy Channel Model for Unsupervised Word Sense Disambiguation
UID:20110707T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Probabilistic context-free grammars can describe probability distributions over strings, i.e., the sum of probabilities of all generated strings is 1.This condition is often  called consistency. It has applications in fields of natural language processing such as probabilistic parsing (disambiguate by picking the parse with the highest score), or speech recognition (rank hypotheses returned by a speech recognizer).The talk is a survey of some of the previous results. We investigate how we can determine if a probabilistic context-free grammar is consistent, and if such a test can always be done. Also, we study a method, namely normalization, which guarantees consistent probabilistic context-free grammars. Moreover, we mention briefly some techniques that train probabilistic context-free grammars and guarantee consistency.

DTEND;TZID=America/Los_Angeles:20080822T153000
DTSTART;TZID=America/Los_Angeles:20080822T150000
LOCATION:11 Large
SUMMARY:Intern Final Talk:  On the Consistency of Probabilistic Context-Free Grammars
UID:20080822T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present work defining an Abstract Meaning Represention (AMR)(joint work with Kevin Knight et al.) that serves as an intermediatesemantic structure when translating between languages such as Chineseand English as well as automatic and manual annotation efforts tobuild corpora of AMRs.I will give a demo of our web-based AMR Editor, which is used by dozensof annotators at LDC, SDL/LanguageWeaver (Cluj) and other places.Finally, I will give an overview of our initial end-to-end prototype,with rule extraction (own work), decoding from source language to AMR(work by Yinggong Zhao) and AMR to target language generation (Yang Gao).

DTEND;TZID=America/Los_Angeles:20121214T160000
DTSTART;TZID=America/Los_Angeles:20121214T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Launching Semantics-Based Machine Translation
UID:20121214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: English light verb constructions (LVCs), such as have a drink, make an offer, take a bath, do an investigation, and give a groan, represent a powerfully expressive type of Multi-Word Expression (MWE) in English; however, the precise definition, semantic function and productivity of English LVCs remain unclear, hampering efforts to treat LVCs appropriately in Natural Language Processing (NLP) resources.  This research focuses on exploring these three issues.  A definition for LVCs that combines syntactic and semantic criteria is developed, initially based on existing research on delimiting and defining LVCs, and iteratively refined during the development of an LVC annotation schema for the PropBank project (Palmer et al., 2005).  Existing theories on the linguistic function of LVCs both cross-linguistically and in English are discussed, and a corpus study of LVCs provides evidence that the primary function of LVCs in English is to enable speakers to describe events in a manner that can take advantage of rich nominal modification; for example, The inspector general did a rather controversial investigation...  Finally, a dominant hypothesis concerning the productivity of certain verbal constructions is investigated in relation to LVCs, using large-scale Mechanical Turk surveys.  This firstly probes the question of why âfamiliesâ of semantically similar LVCs occur (e.g. make a statement/speech/declaration/proposal), but other arguably similar LVCs are odd to speakers (e.g. ?make a yell/hint).  Secondly, this provides the groundwork for better detection of very low frequency LVCs that arise from a speakerâs ability to shift and extend verbal meanings within novel constructions.  The contributions of these findings on both NLP and linguistic theory are presented.Bio:Claire Bonial is in the final weeks of her academic adventure at the University of Colorado, Boulder, and will soon be graduating with a joint PhD in Linguistics and Cognitive Science.  Claire has had a rich academic and professional experience at CU, including collaborative research with Martha Palmer on a variety of Natural Language Processing (NLP) resources, such as PropBank and VerbNet.  Although Claire has served in many roles on these projects, her primary contribution has been theoretical research on English Light Verb Constructions (LVCs, e.g. take a walk, make a mistake).  Specifically, she has worked to improve the coverage and treatment of this type of multi-word expression.  Two years ago, Claire began working with Kevin Knight and ISI on the Abstract Meaning Representation (AMR) project, which has since afforded her several lovely trips to Marina del Rey, as well as the opportunity to apply her linguistic expertise in the development of this exciting and challenging project.Webcast Link:http://webcasterms1.isi.edu/mediasite/Viewer/?peid=c389f52cfb16424facb6386ff180de771d

DTEND;TZID=America/Los_Angeles:20140905T160000
DTSTART;TZID=America/Los_Angeles:20140905T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Take a look at this!  Form, Function and Productivity of English Light Verb Constructions
UID:20140905T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automated essay scoring was initially motivated by its potential costsavings for large-scale writing assessments.  However, as automated essayscoring became more widely available and accepted, teachers and assessmentexperts realized that the potential of the technology could go way beyondjust essay scoring.  Over the past five years or so, there has been rapiddevelopment, and commercial deployment of automated essay evaluation forboth large-scale assessment and classroom instruction.  A number offactors contribute to an essay score, including varying sentencestructure, grammatical correctness, appropriate word choice, errors inspelling and punctuation, use of transitional words/phrases, andorganization and development. Instructional software capabilities existthat provide essay scores and evaluations of student essay writing in allof these domains.  The foundation of automated essay evaluation softwareis rooted in NLP research.  This talk will walk through the development ofCriterionSM, e-rater, and Critique writing analysis tools, automated essayevaluation software developed at Educational Testing Service - from NLPresearch through deployment as a business.(Preview of an HLT/NAACL-2004 Invited Speaker Presentation)Jill BursteinEducational Testing ServicePrinceton, NJ

DTEND;TZID=America/Los_Angeles:20040413T163000
DTSTART;TZID=America/Los_Angeles:20040413T150000
LOCATION:4 Large
SUMMARY:Automated Essay Evaluation: From NLP research through deployment as a business
UID:20040413T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, Lili will introduce his work "Coupling distributed and symbolic execution for natural language queries," which was done during his internship at Huawei Technologies (Hong Kong), supervised by Dr. Zhengdong Lu. The study proposes a unified perspective of neural and symbolic execution for semantic parsing, and shows how we can make use of both neural and symbolic worlds.Lili Mou received his BS degree in computer science from Peking University in 2012. He is now a Ph.D. student, supervised by Profs. Zhi Jin, Ge Li, and Lu Zhang. His recent research interests include deep learning applied to natural language processing as well as programming language processing. He has publications at top conferences like AAAI, ACL, CIKM, COLING, EMNLP, IJCAI, and INTERSPEECH.

DTEND;TZID=America/Los_Angeles:20170303T160000
DTSTART;TZID=America/Los_Angeles:20170303T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Coupling distributed and symbolic execution for natural language queries
UID:20170303T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Semantic parsing has potential applications in a number of areas, including machine translation and machine reading, among many others. In this talk we will present our initial work on the parsing task for the semantic representation language known as Abstract Meaning Representation (AMR). The task is to take an English sentence and transform it into its semantic representation.We will present a series of approaches and associated results, providing guidelines for future work in this area. We will show approaches using heuristics, tree transducers, and probabilistic context free grammars. We will also present approaches for AMR rule extraction for the applicable formalisms. In doing so, we will also highlight challenges relative to syntactic parsing.Additionally, we will provide a map for the future directions in AMR parsing that we plan to pursue in the fall.Bios:Allen Schmaltz is a Ph.D. student in Computer Science in the School of Engineering and Applied Sciences at Harvard University (2013-present; S.M. 2014), working with Stuart Shieber. He is interested in formal, statistical, and human-augmented machine learning approaches for computational linguistics. Before starting his Ph.D. in Computer Science, he completed the better part of an additional Ph.D. in the (quantitative) social sciences at Harvard University (2010-2013), received a M.A. from Stanford University (2010), and received a B.A. from Northwestern University (2006). Earlier in his academic career he also studied at Cornell University and in Yokohama, Japan, among other places.Julian Schamper studies computer science at RWTH Aachen University. He did his bachelor thesis in the field of deciphering foreign language and works as a student research assistant at Prof. Hermann Ney's Human Language Technology and Pattern Recognition Group.

DTEND;TZID=America/Los_Angeles:20140829T160000
DTSTART;TZID=America/Los_Angeles:20140829T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Toward Semantic Parsing [Intern final talk]
UID:20140829T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We compare and contrast the strengths and weaknesses of a syntax-basedmachine translation model with a phrase-based machine translationmodel on several levels.  We briefly describe each model, highlightingpoints where they differ.  We include a quantitative comparison of thephrase pairs that each model has to work with, as well as the reasonswhy some phrase pairs are not learned by the syntax-based model.  Wethen propose improvements to the syntax-based extraction techniques tocapture more phrases.  We also compare the translation accuracy forall variations.

DTEND;TZID=America/Los_Angeles:20070511T163000
DTSTART;TZID=America/Los_Angeles:20070511T150000
LOCATION:11 Large
SUMMARY:What Can Syntax-based MT Learn from Phrase-based MT?
UID:20070511T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: <b>Natural Language Understanding: A fast and accurate Statistical Learning Approach for Dialogue Systems</b>Natural Language Understanding (NLU) is an essential module of a gooddialogue system. To achieve satisfactory performance levels, real timedialogue systems need the NLU module to be both fast and accurate. FiniteState Model (FSM) based systems are fast and accurate but lack robustnessand flexibility. The Statistical Learning Model (SLM) based systems arerobust and flexible but lack accuracy and are at most times slow.In this talk, I am going to talk about an SLM based NLU approach fordialogue utterances that is both accurate and fast. The system has highaccuracy and produces frames in real time.<b>A Community of Words: Understanding Social Relationships from E-mail</b>A corpus of e-mail messages presents a number of challenges for NLPtechniques, with its nearly unconstrained structure and vocabulary,mistyped words and ungrammatical sentences, and extensive contextualinformation that is never explicitly stated. Yet, the intrinsically socialnature of such communication provides an opportunity to study not just abag of words, but also the relationships, competencies, and activitiesbehind them.This talk presents work with Eduard Hovy as part of the MKIDS project.

DTEND;TZID=America/Los_Angeles:20040521T163000
DTSTART;TZID=America/Los_Angeles:20040521T150000
LOCATION:11 Large
SUMMARY:Statistical Learning for Dialogue System <b>and</b> A Community of Words
UID:20040521T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Political discourse is challenging from a sentiment analysis point of view because political issues are subjective and highly dynamic.  Political language may contain neologisms that do not occur frequently in general purpose lexical sentiment models. Also, the presence of humor, sarcasm, and comparatives may introduce errors in sentiment analysis. In Twitter, these issues are amplified by the use of Twitter-specific features and constrained message lengths. In this presentation, we will present a collaborative project between the University of Southern California (USC) Signal Analysis and Interpretation Laboratory, USC Annenberg Innovation Laboratory, and IBM. Our system is relies on manual curation of keywords and hashtags, crowd-sourced annotation, statistical machine learned sentiment models, and a real-time visualization that is ideal for display during live events.  We describe our corpus and several experiments using different settings of our sentiment models.  Among our findings are that sentiment in politics is skewed towards negative, annotation agreement tend to be low, and that sarcasm is a factor that explains some of the annotator disagreement.We have also studied bigger picture questions such as how much weight tweets by Big Bird (or someone pretending to be Big Bird) should be allocated in reporting the results of sentiment analysis.  Question about the role of humor and sarcasm in social media lead to some skepticism of naive applications of sentiment analysis but present interesting examples of content that influences social media user behavior and spills over into traditional media.This is joint work with Dogan Can, Nikos Malandrakis, Hao Wang, Alex Leavitt, Kevin Driscoll, Kristen Guth, Theo Mazumdar, Varun Lingaraju, Sagar Jhobalia, Mellisa Loudon, Shrikanth Narayanan, FranÃ§s Bar, Kjerstin Thorson, Mike Ananny, Sam Thomson, Ed Elze, Graham Mackintosh, Robert Uleman, Leon Katsnelson, and Chris Gruber.

DTEND;TZID=America/Los_Angeles:20130405T160000
DTSTART;TZID=America/Los_Angeles:20130405T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Sentiment and Sarcasm in the 2012 US Presidential Election
UID:20130405T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree Adjoining Grammars have well-known advantages but are typically considered too difficult for practical systems.  We propose that, when done right, adjoining improves translation quality without becoming computationally intractable.  Using adjoining to model optionality allows general translation patterns to be learned without the clutter of endless variations of optional material.  The appropriate modifiers can later be spliced in as needed to translate details.In this proposal, we describe challenges encountered by phrase-based and syntax-based machine translation (MT) systems today, and present an in-depth, quantitative comparison of both models. Then, we describe a novel model for statistical MT which addresses these challenges using a Synchronous Tree Adjoining Grammar.  We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding.   And we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data.Finally, our initial results show that adjoining already delivers an end-to-end improvement of +0.8 BLEU over a baseline statistical syntax-based MT model on a medium-scale Arabic/English MT task.  Furthermore, we demonstrate it is a competitive entry in the Urdu-English track of the 2009 NIST MT evaluation.  We then propose improvements to the model, decoding, and extraction that promise to allow this new, linguistically-motivated MT model to surpass its syntax-based and phrase-based cousins in a wide range of scenarios and language pairs.

DTEND;TZID=America/Los_Angeles:20091023T160000
DTSTART;TZID=America/Los_Angeles:20091023T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tree Adjoining Machine Translation (thesis proposal practice talk)
UID:20091023T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Analyzing and generating creative language (stories, poems, jokes, etc) is a growing field within computational linguistics.  We will give three short talks on the topic -- Yoav on Haiku generation, Sravana on understanding eggcorns, and Kevin on poetry translation.

DTEND;TZID=America/Los_Angeles:20100723T160000
DTSTART;TZID=America/Los_Angeles:20100723T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Three Mini-Talks on Creative Language
UID:20100723T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030520T160000
DTSTART;TZID=America/Los_Angeles:20030520T150000
LOCATION:11 Large
SUMMARY:Discourse Segmentation of Multi-Party Conversation
UID:20030520T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk we propose to use natural language as a guide for what people can perceive about the world from images and what ultimately machines should aim to see as well. We discuss two recent structured prediction efforts in this vein: scene graph parsing in Visual Genome, a framework derived from captions, and visual semantic role labeling in imSitu, a formalism built on FrameNet and WordNet. In scene graph parsing, we examine the problem of modeling higher order repeating structure (motifs) and present new state-of-the-art baselines and methods. We then look at the problem semantic sparsity in visual semantic role labeling: infrequent combinations of output semantics are frequent. We present new compositional and data-augmentation methods for dealing with this challenge, significantly improving on prior work.Bio: Mark Yatskar is a post-doc at the Allen Institute for Artificial Intelligence and recipient of their Young Investigator Award. His primary research is in the intersection of language and vision, natural language generation, and ethical computing. He received his Ph.D. from the University of Washington with Luke Zettlemoyer and Ali Farhadi and in 2016 received the EMNLP best paper award and his work has been featured in Wired and the New York Times.

DTEND;TZID=America/Los_Angeles:20180420T160000
DTSTART;TZID=America/Los_Angeles:20180420T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Language as a Scaffold for Visual Recognition
UID:20180420T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the first part of the talk, we investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneously segments the speech, discovers a proper set of sub-word units (e.g., phones) and learns a Hidden Markov Model (HMM) for each induced acoustic unit. Our approach is formulated as a Dirichlet process mixture model in which each mixture is an HMM that represents a sub-word unit. We apply our model to the TIMIT corpus, and the results demonstrate that our model discovers phone units that are highly correlated with English phones as well as produces better segmentation than the state-of-the-art baselines. We test the quality of the learned acoustic models on a spoken term detection task. Compared to the baseline, our model is able to improve the detection precision of top hits by a large margin.The creation of a pronunciation lexicon remains the most inefficient process in developing an automatic speech recognizer. In the second part of the talk, we discuss an unsupervised alternative to the conventional manual approach for creating pronunciation dictionaries. We present a hierarchical Bayesian model, which jointly discovers the phonetic inventory and the Letter-to-Sound (L2S) mapping rules in a language using only transcribed data. When tested on a corpus of spontaneous queries, our results demonstrate the superiority of the proposed joint learning scheme over its sequential counterpart, in which the latent phonetic inventory and L2S mappings are learned separately. Furthermore, the recognizers built with the automatically induced lexicon consistently outperform grapheme-based recognizers and even approach the performance of recognition systems trained using conventional supervised procedures.

DTEND;TZID=America/Los_Angeles:20130719T160000
DTSTART;TZID=America/Los_Angeles:20130719T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Bayesian Approaches to Acoustic Model and Pronunciation Lexicon Discovery
UID:20130719T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This is a practice run for I talk I will give a few times over the nextweeks when interviewing for job positions.)I will review the state of the art in statistical machine translation(SMT), present my dissertation work, and sketch out the researchchallenges of syntactically structured statistical machine translation.The currently best methods in SMT build on the translation of phrases (anysequences of words) instead of single words. Phrase translation pairs areautomatically learned from parallel corpora. While SMT systems generatetranslation output that often conveys a lot of the meaning of the originaltext, it is frequently ungrammatical and incoherent.The research challenge at this point is to introduce syntactic knowledgeto the state of the art in order to improve translation quality. Myapproach breaks up the translation process along linguistic lines. I willpresent my thesis work on noun phrase translation and ideas about clausestructure.

DTEND;TZID=America/Los_Angeles:20031010T160000
DTSTART;TZID=America/Los_Angeles:20031010T150000
LOCATION:11 Large
SUMMARY:Advances in Statistical MT: Phrases, Noun Phrases and Beyond
UID:20031010T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I discuss translation as an optimization problem subject tothree kinds of constraints: lexical, configurational, and constraintsenforcing target-language wellformedness. Lexical constraints ensurethat the lexical choices in the output are meaning-preserving;configurational constraints ensure that the relationships betweensource words and phrases (e.g., semantic roles and modifier-headrelationships) are properly transformed in translation; andtarget-language wellformedness constraints ensure the grammaticalityof the output. In terms of the traditional source-channel model ofBrown et al. (1993), the "translation model" encodes lexical andconfigurational constraints and the "language model" encodes targetlanguage wellformedness constraints. On the other hand, theconstraint-based framework suggests a generate-and-test(discriminative) model of translation in which features sensitive toinput and output structures, and the feature weights are trained tomaximize the (conditional) likelihood of a corpus of exampletranslations. The specified features represent empirical hypothesesabout what variables correlate (but not why) and thus encodedomain-specific knowledge that is useful for the problem at hand; thelearned weights indicate to what extent these hypotheses are confirmedor refuted.To verify the usefulness of the feature-based approach, I discuss theperformance two models: first, a lexical translation model evaluatedby the word alignments it learns. Unlike previous unsupervisedalignment models, the new model utilizes features that capture diverselexical and alignment relationships, including morphologicalrelatedness, orthographic similarity, and conventional co-occurrencestatistics. Results from typologically diverse language pairsdemonstrate that the generate-and-test model provides substantialperformance benefits compared to state-of-the-art generativebaselines.  Second, I discuss the results of an end-to-end translationmodel in which lexical, configurational, and wellformednessconstraints are modeled independently. Because of the independenceassumptions, the model is substantially more compact thanstate-of-the-art translation models, but still performs significantlybetter on languages where source-target word order differences aresubstantial.Bio: Chris Dyer is a postdoctoral researcher in Noah Smith's lab inthe Language Technologies Institute at Carnegie Mellon University. Hecompleted his PhD on statistical machine translation with PhilipResnik at the University of Maryland in 2010. Together with Jimmy Lin,he is author of "Data-Intensive Text Processing with MapReduce",published by Morgan & Claypool in 2010. Current research interestsinclude machine translation, unsupervised learning, Bayesiantechniques, and "big data" problems in NLP.

DTEND;TZID=America/Los_Angeles:20111216T160000
DTSTART;TZID=America/Los_Angeles:20111216T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Generate-and-Test Models for Alignment and Machine Translation
UID:20111216T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this summer project, we investigate a scalable method to extractChinese-English name transliterations from large comparable corpora,which consist of two languages discussing same or similar topics. We showthat bigram Jaccard coefficient is a good similarity method to compare Englishand Chinese names, at Chinese pronunciation (Pinyin) level. Based on this phoneticsimilarity score, an efficient randomized algorithm is then used tofind name pair candidates from English and Chinese lists. Finally, contextinformation, such as dates, frequency, place and titles are combined with thephonetic similarity to improve the accuracy of the name pairs list.(Note: This is part of the Summer Intern Series)

DTEND;TZID=America/Los_Angeles:20060818T153000
DTSTART;TZID=America/Los_Angeles:20060818T150000
LOCATION:11 Large
SUMMARY:Name Entity Transliteration Discovery from Large Bilingual Comparable Corpora
UID:20060818T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Much of the meaning conveyed in language use goes beyond the literal meaning of the words. Suppose someone asks whether I want to go for lunch, and I reply: "I had a very large breakfast". The utterance does not convey only what it literally means, my interlocutor is probably going to infer that I am not hungry and do not want to go for lunch now. Computational systems today understand at most the literal meaning of human language utterances. I aim at capturing aspects of utterance meaning, the kind of information that a reader will reliably extract from an utterance within text.The first part of the talk concentrates on interpreting answers to yes/no questions which do not straightforwardly convey a 'yes' or 'no' answer. I focus on questions involving scalar modifiers (Was it acceptable? It was unprecedented.) and numerical answers (Are you kids little? I have a 10 year-old and a 7 year-old.). I exploit the availability of large amount of text to learn meanings from words and sentences in real context. I show that we can ground scalar modifier meaning based on large unstructured databases, and that such meanings can drive pragmatic inference.The second part of the talk targets veridicality -- whether a speaker intends to convey that the events described are actual, non-actual or uncertain -- which is central to language understanding, but little used in relation and event extraction systems. What do people infer from a sentence such as FBI agents alleged in court documents today that Zazi had admitted receiving weapons and explosives training from al Qaeda operatives? Did Zazi received weapons and explosives training? I show that not only lexical semantic properties but context and world knowledge shape veridicality judgments. Since such judgments are not always categorical, I suggest they should be modeled as distributions, and propose a classifier to do so. The classifier features provide a nuanced picture of the diverse factors that affect veridicality.Short Bio:Marie-Catherine de Marneffe is a fifth-year PhD student in Linguistics at Stanford University. Prior to herdoctoral studies, she visited the Stanford NLP research group for 2 years, working with Christopher D. Manning.In 2000, she received her master degree in Classical Languages, and a master in Computer Science in 2002,both from the UniversitÃ© catholique de Louvain (Belgium). Her work in computational semantics focuses onon detecting entailment and contradiction in texts, grounding meaning from large unstructured databases, and assessing the information status of events from a reader's perspective. She is also interested in language acquisition, studying howchildren acquire verb forms in French.

DTEND;TZID=America/Los_Angeles:20110429T160000
DTSTART;TZID=America/Los_Angeles:20110429T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Computational models of utterance meaning
UID:20110429T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a machine translation engine that can translate romanized Arabic, often known as Arabizi, into English. With such a system we can, for the first time, translate the massive amounts of Arabizi that are generated every day in the social media sphere but until now have been uninterpretable by automated means. We accomplish our task by leveraging a machine translation system trained on non-Arabizi social media data and a weighted finite-state transducer-based Arabizi-to-Arabic conversion module, equipped with an Arabic character-based n-gram language model. The resulting system allowshigh capacity on-the-fly translation from Arabizi to English. We demonstrate via several experiments that our performance is quite close to the theoretical maximum attained by perfect deromanization of Arabizi input. This constitutes the first presentation of an end-to-end social media Arabizi-to-English translation system.bio:Jonathan May is a computer scientist at USC-ISI, where he also received a PhD in 2010. His current focus areas are in machine translation, machine learning, and natural language understanding. Jonathan co-developed and patented a highly portable method for optimizing thousands of features in machine translation systems that has since been incorporated into all leading open source MT systems. He has previously worked in automata theory and information extraction and at SDL Language Weaver and BBN Technologies.

DTEND;TZID=America/Los_Angeles:20140718T160000
DTSTART;TZID=America/Los_Angeles:20140718T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:An Arabizi-English Social Media Statistical Machine Translation System
UID:20140718T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The value of mathematical formalisms for speech recognition, language generation, and machine translation has long been recognized. Not so much work, though, has been spent reconciling these formalisms with linguistic theories. In this talk I'll propose a theoretical descriptive mechanism based on feature logic, which is central to construction and constraint-based linguistic theories like construction grammar and HPSG, and which  can be used to view tree transducers and tree-adjoining grammars as giving rise to a construction-based framework.

DTEND;TZID=America/Los_Angeles:20071102T163000
DTSTART;TZID=America/Los_Angeles:20071102T150000
LOCATION:11 Large
SUMMARY:Constructions, Constraints, Transducers, and TAGs: A unifying view through Feature Logic
UID:20071102T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A multi-document summary gives the "gist" of what is contained in acollection of related documents. But how can we define a "gist?" Weexplore this question by analyzing human written summaries for clusters ofdocument sets. In particular, we estimate the probability that word willbe chosen by a human to be included in a summary. We demonstrate that ifthis probability model were given by an oracle, then a simple automaticmethod of summarization can produce extract summaries which arestatistically indistinguishable from the human summaries.About the Speaker:John M. Conroy received a B.S. in Mathematics from Saint Joseph'sUniversity in 1980 and a Ph.D. in Applied Mathematics from the Universityof Maryland in 1986. Since then he has been a research staff member forthe IDA Center for Computing Sciences in Bowie, MD. His research interestis applications of numerical linear algebra and statistics. He is a memberof the Society for Industrial and Applied Mathematics, Institute ofElectrical and Electronics Engineers (IEEE), and the Association forComputational Linguistics.

DTEND;TZID=America/Los_Angeles:20060127T163000
DTSTART;TZID=America/Los_Angeles:20060127T150000
LOCATION:11 Large
SUMMARY:Multi-Document Summary Space:What do People Agree is Important?
UID:20060127T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Note that this is a MONDAY!)

DTEND;TZID=America/Los_Angeles:20050214T163000
DTSTART;TZID=America/Los_Angeles:20050214T150000
LOCATION:11 Large
SUMMARY:Collecting Broad-Coverage Knowledge Bases from Volunteers
UID:20050214T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I present an algorithm, Searn (for "search-learn") that is designed tosolve structured prediction problem: problems whose goal is to learn topredict complex objects such as parts-of-speech, parse trees,translations, etc...  Searn functions by "breaking apart" structuredprediction problems into classification problems in the process of search.I analyze Searn in the framework of learning reductions and show that goodperformance on the underlying classification problems implies good searchperformance.  Moreover, Searn is computationally efficient in a supersetof the settings where previous algorithms are efficient and is not limitedby conditional independence assumptions (as in CRFs).  This excessivelysimple and general algorithm turns out to have excellent state-of-the-artperformance.This is joint work with John Langford (TTI-C) and Daniel Marcu; and, to alesser extent, with Drew Bagnell (CMU) and Bianca Zadrozny (IBM TJWatson).

DTEND;TZID=America/Los_Angeles:20060224T163000
DTSTART;TZID=America/Los_Angeles:20060224T150000
LOCATION:11 Large
SUMMARY:Search-based Structured Prediction
UID:20060224T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Word alignment, the process of inferring the implicit links between words across two languages, serves as an integral piece of the puzzle of learning linguistic translation knowledge. It enables us to acquire automatically from data the rules that govern the transformation of words, phrases, and syntactic structures from one language to another. Word alignment is used in many tasks in Natural Language Processing, such as bilingual dictionary induction, cross-lingual information retrieval, and distilling parallel text from within noisy data. In this talk, we focus on word alignment for statistical machine translation.We advance the state-of-the-art in search, modeling, and learning of alignments and show empirically that, when taken together, these contributions significantly improve the output quality of large-scale statistical machine translation, outperforming existing methods. The work we describe may be used for any language-pair, supporting arbitrary and overlapping features from varied sources.

DTEND;TZID=America/Los_Angeles:20120316T160000
DTSTART;TZID=America/Los_Angeles:20120316T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Syntactic Alignment Models for Large-Scale Translation (PhD Defense Practice Talk)
UID:20120316T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Extracting Parallel Sub-Sentential Fragments from Non-Parallel CorporaDragos MunteanuWe present a novel method for extracting parallel sub-sentential fragmentsfrom comparable bilingual corpora. Currently, the state of the art incomparable corpus mining is only able to extract full sentence pairs whichare judged to be parallel. We advance the state of the art by showing howto obtain useful data even from not-fully-parallel sentences. By analyzingsentence pairs using a signal-processing-inspired approach, we detectwhich segments of the source sentence are translated into segments of thetarget sentence, and which are not. We evaluate the quality of theextracted data by showing that it improves the performance of astate-of-othe-art machine translation system.Advances in Discriminative ParsingJoseph TurianThe present work advances the accuracy and training speed ofdiscriminative parsing. Our discriminative parsing method has nogenerative component, yet surpasses a generative baseline on constituentparsing, and does so with minimal linguistic cleverness. Our model canincorporate arbitrary features of the input and parse state, and performsfeature selection incrementally over an exponential feature space duringtraining. We demonstrate the flexibility of our approach by testing itwith several parsing strategies and various feature sets.

DTEND;TZID=America/Los_Angeles:20060711T160000
DTSTART;TZID=America/Los_Angeles:20060711T143000
LOCATION:11 Large
SUMMARY:Practice Talks for ACL
UID:20060711T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP.  The task of reading comprehension (i.e., question answering over unstructured text) has received vast attention recently, and some progress has been made thanks to the creation of large-scale datasets and development of attention-based neural networks.In this talk, Iâll first present how we advance this line of research. Iâll show how simple models can achieve (nearly) state-of-the-art performance on recent benchmarks, including the CNN/Daily Mail datasets and the Stanford Question Answering Dataset. Iâll focus on explaining the logical structure behind these neural architectures and discussing advantage as well as limits of current approaches.Lastly Iâll talk about our recent work on scaling up machine comprehension systems, which attempt to answer open-domain questions at the full Wikipedia scale. We demonstrate the promise of our system, as well as set up new benchmarks by evaluating on multiple existing QA datasets.Bio:Danqi Chen is a Ph.D. candidate in Computer Science at Stanford University, advised by Prof. Christopher Manning. Her main research interests lie in deep learning for natural language processing and understanding, and she is particularly interested in the intersection between text understanding and knowledge reasoning. She has been working on machine comprehension, question answering, knowledge base population and dependency parsing. She is a recipient of Facebook fellowship and Microsoft Research Womenâs Fellowship and an outstanding paper award in ACL'16. Prior to Stanford, she received her B.S. from Tsinghua University in 2012.

DTEND;TZID=America/Los_Angeles:20170331T160000
DTSTART;TZID=America/Los_Angeles:20170331T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards the Machine Comprehension of Text
UID:20170331T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Training a string-to-tree syntax-based statistical machine translationsystem to translate from a source language (e.g. Chinese or Arabic)into a target language (e.g. English) requires the followingresources: a parallel corpus (a large set of example sentences in thesource language that have been translated into the target language bya human); a word alignment (a word-to-word correspondence between eachsource-target sentence pair); and a parse tree (a syntacticrepresentation) of each sentence in the target language.  From thesetraining examples, the system learns to translate source-languagesequences of words into target-language trees.  In order to ensurebroad coverage, the parallel corpus of training examples must besufficiently large (on the order of millions of sentence pairs).Manually annotating such large corpora would be prohibitivelytime-consuming.  Instead, these corpora must be word-aligned andparsed automatically.There are two problems with existing approaches to automatic wordalignment and parsing for syntax-based machine translation.  First,these processes are noisy and introduce errors which impacttranslation quality.  Second, these processes are typically performedindependently of one another.  Since each process produces constraintsthat can be used to guide the other, by more closely integrating them,we can expect to improve the accuracy of each process.  In thisthesis, we address these two problems as follows: first, we improveupon the accuracy of a state-of-the-art parser; second, we use wordalignments to improve parse accuracy; third, we use parses to improveword alignment accuracy; and fourth, we optimize parses and wordalignments simultaneously.  We examine the impact of each of thesemethods upon parse quality, alignment quality, and translation qualityin a downstream syntax-based machine translation system.  Our resultsdemonstrate that more closely integrating word alignment and syntacticparsing can indeed improve the accuracy of each process, and in somecases leads to an improvement in translation quality relative to astate-of-the-art syntax-based statistical machine translation system.

DTEND;TZID=America/Los_Angeles:20100330T170000
DTSTART;TZID=America/Los_Angeles:20100330T160000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Integrating Parsing and Word Alignment in Syntax-Based Machine Translation (Ph.D. Defense practice talk)
UID:20100330T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Civil litigation in this country relies on each side making relevant evidence available to the other, a process known as "discovery." The explosive growth of information in digital form has led to an increasing focus on how search technology can best be applied to balance costs and responsiveness in what has come to be known as "e-discovery".This is now a multi-billion dollar business, one in which new vendors are entering the market frequently, usually with impressive claims about the efficacy of their products or services.  Courts, attorneys, and companies are actively looking to understand what should constitute best practice, both in the design of search technology and in how that technology is employed.  In this talk I will provide an overview of the e-discovery process, and then I will use that background to motivate a discussion of which aspects of that process the TREC Legal Track is seeking to model.  I will then spend most of the talk describing two novel aspects of evaluation design: (1) recall-focused evaluation in large collections, and (2) modeling an interactive process for "responsive review" with fairly high fidelity.  Although I will draw on the results of participating teams to illustrate what we have learned, my principal focus will be on discussing what we presently understand to be the strengths and weaknesses of our evaluation designs.About the Speaker:Douglas Oard is a Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies and the Institute for Advanced Computer Studies, where he currently serves as director of the Computational Linguistics and Information Processing lab.  Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland.  His research interests center around the use of emerging technologies to support information seeking by end users.  His recent work has focused on interactive techniques for cross-language information retrieval, searching conversational media such as speech and email, evaluation design for e-discovery in the TREC Legal Track, and support for sense-making in large digital archival collections.  Additional information is available at http://terpconnect.umd.edu/~oard/.

DTEND;TZID=America/Los_Angeles:20121029T150000
DTSTART;TZID=America/Los_Angeles:20121029T140000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Evaluating E-Discovery Search: The TREC Legal Track
UID:20121029T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Scholarly digital libraries (DLs) have managed to scale upto handle millions of documents and now feature tools to trackcitations and references between articles.  However, users of digitallibraries typically often access the DL merely to check references or todownload the PDF of the document.  What features will thenext-generation DL need to inspire scholars to use digital library formore than accessing the document?  In ForeCite, our digital libraryproject at NUS, we believe part of the answer lies in integratingcommon end user's concerns: annotation, sharing, off-and-online usageand focusing on the intra-document processing.  I will describe anddemonstrate some of the preliminary components of the ForeCite system:including its web based front end, ParsCit (a backend open-sourcecitation segmentation system), and ForeCiteNote (TiddyWiki basedresearch notetaking system) and ForeCiteReader (Google Books-likeinterface for annotation and collaboration on notetaking, and FireCite(browser extension for recognizing citations on webpages).Speaker Bio:Min-Yen Kan (BS;MS;PhD Columbia Univ.) is an associate professor atthe National University of Singapore.  His research interests includedigital libraries and applied natural language processing.  Specificprojects include work in the areas of scientific discourse analysis,multiword expression extraction and understanding, machine translationand applied text summarization.  Currently, he is an associate editorfor "Information Retrieval" and is the Editor for the ACL Anthology,the computational linguistics community's largest archive of publishedresearch. More information about him and his group can be found at theWING homepage: http://wing.comp.nus.edu.sg/

DTEND;TZID=America/Los_Angeles:20100115T160000
DTSTART;TZID=America/Los_Angeles:20100115T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:ForeCite: towards a more integrated scholarly digital library
UID:20100115T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: When asked, as a PhD student, what I wanted to do when I grow up, I had one and only one answer: academic-oriented, natural language processing research.  During the last decade, I have learned though to also appreciate the research opportunities in the commercial world. In this talk, I will compare several academic and commercial research models and ground the comparison in examples derived from my own experience while working as a researcher for USC, Language Weaver, and SDL

DTEND;TZID=America/Los_Angeles:20130125T160000
DTSTART;TZID=America/Los_Angeles:20130125T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:The Things I Learned While Doing Research in the Commercial World
UID:20130125T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Contextual bandit algorithms provide principled online learning solutions to find optimal trade-offsbetween exploration and exploitation with companion side-information. They have been extensivelyused in various important practical scenarios, such as display advertising and content recommendation.A common practice estimates the unknown bandit parameters pertaining to each user independently.This unfortunately ignores dependency among users and thus leads to suboptimal solutions, especiallyfor the applications that have strong social components.In this talk, I will introduce our newly developed collaborative contextual bandit algorithm, in which theadjacency graph of users is leveraged to share context and payoffs among neighboring users duringonline updating. We rigorously prove an improved upper regret bound of the proposed collaborativebandit algorithm comparing to conventional independent bandit algorithms. More importantly, we alsoprove that user dependency relation is only needed to be time-invariant, such that a sublinear upperregret bound is still achievable in such an algorithm. This enables online user dependency estimation.Extensive experiments on both synthetic and three large-scale real-world datasets verified theimprovement of our proposed algorithm against several state-of-the-art contextual bandit algorithms. Inaddition, I will also cover our recent progress in online matrix factorization, optimizing user long-term engagement, and bandit learning in a non-stationary environment.Bio: Dr. Hongning Wang is now an Assistant Professor in the Department of Computer Science at theUniversity of Virginia. He received his Ph.D. degree in computer science at the University of Illinois atChampaign-Urbana in 2014. His research generally lies in the intersection among machine learning, datamining and information retrieval, with a special focus on computational user intent modeling. His workhas generated over 40 research papers in top venues in data mining and information retrieval areas. Heis a recipient of 2016 National Science Foundation CAREER Award and 2014 Yahoo Academic CareerEnhancement Award.

DTEND;TZID=America/Los_Angeles:20180209T160000
DTSTART;TZID=America/Los_Angeles:20180209T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Contextual Bandits in a Collaborative Environment
UID:20180209T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Latent Dirichlet allocation (LDA) and its Bayesian nonparametric generalization hierarchical Dirichlet processes (HDP) have been proven successful in modeling large, complex, real-world domains. However, inference on LDA/HDP is challenging and it has received notable attention from the researchers. In this talk, we present two algorithmic advances for LDA/HDP inference by examining their mathematical properties. We will first present an effective parallel Gibbs sampling algorithm for LDA/HDP by exploring the equivalency between the Dirichlet-multinomial hierarchy and the Gamma-Poisson hierarchy. Secondly, we will show how to provably select the number of topics for LDA by studying the spectral space of its second order moments (bi-gram statistics).Bio: Dehua Cheng is a third year Ph.D. student in the CS department at USC, advised by Professor Yan Liu. Prior to that, he received his B.S. degree in Mathematics and Physics from Tsinghua University, China. His research interests include randomized numerical algorithm in machine learning and parallel inference for probabilistic graphical model.

DTEND;TZID=America/Los_Angeles:20150515T160000
DTSTART;TZID=America/Los_Angeles:20150515T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Exploring LDA: Parallel Inference and Model Selection
UID:20150515T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk summarizes my summer work on scaling a machine translationsystem to train on a large data set. Similar system are tuned withMERT on 1k sentences, we train a CRF on 100k sentences. I will discusstechniques for training, features, distributed scaling,regularization, and tuning, and give preliminary results.

DTEND;TZID=America/Los_Angeles:20100827T153000
DTSTART;TZID=America/Los_Angeles:20100827T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Intern Final Talk: Large-scale, High-dimensional, Discriminative Machine Translation
UID:20100827T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Studies of social systems have traditionally focused on analyzing various structural properties of networks induced by social communication, while ignoring the content of communication. Despite recent advances, language-based analysis of social processes is still a challenging problem due to the lack of sound mathematical frameworks and adequate computational methods for extracting and analyzing useful social signals from unstructured text.Here I will describe our recent work on content-based analysis of social interactions, which involves two main steps: (a) Embedding communication content in an abstract content space, so that a sequence of textual exchanges is represented as trajectories in this space; and (b) Applying tools from information theory and dynamical systems to discover and characterize directional correlations among those trajectories. I will briefly describe the main elements of the technical approach, and demonstrate the usefulness of the proposed framework on two case studies: content-based characterization of social influence, and stylistic coordination in dialogues.Bio:Aram Galstyan is a Project Leader at the USC Information Sciences Institute and a Research Assistant Professor at the USC Computer Science Department. His current research focuses on characterizing and predicting behavior of dynamic networks using informationâtheoretic concepts. His other research interests include developing statisticalâphysics based approaches for understanding fundamental limits of various inference algorithms and characterizing the performance of those algorithms with respect to stability and robustness.

DTEND;TZID=America/Los_Angeles:20140509T160000
DTSTART;TZID=America/Los_Angeles:20140509T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Deciphering Social Interactions from Text
UID:20140509T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The success of machine learning has surged, with similar algorithmic approaches effectively solving a variety of human-defined tasks.  Tasks testing how well machines can perceive images and communicate about them have exposed strong effects of different types of bias, such as selection bias and dataset bias.  In this talk, I will unpack some of these biases, and how they affect machine perception today.Bio: Margaret Mitchell is a Senior Research Scientist in Google's Research & Machine Intelligence group, working on artificial intelligence. Her research generally involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI.

DTEND;TZID=America/Los_Angeles:20171120T160000
DTSTART;TZID=America/Los_Angeles:20171120T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Algorithmic Bias in Artificial Intelligence: The Seen and Unseen Factors Influencing Machine Perception of Images and Language
UID:20171120T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recent years have seen a resurgence of Neural Networks  in Natural Language Processing. Much of this success can be attributed to learning compact representations (or embeddings) of words, which are used as input to train standard Neural Network architectures. In the first part of the talk I will describe two approaches for learning word embeddings for large vocabularies. In the second part, I will talk about successful applications of Neural Networks in NLP tasks like Part-Of-Speech tagging, Chunking, Parsing etc. without any feature engineering. I will also describe some preliminary work on Neural Networks for unsupervised Part-Of-Speech tagging.

DTEND;TZID=America/Los_Angeles:20121107T160000
DTSTART;TZID=America/Los_Angeles:20121107T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Neural Networks for NLP
UID:20121107T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abduction is an inference method often used to formalize what the process of interpretation is. In this talk i'll describe a system that generates a textual description of an abductive proof and its evaluation when applied to the interpretations generated for a set of 100 movies from the Heider-Simmel Interactive Theater project. The goal of the system is to generate text that explains the system's interpretation fluently without having to read or understand a proof graph and first order logic.

DTEND;TZID=America/Los_Angeles:20151106T160000
DTSTART;TZID=America/Los_Angeles:20151106T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Text generation from abductive interpretations
UID:20151106T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The future of self-driving cars, personal robots, smart homes, and intelligent assistants hinges on our ability to communicate with computers. The failures and miscommunications of Siri-style systems are untenable and become more problematic as machines become more pervasive and are given more control over our lives. Despite the creation of massive proprietary datasets to train dialogue systems, these systems still fail at the most basic tasks. Further, their reliance on big data is problematic. First, successes in English cannot be replicated in most of the 6,000+ languages of the world. Second, while big data has been a boon for supervised training methods, many of the most interesting tasks will never have enough labeled data to actually achieve our goals. It is, therefore, important that we build systems which can learn from naturally occurring data and grounded, situated interactions.In this talk, I will discuss work from my thesis on the unsupervised acquisition of syntax which harnesses unlabeled text in over a dozen languages.  This exploration leads us to novel insights into the limits of semantics-free language learning.  Having isolated these  stumbling blocks, Iâll then present my recent work on language grounding where we attempt to learn the meaning of several linguistic constructions via interaction with the world.Yonatan Biskâs research focuses on Natural Language Processing from naturally occurring data (unsupervised and weakly supervised data).  He is a postdoc researcher with Daniel Marcu at USCâs Information Sciences Institute.  Previously, he received his PhD from the University of Illinois at Urbana-Champaign under Julia Hockenmaier and his BS from the University of Texas at Austin.

DTEND;TZID=America/Los_Angeles:20170210T160000
DTSTART;TZID=America/Los_Angeles:20170210T150000
LOCATION:6th Floor Conference Room [689]
SUMMARY:The Limits of Unsupervised Syntax and the Importance of Grounding in Language Acquisition
UID:20170210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will briefly overview the landscape of event-oriented informationextraction (IE) systems and explain why it is especially challengingto learn IE systems without annotated training data. Then I willdescribe one attempt to do so by decoupling the tasks of findingrelevant text regions and applying extraction patterns. First, aself-trained relevant sentence classifier identifies relevant regionsin documents. Second, a "semantic affinity" measure identifiesdomain-relevant extraction patterns.  We further distinguish between"primary" patterns and "secondary" patterns and apply the patternsselectively in the relevant regions.  This approach is weaklysupervised, requiring only a few seed patterns plus relevant andirrelevant (but unannotated) documents for training.  The resulting IEsystem achieves reasonably good performance, despite the fact that therelevant region classifier leaves a lot to be desired.

DTEND;TZID=America/Los_Angeles:20080613T160000
DTSTART;TZID=America/Los_Angeles:20080613T150000
LOCATION:11 Large
SUMMARY:Effective Information Extraction with Relevant Regions and Semantic Affinity Patterns
UID:20080613T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Talk-1: Learning Phoneme Mappings for Transliteration without Parallel DataWe present a method for performing machine transliteration without any parallel resources. We frame the transliteration task as a decipherment problem and show that it is possible to learn cross-language phoneme mapping tables using only monolingual resources.  We compare various methods and evaluate their accuracies on a standard name transliteration task.This is joint work with Kevin Knight.----------------------------------------------------Talk-2: A New Objective Function for Word AlignmentWe develop a new objective function for word alignment that measures the size of the bilingual dictionary induced by an alignment. A word alignment that results in a small dictionary is preferred over one that results in a large dictionary.  In order to search for the alignment that minimizes this objective, we cast the problem as one of integer linear programming.  We then extend our objective function to align corpora at the sub-word level, which we demonstrate on a small Turkish-English corpus.This is joint work with Tugba Bodrumlu and Kevin Knight.

DTEND;TZID=America/Los_Angeles:20090514T160000
DTSTART;TZID=America/Los_Angeles:20090514T150000
LOCATION:4th flr CR
SUMMARY:Practice talks for NAACL HLT
UID:20090514T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Quantitative data, such as time series and numerical attribute data, often play a crucial role in understanding the world and validating factual statements. Unfortunately, quantitative datasets are often expressed in diverse formats that exhibit significant variation, posing difficulties to machine reading approaches. Furthermore, the scant context that accompanies these data often makes it difficult to relate the quantitative data with broader ideas. Finally, the vast amount of quantitative data make it difficult for humans to find, understand, or access. In this talk, I highlight my recent work, which focuses on developing general approaches to extracting quantitative data from structured sources, creating high-level descriptions of these sources, and aligning quantitative data with textual and ontological labels.Bio: Jay Pujara is a research scientist at the University of Southern California's Information Sciences Institute whose principal areas of research are machine learning, artificial intelligence, and data science. He completed a postdoc at UC Santa Cruz, earned his PhD at the University of Maryland, College Park and received his MS and BS at Carnegie Mellon University. Prior to his PhD, Jay spent six years at Yahoo! working on mail spam detection, and he has also worked at Google, LinkedIn and Oracle. Jay is the author of over thirty peer-reviewed publications and has received three best paper awards for his work. He is a recognized authority on knowledge graphs, and has organized the Automatic Knowledge Base Construction (AKBC) and Statistical Relational AI (StaRAI) workshops, presented tutorials on knowledge graph construction at AAAI and WSDM, and had his work featured in AI Magazine. For more information, visit https://www.jaypujara.org

DTEND;TZID=America/Los_Angeles:20180427T160000
DTSTART;TZID=America/Los_Angeles:20180427T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Extracting and Aligning Quantitative Data with Text
UID:20180427T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The annual Computational Linguistics Open House will be held at USC's InformationSciences Institute from 3:00-4:30pm in the 11th floor Conference Room. Researchers fromISI, including Eduard Hovy, Daniel Marcu, and Kevin Knight will present overviews oftheir latest research.  We will also hear about the research activities of Dani Byrd ofthe Linguistics Department, Shri Narayanan's group in EE, and David Traum and AndrewGordon of USC's Institute for Creative Technologies.

DTEND;TZID=America/Los_Angeles:20031017T163000
DTSTART;TZID=America/Los_Angeles:20031017T150000
LOCATION:11 Large
SUMMARY:Introduction to CL Research
UID:20031017T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a novel approach to weakly supervised semantic class learning fromthe web, using a single powerful hyponym pattern combined with graphstructures, which capture two properties associated with pattern-basedextractions: popularity and productivity. Intuitively, a candidate is popularif it was discovered many times by other instances in the hyponym pattern. Acandidate is productive if it frequently leads to the discovery of otherinstances. Together, these two measures capture not only frequency ofoccurrence, but also cross-checking that the candidate occurs both near theclass name and near other class members. We developed two algorithms that beginwith just a class name and one seed instance and then automatically generate aranked list of new class instances. We conducted experiments on four semanticclasses and consistently achieved high accuracies.

DTEND;TZID=America/Los_Angeles:20080502T160000
DTSTART;TZID=America/Los_Angeles:20080502T150000
LOCATION:11 Large
SUMMARY:Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs
UID:20080502T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: When rewards are sparse and efficient exploration essential, deep Q-learning with Ïµ-greedy exploration tends to fail. This poses problems for otherwise promising domains such as task-oriented dialog systems, where the primary reward signal, indicating successful completion, typically occurs only at the end of each episode but depends on the entire sequence of utterances. A poor agent encounters such successful dialogs rarely, and a random agent may never stumble upon a successful outcome in reasonable time. We present two techniques that significantly improve the efficiency of exploration for deep Q-learning agents in dialog systems. First, we demonstrate that exploration by Thompson sampling, using Monte Carlo samples from a Bayes-by-Backprop neural network, yields marked improvement over standard DQNs with Boltzmann or Ïµ-greedy exploration. Second, we show that spiking the replay buffer with a small number of successes, as are easy to harvest for dialog tasks, can make Q-learning feasible when it might otherwise fail catastrophically.Bio:I am a graduate student in the Artificial Intelligence Group at the University of California, San Diego on leave for two quarters at Microsoft Research Redmond. I work on machine learning, focusing on deep learning methods and applications. In particular, I work on modeling sequential data with recurrent neural networks and sequential decision-making processes with deep reinforcement learning. I'm especially interested in research impacting medicine and natural language processing. Recently, in Learning to Diagnose with LSTM RNNs, we trained LSTM RNNs to accurately predict patient diagnoses using only lightly processed time series of sensor readings in the pediatric ICU. Before coming to UCSD, I completed a Bachelor of Arts with a joint major in Mathematics and Economics at Columbia University. Then, I worked in New York City as a jazz musician. I have interned with Amazon's Core Machine Learning team and Microsoft Research's Deep Learning Team.

DTEND;TZID=America/Los_Angeles:20160916T143000
DTSTART;TZID=America/Los_Angeles:20160916T133000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Efficient Exploration for Dialog Policy Learning with BBQ Networks & Replay Buffer Spiking
UID:20160916T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present work that extends the standard hidden Markov model to aversion that can emit multiple symbols in a single time step.  Using thismodel, we are able to automatically create phrase-to-phrase mappings in analignment process.  I've applied this model to the task of creatingalignments between documents and their human-written abstracts, yieldingan overall alignment F-score of 0.548, a significant improvement on thebest results to date of 0.363.  These results are published in an EMNLPpaper this year, but the talk will be an extended version of the talk Iwill give there (namely, I will discuss the mechanics of the extended HMMin more detail in this seminar).

DTEND;TZID=America/Los_Angeles:20040702T150000
DTSTART;TZID=America/Los_Angeles:20040702T133000
LOCATION:11 Large
SUMMARY:A Phrase-Based HMM Approach to Document/Abstract Alignment
UID:20040702T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (a 20-minute version of this talk was given at the ISD retreat, with no technical details.)Parsing is the task of finding the most probable interpretation for a given sentence, and is a central problem in NLP because it serves as the basis of many downstream applications such as machine translation, summarization, paraphrasing, and question answering. Improving parsing efficiency and accuracy will greatly improve the applicability of those applications.However, unlike human parsing which is amazingly efficient by scanning the sentence incrementally, current state-of-the-art parsers are either extremely slow (standard algorithms like CKY scale cubically with sentence length), or purely greedy in the search algorithm that only touches a tiny fraction of the (exponentially) large search space. We instead propose a dynamic programming algorithm that does incremental parsing and ambiguity packing along the way, such that the running time is (almost) linear, and yet searches over exponentially many trees. Empirical results are very good, but further details withheld -- come to the talk!This is a joint work with Kenji Sagae, USC/ICT.

DTEND;TZID=America/Los_Angeles:20100305T160000
DTSTART;TZID=America/Los_Angeles:20100305T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Incremental Parsing
UID:20100305T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algorithm on multiple semantic relations expressed using "verb", "noun" and "verb prep" lexico-syntactic patterns. We embark on human based evaluation to assess the quality of the harvested information and find out that the overall accuracy of our algorithm is 90%. We also compare our results with existing knowledge base outlining the similarity and differences of the granularity and diversity of the harvested knowledge.

DTEND;TZID=America/Los_Angeles:20100702T160000
DTSTART;TZID=America/Los_Angeles:20100702T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Arguments and Supertypes of Semantic Relations using Recursive Patterns (ACL 2010 Practice Talk)
UID:20100702T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present a new kind of dependency parsing algorithm: easy first,non directional dependency parsing.  This is a greedy, bottom upparser, admitting an efficient O(nlogn) implementation.  Unlikeshift-reduce based greedy parsers, it does not analyze the sentence ina fixed sequential order, but instead tries to make easier attachmentdecisions between harder ones.  The parser performs well on bothHebrew and English.  I also present evidence that the parsers producesqualitatively different parses than either the Malt or the MSTparsers.  This observation give rise to an intriguing questions: whydo different parsers produce different parses? can we quantify thiskind of difference?  In the second part of the talk I will present myattempts to answer these kinds of questions.

DTEND;TZID=America/Los_Angeles:20100611T163000
DTSTART;TZID=America/Los_Angeles:20100611T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Easy First Dependency Parsing and How Different Parsers Behave Differently
UID:20100611T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND;TZID=America/Los_Angeles:20050408T163000
DTSTART;TZID=America/Los_Angeles:20050408T150000
LOCATION:11 Large
SUMMARY:Search Engines for HLT Applications
UID:20050408T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Semantics has been the object of deep study for many years. Yet representation of contentâthe actual meaning of the symbols used in semantic propositionsâis curiously absent from most of this work. This talk argues that this is so because the most useful way of conceptualizing content is not in the form of symbols but as statistical word(sense) distributions, suitably organized.  Over the past few years, NLP research has increasingly treated topic signature word distributions (also called 'context vectors', 'topic models', 'language models', etc.) as a de facto replacement for semantics at various levels of granularity. Whether the task is wordsense disambiguation, certain forms of textual entailment, information extraction, paraphrase learning, and so on, it turns out to be very useful to consider a semantic unit as being defined by the distribution of word(senses) that regularly accompany it (in the classic words of Firth, "you shall know a word by the company it keeps"). This is true for semantic units of all sizes, from individual word(sense)s to sentences to text collections; the information learned and used by WSD engines closely resembles that learned by LDA and similar topic characterization engines.In this talk I argue for a new kind of semantics, which is combines traditional symbolic logic-based proposition-style semantics of the kind used in older NLP with (computation-based) statistical word distribution information (what is being called Distributional Semantics in modern NLP). The core resource is a single lexico-semantic 'lexicon' that can be used for a variety of tasks provided it is reformulated appropriately. I show how to define such a lexicon, how to build and format it, and how to use it for various tasks. The talk pulls together a wide range of related topics, including Pantel-style resources like DIRT, inferences / expectations such as those used in Schank-style expectation-based parsing and expectation-driven NLU, PropBank-style word valence lexical items, and the treatment of negation and modalities.Combining the two views of semantics seems promising but opens many questions that need study, including the operation of logical operators such as negation and modalities over word(sense) distributions, the nature of ontological facets required to define concepts, and the action of compositionality over statistical concepts.

DTEND;TZID=America/Los_Angeles:20101005T173000
DTSTART;TZID=America/Los_Angeles:20101005T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Toward a Computational Theory of Semantic Content
UID:20101005T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: After a brief overview of applications of tree transducers in statistical machine translation, we introduce multi bottom-up tree transducers (XMBOT).We then present a complete translation system integrating XMBOT. The two main components of our pipeline are (a) rule extraction and (b) decoding. We begin by presenting the extraction of XMBOT rules from an aligned and bi-parsed parallel corpus. In a second step, we introduce our XMBOT decoder which is an adaptation of the syntax-based component of the Moses open-source MT toolkit to handle XMBOT rules. We end this talk with an evaluation of our system on the WMT 2009 English-to-German translation task.

DTEND;TZID=America/Los_Angeles:20130726T160000
DTSTART;TZID=America/Los_Angeles:20130726T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Multi bottom-up tree transducers in statistical machine translation
UID:20130726T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk has two parts. In the first part, I will introduce researchactivities in Speech-to-Speech Translation at ATR, including on-goingresearch on statistical machine translation. In the second part, I willpresent a new approach to QA named Question-Biased Term Extraction (QBTE).The QBTE directly extracts answers as terms biased by the question. Toconfirm the feasibility of our QBTE approach, we conducted experiments onthe CRL QA Data based on 10-fold cross validation, using Maximum EntropyModels as an ML technique. Experimental results showed that the trainedsystem achieved approximately 0.35 in MRR and 50% in TOP5 accuracy. Thispart is an English version of my presentation given in IPSJ SIGNL-163 in2004 in Japanese. If time allows, I would like to introduce the NTCIR-5(2004/2005) Cross-Lingual QA task (CLQA) that I am going to organize.About the speaker:Yutaka Sasaki received his Ph.D. in Engineering from the University ofTsukuba, Japan in 2000 for his work on generating Information Extractionrules with hierarchically sored Inductive Logic Programming. He joined NTTLaboratories in 1988. Since then, he was involved in research inrule-based CAI, inductive logic programming, Information Extraction, andQuestion Answering. From 1995 to 1996, he spent one year at Simon FraserUniversity, Canada as a visiting researcher. From 1999, he led a subgroupto develop the first practical Japanese Question Answering System SAIQA.Then, he applied SVMs to automatically construct the QA system SAIQA-IIfrom QA and NE data. In June 2004, he moved to ATR Spoken LanguageTranslation Research Laboratories. Currently, he is the head of Departmentof Natural Language Processing. He is also an organizer of the NTCIR 5Cross-Lingual Question Answering Task.

DTEND;TZID=America/Los_Angeles:20050128T163000
DTSTART;TZID=America/Los_Angeles:20050128T150000
LOCATION:11 Large
SUMMARY:Research Activities in Speech Translation at ATR/QA as Question-Biased Term Extraction
UID:20050128T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Parsing and translating natural languages can be viewed asstructured-prediction problems. We outline the crucial designdecisions that must be made to build a machine to solve structuredprediction problems, and explain our particular choices for these twolarge-scale NLP problems.  Our approach uses a purely discriminativelearning method that scales up well to problems of this size.  Unlikecurrently popular methods, this one does not require a great deal offeature engineering a priori, because it performs feature selectionover a compound feature space as it learns.  Accuracy on constituentparsing was at least as good as other comparable methods.  To ourknowledge, it is the first purely discriminative learning algorithmfor translation with tree-structured models.  Experiments demonstratethe method's versatility, accuracy, and efficiency.

DTEND;TZID=America/Los_Angeles:20060623T163000
DTSTART;TZID=America/Los_Angeles:20060623T150000
LOCATION:11 Large
SUMMARY:Discriminative Training for Large-Scale NLP
UID:20060623T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'm going to talk about what I've been working on recently.  My thesisproposal is something having to do with the interaction of search,learning and features in supervised natural language problems.  I will befocusing on the task of coreference, since it is a well-studied problem,yet nevertheless not really solved and quite difficult.  It is also agreat pedagogical example for why we should care about something *other*than standard Markov random fields for structured prediction, since, forthe coreference problem (and pretty much every other "real" naturallanguage problem) inference in such models is intractable.The contents of this talk will be roughly 40% from a paper I have at ICMLthis year on efficient, accurate supervised learning techniques forstructured prediction (and why I feel inclined to make the verycontroversial statement that supervised learning for NLP problems issolved); it will be roughly 40% about an application of this technique tothe coreference resolution problem and an exploration of the feature spacefor solving this problem (submitted to HLT); and it will be roughly 20%about looking forward to what I want to accomplish in the remainder of mythesis, not covered by the first 80%.

DTEND;TZID=America/Los_Angeles:20050613T120000
DTSTART;TZID=America/Los_Angeles:20050613T103000
LOCATION:11 Small
SUMMARY:Search, Learning and Features (my thesis proposal proposal)
UID:20050613T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: BLEU the de facto standard for evaluation and development of statistical machine translation systems.  We describe three real-world situations involving comparisons between different versions of the same systems where one can obtain improvements in BLEU scores that are questionable or even absurd. We propose a very conservative modification to BLEU that addresses these issues while improving correlation with human judgements, then explore some deeper modifications that alleviate the problems further.

DTEND;TZID=America/Los_Angeles:20080530T153000
DTSTART;TZID=America/Los_Angeles:20080530T150000
LOCATION:11 Large
SUMMARY:BLEU Sway Issues: one way to get statistical significance, two ways to get a better score, and three ways to thwart them
UID:20080530T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a method for improving word alignment for statistical syntax-based machine translation that employs a syntactically informed alignment model closer to the translation model than commonly-used word alignment models. This leads to extraction of more useful linguistic patterns and improved BLEU scores on translation experiments in Chinese and Arabic.

DTEND;TZID=America/Los_Angeles:20080411T160000
DTSTART;TZID=America/Los_Angeles:20080411T150000
LOCATION:11 Large
SUMMARY:Syntactic Re-Alignment Models for Machine Translation
UID:20080411T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Narrative prediction involves predicting âwhat happens nextâ in a story. This task has a long history in AI research but is now getting more recognition in the NLP community. In this talk Iâll describe three different evaluation schemes for narrative prediction, one of which (the Story Cloze Test) is the shared task for this yearâs LSDSem workshop at EACL. Iâll present my ongoing efforts to develop Recurrent Neural Network-based models that succeed on these evaluation frameworks, and discuss the particular challenges posed by each of them.Bio: Iâm a PhD candidate at USCâs Institute for Creative Technologies advised by Andrew Gordon in the Narrative Group. My thesis research explores machine learning approaches to automatically generating text-based stories. Iâm interested in using this research to stimulate peopleâs creativity in writing. More broadly, Iâm excited by any opportunity to use automated analysis of text data to give people new insights and ideas.

DTEND;TZID=America/Los_Angeles:20170203T160000
DTSTART;TZID=America/Los_Angeles:20170203T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Recurrent Neural Networks for Narrative Prediction
UID:20170203T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will present recent work on two topics: syntactically structured representations of word meaning in context and semi-supervised semantic role labeling.  These will be presented as two instances of a general theme: acquiring structured meaning representations with little or no manual annotation.Vector space models have become a standard way of representing word meaning that can be learned in an unsupervised way.  The problem of polysemy, however, has only recently been addressed within this framework.  Several approaches to derive vector representations of words in specific sentential contexts have been proposed.  I will present recent work on extending such contextualization operations to vector models incorporating rich syntactic structure, achieving significant improvements in context-dependent lexical substitution tasks.Going beyond the meaning of single words, I will then turn to work on semantic role labeling.  Here, a key obstacle is the annotation effort required for the training of high quality role labeling systems.  I will present a semi-supervised approach to semantic role labeling, based on generalizing semantic annotations from manually labeled seed sentences to unlabeled sentences via structural alignments, yielding significant improvements in role labeling performance.I will conclude my talk with an outlook onto how the search for adequate models of semantics may profit from formulation in task-specific ways. In particular, I will sketch some ideas on structured semantic models for statistical machine translation.Bio: Hagen FÃ¼rstenau is a researcher at Saarland University, Germany.  Hereceived an M.Sc. in Mathematics from Bonn University and is about tofinish his Ph.D. in Computational Linguistics.  His research interestsinclude data-driven methods in computational semantics and weaklysupervised machine learning.

DTEND;TZID=America/Los_Angeles:20110214T120000
DTSTART;TZID=America/Los_Angeles:20110214T110000
LOCATION:4th Floor Large Conference Room [460]
SUMMARY:Learning Structured Semantics under Weak Supervision
UID:20110214T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Ranked lists of output trees from syntactic statistical NLP applicationsfrequently contain multiple repeated entries. This redundancy leads tomisrepresentation of tree weight and reduced information for debugging andtuning purposes. It is chiefly due to nondeterminism in the weightedautomata that produce the results. I will introduce an algorithm thatdeterminizes such automata while preserving proper weights, returning thesum of the weight of all multiply derived trees. I will also reportresults of the application of the algorithm to machine translation andData Oriented Parsing.

DTEND;TZID=America/Los_Angeles:20051216T163000
DTSTART;TZID=America/Los_Angeles:20051216T150000
LOCATION:11 Large
SUMMARY:A Better N-Best List - Practical Determinization of Weighted Finite Tree Automata
UID:20051216T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a simple yet powerful hierarchical search algorithm for automatic word alignment. Essentially, we treat word alignment as a parsing problem, and induce a forest of alignments from which we can efficiently extract a ranked k-best list. We score a given alignment within the forest with a flexible, linear discriminative model incorporating hundreds of local and nonlocal features features, trained on a relatively small amount of annotated data. We report results on Arabic-English word alignment and translation tasks. Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure, yielding a 1.1 BLEU score increase over a state-of-the-art syntax-based machine translation system.

DTEND;TZID=America/Los_Angeles:20100630T160000
DTSTART;TZID=America/Los_Angeles:20100630T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Hierarchical Search for Word Alignment (ACL 2010 Practice Talk)
UID:20100630T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditional inference-based natural language understanding (NLU) in acomputational framework suffered mainly from a lack of a sufficientlylarge knowledge base of commonsense knowledge. Recent advances havechanged this situation: A large amount of machine-readable knowledgeis now freely available to the community. This talk focuses onexploiting these developments to model large-scale NLU in aninference-based framework.The three main types of the existing knowledge sources arelexical-semantic dictionaries, distributional resources, andontologies. After comparing these types of resources and outliningtheir differences, I will present an integrative knowledge basecombining lexical-semantic, ontological, and distributional knowledgein a modular way.I will then talk about reasoning procedures able to make use of thelarge scale knowledge base. In particular, I will compare two mainforms of logical inferences applied to NLU: deduction and abduction.In the last part of the talk, I will present experiments on thefollowing knowledge-intensive NLU tasks: recognizing textualentailment, semantic role labeling, and paraphrasing of noun-noundependencies.

DTEND;TZID=America/Los_Angeles:20111007T160000
DTSTART;TZID=America/Los_Angeles:20111007T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Integration of World Knowledge for Natural Language Understanding
UID:20111007T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Open-class semantic lexicon induction is of great interest for the current knowledge harvesting algorithms. We propose a general framework that uses patterns in bootstrapping fashion to learn open-class semantic lexicons for different kinds of relations. These patterns require seeds. To estimate the /goodness/ (the potential yield) of new seeds, we introduce a regression model that considers the connectivity behavior of the seed during bootstrapping. The generalized regression model is evaluated on six different kinds of relations with over 10000 different seeds for English and Spanish patterns. Our approach reaches robust performance of 90% correlation coefficient with 15% error rate for any of the patterns when predicting the /goodness/ of seeds.

DTEND;TZID=America/Los_Angeles:20100521T153000
DTSTART;TZID=America/Los_Angeles:20100521T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds
UID:20100521T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Due to the Internet revolution, human conversational data--in written forms--are accumulating at a phenomenal rate, as more and more people engage  in email exchanges, blogging, texting and other social media activities. In this talk, we will present automatic methods for analyzing conversational text generated in asynchronous conversations, i.e., where participants communicate with each other at different times (e.g., email, blog, forum). Our focus will be on novel techniques to detect the topics covered in the conversation, to identify whether an utterance in the conversation is expressing an opinion, as well as to determine the discourse structure of each message. In our work, we apply both graph-based methods and probabilistic graphical models.Giuseppe is an Associate Professor in Computer Science at the University of British Columbia (BC, Canada). Giuseppe has broad interdisciplinary interests. His work on natural language processing and information visualization to support decision making has been published in over 90 peer-reviewed papers. Dr. Carenini was the area chair for âSentiment Analysis, Opinion Mining, and Text Classificationâ of ACL 2009 and the area chair for âSummarization and Generationâ of NAACL 2012. He has recently co-edited an ACM-TISTSpecial Issue on âIntelligent Visual Interfaces for Text Analysisâ. In July 2011, he has published a co-authored book on âMethods for Mining and Summarizing Text Conversationsâ. In his work, Dr. Carenini has also extensively collaborated with industrial partners, including Microsoft and IBM. Giuseppe was awarded a Google Research Award and anIBM CASCON Best Exhibit Award in 2007 and 2010 respectively.

DTEND;TZID=America/Los_Angeles:20131108T160000
DTSTART;TZID=America/Los_Angeles:20131108T150000
LOCATION:6th Floor Large Conference Room [Rm 689]
SUMMARY:Modeling Topics, Opinions and Discourse Structure in Asynchronous Conversations
UID:20131108T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In the past decade, researchers have explored many approaches toautomatically extract large collections of knowledge from text. In thistalk, we present Espresso, a weakly-supervised, general-purpose, andbroad-coverage algorithm for harvesting binary semantic relations. Themain contributions are: i) a method for exploiting generic patterns byfiltering incorrect instances using the Web; and ii) a principled measureof pattern and instance reliability enabling the filtering algorithm. Wepresent an empirical comparison of Espresso with various state of the artsystems, on different size and genre corpora, on extracting variousgeneral and specific relations. Experimental results show that ourexploitation of generic patterns substantially increases system recallwith small effect on overall precision.

DTEND;TZID=America/Los_Angeles:20060519T163000
DTSTART;TZID=America/Los_Angeles:20060519T150000
LOCATION:11 Large
SUMMARY:Espresso: Making Use of Generic Patterns for Mining Relations from Small and Large Corpora
UID:20060519T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you speak a little perl, are an occasional perl-scripter, andwould like to know more about how to use it as a (p)ortable, (e)fficient, and (r)eadible (l)anguage, you may be interested in mybrown bag (read: bring your own) lunch seminar:I will talk about using Perl in a portable fashion, the environmentit is run in, and how avoid common mistakes and misconceptions. Perloffers more than a thousand ways to solve a problem, but some aremore portable or more efficient than others. If time permits, simplehands-on examples can be tried out during the talk, so power forlaptops will be provided.

DTEND;TZID=America/Los_Angeles:20061023T133000
DTSTART;TZID=America/Los_Angeles:20061023T120000
LOCATION:11 Large
SUMMARY:perl - how to use it, not abuse it
UID:20061023T120000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We investigate the use of distributed word representations instead of word forms and parts of speech in syntactic parsing.  Distributed representations are dense, low-dimensional, and real valued vector representations (embeddings) for words.  Instead of ad-hoc feature conjunctions, we use kernels and neural networks for non-linearity, greatly simplifying feature engineering.  We show that dense representations offer both computational and learning advantages compared to sparse one-hot vector representations.  We introduce context vectors, distributed representations for word contexts, and show that they can replace or complement parts of speech in parsing models.  We show that distributed representations give accuracies comparable to the state-of-the-art word form and part-of-speech based feature sets.Bio. Deniz Yuret is an associate professor of Computer Engineering at KoÃ§ University in Istanbul working at the Artificial Intelligence Laboratory since 2002. Previously he was at the MIT AI Lab (1988-1999) and later co-founded Inquira, Inc., a company commercializing question answering technology (2000-2002). He has worked on supervised and unsupervised approaches to syntax, morphology, lexical semantics and lexical categories.  His most recent work is on creation and applications of continuous word embeddings.

DTEND;TZID=America/Los_Angeles:20150710T160000
DTSTART;TZID=America/Los_Angeles:20150710T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Parsing with word vectors
UID:20150710T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The ability of computer models to achieve genuine understanding of information as presented to humans (text, images, etc) is a long-standing goal of Artificial Intelligence. Along the way towards this goal, the research community has proposed solving tasks such as machine reading comprehension and computer image understanding. In this talk, we introduce two new tasks that can help us move closer to the goal. First, we present a multi-choice reading comprehension task, for which the goal is to understand a text passage and choose the correct summarizing sentence from among several options. Second, we present a multi-modal understanding task, posed as a combined vision-language comprehension challenge: identifying the most suitable text describing a visual scene, given several similar options. We present several baseline and competitive learning approaches based on neural network architectures, illustrating the utility of the proposed tasks in advancing both image and language comprehension. We also present human evaluation results, which inform a performance upper-bound on these tasks, and quantify the remaining gap between computer systems and human performance (spoiler alert: we are not there yet).Radu Soricut is a Staff Research Scientist in the Research and Machine Intelligence group at Google. Radu has a PhD in Computer Science from University of Southern California, and has been with Google since 2012. His main areas of interest are natural language understanding, multilingual processing, natural language generation (from multimodal inputs), and general machine learning techniques for solving these problems. Radu has published extensively in these areas in top-tier peer-reviewed conferences and journals, and has won the Best Paper Award at the North American Association for Computational Linguistics Conference (NAACL) in 2015. Radu's current project looks at bridging natural language understanding and generation using neural techniques, in the context of Google's focus on making natural language an effective way of interacting with the world and the technology around us.

DTEND;TZID=America/Los_Angeles:20161209T160000
DTSTART;TZID=America/Los_Angeles:20161209T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Multimodal Machine Comprehension: Tasks and Approaches
UID:20161209T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A major obstacle in syntax-based machine translation is theprohibitively large search space for decoding with an integratedlanguage model. We develop faster approaches for this problem basedon lazy algorithms for k-best parsing. When comparing againstChiang's technique of cube pruning, our method runs up to twice asfast without making more search errors or decreasing translationaccuracy as measured by BLEU. We demonstrate the effectiveness of thealgorithm on a large-scale translation system.Interestingly, these techniques can be applied to speed up bilexicalparsing as well, where the (bi-) lexical probabilities can be viewedas n-gram probabilities that causes non-monotonicity. This methodfits naturally into the coarse-to-fine grained multi-pass parsingschemes.To push this direction even further, we can generalize cube and lazycube pruning as generic tools for reducing complicated search spaces,as alternatives to the well-known A* and annealing techniques.This is joint work with David Chiang (ISI).

DTEND;TZID=America/Los_Angeles:20061214T150000
DTSTART;TZID=America/Los_Angeles:20061214T133000
LOCATION:11 Large
SUMMARY:Faster Decoding with Synchronous Grammars and n-gram Language Models
UID:20061214T133000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20121005T160000
DTSTART;TZID=America/Los_Angeles:20121005T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Whom to Trust with MACE
UID:20121005T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this paper, we introduce a way of improving the parsing accuracy of a transition-based dependency parsing model by using k-best ranking. Our approach uses a broader search space than beam search, yet keeps the parsing complexity near a quadratic average running time. In addition, we take a simple post-processing step to ensure the parsing output is a connected dependency tree. As an oracle, we use a high-performing but relatively under-explored machine learning algorithm, Robust Risk Minimization, which gives a higher parsing accuracy than the Perceptron algorithm in the experiments. We also use an automatic feature reduction technique that reduces the feature space by about 49% without compromising the parsing accuracy. We evaluate our approach on the CoNLL '09 shared task English data and improve the transition-based dependency parsing accuracy, showing a 0.64% higher accuracy than the best transition-based CoNLL '09 system.

DTEND;TZID=America/Los_Angeles:20100519T163000
DTSTART;TZID=America/Los_Angeles:20100519T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:K-best, Transition-based Dependency Parsing using Robust Risk Minimization and Automatic Feature Reduction
UID:20100519T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Cube pruning is a fast inexact method for generating the items of abeam decoder.  Here we show that cube pruning is essentiallyequivalent to A* search on a specific search space with specificheuristics.  We use this insight to develop faster and exact variantsof cube pruning.

DTEND;TZID=America/Los_Angeles:20090723T154500
DTSTART;TZID=America/Los_Angeles:20090723T150000
LOCATION:11 Large
SUMMARY:Cube Pruning as Heuristic Search (Practice talk for EMNLP)
UID:20090723T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Voynich manuscript is a medieval illustrated book writtenin an undeciphered script. I will present some questions and answersabout the linguistic and statistical properties of the text.

DTEND;TZID=America/Los_Angeles:20100825T150000
DTSTART;TZID=America/Los_Angeles:20100825T143000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Intern Final Talk: Towards deciphering the Voynich manuscript
UID:20100825T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: During the past few years, graph signal processing has been extending the fieldof signal processing on Euclidean spaces to irregular spaces represented bygraphs. We have seen successes ranging from the Fourier transform, towavelets, vertex-frequency (time-frequency) decomposition, sampling theory,uncertainty principle, or convolutive filtering. This presentation introducesthe field, the type of signals involved, and how harmonic analysis isperformed.Bio: Benjamin Girault received his License (B.Sc.) and his Master (M.Sc.) in Francefrom Ãcole Normale SupÃ©rieure de Cachan, France, in 2009 and 2012 respectivelyin the field of theoretical computer science. He then received his PhD incomputer science from Ãcole Normale SupÃ©rieure de Lyon, France, in December2015. His dissertation entitled "Signal Processing on Graphs - Contributionsto an Emerging Field" focuses on extending the classical definition ofstationary temporal signals to stationary graph signal. Currently, he is apostdoctoral scholar with Professors Antonio Ortega and Shri Narayanan at theUniversity of Southern California continuing his work on graph signalprocessing with a focus on applying these tools to understanding humanbehavior.

DTEND;TZID=America/Los_Angeles:20170609T160000
DTSTART;TZID=America/Los_Angeles:20170609T150000
LOCATION:6th Floor  Conference Room [689]
SUMMARY:Introduction to Graph Signal Processing: Tools for Harmonic Analysis on Irregular Structures.
UID:20170609T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Text-to-text applications -- Machine Translation, Summarization, QuestionAnswering -- do not usually involve generic Natural Language Generation(NLG) systems in their generation components, but rather useapplication-specific algorithms. The main reason for this state of affairsis that virtually all the formalisms used by current generic NLG systemsrequire information that cannot be reliably extracted from unrestrictedtext.This thesis proposal is about meeting the demand for natural languagegeneration in the context of text-to-text applications. I introduce a newrepresentation formalism (WIDL-expressions), propose generation algorithmsthat operate on representations specific to this formalism, and discuss ageneric sentence realization framework for text-to-text applications. Thegeneration mechanism is based on algorithms for intersectingWIDL-expressions with probabilistic language models. I present boththeoretical and empirical results concerning the correctness andefficiency of these algorithms. I also discuss the practical aspectsarising from implementing this generation mechanism.In a concrete application of the proposed generation mechanisms, I presentan end-to-end Machine Translation application. I also discuss anotherpossible application for Automated Summarization, namely automatedheadline generation.

DTEND;TZID=America/Los_Angeles:20050707T163000
DTSTART;TZID=America/Los_Angeles:20050707T150000
LOCATION:11 Small
SUMMARY:Natural Language Generation for Text-to-Text Applications Using an Information-Slim Representation
UID:20050707T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Incorporating implicit feedback into a recommender system is a challenging problem due to sparse and noisy observations. I will present our approaches that exploit heterogeneous attributes and sequence properties within the observations. We build a neural network framework to embed heterogeneous attributes in an end-to-end fashion, and apply the framework to three sequence-based models. Our methods achieve significant improvements on four large-scale datasets compared to state-of-the-art baseline models (30% to 90% relative increase in NDCG). Experimental results show that attribute embedding and sequence modeling both lead to improvements and, further, that our novel output attribute layer plays a crucial role. I will conclude with our exploratory studies that investigate why sequence modeling works well in recommendation systems and advocate its use for large scale recommendation tasks.Bio:Kuan Liu is a fifth year Ph.D. student at ISI/USC working with Prof. Prem Natarajan. Before that, He received a bachelor degree from Tsinghua University with a major in Computer Science. His research interests include machine learning, large scale optimization, deep learning, and applications to recommender systems, network analysis.

DTEND;TZID=America/Los_Angeles:20170317T160000
DTSTART;TZID=America/Los_Angeles:20170317T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Heterogeneous Attribute Embedding and Sequence Modeling for Recommendation with Implicit Feedback
UID:20170317T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, we introduce a methodology for analyzing judgment opinions.We define a judgment opinion as consisting of a valence, a holder, and atopic. We decompose the task of opinion analysis into four parts: 1)recognizing the opinion; 2) identifying the valence; 3)  identifying theholder; and 4) identifying the topic. We evaluate our methodology usingboth intrinsic and extrinsic measures.

DTEND;TZID=America/Los_Angeles:20060421T163000
DTSTART;TZID=America/Los_Angeles:20060421T150000
LOCATION:11 Large
SUMMARY:Identifying and Analyzing Judgment Opinions
UID:20060421T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: It has been a prominent empirical fact in the last decade that languages which have properties that are different from those of English, for instance, languages with free word-order and rich morphological structure, do not lend themselves naturally to the application of statistical models developed for processing English. In this talk I focus on the parsing task and based on the kind of correspondence patterns between form and function that characterize richly inflected languages, I aim to identify the properties of models that can successfully cope with parsing such structures. I start by demonstrating complex many-to-many correspondence patterns in Natural Language using data from the Semitic language Modern Hebrew. I review properties of prominent models for morphological analysis (Stump 2001), and isolate the ones that are appropriate for modeling such complex patterns. I then propose to apply the same strategy to the syntactic domain, arguing that this provides not only for a streamlined interface to morphology, but also better yields a better framework for capturing morphosyntactic interactions on the whole. I illustrate this approach via a particular instantiation, the relational-realizational model of (Tsarfaty 2010), applied to parsing Modern Hebrew. I report significant improvements on various measures over competing alternatives and previously reported results. I finally suggest that other modeling frameworks may often be enhanced to cope better with rich morphosyntactic phenomena, by similarly analyzing their underlying properties and enhancing their relational, or realizational, component, accordingly.Speaker website:http://stp.lingfil.uu.se/~tsarfaty/

DTEND;TZID=America/Los_Angeles:20100608T113000
DTSTART;TZID=America/Los_Angeles:20100608T103000
LOCATION:10th Floor Conference Room [1026]
SUMMARY:Morphology in Parsing: A Taxonomy-Based Approach
UID:20100608T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Many areas of science have experienced rapid growth in the amount of scientific literature published. For example, there are approximately 400 new papers published each year in the area of Machine Translation. As such amount of new data is virtually impossible to processes by a single researcher, a new tool is needed that would help researchers explore existing and discover new MT literature. To address this problem we built an approach for automatic extraction of experimental data from scientific papers that populates a database enabling structured queries.Bios:Eunsol Choi is a PhD student at the University of Washington, advised by Prof. Luke Zettlemoyer. Prior to UW, she studied mathematics and computer science at Cornell University.Matic Horvat is a PhD student at University of Cambridge researching integration of semantics and Statistical Machine Translation. He is originally from Ljubljana, Slovenia, where he completed a BSc in Computer Science in 2012. He continued with a masters in Advanced Computer Science at University of Cambridge, graduating in 2013.

DTEND;TZID=America/Los_Angeles:20140911T163000
DTSTART;TZID=America/Los_Angeles:20140911T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards automatic extraction of experimental data from scientific papers [Intern final talk]
UID:20140911T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (Practice tutorial for ACL/COLING 2006)Once upon a time, synchronous grammars and tree transducers were esoterictopics in formal language theory, far removed from the practice ofbuilding real, large-scale natural language systems. However, these toolsare now rapidly becoming essential for modeling machine translation andother complex language transformations. It has therefore become practicaland important to understand the basic properties of tree transformationsystems, which we cover in this tutorial.

DTEND;TZID=America/Los_Angeles:20060630T170000
DTSTART;TZID=America/Los_Angeles:20060630T140000
LOCATION:11 Large
SUMMARY:Synchronous Grammars and Tree Transducers
UID:20060630T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Parallel texts -- texts that are translations of each other -- are animportant resource in many cross-lingual NLP applications, such as lexicalacquisition, cross-language IR, and annotation projection. However, theirimportance is paramount for Statistical Machine Translation (SMT), as theyprovide the training data from which all the translation knowledge islearned. The state of the art in SMT is advanced enough that, givensufficient parallel data (i.e. a few million words) for any language pairin a given domain, a generic SMT system trained on it will achieve areasonable translation performance in that domain. The main reason why SMTsystems exist only for a handful of languages is that, for most languagepairs, parallel training data is simply not available.One way to alleviate this lack of parallel data is to exploit a muchricher and more diverse resource: comparable corpora, texts which are notstrictly parallel but related. The prototypical example of comparabletexts are two news articles in different languages which report on thesame event. I will present methods for automatic extraction of paralleldata from such corpora. I will show how to detect parallel data at variouslevels of granularity: parallel documents, parallel sentences, and evenparallel sub-sentence fragments. The parallel corpora obtained using thesemethods help improve translation performance for both resource-scarcelanguage pairs (such as Romanian-English) and resource-rich ones (such asArabic-English).

DTEND;TZID=America/Los_Angeles:20060324T163000
DTSTART;TZID=America/Los_Angeles:20060324T150000
LOCATION:11 Large
SUMMARY:Automatic creation of parallel corpora
UID:20060324T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Modeling query concepts through term dependencies has been shown to have a significant positive effect on retrieval performance, especially for tasks such as Web search, where relevance at high ranks is particularly critical. Most previous work, however, treats all concepts as equally important, an assumption that often does not hold, especially for longer, more complex queries. In this talk, I will describe the state-of-the-art practices for modeling query term dependencies for information retrieval using Markov random fields. Within this context I will discuss why many NLP-inspired approaches to the problem, such as query segmentation, have failed to show consistent improvements when applied to information retrieval tasks. Experimental results carried out on a number of TREC and Yahoo! Web search test collections will be presented showing the effectiveness of various types of term (in)dependence models.Brief bio:Donald Metzler is a Research Scientist in the Search and Computational Advertising group at Yahoo! Research. He obtained his Ph.D. from the University of Massachusetts in 2007. He is an active member of the information retrieval and web search communities, having served on the program committees of SIGIR, ECIR, HLT, EMNLP, WSDM, WWW, and ICML. He has published over 35 research papers, has 13 patents pending, and is the co-author of Search Engines: Information Retrieval in Practice. His research interests include information retrieval, web search, computational advertising, and applications of machine learning to large-scale text problems.

DTEND;TZID=America/Los_Angeles:20091204T160000
DTSTART;TZID=America/Los_Angeles:20091204T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Query Concept Importance Using a Weighted Dependence Model
UID:20091204T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Tree adjoining grammars (TAGs) have greater linguistic expressiveness than the tree substitution grammars used in many natural language tasks, but are typically considered too complex or computationally expensive for practical systems.  Many current statistical machine translation (MT) models use tree substitution to memorize sequences of words or constituents, specifying exactly what phrases to use or exactly what trees are grammatical.  Adding the operation of tree adjoining provides the freedom to splice additional information into an existing grammatical tree.  An adjoining translation model allows general, linguistically-motivated translation patterns to be learned without the clutter of endless variations of optional material.  The appropriate modifiers, such as adjectives, adverbs, and prepositional phrases, can later be grafted in as needed to translate details.  We show that the increased generalization power provided by adjoining, when used carefully, improves MT quality without becoming computationally intractable.In this talk, we describe challenges encountered by phrase-based and syntax-based MT systems today, and present an in-depth, quantitative comparison of both models.  Then, we describe a novel model for statistical MT which addresses these challenges using a synchronous tree adjoining grammar.  We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding.   Then we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data.Finally, our results show that adjoining delivers a consistent improvement over a baseline statistical syntax-based MT model on both medium and large-scale MT tasks using several language pairs.

DTEND;TZID=America/Los_Angeles:20110304T173000
DTSTART;TZID=America/Los_Angeles:20110304T163000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tree Adjoining Machine Translation
UID:20110304T163000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk summarizes my work so far on investigating the usefulness ofHyTER networks for short-answer scoring. I will first introduce thetask and the approach we take in this project. And finally I will showsome initial results we have.Bio. Wenduan Xu is a graduate student in Cambridge advised by Stephen Clark, working on CCG parsing.

DTEND;TZID=America/Los_Angeles:20150825T160000
DTSTART;TZID=America/Los_Angeles:20150825T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Using HyTER networks for short-answer scoring
UID:20150825T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Traditionally, automated triage of papers is performed using lexical (unigram, bigram, and sometimes trigram) features.  This talkexplores the use of information extraction (IE) techniques to create richer linguistic features than traditional bag-of-words models. Ourclassifier includes lexico-syntactic patterns and more-complex features that represent a pattern coupled with its extracted noun,represented both as a lexical term and as a semantic category. Our experimental results show that the IE-based features can improve performance over unigram and bigram features alone. We present intrinsic evaluation results of full-text document classification experiments to determine automatically whether a paper should be considered of interest to biologists at the Mouse Genome Informatics (MGI) system at the Jackson Laboratories. We also further discuss issues relating to design and deployment of our classifiers as an application to support scientific knowledge curation at MGI.

DTEND;TZID=America/Los_Angeles:20110610T160000
DTSTART;TZID=America/Los_Angeles:20110610T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:The Role of Information Extraction in the Design of a Document Triage Application for Biocuration
UID:20110610T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: An interesting (disturbing?) new trend is beginning to manifest itself inNLP, one that is focused on performance and hence very attractive in thecontext of inter-system competitive evaluations such as TREC and DUC, butone that does not provide much insight about language or NLP methods tothe researcher interested in these topics.  This addition of a newparadigm to NLP has implications for all of us.

DTEND;TZID=America/Los_Angeles:20040409T163000
DTSTART;TZID=America/Los_Angeles:20040409T150000
LOCATION:11 Large
SUMMARY:Three (and a half?) Trends: The Future of NLP
UID:20040409T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Unsupervised Part of Speech induction can be viewed as a two-steps task.The first step infers a sequence of states, while the second step maps this sequence to anactual Part-of-Speech sequence at training or testing time.Hence, this last step requires reference tagged data, a luxury low-resource target languages might not have.In this talk, we present an alternative approach to the second step, modeling it as a decipherment problemin which the ciphered text is the sequence of states and the original text we want to recover is the POS sequence.This approach requires no reference data in the target language and allows to leverage POS sequences inmuch richer languages.Our experiments show that our approach benefits the most from simple strategies for inferring state sequences, such as Brown clustering.This allow our method to obtain reasonable performance in low-resource and limited-time scenarios.Bio: Ronald Cardenas is a Master's student in the Language and Communication Technologies programme at Charles University in Prague. His research interests span morphological analysis and parsing of low-resource languages.At ISI, he works with Jonatan May on developing universal language tools.

DTEND;TZID=America/Los_Angeles:20180817T160000
DTSTART;TZID=America/Los_Angeles:20180817T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Decipherment for Universal Language Tools: a case study for Unsupervised Part of Speech Induction
UID:20180817T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Efficient computation of substring posteriors from lattices has applications in the estimation of document frequencies in spoken corpora and lattice-based minimum Bayes-risk decoding in statistical machine translation. In this talk, we present a new algorithm for exact substring posterior computation that leverages the following observations to speed up computation: i) the set of substrings for which the posteriors will be computed typically comprises all n-grams in the lattice up to a certain length, ii) posterior probability is equivalent to expected count for substrings that do not repeat on any path of the input lattice, iii) there are efficient algorithms for computing expected counts from lattices. We present experimental results comparing our algorithm with the best known algorithm in literature as well as a baseline algorithm based on finite state automata operations.Bio: Dogan Can is a fifth year Ph.D. student at USC SAIL (Signal Analysis and Interpretation Lab). He works with Professor Shrikanth Narayanan on a range of topics including lattice indexing for spoken information retrieval, concurrent/online speech processing architectures and statistical modeling of psychotherapy sessions. His research interests include weighted finite state automata, automatic speech recognition, information retrieval, dialogue modeling and behavioral informatics.

DTEND;TZID=America/Los_Angeles:20150213T160000
DTSTART;TZID=America/Los_Angeles:20150213T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Efficient Computation of Substring Posteriors from Lattices using Weighted Factor Automata
UID:20150213T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present an approach to automatically extracting paraphrase templatesfrom document/abstract pairs. This methodology relies on word-basedalignments created by off-the-shelf software. Our paraphrases areevaluated by human evaluators for precision and automatically forapplicability. We find that 77% of the extracted paraphrases are judgedto be always correct and that the generalized templates of 60% arejudged to be applicable most of the time and 87% are judged to beapplicable sometimes.

DTEND;TZID=America/Los_Angeles:20030502T160000
DTSTART;TZID=America/Los_Angeles:20030502T150000
LOCATION:11 Large
SUMMARY:Acquiring Paraphrase Templates from Document/Abstract Pairs
UID:20030502T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Our contextual inquiry into the practices of oralhistorians uneartheda curious incongruity. While oral historians consider interviewrecordings a central historical artifact, these recordingssit unusedafter a written transcript is produced. We hypothesizedthat this islargely because books are more usable than recordings.Therefore, wecreated Books with Voices: bar-code augmented paper transcriptsenabling fast, random access to digital video interviews ona PDA. Wepresent quantitative results of an evaluation of this tangibleinterface with 13 participants. They found this lightweight,structured access to original recordings to offersubstantial benefitswith minimal overhead. Oral historians found a level ofemotion in thevideo not available in the printed transcript. The videoalso helpedreaders clarify the text and observe nonverbal cues.<ahref="http://guir.berkeley.edu/oral-history/">http://guir.berkeley.edu/oral-history/

DTEND;TZID=America/Los_Angeles:20030307T160000
DTSTART;TZID=America/Los_Angeles:20030307T150000
LOCATION:11 Large
SUMMARY:Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories
UID:20030307T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you ever needed to extract information, e.g. LHS, RHS words, features, etc., from an XRS rules, this talk is for you. Over the years, a variety of regular expressions have been used to obtain data from XRS rules. However, in light of recent pipeline efforts, the copy-n-paste culture lead to expressions that were sometimes too complex for the task at hand, unnecessarily slowing down processing steps, or too trivial to work correctly on boundary cases. A unified effort by Steve, David, Wei, Michael and Jens culminated in the NLPRules module for Perl. While the talk centers on the Perl module, and some surprising benchmark results, any language supporting libpcre (perl compatible regular expression) will benefit from the insights, and from knowing the right regular expression for the task at hand.

DTEND;TZID=America/Los_Angeles:20081017T160000
DTSTART;TZID=America/Los_Angeles:20081017T150000
LOCATION:11 Large
SUMMARY:Parsing XRS with(out) regular expressions
UID:20081017T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Broad-coverage repositories of semantic relations between verbs couldbenefit many NLP tasks. We present a semi-automatic method for extractingfine-grained semantic relations between verbs. We detect similarity,strength, antonymy, enablement, and temporal happens-before relationsbetween pairs of strongly associated verbs using lexico-syntactic patternsover the Web. On a set of 29,165 strongly associated verb pairs, ourextraction algorithm yielded 65.5% accuracy. We provide the resource,called VerbOcean, for download at http://semantics.isi.edu/ocean/. We willalso discuss current work on disambiguating the verbs in the network aswell as refining the semantic relations using path analysis.

DTEND;TZID=America/Los_Angeles:20040816T153000
DTSTART;TZID=America/Los_Angeles:20040816T140000
LOCATION:11 Large
SUMMARY:VerbOcean: Mining the Web for Fine-Grained Semantic Verb Relations
UID:20040816T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The large corpora of written text that is available to the languagecommunity has largely been utilized for language understanding; it hassomewhat been ignored in the context of language generation. Recentdevelopments in stochastic generation have allowed such systems to shiftthe burden from hand crafted databases (lexicons, grammars, ontologies) tothe knowledge implicitly found in written text. However, when building adialogue system, generation is largely interactive, very different fromthe written structure of most corpora.In this talk, I will discuss my recent work at applying a stochasticgenerator, HALogen, and its newswire language model to a dialogue system,TRIPS. I'll describe the difficulties in mapping the TRIPS semantic forminto HALogen's representation, the critical differences between newswireand dialogue, and the possibility of using HALogen and a large newswiremodel as a domain independent generator.

DTEND;TZID=America/Los_Angeles:20030221T160000
DTSTART;TZID=America/Los_Angeles:20030221T150000
LOCATION:11 Large
SUMMARY:Statistical Language Generation in a Dialogue System
UID:20030221T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will be presenting some recent results of mine regarding the possibilityof automatic evaluation in summarization.  I will discuss both my ownfindings, as well of those of people here and at Columbia, and attempt toexplain in a principled fashion why there are disparate opinions on theplausibility of performing automatic evaluation in this task.  I willdiscuss my (perhaps pessimistic) views on the plausibility of doing anysort of evaluation of summarization, automatic or otherwise.The results and experimental setups developed in connection withsummarization will be extended to the machine translation.  I will reviewpossible reasons why metrics such a bleu have experienced significantlymore success in machine translation than in summarization.  I will alsoconnect the evaluation criterea developed in the context of summarizationto machine translation, and discuss the automation of these methods.In short: I'll talk about why I've been doing so much data elicitaitonrecently.This will be a highly informal seminar and participation is highlyencouraged.

DTEND;TZID=America/Los_Angeles:20040220T160000
DTSTART;TZID=America/Los_Angeles:20040220T150000
LOCATION:4 Large
SUMMARY:Some Results in Automatic Evaluation for Summarization and MT
UID:20040220T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Phrase-based statistical machine translation typically uses no syntactic information during translation, but while this information intuitively seems useful, including it has not necessarily helped translation performance. My PhD project is looking at this problem in the context of a syntactically-informed reordering preprocessing step prior to phrase-based translation. My work so far has shown that this preprocessing step does not necessarily improve performance when applied to every sentence; in my project I aim to develop a lattice-based system, armed with a number of syntax-based confidence features, that can choose on a sentence-by-sentence basis whether to use the reordering. In this presentation I will outline my progress so far, and welcome feedback and suggestions, particularly with respect to features to consider.Short Bio:Suzy Howlett is a PhD student at the Centre for Language Technology at Macquarie University, Australia, under the supervision of Mark Dras. She studied computer science and linguistics as an undergraduate at the University of Sydney, finishing in 2008 with an Honours year with James Curran, looking at automatically annotating additional training data for the C&C statistical CCG parser.

DTEND;TZID=America/Los_Angeles:20110628T160000
DTSTART;TZID=America/Los_Angeles:20110628T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Confidence in Syntax for Statistical Machine Translation
UID:20110628T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: EM (the Expectation Maximization Algorithm) is a well known technique for unsupervised learning (where one does not have any hand labeled solutions available, but instead one must learn from the raw text). Unfortunately EM is known to fail to find good solutions in many (most?) applications on which it is tried.  In this talk we present some recent work on using EM to learn how to resolve pronoun-anaphora: determining that "the dog" is the antecedent of "he" and "his" in "When Sally fed the dog he wagged his tail". For this application EM works strikingly well, determining tens of thousands of parameters and resulting in a program that probably produces state of the art results, although because this is preliminary work, and pronoun-anaphora has no standard evaluation metrics, this is just a guess.About the Speaker:Eugene Charniak is Professor of  Computer Science. and Cognitive Science at Brown University. He received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Over the last few years he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.

DTEND;TZID=America/Los_Angeles:20080926T160000
DTSTART;TZID=America/Los_Angeles:20080926T150000
LOCATION:11 Large
SUMMARY:EM Works for Pronoun-Anaphora Resolution
UID:20080926T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Thanks to the availability of parallel data and advances in machine learning techniques, we have seen tremendous improvement in the field of machine translation over the past 20 years. However, due to lack of parallel data, the quality of machine translation is still far from satisfying for many language pairs and domains. In general, it is easier to obtain non-parallel data, and much work has tried to learn translations from non-parallel data. Nonetheless, improvements to machine translation have been limited. In this work, I follow a decipherment approach to learn translations from non parallel data and achieve significant gains in machine translation.I apply slice sampling to Bayesian decipherment. Compared with the state-of-the-art algorithm, the new approach is highly scalable and accurate, making it possible to decipher billions of tokens with hundreds of thousands of word types at high accuracy for the first time. Furthermore, I introduce dependency relations to address the problems of word reordering, insertion, and deletion when deciphering foreign languages, and show that dependency relations help improve deciphering accuracy by over 5-fold.I decipher large amounts of monolingual data to learn translations for out-of-vocabulary words and observe significant gains of up to 3.8 BLEU points in domain-adaptation. Moreover, I show that a translation lexicon learned from large amounts of non-parallel data with decipherment can improve a phrase-based machine translation system trained with limited parallel data. In experiments, I observe BLEU gains of 1.2 to 1.8 across three different test sets.Given the above success, I propose to work on advancing machine translation of real world low density languages, and to explore using non-parallel data to improve word alignment and discovery of phrase translations.Bio:Qing Dou is a fourth year PhD student at USC/ISI, advised by Professor Kevin Knight.

DTEND;TZID=America/Los_Angeles:20140514T120000
DTSTART;TZID=America/Los_Angeles:20140514T110000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Beyond Parallel Data [Qualification practice talk]
UID:20140514T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 1)We investigate properties of a simple recurrent neural network (RNN) as a formal device for recognizing weighted languages. We focus on the single-layer, ReLU-activation, rational-weight RNN with softmax, a standard form of RNN used in language processing applications. We prove that many questions one may ask about such RNNs are undecidable, including consistency, equivalence, minimization, and finding the highest-weighted string. For consistent RNNs, finding the highest-weighted string is decidable, although the solution can be exponentially long in the length of the input RNN encoded in binary. Limiting to solutions of polynomial length, we prove that finding the highest-weighted string for a consistent RNN is NP-complete and APX-hard.2) Neural Machine Translation has gained popularity in recent years and has been able to achieve impressive results. The only caveat is that millions of parallel sentences are needed in order to train the system properly, and in a low-resource scenario that amount of data simply may not be available. This talk will discuss strategies for addressing the data scarcity problem, particularly using alignment tables to make use of parallel data from higher-resource language pairs and creating synthetic in-domain data.Bio: Yining Chen is an third year undergraduate student at Dartmouth College. She is a summer intern at ISI working with Professor Kevin Knight and Professor Jonathan May.Sasha Mayn is a summer intern at ISIâs Natural Language Group. She is particularly interested in machine translation and language generation. Last summer Sasha interned at the PanLex Project in Berkeley, where she was responsible for pre-processing digital dictionaries and entering them into PanLex's multilingual database. This summer she has been working on improving neural machine translation strategies for low-resource languages under the supervision of Jon May and Kevin Knight.

DTEND;TZID=America/Los_Angeles:20170831T160000
DTSTART;TZID=America/Los_Angeles:20170831T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:THURSDAY TALK:  1)Recurrent Neural Networks as Weighted Language Recognizers 2)Gloss-to-English: Improving Low Resource Language Translation Using Alignment Tables
UID:20170831T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Question Answering (QA) is a longstanding goal in Natural Language Processing (NLP). In its simplest form, QA relies on keyword matching to find single-word answers (e.g., search engines).But single words taken out of context are ambiguous -- only context disambiguates them. This meaningful context comes in the form of syntactic and/or semantic relations between predicates and arguments. Relations are thus at the core of meaning and information. Systems like Siri or Watson have put QA in more widespread use, and users move away from single-word questions to more complex ones. Finding and classifying relations to answer those questions will thus become the central challenge for future QA systems.The large number of relations makes relation extraction challenging; given a sentence, many possible relations can be extracted. If we can specify the relations we are interested in beforehand, we can annotate data to train supervised systems. Often though, definition beforehand is impossible, and we have to find all possible relations that hold in a text. In those cases, we must rely on unsupervised approaches.A second problem is rapid adaptation to new domains and topics. Relations extracted from one domain may not be relevant to another.A third problem is variation in the ways relations are expressed in text. Often, intervening words and phrases between predicates and arguments cause fixed-window pattern matching approaches to fail.Most previous relation extraction approaches have either relied on annotated data or (semi-) structured sources of information. These approaches require pre-defined relations and manually annotated data. Furthermore, many of these approaches rely on pattern matching over surface strings, which is not robust to variations. If previous approaches used unsupervised training methods, they largely focused on clustering, effectively ignoring sequential structure in the data.The future of QA will require us to quickly adapt to new domains and topics with little annotated data. Only if we can discover and disambiguate relations automatically can we build systems capable of open-ended QA.I present several techniques for discovering relations from text. I show how to use unsupervised sequential models to discover relations from raw text. These methods do not require any existing resources, manual annotation, or pre-defined relations, and can be applied to any domain. I use dependency parse structures as inputs to these methods, making these approaches more robust to surface variations. I show improvements over state-of-the-art systems as well as novel approaches to fully exploit the structure contained in the data.

DTEND;TZID=America/Los_Angeles:20120503T170000
DTSTART;TZID=America/Los_Angeles:20120503T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Using Syntactic Information for Unsupervised Relation Extraction and Typing (Thesis Proposal Practice Talk)
UID:20120503T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We align pairs of English sentences and corresponding Abstract Meaning Representations (AMR), at the token level.  Such alignments will be useful for downstream extraction of semantic interpretation and generation rules.  Our method involves linearizing AMR structures and performing symmetrized EM training.  We obtain 86.5% and 83.1% alignment F score on development and test sets.Bio:Nima Pourdamghani is a second year Ph.D. student at ISI. He works with Professor Kevin Knight on Abstract Meaning Representation and its application to machine translation.

DTEND;TZID=America/Los_Angeles:20141107T160000
DTSTART;TZID=America/Los_Angeles:20141107T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Aligning English Strings with Abstract Meaning Representation Graphs
UID:20141107T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We will present the results of the 2003 Johns Hopkins UniversitySummer Workshop on "Syntax for Statistical Machine Translation".We will describe a large effort to extend a high-performingphrase-based MT system as baseline by adding new features representingsyntactic knowledge that deal with specific problems of the underlyingbaseline. We investigate a broad range of possible feature functions,from very simple binary features to sophisticated tree-to-treetranslation models. Simple feature functions test if a certainconstituent occurs in the source and the target language parsetree. More sophisticated features will be derived from an alignmentmodel where whole sub-trees in source and target can be aligned nodeby node. We present results on the Chinese-English large data track ofthe recent TIDES MT evaluations.This is joint work with the other workshop team members: DanielGildea, Anoop Sarkar, Sanjeev Khudanpur, Kenji Yamada, Libin Shen,Shankar Kumar, David Smith, Viran Jain, Katherine Eng, Jin Zhen andDragomir Radev.See <ahref="http://www.clsp.jhu.edu/ws03/groups/translate/">http://www.clsp.jhu.edu/ws03/groups/translate/</a>for more.

DTEND;TZID=America/Los_Angeles:20030903T160000
DTSTART;TZID=America/Los_Angeles:20030903T150000
LOCATION:11 Large
SUMMARY:JHU MT Workshop
UID:20030903T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Transition-based methods leverage non-local features for structured tasks. When combined with beam search and global structure learning, they give high accuracies for a number of NLP tasks. We investigate the effectiveness of neural network models for transition-based parsing and Chinese word segmentation. Results show that automatic features induced by neural models give higher accuracies than carefully designed manual features. The beam search and perceptron learning framework of Zhang and Clark (2011) can be used with neural network models. However, large margin training does not always work. When the number of labels are many, a maximum likelihood training objective with contrastive estimation learning gives better accuracies.Bio: Yue Zhang is currently an assistant professor at Singapore University of Technology and Design. Before joining SUTD in July 2012, he worked as a postdoctoral research associate in University of Cambridge, UK. Yue Zhang received his DPhil and MSc degrees from University of Oxford, UK, and his BEng degree from Tsinghua University, China. His research interests include natural language processing, machine learning and artificial Intelligence. He has been working on statistical parsing, parsing, text synthesis, machine translation, sentiment analysis and stock market analysis intensively. Yue Zhang serves as the reviewer for top journals such as Computational Linguistics, Transaction of Association of Computational Linguistics and Journal of Artificial Intelligence Research.  He is also PC member for conferences such as ACL, COLING, EMNLP, NAACL, EACL, AAAI and IJCAI. Recently, he was the area chairs of COLING 2014, NAACL 2015 and EMNLP 2015.

DTEND;TZID=America/Los_Angeles:20160623T160000
DTSTART;TZID=America/Los_Angeles:20160623T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Title: Neural network models for structured prediction
UID:20160623T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Ever since their introduction in 1992, hand-crafted lexico-semantic patterns have been shown to be useful for many semantic tasks. In recent years, an automatic, fully unsupervised method to generate patterns was developed ("flexible patterns"). I will demonstrate that flexible patterns are useful for extracting semantic information on words, word relations and sentences. I will present in detail the latest results in the field â applying flexible patterns on the task of authorship attribution on tweets (Schwartz et al., EMNLP2013).

DTEND;TZID=America/Los_Angeles:20131025T160000
DTSTART;TZID=America/Los_Angeles:20131025T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Semantic Representation using Flexible Patterns
UID:20131025T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recent research on statistical machine translation has lead to the rapid development of syntax-based translation models, whichexploit syntactic information to direct translation. In this talk, Iwill give an overview of tree-to-string translation models, one of thestate-of-the-art syntax-based models. In a tree-to-string model, the source side is a phrase structure parse tree and the target side is astring. This talk includes the following topics: (1) tree-based tree-to-string model, (2) tree-sequence based tree-to-string model,(3) forest-based tree-to-string model, and (4) context-awaretree-to-string model. Experimental results show that the forest-basedtree-to-string system outperforms Hiero significantly on Chinese-to-English translation.About the speaker:Yang Liu is an Assistant Researcher at Institute of ComputingTechnology (ICT), Chinese Academy of Sciences. He received his PhDdegree in Computer Science from ICT in 2007. His major researchinterests include statistical machine translation and Chineseinformation processing. He has been working on syntax-based modeling,word alignment, and system combination. His paper on tree-to-stringtranslation won the Meritorious Asian NLP Paper Award of COLING/ACL2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.

DTEND;TZID=America/Los_Angeles:20090715T170000
DTSTART;TZID=America/Los_Angeles:20090715T160000
LOCATION:11 Large
SUMMARY:An Overview of Tree-to-String Translation Models
UID:20090715T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present an inference algorithm that organizes observedwords (tokens) into structured inflectional paradigms (types). Italso naturally predicts the spelling of unobserved forms that aremissing from these paradigms, and discovers inflectionalprinciples (grammar) that generalize to wholly unobserved words.Our Bayesian generative model of the data explicitly representstokens, types, inflections, paradigms, and locally conditionedstring edits. It assumes that inflected word tokens are generatedfrom an infinite mixture of inflectional paradigms (stringtuples). Each paradigm is sampled all at once from a graphicalmodel, whose potential functions are weighted infinite-statetransducers with language-specific parameters to be learned. Theseassumptions naturally lead to an elegant empirical Bayesinference procedure that exploits Monte Carlo EM, beliefpropagation, and dynamic programming. Given 50-100 seedparadigms, adding a 10-million-word corpus reduces predictionerror for morphological inflections by up to 10%.This is joint work with Jason Eisner, JHU.

DTEND;TZID=America/Los_Angeles:20110715T160000
DTSTART;TZID=America/Los_Angeles:20110715T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model (EMNLP 2011 practice talk)
UID:20110715T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Considering the adversity of the conditions under which linguistic communication takes place in everyday life---ambiguity of the signal, environmental competition for our attention, speaker error, our limited memory, and so forth---it is perhaps remarkable that we are as successful at it as we are.  Perhaps the leading explanation of this success is that (a) the linguistic signal is redundant, (b) diverse information sources are generally available that can help us obtain infer the intended message (or something close enough) when comprehending an utterance, and (c) we use these diverse information sources very quickly and to the fullest extent possible.  This explanation can be thought of as treating language comprehension as a rational, evidential process.  Nevertheless, there are number of prominent phenomena reported in the sentence processing literature that remain clear puzzles for the rational approach.  In this talk I address three such phenomena, whose common underlying thread is an apparent failure to use information available in a sentence appropriately in global or incremental inferences about the correct interpretation of a sentence.  I argue that the apparent puzzle posed by these phenomena for models of rational sentence comprehension may derive from the failure of existing models to appropriately account for the environmental and cognitive constraints---namely, noisy input and limited memory---under which comprehension takes place.  I present two new probabilistic models of language comprehension under noisy input and limited memory, and show that these models lead to solutions to the above puzzles.  More generally, these models suggest how appropriately accounting for environmental and cognitive constraints can lead to a more nuanced and ultimately more satisfactory picture of key aspects of human cognition.

DTEND;TZID=America/Los_Angeles:20090123T160000
DTSTART;TZID=America/Los_Angeles:20090123T150000
LOCATION:11 Large
SUMMARY:Noise and memory in rational human language comprehension
UID:20090123T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll talk about a graph based approach to the string regeneration problem, published at 2014 EACL Student Research Workshop. I will conclude my talk by briefly talking about my PhD research direction of including semantics (MRS) into a state-of-the-art SMT system.String regeneration is the problem of generating a fluent sentence from an unordered list of words. The purpose of investigating and developing approaches to solving the string regeneration problem is grammaticality and fluency improvement of machine generated text. I investigated a graph-based approach to the string regeneration problem that finds a permutation of words with the highest probability under an n-gram language model.Bio: I am a PhD student at University of Cambridge researching integration of semantics and Statistical Machine Translation. I am originally from Ljubljana, Slovenia, where I completed a BSc in Computer Science in 2012. I continued with a masters in Advanced Computer Science at University of Cambridge, graduating in 2013.

DTEND;TZID=America/Los_Angeles:20140702T150000
DTSTART;TZID=America/Los_Angeles:20140702T143000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:A Graph-Based Approach to String Regeneration [Intern talk]
UID:20140702T143000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We describe a new sentence realization framework for text-to-textapplications. This framework uses IDL-expressions as a representationformalism, and a generation mechanism based on algorithms for intersectingIDL-expressions with probabilistic language models. We present boththeoretical and empirical results concerning the correctness andefficiency of these algorithms.

DTEND;TZID=America/Los_Angeles:20050527T163000
DTSTART;TZID=America/Los_Angeles:20050527T150000
LOCATION:11 Small
SUMMARY:Towards Developing Generation Algorithms for Text-to-Text
UID:20050527T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Weighted tree automata are equivalent to weighted tree grammars, which can be used, for example, to easily model weighted context-free grammars. In constrast to context-free grammars, tree automata work directly on a tree representation and not on strings. We will introduce weighted tree automata and review the important results on minimization of them. For example, it is known that deterministic devices over commutative semifields (commutative semirings with multiplicative inverses) can be effectively minimized. In the main part of the talk, we present the first efficient algorithm for this minimization. If the operations can be performed in constant time, then our algorithm constructs an equivalent minimal (with respect to the number of states) deterministic automaton in time linear in the maximal rank of the input symbols, the number of (useful) transitions, and the number of states of the input automaton.

DTEND;TZID=America/Los_Angeles:20090306T160000
DTSTART;TZID=America/Los_Angeles:20090306T150000
LOCATION:11 Large
SUMMARY:Minimizing Deterministic Weighted Tree Automata
UID:20090306T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: To enable naturalistic, context-aware language generation, the underlying models must be flexible but controllable. They must be flexible enough to account for the rich linguistic diversity of data that the model generates and conditions on. On the other hand, generation must be controlled, to lexicalize the same meaning differently, depending upon the social and the situational context. I'll present model-based approaches to multilingual language modeling and open-vocabulary machine translation, aiming at making language generation more flexible by relaxing the (unreasonable but prevalent in the literature) assumption that a model's vocabulary is constrained to a particular set of most frequent words in a particular language. Then, I'll present an approach to controllable text generation that modulates social variables in generated text. Iâll conclude with an overview of ongoing research projects.Bio: Yulia Tsvetkov is an assistant professor in the Language Technologies Institute at Carnegie Mellon University. Her research interests lie at or near the intersection of natural language processing, machine learning, linguistics, and social science. Her current research projects focus on multilinguality (e.g., open-vocabulary machine translation, polyglot models, entrainment in code-switching), controllable text generation, automated negotiation, and NLP for social good (e.g., identification of microaggressions and dehumanization in online interactions, identification of misinformation and agenda-setting in news, predicting scientific misconduct). Prior to joining LTI, Yulia was a postdoc in the department of Computer Science at Stanford University; she received her PhD from Carnegie Mellon University.

DTEND;TZID=America/Los_Angeles:20180511T160000
DTSTART;TZID=America/Los_Angeles:20180511T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards Flexible but Controllable Language Generation
UID:20180511T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: String-to-tree machine translation decoders are effective but very slow, especially compared to other decoding approaches.  We explore various methods to identify constraints on the search space, with the aim of improving the efficiency of the syntax-based decoder.

DTEND;TZID=America/Los_Angeles:20080822T161500
DTSTART;TZID=America/Los_Angeles:20080822T154500
LOCATION:11 Large
SUMMARY:Intern Final Talk: Structural constraints for efficient decoding.
UID:20080822T154500@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The language barrier in many of the multilingual natural language processing (NLP) tasks, such as name transliteration, mining bilingual word translations, etc., can be overcome by mapping objects (names and words in the respective tasks) from different languages (or íviewsí) into a common low-dimensional subspace. Multi-view models learn such a low-dimensional subspace using a training corpus of paired objects, e.g. name pairs written in different languages.The central idea of my dissertation is to learn low-dimensional subspaces (or interlingual representations) that are effective for various multilingual and monolingual NLP tasks. First, I demonstrate the effectiveness of interlingual representations in mining bilingual word translations for machine translation, and then proceed to developing models for diverse situations that often arise in NLP tasks. In particular, I design models for 1) bridge setting -- when there are more than two views but we only have training data from a single pivot view into each of the remaining views 2) reranking setting -- when an object from one view is  associated with a ranked list of objects from another view, and finally 3) when the underlying objects have rich structure, such as a tree.These problem settings arise frequently in real world applications. I choose a canonical task for each of the settings and compare my model with existing state-of-the-art baseline systems. I provide empirical evidence for the first two models on multilingual name transliteration and the part-of-speech tagging tasks, respectively. For the third problem setting, I discuss my ongoing work on vector based compositionality learning task. This task aims to find the meaning, represented as a vector in d-dimensional space, of a sentence or a phrase based on the meaning of its constituent words.

DTEND;TZID=America/Los_Angeles:20121012T120000
DTSTART;TZID=America/Los_Angeles:20121012T110000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Discriminative Interlingual Representations for NLP
UID:20121012T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: A portmanteau is a type of compound word that fuses the sounds and meanings of two component words; for example, âfrenemyâ (friend + enemy) or âsmogâ (smoke + fog). We develop a system, including a novel multitape FST, that takes an input of two words and outputs possible portmanteaux. Our system is trained on a list of known portmanteaux and their component words, and achieves 45% exact matches in cross-validated experiments.Bio.Aliya Deri is a PhD candidate at USC/ISI.

DTEND;TZID=America/Los_Angeles:20150529T160000
DTSTART;TZID=America/Los_Angeles:20150529T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:How to Make a Frenemy: Multitape FSTs for Portmanteau Generation
UID:20150529T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a survey of resent research into using information visualizationto reveal new insights about linguistic data.  Our recent work includesusing WordNet hyponymy as a basis for document visualization and visualizingthe uncertainty in machine translation in an instant messaging chatcontext.  We will present our preliminary findings and prototypevisualization for machine translation data resulting from a week ofcollaboration with ISI researchers.About the speaker:Christopher Collins is a PhD candidate in information visualization andcomputational linguistics at the University of Toronto.  He works with Prof.Gerald Penn and Prof. Sheelagh Carpendale (University of Calgary).

DTEND;TZID=America/Los_Angeles:20070420T163000
DTSTART;TZID=America/Los_Angeles:20070420T150000
LOCATION:11 Large
SUMMARY:Information Visualization to Support Computational Linguistics
UID:20070420T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: One of the most ambitious goals of AI is to develop intelligent conversational agents able to communicate with humans and assist them in their tasks. Thus, communication and interaction should be at the core of the learning process of these agents; failure to integrate communication as their main building block raises concerns regarding their usability.In this talk, I will propose an interactive multimodal framework for language learning. Instead of being passively exposed to large amounts of natural text, our learners (implemented as feed-forward neural networks) engage in cooperative referential games starting from a tabula rasa setup, and thus develop their own language from the need to communicate in order to succeed at the game. Preliminary experiments provide promising results, but also suggest that it is important to ensure that agents trained in this way do not develop an ad-hoc communication code only effective for the game they are playing.Bio: Angeliki is a final year PhD student at the Center for Mind/Brain Sciences of the University of Trento. She received her MSc from the Saarland University, where she worked with Ivan Titov and Caroline Sporleder on Bayesian models for sentiment and discourse. She is currently working at the intersection between language and vision under the supervision of Marco Baroni.

DTEND;TZID=America/Los_Angeles:20160513T160000
DTSTART;TZID=America/Los_Angeles:20160513T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Towards Multi-Agent Communication-Based Language Learning
UID:20160513T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This tutorial will be a short introduction to using the Linux cluster atUSC's High-Performance Computing (HPC) facility. Topics will include:(1) basics of starting jobs on the cluster using Torque/PBS,(2) dealing with common problems like jobs not starting orspontaneously dying,(3) maximizing the performance of your jobs (both yours and otherpeople's), e.g., using the correct filesystem and tuning it for better speed,(4) embarrassingly parallel processing and poor-man's workflows.It will NOT coverHadoop,MPI,real workflow management tools like Condor.

DTEND;TZID=America/Los_Angeles:20090911T160000
DTSTART;TZID=America/Los_Angeles:20090911T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tutorial on HPC
UID:20090911T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recent advances in end-to-end representation learning have made impressive strides in achieving state-of-the-art results in perception problems on speech, image and natural language. However, the area of affect understanding has mostly relied on off-the-shelf features to solve problems in emotion recognition, multi-modal fusion and generative modeling of affective speech and language. The potential impact of representation learning approaches to this area remains ripe for exploration. My thesis proposal is an important step in this direction. Firstly, I present an overview of my work on AU (Action Unit) detection, speech emotion recognition and glottal inverse filtering through speech modeling. Secondly, I introduce Affect-LM, a novel neural language model for affective text generation which exploits prior knowledge through a dictionary of emotionally colored words (such as the LIWC tool). Finally, I state some upcoming problems in representation learning for affect from speech and multi-modal language modeling which I plan to work on for the remainder of my degree.Sayan is a fourth-year PhD student at the University of Southern California, working at the Behavior Analytics and Machine Learning Group at the ICT(Institute for Creative Technologies) with Prof. Stefan Scherer. He is working on research towards building learning systems for better sensing of human behavior and emotion, and integrating deep learning techniques with human affect. His areas of interest include, but are not limited to deep learning, machine perception, affective computing, speech/signal processing, and generative modeling.

DTEND;TZID=America/Los_Angeles:20170505T160000
DTSTART;TZID=America/Los_Angeles:20170505T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Representation Learning for Human Affect Recognition  (PhD Proposal Practice Talk)
UID:20170505T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Combining Constituent Parsers (Victoria Fossum: 3:00pm -- 3:30pm)Combining the 1-best output of multiple parsers via parse selection orparse hybridization improves f-score over the best individual parser(Henderson and Brill, 1999; Sagae and Lavie, 2006). We propose three ways to improve upon existing methods for parser combination.---------------------------------------------------------Disambiguation of Preposition Sense Using Linguistically MotivatedFeatures (Dirk Hovy: 3:30pm -- 4:00pm)Classifying polysemous words into their proper sense classes ispotentially useful to any NLP application that needs to extractinformation from text or build a semantic representation of thetextual information. Like instances of other word classes, manyprepositions are ambiguous, carrying different semantic meanings(including notions of instrumental, accompaniment, location, etc.)In this paper, we present a supervised classification approach fordisambiguation of preposition senses. We use the SemEval 2007Preposition Sense Disambiguation datasets to evaluate our system andcompare its results to those of the systems participating in theworkshop. We derived linguistically motivated features from both sidesof the preposition. Instead of restricting     these to a fixed windowsize, we utilized the phrase structure. Testing with five differentclassifiers, we can report an increased accuracy (76.4%) thatoutperforms the best system in the SemEval task.

DTEND;TZID=America/Los_Angeles:20090522T160000
DTSTART;TZID=America/Los_Angeles:20090522T150000
LOCATION:11th flr CR
SUMMARY:Practice talks for NAACL HLT
UID:20090522T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: PropBank: the next stage of TreebankNatural-language engineers the world over are coming to a consensus that adegree of semantic knowledge is a necessary addition to purely structuralrepresentations of language.  This talk describes the Propbank project atPenn, which provides a complete shallow semantic parse of the Treebank IIcorpus.Inducing a Chronology of the Pali Canon:Works such as Kroch (1989), Taylor (1994) and Han (2000) have demonstratedthat syntactic change can be described mathematically as the competitionbetween innovating and archaic formations.  This paper demonstrates howthis same mathematical description can be turned around to predict thedate of a historical text.  The Middle Indic period showed dramatic changein the morphological system, such as the collapse of the past-tense verbalsystem.  Whereas Sanskrit had three competing formations, each withmultiple possible morphological realizations, Pali (a Middle Indo-Aryanlanguage) had only a single formation, based mostly on the sigmatic aoristalthough many archaic nonsigmatic aorists are also attested.  Theproportions of the archaic and innovative forms can be easily calculatedfor each text in the Pali Canon and these proportions used to assign anapproximate date for each text.  The accuracy of the method can beassessed qualitatively by comparing the derived chronology to chronologiesbased on various non-linguistic criteria, or quantitatively by comparingthe derived chronology to a known dating scheme.  For the latter it isnecessary to turn to a different dataset, such as that describing the riseof do-support in Early Modern English, as described in Ellegard (1953) andKroch (1989).Bio:Paul Kingsbury graduated summa cum laude in linguistics from Ohio StateUniversity in 1993 with a thesis on "Some sources for L-words inSanskrit".  He subsequently entered the University of Pennsylvania tostudy historical linguistics and Sanskrit, but (like most historicalstudents) was diverted to computational issues.  He joined the Propbankproject in 2000 and soon thereafter engineered a major rethinking of themethods and goals of the project, in order to make the annotationlinguistically meaningful.  He completed his doctorate in 2002 with athesis entitled 'The Chronology of the Pali Canon: the case of theaorist'.

DTEND;TZID=America/Los_Angeles:20040130T163000
DTSTART;TZID=America/Los_Angeles:20040130T150000
LOCATION:11 Large
SUMMARY:PropBank: the next stage of Treebank <b>and</b><br>Inducing a Chronology of the Pali Canon
UID:20040130T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Hadoop is an open-source implementation of the Map/Reduce framework introduced by Google Research. It is a simple abstraction for describing parallelizable algorithms that admits very efficient execution: in one case, one of my (poorly implemented) algorithms was improved from a typical runtime of 72 hours to 3 hours. I will give a short introduction to Hadoop that is highly colored by my experiences with it and the likely experiences of other natural language processing researchers at ISI. I will show how to run Hadoop on HPC, how to use Hadoop Streaming (which allows implementation in any language you choose), and how to define Map/Reduce algorithms for a few incarnations of a typical NLP task, relative-frequency estimation of a large probability distribution. Input from others who are more experienced with Hadoop than I am is welcome!

DTEND;TZID=America/Los_Angeles:20090327T160000
DTSTART;TZID=America/Los_Angeles:20090327T150000
LOCATION:11 Large
SUMMARY:Tutorial on Hadoop
UID:20090327T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: As a discipline of biology, the field of neuroscience suffers greatly frominformation overload, non-standardization and complexity. In the absenceof a mathematical theoretical structure for the subject, scientists usetheir own ad-hoc methods of collating and synthesizing information fromboth the primary literature and their own data. In order to eventuallyformalize and accelerate the development of theoretical approaches in thesubject, we are combining an Electronic Laboratory Notebook (ELN) withasset management of the primary research literature to construct aknowledge engineering framework based around the organizational unit of aneuroscience laboratory. This project, called Â¡NeuroScholarÂ¢(http://www.neuroscholar.org/) is open-source, and is being tested andused in the laboratories of Prof. Larry Swanson and Prof. Alan Watts atUSC. In each laboratory, the system will operate on top of a Â¡laboratorycorpusÂ¢ of knowledge resources (data files, full-text pdf files , etc.)that summarizes the relevant knowledge for that laboratory. Not only willthis collection provide a valuable resource for the members of thelaboratory, it provides a platform for natural language processing andknowledge engineering to answer formally-defined research questions. TheSociety for NeuroscienceÂ¢s annual meeting attracts over 30,000 attendees,who collectively form potential user-base of this software.I will talk about the ideas underlying the project, the currentimplementation of NeuroScholar, developments from collaboration with thenatural language group at ISI and possible collaborations for the future.

DTEND;TZID=America/Los_Angeles:20050617T120000
DTSTART;TZID=America/Los_Angeles:20050617T103000
LOCATION:11 Large
SUMMARY:The neuroscience laboratory as a knowledge factory: challenges, approaches and tools
UID:20050617T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will give a status report on my current thesis work onnoun phrase translation. The motivation of this work isto break up the machine translation problem into smaller,more manageable units. The treatment of noun phrase translationas a subtask of machine translation is both linguisticallyand empirically motivated. My approach is to generatea n-best list of candidate translations with a statisticalmachine translation system and rerank the candidates withadditional features. For about 90% of all noun phrases wecan find an acceptable translation in the 100-best list, whilean acceptable translation comes out on the very top for onlyabout 60% of the noun phrases. I will discuss a variety oflinguistic and empirical features that (may) help to movethe acceptable translations higher in the list. I will alsopresent results modeling issues such as phrase basedtranslation and compound splitting. This talk is alsointended as a fishing expedition for feature suggestions bythe audience.

DTEND;TZID=America/Los_Angeles:20030131T160000
DTSTART;TZID=America/Los_Angeles:20030131T150000
LOCATION:11 Large
SUMMARY:Noun Phrase Translation
UID:20030131T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: One of the key challenges in retrieval is what to do when a query termneeds to be replaced with more than one term. This problem arises inapplications such as cross language information retrieval andthesaurus expansion. One solution is to use structured query methods,which treat all the possible replacements as if they were one queryterm by computing a joint document frequency and a joint termfrequency. This presentation will review prior work on structuredquery techniques and then introduce three new variants that aim toimprove computational efficiency and to leverage estimates ofreplacement probabilities to improve retrieval effectiveness. Themethods have now been tested in cross-language retrieval andOCR-degraded text retrieval applications in which replacementprobability estimates could be estimated. In both applications, thenew structured query methods showed statistically significantimprovements in retrieval effectiveness over previously knownstructured query methods.

DTEND;TZID=America/Los_Angeles:20030314T160000
DTSTART;TZID=America/Los_Angeles:20030314T150000
LOCATION:11 Large
SUMMARY:Improving the Efficiency and Effectiveness of Structured Query Methods
UID:20030314T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We employ statistical methods to analyze, generate, and translate rhythmic poetry. We first apply unsupervised learning to reveal word-stress patterns in a corpus of raw poetry. We then use these word-stress patterns, in addition to rhyme and discourse models, to generate English love poetry. Finally, we translate Italian poetry into English, choosing target realizations that conform to desired rhythmic patterns.

DTEND;TZID=America/Los_Angeles:20101001T153000
DTSTART;TZID=America/Los_Angeles:20101001T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation (EMNLP 2010 Practice Talk)
UID:20101001T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is about modeling the Syntax-Based Machine Translation(SBMT) problem within the Searn (Search & Learn) framework developed by Hal Daume inhis PhD thesis. I will present the way we define the states, actionsand the search space and how to implement the cost function.(Note: This is part of the Summer Intern Series)

DTEND;TZID=America/Los_Angeles:20060823T153000
DTSTART;TZID=America/Los_Angeles:20060823T150000
LOCATION:11 Large
SUMMARY:Towards combining Searn and Syntax-Based Machine Translation (SBMT)
UID:20060823T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: In NLP, we rely on annotated data to train models. This implicitly assumes that the annotations represent the truth. However, this basic assumption can be violated in two ways: either because the annotators exhibit a certain bias (consciously or subconsciously), or because there simply is not one single truth. In this talk, I will present approaches to deal with both problems.In the case of biased annotators, we can collect multiple annotations and use an unsupervised item-response model to infer the underlying truth and the reliability of the individual annotators. We present a software package, MACE (Multi-Annotator Competence Estimation) with considerable improvements over standard baselines both in terms of predicted label accuracy and estimates of trustworthiness, even under adversarial conditions. Additionally, we can trade precision for recall, achieving even higher performance by focusing on the instances our model is most confident in.In the second case, where not a single truth exists, we can collect information about easily confused categories and incorporate this knowledge into the training process. We use small samples of doubly annotated POS data for Twitter to estimate annotation reliability and show how those metrics of likely inter-annotator agreement can be implemented in the loss functions of structured perceptron. We find that these cost-sensitive algorithms perform better across annotation projects and, more surprisingly, even on data annotated according to the same guidelines. Finally, we show that these models perform better on the downstream task of chunking.Bio:Dirk Hovy is a postdoc in the Center for Language Technology at the University of Copenhagen, working with Anders SÃ¸gaard on improving analysis of low-resource languages. Their recent paper on POS tagging with inter-annotator agreement won the best paper award at EACL 2014.Dirk received his PhD from the University of Southern California (USC), where he was working at the Information Sciences Institute (ISI) on unsupervised relation extraction. He has a background in socio-linguistics and worked on unsupervised and semi-supervised models for relation extraction, temporal links, and WSD, as well as annotator assessment. He is interested in the "human" aspects of NLP, i.e., the individual bias people have when producing or annotating language, and how it affects NLP applications.His other interests include cooking, cross-fit, and medieval art and literature.

DTEND;TZID=America/Los_Angeles:20140616T163000
DTSTART;TZID=America/Los_Angeles:20140616T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Two ways to deal with annotation bias
UID:20140616T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND;TZID=America/Los_Angeles:20040428T170000
DTSTART;TZID=America/Los_Angeles:20040428T150000
LOCATION:11 Large
SUMMARY:Practice Talks for HLT/NAACL
UID:20040428T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Modeling the ways in which humans produce and perceive various forms of behavioral communication, such as speech, pose many diverse challenges. For instance, from a controls perspective, it is important to understand and model how control and coordination of various biological actuators in human body is achieved order to produce motor actions. From a signal processing perspective, we would like to discover novel representations or system architectures that are used in order to effect this coordination.We present a computational, data-driven approach to derive interpretable movement primitives from speech articulation data in a bottom-up manner. It puts forth a convolutive Nonnegative Matrix Factorization algorithm with sparseness constraints (cNMFsc) to decompose a given data matrix into a set of spatio-temporal basis sequences and an activation matrix. The algorithm optimizes a cost function that trades off the mismatch between the proposed model and the input data against the number of primitives that are active at any given instant. We further argue that such primitives can be modeled using nonlinear dynamical systems in a control-theoretic framework for speech motor control. Specifically, we extend our approach to extract a spatio-temporal dictionary of control primitives (sequences of control parameters), which can then be used to control a dynamical systems model of the vocal tract to produce any desired sequence of movements. Although the method is particularly applied to measured and synthesized articulatory data in our case, the framework is general and can be applied to any multivariate timeseries. The results suggest that the proposed algorithm extracts movement primitives from human speech production data that are linguistically interpretable.

DTEND;TZID=America/Los_Angeles:20131115T160000
DTSTART;TZID=America/Los_Angeles:20131115T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Data-Driven Techniques for Modeling Speech Motor Control
UID:20131115T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk will address the problem of assessing the correctness of MToutput on the word level. I will give an overview on word confidencemeasures for SMT.  Different variants of word posterior probabilities thatcan be directly used as confidence measure will be presented. Theirconnection with the Bayes decision rule and the underlying error measurewill be shown. Experimental comparison of different word confidencemeasures will be presented on a translation task consisting of technicalmanuals.Additionally, I will show how word confidence measures can be applied inan interactive SMT system. This system predicts translations, taking partsof the sentence into account that have already been accepted or typed bythe user. Through the use of confidence measures, the performance of theprediction engine can be improved.About the Speaker:Nicola Ueffing is a graduate research assistant at the group for "HumanLanguage Technology and Pattern Recognition" (Lehrstuhl fuer InformatikVI) at RWTH Aachen University. She received her diploma in mathematicsfrom RWTH Aachen University in 2000. Her research topic is statisticalmachine translation, focusing on confidence measures for SMT. In 2003, shewas a member of the team working on "Confidence Estimation for SMT" at theCLSP workshop at JHU.

DTEND;TZID=America/Los_Angeles:20041217T163000
DTSTART;TZID=America/Los_Angeles:20041217T150000
LOCATION:11 Large
SUMMARY:Word-Level Confidence Measures for SMT
UID:20041217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Like most scientific disciplines, cancer biology involves performing experiments and interpreting them. At present, most modeling efforts center on trying to bring together collections of interpretations as 'pathway diagrams' but do not attempt to capture the semantics of supporting experimental data. Here, I will describe a new strategic approach for machine reading of scientific articles based on a generic representation of experimental data with explicit examples within the field of cancer biology. I will also discuss this effort in the context of the Abstract Meaning Representation (AMR) and present an informal generative story for your consideration and feedback.Bio: Gully Burns develops pragmatic biomedical knowledge engineering systems for scientists that provide directly useful functionality in their everyday use and is based on innovative, cutting edge computer science. He was originally trained as a physicist at Imperial College in London before switching to do a Ph.D. in neuroscience at Oxford. He came to work at USC in 1997, developing the 'NeuroScholar' project in Larry Swanson's lab before joining the Information Sciences Institute in 2006. He is now works as project leader in ISI's Information Integration Group.

DTEND;TZID=America/Los_Angeles:20141114T160000
DTSTART;TZID=America/Los_Angeles:20141114T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Machine Reading of the Biomedical Literature: It's All About Data
UID:20141114T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recent years have seen a resurgence of Neural Networks  in Natural Language Processing. Much of this success can be attributed to learning compact representations (or embeddings) of words, which are used as input to train standard Neural Network architectures. In the first part of the talk I will describe two approaches for learning word embeddings for large vocabularies. In the second part, I will talk about successful applications of Neural Networks in NLP tasks like Part-Of-Speech tagging, Chunking, Parsing etc. without any feature engineering. I will also describe some preliminary work on Neural Networks for unsupervised Part-Of-Speech tagging.

DTEND;TZID=America/Los_Angeles:20121109T160000
DTSTART;TZID=America/Los_Angeles:20121109T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Neural Networks for NLP
UID:20121109T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: (This will be an extended version of the talk for NAACL-HLT 2007. It'sbased on my summer internship work at IBM T.J. Watson Research Centerlast year.)The project aimed to address the problems encountered when trying tomatch available employees to open job positions, based on skillmatches. Currently, job search applications, like IBM's ProfessionalMarketplace, only find exact matches. A skill affinity computation isdesired to allow searches to be expanded to related/similar skills,and return more potential matches.In this talk, I will explore the problem of computing text similaritybetween verb phrases describing skilled human behavior for the purposeof finding approximate matches. Four parsers (Charniak's parser,Stanford's parser, IBM XSG slot grammar parser, and Lin's MINIPAR) areevaluated on a corpus of skill statements extracted from anenterprise-wide expertise taxonomy. A similarity measure utilizingcommon semantic role features extracted from parse trees was foundsuperior to an information-theoretic measure of similarity andcomparable to the level of human agreement.

DTEND;TZID=America/Los_Angeles:20070518T163000
DTSTART;TZID=America/Los_Angeles:20070518T150000
LOCATION:11 Large
SUMMARY:Computing Semantic Similarity between Skill Statements for Approximate Matching
UID:20070518T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Practical dialogue systems must exploit context to interpret userutterances correctly.  Received views of context and coordination inpragmatic theory equate utterance context with the occurrentsubjective states of interlocutors using notions like common knowledgeor mutual belief.  We argue that these views are not well suited forpractical modeling due to the uncertainty and robustness of contextdependence in human-human dialogue.  We present an alternativecharacterization of utterance context as objective and normative.  Onthis view, an interlocutor's representation of context reflectsprivate uncertainty about the true objective context as determined byprior speaker meanings.  As conversation moves forward, new utterancesprovide interlocutors with retrospective insight about each other'sprior meanings and therefore about what the true context really is.This view reconciles the need for uncertainty with received intuitionsabout coordination, and can directly inform computational approachesto dialogue.Joint work with Matthew Stone, Rutgers and Rich Thomason, MichiganAbout the Speaker:David DeVault is a Ph.D. candidate in the Department of ComputerScience at Rutgers University.  He holds a B.S. in Engineering andApplied Science from the California Institute of Technology and anM.A. in Philosophy from Rutgers University.  David's research aims todevelop techniques to allow computational agents to participate inflexible task-oriented conversations with human beings.  His recentwork has drawn on design challenges encountered in building such anagent to try to articulate practical, learnable, and theoreticallysatisfying representations of context, utterance meaning, and speakerintention for implemented conversational systems.

DTEND;TZID=America/Los_Angeles:20061117T163000
DTSTART;TZID=America/Los_Angeles:20061117T150000
LOCATION:11 Large
SUMMARY:Scorekeeping in an Uncertain Language Game
UID:20061117T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We describe an unsupervised, language-independent model for finding rhyme schemes in poetry, using no prior knowledge about rhyme or pronunciation.

DTEND;TZID=America/Los_Angeles:20110617T160000
DTSTART;TZID=America/Los_Angeles:20110617T154000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Unsupervised Discovery of Rhyme Schemes (ACL practice talk)
UID:20110617T154000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Widely used in speech and language processing, Kneser-Ney (KN) smoothing has consistently been shown to be one of the best-performing smoothing methods. However, KN smoothing assumes integer counts, limiting its potential usesâfor example, inside Expectation-Maximization. In this paper, we propose a generalization of KN smoothing that operates on fractional counts, or, more precisely, on distributions over counts. We rederive all the steps of KN smoothing to operate on count distributions instead of integral counts, and apply it to two tasks where KN smoothing was not applicable before: one in language model adaptation, and the other in word alignment. In both cases, our method improves performance significantly.Hui Zhang is a fourth year PhD student working with Professor David Chiang at the USC Information Sciences Institute. His main research interests are in statistical machine translation and machine learning. He has focused on domain adaptation and smoothing techniques.

DTEND;TZID=America/Los_Angeles:20140425T113000
DTSTART;TZID=America/Los_Angeles:20140425T103000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:[ACL2014 practice talk] Kneser-Ney Smoothing on Expected Counts
UID:20140425T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Textual Entailment has been proposed recently as a generic frameworkfor modeling semantic variability in many Natural Language Processingapplications, such as Question Answering, Information Extraction,Information Retrieval and Document Summarization. The TextualEntailment relationship holds between two text fragments, termed textand hypothesis, if the truth of the hypothesis can be inferred fromthe text.In this talk, the Textual Entailment framework will be introduced.I'll then present an algorithm for large-scale Web-based acquisitionof entailment rules, a type of knowledge needed for robust inference.Finally, I will present an unsupervised Relation Extraction approachbased on the Textual Entailment framework.About the speaker:Idan Szpektor is a PhD student under the supervision of Dr. Ido Daganat Bar Ilan University, Israel. His current research activity is inacquisition of knowledge for textual entailment.

DTEND;TZID=America/Los_Angeles:20060811T163000
DTSTART;TZID=America/Los_Angeles:20060811T150000
LOCATION:11 Large
SUMMARY:Textual Entailment: Framework, Learning and Applications
UID:20060811T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 1) A serious bottleneck in the development of trainable text summarizationsystems is the shortage of training data. Constructing such data is a verytedious task, especially because there are in general many differentcorrect ways to summarize a text. Fortunately we can utilize the Internetas a source of suitable training data. In this paper, we present asummarization system that uses the web as the source of training data. Theprocedure involves structuring the articles downloaded from variouswebsites, building adequate corpora of (summary, text) and (extract,text) pairs, training on positive and negative data, and automaticallylearning to perform the task of extraction-based summarization systems.2) Headlines are useful for users who only need information on the maintopics of a story. We present a headline summarization system that isbuilt at ISI for this purpose and is a top performer for DUC2003's task 1,generating very short summaries (10 words or less).

DTEND;TZID=America/Los_Angeles:20030523T160000
DTSTART;TZID=America/Los_Angeles:20030523T150000
LOCATION:11 Large
SUMMARY:A Web-Trained Extraction Summarization System and Headline Summarization at ISI
UID:20030523T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Building AI systems that can process user input, understand it, and generate an engaging and contextually-relevant output in response, has been one of the longest-running goals in AI. Humans use a variety of modalities, such as language and visual cues, to communicate. A major trigger to our meaningful communications are "events" and how they cause/enable future events. In this talk, I will present my research about language comprehension and language generation around events, with a major focus on commonsense reasoning, world knowledge, and context modeling. I will focus on multiple context modalities such as narrative, conversational, and visual. Finally, I will highlight my recent work on language comprehension in the biomedical domain for finding cures for major diseases.Bio: Nasrin Mostafazadeh is a senior research scientist at BenevolentAI labs. She recently got her PhD at the University of Rochester working with James Allen in conversational interaction and dialogue research group. During her PhD, she spent about a year at Microsoft and a summer at Google doing research on various NLP problems. Nasrinâs research focuses on language comprehension, mainly studying events to predict what happens next. She has developed models for tackling various research tasks for pushing AI toward deeper language understanding with applications ranging from story generation to vision & language. Recently, she has been working on language comprehension in the biomedical domain, with the goal of finding cures for major diseases such as cancer by leveraging millions of unstructured data.

DTEND;TZID=America/Los_Angeles:20171208T160000
DTSTART;TZID=America/Los_Angeles:20171208T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:[Canceled] Language Comprehension & Language Generation in Eventful Contexts
UID:20171208T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Translation into morphologically rich languages is an important but recalcitrant problem in machine translation. When confronted with the large vocabulary sizes resulting from various morphological phenomena, the independence assumptions made by standard translation models mean that vast amounts of parallel training data (which do not generally exist) would be necessary to reliably estimate the numerous required parameters. On the other hand, previous attempts to remedy this situation have been unsatisfying either because they were highly language-dependent, or because they failed from a modeling perspective (e.g., they improved performance on long-tail types at the expense of frequent types).We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentence-specific phrases that are added to a standard translation model prior to decoding. Our approach relies on morphological analysis of the target language but we show that an unsupervised Bayesian model can also be used in place of a standard supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.

DTEND;TZID=America/Los_Angeles:20130710T160000
DTSTART;TZID=America/Los_Angeles:20130710T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Translating into Morphologically Rich Languages with Synthetic Phrases
UID:20130710T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recurrent neural networks (RNN) have been successfully applied to various Natural Language Processing tasks, including language modeling, machine translation, text generation, etc. However, several obstacles still stand in the way: First, due to the RNN's distributional nature, few interpretations of its internal mechanism are obtained, and it remains a black box. Second, because of the large vocabulary sets involved, the text generation is very time-consuming. Third, there is no flexible way to constrain the generation of the sequence model with  external knowledge. Last, huge training data must be collected to guarantee the performance of these neural models, whereas annotated data such as parallel data used in machine translation are expensive to obtain. This work aims to address the four challenges mentioned above.To further understand the internal mechanism of the RNN, I choose neural machine translation (NMT) systems as a testbed. I first investigate how NMT outputs target strings of appropriate lengths, locating a collection of hidden units that learns to explicitly implement this functionality. Then I investigate whether NMT systems learn source language syntax as a by-product of training on string pairs. I find that both local and global syntactic information about source sentences is captured by the encoder. Different types of syntax are stored in different layers, with different concentration degrees.To speed up text generation, I proposed two novel GPU-based algorithms: 1) Utilize the source/target words alignment information to shrink the target side run-time vocabulary; 2) Apply locality sensitive hashing to find nearest word embeddings. Both methods lead to a 2-3x speedup on four translation tasks without hurting machine translation accuracy as measured by BLEU. Furthermore, I integrate a finite state acceptor into the neural sequence model during generation, providing a flexible way to constrain the output, and I successfully  apply this to poem generation, in order to control the pentameter and rhyme.Based on above success, I propose to work on the following: 1) Go one further step towards interpretation: find unit/feature mappings, learn the unit temporal behavior, and understand different hyper-parameter settings. 2) Improve NMT performance on low-resource language pairs by fusing an external language model, feeding explicit target-side syntax and utilizing better word embeddings.Bio: Xing Shi is a PhD student at ISI working with Prof. Kevin Knight.

DTEND;TZID=America/Los_Angeles:20170721T160000
DTSTART;TZID=America/Los_Angeles:20170721T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Neural Sequence Models: Interpretation and Augmentation
UID:20170721T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Phrase-based translation models usually memorize local translation literally and make independent assumption between phrases which makes it neither generalize well on unseen data nor model sentence-level effects between phrases. We present a new method to model correlations between phrases as a Markov model and meanwhile employ a robust smoothing strategy to provide better generalization. This method defines a recursive estimation process and backs off in parallel paths to infer richer structures. Our evaluation shows an 1.1â3.2% BLEU improvement over competitive baselines for Chinese-English and Arabic-English translation.Bio: Yang Feng is a postdoctoral scholar in Kevin Knight's NLP group in USC/ISI. She got her Ph.D. degree in 2011 from Institute of Computing Technology, Chinese Academy of Sciences. Her interests are machine translation and machine learning, focusing on Bayesian inference and Gaussian process. Now her main work is to improve ISI syntax-based system.

DTEND;TZID=America/Los_Angeles:20140711T160000
DTSTART;TZID=America/Los_Angeles:20140711T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Factored Markov Translation with Robust Modeling
UID:20140711T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Over the last few years, neural network-based deep-learning models achieved good results in various NLP tasks, such as  language modelling, POS tagging, parsing, chunking, and NER. In contrast to discrete models like HMMs, neural models operate by jointly learning continuous input representations (embeddings), and the model to interpret them. These embeddings represent words and/or phrases in a lower-dimensional, latent, syntactic-semantic space and can often be learned in an unsupervised manner.We aim to exploit this property of deep learning to transfer knowledge from resource-rich to resource-poor domains. We facilitate the transfer of knowledge by constraining the learned embeddings of both domains to share as much structural similarity as possible. I will discuss preliminary results for noisy text normalization in Twitter, where the task is to transfer the correct clean words from English to the noisy Twitter domain, and review the main deep learning models for NLP (Bengio et al. (2003, Mnih and Hinton (2007), Collobert and Weston (2008), Mikolov et al. (2010), and Socher et al. (2011)).Bio:Stephan Gouws is a PhD student at Stellenbosch University in South Africa. He is currently on a short-term visit at the ISI. His main research focus is on developing robust, semi-supervised techniques for processing language in and across noisy domains. In 2011 he was also on a 6-month visit to the ISI during which he worked on orthographic normalization of non-standard Twitter text.

DTEND;TZID=America/Los_Angeles:20120706T160000
DTSTART;TZID=America/Los_Angeles:20120706T150000
LOCATION:11th Floor Conference Room [1135]
SUMMARY:Projecting features across domains using deep learning
UID:20120706T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This high-level survey will describe the results of statistical machine translation (SMT) research since 1948. Part of the survey will cover the explosion of work in the past few years that has resulted from intense interest on the part of scientists, funders, and industry. We will also examine the roots of SMT in World War II decipherment activities. Some of the concepts from that era have become core to the field, while others still remain to be picked up.

DTEND;TZID=America/Los_Angeles:20090130T160000
DTSTART;TZID=America/Los_Angeles:20090130T150000
LOCATION:11 Large
SUMMARY:Sixty Years of Statistical Machine Translation
UID:20090130T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: EM has proved to be a great and useful technique for unsupervised learningproblems in natural language.  Unfortunately, it cannot solve everyproblem out there, either because the E-step is intractable, the M-step isintractable or both.  Typically our community resorts to a Viterbiapproximation in this case, which really isn't very justified and caneasily diverge from our expectations (no pun intended). Moreover, EM --like all maximum likelihood methods -- suffers from a need for ad-hoc andundesirable smoothing.  All of these problems -- intractable E- orM-steps, the Viterbi approximation, and the annoyance of smoothing -- aresolved by using Bayesian methods. Moreover, from a theoretic point ofview, the Bayesian paradigm is much more foundationally well justifiedthan the frequentist use of estimators (such as the maximum likelihoodestimator), at some cost in computation (though not as much as you mightbelieve).In this tutorial, I will discuss Bayesian methods as they can be used innatural language processing.  The first half will be background (some ofwhich you probably won't have seen, some of which you probably will haveseen, but which will probably be presented in a different way that you'reused to) including graphical models, EM, priors and pro- (and con-)Bayesian arguments.  The second half of the tutorial will focus on solvingcomplex inference problems, essentially building on what we've seen fromEM.  I'll cover MAP (*not* Bayesian -- if you can't tell me why, then youshould come to the tutorial!), summing, Monte Carlo, MCMC, Laplace,variational and expectation propagation.  Time permitting, I will brieflydiscuss Bayesian discriminative models (basically what a Bayesian usesinstead of SVMs), non-parametric (infinite) models and Bayesian decisiontheory, all of which make use of the inference techniques we will havealready covered.This tutorial is intended to be largely self contained, though I willexpect that you know what probabilities are, what distributions are andthe standard manipulations of conditional/joint distributions. Familiaritywith EM would be helpful, but I'll cover this topic in some depth since itwill be important for understanding the rest of the tutorial.  I hope --though this never really seems to come to fruition -- that this will be asemi-interactive talk and I will attempt to adjust according to whatpeople are interested in and what is putting people to sleep.(see http://www.isi.edu/~hdaume/bayesnlp/ for more information)

DTEND;TZID=America/Los_Angeles:20050622T163000
DTSTART;TZID=America/Los_Angeles:20050622T130000
LOCATION:11 Large
SUMMARY:Beyond EM: Bayesian Techniques for NLP Researchers
UID:20050622T130000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will give a status report work on information extraction during last10 months. The motivation of this work is to learn extractionpatterns automatically using seed template and web search engine. Myapproach is to generate linguistics patterns and surface patterns andcombine them to compenstate for the respective weaknesses of twopatterns. On the DUC01-test-disasters (67 documents),DUC01-training-disasters (54 documents) I got a 0.34/0.26 f-measurerespectively. In this talk, I will give a status report on ReADproject (with Dr. Chin-Yew Lin).

DTEND;TZID=America/Los_Angeles:20030207T160000
DTSTART;TZID=America/Los_Angeles:20030207T150000
LOCATION:11 Large
SUMMARY:Automatic Pattern Learning for Information Extraction using Web Data
UID:20030207T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We propose a theory that gives formal semantics to word-levelalignments defined over parallel corpora. We use our theory tointroduce a linear algorithm that can be used to derive fromword-aligned, parallel corpora the minimal set of syntacticallymotivated transformation rules that explain human translation data.(joint work with Michel Galley, Kevin Knight, and Daniel Marcu)

DTEND;TZID=America/Los_Angeles:20040206T160000
DTSTART;TZID=America/Los_Angeles:20040206T150000
LOCATION:11 Large
SUMMARY:What's in a Translation Rule?
UID:20040206T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk, we will present our research on learning modelswith high cross domain accuracy for subjectivity classification. After asmall introduction about related works and challenges of sentimentanalysis, we will start by presenting new features for subjectivityanalysis. Then, we will present two different paradigms of multi-viewlearning strategies to learn transfer models: multi-view learning withagreement and guided multi-view learning. Then, we will present anexhaustive evaluation based on both paradigms including twostates-of-the-art algorithms and show that accuracy over 91% can beobtained using three views. In our concluding remarks, we will talkabout future extensions of the presented methodology. Then, we willbriefly present the Human Language Technology team of the GREYCLaboratory of the University of Caen Basse-Normandie (France) andpresent projects that are being studied ans further prospects.Biography: Gael Dias is full professor at the University of CaenBasse-Normandie (France). His research interests include unsupervisedmethodologies for text mining, information retrieval and textsummarization. His recent research focuses on Sentiment Analysis,Ontology Learning, Lexical Semantics, Web Personalization andCollaboration, Temporal Information Retrieval, and Paraphrase Extractionand Identification. He has served on program committees of internationalconferences and workshops such as ACL/HLT 2011, COLING 2010, IJCNLP/ACL2009, ACL 2007, HLT-NAACL 2007, COLING/ACL 2006 as well as is/was areviewer for Information Processing and Management, IEEE Transactions onAudio, Speech and Language Processing, Natural Language EngineeringJournal, Journal of Language Resources and Evaluation, Journal ofComputer Speech and Language and  ACM Transactions on Speech andLanguage Processing.

DTEND;TZID=America/Los_Angeles:20111212T170000
DTSTART;TZID=America/Los_Angeles:20111212T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Cross Domain Subjectivity Classification using Multi-View Learning
UID:20111212T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 3:00pm  Victoria Fossum (Michigan)Exploring the Continuum between Phrase-based and Syntax-based Machine TranslationState-of-the-art statistical machine translation systems use lexicalphrases as the basic unit of translation.  Phrase-based systems cancapture those aspects of translation that are sensitive to local context.Syntax-based systems, on the other hand, make use of linguisticallymotivated syntactic structure, can capture long-distance dependencies andreorderings, and offer greater generalization in translation rules.However, their performance lags that of phrase-based systems.Hierarchical phrase-based translation, introduced by [Chiang 05], providesan elegant framework for exploring the continuum between phrase-based andsyntax-based translation.  This system combines the "formal machinery" ofsyntax-based systems without any "linguistic commitment" to a particularsyntactic structure [Chiang 05].I will present results from my re-implementation of Chiang's hierarchicalphrase-based system, and (if time permits) compare those results with thefollowing systems on Chinese-English translation: ISI's phrase-basedsystem, and ISI's syntax-based system.  Between now and December 2005, Iplan to incrementally explore the space between phrase-based andsyntax-based systems by augmenting these hierarchical phrase-based ruleswith richer syntactic annotation.3:30pm  Liang Huang (Penn) and Hao Zhang (Rochester)Efficient Integration of n-gram Language Models with Syntax-based DecodingWe first give an overview of the ISI syntax-based MT system which is basedon tree-to-string (xRs) translation rules. The biggest problem at thisstage is the inefficiency of the integration of n-gram models.  Withoutn-gram models, the xRs translation rules can be easily binarized withrespect to the foreign language to ensure cubic-time decoding. With n-grammodels, however, binarization without considering both languages will leadto exponential complexity.Inspired by Inversion Transduction Grammar (ITG) (Wu, 97), we will focuson the so-called ITG binarizable rules which count for over 99% of thewhole rule set. A simple linear-time algorithm will be presented to do thebinarization. Decoding with ITG-like rules is of low polynomial complexityin both time and space. We will discuss experimental results on bothefficiency and accuracy of decoding with the new binarization.  If timepermits, we will also present the "hook trick" (inspired by (Eisner andSatta, 99)) to even further reduce the polynomial complexity of thedecoding process.

DTEND;TZID=America/Los_Angeles:20050826T163000
DTSTART;TZID=America/Los_Angeles:20050826T150000
LOCATION:11 Large
SUMMARY:Summer Student Presentations
UID:20050826T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Term weighting methods have been shown to give significant increasesin information retrieval performance. Term weights are typicallycalculated using frequency counts across the whole retrievalcollection, frequency of each term within individual documents andcompensation for varying document length. The presence of pronomialreferences in documents effectively reduces the within document termfrequency of associated words with a consequent effect on term weightsand information retrieval behaviour. This presentation will describean experimental investigation into the impact on information retrievalperformance of broad coverage automatic pronoun resolution. Resultsusing a standard information retieval test collection indicate thatcalculating term weights using a pronoun resolved version of thedocument test collection can improve both fixed cutoff and averageretrieval precision.

DTEND;TZID=America/Los_Angeles:20030321T160000
DTSTART;TZID=America/Los_Angeles:20030321T150000
LOCATION:11 Large
SUMMARY:An Investigation of the Application of Broad Coverage Automatic Pronoun Resolution in Information Retrieval
UID:20030321T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The World Wide Web brought with it an unprecedented level ofinformation overload. Computers are very effective at processing andclustering numerical and binary data, however, the automated conceptualclustering of natural-language data is considerably harder to automate.Many techniques rely on relatively simple keyword-matching techniques orprobabilistic methods to measure semantic relatedness between words anddocuments. However, these approaches do not always accurately captureconceptual relatedness as measured by humans.In this talk I'll briefly discuss a novel use of spreading activation(SA) techniques (primarily from cognitive science) for computingsemantic relatedness between words and/or documents. This is done bymodelling the article hyperlink structure of Wikipedia as an associativenetwork structure for knowledge representation. The SA technique isadapted and several problems are addressed for it to function over thederived Wikipedia hyperlink graph. We evaluate these approaches overstandard document similarity datasets and by user evaluationexperiments, and achieve results which compare favourably with state ofthe art methods.By making use of the collaboratively-created resource Wikipedia, wehereby also overcome a significant problem in making use of spreadingactivation based techniques for information retrieval up to now, asnoted by Crestani (1997): "The problem of building a network whicheffectively represents the useful relations [between concepts] hasalways been the critical point of many of the attempts to use SA in IR.These networks are very difficult to build, to maintain and keep up todate.

DTEND;TZID=America/Los_Angeles:20110204T160000
DTSTART;TZID=America/Los_Angeles:20110204T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Measuring Conceptual Similarity by Spreading Activation over Wikipedia's Hyperlink Graph
UID:20110204T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will give an overview of this project, focusing on the pieces that my student, Antonios Anastasopoulos, and I have been most involved in. Our work is based on the premise that spoken language resources are more readily annotated with translations than with transcriptions. A first step towards making such data interpretable would be to automatically align spoken words with their translations. I'll present a neural attentional model (Duong et al., NAACL 2016) and a latent-variable generative model (Anastasopoulos and Chiang, EMNLP 2016) for this task.David Chiang (PhD, University of Pennsylvania, 2004) is an associate professor in the Department of Computer Science and Engineering at the University of Notre Dame. His research is on computational models for learning human languages, particularly how to translate from one language to another. His work on applying formal grammars and machine learning to translation has been recognized with two best paper awards (at ACL 2005 and NAACL HLT 2009). He has received research grants from DARPA, CIA, NSF, and Google, has served on the executive board of NAACL and the editorial board of Computational Linguistics and JAIR, and is currently on the editorial board of Transactions of the ACL.

DTEND;TZID=America/Los_Angeles:20170110T160000
DTSTART;TZID=America/Los_Angeles:20170110T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Speech-to-Translation Alignment for Documentation of Endangered Languages
UID:20170110T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: ISI's Tactical Language Project is a system designed to teach Americanshow to speak Arabic through a video game environment. We've taken a FPSengine (Unreal 2003) and re-did the graphics so it looks like you're in atypical Lebanese village. We took away the guns, added speech recognition,and set the player in the middle of it all. The theory is that if youlearn well in a classroom, you'll perform well in a classroom, but if youlearn well in a pseudo-naturalistic environment, you'll perform better inreal life.In a pedagogical context, speech recognition is a hard thing we're tryingto recover signal from noisy language-learner speech--with all of itsmispronunciations, disfluencies, and grammatical errors . Languageunderstanding is hopeless unless you have a good approximation of whatkinds of mistakes learners make, and you can build a system to anticipatethem.Suppose an English language learner says "Water". Is he asking you forwater? Is he telling you there's a puddle in front of you? Is he sayinghis name is "Walter", but with horrible pronunciation? There's a lot ofambiguity involved. In order to disambiguate, we need to look at thespeech signal itself, the utterance's context, the learner's past languageperformance, and details about the learner's mother language as it relatesto English, etc., etc... Only then can we hope to guess what the learneris actually trying to say.And then, of course, once we've made a good guess at the learner's speechintentions, what do we do about it? How do we correct him? How do webalance the consideration of inherent qualities of learner motivation,language errors, learning objectives, and possibly low-confidence speechrecognition, as we generate good pedagogical feedback?This is NLP (primarily statistical) with a bit of pedagogy theory andlinguistic (SLA and phonology) theory sprinkled in.

DTEND;TZID=America/Los_Angeles:20041210T163000
DTSTART;TZID=America/Los_Angeles:20041210T150000
LOCATION:11 Large
SUMMARY:Developing a Language Model for Second Language Learner Speech
UID:20041210T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Performing machine translation with monolingual data instead of parallel data is an interesting problem. Because of the lack of parallel data for many language pairs, solving the problem would arise interesting new use cases.On the road towards this we look at similar but easier problems. In the past improvements on simple substitution ciphers (1:1) were made - Even word substitutions ciphers with large vocabularies were solved for example by a beamsearch approach. This talk concentrates on the more complicated cipher class of homophonic substitution ciphers (1:m) like the famous Z408 of the Zodiac killer or the second page of the Beale cipher. We preset a method based on beamsearch. Covered aspects are an improved heuristic, the order the beamsearch should explore the search space, pruning, and the impact of the cipher lengths and cipher alphabet size on the deciphering accuracy.Bio:Julian Schamper studies computer science at RWTH Aachen University. He did its bachelor thesis in the field of deciphering foreign language and works as a student research assistant at Prof. Hermann Ney's Human Language Technology and Pattern Recognition Group.

DTEND;TZID=America/Los_Angeles:20140611T160000
DTSTART;TZID=America/Los_Angeles:20140611T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Solving Homophonic Sustitution Ciphers [Intern talk]
UID:20140611T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal displays and signals. Human interpersonal behaviors have long been studied in linguistic, communication, sociology and psychology. The recent advances in machine learning, pattern recognition and signal processing enabled a new generation of computational tools to analyze, recognize and predict human communication behaviors during social interactions. This new research direction have broad applicability, including the improvement of human behavior recognition, the synthesis of natural animations for robots and virtual humans, the development of intelligent tutoring systems, and the diagnoses of social disorders (e.g., autism spectrum disorder).In this talk, I will present some of our recent work modeling multiple aspects of human communication dynamics, including behavioral dynamic, multimodal dynamic and interpersonal dynamic. I will describe the different computational models specifically designed model these dynamics, including the Latent-Dynamic Conditional Random Fields, Multi-view Hidden Conditional Random Fields and the Latent Mixture of Discriminative Experts. I will show how these technologies can be applied to real-world problems such as negotiation outcome prediction, YouTube opinion mining, group learning analytics and psychological distress indicators. Finally, I will summarize our recent progress in integrating these sensing technologies with a virtual human for healthcare application.Bio:Louis-Philippe Morency is a Research Assistant Professor in the Department of Computer Science at the University of Southern California (USC) and Research Scientist at the USC Institute for Creative Technologies where he leads the Multimodal Communication and Machine Learning Laboratory (MultiComp Lab). He received his Ph.D. and Master degrees from MIT Computer Science and Artificial Intelligence Laboratory. His research interests are in computational study of nonverbal social communication, a multi-disciplinary research topic that overlays the fields of multimodal interaction, computer vision, machine learning, social psychology and artificial intelligence. Dr. Morency was selected in 2008 by IEEE Intelligent Systems as one of the Ten to Watch for the future of AI research. He received 6 best paper awards in multiple ACM- and IEEE-sponsored conferences for his work on context-based gesture recognition, multimodal probabilistic fusion and computational modeling of human communication dynamics. His work was reported in The Economist, New Scientist and Fast Company magazines.

DTEND;TZID=America/Los_Angeles:20130222T160000
DTSTART;TZID=America/Los_Angeles:20130222T150000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Modeling Human Communication Dynamics: From Depression Assessment to Multimodal Sentiment Analysis
UID:20130222T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Facts and their relations extracted from web are commonly modeled as graphs with different types of vertices. In this work, we focus on the problem of revealing latent entities from a $k$-partite graph, by co-clustering $k$ types of different vertices. We propose a CoSum approach, which creates a summary graph, where each super node (a cluster of original vertices) represents a hidden entity and the weighted edges encode important relations among extracted entities. The resulted summary graph also allows for investigation and interpretation of hidden entities. Evaluation verifies that CoSum outperforms several baselines in terms of entity coherence, query supporting and recovering hidden victims in the applied human trafficking domain.Bio: Linhong Zhu is currently a computer scientist at Information Sciences Institute, University of Southern California, where she also received training as a Postdoctoral Research Associate. Before that, she worked as a Scientist-I in data analytics department at Institute for Infocomm Research, Singapore. She obtained her Ph.D. degree in computer engineering from Nanyang Technological University, Singapore in 2011. Her research interests are large-scale graph analytics with applications to social network analysis, social media analysis, and predictive modeling. She has been awarded with University of Southern California Postdoctoral travel and training award in 2014 and her paper has been selected as two of the best papers in SIGMOD 2010.

DTEND;TZID=America/Los_Angeles:20160205T160000
DTSTART;TZID=America/Los_Angeles:20160205T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Deciphering Dark Web through k-partite Graph Summarization
UID:20160205T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Paraphrases are textual expressions that convey the same meaning using different surface forms. Capturing the variability of language, they play an important role in many natural language applications including question answering, machine translation, and multi-document summarization. In linguistics, paraphrases are characterized by approximate conceptual equivalence. Since no automated semantic interpretation systems available today can identify conceptual equivalence, paraphrases are difficult to acquire without human effort. The aim of this thesis is to develop methods for automatically acquiring and filtering phrase-level paraphrases using a monolingual corpus.Noting that the real world uses far more quasi-paraphrases than the logically equivalent ones, we first present a general typology of quasi-paraphrases together with their relative frequencies. To our knowledge the first one ever. We then present a method for automatically learning the contexts in which quasi-paraphrases obtained from a corpus are mutually replaceable. Knowing that quasi-paraphrases are often inexact because they contain semantic implications which can be directional, we present an algorithm called LEDIR to learn the directionality of quasi-paraphrases. Since semantic classes play a crucial role in our work, we also investigate the use of a semi-supervised clustering algorithm for learning semantic classes.We next investigate the task of learning surface paraphrases, i.e., paraphrases that do not require the use of any syntactic interpretation. Since one would need a very large corpus to find enough surface variations, we use a really large but unprocessed corpus of 150GB (25 billion words) obtained from Google News to do this learning. We show that these paraphrases can be used to learn surface patterns for relation extraction. Finally, we use paraphrases to learn patterns for domain-specific information extraction.Thus, in this thesis we define quasi-paraphrases, present methods to learn them from a corpus, and show that quasi-paraphrases are useful for information extraction.

DTEND;TZID=America/Los_Angeles:20090417T160000
DTSTART;TZID=America/Los_Angeles:20090417T150000
LOCATION:11 Large
SUMMARY:Learning Paraphrases from Text (Ph.D. Defense practice talk)
UID:20090417T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We present a simple, language-independent method for integrating recovery of empty elements into syntactic parsing. This method outperforms the best published method we are aware of on English and a recently published method on Chinese.This is a joint work with David Chiang and Yoav Goldberg

DTEND;TZID=America/Los_Angeles:20110527T160000
DTSTART;TZID=America/Los_Angeles:20110527T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Language-Independent Parsing with Empty Elements
UID:20110527T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Grapheme-to-phoneme (g2p) models are rarely available in low-resource languages, as the creation of training and evaluation data is expensive and time-consuming. We use Wiktionary to obtain more than 650k word-pronunciation pairs in more than 500 languages. We then develop phoneme and language distance metrics based on phonological and linguistic knowledge; applying those, we adapt g2p models for high-resource languages to create models for related low-resource languages. We provide results for models for 229 adapted languages.Bio: Aliya Deri is a PhD candidate in Computer Science at USC, advised by Professor Kevin Knight.

DTEND;TZID=America/Los_Angeles:20160708T160000
DTSTART;TZID=America/Los_Angeles:20160708T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Title: Grapheme-to-Phoneme Models for (Almost) Any Language
UID:20160708T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We discuss the relevance of k-best parsing to recent applications innatural language parsing, and develop algorithms that substantiallyimprove on previously-used algorithms with respect to efficiency,scalability, and accuracy. We demonstrate these algorithms in experimentson Bikel's implementation of Collins' lexicalized PCFG model, and on asynchronous CFG based decoder for statistical machine translation. We showin particular how the improved output of our algorithms has the potentialto improve results from parse reranking systems and other applications.In this talk, I will demonstrate the convergence of several popularparsing formalisms (weighted deduction, shared forest, semiring) under thepowerful hypergraph formalism. If time permits, I will also show howgeneric Dynamic Programming can be formalised as hypergraph searching.Joint work with David Chiang (University of Maryland)

DTEND;TZID=America/Los_Angeles:20050610T163000
DTSTART;TZID=America/Los_Angeles:20050610T150000
LOCATION:11 Large
SUMMARY:Better k-best Parsing, Hypergraphs and Dynamic Programming
UID:20050610T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The School of Information Technologies at the University of Sydney hashad a 3 year partnership with the Intensive Care Unit at the RoyalPrince Alfred Hospital, Sydney. In that time they have managed 8 jointprojects aimed at producing software solutions that enhanceproductivity in the Unit and in some cases enabled entirely newfunctionalities in their information systems. The principle motivationfor the research is the processing of the narratives in clinical notesbut concomitant problems in information systems have also been tackledand the combination of the two disciplines have led to the two relatedprocessing systems to be described in this presentation.- Ward Rounds Information Systems (WRIS) & Handovers -The WRIS is designed to support the work of all clinical staff intheir ward rounds activities. The system, when activated,automatically populates from the resident clinical database a proforma report with the most recent relevant data about the patient,such as vital signs, pathology reports, and other diagnosticmeasurements, presented as a web page. The clinical staff then writetheir progress notes into the web page which converts the text toSNOMED CT codes and other relevant concepts and entities. Theclinician is given the opportunity to change any analyses done by theprocessor. This clinician approved data is loaded to the patientrecord. The essential elements of this system, that is computing anextract of the patient record, accepting narrative input, andanalysing the text for coding, is a productivity gain of itself, butmore importantly, also constitutes the beginning of a hospital wideHandovers System for use throughout each step in the patientjourney. This system is being tested at the RPAH ICU in readiness forward usage. The impact of this system in improving the quality andsafety of handovers has the potential to be very significant.- Clinical Data Analytics Language (CDAL) -General purpose access to data from clinical information systems,beyond retrieval for point of care work, is needed for many aspects ofthe hospital's work particularly for clinical research, logistics &operational planning, and auditing patient safety. Most currentclinical systems only provide access to data identified in standardreports with no flexibility to make ad hoc enquiries or to pursue newdirections of enquiry. The clinical data analytics language developedenables the expression of any question that can be answered from thedata in the database in a restricted natural language. A prototype ofthe language has been developed for the CareVue information systemused in the ICU at the Royal Prince Alfred Hospital. It provides forthe use of local medical dialects, SNOMED CT terminology including allforms of collective expressions in SNOMED (e.g. infectious diseases),specification of patient groups, a variety of statistical functions,and constraints over any medical variable, Time, and Location. CDAL isgeneral in that it can be bolted on to any clinical information systemand is applicable to any clinical specialisation.

DTEND;TZID=America/Los_Angeles:20071017T163000
DTSTART;TZID=America/Los_Angeles:20071017T153000
LOCATION:11 Large
SUMMARY:Enhancement Technologies for ICU Information Systems
UID:20071017T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Our research aims at building interactive robots and agent that can expand their knowledge by interacting with human users. In this talk, I will give an overview of our ongoing work on learning novel tasks from linguistic, mixed-initiative instructions. The first part of the talk will address the problems of situated language comprehension for cognitive agents in real-world environments. The second part will focus on task learning. I will discuss the knowledge representations we employ to represent hierarchical, goal-oriented tasks and how this knowledge can be learned from interactions using an explanation-based learning framework.Bio: Shiwali Mohan is a Ph.D. candidate in the department of Computer Science and Engineering at the University of Michigan, Ann Arbor. Her research interests include situated language, interactive learning, and cognitive systems.

DTEND;TZID=America/Los_Angeles:20131206T160000
DTSTART;TZID=America/Los_Angeles:20131206T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning Hierarchical Tasks from Situated Interactive Instruction
UID:20131206T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recently, several datasets have become available which represent natural language phenomena as graphs. Hyperedge Replacement Languages (HRL) have been the focus of much attention as a formalism to represent the graphs in these datasets. Chiang et al. (2013) prove that HRL graphs can be parsed in polynomial time with respect to the size of the input graph. We believe that HRL may be more expressive than is necessary to represent semantic graphs and we propose looking at Regular Graph Languages (RGL; Courcelle, 1991), which is a subfamily of HRL, as a possible alternative. We provide a top-down parsing algorithm for RGL that runs in time linear in the size of the input graph.Bio:Sorcha is a 2nd year PhD student at the University of Edinburgh and is a student in the Center for Doctoral Training in Data Science. Her PhD is focused on formal languages of graphs for NLP and her supervisors are Adam Lopez and Sebastian Maneth. She completed her undergraduate degree in mathematical sciences at University College Cork and her masters degree in data science at the University of Edinburgh. She is at ISI as an intern in the NLP group.Live here: http://webcastermshd.isi.edu/Mediasite/Play/c523b7ef95b443e8b29cfac3092e00081d

DTEND;TZID=America/Los_Angeles:20170714T160000
DTSTART;TZID=America/Los_Angeles:20170714T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Parsing Graphs with Regular Graph Grammars
UID:20170714T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Information can now be found on almost any topic ranging from news to do-it-yourself guides to health-related articles.  Unfortunately for readers, the complexity and readability of these texts can vary widely.  Even if the concepts of an article are accessible, the language and structure of the text can prohibit a person from understanding these concepts.Text simplification techniques are aimed at reducing the reading and grammatical complexity of text while retaining the meaning and are one approach to increasing information accessibility.  Motivated by both corpus analyses and human experiments, I will introduce a number of recent text simplification techniques ranging from semi-automated approaches, that require a human in the loop, to automated approaches, including word-level, phrase-level and syntax-level models.Bio: David Kauchak is currently an assistant professor in the Computer Science Department at Pomona College.  Previously, he was at Middlebury College and has worked at Google, ISI, PARC and Adchemy.  He received his Ph.D. in Computer Science from University of California, San Diego.

DTEND;TZID=America/Los_Angeles:20150424T160000
DTSTART;TZID=America/Los_Angeles:20150424T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Learning To Simplify Text One Sentence at a Time
UID:20150424T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: TBA

DTEND;TZID=America/Los_Angeles:20040312T163000
DTSTART;TZID=America/Los_Angeles:20040312T150000
LOCATION:11 Large
SUMMARY:About My Thesis Proposal
UID:20040312T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Following the recent adoption by the machine translation community ofautomatic evaluation using the BLEU/NIST scoring process, we conduct anin-depth study of a similar idea for evaluating summaries. The resultsshow that automatic evaluation using unigram co-occurrences betweensummary pairs correlates surprising well with human evaluations, basedon various statistical metrics; while direct application of the BLEUevaluation procedure does not always give good results.

DTEND;TZID=America/Los_Angeles:20030516T160000
DTSTART;TZID=America/Los_Angeles:20030516T150000
LOCATION:11 Large
SUMMARY:Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics
UID:20030516T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This talk is about an optimization problem that was shown to beNP-hard: computing optimal alignments for the IBM-3 translationmodel. I will show that in practice it can be solved quite efficientlyvia Integer Linear Programming. In addition to using a standard solverI will also show problem-specific preprocessing techniques: byderiving upper and lower bounds, a large number of variables can beremoved from the start.Short Bio: Thomas Schoenemann was born and grew up in Germany. He studiedComputer Science at RWTH Aachen, Germany, where he got a diploma in2005, having written his diploma thesis on the topic of confidencemeasures in machine translation in the group of HermannNey. Afterwards he went to the University of Bonn, Germany, to do hisPh.D. thesis in computer vision in the years 2006-2008. Up to a monthago he was a postdoc in the vision group at Lund University, Sweden,where he also resumed his work on translation. Currently he is takinga time off to explore other fields and broaden his scope.

DTEND;TZID=America/Los_Angeles:20110415T160000
DTSTART;TZID=America/Los_Angeles:20110415T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Computing Viterbi Alignments via Integer Linear Programming
UID:20110415T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this paper, we address the problem of extracting data records andtheir attributes from unstructured biomedical full text. There hasbeen little effort reported on this in the research community. Weargue that semantics is important for record extraction orfiner-grained language processing tasks. We derive a data recordtemplate including semantic language models from unstruc-tured textand represent them with a dis-course level Conditional Random Fields(CRF) model. We evaluate the approach from the perspective ofInformation Extrac-tion and achieve significant improvements on systemperformance compared with other baseline systems.

DTEND;TZID=America/Los_Angeles:20070615T113000
DTSTART;TZID=America/Los_Angeles:20070615T110000
LOCATION:11 Large
SUMMARY:Extracting Data Records from Unstructured Biomedical Full Text
UID:20070615T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I will present some preliminary results on the problem of domainadaptation in maximum entropy models, specifically in the case when thereis a large amount of "out of domain" data, and only a very small amount of"in domain" data.  The model and algorithms I present are based on thetechnique of conditional Expectation Maximization (CEM) and allow forrelatively fast optimization of these models.  Preliminary results on sometasks are quite promising.

DTEND;TZID=America/Los_Angeles:20040924T163000
DTSTART;TZID=America/Los_Angeles:20040924T150000
LOCATION:11 Large
SUMMARY:Domain Adaptation in Maximum Extropy Models
UID:20040924T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties. We show that this can be solved by adding a regularization term, which is in turn related to injecting multiplicative noise in the activations of a Deep Neural Network, a special case of which is the common practice of dropout. We show that our regularized loss function can be efficiently minimized using Information Dropout, a generalization of dropout rooted in information theoretic principles that automatically adapts to the data and can better exploit architectures of limited capacity. When the task is the reconstruction of the input, we show that our loss function yields a Variational Autoencoder as a special case, thus providing a link between representation learning, information theory and variational inference. Finally, we prove that we can promote the creation of disentangled representations simply by enforcing a factorized prior, a fact that has been observed empirically in recent work. Our experiments validate the theoretical intuitions behind our method, and we find that information dropout achieves a comparable or better generalization performance than binary dropout, especially on smaller models, since it can automatically adapt the noise to the structure of the network, as well as to the test sample.arXiv: https://arxiv.org/abs/1611.01353Bio: Alessandro Achille is a PhD student in Computer Science at UCLA, working with Prof. Stefano Soatto. He focuses on variational inference, representation learning, and their applications to deep learning and computer vision. Before coming to UCLA, he obtained a Master's degree in Pure Math at the Scuola Normale Superiore in Pisa, where he studied model theory and algebraic topology with Prof. Alessandro Berarducci.

DTEND;TZID=America/Los_Angeles:20170307T120000
DTSTART;TZID=America/Los_Angeles:20170307T110000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Information Dropout: Learning Optimal Representations Through Noisy Computation
UID:20170307T110000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivation history, this method runs in average case polynomial-time in theory, and linear-time with beam search in practice (whereas phrase-based decoding is exponential-time in theory and quadratic-time in practice). Experiments show that, with comparable translation quality, our tree-to-string system (in Python) can run more than 30 times faster than the phrase-based system Moses (in C++).

DTEND;TZID=America/Los_Angeles:20101001T160000
DTSTART;TZID=America/Los_Angeles:20101001T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Efficient Incremental Decoding for Tree-to-String Translation (EMNLP 2010 Practice Talk)
UID:20101001T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.Bio: Sahil Garg is a PhD student, advised by Prof. Aram Galstyan, in computer science department of Viterbi school of engineering at University of Southern California. He is interested in problem oriented research. In the past, he developed machine learning, information theoretic algorithms for real world problems such as sensing environmental dynamics using mobile robotic sensors. In this talk, he is going to discuss his recent work on extracting bio-molecular interactions from bio-medical text using semantic parsing, especially in relevance to Cancer disease.

DTEND;TZID=America/Los_Angeles:20160311T160000
DTSTART;TZID=America/Los_Angeles:20160311T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text
UID:20160311T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Most existing theory of structured prediction assumes exact inference, which is often intractable in many practical problems. This leads to the routine use of approximate inference such as beam search but there is not much theory behind it. Based on the structured perceptron, we propose a general framework of "violation-fixing" perceptrons for inexact search with a theoretical guarantee for convergence under new separability conditions. This framework subsumes and justifies the popular heuristic "early-update" for perceptron with beam search (Collins and Roark, 2004). We also propose several new update methods within this framework, among which the "max-violation" method dramatically reduces training time (by 3 fold as compared to early-update) on state-of-the-art part-of-speech tagging and incremental parsing systems.

DTEND;TZID=America/Los_Angeles:20120525T160000
DTSTART;TZID=America/Los_Angeles:20120525T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Structured Perceptron with Inexact Search (NAACL HLT Practice Talk)
UID:20120525T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Standard phrase-based translation models do not explicitly model context dependence between translation units. As a result, they rely on large phrase pairs and target language models to recover contextual effects in translation. In this work, we explore language models over Minimal Translation Units (MTUs) to explicitly capture contextual dependencies across phrase boundaries in the channel model. As there is no single best direction in which contextual information should flow, we explore multiple decomposition structures as well as  dynamic bidirectional decomposition. The resulting models are evaluated  in an intrinsic task of lexical selection for MT as well as a full MT system, through n-best re-ranking. These experiments demonstrate that additional contextual modeling does indeed benefit a phrase-based system(up to 2.8 BLEU score) and that the direction of conditioning is important. Integrating multiple conditioning orders provides consistent benefit, and the most important directions differ by language pair.

DTEND;TZID=America/Los_Angeles:20130412T160000
DTSTART;TZID=America/Los_Angeles:20130412T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Beyond Left-to-Right: Multiple Decomposition Structures for SMT
UID:20130412T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Advances in technologies to capture and process multimedia signals are enabling new opportunities for understanding and modeling human behavior, and designing new human-centered applications. Intelligent environments equipped with a range of audio-visual sensors provide suitable means for automatically monitoring and tracking the behavior, strategies and engagement of the participants in multiperson interactions such as meetings, at various levels of interest. We describe a case study of a "Smartroom" being developed at USC in which high-level features are calculated from active speaker segmentations, automatically annotated by our system, to infer the interaction dynamics between the participants. The results show that it is possible to accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion.Additionally, we describe analysis of human expressive behavior that can be afforded by such audio-visual data. We describe an analysis of the interrelation between facial gestures and speech using a multimodal approach. Using a controlled setting, motion capture technology was used to simultaneously acquire speech and detailed facial information. Our results indicate that the verbal and non-verbal channels of human communication are internally and intricately connected. The interplay is observed across the different communication channels such as various aspects of speech, facial expressions, and movements of the hands, head and body, and is greatly affected by the linguistic and emotional content of the message being communicated. As a result of the analysis, applications in automatic emotion recognition and synthesis of expressive communication are presented.[This research was supported in part by funds from the NSF, NIH, and the Department of the Army]

DTEND;TZID=America/Los_Angeles:20090227T160000
DTSTART;TZID=America/Los_Angeles:20090227T150000
LOCATION:11 Large
SUMMARY:Multimodal Processing of Human Behavior in Intelligent Instrumented Spaces: A Focus on Expressive Human Communication
UID:20090227T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Speech is a crucial component in human computer interaction. Whiletremendous progress has been made in automatic speech recognition, speechtranscription -- which is the output of automatic speech recognition -- isfar from providing all the information that one could retrieve fromspeech. For example, prominence, pause, rhythm, and rate of speech allcarry important information in speech and are crucial in speechperception. Inclusion of such information can facilitate better machinerecognition and understanding of speech.In this talk, we will introduce the research effort and result in speechrate, prominence, disfluency and utterance boundary detection. We willalso show some interesting applications utilizing these features innatural language understanding and dialog management.

DTEND;TZID=America/Los_Angeles:20050325T163000
DTSTART;TZID=America/Los_Angeles:20050325T150000
LOCATION:11 Large (THIS HAS CHANGED!!!)
SUMMARY:Metalinguistic feature study for spontaneous speech in human computer interaction
UID:20050325T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: 

DTEND;TZID=America/Los_Angeles:20030822T160000
DTSTART;TZID=America/Los_Angeles:20030822T150000
LOCATION:11 Large
SUMMARY:Information Extraction, IR and QA
UID:20030822T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Since part 1 of the Perl tutorial didn't cover the juicy bits (like aunique function in Perl), based on feedback from participants, I amoffering a part 2 "Perl - Advanced Magick" covering:o the slides from roughly page 40- The Schwartzian Transform- Dissecting a programo What to do, if you do need popen or backticks?o OO Perl - a starto C embedding - definitely only a "start here"o Useful recipes, e.g. interpolating variables in configurationscripts from Perl values.If there is something you are especially interested in seeing, pleasesend me an email

DTEND;TZID=America/Los_Angeles:20061103T170000
DTSTART;TZID=America/Los_Angeles:20061103T153000
LOCATION:11 Large
SUMMARY:perl part 2 - advanced magick
UID:20061103T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Leading Question-Answering systems employ a variety of means to boost theaccuracy of their answers.  Such methods include redundancy (getting thesame answer from multiple documents/sources), deeper parsing of questionsand texts (hence improving the accuracy of confidence measures),inferencing (proving the answer from information in texts plus backgroundknowledge) and sanity-checking (verifying that answers are consistent withknown facts).  To our knowledge, however, no QA system deliberately asksadditional questions in order to derive constraints on the answers to theoriginal questions.We present in this talk the method of QA-by-Dossier-with-Constraints (QDC).This is an extension of the simpler method of QA-by-Dossier, in whichdefinitional questions ("Who/what is X") are addressed by asking a set ofquestions about anticipated properties of X.  In QDC, the collection ofDossier candidate answers, along with possibly other answers to questionsasked expressly for this purpose, are subjected to satisfying a set ofnaturally-arising constraints.  For example, for a "Who is X" question, thesystem will ask about birth, accomplishment and death dates, which, if theyexist, must occur in that order, and also obey other constraints such aslifespan.  Temporal, spatial and kinship relationships seem to beparticularly amenable to this treatment, but it would seem that almost any"factoid" question can benefit from QDC.  We will discuss the setting-upand application of constraint networks, and talk about how (and whether) todevelop the constraint sets automatically.  We will demonstrate severalapplications of QDC, and present one evaluation in which the F-measure fora set of questions improved with QDC from .39 to .69.

DTEND;TZID=America/Los_Angeles:20040116T150000
DTSTART;TZID=America/Los_Angeles:20040116T140000
LOCATION:11 Large
SUMMARY:Using Constraints to Improve Question-Answering Accuracy
UID:20040116T140000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: If you understood all of the world's languages, you would still not beable to read many of the texts that you find on the world wide web,because they are written in non-Roman scripts -- often ones that havebeen arbitrarily encoded for electronic transmission in the absence ofan accepted standard.  This very modern nuisance reflects a dilemma asancient as writing itself: the association between a language as it isspoken and its written form has a sort of internal logic to it that wecan comprehend, but the conventions are different in every individualcase --- even among languages that use the same script, or betweenscripts used by the same language.  This conventional associationbetween language and script, called a <i>writing system</i>, is indeedreminiscent of the Saussurean conception of language itself, aconventional association of meaning and sound, upon which modernlinguistic theory is based.  Despite linguists' reliance upon writingto present and preserve linguistic data, however, writing systems werea largely forgotten corner of linguistics until the 1960s, when Gelbpresented their first classification.This talk will describe recent work that aims to place the study ofwriting systems upon a sound computational and statistical foundation.While archaeological decipherment may eternally remain the holy grailof this area of research, it also has applications to speechsynthesis, machine translation, and multilingual document retrieval.

DTEND;TZID=America/Los_Angeles:20070126T163000
DTSTART;TZID=America/Los_Angeles:20070126T150000
LOCATION:11 Large
SUMMARY:The Quantitative Study of Writing Systems
UID:20070126T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: Today's machine translation system are highly complex and extending them often means leaving highly sophisticated solutions and established algorithms behind. Therefore it is attractive to try to extend the process outside of the translation system: in pre- and post-processing steps.I will show a pre-processing step for helping to translate tweets and a post-processing step that helps "guess" the translations of unknown and thus untranslated words in arbitrary sentences using dictionaries and other resources.Bio: Sebastian is currently pursuing a CS masters degree in Dresden, Germany with Prof. Heiko Vogler, taking a break from studying to work on low-resource machine translation with Prof. Kevin Knight and Prof. Daniel Marcu as an ISI intern in 2016.

DTEND;TZID=America/Los_Angeles:20160729T160000
DTSTART;TZID=America/Los_Angeles:20160729T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Let's not be clever: simple pre- and post-processing tricks in machine translation
UID:20160729T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The medieval Voynich Manuscript has been called "the mostmysterious document in the world".  Its pages contain bizarre drawingsof strange plants and astrological diagrams, as well as an undecipheredscript of 20,000 running words, written in a character set that has neverbeen seen elsewhere.  Its origin is also controversial, with many theoriesabounding.  I will describe the document, show samples, explain where itmay have come from, and present some properties of the text.This will more of a history/mystery talk thana computer science talk.

DTEND;TZID=America/Los_Angeles:20070309T163000
DTSTART;TZID=America/Los_Angeles:20070309T150000
LOCATION:11 Large
SUMMARY:The Voynich Manuscript
UID:20070309T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The Prague Dependency Treebank project is aimed at a linguisticallycomplex, multi-tier annotation of relatively large amounts of naturallyoccuring sentences of natural language. There are four tiers at present:the basic token tier (level 0), and the morphological, surface-syntacic,and semantic (called "tectogrammatics") tiers. The syntactic andtectogrammatic tiers are based on a richly labelled dependencyrepresentation principle. So far, the project produced three corpora: theCzech-language-only Prague Dependency Treebank, the Prague Czech-EnglishDependency Treebank and the Prague Arabic Dependency Treebank. In thetalk, the principles of the Prague Dependency Treebank linguisticannotation scheme will be presented. Some technical details will also bediscussed, as well as some of the tools developed both for the manualannotation itself and for corpus-based NLP of Czech, English and Arabic.

DTEND;TZID=America/Los_Angeles:20050805T120000
DTSTART;TZID=America/Los_Angeles:20050805T103000
LOCATION:11 Large
SUMMARY:The Family of Prague Dependency Treebanks
UID:20050805T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Paraphrases are textual expressions that convey the same meaning using different words. They capture variability, which is a common phenomenon in language. Given this, paraphrases have been shown to be useful in many natural language applications like Question-Answering, Machine Translation, Summarization and Information Retrieval. In this talk, I'll discuss the phenomenon paraphrasing and focus on methods for automatically acquiring paraphrases from text.

DTEND;TZID=America/Los_Angeles:20080418T160000
DTSTART;TZID=America/Los_Angeles:20080418T150000
LOCATION:11 Large
SUMMARY:Learning Paraphrases from Text
UID:20080418T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We begin by describing a set of pruning constraints that are appliedin the literature to effectively restrict the search space ofsynchronous PCFGs intersected with target language model contexts. Weapply these constraints to non-binarized grammars with a large numberof non-terminals and demonstrate effective parsing within theframework of Wu, 97.We then present a novel parsing approach that avoids language modelcontext intersection during parsing in favor of language model drivenn-best list extraction. The parsing step produces a sentencespanning parse forest which is explored in left-to-right target orderby the N-Best extraction method.This method avoids lossy pruning during the parsing process, searchinga much larger effective parse space than practically possible in thefull intersection scenario, and has the important benefit of allowingintegration of a high order language within the N-Best search process,rather than only in parse re-scoring.We demonstrate the impact of this parsing approach using the SPCFGapproach described in Zollmann, Venugopal, Vogel 06, which is similarto Galley et al., 04 and compare performance against fullintersection.This is joint work with Andreas ZollmannAbout the Speaker:Ashish Venugopal is a Ph.D candidate at the Language TechnologiesInstitute at Carnegie Mellon University, and holds B.S (SCS,Univ. Honors), M.S degrees from the same institution. He is a SeibelScholar and has received the annual Graduate Student Teaching Award atCarnegie Mellon. His research focus is on syntax augmented machinetranslation.

DTEND;TZID=America/Los_Angeles:20060929T163000
DTSTART;TZID=America/Los_Angeles:20060929T150000
LOCATION:11 Large
SUMMARY:Delayed LM Intersection and Left-to-Right N-Best Extraction for Syntax-Based MT
UID:20060929T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Contextual information is critical for language processing and generation. Particularly for large texts consisting of multiple sentences or paragraphs, how to capture the contextual information beyond sentence boundaries is important for building better language processing systems. This talk will discuss our recent effort on incorporating contextual information to language modeling and generation. It presents three models with each of them corresponds a specific linguistic phenomenon of context shared in written texts: (i) local context from preceding sentences; (ii) semantic and pragmatic relations between adjacent sentences; and (iii) evolving of entities (e.g., characters in novels) through coreference links in texts. The starting point of our model design is sentence-level recurrent neural network language models (RNNLMs). To capture these aspects of contextual information, we extend RNNLMs by either adding extra connections among existing network components, or adding dedicated components particularly to encode specific linguistic information. Evaluation results show that these models outperforms strong baselines and prior work language modeling tasks. Their ability of capturing contextual information is also verified by the quantitative evaluation on each corresponding task, such as identifying the relation between sentences, and resolving coreference ambiguity. Qualitative analysis is also included to demonstrate the ability of these models for text generation.Bio: Yangfeng Ji is a postdoc researcher at University of Washington working with Noah Smith. His research interests lie in the interaction of natural language processing and machine learning. He is interested in designing machine learning models and algorithms for language processing, and also fascinated by how linguistic knowledge helps build better learning models. He completed his Ph.D. in Computer Science at Georgia Institute of Technology in 2016, advised by Jacob Eisenstein. He was one of the area co-chairs on Discourse and Pragmatics in ACL 2017.

DTEND;TZID=America/Los_Angeles:20171013T160000
DTSTART;TZID=America/Los_Angeles:20171013T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Context is Everything: From language modeling to language generation
UID:20171013T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The increasing availability of large text corpora holds the promise of acquiring an unprecedented amount of knowledge from this text. However, current techniques are either specialized to particular domains or do not scale to large corpora. This dissertation develops a new technique for learning open-domain knowledge from unstructured web-scale text corpora.    A first application aims to capture common sense facts: given a candidate statement about the world and a large corpus of known facts, is the statement likely to be true? We appeal to a probabilistic relaxation of natural logic -- a logic which uses the syntax of natural language as its logical formalism -- to define a search problem from the query statement to its appropriate support in the knowledge base over valid (or approximately valid) logical inference steps. We show a 4x improvement at retrieval recall compared to lemmatized lookup, maintaining above 90% precision.    This approach is extended to handle longer, more complex premises by segmenting these utterance into a set of atomic statements entailed through natural logic. We evaluate this system in isolation by using it as the main component in an Open Information Extraction system, and show that it achieves a 3% absolute improvement in F1 compared to prior work on a competitive knowledge base population task.    A remaining challenge is elegantly handling cases where we could not find a supporting premise for our query. To address this, we create an analogue of an evaluation function in gameplaying search: a shallow lexical classifier is folded into the search program to serve as a heuristic function to assess how likely we would have been to find a premise. Results on answering 4th grade science questions show that this method improves over both the classifier in isolation and a strong IR baseline, and achieves the best published results on the task.Bio. Gabor is a new graduate from Chris Manning's natural language processing lab. He graduated with a BS in electrical engineering/computer science from UC Berkeley in 2010, and defended his Ph.D. in the fall of 2015. His research focuses on natural language understanding, ranging from relation extraction and knowledge base population, textual entailment, common-sense reasoning, and question answering. He has led the Stanford knowledge base population project for the past three years, with Stanford ranking 5th, 1st, and 1st (tied) among teams participating in the TAC-KBP competition over those three years. In addition to publications at ACL, EMNLP and NAACL, he co-authored an EMNLP best dataset paper on collecting a large dataset for textual entailment. Outside of academia, he was the NLP architect for Baarzo in 2014 (acquired by Google), and is currently a fellow at XSeed Capital. In his free time, Gabor enjoys hiking, board games, and binge-watching Netflix shows.

DTEND;TZID=America/Los_Angeles:20160115T160000
DTSTART;TZID=America/Los_Angeles:20160115T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Learning Open Domain Knowledge From Text
UID:20160115T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Machine Translation (MT) is the task of translating a document from asource language (e.g., Chinese) into a target language (e.g., English)via computer.  State-of-the-art statistical approaches to MT use largecollections of human-translated documents as training material,gathering statistics on the patterns of correspondence betweenlanguages according to the features specified by the translationmodel.  Using this bilingual translation model in conjunction with atarget language model, created by gathering statistics from a largemonolingual corpus, a new document in the source language can beautomatically translated into its target-language equivalent withsurprising accuracy.Much  MT research focuses on types of the patterns and features toinclude in a translation model.Recent statistical MT models have used syntax trees to enforcegrammaticality, but the currently popular tree substitution modelsonly memorize sequences of words or constituents, specifying exactlywhat phrases to use and exactly what trees are grammatical, which doesnot generalize well.  Adding the operation of tree-adjoining providesthe freedom to splice additional information into an existinggrammatical tree.  An adjoining translation model allows general,linguistically-motivated translation patterns to be learned withoutthe clutter of endless variations of optional material.  Theappropriate modifiers, such as adjectives, adverbs, and prepositionalphrases, can be grafted into these core patterns as needed totranslate details.  We show that the increased generalization powerprovided by adjoining, when used carefully, improves MT qualitywithout becoming computationally intractable.In this thesis, we describe challenges encountered by both word-sequence-basedand syntax-tree-based MT systems today, and present anin-depth, quantitative comparison of both models.  Then we describe anovel model for statistical MT which addresses these challenges usinga synchronous tree-adjoining grammar.  We introduce a method ofconverting these grammars to a weakly equivalent tree transducer fordecoding.   Then we present a method for learning the rules andassociated probabilities of this grammar from aligned tree/stringtraining data, and empirically analyze important characteristics ofthe resulting model, considering and evaluating many variations.Finally, our results show that adjoining delivers a consistentimprovement over a baseline statistical syntax-based MT model on bothmedium and large-scale MT tasks using several language pairs.

DTEND;TZID=America/Los_Angeles:20111004T170000
DTSTART;TZID=America/Los_Angeles:20111004T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Tree-adjoining Machine Translation (Ph.D. Defense Practice Talk)
UID:20111004T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Match-making systems refer to systems where users want to meet other individuals to satisfy some underlying need. Examples of match-making systems include dating services, resume/job bulletin boards, community based question answering, and consumer-to-consumer marketplaces. One fundamental component of a match-making system is the retrieval and ranking of candidate matches for a given user.We present the first in-depth study of information retrieval approaches applied to match-making systems. Specifically, we focus on retrieval for a dating service. This domain offers several unique problems not found in traditional information retrieval tasks. These include two-sided relevance, very subjective relevance, extremely few relevant matches, and structured queries. We propose a machine learned ranking function that makes use of features extracted from the uniquely rich user profiles that consist of both structured and unstructured attributes. An extensive evaluation carried out using data gathered from a real online dating service shows the benefits of our proposed methodology with respect to traditional match-making baseline systems. Our analysis also provides deep insights into the aspects of match-making that are particularly important for producing highly relevant matches.

DTEND;TZID=America/Los_Angeles:20110114T160000
DTSTART;TZID=America/Los_Angeles:20110114T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Relevance and Ranking in Online Dating Systems
UID:20110114T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Because natural language is complex, researchers in many domains look for lower-dimensional representations of text to suit their purposes. Different methods attempt to single out intuitive aspects of language like content, sentiment, or style. I will discuss a new, unsupervised approach to learning abstract representations of text (or other high-dimensional signals). The motivating principle is to use information theory to construct higher-order features that explain correlations between lower-order features. I will present preliminary results using this framework.Greg Ver Steeg is a research professor at ISI. His research explores practical methods for inferring meaningful structure in complex systems like social networks. He did his PhD in quantum physics at Caltech.

DTEND;TZID=America/Los_Angeles:20131101T160000
DTSTART;TZID=America/Los_Angeles:20131101T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Coarse-graining Text
UID:20131101T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Recently there has been an explosion in the number of users posting short status messages on Social Media websites such as Facebook and Twitter.  Although noisy and informal, this new style of text represents a valuable source of information not available elsewhere: it provides the most up-to-date information on current events, in addition to a massive publicly available corpus of naturally occurring human conversations.  In this talk I will present ongoing work which explores both of these aspects.First, I will describe efforts towards Information Extraction from status messages.  Because statuses can be posted quickly and are widely disseminated, they often provide the most up-to-date source of information on current events around the world and locally.  This dynamically changing source of realtime information is already being processed using keyword extraction techniques, for example the "trends" displayed on Twitter's website provide a list of phrases which are frequent in the current stream of messages.  In order to move beyond a flat list of phrases, we have been investigating the feasibility of applying Information Extraction techniques to produce more structured representations of events.  A key challenge is the noisy nature of this data; unlike newswire, or biomedical text, status messages contain frequent misspellings and abbreviations, inconsistent capitalization, unique grammar, etc...  To deal with these issues, we have been annotating a corpus of Twitter Posts with POS tags and Named Entities, then using these annotations to train Twitter-specific NLP tools.  As a demonstration of their utility, the resulting tools are combined to produce a calendar of popular events occurring in the future.In addition, I will discuss work which exploits a corpus of roughly 1.3 million naturally occurring conversations collected from Twitter for building models of human conversation.  Three data-driven approaches to generating responses to Twitter status posts are considered, based on either information retrieval or phrase-based statistical machine translation.  Although there are many challenges to overcome in adapting phrase-based SMT to dialogue, we show that it is a promising approach to this problem.  We compare these approaches in a human evaluation, using annotators from Amazon's Mechanical Turk service.  Furthermore, we measure agreement between human evaluators and the BLEU automatic MT evaluation metric.  As far as we are aware, this is the first work to investigate the application of phrase-based SMT to dialogue generation.Short Bio: Alan Ritter is a graduate student at the University of Washington advised by Oren Etzioni.  His research interests are in Information Extraction, Computational Lexical Semantics, and Language Processing in Social Media.

DTEND;TZID=America/Los_Angeles:20110217T160000
DTSTART;TZID=America/Los_Angeles:20110217T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Status Messages: A Unique Textual Source of Realtime and Social Information
UID:20110217T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Automatic Prediction of Parser Accuracy (practice talk for EMNLP)Statistical parsers have become increasingly accurate, to the point where they are useful in many natural language applications. However, estimating parsing accuracy on a wide variety of domains and genres is still a challenge in the absence of gold-standard parse trees.We propose a technique that automatically takes into account certain characteristics of the domains of interest, and accurately predicts parser performance on data from these new domains. As a result, we have a cheap (no annotation involved) and effective recipe for measuring the performance of a statistical parser on any given domain.(Joint work with Kevin Knight and Radu Soricut)Overcoming Vocabulary Sparsity in MT Using Lattices  (practice talk for AMTA)Source languages with complex word formation rules present a challenge for statistical machine translation (SMT). In this paper, we take on three facets of this challenge: (1) common stems are fragmented into many different forms in training data, (2) rare and unknown words are frequent in test data, and (3) spelling variation creates additional sparseness problems. We present a novel, lightweight technique for dealing with this fragmentation, based on bilingual data, and we also present a combination of linguistic and statistical techniques for dealing with rare and unknown words. Taking these techniques together, we demonstrate +1.3 and +1.6 BLEU increases on top of strong baselines for Arabic-English machine translation.(Joint work with Ulf Hermjakob and Kevin Knight)

DTEND;TZID=America/Los_Angeles:20081010T161500
DTSTART;TZID=America/Los_Angeles:20081010T150000
LOCATION:11 Large
SUMMARY:Practice talks for AMTA/EMNLP
UID:20081010T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: With the rise of social media, millions of people express their moods, feelings and daily struggles with mental health issues routinely on social media platforms like Twitter. Un- like traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential of detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expressionon Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.Bio: Amir is a 2nd year Ph.D. Researcher at Kno.e.sis Center Wright State University, OH under the guidance of  Prof. Amit P. Sheth, the founder and executive director of Kno.e.sis Center. He is broadly interested in machine learning (incl. deep learning) and semantic web (incl. creation and use of knowledge graphs) and their applications to NLP/NLU and social media analytics. He has a particular interest in the extraction of subjective information with applications to search, social and biomedical/health applications. At Kno.e.sis Center â He is working on several real world projects mainly focused on studying human behavior on the web via Natural Language Understanding, Social Media Analytics utilizing Machine learning (Deep learning) and Knowledge Graph techniques. In particular, his focus is to enhance statistical models via domain semantics and guidance from offline behavioral knowledge to understand userâs behavior from unstructured and large-scale Social data.

DTEND;TZID=America/Los_Angeles:20170707T160000
DTSTART;TZID=America/Los_Angeles:20170707T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
UID:20170707T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I'll present a dozen interesting, potentially high-impact NLP research projects. I'd like to make this a very interactive session.

DTEND;TZID=America/Los_Angeles:20130913T160000
DTSTART;TZID=America/Los_Angeles:20130913T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Some Potential NLP Thesis Topics and Other Fun Research Projects
UID:20130913T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We ask how much information a human translator adds to an original text, and weprovide a bound. We address this question in the context of bilingual text com-pression: given a source text, how many bits of additional information are required to specify the target text produced by a human translator? We develop new compression algorithms and establish a benchmark task.

DTEND;TZID=America/Los_Angeles:20150904T160000
DTSTART;TZID=America/Los_Angeles:20150904T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:How Much Information Does a Human Translator Add to the Original?
UID:20150904T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Weighted tree transducers have been proposed as useful formal models for representing syntactic natural language pro- cessing applications, but there has been little description of inference algorithms for these automata beyond formal foundations. We give a detailed description of algorithms for application of cascades of weighted tree transducers to weighted tree acceptors, connecting formal theory with actual practice. Additionally, we present novel on-the-fly variants of these algorithms, and compare their performance on a syntax machine translation cascade based on (Yamada and Knight, 2001).

DTEND;TZID=America/Los_Angeles:20100630T163000
DTSTART;TZID=America/Los_Angeles:20100630T160000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Efficient Inference Through Cascades of Weighted Tree Transducers (ACL 2010 Practice Talk)
UID:20100630T160000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This tutorial is intended to provide attendees with working knowledge of the Arabic writing system. No previous experience with Arabic is required. At the end of this tutorial you should be able to read and segment individual Arabic characters, read common ligatures, identify possible affixes on stems, and understand the various lexical normalizations used in Arabic text preprocessing. The focus will be on the formal writing system in printed text for Modern Standard Arabic, although handwriting will be briefly discussed.

DTEND;TZID=America/Los_Angeles:20080325T113000
DTSTART;TZID=America/Los_Angeles:20080325T103000
LOCATION:11 Large
SUMMARY:Tutorial on Arabic Orthography
UID:20080325T103000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This proposal seeks to explain how humans describe emotions usingnatural language. Thefocus of the proposal is on words and phrases that refer to emotions,rather than the more general phenomena of emotional language. The mainproblem I address is that if natural language descriptions of emotionsrefer to abstract concepts that are local to a particular human (oragent), then how do these concepts vary from person to person and howcan shared meaning be established between people. The thesis of theproposal is that naturallanguage emotion descriptions are definite descriptions that refer totheoretical objects, which provide a logical framework for dealingwith this phenomenon in scientific experimentsand engineering solutions. An experiment, Emotion Twenty Questions(EMO20Q), was devised to study the social natural language behavior ofhumans, who must use descriptions of emotions to play the familiargame of twenty questions when the unknown word is an emotion.The idea of a theory based on natural language propositions isdeveloped and used to formalize the knowledge of a sign-using agent.Based on this pilot data, it was seen that approximately 25% of theemotion descriptions referred to emotions as objects with dimensionalattributes, similarity, or subsethood. This motivated the author touse interval type-2 fuzzy sets as a computational model for theconceptual meaning of emotion descriptions. This model introduces adefinition of a variable that ranges over emotions and allows for bothinter- and intra-subject variability. A second experiment usedinterval surveys and translation tasks to assess this model. Finally,the author proposes the use of spectral graph theory to representemotional knowledge as a network of proposition nodes that areconnected  to emotion nodes based on data from EMO20Q.Short Bio: Abe Kazemzadeh is a PhD student at the USC Computer Science Dept and aresearch assistant at the the Signal Analysis and InterpretationLaboratory (SAIL). His interests include natural language, logic,emotions, games, and algebra.

DTEND;TZID=America/Los_Angeles:20110506T160000
DTSTART;TZID=America/Los_Angeles:20110506T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Natural Language Descriptions of Emotions
UID:20110506T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Information extraction is concerned with the retrieval of structured information from unstructured sources. Knowledge extraction/acquisition will need to go a step further by testing whether the extracted information is actually true. Since none of the extraction systems in current use can guarantee a perfect precision, it is necessary to incorporate manual verification steps into the information extraction pipeline in order to use extracted facts in further reasoning. My talk will present a framework that adopts a cyclic approach to advancing the state of factual knowledge within a system, taking advantage of available formal/structured knowledge sources, information extraction and human/social computing to verify the extracted information. For the fact extraction part, the system uses LoD as training data, a domain hierarchy extractor to delineate domain boundaries and non-NLP surface-pattern-based open IE techniques to connect concepts within the hierarchy. To combat the low recall that most IE approaches face, the system deploys generalization techniques and pertinence computation to increase the number of patterns. Verification is done by means of information use under the assumption that correct information will be utilized more often than incorrect one.Bio:Christopher Thomas is a PhD candidate in the Kno.e.sis Center at Wright State University. His research is in epistemological aspects of Computer Science and Artificial intelligence, namely knowledge extraction, representation, verification and dissemination. To build a coherent framework for this kind of systems epistemology, his publications span technical work on ontology design, ontology learning, information quality and information extraction as well as conceptual work on knowledge representation and social computing methods for knowledge verification.

DTEND;TZID=America/Los_Angeles:20110303T160000
DTSTART;TZID=America/Los_Angeles:20110303T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:What Goes Around Comes Around -- Improving the State of Knowledge on the Web through On-Demand Model Creation
UID:20110303T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In many application domains, we face the task of characterizing the distribution of continuous random variables.  For instance, in automatic speech recognition (ASR), these variables are acoustic properties of speech signals.  For such tasks, Gaussian mixture models (GMMs) are widely used as an very effective density estimator. Particularly, in the context of ASR, they are embedded in continuous-density hidden Markov models (CD-HMMs) to yield emission probabilities, i.e., the likelihoods of acoustic observations conditioned on hidden states such as phonemes. Meanwhile, the transition probabilities in CD-HMMs attempt to capture temporal properties of speech signals. Similar modeling choices arise in other applications, for instance, in activity recognition.Various techniques have been developed to estimate the parameters of CD-HMMs. In particular, discriminative techniques such as conditional maximum likelihood and minimum classification error have attracted significant research attention. When carefully and skillfully implemented, they often lead to lower error rates (in speech recognition) than traditional techniques of maximum likelihood estimation.In this talk, I will describe a new discriminative technique that is based on the principle of large margin, a key framework in many machine learning algorithms including support vector machines and boosting. The new technique differs from previous discriminative methods for ASR in the goal of margin maximization. In particular, in our large margin training of CD-HMMs, model parameters are optimized to maximize the gap (or the margin)  between correct and incorrect classifications.  I will present an extensive empirical evaluation of our approach on two benchmark problems in speech recognition: phonetic classification and recognition on the TIMIT speech database.  In both tasks, large margin systems obtain significantly better performance than systems trained by maximum likelihood estimation or competing discriminative frameworks.  An in-depth analysis also reveals someinteresting features of our approach, which contribute to the superior performance.Towards the end of the talk, I will discuss briefly the connection of our work to the structured prediction problems in the machine learning community. I will also discuss the future direction of this line of work and other application potentials.

DTEND;TZID=America/Los_Angeles:20080919T160000
DTSTART;TZID=America/Los_Angeles:20080919T150000
LOCATION:11 Large
SUMMARY:Large margin based parameter estimation for hidden Markov models
UID:20080919T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Abstract: We consider the challenge of learning semantic parsers that scale to large, open-domain problems, such as question answering or knowledge base completion with Freebase. In such settings, the sentences cover a wide variety of topics and include many phrases whose meaning is difficult to represent in a fixed target ontology. For example, even simple phrases such as `daughter' and `number of people living in' cannot be directly represented in Freebase, whose ontology instead encodes facts about gender, parenthood, and population. Here, we introduce a semantic parsing approach that learns to resolve such ontological mismatches. The parser uses a probabilistic CCG to build linguistically motivated logical-form meaning representations, and includes an ontology matching model that adapts the output logical forms for each target ontology.Bio: Eunsol Choi is a Ph.D student at the University of Washington, advised by Prof. Luke Zettlemoyer. Prior to UW, she studied mathematics and computer science at Cornell University.

DTEND;TZID=America/Los_Angeles:20140630T160000
DTSTART;TZID=America/Los_Angeles:20140630T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Open Domain Semantic Parser for QA / Information Extraction [Intern talk]
UID:20140630T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: We propose the use of a nonparametric Bayesian model, the Hierarchical Dirichlet Process (HDP), for the task of Word Sense Induction. Results areshown through comparison against Latent Dirichlet Allocation (LDA), a parametric Bayesian model employed by Brody and Lapata (2009) for this task.We find that the two models achieve similar levels of induction quality, while the HDP confers the advantage of automatically inducing a variable number of senses per word, as compared to manually fixing the number of senses a priori, as in LDA. This flexibility allows for the model to adapt to terms with greater or lesser polysemy, when evidenced by corpus distributional statistics.

DTEND;TZID=America/Los_Angeles:20110617T154000
DTSTART;TZID=America/Los_Angeles:20110617T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Nonparametric Bayesian Word Sense Induction (ACL practice talk)
UID:20110617T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: In this talk I will present the work I did with Prof Daniel Marcu and Prof Kevin Knight at ISI over the summer. In this work, we show how we can improve relation extraction for biomedical text using distant supervision from existing knowledge sources like BioPax. We label the data using heuristics from AMR which obviates the need for expensive manual annotation and allows us to make use of large amounts of data for training. I will also talk about some ongoing work on training a simpler model that exploits linguistic information stored in the path via the least common ancestor in an AMR.Bio. I am a PhD student from University of Maryland, College Park working under Prof. Hal Daume III and Prof. Philip Resnik. My recent project on "Dialogue focus tracking for zero pronoun resolution" appeared at NAACL 2015. At ISI, I am working with Prof. Daniel Marcu and Prof. Kevin Knight on application of Abstract Meaning Representation (AMR) to biology literature. Specifically we will be developing techniques for constructing text level AMRs from sentence level AMRs and then assess its impact on reading-against-a-model molecular biology tasks. In my spare time, I enjoy singing, dancing and watching movies.

DTEND;TZID=America/Los_Angeles:20150828T160000
DTSTART;TZID=America/Los_Angeles:20150828T150000
LOCATION:6th Floor Large Conference Room [689]
SUMMARY:Distant supervision for relation extraction using AMR
UID:20150828T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: How do adjectives project from a noun to its parts and other aspects? If a motorcycle is red, are its wheels red? Is a sharp knifeâs handle sharp? Questions like this are common sense for humans, using our understanding of the world, but difficult for computers. I will describe our process for curating and annotating a large dataset consisting of related object pairs and adjectives, and a set of experiments that aim to discover the extent to which modern approaches can learn these relationships from purely textual sources.Bio: James is a Masterâs Student in Computer Science at the Georgia Institute of Technology, where he works on machine learning for healthcare using written electronic health record notes. At ISI, he is working with Jonathan May and Nanyun Peng on building a dataset and models for textual commonsense reasoning. He aims to work on NLP and ML in industry for a year or so before applying for PhD programs.

DTEND;TZID=America/Los_Angeles:20180810T160000
DTSTART;TZID=America/Los_Angeles:20180810T150000
LOCATION:6th Floor Conference Room [689]
SUMMARY:Reasoning about objects, their components, and their descriptors
UID:20180810T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: I am Tomohide Shibata, an assistant professor at Kyoto University,Japan. I am working with Prof. Kurohashi. I have been visitingProf. Hovy for three weeks. In this talk, I introduce ourresearch. Our research roughly consists of three fields: basic textanalysis, information retrieval and machine translation. Among them,basic text analysis and information retrieval, which I am engaged in,are introduced.In basic text analysis, we have been developed Japanese morphologicalanalyzer and parser, which are widely used in research community. Caseframes, which describe the relation between a verb and its casecomponents, are automatically constructed from a large Webcorpus. Synonym and is-a relations are automatically extracted from adictionary and Web corpus.In Information Retrieval, we are running a search engineinfrastructure called TSUBAKI. The features of TSUBAKI are (i) thesentence structure (dependency relation) is considered in the documentranking, and (ii) the expression divergence between a query and adocument is assimilated. We are also running a search resultclustering system based on TSUBAKI.

DTEND;TZID=America/Los_Angeles:20091209T163000
DTSTART;TZID=America/Los_Angeles:20091209T153000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Introduction of Our Research (text analysis and IR)
UID:20091209T153000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: Metonymic words stand-in for concepts closely related to the words' literal interpretation. Resolving metonymies would then require identifying potentially metonymic words, finding closely related concepts, and determining which one fits the local(grammatically-related) and global context best. Each of these tasks can be resolved best by using different types of resources: a network of concepts for finding related concepts; a grammatically analyzed corpus (and, ideally, an ontology) for computing selectional preferences for the local context; a large corpus for computing co-occurrence probabilities, to factor in the global context. Within NLP we do have all these types of resources, but because of their different requirements -- e.g. relational models of meaning rely on differentiating word senses, while distributional representations do not (cannot) make such distinctions -- they are separate from one another. By using Wikipedia and exploiting its various structured/semi-structured sources of information, we can build a resource that combines the three types of meaning representations mentioned above. I will discuss the task of metonymy resolution and show how the combination of representations extracted from Wikipedia makes possible an unsupervised approach to this task.Bio: Vivi Nastase is a researcher at the Fondazione Bruno Kessler in Trento, working mainly on lexicalsemantics, semantic relations, knowledge acquisition and language evolution. She holds a Ph.D.from the University of Ottawa, Canada, and has previously worked at the Heidelberg Institute of Theoretical Studies (HITS) and the University of Heidelberg.

DTEND;TZID=America/Los_Angeles:20150626T160000
DTSTART;TZID=America/Los_Angeles:20150626T150000
LOCATION:10th Floor Classroom [1016]
SUMMARY:Metonymy resolution with multi-faceted knowledge from Wikipedia
UID:20150626T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: This thesis details three contributions to the advancement ofsemantic-enriched parsing for English sentences: inventories of semanticrelations covering three semantically ambiguous linguistic phenomena,large datasets annotated according to the inventories, and, finally, asuite of tools for semantically-enriched parsing built using the data.For the purposes of this thesis, semantically-enriched parsing isdefined as the reconstruction of the underlying grammatical structure oftext along with shallow semantic annotation of semantically-ambiguousstructures. Ultimately, semantically-enriched parsing is one of the mostcritical steps in natural language understanding---the initial step inwhich the text is read by the machine into a knowledge representationfor further processing and reasoning.The first contribution of this thesis is to advance the theoreticalfoundations for the interpretation of three ambiguous linguisticphenomena in English that have significant overlap in terms of therelations expressed: noun compounds, possessive constructions, andprepositions. For these, I define inventories of relations based uponextensive annotation by myself, previous work by others, andinter-annotator agreement studies. In the case of prepositions, therelations are created by refining an existing resource whereas the othertwo are created from scratch. In addition to mappings to prior work,mappings are provided across the different inventories in order tocreate a unified set of relations.Second, I produce large datasets annotated according to theaforementioned sense inventories. Such data is vital for training mostautomatic tools and also provides exemplars for the theory embodied inthe inventories. Some of these datasets are created from scratch,including a collection of over 17,500 noun compounds and a collection ofover 21,900 possessive construction examples. In the case ofprepositions, an existing resource including over 24,000 annotatedexamples is refined.The final contribution is a suite of tools that can constructsemantically-enriched parse trees. The suite is designed to work in asequential, pipeline-like fashion and can be thought of as consisting oftwo subsections. The first part reconstructs the grammatical structureof the text using a dependency parser that extends the non-directionaleasy-first algorithm developed by Goldberg and Elhadad (2010) in orderto support non-projective trees and is trained using my improveddependency tree conversion of the Penn Treebank. Second is a semanticannotation module that adds shallow semantic annotation for nouncompounds, preposition senses, and possessives. Combined, these toolsproduce semantically-enriched parse trees that include both grammaticalstructure and shallow semantics. The core parser itself achievesstate-of-the-art accuracy and can process over 75 sentences per second,which is substantially faster than most of the accurate parsersavailable today.In conclusion, this thesis work provides significant contributions tocomputational linguistics, both in terms of theory and resources. Itadvances our understanding of the relations expressed by threesemantically-ambiguous linguistic phenomena, creates large annotateddatasets useful for machine learning, and produces a fast, accurate, andinformative system for semantically-enriched parsing.

DTEND;TZID=America/Los_Angeles:20110819T160000
DTSTART;TZID=America/Los_Angeles:20110819T150000
LOCATION:11th Floor Large Conference Room [1135]
SUMMARY:Semantically-Enriched Parsing for Natural Language Understanding
UID:20110819T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
BEGIN:VEVENT
DESCRIPTION: The research of extracting event duration information from texts ispotentially very important in applications in which the time course ofevents is to be extracted from news.  For example, whether two eventsoverlap or are in sequence often depends very much on their durations.  Ifa war started yesterday, we can be pretty sure it is still going on today.If a hurricane started last year, we can be sure it is over by now.In the talk, I will first present our work on constructing an annotatedcorpus for extracting information about the typical durations of eventsfrom texts, including the annotation guidelines, the event classes wecategorized, the way we use normal distributions to model such vague andimplicit temporal information, and how we evaluate inter-annotatoragreement. I will then show that machine learning techniques applied tothis data yield coarse-grained event duration information, considerablyoutperforming a baseline and approaching human performance.At the beginning of the talk, I will also give a brief overview of thetime ontology (OWL-Time, formerly DAML-Time) we have developed, which isrepresented in both first-order logic and the OWL web ontology language.

DTEND;TZID=America/Los_Angeles:20060428T163000
DTSTART;TZID=America/Los_Angeles:20060428T150000
LOCATION:11 Large
SUMMARY:Learning Event Durations from Event Descriptions
UID:20060428T150000@NL
URL:http://www.isi.edu/natural-language/nl-seminar
END:VEVENT
END:VCALENDAR