Natural Language Understanding [PDF]

Nov 20, 2013 - Squad helps dog bite victim. ▻ Helicopter powered by human flies. ▻ I ate spaghetti with meatballs. â

7 downloads 27 Views 1MB Size

Recommend Stories


Review PDF Understanding Morphology (Understanding Language)
In every community, there is work to be done. In every nation, there are wounds to heal. In every heart,

Aspect Natural Language Understanding Architecture and Design Philosophy
The happiest people don't have the best of everything, they just make the best of everything. Anony

[PDF] Natural Language Processing with Python
Live as if you were to die tomorrow. Learn as if you were to live forever. Mahatma Gandhi

Underspecification in Natural Language Understanding for Dialog Automation
Don't fear change. The surprise is the only way to new discoveries. Be playful! Gordana Biernat

True Natural Language Understanding: How Does Kyndi Numeric Mapping Work?
Don’t grieve. Anything you lose comes round in another form. Rumi

Aspect ® Natural Language Understanding Architecture and Design Philosophy
Where there is ruin, there is hope for a treasure. Rumi

Understanding the Natural World
Learning never exhausts the mind. Leonardo da Vinci

Natural-Language Interface
What we think, what we become. Buddha

natural Language processing
Happiness doesn't result from what we get, but from what we give. Ben Carson

Natural Language Processing
Make yourself a priority once in a while. It's not selfish. It's necessary. Anonymous

Idea Transcript


11-20-2013

 Natural Language Understanding Read: AIMA Chapters 22 & 23

HW#8, due Monday, 11/25

What are some of the most impressive technologies in futuristic Science Fiction? e.g. consider Star Trek One is the “Universal Translator”

 Science behind Watson  Three key capabilities  Natural

Language Understanding  Hypothesis Generation  Evidence-based Learning

 NLP is a discipline that aims to build computer systems that will be able to analyze, understand and generate human speech.  NLP subareas of research are:  Speech

Recognition (speech analysis),  Speech Synthesis (speech generation), and  Natural Language Understanding (NLU).

 Putting meaning to the words  Input might be speech or could be typed in  Holy grail of Artificial Intelligence problems

 Georgetown University: “The spirit is willing but the flesh is weak.” English to Russian

Russian to English

“The vodka is good but the meat has spoiled.”

Consider the following conversation between Mary and Tom: Tom: “Who do you like tonight, Boston or LA?”

Mary: “Lakers. You?” Tom: “Come on Mary, LA can’t handle Bird.” Mary: “I’ve got a five that says Magic will shut him down.”

 Problem: English sentences are incomplete descriptions of the info they are intended to convey.

 I called Linda to ask her to the movies. She said she’d love to go.

but… speakers can be vague or precise; can leave out details that the hearer is expected to know

 Problem: The same expression means different things in different contexes.  Where’s the water?

but…

can communicate about an infinite world with a finite number of symbols

 Problem: New words, expressions and meanings evolve.  I’ll fax it to you.  In the 1600s, St. Paul’s cathedral was said to be “amusing, awful and artificial.”  “Selfie” named by Oxford dictionaries as word of the year 2013.

but… languages can evolve as experiences change

 Problem: There are a lot of ways of saying the same thing.  Mary was born on March 27th.  Mary’s birthday is March 27th.

but… when you know a lot, facts imply each other

 Speech recognition is the process of converting spoken language to written text or some similar form.  Speech synthesis is the process of converting the text into spoken language.

 Natural Language Understanding (NLU) is a process of analysis of recognized words and transforming them into data meaningful to computer.

 Other words, NLU is a computer based system that “understands” human language.  NLU is used in combination with speech recognition.

■ Three major issues involved in understanding language A

large amount of human knowledge is assumed  Language is pattern based: phonemes are components of words and words make phrases and sentences.  Language acts are the product of agents, either human or computer

■ Terry Winograd’s SHRDLU(Winograd 1972)  Early

AI programs made progress by restricting the focus to microworld

 SHRDLU could respond to English queries  What is sitting on the red block?  What shape is the blue block on the table?  Place the green pyramid on the red brick.

 Language is a complicated phenomenon, involving processes as varied as the recognition of sounds or printed letters, syntactic parsing, high-level semantic inferences, and even the communication of emotional content through rhythm and inflection.  To manage this complexity, linguists have defined different levels of analysis for natural language.

NLP Pipeline speech

text

Phonetic/Phonological Analysis

OCR/Tokenization

Morphological analysis Syntactic analysis Semantic Interpretation Discourse Processing

Phonology

Syntax

Semantics Pragmatics & World Knowledge

 Prosody: dealing with inflection, stress, pitch, timing  Phonology: examining sounds combined to form language, important for speech recognition and generation  Morphology: concerned with morphemes making up words. These include rules governing the formation of words. Important in determining the role of a word in a sentence in most languages in the world.  Morphological

apple.”

anomaly: “The computer eated an

 Syntax: dealing with rules for combining words into legal phrases and sentences  Syntactic

anomaly: “The computer ate apple.” “An the ate apple computer.”

 Semantics: considers meaning of words, phrases, and sentences also ways in which meaning is conveyed in natural language  Semantic

anomaly: “The computer ate an apple.”

 Pragmatics: dealing with ways in which language is used and its effects on the listener you know the time?”  Pragmatic anomaly: “Next year, all taxes will disappear.”  “Do

 World knowledge: includes knowledge of physical world, is essential to understand the full meaning of a text pen is in the box.” versus  “The box is in the pen.”  “The

Lazy

Contented

Colorless

Green

Cats

Sleep

Peacefully

Ideas

Sleep

Furiously

 Squad helps dog bite victim.  Helicopter powered by human flies.  I ate spaghetti with meatballs. … with salad. … with abandon. … with a fork … with a friend. Ambiguity can be lexical, syntactic, semantic, or referential

S NP John saw

S

VP V

NP

NP Art

N

a boy

John

PP

VP V

saw Art

in a park

NP N

PP with a telescope

a boy

PP

PP

in a park

with a telescope S

John saw a boy in a park with a telescope.

NP John saw

VP V

NP Art

N

a boy

PP in a park

PP with a telescope

S

S NP

John saw

NP

VP

V

John

NP Art

N

a boy

VP V

saw Art

PP

NP

PP

N

PP with a telescope

a boy

in a park

in a park

S

PP NP

with a statue John

saw

VP V

NP Art

N

a boy

PP

PP

in a park

with a dog

Identify all noun phrases that refer to the same entity John Simon, Chief Financial Officer of Prime Corp. since 1986, saw his pay jump 20%, to $1.3 million,

as the 37-year-old also became the financialservices company’s president... Best results: F-measure of 70.4 (MUC-6) and 63.4 (MUC-7) [Ng & Cardie, 2002]

 Advances in software and hardware create NLP needs for information retrieval (web), machine translation, spelling and grammar checking, speech recognition and synthesis.  Stochastic and symbolic methods combine for real world applications.

Speech Processing

A Voice Interface

Some Applications ■Information Retrieval: Web search (uni- or multi-lingual) ■Query Answering/ Dialogue, e.g., Chat-80 ■Report Generation: English/French weather report ■Foreign Language Training: Spanish/Arabic tutorial systems for military linguists ■Machine Translation : Babelfish on Yahoo

“I would like to fly to Seattle tomorrow.”

Speech Recognizer

Natural Language Understanding

“When would you like to leave?”

Domain Knowledge Dialog Manager

Speech Synthesizer

Natural Language Generator

 What is speech?

 Vibrations of vocal cords creates sound “ahh”  Mouth, throat, tongue, lips shape sound

 English speech  40

phonemes; 24 consonants, 16 vowels

 Sounds transmit “language”

 Speech does not equal written language

"I told him to go back where he came from, but he wouldn't listen."

 Tell which person it is (voice print)

 Could also be important for monitoring meetings, determining speaker

 Primarily identifying words  Improving all the time  Commercial systems:  IBM

ViaVoice, Dragon Dictate, ...

 Speaker dependent/independent

 Parametric patterns are sensitive to speaker  With training (dependent) can get better

 Vocabulary  Some

have 50,000+ words

 Isolated word vs. continuous speech

Did you vs. Didja

 Continuous: where words stop & begin  Typically a pattern match, no context used

 Java Speech SDK

 FreeTTS 1.1.1 http://freetts.sourceforge.net/docs/index.php

 IBM JavaBeans for speech  Visual/Real Basic speech SDK  OS capabilities (speech recognition and synthesis built in to OS) (TextEdit)  VoiceXML

 tool  automate

the construction of NLP systems  avoid the need for large linguistic knowledge bases

 portability  move

to new domain quickly  reduce the need for expertise in computational linguistics

 robustness  handle

ungrammatical or unexpected text  missing domain knowledge

 Statistical methods have transformed the field of NLP  Very good performance on increasing numbers/types of problems in NLP  Thus far, the most successful statistical and ML algorithms are supervised learning algorithms  Require

large amounts of training data that has been annotated with the “correct” answers  Corpus annotation bottleneck

Japanese, Chinese, Thai, ...: no spaces between words

Combining simple statistics from unsegmented Japanese newswire yields results rivaling grammar-based approaches.

[Ando & Lee 2000, 2003]

Translating from one language to another is challenging even to human translators. e.g. signs translated into English by a person:  Utmost of chicken with smashed pot. (restaurant in Greece)  Nervous meatballs (restaurant in Bulgaria)  The nuns harbor all diseases and have no respect for religion. (Swiss nunnery hospital)  All the water has been passed by the manager. (German hotel)

analysis

input

generation

output

Morphological analysis

Morphological synthesis

Syntactic analysis

Syntactic realization

Semantic Interpretation

Lexical selection

Interlingua

 Doesn’t work well enough yet

 ACL 2013 8th Workshop on Statistical Machine Translation  MT Summit 2013  Machine Translation without the Translation, Chronicle of Higher Education  NLPCS 2013: 10th International Workshop on Natural Language Processing and Cognitive Science  Workshop on Natural Language Processing and Automated Reasoning

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.