PROFESSIONAL ACADEMIC STUDY RESOURCES WEBSITE +1 813 434 1028 proexpertwritings@hotmail.com
. A Brief NLP + AI
Erin M. Buchanan
06/03/2019
What are we going to talk about?
Natural Language Processing
Computational Linguistics
Dealing with language (which is messy)
Artificial Intelligence
What will you learn?
How key concepts from text mining and linguistics are used to describe and analyze language
How data structures and algorithms are used in text mining and NLP
How we can expand these techniques and apply them to artificial intelligence
What is NLP?
Natural language processing
Roots in computer science, artificial intelligence, and linguistics
Focuses on human language and how to analyze language data
What is language? How do we deal with such a messy construct?
Origins of NLP
Turing Test – Intelligence (1950)
Chinese Room Thought Experiment by Searle (1980)
Georgetown Experiment – Machine Translation (1954)
NLP Systems (1960s)
SHRDLU
ELIZA
Explosion in research given computational power increases, corpus linguistics, and machine learning
Why Study NLP?
80% of “big data” is unstructured data
Images
Videos
Human language (text, recordings)
Text Mining (text analytics, sentiment analysis, etc.)
Linguistic, statistical, and machine learning techniques used to derive high-quality information from text
Traditional Approaches to Text Analytics
Semantics
Readability
Student interest indices
Vocabulary
Frequency, frequency, frequency
Factor/cluster analysis
Word clouds
Pages, chapters, etc.
Terms to Know
Corpus: a body of linguistic data
Corpus of Contemporary American English
Terms to Know
Corpora have changed the face of NLP
The avaliability of data on the internet (and sharing!) has given us a world of possibilities when it comes to analyzing language
Also a large increase in corpora that AREN’T in English!
Terms to Know
Terms to Know
Token: total number of words in a text
Types: number of distinct words
Frequency distribution: a list of all the unique tokens (types) and count of how many times they appear
Terms to Know
Dispersion plot: a graphical representation of the location of tokens in a text
Terms to Know
Collocation: a sequence of words that occur together often
n-Gram: n words that occur together
How to Compute Language
Basic Statistics
Frequency: Counts of characters, words, sentences
Lexical Diversity: percentage of unique word tokens
Lexical Dispersion: position of word tokens in the text
How to Compute Language
Word Sense Disambiguation
Determine which word was intended in a given context
serve: help with food or drink; hold an office; put ball into play
dish: plate; course of a meal; communications device
Contextual clues:
The lost children were found by the searchers (agentive)
The lost children were found by the mountain (locative)
The lost children were found by the afternoon (temporal)
How to Compute Language
Pronoun Resolution
Pronouns refers to a noun – like I/you/this
The noun it refers to is called the antecedent
Examples
The thieves stole the paintings. They were subsequently sold.
The thieves stole the paintings. They were subsequently caught.
The thieves stole the paintings. They were subsequently found.
How to Compute Language
Generating Language Output
Question Answering
For example, who sold the paintings?
Machine Translation
Being able to translate from one language to another
Search for google translate fails
Spoken Dialog Systems
Siri, Ok Google, etc.
How to Compute Language
What can I do with NLP?
What can I do with NLP?
What is next?
Artificial intelligence is implemented all around you now, and certainly, you use it on a day to day basis:
Ok Google, Siri, any automated phone system (human speech processing)
Gaming
Watson and intelligent searching
How can this be applied to health?
What is next?
The healthcare industry has finally reached a point of understanding how to more effectively use its data.
Image processing has shown vast possibilities for reading scans quickly and efficiently to develop detection algorithms
What is next?
Paired with Apple + Research Kit, researchers are developing algorithms to look for markers of autism, help track chronic illness, epilepsy, and more.
Paired with NLP and behavioral economics, we might be able to “nudge” patients into better outcomes
Questions?