Understanding Language Through AI
Natural Language Processing and the GPT Algorithm
Image Source:
Murrstock/Stock.adobe.com
By Alex Pluemer for Mouser Electronics
Published February 14, 2023
Introduction
Computers that understand human speech and can respond to verbal commands used to exist purely in science fiction
(for example, HAL in 2001: A Space Odyssey). Attempts have been made since the 1930s to program machines to
digest human language more easily, part of a field known as computational linguistics. Still, recent advances in
artificial intelligence (AI) have brought that fiction closer to reality.
The different methods and evolving technologies employed in this endeavor fall under the heading of natural
language processing (NLP), the field of computer science concerned with implementing AI to better understand human
language. Whether you're asking Siri for directions or Alexa to turn up the thermostat, NLP is helping to facilitate
that request. NLP includes both natural language understanding (NLU) and natural language generation (NLG), allowing
digital assistants, chatbots, etc. to comprehend human speech and produce a response.
In the early 1980s, NLP took the form of handwritten instructions that were painstakingly programmed into a
computer word by word, almost like teaching a language to a complete neophyte. Advancements in NLP were made
possible by incorporating machine learning algorithms, which evolved beyond the strict “if/then”
framework established by handwritten rules (now referred to as symbolic NLP), to develop a more flexible,
probability-based structure for determining which word and/or syntax choices to make (known as statistical NLP).
These algorithms "learned" the rules of grammar and syntax by analyzing documents and other written material on the
internet and mimicking what they read.
Another big step in NLP’s evolution was the implementation of neural networks. Neural machine translation
(NMT) takes even more of the preliminary programming and rules-based instruction out of the NLP process, allowing AI
and deep-learning algorithms to teach the rules of human language to themselves. NLP is almost ubiquitous now, from
speech-to-speech (like digital assistants) to text-to-text (like customer service chatbots) and even speech-to-text
implementations (like transcription and translation programs).
This article will offer a deeper look at the operations and tools AI uses to process human language and how NLP can
be implemented both now and in the future.
NLP Tasks and Operations
Now that we know what NLP is and what it does, the question becomes how does it do it? How do AI and machine
learning algorithms teach themselves to understand human language? The first step is to determine a language's rules
through a process called syntactic analysis, which examines when, where, and how sentences start and stop and how to
punctuate them. For example, understanding that a period can end a clause but can also punctuate an abbreviation is
crucial to grasping a sentence’s meaning. A similar operation is performed with words in a process called
part-of-speech tagging (for example, determining whether a word is being used as a noun or a verb in context). NLP
must differentiate between the noun form (“I’m going for a run”) and the verb form (“I
can’t get this program to run correctly") of the same word to process its meaning. NLP also must distinguish
between words with multiple meanings (known as word disambiguation), like in the case of the word "bank." A bank can
be a place you invest your savings or the side of a river or the movement of a plane. AI has made significant
improvements in this area of NLP, as it's far simpler to let machine learning tools absorb the different meanings of
words by analyzing source material than programming them one by one. Distinguishing proper nouns from common words
(sometimes referred to as named entity recognition) is another critical task. For example, differentiating between a
member of the Chicago Bulls NBA team and a male cow.
Two words that refer to the same noun (as is often the case with pronouns) requires a process called co-reference
resolution. For example, in the sentence “The press secretary wanted to respond, but he thought the better of
it,” NLP would determine that the word "he" refers to the press secretary. The most challenging task NLP
performs, and the area in which it still stands to make the most improvement, is sentiment analysis. Sentiment
analysis refers to determining what attitudes and feelings are trying to be communicated and what the underlying
subtext may be. This is particularly critical in social media and marketing applications, where it is vital to not
just understand the words and phrases themselves but also the inclinations and convictions behind them. NLP still
has a long way to go, but AI and machine learning have made considerable strides in recent years.
NLP Use Cases
NLP is implemented in many ways, but nearly everyone is familiar with its most common applications: virtual
assistant technologies. As previously mentioned, digital voice-activated assistants utilize NLP to comprehend human
requests and generate coherent and topical responses.
Chatbots and virtual customer service representatives are another nearly ubiquitous implementation of NLP in our
everyday lives. AI-powered algorithms can learn from previous interactions with customers to improve their
conversational capabilities in the future. While chatbots aren't yet great at finding new solutions to problems,
they can remember previous solutions. A chatbot is more likely to know the answer to a customer's question if that
question has been asked in the past.
In the last decade, NLP programs have made huge strides in text summarization applications, in which the programs
read and process enormous amounts of data, usually in the form of documentation, and present clear and concise
recaps. Text summarization applications are practical when the documents they're summarizing are excessively
technical or jargoned. NLP programs can understand the more complex terminology and rephrase it in more
straightforward language that's easier for a layperson to understand. Machine translation tools are similarly
improving through machine learning and AI.
While NLP translation applications have long been able to translate individual words from one language into
another, they are getting better at deciphering the meaning of more complex thoughts and ideas. For example, asking
Google Translate to convert the title of Marcel Proust’s magnum opus À la recherche du temps
perdu into English (an infamously tricky translation) elicits the response “In Search of Lost
Time,” its commonly recognized English title, though not an exact translation.
NLP has also greatly improved efficiency in spam detection. Machine learning algorithms can detect patterns of
speech and common phrases frequently found in mass advertising more effectively than older spam detection programs.
Sentiment analysis is also a big factor in spam detection and social media harvesting, as determining the emotional
responses certain advertisements or social media posts elicit is a large part of developing more targeted
advertising campaigns designed for individual consumers. AI-powered NLP applications can process enormous amounts of
data and synthesize it into digestible information, and this capability has more value in online and digital
advertising than in any other area.
NLP Tools
NLP software programs have been developing since the 1950s. Still, the relatively recent introduction of
statistical NLP and machine learning/neural networks has contributed to significant advancements in the quality of
NLP tools. As previously mentioned, digital assistants like Siri and Alexa have advanced NLU and NLG capabilities,
and applications like Google Translate dramatically improve understanding of context and intent. Options for working
with NLP-related software have expanded as well. Programmers working with Python have various tools and libraries
available in the Natural Language Tool Kit (NLTK), a compilation of open-source programs and libraries for working
with NLP applications. Resources for primary NLP-related tasks, smaller operations (e.g., sentence parsing,
stemming), and larger macro-level capabilities like semantic reasoning or sentiment analysis are free and accessible
to all.
The hottest new NLP tool on the scene is ChatGPT from OpenAI, an AI research and development laboratory based in
San Francisco. Powered by the third generation of OpenAI’s language prediction algorithm, GPT-3,
ChatGPT’s NLG responses are sometimes indistinguishable from human speech, essentially meeting the criteria
for Alan Turing’s eponymous test for artificial intelligence. While ChatGPT’s responses aren’t
always factually sound (for example, when asked if Nigeria is in North America, it has responded “yes”),
its grammar and syntax are almost always unassailable. ChatGPT is particularly effective at mimicking standard forms
of documentation like rental agreements or simple contracts.
As a simple exercise, ChatGPT formulated the following response to the question, "What is natural language
processing?":
Natural Language Processing (NLP) is a field of artificial intelligence and computational linguistics that focuses
on the interaction between computers and human language, specifically in speech recognition, natural language
understanding, and natural language generation. NLP algorithms and techniques are used to process and analyze large
amounts of natural language data, such as speech and text, to perform tasks such as language translation, sentiment
analysis, text summarization, named entity recognition, and many more. NLP is an interdisciplinary field, drawing on
areas such as computer science, linguistics, and cognitive psychology, to develop algorithms and models that enable
computers to understand, interpret, and generate human language.
It would be hard for a human being to summarize NLP any more clearly or concisely than an NLP-driven algorithm can
already do itself.
Conclusion
NLP has the potential to revolutionize several professions and industries. Translating legal and technical
documents from one language to another is a common and well-paid occupation, but translation tools could soon make
it obsolete. Writing boilerplate legal documents or drawing up standard contracts is another task NLP might make
humans unnecessary to complete. ChatGPT has already been prompted to create similar documents with great success.
Customer service representation might be the most transformed field; call centers staffed with actual human beings
may cease to be because AI speech-to-speech and text-to-text programs can answer customers' questions without taking
breaks or going on vacations.
NLP has obvious potential in education, as well. For those learning a language, NLP programs can help them improve
their writing and grammar by demonstrating how to better and more coherently construct sentences and paragraphs.
Computers that can read, write, and communicate in human languages might seem like something out of science fiction,
but they're already here. You may even have interacted with one today without even knowing it.
Author Bio
Alex is a senior
technical writer for Wavefront Marketing specializing in advanced electronics, emerging technologies and responsible
technology development.