Natural language processing definition
Natural language processing (NLP) is the branch of artificial intelligence (AI) that deals with communication: How can a computer be programmed to understand, process, and generate language just like a person?
While the term originally referred to a system’s ability to read, it’s since become a colloquialism for all computational linguistics. Subcategories include natural language generation (NLG) — a computer’s ability to create communication of its own — and natural language understanding (NLU) — the ability to understand slang, mispronunciations, misspellings, and other variants in language.
How natural language processing works
Natural language processing works through machine learning (ML). Machine learning systems store words and the ways they come together just like any other form of data. Phrases, sentences, and sometimes entire books are fed into ML engines where they’re processed based on grammatical rules, people’s real-life linguistic habits, or both. The computer then uses this data to find patterns and extrapolate what comes next. Take translation software, for example: In French, “I’m going to the park” is “Je vais au parc,” so machine learning predicts that “I’m going to the store” will also begin with “Je vais au.” All the computer needs after that is the word for “store.”
Natural language processing applications
Machine translation is one of the better NLP applications, but it’s not the most commonly used. Search is. Every time you look something up in Google or Bing, you’re feeding data into the system. When you click on a search result, the system sees this as confirmation that the results it has found are right and uses this information to better search in the future.
Chatbots work the same way: They integrate with Slack, Microsoft Messenger, and other chat programs where they read the language you use, then turn on when you type in a trigger phrase. Voice assistants such as Siri and Alexa also kick into gear when they hear phrases like “Hey, Alexa.” That’s why critics say these programs are always listening: If they weren’t, they’d never know when you need them. Unless you turn an app on manually, natural language processing programs must operate in the background, waiting for that phrase.
Even if they are always there, NLP isn’t Big Brother. Natural language processing does more good for the world than bad. Just imagine your life without Google search. Or spellcheck, which uses NLP to compare the words you type to ones in the dictionary. Comparing the two data sets allows spellcheckers to identify what’s wrong and to offer suggestions.
Natural language processing examples
Search and spellcheck are so commonplace, we often take them for granted, especially at work where NLP offers radical productivity gains. Want to know how many vacation days you have left? Don’t call HR. Save time and ask Talla, a chatbot that searches company policies for an answer. On the phone and need last quarter’s numbers? Mention them during your conversation and audio search startup SecondMind will show the answer on your screen. The company boasts its integrated search tool makes accounting and customer resource calls up to ten times shorter.
Natural language processing also helps job recruiters sort through resumes, attract diverse candidates, and hire more qualified workers. Spam detection uses NLP to keep unwanted email out of your inbox; programs such as Outlook and Gmail use it to sort messages from certain people into folders you create.
Tools like sentiment analysis help companies quickly discern whether tweets about them are good or bad so they can triage customer concerns. Sentiment analysis doesn’t just process words on social media, it breaks down the context in which they appear. Only 30 percent of English words are positive, says Skye Morét, data visualizer at analysis firm Periscopic — the rest are neutral or negative. So NLP helps businesses more fully understand a post: What’s the consumer emotion behind those neutral words?
Traditionally, corporations used natural language processing to classify feedback as positive or negative. But Ryan Smith, senior vice president of social and innovation at FleishmanHillard, says today’s tools identify more precise emotions, like sadness, anger, and fear.
Natural language processing software
Whether you’re building a chatbot, voice assistant, predictive text application, or other application with natural language processing at its core, you’ll need tools to help you do it. According to Technology Evaluation Centers, the most popular natural language processing software includes:
- Natural Language Toolkit (NLTK). NLTK is an open source framework for building Python programs to work with human language data. It was developed in the Department of Computer and Information Science at the University of Pennsylvania and provides interfaces to more than 50 corpora and lexical resources, a suite of text processing libraries, wrappers for natural language processing libraries, and a discussion forum. NLTK is offered under the Apache 2.0 license.
- SpaCy. SpaCy is an open source library for advanced natural language processing explicitly designed for production use rather than research. SpaCy was made with high-level data science in mind and allows deep data mining. It’s licensed by MIT.
- Gensim. Gensim is an open source Python library for natural language processing. The platform-independent library supports scalable statistical semantics, analysis of plain-text documents for semantic structure, and the ability to retrieve semantically similar documents. It’s intended to handle large amounts of text without human supervision.
- Amazon Comprehend. This Amazon service doesn’t require machine learning experience. It’s intended to help organizations find insights from email, customer reviews, social media, support tickets, and other text. It uses sentiment analysis, part-of-speech extraction, and tokenization to parse the intention behind the words.
- IBM Watson Tone Analyzer. This cloud-based solution is intended for social listening, chatbot integration, and customer service monitoring. It can analyze emotion and tone in customer posts and monitor customer service calls and chat conversations.
- Google Cloud Translation. This API uses natural language processing to examine a source text to determine language and then use neural machine translation to dynamically translate the text into another language. The API allows users to integrate the functionality into their own programs.
Natural language processing courses
There are many resources available for learning to create and maintain natural language processing applications and a number of them are free. They include:
- Introduction to Natural Language Processing in Python from DataCamp. This free course, offered as 15 videos and 51 exercises, covers the basics of natural language processing using Python. It covers how to identify and separate words, how to extract topics in a text, and how to build your own fake news classifier.
- Introduction to Natural Language Processing (NLP) from Udemy. This introductory course provides hands-on experience working with and analyzing text using Python and the Natural Language Toolkit. It consists of three hours of on-demand video, three articles, and 16 downloadable resources. The course costs $19.99, which includes a certificate of completion.
- Hands On Natural Language Processing (NLP) using Python from Udemy. This course is for individuals with basic programming experience in any language, an understanding of object-oriented programming concepts, knowledge of basic to intermediate mathematics, and knowledge of matrix operations. It is completely project-based and involves building a text classifier for predicting sentiment of tweets in real time, and an article summarizer that can fetch articles and find the summary. The course consists of 10.5 hours of on-demand video and eight articles. The course costs $19.99, which includes a certificate of completion.
- Natural Language Processing (NLP) from edX. This six-week course, offered by Microsoft through edX, provides an overview of natural language processing and the use of classic machine learning methods. It covers statistical machine translation and deep semantic similarity models (DSSM) and their applications. It also covers deep reinforcement learning techniques applied in natural language processing and vision-language multimodal intelligence. It’s an advanced-level course and those who complete it can pursue a Verified Certificate for $99.
- Natural Language Processing from Coursera. Part of Coursera’s Advanced Machine Learning Specialization, this course covers natural language processing tasks including sentiment analysis, summarization, dialogue state tracking, and more. Coursera says it is an advanced level course and estimates it will take five weeks of study at four to five hours per week to complete.
- Natural Language Processing in TensorFlow by Coursera. This course is part of Coursera’s TensorFlow in Practice Specialization, and it covers using TensorFlow to build natural language processing systems that can process text and input sentences into a neural network. Coursera says it is an intermediate-level course and estimates it will take four weeks of study at four to five hours per week to complete.
Natural language processing for social good
In addition to helping companies process data, sentiment analysis also helps us understand society. Periscopic, for example, has paired NLP with visual recognition to create the Trump-Emoticoaster, a data engine that processes language and facial expressions in order to monitor President Donald Trump’s emotional state.
Similar tech could also prevent school shootings: At Columbia University, researchers have processed 2 million tweets posted by 9,000 at-risk youth, looking for the answer to one question: How does language change as a teen comes closer and closer to getting violent?
“Problematic content can evolve over time,” says program director Dr. Desmond Patton. As at-risk youth grow closer to the brink, they reach out for help, using language. Natural language processing then flags problematic emotional states so that social workers can intervene.
Like Periscopic, Columbia pairs sentiment analysis with image recognition to improve accuracy. Patton says computer vision breaks down pictures attached to the Tweets, then machine learning processes them together with the language to tell “the actual emotionality of an image. Is this image about grief? Is this image about threats? … What else is happening in an image that helps us understand more complexly?” In addition to school shootings, the Columbia program hopes to also prevent gang violence.
Natural language processing for personal improvement
Natural language processing can also help you monitor your own emotional state. Woebot is an electronic therapist that connects with users via a Facebook Messenger chatbot or through a stand-alone app. There’s no high-level sentiment analysis here yet, though. Woebot essentially tracks only depression and anxiety, looking for words that may indicate users face an emergency situation.
This story, “What is natural language processing? The business benefits of NLP explained” was originally published by
Share this post if you enjoyed! 🙂