NLP Landscape from 1960s to 2022 - An Introduction of NLP & its history
Hello, welcome all! Today I will give a brief introduction on very interesting and trending topic which Natural Language Processing. Let's understand NLP -
Table of Contents
- What is NLP?
- Some Real-World Applications
- Common NLP Tasks
- Approaches Used to solve NLP use cases
- Challenges in NLP
- Conclusion
What is NLP?
Natural Language Processing (NLP) is basically how you can teach machines to understand human languages and extract appropriate meaning from text. Language as a structured medium of communication is what separates us human beings from animals. From Wikipedia definition, say's, natural language processing is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, particularly how to program computers to process and analyze large amounts of natural language data. So it's all about providing the ability for computers to listen and talk like humans.
The main aim is of Making machines able to understand and respond the Natural Language.
Some Real World Applications :
- Contextual Advertisement
- Email Clients - Spam Filtering, Smart Replies
- Social Media - Removing Adult Content, Opinion mining
- Search Engines
- Chat Bots
Common NLP Tasks:
- Text/Document Classifications
- Sentiment Analysis
- Information Retrieval
- Parts of Speech tagging
- Language Detection & Machine Translation
- Conversational Agents
- Knowledge Graphs & Q.A sys
- Text Summarization
- Topic Modelling
- Text Generation
- Spell Checking & Grammar correction
- Text Parsing
- Speech to text & Text to Speech
Approaches to NLP:
- Heuristic Methods
- M.L based models
- D.L based models
1. Heuristic Methods:
A heuristic, or a heuristic technique, is any approach to problem-solving that uses a practical method or various shortcuts in order to produce solutions that may not be optimal but are sufficient given a limited timeframe or deadline. The examples are like
- Regular Expression → To predict texts of same pattern.
- Word Net(Lexical Dictionary) → Unlike common dictionaries, Word Net is like a Lexical Dictionary i.e., here , words are stored here in an organized manner on the basis of it's relations with other words .
- Open Minded Common Sense questions
Advantages :
- It is accurate because it has human involvement.
- It is quick approach.
Disadvantages:
- This needed good knowledge and experience to apply heuristics effectively.
- You should have to use multiple experts to aggregate their results.
- It is sometimes time-consuming.
2. Machine Learning Methods :
All the major problems present in Heuristic methods like when there is some open ended issues .So, the major advantage of ML models over Heuristic method is it solves open ended problems.
Algorithms :
- Naïve-Bayes Classification
- Logistic Regression
- Support Vector Machine
- LDA (for Topic Modelling)
- Hidden Markov Models
3. Deep Learning Methods :
One of the main issues present in Machine Learning approach was that ML models can't read texts sequentially. But in Deep Learning approach text data is read in sequentially manner and unlike ML approach it can also automate feature generation.
Algorithms :
- RNN
- LSTM (Long Short Time Memory)
- GRU (Grated Recurrent Unit)
- CNN
- Transformers
- Auto encoders
In RNN the main issue is it can't process a Long Sentences. This problem gets resolved in LSTM algorithm. GRU is mainly used for Text Generation. Transformers revolutionized the NLP . Transformers can provide more attention to certain words . Auto Encoders are mainly used using Two Neural Networks (LSTM based) one of which acts as an encoder & other one acts as decoder.
Challenges in NLP:
- Ambiguity - Human language is way advanced that same sentences have include completely different meanings in line with context.
For examples -
1) I saw the boy on the beach with my binoculars.
2) I've never tasted a cake quite like that one before!
Now the first sentence creates two meanings that we can understand according to paragraph context by making this understand to any NLP software is a very difficult task.
- Contextual Words - Same words in a different context have a different meaning.
For Example-
I ran to the store because we ran out of milk.
- Colloquialisms & slangs - During human conversation, we say something that means something else but humans can understand what it means to say but a machine cannot justify the talks. For example,
This task for me is like a piece of cake
playing football is not your cup of tea
- Synonyms - Humans use synonyms which machines can't understand because it will create confusion.
- Irony, Sarcasm & tonal difference - Sarcasm everyone knows is telling something else in two different tones that can mean sometimes in the true sense or sometimes indirectly.
- Spelling Errors - While writing emails, any form of texts we make typo errors which is not justify by machines.
- Creativity
- Diversity - There are so many languages around the globe and in each country. And also we talk about a particular language that software should go in deep to understand this so this task is still in research in NLP. We only reached to 5% of what NLP can give us.
Conclusion
NLP is a very vast field that revolves around human language and solving real-world problems smartly that help systems communicate easily with humans. In this article, we have learned what made NLP such a popular and wide researched topic of the 20th century. We have also seen different challenges faced while dealing with the NLP problem statement and how current deep learning architectures are evolving to solve and cover these challenges. There are more different challenges that you can figure out when you will work on any NLP use case.