Introduction to Natural language processing (NLP)

July 07, 2023

What is NLP?

NLP is a subfield of AI and computational linguistics that focuses on the interaction between the computers and natural language. It helps the computer to understand, interpret and generate human language in a way that is meaningful and useful.

NLP is a technique used to teach computers to understand words, sentences, or even paragraphs and helps computers to read, write and even have conversations with people.

Knowing the past is crucial because it enables us to understand the efforts made and challenges encountered to achieve this:

History of NLP

1950s:

Researchers thought " why not we can create a tool that helps the computer to understand and process human language". They started with simple rules and grammars to analyze language, but progress was limited.

1960s-1980s:

NLP continued to evolve based on the heuristic approach, heuristics are nothing but the rules or guidelines or instructions that are given to machines to understand and work on human language. using this approach models built are called "rule-based models".

1980s-1990s:

The use of statistical and machine learning techniques NLP gained the popularity. Using the algorithms and techniques researchers started improving language processing tasks like speech recognition and machine translation.

2000s:

The machine learning algorithms like Support Vector Machines(SVM) and Hidden Markov Models (HMM) helped computers understand and process better, they were applied to various applications like sentiment analysis and information retrieval.

2010s:

The era of Deep Learning came into exist and brought significant advancements in NLP. Researchers developed models like recurrent neural networks (RNNs) and later transformers which revolutionized tasks such as language translation, language understanding and text generation.

2018s-present:

Large language models came into exist and models like chatGPT and GoogleBard were developed. These models are capable of generating human-like language, making NLP more accessible and applicable in various domains.

Approach's for NLP:

1) Heuristics Approach:

Heuristics are nothing but the rules or guidelines or instructions that are given to machines to understand and work on human language. using this approach models built are called "rule-based models".
Examples:
1.Regular Expressions:

These are like magical patterns or rules that help computer to search and find things in the bunch of text.
Let's say if i ask the search engine, "what is the temperature in khammam?". First engine finds the keywords in the text like temperature and khammam(location). Now it starts searching information related temperature. why it is done so? Because the engine follow's a simple rule: for any given text first it finds the keywords in it. Then it access the information from the wordnet.

2. WordNet:

It is a big dictionary where every word had a relation with different words. It helps to extract the relevant information fastly.

2) Machine Learning approach:

Instead of setting rules after understanding the text manually. ML algorithms automatically set rules by them self after training it.

Models used in ML:

Naive bayes
Support Vector Machines (SVM)
Logistic Regression
Latent Dirichlet Allocation (LDA)
Hidden Markov models (HMM).

3) Deep Learning Approach:

Big Advantage over ML:

Machine learning algorithms don't care the sequential information in the text. Second is to train a ML model we have to extract the features from the problem.

Architectures used:

Recurrent Neural Networks ( RNNs)
Long-Short term Memory model (LSTMs)
Gated Recurrent Unit (GRU) / Convolutional Neural Networks (CNNs)
Transformers
Autoencoders

Search This Blog

Learn DATA SCIENCE with Jayanth