top of page

Natural Language Processing In Python : Part - 5


Stemming and Lemmatization in python NLP
Stemming and Lemmatization in python NLP

This is part - 5 of this series, before this blog we will already created for blog, if you want to learn this blog then i suggest that you can learn previous blog so that you can easily learn this blog. Here, previous blog link are given below you can directly visit from here:


In thi blog we will learn all about basic to advanced concepts of the topic Stemming and Lemmatization.



What is Stemming and Lemmatization ?


Stemming - It is a process of reducing words to its root form even if the root has no dictionary meaning. For eg: beautiful and beautifully will be stemmed to beauti which has no meaning in English dictionary.


In other language we can say converts a word into its stem(root form) by removing the some suffix like : “es”, “ing”, “pre” etc.


Lemmatization - It is a process of reducing words into their root form or dictionary. It takes into account the meaning of the word in the sentence.


For eg: beautiful and beautifully are lemmatised to beautiful and beautifully respectively without changing the meaning of the words. But, good, better and best are lemmatised to good since all the words have similar meaning.


Now we will start this blog: Before start it first we need to install all related libraries which helps to running code properly-


Install these libraries :


First install nltk library-


pip install nltk

Then import it using:


import nltk

Types of Lemmatizers:


There are many types of Lemmatizer but here we will works some of them like wordnet:

  • Wordnet Lemmatizer

  • spaCy Lemmatization

  • TextBlob Lemmatizer

  • Pattern Lemmatizer

  • Stanford CoreNLP Lemmatization

  • Gensim Lemmatize

  • TreeTagger

If you want learn more about lemmatizer then click here


"Wordnet" Lemmatizer with NLTK

 

After this install "wordnet", which is collection of english text, which is available free of cost, it is lexical database for the English language aiming to establish structured semantic relationships between words.


nltk.download('wordnet')


Now start "Lemmatizing" using this :


from nltk.stem import WordNetLemmatizer


Jupyter notebook output:


If lemmatize a simple sentence then first tokenize it then perform operation.



Output on Jupiter notebook:



"TextBlob" Lemmatizer

 

First install textblob using


pip install textblob

It is the powerful NLP package

Use Word - for single word, and TextBlob - group of words or sentences


Examples:

With complete sentences:


Stemming

 

Here you can learn it with the help of this example


Output:




Why is Lemmatization better than Stemming?


Stemming algorithm works by cutting the suffix from the word and change the meaning of the word but in lemmatization meaning of word in not changed.


Thanks for reading this blog in next blog we will learn next topic - Finding unusual words using python NLP


If you like Codersarts blog and looking for Assignment help,Project help, Programming tutors help and suggestion  you can send mail at contact@codersarts.com.

Please write your suggestion in comment section below if you find anything incorrect in this blog post 

Comentarios


bottom of page