top of page

Sentiment Analysis in Machine Learning for text datasets

Updated: Mar 23, 2021

It is a type of data mining which is used to recognize people's inclination like positive, negative or neutral. It is basically used for text and image datasets with feature extraction to find positive, negative and neutral emotions.


How we can perform sentiment analysis using given datasets:


First, find the datasets and follow the given below process:


First, import libraries and read CSV datasets file


>>> import numpy as np

>>> import pandas as pd

>>> import re import nltk

>>> import matplotlib.pyplot as plt

>>> %matplotlib inline


>>> data = pd.read_csv("data.csv")


If training and testing data are not given separately then we need to divide data into train and test data using train_test_split() function.


If data is given in both train and test format then no need to divided it, you can use it directly and fit it into the model.


Here work with twitter data you can easily find it on google.


>>> train_data = pd.read_csv('data/train.csv')

>>> test_data = pd.read_csv('data/test.csv')


Now we split the train and test data and fit into different ML algorithms.


>>> from sklearn.model_selection import train_test_split

>>> X_train, X_test, y_train, y_test = train_test_split(processed_features, labels, test_size=0.2, random_state=0)



After this follow these for data visualization


>>> train_data.head()


>>> train_data.info()


>>> test_data.head()


Now fit it into the model-


Training the Model


>>> from sklearn.ensemble import RandomForestClassifier

>>> text_classifier = RandomForestClassifier(n_estimators=200, random_state=0)

>>> text_classifier.fit(X_train, y_train)


Making Predictions and Evaluating the Model


>>> predictions = text_classifier.predict(X_test)



Confusion Matrix


Confusion matrix to show prediction, accuracy score and classification report to show positive, negative and neutral emotions.


>>> from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

>>> print(confusion_matrix(y_test,predictions))

>>> print(classification_report(y_test,predictions))

>>> print(accuracy_score(y_test, predictions))


Conclusion


Sentiment analysis is an NLP task that is used to show people's OPINION as per given datasets.


Thanks, for reading this blog,


I always try to provide best solution for any type of machine learning project assignment. If you find anything missing then please feel free and comments below so we can provide the best for you.

If you need any types of ML related help please visit here or contact us here.



bottom of page