Replacing strings with numbers in Python for Data Analysis with pandas


In this blog, we will learn how to implement string format data in ML and how to fit it into ML models.


Sometimes data is given in string format which is not fit into ML models, to solve this issue first changing a string value into any numeric value and then split it into training and testing data.


Let we will learn below data and fit it into ML models then we need to change sex column value into numeric like-


F(Female) - 1

M(Male) - 2



Here we change the value of F by 1 and M by 2, here below python code to do this is:


Here below steps to do this:


Step 1:


Read the CSV file using:


>>> df = pd.read_csv('mydata.csv')

>>> df.head()


And after this remove all nan value from the dataset


>>> data = df.dropna()



Step 2:


Divided data into target and source for training and testing. We will use one column as target for prediction


# divide data for training and testing

>>> x=data.drop('target column',axis=1)

>>> y=data.target column


Now we will split data into training and testing


>>> from sklearn.model_selection import train_test_split

>>> x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.5)


Now we will change the value string to numeric for data analysis and fit it into ML models for predictions.


Dropping unnecessary column from data using drop:


>>> x_train_data = x_train.drop(['name'],axis=1)


Step 3:


>>>x_train_data .sex[x_train.sex == 'F'] = 1

>>>x_train_data .sex[x_train.sex == 'M'] = 2


By using this all value of sex column is updated to numeric values.


Step 4:


After this, we will fit it into the models


Fit into the Logistic Regression


>>> model = LogisticRegression()

>>> fit = model .fit(x_train_data, y_train)


I hope it may be helpful for you, and there are many models which you need to predict or need help to predict any types of ML models then contact us here


We have a highly professional expert team that help any type of machine learning and data science problem and give better solutions within your due date.

Contact Us

Tel: (+91) 0120  4118730  

Time :   10 : 00  AM -  08 : 00 PM IST 

Registered address: G-69, Sector 63, 

 Noida - 201301, India

We Provide Services Across The different countries

USA    Australia   Canada   UK    UAE    Singapore   New Zealand    Malasia   India   Ireland   Germany

CodersArts is a Product by Sofstack Technology Solutions Pvt. Ltd.

  • CodersArts | Linkedin
  • Instagram