Machine learning Logistic Regression Using K fold cross-validation

Requirements :

Find 2 datasets, one for regression and the other for classification


linear regression, polynomial regression(upto deg=3), random forest, SVM


the other for classification using logistic regression, KNN, random forest, SVM

Project Requirements:

  • No. of rows >=1000

  • No. variables > 2

  • No. of classes for the dependent variable must be more than 2 for classification

Do K-fold cross-validation for both.

For regression show: R2, Adjusted R2, RMSE, correlation matrix, p-values of independent variables (codes 10)

For classification show: Accuracy, confusion matrix, (Macro recall and precision for multiclass Classification) (codes 10)

Do hyper-parameter tuning using Grid Search

The report should discuss the properties of the datasets, your results, and model performance comparisons, and inferences/conclusions. (10)

Prepare a report to discuss the properties of the datasets, your results, and inferences. (10)

Here solution of this which fulfill the above requirements :

Import Libraries

>>> import pandas as pd

>>> import numpy as np

>>> import matplotlib.pyplot as plt #Data visualization libraries

>>> import seaborn as sns

>>> %matplotlib inline

Load Data

Creating methods to update columns fields values

Applying these methods on pandas datasets to update values

