It is used to show the linear relationship between a dependent variable and one or more independent variables
What is Regression?
Regression analysis is a form of predictive modeling technique which investigates the relationship between a dependent and independent variable
Types of Regression
Linear Regression vs Logistic Regression
The data is modeled using a straight line
Value of the variable
Measured by loss, R squared, Adjusted R squared, etc.
The data is modeled using a sigmoid
Probability of occurrence of an event
Measured by Accuracy, Precision, Recall, F1 score, ROC curve, Confusion Matrix, etc
How do Linear Regression Algorithm works?
Least squares is a statistical method used to determine the best fit line or the regression line by minimizing the sum of squares created by a mathematical function. The “square” here refers to squaring the distance between a data point and the regression line. The line with the minimum value of the sum of the square is the best-fit regression line
Regression Line, y = mx+c where,
y = Dependent Variable
x= Independent Variable ; c = y-Intercept
Least Square Method – Implementation using Python
#Importing Necessary Libraries %matplotlib inline import numpy as np import pandas as pd import matplotlib.pyplot as plt plt.rcParams['figure.figsize'] = (20.0, 10.0) # Reading Data data = pd.read_csv('headbrain.csv') print(data.shape) data.head() # Collecting X and Y X = data['Head Size(cm^3)'].values Y = data['Brain Wei
To find the value of m and c, you first need to calculate the mean of X and Y
# Mean X and Y mean_x = np.mean(X) mean_y = np.mean(Y) # Total number of values n = len(X) # Using the formula to calculate m and c numer = 0 denom = 0 for i in range(n): numer += (X[i] - mean_x) * (Y[i] - mean_y) denom += (X[i] - mean_x) ** 2 m = numer / denom c = mean_y - (m * mean_x) # Print coefficients print(m, c)
The value of m and c from above will be added to this equation
BrainWeight = c + m ∗ HeadSize
Plotting Linear Regression Line
# Plotting Values and Regression Line max_x = np.max(X) + 100 min_x = np.min(X) - 100 # Calculating line values x and y x = np.linspace(min_x, max_x, 1000) y = c + m * x # Ploting Line plt.plot(x, y, color='#52b920', label='Regression Line') # Ploting Scatter Points plt.scatter(X, Y, c='#ef4423', label='Scatter Plot') plt.xlabel('Head Size in cm3') plt.ylabel('Brain Weight in grams') plt.legend() plt.show()
R Square Method – Goodness of Fit
R–squared value is the statistical measure to show how close the data are to the fitted regression line
R square – Implementation using Python
#ss_t is the total sum of squares and ss_r is the total sum of squares of residuals(relate them to the formula). ss_t = 0 ss_r = 0 for i in range(m): y_pred = c + m * X[i] ss_t += (Y[i] - mean_y) ** 2 ss_r += (Y[i] - y_pred) ** 2 r2 = 1 - (ss_r/ss_t) print(r2)
I hope you can understand it using least square method, now we do it using sklearn algorithm
Linear Regression – Implementation using scikit learn
from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error # Cannot use Rank 1 matrix in scikit learn X = X.reshape((m, 1)) # Creating Model reg = LinearRegression() # Fitting training data reg = reg.fit(X, Y) # Y Prediction Y_pred = reg.predict(X) # Calculating R2 Score r2_score = reg.score(X, Y) print(r2_score)
Get your project or assignment completed by Deep learning expert and experienced developers and researchers.
If you have project files, You can send at email@example.com directly