top of page

Chronic kidney disease I Machine learning project help

Abstract:

This dataset can be used to predict chronic kidney disease and it has been collected at a hospital for a period of nearly 2 months.


Problem Statement

  • Identify the factors causing chronic kidney disease

  • Build a model that can help to determine if a patient is suffering from kidney chronic disease or not

Data Understanding

The data is gathered for two months from patients at a hospital. You need to make utilization of the features presented in the data set for your task. The data set and a document containing the information about the attributes are attached with the assignment problem statement.

Make yourself familiar with these attributes as these might help you in determining the patients with kidney chronic disease


Data preparation and Exploratory Data Analysis

You are supposed to make utilizations of all the appropriate data pre-processing techniques on the given data set. If required, make appropriate assumptions and make it explicitly known while using them in the code or in the presentation. You are required to identify the key factors that influences the presence of chronic kidney disease in a patient. Make appropriate selection of the attributes with sound justification for the same. The data set allows for several new combinations of attributes and attribute exclusions, or the modification of the attribute type (categorical, integer, or real) depending on the purpose of the research.

You are supposed to make use of Python programming language and its libraries to work on this analysis effort


Model building and Evaluation

You are supposed to build a model that predicts if a patient is suffering from the kidney disease or not, provided the several features associated with the delivery personnel’s work are given as input.

Apply the appropriate evaluation techniques in order to determine the accuracy of the predictions made by the model. Think of employing the technique that helps in improving the accuracy of the models along with inclusion of limited number of factors in the model.


Try to obtain a model that can be easily understood and explained but it should not come at the cost of accuracy


You are supposed to make use of Python’s scikit-learn library for this step. You are free to write your custom algorithm as well provided it help in trying the objective of the use case.


Expected Outcomes

The results should consist of


a) The python script file or Jupyter notebook containing all the code for the proposed solution. Write all code in single file only with proper comments and outputs at various places.


b) A presentation which describes the

  • Problem

  • Your understanding of data

  • Pre-processing techniques you have applied

  • Intuition behind Algorithm selection for building model

  • Discussion of results

  • your observations



bottom of page