This is the process of cleaning, transforming, and modeling data for extract relevant and useful information. There are many tools which used to analysis the data which is as:

Xplenty,

Microsoft HDInsight,

Skytree,

Talend,

Splice Machine,

Spark,

Plotly,

Apache SAMOA,

Lumify,

Elasticsearch,

R-Programming,

IBM SPSS Modeler,

and more others.

There are different techniques that are used for data analysis which is listed below:

Data Exploration

Summary Statistics

Distribution analysis

One-Way Frequencies

Correlation Analysis

Table Analysis

t-Tests

Predictive Analysis

Prescriptive Analysis

Statistical Analysis

Text Analysis

__Data Exploration__

Data exploration is the initial steps in data analysis, it used the techniques of data visualization which is done manually or with the help of many data visualization techniques.

Data Exploration is about describing the data by means of statistical and visualization techniques. We explore data in order to bring important aspects of that data into focus for further analysis.

Univariate analysis explores variables (attributes) one by one. Variables could be either *categorical* or *numerical*.

There are two types of Univariate Analysis:

Categorical Variables

Numerical Variables

Bivariate Analysis

Bivariate analysis is the simultaneous analysis of two variables (attributes). It explores the concept of the relationship between two variables

There are three types of bivariate analysis.

Numerical & Numerical

Categorical & Categorical

Numerical & Categorical

__Summary Statistics or Descriptive__

This technique is used to summarizing or describing the data. It uses two approaches:

Quantitative Approach

Visual Approach

Descriptive statistics can be used on one or many datasets or variables

__Distribution analysis__

A distribution analysis helps us understand the distribution of the various attributes of our data.

There are different types of distribution used in machine learning:

Types of Distributions:

Bernoulli Distribution

Uniform Distribution

Binomial Distribution

Normal Distribution

Poisson Distribution

__One-Way Frequencies__

The One-Way Frequencies task generates frequency tables from your data. You can also use this task to perform binomial and chi-square tests.

__One-Way Tables__

Create frequency tables (also known as crosstabs) in pandas using the pd.crosstab() function.

__Example:__

One_way_table_data = pd.crosstab(index=titanic_train["Survived"], columns="count")# Make a crosstabOne_way_table_data# Name the count column

You can use __value_counts()__ to cross - check these counts

titanic_train.count.value_counts()

you can get the same result.

__Correlation Analysis__

Data correlation is the way in which one set of data may correspond to another set.

Correlation is a bivariate analysis that measures the strength of association between two variables and the direction of the relationship. In terms of the strength of relationship, the value of the correlation coefficient varies between +1 and -1.

Usually, in statistics, we measure four types of correlations: Pearson correlation, Kendall rank correlation, Spearman correlation, and the Point-Biserial correlation. The software below allows you to very easily conduct a correlation.

Syntax used to find a correlation

dataframe.corr(method='',min_periods=1)

Where,

method: {‘pearson’, ‘kendall’, ‘spearman’}

__Table Analysis__

Often you need to analyze the information in a table, sometimes called a contingency table or a cross-classification table. You may analyze a single table, or you may analyze a set of tables.

Using the Table Analysis task, not only can you analyze a single table, but you can also analyze sets of tables. This provides a way to control, or adjust for, a covariate while assessing the association of the rows and columns of the tables.

__t-Tests__

The t-test (also called Student’s T-Test) compares two averages(means) and tells you if they are different from each other.

The Student’s t-test is a statistical hypothesis test for testing whether two samples are expected to have been drawn from the same population.

__Predictive Analysis__

In this extracting information from existing data in order to determine patterns and predict future outcomes. It does not tell you what will happen in the future. Instead, it forecasts what might happen in the future with an acceptable level of reliability, and includes what-if scenarios and risk assessment.

__Prescriptive Analysis__

This is another types type of data analytics—the use of technology to help businesses make better decisions through the analysis of raw data. Specifically, prescriptive analytics factors information about possible situations or scenarios, available resources, past performance, and current performance, and suggests a course of action or strategy.

__Statistical Analysis__

It’s the science of collecting, exploring and presenting large amounts of data to discover underlying patterns and trends. Statistics are applied every day – in research, industry and government – to become more scientific about decisions that need to be made.

Other some important techniques

Linear models

Survival Analysis

Multivariate Analysis

Contact us for this machine learning assignment Solutions by Codersarts Specialist who can help you mentor and guide for such machine learning assignments.

If you have project or assignment files, You can send at __contact@codersarts.com__ directly