__1)____ ____DESCRIPTION__

80% of people who purchase car insurance are men. If the owners of 9 car insurance are randomly selected, then find the probability using binomial distribution that exactly X out of them are men

Read a number X from a line of input

Print the output rounded till 4 decimal point

Example:

Sample Input:

6

Sample Output:

0.1762

2) __DESCRIPTION__

If the probability of a profit or loss in investment is equal, find the probability using geometric distribution that an investor’s k investment is his first profit

Take input from the user k

Print the ouput and round up the output till three decimal points

** **

Example:

Sample Input:

4

Sample Output:

0.062

3) __DESCRIPTION__

Conditional Probability

The probability of an event which is conditioned or dependent on another event is a Conditional Probability

Conditional Probability = P(A|B) = P(A and B)/P(B)

P(A|B) is the probability of event A occurring, given that event B occurs

You have the Member dataset, which is an input data file Members.csv present at the location /data/training/blackfriday.csv

This dataset contains information about information related to the people. Here’s a brief description of the columns in the sample dataset

Dataset Description:

The dataset contains data of 8 rows and 4 different columns. The columns are:

Gender: whether the particular person is male or female

Height: Height of the person

Weight: Weight of the person

Foot-size: Foot-size of the person

This is a preview of the data under consideration:

Question:

Calculate the probability of members height being more than 5 inches, given that member is female

Input Format:

The file to be read will be Members.csv, which contains the data as mentioned above. This file is in .csv format.

Example:

Sample Input:

** **https://media-doselect.com/Members.csv

Sample Output:

0.52

EXECUTION TIME LIMIT

4) __DESCRIPTION__

Write a program to perform the following operations:

1. Read a number X from a line of input, where X must be a float value

2. Input X represents the probability of a person being hit by a falling meteorite

3. Calculate the odds of a person being hit by a falling meteorite

4. Print the output and round up till 3 decimal points

Example:

Sample Input:

0.07

Sample Output:

7.527

5) __DESCRIPTION__

Average number of apples in a carton is 25 with variance of 36. Calculate the probability using normal distribution of number of apples less than X.

Read a number X from a line of input

Print the output and round up the till four decimal points

Example:

Sample Input:

28

Sample Output:

0.6915

6) __DESCRIPTION__

Black Friday falls on the Friday following the ‘Thanksgiving Day’ and is used as an occasion by many stores to offer highly promoted Sales.

You have the Black Friday dataset, which is an input data file blackfriday.csv present at the location /data/training/blackfriday.csv

This dataset contains information about purchases made in a retail store on Black Friday sale. Here’s a brief description of the columns in the sample dataset:

USER_ID: ID of the user

Gender: F or M

Age: Age group to which the customer belongs

Occupation: ID of occupation of the customer

City_Category: A or B or C

Stay_In_Current_City_Years: 0 to 4+

Marital_Status: 0: Unmarried, 1: Married

Purchase: Purchase amount in dollars

This is a preview of the data under consideration:

The retailer wants to analyse this data and improve its future sales based on the analysis. In all the questions of this Assignment, we have to perform analysis on this data.

Purchases made by customers on Black Friday sale are stored in the column named

**Purchase****Age**represents the age group the customer belongs to out of 0-17, 18-25, 26-35, 36-45, 46-50, 51-55 and 55+**Gender**represents the gender of the customer as F or M**City_Category**represents the category of city the customer belongs to as A, B or C

In this question, we have to perform calculations on the above data as explained below.

Question:

Given that the age is 18-25, Calculate the probability of the number of people who have purchased above 10000

Input Format:

The file to be read will be **blackfriday.csv**, which contains the data as mentioned above. This file is in **.csv** format.

Hint:

Avoid using repetitive customers

Example:

Sample Input:

https://media-doselect.s3.amazonaws.com/generic/3M8qkrpOgMEwqevMR5kPon3v/blackfriday.csv

Sample Output:

0.3276

7) __DESCRIPTION__

Write a Python code to perform the following operations:

1. Create a list having **10 elements that are positive integer values**

Read 10 input values on each line of input

2. Convert both the lists into series

3. Find the **population mean **and** population standard deviation** of the series using pandas

On

*first*output line:the population mean and population standard deviation values rounded up to**Print****3 decimal places**and separated by a space

4. Draw a** sample** of 5 from the series

Use

**pandas.DataFrame.sample**with the following parameters**n**=sample_size,**random_state**=1

5. Find the **sample mean **and** sample standard deviation** of the series using pandas

On the

*second*output line:the sample mean and sample standard deviation values rounded up to**Print****3 decimal places**and separated by a space

Example:

Sample Input:

98 63 23 697 136 35 09 343 23 1

**Sample Output:**

142.8 219.589 53.4 60.111

__8)____ ____DESCRIPTION__

**Dataset:** mpg.csv

**Dataset Description:**

Data set contains 398 observations containing 8 variables.

Here’s a preview of the data under consideration:

Problem Statement

Based on this data set, write a Python code to perform the following operations:

1. **Load the data set** from the location of the file provided as input using pandas

2. Read a string on the second input line which specifies a **quantitative data column name** in the data set

3. Find the **population mean **and** population standard deviation** of the specified quantitative data column using pandas

4. Draw a** sample** of 200 from the specified quantitative data column

Use

**pandas.DataFrame.sample**with the following parameters**n**=sample_size,**random_state**=1

5. Find the **sample mean **and** sample standard deviation** of the specified quantitative data column using pandas

6. Find the **difference** between the sample mean & population mean as well as sample standard deviation & population standard deviation

On

*first*output line:the difference as**Print****<sample mean> -****<population mean>**rounded up to**3 decimal places**On

*second*output line:the difference as**Print****<sample std deviation> -****<population std deviation>**rounded up to**3 decimal places**

Example:

Sample Input:

https://media-doselect.com/mpg.csv weight

Sample Output:

22.07 34.815

9) __DESCRIPTION__

A food delivery company gets cancellations on ** x** orders in a day out of 900 total orders. Each customer can make only one cancellation in a day. The company assumes that all customers are independent of each other.

Write a Python code to perform the following operations:

1. Read an integer input which specifies the **number of cancelled orders**

2. Find out the **margin of error **using** scipy.stats.norm.ppf**

On

*first*output line:the margin of error value rounded up to**Print****5 decimal places**

3. Determine an approximate **95% confidence interval** for the proportion of orders cancelled in a day

On

*second*output line:the confidence interval values rounded up to**Print****5 decimal places**and separated by a space

**Note:**

Margin of Error = Critical Value*Standard Error of Statistic

Confidence Interval = Sample Statistic

**±**Margin of Error

**Example: **Let's say 300 out of 900 orders were cancelled

Sample Input:

300

The margin of error & confidence interval values should be printed as -

Sample Output:

0.02585 0.30749 0.35918

10) __DESCRIPTION__

Dataset: Property.csv

Dataset Description:

Data set contains 21613 observations containing 21 variables.

Here’s a preview of the data under consideration:

**Problem Statement**

Based on this data set, write a Python code to perform the following operations:

1. **Load the data set** from the location of the file provided as input using pandas

2. Read a string input which specifies a **quantitative data column name** in the data set

3. Find the **population mean **and** population standard deviation** of the specified quantitative data column using pandas

On

*first*output line:the (1)population mean and (2)population standard deviation values rounded up to**Print****3 decimal places**and separated by a space

4. Draw a** sample** of 100 from the specified quantitative data column

Use

**pandas.DataFrame.sample**with the following parameters**n**=sample_size,**random_state**=4

5. Find the **sample mean **and** sample standard deviation** of the specified quantitative data column using pandas

On

*second*output line:the (1)sample mean and (2)sample standard deviation values rounded up to**Print****3 decimal places**and separated by a space

6. Check if the sample mean differs from the population mean using **Hypothesis Testing**

a) The hypothesis is stated as follows:

**Null hypothesis**= sample mean does not differ from the population mean**Alternate hypothesis**= sample mean differs from the population mean

b) Perform a test at 95% confidence level and find out the **z-statistic** and **critical value**

On

*third*output line:the (1)z-statistic and (2)critical value rounded up to**Print****3 decimal places**and separated by a space

c) Conclude the relationship between the sample mean and the population mean

On

*fourth*output line:the hypothesis that holds true as per Point 1 in the**Print****Note**given below

Note:

Point 1:

Z-statistics is

Lesser than critical value: fail to reject the null hypothesis

Greater than critical value: reject the null hypothesis

Point 2:

Make sure your code prints the hypothesis exactly as given above (i.e., lowercase letters and space between words)

Example:

Sample Input:

https://media-doselect.s3.amazonaws.com/generic/RkzkY87b8Y1QNRwG3QKwe94v/Property.csv price

Sample Output:

540088.142 367127.196 515254.41 280175.923 -0.676 1.645 fail to reject the null hypothesis

11) __DESCRIPTION__

Write a Python code to perform the following operations:

1. Read the following list defined below:

763, 667, 593, 402, 348, 278, 123

2. Create another list having **7 elements that are positive integer values**

Read 7 input values on each line of input

3. Check if there exists a relationship between means of the two lists using **Hypothesis Testing**

a) The hypothesis is stated as follows:

**Null hypothesis**= there is no relationship (independent)**Alternate hypothesis**= there is a relationship

b) Perform a t-test using **stats.ttest_ind** and find out the **p-value**

On

*first*output line:the p-value rounded up to**Print****5 decimal places**

c) Conclude the relationship between means of the two lists

On

*second*output line:the hypothesis that holds true as per Point 1 in the**Print****Note**given below

Note:

Point 1:

P-value is

Lesser than significance level (0.05): there is a relationship

Greater than significance level (0.05): there is no relationship (independent)

Point 2:

Make sure your code prints the hypothesis exactly as given above (i.e., lowercase letters and space between words)

Example:

Sample Input:

23 56 86 99 116 294 366

Sample Output:

0.00976 there is a relationship