Probability and Statistics Question and Answers |

Probability and Statistics Question and Answers

( Suggestion : keep refreshing the page for updated content & Search Questions Using Find )

Q1.A)i) The marks obtained by 9 students in STATISTICS and SOFTWARE ENGINEERING are given below. Find the coefficient of correlation between the two courses. Roll. No 1 2 3 4 5 6 7 8 9 Marks in statistics 85 45 54 91 58 65 56 48 50 Marks in Software Engineering 75 53 65 58 15 80 55 30 60
B) A cubical die was thrown 9000 times and 2 or 3 was obtained 3240 times. On the assumption of certain throwing do the data indicate an unbiased die? Use 1% level of significance.

Ans:
B)

we can perform a hypothesis test using the chi-squared goodness-of-fit test. The null hypothesis (H0) is that the die is unbiased, and the alternative hypothesis (Ha) is that the die is biased.

Let’s set up the hypotheses:

H0: The die is unbiased (i.e., the probabilities of getting each number 1 through 6 are equal). Ha: The die is biased (i.e., the probabilities of getting each number 1 through 6 are not equal).

We will use a chi-squared goodness-of-fit test to test these hypotheses. The formula for the chi-squared statistic for goodness-of-fit is:

X^2 = (O-E)^2/E

Where:

X^2 is the chi-squared statistic.
O is the observed frequency (the number of times 2 or 3 was obtained).
E is the expected frequency (the number of times we would expect to get 2 or 3 if the die is unbiased).

If the die is unbiased, we would expect each number (1 through 6) to have a probability of 1/6 because a fair die has six equally likely outcomes.

So, the expected frequency for getting a 2 or 3 in one throw of the die is:

Now, we can calculate the chi-squared statistic:

$X^2=(3240−1500)^2/1500=1740^2/1500≈2020$

Next, we need to find the critical chi-squared value at a 1% level of significance with degrees of freedom equal to the number of categories minus 1 (in this case, 6 – 1 = 5). You can use a chi-squared table or calculator for this. At a 1% significance level and 5 degrees of freedom, the critical chi-squared value is approximately 15.086.

Q2.A)Suppose the standard deviation of the heights of university college male students is 6√2 inches Two hundred male students of a university are measured. If the true population mean height is 71, then find the probability of sample mean will be between 69 and 72inches. B)An automotive engineer wants to estimate the cost of repairing a Sports Bike that experiences a 200 KMPH head-on collision. He crashes 16 bikes, and the average repair costs 5400. The standard deviation of the 16-Bike repair costs 316. Find 95% confidence interval for the mean cost of repair, if the true mean cost of repair is 5700. Take 5% level of significance

Ans: A)

To find the probability of the sample mean height being between 69 and 72 inches, you can use the z-score formula and the properties of the normal distribution, assuming the heights follow a normal distribution.

The formula for the z-score is: $Z = X - μ/$ $σ /\sqrt n$

Where:

$X$ is the sample mean height you’re interested in (69 and 72 inches in your case).
$μ$ is the population mean height (given as 71 inches).
$σ$ is the population standard deviation (given as 6√2 inches).
$n$ is the sample size (given as 200 students).

First, let’s calculate the z-scores for both 69 and 72 inches:

For 69 inches: Z

For 72 inches: Z

Next, you’ll need to find the corresponding probabilities using a standard normal distribution table or a calculator.

P(Z < -2.12) is the probability that the sample mean is less than 69 inches, and P(Z < 0.47) is the probability that the sample mean is less than 72 inches.

Now, you can find the probability that the sample mean is between 69 and 72 inches by subtracting these two probabilities:

Calculate these probabilities using a standard normal distribution table or calculator to find the final probability.

To find the 95% confidence interval for the mean cost of repair when the true mean cost is $5700 and the sample mean repair cost is $5400, you can use the formula for the confidence interval for a population mean with a known population standard deviation when the sample size is relatively small (n < 30). The formula is:

$C o n f i d e n ce I n t er v a l = X ˉ \pm Z (σ/ n)$

Where:

$X ˉ$ is the sample mean repair cost (given as $5400).
$σ$ is the population standard deviation (given as 316).
$n$ is the sample size (given as 16).
$Z$ is the critical value for a 95% confidence interval at a 5% level of significance.

To find the critical value $Z$ for a 95% confidence interval with a 5% level of significance, you can use a standard normal distribution table or a calculator. For a 95% confidence interval, the critical value is approximately 1.96.

Now, plug in the values into the formula:

$C o n f i d e n ce I n t er v a l = 5400 \pm 1.96 (316)$

Calculate the values within the parentheses first:

$C o n f i d e n ce I n t er v a l = 5400 \pm 1.96 (316/)$

$C o n f i d e n ce I n t er v a l = 5400 \pm 1.96 \cdot 79$

Now, calculate the confidence interval:

Lower Limit = 5400 – (1.96 * 79) Upper Limit = 5400 + (1.96 * 79)

Lower Limit ≈ 5243.64 Upper Limit ≈ 5556.36

So, the 95% confidence interval for the mean cost of repair, when the true mean cost is $5700, is approximately $5243.64 to $5556.36. This means we are 95% confident that the true mean cost of repair falls within this interval.

Q3) A) A college admissions director wishes to estimate the mean age of all students currently enrolled. In a random sample of 20 students, the mean age is found to be 22.9 years. From past studies, the standard deviation is known to be 1.5 years, and the population is normally distributed Construct a 90% confidence interval of the population mean age

Ans: To construct a 90% confidence interval for the population mean age, you can use the formula for a confidence interval when the population standard deviation is known:

Confidence Interval = x̄ ± Z * (σ / √n)

Where:

x̄ is the sample mean (22.9 years).
σ is the population standard deviation (1.5 years).
n is the sample size (20).
Z is the Z-score corresponding to the desired confidence level (90% confidence corresponds to a Z-score of 1.645 for a two-tailed interval).

Now, let’s plug these values into the formula:

Confidence Interval = 22.9 ± 1.645 * (1.5 / √20)

Calculate the values:

Confidence Interval ≈ 22.9 ± 1.645 * (1.5 / √20)

Confidence Interval ≈ 22.9 ± 0.619

Now, calculate the lower and upper bounds of the confidence interval:

Lower Bound = 22.9 – 0.619 ≈ 22.281 Upper Bound = 22.9 + 0.619 ≈ 23.519

So, the 90% confidence interval for the population mean age is approximately 22.281 years to 23.519 years. This means that we can be 90% confident that the true population mean age falls within this range based on the sample data.

Q4)A)The joint probability distribution of X and Y is given by f(x,y) = c(x² + y²) for x=-1,0,1,3 and y=1, 2, 3. Find 1) the value of c 2). P(x≤1y>2) 3). P(x≥2-y)
B)Determine K such that the joint probability density function of a pair (x,y) of continuous random variables is f(x,y) = K (xy+2x+3y+6) 0 ≤x, y ≤1. Examine whether X and Y are independent

Ans: A)To find the values requested, we need to follow these steps:

Find the value of c.
Calculate P(x ≤ 1, y > 2).
Calculate P(x ≥ 2 – y).

Let’s start with finding the value of c. In a joint probability distribution, the sum of probabilities over all possible values must equal 1. Therefore:

The value of c can be found by summing the probabilities over all possible values of x and y:

∑∑[f(x, y)] = 1

∑∑[c(x² + y²)] = 1

Now, we need to calculate the sum over all x and y values given:

c((-1)² + 1²) + c(0² + 1²) + c(1² + 1²) + c(3² + 1²) + c((-1)² + 2²) + c(0² + 2²) + c(1² + 2²) + c(3² + 2²) + c((-1)² + 3²) + c(0² + 3²) + c(1² + 3²) + c(3² + 3²) = 1

Now, plug in the values for x and y and solve for c.

c(2) + c(1) + c(2) + c(10) + c(5) + c(4) + c(5) + c(13) + c(10) + c(9) + c(10) + c(18) = 1

16c = 1

c = 1/16

Now that we have found the value of c, let’s move on to the next parts:

P(x ≤ 1, y > 2): This probability is the sum of probabilities for x ≤ 1 and y > 2.

P(x ≤ 1, y > 2) = ∑∑[f(x, y)] for (x, y) in (-1,1) x (3,3) + (0,3) + (1,3)

P(x ≤ 1, y > 2) = c((-1)² + 3²) + c(0² + 3²) + c(1² + 3²)

P(x ≤ 1, y > 2) = (1/16) * (10 + 9 + 10) = (1/16) * 29 = 29/16

P(x ≥ 2 – y): This probability is the sum of probabilities for x ≥ 2 – y.

P(x ≥ 2 – y) = ∑∑[f(x, y)] for (x, y) in (1,3) x (1,3) + (3,3)

P(x ≥ 2 – y) = c(1² + 1²) + c(3² + 1²) + c(1² + 2²) + c(3² + 2²) + c(1² + 3²) + c(3² + 3²)

P(x ≥ 2 – y) = (1/16) * (2 + 10 + 5 + 13 + 10 + 18) = (1/16) * 58 = 58/16 = 29/8

So, the answers are:

c = 1/16
P(x ≤ 1, y > 2) = 29/16
P(x ≥ 2 – y) = 29/8

B)To determine the value of K and examine whether X and Y are independent, we need to follow these steps:

Find the value of K.
Check if X and Y are independent.

Let’s start with finding the value of K:

The joint probability density function (PDF) must satisfy the condition that the total probability over the entire range equals 1. In this case, we have:

∫∫[f(x, y)] dx dy = 1

Integrate f(x, y) over the specified range 0 ≤ x, y ≤ 1:

∫[∫[K(xy + 2x + 3y + 6)] dx] dy = 1

Now, perform the inner integration with respect to x:

K ∫[xy + 2x + 3y + 6] dx

K [∫(xy dx) + ∫(2x dx) + ∫(3y dx) + ∫(6 dx)]

K [(1/2)xy² + x² + 3xy + 6x] from 0 to 1

K [(1/2)y + 1 + 3y + 6 – (0)]

K [(1/2)y + 1 + 3y + 6] = K [(7/2)y + 7]

Now, integrate this expression with respect to y from 0 to 1:

∫[K(7/2)y + 7] dy from 0 to 1

K [(7/2)(1/2) + 7 – ((7/2)(0/2) + 7)] = K (7/4 + 7) = K (7/4 + 28/4) = K (35/4)

Now, we know that the integral of f(x, y) over the specified range equals 1, so:

K (35/4) = 1

Solving for K:

K = 4/35

Now that we have found the value of K, let’s examine whether X and Y are independent. Two random variables, X and Y, are independent if and only if their joint PDF can be expressed as the product of their individual PDFs. In this case, we have:

f(x, y) = (4/35)(xy + 2x + 3y + 6)

Now, let’s find the individual PDFs of X and Y:

f_X(x) = ∫[f(x, y)] dy from 0 to 1 f_Y(y) = ∫[f(x, y)] dx from 0 to 1

Calculate these integrals:

f_X(x) = (4/35) ∫[(xy + 2x + 3y + 6)] dy from 0 to 1 f_Y(y) = (4/35) ∫[(xy + 2x + 3y + 6)] dx from 0 to 1

After performing these integrals, if f_X(x) and f_Y(y) are not equal to (4/35)(x + 2)(y + 3), then X and Y are not independent.

If they are equal, then X and Y are independent.

Q5.A)Determine the mean and S. D of sampling distributions of variances for the population 6, 12, 18, 24, 30, 36 with n=2 and with sampling (a) with replacement (b) without replacement.
B) Chief Executive Officer (CEO) of the life insurance company wants to undertake a survey of the huge number of insurance policies that the company has underwritten. The company makes a yearly profit on each policy that is distributed with mean Rs.8000 and standard deviation Rs.300. It is desired that the survey must be large enough to reduce the standard error to not more than 1.5% of the population mean. How large should sample be?

Ans: A)

To determine the mean and standard deviation (SD) of sampling distributions of variances for a given population, you can follow these steps:

Population: 6, 12, 18, 24, 30, 36

Sample Size (n): 2

(a) Sampling with Replacement:

Calculate all possible samples of size 2 with replacement from the population:
- Sample 1: (6, 6)
- Sample 2: (6, 12)
- Sample 3: (6, 18)
- …
- Sample 30: (36, 36)
For each sample, calculate the sample variance (s^2). The sample variance formula is: s^2 = [(x1 – x̄)^2 + (x2 – x̄)^2] / (n – 1)
Find the mean (average) of all sample variances calculated in step 2. This will be the mean of the sampling distribution of variances.
Find the standard deviation (SD) of the sample variances calculated in step 2. This will be the SD of the sampling distribution of variances.

(b) Sampling without Replacement:

When sampling without replacement, you need to consider the changing population size for each sample. The mean and SD of the sampling distribution of variances without replacement can be more complex to calculate, especially for larger populations and sample sizes.

You would need to consider all possible combinations of samples without replacement, calculate the sample variances for each, and then find the mean and SD of these sample variances. This would involve combinatorial calculations, and for a population of 6 elements with a sample size of 2, there will be 15 possible combinations to consider. You would use these combinations to compute the sample variances for each case.

Keep in mind that calculating the mean and SD of the sampling distribution of variances without replacement can be a time-consuming process when dealing with larger populations and sample sizes. It may be practical to use statistical software or tools to perform these calculations.

To determine the sample size needed for the CEO’s survey of insurance policies to ensure that the standard error is not more than 1.5% of the population mean, you can use the formula for sample size in a survey:

Since you cannot have a fraction of a policy, you should round up to the nearest whole number to ensure the sample size is sufficient. Therefore, the CEO should survey at least 25 insurance policies to reduce the standard error to not more than 1.5% of the population mean with 95% confidence.

For More Updates Join Our Channels :