A scientist for a company that manufactures fabric wants to assess the percentage of polyester in the fabric.
The advertised percentage is 15%. The scientist measures the percentage of polyester in 20 random samples.
Previous measurements found that the population standard deviation is 2.6%.
Sample Id |
Percent Polyester |
1 |
15.2 |
2 |
12.4 |
3 |
15.4 |
4 |
16.5 |
5 |
15.9 |
6 |
17.1 |
7 |
16.9 |
8 |
14.3 |
9 |
19.1 |
10 |
18.2 |
11 |
18.5 |
12 |
16.3 |
13 |
20.0 |
14 |
19.2 |
15 |
12.3 |
16 |
12.8 |
17 |
17.9 |
18 |
16.3 |
19 |
18.7 |
20 |
16.2 |
from scipy.stats import norm
import pandas as pd
import math
dataset = pd.Series([15.2, 12.4, 15.4, 16.5, 15.9, 17.1, 16.9, 14.3, 19.1, 18.2, 18.5, 16.3, 20, 19.2, 12.3, 12.8, 17.9, 16.3, 18.7, 16.2])
alpha = 0.05
mean = dataset.mean()
mu = 15
sigma = 2.6
n = dataset.size
z = (mean - mu) / (sigma / math.sqrt(n))
p = norm.sf(abs(z))*2
print('z =', z)
print('p =',p)
z = 2.5112763439613035 p = 0.012029548638074252
The table presents data from two jeans factories. In each of the columns is presented the monthly number of defective products.
The data for factory 1 are for 10 months, and for factory 2 - for 12 months.
Is there a difference in the average number of defective products in the two factories?
Factory 1 |
Factory 2 |
80 |
79 |
76 |
73 |
70 |
72 |
80 |
62 |
66 |
76 |
85 |
68 |
79 |
70 |
71 |
86 |
81 |
75 |
76 |
68 |
73 |
|
66 |
from scipy import stats
factory1 = pd.Series([80, 76, 70, 80, 66, 85, 79, 71, 81, 76])
factory2 = pd.Series([79, 73, 72, 62, 76, 68, 70, 86, 75, 68, 73, 66])
stats.ttest_ind(factory1, factory2)
Ttest_indResult(statistic=1.5519317588776553, pvalue=0.1363596260157604)