Saturday, July 9, 2011

ASU B.Stats Assignment for monday July 11,2011

Albany State University

Assignment

Date July 11,2011

1. The data represents the annual chocolate sales for a sample of seven countries in the world. Find the mean. Please round your answer to two decimal places.
2,6, 4, 8, 12,15,11,21,34,21,18

2.  Given the distribution below, do the followings. (5 points each )
    complete the table.
    Class Limits   Frequency                                                      
       0 - 10                2

       10 - 20               7

      20 – 30             10

      30 - 40             2

      40 - 50             15
Mean ___________________           Standard Deviation___________________

3. To qualify for a police academy, candidates must score in the top 5% ( Area of Z= 0.0500) on a general abilities test. The test has a mean of 200 and a standard deviation of 30. Find the lowest possible score to qualify. Assume the test scores are normally distributed.


4.Find the area under the normal distribution curve between z1 = 1.00 and z2 = 1.58. Round your answer to four decimal places.
Find the area under the normal distribution curve between z1 = +2.67 and
z2 = -1.27. Give your answer in decimal form.

The desired area is shown below.
2.     Compute the value of the correlation coefficient for the data obtained in the study of the number of absences and the final grade of seven students in the statistics class given.


Subject
Number of
Final grade
xy
x2
y2
absences x
y(%)
A
1
B
8
C
5
D
2
E
4
F
6
G
8
Substitute in the formula and solve for r.


r
=
n( xy) - ( x)( y)
[n( x2) - ( x)2] [n( y2) - ( y)2]
3.             Test the significance of the correlation coefficient for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects. The data are shown in the table.

Subject
Age x
Pressure y
A
43
125
B
48
126
C
56
132
D
61
144
E
67
141
F
70
151
4.     Find the area under the Normal distribution curve between z1 = 1.27 and z2 = 1.98 Round your answer to four decimal place.
5.      
6.     Find the area under the Normal distribution curve between z1 = 0 and z2 = 27 Round your answer to four decimal place.
7.      
8.     Find the area under the Normal distribution curve between z1 =1.77 and z2 = 2.21 Round your answer to four decimal place.

9.     Find the area under the Normal distribution curve between z1 = 1.27 and z2 = 1.98 Round your answer to four decimal place.

10.  Find the area under Normal distribution curve. To right of Z = 0.34

11.  Find the area under Normal distribution curve. To right of Z = 1.97


12.  Find the area under Normal distribution curve. To left of Z =-0.27

13.  Find the area under Normal distribution curve. To left  of Z = -0.37


14.  Find the area under Normal distribution curve. To right of Z = 0.94

15.  Given that Mean of normal distribution is 25. S.D  is 0.4. Find the area under normal curve between  27 and 34.

16.  Given that Mean of normal distribution is 10. S.D  is 0.14. Find the area under normal curve between  34 and 54.
17.  Given that Mean of normal distribution is 9. S.D  is 0.2. Find the area under normal curve between  19 and 34.

18.  Given that Mean of normal distribution is 5. S.D  is 0.1. Find the area under normal curve between  27 and 34.

19.  Given that Mean of normal distribution is 150. S.D  is 25.4. Find the area under normal curve between  160 and 175.

20.  Given that Mean of normal distribution is 13. S.D  is 0.14. Find the area under normal curve between  10 and 20.





   


Wednesday, July 6, 2011

ASU Probability, Normal Distribution, Correlation, Regression

ALBANY STATE UNIVERSITY
PROBABILITY
1.    For a card drawn from an ordinary deck, find the probability of getting an ace.



1
13
13
1
52
none of these
2.    A card is drawn from an ordinary deck. Find the probability of getting the 3 of spades.



1
4
1
13
52
1
none of these
3.    A card is drawn from an ordinary deck. Find the probability of getting a 4 or a diamond.



1
52
4
52
4
13
none of these
4.    When a single die is rolled, what is the probability of getting a 9?

Your Answer:

5.    To qualify for a police academy, candidates must score in the top 5% on a general abilities test. The test has a mean of 200 and a standard deviation of 30. Find the lowest possible score to qualify. Assume the test scores are normally distributed.


Since the test scores are normally distributed, the test value (X) that cuts off the upper 5% of the area under the normal distribution curve is desired. This area is shown in the figure below.


Work backward to solve this problem.
Subtract 0.0500 from 0.5000 to get the area under the normal distribution between 200 and X: 0.5000 - 0.0500 = 0.4500
Find the z value that corresponds to an area of 0.4500 by looking up 0.4500 in the area portion of a standard normal distribution table. Since the area falls exactly halfway between two z values, use the larger of the two z values. In this case, the area 0.4500 falls halfway between 0.4495 and 0.4505, so use 1.65 for the z value.
Substitute in the formula z = (X - )/ and solve for X.
1.65
=
X - 200
30
(1.65)(30) + 200
=
X
49.50 + 200
=
X
249.50
=
X
Rounding up, we obtain X = 250.
A score of 250 should be used as a cutoff. Anybody scoring 250 or higher qualifies.

6.           A single card is drawn from a deck. Find the probability that it is a ten or a heart.


Since the ten of hearts is counted twice, one of the two probabilities must be subtracted.
P( ten or heart)
=
P( ten) + P( heart) - P( ten of hearts)
=
4
52
+
13
52
-
1
52
=
16
52
=
4
13
7.    A single card is drawn from a deck. Find the probability that it is a ten or a heart.


Since the ten of hearts is counted twice, one of the two probabilities must be subtracted.
P( ten or heart)
=
P( ten) + P( heart) - P( ten of hearts)
=
4
52
+
13
52
-
1
52
=
16
52
=
4
13
8.    In a hospital unit there are 10 nurses and 6 physicians; 7 nurses and 3 physicians are females. If a staff person is selected, find the probability that the subject is a nurse or a male.


The sample space is shown here.

Staff
Females
Males
Total
Nurses
7
3
10
Physicians
3
3
6
Total
10
6
16
The probability is

P( nurse or male)

=

P( nurse) + P(male) - P(male nurse)
=
10
16
+
6
16
-
3
16
=
13
16
9.    When a single die is rolled are these two events mutually exclusive?
Getting a number greater than 4 and getting a number less than 4.


The events are mutually exclusive, since the first event can be 5 or 6 and the second event can be 1, 2, or 3.
The answer is yes
10. A box contains 5 glazed doughnuts, 6 jelly doughnuts, and 7 chocolate doughnuts. If a person selects one doughnut at random, find the probability that it is either a glazed doughnut or a chocolate doughnut.


Since the box contains 5 glazed doughnuts, 7 chocolate doughnuts, and a total of 18 doughnuts,

P(glazed or chocolate) = P(glazed) + P(chocolate)

=
5
18
+
7
18
=
12
18
=
2
3
.
The events are mutually exclusive.



NORMAL DISTRIBUTION



1.    Find the area under the normal distribution curve between z = 0 and z = 2.37. Round to four decimal places.


Draw the figure and represent the area as shown.
Since the table for standard normal distribution gives the area between 0 and any z value to the right of 0, one need only to look up the z value in the table. Find 2.3 in the left column and 0.07 in the top row. The value where the column and row meet in the table is the answer, 0.4911. Hence, the area is 0.4911, or 49.11%.
2.    Find the area under the normal distribution curve between z = 0 and z = 1.9. Round to four decimal places.



Draw the figure and represent the area as shown.
Find the area in the table by finding 1.9 in the left column and 0.00 in the top row. The area is 0.4713, or 47.13%.
3.    Find the area under the normal distribution curve between z = 0 and z = -1.76. Give your answer in decimal form.



Draw the figure and represent the area as shown.
The table does not give the area for negative values of z. But since the normal distribution is symmetric about the mean, the area to the left of the mean (in this case, the mean is 0) is the same as the area to the right of the mean. Hence one need only look up the area for z = +1.76, which is 0.4608, or 46.08%.



4.    Find the area under the normal distribution curve between z1 = 1.00 and z2 = 2.58. Round your answer to four decimal places.



The desired area is shown below.
For this situation, look up from the area from z = 0 to z = 2.58 and the area from z = 0 to z = 1.00. Then subtract the two areas as shown below.
The area between z = 0 and z = 2.58 is 0.4951.
The area between z = 0 and z = 1.00 is 0.3413.
Hence, the desired area is 0.4951 - 0.3413 = 0.1538, or 15.38%.

Top of Form

Bottom of Form

5.    Find the area under the normal distribution curve between z1 = +1.67 and z2 = -1.3. Give your answer in decimal form.



The desired area is shown below.
Now, since the two areas are on the opposite sides of z = 0, one must find both areas and add them.
The area between z = 0 and z = 1.67 is 0.4525.
The area between z = 0 and z = -1.3 is 0.4032.
Hence, the total area between z = -1.3 and z = +1.67 is 0.4525 + 0.4032 = 0.8557, or 85.57%.
6.    Find the area under the normal distribution curve to the right of z1 = +1.14 and to the left of z2 = -2.89. Round to four decimal places.



The desired area is shown.
The area to the right of 1.14 is 0.5000 - 0.3729 = 0.1271.
The area to the left of -2.89 is 0.5000 - 0.4981 = 0.0019.
The total area, then, is 0.1271 + 0.0019 = 0.129, or 12.9%.
7.    Find the z value such that the area under the standard normal distribution curve between 0 and the z value is 0.195. Give your answer correct to two decimal places.



Draw the figure. The area is shown below.
Next, find the area in the standard normal distribution table. Then read the correct z value in the left column as 0.5 and the top row as 0.01 and add these two values to get 0.51.
8.The mean number of hours an American worker spends on the computer is 3.1 hours per workday. Assume the standard deviation is 0.5 hour. Find the percentage of workers who spend less than 4.4 hours on the computer. Assume the variable is normally distributed. Round to the nearest hundredth of a percent.



Draw the picture and represent the area as shown below (X = 4.4).

Find the z value corresponding to 4.4.

z =
X -
=
4.4 - 3.1
0.5
= 2.60
Hence, 4.4 is 2.6 standard deviations above the mean of 3.1, as shown below (z = 2.6).

Find the area using a table of values for the standard normal distribution. The area between z = 0 and z = 2.6 is 0.4953. Since the area under the curve to the left of z = 2.6 is desired, add 0.5000 to 0.4953 (0.5000 + 0.4953 = 0.9953).
Therefore, 99.53% of the workers spend less than 4.4 hours per workday on the computer.
9. Each month, an American household generates an average of 28 pounds of newspaper for garbage or recycling. Assume the standard deviation is 2 pounds. If a household is selected at random, find the probability of its generating between 26 and 29 pounds per month. Assume the variable is approximately normally distributed. Express your answer correct to the nearest hundredth of a percent.



Draw the figure and represent the area as shown below.

26
28
29
Find the two z values.

z =
X -
=
26 - 28
2
= -1
z =
X -
=
29 - 28
2
= 0.5
Find the appropriate area, using a table of values for the standard normal distribution. The area between z = 0 and z = -1 is 0.3413. The area between z = 0 and z = 0.5 is 0.1915. Add 0.3413 to 0.1915 (0.3413 + 0.1915 = 0.5328).
Hence, the probability that a randomly selected household generates between 26 and 29 pounds of newspapers per month is 53.28%.
10. Each month, an American household generates an average of 28 pounds of newspaper for garbage or recycling. Assume the standard deviation is 2 pounds. If a household is selected at random, find the probability of its generating more than 30.7 pounds per month. Assume the variable is approximately normally distributed. Express your answer correct to the nearest hundredth of a percent.



Draw the figure and represent the area as shown below.

28
30.7
Find the z value for 30.7.

z =
X -
=
30.7 - 28
2
= 1.35
Find the appropriate area, using a table of values for the standard normal distribution. The area between z = 0 and z = 1.35 is 0.4115. Since the desired area is in the right tail, subtract 0.4115 from 0.5000.

0.5000 - 0.4115 = 0.0885
Hence, the probability that a randomly selected household will generate more than 30.7 pounds of newspapers per month is 0.0885, or 8.85%.
11. The American Automobile Association reports that the average time it takes to respond to an emergency call is 25 minutes. Assume the variable is approximately normally distributed and the standard deviation is 4.5 minutes. If 99 calls are randomly selected, approximately how many will be responded to in less than 24 minutes?



Draw the figure and represent the area as shown below.

24
25
Find the z value for 24.

z =
X -
=
24 - 25
4.5
= -0.22
Find the appropriate area, using a table of values for the standard normal distribution with z = + 0.22. The area between z = 0 and z = -0.22 is 0.0871.
Subtract 0.0871 from 0.5000 to get 0.4129.
To find how many calls will be made in less than 24 minutes, multiply the sample size 99 by 0.4129 to get 40.877. Hence, 40.877, or approximately 41, calls will be responded to in under 24 minutes.

Top of Form

Bottom of Form

12. To qualify for a police academy, candidates must score in the top 5% on a general abilities test. The test has a mean of 200 and a standard deviation of 30. Find the lowest possible score to qualify. Assume the test scores are normally distributed.



Since the test scores are normally distributed, the test value (X) that cuts off the upper 5% of the area under the normal distribution curve is desired. This area is shown in the figure below.


Work backward to solve this problem.
Subtract 0.0500 from 0.5000 to get the area under the normal distribution between 200 and X: 0.5000 - 0.0500 = 0.4500
Find the z value that corresponds to an area of 0.4500 by looking up 0.4500 in the area portion of a standard normal distribution table. Since the area falls exactly halfway between two z values, use the larger of the two z values. In this case, the area 0.4500 falls halfway between 0.4495 and 0.4505, so use 1.65 for the z value.
Substitute in the formula z = (X - )/ and solve for X.
1.65
=
X - 200
30
(1.65)(30) + 200
=
X
49.50 + 200
=
X
249.50
=
X
Rounding up, we obtain X = 250.
A score of 250 should be used as a cutoff. Anybody scoring 250 or higher qualifies.

8.       CORRELATION

13. Construct a scatter plot for the data obtained in a study on the number of absences and the final grades of seven randomly selected students from a statistics class. The data are shown here.

Number of
Final
Student
absences x
grade y (%)
A
6
87
B
2
91
C
15
48
D
9
79
E
12
63
F
5
95
G
8
83


Draw and label the x and y axes.
Plot each point on the graph, as shown below.

1.        

14. Compute the value of the correlation coefficient for the data obtained in the study of age and blood pressure.

Subject
Age x
Pressure y
A
43
122
B
41
120
C
52
131
D
66
145
E
65
148
F
76
155


Make a table.

Subject
Age x
Pressure y
xy
x2
y2
A
43
122
B
41
120
C
52
131
D
66
145
E
65
148
F
76
155
Find the value of xy, x2, and y2 and place these values in the corresponding columns of the table. The completed table.
Subject
Age x
Pressure y
xy
x2
y2
A
43
122
5246
1849
14884
B
41
120
4920
1681
14400
C
52
131
6812
2704
17161
D
66
145
9570
4356
21025
E
65
148
9620
4225
21025
F
76
155
11780
5776
24025
x = 343
y = 821
xy = 47948
x2 = 20591
y2 = 113399
Substitute in the formula and solve for r.
r
=
n( xy) - ( x)( y)
[n( x2) - ( x)2] [n( y2) - ( y)2]
=
(6)(47948) - (343)(821)
[(6)(20591) - (343)2] [(6)(113399) - (821)2]
= 0.994
(rounded to three decimal places)
The correlation coefficient suggests a strong positive relationship between age and blood pressure.
15. Compute the value of the correlation coefficient for the data obtained in the study of the number of absences and the final grade of seven students in the statistics class given.

Subject
Number of absences x
Final grade y (%)
A
1
72
B
8
84
C
5
31
D
2
70
E
4
52
F
6
99
G
8
87



Make a table.

Find the values of xy, x2, and y2 and place these values in the corresponding columns of the table.
Subject
Number of
Final grade
xy
x2
y2
absences x
y(%)
A
1
72
72
1
5184
B
8
84
672
64
7056
C
5
31
155
25
961
D
2
70
140
4
4900
E
4
52
208
16
2704
F
6
99
594
36
9801
G
8
87
696
64
7569
x = 34
y = 495
xy = 2537
x2 = 210
y2 = 38175
Substitute in the formula and solve for r.
r
=
n( xy) - ( x)( y)
[n( x2) - ( x)2] [n( y2) - ( y)2]
=
(7)(2537) - (34)(495)
[(7)(210) - (34)2] [(7)(38175) - (495)2]
= 0.352
The value of r suggests a strong positve relationship between a student's final grade and the number of absences a student has. That is, the more absences a student has, the greater is his or her grade.
16. Test the significance of the correlation coefficient for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects. The data are shown in the table.

Subject
Age x
Pressure y
A
43
125
B
48
126
C
56
132
D
61
144
E
67
141
F
70
151

Use = 0.02, and r = 0.942.



There is not a significant relationship between the variables of age and blood pressure.
There is a significant relationship between the variables of age and blood pressure.
17. The following data were obtained in a study on the number of hours that nine people exercise each week and the amount of milk (in ounces) each person consumes each week.

Subject
Hours x
Amount y
A
3
35
B
0
8
C
2
43
D
5
47
E
8
41
F
5
43
G
10
52
H
2
63
I
1
35

Using the table for the critical values for PPMC, test the significance of the correlation coefficient r = 0.456 at = 0.05.




There is enough evidence to say that there is a significant
linear relationship between the variable
There is not enough evidence to say that there is a significant
linear relationship between the variables.

Regression

18.The data in the following table were obtained in a study of age and systolic blood pressure of six randomly selected subjects. Find the equation of the regression line. Please round to three decimal places.

Subject
Age x
Pressure y
A
43
127
B
48
126
C
56
130
D
61
143
E
67
140
F
70
153


The values needed for the equation are n = 6, x = 345, y = 819, xy = 47602, and x2 = 20399. Substituting in the formulas, one gets
a
=
( y)( x2) - ( x)( xy)
n( x2) - ( x)2
=
(819)(20399) - (345)(47602)
(6)(20399) - (345)2
= 84.325
b
=
n( xy) - ( x)( y)
n( x2) - ( x)2
=
(6)(47602) - (345)(819)
(6)(20399) - (345)2
= 0.907
Hence, the equation of the regression line y = a + bx is

y = 84.325 + 0.907x



18. The data in the following table were obtained in a study on the number of absences and the final grades of seven randomly selected students from a statistics class. Find the equation of the regression line. Please round to three decimal places.

Number of
Final
Student
absences x
grade y (%)
A
6
88
B
2
97
C
15
43
D
9
65
E
12
63
F
5
87
G
8
73



The values needed for the equation are n = 7, x = 57, y = 516, xy = 3727, and x2 = 579. Substituting in the formulas, one gets
a
=
( y)( x2) - ( x)( xy)
n( x2) - ( x)2
=
(516)(579) - (57)(3727)
(7)(579) - (57)2
= 107.369
b
=
n( xy) - ( x)( y)
n( x2) - ( x)2
=
(7)(3727) - (57)(516)
(7)(579) - (57)2
= -4.133
Hence, the equation of the regression line y = a + bx is

y = 107.369 - 4.133x
19. The data in the following table were obtained in a study of age and systolic blood pressure of six randomly selected subjects. Use the equation of the regression line to predict the blood pressure for a person who is 64 years old. Please round to the nearest whole number.

Subject
Age x
Pressure y
A
43
126
B
48
124
C
56
135
D
61
140
E
67
140
F
70
150



The values needed for the equation are n = 6, x = 345, y = 815, xy = 47350, and x2 = 20399. Substituting in the formulas, one gets
a
=
( y)( x2) - ( x)( xy)
n( x2) - ( x)2
=
(815)(20399) - (345)(47350)
(6)(20399) - (345)2
= 85.911
b
=
n( xy) - ( x)( y)
n( x2) - ( x)2
=
(6)(47350) - (345)(815)
(6)(20399) - (345)2
= 0.868
Hence, the equation of the regression line y = a + bx is

y = 85.911 + 0.868x
Substituting 64 for x in the equation of the regression line gives

y = 85.911 + (0.868)(64) = 141.463 (rounded to 141)
In other words, the predicted systolic blood pressure for a 64-year-old person is 141.