Wednesday, June 8, 2011

Basic Statics 6-8-2011

1.Using the frequency distribution given, construct a frequency polygon.

Class Boundaries

Frequency

95.5 - 100.5
    


2


100.5 - 105.5



8


105.5 - 110.5



18


110.5 - 115.5



13


115.5 - 120.5



7


120.5 - 125.5



1


125.5 - 130.5



1




Find the midpoints of each class. Recall that midpoints are found by adding the upper and lower boundaries and dividing by 2:
95.5 + 100.5
2
 = 98

100.5 + 105.5
2
 = 103
 
and so on. The midpoints are
Class Boundaries
Midpoints
Frequency

95.5 - 100.5

98

2


100.5 - 105.5

103

8


105.5 - 110.5

108

18


110.5 - 115.5

113

13


115.5 - 120.5

118

7


120.5 - 125.5

123

1


125.5 - 130.5

128

1

Draw the x and y axes. Label the x axis with the midpoint of each class, and then use a suitable scale on the y axis for the frequencies.
Using the midpoints for the x values and the frequencies as the y values, plot the points.
Connect adjacent points with line segments. Draw a line back to the x axis at the beginning and end of the graph, at the same distance that the previous and next midpoints would be located.



Construct a histogram to represent the data shown below for the record high temperatures for each of the 50 states.

 Class boundaries 
 Frequency 

99.5-104.5
   
  
  
1


104.5-109.5
   


4


109.5-114.5
   


8


114.5-119.5
   


8


119.5-124.5
   


19


124.5-129.5
   


5


129.5-134.5
   


4


134.5-139.5
   


1


Is the given histogram correct?




Draw and label the x and y axes. The x axis is always the horizontal axis, and the y axis is always the vertical axis.
Represent the frequency on the y axis and the class boundaries on the x axis.
Using the frequencies as the heights, draw vertical bars for each class.
No, the given graph does not match the graph we constructed.

Top of Form

Bottom of Form

These data represent the record high temperatures for each of the 50 states. Construct a grouped frequency distribution for the data using 7 classes.

112
100
127
120
134
118
105
110
109
112
110
118
117
116
118
122
114
114
105
109
107
112
114
115
118
117
118
122
106
110
116
108
110
121
113
120
119
111
104
111
120
113
120
117
105
110
118
112
114
114

What is the cumulative frequency of the class 124.5 - 129.5?



The procedure for constructing a grouped frequency distribution for numerical data follows.
Determine the classes.

Find the highest value and the lowest value: H = 134 and L = 100.

Find the range: R = highest value - lowest value = H - L, so

R = 134 - 100 = 34

Select the number of classes desired (usually between 5 and 20). In this case, 7 is arbitrarily chosen.
Find the class width by dividing the range by the number of classes.

Width = 
R
number of classes
 = 4.9 (rounded to one decimal place)
Round the answer up to the nearest whole number if there is a remainder: 4.9 5. (Rounding up is different from rounding off. A number is rounded up if there is any remainder when dividing. For example, 85 6 = 14.167 and is rounded up to 15. Also, 53 4 = 13.25 and is rounded up to 14.)
Select a starting point for the lowest class limit. This can be the smallest data value or any convenient number less than the smallest data value. In this case 100 is used. Add the width to the lowest score taken as the starting point to get the lower limit of the next class. Keep adding until there are 7 classes, 100, 105, 110, etc.

Subtract one unit from the lower limit of the second class to get the upper limit of the first class. Then add the width to each upper limit to get all the upper limits.

105 - 1 = 104

The first class is 100 - 104, the second class is 105 - 109, etc. Find the class boundaries by subtracting 0.5 from each lower class limit and adding 0.5 to each upper class limit:

99.5 - 104.5, 104.5 - 109.5, etc.
Tally the data.
Find the numerical frequencies from the tallies.

A cumulative frequency column can be added to the distribution by adding the frequency in each class to the total of the frequencies of the classes preceding that class, such as 0 + 2 = 2, 2 + 8 =10, 10 + 18 = 28, and 28 + 13 = 41. The completed frequency distribution is

Class Limits
Class boundaries
Tally
Frequency
Cumulative Frequency
100 - 104
99.5 - 104.5
2
2
105 - 109
104.5 - 109.5
8
10
110 - 114
109.5 - 114.5
18
28
115 - 119
114.5 - 119.5
13
41
120 - 124
119.5 - 124.5
7
48
125 - 129
124.5 - 129.5
1
49
130 - 134
129.5 - 134.5
1
50
The frequency distribution shows that the cumulative frequency of the class 124.5 - 129.5 is 49.
These data represent the record high temperatures for each of the 50 states. Construct a grouped frequency distribution for the data using 7 classes.

112
100
127
120
134
118
105
110
109
112
110
118
117
116
118
122
114
114
105
109
107
112
114
115
118
117
118
122
106
110
116
108
110
121
113
120
119
111
104
111
120
113
120
117
105
110
118
112
114
114

What is the frequency of the class 119.5 - 124.5?

Twenty-five army inductees were given a blood test to determine their blood type. The data set is

A
O
AB
AB
O
AB
O
B
AB
B
B
B
O
A
O
A
O
O
O
AB
AB
A
O
B
A

Construct a frequency distribution for the data.

What is the frequency in the class of type O?



Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O and AB. These types will be used as the classes for the distribution. The procedure for constructing a frequency distribution for categorical data is given next.
Make a table as shown.

A
B
C
D
Class
Tally
Frequency
Percent
A



B



O



AB



Tally the data and place the results in column B.
Count the tallies and place the results in column C.
Find the precentage of values in each class by using the formula

% = 
f
n
  100%

where f = frequency of the class and n = total number of values. For example, in the class of type A blood, the percentage is

% = 
5
25
  100% = 20%

Percentages are not normally a part of a frquency distribution, but they can be added since they are used in certain types of graphical presentations, such as pie graphs.
Find the totals for the columns C and D (see completed table).

A
B
C
D
Class
Tally
Frequency
Percent
A
5
20
B
5
20
O
9
36
AB
  6
 24


Total 25
100
For the sample, we have 9 people with type O blood.



The data shown here represent the number of miles per gallon that 30 selected four-wheel-drive sports utility vehicles obtained in city driving.



Construct a frequency distribution.



12
17
12
14
16
18
16
18
12
16
17
15
15
16
12
15
16
16
12
14
15
12
15
15
19
13
16
18
16
14



What is the frequency of the class 11.5 - 12.5?
The number of stories in two selected samples of tall buildings in Los Angeles and Philadelphia are shown. Construct a back-to-back stem and leaf plot, and compare the distributions.

Los Angeles
Philadelphia
23
42
29
37
54
56
22
34
21
46
38
36
44
40
34
42
52
30
43
44

Does Los Angeles have more 40 story buildings than Philadelphia?



Arrange the data for both sets in order.
Construct a stem and leaf plot using the same digits as stems. Place the digits for the leaves for Los Angeles on the left side of the stem and the digits for the leaves for Philadelphia on the right side.
Los Angeles

Philadelphia


 9 
 3 
 2 
 1 
 2 


 8 
 7 
 6 
 4 
3
 0 
 4 



 4 
 2 
 0 
4
 2 
 3 
 4 
 6 



 4 
5
 2 
 6 


Compare the distributions. The buildings in Los Angeles have a large variation in the number of stories per building. Philadelphia has more 40 story buildings than Los Angeles. Therefore the answer is no.
Construct an ogive for the frequency distribution below.

Class Boundaries

Frequency

99.5 - 104.5
    


2


104.5 - 109.5



6


109.5 - 114.5



16


114.5 - 119.5



12


119.5 - 124.5



9


124.5 - 129.5



4


129.5 - 134.5



1




Find the cumulative frequency for each class.
Class Boundaries

Cumulative Frequency

99.5 - 104.5
    


2


104.5 - 109.5



8


109.5 - 114.5



24


114.5 - 119.5



36


119.5 - 124.5



45


124.5 - 129.5



49


129.5 - 134.5



50

Draw the x and y axes. Label the x axis with the class boundaries. Use an appropriate scale for the y axis to represent the cumulative frequencies. (Depending on the numbers in the cumulative frequency columns, scales such as 0, 1, 2, 3, . . ., or 5, 10, 15,20, . . ., or 1000, 2000, 3000, . . . can be used. Do not label the y axis with the numbers in the cumulative frequency column.) In this example, a scale of 0, 5, 10, 15, . . . will be used.
Plot the cumulative frequency at each upper class boundary. Upper boundaries are used since the cumulative frequencies represent the number of data values accumulated up to the upper boudary of each class.
Starting with the first upper class boundary, 104.5, connect adjacent points with line segments. Then extend the graph to the first lower class boundary, 99.5 on the x axis.

ALBANY STATE UNIVERSITY
BASIC STATISTICS                                                                 TEST-1 (CHAPTER-1, 2)                                                                                                             
Date:
1.   Define Population, Sample.
2.    Name and define two areas of statistics.
3.   Define bas ic sampling methods.
4.   What are the types of sampling?
5.   The average grade of a class is 90. Define    Descriptive / Inferential
6.   Allergy therapy makes bees go away. Define    Descriptive / Inferential

7.   Classify the variable
1.   Wal-Mart sold 100 TV sets in a weak in Albany area. Qualitative/ Quantitative
2.   All of Basic Statistics students in ASU are hard workers. Qualitative/ Quantitative
3.   Number of toys sold in a store.  Discrete/ continuous
4.   The average age of a class. Discrete/ continuous

8.   What are the boundaries of 25.6 ounces?
a)   25-26  b)25.55-25.65   c)25.5-25.7  d)20-39

9.   A researcher divided subjects into two groups according to gender and then selected members from each group for her sample. What sampling method was heresearcher using.
a)   Cluster  b) Random  c) Systematic  d) stratified 
   
10.   Classify each sample as random, systematic, stratified, or cluster.
a)   In ASU, all teachers from two buildings are interviewed to determine whether they believe the students have less homework to do now than in previous years.
b)   Every 5 th player is selected for soccer game in ASU.
c)   Statistics students are selected to determine annual graded.
d)   Every 100 th manufactured item is checked to determine its quality.
e)   In USPS MAIL carriers of Albany city are divided into four groups according to gender and according to whether they walk or ride on their routes. Then 10 are selected from each group and interviewed to determine whether they have been by a dog in the last year.

11.  List five reasons for organizing data in to a frequency distribution.


12.  Find class limits, upper limits. Lower limits. Upper bounds. Lower bounds. Mid point.
Class Limits
Class boundaries class limits midpoints   
Frequency
Cumulative Frequency
100 - 104
2
105 - 109
8
110 - 114
18
115 - 119
13
120 - 124
7
125 - 129
1
130 - 134
1


13.  ­­­idnt_______________________s.lower limits. upper uency distribution.
13.  og in the last year.
13.  e students have less homework to di n
Twenty-five army inductees were given a blood test to determine their blood type. The data set is

A
O
AB
AB
O
AB
O
B
AB
B
B
B
O
A
O
A
O
O
O
AB
AB
A
O
B
A

Construct a frequency distribution for the data.

What is the frequency in the class of type O?
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O and AB. These types will be used as the classes for the distribution. The procedure for constructing a frequency distribution for categorical data is given next.
14.
The data shown here represent the number of miles per gallon that 30 selected four-wheel-drive sports utility vehicles obtained in city driving.

Construct a frequency distribution.

12
17
12
14
16
18
16
18
12
16
17
15
15
16
12
15
16
16
12
14
15
12
15
15
19
13
16
18
16
14

What is the frequency of the class 11.5 - 12.5?