`Activity 5—MATH 250 `

Elements of Statistics—Summer 2018 MATH 250- Elements of Statistics

DUE DATE:07/05/2018 Class Data, Summer 2018—CLEANED Student Data

NAME: Individual ID# Gender Foot Length Height Age Armspan Number in Family Hair Color

1 Female 24.5 162.5 27 156.5 3 Blonde

General Instructions: Please place your name above, then complete the following questions. NOTE: Read the entire document below to get a feel for the activity before continuing. Make sure to save this Excel file often using the filename “yournameActivity5(Summer 2018)”. Once complete, submit your answers to this activity by attaching your Excel file through the completion link in the Unit 2 Activity 5 assignment description in Blackboard. Use the area to the near right in this Excel worksheet when calculating any statistics/parameters. Methods/work to calculate values must be shown in the spreadsheet in order to receive full credit. 2 Male 25.5 175.5 31 187 4 Black

3 Female 23 160 30 162 5 Brown

4 Male 25.5 175 29 175 3 Brown

5 Female 24 158 25 157 7 Brown

6 Female 25 163 22 163 3 Black

7 Male 31 186 30 190 3 Brown

8 Female 24.5 165 18 166 9 Brown

9 Female 21 155 27 150.5 4 Brown

Overview: 10 Female 23 165 24 150 3 Black

As discussed in the 9th and 10th Chapters of your text, one must realize the difficulty of and issues related to predicting population parameters from sample data (e.g. determine the mean height of all FHSU students but only collect a sample to do so.) To do this (as covered in Chapter 9), one must understand sampling distributions and the related issue of probability as found through the study of such distributions (which is also the key to truly understanding the rest of the material in the remaining chapters covered.) With knowledge of sampling distributions, one can then predict population parameters from sample statistics and have some indication of the strength of prediction. Do note however, we can never be 100% sure of the true population value based from a sample, we can only be relatively confident the population value is within some interval we build around a collected sample statistic. This activity is designed to have you use the cleaned class data from Activity 4 to predict some population parameters and to discuss issues regarding such predictions. The cleaned data set of ninety-six values is given to the right (recall that one individual’s data was eliminated in Activity 4). Althought this is not normally done, the Indivdiual ID# were also reassigned to eliminate student confusion on the number of data pieces in the data set. 11 Female 25.5 172 37 172 2 Black

12 Male 27 168 28 169 4 Black

13 Male 27 175.5 26 184.5 5 Brown

14 Female 22.5 158 32 162.5 4 Black

15 Male 27 188.5 22 187 4 Brown

16 Female 24.5 162.5 46 165 6 Brown

17 Male 31 183 26 188 4 Brown

18 Male 26 162.5 47 163 3 Brown

19 Male 29 180.5 50 182 4 Brown

20 Female 24 171 27 164 4 Blonde

21 Male 28 180.5 36 183 3 Black

22 Male 26 180.5 36 183 3 Brown

23 Male 26.5 176 21 176.5 9 Brown

24 Male 24 175 28 172.5 4 Blonde

25 Male 27 173 31 175 3 Black

- Let’s assume that the hair color factor for the class was collected from sixty randomly selected students taking Elements of Statistics from FHSU—even though we know the data was not truly randomly collected but instead comes from a convenience sample of this semester’s classes. 26 Female 25.5 165.5 20 173.5 3 Blonde

27 Female 24 160 47 182.5 2 Black

28 Female 24 161.5 24 160 7 Blonde

29 Female 25 174 22 169 3 Brown

a. First, determine from this sample data the proportion (percentage) of students which have brown hair. 30 Female 24.5 165 29 167.5 5 Brown

31 Male 27.5 178 48 178 4 Brown

32 Female 20 164 43 161.5 4 Blonde

33 Female 20.5 153 9 151.5 5 Brown

34 Male 24 183 22 188 1 Brown

35 Female 21.5 162.5 31 158 3 Blonde

b. Explain why it would be wrong to use the proportion calculated above (even if the data came from a random sample) as the proportion value for ALL FHSU Element of Statistics students who have brown hair color. 36 Male 32 178 37 165 3 Brown

37 Female 25 178 34 173.5 4 Brown

38 Female 25 168.5 23 168 4 Brown

39 Female 23.5 162.5 35 161.5 5 Red

40 Male 27.5 185.5 31 181 4 Brown

41 Male 27.5 176.5 29 170.5 4 Brown

42 Male 27 188 37 188 3 Black

c. Next, based upon a 90% level of confidence, determine the margin of error in estimating the true population proportion of FHSU Element of Statistics students with brown hair from this sample data. Give this margin of error value below. 43 Female 23 167 20 154 4 Red

44 Female 23 155 28 157.5 4 Brown

45 Female 22.5 160.5 27 163 5 Brown

46 Male 27 173 34 173 3 Brown

47 Female 28 177.5 41 175 3 Brown

48 Male 28 180 33 189 3 Blonde

49 Male 26.5 178 32 172.5 2 Red

50 Female 25 171.5 21 168.5 3 Blonde

51 Female 24 168 14 168 4 Blonde

52 Female 23 164.5 23 166 4 Brown

53 Female 25 170.5 25 174 2 Blonde

54 Female 23 168 22 160 4 Brown

55 Female 22 160.5 32 162.5 6 Brown

56 Female 23.5 165 30 157 5 Brown

d. Next use your margin of error value to produce the 90% confidence interval for the population proportion as predicted from the sample results. State this interval below within an interpretive sentence tied to the given context. 57 Female 21 157 21 158 3 Brown

58 Female 24 161.5 24 160 7 Blonde

59 Female 23.5 157.5 22 160 2 Brown

60 Male 28 173 39 160 6 Brown - Clearly explain the intent/meaning of the “90%” number in the confidence interval problem above.
- As above, let’s assume that the sample data for the height variable was collected from randomly selected students taking Elements of Statistics from FHSU.

a. Determine the mean, median, standard deviation, and variance of the “height” data.

b. Determine the margin of error in estimating the population mean height from this sample’s mean based upon a 95% level of confidence. (You may assume that previous research has demonstrated the population standard deviation on height measures of similar groups of people is 10.2 cm and that the distribution of height measures is roughly a normal distribution).

c. Using your margin of error value from (3b), give the 95% confidence interval for the population mean height measure as predicted from this sample’s results. State this interval below within an interpretive sentence.

d. Although similar, what makes this situation in problem #3 (and hence some of the resulting statistical formulas used) different from the situation in problem #1 above? - Notice (from the margin of error measurment determined in 3b) that we have used a sample size larger than needed in order to be sufficiently confident that the sample mean would estimate the population mean height within 4 cm; that is our margin of error at 2.17 cm was much smaller than 4 cm. Therefore, determine the smaller sample size required to predict the mean height of all FHSU stats students based upon a needed 95% confidence level with an error of no more than 4 cm. (Again assume s = 10.2 cm.)

Does it seem reasonable that collecting a random sample of the size you determined can produce the desire accuracy? Are there any issues with going to such a small sample size? Explain your position. - Lastly, although the procedure in #3 above was convenient, in most cases if we are trying to predict the population mean from a sample mean, we will most likely not know the population standard deviation, sigma. Thus, our next best statistic to use is the sample standard deviation in its place. However, when using the sample standard deviation, we are no longer able to use a normal distribution, and must use the t-distribution instead.

a. Determine the margin of error in estimating the population mean height from this one sample’s mean based upon a 95% level of confidence. Do NOT USE our earlier assumption of sigma = 10.2 cm, but use the sample standard deviation value you calculated in 3a, along with appropriate t-score instead of a z-score.

b. Using your margin of error value from (5a), give the 95% confidence interval for the population mean height measure as predicted from this one sample’s results. State this interval below within an interpretive sentence.