Part One • Name: ____________________________

The number of new orange centerline reflectors from the Sokeh's Island junction to the college in Palikir was counted by a three person team. The numbers are the number of markers for each half kilometer.

Number of
centerline reflectors
per half kilometer
84
74
78
61
67
55
54
62
61
83
81
75
70
65
70
76
  1. For the number of centerline reflectors per half kilometer:

    _________ Determine the sample size n.
  2. _________ Calculate the sample mean x.
  3. _________ Determine the median.
  4. _________ Determine the mode.
  5. _________ Determine the minimum.
  6. _________ Determine the maximum.
  7. _________ Calculate the range.
  8. _________ Calculate the sample standard deviation sx.
  9. _________ Calculate the sample Coefficient of Variation.
  10. _________ Determine the class width. Use 5 bins (classes or intervals)
  11. Fill in the following table with the class upper limits in the first column, the frequencies in the second column, and the relative frequencies in the third column
    Bins Frequency Relative Frequency f/n
    _________ _________ _________
    _________ _________ _________
    _________ _________ _________
    _________ _________ _________
    _________ _________ _________
    Sums: _________ _________
  12. Sketch a histogram of the relative frequency data on the back of the paper.
  13. _________ What is the shape of the distribution?
  14. _________ Using the relative frequencies in the table above, what is the probability that a half kilometer of road will have 67 to 72 centerline reflectors?
  15. Construct a 95% confidence interval for the population mean µ centerline reflectors per half kilometer using the above data. Note that n is less than 30. Use the sample mean and sample standard deviation to generate your error tolerance E.
    1. __________ How many degrees of freedom?
    2. __________ What is tc?
    3. The error tolerance E = _______________
    4. The 95% confidence interval for µ is ____________ < µ < ____________
  16. __________ Based on your confidence interval calculations above, if the international road safety standard is µ = 74.6 centerline reflectors per half kilometer, is the Pohnpei average (calculated in question two above) for the 16 segments of road statistically significantly different from the international standard at an alpha of 0.05?
  17. __________ Calculate the t-statistic based on the sample size in question one, the sample mean in number two, the sample standard deviation in question eight, and the above µ = 74.6 population mean.
  18. __________ Calculate the two-tail p-value using a t-statistic above.
  19. __________ Use the p-value in the above question to calculate and report the largest confidence level for which the change would be significant.

Part Two

The data below is a running cumulative count of the number of centerline reflectors. The first column of the table is the distance from the junction with Sokeh's Island. The second column is the cumulative count of the reflectors. Use this data to form an x-y scatter graph with kilometers on the x-axis and the cumulative count of the centerline reflectors on the y-axis.

KilometersCumulative count of reflectors
00
1158
2297
3419
4534
5678
6834
7969
81049
8.71115
  1. _________ Calculate the slope of the best fit (least squares) line for the data.
  2. _________ Calculate the y-intercept of the best fit (least squares) line.
  3. _________ Is the correlation positive, negative, or neutral?
  4. _________ Use the equation of the best fit line to calculate the projected number of reflectors needed to put reflectors all the way around the 80 kilometers of the circumferential road. Project beyond the end of the data. In this case this is appropriate if you are trying, for example, to determine how many reflectors to buy.
  5. _________ Use the inverse of the best fit equation of the best fit line to calculate the number of kilometers that 5000 reflectors could be expected to cover. Once again, work beyond the end of the data.
  6. _________ Calculate the linear correlation coefficient r for the data.
  7. _________ Is the correlation none, low, moderate, high, or perfect?
  8. _________ Calculate the coefficient of determination.
  9. _________ What percent of the variation in the distance data explains the variation in the reflector data?
  10. _________ Is there a relationship between the kilometers and the cumulative count of reflectors?
Basic Statistics
Statistic or Parameter Symbol Equations Excel
Square root     =SQRT(number)
Sample size n   =COUNT(data)
Sample mean x Sx/n =AVERAGE(data)
Sample standard deviation sx or s   =STDEV(data)
Sample Coefficient of Variation CV 100(sx/x) =100*STDEV(data)/AVERAGE(data)
Linear Regression Statistics
Statistic or Parameter Symbol Equations Excel
Slope b   =SLOPE(y data, x data)
Intercept a   =INTERCEPT(y data, x data)
Correlation r   =CORREL(y data, x data)
Coefficient of Determination r2   =(CORREL(y data, x data))^2
Statistic or Parameter Symbol Equations Excel
Normal Statistics
Calculate a z value from an x z = standardize.gif (905 bytes) =STANDARDIZE(x, µ, s)
Calculate an x value from a z x = s z + µ =s*z+µ
Calculate a t-statistic (t-stat) t xbartot.gif (1028 bytes) =(x - µ)/(sx/SQRT(n))
Calculate an x from a z   xbarfromz.gif (1060 bytes) =µ + zc*sx/sqrt(n)
Find a probability p from a z value     =NORMSDIST(z)
Find a z value from a probability p     =NORMSINV(p)
Confidence interval statistics
Degrees of freedom df = n-1 =COUNT(data)-1
Find a zc value from a confidence level c zc   =ABS(NORMSINV((1-c)/2))
Find a tc value from a confidence level c tc   =TINV(1-c,df)
Calculate an error tolerance E of a mean for n >= 30 using sx E error_tolerance_zc.gif (989 bytes) =zc*sx/SQRT(n)
Calculate an error tolerance E of a mean for n < 30 using sx. Can also be used for n >= 30. E error_tolerance_tc.gif (989 bytes) =tc*sx/SQRT(n)
Calculate a confidence interval for a population mean µ from a sample mean x and an error tolerance E   x-E<= µ <=x+E  
Hypothesis Testing
Calculate t-critical for a two-tailed test tc   =TINV(a,df)
Calculate a p-value from a t-statistic p   = TDIST(ABS(tstat),df,#tails)