MS 150 Statistics Summer 2001 FX

Part I

Use this female adult literacy data to answer the following questions in part I. The female adult literacy rate is the percentage of adult women who can read.

Location Adult literacy rate (female)
American
Samoa
97
CNMI 96
Fiji 89
French Polynesia 98
FSM 88
Guam 99
Kiribati 90
Marshall Islands 88
Nauru 99
Palau 90
Samoa 97
Tonga 99
Tuvalu 45
Vanuatu 48
Wallis Futuna 50
  1. Determine the sample size n.

  2. Calculate the sample mean xbar.gif (842 bytes).

  3. Determine the median.

  4. Determine the mode.

  5. Calculate the range.

  6. Calculate the sample standard deviation sx.

  7. Calculate the Coefficient of Variation.

  8. Using the intervals specified in the bins column, fill in the frequency and relative frequency columns of the following table. The number in the bins column is the class upper limit.  Include the class upper limit in each interval. Include any data below 70 in the first interval.

    Bins Frequency Relative
    Frequency f/n
    70 _________ _________
    75 _________ _________
    80 _________ _________
    85 _________ _________
    90 _________ _________
    95 _________ _________
    100 _________ _________
    Sums: _________ _________


  9. Draw a histogram of the Relative Frequency data using the following chart:
    histo8by8.gif (3951 bytes)
  10. What, if any, is the shape of the distribution?

  11. What is the probability that the female adult literacy rate will be greater than 85 but less than or equal to 90?

  12. Construct a 95% confidence interval for the population mean m female adult literacy rate using the above data.  Presume for this example that the data distribution is sufficiently normal.  Note that n is less than 30. Use the sample mean xbar.gif (842 bytes) (question #2 above), sample standard deviation sx (question #6 above), and tc to generate your error tolerance E (maximal error of estimate). 






    1. Degrees of freedom = __________

    2. Error tolerance E (maximal error of estimate) = _______________

    3. The 95% confidence interval for m is  ____________ < m < ____________

  13.  In the following exercise use your sample mean and sample standard deviation as population parameters.  That is, use your the sample mean xbar.gif (842 bytes) calculations in question #2 above for the population mean m.  Use your the sample standard deviation sx calculations in question #6 above for the population standard deviation s.  

    Let the null hypothesis H0 be that the average female adult literacy rate is m as calculated by you in question number two above.  Suppose in the year 2000 the actual average female adult literacy rate for these same countries is xbar.gif (842 bytes) = 93.  At an alpha a of 0.05, is this increase in the female literacy rate statistically significantly more than the present rate?

    Note that n is less than 30.  

    1. m = _____________ (from question #2 above)

    2. xbar.gif (842 bytes) = 93

    3. sx = ____________ (from question #6 above)

    4. n = ____________

    5. What is H0?

    6. What is H1?

    7. What is a?

    8. Is this a one-tail or two tail test?

    9. How many degrees of freedom?

    10. Calculate the t-statistic.

    11. Calculate t-critical tc.

    12. Make a sketch of the student's t-distribution curve including the critical value for t, the critical area, and the t-statistic.  Use the back if necessary to making a clean, clear, concise, legible, and accurate sketch.


    13. Do we reject H0 or do we fail to reject H0?

    14. Is the change in the female literacy rate significant?


Part II

For this part use the following data.  The adult female literacy rate is the percentage of women that country who can read.  The infant mortality rate is the number of babies per one thousand live births who die before they reach one year in age.  So, for example, in the FSM 88% of the adult women in the country can read and of 1000 babies born in the FSM in the year 2000, 33 will die before they reach their first birthday (the infant mortality estimates are for the year 2000). Note that the table continues across the page break.

Location Adult literacy rate (female) Infant Mortality
per
1000 live births
American
Samoa
97 11
CNMI 99 6
Fiji 89 17
French Polynesia 98 9
FSM 88 33
Guam 99 7
Kiribati 90 55
Marshall Islands 88 41
Nauru 99 11
Palau 90 17
Samoa 97 33
Tonga 99 14
Tuvalu 45 40
Vanuatu 48 63
Wallis Futuna 50 25
  1. Determine the slope of the least squares line (best fit) for the adult female literacy rate and infant mortality data above.


  2. What does of the sign of the slope tell us about this data?  That is, what type of correlation is this, positive or negative?

  3. Determine the y-intercept of the least squares line for the adult female literacy rate and infant mortality data above.

  4. Write the equation of the least squares line for the adult female literacy rate and infant mortality data above.


  5. Calculate the linear correlation coefficient r for the adult female literacy and infant mortality columns.

  6. Is the correlation none, low, moderate, high, or perfect?

  7. Calculate the coefficient of determination.

  8. What does the coefficient of determination tell us about the relationship between adult female literacy rate and infant mortality rates?



  9. Is there a relationship between adult female literacy (the ability of women to read) and infant mortality?


  10. Explain what policies or programs a country might want to promote to reduce infant mortality rates based on our data.  Use the back as necessary.

 

Statistic or Parameter Symbol Equations Excel
Square root =SQRT(number)
Sample size n =COUNT(data)
Minimum =MIN(data)
Maximum =MAX(data)
Median =MEDIAN(data)
Mode =MODE(data)
Sample mean xbar.gif (842 bytes) Sx/n =AVERAGE(data)
Population mean m SX/N
x P(x)
n p (binomial)
=AVERAGE(data)
Sample standard deviation sx sampstdev.gif (1072 bytes) =STDEV(data)
Population standard deviation s probabilitypopstdev.gif (1053 bytes)
npq.gif (927 bytes) (binomial)
=STDEVP(data)
Sample variance (sx)² =VAR(data)
Population variance s² =VARP(data)
Sample Coefficient of Variation CV 100(sx/xbar.gif (842 bytes)) =100*STDEV(data)/AVERAGE(data)
Slope b =SLOPE(y data, x data)
Intercept a =INTERCEPT(y data, x data)
Correlation r =CORREL(y data, x data)
Coefficient of Determination =(CORREL(y data, x data))^2
Binomial probability nCr pr q(n-r) =COMBIN(n,r)*p^r*q^(n-r)
Calculate a z value from an x z = standardize.gif (905 bytes) =STANDARDIZE(x, m, s)
Calculate an x value from a z x = s z + m = s*z+m
Calculate a z-statistic from an xbar.gif (842 bytes)value given m and s z xbartoz.gif (1022 bytes) =STANDARDIZE(xbar.gif (842 bytes), m, s/SQRT(n))
Calculate a t-statistic or t-ratio or tdata t xbartot.gif (1028 bytes) =STANDARDIZE(xbar.gif (842 bytes), m, sx/SQRT(n))
Find a probability p from a z value =NORMSDIST(z)
Find a z value from a probability p =NORMSINV(p)
Standard error of the population mean SE standard_error_mean_sigma.gif (945 bytes)
Standard error of the sample mean SE standard_error_mean_sx.gif (956 bytes)
Determining z critical from a for confidence intervals. zc =NORMSINV(1-a/2)
Error tolerance E of a mean for n ³ 30 using s E error_tolerance_e.gif (987 bytes) =CONFIDENCE(a,s,n)
Error tolerance E of a mean for n ³ 30 using sx E error_tolerance_zc.gif (989 bytes) =CONFIDENCE(a,sx,n)
Error tolerance E of a mean for n < 30.  Can also be used for n ³ 30. E error_tolerance_tc.gif (989 bytes) [no Excel function, determine tc and then multiply by standard error of the mean as shown in the equation]
Determining tc from a and the degrees of freedom df for a confidence interval. tc =TINV(a,df)
Calculate an xbar.gif (842 bytes) value from a tc, sx, n, and m xbar.gif (842 bytes)=error_tolerance_tc.gif (989 bytes)+ m
Calculate a confidence interval for a population mean m from a sample mean xbar.gif (842 bytes) and an error tolerance E xbar.gif (842 bytes)-E< m <xbar.gif (842 bytes)+E
Determining zc from a for a TWO-tail hypothesis test. =NORMSINV(a/2)
[returns only the negative value for zc]
Determining zc from a for a ONE-tail hypothesis test. =NORMSINV(a)
[returns only the negative value for zc]
Determining tc from a and degrees of freedom df for a TWO-tail hypothesis test. =TINV(a, df)
[returns only the positive value for tc]
Determining tc from a and degrees of freedom df for a ONE-tail hypothesis test. =TINV(2a, df)
[returns only the positive value for tc]
Determining the p-value from a z-statistic, ONE tail =1-NORMSDIST(ABS(z))
Determining the p-value from a z-statistic, TWO tail =2*(1-NORMSDIST(ABS(z)))
Determining the p-value from a t-statistic, ONE tail =TDIST(ABS(t),df,1)
Determining the p-value from a t-statistic, TWO tail =TDIST(ABS(t),df,2)

Standard normal distribution information:

normal_curve.gif (76564 bytes)

The standard normal Excel functions such as NORMSDIST and NORMSINV use "left" to z as shown at the right below:

Standard normal cumulative distribution left to z: Excel functions

Sources:

http://www.odci.gov/cia/publications/factbook/fields/infant_mortality_rate.html
http://www.overpopulation.com/267
http://www.cia.gov/cia/publications/factbook/indexgeo.html
http://www.adb.org/Statistics/Poverty/TUV.asp
http://www.mrdowling.com/800growth.html
http://www.unescap.org/stat/statdata/kiribati.htm
http://www.immigration-usa.com/wfb/wallis_and_futuna_people.html
http://www.overpopulation.com/1507
http://www.library.uu.nl/wesp/populstat/Oceania/naurug.htm
http://www.unicef.org/statis/Country_1Page179.html
http://www.unctad.org/en/docs/ldc99stat_tuv.en.pdf
http://www.abc.net.au/ra/pacific/places/infant_mortality.htm
http://www.adb.org/Statistics/Poverty/KIR.asp

Statistic home Lee Ling home COM-FSM home page