Steps • Name:
Data
Student | Steps |
Gabriel | 7095 |
Layoleen | 6593 |
MJ Marcus | 4151 |
Sabias | 6793 |
Alex | 5036 |
Elsihmer | 5404 |
Misty | 3922 |
Mairalinda | 4361 |
Ernest Junior | 5091 |
Lyllone | 3042 |
AllyJean | 4236 |
Felix Junior | 1424 |
Mark | 8966 |
Mersein | 7402 |
Mayoleen | 7618 |
Part I: Basic Statistics
During this term many of you have had a chance to wear a pedometer and to count your steps.
Focusing the final examination on statistics from pedometers seems like a natural way to end this term.
The data in the table comes from pedometers worn by second grade students at Rural Elementary School (RES).
The data is the daily average steps for each of the students.
Data sheet
- _________ What level of measurement is the data?
- _________ Determine the sample size n.
- _________ Determine the minimum.
- _________ Determine the maximum.
- _________ Calculate the range.
- _________ Calculate the midrange.
- _________ Determine the mode.
- _________ Determine the median.
- _________ Calculate the sample mean x.
- _________ Calculate the sample standard deviation sx.
- _________ Calculate the sample Coefficient of Variation.
- _________ Determine the bin width. Use five classes (bins or intervals).
-
Fill in the following table with the class upper limits in the first column,
the frequencies in the second column, and the relative frequencies in the third column
Bins (x) | Frequency f | RF p(x) |
| | |
| | |
| | |
| | |
| | |
Sums: | | |
- Sketch a histogram of the relative frequency data.
- __________________ What is the shape of the distribution?
- __________________ Use the sample mean x and sample standard deviation sx above
to calculate the z-score for first grader Shana who averages 8840 steps per day.
- _________ Is the z-score for Shana an ordinary or extraordinary value?
- __________________ Use the sample mean x and sample standard deviation sx above
to calculate the z-score for second grader Marlin who averages 11811 steps per day.
- _________ Is the z-score for Marlin an ordinary or extraordinary value?
- _________ Calculate the standard error of the sample mean x for the number of steps.
- _________ Find tcritical for a confidence level c of 95% for the number of steps.
- _________ Determine the margin of error E for the sample mean x.
- Write out the 95% confidence interval for the population mean μ number of steps:
p(_____________ < μ < ___________) = 0.95
- _________
The population mean number of steps μ for the
Urban Elementary School (UES) second graders is 6246 steps per day.
Is this a possible population mean for the
Rural Elementary School (RES) students?
- _________ Are the UES students statistically significantly more
active than the RES students as measured by steps per day?
Part II: Hypothesis Testing using the t-test
A study was conducted to determine differences in the number of steps
between the rural Rural Elementary School (RES)
students and the Urban Elementary School (UES) students.
The data in the table is the average daily steps for the students.
Steps
RES students | Steps (x) |
UES students | Steps (y) |
Gabriel | 7095 | Frankie | 16269 |
Layoleen | 6593 | Shawna | 3712 |
MJ Marcus | 4151 | Alan | 9713 |
Sabias | 6793 | Danielle | 6293 |
Alex | 5036 | Kevin | 9529 |
Elsihmer | 5404 | Rejazy | 3747 |
Misty | 3922 | Alexandra | 1019 |
Mairalinda | 4361 | Marlin | 9669 |
Ernest Junior | 5091 | Anjelu | 2914 |
Lyllone | 3042 | AJ | 3798 |
AllyJean | 4236 | Arianne | 1823 |
Felix Junior | 1424 | Burt | 4482 |
Mark | 8966 | Aiyumi | 7095 |
Mersein | 7402 | Trumaine | 4071 |
Mayoleen | 7618 | Vincent | 7616 |
| | Faith | 4123 |
| | Hart | 10068 |
| | Kelsey | 3309 |
| | Hagadauroma | 8828 |
| | Beatrice | 6840 |
- _________ Calculate the sample mean x
average daily steps for Rural Elementary School (RES).
- _________ Calculate the sample mean y
average daily steps for Urban Elementary School (UES).
- _________ Are the sample means for the two samples mathematically different?
- _________ Calculate the degrees of freedom using the smaller of the two sample sizes.
- _________ Calculate the tcritical using alpha α = 0.05 and the degrees of freedom.
- _________ Calculate the standard error SE for two independent samples.
- _________ Calculate the margin of error E for two independent samples.
- ____________ < μd < _____________
Determine the 95% confidence interval for the population mean difference μd
- _________ Does the confidence interval include a mean difference of zero?
- __________________ What is the p-value? Use the difference of means for independent samples TTEST function
=TTEST(data_range_x;data_range_y;2;3) to determine the p-value for this two sample data.
- __________________ Is the difference in the mean daily steps statistically significant
at a risk of a type I error alpha α = 0.05?
- __________________ Would we "fail to reject" or "reject" a null hypothesis of no difference
in the steps per day between the two schools?
- __________________ What is the maximum level of confidence we can have that the
difference is statistically significant?
Part III: Linear Regression (best fit or least squares line)
Cadence versus speed
Cadence (steps/min) | Speed (cm/s) |
148.0 | 234 |
150.0 | 250 |
151.3 | 239 |
151.4 | 239 |
152.4 | 250 |
153.2 | 224 |
153.6 | 256 |
154.7 | 250 |
154.8 | 235 |
155.4 | 290 |
155.4 | 240 |
155.5 | 276 |
159.3 | 245 |
159.7 | 248 |
The data in this section explores the relationship between cadence
(steps per minute) and speed (centimeters per second).
In distance running theory, the optimal cadence is up around 180
steps per minute. This should produce the highest sustainable
speed for distance running.
As a joggler, my cadence is synchronized to my juggling.
This section of the examination explores whether my cadence as
measured by a pedometer is related to my speed as measured by a
global positioning satellite receiver.
- _________ Calculate the slope of the linear regression (best fit line).
- _________ Calculate the y-intercept of the linear regression (best fit line).
- _________ Is the relation between cadence and speed positive, negative, or neutral?
- _________ Calculate the linear correlation coefficient r for the data.
- ______________ Is the correlation none, weak/low, moderate, strong/high, or perfect?
- _________ Use the slope and intercept to predict the speed
for a cadence of 149 steps per minute.
- _________ Use the slope and intercept to determine the cadence
produced by a speed of 200 cm/s.
One intention of any course is that a student should be able to
learn and employ new concepts in the field even after the course is over.
In a linear regression analysis a correlation coefficient near
zero means no relation exists between the variables.
You can run a statistical test to determine whether the
correlation coefficient r is
statistically signigicantly different from zero.
If the difference of r from zero is statistically significant,
then you will have proved that a relationship exists.
If you fail to reject a null hypothesis of r equals zero,
then there is no evidence in the data that cadence and speed are linked.
To run the hypothesis test, you will calculate a
t-critical (tc), a t-statistic (t), and then a p-value
using the t-statistic and the TDIST function.
For this test:
sample size n is the number of data pairs
t-critical: =TINV(α;n−2) where α = 0.05
p-value: =TDIST(ABS(t-statistic);n−2;2)
Note that n−2 is used in these formulas. This is the degrees
of freedom for a correlation hypothesis test.
- _________ Determine the sample size n
by counting the number of data pairs.
- _________ Determine t-critical using an alpha of α = 0.05
and n − 2 degrees of freedom.
- _________
Determine the t-statistic using the formula noted further above,
remembering to use n − 2 for the degrees of freedom.
- _________ Determine the p-value using the TDIST function,
remembering to use n − 2 for the degrees of freedom.
- ________ Is the correlation between my cadence and
my speed statistically significant?
Tables of Formulas and OpenOffice Calc functions
Hypothesis testing for paired data samples |
Statistic or Parameter | Symbol | Equations | OpenOffice |
Calculate a p-value for the difference of the means from two samples of paired data |
=TTEST(data_range_x;data_range_y;2;1) |