Due to the gravity of the errors made, I am herewith
forwarding all of my critical decision path communications from just before the
construction of the entrance function to Paul's note that launched the first set
of second looks at the data.
Friday 04 April
Saturday 05 April
At 7:00 I sent out the attached "A Saturday morning
romp..." which looked at correlations between the sections of the entrance
test. This report would have no direct impact on the incident. The
report would, however, later be repeated to look at whether the actual PICS
essay scores were different from those predicted by a correlation
function. This was sent from my house.
At 8:49 I would send home from the College the first
broken tile. The message was short:
Structure = 10*(1.158*raw+21.47)
Reading = 10*(0.9150*raw+21.92)
The attached toeflconversion.xls spreadsheet reveals the
fatal flaw: I had been given a TOEFL reading conversion table for a fifty
question test. The spreadsheet confirms this: I used the bottom of
parallel range bins. The raw scores end at 48: the top of the bin was
50.
I have never seen nor administered the structure
section. I had absolutely no way of knowing that the total possible was
incorrect. At this point the whole process was
doomed to an incorrect conclusion.
That morning and on into the afternoon would be spent
marking essays.
At home that afternoon and late into the night I was
entering essay scores.
Sunday 06 April
I would spend Sunday morning working on the entrance
function. I used the flawed reading function to convert reading scores
into projected TOEFL scores. The structure function, while possibly not from the
correct table for the structure section administered, was at least on the
correct number of questions.
I was working with the number 600 in my head - the number
the admissions board had indicated we traditionally admitted to the national
campus. When I got down to the 600th student I noticed we were near
470.
For some time now the admissions board has used an Excel
spreadsheet function designed by me - one that made cuts at 400 and 470 to
divide up state, IEP, and national admission. I realized that I could use
the same basic function and thereby use our previously approved cut
points. This seemed to me to be a logical approach: by using the cut
points that we have used for the last few years we would be consistent in our
standards.
At lunch time Sunday I had been at the computer every
waking moment since Saturday afternoon. My family insisted that I needed
to get some sunlight and took me to the Village.
By mid-Sunday afternoon I felt I understood and had
solved the "boundary value problems" generated by the equation. Sometime
in the middle of Sunday afternoon I sat back in my chair and stared at the
equation for a long time. I realized as I looked at it that few would
comprehend the equation, its cut points, and its boundary value problem
solutions. I surmised that in the upcoming admissions board meeting the
committee would look at the equation, trust my work, and approve it.
As Jonathan noted in an earlier email, what other
option did the time-pressed committee have?
I turned to my family and said words along the lines of,
"I have just decided who gets to go to Palikir and who does not. I do not
want this responsibility. No will spend the time I've spent to make sure
this is right. I will be alone on this one." I then said a prayer
hoping I'd got it right.
Unfortunately I had neglected the adage of the computer
age: garbage in equals garbage out. Although my work was solid, it was
based on a flawed conversion function, the incorrect tables I had been given
Saturday morning. I had trusted the tables to be correct.
I knew I had to send out a thorough report with
statistics and data so that others could vet my work thoroughly. At 17:27
on Sunday I sent out "Predicting the future with an equation..." That
email contained the following critical information:
Yields:
Projected destination (ProjDest) 3: National Campus:
694
2: IEP: 225
1: State Campus: 440
I did not know the historic values for these
numbers. I was counting on someone responding if those numbers were out of
line. Apparently they were, and badly so. The IEP numbers are
apparently way low against historic norms, or so I am led to believe.
I think the complication was that no one person carries
the total IEP count in their head: Pohnpei knows their number, Kosrae knows
theirs, Yap knows theirs, and so forth. No one, however, walks
around with the sum in their head.
I knew the email was important and I sent a copy to both
Ringlen's college and Hotmail address in the hopes that he was remaining in
contact while on the road. The email also went to to admissions board
personnel and others who had worked on the entrance test.
I do note one omission from the email list: as far as I
can tell from the To: list the director of the Kosrae campus did not get a copy
of that email. In addition the Pohnpei campus copy went to Patty
Grandos. Joakim and Lourdes did, however, receive copies of that
note.
My apologies to director Kephas, I was hand typing the
address list and I was at that point dizzy and tired from having spent 20
of the prior 26 hours sitting in front of a computer working.
Paul would know, but I think I might have bcc'ed my
division and him. I have always looked to Paul as I know he catches stuff
I miss.
Sunday evening
I would spend Sunday evening looking
at the high school results in the email "A look
at the high schools". Here a second error, completely mine, would be
made. The critical omission was the following table:
|
|
2003 |
|
Grand |
|
2002 |
|
Grand |
|
2001 |
Grand |
2003 |
2002 |
2001 |
2002 0.95 |
HS |
Count |
Average |
StDev |
Average |
Count |
Average |
StDev |
Average |
Count |
Average |
Avg |
Rank |
Rank |
Rank |
Conf |
CCA |
9 |
619.04 |
50.4 |
475.44 |
10 |
592 |
55 |
432 |
7 |
536 |
437 |
|
1 |
2 |
39 |
Xavier |
26 |
628.21 |
29.63 |
475.44 |
25 |
536 |
44 |
432 |
26 |
548 |
437 |
|
2 |
1 |
18 |
PSDA |
22 |
566.02 |
67.18 |
475.44 |
40 |
519 |
60 |
432 |
30 |
528 |
437 |
|
3 |
4 |
19 |
YSDA |
5 |
461.52 |
64.25 |
475.44 |
15 |
505 |
64 |
432 |
5 |
533 |
437 |
|
4 |
3 |
36 |
PATS |
24 |
544.97 |
63.94 |
475.44 |
40 |
486 |
59 |
432 |
37 |
493 |
437 |
|
5 |
5 |
19 |
SCA |
42 |
514.56 |
60.4 |
475.44 |
46 |
478 |
51 |
432 |
42 |
470 |
437 |
|
6 |
12 |
15 |
PICS |
325 |
523.18 |
62.1 |
475.44 |
242 |
467 |
56 |
432 |
259 |
477 |
437 |
|
7 |
7 |
7 |
YHS |
103 |
513.28 |
88.21 |
475.44 |
109 |
458 |
74 |
432 |
117 |
474 |
437 |
|
8 |
9 |
14 |
KHS |
126 |
512.85 |
72.49 |
475.44 |
140 |
452 |
66 |
432 |
96 |
460 |
437 |
|
9 |
14 |
11 |
Mizpah |
22 |
471.98 |
44.7 |
475.44 |
18 |
449 |
46 |
432 |
|
|
|
|
10 |
|
23 |
OIHS |
28 |
509.41 |
68.24 |
475.44 |
19 |
444 |
54 |
432 |
24 |
384 |
437 |
|
11 |
20 |
26 |
OHWA |
10 |
494.6 |
79.72 |
475.44 |
8 |
433 |
77 |
432 |
9 |
433 |
437 |
|
12 |
16 |
64 |
Nukuno (NCHS) |
11 |
392.51 |
49.29 |
475.44 |
6 |
373 |
69 |
432 |
|
|
|
|
13 |
|
73 |
SNHS |
89 |
402.47 |
46.39 |
475.44 |
72 |
366 |
38 |
432 |
80 |
418 |
437 |
|
14 |
17 |
9 |
CHS |
268 |
411.94 |
58.81 |
475.44 |
209 |
362 |
49 |
432 |
219 |
369 |
437 |
|
15 |
22 |
7 |
Weno |
113 |
386.4 |
46.83 |
475.44 |
103 |
357 |
62 |
432 |
97 |
366 |
437 |
|
16 |
23 |
12 |
Berea |
24 |
446.21 |
60.28 |
475.44 |
|
|
|
|
24 |
481 |
437 |
|
|
6 |
|
CSDA |
13 |
510.38 |
70.87 |
475.44 |
|
|
|
|
13 |
439 |
437 |
|
|
15 |
|
Woleai (NICHS) |
29 |
458.56 |
64.24 |
475.44 |
|
|
|
|
30 |
402 |
437 |
|
|
18 |
|
PentLHA |
38 |
421.22 |
56.45 |
475.44 |
|
|
|
|
29 |
356 |
437 |
|
|
24 |
|
The problem can be seen in the "Grand Average"
columns. I had a 43 point rise in the TOEFL average, an average for which
I knew the historic amounts of movement to be on the order of five points.
That night I looked back over my work, including the
conversion functions, but could find no flaw in the data, functions, or
statistical analyses. I was hesitant to report a 43 point rise. As
Paul would later note that this would trumpeted as proof of the greatness
of the high schools.
I, however, was aware we had dumped one section of the
TOEFL (listening) and I knew that Jonathan had reworking one of the other
sections. Whenever a test is changed, the new data can be very different
from the old data. The rule in statistics is that if one makes changes
that are significant, new results cannot be compared to old results.
Tired, and unable to another explanation, I opted to omit
the above table and said:
Many things changed on the entrance test this year.
Too many. As a result I have to re-engineer most of the statistics I
report. I begin with mathematics as I will present a new way of measuring
high school performance. The result will be NO HISTORICAL comparison data
this year. That will come next year.
In English the change in the structure of the test created
similar problems with historic comparisons. For now I will report simply a
pseudo-z value rank order. The z-value is the school average on the
structure and reading sections minus the averages of the schools (not of all the
students) and then that difference is divided by the standard deviation of
the schools. These values then provide the capacity to generate a rank
order.
Unfortunately the 43 point jump was not a result of test
changes. It was result of the factors cited by Jonathan, including the use
of an incorrect table. Correcting the table, however, would be expected to
only drop the jump by 10 points. This leaves a 33 point jump, still a
large jump. The use of the wrong conversion table alone is not the only
thing that altered the placement. Other changes either made the test
easier or the high schools really are getting better.
As we rush to impale ourselves on our swords and seek
ways to financially save the IEP programs, let us not unduly deride the high
schools. It may well be that PICS and KHS are producing fewer IEP level
students. The table only explains 10 of the 43 point nationwide
jump. And KHS experienced a 61 point in their TOEFL score. All of
the errors put together cannot explain this away.
Bear in mind that the essays, and independent variable
for KHS, correlate highly with the reading and structure. And 59% of the
KHS students wrote a 4, 5, or 6 essay. These are, against the rest of the
nation, good essays.
So when Paul later wrote questioning the high admissions,
my look back at my data suggested the boost was REAL. Even now, correcting
the data is still likely to leave far higher than usual admissions from
KHS.
I have not studied PICS results in detail, but maybe,
just maybe, the high schools are starting to get the job done in English.
Where this happens, IEP is dead. Sorry all, but if the high schools
produce stronger students then the IEPs are likely to die. And I still
cannot rule out underlying improvement in the high schools as a contributing
factor to the decrease in IEP and increase in national campus
admission.
For those concerned about my having a national campus
bias, I would ask that my record be reviewed. I have been a strong
supporter of the state campuses. I have taught in the state campuses both
here in Pohnpei and on Kosrae since 1993, with my only hiatus being the five
years I was with Title III. I wrote a long tome last July arguing for
stronger state campus offerings and in support of state campuses.
As I worked on the admissions work I was concerned about
the state campuses, they are near on always in my mind when I do curriculum or
admissions work. As noted, however, I did not have the historical numbers
and hoped someone else would holler if I something was amiss.
Monday 06 April
Monday was absorbed with the fallout from the PICS
essays. VPIA Spensin James met with the principal, but the meeting did not end
with an agreement. I visited the principal in the evening and then drove
to Nett to talk to Jonathan. This is the dogs, children, and underwear meeting
he mentioned in one of his notes.
Tuesday 07 April
I continued to work on the PICS data.
Wednesday 08 April
I produced "Admissions: What percentage were admitted
where..." which went out at 10:30 that night. This contained the critical
table:
HS |
SC |
IEP |
N |
Sums |
SC |
IEP |
N |
Avg |
Xavier |
|
|
26 |
26 |
0 |
0 |
1 |
3 |
CCA |
|
|
9 |
9 |
0 |
0 |
1 |
3 |
PPSD |
|
1 |
22 |
23 |
0 |
0.04 |
0.96 |
2.96 |
PSDA |
|
1 |
21 |
22 |
0 |
0.05 |
0.95 |
2.95 |
PATS |
1 |
3 |
20 |
24 |
0.04 |
0.13 |
0.83 |
2.79 |
PICS |
19 |
40 |
266 |
325 |
0.06 |
0.12 |
0.82 |
2.76 |
SCA |
3 |
6 |
33 |
42 |
0.07 |
0.14 |
0.79 |
2.71 |
OIHS |
3 |
4 |
21 |
28 |
0.11 |
0.14 |
0.75 |
2.64 |
KHS |
10 |
27 |
89 |
126 |
0.08 |
0.21 |
0.71 |
2.63 |
YHS |
16 |
20 |
67 |
103 |
0.16 |
0.19 |
0.65 |
2.5 |
CSDA |
3 |
1 |
9 |
13 |
0.23 |
0.08 |
0.69 |
2.46 |
KSC |
2 |
1 |
6 |
9 |
0.22 |
0.11 |
0.67 |
2.44 |
Mizpah |
3 |
8 |
11 |
22 |
0.14 |
0.36 |
0.5 |
2.36 |
Ohwa |
2 |
3 |
5 |
10 |
0.2 |
0.3 |
0.5 |
2.3 |
NICHS |
7 |
8 |
14 |
29 |
0.24 |
0.28 |
0.48 |
2.24 |
YSDA |
2 |
|
3 |
5 |
0.4 |
0 |
0.6 |
2.2 |
Berea |
10 |
5 |
9 |
24 |
0.42 |
0.21 |
0.38 |
1.96 |
PLHA |
19 |
12 |
7 |
38 |
0.5 |
0.32 |
0.18 |
1.68 |
CHS |
178 |
50 |
40 |
268 |
0.66 |
0.19 |
0.15 |
1.49 |
SNHS |
66 |
16 |
7 |
89 |
0.74 |
0.18 |
0.08 |
1.34 |
Weno |
87 |
18 |
8 |
113 |
0.77 |
0.16 |
0.07 |
1.3 |
NCHS |
9 |
1 |
1 |
11 |
0.82 |
0.09 |
0.09 |
1.27 |
Sums: |
440 |
225 |
694 |
1359 |
0.32 |
0.17 |
0.51 |
2.19 |
At the time this email went out the admissions letters
had not been prepared. The state campuses were sent this information, with
the email going specifically to: --deleted from web page version --
At each step I was hoping others were looking at the data
I was generating and would let me know if something looked amiss. Of
course this is period in which the time crunch hurt us: we were trying to move
to letter production as fast as humanly possible.
Friday 11 April
On Friday I sent out the "Admissions letter wording..."
at 1:29 P.M. Tropical Storm 02W had shut the campus down for two days, the
lull provided the time I needed to work on the admissions letters. Due to
the non-work day on Friday, nothing would be done with this letter until the
following week at the earliest.
At this point, having heard nary otherwise, I presumed
the numbers that were being generated were acceptable.
Saturday 19 April
Eight days later, in the middle of Easter recess, at 8:20
A.M. on a Saturday morning Paul compliments the national campus geniuses and
begins the ball rolling on a second look at the numbers. At this point I
am convinced I am running on solid ground and offer a rebuttal sharing the
followign cross-table:
|
MS 090 MS 065 |
MS 095 |
MS 098 |
MS 100 |
MS 101 MS 150 |
Sum |
Cert |
10 |
|
|
|
|
10 |
IEP |
23 |
2 |
|
2 |
|
27 |
Natl |
50 |
12 |
10 |
15 |
2 |
89 |
Sum |
83 |
14 |
10 |
17 |
2 |
126 |
The cross-table gets sent on Sunday.
Monday 21 April
Paul gives me the first historic numbers, the first
indication that something went wrong.
Over the last 6 years I have kept detailed stats of my own
regarding the perfomance of KHS students. I try to keep track of how many
seniors can get 60% of the questions right on each section, the standard
pass/fail requirement. On the old TOEFL this translated to a 500 average. When
removing Upward Bound from the mix KHS had somewhere between 5-10 students each
year who could attain this level.
I realized the import of his data: I deeply understand
the sluggishness of statistical change. This was a indication something
was not valid. I remain grateful to Paul for providing me numbers -
up until then I was in the dark as to the historic values. His note later
that same day would confirm both the usual sluggishness and the sudden
change:
1999- 39 Regular Students
2000- 29 Regular
Students
2001- 27 Regular Students
2002- 32 Regular Students
(trumpets
please
2003- EIGHTY-NINE REGULAR STUDENTS!!!
I then went back over everything and could not find an
explanation. At this point, however, the idea that something was
amiss was forming. A rolling ongoing discussion began.
Monday 05 May
Jonathan shared information from last year's results that
suggested we admitted 100 more than the previous year, 694 versus 591.
Various reasons why this might have happened were cited.
Tuesday 06 May
I was talking to Bastora when she noted that our cut-offs
appeared to be wrong. Bastora has a deep and intuitive sense with regards
admissions from her years of working with the data. She was the first
person to mention to me the possibility that the number of questions was
amiss.
Jonathan wrote another note listing all of the now known
factors that led to the problems. This then led to an avalanche of email on the
topic.
Thursday 08 May
The drop-dead critical as a heart attack admissions board
meeting is postponed so curriculum committee can meet. Depressed and angry
that no resolution will be in the offing until Monday, I could not muster myself
for a curriculum meeting.
In the evening I went and visited Deep Kool-Aid (DK), my
student informant at PICS in regards the essays. Holding pride of place on
the wall over her bed was her admissions letter to the national campus.
Her older siblings had placed into the IEP program, she was excited that she had
successfully gained admission to the national campus. Were she to now get
a letter changing her admission, she would either be crushed or angry, possible
both.
Because I work in the cold and harsh world of statistics
I always force myself to know the stories behind the anonymous piles of numbers
I crunch.
Extrapolate from DK to all of the students impacted and
we will have found one more way to tick off and alienate our
students.
Do not get me wrong, I do not oppose any of the
ideas in circulation. In fact at this point I neither oppose nor support any
idea. Since I was in the chain that led to the incident I am recusing
myself from backing any particular solution.
I would suggest that we apply our corrections based on
the errors made. The error was in the reading scores, someone get me the
CORRECT table and I can rerun the scores with the correct function. If
someone would get me the CORRECT structure conversion I would that. Then I
could rerun my analysis using our traditional cuts at 400 and 470 and see what
change occurs.
Bear in mind that these changes may not have a big impact
in KHS. Paul notes that usually only 5 to 10 non-UB students gain national
campus admission. The UB students, by and large, gain national campus
admission. I think two last year (2002) did not. Use Paul's cut-off
of 500 but add 10 for the fault reading conversion, so look only at students
above 510.
There are 63 students above 510, only 15 of whom are
UB. 13 of the 17 UB students are clustered at or above a 563
average. Yet there are 18 non-UB students at or above this 563
number. KHS, all by itself and without the massive resources of UB, lifted
18 non-UB students to a UB level. Sure, out of 126 this is only 14%, but
apparently the capacity is there in some classes. In the world of the FSM
this is a solid high school. Whoever handled those 18 did a great
job.
This genius continues to believe that there was real
improvement in some high schools and that this also contributed to the lowered
IEP numbers. What I am saying is that our efforts to correct the errors
should not be a zero-sum game: we should not shoot to remove all statistical
improvement in the numbers who achieved national campus admission.
If I am going to run tomorrow morning at 7:00 in the fun
run then I must go to bed, it is after one in the morning now.
I know I have miswritten some things, left out negatives,
etc. Please forgive me.
Thank-you for reading and studying this!
Dana