|
BUY
THIS BOOK FROM BARNES AND NOBLE
Haladyna, Thomas M.
(1994). Developing and validating multiple-choice test
items. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Parts of a Multiple
Choice Question
- Stem
- Correct Answer
- Distractors (Foils)
Types of Conventional
Multiple Choice Question Formats
- Question
- Positive (“Which
is…”)
- Negative (“Which
of the following is NOT…”)
- Incomplete Stem
(partial sentence)
- Best Answer
Variations of Conventional
Multiple Choice Question Formats
1) conventional matching
(two-column)
2) unconventional
matching (three-column): adds a third column. The second
and third columns’ answer choices are staggered (i.e., “1”
in first column, “2” in second, “3” in first, “4” in second---or
“a” in first, “b” in second, etc.)
3) pick the closest—used
for number questions, a standard set of numbers is presented
and students pick the closest to their answer. Avoids
the tendency in math questions to use the supplied options
to pick the correct answer.
d) uncued---uses a
large set of distractors (sometimes in the hundreds) to virtually
eliminate guessing.
Alternate-Choice
Items
These have only two
options, but they are NOT true-false items (because unlike
T/F, they offer a comparison between two choices whereas T/F
says whether one choice is right or wrong). A good choice
for items for which it is difficult to write more than one
plausible distractor. Provides for higher reliability
because you can ask more items in a given time period.
Especially good for high-achieving students who are skilled
at eliminating less plausible distractors.
The biggest problem
with alternate choice is that there are only two choices,
which means even a guesser is likely to score about 50% (vs.
25% for four-choice MC tests). Thus the range of the
test scores is truncated from 50% to 100%.
True-False Items
Advantages:
- Easy to write
- Short—can give
a larger number in a given amount of time.
Research suggests
T/F items have a number of problems:
- Like alternate
choice items, they have a truncated range of scores (50%-100%)
that may encourage guessing.
- They can be confusing
(errors for negatively-worded items are higher)
- They are less reliable
because many test-takers approach them with a bias toward
true or false answers.
- They have a tendency
to test trivial knowledge (though this is the fault of the
item writer, not the item).
Modified True-False
Format
This format uses multiple
column format:
Which statement is
true? Select “A” if the item in the first column is
true, “B” if the item in the second column is true, “c” if
the item in the third column is true, or “D” if NONE of the
items on that row are true.
|
|
|
Root |
Stem |
Leaf |
|
1 |
Growing point
protected by a cap |
|
|
|
|
2 |
May possess
a pithy center |
|
|
|
|
3 |
Epidermal
cell hair-like |
|
|
|
|
4 |
Growing region
at tip |
|
|
|
Complex Multiple-Choice
This is also called
“ETS” (Educational Testing Service) format and is common on
the SAT, MCAT, etc.
Problems
-
Difficult to construct.
-
Takes up a lot
of space
-
Requires more
reading time, reducing the number of items per unit of
time (and thus reliability)
-
Students find
them confusing
-
Inflates importance
of test-taking skills (e.g., knowing that one item is
correct or incorrect can eliminate multiple distractors)
Multiple True False
This alternative to
ETS-style format is an item set such as the following:
“Below are references
to creatures. Mark “A” if absurd and “B” is realistic:
1. Aquatic mammal
2. Fish with a lung
3. Fern gemtophyte
with spores
4. Algae with no nucleus
5. Chordate without
a notochord
6. Single-celled metazoa
7. Featherless, flying
mammal
8. Flatworm with a
skeleton
9. Amoeba with a fixed
mouth
10. Warm-blooded reptile
Advantages
- Students prefer
them over multiple choice questions.
- A large number
of questions can be asked in a given time period, improving
reliability.
Disadvantages
- Truncated range
of scores per item (50%-100%) may encourage guessing.
- Preliminary research
indicates that this format may be better suited to basic
knowledge than complex competence. For example, it
is best at test for examples vs. non-examples or characteristics
and non-characteristics.
Multiple Mark (Multiple-Multiple-Choice)
In this variation
of the Multiple True False format, students mark the choice
if it is true and do NOT mark it if it is false. Whereas
with standard MTF format there is a bias toward guessing “True”,
here there is a bias toward omission (i.e., guessing “False”.
Context-Dependent
Item Sets (aka bundles, scenarios, problem sets, testlets)
Here, one presents
introductory information followed by a series of questions
that depend on that information.
This format offers
the potential of measuring higher-order processes and is increasingly
popular, but has not been well-studied. It has several
variants:
a) comprehension:
presents a literary excerpt and asks several questions about
it.
b) problem-solving:
here, each item builds on the preceding one or asks about
the next step in the problem-solving process.
c) interlinear: good
for measuring writing skills, this variant embeds several
items within a paragraph, asking students to choose the best
alternative for each word or phrase.
d) graphical:
e.g., presenting a table or graph and asking several questions
about its interpretation.
Dangerous Answers
While rarely the primary
focus of a question, this argues for incorporating distractors
identifying dangerous or incompetent behaviors. This
shows promise on certification tests where practitioners deal
with the public and errors could cause harm (e.g., medical
professions).
Item Shells
Shells are “hollow”
items containing the syntax for the general class of items.
“Which is an example
of (any concept)?”
a. Example
b. Plausible non-example
c. Plausible non-example
Steps in Developing
Item Shells
1. Start with the
stem of a successful item.
“What is the distinguishing
characteristic of hydrogen?”
2. underline key words
or phrases representing the content of the item
“What is the distinguishing
characteristic of hydrogen?”
3. Identify alternate
words or phrases for each key word or phrase.
(any other gases studied
in this unit---e.g., oxygen, argon, carbon dioxide, carbon
monoxide)
4. Select a variables
from the range.
“Oxygen”
5. Write the item
stem.
“What is the distinguishing
characteristic of oxygen?”
6. Write the correct
answer.
“It is the secondary
element in water”
7. Write up to four
plausible distractors.
“It has a lower density
than hydrogen”
“It can be fractionally
distilled”
“It has a lower boiling
point than hydrogen”
*************************************************
Item Modeling
Item modeling is derived
from scenarios constructed by experts and is used to test
the same types of higher-level thinking that would actually
be used by professionals in the domain. To do item modeling,
one constructs sets of alternatives for different “facets”
or variables. For example:
Facet One: Setting
a) unscheduled visits
b) scheduled appointments
c) rounds
d) emergencies
e) other
Facet Two: Physician
Tasks
a) history
b) physical exam
c) lab and diagnostic
studies
d) formulation of
likely diagnoses
e) selection of most
likely diagnosis
Facet Three: Case
Cluster
1a) initial workup,
new patient
1b) initial workup,
known patient
2a) continued
care of known patient, old problem
2b) continued care
of known patient, worsening problem
3) emergency
Once the facet sets are created, then
many items can be generated by “rotating” the factors.
|