1
Sep
2010
Posted by Steven Just. No Comments
Question: If I make my distractors “plausible” doesn’t that increase the possibility that one of them will also be correct?
Answer: Yes, with “plausible” distractors you have to be careful about not inadvertently supplying an additional correct answer in your multiple choice questions. One way to protect yourself against the “but Choice B could also be correct” argument is to use the following phrase in the stem:
Select the single best answer from the choices below.
Assuming that your correct answer is “more correct” than the “plausible” distractor, you are protected.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
18
Aug
2010
Posted by Steven Just. 2 Comments
Question: I know that for performance evaluations most experts say to use rating scales with an even number of choices rather than an odd number, but isn’t it sometimes useful to have a “neutral” category?
Answer: Yes, I suppose so, but in my opinion the temptation is too great to “take the easy way out” in an evaluation. Having an even number of rating choices forces the evaluator to make a decision, positive or negative. A number of years ago we created a sales representative role play evaluation system for one of our clients. Though the system was flexible enough to accommodate scales of any type, they chose a five point scale. After a year or so of use we tried to interest them in an improved report writer that would enable them to better analyze their data so they could look at role play performance vs. actual field performance, inter-rater reliability, performance ratings vs. knowledge-based test scores, etc. Their response? “That won’t do us any good – everyone gets rated a three.”
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
3
Aug
2010
Posted by Steven Just. No Comments
Question: What’s a rubric?
Answer: A rubric is a scoring/evaluation tool that helps a rater define how a behavior is to be scored. For standard objective type questions scoring is simple. The student’s answer is usually correct or incorrect. But what about judging an essay or an observed behavior? If we have multiple raters how do we ensure rater consistency? That’s where a rubric comes in. It provides a scoring scale based upon specific observed behaviors. For example if we have a standard rating scale of 1 to 5, with 1 being poor and 5 being excellent, a rubric helps the rater decide a proper score by associating specific observable behaviors with each of the five possible ratings.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
21
Jul
2010
Posted by Steven Just. No Comments
Question: Since ultimately I am testing the employee’s ability to do a job shouldn’t I be doing more performance testing?
Answer: Yes, absolutely. You can do performance testing in three ways:
- Through standard objective questions that pose a situation and ask the employee how he/she would act.
Caveat: There are significant limitations to how well you can model a live, open-ended interaction in a closed, objective question.
- Through a computer-based simulation that enables the employee to interact with a simulated environment as he/she would with the “live” environment.
Caveat: Computer simulations vary greatly in quality. Those that most closely model actual job performance can be expensive to develop.
- By having a person, usually a manager or trainer, evaluate the employee in a simulated role play.
Caveat: Evaluators need to be trained and have validated rubrics for scoring.
Most jobs require a combination of knowledge and skills. You need to evaluate both.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
14
Jul
2010
Posted by Steven Just. No Comments
Question: If I have a test with retries, when I compute item statistics (difficulty levels, choice distributions, etc.) should I include all tries or just the first one?
Answer: Just the first one. The second and subsequent tries are usually “contaminated” by the students’ knowledge of what they got right and what they got wrong on the first attempt. You may even have given them the correct answers in the feedback after the first attempt.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
9
Jul
2010
Posted by Steven Just. No Comments
Question: Is it better to set a “traditional” passing score (e.g. 90%) or
test for mastery by learning objective?
Answer: Most test creators opt for the former because that is what they are familiar with from school, but for criterion-referenced testing the latter is preferable. Why? A criterion-referenced test is, by definition, about mastery of a topic. Assuming a topic can be broken into independent learning objectives — and any topic can be — the student should be able demonstrate mastery of ALL learning objectives. In a traditionally-scored test a student may demonstrate mastery of the topic as a whole but not necessarily each learning objective. Let’s take an extreme case: You create a 100 question test composed of 10 learning objectives, where each learning objective has 10 questions (this is just to make the math simple). If passing is set at 90% it is possible (not likely, but possible) for a student to pass the test but get a 0% on one learning objective. Bottom line: If you want to guarantee mastery of ALL learning objectives, test by the learning objective, not simply by passing score.
Have a question only an expert can answer? Let us know! If you want more detail,
disagree, or want my rationale for an answer, feel free to comment.
30
Jun
2010
Posted by Steven Just. No Comments
Question: We have gone through a validated (Angoff) process to set our passing score, which came out to be 87%. But the test only has 20 questions so test takers can’t get a score between 85% and 90%. Is it OK to set the passing score to 85% or 90%?
Answer: Yes. I would go with 90%, which in this case is equivalent to 87%. Setting it to 85% artificially lowers the required competency level.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
16
Jun
2010
Posted by Steven Just. No Comments
Question: Is it Ok to test a learning objective with only one question?
Answer: Yes and No. Yes, in the sense that if a learning objective is narrowly written perhaps a single question covers the entire content of the objective. But this puts you in a somewhat awkward position when it comes to setting a passing score. Let’s assume for a moment that you have many single question objectives on a test. It Then it is quite probable that a student passing the test with a score of less than 100% will still entirely fail a number of learning objectives. In fact his score on each of those single question learning objectives that he got wrong will be 0%. On a 50 question test, if passing is 90%, then the student could entirely fail five single-question learning objectives and still pass the test. What to do? Try recasting some of the narrow learning objectives into broader evaluation objectives that can be tested with multiple questions, giving the student some leeway to get a question wrong but still demonstrate competency on the objective.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.
20
May
2010
Posted by Steven Just. No Comments
Question: What’s wrong with setting the passing score to be 100%?
Answer: I’ll give you three reasons.
Reason 1: It’s not valid. I can create a test that is so hard that no one will get 100% or I can create a test that is so easy everyone will get 100%. What does this prove? Nothing. Create an exam that measures the competencies required for the job and then have a panel of experts judge minimal competency (the Angoff process).
Reason 2: It won’t last anyway. Even if everyone did receive 100% (and the test was not trivially easy) people forget, especially if they have crammed to get a 100% on a test. I can guarantee you that a week or two
later they will have forgotten more than half of what they learned (see the
Ebbinghaus Curve of Forgetting).
Reason 3: Self-interest. If this is a high stakes test affecting hiring,
promotion, or dismissal decisions, be prepared to defend your decision in court. We live in a litigious society. The people who will litigate are those who failed the test, not those who passed, and if your passing score is 100% you may have many failures.
Have a question only an expert can answer? Let us know! If you want more detail,
disagree, or want my rationale for an answer, feel free to comment.
13
May
2010
Posted by Steven Just. No Comments
Question: How are test reliability and validity related?
Answer: Test reliability refers to consistency over time; test validity depends on several factors, one of which is reliability. For example: Let’s say I have a class of second graders who are studying addition. To test their understanding I give them an advanced calculus test. How will they do on the test? Not very well, of course. A week later I give them the same test. How will they do? Poorly, again. Is this test reliable? Yes. It gives consistent results over time. Is this test valid? No. So a test can be reliable but not valid, but for a test to be valid it must also be reliable.
Have a question only an expert can answer? Let us know! If you want more detail, disagree, or want my rationale for an answer, feel free to comment.