Testing is a fundamental part of the education process, but the act of coming up with the questions in those tests has largely remained unchanged for decades.
Most of the time, this simply involves thinking of challenging questions, and then putting them onto the test. Sometimes this process is made more rigorous by actually testing the questions on a sample of students, but this is more expensive, both in terms of time and expense, so is therefore rarely undertaken.
Turning to the crowd
A team of Harvard researchers suggest in a recent paper that by recruiting the crowd, this process can be done more efficiently, thus allowing tests to be more rigorous without breaking the bank.
“Crowdsourcing opens up a whole new possibility for people creating tests,” they say. “And instead of taking a semester or a year, you can do it in a weekend.”
The researchers propose a two stage process. Firstly, a pilot is undertaken with a lot of questions generated by experts that are tested out on a large pool of students. Then, a field test is performed on 1,000-2,000 students, with statistical analysis used to pick out the best questions.
Where things are interesting however, is that the first step is undertaken using a crowd selected from Mechanical Turk. Each participant was asked to answer 25 multiple choice questions that were developed for middle-school science students.
In total, 110 questions were examined using both the traditional method and the crowdsourced alternative, with the results then compared to test the effectiveness of each approach.
Passing the test
The results revealed that both the crowd and the sample of real students produced a similar pool of ‘best of breed’ questions. Whilst the researchers point out that the crowd should not be considered a replacement for testing with actual students, it could nonetheless be a useful early step in the process to allow a quick and effective analysis of questions to be undertaken.
“The key to creating good standardized tests isn’t the expert crafting of every test question at the outset, but uncovering the gems hidden in a much larger pile of ordinary rocks,” the researchers say. “Crowdsourcing, coupled with using commercially available test-analysis software, can now easily identify promising candidates for those needle-in-a-haystack items.”
Suffice to say, this isn’t a common approach at the moment, but the researchers are confident that it could have a number of applications.