Assessing Knowledge in Medical School: Implementation of an Item Review Committee to Increase Assessment Item Quality

June 23, 2022

Lori M. DeShetler, PhD, Bindu Menon, PhD, and Jolene M. Miller, MLS

Medical schools must continuously refine the structure, content, and delivery of medical education to address advancements in medicine and develop innovative approaches aligned with their specific missions. Often overshadowed by larger curricular review is the evaluation of items used on internal assessments. Items developed in-house have been found to measure aspects other than medical knowledge, such as testwiseness (Downing, 2002, 2005; Jozefowicz et al., 2002). Recent research suggests that more medical schools assign item review to individual faculty and unit directors (Menon et al., 2021), rather than a committee, despite research suggesting that review by interdisciplinary committees, in combination with pre-established guidelines, improved item quality (Wallach et al., 2006). This paper describes the development and early outcomes of an interdisciplinary review committee and style guide to improve items on internal assessments at a medical school in the Midwest United States.


In 2017-18, our medical school introduced a new systems-based integrated curriculum for the first two years. With this new curriculum came the need for new internal assessments, which meant expanding our test bank by having faculty write hundreds of new assessment questions. During review of the courses the following year, the Curriculum Evaluation Committee noticed a trend across course evaluations in which students repeatedly reported errors on assessments, including spelling and grammatical errors and duplicated questions, among other issues. For example, on one course survey, a student commented, "I wish there was more spellcheck/review of the questions for the final since there were numerous errors." In another, a student stated, "It is a little disappointing that there continues to be errors in writing formative and summative assessment questions." From another course survey, "There were multiple spelling and grammar errors that made the questions hard to understand." Another respondent wrote, "There were once again numerous errors in formatting/repeat questions."


Based on the above feedback, it was very clear that action was needed to improve the quality of our internal assessments and to support the individual course faculty responsible for writing questions and developing these assessments. In the fall of 2019, an item review committee (IRC) was established. Committee membership included pre-clerkship and clinical medical educators as well as faculty with assessment and editing expertise. While not all members can attend every meeting, it was determined that every review session must have at least one pre-clerkship educator, a clinical educator, and an assessment/editorial expert.

The committee was charged with reviewing all questions in the internal assessment system for grammar, question structure, understandability, biases (stereotypes, outdated and offensive language), and assuring that vignettes use language consistent with clinical settings. With the volume of questions to be addressed, the committee strategically prioritized the review of exam questions in scheduled exam order. The IRC developed a style guide for writing assessment questions to support consistent reviews, but also to guide faculty writing the questions.

While the National Board of Medical Examiners presented an item-writing workshop to faculty in 2017, formal training was not ongoing. Supplementing the newly implemented orientation of new faculty by system directors, the style guide serves as a resource to help faculty develop quality assessment questions. As needed, the IRC also sought expertise from experts on campus, such as the college’s Office of Diversity, Equity, and Inclusion, to provide guidance on the style guide. This document was revised organically as the committee worked through numerous questions and monitored student feedback on assessments.


Based on open-ended questions on student surveys, at least three of the 13 pre-clerkship courses in 2018-19 contained complaints about quiz and exam question construction, errors, and duplicate questions. The exam questions for these courses were reviewed the following year by the IRC, and the same error issues did not resurface in course feedback. On average, there are 100 items per exam, and IRC review takes four hours for one exam with all new questions. This time significantly decreases once questions have been reviewed.

In 2019-20, three different courses received complaints about assessments errors, but upon review, we found that two of the courses’ exams had not yet been reviewed by the IRC. In the third course, a portion of its exam questions were added late and thus were not reviewed by the IRC. This prompted the committee to establish deadlines with directors to ensure timely review prior to administering an assessment. As IRC reviews proceeded in 2019-20, student feedback highlighted new concerns that the IRC added as components of its review, such as readability and accessibility of images attached to questions. Despite these different concerns, grammatical errors were no longer an issue reported by the students. By the end of the 2020-21 year, all existing exam questions had been reviewed by the IRC. The committee then turned its attention to quiz questions in 2021-22 along with new or modified exam questions.

The committee continues to revise the style guide for writing effective assessment questions, which has resulted in increased consistency among question formatting and reduced the number of complaints about errors on assessments. To date, student comments this academic year have not mentioned errors on exams and quizzes pertaining to spelling, grammar, and question formatting, as was frequently reported in prior years. This suggests that the careful review by the IRC is effective and improves the quality of assessment items.


A subset of IRC members conducted a survey in 2020 to determine what processes other institutions have for reviewing assessments. Findings showed that most responding institutions had a process in place, but responsibility for assessment review varied from faculty to administrators and committees. In fact, it was more common for individual faculty and unit directors to hold responsibility for assessment question review than an independent committee.

Results of our study confirmed that a review process is important. Consistent with previous research (Jones & New, 2021; Wallach et al., 2006), we determined that for our school, a committee structure for peer-review of items is more beneficial than a single reviewer. An item review committee puts multiple eyes on the questions from a variety of individuals with different skill sets. Most committee members are seeing the items for the first time, allowing them to identify errors more easily than those who have written the questions. Combining the strengths of experts in content, editing, and assessment enables holistic review. When reviewing assessment items, we have found it more efficient to invite the relevant course director to participate in the meeting; corrections and clarifications can be made on the spot rather than over email.

We realize the time commitment of members on the IRC is significant, but we cannot ignore the positive impact it has had – both on quality of assessment questions and of the reduction of student complaints about errors in assessments. Future plans for the IRC are to continue reviewing all new items added to the question bank and re-review vetted questions every two or three years to check for updates and new guidelines per the style guide. Despite the large time investment, the IRC has proven to be a valuable committee that has had a positive impact on the quality of assessments in our medical school.


Downing, S. M. (2002). Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference? Academic Medicine, 77(10), S103-S104. https://doi.org/10.1097/00001888-200210001-00032.
Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: The consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10(2), 133-143. https://doi.org/10.1007/s10459-004-4019-5.
Jones, L., & New, K. (2021). Establishing an item review committee: Our school's journey. Nurse Educator, 46(1), 8-9. https://doi.org/10.1097/NNE.0000000000000846.
Jozefowicz, R. F., Koeppen, B. M., Case, S., Galbraith, R., Swanson, D., & Glew, R. H. (2002) The quality of in-house medical school examinations, Academic Medicine, 77(2), 156-161. https://doi.org/10.1097/00001888-200202000-00016.
Menon, B., Miller, J., & DeShetler, L. (2021). Questioning the questions: Methods used by medical schools to review internal assessment items. MedEdPublish, 10(1), 37. https://doi.org/10.15694/mep.2021.000037.1.
Wallach, P. M., Crespo, L. M., Holtzman, K. Z, Galbraith, R. M., & Swanson, D. B. (2006). Use of a committee review process to improve the quality of course examinations. Advances in Health Sciences Education, 11(1), 61-68. https://doi.org/10.1007/s10459-004-7515-8.