Posted by: Keean Schupke
The ‘boom’ in popularity of artificial intelligence (AI), since ChatGPT broke onto the scene in November 2022, has encouraged many to be more open-minded about how AI can augment their work. In assessment, it’s progressed the conversation about how we can collectively improve traditional teaching, learning and examination practices for the better.
To date, much of the discussion on the topic of AI in assessment has broadly focused on preventing its use as a cheating aid. However, it’s important that we don’t blinker ourselves and instead realise that the integration of AI into healthcare assessment actually offers numerous opportunities for enhancing various aspects of the assessment process.
Firstly, AI has the potential to help us to dive into the taxonomy of subjects, aiding in data tagging and a more efficient categorisation of learning materials. With more effectual tagging of these materials, the creation of assessments could be much quicker.
If we can group and break down knowledge, to ensure we’ve covered all the necessary items from a specific area, then we’re certifying that students are examined fairly based on the entire breadth of necessary knowledge areas.
For example, a tutor creating a formative ‘popquiz’ style assessment might select five random AI-generated questions from the learning outcomes of lectures 1-3 of their course on ‘Anatomy 101’. This provides a notable increase in the efficacy of the tutor’s limited resources and ensures a random, yet comprehensive, assessment of a student’s knowledge.
What’s more, the recent introduction of the Medical Licensing Agreement (MLA), which seeks to improve fairness and consistency in how prospective doctors are tested prior to joining the medical register, will bring with it more stringent curriculum requirements. This underpins the need for more efficient ‘tagging’ and clearly mapped curriculums
to aid with the creation of more clearly defined assessment objectives and learning outcomes – an area where AI can excel.
Marking represents another area ripe for an infusion of AI. While the evaluation of longer-form answers might currently lie beyond the capabilities of this generation of AI, its potential in marking short-form assessments against its knowledge bank of course materials is notable. However, its implementation will require a parallel process where human markers are on hand to validate its accuracy.
For example, if AI markers and human markers can demonstrate a 90% correlation in the marks given, then the use of AI to grade short-form answers could significantly decrease the workload of assessors by limiting human review to only the borderline cases – i.e., the cusp of a pass/fail. As a result, a human marker might then only have to review 100 out of 1000 papers – saving them huge amounts of time.
Critically, in a clinical subject like medicine, a student is being assessed on their capacity to practice – so it’s important that if AI is to be used, then its use is supported by an extensive body of evidence that confirms that A) the correct decisions are being made and B) it’s acting effectively as a ‘co-pilot’.
Its potential can also be seen in the provision of informal feedback throughout an academic year. AI can be used to read and understand the feedback and performance of a student or trainee, meaning it can also recognise patterns indicative of someone who might not be likely to pass an end-of-year exam. Given the intense resource-burden upon those running the course, or supervising training in the workplace, a tutor or workplace mentor might not have enough time to pay attention to each candidate/colleague and to notice these patterns themselves.
So, AI can act as a safety net of sorts. It might not be 100% accurate in its predictions of
performance, but if it can be used to help humans understand what features are indicative of someone struggling, then flagging that individual for intervention would more than likely help to reduce their chance of failure.
Once flagged, AI can suggest personalised intervention plans, based on historic trends. For example, if a candidate’s progress trajectory indicates a need for increased study hours, AI can recommend specific changes based on its ability to recognise patterns.
Looking at OSCE assessment, where the assessed content is verbal, AI could be used in its speech-to text capacity, meaning the OSCE recording could be inputted and compared against an approved mark scheme for an immediate draft of a mark.
However, in an OSCE it’s not just an assessment of how much the candidate knows, it’s also how they deliver this information. So, there will remain an important role for a human examiner to play in validating a candidate’s performance.
It’s apparent that AI’s influence is now widespread. Much like the adoption of the ‘e’ into ‘e-assessment’ at the turn of the millennium, it’s only a matter of time before e-assessment associations and education providers transition to adopt AI in their practices. There’s already some great work being done across the sector.
For more information please visit risr.global