A trial in which trainee teachers who were being taught to identify pupils with potential learning difficulties had their work ‘marked’ by artificial intelligence has found the approach significantly improved their reasoning.
The study, with 178 trainee teachers in Germany, was carried out by a research team led by academics at the University of Cambridge and Ludwig-Maximilians-Universität München (LMU Munich). It provides some of the first evidence that artificial intelligence (AI) could enhance teachers’ ‘diagnostic reasoning’: the ability to collect and assess evidence about a pupil, and draw appropriate conclusions so they can be given tailored support.
During the trial, trainees were asked to assess six fictionalised ‘simulated’ pupils with potential learning difficulties. They were given examples of their schoolwork, as well as other information such as behaviour records and transcriptions of conversations with parents. They then had to decide whether or not each pupil had learning difficulties such as dyslexia or Attention Deficit Hyperactivity Disorder (ADHD), and explain their reasoning.
Immediately after submitting their answers, half of the trainees received a prototype ‘expert solution’, written in advance by a qualified professional, to compare with their own. This is typical of the practice material student teachers usually receive outside taught classes. The others received AI-generated feedback, which highlighted the correct parts of their solution and flagged aspects they might have improved.
After completing the six preparatory exercises, the trainees then took two similar follow-up tests — this time without any feedback. The tests were scored by the researchers, who assessed both their ‘diagnostic accuracy’ (whether the trainees had correctly identified cases of dyslexia or ADHD), and their diagnostic reasoning: how well they had used the available evidence to make this judgement.
The average score for diagnostic reasoning among trainees who had received AI feedback during the six preliminary exercises was an estimated 10 percentage points higher than those who had worked with the pre-written expert solutions.
The reason for this may be the ‘adaptive’ nature of the AI. Because it analysed the trainee teachers’ own work, rather than asking them to compare it with an expert version, the researchers believe the feedback was clearer. There is no evidence, therefore, that AI of this type would improve on one-to-one feedback from a human tutor or high-quality mentor, but the researchers point out that such close support is not always readily available to trainee teachers for repeat practice, especially those on larger courses.
The study was part of a research project within the Cambridge LMU Strategic Partnership. The AI was developed with support from a team at the Technical University of Darmstadt.
Riikka Hofmann, Associate Professor at the Faculty of Education, University of Cambridge, said: “Teachers play a critical role in recognising the signs of disorders and learning difficulties in pupils and referring them to specialists. Unfortunately, many of them also feel that they have not had sufficient opportunity to practise these skills. The level of personalised guidance trainee teachers get on German courses is different to the UK, but in both cases it is possible that AI could provide an extra level of individualised feedback to help them develop these essential competencies.”
Dr Michael Sailer, from LMU Munich, said: “Obviously we are not arguing that AI should replace teacher-educators: new teachers still need expert guidance on how to recognise learning difficulties in the first place. It does seem, however, that AI-generated feedback helped these trainees to focus on what they really needed to learn. Where personal feedback is not readily available, it could be an effective substitute.”
The study used a natural language processing system: an artificial neural network capable of analysing human language and spotting certain phrases, ideas, hypotheses or evaluations in the trainees’ text.
It was created using the responses of an earlier cohort of pre-service teachers to a similar exercise. By segmenting and coding these responses, the team ‘trained’ the system to recognise the presence or absence of key points in the solutions provided by trainees during the trial. The system then selected pre-written blocks of text to give the participants appropriate feedback.
In both the preparatory exercises and the follow-up tasks, the trial participants were either asked to work individually, or assigned to randomly-selected pairs. Those who worked alone and received expert solutions during the preparatory exercises scored, on average, 33% for their diagnostic reasoning during the follow-up tasks. By contrast, those who had received AI feedback scored 43%. Similarly, the average score of trainees working in pairs was 35% if they had received the expert solution, but 45% if they had received support from the AI.
Training with the AI appeared to have no major effect on their ability to diagnose the simulated pupils correctly. Instead, it seems to have made a difference by helping teachers to cut through the various information sources that they were being asked to read, and provide specific evidence of potential learning difficulties. This is the main skill most teachers actually need in the classroom: the task of diagnosing pupils falls to special education teachers, school psychologists, and medical professionals. Teachers need to be able to communicate and evidence their observations to specialists where they have concerns, to help students access appropriate support.
How far AI could be used more widely to support teachers’ reasoning skills remains an open question, but the research team hope to undertake further studies to explore the mechanisms that made it effective in this case, and assess this wider potential.
Frank Fischer, Professor of Education and Educational Psychology at LMU Munich, said: “In large training programmes, which are fairly common in fields such as teacher training or medical education, using AI to support simulation-based learning could have real value. Developing and implementing complex natural language-processing tools for this purpose takes time and effort, but if it helps to improve the reasoning skills of future cohorts of professionals, it may well prove worth the investment.”