The aim of this Challenge is to develop an application (app) that uses artificial intelligence and machine learning to automatically detect changes in facial expression and/or body condition, to improve the monitoring of mouse welfare.
Challenges briefing webinar
View the Challenges briefing webinar recording to find out more about this Challenge.
Sponsored by AstraZeneca, GSK, CRUK Manchester Institute - University of Mancheser, The Sainsbury Wellcome Centre (UCL) and Agenda Vets, this Challenge aims to develop an application (app) that uses artificial intelligence and machine learning to automatically detect changes in facial expression and/or body condition, to improve the monitoring of mouse welfare.
Background and 3Rs benefits
Millions of mice are used worldwide in research each year (EU statistical report 2019). Ensuring any pain or suffering is kept to a minimum requires careful monitoring of the animals so that appropriate action can be taken, and humane endpoints implemented. Welfare is monitored using a panel of indicators such as the assessment of pain, body and coat condition, and the weight of the animal. Traditional methods of assessment based on monitoring changes in behaviour or clinical signs (e.g. weight loss) are time consuming and can have other limitations as they may not be specific to pain or a sensitive indicator of health under certain conditions. A number of scoring systems (e.g. the mouse grimace scale and body condition scoring) have been developed that provide simple, reliable and non-invasive measures for assessing welfare at the cage side (Langford et al., 2010; Leach et al., 2012; Ullman-Cullere and Foltz 1999), but there has not been widespread adoption.
The mouse grimace scale
All mammals communicate emotions through facial expressions and changes in these can provide a reliable and rapid means of assessing pain. The mouse grimace scale (MGS) is a graded measure for changes in facial expression related to pain (Langford et al., 2010; Leach et al., 2012) that has been adopted by some research groups and institutions to assess signs of pain following procedures. The MGS was developed based on a change in five facial actions units (FAUs) – orbital tightening, ear position, cheek bulge, nose bulge and whisker position – with each FAU scored on a three-point scale from not present (0), through to moderate (1) and severe (2). These action units increase in intensity as a response to post-procedural pain and can be used as part of a clinical assessment. Depending on the score, interventions can then be taken to alleviate pain and/or distress.
Grimace scores can be used to assess pain in real time at the cage side. Each animal has to be observed for a short period of time to avoid scoring brief changes in facial expression that are unrelated to pain and can only be carried out on awake animals. The MGS has been used for assessing both post-operative pain (Leach et al., 2012) and the effectiveness of analgesics (Matsumiya et al., 2012) and has the advantage that it does not require the mouse to be handled.
Automating the Mouse Grimace Scale
A number of projects have worked to automate the process of assessing pain using the MGS. In 2011, the Rodents Face Finder® software was published as a tool to automate the selection of images for scoring through detecting rodent eyes and ears in images, but the grimace scale scoring was still carried out manually (Sotocinal et al., 2011). Ernst et al., also successfully automated the process for pre-selecting images most suitable for manual scoring using an algorithm (Ernst et al., 2020a and 2020b). However, images selected by the algorithm are still evaluated manually. Tuttle et al. developed an automated MGS using machine learning and deep neural networks, which was shown to be highly accurate (94%) when compared to manual scoring (Tuttle et al., 2018). However, the current model only detects grimacing in albino mice and only provides a binary read-out (pain or no pain) rather than a precise MGS score. Andresen et al. developed an automated facial expression recognition software using deep learning neural networks to assess post-anaesthetic and/or post-surgical effects in mice (Andresen et al., 2020). Like Tuttle et al., the software only provides a binary read-out (e.g. post anaesthetic/surgical effect, no post anaesthetic/surgical effect) and has only been used in black-furred mice (C57BL/6JRj).
Body condition scoring
Body condition scoring (BCS) is a method to assess the welfare of mice without relying solely on measuring weight. Weight loss, measured as a percentage decline from initial weight or compared with the weight of age-matched controls, is commonly used as a criterion for welfare assessment. However, weight loss may not always be a sensitive indicator of animal health. For example, studies that create physiological changes, such as intraperitoneal fluid retention or tumour growth, may mask weight loss. BCS grades the amount of flesh covering bony protuberances on palpation or by visual assessment and correlates to potential changes in the health of the mouse (Ullman-Cullere and Foltz, 1999). It uses clinical indicators which are scored as degree-of-deviation-from-normal, thereby allowing an animal to be monitored over time as health declines (Ullman-Cullere and Foltz, 1999). Body condition is scored on a scale of one (emaciated) to five (obese). BCS is particularly useful for mice with tumours or ascites where changes in weight may be misleading. The BCS is typically carried out twice a day and requires the home cage to be taken off the rack and removal of the lid to fully assess the animal including its movement and behaviour. The BCS can sometimes also require handling to palpate body condition, which can be stressful for the animal. There is currently no automated or semi-automated approach to BCS that would avoid the need to handle the mice.
Both the MGS and BCS are subjective, can be labour intensive and rely on the experience of the staff carrying out the observation, potentially leading to inconsistencies in scoring. They also require staff to be trained and how this training is given may vary. As a result, the MGS and BCS have not been widely adopted. There is a need to automate the process of scoring to deliver increased consistency and accuracy in welfare assessments, and to facilitate their widespread use.
This Challenge aims to create a facial and/or body recognition app based on artificial intelligence and machine learning that automates detection of pain and body condition in mice and is simple and fast to use in an animal facility setting. The worldwide availability of the tool could help improve and harmonise decisions regarding animal welfare and humane endpoints, reducing individual suffering in large numbers of mice used in scientific studies and procedures, while at the same time improving the the quality of scientific data when pain and discomfort can be mitigated. It could also reduce staff time and is likely to be adopted widely if it is quick, accurate and easy to use.
Full Challenge information
Review and Challenge Panel membership
|Professor Jon Timmis||University of Sunderland|
|Dr Sally Robinson (Sponsor)||AstraZeneca|
|Dr Elin Holmedal (Sponsor)||AstraZeneca|
|Mrs Sam Izzard (Sponsor)||GSK|
|Dr Joanna Moore (Sponsor)||GSK|
|Mr Stuart Pepper (Sponsor)||CRUK Manchester Institute - University of Manchester|
|Dr Eleni Amaniti (Sponsor)||Sainsbury Wellcome Centre - UCL|
|Dr Lucy Whitfield (Sponsor)||Agenda Vets|
|Professor Miroslaw Bober||University of Surrey|
|Dr James Brown||University of Lincoln|
|Dr Zahid Latif||Modify Medical Services Ltd|
|Professor Alexander Mathis||École polytechnique fédérale de Lausanne|
|Professor Ioannis Patras||Queen Mary University London|