MA932 Projects 2022-2023

PROJECT A. Understanding collective motion
PROJECT B. Using age data to improve targeted intervention campaigns against African sleeping sickness in Guinea
PROJECT C. Drift and bias in AI triage tools for GP consultations

PROJECT A. Understanding collective motion
Internal: Prof Matthew Turner & Dr Gareth Alexander
External: Dr Ananyo Maitra & Dr Fernando Peruani, Universite Paris Cergy-Pontoise

Motion involving many individual particles, bodies or agents is ubiquitous in nature generally and in human societies in particular. Examples of collective motion in nature include the motion of flocks of birds, swarms of insects or shoals of fish. In human societies we might be interested in the behaviour of crowds or the motion of many vehicles on a transport network. More recently, algorithms to control driverless cars or drone swarms have become industrially important.
This project will involve students undertaking the following steps:
1. A review of the literature on collective motion in some of these systems in order to gain an idea of the state-of the-art in modelling of collective motion. This will itself have two components. (i) Agent-based models in which an equation of motion for each of the N agents is developed. The agents are usually taken to be identical and hence these are leaderless or internally coordinated, rather than externally controlled, systems. (ii) Continuum models that treat the velocity and density of the particles as fields that are controlled by appropriate PDEs. In some cases it is possible to take agent-based models and show formally how they can be coarse-grained into a continuum description.
2. A coding exercise: a simple agent-based collective motion algorithm is to be coded (e.g. the Vicsek model). This can be compared with its relevant continuum description, the Toner-Tu model.
3. An analysis of possible refinements of these models: Do collisions occur in these models? Would this be a desirable feature in autonomous robotics? Can simple algorithms be identified or developed to reduce collisions? How would this affect the continuum description? How does this modify the continuum descriptions? Can this problem be expressed as a form of control theory and in what way might an individual’s optimal behaviour be different to the global optimal behaviour?
4. A review of how such algorithms might be useful, how they could be refined further or extended and an analysis of future prospects.

PROJECT B. Using age data to improve targeted intervention campaigns against African sleeping sickness in Guinea
Internal: Dr Kat Rock
External: Dr Paul Bessell (TrypaNO! partnership)

Human African trypanosomiasis (HAT), otherwise known as sleeping sickness, is a parasitic vector-borne infection found in West and Central Africa which is typically fatal if left untreated. Over the last two decades, a large effort has been made to reduce case burden and deaths mainly through diagnosis and treatment of cases. Earlier identification and treatment is beneficial for individuals infected with the parasite as it can reduce suffering and the probability of dying from infection, but also at the population level to reduce time spent infectious and onwards transmission. Unfortunately, HAT is not vaccine-preventable but interventions to reduce the vector (tsetse fly) population can help reduce levels of transmission.
With the current combination of tools: diagnostic tests, new easier treatment, and targets to kill flies, intervention strategies have done well to reduce case numbers to a historically low level in countries like Guinea in West Africa. However, the World Health Organization have now set a target to elimination transmission to humans by 2030, and with dwindling case numbers, new approaches might be needed to reach this goal or to reach it using fewer resources.
Two possible options to improve diagnosis and treatment are (i) to better target screening towards people believed to be at higher risk of infection, and (ii) treating people with a new drug when they are positive on a single blood test (rather than requiring several extra tests to be performed). These options also present challenges, for example it is usually people we believe are at low risk who participate in testing events, and we might “overtreat” a low of people who aren’t really infected with a single blood test. In this RSG project you’ll use age-structured data for different regions of Guinea to address the question of how to optimally test and treat populations for HAT infection taking factors like additional effort to screening high risk people into account. .

PROJECT C. Drift and bias in AI triage tools for GP consultations
Internal: Prof Magnus Richardson
External: Dr Youssef Taleb, Spectra Analytics

There exists a well-known issue in machine learning, known as model drift. This is a phenomenon characterized by a degradation in the performance of a pre-trained AI model, driven by changes over time in the properties/meaning of the dependent/target variable (known as ‘concept drift’) and/or the statistical distributions of one or more independent/predictor variables (known as ‘data drift’). If left undetected, such performance decay could lead to the model losing too much predictive power to be reliable anymore.
Another and more recent concern around AI is bias. When protected attributes such as sex or ethnicity are directly or indirectly used in predictive models to make decisions in areas such as advertising or healthcare, training a ‘best’ model from a pure performance point of view could lead to biases being introduced in the model’s learned behaviour. There have been initial attempts to define AI ‘fairness’ with mathematical/probabilistic formulations, mainly for models that perform classification (i.e. a finite, discrete outcome). Linking back to model drift, this raises a likely scenario where the ‘fairness’ of a model degrades over time because of changes in the underlying distributions of the target population’s attributes.
What we would be interested in is to explore ways to detect and control for model drift with respect to one choice of fairness metric, without significantly sacrificing model performance. This would be very beneficial to maintain AIs such as the ones used by PATCHS (Spectra Analytics AI-based triage tool for online GP consultations) as continuously robust and unbiased as possible.
Some references:
A Survey on Bias and Fairness in Machine Learning
A Clarification of the Nuances in the Fairness Metrics Landscape
Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine
Protect against model drift (IBM)
Model Drift in Machine Learning: How to Detect and Avoid It