27th September 2022

 

Welcome to the RAPPORT project blog!

 

RAPPORT stands for Developing a Rapid AI-based Policy Probing and Observational Research Tool

 

The project is funded by Wellcome via their Mental Health Data Prize

https://wellcome.org/grant-funding/schemes/wellcome-mental-health-data-prize

I am really excited about this project. It is an amazing opportunity to apply and publicise new, powerful data science methods for mental health research. Thank you Wellcome! 😀

I am Paul Tiffin, the principal investigator (PI) for the project. Before I let the other team members introduce themselves let me give you an overview of the project:

It’s important that treatments and health policies actually improve well-being and health. Sometimes well-intentioned healthcare interventions either don’t work for some groups of individuals, or can even cause unintended harms. 

Scientific research is vital if we are to have evidence over whether such interventions are likely to work or not, and for which groups of people. Traditionally such evidence has been often provided by ‘randomised controlled trials’ (RCTs). RCTs are experiments where usually one group of people have a particular treatment, and another group are offered an alternative treatment, or no treatment as such (e.g. a ‘placebo’, or sugar pill). Who gets which treatment is decided by chance, hence the term ‘Randomised’ controlled trial. In this way the actual causal effect of the experimental treatment, compared to any alternative or placebo, can be worked out. This is because any other characteristics of the people involved in the trial, that may be associated with the health outcome of interest, should be similar between the two groups, due to the randomisation.

However, RCTs can take a long time to set up and report their findings. They are also very expensive to run, often costing several millions of pounds. 

We now have lots of information on patients, in the form of data from everyday practice and from research studies. However, unlike in RCTs, in the real world patients aren’t given treatments according to chance. This can make working out whether a particular health policy or treatment has actually caused an improvement (or even worsening) of health very difficult.

However, using new statistical methods we can use routine health data, and those from scientific studies (even those that don’t involve randomised trials) to understand if new policies, practices or treatments cause improvements in health. Moreover, we can now use mathematical approaches to work out which interventions are likely to work for whom

Machine learning is part of Artificial Intelligence (AI)- where decisions can be made automatically without a human being involved. In machine learning computers can learn to recognise patterns in data in order to make predictions. 

In the RAPPORT project we will use machine learning in two main ways. Firstly, we will use traditional statistical methods used to understand population health (‘epidemiology’) and combine these with machine learning. This approach will enable us to better understand the causal effects of a mental health intervention, not just whether it is associated with better outcomes. This approach is known as ‘targeted learning’ and seems to be more effective than existing statistical methods for this purpose. Targeted learning has started to be applied to the understanding of physical health issues but there are very few examples of it being used in mental health research.

The second way we will be using machine learning is to understand how different groups of people respond differently to an intervention or treatment. These methods are known as ‘causal forests’. They work by learning the rules that can predict who is most likely to respond most positively to an intervention. This means we can identify the groups of people most likely to benefit from a certain policy or treatment. 

The initial Discovery phase of the project will apply these new approaches to the Millennium Cohort Study data to assess the impact of childhood physical activity (an ‘active ingredient’) on depression. The Millennium Cohort (also known as ‘Child of the New Century’) is following the lives of around 19,000 young people born in the UK in 2000-02 (Millenium Cohort Study). It contains a lot of information on health and lifestyle.

For the ‘Prototyping’ phase, if funded, we plan to develop some proof of concept tools, that will show how we can help other mental health researchers access these new, powerful techniques in easily available user-friendly software.

Our work will involve and engage experts with relevant lived experience from the start. Their knowledge and insights will help inform our approach to data analysis, as well as how we make sense of, report and communicate our findings.

Science in general, and machine learning studies in particular, has been criticised recently for the lack of transparency and reproducibility. By this, people mean that researchers do not always give all the details required for other scientists to be able to replicate their results, thus showing they are likely to be true and accurate.  Working with experts by experience and other stakeholders we will establish a framework for the transparent and replicable implementation of this approach. This will involve ensuring we report all the relevant details of our methods when we give our findings. Also, it will involve clearly labelling all the computer code we have used and making that publicly available also. This will help other scientists in the field understand what we did and how we got the results we report. More generally it will set high standards for reporting these kinds of studies, encouraging and supporting better practice in machine learning-based research.

Overall our project is ambitious, but feasible. We aim to both specifically understand the causal impact of ‘key ingredients ‘ related to young people’s mental health. More generally, we intend to make new, accessible, digital tools available to the research community and set a new standard for transparency and reproducibility in reporting machine learning based studies. 


Comments

Popular posts from this blog

Co-producing reporting guidelines for targeted learning studies

The 'Table 2 fallacy'