6th January 2023


Happy New Year!

 

Over the Christmas period I looked again at the mathematics behind targeted learning. 


What is the 'magic' 🐇🎩behind this technique?

 

I was familiar with machine learning so the first part was clear to me. This is where propensity scores and predicted probabilities for the outcomes are generated, usually through ‘superlearning’.

 

However, previously I have never been quite sure how the ‘updating’ worked.

 

Thankfully I found this helpful video from Susan Gruber, which explains the maths behind the ‘updating’ stage in detail.

 

https://youtu.be/8Q9dfW3oOi4

 

There is also a helpful paper that explains the targeted learning process in detail here.

The key issue is that, once the machine learning predictions are derived then they can be further updated by essentially regressing the residuals on to the observed outcomes for a validation data set. This involves using the predictions as an ‘offset’ in a logistic regression. In this context an ‘offset’ is a coefficient in the regression equation that is predetermined. This means that if there is any signal in the residuals then this will be used to update the predictions. If the residuals are merely random noise then no updating will occur.

 

The elegance of the targeted learning approach is that one does not need to specify a multivariable model to obtain an estimate of the parameter of interest (usually the causal effect of the treatment or exposure). Also, even if the initial machine learning estimates are biased then there is a second chance to correct them via the updating process.

Comments

Popular posts from this blog

Co-producing reporting guidelines for targeted learning studies

The 'Table 2 fallacy'