This repository contains the Stata and R code used in methodological research by Clare et al.

Comparison of methods of adjusting for time-varying confounding under misspecification – A Monte-Carlo simulation study. The Stata code creates a series of quasi-random datasets using a pre-specified data structure. Analysis code runs all analyses on those datasets, and saves the results. Note that the code is written to run on Google Compute clusters, using a Linux OS (in order to run the syntax on a Windows-based machine, some changes to the way parallel processing is required (because Windows is not compatible with ‘FORK’).

Two types of standard error estimates were used, so two sets of analysis code are included. The first calculates standard errors using bootstrapping. The second calculates model-based standard errors, using influence curves for TMLE.

Description | Github code | Download code |
---|---|---|

S1 - Data creation Stata Code | Data creation code | Download code |

S2 - Analysis with bootstrap SEs - R Code | Analysis code - Bootstrap | Download code |

S3 - Analysis with model-based/influence curve SEs - R Code | Analysis code - Alternative | Download code |

Comparison of methods of adjusting for time-varying confounding with missing data – A Monte-Carlo simulation study. The Stata code creates a series of quasi-random datasets (3 different datasets were used in the simulation) using a pre-specified data structure. Analysis code runs all analyses on those datasets, and saves the results. Note that the code is written to run on the UNSW Katana cluster computer, which uses a scheduler to sequentially call the R script and pass it the particular iterations of the data to be processed in each step. To run the code on a standard computer, the code can be edited so the parameters passed by the Katana scheduler are defined internally.

Description | Code |
---|---|

S1 - Data creation of Dataset 1 - Stata Code | Data creation code |

S2 - Data creation of Dataset 2 - Stata Code | Data creation code |

S3 - Data creation of Dataset 3 - Stata Code | Data creation code |

S4 - Analysis - R Code | Analysis code |

Targeted Maximum Likelihood Estimation to adjust for time-varying confounding – a tutorial paper. This repository contains data and a number of snippets of R code, used in the TMLE tutorial by Clare, Dobbins, Bruno and Mattick.

Description | Markdown | Github R code | Download R code |
---|---|---|---|

Creating the longitudinal dataset used in example analyses. Data creation is done using the package ‘simcausal’ (1). | Data Creation Markdown | Data Creation Code | Download code |

Cross-sectional TMLE analysis, both manually and using the package ‘tmle’ (2). | Cross-sectional Analysis Markdown | Cross-sectional Analysis Code | Download code |

Longitudinal TMLE with a single outcome measurement, both manually and using the package ‘ltmle’ (3). | Single Outcome Longitudinal Markdown | Single Outcome Longitudinal Code | Download code |

Longitudinal TMLE with a repeated outcome measurement, both manually and using the package ‘ltmle’ (3). | Repeated Outcome Longitudinal Markdown | Repeated Outcome Longitudinal Code | Download code |

The longitudinal dataset, ldata.RData, is also included in the repository: https://github.com/philipclare/tmletutorial

- Sofrygin O, van der Laan Mark J, Neugebauer R. simcausal R Package: Conducting Transparent and Reproducible Simulation Studies of Causal Effect Estimation with Complex Longitudinal Data. Journal of Statistical Software. 2017;81(2):1-47.
- Gruber S, van der Laan MJ. tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software. 2012;51(13):1-35.
- Lendle SD, Schwab J, Petersen ML, van der Laan MJ. ltmle: An R Package Implementing Targeted Minimum Loss-Based Estimation for Longitudinal Data. Journal of Statistical Software. 2017;81(1):1-21. Petersen ML, van der Laan MJ. ltmle: An R Package Implementing Targeted Minimum Loss-Based Estimation for Longitudinal Data. Journal of Statistical Software. 2017;81(1):1-21.