Australian Longitudinal Study of Women’s Health (ALSWH) Analysis Code

R and Stata Analysis Code

This repository contains R code used in articles using the ALSWH.

Code for all analysis in the article examining the causal effects of meeting physical activity on health-related quality of life (Nguyen-duy et al 2024) published in Plos Medicine: https://doi.org/10.1371/journal.pmed.1004384.

DescriptionR Code
1 - Data extraction - pull relevant variables from each waveData extraction
2 - Merge data - merge waves and create derived variablesMerge data
3 - Multiple imputation - impute intermittent missing dataImputation
4 - Final data creation - finalise imputed data and structure for analysisFinalise data
5 - LTMLE analysis - using dynamic regimes based on age, using the package ‘ltmle’ (1).LTMLE analysis
6 - Sensitivty analysis using lower physical activity cut-point.Sensitivity 1
7 - Sensitivty analysis excluding variables wholly missing in some waves.Sensitivity 1
8 - Pool results across imputations and create analysis figuresPool results
9 - Create plots of results using ggplotCreate plots
10 - Generate ‘table 1’ of baseline descriptive statisticsDescriptives
11 - E-value analysis to test sensitivity to unmeasured confoundingE-value analysis
12 - Create summary of missing dataMissing data

Causal effects of physical activity on mortality

Code for all analysis in the article examining the causal effects of meeting physical activity on mortality (Nguyen-duy et al 2025, under review).

DescriptionR Code
1 - Data extraction - pull relevant variables from each waveData extraction
2 - Merge data - merge waves and create derived variablesMerge data
3 - Multiple imputation - impute intermittent missing dataImputation
4 - Final data creation - finalise imputed data and structure for analysisFinalise data
5 - Analysis of all-cause mortality - using dynamic regimes based on age, using the package ‘ltmle’ (1).All-cause analysis
6 - Analysis of CVD mortality - using dynamic regimes based on age, using the package ‘ltmle’ (1).CVD analysis
7 - Analysis of Cancer mortality - using dynamic regimes based on age, using the package ‘ltmle’ (1).Cancer analysis
8 - Pool results across imputations and create analysis figuresPool results
9 - Create plots to graphically report the analysis findingsCreate plots

Causal effects of loneliness on all-cause mortality

Code for all analysis in the article examining the causal effects of loneliness on mortality (HaGani et al 2025, under review).

DescriptionR Code
1 - Data extraction - pull relevant variables from each waveData extraction
2 - Merge data - merge waves and create derived variablesMerge data
3 - Multiple imputation - impute intermittent missing dataImputation
4 - Final data creation - finalise imputed data and structure for analysisFinalise data
5 - Analysis of all-cause mortality - using dynamic regimes based on age, using the package ‘ltmle’ (1).All-cause analysis
6 - Post-hoc sensitivity analysis adjusting for baseline conditions rather than excluding.Sensitivity
7 - Pool results across imputations and create analysis figuresPool results
8 - Create plots to graphically report the analysis findingsCreate plots
9 - E-Value analysis of unmeasured confoundingEValue analysis
10 - Missing data summary for appendixMissing data
11 - Descriptive statistics on unadjusted mortality incidence.Mortality descriptives
12 - Socio-demographic descriptives for Table 1.Sociodemographic descriptives

Physical activity and incident obesity: causal inference analysis in Australian women aged 45 years and older

Code for all analysis in the article examining the causal effects of physical activity on incident obesity (Tarp et al 2025), in progress.

DescriptionR Code
1 - Data extraction - pull relevant variables from each waveData extraction
2 - Merge data - merge waves and create derived variablesMerge data
3a - Multiple imputation - impute intermittent missing dataImputation
3b - Multiple imputation for sensitivity analysisImputation
4 - Final data creation - finalise imputed data and structure for analysisFinalise data
5a - Linearity tests for functional form of PA in primary analysisPrimary linearity
5b - Linearity tests for functional form of PA in categorical analysisCategorical linearity
5c - Linearity tests for functional form of PA in analysis of ‘severe’ obesitySevere linearity
5d - Linearity tests for functional form of PA in analysis of five-percent weight gainFive-percent linearity
5e - Linearity tests for functional form of PA in analysis of ten-percent weight gainTen-percent linearity
5f - Combine and summarise linearity testsCombine linearity
6a - Primary analysisPrimary analysis
6b - Categorical analysisCategorical analysis
6c - Analysis of ‘severe’ obesitySevere analysis
6d - Analysis of five-percent weight gainFive-percent analysis
6e - Analysis of ten-percent weight gainTen-percent analysis
6f - Secondary analysis stratified by level of educationStratified analysis
6g - Sensitivity analysis controlling for descendants of possible unmeasured confounderSensitivity analysis 1
6h - Sensitivity analysis excluding variables that were wholly imputed in some wavesSensitivity analysis 2
7 - Pool results across imputations and create analysis figuresPool results
8 - Create plots to graphically report the analysis findingsCreate plots
9 - Socio-demographic descriptives for Table 1 and on unadjusted incidence of outcomes.Sociodemographic descriptives
10 - E-Value analysis of unmeasured confoundingEValue analysis
11 - Missing data summary for appendixMissing data
  1. Lendle SD, Schwab J, Petersen ML, van der Laan MJ. ltmle: An R Package Implementing Targeted Minimum Loss-Based Estimation for Longitudinal Data. Journal of Statistical Software. 2017;81(1):1-21.
  2. Vermunt JK. Latent Class Modeling with Covariates: Two Improved Three-Step Approaches. Political Analysis. 2010;18(4):450-469.
Avatar
Dr. Philip J Clare, PhD

Biostatistician at the Prevention Research Collaboration, University of Sydney.

Related