Analysing the Antiretroviral Therapy in HIV dataset

Key figures

  • 22 Students πŸ‘©β€πŸŽ“

  • Six Teams 🀝

  • Three Content experts πŸ’Š

  • Two Government partners πŸ₯

  • One winning team 🌟

Overview of the event

The first CBDRH Health Data Science datathon took place on Friday 26th May 2023. This event saw the participation of 22 HDS students working in six teams, five on campus and one hybrid online/on-campus team. The event focused on the Antiretroviral Therapy in HIV dataset, and teams were challenged to pose a research question and develop a solution using their health context expertise and analytic skills.

Health Data Science students in action πŸ’›
Credit: Cassandra Hannagan

Five experts were on hand to guide the teams, aiding them in crafting and executing their research questions and proposed solutions effectively. This included applied researchers with content expertise in HIV medications and machine learning, and Health Informaticians from NSW Health Sydney Local Health District

Consulting with machine learning experts. πŸ’›
Credit: Cassandra Hannagan

The datathon wasn’t just about coding and data; it was also about enjoyment and camaraderie. The teams enjoyed good food and had loads of fun. The presentations from all teams were highly engaging, with everyone putting into practice the technical coding skills and health context expertise from the HDS program.

Students presenting their work πŸ’›
Credit: Cassandra Hannagan

The day culminated with a prize ceremony to acknowledge the hard work from all of the teams, and of course announce the winners! The winning team’s breakthrough came from using neural networks to predict the success of current ART drug combinations for patients with HIV. The second-place team used survival analysis to answer the question, β€œWhat drug combination is most effective at achieving viral suppression?” The third team investigated the impact of Dolutegravir (DTG) as a third agent drug on time to viral suppression among active HIV patients under antiretroviral therapy (ART).

This event truly blended learning, collaboration, and fun, capturing the true essence of data science in an enjoyable and rewarding environment.


The data

The Antiretroviral Therapy in HIV dataset comprises viral loads, CD4 counts, and drug regimen information for 8,916 patients with HIV. This is a synthetic dataset that has been developed using Generative Adversarial Networks. This approach provides realistically complex data, allowing users to prototype, evaluate, and compare machine learning algorithms without the usual constraints of patient privacy.

The ART HIV dataset included demographic details and longitudinal clinical data on drug combinations and CD4 counts for nearly 9,000 patients. Common baseline drug regimes included tenofovir disoproxil & emtricitabine (FTC+TDF) and abacavir & lamivudine (3TC+ABC) Several of the teams choose to implement machine learning models to predict future CD4 outcomes based on previous values, demographics and treatment values.

What our students said

Based on your experience at the datathon, would you recommend it to other students?
100%
DEFINATELY


I loved getting to work with really complicated data based on real situations, loved the info sharing sessions and the support given.

The open-ended nature of the competition meant that we got to see different uses for the dataset. I got to meet participants from other faculties.

The actual competition day itself was very rewarding and also very fun!