This is an overview of a personal project called antren.
When it comes to personal projects, I like to take the shortest path to get it done. It's usually about the destination and not the journey. This project was a bit different: I tried to make each step as un-optimized as possible for the subsequent step so that each step actually did something meaningfully beneficial (and gave me something to write about).
Here is the project in 3 simple steps:
Using Docker and Google Cloud Run, I deployed a beautifully slow python script (that uses BeautifulSoup), to get tcx files from Garmin, turns them into parquet files and uploads them to Google Storage.
With Airflow, I orchestrated the step above as well as getting these wonderfully compressed, but very-hard-to-analyze parquet files into BigQuery.
With dbt, I turned nested JSON-like columns into rows that became metrics to track my cycling training data. Finally, I no longer have to pay a subscription to get exactly the information I want about my progress.