dbd database prototyping tool now supports direct loading of Kaggle dataset data files (e.g. CSV, JSON, XLS, parquet) to databases.
You don’t have to create any tables or run any SQL INSERT or COPY statements. Everything is automated. Just reference the datasets and files with a URL and execute dbd run
command.
There are two example datasets:
- Kaggle Omicron dataset that loads worldwide COVID-19 Omicron variant data. This example’s sources are here.
- Kaggle NYT COVID-19 dataset that loads New York Times COVID-19 US data. This example’s sources are here
Examples are tested with Postgres, MySQL, and SQLite databases.
GitHub repository with detailed instructions is here
Hopefully, you find it useful.
Let me know, what you think!