Can you imagine creating a new database schema without a single CREATE
statement? Just load data from your files and transform it using SELECT
statements. No unnecessary boilerplate code. Fast and productive.
This is exactly what dbd
does. You can use it to create your data schemas by just placing data files (CSV, JSON, XLS(X), or parquet) to a directory on your computer and calling dbd run
.
dbd
introspects your data files, determines data types, creates tables, and populates them.
If you don’t like the default data types, you can write simple YAML annotations that override just the data types that you don’t like. Or you can specify that a certain column is a primary key or foreign key, or you can create an index. You write less than 30% of the code that you’d otherwise have to write with SQL.
Is your data file remote? No problem, just give dbd
its URL. It will take care of the rest.
Do you need to derive a new table from those that you’ve created from the data files? Create a SQL file with a SELECT statement and the tool will materialize it for you. You just decide if you want a table or view.
Also, you don’t need to care about the sequence in which you create or drop your tables, indexes, and keys. dbd
takes care of it because it understands dependencies between your tables.
I use for creating my data schemas for some time already. I’ve optimized its performance, fixed bugs. I’ve created this COVID-19 data schema with it last weekend.
You can try it yourself. dbd
is free, open-sourced under the BSD license. You can check out the code in its Github repo.
This simple shell script installs dbd
and creates a simple SQLite data schema.
python3 -m venv dbd-env
source dbd-env/bin/activate
pip3 install dbd
git clone https://github.com/zsvoboda/dbd.git
cd dbd/examples/sqlite/covid_us
dbd run .
Liked it? Take a look at more examples here.
Let me know what you think!