ZDIceberg 14 + Spark 3.3 = FAST!I upgraded the ngods data stack to Apache Spark 3.3.0 and Iceberg 14.0, which is now visibly faster!Aug 5, 20221Aug 5, 20221
ZDinTowards DevTrino & dbt: excellent fit for cross-database ELT and data connectorsTrino (aka Presto or Starburst) is an open-source component for querying multiple databases simultaneously. It can execute distributed…Jul 13, 20221Jul 13, 20221
ZDAre you throwing money out of the window by using Snowflake?I recently came across this excellent blog post in which Kris argues that the hosted data stacks are becoming quite expensive. I agree with…Jul 11, 20228Jul 11, 20228
ZDI agree. The cost is one of the reasons why I run my own Spark/Trino/Iceberg stack…Jul 10, 2022Jul 10, 2022
ZDOrchestrating dbt and PysparkI use dbt for my data projects to implement the medallion data pipeline architecture that processes data in three: bronze, silver, and gold…Jul 10, 2022Jul 10, 2022
ZDHi, I noticed that Snowpark allows you to define a stored procedure in Scala, Java, or Python and…Jul 8, 20221Jul 8, 20221
ZDinDev GeniusIceberg + Spark + Trino + Dagster: modern, open-source data stack demoI assembled the ngods (new generation open-source data stack) two months back and have used it for two projects since then.Jul 4, 20229Jul 4, 20229
ZDngods: new generation open-source data stackI wanted to quickly share my attempt to assemble a new generation data stack that is composed of open-source technologies.May 20, 20222May 20, 20222
ZDHeadless BI: metrics vs SQLHeadless BI’s metrics are much better than queries for your “non-SQL speaking” users.May 16, 20221May 16, 20221