ZDIceberg 14 + Spark 3.3 = FAST!I upgraded the ngods data stack to Apache Spark 3.3.0 and Iceberg 14.0, which is now visibly faster!1 min read·Aug 5, 2022--1--1
ZDinTowards DevTrino & dbt: excellent fit for cross-database ELT and data connectorsTrino (aka Presto or Starburst) is an open-source component for querying multiple databases simultaneously. It can execute distributed…2 min read·Jul 13, 2022--1--1
ZDAre you throwing money out of the window by using Snowflake?I recently came across this excellent blog post in which Kris argues that the hosted data stacks are becoming quite expensive. I agree with…2 min read·Jul 11, 2022--8--8
ZDI agree. The cost is one of the reasons why I run my own Spark/Trino/Iceberg stack…1 min read·Jul 10, 2022----
ZDOrchestrating dbt and PysparkI use dbt for my data projects to implement the medallion data pipeline architecture that processes data in three: bronze, silver, and gold…3 min read·Jul 10, 2022----
ZDHi, I noticed that Snowpark allows you to define a stored procedure in Scala, Java, or Python and…1 min read·Jul 8, 2022--1--1
ZDinDev GeniusIceberg + Spark + Trino + Dagster: modern, open-source data stack demoI assembled the ngods (new generation open-source data stack) two months back and have used it for two projects since then.3 min read·Jul 4, 2022--9--9
ZDngods: new generation open-source data stackI wanted to quickly share my attempt to assemble a new generation data stack that is composed of open-source technologies.1 min read·May 20, 2022--2--2
ZDHeadless BI: metrics vs SQLHeadless BI’s metrics are much better than queries for your “non-SQL speaking” users.1 min read·May 16, 2022--1--1