DataOps & Headless BI: the perfect fit

ZD
GoodData Developers
3 min readJun 18, 2021

--

The data managed by DataOps teams are often consumed as analytics insights, dashboards, or machine learning models. We at GoodData believe that exposing a SQL database and let your consumers analyze the data in their Excels, Tableaus, or QlikViews is way to hell. You can read more about the unpleasant consequences of this approach in the article that describes the concept of Headless BI.

In this article, I wanted to describe how the Headless BI stack supports a DataOps team that continuously delivers an analytical solution. There are two key stack capabilities that make this easier:

  • Human-readable declarative definitions are used for the definition and configuration of every analytics artifact. Human readability is very important as the declarative definitions are frequently merged and potential conflicts must be resolved.
  • Open APIs allow automation of the DataOps continuous delivery process (e.g. automated testing) described below.

DataOps process for analytics

The image below outlines the high-level steps for delivering analytics by a DataOps team. This is a rinse-and-repeat process that happens for every analytics solution iteration (set of features). Companies that reach a high level of DataOps maturity can complete these steps many times per day (or even per hour).

High-level continuous integration process scheme

Using Headless BI stack for delivering analytics

Now we’ll take a look at how the declarative and the API first approaches help to implement the DataOps process outlined in the image above. I’ll reference real-life declarative formats and APIs of the GoodData.CN for better illustration.

Development

The Headless BI analytical model and metrics can be developed by multiple data engineers. Each of them has a private development workspace. They can use visual drag-and-drop tools or edit the declarative definitions as a code.

Visual semantic model editor

No matter what tool a developer uses, the result is the declarative definition that can be committed to a version control system repository (e.g. Github). The declarative definition can be retrieved from the Headless BI stack via a simple GET REST request.

Other developers can pull the declarative definition from the repository and load it to their development workspaces via a single PUT REST request.

Each developer works on his or her own branch and keeps merging their final work to the shared master branch.

Quality assurance

At the time of the release, a quality engineer forks a new release branch from the master, creates a temporary quality workspace, and loads the new content to it from the versioned declarative definition. Then he or she executes the automated quality tests against the quality workspace and reports bugs back to developers.

Developers fix the bugs in their branches and merge the fixes to the master branch. Quality then merges the fixes from the master to the release branch and re-run the automated tests. Rinse and repeat. When all bugs are fixed the quality engineer tags the final content with the release version tag.

Production

The final analytics content is published in the production workspace. The GoodData.CN provides a handy inheritance mechanism that allows customization of the released content in different contexts (external or internal audiences, different departments, branches, franchises, customers, etc.).

Try it yourself

You can use the free GoodData.CN and its demo data and models to quickly try these steps yourself. It is easy to install (one consolidated Docker image). If it works, you can even use the software in production (Kubernetes deployment) for free for unlimited number of users.

Thank you for reading and let me know if you have any comments or questions!

--

--