Containers open the whole new set of options for data analytics. Deploying and configuring a complete analytical stack can take less time than executing query in your database.

GoodData engineers work on a modular analytical stack (codename Tiger) that can be deployed as a single Docker image or as elastic k8s application. The deployment and configuration of this stack can be easily automated. Let me share a quick example how this stack works.

NOTE: You need username and password to the GoodData registry to be able to execute the steps below. …

GoodData.UI is a new open-source data visualization library. It is based on JSX components. For example, this <ColumnChart/> tag displays a column (aka bar) chart:

<ColumnChart 
measures={[Ldm.Revenue]}
viewBy={[Ldm.DateMonthYear.Long]} />
Image for post
Image for post
Bar chart data visualization displayed by the one-line code above

Try it yourself in this sandbox

You can play with the tags in this sandbox. The supported tags are documented here. By the way, the framework is open-sourced.

Data visualizations with GoodData.UI is as easy as writing HTML tags. For example, this <ColumnChart/> tag displays a column (aka bar) chart:

<ColumnChart 
measures={[Ldm.Revenue]}
viewBy={[Ldm.DateMonthYear.Long]} />
Image for post
Image for post
Bar chart data visualization displayed by the one-line code above

You can play with the tags in this sandbox. The supported tags are documented here. By the way, the framework is open-sourced.

Multitenant analytics is about delivering analytics to users in multiple organizations (tenants). The most common use case for multitenant analytics is customer-facing reports, and dashboards embedded in a SaaS application.

Another frequent use case is an organization that provides analytics to its business partners: suppliers, distributors, resellers, franchises, etc.

Image for post
Image for post

Multitenant analytics is often delivered as a product. This involves the following high-level steps:

  • Deliver an initial analytical experience (data visualizations, reports, dashboards, etc.) to new tenants (e.g. organizations, customers, business partners)
  • Organizations customize their analytics with self-service tools
  • You release a new version of the analytics without breaking the customizations

This example shows how to create Spark Dataset on top of GoodData workspace. Once you have data in a Spark Dataset, you can use all data processing power of Spark including data transformation, machine learning, etc.

The dataset uses the workspace created in this tutorial.

Image for post
Image for post

Setup

The example requires a .gooddata configuration file located in your home directory. The file has this structure:

{ 
"host": "<your-gd-domain>.na.gooddata.com",
"username": "<your-gd-username>",
"password": "<your-gd-password>",
"workspace":"<your-gd-workspace-id>"
}

Code

Check out the example code

Tutorials and articles

Originally published at https://github.com.

This short article describes how to access GoodData workspace from Apache Zeppelin notebook.

Image for post
Image for post

Zeppelin installation

Download the latest version of the GoodData JDBC Driver (check the assets list).

I use the official Docker image and follow the setup described in the Zeppelin documentation.

Create this directory structure in your home directory:

zeppelin 
+- lib
+- logs
+- notebook

Copy the GoodData JDBC Driver JAR (e.g. gooddata-jdbc-0.74.jar) to the lib directory.

docker run -p 8080:8080 --rm -v ~/zeppelin/logs:/logs -v ~/zeppelin/notebook:/notebook -v ~/zeppelin/lib:/java-lib -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.9.0

Create a new JDBC interpreter.

SQL is great for working with raw data. SELECTs, INSERTs, DELETEs, and UPDATES work great for the CRUD (Create, Read, Update, Delete) operations. However, SQL sucks when you start working with aggregated data. Let me demonstrate it on a simple sales orders example. You’ll be able to try it yourself with a simple SQLite setup described below.

There are two tutorials that show how to setup DBeaver with SQLite to run all the SQL queries and GoodData XAE (Extensible Analytical Engine) to execute the metric queries.

Working with aggregated data in SQL

I have the following data model:

Image for post
Image for post

SQL uses the GROUP BY statement for data…

Today, I am going to share more details about the very heart of GoodData platform, the Extensible Analytics Engine that we familiarly call the XAE. This core component performs every insight execution on our platform.

Let me demonstrate the power of XAE through one of our top customers who provides analytics to more than 29 thousand companies. This large-scale analytics solution involves 22.7 TB of data analyzed by 880k users. The data and user loads are not evenly spread, the largest customer serves almost 750 GB to 13.5k users.

The analytic insights are embedded inside of a responsive application that…

ZD

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store