Containers open the whole new set of options for data analytics. Deploying and configuring a complete analytical stack can take less time than executing query in your database.
GoodData engineers work on a modular analytical stack (codename Tiger) that can be deployed as a single Docker image or as elastic k8s application. The deployment and configuration of this stack can be easily automated. Let me share a quick example how this stack works.
GoodData.UI is a new open-source data visualization library. It is based on JSX components. For example, this <ColumnChart/> tag displays a column (aka bar) chart:
Data visualizations with GoodData.UI is as easy as writing HTML tags. For example, this <ColumnChart/> tag displays a column (aka bar) chart:
Multitenant analytics is about delivering analytics to users in multiple organizations (tenants). The most common use case for multitenant analytics is customer-facing reports, and dashboards embedded in a SaaS application.
Another frequent use case is an organization that provides analytics to its business partners: suppliers, distributors, resellers, franchises, etc.
Multitenant analytics is often delivered as a product. This involves the following high-level steps:
This example shows how to create Spark Dataset on top of GoodData workspace. Once you have data in a Spark Dataset, you can use all data processing power of Spark including data transformation, machine learning, etc.
The dataset uses the workspace created in this tutorial.
The example requires a
.gooddata configuration file located in your home directory. The file has this structure:
Check out the example code
This short article describes how to access GoodData workspace from Apache Zeppelin notebook.
Download the latest version of the GoodData JDBC Driver (check the assets list).
I use the official Docker image and follow the setup described in the Zeppelin documentation.
Create this directory structure in your home directory:
Copy the GoodData JDBC Driver JAR (e.g. gooddata-jdbc-0.74.jar) to the lib directory.
docker run -p 8080:8080 --rm -v ~/zeppelin/logs:/logs -v ~/zeppelin/notebook:/notebook -v ~/zeppelin/lib:/java-lib -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.9.0
SQL is great for working with raw data. SELECTs, INSERTs, DELETEs, and UPDATES work great for the CRUD (Create, Read, Update, Delete) operations. However, SQL sucks when you start working with aggregated data. Let me demonstrate it on a simple sales orders example. You’ll be able to try it yourself with a simple SQLite setup described below.
There are two tutorials that show how to setup DBeaver with SQLite to run all the SQL queries and GoodData XAE (Extensible Analytical Engine) to execute the metric queries.
I have the following data model:
SQL uses the
GROUP BY statement for data…
Today, I am going to share more details about the very heart of GoodData platform, the Extensible Analytics Engine that we familiarly call the XAE. This core component performs every insight execution on our platform.
Let me demonstrate the power of XAE through one of our top customers who provides analytics to more than 29 thousand companies. This large-scale analytics solution involves 22.7 TB of data analyzed by 880k users. The data and user loads are not evenly spread, the largest customer serves almost 750 GB to 13.5k users.
The analytic insights are embedded inside of a responsive application that…