BI over Big Data

Next generation of BI tools emerge silently. Elastic Stack as a killer for traditional BI tools like Tableau, Qlik, MicroStrategy and others

BI for Big Data is a challenging task even for proprietary vendors. Hadoop ecosystem, driven by open-source community, has strong toolset for Big Data ingestion, transformation, storage and processing. However BI cases were traditionally better handled by paid software with attractive visualization and short development cycles. In this article I discuss one example of using Elastic Stack and specifically Elasticsearch + Kibana to solve various Big Data analytics tasks.

BI Trends in 2019

So, what are the top expectations from modern BI systems? Recent research by Business Analytics Research Centre gives some insights into that:

Pic.1. Top Trends in BI for 2019. Source: Business Application Research Center

Let's have a look at top trends and show how effectively we can match them with open-source components. Our target architecture employs Hortonworks Hadoop Data Platform and Elasticsearch with Kibana.

#1 Data Quality Management: acquiring the data, implementing advanced data processes, distributing the data effectively and managing data oversight.

HDP is packaged with several industry leading components to handle large scale Big Data cases: NiFi – for stream acquisition and processing, Kafka for message/subscribe, Sqoop for batch data acquisition and Storm for data processing are just few but strong examples how open-source is becoming.

#2
Data Discovery / Visualization
: understanding the relationship between
data in the form of data preparation, visual analysis and guided advanced
analytics.

#3 Self-service BI: ability for users create own reports, store and reuse them in the future.

Kibana,
that runs on top of Elasticsearch, provides excellent NoSQL data discovery and visualization
tools. It comes bundled with several sample data sets and over 270 reports.

Following
are few examples of dashboards running on Kibana with no coding or scripting.

Latest version of Kibana (ELK 6.5) is coming with one more option - Canvas, a powerful presentation tool built into the system. You can built pixel perfect presentations on live data.

Summary

Open-source tools has been around for a while but recent trends show they are becoming more user friendly, enterprise-ready with quick delivery cycles. Elastic Stack is probably on top of that trend. Combined with one of leading Hadoop distributions we can create truly brilliant solutions from cost and scalability perspective.

References

Top Business Intelligence Trends 2019: What 2,679 BI Professionals Really Think

Survey results of 2,700 BI professionals including 2,130 BI users, 337 BI consultants and 212 BI vendors for their views on the most important BI trends.

Hortonworks Data Platform (HDP)
Nicely packaged open-source distribution of Hadoop with Enterprise level support. I've been personally inspired and dragged into complex Hadoop eco-system by professionally prepared documentation. Many thanks to the Hortonworks team!

Open Source Search and Analytics Platform - Elastic Stack
I've been working with different BI and Reporting packages for many years and having open-source as a front-end was a great challenge. With Elastic and Kibana this seems to be getting over. Elastic provides fast NoSQL searching capabilities and Kibana reach BI/Reporting GUI on top of that. So these two components could be used for wide range of tasks. I'd highly recommend this stack for near-real time BI/Reporting tasks.

Canvas: Showcase Your Data, Live & Pixel-Perfect

Examples, videos and links to infographic-style presentations and reports with Canvas