EXTBI

Abstract

In today's data-driven world, analytical querying, typically based on the data cube concept, is the cornerstone of answering important business questions and making data-driven decisions. Traditionally, the underlying analytical data was mostly internal to the organization and stored in relational data ware houses and data cubes. Today, external data sources are essential for analytics and, as the Semantic Web gains popularity, more and more external sources are available in native RDF. With the recent SPARQL 1.1 standard, performing analytical queries over RDF data sources has finally become feasible. However, unlike their relational counterparts, RDF data cubes stores lack optimizations that enable fast querying. In this paper, we present an approach to optimizing RDF data cubes that is based on three novel cube patterns that optimize RDF data cubes, as well as associated algorithms that transform the RDF data cube. An extensive experimental evaluation shows that the approach allows trading additional storage and/or load times in return for significantly increased query performance. We further provide guidelines for which patterns to apply for specific scenarios and systems.

Authors: Kim A. Jakobsen, Alex B. Andersen, Katja Hose, and Torben Bach Pedersen

Dataset

	Scale 0.1	Scale 0.2	Scale 0.3	Scale 0.5
Snowflake	Download	Download	Download	Download
Star	Download	Download	Download	Download
Denormalized	Download	Download	Download	Download

Scale 0.1

Scale 0.2

Scale 0.3

Scale 0.5

Snowflake

Download

Star

Download

Denormalized

Download

EXTBI

Aalborg University

Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

Abstract

TPC-H relational diagram

Dataset