EXTBI

Repositories

The inplementaiton and the code for running the experiments can be access at the the two GitHub project linked below.

SWOD algorithm implementation GibHub
Tools for running the experiments GibHub

SWOD Implementation

This program generates a serie of SPARQL construct queries that create the snowflake pattern and fully denormalized pattern cubes.

This Java program use Apache Maven to manage dependencies

The SWOD Tools project contains generated SPARQL queries, thus it is not nessary to run the SWOD program in order to run the experiments

SWOD Tools

These tools will allow you to generate the TPC-H data in triples (generate.sh), load the data into Virtuoso and Apache Jena (load.sh), run the TPC-H queries on the triple stores (query.sh), and analyse the results by comparing the queries (extractQueryTimes.py, compareResults.py).

All scripts are written in bash and python, this might result in some problem on windows systems.

The batch scripts takes a series of "sources" as input, these modular configurations files are located in the "source" folder. Be aware the these configuration files need to be set up manually before running any of the programs.

The python scripts have a help flag (--help) that displays the allowed parameters.

Workflow

Download and install the following progarm

Create configuration files (source files) that match your system (source/machine/) and wanted configuration (scale factor etc.)
Generate or download the dataset

Generation requires Virtuoso for running the construct queries

Install Virtuoso or Apache Jena
Load the data into the Jena TDB or Virtuoso by using the appropriate configuration files
Change the querymix configuration (source/) to mach which queries you want to execute, run the querymix.sh program to propagate these settings.
Run the query.sh script with the appropiate configuration files to start the experiments
Use the extractQueryTimes.py on the generated logfiles (logs/) to extract and aggregate the query times.
The experiments can now be compare using the compareResults.py script

Feel free to post bug report and ask questions

EXTBI

Aalborg University

Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

Repositories

SWOD Implementation

SWOD Tools

Workflow