Skip to content

SPARQL server setup

Patrick Neises edited this page Sep 11, 2024 · 9 revisions

Running a local SPARQL server with the DBLP dataset by using qlever

We provide a step-by-step tutorial on how to run your own SPARQL server on a linux machine. Therefore we assume that you are the root-user of the system. The dataset used in this tutorial is the dataset from the server https://sparql.dblp.org/, which can be found at https://sparql.dblp.org/download/.

System requirements

In order to host your own version of the dataset using qlever, we recommend:

  • 32 GB of free RAM
  • 200 GB of free Disk space

This is sufficient to build a qlever index and run the qlever server afterwards.

Installing dependencies

In order to run your own SPARQL enpoint, you need the following software:

  • python3
  • pip
  • docker
  • curl
  • tar

On Ubuntu, you can install these by running the following command:

apt install python3 python3-pip docker.io curl tar

Installing the qlever control script

We next need to install the qlever control script by installing the package qlever via pip:

pip3 install qlever

Using the qlever control script to run a SPARQL endpoint with the DBLP dataset

In order to run an endpoint, we first need to create a directory that should contain the data for the endpoint and cd into this directory. In this example we use ~/sparql-dblp.

mkdir ~/sparql-dblp
cd ~/sparql-dblp

Then we need to create a configuration file for qlever. Plese copy the following code into a file named Qleverfile in the current directory and change the ACCESS_TOKEN to some random string:

[data]
NAME              = dblp
GET_DATA_URL      = https://sparql.dblp.org/download/dblp_KG_with_associated_data.tar
GET_DATA_CMD      = curl -LO -C - ${GET_DATA_URL}; tar -xf dblp_KG_with_associated_data.tar; for F in *.ttl.gz; do zcat -f "$$F" | grep ^@prefix; done | sort -u > dblp.prefix-definitions;
DESCRIPTION       = DBLP computer science bibliography, data from ${GET_DATA_URL}
TEXT_DESCRIPTION  = All literals, search with FILTER KEYWORDS(?text, "...")

[index]
INPUT_FILES     = *.ttl.gz
CAT_INPUT_FILES = zcat -f dblp.prefix-definitions ${INPUT_FILES}
SETTINGS_JSON   = { "ascii-prefixes-only": false, "num-triples-per-batch": 1000000 }
TEXT_INDEX      = from_literals

[server]
PORT               = 7015
ACCESS_TOKEN       = SomeAccessToken
MEMORY_FOR_QUERIES = 30G
CACHE_MAX_SIZE     = 5G

[runtime]
SYSTEM = docker
IMAGE  = docker.io/adfreiburg/qlever:latest

[ui]
UI_CONFIG = dblp

This Qleverfile contains all the information on how to obtain the dataset and the configuration for qlever. Next we need to fetch the data and build the index by running the following commands:

qlever get-data
qlever index

This may take a while. After the index is prepared, we finally need to start the qlever server and the UI by running the following commands:

qlever start
qlever ui

Once those services are up, you can access the query UI at http://localhost:8176 and use the dataset.

Clone this wiki locally