Chapter 7. Sesame Console

Table of Contents

7.1. Getting started
7.2. Connecting to a set of repositories
7.3. Repository list
7.4. Creating a repository
7.5. Other commands
7.6. Repository configuration
7.6.1. Memory store configuration
7.6.2. Native store configuration
7.6.3. RDBMS store configuration
7.6.4. HTTP repository configuration
7.7. Repository configuration templates (advanced)

This chapter describes Sesame Console, a command-line application for interacting with Sesame. For now, the best way to create and manage repositories in a SYSTEM repository is to use the Sesame Console.

7.1. Getting started

Sesame Console can be started using the console.bat/.sh scripts that can be found in the bin directory of the Sesame SDK. By default, the console will connect to the "default data directory", which contains the console's own set of repositories. See Chapter 5, Application directory configuration for more info on data directories.

The console can be operated by typing commands. Commands can span multiple lines and end with a '.' at the end of a line. For example, to get an overview of the available commands, type:

help.

To get help for a specific command, type 'help' followed by the command name, e.g.:

help connect.

7.2. Connecting to a set of repositories

As indicated in the previous section, the console connects to its own set of repositories by default. Using the connect command you can make the console connect to a Sesame Server or to a set of repositories of your file system. For example, to connect to a Sesame Server that is listening to port 8080 on localhost, enter the following command:

connect http://localhost:8080/openrdf-sesame.

7.3. Repository list

To get an overview of the repositories that are available in the set that your console is connected to, use the 'show' command:

show repositories.

7.4. Creating a repository

The 'create' command can be used to add new repositories to the set that the console is connected to. This command expects the name of a template that describes the repository's configuration. Currently, there are nine templates that are included with the console by default:

  • memory -- a memory based RDF repository
  • memory-rdfs -- a main-memory repository with RDF Schema inferencing
  • memory-rdfs-dt -- a main-memory repository with RDF Schema and direct type hierarchy inferencing
  • native -- a repository that uses on-disk data structure
  • native-rdfs -- a native repository with RDF Schema inferencing
  • native-rdfs-dt -- a native repository with RDF Schema and direct type hierarchy inferencing
  • pgsql -- a repository that stores data in a PostgreSQL database
  • mysql -- a repository that stores data in a MySQL database
  • remote -- a repository that serves as a proxy for a repository on a Sesame Server

When the 'create' command is executed, the console will ask you to fill in a number of parameters for the type of repository that you chose. For example, to create a native repository, you execute the following command:

create native.

The console will then ask you to provide an ID and title for the repository, as well as the triple indexes that need to be created for this kind of store. The values between square brackets indicate default values which you can select by simply hitting enter. The output of this dialogue looks something like this:

Please specify values for the following variables:
Repository ID [native]: myRepo
Repository title [Native store]: My repository
Triple indexes [spoc,posc]: 
Repository created

Please see Section 7.6, “Repository configuration” for more info on the repository configuration options.

7.5. Other commands

Please check the documentation that is provided by the console itself for help on how to use the other commands. Most commands should be self explanatory.

7.6. Repository configuration

7.6.1. Memory store configuration

A memory store is an RDF repository that stores its data in main memory. Apart from the standard ID and title parameters, this type of repository has a Persist and Sync delay parameter.

7.6.1.1. Memory Store persistence

The Persist parameter controls whether the memory store will use a data file for persistence over sessions. Persistent memory stores write their data to disk before being shut down and read this data back in the next time they are initialized. Non-persistent memory stores are always empty upon initialization.

7.6.1.2. Synchronization delay

By default, the memory store persistence mechanism synchronizes the disk backup directly upon any change to the contents of the store. That means that directly after an update operation (upload, removal) completes, the disk backup is updated. It is possible to configure a synchronization delay however. This can be useful if your application performs several transactions in sequence and you want to prevent disk synchronization in the middle of this sequence to improve update performance.

The synchronization delay is specified by a number, indicating the time in milliseconds that the store will wait before it synchronizes changes to disk. The value 0 indicates that there should be no delay. Negative values can be used to postpone the synchronization indefinitely, i.e. until the store is shut down.

7.6.2. Native store configuration

A native store stores and retrieves its data directly to/from disk. The advantage of this over the memory store is that it scales much better as it isn't limited to the size of available memory. Of course, since it has to access the disk, it is also slower than the in-memory store, but it is a good solution for larger data sets.

7.6.2.1. Native store indexes

The native store uses on-disk indexes to speed up querying. It uses B-Trees for indexing statements, where the index key consists of four fields: subject (s), predicate (p), object (o) and context (c). The order in which each of these fields is used in the key determines the usability of an index on a specify statement query pattern: searching statements with a specific subject in an index that has the subject as the first field is signifantly faster than searching these same statements in an index where the subject field is second or third. In the worst case, the 'wrong' statement pattern will result in a sequential scan over the entire set of statements.

By default, the native repository only uses two indexes, one with a subject-predicate-object-context (spoc) key pattern and one with a predicate-object-subject-context (posc) key pattern. However, it is possible to define more or other indexes for the native repository, using the Triple indexes parameter. This can be used to optimize performance for query patterns that occur frequently.

The subject, predicate, object and context fields are represented by the characters 's', 'p', 'o' and 'c' respectively. Indexes can be specified by creating 4-letter words from these four characters. Multiple indexes can be specified by separating these words with commas, spaces and/or tabs. For example, the string "spoc, posc" specifies two indexes; a subject-predicate-object-context index and a predicate-object-subject-context index.

Creating more indexes potentially speeds up querying (a lot), but also adds overhead for maintaining the indexes. Also, every added index takes up additional disk space.

The native store automatically creates/drops indexes upon (re)initialization, so the parameter can be adjusted and upon the first refresh of the configuration the native store will change its indexing strategy, without loss of data.

7.6.3. RDBMS store configuration

An RDBMS store is an RDF repository that stores its data in a relational database. Currently, PostgreSQL and MySQL are supported. Both RDBMS's have their own configuration template, "pgsql" and "mysql" respectively, but these have the same set of parameters.

7.6.3.1. JDBC driver

The RDBMS store communicates with a database via a JDBC driver for the concerning RDBMS. These JDBC drivers are not included in the Sesame SDK; you will need to add these jar-files to the Console and/or Sesame server for these to be able to run the RDBMS store. Note that you don't need to add the driver to the Console if you only use it to configure an RDBMS store on a Sesame server.

To add the JDBC driver to the Sesame Console, just put the JDBC jar-file in the SDK's lib directory with all the other jar-files. To add it to a Sesame server, add the jar-file to the web application's WEB-INF/lib directory.

7.6.3.2. JDBC parameters

The database that the RDBMS should use is defined using the following set of parameters:

  • JDBC driver -- specifies which JDBC driver an RDBMS store should use. The default value specified by the configuration templates should used in most cases.
  • Host -- specifies the name of the machine that is running the database.
  • Port -- specifies the port to use for communication with the host machine. The configuration templates specify the default port numbers for their RDBMS's.
  • Database -- specifies the name of the database that should be used.
  • Connection properties -- can optionally be used to specify additional properties for the JDBC driver. Please consult the documentation of the RDBMS's JDBC driver for more info.
  • User name -- the user name or role that should be used to authenticate with the RDBMS.
  • Password -- the password for the specified user name or role.

7.6.3.3. Table layout parameters

The database's table layout can be tweaked using the Max number of triple tables parameter. The RDBMS store supports both a "monolithic" schema with a single table that stores all statements, as well as a vertical schema that stores statements in a per-predicate table.

The vertical layout has better query evaluation performance on most data sets, but potentially leads to huge amounts of tables, depending on the number of unique predicates in your data. If the number of tables becomes too large, the database's performance can start to decrease or it can even fail completely. To prevent these problem, you can specify the maximum number of triple tables that should be created. Setting this parameter to 1 results in a monolithic schema, setting it to 0 or a negative value disables the limit.

7.6.4. HTTP repository configuration

An HTTP repository isn't an actual store, but serves as a proxy for a store on a (remote) Sesame server. Apart from the standard ID and title parameters, this type of repository has a Sesame server location and a Remote repository ID parameter.

7.6.4.1. Sesame server location

This parameter specifies the URL of the Sesame Server that the repository should communicate with. Default value is http://localhost:8080/openrdf-sesame, which corresponds to a Sesame Server that is running on your own machine.

7.6.4.2. Remote repository ID

This is the ID of the remote repository that the HTTP repository should communicate with. Please note an HTTP repository two repository ID parameters: one identifying the remote repository and one that specifies the HTTP repository's own ID.

7.7. Repository configuration templates (advanced)

In Sesame, repository configurations with all their parameters are modeled in RDF and stored in the SYSTEM repository. So, in order to create a new repository, the Console needs to create such an RDF document and submit it to the SYSTEM repository. The Console uses so called repository configuration templates to accomplish this.

Repository configuration templates are simple Turtle RDF files that describe a repository configuration, where some of the parameters are replaced with variables. The Console parses these templates and asks the user to supply values for the variables. The variables are then substituted with the specified values, which produces the required configuration data.

The Sesame Console comes with a number of default templates, which are listed in Section 7.4, “Creating a repository”. The Console tries to resolve the parameter specified with the 'create' command (e.g. "memory") to a template file with the same name (e.g. "memory.ttl"). The default templates are included in Console library, but the Console also looks in the templates subdirectory of [ADUNA_DATA]. You can define your own templates by placing template files in this directory.

To create your own templates, it's easiest to start with an existing template and modify that to your needs. The default "memory.ttl" template looks like this:

#
# Sesame configuration template for a main-memory repository
#
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix ms: <http://www.openrdf.org/config/sail/memory#>.

[] a rep:Repository ;
   rep:repositoryID "{%Repository ID|memory%}" ;
   rdfs:label "{%Repository title|Memory store%}" ;
   rep:repositoryImpl [
      rep:repositoryType "openrdf:SailRepository" ;
      sr:sailImpl [
         sail:sailType "openrdf:MemoryStore" ;
         ms:persist {%Persist|true|false%} ;
         ms:syncDelay {%Sync delay|0%}
      ]
   ].

Template variables are written down as {%var name%} and can specify zero or more values, seperated by vertical bars ("|"). If one value is specified then this value is interpreted as the default value for the variable. The Console will use this default value when the user simply hits the Enter key. If multiple variable values are specified, e.g. {%Persist|true|false%}, then this is interpreted as set of all possible values. If the user enters an unspecified value then that is considered to be an error. The value that is specified first is used as the default value.

The URIs that are used in the templates are the URIs that are specified by the RepsitoryConfig and SailConfig classes of Sesame's repository configuration mechanism. The relevant namespaces and URIs can be found in these javadoc or source of these classes.