As Semantic Web technologies are getting mature, there is a growing need for RDF applications to access the content of huge, live, non-RDF, legacy databases without having to replicate the whole database into RDF. This document describes the D2RQ mapping language for treating non-RDF relational databases as virtual RDF graphs, and the D2RQ Platform that enables applications to access these graphs through the Jena and Sesame APIs, as well as over the Web via the SPARQL Protocol and as Linked Data.
This document describes the D2RQ Platform for accessing non-RDF, relational databases as virtual, read-only RDF graphs. D2RQ offers a variety of different RDF-based access mechanisms to the content of huge, non-RDF databases without having to replicate the database into RDF.
Using D2RQ you can:
The D2RQ Platform consists of:
The figure below depicts the architecture of the D2RQ Platform:
The D2RQ Engine is implemented as a Jena graph, the basic information representation object within the Jena framework. A D2RQ graph wraps a local relational databases into a virtual, read-only RDF graph. It rewrites Jena or Sesame API calls, find() and SPARQL queries to application-data-model specific SQL queries. The result sets of these SQL queries are transformed into RDF triples or SPARQL result sets that are passed up to the higher layers of the framework. The D2RQ Sesame interface wraps the D2RQ Jena graph implementation behind a Sesame RDF source interface. It provides a read-only Sesame repository interface for querying and reasoning with RDF and RDF Schema.
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language. D2R Server builds on the D2RQ Engine. For detailed information on how to set up D2R Server please refer to the separate D2R Server website.
Example
We are using an example database which stores information about conferences, papers, authors and topics throughout this manual. The database is mapped to the International Semantic Web Community (ISWC) Ontology.
The D2RQ Plaffrom has been tested with these database engines:
The D2RQ Platform comes with two command line tools: a mapping generator that creates a default mapping file by analyzing the schema of an existing database, and a dump script that writes the complete contents of a database into a single RDF file. The scripts work on Windows and Unix systems.
The generate-mapping script creates a default mapping file by analyzing the schema of an existing database. This mapping file can be used as-is or can be customized.
generate-mapping [-u username] [-p password] [-d driverclass] [-o outfile.n3] jdbcURL
JDBC connection URL for the database. Refer to your JDBC driver documentation for the format for your database engine. Examples:
MySQL: jdbc:mysql://servername/databasename
PostgreSQL: jdbc:postgresql://servername/databasename
Oracle: jdbc:oracle:thin:@servername:1521:databasename
The fully qualified Java class name of the database driver. The jar file containing the JDBC driver has to be in D2RQ's /lib/db-drivers/ directory. Drivers for MySQL, PostgreSQL and Oracle are provided with the download, for other databases a driver has to be downloaded from the vendor or a third party. To find the driver class name, consult the driver documentation. Examples:
MySQL: com.mysql.jdbc.Driver
PostgreSQL: org.postgresql.Driver
Oracle: oracle.jdbc.OracleDriver
Example invocation for a local MySQL database:
generate-mapping -d com.mysql.jdbc.Driver -u root jdbc:mysql://127.0.0.1/iswc
The dump-rdf script provides a way of dumping the contents of the whole database into a single RDF file. This can be done with or without a mapping file. If a mapping file is specified, then the script will use it to translate the database contents to RDF. If no mapping file is specified, then the script will invoke generate-mapping and use its default mapping for the translation.
dump-rdf -m mapping.n3 [output parameters]
If no mapping file is provided, then the database connection must be specified on the command line. The meaning of all parameters is the same as for the generate-mapping script.
dump-rdf -u username [-p password] -d driverclass -j jdbcURL [output parameters]
Several optional parameters control the RDF output:
Example invocation using a mapping file:
dump-rdf -m mapping-iswc.n3 -f N-TRIPLE -b http://localhost:2020/ > iswc.nt
This section describes how the D2RQ Engine is used within the Jena 2 Semantic Web framework.
Download
D2RQ can be downloaded from http://sourceforge.net/projects/d2rq-map/
Jena Versions
At the time of writing, the latest Jena release is version 2.4. D2RQ requires a more recent custom-built version of Jena, the version that ships with ARQ 1.4. All required jar files are included in the D2RQ distribution. (Jena 2.4 and 2.3 may work to some extent.)
Installation
Debugging
D2RQ uses the Apache Commons - Logging API for logging. To enable D2RQ debug messages, set the log level for logger de.fuberlin.wiwiss.d2rq to ALL. An easy way to do this is:
org.apache.log4j.Logger.getLogger( "de.fuberlin.wiwiss.ng4j.d2rq").setLevel( org.apache.log4j.Level.ALL);
The ModelD2RQ class provides a Jena Model view on the data in a D2RQ-mapped database. The example shows how a ModelD2RQ is set up using a mapping file, and how Jena API calls are used to extract information about papers and their authors from the model.
The ISWC and FOAF classes have been created with Jena's schemagen tool. The DC and RDF classes are part of Jena.
// Set up the ModelD2RQ using a mapping file Model m = new ModelD2RQ("file:doc/example/mapping-iswc.n3"); // Find anything with an rdf:type of iswc:InProceedings StmtIterator paperIt = m.listStatements(null, RDF.type, ISWC.InProceedings); // List found papers and print their titles while (paperIt.hasNext()) { Resource paper = paperIt.nextStatement().getSubject(); System.out.println("Paper: " + paper.getProperty(DC.title).getString()); // List authors of the paper and print their names StmtIterator authorIt = paper.listProperties(DC.creator); while (authorIt.hasNext()) { Resource author = authorIt.nextStatement().getResource(); System.out.println("Author: " + author.getProperty(FOAF.name).getString()); } System.out.println(); }
In some situations, it is better to use Jena's low-level Graph API instead of the Model API. D2RQ provides an implementation of the Graph interface, the GraphD2RQ.
The following example shows how the Graph API is used to find all papers that have been published in 2003.
// Load mapping file Model mapping = FileManager.get().loadModel("doc/example/mapping-iswc.n3"); // Set up the GraphD2RQ GraphD2RQ g = new GraphD2RQ(mapping, "http://localhost:2020/"); // Create a find(spo) pattern Node subject = Node.ANY; Node predicate = DC.date.asNode(); Node object = Node.createLiteral("2003", null, XSDDatatype.XSDgYear); Triple pattern = new Triple(subject, predicate, object); // Query the graph Iterator it = g.find(pattern); // Output query results while (it.hasNext()) { Triple t = (Triple) it.next(); System.out.println("Published in 2003: " + t.getSubject()); }
In addition to the GraphD2RQ, there is a CachingGraphD2RQ which supports the same API and uses a LRU cache to remember a number of recent query results. This will improve performance for repeated queries, but will report inconsistent results if the database is updated during the lifetime of the CachingGraphD2RQ.
D2RQ can answer SPARQL queries against a D2RQ model. The SPARQL queries are processed by Jena's ARQ query engine. The example shows how a D2RQ model is set up, how a SPARQL query is executed, and how the results are written to the console.
ModelD2RQ m = new ModelD2RQ("file:doc/example/mapping-iswc.n3"); String sparql = "PREFIX dc: <http://purl.org/dc/elements/1.1/>" + "PREFIX foaf: <http://xmlns.com/foaf/0.1/>" + "SELECT ?paperTitle ?authorName WHERE {" + " ?paper dc:title ?paperTitle . " + " ?paper dc:creator ?author ." + " ?author foaf:name ?authorName ." + "}"; Query q = QueryFactory.create(sparql); ResultSet rs = QueryExecutionFactory.create(q, m).execSelect(); while (rs.hasNext()) { QuerySolution row = rs.nextSolution(); System.out.println("Title: " + row.getLiteral("paperTitle").getString()); System.out.println("Author: " + row.getLiteral("authorName").getString()); }
D2RQ comes with a Jena assembler. Jena assembler specifications are RDF configuration files that describe how to construct a Jena model. For more information on Jena assemblers, see the Jena Assembler quickstart page.
The following example shows an assembler specification for a D2RQ model:
@prefix : <#> . @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> . <> ja:imports d2rq: . :myModel a d2rq:D2RQModel; d2rq:mappingFile <mapping-iswc.n3>; d2rq:resourceBaseURI <http://localhost:2020/>; .
D2RQ model specifications support these two properties:
This usage example will create a D2RQ model from a model specification, and write it to the console:
// Load assembler specification from file Model assemblerSpec = FileManager.get().loadModel("doc/example/assembler.n3"); // Get the model resource Resource modelSpec = assemblerSpec.createResource(assemblerSpec.expandPrefix(":myModel")); // Assemble a model Model m = Assembler.general.openModel(modelSpec); // Write it to System.out m.write(System.out);
This section describes how the D2RQ Engine is used within the Sesame 1.2 RDF API.
Download
You have to download the following packages:
Installation
You have to add the "d2rq.jar" and "d2rq-to-sesame.jar" files from the "bin" directory of the D2RQ distribution together with the Jena 2 and Sesame 1.2 jar files to your classpath. To run D2RQ only the jar files
The following example shows how RDQL is used to get all information about the paper with the URI "http://www.conference.org/conf02004/paper#Paper1" out of a D2RQRepository.
import de.fuberlin.wiwiss.d2rq.sesame.D2RQRepository; import de.fuberlin.wiwiss.d2rq.sesame.D2RQSource; import org.openrdf.model.Value; import org.openrdf.sesame.Sesame; import org.openrdf.sesame.constants.QueryLanguage; import org.openrdf.sesame.query.QueryResultsTable; import org.openrdf.sesame.repository.SesameRepository; ... try{ // Initialize repository D2RQSource source = new D2RQSource("file:///where/you/stored/the/d2rq-mapping.n3", "N3"); SesameRepository repos = new D2RQRepository("urn:youRepository", source, Sesame.getService()); // Query the repository String query = "SELECT ?x, ?y WHERE (<http://www.conference.org/conf02004/paper#Paper1>, ?x, ?y)"; QueryResultsTable result = repos.performTableQuery(QueryLanguage.RDQL, query); // print the result int rows = result.getRowCount(); int cols = result.getColumnCount(); for(int i = 0; i < rows; i++){ for(int j = 0; j < cols; j++){ Value v = result.getValue(i,j); System.out.print(v.toString() + " "); } System.out.println(); } } catch(Exception e){ // catches D2RQException from D2RQSource construcor // catches java.io.IOException, // org.openrdf.sesame.query.MalformedQueryException, // org.openrdf.sesame.query.QueryEvaluationException, // org.openrdf.sesame.config.AccessDeniedException // from performTableQuery e.printStackTrace(); }
The D2RQ mapping language is a declarative language for describing the relation between a relational database schemata and RDFS vocabularies or OWL ontologies. A D2RQ map is an RDF document.
The language is formally defined by the D2RQ RDFS Schema.
The D2RQ namespace is http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#
An ontology is mapped to a database schema using d2rq:ClassMaps and d2rq:PropertyBridges. The central object within D2RQ and also the object to start with when writing a new D2RQ map is the ClassMap. A ClassMap represents a class or a group of similar classes of the ontology. A ClassMap specifies how instances of the class are identified. It has a set of PropertyBridges, which specify how the properties of an instance are created.
The figure below shows the structure of an example D2RQ map:
The following example D2RQ map relates the table conferences in a database to the class conference in an ontology. You can use the map as a template for writing your own maps.
# D2RQ Namespace @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> . # Namespace of the ontology @prefix : <http://annotation.semanticweb.org/iswc/iswc.daml#> . # Namespace of the mapping file; does not appear in mapped data @prefix map: <file:///Users/d2r/example.n3#> . # Other namespaces @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . map:Database1 a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/iswc"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password"; . # ----------------------------------------------- # CREATE TABLE Conferences (ConfID int, Name text, Location text); map:Conference a d2rq:ClassMap; d2rq:dataStorage map:Database1. d2rq:class :Conference; d2rq:uriPattern "http://conferences.org/comp/confno@@Conferences.ConfID@@"; . map:eventTitle a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Conference; d2rq:property :eventTitle; d2rq:column "Conferences.Name"; d2rq:datatype xsd:string; . map:location a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Conference; d2rq:property :location; d2rq:column "Conferences.Location"; d2rq:datatype xsd:string; .
The constructs of the D2RQ mapping language are described in detail below.
A d2rq:Database defines a JDBC or ODBC connection to a local relational database and specifies the type of the database columns used by D2RQ. A D2RQ map can contain several d2rq:Databases for accessing different local databases.
Properties
d2rq:jdbcDSN | The JDBC database URL. This is a string of the form jdbc:subprotocol:subname. For a MySQL database, this is something like jdbc:mysql://hostname:port/dbname. Examples for other databases |
d2rq:jdbcDriver | The JDBC driver class name for the database. Used together with d2rq:jdbcDSN. Example: com.mysql.jdbc.Driver for MySQL. |
d2rq:odbcDSN | The ODBC data source name of the database. |
d2rq:username | A username if required by the database. |
d2rq:password | A password if required by the database. |
d2rq:resultSizeLimit | An integer value that will be added as a LIMIT clause to all generated SQL queries. This sets an upper bound for the number of results returned from large databases. Note that this effectively “cripples” the server and can cause unpredictable results. |
d2rq:allowDistinct | Specifies the databases ability to handle DISTINCT correctly. Value: "true" or "false". For example MSAccess cuts fields longer than 256 chars. |
d2rq:expressionTranslator | Specifies a Java class that can translate Jena RDQL expression objects into SQL expressions that are understood by the database. "null" means: no expression translation. Otherwise full SQL 92 is assumed. |
d2rq:textColumn d2rq:numericColumn d2rq:dateColumn |
These properties are used to declare the column type of database columns. Values are column names in Table_name.column_name notation. These properties do not need to be specified unless the engine is for some reason unable to determine the correct column type by itself. |
Example
map:Database1 a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/iswc"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password"; d2rq:numericColumn "Conferences.ConfID"; d2rq:textColumn "Conferences.URI"; d2rq:textColumn "Conferences.Name"; d2rq:textColumn "Conferences.Location"; d2rq:dateColumn "Conferences.Date".
A note on using multiple databases within a single D2RQ map: You cannot link from one database by having a property bridge with d2rq:belongsToClassMap in one database and d2rq:refersToClassMap in the other. Instead, define class maps in both databases that create the same URIs or blank node IDs.
A d2rq:ClassMap represents a class or a group of similar classes of an OWL ontology or RDFS schema. A class map defines how instances of the class are identified. It is connected to a d2rq:Database and has a set of d2rq:PropertyBridges which attach properties to the instances.
D2RQ provides four different mechanisms of assigning identifiers to the instances in the database:
A URI pattern is instantiated by inserting values of certain database columns into a pattern. Examples:
http://example.org/persons/@@Persons.ID@@ http://example.org/lineItems/item@@Orders.orderID@@-@@LineItems.itemID@@ urn:isbn:@@Books.isbn@@ mailto:@@Persons.email@@
The parts between @@'s mark database columns in Table.Column notation. URI patterns are used with the d2rq:uriPattern property.
Certain characters, like spaces or the hash sign, are not allowed in URIs or have special meaning. Columns that contain such characters need to be encoded before their values can be inserted into a URI pattern:
A relative URI pattern is a URI pattern that generates relative URIs:
persons/@@Persons.ID@@
They will be combined with a base URI provided by the processing environment to form full URIs. Relative URI patterns allow the creation of portable mappings that can be used for multiple instances of the same database schema. Relative URI patterns are generated with the d2rq:uriPattern property.
In some cases, the database may already contain URIs that can be used as resource identifiers, such as web page and document URLs. URI are generated from columns with the d2rq:uriColumns property.
RDF also has the concept of blank nodes, existential qualifiers that denote some resource that exists and has certain properties, but is not named. In D2RQ, blank nodes can be generated from one or more columns. A distinct blank node will be generated for each distinct set of values of these columns. The columns are specified using the with the d2rq:bNodeIdColumns property.
d2rq:dataStorage | Reference to a d2rq:Database where the instance data is stored. |
d2rq:class | An RDF-S or OWL class. All resources generated by this ClassMap are instances of this class. |
d2rq:uriPattern | Specifies a URI pattern that will be used to identify instances of this class map. |
d2rq:uriColumn | A database column containing URIrefs for identifying instances of this class map. The column name has to be in the form "TableName.ColumnName". |
d2rq:bNodeIdColumns | A comma-seperated list of column names in "TableName.ColumnName" notation. The instances of this class map will be blank nodes, one distinct blank node per distinct tuple of these columns. |
d2rq:translateWith | Assigns a d2rq:TranslationTable to the class map. Values from the d2rq:uriColumn or d2rq:uriPattern will be translated by the table before a resource is generated. See below for details. |
d2rq:containsDuplicates | Must be specified if a class map uses information from tables that are not fully normalized. If the d2rq:containsDuplicates property value is set to "true", then D2RQ adds a DISTINCT clause to all queries using this classMap. "False" is the default value, which doesn't have to be explicitly declared. Adding this property to class maps based on normalized database tables degrades query performance, but doesn't affect query results. |
d2rq:additionalProperty | Adds an AdditionalProperty to all instances of this class. This might be useful for adding rdfs:seeAlso properties or other fixed statements to all instances of the class. |
d2rq:condition | Specifies an SQL WHERE condition. An instance of this class will only be generated for database rows that satisfy the condition. Conditions can be used to hide parts of the database from D2RQ, e.g. deny access to data which is older or newer than a certain date. See section Conditional Mappings for details. |
ClassMap property:
d2rq:classMap | Inverse of d2rq:class and unnecessary if d2rq:class is used. Specifies that a d2rq:classMap is used to create instances of an OWL or RDF-S class. |
Example: ClassMap where instances are identified using an URI pattern
map:PaperClassMap a d2rq:ClassMap; d2rq:uriPattern "http://www.conference.org/conf02004/paper#Paper@@Papers.PaperID@@"; d2rq:class :Paper; d2rq:dataStorage map:Database1.
The d2rq:class property is used to state that all resources generated by the d2rq:ClassMap are instances of an RDFS or OWL class. D2RQ automatically creates the necessary rdf:type triples.
Example: ClassMap where instances are identified using blank nodes
map:Topic a d2rq:ClassMap ; d2rq:bNodeIdColumns "Topics.TopicID" ; d2rq:class :Topic ; d2rq:dataStorage map:Database1 .
In order to recognize bNodes across several find() calls and to be able to map bNodes to instance data in the database, D2RQ encodes the classMap name together with the primary key values in the bNode label. The map above could produce the bNode label "http://www.example.org/dbserver01/db01#Topic@@6", where the number "6" is a primary key value and "http://www.example.org/dbserver01/db01#Topic" is the ClassMap name.
Example: ClassMap for a group of classes with the same properties
If you want to use one ClassMap for a group of classes with the same properties (like Person, Professor, Researcher, Student) that all come from the same table, you must create the rdf:type statements with an object property bridge instead of using d2rq:class.
map:PersonsClassMap a d2rq:ClassMap ; d2rq:uriColumn "Persons.URI" ; d2rq:dataStorage map:Database1 . map:PersonsType a d2rq:PropertyBridge ; d2rq:property rdf:type ; d2rq:pattern "http://annotation.semanticweb.org/iswc/iswc.daml#@@Persons.Type@@" ; d2rq:belongsToClassMap map:PersonsClassMap .
Here, the class of each person is obtained by prefixing the values of the Persons.Type column with an ontology namespace. If the class names within the ontology can't be constructed directly from values of the Persons.Type column, then a TranslationTable could be used for aligning class names and database values.
Property Bridges relate database table columns to RDF properties. They are used to attach properties to the RDF resources created by a class map. The values of these properties are often literals, but can also be URIs or blank nodes that relate the resource to other resources, e.g. the value of a paper's :author property could be a URI representing a person.
If the one of the columns used in a property bridge is NULL for some database rows, then no property is created for the resources corresponding to these rows.
Properties
d2rq:belongsToClassMap | Specifies that the property bridge belongs to a d2rq:ClassMap. Must be specified for every property bridge. |
d2rq:property | The RDF property that connects the ClassMap with the object or literal created by the bridge. Must be specified for every property bridge. If multiple d2rq:properties are specified, then one triple with each property is generated per resource. |
d2rq:column | For properties with literal values. The database column that contains the literal values. Column names have to be given in the form "TableName.ColumnName". |
d2rq:pattern | For properties with literal values. Can be used to extend and combine column values before they are used as a literal property value. If a pattern contains more than one column, then a separating string, which cannot occur in the column values, has to be used between the column names, in order to allow D2RQ reversing given literals into column values. |
d2rq:datatype | For properties with literal values. Specifies the RDF datatype of the literals. |
d2rq:lang | For properties with literal values. Specifies the language tag of the literals. |
d2rq:uriColumn | For properties with URI values. Database column that contains URIs. Column names have to be given in the form "TableName.ColumnName". |
d2rq:uriPattern | For properties with URI values. Can be used to extend and combine column values before they are used as a URI property values. If a pattern contains more than one column, then a separating string, which cannot occur in the column values, has to be used between the column names, in order to allow D2RQ reversing given literals into column values. See example below. |
d2rq:refersToClassMap | For properties that correspond to a foreign key. References another d2rq:ClassMap that creates the instances which are used as the values of this bridge. One or more d2rq:join properties must be specified to select the correct instances. See example below. |
d2rq:join | If the columns used to create the literal value or object are not from the database table(s) that contains the ClassMap's columns, then the tables have to be joined together using one or more d2rq:join properties. See example below. |
d2rq:alias | Aliases take the form "Table AS Alias" and are used when a table needs to be joined to itself. The table can be referred to using the alias within the property bridge. See example below. |
d2rq:condition | Specifies an SQL WHERE condition. The bridge will only generate a statement if the condition holds. A common usage is to suppress triples with empty literal values: d2rq:condition "Table.Column <> ''". See section Conditional Mappings for details. |
d2rq:translateWith | Assigns a d2rq:TranslationTable to the property bridge. Values from the d2rq:column or d2rq:pattern will be translated by the table. See section TranslationTables for details. |
d2rq:valueMaxLength | Asserts that all values of this bridge are not longer than a number of characters. This allows D2RQ to speed up queries. See section Performance Optimization for details. |
d2rq:valueContains | Asserts that all values of this bridge always contain a given string. This allows D2RQ to speed up queries. Most useful in conjunction with d2rq:column. See section Performance Optimization for details. |
d2rq:valueRegex | Asserts that all values of this bridge match a given regular expression. This allows D2RQ to speed up queries. Most useful in conjunction with d2rq:column on columns whose values are very different from other columns in the database. See section Performance Optimization for details. |
PropertyBridge property:
d2rq:propertyBridge | Inverse of d2rq:property and not needed if d2rq:property is used. The d2rq:propertyBridge property specifies which property bridge is used for an RDF property. If the same RDF property is used by several RDF classes, then several property bridges are used to relate the RDF property to the different class maps. |
Example: A simple property bridge
map:PaperTitle a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :title; d2rq:column "Papers.Title"; d2rq:lang "en"; .
This attaches a :title property to all resources generated by the map:Paper class map. The property values are taken from the Papers.Title column. The generated literals will have a language tag of "en".
Example: Property bridge using information from another database table
map:authorName a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Papers; d2rq:property :authorName; d2rq:column "Persons.Name"; d2rq:join "Papers.PaperID = Rel_Person_Paper.PaperID"; d2rq:join "Rel_Person_Paper.PersonID = Persons.PerID"; d2rq:datatype xsd:string; .
This property bridge adds the names of authors to papers. If a paper has several authors, then several :authorName properties are added. The tables Papers and Persons are in an n:m relation. The d2rq:join is used to join the tables over the Rel_Person_Paper table.
Example: A property bridge with mailto: URIs
map:PersonsClassEmail a d2rq:PropertyBridge; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:property :email; d2rq:uriPattern "mailto:@@Persons.Email@@"; .
The pattern mailto:@@Persons.Email@@ is used to attach a mailto: prefix to the values of the "Persons.Email" column. The example uses d2rq:uriPattern instead of d2rq:pattern because the bridge should produce URIs, not literals.
Example: Linking instances from two database tables
map:PaperConference a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :conference; d2rq:refersToClassMap map:Conference; d2rq:join "Papers.Conference = Conferences.ConfID"; .
The example attaches a :conference property to papers. The values of the property are generated by the map:Conference class map, not shown here. It may use a d2rq:uriPattern, d2rq:uriColumn or blank nodes to identify the conferences. The appropriate instance is found using the d2rq:join on the foreign key Papers.Conference.
Example: Joining a table to itself using d2rq:alias
map:ParentTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Topic; d2rq:property :parentTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Topics.ParentID = ParentTopics.ID"; d2rq:alias "Topics AS ParentTopics"; .
Here, a topic may have a parent topic whose ID is found in the Topics.ParentID column. This foreign key refers back to the Topics.ID column. The table has to be joined to itself. A d2rq:alias is declared, and the join is established between the original table and the aliased table. This pattern is typical for hierarchical or graph-style relationships.
A d2rq:AdditionalProperty can be used to add a fixed statement to all instances generated by a class map. The statement is added to the result sets, if patterns like (ANY, ANY, ANY), (URI, ANY, ANY) or (URI, additionalPropertyName, ANY) are used. The d2rq:additionalProperty property is used to link from the class map to the d2rq:AdditionalProperty definition.
Properties
d2rq:propertyName | The RDF property to be used as the predicate of all fixed statements. |
d2rq:propertyValue | The value to be used as the object of all fixed statements. |
Example:
map:PersonsClassMap a d2rq:ClassMap; d2rq:class :Person; d2rq:additionalProperty map:SeeAlsoStatement. map:SeeAlsoStatement a d2rq:AdditionalProperty; d2rq:propertyName rdfs:seeAlso; d2rq:propertyValue <http://annotation.semanticweb.org/iswc2003/>.
This adds an rdfs:seeAlso statement with a fixed URL object to every instance of the class map.
A d2rq:TranslationTable is an additional layer between the database and the RDF world. It translates back and forth between values taken from the database and RDF URIs or literals. A translation table can be attached to a class map or a property bridge using the d2rq:translateWith property. TranslationTables can be used only for mappings that are unique in both directions (1:1).
Properties
d2rq:translation | Adds a d2rq:Translation to the table. |
d2rq:href | Links to a CSV file containing translations. Each line of the file is a translation and contains two strings separated by a comma. The first one is the DB value, the second the RDF value. |
d2rq:javaClass | The qualified name of a Java class that performs the mapping. The class must implement the Translator interface. Custom Translators might be useful for encoding and decoding values, but are limited to 1:1 translations. Further datails can be found in the D2RQ javadocs. |
A d2rq:Translation is a single entry in a d2rq:TranslationTable.
Properties
d2rq:databaseValue | A value that might appear in a database column or might be generated by a d2rq:pattern. |
d2rq:rdfValue | A translation of that value to be used in RDF constructs. |
Example: Translating color codes
A typical application are database columns containing type codes or similar enumerated values. A translation table can be used to turn them into RDF resources. In this example, the column ShinyObject.Color contains a color code: "R" for red, "G" for green etc. These codes must be translated into RDF resources: :red, :green etc.
:red a :Color; :green a :Color; # ... more colors omitted ... :blue a :Color; map:ColorBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:ShinyObjectMap; d2rq:property :color; d2rq:uriColumn "ShinyObject.Color"; d2rq:translateWith map:ColorTable. map:ColorTable a d2rq:TranslationTable; d2rq:translation [ d2rq:databaseValue "R"; d2rq:rdfValue :red; ]; d2rq:translation [ d2rq:databaseValue "G"; d2rq:rdfValue :green; ]; # ... more translations omitted ... d2rq:translation [ d2rq:databaseValue "B"; d2rq:rdfValue :blue; ].
The d2rq:translateWith statement tells D2RQ to look up database values in the map:ColorTable. There, a translation must be given for each possible value. If the database contains values which are not in the translation table, D2RQ will not generate a :color statement for that :ShinyObject instance.
Note that the type of the resulting RDF node is determined by the bridge and not by the node type of the rdfValues. map:ColorBridge uses d2rq:uriColumn. Thus, the translation will create URI nodes. If it used d2rq:column, then literals would be created.
Sometimes only certain information from a database should be accessible, because parts of the database might be confidential or outdated. Using d2rq:condition you can specify conditions, which must be satisfied by all accessible data.
You can use d2rq:condition on class map and property bridge level. The d2rq:condition value is added as an additional SQL WHERE clause to all queries generated using the class map or property bridge. If the condition evaluates to FALSE for a SQL result set row, then no triples will be generated from that row.
Example: Using d2rq:condition on a d2rq:ClassMap
map:Paper a d2rq:ClassMap; d2rq:class :Paper; d2rq:uriPattern "http://www.conference.org/conf02004/paper#Paper@@Papers.PaperID@@"; d2rq:condition "Papers.Publish = 1"; d2rq:dataStorage map:Database1.
Only those papers with a value of 1 in the Papers.Publish column will be accessible. All other papers are ignored.
Example: Filtering zero-length strings
Usually, the special value NULL is used in a database to indicate that some field has no value, or that the value is unknown. Some databases, however, are using a zero-length string ("") instead. D2RQ doesn't generate RDF statements from NULL values, but it doesn't recognize zero-length strings and will generate statements like :Person123 :firstName "". if the person's first name is a zero-length string. In oder to suppress these statements, a d2rq:condition can be added to the property bridge:
map:PersonsClassFirstName a d2rq:PropertyBridge; d2rq:property :firstName; d2rq:column "Persons.FirstName"; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:condition "Persons.FirstName <> ''".
Example: Relationship type codes
Imagine a table Rel_Paper_Topic that associates rows from a Papers table with rows from a Topics table. The Rel_Paper_Topic table contains a PaperID column to reference the papers, a TopicID to reference the topics, and a RelationshipType column which contains 1 if the topic is a primary topic of the paper, and 2 if it is a secondary topic.
For primary topic relationships, the :primaryTopic property shall be used, and for others the :secondaryTopic property.
We can build a map for this scenario by creating two property bridges. One for :primaryTopic, one for :secondaryTopic. We add a d2rq:condition to both bridges to suppress those statements where the RelationshipType column doesn't have the correct value.
map:primaryTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :primaryTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Papers.PaperID = Rel_Paper_Topic.PaperID"; d2rq:join "Rel_Paper_Topic.TopicID = Topics.TopicID"; d2rq:condition "Rel_Paper_Topic.RelationType = 1". map:secondaryTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :secondaryTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Papers.PaperID = Rel_Paper_Topic.PaperID"; d2rq:join "Rel_Paper_Topic.TopicID = Topics.TopicID"; d2rq:condition "Rel_Paper_Topic.RelationType = 2".
This section covers hint properties that can be added to property bridges in order to speed up queries: d2rq:valueMaxLength, d2rq:valueRegex and d2rq:valueContains.
Example: Providing a maximum length
map:PersonsClassFirstName a d2rq:PropertyBridge; d2rq:property :firstName; d2rq:column "Persons.FirstName"; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:valueMaxLength "15".
The d2rq:valueMaxLength property can be used to tell D2RQ that the length of Persons.FirstName values is limited to 15 characters. Using this information, D2RQ doesn't have to look in the database anymore to figure out, that a given FirstName which is longer than 15 characters isn't fitting.
Example: Providing a regular expression
map:PaperYear a d2rq:PropertyBridge; d2rq:property :year; d2rq:column "Papers.Year"; d2rq:belongsToClassMap map:Paper; d2rq:datatype xsd:gYear; d2rq:valueRegex "^[0-9]{4}$".
Here, the d2rq:valueRegex property is used to provide a regular expression for the Papers.Year column. The statement asserts that all values match the regular expression (or are NULL). The expression ^[0-9]{4}$ matches every four-digit number. If you don't want to use the full regular expression machinery, you can use d2rq:valueContains to assert that all values generated by the property bridge contain a certain phrase.
You are geting the largest performance gain by providing hints for property bridges which are using d2rq:column. You should define hints on columns of large tables and on columns that are not indexed by the database. These are the cases where a well-placed optimization hint can result in an order-of-magnitude improvement for some queries. Don't bother to provide hints for property bridges based on d2rq:pattern. These can be optimized very well without hints. In general, the biggest payoff is expected for hints on large tables. If you have a few very large tables with non-indexed columns in your database, that's where you should focus your efforts.
Please keep in mind that hint properties are not intended for filtering of unwanted database values. They are only performance hints. Values that do not fulfill the criteria will still appear in query results if find patterns like (URI, ANY, ANY) are used. In oder to filter values, use d2rq:condition or a translation table with a custom Java class that returns null for unwanted database values.
You can configure D2RQ by providing d2rq:ProcessingInstructions. Up till now D2RQ V0.4 only supports the queryHandler processing instruction. The d2rq:queryHandler processing instruction can be used to specify a query handler that is different from the default query handler which is D2RQQueryHandler.
Example: Switch back to Jena's standard query handler
:ProcessingInstructions1 a d2rq:ProcessingInstructions; d2rq:queryHandler "com.hp.hpl.jena.graph.query.SimpleQueryHandler".
This section lists several language constructs from older versions of the D2RQ mapping language that have been replaced by better alternatives and should no longer be used.
Older versions of the language used two different classes to distinguish between property bridges that produce literals, and bridges that produce resources.
In the current version, both cases are handled by the d2rq:PropertyBridge class. The distinction is made by using an appropriate property on the bridge declaration: d2rq:column and d2rq:pattern for literals, d2rq:uriColumn, d2rq:uriPattern and d2rq:bNodeIdColumns for resources.
$Id: index.htm,v 1.30 2007/10/23 15:33:19 cyganiak Exp $