LsidVocsUsage
Introduction
This page deals with how to use LSID Vocabularies. It is aimed at developers. For background and other resources read the
LsidVocs page. If you are using a wrapper application such as
PyWrapper? v3 or TAPIRLink this page may be only of interest as background information. It is intended that these applications support the vocabularies as a configuration option.
If you are interested in exploring the vocabularies to get a feel for what they are from a technical point of view have a look at
LsidVocsExploring.
LSID Vocabularies are a hybrid technology in that instance documents built against them may be treated as both RDF and plain XML at generation and at the consumption. This gives freedom on the server side as the most convenient technique can be employed. The client side is more complex as illustrated by the table below.
| | Consume as RDF | Consume as XML |
| Produce as RDF | No limitations. Data should appear as any other Semantic web data source. | Problematic. The client can't be sure of the serialization used by the server it is therefore difficult to write XSL and use techniques that depend on document structure. RDF parsing libraries are mature and widely available though. |
| Produce as XML | No limitations. Data should appear as any other Semantic web data source. | Minor limitations. The client needs to know that this is an XML document of a certain form but no schema location is present. If validation is required the schema location needs to be defined elsewhere. |
Publishing Data
Using Semantic Web Technologies
Standard techniques can be used for constructing RDF data in memory and serializing it to the output stream. If using model driven APIs instead of the resource focused APIs it should be possible to import the LSID vocabulary itself as a starting point. Be aware though that these vocabularies are currently managed for the stability of instance data and changes to the inheritance structure of the vocabulary may break your code.
As XML
Any of the standard methods for production of XML could be employed:
- A templating language like JSP/PHP/ASP could be used.
- A DOM based memory approach
- SAX based streaming approach
- etc etc
Several factors should be borne in mind:
- The XML schema location should be removed from the root element of the instance document to make it valid RDF.
- It is not possible, in XML Schema, to specify that if a certain attribute is present the element should be empty. It is therefore not possible to fully validate RDF instance document with XML Schema because the basic choice between a resource being represented as a URI in an rdf:resource attribute OR as the content of the element can not be enforced. This therefore has to be enforced by your code. [BTW: Geography Markup Language suffers the same problem].
- At a minimum example instance documents generated by your code should be checked against an RDF validator such as that hosted by W3C. Ideally this test should form part of a regression test suite.
For your convenience a example instance document is supplied with each vocabulary. If you are hand crafting the XML using some templating technology this is a good starting point as it already has the namespace prefixes defined. Creating a document from scratch using current versions of
OxygenXML?, XML Spy or another IDEs will tend to produce ugly, hard to read XML with namespace declarations all over the place.
Remember that you don't have to follow the schema. The schema is intended to help you understand and produce RDF instance documents but is not intended to be normative at this point. If your business goals can be met with a different schema that still produces valid RDF then go ahead and do it. This may be particularly true if you want to include your own properties from another namespace. Be aware that some clients may rely on you following the avowed schema though - read the section below.
Consuming Data
As RDF
Clients can make use of one of the RDF libraries available such as
Raptor for C,
Jena for Java,
RAP for PHP, even
JavaScript and others.
If XML data of a known and validated structure is required then it could be written out from data parsed with one of these libraries.
As XML
If the client knows that the server is producing data as per a specific document structure then it can consume it as a regular XML document. To construct a client that consumes data from all data sources it would be safer to use an RDF parsing library. It is theoretically possible to write XSL style sheets to convert arbitrary XML serializations of RDF into known document structures but this is likely to be complex and error prone compared with the use of existing libraries. Use of existing code may also allow for consumption of RDF in other, non-XML serializations.
Linking Topics