TagDiscussionRoadMap
This page maps out where the discussions on the mailing list and else where are going. It is a short summary of what has been concluded and a TO DO list of what remains to be discussed. (The mailing list is hosted here:
http://lists.tdwg.org/mailman/listinfo/tdwg-tag_lists.tdwg.org).
Active Issues
- Core Ontology Management - How do we build the core ontology in a collaborative fashion.
(
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-February/000025.html)
Agreed Issues
The common architecture developed and promoted by TDWG must be independent of the GBIF data portal.
Although GBIF is involved in the TDWG Infrastructure Project (Donald Hobern - 20%), Lee Belbin is the Manager of the Project and is independent of GBIF. The common architecture proposed by the TAG will access and attempt to address all community needs. Any proposals made by the TAG will be accompanied by technical justifications. (
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-March/000036.html)
Centralization of services
The TDWG architecture should not rely on any third party providing a particular piece of infrastructure indefinitely. This effectively rules out rules out the architecture being limited to or reliant on any centralized data warehouse or indexing service beyond the hosting of standards files by TDWG itself - possibly as part of its collaborative infrastructure. (
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-March/000036.html)
There should be commonality between all TDWG objects.
There must be something all high level TDWG object definitions have in common that enables client applications (and application developers) to identify them and devise ways of handling them. This is likely to be a link of some kind to a core ontology.
(
http://www.tdwg.hyam.net/twiki/bin/edit/TAG/TagDiscussionRoadMap?t=1140706613)
There should be a core TDWG ontology consisting of a small vocabulary of shared terms or base classes.
- It should be representation independent (i.e. we should be able to move it between 'languages' UML, OWL, BNF etc).
- It should be dynamic (i.e. capable of evolving through time).
- It should be polymorphic. This is a result of it being dynamic. There will, at a minimum, be multiple version of any one part of the model when new version are introduced.
- It should NOT attempt to be omniscient i.e. it will not cover everything in our domain, only the parts that need to be communicated.
- It will be managed in a distributed fashion. Different teams will take responsibility for different parts of it.
(
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-February/000017.html)
There should be multiple ways to define the same 'objects'.
e.g. It should be possible to have several encodings running in parallel. This enables:
- Upgrades of object definitions where we have to run two types of definition together during the roll out of the new one.
- Not having to derive a 'perfect' definition of an object before data exchange can begin.
- Different subdomains to use different definitions for their own purposes.
(
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-February/000011.html)
Different vocabularies or definitions of the same object should be interoperable.
Object definitions should take the form of interfaces rather than restrictions or extensions so that a single instance of an object might implement more than one interface. (c.f. different versions of
DarwinCore?).
(
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-February/000018.html)
Resolve, Search & Query Definitions.
- Resolving. This means to convert a pointer into a data object. Examples would be to resolve an LSID and get back either data or metadata or resolve a url and get back a web page in html.
- Searching. This means to select a set of objects (or their proxies) on the basis of the values of their properties. The objects are predefined (implicitly part of the call) and we are simply looking for them. An example would be finding pages on Google.
- Querying. This means to ask a question of a data source that returns objects that are defined as part of the call. An example would be an SQL query where the response object (tuples) are defined in the query.
(
http://lists.tdwg.org/pipermail/tdwg-tag_lists.tdwg.org/2006-February/000009.html)
General notes
Gregor: When I read the above it seems to me that to a large extent it seems to be exclusively developed under the GBIF use case of an indexer/aggregator. This clearly is a very urgent use case for us, but I am afraid we will develop a too limited architecture if we do not consider the use case of generating, revising and consuming scientific data were exact representations matter. See
ForwardCompatibilityAndScientificDataExchange.
To Be Discussed
Linking Topics