The ECAT World Taxonomies dataset is brought together in order to compile and contrast information found in a series of different available metadatabases. The data can be analysed in order to pinpoint gaps in the taxonomic coverage in order to set priorities for future initiatives like Seed Money.
The hierarchy used in the meatadatabase is the ITIS management hierarchy, extracted from the ITIS website. Ranks down to the level of the Order (Family for Vascular Plants) are implemented.
The ECAT World Taxonomies dataset is compiled from four principal sources, to which due recognition should be attributed:
The ITIS Data Status Summary Table, found at http://www.itis.usda.gov/status.html , summarises in broad terms the indexing status of the taxonomic groups found in the ITIS database by indicating which groups have been completed and which are in progress. The scope of the data e.g. World, U.S. and Canada, is given. A taxon designated 'complete' and 'World' would constitute a proper GSD. In a number of cases ITIS reports being completed for particular, cross cutting groups like 'Marine lobsters of economic importance'
The Species 2000 Metadatabase, kindly provided by Yuri Roskov, the Species 2000 secretariat, Reading, UK, contains information on any taxonomy initiative brought to the attention of Species 2000. This includes the range from comprehensive, on-line available, Species 2000 sector GSDs to database initiatives only characterized by their name - e.g. Spooner's Elaphomycetes - sometimes with a person indicated as the contact person, sometimes without any kind of metadata.
At a
workshop in Kew, 28-30 June 2004, the GSPC initiated a compilation of a
metadatabase on the status of the families of vascular plants.
This list
utilizes a series of categories describing the status of availability of
taxonomies for the families;
The list also very comprehensively suggests road map and obstacles for each of the families. Potential collaborators are identified. These data are not included in the ECAT dataset as they are to a certain degree person sensible. The GSPC data includes many families that are not present in the ITIS taxonomy. This will have to be sorted out. At present, the families are listed in a separate group as direct descendents of the Vascular Plants.
The ECAT Seed Money programme has yielded certain insight into taxonomy initiatives around the world. For the moment only projects that have received Seed Money funding are entered into the database. In the future it may be advantageous to include also projects that applied for seed money but were rejected. This can of course not be done without contacting the applicants, which in many cases could lead to complicated situations. Also information on the taxonomic nomenclators is included.
The metadatabase is delivered as an MSAccess database file and an HTML output that displays the data in the context of a taxonomic hierarchy tree.
The database stores the entire ITIS hierarchy with parent-child linking internally in a single table. Taxonomic resource information is inserted on appropriate levels. The database utilizes a management form where tree-walking is possible by clicking names of parent and children. Data editing directly in the table is also possible.
The HTML output visualizes a tree representation of the entire taxonomy, with taxonomic resource information affixed at appropriate nodes and terminals. In general, if a node carries information, this information applies to all descendants of this node. As not all information applies to natural taxonomic groups, a number of informal Groups have been introduced.
Data is presented in 6 containers; ITIS, Species 2000, GSPC, Seed Money, Nomenclators and 'Others'.
As loading time for the rather complex tree document can be rather long (seemingly faster in Mozilla Firefox than in MS Internet Explorer), the HTML output comes in two versions; one with the full taxonomy and one - the Light Edition - where all terminal Orders (or Families) without data have been pruned.
The data set has been analysed at Phylum/Division level, assessing the state of taxonomic treatment. These findings are listed and summarised in another document
The data is brought from the database into HTML through an ASP procedure that will write a file with the tree node information embedded. A comprehensive Java Script (thanks to Cip for finding it) transforms the node info into a graphical tree that can be collapsed and expanded by clicking on nodes. At present, as GBIF does not serve ASP, as an interim solution, the output is saved as .htm files and stored in the CIRCA Information service. From where they can be further disseminated.