Specify 6.4

Specify 6.4

The Specify Software Project is delighted to deliver Specify 6.4. The new release extends Specify’s research functions for georeferencing, distribution mapping, and new specimen digitization. These enhancements are accomplished through data integration with GBIF and web service connections to the GEOLocate and Lifemapper Projects.

Specify’s new subsystem “Scatter Gather Reconcile” (SGR) discovers and facilitates the re-use of computerized information from duplicate specimens. SGR delivers on the promise of data publishing for biological collections; it completes the circle of enterprise-level integration by enabling write-back from GBIF to Specify databases.

Specify 6.4 also includes a new Java implementation of GEOLocate, a georeferencing application from Tulane University. The embedded module communicates with GEOLocate data servers to parse text locality descriptions, calculate specimen latitude and longitude coordinates and assign best practice spatial metadata--all within a seamless Specify interface.

Finally, 6.4 makes network connections to Lifemapper’s visualization services for mapping collection data and comparing the geographical distribution of species vouchers in your database with the spread of those in GBIF. With NASA’s World Wind plug-in, Specify users explore the localities of their holdings in an interactive globe display. In a future release, Specify/Lifemapper integration will enable Specify users to invoke Lifemapper’s species distribution modeling and multispecies diversity analysis services.

These three new network integration modules, SGR, GEOLocate, and Lifemapper are the first wave of broader computational capability and network integration coming to the Specify Software platform. Additional information on how to use these new modules is accessible in Specify’s help system.

 

Specify 6.4 New Modules

 

Scatter Gather Reconcile (SGR)

SGR

Botanists commonly collect multiple samples of a plant in order to distribute duplicates for determination, or for ‘exchange’ to share unique specimens with other institutions. “Scatter Gather Reconcile” alludes to this practice whereby specimens are scattered among multiple institutions, and then virtually gathered when their associated data are electronically aggregated. Reconcile refers to the resolution of divergent data that occurs when records of duplicates are compared and updated.

For the first time, researchers can discover within their data management environment if another collection has already digitized duplicates of their specimens. The Specify SGR module takes your partial or complete existing collection object records and compares them to those held by GBIF. SGR then shows the GBIF records which best match your own and enables re-use of pre-existing data. With SGR, GBIF occurrence data are no longer just HTML in a browser window; they are an integral part of your data management domain, easily discovered, inspected, and harvested for local use.

With SGR, GBIF data can be used for new specimen record creation or for validating existing records. For digitizing specimens, SGR can utilize preliminary records with data in as few as three or four fields to determine if duplicate specimen records exist in GBIF. SGR will enable collaborative cataloging within a collection (e.g. looking for duplicate collecting events), among several collections in a project network, and at a global level with biodiversity data published to GBIF. SGR will improve local data quality, consistency and project sharing.

SGR was developed in collaboration with Íñigo Granzow-de la Cerda at the Univeristat Autònoma de Barcelona with funding from U.S. NSF/DBI grant 0646301.

 

GEOLocate

GOLocate

The GEOLocate Project of Tulane University produces software applications and network services for georeferencing specimen localities. GEOLocate’s network services include a natural language processing algorithm capable of interpreting textual locality descriptions to produce geographic coordinates with associated uncertainty values. New Specify/GEOLocate features include: enhanced interactivity and ease of use, improved locality description analysis, multilingual georeferencing, and support for uncertainty radii and polygons. Now Specify users have state-of-the-art georeferencing accessible instantly to complete specimen locality descriptions, making specimen records suitable for mapping and geospatial modeling. GEOLocate embedded within Specify eliminates the need for accessing external georeferencing web sites, cutting and pasting of locality coordinates, and multi-step data exporting and importing to update georeferences.

The GEOLocate plug-in was designed and implemented by Djihbrihou Abibou, Nelson Rios and Hank Bart of Tulane University with funding from NSF BIO/DBI grant 085214.

 

Lifemapper

Lifemapper

The Lifemapper Project produces software tools and computational services which ingest specimen data from biological collections along with environmental data layers to produce geographical analyses of species ranges. Lifemapper is organized in two core components, (1) a map archive of GBIF species distribution data and predicted current and future, climate-influenced, species distribution models and, (2) a suite of web services and software integration tools which enable researchers to predict and analyze multi-species, multi-scale patterns of species diversity. Analysis tools in Lifemapper include Species Distribution Modeling Services based on modeling algorithms in openModeller as well as Range and Diversity Services which enable research on the patterns and process of species diversity in plant and animal communities.

We connected Specify 6.4 to Lifemapper mapping services as the first step toward deeper integration between the collections and biogeographic and phylogeographic research communities. In the future, Specify and Lifemapper will be more richly integrated through a web services, to enable Specify users to launch and monitor species distribution model runs as well as manage climate and other custom occurrence data sets associated with biogeographic modeling experiments. Specify will become an increasingly versatile, open source platform for research workflows which start with specimen information.

Lifemapper is supported by US NSF grants: EPS 0919443, EHR/DRL 0918590, BIO/EF 0851290, and OCI 1135510.

 

Specify 6.4 Broader Impacts

 

Integration

These three new network connections in Specify bridge computational divides for biodiversity data providers. Specify integration with Lifemapper intersects legacy biological collections data processing with research workflows of broader biogeographical and environmental computing communities. SGR integration with data from GBIF completes the circle of integration for Specify institutions publishing their specimen records; it allows them to discover and harvest data from specimens which duplicate their own. Specify’s web service connection to GEOLocate efficiently delivers geospatial processing. The gazetteer lookups, natural language processing, and interactive georeferencing services of GEOLocate increase the broader research utility of specimen records by adding georeferences.

Expect more cross-cutting research capabilities and network integration in future releases of Specify. We are partnering to bring Specify collections into the proposed annotation architecture of the Filtered Push Project and to the image repository services of Morphbank, both Specify Collaboration Partners. Specify 7, which is being concurrently developed by U.S. and international labs, will include web clients for remote data entry and processing.

Core funding for the Specify Software Project is provided by NSF/DBI grant 0968352.