NCRI Informatics Header NCRI Informatics Initiative Home NCRI

NCRI Informatics Initiative and data standards

The NCRI data sharing policy is raising questions about how data should be represented to facilitate sharing, making the need for data exchange standards critical and immediate.

The NCRI Informatics Initiative supports the development of standards for describing, formatting, submitting, and exchanging both data and metadata.

The NCRI Informatics Initiative is working with the relevant communities to identify, evaluate and promote the adoption of common standards within the domains of the Cancer InfoMatrix.

If you know of any other relevant data standards that should appear on Cancer InfoMartix or if you need to find out more about standards please contact us.

What are data standards?

Data standards are consensual specifications for the representation of data from different sources or settings.

Categories of data standards

It is important to distinguish between standards that specify how to actually do experiments and standards that specify how to describe experiments. This section focuses on standards that specify how to describe and communicate data and information including checklists (e.g. minimum reporting guidelines for metadata descriptions), syntax (data exchange languages) and semantics (data models, ontologies and controlled vocabularies).

Standards can be informal or formal. Informal standards are used by a wide community but have not gone through a certification process from a recognised institution. Formal standards are also used by a wide community but as opposed to informal standards they have gone through a process of definition and maintenance within some recognised institution.

why use data standards

In this high throughput, open source era, access to data is something that is taken for granted. However, data alone is of little use unless it is made available in a usable form through the development and global uptake of data standards.

The adoption of common standards by any community provides a robust foundation for successful data portability, sharing, integration, interoperability and reusability of data.

Useful resources

Several synergistic activities have begun that aim to foster the harmonisation and consolidation of data standards, including:

BRIDG - The Biomedical Research Integrated Domain Group Model

The BRIDG Model is a collaborative effort of stakeholders from the Clinical Data Interchange Standards Consortium (CDISC), the HL7 Regulated Clinical Research Information Management Technical Committee (RCRIM TC), the National Cancer Institute (NCI), and the US Food and Drug Administration (FDA) to produce a shared view of the dynamic and static semantics that collectively define a shared domain-of-interest, i.e. the domain of clinical and pre-clinical protocol-driven research and its associated regulatory artefacts.

EQUATOR Network - Enhancing the Quality and
Transparency of Health Research

The EQUATOR Network is a new initiative that seeks to improve the quality of scientific publications by promoting transparent and accurate reporting of health research.

MIBBI – Minimum information for Biomedical or Biological Investigations

The MIBBI project maintains a web-based communal resource designed to act as a one-stop shop for exploring the range of extant checklist projects and to foster collaborative, integrative development of checklists.

OBO Foundry – Open Biological Ontology Foundry

The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain.

 

International and national bodies that formally approve standards or provide a framework for standards development include:

ANSI - American National Standards Institute

ASTM International – American Society for Testing and Materials International

BSI – British Standards Institute

CEN – European Committee for Standardisation

GSC – Genomic Standards Consortium

HL7 – Health Level 7

HUPO PSI – Human Proteome Organisation Proteomics Standards Initiative

ISO – International Organisation for Standardisation

MGED – Microarray and Gene Expression Data Society

 

Last updated 05.08.2008 Terms and Conditions © Copyright NCRI Informatics Initiative 2008