NCRI Informatics Header NCRI Informatics Initiative Home NCRI

 

The NCRI Informatics Initiative was established to address how the cancer research community can more effectively discover, share and utilise the vast amounts of information and data generated in this field each year. Therefore we are currently exploring ways of revolutionising how information is accessed by exploiting advances in Information Technology to create a more open, integrated research environment where research and clinical data and information can be shared. We believe that wider access to information, data and other cancer-related resources will accelerate the translation of new discoveries in the lab to effective therapies in the clinic.

The NCRI Oncology Information Exchange

Our strategy to enable wider access to cancer-related information has been to work in collaboration with the cancer research community to develop an IT infrastructure, the NCRI Oncology Information Exchange (ONIX), which will form an integrated, web-based, research environment where data and information can be easily shared amongst scientists.
ONIX’s main function is to hold a central register of resources useful to cancer researchers and to allow queries to be run simultaneously across connected databases enabling users to quickly and easily find and retrieve information in its diverse forms.

The Development of ONIX

After a number of initial review releases the public launch of ONIX (V1.2) occurred in July 2009 and has since undergone review by an expanded user group drawn from researchers nominated by our main funding bodies. Improvements, based on their feedback will be built into future versions of ONIX.

ONIX, in its initial form focuses on building up a knowledge base, known as the Resource Catalogue, of what exists and is available for re-use. “A resource” can be a data source, a research group, a dataset, an analytical tool or data standard of some description, in fact anything that a researcher might find of interest. The Unit sees the creation of a federated, international set of resource entries as the ultimate goal. The initial source being trialled is the National Cancer Institute’s caBIG® grid services catalogue, but work is underway to enable information gathering from many parts of the NHS, the research community themselves (via ventures such as the BioCatalogue) and from the funders of research (such as CR-UK’s tissue collection information). The major release of ONIX (V1.4) in Q1 2010 changed the user interface to enable the Resource Catalogue to quickly search the potentially vast array of information on offer which could run into thousands of resource entries.
On the search capability side the initial work has focused again on information discovery through our Quick Search tool. This is now connected to some 45 data resources globally, including major data sets held at the NCBI, EMBL-EBI and KEGG. Aside from simultaneous searching across multiple resources, the launch of ONIX V1.3 in November 2009 introduced a unified set of search operators and wildcards. These enable the researcher to more specifically target information retrieval where resources support operators. A key finding has been the great variety of interface approaches and inconsistency of such operator support seen across the research community - something the Unit will publicise in the coming months. In addition, further resources will be connected to ONIX beyond the 45 currently available.
 
ONIX, as a training tool, has also been pioneered with UCL Medical School. Students were successfully trained (using the Quick Search facility) in how to use basic research assets, such as genomic and publication repositories. In return the Unit gained access to a large feedback group to review some of our initial screen designs.

Over the coming year the focus will switch to development of search tools that allow not only discovery, but the joining and extraction of information in order to create new data sets. A successful prototype has been developed and the challenge now is to deploy this in a production environment.
While ONIX provides a simple vocabulary support service, work will also go into expanding the level of information about vocabularies relevant to cancer researchers and to providing better access to them. The consistent annotation of new data with vocabularies is a key component in allowing easier discovery and re-use of information.

Getting Involved

One of our highest priorities is to develop ONIX’s electronic resource catalogue. This catalogue will not be confined to just biomedical data generated from research projects, but also data analysis tools, new research technology and projects involved in standards development. We are always interested in hearing from owners of such resources who are willing to log a small amount of information with us.  If you are interested in registering your resource in the ONIX catalogue please contact us.

Conceptual Diagram of ONIX

onix diagram

NCRI ONIX

Last updated 29.03.2010 Terms and Conditions © Copyright NCRI Informatics Initiative 2010