Data knowledge base prototype for modern scientific collaborationsстатья Исследовательская статья Электронная публикация

Информация о цитировании статьи получена из Scopus
Дата последнего поиска статьи во внешних источниках: 31 июля 2019 г.

Работа с статьей


[1] Data knowledge base prototype for modern scientific collaborations / M. A. Grigorieva, V. A. Aulov, M. V. Golosova et al. // CEUR Workshop Proceedings. — Vol. 1787 of Selected Papers of the 7th International Conference Distributed Computing and Grid-technologies in Science and Education. — Aachen, Germany: Aachen, Germany, 2016. — P. 26–33. The most common characteristics of large-scale modern scientific experiments are long lifetime, complex experimental infrastructure, sophisticated data analysis and processing tools, peta- and exascale data volume. All stages of an experiment life cycle are accompanied with the auxiliary metadata required for monitoring, control and scientific results replicability and reproducibility. The actual issue for the majority of scientific communities is a very loose coupling between metadata describing data processing cycle, and metadata representing annot ations, indexing and publication of the experimental results. Besides, to reproduce and to verify some previous data analysis, it's very important for the scientists to conduct studies under the same conditions or to process data collection with new software releases or/and algorithms. That's why all information about data analysis process must be preserved, starting from the initial hypothesis following by processing chain description, data collection, initial results presentation and final publication. A knowledge-based infrastructure (Data Knowledge Base - DKB) gives such possibility and provides fast access to relevant scientific and accompanying information. DKB is functioning on the basis of HEP data analysis ontology. The architecture has two data storage layers: Hadoop storage, where data from many metadata sources are integrated and processed to obtain knowledge-based characteristics of all stages of the experiment, and Virtuoso RDF-storage, where all extracted data are registered. DKB agents process and aggregate metadata from data management and data processing systems, metadata interface, conference notes archives, workshops and meetings agendas, and publications. Additionally, this data is linking with the scientific topic documentation pages (such as Twiki pages, Google documents, etc) and information extracted from full texts of experiment supporting documentation. In this way, rather than require the physicists to annotate all meta information in details, DKB agents will extract, aggregate and integrate all necessary metadata automatically.

Публикация в формате сохранить в файл сохранить в файл сохранить в файл сохранить в файл сохранить в файл сохранить в файл скрыть