Bioassay Classification Study via LC-MS and Machine Learning in Conjunction with Dimensionality Reduction - доклад на конференции | ИСТИНА – Интеллектуальная Система Тематического Исследования НАукометрических данных

Авторы: Plyushchenko I.V., Rodin I.A.
Международная Конференция : MSACL 2019 EU
Даты проведения конференции: 22-26 сентября 2019
Дата доклада: 25 сентября 2019
Тип доклада: Стендовый
Докладчик: Plyushchenko I.V.
Место проведения: Зальцбург, Austria
Аннотация доклада:
INTRODUCTION: Metabolomics data often contain thousands of features, but only some of them keep useful information about clinical status and other types of system biology source of data. The one of the first step to the realization of global concepts (such as personalized medicine and system biology) is design a list of the most stable and robust approaches to the extraction of informative metabolites. OBJECTIVES: The primarily aim of our research is attempt to employ machine learning principle for selection important features from metabolomics data without powerful and not stable preprocessing stages (such as QC-based correction, scaling, transformation, decompositions, etc.). We applied only creatinine normalization and half-minimum missing value imputation to raw data. METHODS: LC-MS analysis of 40 urine samples was performed by C18 column (Waters) coupled with IT-TOF (Shimadzu) instrument. The metabolites data table after integration and alignment was obtained from iMet-Q software. All calculations for model training, resampling, tuning hyperparameters, variables importance sorting, feature frequency computing between different stage of resampling and recursive feature elimination were done by R environment (caret package in generally). Other computations were also produced throughout R software. The obtained pipeline of data engineering process was tested on one open repository metabolomics data. RESULTS: In all datasets (experimental and from open repository) clinical groups were clearly and properly separated by hierarchical cluster analysis and principal component analysis. Correct pattern recognition was achieved for reduced datasets after feature selection based on combination of machine learning training and results of univariate analysis. CONCLUSION: This report slightly demonstrate potential opportunities to creation and validation of some useful approaches for marker research in high dimensional data. Combination the efforts of many researchers can led to the adoption of more rational and robust techniques then the classical methods (ANOVA, FCA, PCA, VIP score from s,o – PLSDA), especially for non-linear and complex issues. This work was funded by the Russian Foundation for Basic Research (RFBR), according to the research project No. 19-33-90071.
Добавил в систему: Плющенко Иван Викторович

ИСТИНА

Интеллектуальная Система Тематического Исследования НАукометрических данных

Bioassay Classification Study via LC-MS and Machine Learning in Conjunction with Dimensionality Reductionдоклад на конференции

Прикрепленные файлы

	ИСТИНА	Войти в систему Регистрация
	Интеллектуальная Система Тематического Исследования НАукометрических данных
	Главная Поиск Статистика О проекте Помощь