ИСТИНА |
Войти в систему Регистрация |
|
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
Keywords: colorectal cancer, pattern recognition, metabolomics workflow, data processing methods Introduction: As mentioned in recent research in USA about 50000 deaths have happened from colorectal cancer (CRC), also authors postulate the need to create a method for screening CRC in the early stages. The comprehensive workflow for pattern recognition of three group CRC patients based on UPLC-MS with fully comprehensive algorithm for optimization methods of preprocessing and data acquisition were reported here. Methods: All morning fasting urine samples (on an empty stomach, immediately after sleep) of control group volunteers (8 pcs.), colorectal cancer patients before surgical operation (20 pcs.) and after surgical operation (12 pcs.) were collected for this research. The HPLC separation was conducted on a C18 column. The separation was carried out in a gradient elution mode with MS detection in a positive ion (TIC mode).“Dilute and shoot” technique was used for sample preparation. Results: Three types of signal drift correction were used for reduction unwanted variations [2]: total ion current, quantile, median and without any method.ANOVA was applied, to test correction Methods: Two methods for metabolites concentration normalization were examined: mass spectrometry total useful signal, creatinine concentration and without normalization.Log transformation and three type of scaling methods (auto, mean, Pareto and without scaling) were tested. So, the dataset was applied to assess all possible 48 permutations of normalization, correction, transformation and scaling followed by PLS-DA and sPLS-DA.Analyzing sPLS-DA result, we found that using a log transformation, regardless of the application of any other data processing methods, the first 10 features with the maximum values of their loadings were the same for all permutations. After an extraction from the entire set of data set only the top features, in all cases samples groupswere fully resolved from each other by PCA, sPLS-DA, PLS-DA, RF, dendrograms and heat map. Conclusions: We propose the pragmatic decision procedure for selection and evaluation of preprocessing methods in metabolomics studies. This algorithm is based on the estimating of the influence each stage using multivariate and univariate statistical analysis.Extraction selected features from raw data set will led to appropriate pattern recognition if the model is built right. Novel Aspect: This LC-MS workflow with algorithm of evaluation, optimization and scheme for performing statistical analysis operations can be applied in other relatively short studies.
№ | Имя | Описание | Имя файла | Размер | Добавлен |
---|---|---|---|---|---|
1. | IMSC_18_Plyushchenko.pdf | IMSC_18_Plyushchenko.pdf | 598,8 КБ | 9 сентября 2018 [Plyush1993] |