![]() |
ИСТИНА |
Войти в систему Регистрация |
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
In this work we considered inverse problems of optical spectroscopy, which consist in determining the ingredient ions concentrations of multicomponent water solutions by their spectra. Due to the nonlinearity of the problem under consideration, as well as to the lack of an adequate mathematical model describing the spectra of multicomponent solutions, machine learning methods based on the use of physical experiment data were used to solve this problem. At the same time, inverse spectroscopy problems are characterized by high input dimensionality with a large number of features, both relevant and irrelevant. In turn, some of the corresponding relevant features are redundant due to their multicollinearity, caused by the fact that the characteristic lines of the solution components have a certain width and cover several spectrum channels at once. This leads to a deterioration in the quality of machine learning solution of the problem, and therefore there is a need for a feature selection procedure that takes into account their relevance and redundancy, as well as the nonlinear relationship with the determined parameters. In this study, we considered a feature selection procedure based on the iterative selection of features with the highest relevance to the target variable and on the elimination of features with a high relationship with each other. In this selection procedure, the method of weight analysis of a trained neural network was used as a nonlinear measure of relevance, and the Pearson correlation coefficient was used as a measure of the relationship of features with each other. Finally, the quality of a neural network solution of the problem of determining the concentrations of solution components from spectroscopic data was compared on the full set of input features and on its subsets compiled using the selection procedure under consideration, as well as using traditional methods for selecting significant input features. The study was carried out at the expense of the grant No. 24-11-00266 from the Russian Science Foundation, https://rscf.ru/en/project/24-11-00266/ [https://rscf.ru/en/project/24-11-00266/].