ИСТИНА |
Войти в систему Регистрация |
|
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
Microbial rhodopsins are a superfamily of photoactive retinal-binding proteins widespread throughout the microbial and lower eukaryotes world. These proteins have the same seven-transmembrane helix structure; however, their functions are diverse. The study of microbial rhodopsins has contributed to understanding the structure and functional activity of membrane proteins, photochemistry, sensory signaling, proteins evolution and mechanisms of organisms’ interaction with light. A few microbial rhodopsin families (bacteriorhodopsins, halorhodopsins, etc.) are reported in the literature, and the determination of amino acid sequence family requires its comparison with all homologous sequences. Algorithms of machine learning are capable of sequences classification without such comparison, although large data sets are necessary for the correct training for these methods, and they are not available now. This problem can be solved by artificial sequences generation with nature sequences properties. The objective of this work is the generation of microbial rhodopsins artificial sequences. We divided the superfamily of microbial rhodopsins into 14 classes using clustering, and we generated artificial sequences for each of them. Our method is taking account of the amino acid set specific of transmembrane and non-membrane fragments, that makes it possible to bring artificial sequences as close as possible in structure to natural ones. Thus, expanding the set of natural microbial rhodopsins sequences with artificial sequences can help with the study of structural and functional characteristics of various rhodopsins. In the future, we are going to use the extended dataset to create and optimize an algorithm based on neural networks that can classify sequences of microbial rhodopsins into 14 classes.