![]() |
ИСТИНА |
Войти в систему Регистрация |
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
Human genome is populated by multiple copies of mobile genetic elements (MGEs). Most of them are inactivated, but a few MGE families still encode functional proteins and actively expand their copy number in the host genome. In recent years, there has been an increasing interest in studying MGE activity, which causes various diseases and plays a role in aging. However, to date, no systematic analysis of MGE translation has been performed. High-throughput sequencing data analysis of MGEs is complicated due to their challenging features such as polymorphism, repetitive nature, and interspersion. Here, we developed a new approach to identify translated open reading frames (ORFs) within MGE consensus sequences. Using ribosome profiling data of human cultured and blood cells and mass spectrometry data of multiple human tissues and cell types, we identified 533 open reading frames (ORFs) with protein-coding capacity over 19 subfamilies in human MGEs. Among them, we found a number of ORFs in non-autonomous MGEs, which were previously considered non-coding. Our results suggest that transcripts derived from many MGE families are efficiently translated in human cells. Based on these results we created an online database of ORFs in MGEs with the protein-coding capacity to direct future research.