ИСТИНА |
Войти в систему Регистрация |
|
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
Restriction-Modification (R-M) systems are bacterial and archaeal defense systems. Their function is to protect a prokaryotic cell from foreign DNA invasions. The distribution of R-M systems in natural multi-species communities has not been sufficiently studied. We studied the R-M system gene composition of the microbial community of hypersaline Antarctic Deep Lake. This is a stable and closed community. Complete genomes of three dominant and one less abundant species from this lake are available, as well as a number of metagenomes. We performed a bioinformatic analysis of R-M system genes in five metagenomes and in complete genomes of six strains of four archaeal species from Deep Lake. The analysis is based on homology search of metagenomic contigs and genomic sequences over REBASE proteins and on detection of R-M system related Pfam domains with HMM profiles. We predicted 4789 R-M system genes. These genes can be grouped into 1389 clusters of homologous genes (>50% identity of translations) and 2306 clusters of nearly identical genes (>98% identity of translations). Only a small part of this variety is encoded in the genomes of dominant archaeal species. Among 2306 clusters of nearly identical genes only 97 include genes from the complete genomes. Of these 97 clusters, 93 are represented also in the metagenomes. The obtained lists of genes in the complete genomes have significant differences with the lists of genes annotated in REBASE for the same genomes. We have predicted 61 putative RM system genes not presented in REBASE. Also REBASE contains 18 genes from the four archaeal genomes that do not meet our criteria. One of the archaeal species, Halorubrum lacusprofundi, demonstrates inter-strain heterogeneity of R-M system composition, probably as a result of an intense gene exchange. Most R-M system genes common for different species are located within rather large highly identical regions (HIRs) in genomes. We divided all R-M system genes into three classes, rare, moderate and abundant, according to their coverage by metagenomic reads. We found that R-M system Types demonstrate a sufficient difference with respect to read depth. Namely, among genes of R-M proteins of Types II, IIG and IV the fractions of highly covered genes are significantly higher in comparison with Types I and III. Some R-M system genes in complete genomes have metagenomic read coverage significantly different in comparison with housekeeping genes of the same species. This can be interpreted as signs of high mobility of those genes. We studied avoidance of recognition sites of Type II R-M systems in the genomes and large metagenomic contigs. Most often found to be under-represented are the sites AGCT, CTAG, GATC and CCWGG. The work was supported by the Russian Science Foundation grant no. 21-14-00135.