Аннотация:Nucleosomes are the basic units of the eukaryotic chromatin. They contain 147 bp of DNA, wrapped in1.65 superhelical turns around a histone octamer. Nucleosome positioning influencesgene expression regulation, DNA replication, repair and some other processes in eukaryotic genomes. Moreover, nucleosomes providefor thefirst level of DNA packaging and control genomeaccessibility to DNA-binding proteins. Consequently, understanding mechanisms of nucleosome positioning is crucially important. In this workwe propose a model usingRandom Forest classifier for prediction of nucleosome positioning based onmechanical, physico-chemical,and structural characteristics of nucleotide sequence. The most generally accepted view is that the nucleosome positioning is determined by mechanical features of DNA and its ability to wrap around a histone octamer[1].The otheropinion is that electrostatic potential plays a decisive role in DNA-histones interaction[2]. The main idea of the improved method presented in this work is to merge two popular approaches and try to predict nucleosome positioning according to mechanical, thermodynamic features of DNA and electrostatic profiles calculated using Poisson-Boltzmann equation. The analyzed dataset contains about 10,000 nucleosomal and linker sequences of 147 bplength from Saccharomycescerevisiaegenome obtained using new chemical method of cleavageH3Q85C[3]. The Random Forest classifier was trained on the presented data and showed the precision of 86% on the test sampling. According to the feature selection procedure the most important features are electrostaticpotential and bendability of DNA, however other thermodynamic and structural properties are also important and make their contribution to the model performance.The constructed joint model showed that taking into account different factors
–electrostatic, mechanical, thermodynamic, and structural at the level of dinucleotides greatly improves the model prediction power.