A successful hybrid deep learning model aiming at promoter identification - PubMed
- ️Sat Jan 01 2022
A successful hybrid deep learning model aiming at promoter identification
Ying Wang et al. BMC Bioinformatics. 2022.
Abstract
Background: The zone adjacent to a transcription start site (TSS), namely, the promoter, is primarily involved in the process of DNA transcription initiation and regulation. As a result, proper promoter identification is critical for further understanding the mechanism of the networks controlling genomic regulation. A number of methodologies for the identification of promoters have been proposed. Nonetheless, due to the great heterogeneity existing in promoters, the results of these procedures are still unsatisfactory. In order to establish additional discriminative characteristics and properly recognize promoters, we developed the hybrid model for promoter identification (HMPI), a hybrid deep learning model that can characterize both the native sequences of promoters and the morphological outline of promoters at the same time. We developed the HMPI to combine a method called the PSFN (promoter sequence features network), which characterizes native promoter sequences and deduces sequence features, with a technique referred to as the DSPN (deep structural profiles network), which is specially structured to model the promoters in terms of their structural profile and to deduce their structural attributes.
Results: The HMPI was applied to human, plant and Escherichia coli K-12 strain datasets, and the findings showed that the HMPI was successful at extracting the features of the promoter while greatly enhancing the promoter identification performance. In addition, after the improvements of synthetic sampling, transfer learning and label smoothing regularization, the improved HMPI models achieved good results in identifying subtypes of promoters on prokaryotic promoter datasets.
Conclusions: The results showed that the HMPI was successful at extracting the features of promoters while greatly enhancing the performance of identifying promoters on both eukaryotic and prokaryotic datasets, and the improved HMPI models are good at identifying subtypes of promoters on prokaryotic promoter datasets. The HMPI is additionally adaptable to different biological functional sequences, allowing for the addition of new features or models.
Keywords: Convolutional neural networks (CNNs); Fully connected networks; Promoter identification; Structural profiles.
© 2022. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures

a Features derived from the training sets of plants through the PSFNcce. b Features derived from the training sets of plants through the PSFN. c Features derived from the test sets of plants through the PSFNcce. d Features derived from the test sets of plants through the PSFN

a The training set features derived by the second layer of the PSFN within the HMPI. b The training set features derived by the second layer of the PSFN within the HMPIat

The framework of the HMPI (hybrid model for promoter identification)

The framework of the PSFN (promoter sequence features network)

The framework of the DSPN (deep structural profiles network)
Similar articles
-
Umarov RK, Solovyev VV. Umarov RK, et al. PLoS One. 2017 Feb 3;12(2):e0171410. doi: 10.1371/journal.pone.0171410. eCollection 2017. PLoS One. 2017. PMID: 28158264 Free PMC article.
-
Identification of prokaryotic promoters and their strength by integrating heterogeneous features.
Tayara H, Tahir M, Chong KT. Tayara H, et al. Genomics. 2020 Mar;112(2):1396-1403. doi: 10.1016/j.ygeno.2019.08.009. Epub 2019 Aug 19. Genomics. 2020. PMID: 31437540
-
Zhang Q, Wei Y, Liu L. Zhang Q, et al. Comput Biol Med. 2024 Sep;180:108974. doi: 10.1016/j.compbiomed.2024.108974. Epub 2024 Aug 2. Comput Biol Med. 2024. PMID: 39096613
-
Prokaryotic promoters in biotechnology.
Goldstein MA, Doi RH. Goldstein MA, et al. Biotechnol Annu Rev. 1995;1:105-28. doi: 10.1016/s1387-2656(08)70049-8. Biotechnol Annu Rev. 1995. PMID: 9704086 Review.
-
Sequence-Based Deep Learning Frameworks on Enhancer-Promoter Interactions Prediction.
Min X, Lu F, Li C. Min X, et al. Curr Pharm Des. 2021;27(15):1847-1855. doi: 10.2174/1381612826666201124112710. Curr Pharm Des. 2021. PMID: 33234095 Review.
Cited by
-
Wang X, Li F, Zhang Y, Imoto S, Shen HH, Li S, Guo Y, Yang J, Song J. Wang X, et al. Brief Bioinform. 2024 Jul 25;25(5):bbae446. doi: 10.1093/bib/bbae446. Brief Bioinform. 2024. PMID: 39276327 Free PMC article. Review.
-
Jia J, Lei R, Qin L, Wu G, Wei X. Jia J, et al. Front Genet. 2023 Mar 1;14:1132018. doi: 10.3389/fgene.2023.1132018. eCollection 2023. Front Genet. 2023. PMID: 36936423 Free PMC article.
-
From tradition to innovation: conventional and deep learning frameworks in genome annotation.
Chen Z, Ain NU, Zhao Q, Zhang X. Chen Z, et al. Brief Bioinform. 2024 Mar 27;25(3):bbae138. doi: 10.1093/bib/bbae138. Brief Bioinform. 2024. PMID: 38581418 Free PMC article. Review.
-
Predmoter-cross-species prediction of plant promoter and enhancer regions.
Kindel F, Triesch S, Schlüter U, Randarevitch LA, Reichel-Deland V, Weber APM, Denton AK. Kindel F, et al. Bioinform Adv. 2024 May 24;4(1):vbae074. doi: 10.1093/bioadv/vbae074. eCollection 2024. Bioinform Adv. 2024. PMID: 38841126 Free PMC article.
-
Core promoterome of barley embryo.
Pavlu S, Nikumbh S, Kovacik M, An T, Lenhard B, Simkova H, Navratilova P. Pavlu S, et al. Comput Struct Biotechnol J. 2023 Dec 5;23:264-277. doi: 10.1016/j.csbj.2023.12.003. eCollection 2024 Dec. Comput Struct Biotechnol J. 2023. PMID: 38173877 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials