A homology-guided, genome-based proteome for improved proteomics in the alloploid Nicotiana benthamiana - PubMed
- ️Tue Jan 01 2019
A homology-guided, genome-based proteome for improved proteomics in the alloploid Nicotiana benthamiana
Jiorgos Kourelis et al. BMC Genomics. 2019.
Abstract
Background: Nicotiana benthamiana is an important model organism of the Solanaceae (Nightshade) family. Several draft assemblies of the N. benthamiana genome have been generated, but many of the gene-models in these draft assemblies appear incorrect.
Results: Here we present an improved proteome based on the Niben1.0.1 draft genome assembly guided by gene models from other Nicotiana species. Due to the fragmented nature of the Niben1.0.1 draft genome, many protein-encoding genes are missing or partial. We complement these missing proteins by similarly annotating other draft genome assemblies. This approach overcomes problems caused by mis-annotated exon-intron boundaries and mis-assigned short read transcripts to homeologs in polyploid genomes. With an estimated 98.1% completeness; only 53,411 protein-encoding genes; and improved protein lengths and functional annotations, this new predicted proteome is better in assigning spectra than the preceding proteome annotations. This dataset is more sensitive and accurate in proteomics applications, clarifying the detection by activity-based proteomics of proteins that were previously predicted to be inactive. Phylogenetic analysis of the subtilase family of hydrolases reveal inactivation of likely homeologs, associated with a contraction of the functional genome in this alloploid plant species. Finally, we use this new proteome annotation to characterize the extracellular proteome as compared to a total leaf proteome, which highlights the enrichment of hydrolases in the apoplast.
Conclusions: This proteome annotation provides the community working with Nicotiana benthamiana with an important new resource for functional proteomics.
Keywords: Genome annotation; Nicotiana benthamiana; Proteomics; Solanaceae; Subtilases.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures

Bioinformatics pipeline for improved Nicotiana benthamiana proteome annotation. The predicted proteins of Nicotiana species generated by the NCBI Eukaryotic Genome Annotation Pipeline were retrieved from Genbank and clustered at 95% identity threshold to reduce redundancy (Step 1), and used to annotate the Niben1.0.1 genome assembly (Step 2). Only those proteins with an alignment coverage ≥60% to the Nicotiana predicted proteins as determined by BLASTP were retained (Step 3) to produce the NbD core dataset. Similarly, the other draft genome assemblies were annotated (Step 4), and only those proteins with an alignment coverage ≥90% to the Nicotiana predicted proteins as determined by BLASTP were retained (Step 5). CD-HIT-2D was used at 100% sequence identity to retain proteins missing in NbD dataset (Step 6), resulting in supplemental dataset NbE. NbD and NbE can be combined (NbDE) to maximise the spectra annotation for proteomics experiments

Increased lengths, coverage and annotation of N. benthamiana proteins. a NbD/NbDE datasets have relatively few entries when compared to preceding datasets. b NbD/NbDE datasets contain nearly all benchmark genes as full-length genes, according to Benchmarking Universal Single-Copy Orthologs (BUSCO) of embryophyta. c The NbD/NbDE datasets have higher number of annotated PFAM domains. d NbD/NbDE datasets have relatively longer protein lengths. Violin and boxplot graph of log10 protein length distribution of each dataset. Jittered dots show the raw underlying data. e NbD/NbDE annotated proteins have a higher percentage coverage to the tomato proteins as determined by BLASTP

NbD/NbDE datasets outperform the annotation of spectra in proteomics. a Percentage of annotated MS/MS spectra in total leaf extract samples. b Average number of unique peptides assigned per protein in the different datasets. a and b Means and standard error of the mean are shown for four biological replicates of total leaf extracts. c Mis-annotations of papain-like Cys proteases (PLCPs) detected by activity-based protein profiling [54]. Leaf extracts were labelled with activity-based probes for PLCPs and labelled proteins were purified and analysed by MS. Shown are the protein annotations found in the NbDE (top) Niben1.0.1 (middle) and curated datasets (bottom), highlighting mis-annotations (red) caused by partial transcripts, mis-annotation of exon-intron boundaries, and mis-assemblies

Examples of subtilase mis-annotations in the different genome assemblies. a Gene-models corresponding to subtilase NbE05066806 and the corresponding annotations in the various datasets. This subtilase gene is fragmented in Niben1.0.1; truncated in Nbv0.5; and carries two SNPs and an extra sequence in Niben0.4.4. b Gene-models corresponding to subtilase NbE03059263 and the corresponding annotations in the various datasets. This subtilase has an inactivated homeolog (dark grey) that was not retained in the NbDE dataset as it encodes a protein with < 60% coverage because it contains premature stop codons. The truncated proteins caused mis-assembly in the Niben1.0.1 dataset, resulting in a hybrid sequence. Mis-annotated exon-intron boundaries also effected gene models in Niben1.0.1, Niben0.4.4 and Nbv5.1. Peptides matched to the different gene models are indicated below the gene models

Birth and death of subtilase paralogs in N. benthamiana. The evolutionary history of the subtilase gene family was inferred by using the Maximum Likelihood method based on the Whelan and Goldman model. The bootstrap consensus tree inferred from 500 replicates is taken to represent the evolutionary history of the taxa analysed. Non-functional subtilases are indicated in grey. Subtilases identified in apoplastic fluid (AF) and/or total extract (TE) are indicated with yellow and green dots, respectively. Naming of subtilase clades is according to [51]. Additional file 1: Figure S2 includes the individual names

Annotation of the N. benthamiana apoplastic proteome. a Correlation matrix heat map of the log2 transformed LFQ intensity of protein groups in the four biological replicates of apoplastic fluid (AF) and total extract (TE) samples. Biological replicates are clustered on similarity. b A volcano plot is shown plotting log2 fold difference (log2FC) of AF/TE over –log10 BH-adjusted moderated p-values. Proteins with log2FC ≥ 1.5 and p ≤ 0.01 were considered apoplastic, as well as those only found in AF. Conversely, proteins with a log2FC ≤ 1.5 and p ≤ 0.01 were considered intracellular, as well as those found only in TE. c Percentage of proteins in each fraction annotated with biological process-associated GO-SLIM terms. d Percentage of proteins in each fraction annotated with molecular function-associated GO-SLIM terms. c and d GO-SLIM annotations are shown when significantly enriched or depleted (BH-adjusted hypergeometric test, p < 0.05) in at least one of the fractions (AF, TE, or both). Each bubble indicates the percentage of genes containing that specific GO-SLIM annotation in that compartment. Colours indicate whether the GO-SLIM annotations are enriched or depleted in that fraction (p < 0.05, n.s., non-significant).
Similar articles
-
Genome and transcriptome characterization of the glycoengineered Nicotiana benthamiana line ΔXT/FT.
Schiavinato M, Strasser R, Mach L, Dohm JC, Himmelbauer H. Schiavinato M, et al. BMC Genomics. 2019 Jul 19;20(1):594. doi: 10.1186/s12864-019-5960-2. BMC Genomics. 2019. PMID: 31324144 Free PMC article.
-
Kurotani KI, Hirakawa H, Shirasawa K, Tanizawa Y, Nakamura Y, Isobe S, Notaguchi M. Kurotani KI, et al. Plant Cell Physiol. 2023 Mar 1;64(2):248-257. doi: 10.1093/pcp/pcac168. Plant Cell Physiol. 2023. PMID: 36755428 Free PMC article.
-
Wang J, Zhang Q, Tung J, Zhang X, Liu D, Deng Y, Tian Z, Chen H, Wang T, Yin W, Li B, Lai Z, Dinesh-Kumar SP, Baker B, Li F. Wang J, et al. Mol Plant. 2024 Mar 4;17(3):423-437. doi: 10.1016/j.molp.2024.01.008. Epub 2024 Jan 24. Mol Plant. 2024. PMID: 38273657
-
Non-model organisms, a species endangered by proteogenomics.
Armengaud J, Trapp J, Pible O, Geffard O, Chaumot A, Hartmann EM. Armengaud J, et al. J Proteomics. 2014 Jun 13;105:5-18. doi: 10.1016/j.jprot.2014.01.007. Epub 2014 Jan 16. J Proteomics. 2014. PMID: 24440519 Review.
-
Proteogenomics for environmental microbiology.
Armengaud J, Hartmann EM, Bland C. Armengaud J, et al. Proteomics. 2013 Oct;13(18-19):2731-42. doi: 10.1002/pmic.201200576. Epub 2013 Jun 18. Proteomics. 2013. PMID: 23636904 Review.
Cited by
-
Wang JD, Hsu YH, Lee YS, Lin NS. Wang JD, et al. Mol Plant Pathol. 2024 Jan;25(1):e13422. doi: 10.1111/mpp.13422. Mol Plant Pathol. 2024. PMID: 38279848 Free PMC article.
-
Teplova AD, Serebryakova MV, Galiullina RA, Chichkova NV, Vartapetian AB. Teplova AD, et al. Int J Mol Sci. 2021 Dec 4;22(23):13123. doi: 10.3390/ijms222313123. Int J Mol Sci. 2021. PMID: 34884925 Free PMC article.
-
Naphthylphthalamic acid associates with and inhibits PIN auxin transporters.
Abas L, Kolb M, Stadlmann J, Janacek DP, Lukic K, Schwechheimer C, Sazanov LA, Mach L, Friml J, Hammes UZ. Abas L, et al. Proc Natl Acad Sci U S A. 2021 Jan 5;118(1):e2020857118. doi: 10.1073/pnas.2020857118. Epub 2020 Dec 21. Proc Natl Acad Sci U S A. 2021. PMID: 33443187 Free PMC article.
-
Metabolic effects of agro-infiltration on N. benthamiana accessions.
Drapal M, Enfissi EMA, Fraser PD. Drapal M, et al. Transgenic Res. 2021 Jun;30(3):303-315. doi: 10.1007/s11248-021-00256-9. Epub 2021 Apr 28. Transgenic Res. 2021. PMID: 33909228 Free PMC article.
-
Johanndrees O, Baggs EL, Uhlmann C, Locci F, Läßle HL, Melkonian K, Käufer K, Dongus JA, Nakagami H, Krasileva KV, Parker JE, Lapin D. Johanndrees O, et al. Plant Physiol. 2023 Jan 2;191(1):626-642. doi: 10.1093/plphys/kiac480. Plant Physiol. 2023. PMID: 36227084 Free PMC article.
References
-
- Bombarely A, Moser M, Amrad A, Bao M, Bapaume L, Barry CS, Bliek M, Boersma MR, Borghi L, Bruggmann R, Bucher M, D’Agostino N, Davies K, Druege U, Dudareva N, Egea-Cortines M, Delledonne M, Fernandez-Pozo N, Franken P, Grandont L, Heslop-Harrison JS, Hintzsche J, Johns M, Koes R, Lv X, Lyons E, Malla D, Martinoia E, Mattson NS, Morel P, Mueller LA, Muhlemann J, Nouri E, Passeri V, Pezzotti M, Qi Q, Reinhardt D, Rich M, Richert-Pöggeler KR, Robbins TP, Schatz MC, Schranz ME, Schuurink RC, Schwarzacher T, Spelt K, Tang H, Urbanus SL, Vandenbussche M, Vijverberg K, Villarino GH, Warner RM, Weiss J, Yue Z, Zethof J, Quattrocchio F, Sims TL, Kuhlemeier C. Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida. Nat Plants. 2016;2:16074. doi: 10.1038/nplants.2016.74. - DOI - PubMed
-
- Casimiro-Soriguer CS, Muñoz-Mérida A, Pérez-Pulido AJ. Sma3s: a universal tool for easy functional annotation of proteomes and transcriptomes. Proteomics. 2017;17. 10.1002/pmic.201700071. - PubMed
-
- Clarkson JJ, Dodsworth S, Chase MW. Time-calibrated phylogenetic trees establish a lag between polyploidisation and diversification in Nicotiana (Solanaceae). Plant Syst Evol. 2017. 10.1007/s00606-017-1416-9.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases