VAN Quoc Giang ,HUYNH Ky ,NGUYEN Chau Thanh Tung ,NGUYEN Loc Hien ,NGUYEN van Manh ,NGUYEN Nhut Thanh ,VO Cong Thanh,SWEE Keong Yeap
(1College of Agriculture,Can Tho University,Can Tho 92000,Vietnam;2Department of Sciences and Technology of Long An Province,Long An 850000,Vietnam;3China-Association of Southeast Asian Nations College of Marine Sciences,Xiamen University Malaysia,Sepang 49000,Selangor,Malaysia)
Abstract: The fragrance of rice is one of the premium characteristics that breeders want to include in rice varieties due to the higher market value.Nucleotide deletions in exons 2 (7 bp) and 7 (8 bp) of Betaine Aldehyde Dehydrogenase 2 (BADH2) are associated with fragrance in rice.In this study,a new 13 bp deletion in exon 7 of the BADH2 gene was discovered in the Nang Thom Cho Dao (NTCD) variety,and the mutation has been closely related to the genetic background of indica subspecies through the Bayesian phylogenetic approach and haplotype network analysis of the 3 000 Rice Genomes Project.In addition,a set of functional markers (EX07-13F,EX07-13RN,and EX07-13RM) identified the 13 bp deletion only within NTCD (no amplified band) compared with both non-aromatic and other aromatic rice varieties(110 bp band).The deletion of 13 bases instead of 8 bases in exon 7 of BADH2 caused a premature stop codon,which down-regulated the expression of the BADH2 transcript while associated with up-regulation of OsP5CS and the high amount of 2-acetyl-1-pyrroline.It is potential to use the deletion in exon 7 of the BADH2 gene as a novel marker for adulteration and breeding of fragrant rice varieties,particularly for NTCD.
Key words: novel deletion;BADH2;fragrant rice;functional marker
Rice is a significant staple food that provides nourishment to millions of Asians (Muthayya et al,2014).The distinctive fragrance of some rice varieties has been recognized as a premium quality because it is commonly prepared as high-priced rice dishes in traditional rice-eating countries (Prodhan and Shu,2020).Besides,fragrance rice can even attract consumers who are not consuming rice as a staple food (Mahajan et al,2018).The scent of fragrant rice varieties is therefore an economically significant quality that gets premium prices in both domestic and export markets in many rice-growing countries (Giraud,2013).
As rice is an excellent model for monocot crop genomics (Rensink and Buell,2004;Tyagi et al,2004),variations in the genome associated with important agronomic premium traits,including the aroma of rice,have been intensively studied (Huang et al,2010,2012;Begum et al,2015).With the advance of rice-breeding technology,there are many variations of aromatic rice with different grain sizes and fragrances(Kraithong et al,2018;Hoffmann et al,2019).Some aromatic rice varieties are cultivated based on the traditional methods in certain small and unique areas and are recognized as ‘premium rice’ due to the distinctive fragrance and availability (Roy et al,2015;Bindusree et al,2017;Maleki et al,2020).fgr,a recessive gene encoding betaine aldehyde dehydrogenase 2(BADH2),is reported as the candidate gene responsible for rice fragrance (Ahn et al,1992;Bradbury et al,2005).The BADH2 enzyme has the role of converting γ-aminobutyraldehyde (AB-ald) to γ-aminobutyric acid (GABA) in non-fragrant rice.Additionally,the loss of function ofBADH2in rice due to an insertion or deletion contributes to the production of 2-acetyl-1-pyrroline (2AP),which is a precursor of fragrant rice (Chen et al,2008;Shi et al,2014;Luo et al,2022).To date,at least 15BADH2alleles showing divergent fragrances in the rice germplasm have been reported (Chen et al,2008;Kovach et al,2009;Ganopoulos et al,2011;Shi et al,2014).In addition,the loss function ofBADH2can be associated with the loss of 8 bp in exon 7,the insertion of 7 bp in exon 8 (Amarawathi et al,2008),the deletion of 2 bp in exon 2 (Shi et al,2008),and many other deletions in non-coding regions such as introns (Chen et al,2008;Sun et al,2008) and the promoter (Bourgis et al,2008).Based on theBADH2sequence in the GenBank database,BADH2functional markers have been widely developed for the detection of fragrant rice varieties (Bligh,2000;Bradbury et al,2005;Chen et al,2008;Srivong et al,2008;Shao et al,2011;Okpala et al,2019).The Kompetitive Allele Specific PCR (KASP) assay based on nine informative single nucleotide polymorphisms (SNPs) across theBADH2gene has been successfully used to discover fragrant rice varieties from Thailand,China and other countries (Addison et al,2020).Based on the information,SNP molecular markers are developed to support marker-assisted selection in the breeding of the major types of the fragrant rice (Li et al,2020).However,to date,the fragrant varieties of Vietnamese rice are still under reported.
Fig.1.Grains of aromatic rice (NTCD) and normal rice (TN).
Nang Thom Cho Dao (NTCD) (Fig.1) is one of the famous traditional fragrant rice varieties in Vietnam,and is specifically grown in Can Duoc district,Long An Province,Vietnam.NTCD has been listed as an important genetic resource that is prohibited from export in Vietnam.However,the Institute of Agricultural Science for Southern Vietnam has reported that NTCD has faced several problems such as poor genetic purity(12.1%-14.3%) after continuous cultivation for several years (Khanh et al,2021).In this study,based on the NTCD conserved from the Mekong Delta Development Research Institute (MDI) since 1986,the genetic sequence of fragrant NTCD was compared with non-fragrant rice,Tai Nguyen (TN) (Fig.1),which is commonly used to imitate NTCD to identify potential markers for the authentication of NTCD.
Fig.2.Sequencing data validation in exon 7 in BADH2 gene.
Fig.3.Haplotype networks inferred from 30 sequences from 12 countries representing relationship of nucleotide deletion in exon 7 of BADH2 gene.
Aroma analysis of two major materials resulted in a 580-bp band for the positive control (Fig.S1),amplified by both External Antisense Primer (EAP)and External Sense Primer (ESP) external primers(Fig.S1).Sequencing by the Sanger method was later used to validate a 13-bp deletion in exon 7 of NTCD (Data S1 and S2) and the matching results was presented at Fig.2-A.Furthermore,a set of primers,named EX07-13F(forward),EX07-13RN (normal) and EX07-13RM(mutant) (Fig.2-B),was designed to cross-check the above result.Both the positive control and TN produced a 110-bp band,while there was no band recorded by NTCD (Fig.2-C).Besides,an additional testing for aromatic and non-aromatic rice genotypes was included,where a 580-bp band was amplified from all individuals (positive control from ESP and EAP external primers).As shown in Fig.2-D,NTCD and the positive control were aromatic genotypes,producing a band of 257 bp amplified by the ESP and IFAP primers,while TN produced a band of 355 bp,indicating that TN carries a non-fragrant allele of thefgrgene.
TCS (Templeton,Crandall,and Sing) haplotype network showed the existence of 4 haplotypes from 30 individuals in 12 countries (Fig.3-A).Haplotypes are connected with a 95% confidence limit.Hap 1 (Japan),Hap 2 (Vietnam),and Hap 3 (Vietnam) contained one sample each (Nipponbare,TN and NTCD,respectively).Hap 4 contained 27 samples from 10 countries,including India (7 accessions),Bangladesh (4 accessions),Nepal(4 accessions),Pakistan (4 accessions),and Iran (3 accessions),while only 1 accession was found in the Philippines,Liberia,Thailand,the Namibia,and Madagascar.Using the reconstructed phylogenetic tree data inferred from Bayesian analysis,there were also four haplotypes suggested via the program (Fig.3-B).
Based on the run from BEAST analysis using Tracer,the effective sample size (ESS) was 5 286.1,which indicated that the analysis did yield a sufficient number of independent samples from the posterior distribution that the Markov chain was equivalent to(Bouckaert et al,2014;Rambaut et al,2018).Furthermore,the current Bayesian phylogenetic analysis was statistically significant.The phylogenetic tree showed that three major clades,including one group originating from Vietnam,one from Japan,and the others from the 3 000 Rice Genomes Project,were clearly classified by Bayesian analysis (Fig.4).Within the largest group,including 27 samples from the 3 000 Rice Genomes Project,each clade consisted of several small sub-groups.Two rice varieties TN and NTCD from Vietnam were recognized as significantly different from the other groups.
Based on the preliminary analysis of RNA-Seq data,the expression levels ofOs08g0424500,BADH2in NTCD showed a significant decrease (P=0.0003)than TN,indicating that this process was somehow influenced,probably due to the 13 bp deletion in exon 7 (Fig.5-A).The number of samples using exons 7,8 and 9 was significantly affected by alternative splicing(Fig.5-B).Furthermore,qRT-PCR analysis showed that theBADH2expression levels were decreased in NTCD and Jasmine in comparison with TN.Both Jasmine and NTCD had increasedOsP5CStranscript levels than TN (Fig.6).
GC/MS analysis of NTCD,TN and Jasmine showed that the concentration of 2AP in NTCD (16.263 μg/kg)was higher than the positive control,Jasmine (13.250μg/kg).In contrast,there was no 2AP recorded in TN(Table 1).
Table 1.Quantification of 2-acetyl-1-pyrroline (2AP) using gas chromatography-mass spectrometry.
High-performance liquid chromatography,hot water extraction,KOH extraction,and chewing are traditional methods for the qualitative or quantitative detection of 2AP that contributes to the fragrance of rice.However,these methods have several limitations including expensive,imprecise and inefficient,and thus are not suitable to be used in the breeding program.To date,numbers of molecular markers have been developed for marker-assisted selection of the fragrance gene,BADH2,to replace the conventional methods (Cordeiro et al,2002;Jin et al,2003).Nevertheless,loose linkage between markers and the target gene has been considered to be an issue causing dissatisfaction with the accuracy of selection (Shao et al,2011).Thus,functional markers have been demonstrated as powerful tools in order to identify the fragrant and non-fragrant rice varieties (Bradbury et al,2005;Shi et al,2008).Polymorphisms including SNPs and nucleotide deletions in exons 2 and 8 are associated with rice fragrance (Bradbury et al,2005;Chen et al,2008;Kovach et al,2009;Shao et al,2013).Artificial selection is one of the reasons causing mutations of theBADH2gene during the later stages of domestication (Shao et al,2013).The detection of 13 bp deletion in exon 7 in theBADH2gene from NTCD is considered to be a novel variation compared with the mutants discovered in previous reports.The deletion can be captured by the external primers ESP and EAP,yields a 580 bp amplicon (Fig.S1),which is sufficient for Sanger sequencing as well as the specificity of allele amplification (Bradbury et al,2005).Although these EX07 primers were designed inside the external primers,they can not be used to identify both fragrant alleles and non-fragrant alleles compared with Bradbury’s primers (Fig.2-C and -D).New sets of primers,EX07-13F,EX07-13RN and EX07-13RM,were designed to discriminate the 13 bp deletion only within NTCD from other mutations in exon 7 of theBADHgene (Fig.2-C).
Fig.4.Bayesian phylogenetic tree based on 30 alignments from the surrounding target region of exon 7 of BADH2.
Hap 4 contained theBADH2-E7allele with an 8-bp deletion (Fig.3),which was the most common haplotype associated with 27 accessions fromindicagene pools.This loss-of-function allele has been thought to originate from the genetic background of theindicavarietal group,and hereafter transferred intojaponica(Shao et al,2013).Within Hap 3,the mutantBADH2-E7allele with a 13-bp deletion from NTCD was well-associated with Hap 4,indicating that the allele itself within NTCD was closely related to the ancestral 1 (Hap 1).Additionally,NTCD is also well-known as a local fragrant and long-grain rice variety originated fromindicagene pools (Khush,1996).Therefore,the result above was inferred as a consequence of the high-level artificial selection pressure of rice domestication (Fig.3).These haplotype networks were shown to provide well-supported results for phylogenetic classification due to the coincidence of information shared by both analyses(Figs.3 and 4).From the Bayesian phylogenetic tree(Fig.4),in comparison with an 8-bp deletion in exon 7 ofindicatypes,the local fragrant rice for NTCDwith a 13-bp deletion has its own genetic background fromindicagene pools (Fig.3).However,due to the contribution ofBADH2to rice domestication (Wang et al,2005;Lin et al,2007;Shao et al,2013),theBADH2-E7allele with 13 bp deletion from NTCD was totally classified into a new group (Fig.4).
Fig.5.RNA-Seq analysis of TN (Tai Nguyen,non-aromatic rice) and NTCD (Nang Thom Cho Dao,aromatic rice) derived from Vietnam.
Down-regulation ofBADH2presents evidence for our initial assumption of the differential gene expression caused by alternative splicing in exon 7(Figs.5 and 6).Previous studies have shown a similar transcriptional down-regulation ofBADH2in aromatic rice compared with non-aromatic rice (Chen et al,2008;Khandagale et al,2020).Additionally,when compared with TN,both Jasmine and NTCD had increasedOsP5CStranscript levels (Fig.6).According to Huang et al (2008),2AP accumulation is associated with the decrease ofBADH2transcription and increase ofOsP5CSgene expression.The negative correlation betweenOsP5CSandBADH2expression levels indicated that NTCD produced more 2AP than Jasmine or TN (Table 1),which served as the positive and negative controls,respectively.The non-functional alleles caused by mutations,including the 8 bp deletion in exon 7 and the 7 bp deletion in exon 2 ofBADH2,result in the enhanced synthesis of 2AP due to its truncatedBADH2and therefore fragrance (Chen et al,2008;Shi et al,2008;Kovach et al,2009).Hence,the activity of 2AP synthesis within NTCD was strongly enhanced due to the loss-of-function allele caused by the 13 bp deletion (Fig.2-B and Table S1).
Fig.6.Transcription levels of BADH2 and OsP5CS in leaves of Nang Thom Cho Dao (NTCD),Tai Nguyen (TN) and Jasmine.
The 13 bp deletion in exon 7 from NTCD is a novel deletion.This newBADH2-E7allele is considered to be associated withindicagene pools,where the haplotype first arose.New functional markers have successfully demonstrated their effectiveness and accuracy when detecting novel deletions for NTCD.In addition,expression ofBADH2for NTCD has also been affected by this type of alternative splicing,suggesting that the truncatedBADH2from NTCD perhaps emitted an aroma during this process.The novel deletion of 13 bp in exon 7 ofBADH2was induced to develop the 2AP aroma in NTCD.
NTCD is an aromatic rice that is geographically restricted to the Can Duoc district,Long An Province,Vietnam (10°33'29″N,106°36'23″ E),while TN is a common non-aromatic rice in Vietnam (Fig.1).These local rice germplasms have been collected and conserved by the Mekong Delta Development Research Institute (MDI) since 1986,and were stored in Can Tho University GenBank.
DNA was extracted from the leaves of two-week-old seedling using NEXprep? Plant DNA Mini Kit (Genes Laboratories,Gyeonggi-do,Korea).DNA was purified according to the manufacturer’s protocol.DNA purity was checked on 1%agarose gel,and concentration and quality were assessed using the Nanodrop (Thermofisher,Waltham,USA) and Bioanalyzer(Agilent,Santa Clara,USA).Primer sets forBADH2InDel markers were designed and synthesized based on sequence data obtained from NTCD whole genome sequencing deposited in NCBI (SRX7885126),which were used for PCR amplification(Table 2).The PCR was performed using 50 ng of DNA extracted from the leaves in the final volume of 25 μL containing 0.2 μmol/L each primer,0.2 mmol/L dNTPs,1.5 mmol/L MgCl2and 1.5 UTaqpolymerase.The cyclic conditions of PCR for both markers were: initial denaturation at 94 °C for 2 min followed by 35 cycles of 30 s denaturation at 94 °C,30 s annealing at 55 °C and 1 min extension at 72 °C with a final extension of 72 °C for 5 min.The amplification products were detected on 1.5% agarose gel or 8%polyacrylamide gel.The size of amplified fragments was calculated by GelAnalyzer V19.1.
Aromatic rice sequences (n=27) containing the region starting from 20 379 794 to 20 386 061 bp ofBADH2were obtained in FASTA format and mapped with the Rice SNP-Seek Database(https://snp-seek.irri.org/) of the 3 000 Rice Genomes Project(Mansueto et al,2017) for multiple sequence alignment using MEGA6 (Tamura et al,2013).The initial best-fit model from 56 Bayesian phylogenetic models was selected using jModelTest 2 (Darriba et al,2012).Based on the results of the Akaike Information Criterion,corrected (AICc),the most accurate substitution model was HKI+I,where HKI is a substitution model and I is proportion invariant (0.9580) for Bayesian phylogenetic tree analysis.The phylogenetic tree was then constructed using BEAST2 program (Drummond and Rambaut,2007;Drummond et al,2012;Bouckaert et al,2014,2019).Subsequently,a revised tree (Table S1) was generated by TreeAnnotator V2.6.3 of BEAST2 to summarize and select the best-fit tree.Statistical analysis of the output files from BEAST2 analysis was performed using Tracer V1.7.1(Rambaut et al,2018).Phylogenetic data inferred from FigTree v1.4.4 was also used to perform haplotype network.Besides,TCS program (Clement et al,2000) was used to estimate cladograms for 30 DNA sequences by maximum parsimony(Templeton et al,1992).The network was finally reconstructed after various alternative enhancements for layouts by tcsBU tool (Múrias et al,2016).
Table 2.New functional markers for BADH2 were developed to discriminate Nang Thom Cho Dao (NTCD) from other fragrant rice varieties.
After the panicle primordium initiation at the reproductive stage,the young panicles of NTCD and TN were collected for RNA isolation.All samples were immediately frozen in liquid nitrogen and stored at -80 °C.RNase Mini Kit (Invitrogen,MA,USA) was used to extract total RNA,which was quantified by a Qubit RNA Assay Kit (Applied Biosystems,CA,USA).An Agilent 2100 Bioanalyzer (Agilent Technologies,CA,USA)was used to check the RNA integrity.RNA of each sample (5 μg) was sent to Theragen (Theragen,Gyeonggi-do,Korea) for RNA-Seq sequencing.The procedure for RNA-Seq followed the Illumina HiSeq 2000 protocol.
Quality control and pre-processing of raw paired-end reads for RNA-Seq data were performed using fastp tool V0.20.0,an ultra-fast FASTQ preprocessor (Chen et al,2018).The number of reads mapped to each gene was counted by featureCounts using a GTF file (Liao et al,2014),which used the original level of gene expression.DESeq2 was used to identify differences in gene expression (Love et al,2014).The expression of genes was lately filtered with significant difference (false discovery rate < 0.05) as the condition for screening the significance of DEGs between normal and aromatic samples.To more thoroughly visualize the expression ofBADH2gene,DEXSeq data was used to test for exon usage differences between normal and aromatic samples (Anders et al,2012).
BADH2expression was further validated using qRT-PCR.In general,RNA was extracted from the leaves of NTCD,TN and Jasmine using TRIzol (Invitrogen,Carlsbad,CA,USA)following the manufacturer’s protocol and treated with DNase I(Fermentas,Waltham,Germany).RNA concentration in samples was determined using a NanoDrop ND-1000 (NanoDrop Technologies,Wilmington,USA).Each sample (1 μg) was used to synthesize the 1st strand cDNA using the RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific,Massachusetts,USA).The following primers were used (Hinge et al,2016):BADH2primers (forward: TGTGCTAAACATA GTGACTGGA;reverse: CTTAACCATAGGAGCAGCT)targeting exons 6 and 7,OsP5CSgene primers (forward:GAAGTGGTAATGGTCTTCTC;reverse: AGCAAATCTGC GATCTCATC),EF1gene (as an internal control) primers(forward: TTTCACTCTTGGTGTGAAGCAGAT;reverse:GACTTCCTTCACGATTTCATCGTAA).Each qRT-PCR was prepared in a total volume of 10 μL containing 5 μL of 2× Fast SYBR?Green qPCR Master Mix with 0.2 μL ROX dye II,0.1 μL of each primer (20 μmol/L),0.8 μL of cDNA,and 3.8 μL ddH2O.Thermal cycling consisted of a hold at 94 °C for 2 min,followed by 35 cycles of 94 °C for 30 s,55 °C for 30 s,and 72 °C for 30 s.The PCR was performed on the CFX96 touch real time PCR detection system (BioRad,Hercules,USA) in triplicate.The differential gene expression was normalized and calculated based on the 2-ΔΔCtmethod.
NTCD,TN and Jasmine rice seeds were extracted with dichloromethane (Bergman et al,2000).The extracts were then filtered with 0.45 μm membrane filters,diluted and subjected to 2AP analysis with GCMS-QP2020 NX (Shimadzu,Kyoto,Japan) using SH-Rxi-5 Sil MS capillary columns (30 m × 0.25 mm × 0.25 μm).The GC oven was initially heated up at 40 °C for 2 min,then increased at a rate of 9 °C/min to 120 °C and kept at 120 °C for 2 min,then increasing the temperature with 25 °C/min up to 250 °C,and then kept at 250 °C for 5 min.Helium was used as the carrier gas at a constant flow of 1 mL/min.Samples were performed in triplicate and the concentration of 2AP was calculated based upon the relative peak area of the standard peak.The internal standard was 2-acetyl-1-pyrroline (Toronto Research Chemicals Inc.,North York,Canada).
ACKNOWLEDGEMENTS
This study was funded in part by the Can Tho University Improvement Project VN14-P6 supported by a Japanese Official Development Assistance loan.The authors thank Navsari Agricultural University for their whole genome sequencing data published on the National Center for Biotechnology Information(NCBI),and special thanks to Long An Science and Technology Department,Ministry of Science and Technology,Vietnam for providing NTCD seeds.
SUPPLEMENTAL DATA
The following materials are available in the online version of this article at http://www.sciencedirect.com/journal/rice-science;http://www.ricescience.org.
Fig.S1.PCR products amplified by EAP-ESP primers on 2%agarose gel.
Table S1.Differential usage of exon ofOs08g0424500gene.
Data S1.FASTA sequences of 30 sequences.
Data S2.FASTA sequences of NTCD-S1 and TN-S2 from Sanger sequencing.