Genetic diversity of local rice varieties (Oryza sativa L.) in Vietnam’s Mekong Delta based on SSR markers and morphological characteristics

Based on target traits, use of the genetic diversity of rice is beneficial for crop improvement. In this study, 41 rice varieties local to Vietnam’s Mekong Delta were evaluated on the basis of 11 quantitative morphological traits, along with the assessment of genetic diversity according to 50 SSR markers. The actual yield had a significance level of 0.05, while plant height and panicles per square meter had a high significance level of 0.001. Cluster analysis based on 11 quantitative traits also revealed that two were the optimal number of clusters used in this study. The highest polymorphic information content (PIC) value obtained was for RM286 (0.49), with a range of 0.00 to 0.49 and an average PIC of 0.14. Both structure and phylogenetic tree analyses as inferred from 50 SSR markers by the unweighted pair‐group method with arithmetic mean (UPGMA) also indicated that the 41 local rice varieties could be divided into two major groups. This study provides a useful information for Mot bui do cao CM, and Mot bui five varieties for improvements in the yield and intermediate amylose content of local rice‐breeding programs in future, especially for the Mekong Delta region.


Introduction
Rice (Oryza sativa L.) is one of the most important crops worldwide and is the main food for more than half of the world's population (Islam et al. 2018). Vietnam has a longstanding rice civilization and a large diversity of landscapes that contribute a high genetic diversity to its rice seed bank and potential materials for further research ing (Phung et al. 2014). Thus, Mekong River Delta is also affected by climate change scenarios leading to the sea level fluctuations at the mouth of the Mekong Delta (Ke skinen et al. 2010). The sealevel rise is forecasted to have a significant impact on rice paddy fields in particular ar eas of the Mekong River basin in case of the saline water intrusion (Yu et al. 2010). To solve the problem, several concentration efforts as well as breeding programs on im proving the productivity and quality of rice varieties have been established in Vietnam by the government and scien tists. The priority of breeding programs, the diversity of genotypes in the population should be studied, in which the more genetic diversity is the more successful breed ing strategies. The diversity of gene pools is including agronomic traits, such as quantity and quality of seed pro duction, tolerance to abiotic stress and resistance to biotic stress (Litrico and Violle 2015).
Based on the evaluation and agromorphological prop erties of rice genetic diversity, plant breeders have ob tained the initial information of rice cultivated varieties through mutation and hybridization (Roy and Sharma 2014). Compared with traditional approaches such as morphophysiological trait evaluation, modern breeding techniques and molecular markers are powerful tools that help breeders assess the genetic variation among rice germplasms in an effective way (Thi et al. 2015). Recent studies have used SSR markers to assess the genetic di versity of rice varieties because of its stability, effective ness, and high polymorphism (Hue et al. 2018; Nguyen et al. 2012. Besides, amylose content (AC) in rice is one of the most important targets to determine rice qual ity, which is influenced primarily by starch, a composition of amylose and amylopectin (Khoomtong and Noomhorm 2015). The percentage of amylose in starch indicates the rice's cooking properties so identification of amylose con tent will bring to plant breeders the information of rice it self, such as waxy (0%-5%), very low (5%-12%), low (12%-20%), intermediate (20%-25%), and high (25%-33%). As can be seen from previous reviews, modern breeding techniques have advantages that help plant breed ers to overcome their problems. Recent studies have inves tigated the genetic diversity of cultivated rice varieties in Vietnam, such as upland rice (Nguyen et al. 2012), which are mostly coming from Northern Delta Vietnam, or local colored rice landraces, which are collected in the central and north of Vietnam 7 (Hue et al. 2018)). However, there is a lack of studies on the genetic diversity of local rice germplasms in the Mekong River Delta in the south of Vietnam, especially in the coastal region. By using a com bined morphologicalmolecular analysis, along with sta tistical approaches, this study can perform a full analysis of the genetic variability of 41 local rice varieties in Viet nam's Mekong River delta. The study will contribute the useful information.

Materials and Methods
For this study, we used 41 rice varieties collected from lo cal areas in Vietnam's Mekong River Delta (Table 1). All experiments were conducted in three provinces (Ca Mau, Ben Tre, Bac Lieu, and Kien Giang) in the Mekong River Delta. Fortyone rice varieties were seeded directly paddy filed conditions with three replicates. Eleven morphologi cal were evaluated following the standard system of IRRI (IRRI, 2013), and the young leaves at seedling stage were collected for DNA extraction. The amylose content in rice varieties was estimated using the iodine colorimetric method (Khoomtong and Noomhorm 2015). The DNAs from young leaves of 41 rice varieties were extracted using the CTAB buffer (Kawata et al. 2003). For genetic diversity assessment, a panel of 50 standard SSR markers (Suppl. Table 1) were taken from rice genome databases (YouensClark et al. 2011). The obtained results were scored in binary data and determined by the distance matrix method to connect the relationships among individuals. The PIC (polymor phism information content) of the SSR markers was the mean of the PIC of each allele (Kempf et al. 2016) and the formula for codominant markers (Zargar et al. 2016) was used to calculate the PIC value as PICi = 2fi (1fi), where PICi is the polymorphism information content of allele i, fi is the frequency of the amplified fragments, and 1fi is the frequency of nonamplified fragments. More over, statistical approaches such as PCA, ANOVA, and heatmap3 as an advanced heatmap for clustering (Zhao et al. 2014) were performed via RStudio version 3.5.1 (RStudio 2015). In addition, population structures of in terested accessions were measured and visualized using Structure version 2.3.4 (Pritchard et al. 2000).

Diversity of phenotypic traits
Using principal component analysis (Figure 1a and 1b), we observed that the first axis mainly accounted for the vari ation in the germplasm (49.1%), followed by the second axis (21.9%). This result indicated that plant height (group I) and panicles per square meter (group II) within the axes exhibit great influence on the phenotype of the population. Moreover, the first five components accounted for 99.30% of the total variation, with components PCA3 (Chalkiness of endosperm), PCA4 (leaf length), and PCA5 (amylose content) contributing 12.7%, 9.7%, and 5.90%, respec tively.
In Figure 1c, there are 13 rice varieties in group I, in cluding Mong chim den, Nang cum 1, Tet ran, Ba bui 2, Nang quot bien, Bo liep 2, Thom lun mua, Nang thom, Doc Phung, Tai nguyen CL, Trang bo cau, Thom man, and Tai    (Figure 1c) showed that group I had 13 rice varieties which represented the plant height group, while group II tended to express mainly on panicles per square meter. Thus, based on phenotypic traits, the analysis have been widely applied for studied genetic diversity (Liu et al. 2015; Veasey et al. 2008 , germplasm GenBank material can be identified (Islam et al. 2018). Finding from this study have the potential for future use costal of Vietnam's Mekong delta with higher yield rice molecular breeding program. The frequency distribution of 11 quantitative mor phological traits were divided among two groups ob tained from cluster analysis (Figure 2). Only three agro morphological traits (actual yield, panicles per square me ter, and plant height) demonstrated significant variation among 41 rice germplasms. Although group I showed higher plant height distribution, group II showed a higher number of both panicles per square meter and actual yield. Yield is typically the most important trait for breeding pur poses. Overall, actual yield, panicles per square meter, and plant height were 2.43-4.57 ton/ha, 203-280 m2, 87.47-156 cm, respectively. Thus, this finding will help for im provement of rice yield in Vietnam's Mekong delta based on group of rice varieties with high in actual yield, pani cles per square meter. In addition, low and intermediate amylose content is one of the characters that rice breeder select for the demand of the market as high quality of rice (Islam et al. 2018)
According to the results, as shown in Figure 3, the sim ilarity coefficient Nei and Li (Nei 1973) of relationships in this study ranged from 0.78 to 0.96. Besides, 41 rice germplasms were officially divided into two main clusters. The first cluster contained 39 germplasms (Doc phung, Lun can do, Lun can trang, Bo liep 2, Mot bui do lun CM, Mot bui lun, Ba bong man, Lun cao san do, Lun cao san trang, Tai nguyen CL, Nang co do 2, Tra long 2, Ba bui 2, Tep hanh, Nam tai 1, Mot bui 5, Mot bui do cao CM, Nang cum 1, Lun phen hat nho, Lun phet, Thom lun mua,  Lun phen, Lun hen, Lun vang, Lun sua, Mong chim den, Mong chim roi 3, Ba bui lun, Tai nguyen, Trang bo cau, Tet ran, Lun man, Soi lun 1, Ngoc nu, Thom man, Nang quot bien 1, Trang phieu, Lun do, and Nang quot bien), and the second cluster had two varieties (Mot bui trang and Nang Thom). In cluster I, only Mot bui do lun CM and Mot bui lun shared a similar coefficient relationship of 96%, which indicates that these two varieties are closely related to each other. They may be shared the same ancestor with differ ent name because its came from Ca Mau province (Table  1). Regarding to Jaccard's coefficient (SJ), the Nei and Li coefficient varies only by the double weight given to the frequency of bands in each of the two studied genotypes (Duarte et al. 1999; Mohammadi andPrasanna 2003), so the Nei and Li coefficient is better matched to the type of analysis described in this study. Since phenotypic traits are based on genotypic traits, this analysis is highly desir able, valuable, and used for selection.

Population structure model-based approach
A modelbased structure analysis was also carried out to observe the number of populations that may be generated from 41 genotypes using 50 SSR markers ( Figure 5) The LnP(D) as well Evanno's ΔK values identified two genet ically distinct groups (K = 2) ( Figure 4). Structure sim ulations were carried out via varying K from 1 to 4 with 10 runs for each K using all 41 genotypes and obtaining the highest likelihood at K = 2. Therefore, two popula tions were obtained with slight mixing in some of the geno types, as represented in Figgure 4. Additionally, this pop ulation structure analysis ( Figure 5) confirmed the group ing of genotypes, as observed by UPGMA cluster analysis ( Figure 3).
Thus, the effectiveness of any crop breeding pro gram depends on the amount of genetic diversity within the desired improvement characteristics and the degree to which these characteristics are inherited (Adjebeng Danquah et al. 2020; Patel et al. 2014; Ravi et al. 2003. From this point of view, the study of genetic variation is very useful for the prebreeding program in order to choose the parental line with different desirable characteristics. Yet, the application of molecular marker has been widely used to study genetic diversity, in which these molecu lar markers were linked to genomic region that presented agronomic traits, but these molecular characteristics may not impact by environments. Structure analysis of agro morphological characteristics divided all varieties into two classes in this study, such as amylose content and plant height ( Figure 5). There were major variations in three agromorphological characteristics, in particular panicles per square meter characteristics. This trait is one of the production components, so these varieties with high num ber of panicles per square meter may be used for high pro duction breeding program. From molecular analysis using 50 SSR markers, two main clusters were finally created. In this study, at K = 2, all 41 varieties divided into two populations, indicating genetic differentiation in the over all varieties. The findings of this study provide a useful information for further work in the effort to increase the yields of 41 local rice varieties in Vietnam's Mekong River Delta.

Conclusions
Variation of agromorphological characteristics was infor mative in order to distinguish among the population, in particular the number of square meter panicles. This is valuable trait for contributing to rice yield that have been recorded in 28 varieties, especially these two varieties Mot bui bo cao CM, and Mot bui 5 that they were higher in number of square meter panicles and intermediate amy lose contain. In addition, the use of the regular panel SSR marker has been given with adequate depth of detail and may assess the genetic diversity of the Mekong Delta pop ulation. It is very important of Can Tho University rice va rieties collection with rich genetic resource that will help to improve in yield for Mekong Delta rice production.