COI-Based DNA Barcoding of Selais Fish from Arut River, Central Kalimantan, Indonesia

ABSTRACT Selais fish belongs to the family Siluridae consisting of 12 genera with 104 properly validated species. The human need for these fish has sharply increased due to the benefits provided, especially for consumption. However, morphologically the selais fish are slightly challenging to differentiate among other silurid fish for nonspecialist experts. Thus, a DNA barcoding approach using the mitochondrial COI gene as a molecular marker in this study was applied to clarify a taxonomic position and classification species of selais fish from Arut River (Central Kalimantan, Indonesia) and was also to assembly fish COI database storage from Indonesia. In this research, the method used was a PCR (Polymerase Chain Reaction) method with a pair of universal barcoding primers, FishF2 and FishR2. Based on partial COI fragment-based DNA barcoding, the whole samples showed no sequence differences (only 1 haplotype) within the population and this confirmed that these fish only consisted of one identical species. Furthermore, phylogenetic analysis (NJ / ML / BI) revealed that selais fish in this study had a close genetic relationship with Ompok hypophthalmus compared to other Ompok groups. This relationship was supported by the genetic distance value not exceeding 3.6% and this evaluated the undetermined naming of the selais fish from Arut River which was previously still unclassifiable.

because they not only have a delicious taste but are also highly nutritious, so they can be used to meet animal protein needs.
Many research types on selais fish were carried out, but only a few studies at the molecular level. Arut River, one of the rivers in Central Kalimantan Province inhabited by selais fish, has become a concern area in this study. Overexploitation and overfishing have the potential to reduce the diversity and population number of selais fish. This is also supported by data at the IUCN which revealed there has been a decrease in the population for several species of selais fish. For conservation management efforts of Selais fish, especially in the Arut River, it is necessary to carry out initial data collection such as molecular identification of species names. Arisuryanti et al. (2020a) reported that the boundaries status and classification by identifying the selais fish using the 16S mitochondrial gene were still undetermined due to low similarity and genetic distance values compared to the GenBank database. Although the COI mitochondrial gene is a high conserve region and highly acceptable for identification of almost all animals (Mitani et al. 2009), limited data in the GenBank database (unregistered) causes a low percentage of similarity and query cover for comparison, and this is a constraint in species determination.
Recently, a DNA barcoding approach using the mitochondrial COI gene as a molecular marker has been applied as a rapid alternative bioidentification method in clarifying taxa from the animal kingdom as a whole (Hebert et al. 2003), including ichthyofauna (Panprommin et al. 2019;Pandey et al. 2020). For example, Malakar et al. (2012) explained that three Ompok species from India were successfully identified as Ompok pabda, Ompok pabo, and Ompok bimaculatus. Arisuryanti et al. (2018) added that two cryptic fish species from Indonesia Periophthalmus argentilineatus and Periophthalmus kalolo were successfully confirmed. In addition, Chen et al. (2021), Chang et al. (2016), and Cline (2012) verified cases of mislabeling fish names from food products being sold. Therefore, we tried to re-evaluate the uncertain taxonomic degree and systematic of selais fish from the Arut River using COI gene as taxonomic DNA by examining genetic relationships of three different phylogenetic tree approaches and this finding was also to compile the COI database library of the fish in Indonesia.

MATERIALS AND METHODS Sample Collection
Wild selais fish with a total of 10 individuals from Arut River, Central Kalimantan (2°40'10.6"S and 111°38'08.3"E) was collected ( Figure 1) by asking local fisherman for help using large fishing nets and documented for this study (Figure 2). Approximately around 50mg muscle tissue of these freshly caught fish was sampled and placed into 1.5 ml labelled tubes containing 99% absolute ethanol. The samples were transferred and then frozen at a temperature of ˗20 O C in the Laboratory of Genetics and Breeding, Faculty of Biology, Universitas Gadjah Mada, for further molecular analysis.

DNA Isolation, Amplification, and Sequencing
The complete genomic DNA (gDNA) of preserved samples was isolated from the muscle tissue (cells) near the ventral fin using the DNeasy Blood and Tissue Kit (QIAGEN, Valencia, USA) following the factory's protocol. The single COI fragment was amplified using a PCR machine with barcode primer FishF2 and FishR2 (5'-TCGACTAATCATAAA GATATCGGCAC-3' and 5'-ACTTCAGGGTGACCGAAGAATCAGAA-3') respectively (Ward et al. 2005). Of 50 μl total volume of PCR reaction, 10-100 ng was gDNA, 25 μl was My Taq HS Red Mix PCR (bioline), 2 mM was MgCl 2 , 3 μl was the two sets of COI primer, and 11 μl was ddH 2 O. The PCR machine was conditioned at 95 O C in 1 min for pre-denaturation followed by 35 repeated cycles for denaturation at 95 O C in 15 sec, primer annealing at 50 O C in  30 sec, elongation at 72 O C in 30 sec, and at 72 O C for 1 min for final elongation and ended by the hold at 4 O C.
The PCR product was visualized using 2 μl staining FloroSafe in 1% agarose gel. The amplicon was then purified and sequenced at 1st BASE company using a pair of the same universal primers during amplification process. Bi-directional COI gene sequencing with sanger dideoxynucleotide sequencing method was performed using the ABI 3730XL Genetic Analyzer machine (Applied Biosystems).

Data Analysis
Sequence data set were processed and set manually using the SeqMan and EditSeq programs (Lasergene, DNASTAR). The appropriate consensus sequence was analyzed using the Identification Engine program in the BOLD website and the BLAST program in the NCBI website to determine percentage identity. For either intra-population (population of selais fish from Arut River in this study) or intra-species analysis (sample combination between this study and GenBank database), sequence data were further aligned in MESQUITE ver. 3.51 (Maddison & Maddison 2018). The nucleotide composition and genetic distance with the Kimura 2-Parameter (K2P) substitution model were then analyzed in MEGA X (Kumar et al. 2018). Next, data of genetic variations (number of haplotypes, polymorphic sites, parsimony sites, transition and transversion, haplotype diversity, and nucleotide diversity) was processed in DnaSP ver. 6 (Rozas et al. 2017). The haplotype network based on the Median Joining Network method was visualized in NET-WORK ver. 10.1 (https://www.fluxus-engineering.com). Furthermore, Principal Coordinate Analysis (PCA) was analyzed in GenAlEx ver. 6.5 (Peakall & Smouse 2012) to obtain a simple separation model among haplotypes.
The phylogenetic tree character was defined using three different approaches. The NJ (Neighbor-Joining) and ML (Maximum Likelihood) trees were characterized using the Kimura 2-Parameter (K2P) model with 1,000 bootstrap replications in MEGA X. Bayesian Inference (BI) tree topology was analyzed using the Bayesian Information Criterion (BIC) approach in the jModelTest ver 2.1.10 program (Darriba et al. 2012) to select the most suitable nucleotide substitution model. Furthermore, the BI tree was analyzed in BEAST ver. 1.10 program under the best appropriate model (Suchard et al. 2018). The MCMC (Markov Chain Monte Carlo) analysis was run 10 7 with 10 3 samples per generation. The first quarter of files were removed (burn-in), other three-quarter files were performed to construct the BI tree and measure posterior probability value. Phylogeny tree characteristics were visualized in

RESULTS AND DISCUSSION
Nine out of all selais fish samples from Arut River were successfully amplified and sequenced using Fish F2 and Fish R2 primers (codes: LSA-1, LSA-2, LSA-5, LSA-6, LSA-7, LSA-8, LSA -9, LSA-10, LSA-11) and produce about 678-705 bp (226-235 amino acids). The one remaining sample yielded a poor sequence even though the DNA band was visible (Code: LSA-4) (Figure 3). The nine COI sequence data of selais fish have been registered in GenBank with accession number MZ634366-MZ634374. The Identification Engine algorithm in BOLD and the BLAST algorithm in GenBank showed that these samples had percentage identity with Ompok hypopthalmus of 96.71% -97.19% (BOLD) and 96.61% -96.77% (GenBank). In particular, the COI gene sequence data of Ompok hypophthalmus in BOLD with a similarity value of 97.19% was still private data. This indicates that the COI gene sequence data has not been released to the public. Therefore, for comparison data, only the released COI sequence of Ompok hypophthalmus was analyzed. For intra-population level analysis, the COI gene sequences were aligned and resulted in 672bp (224 amino acids).
The three statistical methods of the phylogenetic tree formed almost similar tree topology, and the tree was only displayed using the NJ approach (Figure 4). For BI tree was constructed using the HKY (Hasegawa Kishino-Yano) with the gamma-distributed rate (+G) and invariant site (+I) as an ideal reference for substitution model under BIC in jModelTest. All samples from Arut River were grouped of only a clade (clade A) with a very strong bootstrap of 100/100 (for both NJ and ML) and posterior probability value of 1 (for BI). This indicates that the entire fish sample consisted of only one species, which was supported by 0% genetic distance among samples. This was confirmed by Roesma et al. (2020) that species possessing a genetic distance from 0% to 0.5% were still indicated as one identical species.
Furthermore, the samples (LSA) had a closer genetic relationship with Ompok hypophthalmus (MK473377, MK473378, and MK473379) in clade B compared to other Ompok groups. This was confirmed by quite a significant bootstrap value (100/98) and posterior probability (1) and the mean of genetic distance between clade A dan clade B was 3.6%. A previous study by Arisuryanti et al. (2020a) described that selais fish based on the 16S mitochondrial gene were closer to genus Kryptopterus than genus Ompok with significant differences in genetic distance of 44.1% and 65.8%, respectively. This discovery has confirmed that the ambiguous name and dispels doubts of selais fish previously were still unclassified accurately. For more detail, the intra-species analysis between selais fish and Ompok hypophthalmus (MK473377, MK473378, and MK473379) resulted in 633 bp-alignment sequences. Each nucleotide composition was presented in Table 1. There were no significant divergences in nucleotide composition between LSA* and Ompok hypophthalmus from the GenBank database. Total composition divergence of nucleotide LSA* was relatively similar from the 12 individuals with T=0%-0.47%, C= 0%-0.63%, A= 0%-0.32%, and G = 0%-0.47%. The AT composition for all samples was greater than the CG content but with the same difference of 0.79%.
The genetic distance for each sample was expressed in Table 2. The intra-species genetic distance ranged from 0% to 3.6%. The highest value was obtained between LSA* and MK473377, MK473378, and MK473379. The lowest percentage of genetic distance was between MK473377 and MK473379. Zemlak et al. (2009) stated that the species was still categorized as one species if the intra-species genetic distance threshold value was 3.5%. Meanwhile, the genetic distance between selais fish in this study and Ompok hypophthalmus from Indragiri River, Riau was 3.6%. However, samples in this study were still classified as Ompok hypophthalmus due to high similarity from BOLD database 97.19%, which means the genetic distance was still <3.5%. However, the mitochondrial COI gene sequence has been in private data, which means that the COI sequence of Ompok hypophthalmus has been registered in BOLD but has not been released to the public (Table 4). Haplotype grouping between selais fish from Arut River and Ompok hypophthalmus from GenBank database was presented in Table 3.  The intra-species polymorphism sites among haplotypes were shown in Table 4. Of three haplotypes, there were 23 variable nucleotide sites (3.61%) with 22 informative parsimony sites (3.46%) and a singleton site (0.16%). Nucleotide diversity (π) and haplotype diversity (Hd) were 0.0142 ± 0.00468 and 0.439 ± 0.158 (Hd value 0<0.5 low haplotype diversity and Hd >0.5≤1 high haplotype diversity), consecutively. This indicates π and Hd were low indexes. It is assumed that the samples in this study have a small population. The nucleotide divergence models in this sequence were fully substitutions with 22 sites (3.46%) represented transition and transversion was only one site (0.16%). Almost all base divergences were in the third position (20 sites / 3.14%), followed by the first codon position (3 sites / 0.47%), and no nucleotide divergences in the second codon position (0 sites / 0%). In addition, one nonsynonymous (from isoleucine to valine) was detected in the 145th codon.
The inter-haplotype mutation model was designed in Figure 5. Based on Figure 5, a very clear separation was depicted between haplotype A (HapA) and haplotype B (HapB 1 and HapB 2 ) due to many substitutions of the nucleotide arrangement. Between HapA and HapB 1 also HapA and Hap-B 2 , there were 22point mutations. Intra-haplogroup B (HapB 1 and HapB 2 ) was inserted with only 2 mutation points. In general, the presence of intra-

T T T G C G A A A C T A T A G C T A A C G T G I 9
HapB1 C C C A . A G G G T C G C G A T C G T T A C A V 2 HapB2 C C C A T A G G G . C G C G A T C G T T A C A V 1 species genetic variation between selais fish from Arut River and Ompok hypophthalmus (3 haplotypes) was simplified in the Principal Coordinate Analysis (PCA) pattern in Figure 6.

CONCLUSION
By way of conclusion, DNA barcoding using the partial COI mitochondrial gene was quite effective and acceptable for molecular identification, especially for morphologically indistinguishable species. Based on the phylogenetic tree construction (NJ/ML/BI), selais fish from Arut River had been confirmed as one single taxa Ompok hypophthalmus supported by a genetic distance value of 3.6%. The results of this study are also expected to be used as an entry point for the formulation of sustainable fisheries management and conservation strategies considering that the enormous potential of this important fish can provide maximum benefits in a sustainable manner if managed properly and responsibly.

AUTHORS CONTRIBUTION
T.K. collected and analyzed the data and wrote the manuscript. T.A. designed the research and supervised all the processes.