Molecular Identification of Mudskipper Fish (Periophthalmus spp.) from Baros Beach, Bantul, Yogyakarta

ABSTRACT Mudskipper fish is amphibious fish belonging to the family Gobiidae. Coastal communities widely consume mudskipper to meet their animal protein needs. Mudskipper is primarily cryptic species that are morphologically difficult to identify and distinguish from other mudskipper fish species. Consequently, it can be confused with the naming of mudskipper fish species and can affect the conservation efforts of the fish in their habitat. One of the molecular approaches that can be used to identify the fish species quickly and accurately is DNA barcoding using the COI mitochondrial gene. However, the research on the identification of mudskipper fish in Indonesia is still very limited. Therefore, this study aimed to identify 26 mudskipper fish from Baros Beach, Bantul, Yogyakarta, using COI mitochondrial gene as a molecular marker for DNA barcoding. The method used in this study was a PCR method with universal primers, FishF2 and FishR2. The data obtained were then analyzed using GeneStudio, DNASTAR, BLAST, Identification Engine, Mesquite, MEGAX, and BEAST. The analysis was conducted to obtain similarity, genetic distance and reconstruct a phylogenetic tree. The result revealed that all 26 samples of mudskippers collected from Baros Beach were identified in one genus, namely Periophthalmus, and consisted of 3 species, namely P. kalolo (16 samples), P. argentilineatus (9 samples), and P. novemradiatus (1 sample). Furthermore, this study also discovered a suspected cryptic species in P. argentilineatus with a genetic distance of 5.46-5.96% between clade E, F compared with clade G. Further morphological studies are needed to confirm the species status of these three clades before solidly proclaim that they are cryptic species.

The mudskippers have few characteristics similar to those amphibious animals so that they are commonly recognized as amphibian fish. However, mudskipper's amphibious lifestyle is assisted by morphological and physiological adaptations such as aerial vision and olfaction, increased ammonia tolerance, terrestrial locomotion via pectoral fins, and enhanced immunological protection against pathogens (You et al. 2014;You et al. 2018). Coastal communities widely consume mudskippers because the fish contain high-quality sources of protein, minerals, and some vitamins (Andem & Ekpo 2014;Kanejiya et al. 2017;Mahadevan et al. 2021).
Cryptic species are commonly found in the family Gobiidae (Thacker 2003). Previous studies showed that cryptic species has occurred in genus Elacatinus, Tigrigobius, Trimma, Eviota using COI mitochondrial gene (Victor 2010;Victor 2014;Winterbottom et al. 2014;Tornabene et al. 2015), genus Ponticola using Cyt-b mitochondrial gene (Vasil'eva et al. 2016) and genus Mugilogobius using three mitochondrial markers (ND5, Cytb, and D-loop) (Huang et al. 2016). Moreover, cryptic species are also observed in mudskippers Boleophthalmus pectinirostris from the western Pacific coast of East Asia and the Strait of Malacca in Malaysia using Mitochondrial ND5 gene and nuclear RAG-1 gene (Chen et al. 2014), Periophthalmus argentilineatus from East-African and Indo-Malaya using two mtDNA markers (D-loop and 16S rDNA) and one nDNA region (RAG-1) (Polgar et al. 2014), and Periophthalmus argentilineatus from Bogowonto Lagoon, Yogyakarta, Indonesia using COI mitochondrial gene as a DNA barcoding marker (Arisuryanti et al. 2018). Since cryptic species are morphologically similar, distinguishing them solely on morphological characteristics is nearly impossible. Furthermore, the inaccurate identification of cryptic species can lead to taxonomic and conservation problems. In the last few years, molecular biology approaches have been applied for identifying fish species due to the limits of morphological identification methods. The most widely used method is DNA barcoding using the COI gene as a DNA barcoding marker (Hogg & Hebert 2004). DNA barcoding is a method for quickly identifying species by examining a short specific target gene. This method can identify hidden species that previously cannot be identified by the conventional identification method (Hebert et al. 2003). Sequences from the mitochondrial COI gene are now considered highly desirable for the precise identification of freshwater fish species (Dahruddin et al. 2017;Hutama et al. 2017;Arisuryanti et al. 2020). This is because accurate species database information can be used to develop conservation programs for cryptic species.
Baros Beach has natural barriers such as land and different newly planted mangrove area that separate sub-populations of mudskipper fish. Moreover, there is no genetic information on mudskipper fish in this area. Therefore, this study aimed to identify mudskippers from Baros Beach Yogyakarta, Indonesia, using the COI mitochondrial gene as a DNA barcoding marker.

Sample collection
A total of 26 mudskippers (code MBR) were collected from 4 different stations (A, B, C, and D) in Baros Beach (Figure 1) during August 2019 and December 2020. These stations were chosen by purposive sampling to represent the natural barriers in Baros Beach (Table 1). The sampling of mudskipper fish was performed using a hand net. Then, the sample was cleaned and documented ( Figure 2). Each mudskipper fish sample was then placed in a plastic clip, placed in a cooler (coolbox), and transported to the Laboratory of Genetics and Breeding (Faculty of Biology, Universitas Gadjah Mada) to be preserved with 99% ethanol and stored at -20°C until used in the following process.

DNA extraction
The genomic DNA of each specimen was extracted from tissue muscle on the back of the gills above the pectoral fins using DNeasy blood and tissue kit (QIAGEN, Valencia, CA, USA), according to the manufacturer's protocols.

DNA Amplification
The partial COI mitochondrial gene was amplified using primers FishF2 (5'-TCGACTAATCATAAAGATATCGGCAC-3') and FishR2 (  Cycler (Biorad). The MyTaq HS Red Mix PCR kit (Bioline) was used for the polymerase chain reaction (PCR), and PCR amplifications were performed in 50 µL reaction volumes containing 10-100 ng of genomic DNA, 25 µL MyTaq HS Red Mix PCR, 1 mM MgCl 2 , 0.6 µM of each primer and 11 µL double distilled water (ddH2O). Negative control was set up by omitting template DNA from the reaction mixture to assess the efficiency of the DNA amplification. PCR amplification conditions were 2 min predenaturation at 95°C, followed by 35 cycles of denaturation at 95°C for 15 sec, annealing at 50°C for 30 sec, and extension at 72°C for 30 sec. A final extension of 5 min at 72°C was performed (Arisuryanti et al. 2020).

Electrophoresis and Sequencing
The electrophoresis of PCR products was run on a 1% agarose gel stained with Florosafe (Bioline) and buffered with Tris-acetate EDTA (TAE) at 50 volts for 20 minutes. Visualization was conducted under U.V. light. All amplification samples were sent to First Base Sdn Bhd (Malaysia) through P.T. Genetika Science (Jakarta) for purification and sequencing in both forward and reverse directions using the Big Dye Terminator (Applied Biosystems) and the ABI 3730xl Genetic Analyzer (Applied Biosystems).

Sequence Editing
Data obtained from DNA sequencing results were edited in the GeneStudio program and validated with SeqMan and EditSeq on the DNASTAR program (DNASTAR Inc. Madison, USA). Sequencing reactions were carried out on each individual using both forward and reverse primers. Chromatograms were inspected manually to check ambiguous bases and stop codons.

Sequence Alignment
The mudskipper COI sequences were then aligned using Opal on Mesquite v.3.51 program (Maddison & Maddison 2018) and ClustalW on the MEGAX program (Kumar et al. 2018).

Nucleotide Composition & Genetic Distance
The composition of the COI nucleotides was calculated using the MEGAX program. Genetic distance was analyzed using the MEGAX program with the Kimura-2 Parameter (K2P) model and summarized in a Neighbor-Joining (NJ) tree, which is the standard methodology used in barcoding studies with bootstrap 1,000 replicates (Hebert et al. 2003).

Phylogenetic Relationship
The reconstruction of the phylogeny tree was analyzed using the Neighbor Joining and Maximum Likelihood methods with 1,000 bootstrap using the MEGAX program (Kumar et al. 2018) and Bayesian Inference using the BEAST program (Suchard et al. 2018

PCR amplification & sequence identification of mudskipper from Baros Beach
The result showed that the amplification of the COI mitochondrial gene of 26 mudskipper fish from Baros Beach produced fragment length around 700 bp ( Figure 3). The consensus sequence results from the chromatogram editing process were between 666-708 bp, which can be translated into 222-236 amino acids. According to the results (

Sequence Alignment
The alignment of sixteen P. kalolo sequence samples from Baros Beach yielded a clean sequence (a sequence that resulted after the alignment and cutting process) of 684 bp, meanwhile in nine P. argentilineatus sequence samples from Baros Beach resulted in a clean sequence of 648 bp. These clean COI gene sequences of each species were then used for intrapopulation analysis.
The COI sequences alignment of mudskippers from Baros Beach and other Indonesian regions recorded in GenBank and BOLD resulted in a fragment length of 579 bp for P. kalolo and P. argentilineatus, 651 bp for P. novemradiatus, respectively. This result was then subjected to intraspecies analysis (nucleotide composition and genetic distance). For phylogenetic tree analysis, the clean COI sequences (570 bp) of 44 samples representing 3 species, namely, P. kalolo, P. argentilineatus, and P. novemradiatus from Baros Beach and other Indonesian regions recorded on GenBank and BOLD were used. One COI sequence of Boleophthalmus boddarti (accession number KU692377) was used as an outgroup.

Intraspecies
The analysis showed that all P. kalolo samples had different T, C, A, and G nucleotide compositions. The differences of T, C, A, and G nucleotides between samples were 0-0.87%, 0-1.04%, 0-0.35%, and 0-0.34%, respectively. The composition of A+T was greater than G+C. The GC content from all P. kalolo samples ranged from 43.18% to 44.21%. The maximum GC content was observed in P. kalolo from Tulungagung (KU692748) and from Cilacap (KU692749). In addition, P. kalolo from Tulungagung (KU692747, KU692751, KU692755) had nucleotide compositions similar to one another. Furthermore, P. kalolo from Cilacap (KU692753) and Bogowonto (MT439602, MT439601) had the same nucleotide composition. This variation in nucleotide composition indicated variations between P. kalolo from the Baros Beach and P. kalolo from other Indonesian regions recorded in GenBank and BOLD based on the mitochondrial COI gene. All P. argentilineatus samples had different T, C, A, and G nucleotide compositions. The differences of T, C, A, and G nucleotide between samples were 0.17-0.69%, 0-0.52%, 0-0.19%, and 0.02-0.3%, respectively. The composition of A+T was greater than that of G+C. The GC content from all P. argentilineatus samples ranged from 42.14% to 42.66%. The maximum GC content was observed in P. argentilineatus from Pandeglang (KU692744). Furthermore, P. argentilineatus from Pandeglang (KU692745-46) and P. argentilineatus from Bogowonto (MT439599) had the same nucleotide composition. This variation of nucleotide composition indicated that there were intraspecies variations of P. argentilineatus from Baros Beach and P. argentilineatus from other Indonesian regions recorded in GenBank and BOLD based on the mitochondrial COI gene.
P. novemradiatus Baros Beach and P. novemradiatus from other Indonesian regions recorded in GenBank and BOLD had relatively similar nucleotide compositions of T, C, A, and G. The differences in T, C, A, and G nucleotides between samples were 0-0.30%, 0.15-0.46%, 0-0.15%, and 0%, respectively. The composition of A+T being greater than the composition of G+C. The GC content from all P. novemradiatus samples ranged from 46.24% to 46.70%. This variation in nucleotide composition indicated variations of P. novemradiatus from Baros Beach and P. novemradiatus from other Indonesian regions recorded in GenBank and BOLD based on the mitochondrial COI gene.

Phylogenetic analysis
The tree reconstruction results yielded identical tree topologies (Figure 4). The results of the jModelTest2 analysis revealed that the most optimal  sequence substitution model is HKY with invariant sites (HKY + I) on the Bayesian Information Criterion (BIC). The phylogenetic tree reconstruction of P. kalolo from Baros Beach and P. kalolo from other regions in Indonesia recorded on GenBank and BOLD formed 4 different clades, namely Clade A, B, C, and D. The formation of these 4 clades were supported by a bootstrap value of 100% in the Neighbor-Joining and Maximum Likelihood methods. In addition, the posterior probability value is 1.00 in Bayesian Inference. Bootstrap and posterior probability results showed that the formation of these 4 clades was powerful and rigid. The results of the grouping above were consistent with previous studies conducted by Arisuryanti et al. (2018), which showed that P. kalolo from Bogowonto formed a group with P. kalolo from Cilacap (KU692753) and separated from P. kalolo from Tulungagung (KU692747-48, KU692750-52, KU692755) and other P. kalolo from Cilacap (KU692753 and KU692754).
The phylogenetic tree reconstruction of P. argentilineatus from Baros Beach and P. argentilineatus from regions in Indonesia recorded on GenBank and BOLD formed 3 different clades, consisting of clade E, clade F, and clade G. The formation of these 3 clades was supported by a bootstrap value of 98% in the Neighbor-Joining method, 100% in Maximum Likelihood, and posterior probability value of 1.00 on Bayesian Inference. Bootstrap and posterior probability results showed that the formation of these 3 clades was powerful and rigid. The separation of clade E and clade F was supported by morphological data in which P. argentilineatus members of clade E had a pointed first dorsal fin shape, while members of clade F had a rounded first dorsal fin shape.
The reconstruction of the phylogenetic tree of P. novemradiatus from Baros Beach with P. novemradiatus from other Indonesian regions recorded on GenBank and BOLD formed one clade of P. novemradiatus, which was supported by a bootstrap value of 100% in the Neighbor-Joining and Maximum Likelihood methods, and a posterior probability of 1.00 on the Bayesian Inference.

Genetic Distance
The lowest genetic distance was between P. kalolo clade B and C with an average of 0.66% (0.35-1.05%), while the highest genetic distance was between clade A and D with an average of 1.90% (1.40-2.29%) (Table 3). According to Zemlak et al. (2009), the threshold for intraspecies genetic distance in fish species is 3.5%. If it exceeds 3.5%, it is considered a different species. Based on the criterion from Zemlak above, P. kalolo from Baros Beach and P. kalolo from the Indonesian regions recorded in GenBank were still classified as the same species (conspecific).
The average genetic distance of P. argentilineatus between clade E and clade F was 2.51% (1.94-2.84%), between clade E and clade G was 5.96% (5.62-6.19%), and then between clade F and clade G was 5.46% (5.24-5.62%) (Table 4). Therefore, according to Zemlak's criterion, clade E and clade F were still classified as the same species (conspecific) because their genetic distance was less than 3.5%, while clade E and clade F compared with clade G were suspected to be cryptic species because their genetic distance was more significant than 3.5%. This finding was consistent with the study conducted by Arisuryanti et al. (2018) and Rha'ifa et al. (2021), who discovered that P. argentilineatus in Indonesia was divided into two clades separated by a genetic distance > 3,5%. The results of the study further showed that P. argentilineatus in Indonesia are suspected to be cryptic species. Further studies are needed to examine whether P. argentilineatus in Indonesia should be categorized into different species or categorized into one species with several sub-species.
The average genetic distance of P. novemradiatus was 0.13% (0-0.30%). Based on the Zemlak criterion, P. novemradiatus from Baros Beach and P. novemradiatus from other Indonesian regions recorded in GenBank were still classified as the same species (conspecific). Table 3. Percentage of an intra-species genetic distance of P. kalolo from Baros Beach and P. kalolo from other regions in Indonesia recorded in GenBank and BOLD based on the COI mitochondrial gene.