HOW MANY SPECIES OF RATTLESNAKES ARE THERE IN THE CROTALUS DURISSUS SPECIES GROUP (SERPENTES: CROTALIDAE)?

Resumen.— Los miembros del grupo de especies de Crotalus durissus se encuentran entre las especies más grandes de serpientes de cascabel y tienen una gran importancia médica. La taxonomía del grupo es compleja


INTRODUCTION
In recent years, the general lineage concept (de Queiroz, 1998) has become widely used for defining species.Under the general lineage species concept, species can be defined as separately evolving metapopulation lineages.The debate has now shifted from "what is a species?", to "which lines of evidence are necessary to define those separately evolving lineages?"(de Queiroz, 2007).Morphological characters have traditionally been the standard when defining species, but these characters can evolve at different rates between taxa, and it can be difficult to determine what characters are diagnostic between species (Fujita et al., 2012).The use of molecular markers has become commonplace when delimiting and defining species, but the results of those studies that define species based only on molecular data alone (e.g.Card, 2016;Leaché & Fujita, 2010;Ruane et al., 2014) can be misleading when not analyzed carefully, as they might delimit genetic structure and not necessarily species (Chambers & Hillis, 2020;Sukumaran et al., 2021;Sukumaran & Knowles, 2017).Mitochondrial DNA (mtDNA) has become a standard tool in population genetics, as it can allow us to better understand inter and intra-population processes such as migration, gene flow, genetic structure, or speciation (Rubinoff & Holland, 2005).However, in recent years, the use of mitochondrial markers as the only evidence to define species has become commonplace (Hillis, 2019).A multitude of phenomena, such as rapid lineage sorting, selection, male-biased dispersal, isolation-by-distance, or introgression, could in many cases make the use of mtDNA alone unreliable for delimiting and defining species (Rubinoff & Holland, 2005).
The level of genetic variation that needs to exist between two lineages to be considered two different species is a matter of great debate, and probably there will not be a single correct answer for all cases (Sukumaran et al., 2021).However, because species are   C. d. cascavella Wagler, 1824, C. d. collilineatus Amaral, 1926, C. d. cumanensis Humboldt, 1811, C. d. marajoensis Hoge, 1966, C. d. maricelae García-Perez, 1995, C. d. ruruima Hoge, 1966, C. d. terrificus (Laurenti, 1768) and C. d. trigonicus Harris & Simmons, 1978.Two other species from South America Crotalus unicolor Van Lidth De Jeude, 1887 (restricted to Aruba) and Crotalus vegrandis Klauber, 1941 (restricted to Venezuela), have previously been considered subspecies of C. durissus, but are currently recognized by some taxonomist as full species (Wallach et al., 2014;Zaher et al., 2019).2020) study, their results do not necessarily show that some of these populations are indeed different species.Here we examine if the genetic patterns observed in previous studies could be explain by other biological processes, in particular, we test if isolation-by-distance (IDB) and a lack of thorough sampling might be the cause of an apparent barrier to geneflow between member of the C. durissus species group.Additionally, we test if a newly develop species delimitation method, DELINEATE (Sukumaran et al., 2021), supports the current taxonomy of the group.

MATERIALS AND METHODS
DNA sequencing and alignment.We sequenced the mitochondrial genes of Cytochrome B (Cytb) and NADH dehydrogenase subunit 4 (ND4) for five individuals of the Crotalus durissus group.These include one C. culminatus, one C. ehecatl and three C. mictlantecuhtli (Appendix 1).DNA extraction protocols and PCR conditions follow those of Reyes-Velasco et al., (2013) and were carried out at the Universidad Juárez del Estado de Durango.Additionally, we used the program MITObim (Hahn et al., 2013) to obtain mitochondrial sequences from the raw Illumina transcriptomic reads of Durban et al. (2017), which include two additional C. mictlantecuhtli, two C. tzabcan, and one C. culminatus (Appendix1).
The program MITObim can generate mitochondrial contigs from short reads with the use of an iterative baiting method.We first used a published sequence of the mitochondrial genome of Crotalus horridus (GenBank reference number NC 014400) as the reference mitogenome, with the default program settings, except for a k-mer length of 31.We then repeated the same procedure for each individual, but with the resulting contig of the first run as the reference mitogenome.For the resulting mitogenomes, we extracted the Cytochrome B and ND4 genes for all subsequent analyses.All new mitochondrial sequences were deposited into GenBank (MZ855472-MZ855493; Appendix 1) and the following analyses are explained below.
As a backbone of our analyses, we used the sequences of Carbajal-Márquez et al. (2020).We obtained additional sequences from GenBank for the mitochondrial genes Cytb and ND4, as well as the nuclear gene oocyte maturation factor (C-mos) for most members in the Crotalus durissus and C. molossus species group.We reduced our dataset as to include only a single individual for most of the South American subspecies of C. durissus, as well as C. unicolor and C. vegrandis, since the main goal of this study is the northern members of the group.In the case of Mesoamerican specimens related to C. simus, we included all available individuals.Additionally, we included multiple other species of rattlesnakes as outgroups.A list of all sequences used and their GenBank number is presented in Appendix 1, while the locality for each sample is shown in figure 1.We aligned each individual gene using the online version of Mafft (Katoh & Standley, 2013).We then translated all genes to their predicted amino acid sequences in Geneious v9.1.6(Biomatters Ltd., Auckland, NZ) to check for the presence of stop codons, which were not found.
Phylogenetic analyses.We estimated the best-fit models of nucleotide evolution for each gene and gene partition the program PartitionFinder.We then concatenated the individual genes using Sequence Matrix (Vaidya et al., 2011).We produced three datasets: one that included all genes (two mtDNA genes plus one nuclear gene), one that only included mtDNA, and the last one that only included the nuclear loci c-mos.We then used the Akaike Information Criterion (AIC) in the program PartitionFinder (Lanfear et al., 2012) to estimate partition schemes and models of molecular evolution, which are shown in table 1.
We estimated Bayesian phylogenetic inference (BI) for each dataset in Mr. Bayes v3.2.(Ronquist et al., 2012), in the CIPRES science gateway server (Miller et al., 2011).We used BI to estimate the posterior probabilities of phylogenetic trees based on a total of 10 million generations Metropolis-coupled Markov chain Monte Carlo (MCMC).These analyses consisted of four simultaneous runs, each with four chains (three heated and one cold), sampling every 1,000 generations.We verified that independent runs had converged by visualizing the output from BI in the program TRACER v. 1.5 (Drummond & Rambaut, 2007).Potential scale reduction factor (PSRF) estimates comparing chain likelihood values indicated convergence by one million generations, so we discarded the first 25% of BI samples as burnin.We estimated a majority-rule consensus phylogram from the combination of the post burn-in samples from the four BI runs, combined the runs using TreeAnnotator v. 1.7.4 (Drummond & Rambaut, 2007) and visualized the resulting trees in FigTree Reyes-Velasco et al. -Species of the Crotalus durissus group v1.4.2 (Rambaut, 2014).Additionally, we used the NeighborNet algorithm in SplitsTree 4 (Huson & Bryant, 2006) to construct a phylogenetic network based on our alignment of mtDNA.

Genetic distances & Isolation-by-distance.
In order to test for isolation-by-distance in multiple members of the Crotalus durissus group, first we georeferenced the locality data for all samples in our dataset with the use of Google Earth Pro.In a small number of cases, the locality information was incomplete or lacking and the locality information is tentative (for example, in the case of the single individual of Crotalus unicolor, the locality is "Aruba", so we georeferenced this locality in the center of the island).Locality information and coordinates are provided in Appendix 1.We then obtained Euclidian distances between samples with the use of the earth.distfunction in the R package fossil (Vavrek, 2011).We used the program MEGA X (Kumar et al., 2018) to estimate pairwise genetic distances (uncorrected P distances) of the mitochondrial data (Cytb + ND4), including all codon positions, both transitions and transversions, and Gamma distributed rates among sites.Genetic distances and geographic distances between samples are provided in Appendix 2 and 3. We assessed if isolation-by-distance (IBD) was present in the C. durissus group by testing for correlation between geographic distances and genetic distances, with the use of Mantel tests in the R package Vegan (Oksanen et al., 2013), as well as Multiple Matrix Regression with Randomization analysis, performed with the R function MMRR (Wang, 2013).
Additionally, we estimated pairwise genetic distances in ND4 for additional species of Crotalus which we obtained from GenBank.We only included ND4 because this has traditionally been the most common gene used in rattlesnake systematics and was the only gene with a more thorough taxonomic coverage of the group.A list of all taxa and samples included is provided in Appendix 4.

Species delimitation analysis.
To test previous species delimitations in the Crotalus durissus group, we employ a novel method DELINEATE (Sukumaran et al., 2021), which used a protracted speciation model (PSM).This method allows us to incorporate species limits obtain by other sources of data into the species delimitation analysis.However, this approach requires a good understanding of the current taxonomy of a particular group.The program's output is a topology in which each individual is assigned to either previously described species or to potentially new taxa.
First, we used BEAUTI, part of the BEAST software package (Drummond & Rambaut, 2007) to set up our StartBeast2 configuration file.For this analysis we used only the mitochondrial genes Cytb and ND4, with a single strict clock model of substitution across genes, and a HKY+G substitution model.Due to the computational burden of the analysis, we reduced our dataset to include only four individuals per species and additionally included multiple individuals of the Crotalus molossus species group, which are sister to the C. durissus group (Reyes-Velasco et al., 2013).We ran four replicates of the analysis, which consisted of 400 million generations, sampling every 400,000 generations, for a total of 1000 trees.
We then used the sumtrees.pyscript, part of the DendroPy library (Sukumaran & Holder, 2010) to discard the first 250 trees as burn-in, combine the runs and use the Maximum Clade Credibility Tree to summarize the resulting trees.We then prepared our population assignment input file for DELINEATE and constrained all currently recognized species, except for Crotalus ehecatl, C. mictlantecuhtli, C. simus, C. unicolor and C. vegrandis.We tested these last two species, as preliminary phylogenetic analysis showed that they were nested within C. durissus (see below), so we wanted to test their validity as species.In the case of the Crotalus molossus group, we constrained all species, except for the three subspecies of C. molossus (C.m. molossus, C. m. nigrescens & C. m. oaxacus) to test their species assignment.

Alignments.
Our resulting alignments for each gene were the following: 751bp (Cytb), 663bp (ND4) and 594bp (c-mos).In the Tabla 1. Modelos de sustitución para los genes utilizados en este estudio.mtDNA gene network & genetic distances.When we visualized the mtDNA data as a genetic network in SplitsTree (Fig 3), we recovered the same general patterns as in the Bayesian Inference analysis of mtDNA, however the degree of low genetic variation between many of populations in the Crotalus durissus group were case of the mtDNA genes, the number of parsimony-informative sites was 107 for Cytb and 162 for ND4, which represents 14% and 24% of all sites, respectively.In the case of c-mos, only 19 sites were parsimony-informative, which represents only 3% of all sites.
Phylogenetic analyses.Our phylogenetic analyses of the mtDNA and mtDNA + c-mos datasets resulted in identical topologies, which are in concordance with other studies of the same group, which is not surprising as the sequences used in the different studies are nearly the same, and any signal in the nuclear gene c-mos becomes overwritten by the mtDNA data, thus here we only present the mtDNA phylogeny (Fig. 2A).We recovered four main clades in the C. durissus group: The first clade included all individuals of Crotalus tzabcan from the Yucatan peninsula and Belize and received strong support (posterior probability, pp = 1).This clade was sister to a second well supported clade which  All the subspecies of Crotalus durissus, including C. unicolor and C. vegrandis had genetic distances (uncorrected p distances) of only 0.018 -0.036 when compared to populations of C. simus from Mexico and Central America (Table 1) Similarly, C. culminatus and C. ehecatl had very limited genetic differences between each other, ranging from 0.026 to 0.049.These genetic distances are similar to those shown by intraspecific populations of other rattlesnakes (Appendix 4).On the other hand, C. tzabcan, which is not clearly isolated from other members of the group in northern Central America or central Veracruz, showed rather significant genetic differences with all other members of the group, ranging from 0.06 to 0.9, which are on par with interspecific differences in other rattlesnake taxa (Appendix 4).If only individuals of C. culminatus were analyzed, the relationship was strong in the Mantel test (Mantel statistic (r) = 0.66), but not in the MMRR analysis (r = 0.4355), however, neither the Mantel test or the MMRR analysis was significant (p = 0.058; p = 0.057, respectively).When comparing Crotalus simus and C. durissus (including C. vegrandis and C. unicolor), the relationship was weak or non-existent (Mantel statistic (r) = 0.12; MMRR (r) = 0.01) and not significant (p = 0.132; p = 0.18, respectively).Similar weak and non-significant relationships were found when including only individuals of C. tzabcan (Mantel statistic (r) = 0.03; MMRR (r) = 0.0009) or C. durissus (including C. vegrandis and C. unicolor; Mantel statistic (r) = 0.246; MMRR (r) = 0.6).These results show that at least in some member of the C. durissus group, isolation-by-distance might be responsible of the genetic structure observed in the mitochondrial genes.

Species delimitation analysis.
The results of our species delimitation analysis in DELINEATE supported the recognition of Crotalus mictlantecuhtli as a distinct species, on the other hand, it included the samples of C. ehecatl as part of C. culminatus Figura 4. Gráfica de aislamiento por distancia (APD) que muestra la relación entre las distancias geográficas y genéticas de los miembros del grupo de especies de Crotalus durissus.La línea representa la correlación entre la distancia geográfica y la distancia genética (r).La significancia estadística se estimó con una prueba de Mantel.

DISCUSSION
In this study we evaluate if isolation-by-distance could be responsible for the genetic variation observed at the mitochondrial level in members of the Crotalus durissus species group.We do this by sequencing additional individuals,  Reyes-Velasco et al. -Species of the Crotalus durissus group combined with previously published mitochondrial and nuclear data.Our results show that the genetic differences between some of the species in the group (C.simus -C.durissus and C. culminatus -C.ehecatl) are lower than intra-specific genetic distances (Appendix 2).The only nuclear gene at hand for the C. durissus group (c-mos) is not informative for distinguishing between many species of rattlesnakes (Fig. 2B), and thus it is of little use when it comes to taxonomic decisions.Additionally, in the case of Crotalus ehecatl and C. culminatus, morphological differences were also minimal, and the characters analyzed overlapped almost completely (see Carbajal-Márquez et al., 2020, Fig. 4).
Isolation-by-distance (IBD) if often overlooked in population genetics (Meirmans, 2012).In the case of some members of the Crotalus durissus group, IBD might be responsible for the apparent genetic differences between populations, which have previously been inferred to be inter-specific differences (Fig. 4).The individuals of C. culminatus and C. ehecatl from which mtDNA sequences were analyzed are separated by more than 950km (coast of Michoacán vs Oaxaca-Chiapas border; Fig. 1).There are records for C. culminatus along the Pacific coast of Mexico, from the Colima-Michoacán border to eastern Oaxaca (Campbell & Lamar, 2004), followed by an apparent gap of approximately 300 km between Marquelia, Guerrero and Santa Maria Huatulco, Oaxaca (the western most record C. ehecatl).However, multiple research grade observations for Crotalus exist along the coast of Oaxaca in the Inaturalist database (Fig. 1), which shows that the distribution of C. culminatus is continuous and no gap exist between C. culminatus and C. ehecatl.Unfortunately, no DNA sequences exist for these samples.Our analyses of IBD show that there is a strong correlation between genetic and geographic distances in the C. culminatus group (Fig. 4 & Appendix 5), and thus could be the cause of the apparent genetic distinctiveness between C. culminatus and C. ehecatl.Despite the fact that we did not find evidence for IBD between C. simus and C. durissus, it is still possible that IBD exists, but it is not detectable with mitochondrial markers alone (Teske et al., 2018).Several recent studies have shown that if IBD is not taken into account, genetic differences can drastically affect the results of species delimitation methods, and over-split species (Gratton et al., 2016;Mason et al., 2020), which could be the case in some members of the C. durissus group.
We use the program DELINEATE as a way to test for current species hypothesis in the group.This analysis did not recognize Crotalus ehecatl as distinct from C. culminatus but did recover C. mictlantecuhtli as a separate lineage.DELINEATE also failed to recover C. simus, C. unicolor and C. vegrandis as distinct from C. durissus.However, all these results should be taken with caution, as our analysis included only two mitochondrial genes, and hybridization or mitochondrial introgression could be responsible for this patterns.Campbell and Lamar (2004) split Crotalus simus and C. durissus into two different species.These species are separated by ~900 km in Central America, and both appear to be absent in Panamá.However, the mtDNA differences between these species are minimal, and much smaller than population level differences in other species (see Appendix 4).Wüster et al. (2005) and Quijada-Mascareñas et al., (2007) suggested that the invasion of South America by the C. durissus occurred relatively recently (~1 mya), probably facilitated by a continuous corridor of dry savannahs.However, the divergence time estimated by Wüster et al. (2005) for the split between the C. durissus and the C. molossus groups is much older (10-14 mya) than that estimated by Reyes-Velasco et al., (2013) of ~3-6 mya, however, these divergence dates should be taken with caution.A much younger split between these groups could indicate that the invasion of South America by C. durissus might be younger than expected, and probably occurred faster than previously suggested, which would explain the almost non-existent differences in the mtDNA data and the results of the species delimitation analysis.

CONCLUSION
Species are, by definition, hypotheses (De Queiroz, 2007;Samadi & Barberousse, 2006), therefore, the more evidence in favor of a hypothesis, the better species is supported.Just as when testing any hypothesis, careful consideration should be taken into account when describing new species.It is relatively easy to describe new taxa, especially when only a few molecular markers are analyzed for diagnosing between species.However, revising the evidence for such taxonomic changes can be very time consuming, and many people are not willing, nor have the time for these taxonomic revisions.Therefore, before describing new taxa, is important to consider all available evidence in favor or against any taxonomic designations.We do not argue that differences in mtDNA or on a few markers should be used as a yardstick to define what species are valid and what species are not, on the contrary, we argue against this type of approach, and thus we understand the limitations of our own analyses.Therefore, our goal with this study is not to conclude how many species should be considered valid in the Crotalus durissus species group, but to show that the data at hand can be interpreted in multiple ways.We believe that additional sampling, combined with more thoroughly molecular analyses are necessary to better understand geneflow and species limits in the C. durissus group, especially in areas of possible contact such as the southern part

Figure 1 .
Figure 1.Distribution map of members of the Crotalus durissus species group.Samples with genetic data are represented by colored circles, while other museum records with no genetic data available are shown as black dots.Open circles indicate additional records for C. culminatus and C. ehecatl obtained from inaturalist.orgas research grade observations, available in the gbif.orgdatabase.There is uncertainty regarding the type localities of C. durissus and C. simus and are thus not indicated.
-Velasco et al. -Species of the Crotalus durissus group Recently, Carbajal-Márquez et al. (2020) revised the taxonomy of the C. durissus group with additional sampling in Mexico.Based on mtDNA data and some morphological evidence, Carbajal-Márquez et al. (2020) split C. culminatus into three species, the nominal species C. culminatus, as well two new species, Crotalus ehecatl from southern Oaxaca and Chiapas, and Crotalus mictlantecuhtli from central Veracruz.These species were based on populations previously referred to as C. simus culminatus (C.ehecatl) and C. simus simus (C.mictlantecuhtli) by Campbell and Lamar (2004).Despite the additional sampling in the Carbajal-Márquez et al. ( consisted of Crotalus durissus (all subspecies), C. vegrandis and C. unicolor (which were nested within C. durissus), as well as all C. simus from southern Chiapas and Central America.The third clade consisted of individuals of C. mictlantecuhtli from Veracruz and was sister to the last clade which was composed of C. culminatus plus C. ehecatl.When we analyzed the nuclear gene c-mos alone, we recover a polytomy which included most members of the C. durissus group, as well as other species of rattlesnakes (Fig 2B).These results show that c-mos lacks phylogenetic signal and thus is of little use for solving the phylogenetic relationships of the C. durissus group.

Figure 2 .
Figure 2. Phylogenetic inference of members of the Crotalus durissus species group.A) Bayesian phylogenetic inference based on the concatenated mtDNA genes Cytb and ND4.B) Bayesian phylogenetic inference based on the nuclear locus C-mos.Black circles represent nodes with a posterior support of 1.

Figure 3 .
Figure 3. Splits tree network of the concatenated mtDNA genes Cytb and ND4 of members of the Crotalus durissus species group.Images represent multiple species of the C. durissus group.

Figure 4 .
Figure 4. Isolation by distance (IBD) plot showing relationship between geographic and genetic distances for members of the Crotalus durissus species group.Line represents correlation between geographic and genetic distance (r).Statistical significance was estimated with a Mantel test.

Figure 5 .
Figure 5. Results of the species delimitation analysis performed in DELINEATE.Clades in blue represent constrained species, while clades in yellow represent those lineages whose species designations were tested.Tips with species assignation as "DelineatedSp" represent species that were tested which the program recognizes as distinct, while names in parenthesis represent their previous species assignation and locality.

Table 1 .
Best-fits models of substitution for the genes used in this study.