Again Consider Equation 699 and the Definitions of
Syst Biol. 2018 Jul; 67(4): 616–632.
Information Criteria for Comparing Partition Schemes
Tae-Kun Seo
1 Department of Biological Sciences, Korea Polar Research Found, 26 Songdomirae-ro, Yeonsu-gu, Incheon 406-840, Commonwealth of Korea
Jeffrey L Thorne
2 Bioinformatics Enquiry Center, Box 7566, North Carolina State University, Raleigh NC 27695-7566, USA
Stephen Smith, Associate Editor
Received 2017 Jun 16; Revised 2017 Dec vii; Accepted 2017 December 17.
Abstract
When inferring phylogenies, one of import conclusion is whether and how nucleotide substitution parameters should exist shared across different subsets or partitions of the data. One sort of partitioning error occurs when heterogeneous subsets are mistakenly lumped together and treated every bit if they share parameter values. The opposite kind of error is mistakenly treating homogeneous subsets every bit if they result from distinct sets of parameters. Lumping and splitting errors are not every bit bad. Lumping errors tin yield parameter estimates that do not accurately reverberate whatsoever of the subsets that were combined whereas splitting errors yield estimates that did not do good from sharing data beyond partitions. Phylogenetic partitioning decisions are ofttimes made by applying information criteria such as the Akaike information benchmark (AIC). As with other data criteria, the AIC evaluates a model or partition scheme by combining the maximum log-likelihood value with a penalty that depends on the number of parameters being estimated. For the purpose of selecting an optimal partitioning scheme, nosotros derive an adjustment to the AIC that we refer to every bit the AIC and that is motivated by the idea that splitting errors are less serious than lumping errors. We also introduce a like adjustment to the Bayesian information criterion (BIC) that nosotros refer to equally the BIC
. Via simulation and empirical data analysis, we dissimilarity AIC and BIC behavior to our suggested adjustments. We talk over these results and also emphasize why we wait the probability of lumping errors with the AIC
and the BIC
to be relatively robust to model parameterization.
Keywords: AIC, BIC, data criteria, model comparison, multilocus analysis, partition scheme comparing, phylogenomics
Probabilistic models of DNA and protein sequence alter take primal roles in phylogenetics and molecular development. Many models accept been proposed, but it is non always obvious which are most appropriate. With multilocus data, model choice can become especially hard because the best model for one subset of the data may exist suboptimal for another. The challenge of how to organize data into subsets that should exist analyzed by shared parameter values has go increasingly central as improvements in DNA sequencing engineering accept led to bigger data sets.
A simple and intuitive approach is to concatenate multilocus sequence data and regard the merged consequence as a single locus. Yet, drawbacks of this "concatenation" process exist and "separate analysis" has been suggested as an alternative (eastward.g. Adachi et al. 2000; Cao et al. 2000a,b; Nikaido et al. 2003; Nishihara et al. 2007). In carve up analysis (too referred to every bit partitioned analysis), some model parameters may be shared amid loci but other parameters may be distinct to individual loci. The concatenation approach has been criticized for ignoring heterogeneity among loci because unlike genes may support unlike tree topologies for reasons ranging from incomplete lineage sorting to horizontal gene transfer to introgression (Leigh et al. 2011; Anderson et al. 2012). With chain, locus-specific information is lost. In addition, evolutionary heterogeneity amid loci is not limited to topology. Fifty-fifty when multiple loci support the same topology, natural selection, and/or mutation tin can cause the nucleotide commutation process to vary among loci. For instance, when multiple loci are generated past an identical tree topology but according to unlike branch lengths, ignoring the resulting heterotachy (Lopez et al. 2002) can cause the inferred topology to differ from the one that is shared among the private loci (e.g. Chang 1996; Kolaczkowski and Thornton 2004; Seo 2008). For diverse reasons, it would exist preferable to handle variation amidst loci when variation exists.
Although separate analysis tin accept advantages, it also has disadvantages. Because evolutionary parameters are separately estimated for each locus, the full number of estimated parameters can be much greater for separate analysis than chain. There is a trade-off between model fit and estimation uncertainty (Hastie et al. 2009). Every bit the number of parameters increases, model fit tends to improve but the variances for estimated parameters go bigger.
While they did not focus on partitioned analysis, Lemmon and Moriarty (2004) did a careful simulation study regarding underparameterization and overparameterization in phylogenetics. They found that both take negative consequences, but those from underparameterization were more astringent. Thus, it is crucial to balance the number of parameters and estimation uncertainty. Another drawback of separate analyses is that they crave more computation than chain. Therefore, for a given fix of multilocus data, careful consideration is needed for whether all loci should be handled separately or whether some should be concatenated. I goal is to make up one's mind an optimal "partition scheme," in which data partitions with similar evolutionary properties are merged and in which partitions with dissimilar properties are not merged.
Previous studies (Li et al. 2008; Lanfear et al. 2012, 2014) take noted that partition schemes can be regarded as statistical models and compared via conventional model choice criteria such equally the likelihood ratio examination (LRT), or the Akaike information criterion (AIC) (Akaike 1974), or the Bayesian information criterion (BIC) (Schwarz 1978). These criteria tin can and so be employed as part of an exhaustive search that examines all possible segmentation schemes to find the best one. Alternatively, a heuristic search can be used forth with the pick criteria. This is frequently desirable because the number of possible partition schemes grows apace as the number of loci increases. A Bayesian approach to simultaneously estimate the number of partitions and their model parameters (Wu et al. 2012) is another possible alternative.
Two sorts of errors can be made in phylogenetic partitioning. These errors are lumping partitions together when they should be separately treated (i.e. underparameterization) and splitting loci into distinct subsets when they should exist grouped together (i.east. overparameterization). Lumping errors tin yield bias such as systematic errors in topology interpretation (e.chiliad. Chang 1996; Kolaczkowski and Thornton 2004; Seo 2008) whereas splitting errors can crusade boosted parameter dubiousness. Here, we innovate and justify 2 model selection criteria that we term the AIC and BIC
considering they are modifications designed for partitioning decisions of the widely-used AIC and BIC. The AIC
and the BIC
are predicated on the assumption that lumping errors have more than serious consequences than splitting errors.
Theory
We motivate the new information criteria by showing the connection betwixt the AIC and a likelihood ratio test of the zilch hypothesis of division homogeneity versus the culling of partition heterogeneity. Nosotros then review the connection betwixt the AIC and the Kullback– Leibler deviation. We follow this with an explanation of how the AIC
relates to the Kullback– Leibler (KL) divergence between the truth and a detail partition scheme of interest when there is homogeneity between partitions. We consummate this section by introducing the BIC
.
Introduction and the Likelihood Ratio Exam equally Motivation
Suppose that a G-partition scheme is to be compared with the "concatenated" 1-partition scheme. The parameter vector of model for partition
of the One thousand-partition scheme will be denoted
. The log-likelihood of the
th sectionalisation with model
is defined as,
where is the
th aligned sequence column of the
th partition and
is the sequence length of the
th partition. When the entire drove of possible partitions is being considered, we utilize the summation sign
to represent this 1-partition concatenation scheme. The model for the 1-partition scheme is termed
and its log-likelihood is
Nosotros treat and discuss generalizations later, but we begin by assuming that the concatenated model and the models
for each partition
share the same parameterization (i.e. topology and substitution model) only may differ in parameter values. For example, all models could share the same topology and all could accept the HKY (Hasegawa et al. 1985) parameterization (
).
The maximum likelihood estimators (MLEs) of the th division and of the 1-sectionalization scheme are:
where implies "the values of the parameters
of model
that maximize
". The
estimators stand for MLEs obtained from the
th partition with model
. Note that
and
are expected to be unequal because the old values are obtained from only the
th sectionalisation whereas the latter values are obtained from the entire information fix.
A limitation of our notation is that it does not reflect the frequently arising situation where some parameters are forced to accept values that are shared among all partitions and other parameters are allowed to have values that are unique to each partition. While this limitation could be corrected by expanding the notation, we choose to adopt our more simple notation and instead focus on the cases where either all parameter values are shared amid partitions or all accept distinct values for each sectionalization. Still, every bit will be emphasized later on, the concepts that we hash out and the theory that justifies the AIC and BIC
also apply to the example where some parameters have values that are shared among partitions and others do not.
Consider the likelihood ratio exam when the null hypothesis is a single sectionalisation (i.e. concatenation) and when the more general hypothesis is that each of the partitions shares the parameterization of the 1-partition model but that each has its own parameter values. Because the null hypothesis is a special instance that is nested within the more than general 1, the likelihood ratio test statistic tin be approximated as having a
distribution when the nil hypothesis is true. Specifically,
(i)
where parameters are estimated co-ordinate to the null hypothesis and where
parameters are separately estimated for each partition in the K-partitioning scheme. When the nada hypothesis of partition homogeneity is correct, Equation (one) implies
(ii)
considering is the expected value of a
random variable. Here, the "
" sign indicates that the expectations are equal. Therefore,
(3)
(4)
Equation (2) shows that the expected log-likelihood of a concatenated model tin be used to approximate the expected log-likelihood of a K-partition model when partitions are homogeneous (or vice-versa). The term on the left side of Equation (3) happens to exist the AIC
penalization for a concatenated model. Information technology is the same penalty as would be assigned by the AIC. The
term on the right side of Equation (4) is the AIC
penalty for a Grand-partition model and the
term on the right side of Equation (iii) reflects the extra cost of the human activity of partitioning. In dissimilarity, the AIC penalty for a G-division model would be
and this tin can exist much heavier than the AIC
penalisation when
is large. As a result, the AIC is more prone than the AIC
to favoring concatenated models.
While the LRT approach to model comparison has some attractive features, one limitation is that the LRT arroyo assumes that the more restricted "nested" statistical model is true. The AIC merely aims to select the candidate that is closest to the truth and it relies on an approximation that is expected to be "fantabulous" when the causeless model is "skilful" (Burnham and Anderson 2002). As well, the AIC differs from the LRT that inspires Equation (4) because information technology is non predicated on the exchange model and tree topology existence correct (encounter Office one of the Appendix). Yet, Equation (3) illustrates that the LRT and the AIC
penalty are closely connected when the substitution model and topology do happen to be correct.
Kullback– Leibler Divergence (KLD) and the AIC
Suppose that homogeneous samples, (
), are obtained from the true but unknown distribution
, and we want to fit them with model
. The KLD between
and
is
(five)
The model whose KLD is minimized is regarded every bit the best amidst model candidates. Because does non vary amongst models, Equation (5) shows that the best model must be the 1 that maximizes
. Akaike (1974) showed that
can be approximated using the MLE
and an adjustment for the number of parameters that are estimated. Written in our notation, Akaike (1974) showed
(6)
(7)
where is the dimension of
and where the approximation requires
to exist large. In Equation (seven), the bracketed term is the unbiased computer of
. The
in this estimator corrects the bias and is necessary because the data set is used to become
and is then used again to approximate
(Konishi and Kitagawa 2004). The AIC is divers as
(viii)
and is derived from this unbiased calculator. When there are competing models, the one that maximizes its unbiased estimator of (or equivalently that minimizes its AIC) is regarded as the best model. Applying the definition of Equation (8), the AIC scores of concatenated data are written in our notation equally
(nine)
where is the dimension of
(see also Lanfear et al. 2012, 2014; Li et al. 2008). For phylogenetic applications, the number of complimentary parameters tin be separated as
(ten)
where is the number of tree branches and
is the number of the remaining parameters (e.g. parameters affecting nucleotide frequencies, transition/transversion ratios, charge per unit heterogeneity among sites, etc).
Writing to signal that each partition
of the
total partitions has its ain model
, the AIC scores of partitioned information (Lanfear et al. 2012, 2014) are
(11)
where is the number of freely estimated parameters and
. In the exhaustive comparison of division schemes, the scores of Equation (xi) are calculated for all candidate schemes and the minimum score determines the all-time i. Widely-used heuristic algorithms for choosing partition schemes (Li et al. 2008; Lanfear et al. 2012, 2014) involve repeatedly comparison a 2– partition scheme with a 1-partition scheme. Different
values in Equations (9) and (11) could be needed if the ane-partition and K-partition schemes adopted dissimilar tree topologies (due east.g. a fully bifurcating topology in i scheme and a "star topology" in the other).
When analyzing multilocus data, a model that has corresponding branch lengths of different partitions differ only by a proportionality constraint (Yang 1996; Pupko et al. 2002) is ofttimes adopted. We will refer to this as the "linked co-operative length" (LBL) model (encounter likewise Lanfear et al. 2012).
With the LBL model, a fix of branch lengths that will influence all partitions is estimated as is a proportionality cistron for each partition. The sum of proportionality factors is typically restricted to ensure uniqueness of MLEs and the degrees of liberty for the proportionality factors is when in that location are
partitions. A model that has co-operative lengths being independently and separately estimated for all partitions can be used as an alternative to a proportional model. This model will be referred to as the "unlinked branch length" (UBL) model (see also Lanfear et al. 2012).
With multiple partitions, the LBL model can substantially reduce the number of complimentary parameters when because the number of free parameters for co-operative lengths is
with the LBL model whereas it is
for the UBL model. However, the more limited flexibility of the LBL model ways it is less satisfactory than the UBL model for some data sets (e.grand. Pupko et al. 2002). For the UBL and LBL treatments,
(12)
Here, we focus on comparison a K-partition scheme with a 1-division scheme. The aforementioned heuristic algorithms would have but larger values of
are also possible. Using AIC, the 1-division scheme is selected when
and the K-partition scheme is selected when the inequality is in the other direction.
Kullback– Leibler Divergence and the AIC
Our AIC modification of the AIC has penalization terms that are intended to reflect how much improvement in model fit would exist expected if partitions are homogeneous merely sequence information are analyzed past separately estimating parameters from each partitioning. These penalties tin can therefore be interpreted as stemming from the human activity of sectionalization. Our suggestion is that the all-time candidate segmentation scheme should be selected subsequently accounting for the act of partitioning. The AIC
punishment is not as heavy as the AIC penalty of
in Equation (eleven). Heavy penalties favor simple models (e.chiliad. concatenation) and tin lead to selection of a i-division model even when partitions are highly heterogeneous. The idea underlying the AIC
is to take a model choice benchmark that places all candidate selection schemes on an equal footing by using the advisable bias correction term according to division homogeneity. Later on this bias adjustment, we reason that the best sectionalization scheme is the ane with the optimal score.
Consider the situation in which samples are separated into
partitions. The size of each segmentation will exist
(
) and the MLE of the ane-partition scheme will be denoted
. The KLD for the
th partition is
(13)
A weighted average of Equation (13) with weights yields the KLD when the model of interest has heterogeneous partitions,
(14)
Minimizing the AIC of the UBL parameterization in Equation (12) can therefore exist seen to be equivalent to minimizing .
Now, consider the case where the UBL model is used simply the partitions are really homogeneous. Part 2 of the Appendix outlines one reason why the drawbacks of incorrectly partitioning homogeneous information do non seem especially astringent. From Equation (vii) and the argument outlined in Part 3 of the Appendix, we can show that the proper bias correction for "partitioning homogeneous data" is
:
(fifteen)
This ways that either a i-partition scheme or a K-partitioning scheme tin can exist employed when at that place is homogeneity among partitions, but the proper bias correction depends on the number of partitions.
The bias correction for division homogeneous data tin be contrasted to a bias correction when partitions are heterogeneous. Whereas the bias correction for partitioning homogeneous data is , the more than extreme correction for partitioning heterogeneous information is
. The bias correction for partition homogeneous partitions closely corresponds to the likelihood ratio test (see Equation (3)).
Based on this idea and the statement outlined in Part three of the Appendix, we tin define the AIC scores of a K-partitioning scheme and a concatenated scheme. As higher up, we assume that each of the
partitions has the same model parameterization equally the concatenated sequences (i.east.
). With this supposition, we accept the AIC
of the K-sectionalisation scheme beingness
(16)
where and represents the difference between the full number of parameters in the partition scheme (
) and the total number in the concatenation scheme (
). When an UBL model is adopted,
. When an LBL model is adopted,
. In dissimilarity, the AIC
for the concatenated model is
(17)
In Equation (16), the summed log-likelihoods represent the model fit of the data and the remaining portion () works as a penalty. Larger
values arise from complex partition schemes and therefore complex schemes are accompanied by higher penalties. The
therefore accounts for the trade-off betwixt division fit and partition complication. The ane-sectionalisation scheme is selected when
and the One thousand-partition scheme is selected when the inequality is in the other direction.
With the UBL model, the AIC score for the Grand-partition scheme is
(18)
We see from Equation (3) that the AIC scores of the concatenated model (encounter Equation (17)) and the K-partition model (meet Equation (18)) are expected to be about the same when the partitions are homogeneous. If the
partitions are heterogeneous, the AIC
score of the G-partition model is therefore expected to be lower. For this reason, nosotros suggest selecting the K-partition model when its score is less than that of the concatenated model.
We note that the AIC score of Equation (16) can be converted to the AIC score by replacing the
term with the heavier penalty of
. For the UBL parameterization, i way to translate the AIC and AIC
penalties is that the AIC has a penalty of
each fourth dimension a parameter is estimated whereas the AIC
has a punishment of
the first time a parameter is estimated merely a penalty of just
for additional partitions from which the parameter is estimated. The fact that the punishment of
in Equation (16) is lighter than that of
in Equation (12) implies that the AIC
can notice heterogeneity better than the AIC. This is considering a moderate comeback of model fit represented past
can dominate the punishment term of
more than easily than that of
.
Results and give-and-take
Simulation Studies
Nosotros performed simulations to compare information criteria. All simulations employed the topology and branch lengths illustrated in Figure 1 to randomly generate data sets consisting of iv partitions. To simulate partition homogeneity, all branches in all partitions had length . To simulate heterogeneous partitions, all branches in segmentation
(
) had branch length
. The number of simulated sites in division
was set to
. This means that the concatenated (one-partition) scheme had
sites. The values of
that were explored are
,
,
,
,
, and
. The Seq-Gen software (Rambaut and Grassly 1997) was employed to randomly generate
information sets via the HKY exchange model (Hasegawa et al. 1985) and a five-category detached-gamma treatment of rate heterogeneity among sites (Yang 1994b). The frequencies of A, C, One thousand, and T were respectively 0.3, 0.2, 0.iii, and 0.2 while the values of parameters for rate heterogeneity amid sites (
) and transition/transversion ratio (
) were both prepare to 5.0.

All simulated information sets were analyzed both with a 4-partition scheme and a concatenated (1– partition) scheme. For each of the ii schemes, each imitation data ready was analyzed with each of eight nucleotide substitution models and each of these cases was explored both with rate homogeneity among sites and with detached-gamma rate heterogeneity (five charge per unit categories). The 8 exchange models that were used for inference are denoted: JC (Jukes and Cantor 1969), K2 (Kimura 1980), F81 (Felsenstein 1981), F84 (Felsenstein 1989), HKY (Hasegawa et al. 1985), T92 (Tamura 1992), TN93 (Tamura and Nei 1993), and GTR (Yang 1994a). For the analyses of the imitation data, we did not effort to discover the best exchange model and tree topology. Instead, we compared the one-partition scheme and 4-sectionalization schemes when both assumed the true tree topology and when both assumed the aforementioned substitution model. Also, the UBL parameterization was used for investigating the four-partition schemes and so that branch lengths of each partition were independently estimated. Maximum likelihood inference was conducted via Version iv.8 of the baseml program of the PAML software (Yang 2007).
Table ane shows how often the 4-sectionalisation scheme was selected with the AIC and the AIC when the truth was that partitions are homogeneous (i.e.
). Most of the simulation results for the AIC prove a depression proportion (<0.1%) of incorrectly selecting the 4-partition scheme. In contrast, the
incorrectly selects the 4-division scheme nigh fifty% of the fourth dimension. This college proportion is expected because the
benchmark is designed with the idea that "splitting" errors (incorrectly separating partitions) are less serious than "lumping" errors (incorrectly concatenating partitions). The observation of an approximately 50% mistake rate is reasonable in lite of the fact that the
criterion is designed and then that its expected value is the same for a 4-partition and 1-partition scheme when the truth is division homogeneity.
Table i.
Proportion of simulations where the four-partition scheme was selected rather than the ane-division (chain) scheme by AIC and by AIC when partitions were actually homogeneous
Tabular array ii shows simulation results for the AIC and the AIC when partitions are heterogeneous. In this state of affairs, the 4-partition scheme should be selected over the i-partition scheme and failure to practise and so is a "lumping error." Both the AIC and AIC
criteria perform well when partitions consist of relatively big numbers of sites. But, Table 2 reveals a marked contrast when partitions have fewer sites. In the situation of
sites, the AIC makes a lumping mistake for nearly all simulated data sets whereas the AIC
is quite unlikely to brand these errors. The results show that the
has college "sensitivity" than the AIC. That is, the
detects heterogeneity better than the AIC when partitions are heterogeneous. With regard to making lumping errors, we also notation that the
appears to be relatively robust to model misspecification in comparison with the AIC.
Table 2.
Proportion of simulations where the 4-sectionalisation scheme was selected rather than the 1-partition (chain) scheme by AIC and past AIC when partitions were actually heterogeneous
When the number of sites per partition is in a biologically plausible range, both the BIC and BIC are more than decumbent than the AIC to favor models that are less parameterized. The practical issue of the increased penalties on more parameterized models means that both the BIC and the BIC
are less likely than the AIC to make splitting errors. The results of Table 1 show that the AIC is very unlikely in our simulation settings to make a splitting mistake by choosing the 4-partition scheme over the 1-partition scheme when the truth is sectionalisation homogeneity. Therefore, it is not surprising that we exercise not observe splitting errors with the heavier BIC and BIC
penalties for our simulation conditions. Because splitting errors are not observed with the BIC and the BIC
, nosotros do not include tables of them for these criteria.
Tabular array three shows the performance of the BIC and when the truth is that the heterogeneous 4-sectionalization scheme should be selected. As expected, the
detects partition heterogeneity better than BIC. That is, the
makes fewer lumping errors than the BIC. However, from the comparison of Tables 2 and 3, we observe that the
is less sensitive than the
and even less than the
. This is because the
has heavier penalties than the
and the
. As well, the
displays a marked sensitivity to the causeless substitution model. For example, the observed proportion of selecting the 4-partition scheme with the JC model is 0.998 when
and it is only 0.093 with the GTR+K model. In contrast, the
appears to be more robust than the
to model misspecification.
Table 3.
Proportion of simulations where the 4-partition scheme was selected rather than the 1-partition (concatenation) scheme by BIC and by BIC when partitions were actually heterogeneous
The choice of model selection criterion can also affect branch length inferences. Table 4 considers the sums of branch lengths (i.eastward. the tree lengths) that were inferred from the imitation information with the unlike data criteria. It shows the specific case where the number of sites was 100 and where both the true and assumed exchange models were HKY + Yard. Tabular array four shows that lumping errors can have more serious impacts than splitting errors on branch length estimates. Also, information technology indicates the potential value of the AIC
when downstream inferences that rely upon branch length estimates (e.g. divergence time and evolutionary rate estimates) are important (run across Case 2 of the following department).
Table 4.
Tree length inferences from different data criteria when Partition 1 () had 100 sites and when the true and assumed substitution model were both HKY+G. If the information criterion selected the i– sectionalization scheme for a simulated data prepare, then the inferred tree length for the concatenated model was recorded for all 4 partitions. If the 4– partition scheme was selected, then tree lengths separately inferred for each segmentation was recorded. Standard deviations of tree length inferences are shown in parentheses below the sample means that were inferred from the 1000 false data sets.
Empirical Information Analysis
We analyzed two data sets to examine how partition scheme selection affects evolutionary inferences. Because some details of our ML interpretation and partition option procedures differ from those of the previous studies, our results take minor differences from the previous ones. Even so, our emphasis hither is non on these differences simply is instead on how partition scheme choice is affected past the pick of information criterion and how downstream evolutionary analyses are affected by partition option.
Instance ane: Li et al. (2008) analyzed 10 protein-coding genes of 56 ray-finned fish taxa. By separating each protein-coding gene into iii codon positions, they started with a possible 30-partition scheme. They and then performed hierarchical clustering to generate candidate schemes with 29, 28, , 2, 1 partitions. Although the number of possible ways to partition xxx items is the Bell Number (Bong 1934)
and this is far more than 30, we considered only the xxx schemes from hierarchical clustering that Li et al. (2008) reported.
For each of these 30 schemes with the AIC and with the BIC, our analyses allowed different commutation models for different partitions by invoking the "models = all" options of PartitionFinder (Lanfear et al. 2012). This differs from the Li et al. (2008) analysis that assumed the GTR+Grand model for all partitions in all schemes. We as well considered the AIC and the BIC
for the xxx candidate sectionalisation schemes. All analyses with these two information criteria causeless the Neighbor-joining tree topology reported by the MEGA software (Tamura et al. 2013) when the 30 possible partitions of the Li et al. (2008) data were concatenated and then analyzed with the TN93 substitution model.
Our AIC and BIC
analyses assumed that all partitions in each candidate scheme evolved according to a GTR substitution model with four detached-gamma categories of charge per unit heterogeneity among sites. Whereas Li et al. (2008) considered only the LBL parameterization, we considered both LBL and UBL parameterizations for the four data criteria with all candidate division schemes.
Detailed results from applying the 4 information criteria to the Li et al. (2008) candidate partitions are given in Tables 1 through 4 of Supplementary Textile, Online Appendix available on Dryad at http://dx.doi.org/10.5061/damsel.qq586. Most of the results are not surprising. The AIC is more than prone to splitting than the AIC. Considering only the UBL parameterization, the AIC
selects the 30-sectionalisation scheme while the AIC chooses the 21-division scheme. As well, the BIC
is more prone to splitting than the BIC. When considering only the UBL parameterization, the BIC
selects the 3-partition scheme while the BIC prefers the 2-partition scheme. These UBL results also confirm the expectation that the BIC and the BIC
prefer fewer partitions than the AIC and the AIC
. This is because both the BIC and the BIC
penalties involve the amount of data too equally the number of estimated parameters.
The number of estimated parameters for the LBL parameterization is substantially less than for the UBL parameterization. This causes the LBL parameterization to tend to favor splitting more than the UBL parameterization. Among the LBL results, the BIC selects the 21-partition scheme whereas the other three information criteria all choose the 30-sectionalisation scheme.
When both UBL and LBL parameterizations are considered for each of the 30 candidate schemes, the BIC selects the LBL with 21 partitions and 288 free parameters while the BIC selects the LBL with xxx partitions and 408 free parameters. For the same gear up of possible models, the AIC chooses the UBL with 21 partitions and 2486 free parameters while the AIC
favors the UBL with 30 partitions and 3540 free parameters. All of these model choices are consistent with the AIC
favoring splitting more than than the AIC, the BIC
favoring splitting more than the BIC, and the BIC and the BIC
both preferring fewer parameters than the AIC and the AIC
.
Because the 30 candidate schemes were generated past hierarchical clustering, all 30 have a nesting/nested relationship for the UBL parameterization. That is, the – partitioning scheme is nested within the
-partition scheme (
). The same applies for the LBL parameterization. These nesting relationships mean that the traditional (asymptotic) LRT can be practical. For both LBL and UBL parameterizations, the 30-sectionalization scheme is significantly meliorate than all others according to the likelihood ratio exam. In contrast, the 21-segmentation UBL scheme is selected by the AIC over the 30-partition UBL scheme and over all other candidate schemes. This emphasizes that the AIC score can produce results that are contradictory to the LRT. The AIC
selects the 30-partition UBL scheme as the best among all candidates and it selects the 30-partition LBL as being the best LBL candidate. The consistency between the AIC
and the LRT is non surprising given their close relationship.
Ripplinger and Sullivan (2008) noted that model pick may touch on phylogeny estimation, specially for regions of an evolutionary tree that have low bootstrap support. Partitioning decisions are a type of model choice and our experience with partition pick coincides with Ripplinger and Sullivan's observation. For example, we considered the thirty-partitioning UBL scheme that was selected as optimal according to the AIC and the 21-partition UBL scheme that was preferred past the AIC. While computational considerations motivated our decision to avoid intensively searching topology space when computing information criteria scores for the 30 candidate partitioning schemes, nosotros used the RAXML software (Stamatakis 2014) to more than carefully search among topologies for both the 21-partition UBL scheme and the xxx-partition UBL scheme. While the RAXML copse derived from these two schemes are topologically very similar, Figure 2 shows that the differences tend to occur in regions of the topologies with low bootstrap support.

Different tree topologies for dissimilar sectionalisation schemes. ML tree topologies of 56 ray-finned fish group were reconstructed with the 30-partition (left) and 21-segmentation (right) schemes. These ii partition schemes are the best with the AIC and the AIC, respectively. The bootstrap support levels shown near each node are based upon 500 bootstrap replicates. The copse were inferred with the GTR exchange model and four categories of detached-gamma rate heterogeneity among sites.
Case two: We analyzed four protein-coding mitochondrial genes and seven (six protein-coding and one noncoding) nuclear genes of 49 notothenioid fish grouping taxa. Colombo et al. (2015) focussed on the (possibly adaptive) radiation of the Antarctic clade, but our focus here is on partition pick. Every bit with Case 1, we used the "models = all" option and the tree topology provided by PartitionFinder for the AIC and the BIC calculations. For the AIC and the BIC
calculations, we used the Neighbor-joining tree topology estimated with the TN93 substitution model (Tamura and Nei 1993) and the MEGA software (Tamura et al. 2013). Fixing this topology, nosotros obtained the AIC
and the BIC
scores for a GTR model with 4 discrete-gamma categories of rate heterogeneity.
We first ran PartitionFinder (Lanfear et al. 2012) with the BIC and the UBL parameterization. The sectionalisation space was searched with the greedy option. Starting with 31 possible partitions (one for the noncoding gene and 1 per codon position for each of the 10 protein-coding genes), partitions were hierarchically merged until the best partition scheme (a iv-partition scheme) was plant co-ordinate to the BIC. Because the BIC tends to favor concatenation relative to the other data criteria, we decided to focus on the 28 candidate partition schemes that led PartitionFinder from the starting 31-sectionalization scheme to the four-partition scheme. The number of partitions in these 28 candidate schemes therefore range from 4 to 31.
Assuming the UBL parameterization and evaluating these 28 candidate schemes, the 9-, iv-, 20-, and 6-partition schemes are selected as the all-time co-ordinate to the AIC, the BIC, the AIC, and the BIC
respectively. Assuming the LBL parameterization, the 31-, eight-, 31-, and 18-sectionalisation schemes are selected as the best for AIC, BIC, AIC
, and BIC
, respectively. When because either the UBL or LBL parameterizations, the AIC selects the 933-parameter nine-partitioning UBL scheme and the AIC
prefers the 2080-parameter 20-partition UBL scheme while the BIC prefers the 146-parameter eight-partition LBL scheme and the BIC
prefers the 274-parameter 18-partition LBL scheme. In summary, the analyses again prove that BIC
is more than apt to split partitions than the BIC and the AIC
is more apt to dissever than the AIC. The results too again show that the BIC and BIC
prefer models with fewer free parameters than the AIC and AIC
.
With this data set, we again performed maximum likelihood topology interpretation co-ordinate to a diversity of sectionalization schemes and model parameterizations that were favored according to i of the 4 information criteria. The results were again consequent with the observation of Ripplinger and Sullivan (2008). While we found minor variations in inferred maximum likelihood topology, the variations were associated with regions of the tree that have low or moderate bootstrap support (data non shown).
Using the twenty-partition UBL scheme preferred by the AIC and as well the 9-partition UBL scheme preferred by the AIC, we estimated evolutionary rates and divergence times with the MCMCtree software (Yang, 2007) past using both the sequence data and calibration points of Colombo et al. (2015). We concentrate on the third codon positions of the CO1 and the ND4 genes that are heterogeneous in the xx-segmentation UBL scheme but that are homogeneous (i.e. in the same partition) in the nine-partition UBL scheme. Figure 3 shows departure time estimates and the inferred trajectory of evolutionary rates of CO1 and ND4 3rd positions at nodes forth the path connecting the most recent mutual antecedent of the Antarctic clade to the tip in this clade that represents Chionodraco rastrospinosus. While the divergence time estimates are very like for the 20-partition and 9-partition schemes, the rate trajectories for the third positions of CO1 and ND4 are quite different. The charge per unit trajectory of the merged data shows a kind of average of CO1 and ND4 and is located between the two trajectories of CO1 and ND4, simply information technology loses information for the evolutionary properties of the individual genes. While this is only one example rather than evidence for a general tendency, nosotros expect that the choice of sectionalisation scheme is less likely to impact the estimation of tree topology than departure times and nosotros await that sectionalization scheme choice is more likely to affect the estimation of evolutionary rates of individual genes than information technology is to impact deviation times that are causeless to be shared among the genes.

Different evolutionary rates for different sectionalisation schemes. Evolutionary rate trajectories from the nearly contempo common ancestor of the Antarctic clade to Chionodraco rastrospinosus. Evolutionary rates of 3rd sites of CO1 (circle), ND4 (rectangle) and merged CO1 and ND4 (cantankerous) are plotted on the y-axis with estimated difference times (millions of years agone) on the x-axis. The third sites of CO1 and ND4 are heterogeneous in the xx-partition scheme that is the best co-ordinate to the AIC. Withal, they are homogeneous in the 9-sectionalisation scheme that is the best co-ordinate to the AIC.
Concluding Remarks
The asymmetric consequences of splitting and lumping errors are our motivation for suggesting the and the
to compare partition schemes. We view the possible bias resulting from lumping errors as more serious than the increased variance generated past splitting errors. Other partitioning techniques to account for the asymmetric consequences are also possible. For example, sectionalization guided by Bayesian decision theory (east.one thousand. see Berger 1985) could exist attractive, but it would be difficult to catechumen the qualitative asymmetry of consequences from lumping and splitting errors into a quantitative loss function that would adequately summarize the relative severities of these two types of errors.
Of the four information criteria that we consider, the is the to the lowest degree probable to brand lumping errors and nosotros conclude that the
is ordinarily a better selection than the other information criteria. Nonetheless, one of the notable features of the
is that it is consistent. Consistency in model pick implies that the probability to select the truthful model approaches 1 as sample size increases (Dziak et al. 2012). As well, consistency in partition selection guarantees selection of the true partition scheme when sample size (i.e. the number of sequence sites) is big. When
goes to infinity, the
penalty divided past
goes to zero whereas the penalty itself goes to infinity. This is frequently the situation when data criteria are consequent with respect to model option (Bozdogan 1987; Dziak et al. 2012). The
is also consistent only, considering the
is less decumbent to lumping errors than the BIC, the
might exist the all-time alternative among the four information criteria when statistical consistency is specially valued.
A conventional approach is to start brand a quick and crude approximation of the phylogenetic tree topology and to so search for the optimal sectionalization scheme by fixing the topology at this approximation (Lanfear et al. 2012). After settling upon and fixing the partition scheme, a thorough search of topologies can be carried out. This conventional approach is attractive because a joint search of all combinations of segmentation scheme and topology tin can be computationally prohibitive.
This conventional approach simplifies computation and seems to united states to also be sensible when employing the AIC and the BIC
, especially for doing phylogenetics with genome-scale information. If two partitions are heterogeneous according to 1 combination of tree topology and nucleotide commutation model, they are probable to be heterogeneous according to another combination. To be sure, the choice of a combination could bear upon the power to observe heterogeneity just this effect is frequently small. This is because even if the causeless tree topology and commutation model are incorrect for some or all partitions, the resulting bias would besides be homogeneous (or heterogeneous) if evolutionary backdrop are really homogeneous (or heterogeneous) amongst partitions. Yet, we take not fully characterized this conventional approach here and doing so might be a good direction for future inquiry.
The application of information criteria to partitioning sequence information has mainly received attention with regard to touch on on phylogeny inference, but diverse other kinds of evolutionary inferences (e.g. divergence fourth dimension estimation, exam of how rates of molecular evolution take changed over time, and detection of diversifying positive selection) are likewise potentially influenced by partitioning. The potentially big effects of partitioning on inferred trajectories of evolutionary rates are illustrated by our findings with CO1 and ND4 3rd positions from the Colombo et al. (2015) notothenioid data.
The power to notice and quantify shifts of evolutionary rates over time is especially pertinent to the study of adaptation. By studying a phylogeny of diverse terrestrial and marine mammals and then identifying genes having evolutionary rates on the tree that correlate with marine/terrestrial status, Chikina et al. (2016) identified a biologically plausible grouping of candidate genes that might be associated with adaptation to marine environments. This promising strategy for illuminating the genetic underpinnings of evolutionary adaptation can potentially be applied to diverse other sorts of adaptation, including adaptation to the farthermost Antarctic environment that may accept been associated with the notothenioid fish radiations studied by Colombo et al. (2015). However, ability to utilize rate change to identify genes associated with adaptation to extreme environments or other sorts of adaptation will be influenced past sectionalization decisions. As genomic data of not-model organisms becomes increasingly available, the ability to characterize trajectories of evolutionary rates should amend and and then should the ability to place interesting changes in evolutionary rates. Success of these studies is likely to exist influenced by the availability of sound methods for partitioning.
Funding
This work was supported by the Korea Polar Inquiry Plant (PE17090) and by NIH grant GM118508.
Acknowledgements
We thank Marker Holder, Stephen Smith, Xiang Ji, and two anonymous reviewers.
Appendix
1. Basic assumptions
Let us define operators ,
, and
as follows
,
and
are defined in a similar manner.
The adopted model may not exist correct. When the MLE with an wrong model converges to a certain value that we will denote , its asymptotic distribution is normal under suitable regularity conditions (White 1982). That is,
(A.one)
(A.2)
and the asymptotic distribution of is expressed in a similar mode. When the assumed model is correct,
in Equation (A.1) and the covariance matrix of Equation (A.1) is reduced to the inverse of
.
In our derivations, we assume that the adopted model may not be right but it is close to the truth so that and the covariance matrix of Equation (A.1) is approximately the changed of
. Besides, we assume partition homogeneity when deriving AIC
'southward penalization.
two. Partitioning homogeneous sequence data is non harmful when the number of sequence sites per partitioning is large
We define
If we assume partition homogeneity, so and
where and
is identity matrix. Nosotros consider the following natural estimators
and annotation that is a consistent computer of
.
Now, consider the post-obit outset derivative and its asymptotic expansion.
where implies that
is bounded in probability (Bishop et al. 2007). This leads to
(A.iii)
where we use the fact for large
'south in the approximation of the last line. Nosotros denote
as
. Then, Equation (A.3) implies
for large
. On the other mitt,
Therefore, fifty-fifty for large
. However,
Therefore, for large
.
Now consider the variances of and
. From Equation (A.3),
Taking the expectation of both sides, we get
which implies that the variances of and
are similar to each other when there are large amounts of information. The similarity of these variances suggests that partitioning homogeneous data is non harmful when the number of sequence sites per partition is large.
3. Proof of Equation (15)
For the simplicity and convenience of mathematical note, we omit the '' subscript below so that
and
respectively stand for
and
.
To further simplify Equation (A.ii), we define
If nosotros rewrite Equation (A.2), the MLE from the th segmentation asymptotically follows a normal distribution,
At present, let us define the vector of all partition's MLEs as follows,
where and
are
dimensional cavalcade vectors. The vector
asymptotically follows a multivariate normal distribution,
(A.4)
where is a diagonal block matrix due to the independence of partitions,
Now, we investigate the relationship between and
for the
th segmentation. We define the (
)-dimensional matrix
as
where is the
-dimensional identity matrix. And then,
where the approximation is from Equation (A.3). Applying Equation (A.iv), we notice
The can be obtained with covariance matrices of private partitions,
where the first approximation results from partition homogeneity and the second approximation results from the assumption of .
The quadratic summation of the elements of follows a
distribution,
(A.5)
And then, the log-likelihood role at the th partition has the post-obit relationship,
(A.half dozen)
where the last approximation holds in the sense of expectation. Therefore,
which proves Equation (15). While Equation (15) is a direct event of the asymptotic beliefs of likelihood ratio tests when the null hypothesis is true, we note that Equations (A.5) and (A.6) are the critical steps in this proof and they are valid so long equally the adopted model does not severely deviate from the truth.
4. Proof of Equation (20): derivation of BIC
To brainstorm, nosotros overview an approximation of the posterior probability density to testify the origin of in Equation (xix) (due east.1000. see Robert 2007). As in Role 3 of the Appendix, we volition omit the '
' subscript below when doing then does not bear on clarity.
Ascertain
Then, the probability of information for a given prior
is
(A.7)
Nosotros use the following results,
(A.8)
which can be derived from the probability density function of a multivariate normal distribution.
Thus,
(A.9)
Multiplying the right side of Equation (A.nine) by , we obtain Equation (19),
In the conventional definition of BIC, is ignored. Simply, in our study, we accept it into consideration for more accurate comparing (see Theory).
By using Equations (A.five) and (A.6), we can derive the post-obit relationships.
(A.10)
where we used an approximation similar to Equation (A.8),
Taking the logarithm of both sides of Equation (A.ten) followed by ignoring minor terms results in
Therefore, by recovering model index , we obtain the following approximation
(A.11)
From the definition of Equation (19) and the approximation of Equation (A.11), we tin can consider the post-obit approximation and definition,
where '' implies that the right side of the equation is defined as the left side of the equation.
References
- Adachi J.,, Waddell P.J.,, Martin Westward.,, Hasegawa Thou. 2000. Plastid genome phylogeny and a model of amino acrid substitution for proteins encoded by chloroplast. J. Mol. Evol. 50, 348–358. [PubMed] [Google Scholar]
- Akaike H. 1974. A new look at the statistical model identification. IEEE Trans. Autom. Contr. xix, 716–723. [Google Scholar]
- Anderson C.North.,, Liu L.,, Pearl D.,, Edwards S.5. 2012. Tangled trees: the claiming of inferring species trees from coalescent and noncoalescent genes. Methods Mol Biol. 856, 3–28. [PubMed] [Google Scholar]
- Bell E.T. 1934. Exponential numbers. Amer. Math. Monthly 41, 411–419. [Google Scholar]
- Berger J.O. 1985. Statistical determination theory and Bayesian assay. New York: Springer-Verlag. [Google Scholar]
- Bishop Y.M.,, Fienberg S.E.,, Holland P.W. 2007. Discrete multivariate analysis. New York: Springer-Verlag; p. 475–484. [Google Scholar]
- Bozdogan H. 1987. Model selection and Akaike'due south Data Benchmark (AIC): the general theory and its analytical extensions. Psychometrika 52, 345–370. [Google Scholar]
- Burnham Grand.P.,, Anderson D.R. 2002. Model option and multimodel inference. 2d ed New York: Springer-Verlag; p. 64–66, 284–285. [Google Scholar]
- Cao Y.,, Sorenson Thousand.D.,, Kumazawa Y.,, Mindell D.P.,, Hasegawa One thousand. 2000a. Phylogenetic position of turtles among amniotes: evidence from mitochondrial and nuclear genes. Factor 259, 139–148. [PubMed] [Google Scholar]
- Cao Y.,, Fujiwara M.,, Nikaido N.,, Okada North.,, Hasegawa Thousand. 2000b. Interordinal relationships and timescale of eutherian evolution as inferred from mitochondrial genome information. Cistron 259: 149–158. [PubMed] [Google Scholar]
- Chang J.T. 1996. Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters. Math. Biosci. 134:189–215 [PubMed] [Google Scholar]
- Chikina M.,, Robinson J.D.,, Clark N.Fifty. 2016. Hundreds of genes experienced convergent shifts in selective pressure in marine mammals. Mol. Biol. Evol. 33 (ix):2182–2192. [PMC free article] [PubMed] [Google Scholar]
- Colombo M.,, Damerau G.,, Hanel R.,, Salzburger West.,, Matschiner G. 2015. Diversity and disparity through time in the adaptive radiation of Antarctic notothenioid fishes. J. Evol. Biol. 28 (ii):376–394. [PMC complimentary article] [PubMed] [Google Scholar]
- Draper D. 1995. Cess and propagation of model uncertainty. J. R. Statist. Soc. B 57, 45–97. [Google Scholar]
- Dziak J.J.,, Coffman D.L.,, Lanza Due south.T.,, Li R. 2012. Sensitivity and specificity of information criteria. Technical Report Series #12-119. The Pennsylvania Land University. Land College, PA. [Google Scholar]
- Felsenstein J. 1981. Evolutionary trees from Deoxyribonucleic acid sequences: a maximum likelihood arroyo. J. Mol. Evol. 17, 368–376. [PubMed] [Google Scholar]
- Felsenstein J. 1989. PHYLIP—phylogeny inference package (version 3.2). Cladistics 5:164–166 [Google Scholar]
- Hasegawa M.,, Kishino H.,, Yano T. 1985. Dating the human-ape splitting past a molecular clock of mitochondrial Deoxyribonucleic acid. J. Mol. Evol. 22, 160–174. [PubMed] [Google Scholar]
- Hastie T.,, Tibshirani R.,, Friedman J. 2009. The elements of statistical learning. Affiliate vii New York: Springer-Verlag. [Google Scholar]
- Jukes T.H.,, Cantor C.R. 1969. Evolution of protein molecules. In: Munro H.North., editors. Mammalian protein metabolism. New York: Academic Printing, p. 21–132. [Google Scholar]
- Kimura K. 1980. A unproblematic method for estimating evolutionary rate of base of operations substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120. [PubMed] [Google Scholar]
- Kolaczkowski B.,, Thornton J.W. 2004. Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431, 980–984. [PubMed] [Google Scholar]
- Konishi S.,, Kitagawa G. 2004. Information criteria (in Japanese). Tokyo: Asakura Publishing Co, p. 47–48. [Google Scholar]
- Lanfear R.,, Calcott B.,, Ho Southward.Y.W.,, Guindon S. 2012. PartitionFinder: combined pick of partitioning schemes and commutation models for phylogenetic analyses. Mol. Biol. Evol. 29, 1695–1701. [PubMed] [Google Scholar]
- Lanfear R.,, Calcott B.,, Kainer D.,, Mayer C.,, Stamatakis A. 2014. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol. Biol. fourteen, 82–95. [PMC free article] [PubMed] [Google Scholar]
- Leigh J.W.,, Lapointe F.J.,, Lopez P.,, Bapteste E. 2011. Evaluating phylogenetic congruence in the mail-genomic era. Genome Biol Evol. three, 571–587. [PMC free article] [PubMed] [Google Scholar]
- Lemmon A.R.,, Moriarty East.C. 2004. The importance of proper model assumption in Bayesian phylogenetics. Syst. Biol. 53: 265–277. [PubMed] [Google Scholar]
- Li C.,, Lu Yard.,, Orti G. 2008. Optimal information segmentation and a test for Ray–Finned fishes (Actinopterygii) based on ten nuclear loci. Syst. Biol. 57, 519–539. [PubMed] [Google Scholar]
- Lopez P.,, Casane D.,, Philippe H. 2002. Heterotachy, an of import process of protein evolution. Mol. Biol. Evol. xix, 1–7. [PubMed] [Google Scholar]
- Nikaido, M.,, Cao, Y., Harada Grand.,, Okada N.,, Hasegawa M. 2003. Mitochondrial phylogeny of hedgehogs and monophyly of Eulipotyphla. Mol. Phylogenet. Evol. 28:276–284 [PubMed] [Google Scholar]
- Nishihara H.,, Okada N.,, Hasegawa G. 2007. Rooting the eutherian tree: the power and pitfalls of phylogenomics. Genome Biol. 8:R199.ane–R199.x [PMC costless article] [PubMed] [Google Scholar]
- Pupko T.,, Huchon D.,, Cao Y.,, Okada N.,, Hasegawa 1000. 2002. Combining multiple information sets in a likelihood analysis: which models are the best? Mol. Biol. Evol. xix, 2294–2307. [PubMed] [Google Scholar]
- Rambaut A.,, Grassly N.C. 1997. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution forth phylogenetic trees. Comput. Appl. Biosci. 13: 235–238. [PubMed] [Google Scholar]
- Ripplinger J.,, Sullivan J. 2008. Does choice in model pick touch maximum likelihood analysis? Syst. Biol. 57 (1):76–85. [PubMed] [Google Scholar]
- Robert C.P. 2007. The Bayesian choice. 2/due east New York: Springer-Verlag; p. 352. [Google Scholar]
- Schwarz Grand. 1978. Estimating the dimension of a model. Ann. Stat. 6, 461–464. [Google Scholar]
- Seo T.-1000. 2008. Calculating bootstrap probabilities of phylogeny using multilocus sequence data. Mol. Biol. Evol. 25, 960–971. [PubMed] [Google Scholar]
- Stamatakis A. 2014. RAxML Version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30 (ix):1312–1313. [PMC free article] [PubMed] [Google Scholar]
- Tamura K. 1992. Estimation of the number of nucleotide substitutions when there are potent transition-transversion and G+C content biases. Mol. Biol. Evol. ix, 678–687. [PubMed] [Google Scholar]
- Tamura K.,, Nei M. 1993. Interpretation of the number of nucleotide substitutions in the control region of mitochondrial Deoxyribonucleic acid in humans and chimpanzees. Mol. Biol. Evol. ten, 512–526. [PubMed] [Google Scholar]
- Tamura K.,, Stecher Thou.,, Peterson D.,, Filipski A.,, Kumar Due south. 2013. MEGA6: molecular evolutionary genetics analysis version half-dozen.0. Mol. Biol. Evol. thirty (12):2725–2729. [PMC free article] [PubMed] [Google Scholar]
- Wu C.H.,, Suchard M.A.,, Drummond A.J. 2012. Bayesian selection of nucleotide substitution models and their site assignments. Mol. Biol. Evol. thirty (3):669–699. [PMC costless article] [PubMed] [Google Scholar]
- White H., 1982. Maximum likelihood estimation of misspecified models. Econometrica l, 1–25. [Google Scholar]
- Yang Z. 1994a. Estimating the pattern of nucleotide substitution. J. Mol. Evol. 39, 105–111. [PubMed] [Google Scholar]
- Yang Z. 1994b. Maximum likelihood phylogenetic estimation from Dna sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306–314. [PubMed] [Google Scholar]
- Yang Z. 1996. Maximum-likelihood models for combined analyses of Multiple sequence data. J. Mol. Evol. 42, 587–596. [PubMed] [Google Scholar]
- Yang Z. 2007. PAML 4: Phylogenetic assay by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591. [PubMed] [Google Scholar]
Articles from Systematic Biology are provided hither courtesy of Oxford University Press
Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005138/
0 Response to "Again Consider Equation 699 and the Definitions of"
Post a Comment