Phylogenetic inference with q2-phylogeny¶
Phylogenetic inference with q2-phylogeny
Note
This tutorial assumes, you’ve read through the QIIME 2 Overview documentation and have at least worked through some of the other Tutorials.
Inferring phylogenies¶
Several downstream diversity metrics, available within QIIME 2, require that a phylogenetic tree be constructed using the Operational Taxonomic Units (OTUs) or Amplicon Sequence Variants (ASVs) being investigated.
But how do we proceed to construct a phylogeny from our sequence data?
Well, there are two phylogeny-based approaches we can use. Deciding upon which to use is largely dependent on your study questions:
1. A reference-based fragment insertion approach. Which, is likely the ideal choice. Especially, if your reference phylogeny (and associated representative sequences) encompass neighboring relatives of which your sequences can be reliably inserted. Any sequences that do not match well enough to the reference are not inserted. For example, this approach may not work well if your data contain sequences that are not well represented within your reference phylogeny (e.g. missing clades, etc.). For more information, check out these great fragment insertion examples.
2. A de novo approach. Marker genes that can be globally aligned across divergent taxa, are usually amenable to sequence alignment and phylogenetic investigation through this approach. Be mindful of the length of your sequences when constructing a de novo phylogeny, short reads many not have enough phylogenetic information to capture a meaningful phylogeny. This community tutorial will focus on the de novo approaches.
Here, you will learn how to make use of de novo phylogenetic approaches to:
generate a sequence alignment within QIIME 2
mask the alignment if needed
construct a phylogenetic tree
root the phylogenetic tree
If you would like to substitute any of the steps outlined here by making use of tools external to QIIME 2, please see the import, export, and filtering documentation where appropriate.
Sequence Alignment¶
Prior to constructing a phylogeny we must generate a multiple sequence alignment (MSA). When constructing a MSA we are making a statement about the putative homology of the aligned residues (columns of the MSA) by virtue of their sequence similarity.
The number of algorithms to construct a MSA are legion. We will make use of MAFFT (Multiple Alignment using Fast Fourier Transform)) via the q2-alignment plugin. For more information checkout the MAFFT paper.
Let’s start by creating a directory to work in:
mkdir qiime2-phylogeny-tutorial
cd qiime2-phylogeny-tutorial
Next, download the data:
Download URL: https://data.qiime2.org/2024.5/tutorials/phylogeny/rep-seqs.qza
Save as: rep-seqs.qza
wget \
-O "rep-seqs.qza" \
"https://data.qiime2.org/2024.5/tutorials/phylogeny/rep-seqs.qza"
curl -sL \
"https://data.qiime2.org/2024.5/tutorials/phylogeny/rep-seqs.qza" > \
"rep-seqs.qza"
Run MAFFT
qiime alignment mafft \
--i-sequences rep-seqs.qza \
--o-alignment aligned-rep-seqs.qza
Reducing alignment ambiguity: masking and reference alignments¶
Why mask an alignment?
Masking helps to eliminate alignment columns that are phylogenetically uninformative or misleading before phylogenetic analysis. Much of the time alignment errors can introduce noise and confound phylogenetic inference. It is common practice to mask (remove) these ambiguously aligned regions prior to performing phylogenetic inference. In particular, David Lane’s (1991) chapter 16S/23S rRNA sequencing proposed masking SSU data prior to phylogenetic analysis. However, knowing how to deal with ambiguously aligned regions and when to apply masks largely depends on the marker genes being analyzed and the question being asked of the data.
Note
Keep in mind that this is still an active area of discussion, as highlighted by the following non-exhaustive list of articles: Wu et al. 2012, Ashkenazy et al. 2018, Schloss 2010, Tan et al. 2015, Rajan 2015.
How to mask alignment.
For our purposes, we’ll assume that we have ambiguously aligned columns in the
MAFFT alignment we produced above. The default settings for the
--p-min-conservation
of the
alignment mask approximates the
Lane mask filtering of QIIME 1. Keep an eye out for updates to the alignment
plugin.
qiime alignment mask \
--i-alignment aligned-rep-seqs.qza \
--o-masked-alignment masked-aligned-rep-seqs.qza
Reference based alignments
There are several tools that attempt to reduce the amount of ambiguously aligned regions by using curated reference alignments. Traditional, de novo alignment methods mututally align a set of unaligned sequences to create a multiple sequence alignment (MSA) from scratch. Re-running these methods with additional sequences will create MSAs with varying numbers of columns and assignments of bases to each column. These alignments is therefore incompatible with one another and may not be joined through concatenation.
Reference based alignments, on the other hand, are meant to add sequences to an existing alignment. Alignments computed using reference based alignment tools always have widths identical to the reference alignment and maintain the meaning of each column. Therefore, these alignments may be concatenated.
QIIME 2 currently does not wrap any methods for reference-based alignments, but alignments created using these methods can be imported into QIIME 2 as FeatureData[AlignedSequence]
artifacts, provided that the alignments are standard FASTA formats. Some examples of tools for reference-based alignment include PyNAST (using NAST), Infernal, and SINA. SILVA Reference
alignments are particularly powerful for rRNA gene sequence data, as knowledge
of secondary structure is incorporated into the curation process, thus
increasing alignment quality.
Note
Alignments constructed using reference based alignment approaches can be masked too, just like the above MAFFT example. Also, the reference alignment approach we are discussing here is distinct from the reference phylogeny approach (i.e. q2-fragment-insertion) we mentioned earlier. That is, we are not inserting our data into an existing tree, but simply trying to create a more robust alignment for making a better de novo phylogeny.
Construct a phylogeny¶
As with MSA algorithms, phylogenetic inference tools are also legion. Fortunately, there are many great resources to learn about phylogentics. Below are just a few introductory resources to get you started:
There are several methods / pipelines available through the q2-phylogeny plugin of :qiime2:. These are based on the following tools:
Methods¶
fasttree¶
FastTree is able to construct phylogenies from large sequence alignments quite rapidly. It does this by using the using a CAT-like rate category approximation, which is also available through RAxML (discussed below). Check out the FastTree online manual for more information.
qiime phylogeny fasttree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree fasttree-tree.qza
Tip
For an easy and direct way to view your tree.qza
files, upload
them to iTOL. Here, you can interactively view and manipulate your
phylogeny. Even better, while viewing the tree topology in “Normal mode”,
you can drag and drop your associated alignment.qza
(the one you used to
build the phylogeny) or a relevent taxonomy.qza
file onto the iTOL tree
visualization. This will allow you to directly view the sequence alignment
or taxonomy alongside the phylogeny. 🕶️
raxml¶
Like fasttree
, raxml
will perform a single phylogentic inference and
return a tree. Note, the default model for raxml
is
--p-substitution-model GTRGAMMA
. If you’d like to construct a tree using
the CAT model like fasttree
, simply replace GTRGAMMA
with GTRCAT
as
shown below:
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 1 inferences on the original alignment using 1 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 3846 -N 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpit1km90u -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.474104 CAT-based likelihood -1243.159855, best rearrangement setting 5
Conducting final model optimizations on all 1 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1387.917228 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpit1km90u/RAxML_result.q2
Starting final GAMMA-based thorough Optimization on tree 0 likelihood -1387.917228 ....
Final GAMMA-based Score of best tree -1387.237425
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpit1km90u/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpit1km90u/RAxML_bestTree.q2
Overall execution time: 0.966433 secs or 0.000268 hours or 0.000011 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 3846 -N 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpit1km90u -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-tree.qza
Perform multiple searches using raxml¶
If you’d like to perform a more thorough search of “tree space” you can
instruct raxml
to perform multiple independent searches on the full
alignment by using --p-n-searches 5
. Once these 5 independent searches are
completed, only the single best scoring tree will be returned. Note, we are
not bootstrapping here, we’ll do that in a later example. Let’s set
--p-substitution-model GTRCAT
. Finally, let’s also manually set a seed via
--p-seed
. By setting our seed, we allow other users the ability to
reproduce our phylogeny. That is, anyone using the same sequence alignment and
substitution model, will generate the same tree as long as they set the same
seed value. Although, --p-seed
is not a required argument, it is generally
a good idea to set this value.
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--p-seed 1723 \
--p-n-searches 5 \
--o-tree raxml-cat-searches-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 5 inferences on the original alignment using 5 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.424027 CAT-based likelihood -1238.242991, best rearrangement setting 5
Inference[1]: Time 0.351334 CAT-based likelihood -1249.502284, best rearrangement setting 5
Inference[2]: Time 0.360563 CAT-based likelihood -1242.978035, best rearrangement setting 5
Inference[3]: Time 0.469736 CAT-based likelihood -1243.159855, best rearrangement setting 5
Inference[4]: Time 0.354395 CAT-based likelihood -1261.321621, best rearrangement setting 5
Conducting final model optimizations on all 5 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1388.324037 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_result.q2.RUN.0
Inference[1] final GAMMA-based Likelihood: -1392.813982 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_result.q2.RUN.1
Inference[2] final GAMMA-based Likelihood: -1388.073642 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_result.q2.RUN.2
Inference[3] final GAMMA-based Likelihood: -1387.945266 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_result.q2.RUN.3
Inference[4] final GAMMA-based Likelihood: -1387.557031 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_result.q2.RUN.4
Starting final GAMMA-based thorough Optimization on tree 4 likelihood -1387.557031 ....
Final GAMMA-based Score of best tree -1387.385075
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m/RAxML_bestTree.q2
Overall execution time: 2.504776 secs or 0.000696 hours or 0.000029 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmpu58t7m -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-searches-tree.qza
raxml-rapid-bootstrap¶
In phylogenetics, it is good practice to check how well the splits /
bipartitions in your phylogeny are supported. Often one is interested in
which clades are robustly separated from other clades in the phylogeny. One
way, of doing this is via bootstrapping (See the Bootstrapping section of the
first introductory link above). In QIIME 2, we’ve provided access to the RAxML
rapid bootstrap feature. The only difference between this command and the
previous are the additional flags --p-bootstrap-replicates
and
--p-rapid-bootstrap-seed
. It is quite common to perform anywhere from 100 -
1000 bootstrap replicates. The --p-rapid-bootstrap-seed
works very much
like the --p-seed
argument from above except that it allows anyone to
reproduce the bootstrapping process and the associated supports for your
splits.
As per the RAxML online documentation and the RAxML manual, the rapid bootstrapping command that we will execute below will do the following:
Bootstrap the input alignment 100 times and perform a Maximum Likelihood (ML) search on each.
Find best scoring ML tree through multiple independent searches using the original input alignment. The number of independent searches is determined by the number of bootstrap replicates set in the 1st step. That is, your search becomes more thorough with increasing bootstrap replicates. The ML optimization of RAxML uses every 5th bootstrap tree as the starting tree for an ML search on the original alignment.
Map the bipartitions (bootstrap supports, 1st step) onto the best scoring ML tree (2nd step).
qiime phylogeny raxml-rapid-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-seed 1723 \
--p-rapid-bootstrap-seed 9384 \
--p-bootstrap-replicates 100 \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-bootstrap-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid bootstrapping and subsequent ML search
Using 1 distinct models/data partitions with joint branch length optimization
Executing 100 rapid bootstrap inferences and thereafter a thorough ML search
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9 -n q2bootstrap
Time for BS model parameter optimization 0.036146
Bootstrap[0]: Time 0.120850 seconds, bootstrap likelihood -1199.758796, best rearrangement setting 12
Bootstrap[1]: Time 0.085275 seconds, bootstrap likelihood -1344.229251, best rearrangement setting 6
Bootstrap[2]: Time 0.073894 seconds, bootstrap likelihood -1295.343000, best rearrangement setting 8
Bootstrap[3]: Time 0.064034 seconds, bootstrap likelihood -1273.768320, best rearrangement setting 8
Bootstrap[4]: Time 0.076273 seconds, bootstrap likelihood -1253.402952, best rearrangement setting 6
Bootstrap[5]: Time 0.079684 seconds, bootstrap likelihood -1260.866113, best rearrangement setting 10
Bootstrap[6]: Time 0.078379 seconds, bootstrap likelihood -1293.636299, best rearrangement setting 14
Bootstrap[7]: Time 0.071599 seconds, bootstrap likelihood -1227.178693, best rearrangement setting 6
Bootstrap[8]: Time 0.078243 seconds, bootstrap likelihood -1321.820787, best rearrangement setting 13
Bootstrap[9]: Time 0.082549 seconds, bootstrap likelihood -1147.233446, best rearrangement setting 6
Bootstrap[10]: Time 0.061155 seconds, bootstrap likelihood -1220.766493, best rearrangement setting 13
Bootstrap[11]: Time 0.083357 seconds, bootstrap likelihood -1200.006355, best rearrangement setting 8
Bootstrap[12]: Time 0.089375 seconds, bootstrap likelihood -1346.392834, best rearrangement setting 14
Bootstrap[13]: Time 0.074733 seconds, bootstrap likelihood -1301.111096, best rearrangement setting 14
Bootstrap[14]: Time 0.081328 seconds, bootstrap likelihood -1262.253559, best rearrangement setting 11
Bootstrap[15]: Time 0.079083 seconds, bootstrap likelihood -1215.017551, best rearrangement setting 14
Bootstrap[16]: Time 0.074963 seconds, bootstrap likelihood -1238.832009, best rearrangement setting 7
Bootstrap[17]: Time 0.069074 seconds, bootstrap likelihood -1393.989732, best rearrangement setting 12
Bootstrap[18]: Time 0.070846 seconds, bootstrap likelihood -1173.921002, best rearrangement setting 15
Bootstrap[19]: Time 0.076723 seconds, bootstrap likelihood -1185.726976, best rearrangement setting 11
Bootstrap[20]: Time 0.066895 seconds, bootstrap likelihood -1158.491940, best rearrangement setting 6
Bootstrap[21]: Time 0.064639 seconds, bootstrap likelihood -1154.664272, best rearrangement setting 11
Bootstrap[22]: Time 0.073533 seconds, bootstrap likelihood -1244.159837, best rearrangement setting 10
Bootstrap[23]: Time 0.088212 seconds, bootstrap likelihood -1211.171036, best rearrangement setting 15
Bootstrap[24]: Time 0.072250 seconds, bootstrap likelihood -1261.440677, best rearrangement setting 12
Bootstrap[25]: Time 0.073180 seconds, bootstrap likelihood -1331.836715, best rearrangement setting 15
Bootstrap[26]: Time 0.075229 seconds, bootstrap likelihood -1129.144509, best rearrangement setting 5
Bootstrap[27]: Time 0.091511 seconds, bootstrap likelihood -1226.624056, best rearrangement setting 7
Bootstrap[28]: Time 0.094367 seconds, bootstrap likelihood -1221.046176, best rearrangement setting 12
Bootstrap[29]: Time 0.059792 seconds, bootstrap likelihood -1211.791204, best rearrangement setting 14
Bootstrap[30]: Time 0.075760 seconds, bootstrap likelihood -1389.442380, best rearrangement setting 5
Bootstrap[31]: Time 0.072990 seconds, bootstrap likelihood -1303.638592, best rearrangement setting 12
Bootstrap[32]: Time 0.078196 seconds, bootstrap likelihood -1172.859456, best rearrangement setting 12
Bootstrap[33]: Time 0.069298 seconds, bootstrap likelihood -1244.617135, best rearrangement setting 9
Bootstrap[34]: Time 0.070307 seconds, bootstrap likelihood -1211.871717, best rearrangement setting 15
Bootstrap[35]: Time 0.083623 seconds, bootstrap likelihood -1299.862912, best rearrangement setting 5
Bootstrap[36]: Time 0.071626 seconds, bootstrap likelihood -1141.967505, best rearrangement setting 5
Bootstrap[37]: Time 0.090573 seconds, bootstrap likelihood -1283.923198, best rearrangement setting 12
Bootstrap[38]: Time 0.069441 seconds, bootstrap likelihood -1304.250946, best rearrangement setting 5
Bootstrap[39]: Time 0.062013 seconds, bootstrap likelihood -1407.084376, best rearrangement setting 15
Bootstrap[40]: Time 0.075738 seconds, bootstrap likelihood -1277.946299, best rearrangement setting 13
Bootstrap[41]: Time 0.075632 seconds, bootstrap likelihood -1279.006200, best rearrangement setting 7
Bootstrap[42]: Time 0.072630 seconds, bootstrap likelihood -1160.274606, best rearrangement setting 6
Bootstrap[43]: Time 0.088079 seconds, bootstrap likelihood -1216.079259, best rearrangement setting 14
Bootstrap[44]: Time 0.066855 seconds, bootstrap likelihood -1382.278311, best rearrangement setting 8
Bootstrap[45]: Time 0.074826 seconds, bootstrap likelihood -1099.004439, best rearrangement setting 11
Bootstrap[46]: Time 0.061724 seconds, bootstrap likelihood -1296.527478, best rearrangement setting 8
Bootstrap[47]: Time 0.104248 seconds, bootstrap likelihood -1291.322658, best rearrangement setting 9
Bootstrap[48]: Time 0.059758 seconds, bootstrap likelihood -1161.908080, best rearrangement setting 6
Bootstrap[49]: Time 0.081085 seconds, bootstrap likelihood -1257.348428, best rearrangement setting 13
Bootstrap[50]: Time 0.093757 seconds, bootstrap likelihood -1309.422533, best rearrangement setting 13
Bootstrap[51]: Time 0.067278 seconds, bootstrap likelihood -1197.633097, best rearrangement setting 11
Bootstrap[52]: Time 0.076540 seconds, bootstrap likelihood -1347.123005, best rearrangement setting 8
Bootstrap[53]: Time 0.066536 seconds, bootstrap likelihood -1234.934890, best rearrangement setting 14
Bootstrap[54]: Time 0.079883 seconds, bootstrap likelihood -1227.092434, best rearrangement setting 6
Bootstrap[55]: Time 0.081940 seconds, bootstrap likelihood -1280.635747, best rearrangement setting 7
Bootstrap[56]: Time 0.068312 seconds, bootstrap likelihood -1225.911449, best rearrangement setting 6
Bootstrap[57]: Time 0.062246 seconds, bootstrap likelihood -1236.213347, best rearrangement setting 11
Bootstrap[58]: Time 0.098091 seconds, bootstrap likelihood -1393.245723, best rearrangement setting 14
Bootstrap[59]: Time 0.075491 seconds, bootstrap likelihood -1212.039371, best rearrangement setting 6
Bootstrap[60]: Time 0.068581 seconds, bootstrap likelihood -1248.692011, best rearrangement setting 10
Bootstrap[61]: Time 0.077918 seconds, bootstrap likelihood -1172.820979, best rearrangement setting 13
Bootstrap[62]: Time 0.090778 seconds, bootstrap likelihood -1126.745788, best rearrangement setting 14
Bootstrap[63]: Time 0.068585 seconds, bootstrap likelihood -1267.434444, best rearrangement setting 12
Bootstrap[64]: Time 0.065728 seconds, bootstrap likelihood -1340.680748, best rearrangement setting 5
Bootstrap[65]: Time 0.065968 seconds, bootstrap likelihood -1072.671059, best rearrangement setting 5
Bootstrap[66]: Time 0.082415 seconds, bootstrap likelihood -1234.294838, best rearrangement setting 8
Bootstrap[67]: Time 0.083069 seconds, bootstrap likelihood -1109.249439, best rearrangement setting 15
Bootstrap[68]: Time 0.063245 seconds, bootstrap likelihood -1314.493588, best rearrangement setting 8
Bootstrap[69]: Time 0.062872 seconds, bootstrap likelihood -1173.850035, best rearrangement setting 13
Bootstrap[70]: Time 0.068895 seconds, bootstrap likelihood -1231.066465, best rearrangement setting 10
Bootstrap[71]: Time 0.069430 seconds, bootstrap likelihood -1146.861379, best rearrangement setting 9
Bootstrap[72]: Time 0.059495 seconds, bootstrap likelihood -1148.753369, best rearrangement setting 8
Bootstrap[73]: Time 0.070391 seconds, bootstrap likelihood -1333.374056, best rearrangement setting 9
Bootstrap[74]: Time 0.062014 seconds, bootstrap likelihood -1259.382378, best rearrangement setting 5
Bootstrap[75]: Time 0.067858 seconds, bootstrap likelihood -1319.944496, best rearrangement setting 6
Bootstrap[76]: Time 0.076973 seconds, bootstrap likelihood -1309.042165, best rearrangement setting 14
Bootstrap[77]: Time 0.093279 seconds, bootstrap likelihood -1232.061289, best rearrangement setting 8
Bootstrap[78]: Time 0.076198 seconds, bootstrap likelihood -1261.333984, best rearrangement setting 9
Bootstrap[79]: Time 0.078577 seconds, bootstrap likelihood -1194.644341, best rearrangement setting 13
Bootstrap[80]: Time 0.070229 seconds, bootstrap likelihood -1214.037389, best rearrangement setting 9
Bootstrap[81]: Time 0.075038 seconds, bootstrap likelihood -1224.527657, best rearrangement setting 8
Bootstrap[82]: Time 0.088783 seconds, bootstrap likelihood -1241.464826, best rearrangement setting 11
Bootstrap[83]: Time 0.067424 seconds, bootstrap likelihood -1230.730558, best rearrangement setting 6
Bootstrap[84]: Time 0.068474 seconds, bootstrap likelihood -1219.034592, best rearrangement setting 10
Bootstrap[85]: Time 0.074125 seconds, bootstrap likelihood -1280.071994, best rearrangement setting 8
Bootstrap[86]: Time 0.064095 seconds, bootstrap likelihood -1444.747777, best rearrangement setting 9
Bootstrap[87]: Time 0.067384 seconds, bootstrap likelihood -1245.890035, best rearrangement setting 14
Bootstrap[88]: Time 0.079186 seconds, bootstrap likelihood -1287.832766, best rearrangement setting 7
Bootstrap[89]: Time 0.069966 seconds, bootstrap likelihood -1325.245976, best rearrangement setting 5
Bootstrap[90]: Time 0.080752 seconds, bootstrap likelihood -1227.883697, best rearrangement setting 5
Bootstrap[91]: Time 0.077056 seconds, bootstrap likelihood -1273.489392, best rearrangement setting 8
Bootstrap[92]: Time 0.030082 seconds, bootstrap likelihood -1234.725870, best rearrangement setting 7
Bootstrap[93]: Time 0.083134 seconds, bootstrap likelihood -1235.733064, best rearrangement setting 11
Bootstrap[94]: Time 0.067809 seconds, bootstrap likelihood -1204.319488, best rearrangement setting 15
Bootstrap[95]: Time 0.065750 seconds, bootstrap likelihood -1183.328582, best rearrangement setting 11
Bootstrap[96]: Time 0.077528 seconds, bootstrap likelihood -1196.298898, best rearrangement setting 13
Bootstrap[97]: Time 0.081839 seconds, bootstrap likelihood -1339.251746, best rearrangement setting 12
Bootstrap[98]: Time 0.030590 seconds, bootstrap likelihood -1404.363552, best rearrangement setting 7
Bootstrap[99]: Time 0.039784 seconds, bootstrap likelihood -1270.157811, best rearrangement setting 7
Overall Time for 100 Rapid Bootstraps 7.425993 seconds
Average Time per Rapid Bootstrap 0.074260 seconds
Starting ML Search ...
Fast ML optimization finished
Fast ML search Time: 3.063765 seconds
Slow ML Search 0 Likelihood: -1387.994678
Slow ML Search 1 Likelihood: -1387.994678
Slow ML Search 2 Likelihood: -1387.994676
Slow ML Search 3 Likelihood: -1387.994650
Slow ML Search 4 Likelihood: -1387.994685
Slow ML Search 5 Likelihood: -1388.092954
Slow ML Search 6 Likelihood: -1388.182551
Slow ML Search 7 Likelihood: -1388.182563
Slow ML Search 8 Likelihood: -1388.182547
Slow ML Search 9 Likelihood: -1387.994723
Slow ML optimization finished
Slow ML search Time: 1.571358 seconds
Thorough ML search Time: 0.419602 seconds
Final ML Optimization Likelihood: -1387.204993
Model Information:
Model Parameters of Partition 0, Name: No Name Provided, Type of Data: DNA
alpha: 1.227800
Tree-Length: 7.823400
rate A <-> C: 0.332564
rate A <-> G: 2.312784
rate A <-> T: 2.215466
rate C <-> G: 1.243321
rate C <-> T: 3.278770
rate G <-> T: 1.000000
freq pi(A): 0.243216
freq pi(C): 0.181967
freq pi(G): 0.319196
freq pi(T): 0.255621
ML search took 5.058340 secs or 0.001405 hours
Combined Bootstrap and ML search took 12.484499 secs or 0.003468 hours
Drawing Bootstrap Support Values on best-scoring ML tree ...
Found 1 tree in File /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bestTree.q2bootstrap
Found 1 tree in File /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bestTree.q2bootstrap
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_info.q2bootstrap
All 100 bootstrapped trees written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bootstrap.q2bootstrap
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bestTree.q2bootstrap
Best-scoring ML tree with support values written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bipartitions.q2bootstrap
Best-scoring ML tree with support values as branch labels written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9/RAxML_bipartitionsBranchLabels.q2bootstrap
Overall execution time for full ML analysis: 12.493003 secs or 0.003470 hours or 0.000145 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpfix82ng9 -n q2bootstrap
Saved Phylogeny[Unrooted] to: raxml-cat-bootstrap-tree.qza
Tip
Optimizing RAxML Run Time.
You may gave noticed that we haven’t added the flag --p-raxml-version
to
the RAxML methods. This parameter provides a means to access versions of
RAxML that have optimized vector instructions for various modern x86
processor architectures. Paraphrased from the RAxML manual and help
documentation: Firstly, most recent processors will support SSE3 vector
instructions (i.e. will likely support the faster AVX2 vector instructions).
Secondly, these instructions will substantially accelerate the likelihood
and parsimony computations. In general, SSE3 versions will run approximately
40% faster than the standard version. The AVX2 version will run 10-30%
faster than the SSE3 version. Additionally, keep in mind that using more
cores / threads will not necessarily decrease run time. The RAxML manual
suggests using 1 core per ~500 DNA alignment patterns. Alignment pattern
information is usually visible on screen, when the --verbose
option is
used. Additionally, try using a rate category (CAT model; via
--p-substitution-model
), which results in equally good trees as the
GAMMA models and is approximately 4 times faster. See the CAT paper. The
CAT approximation is also Ideal for alignments containing 10,000 or more
taxa, and is very much similar the CAT-like model of FastTree2.
iqtree¶
Similar to the raxml
and raxml-rapid-bootstrap
methods above, we
provide similar functionality for IQ-TREE: iqtree
and
iqtree-ultrafast-bootstrap
. IQ-TREE is unique compared to the fastree
and raxml
options, in that it provides access to 286 models of nucleotide
substitution! IQ-TREE can also determine which of these models best fits your
dataset prior to constructing your tree via its built-in ModelFinder
algorithm. This is the default in QIIME 2, but do not worry, you can set any
one of the 286 models of nucleotide substitution via the
--p-substitution-model
flag, e.g. you can set the model as HKY+I+G
instead of the default MFP
(a basic short-hand for: “build a phylogeny
after determining the best fit model as determined by ModelFinder”). Keep in
mind the additional computational time required for model testing via
ModelFinder.
The simplest way to run the
iqtree command with default
settings and automatic model selection (MFP
) is like so:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree iqt-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree -nt 1
Seed: 994151 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:14:33 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000115871 secs using 81.99% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 4.60148e-05 secs using 41.29% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1389.605
Optimal log-likelihood: -1388.793
Rate parameters: A-C: 0.37543 A-G: 2.37167 A-T: 2.15334 C-G: 1.24271 C-T: 3.32365 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.034
Gamma shape alpha: 1.400
Parameters optimization took 1 rounds (0.003 sec)
Time for fast ML tree search: 0.035 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1402.600 45 2895.200 2919.843 3046.669
2 GTR+F+I 1401.121 46 2894.242 2920.135 3049.077
3 GTR+F+G4 1387.358 46 2866.716 2892.609 3021.551
4 GTR+F+I+G4 1387.726 47 2869.452 2896.633 3027.653
5 GTR+F+R2 1382.364 47 2858.729 2885.910 3016.930
6 GTR+F+R3 1382.420 49 2862.840 2892.718 3027.773
14 GTR+F+I+R2 1382.418 48 2860.837 2889.346 3022.403
15 GTR+F+I+R3 1382.449 50 2864.899 2896.187 3033.197
25 SYM+G4 1387.134 43 2860.269 2882.528 3005.006
27 SYM+R2 1383.095 44 2854.189 2877.621 3002.292
36 SYM+I+R2 1383.227 45 2856.454 2881.097 3007.923
47 TVM+F+G4 1388.357 45 2866.713 2891.356 3018.182
49 TVM+F+R2 1383.789 46 2859.578 2885.470 3014.413
58 TVM+F+I+R2 1383.812 47 2861.625 2888.805 3019.826
69 TVMe+G4 1387.122 42 2858.245 2879.368 2999.616
71 TVMe+R2 1383.079 43 2852.159 2874.418 2996.896
80 TVMe+I+R2 1383.224 44 2854.449 2877.881 3002.552
91 TIM3+F+G4 1391.377 44 2870.754 2894.186 3018.857
93 TIM3+F+R2 1385.912 45 2861.825 2886.468 3013.294
102 TIM3+F+I+R2 1386.041 46 2864.082 2889.975 3018.917
113 TIM3e+G4 1390.358 41 2862.715 2882.738 3000.720
115 TIM3e+R2 1385.918 42 2855.836 2876.959 2997.207
124 TIM3e+I+R2 1386.073 43 2858.145 2880.404 3002.882
135 TIM2+F+G4 1393.635 44 2875.270 2898.702 3023.373
137 TIM2+F+R2 1387.681 45 2865.362 2890.005 3016.831
146 TIM2+F+I+R2 1387.782 46 2867.564 2893.456 3022.399
157 TIM2e+G4 1396.795 41 2875.589 2895.613 3013.594
159 TIM2e+R2 1391.574 42 2867.148 2888.270 3008.519
168 TIM2e+I+R2 1391.651 43 2869.302 2891.561 3014.039
179 TIM+F+G4 1390.363 44 2868.726 2892.158 3016.829
181 TIM+F+R2 1384.933 45 2859.866 2884.509 3011.335
190 TIM+F+I+R2 1385.016 46 2862.032 2887.925 3016.867
201 TIMe+G4 1394.002 41 2870.005 2890.028 3008.010
203 TIMe+R2 1389.000 42 2862.000 2883.123 3003.371
212 TIMe+I+R2 1389.095 43 2864.190 2886.449 3008.927
223 TPM3u+F+G4 1392.306 43 2870.611 2892.870 3015.348
225 TPM3u+F+R2 1387.329 44 2862.659 2886.091 3010.762
234 TPM3u+F+I+R2 1387.462 45 2864.923 2889.566 3016.392
245 TPM3+G4 1390.374 40 2860.748 2879.708 2995.387
247 TPM3+R2 1385.925 41 2853.851 2873.874 2991.856
256 TPM3+I+R2 1386.070 42 2856.140 2877.263 2997.511
267 TPM2u+F+G4 1394.533 43 2875.067 2897.325 3019.804
269 TPM2u+F+R2 1389.057 44 2866.113 2889.545 3014.216
278 TPM2u+F+I+R2 1389.101 45 2868.201 2892.844 3019.670
289 TPM2+G4 1396.823 40 2873.646 2892.605 3008.285
291 TPM2+R2 1391.578 41 2865.155 2885.178 3003.160
300 TPM2+I+R2 1391.649 42 2867.297 2888.420 3008.668
311 K3Pu+F+G4 1391.381 43 2868.762 2891.021 3013.499
313 K3Pu+F+R2 1386.371 44 2860.742 2884.174 3008.845
322 K3Pu+F+I+R2 1386.425 45 2862.850 2887.493 3014.319
333 K3P+G4 1394.015 40 2868.030 2886.989 3002.669
335 K3P+R2 1389.002 41 2860.004 2880.028 2998.009
344 K3P+I+R2 1389.099 42 2862.197 2883.320 3003.568
355 TN+F+G4 1394.038 43 2874.077 2896.336 3018.814
357 TN+F+R2 1388.241 44 2864.483 2887.915 3012.586
366 TN+F+I+R2 1388.289 45 2866.578 2891.221 3018.047
377 TNe+G4 1396.791 40 2873.582 2892.542 3008.221
379 TNe+R2 1391.586 41 2865.172 2885.195 3003.177
388 TNe+I+R2 1391.666 42 2867.332 2888.454 3008.703
399 HKY+F+G4 1394.951 42 2873.902 2895.024 3015.273
401 HKY+F+R2 1389.609 43 2865.217 2887.476 3009.954
410 HKY+F+I+R2 1389.663 44 2867.327 2890.759 3015.430
421 K2P+G4 1396.825 39 2871.649 2889.580 3002.922
423 K2P+R2 1391.594 40 2863.189 2882.148 2997.828
432 K2P+I+R2 1391.664 41 2865.327 2885.351 3003.332
443 F81+F+G4 1405.743 41 2893.486 2913.509 3031.491
445 F81+F+R2 1400.805 42 2885.611 2906.733 3026.982
454 F81+F+I+R2 1400.908 43 2887.816 2910.075 3032.553
465 JC+G4 1407.650 38 2891.299 2908.236 3019.206
467 JC+R2 1402.858 39 2883.715 2901.646 3014.988
476 JC+I+R2 1402.926 40 2885.851 2904.811 3020.490
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TPM3+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree.model.gz
CPU time for ModelFinder: 0.764 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.775 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1385.925
Optimal log-likelihood: -1385.924
Rate parameters: A-C: 0.40868 A-G: 1.56206 A-T: 1.00000 C-G: 0.40868 C-T: 1.56206 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.716,0.409) (0.284,2.490)
Parameters optimization took 1 rounds (0.002 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000636101 secs using 97.63% CPU
Computing ML distances took 0.000684 sec (of wall-clock time) 0.000659 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.90871e-05 secs using 75.63% CPU
Computing RapidNJ tree took 0.000109 sec (of wall-clock time) 0.000111 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.820
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.061 second
Computing log-likelihood of 98 initial trees ... 0.049 seconds
Current best score: -1385.924
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1385.880
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1385.311
Iteration 10 / LogL: -1385.344 / Time: 0h:0m:0s
Iteration 20 / LogL: -1385.343 / Time: 0h:0m:1s
Finish initializing candidate tree set (2)
Current best tree score: -1385.311 / CPU time: 0.245
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
UPDATE BEST LOG-LIKELIHOOD: -1385.311
Iteration 30 / LogL: -1385.837 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 40 / LogL: -1385.863 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 50 / LogL: -1385.536 / Time: 0h:0m:1s (0h:0m:0s left)
BETTER TREE FOUND at iteration 56: -1385.310
Iteration 60 / LogL: -1385.316 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 70 / LogL: -1385.727 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.310
Iteration 80 / LogL: -1385.311 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 90 / LogL: -1385.650 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1385.741 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1385.311 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 120 / LogL: -1385.370 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.310
Iteration 130 / LogL: -1385.319 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 140 / LogL: -1385.929 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 150 / LogL: -1385.850 / Time: 0h:0m:2s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 157 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.310
Optimal log-likelihood: -1385.306
Rate parameters: A-C: 0.39447 A-G: 1.56668 A-T: 1.00000 C-G: 0.39447 C-T: 1.56668 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.721,0.402) (0.279,2.542)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.306
Total tree length: 6.846
Total number of iterations: 157
CPU time used for tree search: 1.576 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.395 sec (0h:0m:1s)
Total CPU time used: 2.357 sec (0h:0m:2s)
Total wall-clock time used: 2.184 sec (0h:0m:2s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree.log
Date and Time: Mon Jul 29 18:14:36 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpcftdegmd/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-tree.qza
Specifying a substitution model¶
We can also set a substitution model of our choosing. You may have noticed
while watching the onscreen output of the previous command that the best
fitting model selected by ModelFinder is noted. For the sake of argument, let’s
say the best selected model was shown as GTR+F+I+G4
. The F
is only a
notation to let us know that if a given model supports unequal base
frequencies, then the empirical base frequencies will be used by default.
Using empirical base frequencies (F
), rather than estimating them, greatly
reduces computational time. The iqtree
plugin will not accept F
within
the model notation supplied at the command line, as this will always be implied
automatically for the appropriate model. Also, the iqtree
plugin only
accepts G
not G4
to be specified within the model notation. The 4
is simply another explicit notation to remind us that four rate categories are
being assumed by default. The notation approach used by the plugin simply helps
to retain simplicity and familiarity when supplying model notations on the
command line. So, in brief, we only have to type GTR+I+G
as our input
model:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-gtrig-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree -nt 1
Seed: 818703 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:14:44 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000100851 secs using 88.25% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.40667e-05 secs using 63.98% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.263 / LogL: -1392.811
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.354 / LogL: -1393.088
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.365 / LogL: -1393.160
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.361 / LogL: -1393.150
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.357 / LogL: -1393.113
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.364 / LogL: -1393.141
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.007, 1.355 / LogL: -1393.088
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.355 / LogL: -1393.102
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.358 / LogL: -1393.108
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.360 / LogL: -1393.118
Optimal pinv,alpha: 0.000, 1.263 / LogL: -1392.811
Parameters optimization took 0.261 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000907898 secs using 97.92% CPU
Computing ML distances took 0.000961 sec (of wall-clock time) 0.000930 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 5.48363e-05 secs using 87.53% CPU
Computing RapidNJ tree took 0.000169 sec (of wall-clock time) 0.000170 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.723
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.059 second
Computing log-likelihood of 98 initial trees ... 0.064 seconds
Current best score: -1392.723
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.358
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1387.268
Iteration 10 / LogL: -1387.281 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.281 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1387.268 / CPU time: 0.336
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Estimate model parameters (epsilon = 0.100)
UPDATE BEST LOG-LIKELIHOOD: -1387.253
Iteration 30 / LogL: -1387.421 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 40 / LogL: -1387.404 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 50 / LogL: -1396.594 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 60 / LogL: -1387.642 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 70 / LogL: -1387.359 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 80 / LogL: -1387.296 / Time: 0h:0m:1s (0h:0m:0s left)
WARNING: NNI search needs unusual large number of steps (20) to converge!
Iteration 90 / LogL: -1396.040 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1388.560 / Time: 0h:0m:1s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 103 ITERATIONS / Time: 0h:0m:1s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.253
Optimal log-likelihood: -1387.252
Rate parameters: A-C: 0.33969 A-G: 2.29899 A-T: 2.17981 C-G: 1.19507 C-T: 3.34689 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.323
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1387.252
Total tree length: 6.726
Total number of iterations: 103
CPU time used for tree search: 1.779 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.600 sec (0h:0m:1s)
Total CPU time used: 2.052 sec (0h:0m:2s)
Total wall-clock time used: 1.876 sec (0h:0m:1s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree.log
Date and Time: Mon Jul 29 18:14:46 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpm79lebdg/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-gtrig-tree.qza
Let’s rerun the command above and add the --p-fast
option. This option,
only compatible with the iqtree
method, resembles the fast search performed
by fasttree
. 🏎️ Secondly, let’s also perform multiple tree searches and
keep the best of those trees (as we did earlier with the
raxml --p-n-searches ...
command):
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--p-fast \
--p-n-runs 10 \
--o-tree iqt-gtrig-fast-ms-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 10 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree -nt 1 -fast
Seed: 318703 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:14:55 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000112057 secs using 81.21% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Analyzing sequences: done in 1.00136e-05 secs using 89.88% CPU
---> START RUN NUMBER 1 (seed: 318703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.00 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.50)
1. Initial log-likelihood: -1492.20
2. Current log-likelihood: -1404.59
3. Current log-likelihood: -1399.23
4. Current log-likelihood: -1397.83
5. Current log-likelihood: -1397.07
Optimal log-likelihood: -1396.49
Rate parameters: A-C: 0.24620 A-G: 2.08306 A-T: 1.99580 C-G: 1.06240 C-T: 2.85598 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.432
Parameters optimization took 5 rounds (0.022 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000920057 secs using 98.58% CPU
Computing ML distances took 0.000978 sec (of wall-clock time) 0.000949 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.00407e-05 secs using 93.21% CPU
Computing RapidNJ tree took 0.000172 sec (of wall-clock time) 0.000165 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.972
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.188
UPDATE BEST LOG-LIKELIHOOD: -1388.187
Finish initializing candidate tree set (3)
Current best tree score: -1388.187 / CPU time: 0.028
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.187
2. Current log-likelihood: -1387.966
3. Current log-likelihood: -1387.806
4. Current log-likelihood: -1387.687
5. Current log-likelihood: -1387.596
6. Current log-likelihood: -1387.525
7. Current log-likelihood: -1387.471
Optimal log-likelihood: -1387.426
Rate parameters: A-C: 0.33228 A-G: 2.23741 A-T: 2.11202 C-G: 1.16006 C-T: 3.23503 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.356
Parameters optimization took 7 rounds (0.014 sec)
BEST SCORE FOUND : -1387.426
Total tree length: 6.737
Total number of iterations: 2
CPU time used for tree search: 0.055 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.028 sec (0h:0m:0s)
Total CPU time used: 0.117 sec (0h:0m:0s)
Total wall-clock time used: 0.077 sec (0h:0m:0s)
---> START RUN NUMBER 2 (seed: 319703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.389
2. Current log-likelihood: -1401.890
3. Current log-likelihood: -1396.534
4. Current log-likelihood: -1395.117
5. Current log-likelihood: -1394.389
Optimal log-likelihood: -1393.814
Rate parameters: A-C: 0.27026 A-G: 2.39526 A-T: 2.16931 C-G: 1.24752 C-T: 3.29290 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.415
Parameters optimization took 5 rounds (0.021 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000890017 secs using 199% CPU
Computing ML distances took 0.000991 sec (of wall-clock time) 0.001884 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.81334e-05 secs using 78.2% CPU
Computing RapidNJ tree took 0.000140 sec (of wall-clock time) 0.000109 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.793
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.213
Finish initializing candidate tree set (3)
Current best tree score: -1388.213 / CPU time: 0.023
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.213
2. Current log-likelihood: -1388.014
3. Current log-likelihood: -1387.868
4. Current log-likelihood: -1387.759
5. Current log-likelihood: -1387.676
6. Current log-likelihood: -1387.611
7. Current log-likelihood: -1387.560
Optimal log-likelihood: -1387.519
Rate parameters: A-C: 0.35532 A-G: 2.35213 A-T: 2.13937 C-G: 1.20295 C-T: 3.37020 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.361
Parameters optimization took 7 rounds (0.014 sec)
BEST SCORE FOUND : -1387.519
Total tree length: 6.816
Total number of iterations: 2
CPU time used for tree search: 0.045 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.023 sec (0h:0m:0s)
Total CPU time used: 0.252 sec (0h:0m:0s)
Total wall-clock time used: 0.148 sec (0h:0m:0s)
---> START RUN NUMBER 3 (seed: 320703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1491.925
2. Current log-likelihood: -1402.064
3. Current log-likelihood: -1396.813
4. Current log-likelihood: -1395.392
5. Current log-likelihood: -1394.652
Optimal log-likelihood: -1394.078
Rate parameters: A-C: 0.27467 A-G: 2.39505 A-T: 2.12238 C-G: 1.21030 C-T: 3.30515 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.388
Parameters optimization took 5 rounds (0.019 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000884056 secs using 197.7% CPU
Computing ML distances took 0.000960 sec (of wall-clock time) 0.001831 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.29018e-05 secs using 82.06% CPU
Computing RapidNJ tree took 0.000217 sec (of wall-clock time) 0.000136 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.807
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.181
Finish initializing candidate tree set (4)
Current best tree score: -1388.181 / CPU time: 0.030
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.181
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.646
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36872 A-G: 2.32249 A-T: 2.12947 C-G: 1.22911 C-T: 3.29731 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.331
Parameters optimization took 6 rounds (0.013 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.508
Total number of iterations: 2
CPU time used for tree search: 0.058 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.030 sec (0h:0m:0s)
Total CPU time used: 0.395 sec (0h:0m:0s)
Total wall-clock time used: 0.223 sec (0h:0m:0s)
---> START RUN NUMBER 4 (seed: 321703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.167
2. Current log-likelihood: -1403.041
3. Current log-likelihood: -1398.313
4. Current log-likelihood: -1396.957
5. Current log-likelihood: -1396.228
Optimal log-likelihood: -1395.709
Rate parameters: A-C: 0.23146 A-G: 2.06957 A-T: 1.96268 C-G: 1.07937 C-T: 2.84174 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.322
Parameters optimization took 5 rounds (0.021 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.00091815 secs using 199% CPU
Computing ML distances took 0.000970 sec (of wall-clock time) 0.001898 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 2.59876e-05 secs using 76.96% CPU
Computing RapidNJ tree took 0.000128 sec (of wall-clock time) 0.000128 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.184
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.955
Finish initializing candidate tree set (4)
Current best tree score: -1387.955 / CPU time: 0.025
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.955
2. Current log-likelihood: -1387.798
3. Current log-likelihood: -1387.680
4. Current log-likelihood: -1387.590
5. Current log-likelihood: -1387.521
6. Current log-likelihood: -1387.467
Optimal log-likelihood: -1387.423
Rate parameters: A-C: 0.33566 A-G: 2.27095 A-T: 2.14605 C-G: 1.17829 C-T: 3.29012 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.352
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.423
Total tree length: 6.744
Total number of iterations: 2
CPU time used for tree search: 0.048 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.025 sec (0h:0m:0s)
Total CPU time used: 0.527 sec (0h:0m:0s)
Total wall-clock time used: 0.293 sec (0h:0m:0s)
---> START RUN NUMBER 5 (seed: 322703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.459
2. Current log-likelihood: -1403.057
3. Current log-likelihood: -1398.315
4. Current log-likelihood: -1396.957
5. Current log-likelihood: -1396.224
Optimal log-likelihood: -1395.705
Rate parameters: A-C: 0.23667 A-G: 2.08551 A-T: 1.97696 C-G: 1.07651 C-T: 2.86648 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.322
Parameters optimization took 5 rounds (0.019 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.00097394 secs using 187.3% CPU
Computing ML distances took 0.001039 sec (of wall-clock time) 0.001907 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 4.41074e-05 secs using 70.28% CPU
Computing RapidNJ tree took 0.000151 sec (of wall-clock time) 0.000120 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.170
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.175
Finish initializing candidate tree set (3)
Current best tree score: -1388.175 / CPU time: 0.021
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.175
2. Current log-likelihood: -1387.959
3. Current log-likelihood: -1387.799
4. Current log-likelihood: -1387.681
5. Current log-likelihood: -1387.591
6. Current log-likelihood: -1387.521
7. Current log-likelihood: -1387.467
Optimal log-likelihood: -1387.423
Rate parameters: A-C: 0.33681 A-G: 2.27128 A-T: 2.14654 C-G: 1.17860 C-T: 3.29078 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.351
Parameters optimization took 7 rounds (0.014 sec)
BEST SCORE FOUND : -1387.423
Total tree length: 6.745
Total number of iterations: 2
CPU time used for tree search: 0.042 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.022 sec (0h:0m:0s)
Total CPU time used: 0.654 sec (0h:0m:0s)
Total wall-clock time used: 0.360 sec (0h:0m:0s)
---> START RUN NUMBER 6 (seed: 323703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1496.015
2. Current log-likelihood: -1403.630
3. Current log-likelihood: -1398.533
4. Current log-likelihood: -1397.077
5. Current log-likelihood: -1396.256
6. Current log-likelihood: -1395.746
Optimal log-likelihood: -1395.367
Rate parameters: A-C: 0.23665 A-G: 2.05005 A-T: 1.94885 C-G: 1.06762 C-T: 2.81216 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.021
Gamma shape alpha: 1.337
Parameters optimization took 6 rounds (0.024 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000882149 secs using 198.4% CPU
Computing ML distances took 0.000929 sec (of wall-clock time) 0.001806 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.79086e-05 secs using 84.41% CPU
Computing RapidNJ tree took 0.000194 sec (of wall-clock time) 0.000161 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.724
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.981
Finish initializing candidate tree set (4)
Current best tree score: -1387.981 / CPU time: 0.027
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.981
2. Current log-likelihood: -1387.812
3. Current log-likelihood: -1387.686
4. Current log-likelihood: -1387.592
5. Current log-likelihood: -1387.521
6. Current log-likelihood: -1387.466
Optimal log-likelihood: -1387.423
Rate parameters: A-C: 0.32762 A-G: 2.25269 A-T: 2.12562 C-G: 1.16855 C-T: 3.25523 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.358
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.423
Total tree length: 6.701
Total number of iterations: 2
CPU time used for tree search: 0.054 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.028 sec (0h:0m:0s)
Total CPU time used: 0.799 sec (0h:0m:0s)
Total wall-clock time used: 0.436 sec (0h:0m:0s)
---> START RUN NUMBER 7 (seed: 324703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1495.863
2. Current log-likelihood: -1402.072
3. Current log-likelihood: -1396.809
4. Current log-likelihood: -1395.391
5. Current log-likelihood: -1394.657
Optimal log-likelihood: -1394.080
Rate parameters: A-C: 0.27275 A-G: 2.35291 A-T: 2.09125 C-G: 1.19606 C-T: 3.26638 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.387
Parameters optimization took 5 rounds (0.020 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.00113201 secs using 173.5% CPU
Computing ML distances took 0.001226 sec (of wall-clock time) 0.002051 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.79086e-05 secs using 81.78% CPU
Computing RapidNJ tree took 0.000148 sec (of wall-clock time) 0.000112 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.809
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.189
Finish initializing candidate tree set (4)
Current best tree score: -1388.189 / CPU time: 0.030
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.189
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36986 A-G: 2.31018 A-T: 2.11746 C-G: 1.22267 C-T: 3.27882 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.014 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.059 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.030 sec (0h:0m:0s)
Total CPU time used: 0.943 sec (0h:0m:0s)
Total wall-clock time used: 0.512 sec (0h:0m:0s)
---> START RUN NUMBER 8 (seed: 325703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.097
2. Current log-likelihood: -1401.816
3. Current log-likelihood: -1396.523
4. Current log-likelihood: -1395.122
5. Current log-likelihood: -1394.389
Optimal log-likelihood: -1393.818
Rate parameters: A-C: 0.27163 A-G: 2.41073 A-T: 2.17144 C-G: 1.24911 C-T: 3.27679 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.416
Parameters optimization took 5 rounds (0.022 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000931978 secs using 198.3% CPU
Computing ML distances took 0.001009 sec (of wall-clock time) 0.001949 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.90871e-05 secs using 79.07% CPU
Computing RapidNJ tree took 0.000144 sec (of wall-clock time) 0.000116 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.794
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.214
Finish initializing candidate tree set (3)
Current best tree score: -1388.214 / CPU time: 0.023
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.214
2. Current log-likelihood: -1388.015
3. Current log-likelihood: -1387.868
4. Current log-likelihood: -1387.760
5. Current log-likelihood: -1387.676
6. Current log-likelihood: -1387.611
7. Current log-likelihood: -1387.560
Optimal log-likelihood: -1387.519
Rate parameters: A-C: 0.35522 A-G: 2.35151 A-T: 2.13874 C-G: 1.20261 C-T: 3.36909 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.362
Parameters optimization took 7 rounds (0.014 sec)
BEST SCORE FOUND : -1387.519
Total tree length: 6.815
Total number of iterations: 2
CPU time used for tree search: 0.044 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.023 sec (0h:0m:0s)
Total CPU time used: 1.077 sec (0h:0m:1s)
Total wall-clock time used: 0.582 sec (0h:0m:0s)
---> START RUN NUMBER 9 (seed: 326703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.199
2. Current log-likelihood: -1404.591
3. Current log-likelihood: -1399.228
4. Current log-likelihood: -1397.831
5. Current log-likelihood: -1397.074
Optimal log-likelihood: -1396.495
Rate parameters: A-C: 0.24620 A-G: 2.08306 A-T: 1.99581 C-G: 1.06240 C-T: 2.85598 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.432
Parameters optimization took 5 rounds (0.022 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.00087595 secs using 197% CPU
Computing ML distances took 0.000957 sec (of wall-clock time) 0.001812 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.29018e-05 secs using 82.06% CPU
Computing RapidNJ tree took 0.000165 sec (of wall-clock time) 0.000135 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.972
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.188
UPDATE BEST LOG-LIKELIHOOD: -1388.187
Finish initializing candidate tree set (3)
Current best tree score: -1388.187 / CPU time: 0.027
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.187
2. Current log-likelihood: -1387.966
3. Current log-likelihood: -1387.806
4. Current log-likelihood: -1387.687
5. Current log-likelihood: -1387.596
6. Current log-likelihood: -1387.525
7. Current log-likelihood: -1387.471
Optimal log-likelihood: -1387.426
Rate parameters: A-C: 0.33228 A-G: 2.23741 A-T: 2.11202 C-G: 1.16006 C-T: 3.23503 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.356
Parameters optimization took 7 rounds (0.014 sec)
BEST SCORE FOUND : -1387.426
Total tree length: 6.737
Total number of iterations: 2
CPU time used for tree search: 0.054 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.028 sec (0h:0m:0s)
Total CPU time used: 1.223 sec (0h:0m:1s)
Total wall-clock time used: 0.658 sec (0h:0m:0s)
---> START RUN NUMBER 10 (seed: 327703)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1495.571
2. Current log-likelihood: -1402.008
3. Current log-likelihood: -1396.794
4. Current log-likelihood: -1395.393
5. Current log-likelihood: -1394.655
Optimal log-likelihood: -1394.081
Rate parameters: A-C: 0.27755 A-G: 2.37595 A-T: 2.10647 C-G: 1.20302 C-T: 3.28731 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.386
Parameters optimization took 5 rounds (0.022 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000878811 secs using 196.6% CPU
Computing ML distances took 0.000958 sec (of wall-clock time) 0.001813 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.88486e-05 secs using 79.73% CPU
Computing RapidNJ tree took 0.000142 sec (of wall-clock time) 0.000112 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.809
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.188
Finish initializing candidate tree set (4)
Current best tree score: -1388.188 / CPU time: 0.031
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.188
2. Current log-likelihood: -1387.973
3. Current log-likelihood: -1387.830
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36987 A-G: 2.31020 A-T: 2.11745 C-G: 1.22270 C-T: 3.27880 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.014 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.059 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.031 sec (0h:0m:0s)
Total CPU time used: 1.373 sec (0h:0m:1s)
Total wall-clock time used: 0.737 sec (0h:0m:0s)
---> SUMMARIZE RESULTS FROM 10 RUNS
Run 5 gave best log-likelihood: -1387.423
Total CPU time for 10 runs: 1.384 seconds.
Total wall-clock time for 10 runs: 0.743 seconds.
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree.treefile
Trees from independent runs: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree.runtrees
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree.log
Date and Time: Mon Jul 29 18:14:56 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 10 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpsi88b8xg/q2iqtree -nt 1 -fast
Saved Phylogeny[Unrooted] to: iqt-gtrig-fast-ms-tree.qza
Single branch tests¶
IQ-TREE provides access to a few single branch testing methods
SH-aLRT via
--p-alrt [INT >= 1000]
aBayes via
--p-abayes [TRUE | FALSE]
local bootstrap test via
--p-lbp [INT >= 1000]
Single branch tests are commonly used as an alternative to the bootstrapping
approach we’ve discussed above, as they are substantially faster and often
recommended when constructing large phylogenies (e.g. >10,000 taxa). All
three of these methods can be applied simultaneously and viewed within iTOL
as separate bootstrap support values. These values are always in listed in the
following order of alrt / lbp / abayes. We’ll go ahead and apply all of the
branch tests in our next command, while specifying the same substitution model
as above. Feel free to combine this with the --p-fast
option. 😉
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-sbt-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Seed: 131143 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:15:05 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000102043 secs using 87.22% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.00136e-05 secs using 79.89% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.239 / LogL: -1394.430
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.306 / LogL: -1394.720
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.315 / LogL: -1394.793
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.313 / LogL: -1394.791
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.307 / LogL: -1394.755
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.309 / LogL: -1394.783
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.008, 1.305 / LogL: -1394.729
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.307 / LogL: -1394.742
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.308 / LogL: -1394.753
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.312 / LogL: -1394.757
Optimal pinv,alpha: 0.000, 1.239 / LogL: -1394.430
Parameters optimization took 0.262 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000931978 secs using 94.96% CPU
Computing ML distances took 0.001020 sec (of wall-clock time) 0.000931 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.69413e-05 secs using 81.66% CPU
Computing RapidNJ tree took 0.000171 sec (of wall-clock time) 0.000136 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.898
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.061 second
Computing log-likelihood of 98 initial trees ... 0.063 seconds
Current best score: -1392.898
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.266
Iteration 10 / LogL: -1387.731 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.282 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1387.266 / CPU time: 0.331
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1387.305 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 40 / LogL: -1387.369 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 50 / LogL: -1387.347 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 60 / LogL: -1387.349 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 70 / LogL: -1387.552 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 80 / LogL: -1387.386 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 90 / LogL: -1387.349 / Time: 0h:0m:1s (0h:0m:0s left)
WARNING: NNI search needs unusual large number of steps (20) to converge!
Iteration 100 / LogL: -1387.350 / Time: 0h:0m:1s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 102 ITERATIONS / Time: 0h:0m:1s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.266
Optimal log-likelihood: -1387.256
Rate parameters: A-C: 0.32741 A-G: 2.25543 A-T: 2.13353 C-G: 1.17231 C-T: 3.27865 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.318
Parameters optimization took 1 rounds (0.003 sec)
BEST SCORE FOUND : -1387.256
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.045 sec.
Total tree length: 6.744
Total number of iterations: 102
CPU time used for tree search: 1.770 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.589 sec (0h:0m:1s)
Total CPU time used: 2.091 sec (0h:0m:2s)
Total wall-clock time used: 1.911 sec (0h:0m:1s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree.log
Date and Time: Mon Jul 29 18:15:07 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmphbxtocal/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Saved Phylogeny[Unrooted] to: iqt-sbt-tree.qza
Tip
IQ-TREE search settings.
There are quite a few adjustable parameters available for iqtree
that
can be modified improve searches through “tree space” and prevent the search
algorithms from getting stuck in local optima. One particular best
practice to aid in this regard, is to adjust the following parameters:
--p-perturb-nni-strength
and --p-stop-iter
(each respectively maps
to the -pers
and -nstop
flags of iqtree
). In brief, the larger
the value for NNI (nearest-neighbor interchange) perturbation, the larger
the jumps in “tree space”. This value should be set high enough to allow the
search algorithm to avoid being trapped in local optima, but not to high
that the search is haphazardly jumping around “tree space”. That is, like
Goldilocks and the three 🐻s you need to find a setting that is “just
right”, or at least within a set of reasonable bounds. One way of assessing
this, is to do a few short trial runs using the --verbose
flag. If you
see that the likelihood values are jumping around to much, then lowering the
value for --p-perturb-nni-strength
may be warranted. As for the stopping
criteria, i.e. --p-stop-iter
, the higher this value, the more thorough
your search in “tree space”. Be aware, increasing this value may also
increase the run time. That is, the search will continue until it has
sampled a number of trees, say 100 (default), without finding a better
scoring tree. If a better tree is found, then the counter resets, and the
search continues. These two parameters deserve special consideration when a
given data set contains many short sequences, quite common for microbiome
survey data. We can modify our original command to include these extra
parameters with the recommended modifications for short sequences, i.e. a
lower value for perturbation strength (shorter reads do not contain as much
phylogenetic information, thus we should limit how far we jump around in
“tree space”) and a larger number of stop iterations. See the IQ-TREE
command reference for more details about default parameter settings.
Finally, we’ll let iqtree
perform the model testing, and automatically
determine the optimal number of CPU cores to use.
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-fast-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Seed: 673011 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:15:15 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000103951 secs using 87.54% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.00136e-05 secs using 89.88% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1396.594
2. Current log-likelihood: -1395.225
Optimal log-likelihood: -1394.470
Rate parameters: A-C: 0.22093 A-G: 2.05280 A-T: 1.94947 C-G: 1.06436 C-T: 2.58632 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.033
Gamma shape alpha: 1.320
Parameters optimization took 2 rounds (0.008 sec)
Time for fast ML tree search: 0.030 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1410.735 45 2911.470 2936.113 3062.939
2 GTR+F+I 1408.913 46 2909.825 2935.717 3064.660
3 GTR+F+G4 1392.994 46 2877.988 2903.880 3032.823
4 GTR+F+I+G4 1393.284 47 2880.568 2907.749 3038.769
5 GTR+F+R2 1387.695 47 2869.389 2896.570 3027.590
6 GTR+F+R3 1387.734 49 2873.467 2903.346 3038.400
14 GTR+F+I+R2 1387.786 48 2871.571 2900.080 3033.138
15 GTR+F+I+R3 1387.749 50 2875.498 2906.786 3043.797
25 SYM+G4 1393.513 43 2873.027 2895.285 3017.764
27 SYM+R2 1389.896 44 2867.792 2891.224 3015.895
36 SYM+I+R2 1390.009 45 2870.018 2894.661 3021.487
47 TVM+F+G4 1393.482 45 2876.964 2901.607 3028.433
49 TVM+F+R2 1388.482 46 2868.965 2894.857 3023.800
58 TVM+F+I+R2 1388.512 47 2871.023 2898.204 3029.224
69 TVMe+G4 1393.649 42 2871.298 2892.421 3012.669
71 TVMe+R2 1389.915 43 2865.830 2888.089 3010.567
80 TVMe+I+R2 1390.045 44 2868.090 2891.522 3016.193
91 TIM3+F+G4 1396.896 44 2881.792 2905.224 3029.895
93 TIM3+F+R2 1391.444 45 2872.887 2897.530 3024.356
102 TIM3+F+I+R2 1391.573 46 2875.146 2901.039 3029.981
113 TIM3e+G4 1396.973 41 2875.945 2895.968 3013.950
115 TIM3e+R2 1393.201 42 2870.402 2891.525 3011.773
124 TIM3e+I+R2 1393.369 43 2872.738 2894.997 3017.475
135 TIM2+F+G4 1401.394 44 2890.789 2914.221 3038.892
137 TIM2+F+R2 1395.779 45 2881.558 2906.201 3033.027
146 TIM2+F+I+R2 1395.842 46 2883.684 2909.576 3038.519
157 TIM2e+G4 1406.338 41 2894.676 2914.699 3032.681
159 TIM2e+R2 1402.241 42 2888.482 2909.605 3029.853
168 TIM2e+I+R2 1402.355 43 2890.710 2912.969 3035.447
179 TIM+F+G4 1397.923 44 2883.846 2907.278 3031.949
181 TIM+F+R2 1392.152 45 2874.304 2898.946 3025.772
190 TIM+F+I+R2 1392.235 46 2876.470 2902.362 3031.305
201 TIMe+G4 1403.735 41 2889.469 2909.492 3027.474
203 TIMe+R2 1399.368 42 2882.736 2903.858 3024.107
212 TIMe+I+R2 1399.508 43 2885.016 2907.275 3029.753
223 TPM3u+F+G4 1397.362 43 2880.723 2902.982 3025.460
225 TPM3u+F+R2 1392.261 44 2872.521 2895.953 3020.624
234 TPM3u+F+I+R2 1392.403 45 2874.806 2899.449 3026.275
245 TPM3+G4 1397.112 40 2874.224 2893.183 3008.863
247 TPM3+R2 1393.234 41 2868.467 2888.491 3006.472
256 TPM3+I+R2 1393.402 42 2870.805 2891.928 3012.176
267 TPM2u+F+G4 1401.857 43 2889.714 2911.973 3034.451
269 TPM2u+F+R2 1396.533 44 2881.066 2904.498 3029.169
278 TPM2u+F+I+R2 1396.608 45 2883.216 2907.859 3034.685
289 TPM2+G4 1406.513 40 2893.026 2911.985 3027.665
291 TPM2+R2 1402.287 41 2886.575 2906.598 3024.580
300 TPM2+I+R2 1402.403 42 2888.805 2909.928 3030.176
311 K3Pu+F+G4 1398.518 43 2883.036 2905.295 3027.773
313 K3Pu+F+R2 1393.045 44 2874.089 2897.521 3022.192
322 K3Pu+F+I+R2 1393.136 45 2876.273 2900.916 3027.742
333 K3P+G4 1403.865 40 2887.731 2906.690 3022.370
335 K3P+R2 1399.381 41 2880.762 2900.786 3018.767
344 K3P+I+R2 1399.484 42 2882.968 2904.091 3024.339
355 TN+F+G4 1401.517 43 2889.033 2911.292 3033.770
357 TN+F+R2 1395.989 44 2879.978 2903.410 3028.081
366 TN+F+I+R2 1396.059 45 2882.117 2906.760 3033.586
377 TNe+G4 1406.404 40 2892.809 2911.768 3027.448
379 TNe+R2 1402.282 41 2886.564 2906.587 3024.569
388 TNe+I+R2 1402.388 42 2888.777 2909.900 3030.148
399 HKY+F+G4 1401.987 42 2887.975 2909.098 3029.346
401 HKY+F+R2 1396.740 43 2879.479 2901.738 3024.216
410 HKY+F+I+R2 1396.828 44 2881.656 2905.088 3029.759
421 K2P+G4 1406.580 39 2891.159 2909.090 3022.432
423 K2P+R2 1402.314 40 2884.629 2903.588 3019.268
432 K2P+I+R2 1402.436 41 2886.872 2906.895 3024.877
443 F81+F+G4 1410.185 41 2902.369 2922.392 3040.374
445 F81+F+R2 1405.823 42 2895.645 2916.768 3037.016
454 F81+F+I+R2 1405.965 43 2897.930 2920.189 3042.667
465 JC+G4 1414.852 38 2905.703 2922.640 3033.610
467 JC+R2 1411.425 39 2900.850 2918.781 3032.123
476 JC+I+R2 1411.533 40 2903.065 2922.025 3037.704
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TVMe+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree.model.gz
CPU time for ModelFinder: 0.753 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.762 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1393.234
Optimal log-likelihood: -1393.225
Rate parameters: A-C: 0.30905 A-G: 1.35389 A-T: 1.00000 C-G: 0.30905 C-T: 1.35389 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.688,0.343) (0.312,2.452)
Parameters optimization took 1 rounds (0.002 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000622034 secs using 97.58% CPU
Computing ML distances took 0.000702 sec (of wall-clock time) 0.000675 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 5.22137e-05 secs using 88.1% CPU
Computing RapidNJ tree took 0.000174 sec (of wall-clock time) 0.000189 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.436
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.060 second
Computing log-likelihood of 98 initial trees ... 0.046 seconds
Current best score: -1393.225
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1392.049
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1385.324
BETTER TREE FOUND at iteration 3: -1385.319
Iteration 10 / LogL: -1385.354 / Time: 0h:0m:0s
Iteration 20 / LogL: -1385.323 / Time: 0h:0m:1s
Finish initializing candidate tree set (4)
Current best tree score: -1385.319 / CPU time: 0.253
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1385.910 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 40 / LogL: -1385.955 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 50 / LogL: -1386.041 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 60 / LogL: -1385.711 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 70 / LogL: -1385.585 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.319
Iteration 80 / LogL: -1385.858 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1385.382 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.319
Iteration 100 / LogL: -1385.545 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1385.984 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 120 / LogL: -1385.949 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 130 / LogL: -1385.546 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.319
Iteration 140 / LogL: -1385.320 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 150 / LogL: -1385.883 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 160 / LogL: -1385.322 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 170 / LogL: -1385.321 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 180 / LogL: -1385.889 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 190 / LogL: -1385.322 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 200 / LogL: -1385.320 / Time: 0h:0m:2s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 204 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.319
Optimal log-likelihood: -1385.309
Rate parameters: A-C: 0.39437 A-G: 1.57343 A-T: 1.00000 C-G: 0.39437 C-T: 1.57343 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.717,0.394) (0.283,2.535)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.309
Total tree length: 6.954
Total number of iterations: 204
CPU time used for tree search: 1.531 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.351 sec (0h:0m:1s)
Total CPU time used: 2.301 sec (0h:0m:2s)
Total wall-clock time used: 2.127 sec (0h:0m:2s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree.log
Date and Time: Mon Jul 29 18:15:17 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpmhyinwwp/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-fast-tree.qza
iqtree-ultrafast-bootstrap¶
As per our discussion in the raxml-rapid-bootstrap
section above, we can
also use IQ-TREE to evaluate how well our splits / bipartitions are supported
within our phylogeny via the ultrafast bootstrap algorithm. Below, we’ll
apply the plugin’s
ultrafast bootstrap command:
automatic model selection (MFP
), perform 1000
bootstrap replicates
(minimum required), set the same generally suggested parameters for
constructing a phylogeny from short sequences, and automatically determine the
optimal number of CPU cores to use:
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-bootstrap-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Seed: 467906 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:15:26 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.00013113 secs using 83.12% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.5974e-05 secs using 93.9% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1390.629
Optimal log-likelihood: -1389.812
Rate parameters: A-C: 0.31763 A-G: 2.16791 A-T: 2.02305 C-G: 1.12985 C-T: 3.04155 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.035
Gamma shape alpha: 1.407
Parameters optimization took 1 rounds (0.005 sec)
Time for fast ML tree search: 0.038 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1405.417 45 2900.833 2925.476 3052.302
2 GTR+F+I 1403.836 46 2899.673 2925.565 3054.508
3 GTR+F+G4 1388.332 46 2868.664 2894.556 3023.499
4 GTR+F+I+G4 1388.709 47 2871.418 2898.598 3029.619
5 GTR+F+R2 1382.567 47 2859.134 2886.314 3017.335
6 GTR+F+R3 1382.606 49 2863.212 2893.091 3028.145
14 GTR+F+I+R2 1382.709 48 2861.417 2889.926 3022.984
15 GTR+F+I+R3 1382.677 50 2865.355 2896.643 3033.654
25 SYM+G4 1388.450 43 2862.899 2885.158 3007.636
27 SYM+R2 1384.040 44 2856.081 2879.513 3004.184
36 SYM+I+R2 1384.179 45 2858.358 2883.001 3009.827
47 TVM+F+G4 1389.411 45 2868.823 2893.466 3020.292
49 TVM+F+R2 1384.290 46 2860.581 2886.473 3015.416
58 TVM+F+I+R2 1384.288 47 2862.576 2889.756 3020.777
69 TVMe+G4 1388.431 42 2860.861 2881.984 3002.232
71 TVMe+R2 1384.070 43 2854.141 2876.400 2998.878
80 TVMe+I+R2 1384.207 44 2856.414 2879.846 3004.517
91 TIM3+F+G4 1392.277 44 2872.555 2895.987 3020.658
93 TIM3+F+R2 1385.911 45 2861.822 2886.465 3013.291
102 TIM3+F+I+R2 1386.045 46 2864.090 2889.982 3018.925
113 TIM3e+G4 1391.664 41 2865.328 2885.351 3003.333
115 TIM3e+R2 1386.836 42 2857.673 2878.796 2999.044
124 TIM3e+I+R2 1386.991 43 2859.982 2882.241 3004.719
135 TIM2+F+G4 1395.142 44 2878.284 2901.716 3026.387
137 TIM2+F+R2 1388.245 45 2866.489 2891.132 3017.958
146 TIM2+F+I+R2 1388.352 46 2868.704 2894.596 3023.539
157 TIM2e+G4 1398.833 41 2879.665 2899.688 3017.670
159 TIM2e+R2 1393.046 42 2870.092 2891.214 3011.463
168 TIM2e+I+R2 1393.119 43 2872.238 2894.497 3016.975
179 TIM+F+G4 1391.818 44 2871.637 2895.069 3019.740
181 TIM+F+R2 1385.495 45 2860.989 2885.632 3012.458
190 TIM+F+I+R2 1385.586 46 2863.173 2889.065 3018.008
201 TIMe+G4 1396.053 41 2874.107 2894.130 3012.112
203 TIMe+R2 1390.516 42 2865.031 2886.154 3006.402
212 TIMe+I+R2 1390.601 43 2867.202 2889.461 3011.939
223 TPM3u+F+G4 1393.267 43 2872.534 2894.793 3017.271
225 TPM3u+F+R2 1387.637 44 2863.275 2886.707 3011.378
234 TPM3u+F+I+R2 1387.756 45 2865.513 2890.156 3016.982
245 TPM3+G4 1391.670 40 2863.341 2882.300 2997.980
247 TPM3+R2 1386.891 41 2855.783 2875.806 2993.788
256 TPM3+I+R2 1387.026 42 2858.051 2879.174 2999.422
267 TPM2u+F+G4 1396.124 43 2878.248 2900.506 3022.985
269 TPM2u+F+R2 1389.934 44 2867.868 2891.300 3015.971
278 TPM2u+F+I+R2 1389.966 45 2869.932 2894.575 3021.401
289 TPM2+G4 1398.849 40 2877.698 2896.657 3012.337
291 TPM2+R2 1393.099 41 2868.197 2888.220 3006.202
300 TPM2+I+R2 1393.154 42 2870.308 2891.431 3011.679
311 K3Pu+F+G4 1392.998 43 2871.995 2894.254 3016.732
313 K3Pu+F+R2 1387.256 44 2862.512 2885.943 3010.614
322 K3Pu+F+I+R2 1387.276 45 2864.551 2889.194 3016.020
333 K3P+G4 1396.053 40 2872.105 2891.065 3006.744
335 K3P+R2 1390.569 41 2863.137 2883.161 3001.142
344 K3P+I+R2 1390.686 42 2865.373 2886.496 3006.744
355 TN+F+G4 1395.494 43 2876.988 2899.247 3021.725
357 TN+F+R2 1388.629 44 2865.258 2888.689 3013.360
366 TN+F+I+R2 1388.742 45 2867.483 2892.126 3018.952
377 TNe+G4 1398.835 40 2877.670 2896.630 3012.309
379 TNe+R2 1393.043 41 2868.085 2888.109 3006.091
388 TNe+I+R2 1393.116 42 2870.233 2891.356 3011.604
399 HKY+F+G4 1396.493 42 2876.986 2898.109 3018.357
401 HKY+F+R2 1390.329 43 2866.658 2888.917 3011.395
410 HKY+F+I+R2 1390.362 44 2868.725 2892.157 3016.828
421 K2P+G4 1398.849 39 2875.699 2893.630 3006.972
423 K2P+R2 1393.099 40 2866.197 2885.157 3000.836
432 K2P+I+R2 1393.153 41 2868.306 2888.330 3006.311
443 F81+F+G4 1406.493 41 2894.987 2915.010 3032.992
445 F81+F+R2 1401.182 42 2886.363 2907.486 3027.734
454 F81+F+I+R2 1401.268 43 2888.536 2910.795 3033.273
465 JC+G4 1408.772 38 2893.544 2910.481 3021.451
467 JC+R2 1403.900 39 2885.801 2903.732 3017.074
476 JC+I+R2 1403.935 40 2887.870 2906.829 3022.509
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TPM3+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.model.gz
CPU time for ModelFinder: 0.772 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.782 seconds (0h:0m:0s)
Generating 1000 samples for ultrafast bootstrap (seed: 467906)...
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1386.891
Optimal log-likelihood: -1386.887
Rate parameters: A-C: 0.39145 A-G: 1.51426 A-T: 1.00000 C-G: 0.39145 C-T: 1.51426 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.718,0.396) (0.282,2.538)
Parameters optimization took 1 rounds (0.002 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000609875 secs using 97.56% CPU
Computing ML distances took 0.000731 sec (of wall-clock time) 0.000645 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.88486e-05 secs using 79.73% CPU
Computing RapidNJ tree took 0.000144 sec (of wall-clock time) 0.000164 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.863
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.056 second
Computing log-likelihood of 98 initial trees ... 0.045 seconds
Current best score: -1386.887
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1385.310
UPDATE BEST LOG-LIKELIHOOD: -1385.308
UPDATE BEST LOG-LIKELIHOOD: -1385.305
Iteration 10 / LogL: -1385.340 / Time: 0h:0m:1s
Iteration 20 / LogL: -1385.340 / Time: 0h:0m:1s
Finish initializing candidate tree set (2)
Current best tree score: -1385.305 / CPU time: 0.338
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
BETTER TREE FOUND at iteration 23: -1385.305
Iteration 30 / LogL: -1385.312 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 40 / LogL: -1385.832 / Time: 0h:0m:1s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.305
Iteration 50 / LogL: -1385.305 / Time: 0h:0m:1s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1416.968
Iteration 60 / LogL: -1385.531 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 70 / LogL: -1385.628 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 80 / LogL: -1385.307 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1385.306 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 100 / LogL: -1385.306 / Time: 0h:0m:1s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1417.494
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.995
Iteration 110 / LogL: -1385.962 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 120 / LogL: -1385.306 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 130 / LogL: -1385.307 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 140 / LogL: -1385.691 / Time: 0h:0m:2s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.305
Iteration 150 / LogL: -1385.628 / Time: 0h:0m:2s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1418.317
Iteration 160 / LogL: -1385.306 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 170 / LogL: -1385.698 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 180 / LogL: -1385.319 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 190 / LogL: -1385.308 / Time: 0h:0m:2s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.305
UPDATE BEST LOG-LIKELIHOOD: -1385.305
Iteration 200 / LogL: -1385.839 / Time: 0h:0m:3s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1418.317
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.998
Iteration 210 / LogL: -1385.306 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 220 / LogL: -1385.307 / Time: 0h:0m:3s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 224 ITERATIONS / Time: 0h:0m:3s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.305
Optimal log-likelihood: -1385.304
Rate parameters: A-C: 0.39601 A-G: 1.57585 A-T: 1.00000 C-G: 0.39601 C-T: 1.57585 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.400) (0.278,2.554)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.304
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.splits.nex
Total tree length: 6.876
Total number of iterations: 224
CPU time used for tree search: 2.775 sec (0h:0m:2s)
Wall-clock time used for tree search: 2.611 sec (0h:0m:2s)
Total CPU time used: 3.603 sec (0h:0m:3s)
Total wall-clock time used: 3.450 sec (0h:0m:3s)
Computing bootstrap consensus tree...
Reading input file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.splits.nex...
20 taxa and 153 splits.
Consensus tree written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.contree
Reading input trees file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1385.305
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.contree
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot.log
Date and Time: Mon Jul 29 18:15:29 2024
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp00_31s4u/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-tree.qza
Perform single branch tests alongside ufboot¶
We can also apply single branch test methods concurrently with ultrafast bootstrapping. The support values will always be represented in the following order: alrt / lbp / abayes / ufboot. Again, these values can be seen as separately listed bootstrap values in iTOL. We’ll also specify a model as we did earlier.
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built Jun 18 2024
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Seed: 641738 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon Jul 29 18:15:38 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000112057 secs using 84.78% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Generating 1000 samples for ultrafast bootstrap (seed: 641738)...
NOTE: 1 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.282 / LogL: -1392.553
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.377 / LogL: -1392.829
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.391 / LogL: -1392.898
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.388 / LogL: -1392.889
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.383 / LogL: -1392.853
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.384 / LogL: -1392.879
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.007, 1.379 / LogL: -1392.828
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.382 / LogL: -1392.844
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.383 / LogL: -1392.849
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.384 / LogL: -1392.858
Optimal pinv,alpha: 0.000, 1.282 / LogL: -1392.553
Parameters optimization took 0.273 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000942945 secs using 98.73% CPU
Computing ML distances took 0.000999 sec (of wall-clock time) 0.000973 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 5.91278e-05 secs using 87.95% CPU
Computing RapidNJ tree took 0.000178 sec (of wall-clock time) 0.000169 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.710
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.061 second
Computing log-likelihood of 98 initial trees ... 0.062 seconds
Current best score: -1392.553
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.258
Iteration 10 / LogL: -1387.279 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.264 / Time: 0h:0m:0s
Finish initializing candidate tree set (3)
Current best tree score: -1387.258 / CPU time: 0.433
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1387.497 / Time: 0h:0m:0s (0h:0m:4s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.254
Iteration 40 / LogL: -1387.277 / Time: 0h:0m:1s (0h:0m:4s left)
Iteration 50 / LogL: -1387.352 / Time: 0h:0m:1s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1413.107
Iteration 60 / LogL: -1387.387 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 70 / LogL: -1387.350 / Time: 0h:0m:1s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.253
UPDATE BEST LOG-LIKELIHOOD: -1387.253
Iteration 80 / LogL: -1387.350 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 90 / LogL: -1396.644 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 100 / LogL: -1387.280 / Time: 0h:0m:1s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.986
NOTE: UFBoot does not converge, continue at least 100 more iterations
UPDATE BEST LOG-LIKELIHOOD: -1387.253
Iteration 110 / LogL: -1387.284 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 120 / LogL: -1387.350 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 130 / LogL: -1387.352 / Time: 0h:0m:2s (0h:0m:1s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 134: -1387.169
Iteration 140 / LogL: -1387.338 / Time: 0h:0m:2s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.169
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 150 / LogL: -1387.169 / Time: 0h:0m:2s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1413.107
Iteration 160 / LogL: -1387.340 / Time: 0h:0m:2s (0h:0m:3s left)
Iteration 170 / LogL: -1387.336 / Time: 0h:0m:3s (0h:0m:3s left)
Iteration 180 / LogL: -1387.169 / Time: 0h:0m:3s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 190 / LogL: -1387.535 / Time: 0h:0m:3s (0h:0m:2s left)
Iteration 200 / LogL: -1387.341 / Time: 0h:0m:3s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.991
Iteration 210 / LogL: -1387.169 / Time: 0h:0m:3s (0h:0m:2s left)
Iteration 220 / LogL: -1387.169 / Time: 0h:0m:3s (0h:0m:2s left)
Iteration 230 / LogL: -1387.364 / Time: 0h:0m:4s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 240 / LogL: -1387.169 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 250 / LogL: -1387.169 / Time: 0h:0m:4s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1413.107
UPDATE BEST LOG-LIKELIHOOD: -1387.169
UPDATE BEST LOG-LIKELIHOOD: -1387.168
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 260 / LogL: -1387.168 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 270 / LogL: -1387.169 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 280 / LogL: -1387.192 / Time: 0h:0m:4s (0h:0m:0s left)
Iteration 290 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 300 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.986
NOTE: UFBoot does not converge, continue at least 100 more iterations
Iteration 310 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:1s left)
Iteration 320 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:1s left)
Iteration 330 / LogL: -1387.177 / Time: 0h:0m:5s (0h:0m:1s left)
BETTER TREE FOUND at iteration 333: -1387.168
Iteration 340 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 350 / LogL: -1387.169 / Time: 0h:0m:5s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1413.107
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 360 / LogL: -1387.169 / Time: 0h:0m:6s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 370 / LogL: -1387.178 / Time: 0h:0m:6s (0h:0m:2s left)
Iteration 380 / LogL: -1387.192 / Time: 0h:0m:6s (0h:0m:2s left)
Iteration 390 / LogL: -1387.169 / Time: 0h:0m:6s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 400 / LogL: -1387.176 / Time: 0h:0m:6s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.986
NOTE: UFBoot does not converge, continue at least 100 more iterations
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 410 / LogL: -1387.171 / Time: 0h:0m:7s (0h:0m:2s left)
Iteration 420 / LogL: -1387.168 / Time: 0h:0m:7s (0h:0m:1s left)
Iteration 430 / LogL: -1387.169 / Time: 0h:0m:7s (0h:0m:1s left)
Iteration 440 / LogL: -1387.206 / Time: 0h:0m:7s (0h:0m:1s left)
Iteration 450 / LogL: -1387.213 / Time: 0h:0m:7s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1413.107
Iteration 460 / LogL: -1387.169 / Time: 0h:0m:7s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 470 / LogL: -1387.168 / Time: 0h:0m:8s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 480 / LogL: -1387.168 / Time: 0h:0m:8s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 490 / LogL: -1396.665 / Time: 0h:0m:8s (0h:0m:0s left)
Iteration 500 / LogL: -1396.269 / Time: 0h:0m:8s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.989
NOTE: UFBoot does not converge, continue at least 100 more iterations
Iteration 510 / LogL: -1387.178 / Time: 0h:0m:8s (0h:0m:1s left)
Iteration 520 / LogL: -1387.169 / Time: 0h:0m:8s (0h:0m:1s left)
Iteration 530 / LogL: -1387.169 / Time: 0h:0m:9s (0h:0m:1s left)
Iteration 540 / LogL: -1387.187 / Time: 0h:0m:9s (0h:0m:1s left)
Iteration 550 / LogL: -1387.169 / Time: 0h:0m:9s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1413.107
Iteration 560 / LogL: -1387.447 / Time: 0h:0m:9s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 570 / LogL: -1387.169 / Time: 0h:0m:9s (0h:0m:0s left)
Iteration 580 / LogL: -1387.169 / Time: 0h:0m:9s (0h:0m:0s left)
Iteration 590 / LogL: -1387.168 / Time: 0h:0m:10s (0h:0m:0s left)
Iteration 600 / LogL: -1387.169 / Time: 0h:0m:10s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1413.107
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.991
TREE SEARCH COMPLETED AFTER 600 ITERATIONS / Time: 0h:0m:10s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.168
Optimal log-likelihood: -1387.167
Rate parameters: A-C: 0.34675 A-G: 2.32822 A-T: 2.14838 C-G: 1.23772 C-T: 3.22576 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.286
Parameters optimization took 1 rounds (0.003 sec)
BEST SCORE FOUND : -1387.167
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.047 sec.
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.splits.nex
Total tree length: 7.597
Total number of iterations: 600
CPU time used for tree search: 10.104 sec (0h:0m:10s)
Wall-clock time used for tree search: 10.032 sec (0h:0m:10s)
Total CPU time used: 10.480 sec (0h:0m:10s)
Total wall-clock time used: 10.411 sec (0h:0m:10s)
Computing bootstrap consensus tree...
Reading input file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.splits.nex...
20 taxa and 187 splits.
Consensus tree written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.contree
Reading input trees file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1387.471
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.contree
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot.log
Date and Time: Mon Jul 29 18:15:49 2024
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/ae281d9d-a0aa-4ff9-8989-4986a6acd700/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpyv65l0e2/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-sbt-gtrig-tree.qza
Tip
If there is a need to reduce the impact of potential model
violations that occur during a UFBoot search, and / or would simply
like to be more rigorous, we can add the --p-bnni
option to any of the
iqtree-ultrafast-bootstrap
commands above.
Root the phylogeny¶
In order to make proper use of diversity metrics such as UniFrac, the phylogeny must be rooted. Typically an outgroup is chosen when rooting a tree. In general, phylogenetic inference tools using Maximum Likelihood often return an unrooted tree by default.
QIIME 2 provides a way to
mid-point root our
phylogeny. Other rooting options may be available in the future. For now, we’ll
root our bootstrap tree from iqtree-ultrafast-bootstrap
like so:
qiime phylogeny midpoint-root \
--i-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--o-rooted-tree iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza
Tip
iTOL viewing Reminder. We can view our tree and its associated alignment via iTOL. All you need to do is upload the iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza tree file. Display the tree in Normal mode. Then drag and drop the masked-aligned-rep-seqs.qza file onto the visualization. Now you can view the phylogeny alongside the alignment.
Pipelines¶
Here we will outline the use of the phylogeny pipeline align-to-tree-mafft-fasttree
One advantage of pipelines is that they combine ordered sets of commonly used commands, into one condensed simple command. To keep these “convenience” pipelines easy to use, it is quite common to only expose a few options to the user. That is, most of the commands executed via pipelines are often configured to use default option settings. However, options that are deemed important enough for the user to consider setting, are made available. The options exposed via a given pipeline will largely depend upon what it is doing. Pipelines are also a great way for new users to get started, as it helps to lay a foundation of good practices in setting up standard operating procedures.
Rather than run one or more of the following QIIME 2 commands listed below:
qiime alignment mafft ...
qiime alignment mask ...
qiime phylogeny fasttree ...
qiime phylogeny midpoint-root ...
We can make use of the pipeline align-to-tree-mafft-fasttree to automate the above four steps in one go. Here is the description taken from the pipeline help doc:
This pipeline will start by creating a sequence alignment using MAFFT, after which any alignment columns that are phylogenetically uninformative or ambiguously aligned will be removed (masked). The resulting masked alignment will be used to infer a phylogenetic tree and then subsequently rooted at its midpoint. Output files from each step of the pipeline will be saved. This includes both the unmasked and masked MAFFT alignment from q2-alignment methods, and both the rooted and unrooted phylogenies from q2-phylogeny methods.
This can all be accomplished by simply running the following:
qiime phylogeny align-to-tree-mafft-fasttree \
--i-sequences rep-seqs.qza \
--output-dir mafft-fasttree-output
Output artifacts:
Congratulations! You now know how to construct a phylogeny in QIIME 2!