Phylogenetic inference with q2-phylogeny¶
Phylogenetic inference with q2-phylogeny
Note
This tutorial assumes, you’ve read through the QIIME 2 Overview documentation and have at least worked through some of the other Tutorials.
Inferring phylogenies¶
Several downstream diversity metrics, available within QIIME 2, require that a phylogenetic tree be constructed using the Operational Taxonomic Units (OTUs) or Amplicon Sequence Variants (ASVs) being investigated.
But how do we proceed to construct a phylogeny from our sequence data?
Well, there are two phylogeny-based approaches we can use. Deciding upon which to use is largely dependent on your study questions:
1. A reference-based fragment insertion approach. Which, is likely the ideal choice. Especially, if your reference phylogeny (and associated representative sequences) encompass neighboring relatives of which your sequences can be reliably inserted. Any sequences that do not match well enough to the reference are not inserted. For example, this approach may not work well if your data contain sequences that are not well represented within your reference phylogeny (e.g. missing clades, etc.). For more information, check out these great fragment insertion examples.
2. A de novo approach. Marker genes that can be globally aligned across divergent taxa, are usually amenable to sequence alignment and phylogenetic investigation through this approach. Be mindful of the length of your sequences when constructing a de novo phylogeny, short reads many not have enough phylogenetic information to capture a meaningful phylogeny. This community tutorial will focus on the de novo approaches.
Here, you will learn how to make use of de novo phylogenetic approaches to:
generate a sequence alignment within QIIME 2
mask the alignment if needed
construct a phylogenetic tree
root the phylogenetic tree
If you would like to substitute any of the steps outlined here by making use of tools external to QIIME 2, please see the import, export, and filtering documentation where appropriate.
Sequence Alignment¶
Prior to constructing a phylogeny we must generate a multiple sequence alignment (MSA). When constructing a MSA we are making a statement about the putative homology of the aligned residues (columns of the MSA) by virtue of their sequence similarity.
The number of algorithms to construct a MSA are legion. We will make use of MAFFT (Multiple Alignment using Fast Fourier Transform)) via the q2-alignment plugin. For more information checkout the MAFFT paper.
Let’s start by creating a directory to work in:
mkdir qiime2-phylogeny-tutorial
cd qiime2-phylogeny-tutorial
Next, download the data:
Download URL: https://data.qiime2.org/2024.10/tutorials/phylogeny/rep-seqs.qza
Save as: rep-seqs.qza
wget \
-O "rep-seqs.qza" \
"https://data.qiime2.org/2024.10/tutorials/phylogeny/rep-seqs.qza"
curl -sL \
"https://data.qiime2.org/2024.10/tutorials/phylogeny/rep-seqs.qza" > \
"rep-seqs.qza"
Run MAFFT
qiime alignment mafft \
--i-sequences rep-seqs.qza \
--o-alignment aligned-rep-seqs.qza
Reducing alignment ambiguity: masking and reference alignments¶
Why mask an alignment?
Masking helps to eliminate alignment columns that are phylogenetically uninformative or misleading before phylogenetic analysis. Much of the time alignment errors can introduce noise and confound phylogenetic inference. It is common practice to mask (remove) these ambiguously aligned regions prior to performing phylogenetic inference. In particular, David Lane’s (1991) chapter 16S/23S rRNA sequencing proposed masking SSU data prior to phylogenetic analysis. However, knowing how to deal with ambiguously aligned regions and when to apply masks largely depends on the marker genes being analyzed and the question being asked of the data.
Note
Keep in mind that this is still an active area of discussion, as highlighted by the following non-exhaustive list of articles: Wu et al. 2012, Ashkenazy et al. 2018, Schloss 2010, Tan et al. 2015, Rajan 2015.
How to mask alignment.
For our purposes, we’ll assume that we have ambiguously aligned columns in the
MAFFT alignment we produced above. The default settings for the
--p-min-conservation
of the
alignment mask approximates the
Lane mask filtering of QIIME 1. Keep an eye out for updates to the alignment
plugin.
qiime alignment mask \
--i-alignment aligned-rep-seqs.qza \
--o-masked-alignment masked-aligned-rep-seqs.qza
Reference based alignments
There are several tools that attempt to reduce the amount of ambiguously aligned regions by using curated reference alignments. Traditional, de novo alignment methods mututally align a set of unaligned sequences to create a multiple sequence alignment (MSA) from scratch. Re-running these methods with additional sequences will create MSAs with varying numbers of columns and assignments of bases to each column. These alignments is therefore incompatible with one another and may not be joined through concatenation.
Reference based alignments, on the other hand, are meant to add sequences to an existing alignment. Alignments computed using reference based alignment tools always have widths identical to the reference alignment and maintain the meaning of each column. Therefore, these alignments may be concatenated.
QIIME 2 currently does not wrap any methods for reference-based alignments, but alignments created using these methods can be imported into QIIME 2 as FeatureData[AlignedSequence]
artifacts, provided that the alignments are standard FASTA formats. Some examples of tools for reference-based alignment include PyNAST (using NAST), Infernal, and SINA. SILVA Reference
alignments are particularly powerful for rRNA gene sequence data, as knowledge
of secondary structure is incorporated into the curation process, thus
increasing alignment quality.
Note
Alignments constructed using reference based alignment approaches can be masked too, just like the above MAFFT example. Also, the reference alignment approach we are discussing here is distinct from the reference phylogeny approach (i.e. q2-fragment-insertion) we mentioned earlier. That is, we are not inserting our data into an existing tree, but simply trying to create a more robust alignment for making a better de novo phylogeny.
Construct a phylogeny¶
As with MSA algorithms, phylogenetic inference tools are also legion. Fortunately, there are many great resources to learn about phylogentics. Below are just a few introductory resources to get you started:
There are several methods / pipelines available through the q2-phylogeny plugin of :qiime2:. These are based on the following tools:
Methods¶
fasttree¶
FastTree is able to construct phylogenies from large sequence alignments quite rapidly. It does this by using the using a CAT-like rate category approximation, which is also available through RAxML (discussed below). Check out the FastTree online manual for more information.
qiime phylogeny fasttree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree fasttree-tree.qza
Tip
For an easy and direct way to view your tree.qza
files, upload
them to iTOL. Here, you can interactively view and manipulate your
phylogeny. Even better, while viewing the tree topology in “Normal mode”,
you can drag and drop your associated alignment.qza
(the one you used to
build the phylogeny) or a relevent taxonomy.qza
file onto the iTOL tree
visualization. This will allow you to directly view the sequence alignment
or taxonomy alongside the phylogeny. 🕶️
raxml¶
Like fasttree
, raxml
will perform a single phylogentic inference and
return a tree. Note, the default model for raxml
is
--p-substitution-model GTRGAMMA
. If you’d like to construct a tree using
the CAT model like fasttree
, simply replace GTRGAMMA
with GTRCAT
as
shown below:
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 1 inferences on the original alignment using 1 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 4176 -N 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmptfk3dtz6 -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.455133 CAT-based likelihood -1242.762404, best rearrangement setting 5
Conducting final model optimizations on all 1 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1387.757860 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmptfk3dtz6/RAxML_result.q2
Starting final GAMMA-based thorough Optimization on tree 0 likelihood -1387.757860 ....
Final GAMMA-based Score of best tree -1387.352099
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmptfk3dtz6/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmptfk3dtz6/RAxML_bestTree.q2
Overall execution time: 0.893563 secs or 0.000248 hours or 0.000010 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 4176 -N 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmptfk3dtz6 -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-tree.qza
Perform multiple searches using raxml¶
If you’d like to perform a more thorough search of “tree space” you can
instruct raxml
to perform multiple independent searches on the full
alignment by using --p-n-searches 5
. Once these 5 independent searches are
completed, only the single best scoring tree will be returned. Note, we are
not bootstrapping here, we’ll do that in a later example. Let’s set
--p-substitution-model GTRCAT
. Finally, let’s also manually set a seed via
--p-seed
. By setting our seed, we allow other users the ability to
reproduce our phylogeny. That is, anyone using the same sequence alignment and
substitution model, will generate the same tree as long as they set the same
seed value. Although, --p-seed
is not a required argument, it is generally
a good idea to set this value.
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--p-seed 1723 \
--p-n-searches 5 \
--o-tree raxml-cat-searches-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 5 inferences on the original alignment using 5 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.426103 CAT-based likelihood -1238.242991, best rearrangement setting 5
Inference[1]: Time 0.345866 CAT-based likelihood -1249.502284, best rearrangement setting 5
Inference[2]: Time 0.356230 CAT-based likelihood -1242.978035, best rearrangement setting 5
Inference[3]: Time 0.458562 CAT-based likelihood -1243.159855, best rearrangement setting 5
Inference[4]: Time 0.341197 CAT-based likelihood -1261.321621, best rearrangement setting 5
Conducting final model optimizations on all 5 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1388.324037 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_result.q2.RUN.0
Inference[1] final GAMMA-based Likelihood: -1392.813982 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_result.q2.RUN.1
Inference[2] final GAMMA-based Likelihood: -1388.073642 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_result.q2.RUN.2
Inference[3] final GAMMA-based Likelihood: -1387.945266 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_result.q2.RUN.3
Inference[4] final GAMMA-based Likelihood: -1387.557031 tree written to file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_result.q2.RUN.4
Starting final GAMMA-based thorough Optimization on tree 4 likelihood -1387.557031 ....
Final GAMMA-based Score of best tree -1387.385075
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg/RAxML_bestTree.q2
Overall execution time: 2.461868 secs or 0.000684 hours or 0.000028 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp4ph41pfg -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-searches-tree.qza
raxml-rapid-bootstrap¶
In phylogenetics, it is good practice to check how well the splits /
bipartitions in your phylogeny are supported. Often one is interested in
which clades are robustly separated from other clades in the phylogeny. One
way, of doing this is via bootstrapping (See the Bootstrapping section of the
first introductory link above). In QIIME 2, we’ve provided access to the RAxML
rapid bootstrap feature. The only difference between this command and the
previous are the additional flags --p-bootstrap-replicates
and
--p-rapid-bootstrap-seed
. It is quite common to perform anywhere from 100 -
1000 bootstrap replicates. The --p-rapid-bootstrap-seed
works very much
like the --p-seed
argument from above except that it allows anyone to
reproduce the bootstrapping process and the associated supports for your
splits.
As per the RAxML online documentation and the RAxML manual, the rapid bootstrapping command that we will execute below will do the following:
Bootstrap the input alignment 100 times and perform a Maximum Likelihood (ML) search on each.
Find best scoring ML tree through multiple independent searches using the original input alignment. The number of independent searches is determined by the number of bootstrap replicates set in the 1st step. That is, your search becomes more thorough with increasing bootstrap replicates. The ML optimization of RAxML uses every 5th bootstrap tree as the starting tree for an ML search on the original alignment.
Map the bipartitions (bootstrap supports, 1st step) onto the best scoring ML tree (2nd step).
qiime phylogeny raxml-rapid-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-seed 1723 \
--p-rapid-bootstrap-seed 9384 \
--p-bootstrap-replicates 100 \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-bootstrap-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid bootstrapping and subsequent ML search
Using 1 distinct models/data partitions with joint branch length optimization
Executing 100 rapid bootstrap inferences and thereafter a thorough ML search
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg -n q2bootstrap
Time for BS model parameter optimization 0.036000
Bootstrap[0]: Time 0.124248 seconds, bootstrap likelihood -1199.758796, best rearrangement setting 12
Bootstrap[1]: Time 0.086034 seconds, bootstrap likelihood -1344.229251, best rearrangement setting 6
Bootstrap[2]: Time 0.074889 seconds, bootstrap likelihood -1295.343000, best rearrangement setting 8
Bootstrap[3]: Time 0.064034 seconds, bootstrap likelihood -1273.768320, best rearrangement setting 8
Bootstrap[4]: Time 0.075816 seconds, bootstrap likelihood -1253.402952, best rearrangement setting 6
Bootstrap[5]: Time 0.079073 seconds, bootstrap likelihood -1260.866113, best rearrangement setting 10
Bootstrap[6]: Time 0.077416 seconds, bootstrap likelihood -1293.636299, best rearrangement setting 14
Bootstrap[7]: Time 0.070610 seconds, bootstrap likelihood -1227.178693, best rearrangement setting 6
Bootstrap[8]: Time 0.076632 seconds, bootstrap likelihood -1321.820787, best rearrangement setting 13
Bootstrap[9]: Time 0.082824 seconds, bootstrap likelihood -1147.233446, best rearrangement setting 6
Bootstrap[10]: Time 0.057784 seconds, bootstrap likelihood -1220.766493, best rearrangement setting 13
Bootstrap[11]: Time 0.083446 seconds, bootstrap likelihood -1200.006355, best rearrangement setting 8
Bootstrap[12]: Time 0.089287 seconds, bootstrap likelihood -1346.392834, best rearrangement setting 14
Bootstrap[13]: Time 0.073796 seconds, bootstrap likelihood -1301.111096, best rearrangement setting 14
Bootstrap[14]: Time 0.080128 seconds, bootstrap likelihood -1262.253559, best rearrangement setting 11
Bootstrap[15]: Time 0.079696 seconds, bootstrap likelihood -1215.017551, best rearrangement setting 14
Bootstrap[16]: Time 0.074540 seconds, bootstrap likelihood -1238.832009, best rearrangement setting 7
Bootstrap[17]: Time 0.069774 seconds, bootstrap likelihood -1393.989732, best rearrangement setting 12
Bootstrap[18]: Time 0.072080 seconds, bootstrap likelihood -1173.921002, best rearrangement setting 15
Bootstrap[19]: Time 0.073779 seconds, bootstrap likelihood -1185.726976, best rearrangement setting 11
Bootstrap[20]: Time 0.067586 seconds, bootstrap likelihood -1158.491940, best rearrangement setting 6
Bootstrap[21]: Time 0.065566 seconds, bootstrap likelihood -1154.664272, best rearrangement setting 11
Bootstrap[22]: Time 0.078228 seconds, bootstrap likelihood -1244.159837, best rearrangement setting 10
Bootstrap[23]: Time 0.094908 seconds, bootstrap likelihood -1211.171036, best rearrangement setting 15
Bootstrap[24]: Time 0.075164 seconds, bootstrap likelihood -1261.440677, best rearrangement setting 12
Bootstrap[25]: Time 0.075515 seconds, bootstrap likelihood -1331.836715, best rearrangement setting 15
Bootstrap[26]: Time 0.085044 seconds, bootstrap likelihood -1129.144509, best rearrangement setting 5
Bootstrap[27]: Time 0.098738 seconds, bootstrap likelihood -1226.624056, best rearrangement setting 7
Bootstrap[28]: Time 0.103661 seconds, bootstrap likelihood -1221.046176, best rearrangement setting 12
Bootstrap[29]: Time 0.063345 seconds, bootstrap likelihood -1211.791204, best rearrangement setting 14
Bootstrap[30]: Time 0.085974 seconds, bootstrap likelihood -1389.442380, best rearrangement setting 5
Bootstrap[31]: Time 0.085482 seconds, bootstrap likelihood -1303.638592, best rearrangement setting 12
Bootstrap[32]: Time 0.085765 seconds, bootstrap likelihood -1172.859456, best rearrangement setting 12
Bootstrap[33]: Time 0.075949 seconds, bootstrap likelihood -1244.617135, best rearrangement setting 9
Bootstrap[34]: Time 0.076542 seconds, bootstrap likelihood -1211.871717, best rearrangement setting 15
Bootstrap[35]: Time 0.084243 seconds, bootstrap likelihood -1299.862912, best rearrangement setting 5
Bootstrap[36]: Time 0.072658 seconds, bootstrap likelihood -1141.967505, best rearrangement setting 5
Bootstrap[37]: Time 0.087699 seconds, bootstrap likelihood -1283.923198, best rearrangement setting 12
Bootstrap[38]: Time 0.071515 seconds, bootstrap likelihood -1304.250946, best rearrangement setting 5
Bootstrap[39]: Time 0.069746 seconds, bootstrap likelihood -1407.084376, best rearrangement setting 15
Bootstrap[40]: Time 0.079093 seconds, bootstrap likelihood -1277.946299, best rearrangement setting 13
Bootstrap[41]: Time 0.080496 seconds, bootstrap likelihood -1279.006200, best rearrangement setting 7
Bootstrap[42]: Time 0.076316 seconds, bootstrap likelihood -1160.274606, best rearrangement setting 6
Bootstrap[43]: Time 0.090347 seconds, bootstrap likelihood -1216.079259, best rearrangement setting 14
Bootstrap[44]: Time 0.071404 seconds, bootstrap likelihood -1382.278311, best rearrangement setting 8
Bootstrap[45]: Time 0.079441 seconds, bootstrap likelihood -1099.004439, best rearrangement setting 11
Bootstrap[46]: Time 0.065488 seconds, bootstrap likelihood -1296.527478, best rearrangement setting 8
Bootstrap[47]: Time 0.095256 seconds, bootstrap likelihood -1291.322658, best rearrangement setting 9
Bootstrap[48]: Time 0.060516 seconds, bootstrap likelihood -1161.908080, best rearrangement setting 6
Bootstrap[49]: Time 0.083760 seconds, bootstrap likelihood -1257.348428, best rearrangement setting 13
Bootstrap[50]: Time 0.094071 seconds, bootstrap likelihood -1309.422533, best rearrangement setting 13
Bootstrap[51]: Time 0.070908 seconds, bootstrap likelihood -1197.633097, best rearrangement setting 11
Bootstrap[52]: Time 0.078994 seconds, bootstrap likelihood -1347.123005, best rearrangement setting 8
Bootstrap[53]: Time 0.074870 seconds, bootstrap likelihood -1234.934890, best rearrangement setting 14
Bootstrap[54]: Time 0.081086 seconds, bootstrap likelihood -1227.092434, best rearrangement setting 6
Bootstrap[55]: Time 0.082878 seconds, bootstrap likelihood -1280.635747, best rearrangement setting 7
Bootstrap[56]: Time 0.068967 seconds, bootstrap likelihood -1225.911449, best rearrangement setting 6
Bootstrap[57]: Time 0.064856 seconds, bootstrap likelihood -1236.213347, best rearrangement setting 11
Bootstrap[58]: Time 0.101236 seconds, bootstrap likelihood -1393.245723, best rearrangement setting 14
Bootstrap[59]: Time 0.099373 seconds, bootstrap likelihood -1212.039371, best rearrangement setting 6
Bootstrap[60]: Time 0.077350 seconds, bootstrap likelihood -1248.692011, best rearrangement setting 10
Bootstrap[61]: Time 0.080570 seconds, bootstrap likelihood -1172.820979, best rearrangement setting 13
Bootstrap[62]: Time 0.087827 seconds, bootstrap likelihood -1126.745788, best rearrangement setting 14
Bootstrap[63]: Time 0.081460 seconds, bootstrap likelihood -1267.434444, best rearrangement setting 12
Bootstrap[64]: Time 0.066738 seconds, bootstrap likelihood -1340.680748, best rearrangement setting 5
Bootstrap[65]: Time 0.067535 seconds, bootstrap likelihood -1072.671059, best rearrangement setting 5
Bootstrap[66]: Time 0.085011 seconds, bootstrap likelihood -1234.294838, best rearrangement setting 8
Bootstrap[67]: Time 0.087579 seconds, bootstrap likelihood -1109.249439, best rearrangement setting 15
Bootstrap[68]: Time 0.070007 seconds, bootstrap likelihood -1314.493588, best rearrangement setting 8
Bootstrap[69]: Time 0.067942 seconds, bootstrap likelihood -1173.850035, best rearrangement setting 13
Bootstrap[70]: Time 0.070865 seconds, bootstrap likelihood -1231.066465, best rearrangement setting 10
Bootstrap[71]: Time 0.072296 seconds, bootstrap likelihood -1146.861379, best rearrangement setting 9
Bootstrap[72]: Time 0.068086 seconds, bootstrap likelihood -1148.753369, best rearrangement setting 8
Bootstrap[73]: Time 0.076978 seconds, bootstrap likelihood -1333.374056, best rearrangement setting 9
Bootstrap[74]: Time 0.070934 seconds, bootstrap likelihood -1259.382378, best rearrangement setting 5
Bootstrap[75]: Time 0.075893 seconds, bootstrap likelihood -1319.944496, best rearrangement setting 6
Bootstrap[76]: Time 0.089450 seconds, bootstrap likelihood -1309.042165, best rearrangement setting 14
Bootstrap[77]: Time 0.109543 seconds, bootstrap likelihood -1232.061289, best rearrangement setting 8
Bootstrap[78]: Time 0.083890 seconds, bootstrap likelihood -1261.333984, best rearrangement setting 9
Bootstrap[79]: Time 0.083867 seconds, bootstrap likelihood -1194.644341, best rearrangement setting 13
Bootstrap[80]: Time 0.072284 seconds, bootstrap likelihood -1214.037389, best rearrangement setting 9
Bootstrap[81]: Time 0.079643 seconds, bootstrap likelihood -1224.527657, best rearrangement setting 8
Bootstrap[82]: Time 0.095080 seconds, bootstrap likelihood -1241.464826, best rearrangement setting 11
Bootstrap[83]: Time 0.070214 seconds, bootstrap likelihood -1230.730558, best rearrangement setting 6
Bootstrap[84]: Time 0.076171 seconds, bootstrap likelihood -1219.034592, best rearrangement setting 10
Bootstrap[85]: Time 0.081032 seconds, bootstrap likelihood -1280.071994, best rearrangement setting 8
Bootstrap[86]: Time 0.069241 seconds, bootstrap likelihood -1444.747777, best rearrangement setting 9
Bootstrap[87]: Time 0.068983 seconds, bootstrap likelihood -1245.890035, best rearrangement setting 14
Bootstrap[88]: Time 0.079667 seconds, bootstrap likelihood -1287.832766, best rearrangement setting 7
Bootstrap[89]: Time 0.072838 seconds, bootstrap likelihood -1325.245976, best rearrangement setting 5
Bootstrap[90]: Time 0.080740 seconds, bootstrap likelihood -1227.883697, best rearrangement setting 5
Bootstrap[91]: Time 0.077605 seconds, bootstrap likelihood -1273.489392, best rearrangement setting 8
Bootstrap[92]: Time 0.030966 seconds, bootstrap likelihood -1234.725870, best rearrangement setting 7
Bootstrap[93]: Time 0.083713 seconds, bootstrap likelihood -1235.733064, best rearrangement setting 11
Bootstrap[94]: Time 0.067604 seconds, bootstrap likelihood -1204.319488, best rearrangement setting 15
Bootstrap[95]: Time 0.065655 seconds, bootstrap likelihood -1183.328582, best rearrangement setting 11
Bootstrap[96]: Time 0.076191 seconds, bootstrap likelihood -1196.298898, best rearrangement setting 13
Bootstrap[97]: Time 0.082096 seconds, bootstrap likelihood -1339.251746, best rearrangement setting 12
Bootstrap[98]: Time 0.030678 seconds, bootstrap likelihood -1404.363552, best rearrangement setting 7
Bootstrap[99]: Time 0.042408 seconds, bootstrap likelihood -1270.157811, best rearrangement setting 7
Overall Time for 100 Rapid Bootstraps 7.755416 seconds
Average Time per Rapid Bootstrap 0.077554 seconds
Starting ML Search ...
Fast ML optimization finished
Fast ML search Time: 3.122360 seconds
Slow ML Search 0 Likelihood: -1387.994678
Slow ML Search 1 Likelihood: -1387.994678
Slow ML Search 2 Likelihood: -1387.994676
Slow ML Search 3 Likelihood: -1387.994650
Slow ML Search 4 Likelihood: -1387.994685
Slow ML Search 5 Likelihood: -1388.092954
Slow ML Search 6 Likelihood: -1388.182551
Slow ML Search 7 Likelihood: -1388.182563
Slow ML Search 8 Likelihood: -1388.182547
Slow ML Search 9 Likelihood: -1387.994723
Slow ML optimization finished
Slow ML search Time: 1.571044 seconds
Thorough ML search Time: 0.417004 seconds
Final ML Optimization Likelihood: -1387.204993
Model Information:
Model Parameters of Partition 0, Name: No Name Provided, Type of Data: DNA
alpha: 1.227800
Tree-Length: 7.823400
rate A <-> C: 0.332564
rate A <-> G: 2.312784
rate A <-> T: 2.215466
rate C <-> G: 1.243321
rate C <-> T: 3.278770
rate G <-> T: 1.000000
freq pi(A): 0.243216
freq pi(C): 0.181967
freq pi(G): 0.319196
freq pi(T): 0.255621
ML search took 5.114552 secs or 0.001421 hours
Combined Bootstrap and ML search took 12.870354 secs or 0.003575 hours
Drawing Bootstrap Support Values on best-scoring ML tree ...
Found 1 tree in File /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bestTree.q2bootstrap
Found 1 tree in File /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bestTree.q2bootstrap
Program execution info written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_info.q2bootstrap
All 100 bootstrapped trees written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bootstrap.q2bootstrap
Best-scoring ML tree written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bestTree.q2bootstrap
Best-scoring ML tree with support values written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bipartitions.q2bootstrap
Best-scoring ML tree with support values as branch labels written to: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg/RAxML_bipartitionsBranchLabels.q2bootstrap
Overall execution time for full ML analysis: 12.878717 secs or 0.003577 hours or 0.000149 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -w /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5l4hqqlg -n q2bootstrap
Saved Phylogeny[Unrooted] to: raxml-cat-bootstrap-tree.qza
Tip
Optimizing RAxML Run Time.
You may gave noticed that we haven’t added the flag --p-raxml-version
to
the RAxML methods. This parameter provides a means to access versions of
RAxML that have optimized vector instructions for various modern x86
processor architectures. Paraphrased from the RAxML manual and help
documentation: Firstly, most recent processors will support SSE3 vector
instructions (i.e. will likely support the faster AVX2 vector instructions).
Secondly, these instructions will substantially accelerate the likelihood
and parsimony computations. In general, SSE3 versions will run approximately
40% faster than the standard version. The AVX2 version will run 10-30%
faster than the SSE3 version. Additionally, keep in mind that using more
cores / threads will not necessarily decrease run time. The RAxML manual
suggests using 1 core per ~500 DNA alignment patterns. Alignment pattern
information is usually visible on screen, when the --verbose
option is
used. Additionally, try using a rate category (CAT model; via
--p-substitution-model
), which results in equally good trees as the
GAMMA models and is approximately 4 times faster. See the CAT paper. The
CAT approximation is also Ideal for alignments containing 10,000 or more
taxa, and is very much similar the CAT-like model of FastTree2.
iqtree¶
Similar to the raxml
and raxml-rapid-bootstrap
methods above, we
provide similar functionality for IQ-TREE: iqtree
and
iqtree-ultrafast-bootstrap
. IQ-TREE is unique compared to the fastree
and raxml
options, in that it provides access to 286 models of nucleotide
substitution! IQ-TREE can also determine which of these models best fits your
dataset prior to constructing your tree via its built-in ModelFinder
algorithm. This is the default in QIIME 2, but do not worry, you can set any
one of the 286 models of nucleotide substitution via the
--p-substitution-model
flag, e.g. you can set the model as HKY+I+G
instead of the default MFP
(a basic short-hand for: “build a phylogeny
after determining the best fit model as determined by ModelFinder”). Keep in
mind the additional computational time required for model testing via
ModelFinder.
The simplest way to run the
iqtree command with default
settings and automatic model selection (MFP
) is like so:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree iqt-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree -nt 1
Seed: 431052 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:47:52 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.00106001 secs using 10% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 91.18% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1389.605
Optimal log-likelihood: -1388.793
Rate parameters: A-C: 0.37543 A-G: 2.37167 A-T: 2.15335 C-G: 1.24272 C-T: 3.32366 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.034
Gamma shape alpha: 1.400
Parameters optimization took 1 rounds (0.003 sec)
Time for fast ML tree search: 0.036 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214 epsilon: 0.100) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1402.600 45 2895.200 2919.843 3046.669
2 GTR+F+I 1401.121 46 2894.242 2920.135 3049.077
3 GTR+F+G4 1387.369 46 2866.737 2892.629 3021.572
4 GTR+F+I+G4 1387.734 47 2869.468 2896.648 3027.669
5 GTR+F+R2 1382.380 47 2858.759 2885.940 3016.960
+R3 reinitialized from +R2 with factor 0.500
+R3 reinitialized from +R2 with factor 0.250
6 GTR+F+R3 1382.454 49 2862.909 2892.787 3027.842
14 GTR+F+I+R2 1382.411 48 2860.821 2889.331 3022.388
15 GTR+F+I+R3 1382.464 50 2864.928 2896.216 3033.227
25 SYM+G4 1387.163 43 2860.326 2882.585 3005.063
27 SYM+R2 1383.105 44 2854.209 2877.641 3002.312
36 SYM+I+R2 1383.186 45 2856.372 2881.015 3007.841
47 TVM+F+G4 1388.360 45 2866.721 2891.364 3018.190
49 TVM+F+R2 1383.725 46 2859.451 2885.343 3014.286
58 TVM+F+I+R2 1383.717 47 2861.433 2888.614 3019.634
69 TVMe+G4 1387.152 42 2858.304 2879.427 2999.675
71 TVMe+R2 1383.090 43 2852.179 2874.438 2996.916
80 TVMe+I+R2 1383.142 44 2854.285 2877.717 3002.388
91 TIM3+F+G4 1391.376 44 2870.752 2894.184 3018.855
93 TIM3+F+R2 1385.912 45 2861.823 2886.466 3013.292
102 TIM3+F+I+R2 1385.947 46 2863.895 2889.787 3018.730
113 TIM3e+G4 1390.370 41 2862.741 2882.764 3000.746
115 TIM3e+R2 1385.927 42 2855.854 2876.977 2997.225
124 TIM3e+I+R2 1385.955 43 2857.911 2880.170 3002.648
135 TIM2+F+G4 1393.632 44 2875.264 2898.696 3023.367
137 TIM2+F+R2 1387.689 45 2865.378 2890.021 3016.847
146 TIM2+F+I+R2 1387.679 46 2867.359 2893.251 3022.194
157 TIM2e+G4 1396.798 41 2875.596 2895.619 3013.601
159 TIM2e+R2 1391.568 42 2867.135 2888.258 3008.506
168 TIM2e+I+R2 1391.562 43 2869.123 2891.382 3013.860
179 TIM+F+G4 1390.337 44 2868.673 2892.105 3016.776
181 TIM+F+R2 1384.915 45 2859.831 2884.474 3011.300
190 TIM+F+I+R2 1384.886 46 2861.772 2887.664 3016.607
201 TIMe+G4 1394.028 41 2870.057 2890.080 3008.062
203 TIMe+R2 1388.990 42 2861.980 2883.103 3003.351
212 TIMe+I+R2 1388.990 43 2863.980 2886.239 3008.717
223 TPM3u+F+G4 1392.293 43 2870.585 2892.844 3015.322
225 TPM3u+F+R2 1387.325 44 2862.650 2886.082 3010.753
234 TPM3u+F+I+R2 1387.333 45 2864.665 2889.308 3016.134
245 TPM3+G4 1390.386 40 2860.772 2879.731 2995.411
247 TPM3+R2 1385.935 41 2853.869 2873.893 2991.874
256 TPM3+I+R2 1385.953 42 2855.905 2877.028 2997.276
267 TPM2u+F+G4 1394.529 43 2875.058 2897.316 3019.795
269 TPM2u+F+R2 1389.057 44 2866.115 2889.547 3014.218
278 TPM2u+F+I+R2 1389.038 45 2868.077 2892.719 3019.545
289 TPM2+G4 1396.829 40 2873.658 2892.617 3008.297
291 TPM2+R2 1391.574 41 2865.147 2885.171 3003.152
300 TPM2+I+R2 1391.570 42 2867.139 2888.262 3008.510
311 K3Pu+F+G4 1391.377 43 2868.753 2891.012 3013.490
313 K3Pu+F+R2 1386.370 44 2860.739 2884.171 3008.842
322 K3Pu+F+I+R2 1386.340 45 2862.680 2887.323 3014.149
333 K3P+G4 1394.023 40 2868.047 2887.006 3002.686
335 K3P+R2 1389.000 41 2859.999 2880.022 2998.004
344 K3P+I+R2 1389.006 42 2862.011 2883.134 3003.382
355 TN+F+G4 1394.028 43 2874.056 2896.314 3018.793
357 TN+F+R2 1388.213 44 2864.425 2887.857 3012.528
366 TN+F+I+R2 1388.214 45 2866.428 2891.071 3017.897
377 TNe+G4 1396.818 40 2873.635 2892.595 3008.274
379 TNe+R2 1391.579 41 2865.158 2885.182 3003.163
388 TNe+I+R2 1391.584 42 2867.169 2888.291 3008.540
399 HKY+F+G4 1394.938 42 2873.876 2894.999 3015.247
401 HKY+F+R2 1389.592 43 2865.185 2887.444 3009.922
410 HKY+F+I+R2 1389.579 44 2867.157 2890.589 3015.260
421 K2P+G4 1396.828 39 2871.656 2889.587 3002.929
423 K2P+R2 1391.583 40 2863.165 2882.125 2997.804
432 K2P+I+R2 1391.585 41 2865.170 2885.193 3003.175
443 F81+F+G4 1405.730 41 2893.461 2913.484 3031.466
445 F81+F+R2 1400.797 42 2885.594 2906.717 3026.965
454 F81+F+I+R2 1400.790 43 2887.581 2909.839 3032.318
465 JC+G4 1407.635 38 2891.270 2908.207 3019.177
467 JC+R2 1402.843 39 2883.685 2901.616 3014.958
476 JC+I+R2 1402.837 40 2885.674 2904.634 3020.313
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TPM3+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree.model.gz
CPU time for ModelFinder: 0.556 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.567 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1402.843
2. Current log-likelihood: -1386.465
3. Current log-likelihood: -1385.950
Optimal log-likelihood: -1385.940
Rate parameters: A-C: 0.41103 A-G: 1.56375 A-T: 1.00000 C-G: 0.41103 C-T: 1.56375 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.414) (0.278,2.520)
Parameters optimization took 3 rounds (0.008 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000575066 secs using 97.38% CPU
Computing ML distances took 0.000677 sec (of wall-clock time) 0.000623 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.69413e-05 secs using 77.95% CPU
Computing RapidNJ tree took 0.000104 sec (of wall-clock time) 0.000120 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.853
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.063 second
Computing log-likelihood of 98 initial trees ... 0.049 seconds
Current best score: -1385.940
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1385.887
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1385.308
Iteration 10 / LogL: -1385.341 / Time: 0h:0m:0s
Iteration 20 / LogL: -1385.341 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1385.308 / CPU time: 0.249
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
BETTER TREE FOUND at iteration 22: -1385.308
Iteration 30 / LogL: -1385.309 / Time: 0h:0m:0s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.308
Iteration 40 / LogL: -1385.878 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 50 / LogL: -1385.930 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 60 / LogL: -1385.665 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 70 / LogL: -1385.845 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.308
Iteration 80 / LogL: -1385.697 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 90 / LogL: -1385.698 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1385.971 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1385.309 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 120 / LogL: -1385.311 / Time: 0h:0m:1s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 123 ITERATIONS / Time: 0h:0m:1s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.308
Optimal log-likelihood: -1385.305
Rate parameters: A-C: 0.39511 A-G: 1.56732 A-T: 1.00000 C-G: 0.39511 C-T: 1.56732 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.403) (0.278,2.550)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.305
Total tree length: 6.837
Total number of iterations: 123
CPU time used for tree search: 1.341 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.172 sec (0h:0m:1s)
Total CPU time used: 1.919 sec (0h:0m:1s)
Total wall-clock time used: 1.760 sec (0h:0m:1s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree.log
Date and Time: Tue Oct 29 14:47:54 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp5n80v6d3/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-tree.qza
Specifying a substitution model¶
We can also set a substitution model of our choosing. You may have noticed
while watching the onscreen output of the previous command that the best
fitting model selected by ModelFinder is noted. For the sake of argument, let’s
say the best selected model was shown as GTR+F+I+G4
. The F
is only a
notation to let us know that if a given model supports unequal base
frequencies, then the empirical base frequencies will be used by default.
Using empirical base frequencies (F
), rather than estimating them, greatly
reduces computational time. The iqtree
plugin will not accept F
within
the model notation supplied at the command line, as this will always be implied
automatically for the appropriate model. Also, the iqtree
plugin only
accepts G
not G4
to be specified within the model notation. The 4
is simply another explicit notation to remind us that four rate categories are
being assumed by default. The notation approach used by the plugin simply helps
to retain simplicity and familiarity when supplying model notations on the
command line. So, in brief, we only have to type GTR+I+G
as our input
model:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-gtrig-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree -nt 1
Seed: 331747 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:48:04 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000105143 secs using 85.6% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.239 / LogL: -1394.543
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.010, 1.340 / LogL: -1394.887
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.010, 1.353 / LogL: -1394.887
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.352 / LogL: -1394.871
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.009, 1.348 / LogL: -1394.836
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.351 / LogL: -1394.862
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.010, 1.352 / LogL: -1394.884
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.346 / LogL: -1394.826
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.009, 1.347 / LogL: -1394.838
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.009, 1.348 / LogL: -1394.841
Optimal pinv,alpha: 0.000, 1.239 / LogL: -1394.543
Parameters optimization took 0.268 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000872135 secs using 98.38% CPU
Computing ML distances took 0.000960 sec (of wall-clock time) 0.000902 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.90871e-05 secs using 89.39% CPU
Computing RapidNJ tree took 0.000177 sec (of wall-clock time) 0.000203 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.870
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.061 second
Computing log-likelihood of 98 initial trees ... 0.066 seconds
Current best score: -1392.870
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.265
Iteration 10 / LogL: -1387.282 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.267 / Time: 0h:0m:0s
Finish initializing candidate tree set (1)
Current best tree score: -1387.265 / CPU time: 0.340
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
UPDATE BEST LOG-LIKELIHOOD: -1387.264
Iteration 30 / LogL: -1387.411 / Time: 0h:0m:0s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.259
Iteration 40 / LogL: -1387.290 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 50 / LogL: -1387.383 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 60 / LogL: -1387.298 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.257
Iteration 70 / LogL: -1387.346 / Time: 0h:0m:1s (0h:0m:0s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 74: -1387.168
Iteration 80 / LogL: -1387.340 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 90 / LogL: -1387.354 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 100 / LogL: -1387.308 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.168
Iteration 110 / LogL: -1387.337 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 120 / LogL: -1387.183 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 130 / LogL: -1387.337 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 140 / LogL: -1387.172 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 150 / LogL: -1387.338 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 160 / LogL: -1387.583 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 170 / LogL: -1387.188 / Time: 0h:0m:2s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 175 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.168
Optimal log-likelihood: -1387.167
Rate parameters: A-C: 0.34628 A-G: 2.32104 A-T: 2.14180 C-G: 1.23358 C-T: 3.21639 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.284
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1387.167
Total tree length: 7.606
Total number of iterations: 175
CPU time used for tree search: 2.850 sec (0h:0m:2s)
Wall-clock time used for tree search: 2.689 sec (0h:0m:2s)
Total CPU time used: 3.130 sec (0h:0m:3s)
Total wall-clock time used: 2.971 sec (0h:0m:2s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree.log
Date and Time: Tue Oct 29 14:48:07 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpqtchocya/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-gtrig-tree.qza
Let’s rerun the command above and add the --p-fast
option. This option,
only compatible with the iqtree
method, resembles the fast search performed
by fasttree
. 🏎️ Secondly, let’s also perform multiple tree searches and
keep the best of those trees (as we did earlier with the
raxml --p-n-searches ...
command):
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--p-fast \
--p-n-runs 10 \
--o-tree iqt-gtrig-fast-ms-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 10 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree -nt 1 -fast
Seed: 66370 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:48:17 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000178099 secs using 75.8% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Analyzing sequences: done in 1.50204e-05 secs using 86.55% CPU
---> START RUN NUMBER 1 (seed: 66370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.00 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.50)
1. Initial log-likelihood: -1493.26
2. Current log-likelihood: -1403.08
3. Current log-likelihood: -1398.35
4. Current log-likelihood: -1396.98
5. Current log-likelihood: -1396.26
Optimal log-likelihood: -1395.75
Rate parameters: A-C: 0.24339 A-G: 2.10097 A-T: 1.98595 C-G: 1.09180 C-T: 2.82193 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.355
Parameters optimization took 5 rounds (0.022 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000894785 secs using 98.24% CPU
Computing ML distances took 0.000952 sec (of wall-clock time) 0.000920 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 2.90871e-05 secs using 79.07% CPU
Computing RapidNJ tree took 0.000192 sec (of wall-clock time) 0.000223 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.173
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.961
Finish initializing candidate tree set (4)
Current best tree score: -1387.961 / CPU time: 0.044
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.961
2. Current log-likelihood: -1387.803
3. Current log-likelihood: -1387.684
4. Current log-likelihood: -1387.593
5. Current log-likelihood: -1387.523
6. Current log-likelihood: -1387.468
Optimal log-likelihood: -1387.424
Rate parameters: A-C: 0.33414 A-G: 2.26635 A-T: 2.14117 C-G: 1.17550 C-T: 3.28158 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.353
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.424
Total tree length: 6.743
Total number of iterations: 2
CPU time used for tree search: 0.082 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.044 sec (0h:0m:0s)
Total CPU time used: 0.142 sec (0h:0m:0s)
Total wall-clock time used: 0.093 sec (0h:0m:0s)
---> START RUN NUMBER 2 (seed: 67370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1496.306
2. Current log-likelihood: -1403.641
3. Current log-likelihood: -1398.531
4. Current log-likelihood: -1397.067
5. Current log-likelihood: -1396.244
6. Current log-likelihood: -1395.736
Optimal log-likelihood: -1395.357
Rate parameters: A-C: 0.22740 A-G: 2.00038 A-T: 1.90797 C-G: 1.02878 C-T: 2.75984 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.021
Gamma shape alpha: 1.340
Parameters optimization took 6 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000870943 secs using 198.6% CPU
Computing ML distances took 0.000941 sec (of wall-clock time) 0.001811 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.88486e-05 secs using 83.19% CPU
Computing RapidNJ tree took 0.000140 sec (of wall-clock time) 0.000112 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.949
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.982
Finish initializing candidate tree set (4)
Current best tree score: -1387.982 / CPU time: 0.031
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.982
2. Current log-likelihood: -1387.819
3. Current log-likelihood: -1387.693
4. Current log-likelihood: -1387.600
5. Current log-likelihood: -1387.528
6. Current log-likelihood: -1387.473
Optimal log-likelihood: -1387.428
Rate parameters: A-C: 0.32622 A-G: 2.24716 A-T: 2.12097 C-G: 1.16448 C-T: 3.24964 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.355
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.428
Total tree length: 6.738
Total number of iterations: 2
CPU time used for tree search: 0.059 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.032 sec (0h:0m:0s)
Total CPU time used: 0.289 sec (0h:0m:0s)
Total wall-clock time used: 0.172 sec (0h:0m:0s)
---> START RUN NUMBER 3 (seed: 68370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.550
2. Current log-likelihood: -1403.090
3. Current log-likelihood: -1398.358
4. Current log-likelihood: -1396.975
5. Current log-likelihood: -1396.257
Optimal log-likelihood: -1395.748
Rate parameters: A-C: 0.23785 A-G: 2.06889 A-T: 1.95307 C-G: 1.06292 C-T: 2.77329 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.358
Parameters optimization took 5 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.00107002 secs using 197.4% CPU
Computing ML distances took 0.001200 sec (of wall-clock time) 0.002198 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 4.19617e-05 secs using 85.79% CPU
Computing RapidNJ tree took 0.000151 sec (of wall-clock time) 0.000121 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.197
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.961
Finish initializing candidate tree set (4)
Current best tree score: -1387.961 / CPU time: 0.042
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.961
2. Current log-likelihood: -1387.803
3. Current log-likelihood: -1387.684
4. Current log-likelihood: -1387.593
5. Current log-likelihood: -1387.523
6. Current log-likelihood: -1387.469
Optimal log-likelihood: -1387.424
Rate parameters: A-C: 0.33442 A-G: 2.26003 A-T: 2.13495 C-G: 1.17227 C-T: 3.27212 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.353
Parameters optimization took 6 rounds (0.013 sec)
BEST SCORE FOUND : -1387.424
Total tree length: 6.742
Total number of iterations: 2
CPU time used for tree search: 0.080 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.042 sec (0h:0m:0s)
Total CPU time used: 0.457 sec (0h:0m:0s)
Total wall-clock time used: 0.262 sec (0h:0m:0s)
---> START RUN NUMBER 4 (seed: 69370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1495.863
2. Current log-likelihood: -1402.072
3. Current log-likelihood: -1396.809
4. Current log-likelihood: -1395.391
5. Current log-likelihood: -1394.657
Optimal log-likelihood: -1394.080
Rate parameters: A-C: 0.27275 A-G: 2.35291 A-T: 2.09125 C-G: 1.19606 C-T: 3.26639 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.387
Parameters optimization took 5 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000864983 secs using 199% CPU
Computing ML distances took 0.000926 sec (of wall-clock time) 0.001838 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.90871e-05 secs using 79.07% CPU
Computing RapidNJ tree took 0.000173 sec (of wall-clock time) 0.000121 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.809
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.189
Finish initializing candidate tree set (4)
Current best tree score: -1388.189 / CPU time: 0.033
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.189
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36986 A-G: 2.31018 A-T: 2.11746 C-G: 1.22267 C-T: 3.27882 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.016 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.061 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.033 sec (0h:0m:0s)
Total CPU time used: 0.610 sec (0h:0m:0s)
Total wall-clock time used: 0.346 sec (0h:0m:0s)
---> START RUN NUMBER 5 (seed: 70370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.167
2. Current log-likelihood: -1403.041
3. Current log-likelihood: -1398.313
4. Current log-likelihood: -1396.957
5. Current log-likelihood: -1396.228
Optimal log-likelihood: -1395.709
Rate parameters: A-C: 0.23146 A-G: 2.06957 A-T: 1.96268 C-G: 1.07937 C-T: 2.84174 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.322
Parameters optimization took 5 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000894785 secs using 198.4% CPU
Computing ML distances took 0.000948 sec (of wall-clock time) 0.001850 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 6.10352e-05 secs using 88.47% CPU
Computing RapidNJ tree took 0.000193 sec (of wall-clock time) 0.000167 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.184
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.955
Finish initializing candidate tree set (4)
Current best tree score: -1387.955 / CPU time: 0.026
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.955
2. Current log-likelihood: -1387.798
3. Current log-likelihood: -1387.680
4. Current log-likelihood: -1387.590
5. Current log-likelihood: -1387.521
6. Current log-likelihood: -1387.467
Optimal log-likelihood: -1387.423
Rate parameters: A-C: 0.33566 A-G: 2.27095 A-T: 2.14605 C-G: 1.17829 C-T: 3.29012 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.352
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.423
Total tree length: 6.744
Total number of iterations: 2
CPU time used for tree search: 0.049 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.026 sec (0h:0m:0s)
Total CPU time used: 0.746 sec (0h:0m:0s)
Total wall-clock time used: 0.420 sec (0h:0m:0s)
---> START RUN NUMBER 6 (seed: 71370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1495.863
2. Current log-likelihood: -1402.072
3. Current log-likelihood: -1396.809
4. Current log-likelihood: -1395.391
5. Current log-likelihood: -1394.657
Optimal log-likelihood: -1394.080
Rate parameters: A-C: 0.27275 A-G: 2.35291 A-T: 2.09125 C-G: 1.19606 C-T: 3.26638 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.387
Parameters optimization took 5 rounds (0.021 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000863075 secs using 198.5% CPU
Computing ML distances took 0.000951 sec (of wall-clock time) 0.001826 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.50475e-05 secs using 85.6% CPU
Computing RapidNJ tree took 0.000149 sec (of wall-clock time) 0.000123 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.809
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.189
Finish initializing candidate tree set (4)
Current best tree score: -1388.189 / CPU time: 0.033
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.189
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36986 A-G: 2.31018 A-T: 2.11746 C-G: 1.22267 C-T: 3.27882 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.015 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.061 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.033 sec (0h:0m:0s)
Total CPU time used: 0.895 sec (0h:0m:0s)
Total wall-clock time used: 0.501 sec (0h:0m:0s)
---> START RUN NUMBER 7 (seed: 72370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.550
2. Current log-likelihood: -1403.090
3. Current log-likelihood: -1398.358
4. Current log-likelihood: -1396.975
5. Current log-likelihood: -1396.257
Optimal log-likelihood: -1395.748
Rate parameters: A-C: 0.23785 A-G: 2.06889 A-T: 1.95307 C-G: 1.06292 C-T: 2.77329 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.358
Parameters optimization took 5 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000940084 secs using 201.4% CPU
Computing ML distances took 0.000997 sec (of wall-clock time) 0.001954 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 2.5034e-05 secs using 79.89% CPU
Computing RapidNJ tree took 0.000192 sec (of wall-clock time) 0.000117 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.197
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.961
Finish initializing candidate tree set (4)
Current best tree score: -1387.961 / CPU time: 0.044
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.961
2. Current log-likelihood: -1387.803
3. Current log-likelihood: -1387.684
4. Current log-likelihood: -1387.593
5. Current log-likelihood: -1387.523
6. Current log-likelihood: -1387.469
Optimal log-likelihood: -1387.424
Rate parameters: A-C: 0.33442 A-G: 2.26003 A-T: 2.13495 C-G: 1.17227 C-T: 3.27212 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.353
Parameters optimization took 6 rounds (0.013 sec)
BEST SCORE FOUND : -1387.424
Total tree length: 6.742
Total number of iterations: 2
CPU time used for tree search: 0.081 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.044 sec (0h:0m:0s)
Total CPU time used: 1.064 sec (0h:0m:1s)
Total wall-clock time used: 0.593 sec (0h:0m:0s)
---> START RUN NUMBER 8 (seed: 73370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1491.633
2. Current log-likelihood: -1402.007
3. Current log-likelihood: -1396.792
4. Current log-likelihood: -1395.393
5. Current log-likelihood: -1394.654
Optimal log-likelihood: -1394.081
Rate parameters: A-C: 0.28077 A-G: 2.37447 A-T: 2.10134 C-G: 1.20130 C-T: 3.28121 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.386
Parameters optimization took 5 rounds (0.023 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000868082 secs using 198.9% CPU
Computing ML distances took 0.000927 sec (of wall-clock time) 0.001809 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 3.31402e-05 secs using 78.45% CPU
Computing RapidNJ tree took 0.000143 sec (of wall-clock time) 0.000114 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.810
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.188
Finish initializing candidate tree set (4)
Current best tree score: -1388.188 / CPU time: 0.032
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.188
2. Current log-likelihood: -1387.973
3. Current log-likelihood: -1387.830
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36985 A-G: 2.31003 A-T: 2.11728 C-G: 1.22260 C-T: 3.27851 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.014 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.060 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.032 sec (0h:0m:0s)
Total CPU time used: 1.214 sec (0h:0m:1s)
Total wall-clock time used: 0.675 sec (0h:0m:0s)
---> START RUN NUMBER 9 (seed: 74370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1496.306
2. Current log-likelihood: -1403.641
3. Current log-likelihood: -1398.531
4. Current log-likelihood: -1397.067
5. Current log-likelihood: -1396.244
6. Current log-likelihood: -1395.736
Optimal log-likelihood: -1395.357
Rate parameters: A-C: 0.22740 A-G: 2.00038 A-T: 1.90797 C-G: 1.02878 C-T: 2.75984 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.021
Gamma shape alpha: 1.340
Parameters optimization took 6 rounds (0.024 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000869989 secs using 197.9% CPU
Computing ML distances took 0.000946 sec (of wall-clock time) 0.001805 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 4.29153e-05 secs using 81.56% CPU
Computing RapidNJ tree took 0.000172 sec (of wall-clock time) 0.000139 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.949
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.982
Finish initializing candidate tree set (4)
Current best tree score: -1387.982 / CPU time: 0.030
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.982
2. Current log-likelihood: -1387.819
3. Current log-likelihood: -1387.693
4. Current log-likelihood: -1387.600
5. Current log-likelihood: -1387.528
6. Current log-likelihood: -1387.473
Optimal log-likelihood: -1387.428
Rate parameters: A-C: 0.32622 A-G: 2.24716 A-T: 2.12097 C-G: 1.16448 C-T: 3.24964 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.355
Parameters optimization took 6 rounds (0.012 sec)
BEST SCORE FOUND : -1387.428
Total tree length: 6.738
Total number of iterations: 2
CPU time used for tree search: 0.057 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.030 sec (0h:0m:0s)
Total CPU time used: 1.361 sec (0h:0m:1s)
Total wall-clock time used: 0.753 sec (0h:0m:0s)
---> START RUN NUMBER 10 (seed: 75370)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.959
2. Current log-likelihood: -1404.580
3. Current log-likelihood: -1399.448
4. Current log-likelihood: -1398.120
5. Current log-likelihood: -1397.423
Optimal log-likelihood: -1396.888
Rate parameters: A-C: 0.24361 A-G: 2.11016 A-T: 2.05349 C-G: 1.06058 C-T: 2.78310 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.028
Gamma shape alpha: 1.410
Parameters optimization took 5 rounds (0.025 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000886917 secs using 199.3% CPU
Computing ML distances took 0.001073 sec (of wall-clock time) 0.001984 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 5.10216e-05 secs using 90.16% CPU
Computing RapidNJ tree took 0.000193 sec (of wall-clock time) 0.000175 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.255
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.971
Finish initializing candidate tree set (3)
Current best tree score: -1387.971 / CPU time: 0.034
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.971
2. Current log-likelihood: -1387.809
3. Current log-likelihood: -1387.689
4. Current log-likelihood: -1387.597
5. Current log-likelihood: -1387.527
6. Current log-likelihood: -1387.472
Optimal log-likelihood: -1387.427
Rate parameters: A-C: 0.33312 A-G: 2.24401 A-T: 2.11841 C-G: 1.16369 C-T: 3.24427 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.357
Parameters optimization took 6 rounds (0.014 sec)
BEST SCORE FOUND : -1387.427
Total tree length: 6.728
Total number of iterations: 2
CPU time used for tree search: 0.064 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.034 sec (0h:0m:0s)
Total CPU time used: 1.518 sec (0h:0m:1s)
Total wall-clock time used: 0.838 sec (0h:0m:0s)
---> SUMMARIZE RESULTS FROM 10 RUNS
Run 5 gave best log-likelihood: -1387.423
Total CPU time for 10 runs: 1.530 seconds.
Total wall-clock time for 10 runs: 0.845 seconds.
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree.treefile
Trees from independent runs: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree.runtrees
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree.log
Date and Time: Tue Oct 29 14:48:17 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 10 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp273jpes6/q2iqtree -nt 1 -fast
Saved Phylogeny[Unrooted] to: iqt-gtrig-fast-ms-tree.qza
Single branch tests¶
IQ-TREE provides access to a few single branch testing methods
SH-aLRT via
--p-alrt [INT >= 1000]
aBayes via
--p-abayes [TRUE | FALSE]
local bootstrap test via
--p-lbp [INT >= 1000]
Single branch tests are commonly used as an alternative to the bootstrapping
approach we’ve discussed above, as they are substantially faster and often
recommended when constructing large phylogenies (e.g. >10,000 taxa). All
three of these methods can be applied simultaneously and viewed within iTOL
as separate bootstrap support values. These values are always in listed in the
following order of alrt / lbp / abayes. We’ll go ahead and apply all of the
branch tests in our next command, while specifying the same substitution model
as above. Feel free to combine this with the --p-fast
option. 😉
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-sbt-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Seed: 354031 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:48:29 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000132084 secs using 71.17% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.287 / LogL: -1395.194
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.376 / LogL: -1395.464
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.390 / LogL: -1395.530
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.384 / LogL: -1395.523
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.375 / LogL: -1395.492
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.374 / LogL: -1395.529
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.008, 1.369 / LogL: -1395.476
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.365 / LogL: -1395.493
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.373 / LogL: -1395.500
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.009, 1.374 / LogL: -1395.511
Optimal pinv,alpha: 0.000, 1.287 / LogL: -1395.194
Parameters optimization took 0.280 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000880957 secs using 92.85% CPU
Computing ML distances took 0.000954 sec (of wall-clock time) 0.000857 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.7895e-05 secs using 78.87% CPU
Computing RapidNJ tree took 0.000119 sec (of wall-clock time) 0.000083 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.836
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.067 second
Computing log-likelihood of 98 initial trees ... 0.062 seconds
Current best score: -1392.836
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.260
Iteration 10 / LogL: -1387.280 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.265 / Time: 0h:0m:0s
Finish initializing candidate tree set (1)
Current best tree score: -1387.260 / CPU time: 0.352
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1406.507 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 40 / LogL: -1387.395 / Time: 0h:0m:0s (0h:0m:1s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 41: -1387.169
BETTER TREE FOUND at iteration 42: -1387.169
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 50 / LogL: -1387.190 / Time: 0h:0m:1s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.169
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 60 / LogL: -1387.339 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 70 / LogL: -1387.227 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.169
Iteration 80 / LogL: -1387.169 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1387.169 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1387.188 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1387.182 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 120 / LogL: -1387.178 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 130 / LogL: -1387.169 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 140 / LogL: -1396.416 / Time: 0h:0m:2s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 143 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.169
Optimal log-likelihood: -1387.167
Rate parameters: A-C: 0.34411 A-G: 2.30327 A-T: 2.12300 C-G: 1.22338 C-T: 3.18565 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.289
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1387.167
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.049 sec.
Total tree length: 7.583
Total number of iterations: 143
CPU time used for tree search: 2.335 sec (0h:0m:2s)
Wall-clock time used for tree search: 2.174 sec (0h:0m:2s)
Total CPU time used: 2.675 sec (0h:0m:2s)
Total wall-clock time used: 2.519 sec (0h:0m:2s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree.log
Date and Time: Tue Oct 29 14:48:31 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp65orcnho/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Saved Phylogeny[Unrooted] to: iqt-sbt-tree.qza
Tip
IQ-TREE search settings.
There are quite a few adjustable parameters available for iqtree
that
can be modified improve searches through “tree space” and prevent the search
algorithms from getting stuck in local optima. One particular best
practice to aid in this regard, is to adjust the following parameters:
--p-perturb-nni-strength
and --p-stop-iter
(each respectively maps
to the -pers
and -nstop
flags of iqtree
). In brief, the larger
the value for NNI (nearest-neighbor interchange) perturbation, the larger
the jumps in “tree space”. This value should be set high enough to allow the
search algorithm to avoid being trapped in local optima, but not to high
that the search is haphazardly jumping around “tree space”. That is, like
Goldilocks and the three 🐻s you need to find a setting that is “just
right”, or at least within a set of reasonable bounds. One way of assessing
this, is to do a few short trial runs using the --verbose
flag. If you
see that the likelihood values are jumping around to much, then lowering the
value for --p-perturb-nni-strength
may be warranted. As for the stopping
criteria, i.e. --p-stop-iter
, the higher this value, the more thorough
your search in “tree space”. Be aware, increasing this value may also
increase the run time. That is, the search will continue until it has
sampled a number of trees, say 100 (default), without finding a better
scoring tree. If a better tree is found, then the counter resets, and the
search continues. These two parameters deserve special consideration when a
given data set contains many short sequences, quite common for microbiome
survey data. We can modify our original command to include these extra
parameters with the recommended modifications for short sequences, i.e. a
lower value for perturbation strength (shorter reads do not contain as much
phylogenetic information, thus we should limit how far we jump around in
“tree space”) and a larger number of stop iterations. See the IQ-TREE
command reference for more details about default parameter settings.
Finally, we’ll let iqtree
perform the model testing, and automatically
determine the optimal number of CPU cores to use.
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-fast-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Seed: 952131 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:48:41 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000105143 secs using 86.55% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1396.575
2. Current log-likelihood: -1395.213
Optimal log-likelihood: -1394.464
Rate parameters: A-C: 0.21819 A-G: 2.03593 A-T: 1.93394 C-G: 1.05109 C-T: 2.56337 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.033
Gamma shape alpha: 1.322
Parameters optimization took 2 rounds (0.008 sec)
Time for fast ML tree search: 0.032 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214 epsilon: 0.100) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1411.054 45 2912.108 2936.751 3063.577
2 GTR+F+I 1409.135 46 2910.270 2936.162 3065.105
3 GTR+F+G4 1392.990 46 2877.979 2903.872 3032.814
4 GTR+F+I+G4 1393.280 47 2880.560 2907.741 3038.761
5 GTR+F+R2 1387.683 47 2869.367 2896.547 3027.567
6 GTR+F+R3 1387.783 49 2873.566 2903.444 3038.499
14 GTR+F+I+R2 1387.806 48 2871.612 2900.121 3033.179
15 GTR+F+I+R3 1387.792 50 2875.584 2906.873 3043.883
25 SYM+G4 1393.507 43 2873.014 2895.273 3017.751
27 SYM+R2 1389.903 44 2867.807 2891.239 3015.910
36 SYM+I+R2 1389.979 45 2869.959 2894.602 3021.428
47 TVM+F+G4 1393.475 45 2876.951 2901.594 3028.420
49 TVM+F+R2 1388.449 46 2868.898 2894.790 3023.733
58 TVM+F+I+R2 1388.463 47 2870.925 2898.106 3029.126
69 TVMe+G4 1393.626 42 2871.251 2892.374 3012.622
71 TVMe+R2 1389.912 43 2865.823 2888.082 3010.560
80 TVMe+I+R2 1389.955 44 2867.910 2891.342 3016.013
91 TIM3+F+G4 1397.018 44 2882.036 2905.468 3030.139
93 TIM3+F+R2 1391.446 45 2872.893 2897.535 3024.362
102 TIM3+F+I+R2 1391.495 46 2874.989 2900.882 3029.824
113 TIM3e+G4 1396.975 41 2875.949 2895.972 3013.954
115 TIM3e+R2 1393.203 42 2870.405 2891.528 3011.776
124 TIM3e+I+R2 1393.216 43 2872.431 2894.690 3017.168
135 TIM2+F+G4 1401.477 44 2890.953 2914.385 3039.056
137 TIM2+F+R2 1395.788 45 2881.575 2906.218 3033.044
146 TIM2+F+I+R2 1395.770 46 2883.540 2909.432 3038.375
157 TIM2e+G4 1406.341 41 2894.682 2914.705 3032.687
159 TIM2e+R2 1402.277 42 2888.553 2909.676 3029.924
168 TIM2e+I+R2 1402.296 43 2890.592 2912.851 3035.329
179 TIM+F+G4 1397.935 44 2883.870 2907.302 3031.973
181 TIM+F+R2 1392.172 45 2874.343 2898.986 3025.812
190 TIM+F+I+R2 1392.182 46 2876.364 2902.256 3031.199
201 TIMe+G4 1403.752 41 2889.505 2909.528 3027.510
203 TIMe+R2 1399.391 42 2882.783 2903.905 3024.154
212 TIMe+I+R2 1399.400 43 2884.799 2907.058 3029.536
223 TPM3u+F+G4 1397.356 43 2880.712 2902.971 3025.449
225 TPM3u+F+R2 1392.248 44 2872.495 2895.927 3020.598
234 TPM3u+F+I+R2 1392.253 45 2874.505 2899.148 3025.974
245 TPM3+G4 1397.121 40 2874.241 2893.201 3008.880
247 TPM3+R2 1393.229 41 2868.459 2888.482 3006.464
256 TPM3+I+R2 1393.237 42 2870.473 2891.596 3011.844
267 TPM2u+F+G4 1401.943 43 2889.887 2912.146 3034.624
269 TPM2u+F+R2 1396.528 44 2881.057 2904.489 3029.160
278 TPM2u+F+I+R2 1396.518 45 2883.036 2907.679 3034.505
289 TPM2+G4 1406.528 40 2893.056 2912.016 3027.696
291 TPM2+R2 1402.307 41 2886.613 2906.636 3024.618
300 TPM2+I+R2 1402.317 42 2888.633 2909.756 3030.004
311 K3Pu+F+G4 1398.533 43 2883.065 2905.324 3027.802
313 K3Pu+F+R2 1393.073 44 2874.146 2897.578 3022.249
322 K3Pu+F+I+R2 1393.047 45 2876.095 2900.738 3027.564
333 K3P+G4 1403.893 40 2887.786 2906.745 3022.425
335 K3P+R2 1399.412 41 2880.824 2900.848 3018.829
344 K3P+I+R2 1399.421 42 2882.841 2903.964 3024.212
355 TN+F+G4 1401.522 43 2889.044 2911.303 3033.781
357 TN+F+R2 1395.980 44 2879.961 2903.393 3028.064
366 TN+F+I+R2 1395.968 45 2881.937 2906.580 3033.406
377 TNe+G4 1406.408 40 2892.816 2911.775 3027.455
379 TNe+R2 1402.302 41 2886.605 2906.628 3024.610
388 TNe+I+R2 1402.317 42 2888.635 2909.758 3030.006
399 HKY+F+G4 1402.004 42 2888.008 2909.131 3029.379
401 HKY+F+R2 1396.737 43 2879.474 2901.732 3024.211
410 HKY+F+I+R2 1396.725 44 2881.451 2904.883 3029.554
421 K2P+G4 1406.585 39 2891.169 2909.100 3022.442
423 K2P+R2 1402.339 40 2884.678 2903.638 3019.317
432 K2P+I+R2 1402.348 41 2886.697 2906.720 3024.702
443 F81+F+G4 1410.210 41 2902.420 2922.444 3040.425
445 F81+F+R2 1405.831 42 2895.663 2916.786 3037.034
454 F81+F+I+R2 1405.837 43 2897.674 2919.933 3042.411
465 JC+G4 1414.850 38 2905.700 2922.637 3033.607
467 JC+R2 1411.456 39 2900.912 2918.843 3032.185
476 JC+I+R2 1411.464 40 2902.928 2921.888 3037.567
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TVMe+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree.model.gz
CPU time for ModelFinder: 0.585 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.595 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1411.467
2. Current log-likelihood: -1394.204
3. Current log-likelihood: -1393.350
Optimal log-likelihood: -1393.276
Rate parameters: A-C: 0.31514 A-G: 1.34673 A-T: 1.00000 C-G: 0.31514 C-T: 1.34673 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.693,0.361) (0.307,2.440)
Parameters optimization took 3 rounds (0.008 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000608921 secs using 97.88% CPU
Computing ML distances took 0.000688 sec (of wall-clock time) 0.000663 sec (of CPU time)
WARNING: Some pairwise ML distances are too long (saturated)
Setting up auxiliary I and S matrices: done in 2.7895e-05 secs using 96.79% CPU
Computing RapidNJ tree took 0.000143 sec (of wall-clock time) 0.000211 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1394.353
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.062 second
Computing log-likelihood of 98 initial trees ... 0.048 seconds
Current best score: -1393.276
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1392.102
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1385.319
BETTER TREE FOUND at iteration 4: -1385.317
Iteration 10 / LogL: -1385.344 / Time: 0h:0m:0s
Iteration 20 / LogL: -1385.341 / Time: 0h:0m:0s
Finish initializing candidate tree set (5)
Current best tree score: -1385.317 / CPU time: 0.266
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
UPDATE BEST LOG-LIKELIHOOD: -1385.315
Iteration 30 / LogL: -1387.700 / Time: 0h:0m:0s (0h:0m:2s left)
Iteration 40 / LogL: -1385.317 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 50 / LogL: -1385.318 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 60 / LogL: -1385.341 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 70 / LogL: -1385.530 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 80 / LogL: -1385.623 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1385.623 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1385.642 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1385.531 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.315
Iteration 120 / LogL: -1385.315 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 130 / LogL: -1385.315 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 140 / LogL: -1385.679 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 150 / LogL: -1390.550 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 160 / LogL: -1385.576 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 170 / LogL: -1385.368 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 180 / LogL: -1385.317 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 190 / LogL: -1385.320 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 200 / LogL: -1385.533 / Time: 0h:0m:1s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.315
TREE SEARCH COMPLETED AFTER 205 ITERATIONS / Time: 0h:0m:1s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.315
Optimal log-likelihood: -1385.308
Rate parameters: A-C: 0.39437 A-G: 1.57060 A-T: 1.00000 C-G: 0.39437 C-T: 1.57060 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.718,0.396) (0.282,2.537)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.308
Total tree length: 6.936
Total number of iterations: 205
CPU time used for tree search: 1.530 sec (0h:0m:1s)
Wall-clock time used for tree search: 1.361 sec (0h:0m:1s)
Total CPU time used: 2.137 sec (0h:0m:2s)
Total wall-clock time used: 1.977 sec (0h:0m:1s)
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree.mldist
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree.log
Date and Time: Tue Oct 29 14:48:43 2024
n cores 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpc9abpnbi/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-fast-tree.qza
iqtree-ultrafast-bootstrap¶
As per our discussion in the raxml-rapid-bootstrap
section above, we can
also use IQ-TREE to evaluate how well our splits / bipartitions are supported
within our phylogeny via the ultrafast bootstrap algorithm. Below, we’ll
apply the plugin’s
ultrafast bootstrap command:
automatic model selection (MFP
), perform 1000
bootstrap replicates
(minimum required), set the same generally suggested parameters for
constructing a phylogeny from short sequences, and automatically determine the
optimal number of CPU cores to use:
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-bootstrap-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Seed: 767811 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:48:53 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000147104 secs using 61.86% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1389.605
Optimal log-likelihood: -1388.791
Rate parameters: A-C: 0.37331 A-G: 2.35436 A-T: 2.13670 C-G: 1.23381 C-T: 3.29661 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.034
Gamma shape alpha: 1.400
Parameters optimization took 1 rounds (0.003 sec)
Time for fast ML tree search: 0.035 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 484 DNA models (sample size: 214 epsilon: 0.100) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1402.600 45 2895.200 2919.842 3046.668
2 GTR+F+I 1401.121 46 2894.242 2920.134 3049.077
3 GTR+F+G4 1387.369 46 2866.737 2892.629 3021.572
4 GTR+F+I+G4 1387.734 47 2869.467 2896.648 3027.668
5 GTR+F+R2 1382.380 47 2858.759 2885.940 3016.960
+R3 reinitialized from +R2 with factor 0.500
+R3 reinitialized from +R2 with factor 0.250
6 GTR+F+R3 1382.454 49 2862.909 2892.787 3027.842
14 GTR+F+I+R2 1382.411 48 2860.821 2889.331 3022.388
15 GTR+F+I+R3 1382.464 50 2864.928 2896.216 3033.227
25 SYM+G4 1387.163 43 2860.326 2882.585 3005.063
27 SYM+R2 1383.105 44 2854.209 2877.641 3002.312
36 SYM+I+R2 1383.186 45 2856.372 2881.015 3007.841
47 TVM+F+G4 1388.360 45 2866.721 2891.364 3018.190
49 TVM+F+R2 1383.725 46 2859.451 2885.343 3014.286
58 TVM+F+I+R2 1383.717 47 2861.433 2888.614 3019.634
69 TVMe+G4 1387.152 42 2858.304 2879.427 2999.675
71 TVMe+R2 1383.090 43 2852.179 2874.438 2996.916
80 TVMe+I+R2 1383.142 44 2854.285 2877.717 3002.388
91 TIM3+F+G4 1391.376 44 2870.752 2894.184 3018.855
93 TIM3+F+R2 1385.912 45 2861.823 2886.466 3013.292
102 TIM3+F+I+R2 1385.947 46 2863.895 2889.787 3018.730
113 TIM3e+G4 1390.370 41 2862.741 2882.764 3000.746
115 TIM3e+R2 1385.927 42 2855.854 2876.977 2997.225
124 TIM3e+I+R2 1385.955 43 2857.911 2880.170 3002.648
135 TIM2+F+G4 1393.632 44 2875.264 2898.696 3023.367
137 TIM2+F+R2 1387.689 45 2865.378 2890.021 3016.847
146 TIM2+F+I+R2 1387.679 46 2867.359 2893.251 3022.194
157 TIM2e+G4 1396.798 41 2875.596 2895.619 3013.601
159 TIM2e+R2 1391.568 42 2867.135 2888.258 3008.506
168 TIM2e+I+R2 1391.562 43 2869.123 2891.382 3013.860
179 TIM+F+G4 1390.337 44 2868.673 2892.105 3016.776
181 TIM+F+R2 1384.915 45 2859.831 2884.474 3011.300
190 TIM+F+I+R2 1384.886 46 2861.772 2887.664 3016.607
201 TIMe+G4 1394.028 41 2870.057 2890.080 3008.062
203 TIMe+R2 1388.990 42 2861.980 2883.103 3003.351
212 TIMe+I+R2 1388.990 43 2863.980 2886.239 3008.717
223 TPM3u+F+G4 1392.293 43 2870.585 2892.844 3015.322
225 TPM3u+F+R2 1387.325 44 2862.650 2886.082 3010.753
234 TPM3u+F+I+R2 1387.333 45 2864.665 2889.308 3016.134
245 TPM3+G4 1390.386 40 2860.772 2879.731 2995.411
247 TPM3+R2 1385.935 41 2853.869 2873.893 2991.874
256 TPM3+I+R2 1385.953 42 2855.905 2877.028 2997.276
267 TPM2u+F+G4 1394.529 43 2875.058 2897.316 3019.795
269 TPM2u+F+R2 1389.057 44 2866.115 2889.547 3014.218
278 TPM2u+F+I+R2 1389.038 45 2868.077 2892.719 3019.545
289 TPM2+G4 1396.829 40 2873.658 2892.617 3008.297
291 TPM2+R2 1391.574 41 2865.147 2885.171 3003.152
300 TPM2+I+R2 1391.570 42 2867.139 2888.262 3008.510
311 K3Pu+F+G4 1391.377 43 2868.753 2891.012 3013.490
313 K3Pu+F+R2 1386.370 44 2860.739 2884.171 3008.842
322 K3Pu+F+I+R2 1386.340 45 2862.680 2887.323 3014.149
333 K3P+G4 1394.023 40 2868.047 2887.006 3002.686
335 K3P+R2 1389.000 41 2859.999 2880.022 2998.004
344 K3P+I+R2 1389.006 42 2862.011 2883.134 3003.382
355 TN+F+G4 1394.028 43 2874.056 2896.314 3018.793
357 TN+F+R2 1388.213 44 2864.425 2887.857 3012.528
366 TN+F+I+R2 1388.214 45 2866.428 2891.071 3017.897
377 TNe+G4 1396.818 40 2873.635 2892.595 3008.274
379 TNe+R2 1391.579 41 2865.158 2885.182 3003.163
388 TNe+I+R2 1391.584 42 2867.169 2888.291 3008.540
399 HKY+F+G4 1394.938 42 2873.876 2894.999 3015.247
401 HKY+F+R2 1389.592 43 2865.185 2887.444 3009.922
410 HKY+F+I+R2 1389.579 44 2867.157 2890.589 3015.260
421 K2P+G4 1396.828 39 2871.656 2889.587 3002.929
423 K2P+R2 1391.583 40 2863.165 2882.125 2997.804
432 K2P+I+R2 1391.585 41 2865.170 2885.193 3003.175
443 F81+F+G4 1405.730 41 2893.461 2913.484 3031.466
445 F81+F+R2 1400.797 42 2885.594 2906.717 3026.965
454 F81+F+I+R2 1400.790 43 2887.581 2909.839 3032.318
465 JC+G4 1407.635 38 2891.270 2908.207 3019.177
467 JC+R2 1402.843 39 2883.685 2901.616 3014.958
476 JC+I+R2 1402.837 40 2885.674 2904.634 3020.313
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TPM3+R2
Bayesian Information Criterion: TPM3+R2
Best-fit model: TPM3+R2 chosen according to BIC
All model information printed to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.model.gz
CPU time for ModelFinder: 0.553 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.563 seconds (0h:0m:0s)
Generating 1000 samples for ultrafast bootstrap (seed: 767811)...
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1402.843
2. Current log-likelihood: -1386.465
3. Current log-likelihood: -1385.950
Optimal log-likelihood: -1385.940
Rate parameters: A-C: 0.41103 A-G: 1.56375 A-T: 1.00000 C-G: 0.41103 C-T: 1.56375 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.414) (0.278,2.520)
Parameters optimization took 3 rounds (0.008 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000592947 secs using 97.82% CPU
Computing ML distances took 0.000647 sec (of wall-clock time) 0.000622 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.59876e-05 secs using 80.81% CPU
Computing RapidNJ tree took 0.000150 sec (of wall-clock time) 0.000131 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1393.853
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.063 second
Computing log-likelihood of 98 initial trees ... 0.046 seconds
Current best score: -1385.940
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1385.887
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 2: -1385.308
UPDATE BEST LOG-LIKELIHOOD: -1385.307
Iteration 10 / LogL: -1385.333 / Time: 0h:0m:0s
Iteration 20 / LogL: -1385.341 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1385.307 / CPU time: 0.353
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1385.910 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 40 / LogL: -1385.846 / Time: 0h:0m:1s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.307
UPDATE BEST LOG-LIKELIHOOD: -1385.307
Iteration 50 / LogL: -1385.550 / Time: 0h:0m:1s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1406.814
Iteration 60 / LogL: -1385.634 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 70 / LogL: -1389.893 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 80 / LogL: -1385.308 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1385.640 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.307
Iteration 100 / LogL: -1391.530 / Time: 0h:0m:1s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1406.059
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.997
Iteration 110 / LogL: -1385.649 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 120 / LogL: -1385.504 / Time: 0h:0m:2s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1385.307
Iteration 130 / LogL: -1385.653 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 140 / LogL: -1385.818 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 150 / LogL: -1385.309 / Time: 0h:0m:2s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1406.783
Iteration 160 / LogL: -1385.532 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 170 / LogL: -1385.311 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 180 / LogL: -1385.534 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 190 / LogL: -1385.311 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 200 / LogL: -1385.307 / Time: 0h:0m:2s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1406.783
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.999
TREE SEARCH COMPLETED AFTER 203 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1385.307
Optimal log-likelihood: -1385.304
Rate parameters: A-C: 0.39511 A-G: 1.56732 A-T: 1.00000 C-G: 0.39511 C-T: 1.56732 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.403) (0.278,2.550)
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1385.304
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.splits.nex
Total tree length: 6.837
Total number of iterations: 203
CPU time used for tree search: 2.554 sec (0h:0m:2s)
Wall-clock time used for tree search: 2.415 sec (0h:0m:2s)
Total CPU time used: 3.169 sec (0h:0m:3s)
Total wall-clock time used: 3.041 sec (0h:0m:3s)
Computing bootstrap consensus tree...
Reading input file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.splits.nex...
20 taxa and 147 splits.
Consensus tree written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.contree
Reading input trees file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1385.305
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.contree
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot.log
Date and Time: Tue Oct 29 14:48:56 2024
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmpw4gp1jrp/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-tree.qza
Perform single branch tests alongside ufboot¶
We can also apply single branch test methods concurrently with ultrafast bootstrapping. The support values will always be represented in the following order: alrt / lbp / abayes / ufboot. Again, these values can be seen as separately listed bootstrap values in iTOL. We’ll also specify a model as we did earlier.
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--verbose
stdout:
IQ-TREE multicore version 2.3.6 for MacOS Intel 64-bit built Aug 4 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: Elizabeths-MacBook-Pro-7.local (AVX512, FMA3, 32 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Seed: 172034 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Tue Oct 29 14:49:08 2024
Kernel: AVX+FMA - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta ... Fasta format detected
Reading fasta file: done in 0.000130892 secs using 69.52% CPU
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 1.09673e-05 secs using 82.06% CPU
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
Generating 1000 samples for ultrafast bootstrap (seed: 172034)...
NOTE: 1 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.218 / LogL: -1394.499
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.293 / LogL: -1394.744
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.008, 1.296 / LogL: -1394.739
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.008, 1.294 / LogL: -1394.737
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.301 / LogL: -1394.766
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.007, 1.297 / LogL: -1394.718
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.008, 1.299 / LogL: -1394.735
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.301 / LogL: -1394.747
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.304 / LogL: -1394.750
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.307 / LogL: -1394.761
Optimal pinv,alpha: 0.000, 1.218 / LogL: -1394.499
Parameters optimization took 0.276 sec
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000896215 secs using 98.41% CPU
Computing ML distances took 0.000954 sec (of wall-clock time) 0.000926 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 2.7895e-05 secs using 100.4% CPU
Computing RapidNJ tree took 0.000138 sec (of wall-clock time) 0.000137 sec (of CPU time)
Log-likelihood of RapidNJ tree: -1392.880
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.056 second
Computing log-likelihood of 98 initial trees ... 0.062 seconds
Current best score: -1392.880
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.268
Iteration 10 / LogL: -1387.734 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.284 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1387.268 / CPU time: 0.416
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
UPDATE BEST LOG-LIKELIHOOD: -1387.266
UPDATE BEST LOG-LIKELIHOOD: -1387.261
Iteration 30 / LogL: -1388.150 / Time: 0h:0m:0s (0h:0m:5s left)
Iteration 40 / LogL: -1388.150 / Time: 0h:0m:1s (0h:0m:4s left)
Iteration 50 / LogL: -1388.150 / Time: 0h:0m:1s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1418.226
Iteration 60 / LogL: -1387.305 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 70 / LogL: -1388.151 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 80 / LogL: -1388.150 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 90 / LogL: -1388.150 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 100 / LogL: -1388.150 / Time: 0h:0m:1s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1418.761
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.997
Iteration 110 / LogL: -1387.316 / Time: 0h:0m:2s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.261
Iteration 120 / LogL: -1387.418 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 130 / LogL: -1387.373 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 140 / LogL: -1388.031 / Time: 0h:0m:2s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.259
Iteration 150 / LogL: -1387.344 / Time: 0h:0m:2s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1418.761
Iteration 160 / LogL: -1387.344 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 170 / LogL: -1387.346 / Time: 0h:0m:3s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.258
Iteration 180 / LogL: -1389.562 / Time: 0h:0m:3s (0h:0m:0s left)
Iteration 190 / LogL: -1387.260 / Time: 0h:0m:3s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.257
Iteration 200 / LogL: -1387.346 / Time: 0h:0m:3s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1418.761
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.969
NOTE: UFBoot does not converge, continue at least 100 more iterations
Iteration 210 / LogL: -1387.533 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 220 / LogL: -1387.346 / Time: 0h:0m:3s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.257
Iteration 230 / LogL: -1387.344 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 240 / LogL: -1387.347 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 250 / LogL: -1387.344 / Time: 0h:0m:4s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1419.546
Iteration 260 / LogL: -1387.344 / Time: 0h:0m:4s (0h:0m:0s left)
Iteration 270 / LogL: -1387.345 / Time: 0h:0m:4s (0h:0m:0s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 273: -1387.167
Iteration 280 / LogL: -1394.651 / Time: 0h:0m:5s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 290 / LogL: -1387.335 / Time: 0h:0m:5s (0h:0m:3s left)
Iteration 300 / LogL: -1387.335 / Time: 0h:0m:5s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1420.463
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.996
Iteration 310 / LogL: -1389.338 / Time: 0h:0m:5s (0h:0m:2s left)
Iteration 320 / LogL: -1387.374 / Time: 0h:0m:5s (0h:0m:2s left)
Iteration 330 / LogL: -1387.167 / Time: 0h:0m:5s (0h:0m:2s left)
Iteration 340 / LogL: -1387.335 / Time: 0h:0m:6s (0h:0m:2s left)
Iteration 350 / LogL: -1387.336 / Time: 0h:0m:6s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1420.463
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 360 / LogL: -1387.384 / Time: 0h:0m:6s (0h:0m:2s left)
Iteration 370 / LogL: -1387.353 / Time: 0h:0m:6s (0h:0m:1s left)
Iteration 380 / LogL: -1387.372 / Time: 0h:0m:6s (0h:0m:1s left)
Iteration 390 / LogL: -1387.337 / Time: 0h:0m:6s (0h:0m:1s left)
Iteration 400 / LogL: -1387.334 / Time: 0h:0m:7s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1420.463
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.996
Iteration 410 / LogL: -1387.773 / Time: 0h:0m:7s (0h:0m:1s left)
Iteration 420 / LogL: -1387.168 / Time: 0h:0m:7s (0h:0m:1s left)
Iteration 430 / LogL: -1387.323 / Time: 0h:0m:7s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 440 / LogL: -1387.351 / Time: 0h:0m:7s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 450 / LogL: -1387.177 / Time: 0h:0m:7s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1420.463
Iteration 460 / LogL: -1387.167 / Time: 0h:0m:8s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.167
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 470 / LogL: -1387.167 / Time: 0h:0m:8s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 474 ITERATIONS / Time: 0h:0m:8s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.167
Optimal log-likelihood: -1387.167
Rate parameters: A-C: 0.34704 A-G: 2.32942 A-T: 2.15058 C-G: 1.23825 C-T: 3.23101 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.282
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -1387.167
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.044 sec.
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.splits.nex
Total tree length: 7.617
Total number of iterations: 474
CPU time used for tree search: 8.172 sec (0h:0m:8s)
Wall-clock time used for tree search: 8.076 sec (0h:0m:8s)
Total CPU time used: 8.543 sec (0h:0m:8s)
Total wall-clock time used: 8.452 sec (0h:0m:8s)
Computing bootstrap consensus tree...
Reading input file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.splits.nex...
20 taxa and 205 splits.
Consensus tree written to /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.contree
Reading input trees file /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1387.811
Analysis results written to:
IQ-TREE report: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.contree
Screen log file: /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot.log
Date and Time: Tue Oct 29 14:49:16 2024
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/qiime2/elizabethgehret/data/8e473e4a-44c3-4e68-967f-31eb660071e1/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/gt/s61zzgwx7gz_npzjm4gxfp3w0000gn/T/tmp367t6ese/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-sbt-gtrig-tree.qza
Tip
If there is a need to reduce the impact of potential model
violations that occur during a UFBoot search, and / or would simply
like to be more rigorous, we can add the --p-bnni
option to any of the
iqtree-ultrafast-bootstrap
commands above.
Root the phylogeny¶
In order to make proper use of diversity metrics such as UniFrac, the phylogeny must be rooted. Typically an outgroup is chosen when rooting a tree. In general, phylogenetic inference tools using Maximum Likelihood often return an unrooted tree by default.
QIIME 2 provides a way to
mid-point root our
phylogeny. Other rooting options may be available in the future. For now, we’ll
root our bootstrap tree from iqtree-ultrafast-bootstrap
like so:
qiime phylogeny midpoint-root \
--i-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--o-rooted-tree iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza
Tip
iTOL viewing Reminder. We can view our tree and its associated alignment via iTOL. All you need to do is upload the iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza tree file. Display the tree in Normal mode. Then drag and drop the masked-aligned-rep-seqs.qza file onto the visualization. Now you can view the phylogeny alongside the alignment.
Pipelines¶
Here we will outline the use of the phylogeny pipeline align-to-tree-mafft-fasttree
One advantage of pipelines is that they combine ordered sets of commonly used commands, into one condensed simple command. To keep these “convenience” pipelines easy to use, it is quite common to only expose a few options to the user. That is, most of the commands executed via pipelines are often configured to use default option settings. However, options that are deemed important enough for the user to consider setting, are made available. The options exposed via a given pipeline will largely depend upon what it is doing. Pipelines are also a great way for new users to get started, as it helps to lay a foundation of good practices in setting up standard operating procedures.
Rather than run one or more of the following QIIME 2 commands listed below:
qiime alignment mafft ...
qiime alignment mask ...
qiime phylogeny fasttree ...
qiime phylogeny midpoint-root ...
We can make use of the pipeline align-to-tree-mafft-fasttree to automate the above four steps in one go. Here is the description taken from the pipeline help doc:
This pipeline will start by creating a sequence alignment using MAFFT, after which any alignment columns that are phylogenetically uninformative or ambiguously aligned will be removed (masked). The resulting masked alignment will be used to infer a phylogenetic tree and then subsequently rooted at its midpoint. Output files from each step of the pipeline will be saved. This includes both the unmasked and masked MAFFT alignment from q2-alignment methods, and both the rooted and unrooted phylogenies from q2-phylogeny methods.
This can all be accomplished by simply running the following:
qiime phylogeny align-to-tree-mafft-fasttree \
--i-sequences rep-seqs.qza \
--output-dir mafft-fasttree-output
Output artifacts:
Congratulations! You now know how to construct a phylogeny in QIIME 2!