Phylogenetic inference with q2-phylogeny¶
Phylogenetic inference with q2-phylogeny
Note
This tutorial assumes, you’ve read through the QIIME 2 Overview documentation and have at least worked through some of the other Tutorials.
Inferring phylogenies¶
Several downstream diversity metrics, available within QIIME 2, require that a phylogenetic tree be constructed using the Operational Taxonomic Units (OTUs) or Amplicon Sequence Variants (ASVs) being investigated.
But how do we proceed to construct a phylogeny from our sequence data?
Well, there are two phylogeny-based approaches we can use. Deciding upon which to use is largely dependent on your study questions:
1. A reference-based fragment insertion approach. Which, is likely the ideal choice. Especially, if your reference phylogeny (and associated representative sequences) encompass neighboring relatives of which your sequences can be reliably inserted. Any sequences that do not match well enough to the reference are not inserted. For example, this approach may not work well if your data contain sequences that are not well represented within your reference phylogeny (e.g. missing clades, etc.). For more information, check out these great fragment insertion examples.
2. A de novo approach. Marker genes that can be globally aligned across divergent taxa, are usually amenable to sequence alignment and phylogenetic investigation through this approach. Be mindful of the length of your sequences when constructing a de novo phylogeny, short reads many not have enough phylogenetic information to capture a meaningful phylogeny. This community tutorial will focus on the de novo approaches.
Here, you will learn how to make use of de novo phylogenetic approaches to:
generate a sequence alignment within QIIME 2
mask the alignment if needed
construct a phylogenetic tree
root the phylogenetic tree
If you would like to substitute any of the steps outlined here by making use of tools external to QIIME 2, please see the import, export, and filtering documentation where appropriate.
Sequence Alignment¶
Prior to constructing a phylogeny we must generate a multiple sequence alignment (MSA). When constructing a MSA we are making a statement about the putative homology of the aligned residues (columns of the MSA) by virtue of their sequence similarity.
The number of algorithms to construct a MSA are legion. We will make use of MAFFT (Multiple Alignment using Fast Fourier Transform)) via the q2-alignment plugin. For more information checkout the MAFFT paper.
Let’s start by creating a directory to work in:
mkdir qiime2-phylogeny-tutorial
cd qiime2-phylogeny-tutorial
Next, download the data:
Download URL: https://data.qiime2.org/2020.11/tutorials/phylogeny/rep-seqs.qza
Save as: rep-seqs.qza
wget \
-O "rep-seqs.qza" \
"https://data.qiime2.org/2020.11/tutorials/phylogeny/rep-seqs.qza"
curl -sL \
"https://data.qiime2.org/2020.11/tutorials/phylogeny/rep-seqs.qza" > \
"rep-seqs.qza"
Run MAFFT
qiime alignment mafft \
--i-sequences rep-seqs.qza \
--o-alignment aligned-rep-seqs.qza
Reducing alignment ambiguity: masking and reference alignments¶
Why mask an alignment?
Masking helps to eliminate alignment columns that are phylogenetically uninformative or misleading before phylogenetic analysis. Much of the time alignment errors can introduce noise and confound phylogenetic inference. It is common practice to mask (remove) these ambiguously aligned regions prior to performing phylogenetic inference. In particular, David Lane’s (1991) chapter 16S/23S rRNA sequencing proposed masking SSU data prior to phylogenetic analysis. However, knowing how to deal with ambiguously aligned regions and when to apply masks largely depends on the marker genes being analyzed and the question being asked of the data.
Note
Keep in mind that this is still an active area of discussion, as highlighted by the following non-exhaustive list of articles: Wu et al. 2012, Ashkenazy et al. 2018, Schloss 2010, Tan et al. 2015, Rajan 2015.
How to mask alignment.
For our purposes, we’ll assume that we have ambiguously aligned columns in the
MAFFT alignment we produced above. The default settings for the
--p-min-conservation
of the
alignment mask approximates the
Lane mask filtering of QIIME 1. Keep an eye out for updates to the alignment
plugin.
qiime alignment mask \
--i-alignment aligned-rep-seqs.qza \
--o-masked-alignment masked-aligned-rep-seqs.qza
Reference based alignments
There are a variety of tools such as PyNAST) (using NAST), Infernal, and SINA, etc., that attempt to reduce the amount of ambiguously aligned regions by using curated reference alignments (e.g. SILVA. Reference alignments are particularly powerful for rRNA gene sequence data, as knowledge of secondary structure is incorporated into the curation process, thus increasing alignment quality. For a more in-depth and eloquent overview of reference-based alignment approaches, check out the great SINA community tutorial).
Note
Alignments constructed using reference based alignment approaches can be masked too, just like the above MAFFT example. Also, the reference alignment approach we are discussing here is distinct from the reference phylogeny approach (i.e. q2-fragment-insertion) we mentioned earlier. That is, we are not inserting our data into an existing tree, but simply trying to create a more robust alignment for making a better de novo phylogeny.
Construct a phylogeny¶
As with MSA algorithms, phylogenetic inference tools are also legion. Fortunately, there are many great resources to learn about phylogentics. Below are just a few introductory resources to get you started:
There are several methods / pipelines available through the q2-phylogeny plugin of :qiime2:. These are based on the following tools:
Methods¶
fasttree¶
FastTree is able to construct phylogenies from large sequence alignments quite rapidly. It does this by using the using a CAT-like rate category approximation, which is also available through RAxML (discussed below). Check out the FastTree online manual for more information.
qiime phylogeny fasttree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree fasttree-tree.qza
Tip
For an easy and direct way to view your tree.qza
files, upload
them to iTOL. Here, you can interactively view and manipulate your
phylogeny. Even better, while viewing the tree topology in “Normal mode”,
you can drag and drop your associated alignment.qza
(the one you used to
build the phylogeny) or a relevent taxonomy.qza
file onto the iTOL tree
visualization. This will allow you to directly view the sequence alignment
or taxonomy alongside the phylogeny. 🕶️
raxml¶
Like fasttree
, raxml
will perform a single phylogentic inference and
return a tree. Note, the default model for raxml
is
--p-substitution-model GTRGAMMA
. If you’d like to construct a tree using
the CAT model like fasttree
, simply replace GTRGAMMA
with GTRCAT
as
shown below:
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 1 inferences on the original alignment using 1 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 6029 -N 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4xa4_9f9/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg9l5vne7 -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.592496 CAT-based likelihood -1242.800535, best rearrangement setting 5
Conducting final model optimizations on all 1 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1387.542672 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg9l5vne7/RAxML_result.q2
Starting final GAMMA-based thorough Optimization on tree 0 likelihood -1387.542672 ....
Final GAMMA-based Score of best tree -1387.231349
Program execution info written to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg9l5vne7/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg9l5vne7/RAxML_bestTree.q2
Overall execution time: 1.213617 secs or 0.000337 hours or 0.000014 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 6029 -N 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4xa4_9f9/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg9l5vne7 -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-tree.qza
Perform multiple searches using raxml¶
If you’d like to perform a more thorough search of “tree space” you can
instruct raxml
to perform multiple independent searches on the full
alignment by using --p-n-searches 5
. Once these 5 independent searches are
completed, only the single best scoring tree will be returned. Note, we are
not bootstrapping here, we’ll do that in a later example. Let’s set
--p-substitution-model GTRCAT
. Finally, let’s also manually set a seed via
--p-seed
. By setting our seed, we allow other users the ability to
reproduce our phylogeny. That is, anyone using the same sequence alignment and
substitution model, will generate the same tree as long as they set the same
seed value. Although, --p-seed
is not a required argument, it is generally
a good idea to set this value.
qiime phylogeny raxml \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model GTRCAT \
--p-seed 1723 \
--p-n-searches 5 \
--o-tree raxml-cat-searches-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 5 inferences on the original alignment using 5 distinct randomized MP trees
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-83gf6_8_/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9 -n q2
Partition: 0 with name: No Name Provided
Base frequencies: 0.243 0.182 0.319 0.256
Inference[0]: Time 0.622635 CAT-based likelihood -1238.242991, best rearrangement setting 5
Inference[1]: Time 0.494979 CAT-based likelihood -1249.502284, best rearrangement setting 5
Inference[2]: Time 0.501749 CAT-based likelihood -1242.978035, best rearrangement setting 5
Inference[3]: Time 0.647470 CAT-based likelihood -1243.159855, best rearrangement setting 5
Inference[4]: Time 0.487555 CAT-based likelihood -1261.321621, best rearrangement setting 5
Conducting final model optimizations on all 5 trees under GAMMA-based models ....
Inference[0] final GAMMA-based Likelihood: -1388.324037 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_result.q2.RUN.0
Inference[1] final GAMMA-based Likelihood: -1392.813982 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_result.q2.RUN.1
Inference[2] final GAMMA-based Likelihood: -1388.073642 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_result.q2.RUN.2
Inference[3] final GAMMA-based Likelihood: -1387.945266 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_result.q2.RUN.3
Inference[4] final GAMMA-based Likelihood: -1387.557031 tree written to file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_result.q2.RUN.4
Starting final GAMMA-based thorough Optimization on tree 4 likelihood -1387.557031 ....
Final GAMMA-based Score of best tree -1387.385075
Program execution info written to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_info.q2
Best-scoring ML tree written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9/RAxML_bestTree.q2
Overall execution time: 3.500996 secs or 0.000972 hours or 0.000041 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -m GTRCAT -p 1723 -N 5 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-83gf6_8_/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpeu2u6wj9 -n q2
Saved Phylogeny[Unrooted] to: raxml-cat-searches-tree.qza
raxml-rapid-bootstrap¶
In phylogenetics, it is good practice to check how well the splits /
bipartitions in your phylogeny are supported. Often one is interested in
which clades are robustly separated from other clades in the phylogeny. One
way, of doing this is via bootstrapping (See the Bootstrapping section of the
first introductory link above). In QIIME 2, we’ve provided access to the RAxML
rapid bootstrap feature. The only difference between this command and the
previous are the additional flags --p-bootstrap-replicates
and
--p-rapid-bootstrap-seed
. It is quite common to perform anywhere from 100 -
1000 bootstrap replicates. The --p-rapid-bootstrap-seed
works very much
like the --p-seed
argument from above except that it allows anyone to
reproduce the bootstrapping process and the associated supports for your
splits.
As per the RAxML online documentation and the RAxML manual, the rapid bootstrapping command that we will execute below will do the following:
Bootstrap the input alignment 100 times and perform a Maximum Likelihood (ML) search on each.
Find best scoring ML tree through multiple independent searches using the original input alignment. The number of independent searches is determined by the number of bootstrap replicates set in the 1st step. That is, your search becomes more thorough with increasing bootstrap replicates. The ML optimization of RAxML uses every 5th bootstrap tree as the starting tree for an ML search on the original alignment.
Map the bipartitions (bootstrap supports, 1st step) onto the best scoring ML tree (2nd step).
qiime phylogeny raxml-rapid-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-seed 1723 \
--p-rapid-bootstrap-seed 9384 \
--p-bootstrap-replicates 100 \
--p-substitution-model GTRCAT \
--o-tree raxml-cat-bootstrap-tree.qza \
--verbose
stdout:
Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!
RAxML can't, parse the alignment file as phylip file
it will now try to parse it as FASTA file
Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs"
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 157 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 39.77%
RAxML rapid bootstrapping and subsequent ML search
Using 1 distinct models/data partitions with joint branch length optimization
Executing 100 rapid bootstrap inferences and thereafter a thorough ML search
All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories
Likelihood of final tree will be evaluated and optimized under GAMMA
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 157
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR
RAxML was called as follows:
raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-50xfe34m/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0 -n q2bootstrap
Time for BS model parameter optimization 0.051174
Bootstrap[0]: Time 0.172528 seconds, bootstrap likelihood -1199.758796, best rearrangement setting 12
Bootstrap[1]: Time 0.113470 seconds, bootstrap likelihood -1344.229251, best rearrangement setting 6
Bootstrap[2]: Time 0.105118 seconds, bootstrap likelihood -1295.343000, best rearrangement setting 8
Bootstrap[3]: Time 0.094437 seconds, bootstrap likelihood -1273.768320, best rearrangement setting 8
Bootstrap[4]: Time 0.106514 seconds, bootstrap likelihood -1253.402952, best rearrangement setting 6
Bootstrap[5]: Time 0.112570 seconds, bootstrap likelihood -1260.866113, best rearrangement setting 10
Bootstrap[6]: Time 0.109708 seconds, bootstrap likelihood -1293.636299, best rearrangement setting 14
Bootstrap[7]: Time 0.100803 seconds, bootstrap likelihood -1227.178693, best rearrangement setting 6
Bootstrap[8]: Time 0.108574 seconds, bootstrap likelihood -1321.820787, best rearrangement setting 13
Bootstrap[9]: Time 0.116927 seconds, bootstrap likelihood -1147.233446, best rearrangement setting 6
Bootstrap[10]: Time 0.084227 seconds, bootstrap likelihood -1220.766493, best rearrangement setting 13
Bootstrap[11]: Time 0.118650 seconds, bootstrap likelihood -1200.006355, best rearrangement setting 8
Bootstrap[12]: Time 0.126655 seconds, bootstrap likelihood -1346.392834, best rearrangement setting 14
Bootstrap[13]: Time 0.106150 seconds, bootstrap likelihood -1301.111096, best rearrangement setting 14
Bootstrap[14]: Time 0.112832 seconds, bootstrap likelihood -1262.253559, best rearrangement setting 11
Bootstrap[15]: Time 0.115825 seconds, bootstrap likelihood -1215.017551, best rearrangement setting 14
Bootstrap[16]: Time 0.107068 seconds, bootstrap likelihood -1238.832009, best rearrangement setting 7
Bootstrap[17]: Time 0.098862 seconds, bootstrap likelihood -1393.989732, best rearrangement setting 12
Bootstrap[18]: Time 0.102751 seconds, bootstrap likelihood -1173.921002, best rearrangement setting 15
Bootstrap[19]: Time 0.106741 seconds, bootstrap likelihood -1185.726976, best rearrangement setting 11
Bootstrap[20]: Time 0.095408 seconds, bootstrap likelihood -1158.491940, best rearrangement setting 6
Bootstrap[21]: Time 0.092270 seconds, bootstrap likelihood -1154.664272, best rearrangement setting 11
Bootstrap[22]: Time 0.103083 seconds, bootstrap likelihood -1244.159837, best rearrangement setting 10
Bootstrap[23]: Time 0.125781 seconds, bootstrap likelihood -1211.171036, best rearrangement setting 15
Bootstrap[24]: Time 0.103894 seconds, bootstrap likelihood -1261.440677, best rearrangement setting 12
Bootstrap[25]: Time 0.105215 seconds, bootstrap likelihood -1331.836715, best rearrangement setting 15
Bootstrap[26]: Time 0.107515 seconds, bootstrap likelihood -1129.144509, best rearrangement setting 5
Bootstrap[27]: Time 0.131299 seconds, bootstrap likelihood -1226.624056, best rearrangement setting 7
Bootstrap[28]: Time 0.128682 seconds, bootstrap likelihood -1221.046176, best rearrangement setting 12
Bootstrap[29]: Time 0.087077 seconds, bootstrap likelihood -1211.791204, best rearrangement setting 14
Bootstrap[30]: Time 0.110945 seconds, bootstrap likelihood -1389.442380, best rearrangement setting 5
Bootstrap[31]: Time 0.110546 seconds, bootstrap likelihood -1303.638592, best rearrangement setting 12
Bootstrap[32]: Time 0.121055 seconds, bootstrap likelihood -1172.859456, best rearrangement setting 12
Bootstrap[33]: Time 0.104294 seconds, bootstrap likelihood -1244.617135, best rearrangement setting 9
Bootstrap[34]: Time 0.100986 seconds, bootstrap likelihood -1211.871717, best rearrangement setting 15
Bootstrap[35]: Time 0.123980 seconds, bootstrap likelihood -1299.862912, best rearrangement setting 5
Bootstrap[36]: Time 0.099252 seconds, bootstrap likelihood -1141.967505, best rearrangement setting 5
Bootstrap[37]: Time 0.121541 seconds, bootstrap likelihood -1283.923198, best rearrangement setting 12
Bootstrap[38]: Time 0.096741 seconds, bootstrap likelihood -1304.250946, best rearrangement setting 5
Bootstrap[39]: Time 0.091542 seconds, bootstrap likelihood -1407.084376, best rearrangement setting 15
Bootstrap[40]: Time 0.109774 seconds, bootstrap likelihood -1277.946299, best rearrangement setting 13
Bootstrap[41]: Time 0.107773 seconds, bootstrap likelihood -1279.006200, best rearrangement setting 7
Bootstrap[42]: Time 0.102929 seconds, bootstrap likelihood -1160.274606, best rearrangement setting 6
Bootstrap[43]: Time 0.127988 seconds, bootstrap likelihood -1216.079259, best rearrangement setting 14
Bootstrap[44]: Time 0.097848 seconds, bootstrap likelihood -1382.278311, best rearrangement setting 8
Bootstrap[45]: Time 0.109597 seconds, bootstrap likelihood -1099.004439, best rearrangement setting 11
Bootstrap[46]: Time 0.088548 seconds, bootstrap likelihood -1296.527478, best rearrangement setting 8
Bootstrap[47]: Time 0.129318 seconds, bootstrap likelihood -1291.322658, best rearrangement setting 9
Bootstrap[48]: Time 0.086704 seconds, bootstrap likelihood -1161.908080, best rearrangement setting 6
Bootstrap[49]: Time 0.118459 seconds, bootstrap likelihood -1257.348428, best rearrangement setting 13
Bootstrap[50]: Time 0.134089 seconds, bootstrap likelihood -1309.422533, best rearrangement setting 13
Bootstrap[51]: Time 0.098051 seconds, bootstrap likelihood -1197.633097, best rearrangement setting 11
Bootstrap[52]: Time 0.110445 seconds, bootstrap likelihood -1347.123005, best rearrangement setting 8
Bootstrap[53]: Time 0.098283 seconds, bootstrap likelihood -1234.934890, best rearrangement setting 14
Bootstrap[54]: Time 0.114188 seconds, bootstrap likelihood -1227.092434, best rearrangement setting 6
Bootstrap[55]: Time 0.117310 seconds, bootstrap likelihood -1280.635747, best rearrangement setting 7
Bootstrap[56]: Time 0.094265 seconds, bootstrap likelihood -1225.911449, best rearrangement setting 6
Bootstrap[57]: Time 0.093729 seconds, bootstrap likelihood -1236.213347, best rearrangement setting 11
Bootstrap[58]: Time 0.133200 seconds, bootstrap likelihood -1393.245723, best rearrangement setting 14
Bootstrap[59]: Time 0.105201 seconds, bootstrap likelihood -1212.039371, best rearrangement setting 6
Bootstrap[60]: Time 0.092922 seconds, bootstrap likelihood -1248.692011, best rearrangement setting 10
Bootstrap[61]: Time 0.108327 seconds, bootstrap likelihood -1172.820979, best rearrangement setting 13
Bootstrap[62]: Time 0.123546 seconds, bootstrap likelihood -1126.745788, best rearrangement setting 14
Bootstrap[63]: Time 0.097677 seconds, bootstrap likelihood -1267.434444, best rearrangement setting 12
Bootstrap[64]: Time 0.090485 seconds, bootstrap likelihood -1340.680748, best rearrangement setting 5
Bootstrap[65]: Time 0.093805 seconds, bootstrap likelihood -1072.671059, best rearrangement setting 5
Bootstrap[66]: Time 0.114375 seconds, bootstrap likelihood -1234.294838, best rearrangement setting 8
Bootstrap[67]: Time 0.112810 seconds, bootstrap likelihood -1109.249439, best rearrangement setting 15
Bootstrap[68]: Time 0.087646 seconds, bootstrap likelihood -1314.493588, best rearrangement setting 8
Bootstrap[69]: Time 0.096061 seconds, bootstrap likelihood -1173.850035, best rearrangement setting 13
Bootstrap[70]: Time 0.102518 seconds, bootstrap likelihood -1231.066465, best rearrangement setting 10
Bootstrap[71]: Time 0.100853 seconds, bootstrap likelihood -1146.861379, best rearrangement setting 9
Bootstrap[72]: Time 0.086235 seconds, bootstrap likelihood -1148.753369, best rearrangement setting 8
Bootstrap[73]: Time 0.101517 seconds, bootstrap likelihood -1333.374056, best rearrangement setting 9
Bootstrap[74]: Time 0.088877 seconds, bootstrap likelihood -1259.382378, best rearrangement setting 5
Bootstrap[75]: Time 0.093051 seconds, bootstrap likelihood -1319.944496, best rearrangement setting 6
Bootstrap[76]: Time 0.110345 seconds, bootstrap likelihood -1309.042165, best rearrangement setting 14
Bootstrap[77]: Time 0.132068 seconds, bootstrap likelihood -1232.061289, best rearrangement setting 8
Bootstrap[78]: Time 0.110082 seconds, bootstrap likelihood -1261.333984, best rearrangement setting 9
Bootstrap[79]: Time 0.114067 seconds, bootstrap likelihood -1194.644341, best rearrangement setting 13
Bootstrap[80]: Time 0.100089 seconds, bootstrap likelihood -1214.037389, best rearrangement setting 9
Bootstrap[81]: Time 0.107106 seconds, bootstrap likelihood -1224.527657, best rearrangement setting 8
Bootstrap[82]: Time 0.127940 seconds, bootstrap likelihood -1241.464826, best rearrangement setting 11
Bootstrap[83]: Time 0.100983 seconds, bootstrap likelihood -1230.730558, best rearrangement setting 6
Bootstrap[84]: Time 0.103703 seconds, bootstrap likelihood -1219.034592, best rearrangement setting 10
Bootstrap[85]: Time 0.111023 seconds, bootstrap likelihood -1280.071994, best rearrangement setting 8
Bootstrap[86]: Time 0.095615 seconds, bootstrap likelihood -1444.747777, best rearrangement setting 9
Bootstrap[87]: Time 0.094808 seconds, bootstrap likelihood -1245.890035, best rearrangement setting 14
Bootstrap[88]: Time 0.109233 seconds, bootstrap likelihood -1287.832766, best rearrangement setting 7
Bootstrap[89]: Time 0.101623 seconds, bootstrap likelihood -1325.245976, best rearrangement setting 5
Bootstrap[90]: Time 0.113758 seconds, bootstrap likelihood -1227.883697, best rearrangement setting 5
Bootstrap[91]: Time 0.111353 seconds, bootstrap likelihood -1273.489392, best rearrangement setting 8
Bootstrap[92]: Time 0.045664 seconds, bootstrap likelihood -1234.725870, best rearrangement setting 7
Bootstrap[93]: Time 0.117640 seconds, bootstrap likelihood -1235.733064, best rearrangement setting 11
Bootstrap[94]: Time 0.097002 seconds, bootstrap likelihood -1204.319488, best rearrangement setting 15
Bootstrap[95]: Time 0.093885 seconds, bootstrap likelihood -1183.328582, best rearrangement setting 11
Bootstrap[96]: Time 0.108553 seconds, bootstrap likelihood -1196.298898, best rearrangement setting 13
Bootstrap[97]: Time 0.116876 seconds, bootstrap likelihood -1339.251746, best rearrangement setting 12
Bootstrap[98]: Time 0.044840 seconds, bootstrap likelihood -1404.363552, best rearrangement setting 7
Bootstrap[99]: Time 0.059058 seconds, bootstrap likelihood -1270.157811, best rearrangement setting 7
Overall Time for 100 Rapid Bootstraps 10.589718 seconds
Average Time per Rapid Bootstrap 0.105897 seconds
Starting ML Search ...
Fast ML optimization finished
Fast ML search Time: 4.193859 seconds
Slow ML Search 0 Likelihood: -1387.994678
Slow ML Search 1 Likelihood: -1387.994678
Slow ML Search 2 Likelihood: -1387.994676
Slow ML Search 3 Likelihood: -1387.994650
Slow ML Search 4 Likelihood: -1387.994685
Slow ML Search 5 Likelihood: -1388.092954
Slow ML Search 6 Likelihood: -1388.182551
Slow ML Search 7 Likelihood: -1388.182563
Slow ML Search 8 Likelihood: -1388.182547
Slow ML Search 9 Likelihood: -1387.994723
Slow ML optimization finished
Slow ML search Time: 2.147984 seconds
Thorough ML search Time: 0.562621 seconds
Final ML Optimization Likelihood: -1387.204993
Model Information:
Model Parameters of Partition 0, Name: No Name Provided, Type of Data: DNA
alpha: 1.227800
Tree-Length: 7.823400
rate A <-> C: 0.332564
rate A <-> G: 2.312784
rate A <-> T: 2.215466
rate C <-> G: 1.243321
rate C <-> T: 3.278770
rate G <-> T: 1.000000
freq pi(A): 0.243216
freq pi(C): 0.181967
freq pi(G): 0.319196
freq pi(T): 0.255621
ML search took 6.909638 secs or 0.001919 hours
Combined Bootstrap and ML search took 17.499828 secs or 0.004861 hours
Drawing Bootstrap Support Values on best-scoring ML tree ...
Found 1 tree in File /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bestTree.q2bootstrap
Found 1 tree in File /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bestTree.q2bootstrap
Program execution info written to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_info.q2bootstrap
All 100 bootstrapped trees written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bootstrap.q2bootstrap
Best-scoring ML tree written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bestTree.q2bootstrap
Best-scoring ML tree with support values written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bipartitions.q2bootstrap
Best-scoring ML tree with support values as branch labels written to: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0/RAxML_bipartitionsBranchLabels.q2bootstrap
Overall execution time for full ML analysis: 17.509964 secs or 0.004864 hours or 0.000203 days
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: raxmlHPC -f a -m GTRCAT -p 1723 -x 9384 -N 100 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-50xfe34m/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -w /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp4b5jq6d0 -n q2bootstrap
Saved Phylogeny[Unrooted] to: raxml-cat-bootstrap-tree.qza
Tip
Optimizing RAxML Run Time.
You may gave noticed that we haven’t added the flag --p-raxml-version
to
the RAxML methods. This parameter provides a means to access versions of
RAxML that have optimized vector instructions for various modern x86
processor architectures. Paraphrased from the RAxML manual and help
documentation: Firstly, most recent processors will support SSE3 vector
instructions (i.e. will likely support the faster AVX2 vector instructions).
Secondly, these instructions will substantially accelerate the likelihood
and parsimony computations. In general, SSE3 versions will run approximately
40% faster than the standard version. The AVX2 version will run 10-30%
faster than the SSE3 version. Additionally, keep in mind that using more
cores / threads will not necessarily decrease run time. The RAxML manual
suggests using 1 core per ~500 DNA alignment patterns. Alignment pattern
information is usually visible on screen, when the --verbose
option is
used. Additionally, try using a rate category (CAT model; via
--p-substitution-model
), which results in equally good trees as the
GAMMA models and is approximately 4 times faster. See the CAT paper. The
CAT approximation is also Ideal for alignments containing 10,000 or more
taxa, and is very much similar the CAT-like model of FastTree2.
iqtree¶
Similar to the raxml
and raxml-rapid-bootstrap
methods above, we
provide similar functionality for IQ-TREE: iqtree
and
iqtree-ultrafast-bootstrap
. IQ-TREE is unique compared to the fastree
and raxml
options, in that it provides access to 286 models of nucleotide
substitution! IQ-TREE can also determine which of these models best fits your
dataset prior to constructing your tree via its built-in ModelFinder
algorithm. This is the default in QIIME 2, but do not worry, you can set any
one of the 286 models of nucleotide substitution via the
--p-substitution-model
flag, e.g. you can set the model as HKY+I+G
instead of the default MFP
(a basic short-hand for: “build a phylogeny
after determining the best fit model as determined by ModelFinder”). Keep in
mind the additional computational time required for model testing via
ModelFinder.
The simplest way to run the
iqtree command with default
settings and automatic model selection (MFP
) is like so:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--o-tree iqt-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-7jytd2sk/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree -nt 1
Seed: 847309 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:56:27 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-7jytd2sk/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1391.311
2. Current log-likelihood: -1389.733
Optimal log-likelihood: -1388.881
Rate parameters: A-C: 0.33789 A-G: 2.29237 A-T: 2.14761 C-G: 1.19040 C-T: 3.28358 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.033
Gamma shape alpha: 1.424
Parameters optimization took 2 rounds (0.013 sec)
Time for fast ML tree search: 0.063 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 286 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1402.497 45 2894.994 2919.637 3046.463
2 GTR+F+I 1401.403 46 2894.806 2920.698 3049.641
3 GTR+F+G4 1387.279 46 2866.558 2892.450 3021.393
4 GTR+F+I+G4 1387.559 47 2869.117 2896.298 3027.318
5 GTR+F+R2 1380.611 47 2855.222 2882.402 3013.423
6 GTR+F+R3 1380.659 49 2859.317 2889.195 3024.250
16 SYM+G4 1387.157 43 2860.314 2882.573 3005.051
18 SYM+R2 1382.244 44 2852.489 2875.920 3000.591
29 TVM+F+G4 1388.424 45 2866.848 2891.491 3018.317
31 TVM+F+R2 1382.481 46 2856.963 2882.855 3011.798
42 TVMe+G4 1387.122 42 2858.245 2879.367 2999.616
44 TVMe+R2 1382.298 43 2850.596 2872.854 2995.333
55 TIM3+F+G4 1391.457 44 2870.914 2894.346 3019.017
57 TIM3+F+R2 1384.431 45 2858.861 2883.504 3010.330
68 TIM3e+G4 1390.540 41 2863.080 2883.103 3001.085
70 TIM3e+R2 1385.228 42 2854.456 2875.578 2995.827
81 TIM2+F+G4 1394.180 44 2876.360 2899.792 3024.463
83 TIM2+F+R2 1386.234 45 2862.469 2887.112 3013.938
94 TIM2e+G4 1397.742 41 2877.483 2897.507 3015.488
96 TIM2e+R2 1391.117 42 2866.235 2887.357 3007.606
107 TIM+F+G4 1390.802 44 2869.603 2893.035 3017.706
109 TIM+F+R2 1383.206 45 2856.411 2881.054 3007.880
120 TIMe+G4 1394.796 41 2871.592 2891.616 3009.597
122 TIMe+R2 1388.300 42 2860.600 2881.723 3001.971
133 TPM3u+F+G4 1392.567 43 2871.134 2893.393 3015.871
135 TPM3u+F+R2 1386.368 44 2860.736 2884.168 3008.839
146 TPM3+F+G4 1392.567 43 2871.134 2893.393 3015.871
148 TPM3+F+R2 1386.368 44 2860.736 2884.168 3008.839
159 TPM2u+F+G4 1395.282 43 2876.564 2898.823 3021.301
161 TPM2u+F+R2 1388.115 44 2864.231 2887.663 3012.334
172 TPM2+F+G4 1395.282 43 2876.564 2898.823 3021.301
174 TPM2+F+R2 1388.115 44 2864.231 2887.663 3012.334
185 K3Pu+F+G4 1392.067 43 2870.133 2892.392 3014.870
187 K3Pu+F+R2 1385.124 44 2858.247 2881.679 3006.350
198 K3P+G4 1394.798 40 2869.597 2888.556 3004.236
200 K3P+R2 1388.380 41 2858.761 2878.784 2996.766
211 TN+F+G4 1394.627 43 2875.254 2897.513 3019.991
213 TN+F+R2 1386.824 44 2861.647 2885.079 3009.750
224 TNe+G4 1397.746 40 2875.492 2894.452 3010.131
226 TNe+R2 1391.135 41 2864.270 2884.293 3002.275
237 HKY+F+G4 1395.753 42 2875.505 2896.628 3016.876
239 HKY+F+R2 1388.692 43 2863.383 2885.642 3008.120
250 K2P+G4 1397.751 39 2873.502 2891.433 3004.775
252 K2P+R2 1391.217 40 2862.434 2881.394 2997.073
263 F81+F+G4 1406.484 41 2894.968 2914.991 3032.973
265 F81+F+R2 1400.605 42 2885.210 2906.333 3026.581
276 JC+G4 1408.433 38 2892.866 2909.803 3020.773
278 JC+R2 1403.022 39 2884.045 2901.976 3015.318
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TVMe+R2
Bayesian Information Criterion: TVMe+R2
Best-fit model: TVMe+R2 chosen according to BIC
All model information printed to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree.model.gz
CPU time for ModelFinder: 0.669 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.692 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1382.298
Optimal log-likelihood: -1382.296
Rate parameters: A-C: 0.21625 A-G: 2.01118 A-T: 1.56840 C-G: 0.77587 C-T: 2.01118 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.722,0.406) (0.278,2.543)
Parameters optimization took 1 rounds (0.004 sec)
Computing ML distances based on estimated model parameters... 0.006 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1389.305
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.088 second
Computing log-likelihood of 98 initial trees ... 0.079 seconds
Current best score: -1382.296
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1382.296
Iteration 10 / LogL: -1382.305 / Time: 0h:0m:0s
Iteration 20 / LogL: -1382.308 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1382.296 / CPU time: 0.418
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1382.297 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 40 / LogL: -1382.868 / Time: 0h:0m:0s (0h:0m:1s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 41: -1382.001
Iteration 50 / LogL: -1382.887 / Time: 0h:0m:0s (0h:0m:1s left)
Iteration 60 / LogL: -1382.094 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 70 / LogL: -1382.094 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 80 / LogL: -1382.099 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 90 / LogL: -1383.123 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 100 / LogL: -1382.242 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 110 / LogL: -1382.094 / Time: 0h:0m:1s (0h:0m:0s left)
Iteration 120 / LogL: -1382.097 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 130 / LogL: -1382.211 / Time: 0h:0m:2s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.001
Iteration 140 / LogL: -1382.098 / Time: 0h:0m:2s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 142 ITERATIONS / Time: 0h:0m:2s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1382.001
Optimal log-likelihood: -1382.001
Rate parameters: A-C: 0.19132 A-G: 1.83777 A-T: 1.52755 C-G: 0.77037 C-T: 1.83777 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.725,0.409) (0.275,2.561)
Parameters optimization took 1 rounds (0.004 sec)
BEST SCORE FOUND : -1382.001
Total tree length: 7.109
Total number of iterations: 142
CPU time used for tree search: 2.417 sec (0h:0m:2s)
Wall-clock time used for tree search: 2.419 sec (0h:0m:2s)
Total CPU time used: 2.446 sec (0h:0m:2s)
Total wall-clock time used: 2.450 sec (0h:0m:2s)
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree.mldist
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree.log
Date and Time: Sat Dec 5 17:56:30 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-7jytd2sk/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmps7tqfmwp/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-tree.qza
Specifying a substitution model¶
We can also set a substitution model of our choosing. You may have noticed
while watching the onscreen output of the previous command that the best
fitting model selected by ModelFinder is noted. For the sake of argument, let’s
say the best selected model was shown as GTR+F+I+G4
. The F
is only a
notation to let us know that if a given model supports unequal base
frequencies, then the empirical base frequencies will be used by default.
Using empirical base frequencies (F
), rather than estimating them, greatly
reduces computational time. The iqtree
plugin will not accept F
within
the model notation supplied at the command line, as this will always be implied
automatically for the appropriate model. Also, the iqtree
plugin only
accepts G
not G4
to be specified within the model notation. The 4
is simply another explicit notation to remind us that four rate categories are
being assumed by default. The notation approach used by the plugin simply helps
to retain simplicity and familiarity when supplying model notations on the
command line. So, in brief, we only have to type GTR+I+G
as our input
model:
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-gtrig-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4m8h5sho/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree -nt 1
Seed: 466955 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:56:35 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4m8h5sho/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.242 / LogL: -1394.542
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.010, 1.345 / LogL: -1394.882
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.010, 1.351 / LogL: -1394.883
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.355 / LogL: -1394.867
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.009, 1.351 / LogL: -1394.832
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.354 / LogL: -1394.859
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.010, 1.355 / LogL: -1394.881
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.349 / LogL: -1394.822
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.009, 1.348 / LogL: -1394.833
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.009, 1.347 / LogL: -1394.843
Optimal pinv,alpha: 0.000, 1.242 / LogL: -1394.542
Parameters optimization took 0.454 sec
Computing ML distances based on estimated model parameters... 0.010 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1392.914
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.089 second
Computing log-likelihood of 98 initial trees ... 0.109 seconds
Current best score: -1392.914
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.266
Iteration 10 / LogL: -1393.565 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.282 / Time: 0h:0m:1s
Finish initializing candidate tree set (2)
Current best tree score: -1387.266 / CPU time: 0.619
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 22: -1387.167
UPDATE BEST LOG-LIKELIHOOD: -1387.167
Iteration 30 / LogL: -1387.167 / Time: 0h:0m:1s (0h:0m:4s left)
Iteration 40 / LogL: -1387.392 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 50 / LogL: -1387.168 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 60 / LogL: -1387.168 / Time: 0h:0m:2s (0h:0m:2s left)
Iteration 70 / LogL: -1387.210 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 80 / LogL: -1396.400 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 90 / LogL: -1387.182 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 100 / LogL: -1387.168 / Time: 0h:0m:3s (0h:0m:0s left)
Iteration 110 / LogL: -1387.177 / Time: 0h:0m:3s (0h:0m:0s left)
Iteration 120 / LogL: -1387.167 / Time: 0h:0m:3s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 123 ITERATIONS / Time: 0h:0m:3s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.167
Optimal log-likelihood: -1387.166
Rate parameters: A-C: 0.34484 A-G: 2.31013 A-T: 2.13040 C-G: 1.22723 C-T: 3.19829 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.285
Parameters optimization took 1 rounds (0.004 sec)
BEST SCORE FOUND : -1387.166
Total tree length: 7.599
Total number of iterations: 123
CPU time used for tree search: 3.391 sec (0h:0m:3s)
Wall-clock time used for tree search: 3.394 sec (0h:0m:3s)
Total CPU time used: 3.873 sec (0h:0m:3s)
Total wall-clock time used: 3.878 sec (0h:0m:3s)
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree.mldist
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree.log
Date and Time: Sat Dec 5 17:56:39 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4m8h5sho/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpj8re7soa/q2iqtree -nt 1
Saved Phylogeny[Unrooted] to: iqt-gtrig-tree.qza
Let’s rerun the command above and add the --p-fast
option. This option,
only compatible with the iqtree
method, resembles the fast search performed
by fasttree
. 🏎️ Secondly, let’s also perform multiple tree searches and
keep the best of those trees (as we did earlier with the
raxml --p-n-searches ...
command):
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-substitution-model 'GTR+I+G' \
--p-fast \
--p-n-runs 10 \
--o-tree iqt-gtrig-fast-ms-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -st DNA --runs 10 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-8fnq9x92/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree -nt 1 -fast
Seed: 809908 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:56:43 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-8fnq9x92/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
---> START RUN NUMBER 1 (seed: 809908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.290 / LogL: -1395.208
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.019, 1.361 / LogL: -1395.971
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.023, 1.415 / LogL: -1396.188
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.023, 1.430 / LogL: -1396.157
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.026, 1.433 / LogL: -1396.340
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.028, 1.436 / LogL: -1396.497
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.023, 1.426 / LogL: -1396.261
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.024, 1.430 / LogL: -1396.321
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.025, 1.430 / LogL: -1396.366
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.026, 1.434 / LogL: -1396.414
Optimal pinv,alpha: 0.000, 1.290 / LogL: -1395.208
Parameters optimization took 0.342 sec
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1392.833
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.271
UPDATE BEST LOG-LIKELIHOOD: -1387.265
Finish initializing candidate tree set (3)
Current best tree score: -1387.265 / CPU time: 0.054
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.265
Optimal log-likelihood: -1387.255
Rate parameters: A-C: 0.33350 A-G: 2.25974 A-T: 2.13776 C-G: 1.16813 C-T: 3.30010 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.319
Parameters optimization took 1 rounds (0.006 sec)
BEST SCORE FOUND : -1387.255
Total tree length: 6.745
Total number of iterations: 2
CPU time used for tree search: 0.054 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.055 sec (0h:0m:0s)
Total CPU time used: 0.424 sec (0h:0m:0s)
Total wall-clock time used: 0.428 sec (0h:0m:0s)
---> START RUN NUMBER 2 (seed: 810908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.259
2. Current log-likelihood: -1403.078
3. Current log-likelihood: -1398.354
4. Current log-likelihood: -1396.979
5. Current log-likelihood: -1396.262
Optimal log-likelihood: -1395.753
Rate parameters: A-C: 0.24339 A-G: 2.10097 A-T: 1.98595 C-G: 1.09180 C-T: 2.82193 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.355
Parameters optimization took 5 rounds (0.039 sec)
Computing ML distances based on estimated model parameters... 0.010 sec
WARNING: Some pairwise ML distances are too long (saturated)
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.985
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.952
Finish initializing candidate tree set (4)
Current best tree score: -1387.952 / CPU time: 0.080
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.952
2. Current log-likelihood: -1387.793
3. Current log-likelihood: -1387.673
4. Current log-likelihood: -1387.584
5. Current log-likelihood: -1387.515
6. Current log-likelihood: -1387.462
Optimal log-likelihood: -1387.420
Rate parameters: A-C: 0.33384 A-G: 2.25230 A-T: 2.12617 C-G: 1.16896 C-T: 3.25564 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.358
Parameters optimization took 6 rounds (0.024 sec)
BEST SCORE FOUND : -1387.420
Total tree length: 6.702
Total number of iterations: 2
CPU time used for tree search: 0.080 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.080 sec (0h:0m:0s)
Total CPU time used: 0.171 sec (0h:0m:0s)
Total wall-clock time used: 0.173 sec (0h:0m:0s)
---> START RUN NUMBER 3 (seed: 811908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1491.633
2. Current log-likelihood: -1402.007
3. Current log-likelihood: -1396.792
4. Current log-likelihood: -1395.393
5. Current log-likelihood: -1394.654
Optimal log-likelihood: -1394.081
Rate parameters: A-C: 0.28077 A-G: 2.37447 A-T: 2.10134 C-G: 1.20130 C-T: 3.28121 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.386
Parameters optimization took 5 rounds (0.037 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.810
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.188
Finish initializing candidate tree set (4)
Current best tree score: -1388.188 / CPU time: 0.056
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.188
2. Current log-likelihood: -1387.973
3. Current log-likelihood: -1387.830
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36985 A-G: 2.31002 A-T: 2.11728 C-G: 1.22260 C-T: 3.27850 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.026 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.056 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.056 sec (0h:0m:0s)
Total CPU time used: 0.144 sec (0h:0m:0s)
Total wall-clock time used: 0.147 sec (0h:0m:0s)
---> START RUN NUMBER 4 (seed: 812908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.491
2. Current log-likelihood: -1404.606
3. Current log-likelihood: -1399.220
4. Current log-likelihood: -1397.821
5. Current log-likelihood: -1397.063
Optimal log-likelihood: -1396.484
Rate parameters: A-C: 0.24153 A-G: 2.03298 A-T: 1.94373 C-G: 1.02159 C-T: 2.79340 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.438
Parameters optimization took 5 rounds (0.037 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1394.014
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.197
UPDATE BEST LOG-LIKELIHOOD: -1388.190
Finish initializing candidate tree set (3)
Current best tree score: -1388.190 / CPU time: 0.055
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.190
2. Current log-likelihood: -1387.966
3. Current log-likelihood: -1387.806
4. Current log-likelihood: -1387.687
5. Current log-likelihood: -1387.596
6. Current log-likelihood: -1387.526
7. Current log-likelihood: -1387.471
Optimal log-likelihood: -1387.426
Rate parameters: A-C: 0.33230 A-G: 2.23569 A-T: 2.11033 C-G: 1.15918 C-T: 3.23266 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.356
Parameters optimization took 7 rounds (0.029 sec)
BEST SCORE FOUND : -1387.426
Total tree length: 6.737
Total number of iterations: 2
CPU time used for tree search: 0.055 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.056 sec (0h:0m:0s)
Total CPU time used: 0.147 sec (0h:0m:0s)
Total wall-clock time used: 0.150 sec (0h:0m:0s)
---> START RUN NUMBER 5 (seed: 813908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1495.863
2. Current log-likelihood: -1402.072
3. Current log-likelihood: -1396.809
4. Current log-likelihood: -1395.391
5. Current log-likelihood: -1394.657
Optimal log-likelihood: -1394.080
Rate parameters: A-C: 0.27275 A-G: 2.35291 A-T: 2.09125 C-G: 1.19606 C-T: 3.26639 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.387
Parameters optimization took 5 rounds (0.039 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.808
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.189
Finish initializing candidate tree set (4)
Current best tree score: -1388.189 / CPU time: 0.055
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.189
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.645
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36986 A-G: 2.31017 A-T: 2.11745 C-G: 1.22267 C-T: 3.27881 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.332
Parameters optimization took 6 rounds (0.028 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.502
Total number of iterations: 2
CPU time used for tree search: 0.055 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.055 sec (0h:0m:0s)
Total CPU time used: 0.144 sec (0h:0m:0s)
Total wall-clock time used: 0.147 sec (0h:0m:0s)
---> START RUN NUMBER 6 (seed: 814908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1496.306
2. Current log-likelihood: -1403.641
3. Current log-likelihood: -1398.531
4. Current log-likelihood: -1397.067
5. Current log-likelihood: -1396.244
6. Current log-likelihood: -1395.736
Optimal log-likelihood: -1395.357
Rate parameters: A-C: 0.22740 A-G: 2.00038 A-T: 1.90797 C-G: 1.02878 C-T: 2.75984 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.021
Gamma shape alpha: 1.340
Parameters optimization took 6 rounds (0.044 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.769
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.988
Finish initializing candidate tree set (4)
Current best tree score: -1387.988 / CPU time: 0.053
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.988
2. Current log-likelihood: -1387.817
3. Current log-likelihood: -1387.688
4. Current log-likelihood: -1387.594
5. Current log-likelihood: -1387.523
6. Current log-likelihood: -1387.468
Optimal log-likelihood: -1387.424
Rate parameters: A-C: 0.32477 A-G: 2.23464 A-T: 2.10729 C-G: 1.15865 C-T: 3.22565 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.360
Parameters optimization took 6 rounds (0.021 sec)
BEST SCORE FOUND : -1387.424
Total tree length: 6.697
Total number of iterations: 2
CPU time used for tree search: 0.053 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.053 sec (0h:0m:0s)
Total CPU time used: 0.143 sec (0h:0m:0s)
Total wall-clock time used: 0.145 sec (0h:0m:0s)
---> START RUN NUMBER 7 (seed: 815908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1496.015
2. Current log-likelihood: -1403.630
3. Current log-likelihood: -1398.533
4. Current log-likelihood: -1397.077
5. Current log-likelihood: -1396.256
6. Current log-likelihood: -1395.746
Optimal log-likelihood: -1395.367
Rate parameters: A-C: 0.23677 A-G: 2.05007 A-T: 1.94889 C-G: 1.06764 C-T: 2.81222 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.021
Gamma shape alpha: 1.337
Parameters optimization took 6 rounds (0.042 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.724
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.981
Finish initializing candidate tree set (4)
Current best tree score: -1387.981 / CPU time: 0.051
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.981
2. Current log-likelihood: -1387.812
3. Current log-likelihood: -1387.685
4. Current log-likelihood: -1387.592
5. Current log-likelihood: -1387.521
6. Current log-likelihood: -1387.466
Optimal log-likelihood: -1387.423
Rate parameters: A-C: 0.32763 A-G: 2.25273 A-T: 2.12567 C-G: 1.16858 C-T: 3.25529 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.358
Parameters optimization took 6 rounds (0.026 sec)
BEST SCORE FOUND : -1387.423
Total tree length: 6.701
Total number of iterations: 2
CPU time used for tree search: 0.051 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.052 sec (0h:0m:0s)
Total CPU time used: 0.143 sec (0h:0m:0s)
Total wall-clock time used: 0.145 sec (0h:0m:0s)
---> START RUN NUMBER 8 (seed: 816908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1493.259
2. Current log-likelihood: -1403.078
3. Current log-likelihood: -1398.354
4. Current log-likelihood: -1396.979
5. Current log-likelihood: -1396.262
Optimal log-likelihood: -1395.753
Rate parameters: A-C: 0.24339 A-G: 2.10097 A-T: 1.98596 C-G: 1.09180 C-T: 2.82193 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.355
Parameters optimization took 5 rounds (0.040 sec)
Computing ML distances based on estimated model parameters... 0.010 sec
WARNING: Some pairwise ML distances are too long (saturated)
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.985
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1387.952
Finish initializing candidate tree set (4)
Current best tree score: -1387.952 / CPU time: 0.076
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1387.952
2. Current log-likelihood: -1387.793
3. Current log-likelihood: -1387.673
4. Current log-likelihood: -1387.584
5. Current log-likelihood: -1387.515
6. Current log-likelihood: -1387.462
Optimal log-likelihood: -1387.420
Rate parameters: A-C: 0.33384 A-G: 2.25230 A-T: 2.12617 C-G: 1.16896 C-T: 3.25564 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.358
Parameters optimization took 6 rounds (0.024 sec)
BEST SCORE FOUND : -1387.420
Total tree length: 6.702
Total number of iterations: 2
CPU time used for tree search: 0.076 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.076 sec (0h:0m:0s)
Total CPU time used: 0.167 sec (0h:0m:0s)
Total wall-clock time used: 0.170 sec (0h:0m:0s)
---> START RUN NUMBER 9 (seed: 817908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1491.925
2. Current log-likelihood: -1402.064
3. Current log-likelihood: -1396.813
4. Current log-likelihood: -1395.392
5. Current log-likelihood: -1394.652
Optimal log-likelihood: -1394.078
Rate parameters: A-C: 0.27467 A-G: 2.39505 A-T: 2.12238 C-G: 1.21030 C-T: 3.30514 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.388
Parameters optimization took 5 rounds (0.037 sec)
Computing ML distances based on estimated model parameters... 0.010 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.807
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.217
BETTER TREE FOUND at iteration 2: -1388.181
Finish initializing candidate tree set (4)
Current best tree score: -1388.181 / CPU time: 0.057
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.181
2. Current log-likelihood: -1387.974
3. Current log-likelihood: -1387.831
4. Current log-likelihood: -1387.725
5. Current log-likelihood: -1387.646
6. Current log-likelihood: -1387.584
Optimal log-likelihood: -1387.534
Rate parameters: A-C: 0.36872 A-G: 2.32249 A-T: 2.12948 C-G: 1.22911 C-T: 3.29732 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.006
Gamma shape alpha: 1.331
Parameters optimization took 6 rounds (0.026 sec)
BEST SCORE FOUND : -1387.534
Total tree length: 7.508
Total number of iterations: 2
CPU time used for tree search: 0.057 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.057 sec (0h:0m:0s)
Total CPU time used: 0.147 sec (0h:0m:0s)
Total wall-clock time used: 0.149 sec (0h:0m:0s)
---> START RUN NUMBER 10 (seed: 818908)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.500)
1. Initial log-likelihood: -1492.199
2. Current log-likelihood: -1404.591
3. Current log-likelihood: -1399.228
4. Current log-likelihood: -1397.831
5. Current log-likelihood: -1397.074
Optimal log-likelihood: -1396.495
Rate parameters: A-C: 0.24620 A-G: 2.08306 A-T: 1.99581 C-G: 1.06240 C-T: 2.85598 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.027
Gamma shape alpha: 1.432
Parameters optimization took 5 rounds (0.042 sec)
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1393.972
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Do NNI search on 2 best initial trees
Estimate model parameters (epsilon = 0.500)
BETTER TREE FOUND at iteration 1: -1388.188
UPDATE BEST LOG-LIKELIHOOD: -1388.187
Finish initializing candidate tree set (3)
Current best tree score: -1388.187 / CPU time: 0.053
Number of iterations: 2
TREE SEARCH COMPLETED AFTER 2 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.050)
1. Initial log-likelihood: -1388.187
2. Current log-likelihood: -1387.966
3. Current log-likelihood: -1387.806
4. Current log-likelihood: -1387.687
5. Current log-likelihood: -1387.596
6. Current log-likelihood: -1387.525
7. Current log-likelihood: -1387.471
Optimal log-likelihood: -1387.426
Rate parameters: A-C: 0.33227 A-G: 2.23742 A-T: 2.11203 C-G: 1.16007 C-T: 3.23505 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.004
Gamma shape alpha: 1.356
Parameters optimization took 7 rounds (0.025 sec)
BEST SCORE FOUND : -1387.426
Total tree length: 6.737
Total number of iterations: 2
CPU time used for tree search: 0.053 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.053 sec (0h:0m:0s)
Total CPU time used: 0.146 sec (0h:0m:0s)
Total wall-clock time used: 0.148 sec (0h:0m:0s)
---> SUMMARIZE RESULTS FROM 10 RUNS
Run 1 gave best log-likelihood: -1387.255
Total CPU time for 10 runs: 1.788 seconds.
Total wall-clock time for 10 runs: 1.821 seconds.
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree.treefile
Trees from independent runs: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree.runtrees
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree.mldist
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree.log
Date and Time: Sat Dec 5 17:56:45 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 10 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-8fnq9x92/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp7jkxv2bh/q2iqtree -nt 1 -fast
Saved Phylogeny[Unrooted] to: iqt-gtrig-fast-ms-tree.qza
Single branch tests¶
IQ-TREE provides access to a few single branch testing methods
SH-aLRT via
--p-alrt [INT >= 1000]
aBayes via
--p-abayes [TRUE | FALSE]
local bootstrap test via
--p-lbp [INT >= 1000]
Single branch tests are commonly used as an alternative to the bootstrapping
approach we’ve discussed above, as they are substantially faster and often
recommended when constructing large phylogenies (e.g. >10,000 taxa). All
three of these methods can be applied simultaneously and viewed within iTOL
as separate bootstrap support values. These values are always in listed in the
following order of alrt / lbp / abayes. We’ll go ahead and apply all of the
branch tests in our next command, while specifying the same substitution model
as above. Feel free to combine this with the --p-fast
option. 😉
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-sbt-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-_2gdfois/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Seed: 96193 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:56:50 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-_2gdfois/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.282 / LogL: -1392.553
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.377 / LogL: -1392.829
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.391 / LogL: -1392.898
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.388 / LogL: -1392.889
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.383 / LogL: -1392.853
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.384 / LogL: -1392.879
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.007, 1.379 / LogL: -1392.828
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.382 / LogL: -1392.844
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.383 / LogL: -1392.849
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.384 / LogL: -1392.858
Optimal pinv,alpha: 0.000, 1.282 / LogL: -1392.553
Parameters optimization took 0.469 sec
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1392.710
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.090 second
Computing log-likelihood of 98 initial trees ... 0.110 seconds
Current best score: -1392.553
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.258
Iteration 10 / LogL: -1387.279 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.279 / Time: 0h:0m:1s
Finish initializing candidate tree set (2)
Current best tree score: -1387.258 / CPU time: 0.609
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1396.564 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 40 / LogL: -1387.370 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 50 / LogL: -1387.350 / Time: 0h:0m:2s (0h:0m:2s left)
Iteration 60 / LogL: -1387.378 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 70 / LogL: -1387.351 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 80 / LogL: -1387.267 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 90 / LogL: -1387.350 / Time: 0h:0m:3s (0h:0m:0s left)
Iteration 100 / LogL: -1387.907 / Time: 0h:0m:3s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 102 ITERATIONS / Time: 0h:0m:3s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.258
Optimal log-likelihood: -1387.252
Rate parameters: A-C: 0.33095 A-G: 2.27170 A-T: 2.15058 C-G: 1.18098 C-T: 3.30169 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.323
Parameters optimization took 1 rounds (0.006 sec)
BEST SCORE FOUND : -1387.252
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.057 sec.
Total tree length: 6.716
Total number of iterations: 102
CPU time used for tree search: 3.092 sec (0h:0m:3s)
Wall-clock time used for tree search: 3.094 sec (0h:0m:3s)
Total CPU time used: 3.651 sec (0h:0m:3s)
Total wall-clock time used: 3.657 sec (0h:0m:3s)
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree.mldist
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree.log
Date and Time: Sat Dec 5 17:56:53 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-_2gdfois/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpg5l05a7_/q2iqtree -nt 1 -alrt 1000 -abayes -lbp 1000
Saved Phylogeny[Unrooted] to: iqt-sbt-tree.qza
Tip
IQ-TREE search settings.
There are quite a few adjustable parameters available for iqtree
that
can be modified improve searches through “tree space” and prevent the search
algorithms from getting stuck in local optima. One particular best
practice to aid in this regard, is to adjust the following parameters:
--p-perturb-nni-strength
and --p-stop-iter
(each respectively maps
to the -pers
and -nstop
flags of iqtree
). In brief, the larger
the value for NNI (nearest-neighbor interchange) perturbation, the larger
the jumps in “tree space”. This value should be set high enough to allow the
search algorithm to avoid being trapped in local optima, but not to high
that the search is haphazardly jumping around “tree space”. That is, like
Goldilocks and the three 🐻s you need to find a setting that is “just
right”, or at least within a set of reasonable bounds. One way of assessing
this, is to do a few short trial runs using the --verbose
flag. If you
see that the likelihood values are jumping around to much, then lowering the
value for --p-perturb-nni-strength
may be warranted. As for the stopping
criteria, i.e. --p-stop-iter
, the higher this value, the more thorough
your search in “tree space”. Be aware, increasing this value may also
increase the run time. That is, the search will continue until it has
sampled a number of trees, say 100 (default), without finding a better
scoring tree. If a better tree is found, then the counter resets, and the
search continues. These two parameters deserve special consideration when a
given data set contains many short sequences, quite common for microbiome
survey data. We can modify our original command to include these extra
parameters with the recommended modifications for short sequences, i.e. a
lower value for perturbation strength (shorter reads do not contain as much
phylogenetic information, thus we should limit how far we jump around in
“tree space”) and a larger number of stop iterations. See the IQ-TREE
command reference for more details about default parameter settings.
Finally, we’ll let iqtree
perform the model testing, and automatically
determine the optimal number of CPU cores to use.
qiime phylogeny iqtree \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-fast-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-5b7g8smv/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Seed: 250130 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:56:58 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-5b7g8smv/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1391.885
2. Current log-likelihood: -1390.523
Optimal log-likelihood: -1389.769
Rate parameters: A-C: 0.33283 A-G: 2.23806 A-T: 2.09600 C-G: 1.18747 C-T: 3.18113 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.034
Gamma shape alpha: 1.359
Parameters optimization took 2 rounds (0.012 sec)
Time for fast ML tree search: 0.065 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 286 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1405.418 45 2900.835 2925.478 3052.304
2 GTR+F+I 1403.836 46 2899.673 2925.565 3054.508
3 GTR+F+G4 1388.331 46 2868.662 2894.554 3023.497
4 GTR+F+I+G4 1388.704 47 2871.408 2898.589 3029.609
5 GTR+F+R2 1382.562 47 2859.124 2886.305 3017.325
6 GTR+F+R3 1382.602 49 2863.203 2893.081 3028.136
16 SYM+G4 1388.448 43 2862.897 2885.155 3007.634
18 SYM+R2 1384.041 44 2856.081 2879.513 3004.184
29 TVM+F+G4 1389.408 45 2868.815 2893.458 3020.284
31 TVM+F+R2 1384.278 46 2860.555 2886.448 3015.390
42 TVMe+G4 1388.431 42 2860.861 2881.984 3002.232
44 TVMe+R2 1384.070 43 2854.141 2876.399 2998.878
55 TIM3+F+G4 1392.277 44 2872.555 2895.987 3020.658
57 TIM3+F+R2 1385.911 45 2861.822 2886.465 3013.291
68 TIM3e+G4 1391.664 41 2865.328 2885.351 3003.333
70 TIM3e+R2 1386.836 42 2857.673 2878.795 2999.044
81 TIM2+F+G4 1395.130 44 2878.260 2901.692 3026.363
83 TIM2+F+R2 1388.182 45 2866.364 2891.007 3017.833
94 TIM2e+G4 1398.824 41 2879.647 2899.670 3017.652
96 TIM2e+R2 1393.016 42 2870.033 2891.156 3011.404
107 TIM+F+G4 1391.782 44 2871.563 2894.995 3019.666
109 TIM+F+R2 1385.369 45 2860.738 2885.381 3012.207
120 TIMe+G4 1396.021 41 2874.043 2894.066 3012.048
122 TIMe+R2 1390.461 42 2864.921 2886.044 3006.292
133 TPM3u+F+G4 1393.267 43 2872.534 2894.792 3017.271
135 TPM3u+F+R2 1387.637 44 2863.274 2886.706 3011.377
146 TPM3+F+G4 1393.267 43 2872.534 2894.792 3017.271
148 TPM3+F+R2 1387.637 44 2863.274 2886.706 3011.377
159 TPM2u+F+G4 1396.097 43 2878.193 2900.452 3022.930
161 TPM2u+F+R2 1389.794 44 2867.587 2891.019 3015.690
172 TPM2+F+G4 1396.097 43 2878.193 2900.452 3022.930
174 TPM2+F+R2 1389.794 44 2867.587 2891.019 3015.690
185 K3Pu+F+G4 1392.948 43 2871.895 2894.154 3016.632
187 K3Pu+F+R2 1387.034 44 2862.069 2885.501 3010.172
198 K3P+G4 1396.027 40 2872.053 2891.013 3006.692
200 K3P+R2 1390.510 41 2863.021 2883.044 3001.026
211 TN+F+G4 1395.483 43 2876.967 2899.225 3021.704
213 TN+F+R2 1388.576 44 2865.152 2888.584 3013.255
224 TNe+G4 1398.824 40 2877.647 2896.607 3012.286
226 TNe+R2 1393.016 41 2868.032 2888.056 3006.037
237 HKY+F+G4 1396.474 42 2876.947 2898.070 3018.318
239 HKY+F+R2 1390.188 43 2866.377 2888.636 3011.114
250 K2P+G4 1398.839 39 2875.678 2893.609 3006.951
252 K2P+R2 1393.063 40 2866.127 2885.086 3000.766
263 F81+F+G4 1406.476 41 2894.952 2914.975 3032.957
265 F81+F+R2 1401.149 42 2886.297 2907.420 3027.668
276 JC+G4 1408.763 38 2893.526 2910.464 3021.434
278 JC+R2 1403.898 39 2885.796 2903.727 3017.069
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TVMe+R2
Bayesian Information Criterion: TVMe+R2
Best-fit model: TVMe+R2 chosen according to BIC
All model information printed to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree.model.gz
CPU time for ModelFinder: 0.627 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.631 seconds (0h:0m:0s)
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1384.070
Optimal log-likelihood: -1384.067
Rate parameters: A-C: 0.20495 A-G: 1.86949 A-T: 1.46520 C-G: 0.73846 C-T: 1.86949 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.716,0.399) (0.284,2.516)
Parameters optimization took 1 rounds (0.004 sec)
Computing ML distances based on estimated model parameters... 0.006 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1389.354
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.087 second
Computing log-likelihood of 98 initial trees ... 0.079 seconds
Current best score: -1384.067
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1382.304
UPDATE BEST LOG-LIKELIHOOD: -1382.302
Iteration 10 / LogL: -1382.307 / Time: 0h:0m:0s
UPDATE BEST LOG-LIKELIHOOD: -1382.300
Iteration 20 / LogL: -1382.310 / Time: 0h:0m:0s
Finish initializing candidate tree set (2)
Current best tree score: -1382.300 / CPU time: 0.428
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 22: -1382.087
UPDATE BEST LOG-LIKELIHOOD: -1382.087
Iteration 30 / LogL: -1382.863 / Time: 0h:0m:0s (0h:0m:3s left)
Iteration 40 / LogL: -1383.091 / Time: 0h:0m:0s (0h:0m:3s left)
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 44: -1382.007
Iteration 50 / LogL: -1390.345 / Time: 0h:0m:0s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.006
UPDATE BEST LOG-LIKELIHOOD: -1382.006
Iteration 60 / LogL: -1382.482 / Time: 0h:0m:0s (0h:0m:2s left)
BETTER TREE FOUND at iteration 68: -1382.005
Iteration 70 / LogL: -1382.038 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 80 / LogL: -1382.021 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 90 / LogL: -1382.021 / Time: 0h:0m:1s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 100 / LogL: -1382.090 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 110 / LogL: -1392.448 / Time: 0h:0m:1s (0h:0m:2s left)
Iteration 120 / LogL: -1382.017 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 130 / LogL: -1382.099 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 140 / LogL: -1382.006 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 150 / LogL: -1382.017 / Time: 0h:0m:1s (0h:0m:1s left)
Iteration 160 / LogL: -1382.005 / Time: 0h:0m:1s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 170 / LogL: -1382.093 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 180 / LogL: -1382.013 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 190 / LogL: -1382.016 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 200 / LogL: -1382.063 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 210 / LogL: -1382.017 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 220 / LogL: -1382.079 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 230 / LogL: -1388.681 / Time: 0h:0m:2s (0h:0m:0s left)
Iteration 240 / LogL: -1382.018 / Time: 0h:0m:2s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 250 / LogL: -1382.017 / Time: 0h:0m:2s (0h:0m:0s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.005
Iteration 260 / LogL: -1382.005 / Time: 0h:0m:3s (0h:0m:0s left)
TREE SEARCH COMPLETED AFTER 269 ITERATIONS / Time: 0h:0m:3s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1382.005
Optimal log-likelihood: -1382.003
Rate parameters: A-C: 0.19380 A-G: 1.85586 A-T: 1.54657 C-G: 0.77853 C-T: 1.85586 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.724,0.408) (0.276,2.551)
Parameters optimization took 1 rounds (0.005 sec)
BEST SCORE FOUND : -1382.003
Total tree length: 7.118
Total number of iterations: 269
CPU time used for tree search: 3.065 sec (0h:0m:3s)
Wall-clock time used for tree search: 3.071 sec (0h:0m:3s)
Total CPU time used: 3.095 sec (0h:0m:3s)
Total wall-clock time used: 3.103 sec (0h:0m:3s)
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree.mldist
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree.log
Date and Time: Sat Dec 5 17:57:01 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-5b7g8smv/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmp1x4214ja/q2iqtree -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-fast-tree.qza
iqtree-ultrafast-bootstrap¶
As per our discussion in the raxml-rapid-bootstrap
section above, we can
also use IQ-TREE to evaluate how well our splits / bipartitions are supported
within our phylogeny via the ultrafast bootstrap algorithm. Below, we’ll
apply the plugin’s
ultrafast bootstrap command:
automatic model selection (MFP
), perform 1000
bootstrap replicates
(minimum required), set the same generally suggested parameters for
constructing a phylogeny from short sequences, and automatically determine the
optimal number of CPU cores to use:
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--o-tree iqt-nnisi-bootstrap-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree-ultrafast-bootstrap is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4lxozo3j/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Seed: 459268 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:57:06 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4lxozo3j/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -1391.885
2. Current log-likelihood: -1390.523
Optimal log-likelihood: -1389.769
Rate parameters: A-C: 0.33284 A-G: 2.23808 A-T: 2.09601 C-G: 1.18749 C-T: 3.18112 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.034
Gamma shape alpha: 1.359
Parameters optimization took 2 rounds (0.012 sec)
Time for fast ML tree search: 0.063 seconds
NOTE: ModelFinder requires 1 MB RAM!
ModelFinder will test up to 286 DNA models (sample size: 214) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 1405.418 45 2900.835 2925.478 3052.304
2 GTR+F+I 1403.836 46 2899.673 2925.565 3054.508
3 GTR+F+G4 1388.331 46 2868.662 2894.554 3023.497
4 GTR+F+I+G4 1388.704 47 2871.408 2898.589 3029.609
5 GTR+F+R2 1382.562 47 2859.124 2886.305 3017.325
6 GTR+F+R3 1382.602 49 2863.203 2893.081 3028.136
16 SYM+G4 1388.448 43 2862.897 2885.155 3007.634
18 SYM+R2 1384.041 44 2856.081 2879.513 3004.184
29 TVM+F+G4 1389.408 45 2868.815 2893.458 3020.284
31 TVM+F+R2 1384.278 46 2860.555 2886.448 3015.390
42 TVMe+G4 1388.431 42 2860.861 2881.984 3002.232
44 TVMe+R2 1384.070 43 2854.141 2876.399 2998.878
55 TIM3+F+G4 1392.277 44 2872.555 2895.987 3020.658
57 TIM3+F+R2 1385.911 45 2861.822 2886.465 3013.291
68 TIM3e+G4 1391.664 41 2865.328 2885.351 3003.333
70 TIM3e+R2 1386.836 42 2857.673 2878.795 2999.044
81 TIM2+F+G4 1395.130 44 2878.260 2901.692 3026.363
83 TIM2+F+R2 1388.182 45 2866.364 2891.007 3017.833
94 TIM2e+G4 1398.824 41 2879.647 2899.670 3017.652
96 TIM2e+R2 1393.016 42 2870.033 2891.156 3011.404
107 TIM+F+G4 1391.782 44 2871.563 2894.995 3019.666
109 TIM+F+R2 1385.369 45 2860.738 2885.381 3012.207
120 TIMe+G4 1396.021 41 2874.043 2894.066 3012.048
122 TIMe+R2 1390.461 42 2864.921 2886.044 3006.292
133 TPM3u+F+G4 1393.267 43 2872.534 2894.792 3017.271
135 TPM3u+F+R2 1387.637 44 2863.274 2886.706 3011.377
146 TPM3+F+G4 1393.267 43 2872.534 2894.792 3017.271
148 TPM3+F+R2 1387.637 44 2863.274 2886.706 3011.377
159 TPM2u+F+G4 1396.097 43 2878.193 2900.452 3022.930
161 TPM2u+F+R2 1389.794 44 2867.587 2891.019 3015.690
172 TPM2+F+G4 1396.097 43 2878.193 2900.452 3022.930
174 TPM2+F+R2 1389.794 44 2867.587 2891.019 3015.690
185 K3Pu+F+G4 1392.948 43 2871.895 2894.154 3016.632
187 K3Pu+F+R2 1387.034 44 2862.069 2885.501 3010.172
198 K3P+G4 1396.027 40 2872.053 2891.013 3006.692
200 K3P+R2 1390.510 41 2863.021 2883.044 3001.026
211 TN+F+G4 1395.483 43 2876.967 2899.225 3021.704
213 TN+F+R2 1388.576 44 2865.152 2888.584 3013.255
224 TNe+G4 1398.824 40 2877.647 2896.607 3012.286
226 TNe+R2 1393.016 41 2868.032 2888.056 3006.037
237 HKY+F+G4 1396.474 42 2876.947 2898.070 3018.318
239 HKY+F+R2 1390.188 43 2866.377 2888.636 3011.114
250 K2P+G4 1398.839 39 2875.678 2893.609 3006.951
252 K2P+R2 1393.063 40 2866.127 2885.086 3000.766
263 F81+F+G4 1406.476 41 2894.952 2914.975 3032.957
265 F81+F+R2 1401.149 42 2886.297 2907.420 3027.668
276 JC+G4 1408.763 38 2893.526 2910.464 3021.434
278 JC+R2 1403.898 39 2885.796 2903.727 3017.069
Akaike Information Criterion: TVMe+R2
Corrected Akaike Information Criterion: TVMe+R2
Bayesian Information Criterion: TVMe+R2
Best-fit model: TVMe+R2 chosen according to BIC
All model information printed to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.model.gz
CPU time for ModelFinder: 0.626 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.630 seconds (0h:0m:0s)
Generating 1000 samples for ultrafast bootstrap (seed: 459268)...
NOTE: 0 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -1384.070
Optimal log-likelihood: -1384.067
Rate parameters: A-C: 0.20495 A-G: 1.86949 A-T: 1.46520 C-G: 0.73846 C-T: 1.86949 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.716,0.399) (0.284,2.516)
Parameters optimization took 1 rounds (0.004 sec)
Computing ML distances based on estimated model parameters... 0.006 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1389.354
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.086 second
Computing log-likelihood of 95 initial trees ... 0.076 seconds
Current best score: -1384.067
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1382.304
UPDATE BEST LOG-LIKELIHOOD: -1382.302
UPDATE BEST LOG-LIKELIHOOD: -1382.298
Iteration 10 / LogL: -1382.310 / Time: 0h:0m:0s
Iteration 20 / LogL: -1382.313 / Time: 0h:0m:0s
Finish initializing candidate tree set (3)
Current best tree score: -1382.298 / CPU time: 0.601
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1384.533 / Time: 0h:0m:0s (0h:0m:4s left)
Estimate model parameters (epsilon = 0.100)
UPDATE BEST LOG-LIKELIHOOD: -1382.088
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 33: -1382.004
Iteration 40 / LogL: -1382.031 / Time: 0h:0m:1s (0h:0m:4s left)
Iteration 50 / LogL: -1382.042 / Time: 0h:0m:1s (0h:0m:4s left)
Log-likelihood cutoff on original alignment: -1404.354
Iteration 60 / LogL: -1382.904 / Time: 0h:0m:1s (0h:0m:4s left)
Iteration 70 / LogL: -1382.901 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 80 / LogL: -1382.004 / Time: 0h:0m:1s (0h:0m:3s left)
Iteration 90 / LogL: -1382.417 / Time: 0h:0m:2s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.003
Iteration 100 / LogL: -1382.049 / Time: 0h:0m:2s (0h:0m:2s left)
Log-likelihood cutoff on original alignment: -1404.537
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.999
Iteration 110 / LogL: -1382.017 / Time: 0h:0m:2s (0h:0m:2s left)
Iteration 120 / LogL: -1382.053 / Time: 0h:0m:2s (0h:0m:2s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.003
Iteration 130 / LogL: -1382.015 / Time: 0h:0m:2s (0h:0m:2s left)
Iteration 140 / LogL: -1382.117 / Time: 0h:0m:2s (0h:0m:1s left)
Iteration 150 / LogL: -1382.024 / Time: 0h:0m:3s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1404.354
Iteration 160 / LogL: -1382.007 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 170 / LogL: -1382.146 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 180 / LogL: -1382.879 / Time: 0h:0m:3s (0h:0m:1s left)
Iteration 190 / LogL: -1382.486 / Time: 0h:0m:3s (0h:0m:0s left)
Iteration 200 / LogL: -1382.174 / Time: 0h:0m:4s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1404.354
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.998
Iteration 210 / LogL: -1382.091 / Time: 0h:0m:4s (0h:0m:1s left)
Iteration 220 / LogL: -1382.006 / Time: 0h:0m:4s (0h:0m:1s left)
UPDATE BEST LOG-LIKELIHOOD: -1382.003
Iteration 230 / LogL: -1382.091 / Time: 0h:0m:4s (0h:0m:1s left)
TREE SEARCH COMPLETED AFTER 234 ITERATIONS / Time: 0h:0m:4s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1382.003
Optimal log-likelihood: -1382.002
Rate parameters: A-C: 0.19145 A-G: 1.84354 A-T: 1.53395 C-G: 0.77258 C-T: 1.84354 G-T: 1.00000
Base frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250
Site proportion and rates: (0.724,0.409) (0.276,2.551)
Parameters optimization took 1 rounds (0.004 sec)
BEST SCORE FOUND : -1382.002
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.splits.nex
Total tree length: 7.115
Total number of iterations: 234
CPU time used for tree search: 4.803 sec (0h:0m:4s)
Wall-clock time used for tree search: 4.817 sec (0h:0m:4s)
Total CPU time used: 4.908 sec (0h:0m:4s)
Total wall-clock time used: 4.928 sec (0h:0m:4s)
Computing bootstrap consensus tree...
Reading input file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.splits.nex...
20 taxa and 159 splits.
Consensus tree written to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.contree
Reading input trees file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1382.002
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.contree
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot.log
Date and Time: Sat Dec 5 17:57:12 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-4lxozo3j/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m MFP -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmptx_hwp5n/q2iqtreeufboot -nt 1 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-tree.qza
Perform single branch tests alongside ufboot¶
We can also apply single branch test methods concurrently with ultrafast bootstrapping. The support values will always be represented in the following order: alrt / lbp / abayes / ufboot. Again, these values can be seen as separately listed bootstrap values in iTOL. We’ll also specify a model as we did earlier.
qiime phylogeny iqtree-ultrafast-bootstrap \
--i-alignment masked-aligned-rep-seqs.qza \
--p-perturb-nni-strength 0.2 \
--p-stop-iter 200 \
--p-n-cores 1 \
--p-alrt 1000 \
--p-abayes \
--p-lbp 1000 \
--p-substitution-model 'GTR+I+G' \
--o-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--verbose
stdout:
Plugin warning from phylogeny:
iqtree-ultrafast-bootstrap is deprecated and will be removed in a future version of this plugin.
IQ-TREE multicore version 2.0.3 for Mac OS X 64-bit built Apr 26 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.
Host: ghost.mggen.nau.edu (AVX, 16 GB RAM)
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-mh1mg7rh/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Seed: 498303 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Sat Dec 5 17:57:16 2020
Kernel: AVX - 1 threads (8 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 8 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-mh1mg7rh/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta ... Fasta format detected
Alignment most likely contains DNA/RNA sequences
Alignment has 20 sequences with 214 columns, 157 distinct patterns
104 parsimony-informative, 33 singleton sites, 77 constant sites
Gap/Ambiguity Composition p-value
1 e84fcf85a6a4065231dcf343bb862f1cb32abae6 40.65% passed 90.91%
2 5525fb6dab7b6577960147574465990c6df070ad 42.99% passed 99.80%
3 eb3564a35320b53cef22a77288838c7446357327 42.99% passed 25.49%
4 418f1d469f08c99976b313028cf6d3f18f61dd55 43.93% passed 71.86%
5 2e3b2c075901640c4de739473f9246385430b1ed 31.31% passed 90.76%
6 0469f8d819bd45c7638d1c8b0895270a05f34267 38.79% passed 92.82%
7 d162ed685007f5adede58f14aece31dfa1b60c18 40.65% passed 97.17%
8 1d45b2bce36cd995c5dcb755babf512e612ce8b9 41.59% passed 39.04%
9 5aba6bd9debc23ded7041ffdcfe5d68a427e8ce8 31.31% passed 87.21%
10 206656bec2abdbc4aee37a661ef5f4a62b5dd6ae 42.99% passed 85.00%
11 606c23e79bb730ad74e3c6efd72004c36674c17a 47.20% passed 87.78%
12 682e91d7e510ab134d0625234ad224f647c14eb0 41.59% passed 31.01%
13 6a36152105590b1eb095b9503e8f1f226fc73e43 39.25% passed 86.29%
14 6ca685c39a33bfbcb3123129e7af88d573df7d6f 42.06% failed 0.02%
15 8a1c44eb462ed58b21f3fdd72dd22bb657db2980 31.78% passed 54.40%
16 9b220cae8d375ea38b8b481cb95949cda8722fcb 36.92% passed 88.78%
17 aa4698d2e2b1fa71d08e2934a923aad7374a18f6 37.85% passed 90.52%
18 b31aa3f04bc9d5e2498d45cf1983dfaf09faa258 31.78% passed 72.69%
19 d44b129a6181f052198bda3813f0802a91612441 41.59% passed 41.69%
20 ed1acad8a98e8579a44370733533ad7d3fed8006 48.13% passed 58.15%
**** TOTAL 39.77% 1 sequences failed composition chi2 test (p-value<5%; df=3)
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.001 seconds
Generating 1000 samples for ultrafast bootstrap (seed: 498303)...
NOTE: 1 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
Thoroughly optimizing +I+G parameters from 10 start values...
Init pinv, alpha: 0.000, 1.000 / Estimate: 0.000, 1.239 / LogL: -1394.430
Init pinv, alpha: 0.040, 1.000 / Estimate: 0.008, 1.306 / LogL: -1394.720
Init pinv, alpha: 0.080, 1.000 / Estimate: 0.009, 1.315 / LogL: -1394.793
Init pinv, alpha: 0.120, 1.000 / Estimate: 0.009, 1.313 / LogL: -1394.791
Init pinv, alpha: 0.160, 1.000 / Estimate: 0.008, 1.307 / LogL: -1394.755
Init pinv, alpha: 0.200, 1.000 / Estimate: 0.009, 1.309 / LogL: -1394.783
Init pinv, alpha: 0.240, 1.000 / Estimate: 0.008, 1.305 / LogL: -1394.729
Init pinv, alpha: 0.280, 1.000 / Estimate: 0.008, 1.307 / LogL: -1394.742
Init pinv, alpha: 0.320, 1.000 / Estimate: 0.008, 1.308 / LogL: -1394.753
Init pinv, alpha: 0.360, 1.000 / Estimate: 0.008, 1.312 / LogL: -1394.757
Optimal pinv,alpha: 0.000, 1.239 / LogL: -1394.430
Parameters optimization took 0.461 sec
Computing ML distances based on estimated model parameters... 0.009 sec
Computing BIONJ tree...
0.001 seconds
Log-likelihood of BIONJ tree: -1392.898
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.087 second
Computing log-likelihood of 96 initial trees ... 0.106 seconds
Current best score: -1392.898
Do NNI search on 20 best initial trees
Estimate model parameters (epsilon = 0.100)
BETTER TREE FOUND at iteration 1: -1387.266
Iteration 10 / LogL: -1387.731 / Time: 0h:0m:0s
Iteration 20 / LogL: -1387.282 / Time: 0h:0m:1s
Finish initializing candidate tree set (2)
Current best tree score: -1387.266 / CPU time: 0.739
Number of iterations: 20
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Iteration 30 / LogL: -1387.308 / Time: 0h:0m:1s (0h:0m:8s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.262
Iteration 40 / LogL: -1387.372 / Time: 0h:0m:1s (0h:0m:6s left)
Iteration 50 / LogL: -1387.307 / Time: 0h:0m:1s (0h:0m:5s left)
Log-likelihood cutoff on original alignment: -1409.040
Iteration 60 / LogL: -1387.370 / Time: 0h:0m:2s (0h:0m:5s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.259
Iteration 70 / LogL: -1387.361 / Time: 0h:0m:2s (0h:0m:5s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.257
Iteration 80 / LogL: -1387.348 / Time: 0h:0m:3s (0h:0m:4s left)
Iteration 90 / LogL: -1387.368 / Time: 0h:0m:3s (0h:0m:4s left)
Iteration 100 / LogL: -1387.439 / Time: 0h:0m:3s (0h:0m:3s left)
Log-likelihood cutoff on original alignment: -1409.046
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.988
NOTE: UFBoot does not converge, continue at least 100 more iterations
Iteration 110 / LogL: -1387.258 / Time: 0h:0m:4s (0h:0m:3s left)
UPDATE BEST LOG-LIKELIHOOD: -1387.255
Iteration 120 / LogL: -1387.354 / Time: 0h:0m:4s (0h:0m:2s left)
Iteration 130 / LogL: -1387.373 / Time: 0h:0m:4s (0h:0m:2s left)
Iteration 140 / LogL: -1387.598 / Time: 0h:0m:5s (0h:0m:2s left)
Iteration 150 / LogL: -1387.405 / Time: 0h:0m:5s (0h:0m:1s left)
Log-likelihood cutoff on original alignment: -1408.710
UPDATE BEST LOG-LIKELIHOOD: -1387.255
Iteration 160 / LogL: -1387.347 / Time: 0h:0m:5s (0h:0m:1s left)
Iteration 170 / LogL: -1390.575 / Time: 0h:0m:6s (0h:0m:1s left)
Iteration 180 / LogL: -1387.268 / Time: 0h:0m:6s (0h:0m:0s left)
Iteration 190 / LogL: -1387.346 / Time: 0h:0m:6s (0h:0m:0s left)
Iteration 200 / LogL: -1387.346 / Time: 0h:0m:6s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -1408.710
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 0.997
TREE SEARCH COMPLETED AFTER 202 ITERATIONS / Time: 0h:0m:6s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -1387.255
Optimal log-likelihood: -1387.252
Rate parameters: A-C: 0.32799 A-G: 2.25616 A-T: 2.13404 C-G: 1.17055 C-T: 3.28011 G-T: 1.00000
Base frequencies: A: 0.243 C: 0.182 G: 0.319 T: 0.256
Proportion of invariable sites: 0.000
Gamma shape alpha: 1.318
Parameters optimization took 1 rounds (0.004 sec)
BEST SCORE FOUND : -1387.252
Testing tree branches by SH-like aLRT with 1000 replicates...
Testing tree branches by local-BP test with 1000 replicates...
Testing tree branches by aBayes parametric test...
0.055 sec.
Creating bootstrap support values...
Split supports printed to NEXUS file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.splits.nex
Total tree length: 6.745
Total number of iterations: 202
CPU time used for tree search: 6.430 sec (0h:0m:6s)
Wall-clock time used for tree search: 6.444 sec (0h:0m:6s)
Total CPU time used: 7.055 sec (0h:0m:7s)
Total wall-clock time used: 7.076 sec (0h:0m:7s)
Computing bootstrap consensus tree...
Reading input file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.splits.nex...
20 taxa and 161 splits.
Consensus tree written to /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.contree
Reading input trees file /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.contree
Log-likelihood of consensus tree: -1387.778
Analysis results written to:
IQ-TREE report: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.iqtree
Maximum-likelihood tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.treefile
Likelihood distances: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.mldist
Ultrafast bootstrap approximation results written to:
Split support values: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.splits.nex
Consensus tree: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.contree
Screen log file: /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot.log
Date and Time: Sat Dec 5 17:57:23 2020
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: iqtree -bb 1000 -st DNA --runs 1 -s /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/qiime2-archive-mh1mg7rh/03a1849a-4bc2-4343-9a49-b17db5bf6f3d/data/aligned-dna-sequences.fasta -m GTR+I+G -pre /var/folders/9h/268zfwl56h37jwt5qv866jcr0000gp/T/tmpmxnxr66z/q2iqtreeufboot -nt 1 -alrt 1000 -abayes -lbp 1000 -nstop 200 -pers 0.200000
Saved Phylogeny[Unrooted] to: iqt-nnisi-bootstrap-sbt-gtrig-tree.qza
Tip
If there is a need to reduce the impact of potential model
violations that occur during a UFBoot search, and / or would simply
like to be more rigorous, we can add the --p-bnni
option to any of the
iqtree-ultrafast-bootstrap
commands above.
Root the phylogeny¶
In order to make proper use of diversity metrics such as UniFrac, the phylogeny must be rooted. Typically an outgroup is chosen when rooting a tree. In general, phylogenetic inference tools using Maximum Likelihood often return an unrooted tree by default.
QIIME 2 provides a way to
mid-point root our
phylogeny. Other rooting options may be available in the future. For now, we’ll
root our bootstrap tree from iqtree-ultrafast-bootstrap
like so:
qiime phylogeny midpoint-root \
--i-tree iqt-nnisi-bootstrap-sbt-gtrig-tree.qza \
--o-rooted-tree iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza
Tip
iTOL viewing Reminder. We can view our tree and its associated alignment via iTOL. All you need to do is upload the iqt-nnisi-bootstrap-sbt-gtrig-tree-rooted.qza tree file. Display the tree in Normal mode. Then drag and drop the masked-aligned-rep-seqs.qza file onto the visualization. Now you can view the phylogeny alongside the alignment.
Pipelines¶
Here we will outline the use of the phylogeny pipeline align-to-tree-mafft-fasttree
One advantage of pipelines is that they combine ordered sets of commonly used commands, into one condensed simple command. To keep these “convenience” pipelines easy to use, it is quite common to only expose a few options to the user. That is, most of the commands executed via pipelines are often configured to use default option settings. However, options that are deemed important enough for the user to consider setting, are made available. The options exposed via a given pipeline will largely depend upon what it is doing. Pipelines are also a great way for new users to get started, as it helps to lay a foundation of good practices in setting up standard operating procedures.
Rather than run one or more of the following QIIME 2 commands listed below:
qiime alignment mafft ...
qiime alignment mask ...
qiime phylogeny fasttree ...
qiime phylogeny midpoint-root ...
We can make use of the pipeline align-to-tree-mafft-fasttree to automate the above four steps in one go. Here is the description taken from the pipeline help doc:
This pipeline will start by creating a sequence alignment using MAFFT, after which any alignment columns that are phylogenetically uninformative or ambiguously aligned will be removed (masked). The resulting masked alignment will be used to infer a phylogenetic tree and then subsequently rooted at its midpoint. Output files from each step of the pipeline will be saved. This includes both the unmasked and masked MAFFT alignment from q2-alignment methods, and both the rooted and unrooted phylogenies from q2-phylogeny methods.
This can all be accomplished by simply running the following:
qiime phylogeny align-to-tree-mafft-fasttree \
--i-sequences rep-seqs.qza \
--output-dir mafft-fasttree-output
Output artifacts:
Congratulations! You now know how to construct a phylogeny in QIIME 2!