Fork me on GitHub

align-to-tree-mafft-iqtree: Build a phylogenetic tree using iqtree and mafft alignment.ΒΆ

Docstring:

Usage: qiime phylogeny align-to-tree-mafft-iqtree [OPTIONS]

  This pipeline will start by creating a sequence alignment using MAFFT,
  after which any alignment columns that are phylogenetically uninformative
  or ambiguously aligned will be removed (masked). The resulting masked
  alignment will be used to infer a phylogenetic tree using IQ-TREE. By
  default the best fit substitution model will be determined by ModelFinder
  prior to phylogenetic inference. The resulting tree will be subsequently
  rooted at its midpoint. Output files from each step of the pipeline will
  be saved. This includes both the unmasked and masked MAFFT alignment from
  q2-alignment methods, and both the rooted and unrooted phylogenies from
  q2-phylogeny methods.

Inputs:
  --i-sequences ARTIFACT FeatureData[Sequence]
                          The sequences to be used for creating a iqtree
                          based rooted phylogenetic tree.           [required]
Parameters:
  --p-n-threads VALUE Int % Range(1, None) | Str % Choices('auto')
                          The number of threads. (Use 0 to automatically use
                          all available cores This value is used when aligning
                          the sequences and creating the tree with iqtree.
                                                                  [default: 1]
  --p-mask-max-gap-frequency PROPORTION Range(0, 1, inclusive_end=True)
                          The maximum relative frequency of gap characters in
                          a column for the column to be retained. This
                          relative frequency must be a number between 0.0 and
                          1.0 (inclusive), where 0.0 retains only those
                          columns without gap characters, and 1.0 retains all
                          columns  regardless of gap character frequency. This
                          value is used when masking the aligned sequences.
                                                                [default: 1.0]
  --p-mask-min-conservation PROPORTION Range(0, 1, inclusive_end=True)
                          The minimum relative frequency of at least one
                          non-gap character in a column for that column to be
                          retained. This relative frequency must be a number
                          between 0.0 and 1.0 (inclusive). For example, if a
                          value of  0.4 is provided, a column will only be
                          retained  if it contains at least one character that
                          is present in at least 40% of the sequences. This
                          value is used when masking the aligned sequences.
                                                                [default: 0.4]
  --p-substitution-model TEXT Choices('JC', 'JC+I', 'JC+G', 'JC+I+G',
    'JC+R2', 'JC+R3', 'JC+R4', 'JC+R5', 'JC+R6', 'JC+R7', 'JC+R8', 'JC+R9',
    'JC+R10', 'F81', 'F81+I', 'F81+G', 'F81+I+G', 'F81+R2', 'F81+R3',
    'F81+R4', 'F81+R5', 'F81+R6', 'F81+R7', 'F81+R8', 'F81+R9', 'F81+R10',
    'K80', 'K80+I', 'K80+G', 'K80+I+G', 'K80+R2', 'K80+R3', 'K80+R4',
    'K80+R5', 'K80+R6', 'K80+R7', 'K80+R8', 'K80+R9', 'K80+R10', 'HKY',
    'HKY+I', 'HKY+G', 'HKY+I+G', 'HKY+R2', 'HKY+R3', 'HKY+R4', 'HKY+R5',
    'HKY+R6', 'HKY+R7', 'HKY+R8', 'HKY+R9', 'HKY+R10', 'TNe', 'TNe+I',
    'TNe+G', 'TNe+I+G', 'TNe+R2', 'TNe+R3', 'TNe+R4', 'TNe+R5', 'TNe+R6',
    'TNe+R7', 'TNe+R8', 'TNe+R9', 'TNe+R10', 'TN', 'TN+I', 'TN+G', 'TN+I+G',
    'TN+R2', 'TN+R3', 'TN+R4', 'TN+R5', 'TN+R6', 'TN+R7', 'TN+R8', 'TN+R9',
    'TN+R10', 'K81', 'K81+I', 'K81+G', 'K81+I+G', 'K81+R2', 'K81+R3',
    'K81+R4', 'K81+R5', 'K81+R6', 'K81+R7', 'K81+R8', 'K81+R9', 'K81+R10',
    'K81u', 'K81u+I', 'K81u+G', 'K81u+I+G', 'K81u+R2', 'K81u+R3', 'K81u+R4',
    'K81u+R5', 'K81u+R6', 'K81u+R7', 'K81u+R8', 'K81u+R9', 'K81u+R10', 'TPM2',
    'TPM2+I', 'TPM2+G', 'TPM2+I+G', 'TPM2+R2', 'TPM2+R3', 'TPM2+R4',
    'TPM2+R5', 'TPM2+R6', 'TPM2+R7', 'TPM2+R8', 'TPM2+R9', 'TPM2+R10',
    'TPM2u', 'TPM2u+I', 'TPM2u+G', 'TPM2u+I+G', 'TPM2u+R2', 'TPM2u+R3',
    'TPM2u+R4', 'TPM2u+R5', 'TPM2u+R6', 'TPM2u+R7', 'TPM2u+R8', 'TPM2u+R9',
    'TPM2u+R10', 'TPM3', 'TPM3+I', 'TPM3+G', 'TPM3+I+G', 'TPM3+R2', 'TPM3+R3',
    'TPM3+R4', 'TPM3+R5', 'TPM3+R6', 'TPM3+R7', 'TPM3+R8', 'TPM3+R9',
    'TPM3+R10', 'TPM3u', 'TPM3u+I', 'TPM3u+G', 'TPM3u+I+G', 'TPM3u+R2',
    'TPM3u+R3', 'TPM3u+R4', 'TPM3u+R5', 'TPM3u+R6', 'TPM3u+R7', 'TPM3u+R8',
    'TPM3u+R9', 'TPM3u+R10', 'TIMe', 'TIMe+I', 'TIMe+G', 'TIMe+I+G',
    'TIMe+R2', 'TIMe+R3', 'TIMe+R4', 'TIMe+R5', 'TIMe+R6', 'TIMe+R7',
    'TIMe+R8', 'TIMe+R9', 'TIMe+R10', 'TIM', 'TIM+I', 'TIM+G', 'TIM+I+G',
    'TIM+R2', 'TIM+R3', 'TIM+R4', 'TIM+R5', 'TIM+R6', 'TIM+R7', 'TIM+R8',
    'TIM+R9', 'TIM+R10', 'TIM2e', 'TIM2e+I', 'TIM2e+G', 'TIM2e+I+G',
    'TIM2e+R2', 'TIM2e+R3', 'TIM2e+R4', 'TIM2e+R5', 'TIM2e+R6', 'TIM2e+R7',
    'TIM2e+R8', 'TIM2e+R9', 'TIM2e+R10', 'TIM2', 'TIM2+I', 'TIM2+G',
    'TIM2+I+G', 'TIM2+R2', 'TIM2+R3', 'TIM2+R4', 'TIM2+R5', 'TIM2+R6',
    'TIM2+R7', 'TIM2+R8', 'TIM2+R9', 'TIM2+R10', 'TIM3e', 'TIM3e+I',
    'TIM3e+G', 'TIM3e+I+G', 'TIM3e+R2', 'TIM3e+R3', 'TIM3e+R4', 'TIM3e+R5',
    'TIM3e+R6', 'TIM3e+R7', 'TIM3e+R8', 'TIM3e+R9', 'TIM3e+R10', 'TIM3',
    'TIM3+I', 'TIM3+G', 'TIM3+I+G', 'TIM3+R2', 'TIM3+R3', 'TIM3+R4',
    'TIM3+R5', 'TIM3+R6', 'TIM3+R7', 'TIM3+R8', 'TIM3+R9', 'TIM3+R10', 'TVMe',
    'TVMe+I', 'TVMe+G', 'TVMe+I+G', 'TVMe+R2', 'TVMe+R3', 'TVMe+R4',
    'TVMe+R5', 'TVMe+R6', 'TVMe+R7', 'TVMe+R8', 'TVMe+R9', 'TVMe+R10', 'TVM',
    'TVM+I', 'TVM+G', 'TVM+I+G', 'TVM+R2', 'TVM+R3', 'TVM+R4', 'TVM+R5',
    'TVM+R6', 'TVM+R7', 'TVM+R8', 'TVM+R9', 'TVM+R10', 'SYM', 'SYM+I',
    'SYM+G', 'SYM+I+G', 'SYM+R2', 'SYM+R3', 'SYM+R4', 'SYM+R5', 'SYM+R6',
    'SYM+R7', 'SYM+R8', 'SYM+R9', 'SYM+R10', 'GTR', 'GTR+I', 'GTR+G',
    'GTR+I+G', 'GTR+R2', 'GTR+R3', 'GTR+R4', 'GTR+R5', 'GTR+R6', 'GTR+R7',
    'GTR+R8', 'GTR+R9', 'GTR+R10', 'MFP', 'TEST')
                          Model of Nucleotide Substitution. If not provided,
                          IQ-TREE will determine the best fit substitution
                          model automatically.                [default: 'MFP']
  --p-fast / --p-no-fast  Fast search to resemble FastTree.   [default: False]
  --p-alrt INTEGER        Single branch test method. Number of bootstrap
    Range(1000, None)     replicates to perform an SH-like approximate
                          likelihood ratio test (SH-aLRT). Minimum of 1000
                          replicates is recomended.                 [optional]
  --p-seed INTEGER        Random number seed for the iqtree parsimony
                          starting tree. This allows you to reproduce tree
                          results. If not supplied then one will be randomly
                          chosen.                                   [optional]
  --p-stop-iter INTEGER   Number of unsuccessful iterations to stop. If not
    Range(1, None)        set, program defaults will be used. See IQ-TREE
                          manual for details.                       [optional]
  --p-perturb-nni-strength NUMBER
    Range(0.01, 1.0)      Perturbation strength for randomized NNI. If not
                          set, program defaults will be used. See IQ-TREE
                          manual for details.                       [optional]
Outputs:
  --o-alignment ARTIFACT FeatureData[AlignedSequence]
                          The aligned sequences.                    [required]
  --o-masked-alignment ARTIFACT FeatureData[AlignedSequence]
                          The masked alignment.                     [required]
  --o-tree ARTIFACT       The unrooted phylogenetic tree.
    Phylogeny[Unrooted]                                             [required]
  --o-rooted-tree ARTIFACT
    Phylogeny[Rooted]     The rooted phylogenetic tree.             [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --examples              Show usage examples and exit.
  --citations             Show citations and exit.
  --help                  Show this message and exit.

Import:

from qiime2.plugins.phylogeny.pipelines import align_to_tree_mafft_iqtree

Docstring:

Build a phylogenetic tree using iqtree and mafft alignment.

This pipeline will start by creating a sequence alignment using MAFFT,
after which any alignment columns that are phylogenetically uninformative
or ambiguously aligned will be removed (masked). The resulting masked
alignment will be used to infer a phylogenetic tree using IQ-TREE. By
default the best fit substitution model will be determined by ModelFinder
prior to phylogenetic inference. The resulting tree will be subsequently
rooted at its midpoint. Output files from each step of the pipeline will be
saved. This includes both the unmasked and masked MAFFT alignment from
q2-alignment methods, and both the rooted and unrooted phylogenies from
q2-phylogeny methods.

Parameters
----------
sequences : FeatureData[Sequence]
    The sequences to be used for creating a iqtree based rooted
    phylogenetic tree.
n_threads : Int % Range(1, None) | Str % Choices('auto'), optional
    The number of threads. (Use 0 to automatically use all available cores
    This value is used when aligning the sequences and creating the tree
    with iqtree.
mask_max_gap_frequency : Float % Range(0, 1, inclusive_end=True), optional
    The maximum relative frequency of gap characters in a column for the
    column to be retained. This relative frequency must be a number between
    0.0 and 1.0 (inclusive), where 0.0 retains only those columns without
    gap characters, and 1.0 retains all columns  regardless of gap
    character frequency. This value is used when masking the aligned
    sequences.
mask_min_conservation : Float % Range(0, 1, inclusive_end=True), optional
    The minimum relative frequency of at least one non-gap character in a
    column for that column to be retained. This relative frequency must be
    a number between 0.0 and 1.0 (inclusive). For example, if a value of
    0.4 is provided, a column will only be retained  if it contains at
    least one character that is present in at least 40% of the sequences.
    This value is used when masking the aligned sequences.
substitution_model : Str % Choices('JC', 'JC+I', 'JC+G', 'JC+I+G', 'JC+R2', 'JC+R3', 'JC+R4', 'JC+R5', 'JC+R6', 'JC+R7', 'JC+R8', 'JC+R9', 'JC+R10', 'F81', 'F81+I', 'F81+G', 'F81+I+G', 'F81+R2', 'F81+R3', 'F81+R4', 'F81+R5', 'F81+R6', 'F81+R7', 'F81+R8', 'F81+R9', 'F81+R10', 'K80', 'K80+I', 'K80+G', 'K80+I+G', 'K80+R2', 'K80+R3', 'K80+R4', 'K80+R5', 'K80+R6', 'K80+R7', 'K80+R8', 'K80+R9', 'K80+R10', 'HKY', 'HKY+I', 'HKY+G', 'HKY+I+G', 'HKY+R2', 'HKY+R3', 'HKY+R4', 'HKY+R5', 'HKY+R6', 'HKY+R7', 'HKY+R8', 'HKY+R9', 'HKY+R10', 'TNe', 'TNe+I', 'TNe+G', 'TNe+I+G', 'TNe+R2', 'TNe+R3', 'TNe+R4', 'TNe+R5', 'TNe+R6', 'TNe+R7', 'TNe+R8', 'TNe+R9', 'TNe+R10', 'TN', 'TN+I', 'TN+G', 'TN+I+G', 'TN+R2', 'TN+R3', 'TN+R4', 'TN+R5', 'TN+R6', 'TN+R7', 'TN+R8', 'TN+R9', 'TN+R10', 'K81', 'K81+I', 'K81+G', 'K81+I+G', 'K81+R2', 'K81+R3', 'K81+R4', 'K81+R5', 'K81+R6', 'K81+R7', 'K81+R8', 'K81+R9', 'K81+R10', 'K81u', 'K81u+I', 'K81u+G', 'K81u+I+G', 'K81u+R2', 'K81u+R3', 'K81u+R4', 'K81u+R5', 'K81u+R6', 'K81u+R7', 'K81u+R8', 'K81u+R9', 'K81u+R10', 'TPM2', 'TPM2+I', 'TPM2+G', 'TPM2+I+G', 'TPM2+R2', 'TPM2+R3', 'TPM2+R4', 'TPM2+R5', 'TPM2+R6', 'TPM2+R7', 'TPM2+R8', 'TPM2+R9', 'TPM2+R10', 'TPM2u', 'TPM2u+I', 'TPM2u+G', 'TPM2u+I+G', 'TPM2u+R2', 'TPM2u+R3', 'TPM2u+R4', 'TPM2u+R5', 'TPM2u+R6', 'TPM2u+R7', 'TPM2u+R8', 'TPM2u+R9', 'TPM2u+R10', 'TPM3', 'TPM3+I', 'TPM3+G', 'TPM3+I+G', 'TPM3+R2', 'TPM3+R3', 'TPM3+R4', 'TPM3+R5', 'TPM3+R6', 'TPM3+R7', 'TPM3+R8', 'TPM3+R9', 'TPM3+R10', 'TPM3u', 'TPM3u+I', 'TPM3u+G', 'TPM3u+I+G', 'TPM3u+R2', 'TPM3u+R3', 'TPM3u+R4', 'TPM3u+R5', 'TPM3u+R6', 'TPM3u+R7', 'TPM3u+R8', 'TPM3u+R9', 'TPM3u+R10', 'TIMe', 'TIMe+I', 'TIMe+G', 'TIMe+I+G', 'TIMe+R2', 'TIMe+R3', 'TIMe+R4', 'TIMe+R5', 'TIMe+R6', 'TIMe+R7', 'TIMe+R8', 'TIMe+R9', 'TIMe+R10', 'TIM', 'TIM+I', 'TIM+G', 'TIM+I+G', 'TIM+R2', 'TIM+R3', 'TIM+R4', 'TIM+R5', 'TIM+R6', 'TIM+R7', 'TIM+R8', 'TIM+R9', 'TIM+R10', 'TIM2e', 'TIM2e+I', 'TIM2e+G', 'TIM2e+I+G', 'TIM2e+R2', 'TIM2e+R3', 'TIM2e+R4', 'TIM2e+R5', 'TIM2e+R6', 'TIM2e+R7', 'TIM2e+R8', 'TIM2e+R9', 'TIM2e+R10', 'TIM2', 'TIM2+I', 'TIM2+G', 'TIM2+I+G', 'TIM2+R2', 'TIM2+R3', 'TIM2+R4', 'TIM2+R5', 'TIM2+R6', 'TIM2+R7', 'TIM2+R8', 'TIM2+R9', 'TIM2+R10', 'TIM3e', 'TIM3e+I', 'TIM3e+G', 'TIM3e+I+G', 'TIM3e+R2', 'TIM3e+R3', 'TIM3e+R4', 'TIM3e+R5', 'TIM3e+R6', 'TIM3e+R7', 'TIM3e+R8', 'TIM3e+R9', 'TIM3e+R10', 'TIM3', 'TIM3+I', 'TIM3+G', 'TIM3+I+G', 'TIM3+R2', 'TIM3+R3', 'TIM3+R4', 'TIM3+R5', 'TIM3+R6', 'TIM3+R7', 'TIM3+R8', 'TIM3+R9', 'TIM3+R10', 'TVMe', 'TVMe+I', 'TVMe+G', 'TVMe+I+G', 'TVMe+R2', 'TVMe+R3', 'TVMe+R4', 'TVMe+R5', 'TVMe+R6', 'TVMe+R7', 'TVMe+R8', 'TVMe+R9', 'TVMe+R10', 'TVM', 'TVM+I', 'TVM+G', 'TVM+I+G', 'TVM+R2', 'TVM+R3', 'TVM+R4', 'TVM+R5', 'TVM+R6', 'TVM+R7', 'TVM+R8', 'TVM+R9', 'TVM+R10', 'SYM', 'SYM+I', 'SYM+G', 'SYM+I+G', 'SYM+R2', 'SYM+R3', 'SYM+R4', 'SYM+R5', 'SYM+R6', 'SYM+R7', 'SYM+R8', 'SYM+R9', 'SYM+R10', 'GTR', 'GTR+I', 'GTR+G', 'GTR+I+G', 'GTR+R2', 'GTR+R3', 'GTR+R4', 'GTR+R5', 'GTR+R6', 'GTR+R7', 'GTR+R8', 'GTR+R9', 'GTR+R10', 'MFP', 'TEST'), optional
    Model of Nucleotide Substitution. If not provided, IQ-TREE will
    determine the best fit substitution model automatically.
fast : Bool, optional
    Fast search to resemble FastTree.
alrt : Int % Range(1000, None), optional
    Single branch test method. Number of bootstrap replicates to perform an
    SH-like approximate likelihood ratio test (SH-aLRT). Minimum of 1000
    replicates is recomended.
seed : Int, optional
    Random number seed for the iqtree parsimony starting tree. This allows
    you to reproduce tree results. If not supplied then one will be
    randomly chosen.
stop_iter : Int % Range(1, None), optional
    Number of unsuccessful iterations to stop. If not set, program defaults
    will be used. See IQ-TREE manual for details.
perturb_nni_strength : Float % Range(0.01, 1.0), optional
    Perturbation strength for randomized NNI. If not set, program defaults
    will be used. See IQ-TREE manual for details.

Returns
-------
alignment : FeatureData[AlignedSequence]
    The aligned sequences.
masked_alignment : FeatureData[AlignedSequence]
    The masked alignment.
tree : Phylogeny[Unrooted]
    The unrooted phylogenetic tree.
rooted_tree : Phylogeny[Rooted]
    The rooted phylogenetic tree.