Fork me on GitHub

sepp: Insert fragment sequences using SEPP into reference phylogenies like Greengenes 13_8

Citations
  • Stefan Janssen, Daniel McDonald, Antonio Gonzalez, Jose A. Navas-Molina, Lingjing Jiang, Zhenjiang Zech Xu, Kevin Winker, Deborah M. Kado, Eric Orwoll, Mark Manary, Siavash Mirarab, and Rob Knight. Phylogenetic placement of exact amplicon sequences improves associations with clinical information. mSystems, 2018. doi:10.1128/mSystems.00021-18.

Docstring:

Usage: qiime fragment-insertion sepp [OPTIONS]

  Perform fragment insertion of 16S sequences using the SEPP algorithm
  against the Greengenes 13_8 99% tree.

Inputs:
  --i-representative-sequences ARTIFACT FeatureData[Sequence]
                       The sequences to insert                      [required]
  --i-reference-alignment ARTIFACT FeatureData[AlignedSequence]
                       The reference multiple nucleotide alignment used to
                       construct the reference phylogeny.           [optional]
  --i-reference-phylogeny ARTIFACT
    Phylogeny[Rooted]  The rooted reference phylogeny. Must be in sync with
                       reference-alignment, i.e. each tip name must have
                       exactly one corresponding record.            [optional]
Parameters:
  --p-threads INTEGER  The number of threads to use               [default: 1]
  --p-alignment-subset-size INTEGER
                       Each placement subset is further broken into subsets
                       of at most these many sequences and a separate HMM is
                       trained on each subset. The default alignment subset
                       size is set to balance the exhaustiveness of the
                       alignment step with the running time.   [default: 1000]
  --p-placement-subset-size INTEGER
                       The tree is divided into subsets such that each subset
                       includes at most these many subsets. The placement step
                       places the fragment on only one subset, determined
                       based on alignment scores. The default placement subset
                       is set to make sure the memory requirement of the
                       pplacer step does not become prohibitively
                       large.
Further reading:
                       https://github.com/smirarab/sepp/blob/master/tutorial/s
                       epp-tutorial.md#sample-datasets-default-parameters
                                                               [default: 5000]
  --p-debug / --p-no-debug
                       Print additional run information to STDOUT for
                       debugging. Run together with --verbose to actually see
                       the information on STDOUT. Temporary directories will
                       not be removed if run fails.           [default: False]
Outputs:
  --o-tree ARTIFACT    The tree with inserted feature data
    Phylogeny[Rooted]                                               [required]
  --o-placements ARTIFACT
    Placements                                                      [required]
Miscellaneous:
  --output-dir PATH    Output unspecified results to a directory
  --verbose / --quiet  Display verbose output to stdout and/or stderr during
                       execution of this action. Or silence output if
                       execution is successful (silence is golden).
  --citations          Show citations and exit.
  --help               Show this message and exit.

Import:

from qiime2.plugins.fragment_insertion.methods import sepp

Docstring:

Insert fragment sequences using SEPP into reference phylogenies like
Greengenes 13_8

Perform fragment insertion of 16S sequences using the SEPP algorithm
against the Greengenes 13_8 99% tree.

Parameters
----------
representative_sequences : FeatureData[Sequence]
    The sequences to insert
threads : Int, optional
    The number of threads to use
alignment_subset_size : Int, optional
    Each placement subset is further broken into subsets of at most these
    many sequences and a separate HMM is trained on each subset. The
    default alignment subset size is set to balance the exhaustiveness of
    the alignment step with the running time.
placement_subset_size : Int, optional
    The tree is divided into subsets such that each subset includes at most
    these many subsets. The placement step places the fragment on only one
    subset, determined based on alignment scores. The default placement
    subset is set to make sure the memory requirement of the pplacer step
    does not become prohibitively large. Further reading:
    https://github.com/smirarab/sepp/blob/master/tutorial/sepp-
    tutorial.md#sample-datasets-default-parameters
reference_alignment : FeatureData[AlignedSequence], optional
    The reference multiple nucleotide alignment used to construct the
    reference phylogeny.
reference_phylogeny : Phylogeny[Rooted], optional
    The rooted reference phylogeny. Must be in sync with reference-
    alignment, i.e. each tip name must have exactly one corresponding
    record.
debug : Bool, optional
    Print additional run information to STDOUT for debugging. Run together
    with --verbose to actually see the information on STDOUT. Temporary
    directories will not be removed if run fails.

Returns
-------
tree : Phylogeny[Rooted]
    The tree with inserted feature data
placements : Placements