Fork me on GitHub

sepp: Insert fragment sequences using SEPP into reference phylogenies like Greengenes 13_8ΒΆ

Citations

[fragment-insertion:sepp:JMG+18]Stefan Janssen, Daniel McDonald, Antonio Gonzalez, Jose A. Navas-Molina, Lingjing Jiang, Zhenjiang Zech Xu, Kevin Winker, Deborah M. Kado, Eric Orwoll, Mark Manary, Siavash Mirarab, and Rob Knight. Phylogenetic placement of exact amplicon sequences improves associations with clinical information. mSystems, 2018. doi:10.1128/mSystems.00021-18.

Docstring:

Usage: qiime fragment-insertion sepp [OPTIONS]

  Perform fragment insertion of 16S sequences using the SEPP algorithm
  against the Greengenes 13_8 99% tree.

Options:
  --i-representative-sequences ARTIFACT PATH FeatureData[Sequence]
                                  The sequences to insert  [required]
  --p-threads INTEGER             The number of threads to use  [default: 1]
  --p-alignment-subset-size INTEGER
                                  Each placement subset is further broken into
                                  subsets of at most these many sequences and
                                  a separate HMM is trained on each subset.
                                  The default alignment subset size is set to
                                  balance the exhaustiveness of the alignment
                                  step with the running time.  [default: 1000]
  --p-placement-subset-size INTEGER
                                  The tree is divided into subsets such that
                                  each subset includes at most these many
                                  subsets. The placement step places the
                                  fragment on only one subset, determined
                                  based on alignment scores. The default
                                  placement subset is set to make sure the
                                  memory requirement of the pplacer step does
                                  not become prohibitively large.
                                  Further
                                  reading: https://github.com/smirarab/sepp/bl
                                  ob/master/tutorial/sepp-tutorial.md#sample-
                                  datasets-default-parameters  [default: 5000]
  --i-reference-alignment ARTIFACT PATH FeatureData[AlignedSequence]
                                  The reference multiple nucleotide alignment
                                  used to construct the reference phylogeny.
                                  [optional]
  --i-reference-phylogeny ARTIFACT PATH Phylogeny[Rooted]
                                  The rooted reference phylogeny. Must be in
                                  sync with reference-alignment, i.e. each tip
                                  name must have exactly one corresponding
                                  record.  [optional]
  --p-debug / --p-no-debug        Print additional run information to STDOUT
                                  for debugging. Run together with --verbose
                                  to actually see the information on STDOUT.
                                  Temporary directories will not be removed if
                                  run fails.  [default: False]
  --o-tree ARTIFACT PATH Phylogeny[Rooted]
                                  The tree with inserted feature data
                                  [required if not passing --output-dir]
  --o-placements ARTIFACT PATH Placements
                                  [required if not passing --output-dir]
  --output-dir DIRECTORY          Output unspecified results to a directory
  --cmd-config FILE               Use config file for command options
  --verbose                       Display verbose output to stdout and/or
                                  stderr during execution of this action.
                                  [default: False]
  --quiet                         Silence output if execution is successful
                                  (silence is golden).  [default: False]
  --citations                     Show citations and exit.
  --help                          Show this message and exit.

Import:

from qiime2.plugins.fragment_insertion.methods import sepp

Docstring:

Insert fragment sequences using SEPP into reference phylogenies like
Greengenes 13_8

Perform fragment insertion of 16S sequences using the SEPP algorithm
against the Greengenes 13_8 99% tree.

Parameters
----------
representative_sequences : FeatureData[Sequence]
    The sequences to insert
threads : Int, optional
    The number of threads to use
alignment_subset_size : Int, optional
    Each placement subset is further broken into subsets of at most these
    many sequences and a separate HMM is trained on each subset. The
    default alignment subset size is set to balance the exhaustiveness of
    the alignment step with the running time.
placement_subset_size : Int, optional
    The tree is divided into subsets such that each subset includes at most
    these many subsets. The placement step places the fragment on only one
    subset, determined based on alignment scores. The default placement
    subset is set to make sure the memory requirement of the pplacer step
    does not become prohibitively large. Further reading:
    https://github.com/smirarab/sepp/blob/master/tutorial/sepp-
    tutorial.md#sample-datasets-default-parameters
reference_alignment : FeatureData[AlignedSequence], optional
    The reference multiple nucleotide alignment used to construct the
    reference phylogeny.
reference_phylogeny : Phylogeny[Rooted], optional
    The rooted reference phylogeny. Must be in sync with reference-
    alignment, i.e. each tip name must have exactly one corresponding
    record.
debug : Bool, optional
    Print additional run information to STDOUT for debugging. Run together
    with --verbose to actually see the information on STDOUT. Temporary
    directories will not be removed if run fails.

Returns
-------
tree : Phylogeny[Rooted]
    The tree with inserted feature data
placements : Placements