Fork me on GitHub

decontam-identify-batches: Identify contaminants in Batch ModeΒΆ

Docstring:

Usage: qiime quality-control decontam-identify-batches [OPTIONS]

  This method breaks an ASV table into batches based on the given metadata and
  identifies contaminant sequences from an OTU or ASV table and reports them
  to the user

Inputs:
  --i-table ARTIFACT FeatureTable[Frequency]
                          Feature table which contaminate sequences will be
                          identified from                           [required]
  --i-rep-seqs ARTIFACT FeatureData[Sequence]
                          Representative Sequences table which contaminate
                          seqeunces will be removed from            [optional]
Parameters:
  --m-metadata-file METADATA...
    (multiple arguments   metadata file indicating which samples in the
     will be merged)      experiment are control samples, assumes sample names
                          in file correspond to the `table` input parameter
                                                                    [required]
  --p-split-column TEXT   input metadata columns that you wish to subset the
                          ASV table byNote: Column names must be in quotes and
                          delimited by a space                      [required]
  --p-method TEXT Choices('combined', 'frequency', 'prevalence')
                          Select how to which method to id contaminants with;
                          Prevalence: Utilizes control ASVs/OTUs to identify
                          contaminants, Frequency: Utilizes sample
                          concentration information to identify contaminants,
                          Combined: Utilizes both Prevalence and Frequency
                          methods when identifying contaminants     [required]
  --p-filter-empty-features / --p-no-filter-empty-features
                          If true, features which are not present in a split
                          feature table are dropped.                [optional]
  --p-freq-concentration-column TEXT
                          Input column name that has concentration
                          information for the samples               [optional]
  --p-prev-control-column TEXT
                          Input column name containing experimental or
                          control sample metadata                   [optional]
  --p-prev-control-indicator TEXT
                          indicate the control sample identifier (e.g.
                          "control" or "blank")                     [optional]
  --p-threshold NUMBER    Select threshold cutoff for decontam algorithm
                          scores                                [default: 0.1]
  --p-weighted / --p-no-weighted
                          weight the decontam scores by their associated read
                          number                               [default: True]
  --p-bin-size NUMBER     Select bin size for the histogram    [default: 0.02]
Outputs:
  --o-batch-subset-tables ARTIFACTS... Collection[FeatureTable[Frequency]]
                          Directory where feature tables split based on
                          metadata and parameter split-column values should be
                          written.                                  [required]
  --o-decontam-scores ARTIFACTS... Collection[FeatureData[DecontamScore]]
                          The resulting table of scores from the decontam
                          algorithm which scores each feature on how likely
                          they are to be a contaminant sequence     [required]
  --o-score-histograms VISUALIZATION
                          The vizulaizer histograms for all decontam score
                          objects generated from the pipeline       [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --recycle-pool TEXT     Use a cache pool for pipeline resumption. QIIME 2
                          will cache your results in this pool for reuse by
                          future invocations. These pool are retained until
                          deleted by the user. If not provided, QIIME 2 will
                          create a pool which is automatically reused by
                          invocations of the same action and removed if the
                          action is successful. Note: these pools are local to
                          the cache you are using.
  --no-recycle            Do not recycle results from a previous failed
                          pipeline run or save the results from this run for
                          future recycling.
  --parallel              Execute your action in parallel. This flag will use
                          your default parallel config.
  --parallel-config FILE  Execute your action in parallel using a config at
                          the indicated path.
  --use-cache DIRECTORY   Specify the cache to be used for the intermediate
                          work of this pipeline. If not provided, the default
                          cache under $TMP/qiime2/ will be used.
                          IMPORTANT FOR HPC USERS: If you are on an HPC system
                          and are using parallel execution it is important to
                          set this to a location that is globally accessible
                          to all nodes in the cluster.
  --example-data PATH     Write example data and exit.
  --citations             Show citations and exit.
  --help                  Show this message and exit.

Import:

from qiime2.plugins.quality_control.pipelines import decontam_identify_batches

Docstring:

Identify contaminants in Batch Mode

This method breaks an ASV table into batches based on the given metadata
and identifies contaminant sequences from an OTU or ASV table and reports
them to the user

Parameters
----------
table : FeatureTable[Frequency]
    Feature table which contaminate sequences will be identified from
metadata : Metadata
    metadata file indicating which samples in the experiment are control
    samples, assumes sample names in file correspond to the `table` input
    parameter
split_column : Str
    input metadata columns that you wish to subset the ASV table byNote:
    Column names must be in quotes and delimited by a space
method : Str % Choices('combined', 'frequency', 'prevalence')
    Select how to which method to id contaminants with; Prevalence:
    Utilizes control ASVs/OTUs to identify contaminants, Frequency:
    Utilizes sample concentration information to identify contaminants,
    Combined: Utilizes both Prevalence and Frequency methods when
    identifying contaminants
rep_seqs : FeatureData[Sequence], optional
    Representative Sequences table which contaminate seqeunces will be
    removed from
filter_empty_features : Bool, optional
    If true, features which are not present in a split feature table are
    dropped.
freq_concentration_column : Str, optional
    Input column name that has concentration information for the samples
prev_control_column : Str, optional
    Input column name containing experimental or control sample metadata
prev_control_indicator : Str, optional
    indicate the control sample identifier (e.g. "control" or "blank")
threshold : Float, optional
    Select threshold cutoff for decontam algorithm scores
weighted : Bool, optional
    weight the decontam scores by their associated read number
bin_size : Float, optional
    Select bin size for the histogram

Returns
-------
batch_subset_tables : Collection[FeatureTable[Frequency]]
    Directory where feature tables split based on metadata and parameter
    split_column values should be written.
decontam_scores : Collection[FeatureData[DecontamScore]]
    The resulting table of scores from the decontam algorithm which scores
    each feature on how likely they are to be a contaminant sequence
score_histograms : Visualization
    The vizulaizer histograms for all decontam score objects generated from
    the pipeline