Fork me on GitHub

demux-single: Demultiplex single-end sequence data with barcodes in-sequence.ΒΆ

Docstring:

Usage: qiime cutadapt demux-single [OPTIONS]

  Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes
  are expected to be located within the sequence data (versus the header, or a
  separate barcode file).

Inputs:
  --i-seqs ARTIFACT MultiplexedSingleEndBarcodeInSequence
                          The single-end sequences to be demultiplexed.
                                                                    [required]
Parameters:
  --m-barcodes-file METADATA
  --m-barcodes-column COLUMN  MetadataColumn[Categorical]
                          The sample metadata column listing the per-sample
                          barcodes.                                 [required]
  --p-cut INTEGER         Remove the specified number of bases from the
                          sequences. Bases are removed before demultiplexing.
                          If a positive value is provided, bases are removed
                          from the beginning of the sequences. If a negative
                          value is provided, bases are removed from the end of
                          the sequences.                          [default: 0]
  --p-anchor-barcode / --p-no-anchor-barcode
                          Anchor the barcode. The barcode is then expected to
                          occur in full length at the beginning (5' end) of
                          the sequence. Can speed up demultiplexing if used.
                                                              [default: False]
  --p-error-rate PROPORTION Range(0, 1, inclusive_end=True)
                          The level of error tolerance, specified as the
                          maximum allowable error rate. The default value
                          specified by cutadapt is 0.1 (=10%), which is
                          greater than `demux emp-*`, which is 0.0 (=0%).
                                                                [default: 0.1]
  --p-batch-size INTEGER  The number of samples cutadapt demultiplexes
    Range(0, None)        concurrently. Demultiplexing in smaller batches will
                          yield the same result with marginal speed loss, and
                          may solve "too many files" errors related to sample
                          quantity. Set to "0" to process all samples at once.
                                                                  [default: 0]
  --p-minimum-length INTEGER
    Range(1, None)        Discard reads shorter than specified value. Note,
                          the cutadapt default of 0 has been overridden,
                          because that value produces empty sequence records.
                                                                  [default: 1]
  --p-cores NTHREADS      Number of CPU cores to use.             [default: 1]
Outputs:
  --o-per-sample-sequences ARTIFACT SampleData[SequencesWithQuality]
                          The resulting demultiplexed sequences.    [required]
  --o-untrimmed-sequences ARTIFACT MultiplexedSingleEndBarcodeInSequence
                          The sequences that were unmatched to barcodes.
                                                                    [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --example-data PATH     Write example data and exit.
  --citations             Show citations and exit.
  --use-cache DIRECTORY   Specify the cache to be used for the intermediate
                          work of this action. If not provided, the default
                          cache under $TMP/qiime2/ will be used.
                          IMPORTANT FOR HPC USERS: If you are on an HPC system
                          and are using parallel execution it is important to
                          set this to a location that is globally accessible
                          to all nodes in the cluster.
  --help                  Show this message and exit.

Examples:
  # ### example: demux single
  qiime cutadapt demux-single \
    --i-seqs seqs.qza \
    --m-barcodes-file md.tsv \
    --m-barcodes-column BarcodeSequence \
    --o-per-sample-sequences per-sample-sequences.qza \
    --o-untrimmed-sequences untrimmed-sequences.qza

Import:

from qiime2.plugins.cutadapt.methods import demux_single

Docstring:

Demultiplex single-end sequence data with barcodes in-sequence.

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes
are expected to be located within the sequence data (versus the header, or
a separate barcode file).

Parameters
----------
seqs : MultiplexedSingleEndBarcodeInSequence
    The single-end sequences to be demultiplexed.
barcodes : MetadataColumn[Categorical]
    The sample metadata column listing the per-sample barcodes.
cut : Int, optional
    Remove the specified number of bases from the sequences. Bases are
    removed before demultiplexing. If a positive value is provided, bases
    are removed from the beginning of the sequences. If a negative value is
    provided, bases are removed from the end of the sequences.
anchor_barcode : Bool, optional
    Anchor the barcode. The barcode is then expected to occur in full
    length at the beginning (5' end) of the sequence. Can speed up
    demultiplexing if used.
error_rate : Float % Range(0, 1, inclusive_end=True), optional
    The level of error tolerance, specified as the maximum allowable error
    rate. The default value specified by cutadapt is 0.1 (=10%), which is
    greater than `demux emp-*`, which is 0.0 (=0%).
batch_size : Int % Range(0, None), optional
    The number of samples cutadapt demultiplexes concurrently.
    Demultiplexing in smaller batches will yield the same result with
    marginal speed loss, and may solve "too many files" errors related to
    sample quantity. Set to "0" to process all samples at once.
minimum_length : Int % Range(1, None), optional
    Discard reads shorter than specified value. Note, the cutadapt default
    of 0 has been overridden, because that value produces empty sequence
    records.
cores : Threads, optional
    Number of CPU cores to use.

Returns
-------
per_sample_sequences : SampleData[SequencesWithQuality]
    The resulting demultiplexed sequences.
untrimmed_sequences : MultiplexedSingleEndBarcodeInSequence
    The sequences that were unmatched to barcodes.