Fork me on GitHub

trim-single: Find and remove adapters in demultiplexed single-end sequences.ΒΆ

Docstring:

Usage: qiime cutadapt trim-single [OPTIONS]

  Search demultiplexed single-end sequences for adapters and remove them. The
  parameter descriptions in this method are adapted from the official cutadapt
  docs - please see those docs at https://cutadapt.readthedocs.io for complete
  details.

Inputs:
  --i-demultiplexed-sequences ARTIFACT SampleData[SequencesWithQuality]
                         The single-end sequences to be trimmed.    [required]
Parameters:
  --p-cores NTHREADS     Number of CPU cores to use.              [default: 1]
  --p-adapter TEXT...    Sequence of an adapter ligated to the 3' end. The
    List[Str]            adapter and any subsequent bases are trimmed. If a
                         `$` is appended, the adapter is only found if it is
                         at the end of the read. If your sequence of interest
                         is "framed" by a 5' and a 3' adapter, use this
                         parameter to define a "linked" primer - see
                         https://cutadapt.readthedocs.io for complete details.
                                                                    [optional]
  --p-front TEXT...      Sequence of an adapter ligated to the 5' end. The
    List[Str]            adapter and any preceding bases are trimmed. Partial
                         matches at the 5' end are allowed. If a `^` character
                         is prepended, the adapter is only found if it is at
                         the beginning of the read.                 [optional]
  --p-anywhere TEXT...   Sequence of an adapter that may be ligated to the 5'
    List[Str]            or 3' end. Both types of matches as described under
                         `adapter` and `front` are allowed. If the first base
                         of the read is part of the match, the behavior is as
                         with `front`, otherwise as with `adapter`. This
                         option is mostly for rescuing failed library
                         preparations - do not use if you know which end your
                         adapter was ligated to.                    [optional]
  --p-error-rate PROPORTION Range(0, 1, inclusive_end=True)
                         Maximum allowed error rate.            [default: 0.1]
  --p-indels / --p-no-indels
                         Allow insertions or deletions of bases when matching
                         adapters.                             [default: True]
  --p-times INTEGER      Remove multiple occurrences of an adapter if it is
    Range(1, None)       repeated, up to `times` times.           [default: 1]
  --p-overlap INTEGER    Require at least `overlap` bases of overlap between
    Range(1, None)       read and adapter for an adapter to be found.
                                                                  [default: 3]
  --p-match-read-wildcards / --p-no-match-read-wildcards
                         Interpret IUPAC wildcards (e.g., N) in reads.
                                                              [default: False]
  --p-match-adapter-wildcards / --p-no-match-adapter-wildcards
                         Interpret IUPAC wildcards (e.g., N) in adapters.
                                                               [default: True]
  --p-minimum-length INTEGER
    Range(1, None)       Discard reads shorter than specified value. Note,
                         the cutadapt default of 0 has been overridden,
                         because that value produces empty sequence records.
                                                                  [default: 1]
  --p-discard-untrimmed / --p-no-discard-untrimmed
                         Discard reads in which no adapter was found.
                                                              [default: False]
  --p-max-expected-errors NUMBER
    Range(0, None)       Discard reads that exceed maximum expected erroneous
                         nucleotides.                               [optional]
  --p-max-n NUMBER       Discard reads with more than COUNT N bases. If
    Range(0, None)       COUNT_or_FRACTION is a number between 0 and 1, it is
                         interpreted as a fraction of the read length.
                                                                    [optional]
  --p-quality-cutoff-5end INTEGER
    Range(0, None)       Trim nucleotides with Phred score quality lower than
                         threshold from 5 prime end.              [default: 0]
  --p-quality-cutoff-3end INTEGER
    Range(0, None)       Trim nucleotides with Phred score quality lower than
                         threshold from 3 prime end.              [default: 0]
  --p-quality-base INTEGER
    Range(0, None)       How the Phred score is encoded (33 or 64).
                                                                 [default: 33]
Outputs:
  --o-trimmed-sequences ARTIFACT SampleData[SequencesWithQuality]
                         The resulting trimmed sequences.           [required]
Miscellaneous:
  --output-dir PATH      Output unspecified results to a directory
  --verbose / --quiet    Display verbose output to stdout and/or stderr
                         during execution of this action. Or silence output if
                         execution is successful (silence is golden).
  --example-data PATH    Write example data and exit.
  --citations            Show citations and exit.
  --use-cache DIRECTORY  Specify the cache to be used for the intermediate
                         work of this action. If not provided, the default
                         cache under $TMP/qiime2/ will be used.
                         IMPORTANT FOR HPC USERS: If you are on an HPC system
                         and are using parallel execution it is important to
                         set this to a location that is globally accessible to
                         all nodes in the cluster.
  --help                 Show this message and exit.

Import:

from qiime2.plugins.cutadapt.methods import trim_single

Docstring:

Find and remove adapters in demultiplexed single-end sequences.

Search demultiplexed single-end sequences for adapters and remove them. The
parameter descriptions in this method are adapted from the official
cutadapt docs - please see those docs at https://cutadapt.readthedocs.io
for complete details.

Parameters
----------
demultiplexed_sequences : SampleData[SequencesWithQuality]
    The single-end sequences to be trimmed.
cores : Threads, optional
    Number of CPU cores to use.
adapter : List[Str], optional
    Sequence of an adapter ligated to the 3' end. The adapter and any
    subsequent bases are trimmed. If a `$` is appended, the adapter is only
    found if it is at the end of the read. If your sequence of interest is
    "framed" by a 5' and a 3' adapter, use this parameter to define a
    "linked" primer - see https://cutadapt.readthedocs.io for complete
    details.
front : List[Str], optional
    Sequence of an adapter ligated to the 5' end. The adapter and any
    preceding bases are trimmed. Partial matches at the 5' end are allowed.
    If a `^` character is prepended, the adapter is only found if it is at
    the beginning of the read.
anywhere : List[Str], optional
    Sequence of an adapter that may be ligated to the 5' or 3' end. Both
    types of matches as described under `adapter` and `front` are allowed.
    If the first base of the read is part of the match, the behavior is as
    with `front`, otherwise as with `adapter`. This option is mostly for
    rescuing failed library preparations - do not use if you know which end
    your adapter was ligated to.
error_rate : Float % Range(0, 1, inclusive_end=True), optional
    Maximum allowed error rate.
indels : Bool, optional
    Allow insertions or deletions of bases when matching adapters.
times : Int % Range(1, None), optional
    Remove multiple occurrences of an adapter if it is repeated, up to
    `times` times.
overlap : Int % Range(1, None), optional
    Require at least `overlap` bases of overlap between read and adapter
    for an adapter to be found.
match_read_wildcards : Bool, optional
    Interpret IUPAC wildcards (e.g., N) in reads.
match_adapter_wildcards : Bool, optional
    Interpret IUPAC wildcards (e.g., N) in adapters.
minimum_length : Int % Range(1, None), optional
    Discard reads shorter than specified value. Note, the cutadapt default
    of 0 has been overridden, because that value produces empty sequence
    records.
discard_untrimmed : Bool, optional
    Discard reads in which no adapter was found.
max_expected_errors : Float % Range(0, None), optional
    Discard reads that exceed maximum expected erroneous nucleotides.
max_n : Float % Range(0, None), optional
    Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a
    number between 0 and 1, it is interpreted as a fraction of the read
    length.
quality_cutoff_5end : Int % Range(0, None), optional
    Trim nucleotides with Phred score quality lower than threshold from 5
    prime end.
quality_cutoff_3end : Int % Range(0, None), optional
    Trim nucleotides with Phred score quality lower than threshold from 3
    prime end.
quality_base : Int % Range(0, None), optional
    How the Phred score is encoded (33 or 64).

Returns
-------
trimmed_sequences : SampleData[SequencesWithQuality]
    The resulting trimmed sequences.