Warning

This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.

Are you looking for:

the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.

Old content beyond this point… 👴👵

dereplicate-sequences: Dereplicate sequences.¶

Command line interface
Artifact API

Docstring:

Usage: qiime vsearch dereplicate-sequences [OPTIONS]

  Dereplicate sequence data and create a feature table and feature
  representative sequences. Feature identifiers in the resulting artifacts
  will be the sha1 hash of the sequence defining each feature. If clustering
  of features into OTUs is desired, the resulting artifacts can be passed to
  the cluster_features_* methods in this plugin.

Inputs:
  --i-sequences ARTIFACT SampleData[Sequences] |
    SampleData[SequencesWithQuality] | SampleData[JoinedSequencesWithQuality]
                          The sequences to be dereplicated.         [required]
Parameters:
  --p-derep-prefix / --p-no-derep-prefix
                          Merge sequences with identical prefixes. If a
                          sequence is identical to the prefix of two or more
                          longer sequences, it is clustered with the shortest
                          of them. If they are equally long, it is clustered
                          with the most abundant.             [default: False]
  --p-min-seq-length INTEGER
    Range(1, None)        Discard sequences shorter than this integer.
                                                                  [default: 1]
  --p-min-unique-size INTEGER
    Range(1, None)        Discard sequences with a post-dereplication
                          abundance value smaller than integer.   [default: 1]
Outputs:
  --o-dereplicated-table ARTIFACT FeatureTable[Frequency]
                          The table of dereplicated sequences.      [required]
  --o-dereplicated-sequences ARTIFACT FeatureData[Sequence]
                          The dereplicated sequences.               [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --example-data PATH     Write example data and exit.
  --citations             Show citations and exit.
  --use-cache DIRECTORY   Specify the cache to be used for the intermediate
                          work of this action. If not provided, the default
                          cache under $TMP/qiime2/ will be used.
                          IMPORTANT FOR HPC USERS: If you are on an HPC system
                          and are using parallel execution it is important to
                          set this to a location that is globally accessible
                          to all nodes in the cluster.
  --help                  Show this message and exit.

Import:

from qiime2.plugins.vsearch.methods import dereplicate_sequences

Docstring:

Dereplicate sequences.

Dereplicate sequence data and create a feature table and feature
representative sequences. Feature identifiers in the resulting artifacts
will be the sha1 hash of the sequence defining each feature. If clustering
of features into OTUs is desired, the resulting artifacts can be passed to
the cluster_features_* methods in this plugin.

Parameters
----------
sequences : SampleData[Sequences] | SampleData[SequencesWithQuality] | SampleData[JoinedSequencesWithQuality]
    The sequences to be dereplicated.
derep_prefix : Bool, optional
    Merge sequences with identical prefixes. If a sequence is identical to
    the prefix of two or more longer sequences, it is clustered with the
    shortest of them. If they are equally long, it is clustered with the
    most abundant.
min_seq_length : Int % Range(1, None), optional
    Discard sequences shorter than this integer.
min_unique_size : Int % Range(1, None), optional
    Discard sequences with a post-dereplication abundance value smaller
    than integer.

Returns
-------
dereplicated_table : FeatureTable[Frequency]
    The table of dereplicated sequences.
dereplicated_sequences : FeatureData[Sequence]
    The dereplicated sequences.

dereplicate-sequences: Dereplicate sequences.¶

Docstring:

Import:

Docstring:

Table of Contents

Quick search