Skip to content

RDP vs IDTAXA? #55

@cheersjiang

Description

@cheersjiang

Hi Daniel,

Thank you for incorporating the new training database (IDTAXA) into PR2. I would appreciate the PR2 database, a highly reputable and professional database for protist taxonomy.

I have a few questions regarding the RDP classifier (DADA2’s default assignment method) vs IDTAXA (recently updated with PR2). I applied both classifiers to my 18S V4 eDNA dataset from marine samples. IDTAXA filtered out a large number of ASVs assigned to the genus Cladocopium—a very important group of marine symbionts—compared to RDP. I verified these sequences using nucleotide BLAST, and the results indicated they are high-quality Illumina reads with ~98–100% identity.

  1. Have you encountered similar issues where IDTAXA may exclude many true ASVs at the genus level, potentially leading to missing information in 18S V4 datasets?

  2. Do you have any recommendations for species-level identification and analysis? The RDP paper (Wang et al., 2007) suggests that ~400 bp provides high confidence at the genus level, which corresponds well with the length of the 18S V4 region. On the other hand, the DADA2 tutorial mentions that species-level classification is only appropriate when an exact match between ASVs and the library, but it is quite unfeasible.

  3. There are limited papers discussing bioinformatic tools based on 18SV4 so far. Is it safe to apply 16SV4 conclusions to 18SV4 (as they are SSUs)? e.g. classifier performance, species threshold determination, etc.

PS: RDP & IDTAXA threshold = 50

Many thanks in advance :)

Cheers
Yuxuan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions