RNAQUAST has now been run on both Trinity and RNA-Spades.
Time to compare the 2 runs and see how they differ.
Here are the metrics that get measured:
SHORT SUMMARY REPORT
METRICS/TRANSCRIPTS
DATABASE METRICS
Genes
Avg. number of exons per isoform
BASIC TRANSCRIPTS METRICS
Transcripts
Transcripts > 500 bp
Transcripts > 1000 bp
ALIGNMENT METRICS
Aligned
Uniquely aligned
Multiply aligned
Unaligned
ALIGNMENT METRICS FOR NON-MISASSEMBLED TRANSCRIPTS
Avg. aligned fraction
Avg. alignment length
Avg. mismatches per transcript
ALIGNMENT METRICS FOR MISASSEMBLED (CHIMERIC) TRANSCRIPTS
Misassemblies
ASSEMBLY COMPLETENESS (SENSITIVITY)
Database coverage
Duplication ratio
50%-assembled genes
95%-assembled genes
50%-covered genes
95%-covered genes
50%-assembled isoforms
95%-assembled isoforms
50%-covered isoforms
95%-covered isoforms
Mean isoform coverage
Mean isoform assembly
ASSEMBLY SPECIFICITY
50%-matched
95%-matched
Unannotated
Mean fraction of transcript matched
We can tally the number of predicted misassembles. This will be evidence of errors in de novo assembly OR errors in reference genome.
If we the same region of reference tagged as a misassembly across more than one plant variety, we can ascertain that the do novo is likely correct and the reference is wrong!