Examples of running infer_experiment.py, with output:
Unstranded protocol:
$ infer_experiment.py -i VF36_C_R1.bam -r S_lycopersicum_Sep_2019.bed
Reading reference gene model S_lycopersicum_Sep_2019.bed ... Done
Loading SAM/BAM file ... Total 200000 usable reads were sampled
This is PairEnd Data
Fraction of reads failed to determine: 0.0087
Fraction of reads explained by "1++,1--,2+-,2-+": 0.4945
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4969
Intepretation: About half of the time, read 1 has the same sequence as its gene's plus strand. The other half of the time, it matches the gene's minus strand.
Stranded synthesis protocol:
$ infer_experiment.py -i A_C_3S.bam -r O_sativa_japonica_Oct_2011.bed
[W::hts_idx_load3] The index file is older than the data file: A_C_3S.bam.bai
Reading reference gene model O_sativa_japonica_Oct_2011.bed ... Done
Loading SAM/BAM file ... Total 200000 usable reads were sampled
This is SingleEnd Data
Fraction of reads failed to determine: 0.0201
Fraction of reads explained by "++,--": 0.0095
Fraction of reads explained by "+-,-+": 0.9704
Interpretation: Almost every sequenced read matches the antisense strand of a gene, never the sense strand of a gene.
Older BAM files (from SRP252265), aligned with hisat2 and run using the wrong strandedness parameter:
infer_experiment.py -i N3-S.bam -r S_lycopersicum_Sep_2019.bed
[W::hts_idx_load3] The index file is older than the data file: N3-S.bam.bai
Reading reference gene model S_lycopersicum_Sep_2019.bed ... Done
Loading SAM/BAM file ... Total 200000 usable reads were sampled
This is PairEnd Data
Fraction of reads failed to determine: 0.0013
Fraction of reads explained by "1++,1--,2+-,2-+": 0.5170
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4817
Moving to done.