Skip to content

On the Calculation Formula of TPM #16

Description

@Hiroyuki24

Hello,

I geatly appreciate the creation of a TPM calculation tool specifically for prokaryotes.
I'm writing to raise a concern regarding the TPM values calculated by FADU.

I tesed FADU with both the privided test data and my own RNA-seq data. The tool ran without issues, and I obtained a results table including TPM calues. However, when I summed the TPM values for all genes, the total did not reach one million. Typically represents the amount of transcrupt per million reads, so the sum of TPMs should be one million. Upon reviewing the FADU code, I found that the calculation formula for TPM is incorrect.

The correct calculation of TPM incolves the following steps:

  1. Normalize the read count for each gene by its length to get counts per kilobase.
  2. Sum the length-normalized counts.
  3. Divide the length-normalized count by the total sum and multiply by one million.

However, the current FADU code is as follows:
function calc_tpm(len::UInt, totalcounts::Float32, feat_counts::Float32) """Calculate TPM score for current feature.""" return @fastmath(feat_counts * 1000 / len) * 1000000 / totalcounts end

it appears that the total read counts is calculated first, followed by length normalization.
This seems to be the formula for calculating FPKM, not TPM.

Could you please verify if there is an error in the TPM calculation formula?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions