1. Retrieval
Published data is downloaded from its respective repository.
2. Noise reduction
Background noise is eliminated to ensure data accuracy.
3. Standardization
Author annotations, such as cell type and tissue, are mapped to a consistent, controlled vocabulary.
4. Expression threshold
A gene is considered expressed in a cell type or tissue if it has a non-zero value in at least 100 cells and 15% of the total cells.
5. Study Count
Count the number of studies reporting the gene as expressed in each cell type.
6. Extract cell type markers
Genes expressed in a specific cell type are ranked based on the number of supporting studies. The top 30 genes are then selected and re-ordered in descending order, according to the number of other cell types in which they are expressed.