In this markdown, we will look at the difference between jet, Seurat, and viridis color palettes in scRNA seq data. We note that color palettes are very important for single-cell analysis of all kinds, from flow cytometry to single-cell sequencing and imaging. This is because a lot of the plots involve the use of color as a dimension. In flow cytometry, manual gating often involves coloring a biaxial plot by density. When visualizing t-SNE or UMAP results, we often color by marker expression.
Here, we will make the case that viridis is a preferred color palette both in terms of resolution and being colorblind-friendly. First, let’s load the data.
library(tidyverse)
library(Seurat)
library(SeuratData)
pbmc <- SeuratData::LoadData(ds = "pbmc3k")
## Warning: Assay RNA changing from Assay to Assay
## Warning: Assay RNA changing from Assay to Assay5
We will next do the standard Seurat workup, which will allow us to test the color schemes on the UMAP visualizations of gene expression.
pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize", scale.factor = 10000)
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)
pbmc <- ScaleData(pbmc, features = rownames(pbmc))
pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))
pbmc <- RunUMAP(pbmc, dims = 1:10)
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
Now we make our plots. We are going to keep the “genes” as a vector in case the reader wants to color more than one. We start with jet, which is a typical color palette for flow cytometry. We have an example UMAP below to show this.
genes <- c("LYZ")
# Simplified version
# jet_colors <- colorRampPalette(c("blue", "cyan", "yellow", "red"))
# Credit: https://stackoverflow.com/questions/18360196/how-can-i-get-a-certain-colorful-scale-in-r
jet_colors <-
colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan",
"#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))
FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = jet_colors(100))
First we examine the color spectra. We look at two types of red-green colorblindness, and blue-yellow colorblindness. But we note that the dichromat package can simulate more. So think of this as starter code for your own personal use.
# Load necessary libraries
library(colorspace)
library(dichromat)
# Simulate red-green colorblindness (deuteranopia)
jet_deuteranopia <- dichromat(jet_colors(100), type = "deutan")
# Simulate red-green colorblindness (protanopia)
jet_protanopia <- dichromat(jet_colors(100), type = "protan")
# Simulate blue-yellow colorblindness (tritanopia)
jet_tritanopia <- dichromat(jet_colors(100), type = "tritan")
# Plot the original and all simulated color schemes
par(mfrow = c(4, 1), mar = c(1, 1, 1, 1))
image(matrix(1:100, ncol = 1), col = jet_colors(100), main = "Original Jet Color Scheme", axes = FALSE)
image(matrix(1:100, ncol = 1), col = jet_deuteranopia, main = "Jet Simulated for Deuteranopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = jet_protanopia, main = "Jet Simulated for Protanopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = jet_tritanopia, main = "Jet Simulated for Tritanopia", axes = FALSE)
We note that while there are lots of colors associated with this palette, which makes it visually appealing, a lot of those colors go away among the colorblind viewers. Furthermore, we note sharp transitions between the colors, which may distort perception of marker expression differences. This can be especially seen amongst the colorblind simulations, where we see sharp breaks, for example between blue and yellow (or for that matter, cool and warm colors) in the red-green colorblind simulation. Now we make the UMAP plots, so we can see how colorblind people view our data.
# Create individual plots with relevant labels
p1 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = jet_colors(100)) &
ggtitle("Original Colors (Jet)")
p2 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = jet_deuteranopia) &
ggtitle("Deuteranopia")
p3 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = jet_protanopia) &
ggtitle("Protanopia")
p4 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = jet_tritanopia) &
ggtitle("Tritanopia")
# Combine plots into a 2x2 layout
combined_plot <- (p1 | p2) / (p3 | p4)
# Display the combined plot
combined_plot
We can see the abrupt difference in expression between the top and the bottom of the rightmost island. Subsequent color schemes will show that some of this might be artificial due to the abrupt transitions in the jet color scheme. Furthermore, we see that in the Deuteranopia case, the differences between the top and bottom of the island do not look nearly as vast. Thus, we will have colorblind viewers that interpret the data differently than non-colorblind viewers.
Now we will redo with the standard Seurat color palette. We have two plots below. The first is the standard FeaturePlot, which will have the Seurat default, and the second is our attempt to replicate it, which can be confirmed in the image. below.
# Create the default Seurat FeaturePlot
p1 <- FeaturePlot(pbmc, features = genes) &
ggtitle("Default Seurat Color Palette")
# Create the replication of the Seurat palette
p2 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = c("lightgray", "blue")) &
ggtitle("Replicated Seurat Color Palette")
# Combine the two plots side by side
combined_plot <- p1 | p2
# Display the combined plot
combined_plot
From here, we have to pull out the hex values of this color scheme and use them in the colorblindness simulations. This will then output the color scale bars as we did with the jet color scheme.
# Load required package
library(scales)
# Create the gradient scale
seurat_colors <- scale_colour_gradientn(colours = c("lightgray", "blue"))
# Generate a sequence of positions along the gradient (0 to 1)
positions <- seq(0, 1, length.out = 100) # Adjust '100' to get more or fewer colors
# Use the scale's palette to extract hex values
seurat_colors <- seurat_colors$palette(positions)
# Simulate colorblindness
seurat_deuteranopia <- dichromat(seurat_colors, type = "deutan")
seurat_protanopia <- dichromat(seurat_colors, type = "protan")
seurat_tritanopia <- dichromat(seurat_colors, type = "tritan")
# Plot the original and simulated color schemes
par(mfrow = c(4, 1), mar = c(1, 1, 1, 1))
image(matrix(1:100, ncol = 1), col = seurat_colors, main = "Original Seurat Default Color Scheme", axes = FALSE)
image(matrix(1:100, ncol = 1), col = seurat_deuteranopia, main = "Seurat Simulated for Deuteranopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = seurat_protanopia, main = "Seurat Simulated for Protanopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = seurat_tritanopia, main = "Seurat Simulated for Tritanopia", axes = FALSE)
Note that we do not see any abrupt transitions as we saw in the jet color scheme. However, a lot of the resolution that jet users have is otherwise gone. We note also that this color scheme looks much more balanced in the colorblind simulations.
And now we run the the UMAPs accordingly.
# Create individual plots with titles
# Create individual plots with relevant titles
p1 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = seurat_colors) &
ggtitle("Seurat Default")
p2 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = seurat_deuteranopia) &
ggtitle("Deuteranopia")
p3 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = seurat_protanopia) &
ggtitle("Protanopia")
p4 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = seurat_tritanopia) &
ggtitle("Tritanopia")
# Combine plots into a 2x2 layout
combined_plot <- (p1 | p2) / (p3 | p4)
# Display the combined plot
combined_plot
We note here that level of balance that we see as opposed to jet. The rightmost island, which is most strongly colored, looks more like a spectrum here, rather than one color on the top, one color on the bottom. This is due to a lack of abrupt transitions in the color scale.
Now we will redo this with viridis, which was made with balance and colorblind friendliness in mind.
library(viridisLite)
vir_colors <- viridis(100) # Default Viridis palette
FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = vir_colors)
And now we make the colorblind spectra again for viridis.
# Load necessary libraries
library(colorspace)
library(dichromat)
# Simulate red-green colorblindness (deuteranopia)
vir_deuteranopia <- dichromat(viridis(100), type = "deutan")
# Simulate red-green colorblindness (protanopia)
vir_protanopia <- dichromat(viridis(100), type = "protan")
# Simulate blue-yellow colorblindness (tritanopia) for viridis
vir_tritanopia <- dichromat(viridis(100), type = "tritan")
# Plot the original and all simulated color schemes
par(mfrow = c(4, 1), mar = c(1, 1, 1, 1))
image(matrix(1:100, ncol = 1), col = vir_colors, main = "Original Viridis Color Scheme", axes = FALSE)
image(matrix(1:100, ncol = 1), col = vir_deuteranopia, main = "Viridis Simulated for Deuteranopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = vir_protanopia, main = "Viridis Simulated for Protanopia", axes = FALSE)
image(matrix(1:100, ncol = 1), col = vir_tritanopia, main = "Viridis Simulated for Tritanopia", axes = FALSE)
We note that there is balance across the spectrum, similar to the Seurat default, but there is more resolution due to there being more colors involved. This higher resolution persists in the colorblind simulations.
And now we go to the UMAPs of our single-cell data.
# Create individual plots with relevant titles
p1 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = vir_colors) &
ggtitle("Original Viridis Color Scheme")
p2 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = vir_deuteranopia) &
ggtitle("Viridis Simulated for Deuteranopia")
p3 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = vir_protanopia) &
ggtitle("Viridis Simulated for Protanopia")
p4 <- FeaturePlot(pbmc, features = genes) &
scale_colour_gradientn(colours = vir_tritanopia) &
ggtitle("Viridis Simulated for Tritanopia")
# Combine plots into a 2x2 layout
combined_plot <- (p1 | p2) / (p3 | p4)
# Display the combined plot
combined_plot
So we can see that the color schemes here are a bit more colorblind friendly, but with more resolution than what we saw in the Seurat default.
We found above that the jet color scheme does not have balanced transitions between colors, distorting what might otherwise be small differences. We also found that this is particularly pronounced in colorblind simulations. We further found that while the Seurat default color scheme corrects for the lack of balanced transitions, and is more colorblind friendly, it does not have resolution (in terms of different colors) that jet users would otherwise want.
Accordingly, viridis is the best of both worlds, presenting a balanced color palette that also has both high resolution and colorblind friendliness. Taken together, the data suggest that viridis, or similar perceptually uniform palettes, should be strongly considered as defaults for continuous data visualization in scRNA-seq and flow cytometry, especially when accessibility and interpretability are priorities.
We note that this markdown was inspired by a SciPy talk from 2015, from those who developed viridis. That talk, which is only 19 minutes long, can be found here, and I encourage the reader to watch this for a detailed first principles look at color theory and how/why they came up with viridis.
The key takeaway for the reader is that we should be using viridis, or similar color palettes that have been developed since then, as defaults. This will lead to both better data interpretation and colorblind friendliness.
We note that colorblindness affects upwards of 8% of men, and 0.5% of women (source). Thus, a large fraction of readers will indeed be colorblind, and therefore have difficulty interpreting traditional color palettes like jet. And you never know if one of these viewers will be the person who is looking to invest in or buy your company, accept your paper for publication in a top-tier journal, or whatever else.
Accordingly, people in academia and industry should strongly consider using viridis in their publications, talks, social media posts, posters, VC pitch decks, and whatever else. By doing so, we can enhance interpretability, inclusivity, and data literacy across academic and industry settings.