First you pick the dataset (2017 or 2019) [2]. Then you can tweak the 'Genes' [3] and 'Tissues' [4] by clicking in them and starting to type (allowed values will auto-fill). You can also delete values by clicking on them and hitting the 'delete' key on your keyboard. You can tweak the display of the box plots a bit by changing the 'Number of columns' field [5]. A higher number will squeeze more plots in each column. When you are done tweaking those parameters, click the big blue '(Re)Draw Plot!' button [6] and wait a few seconds.
If you mouse over a data point [8], you will get metadata about that particular sample.
Each gene gets its own box. The y-axis is length scaled TPM (log2 transformed). The x axis is samples, colored by tissue. The right panels [9, 10] are tables with external links to gene info [9] and the absolute TPM values and the rank of the gene in the particular tissue (lower is more highly expressed) [10].
To give a rough sense of how highly expressed the gene is in the tissue, the decile of expression is given in [10]; 10 is the highest decile of expression and 1 is the lowest.
This is a very simple differential expression test. When you click this radio button [1] the view changes, where the expression is being shown *relative* to the reference tissues [2]. By default it is all of the tissues; in the screenshot Retina - Organoid and Retina - Stem Cell Line are the baseline expression samples. The data table on the right [3] will then display log2 fold change, average expression, and the p-value (from a t-test) of the differential expression.
This produces a 2D visualization, with each gene as a row and each tissue as a column. More yellow is more expressed. It is a efficient way to display the expression of many genes and tissues.
If you use the Pan-Tissue Boxplot feature a lot, you may find it frustrating to have to input in your favorite genes and tissues. We have added the ability to use a custom url to load in the genes and tissues of your choice. Previously you had to build this link youself - but now there's a handy button [7] you can click that will re-create the parameters. One downside is that the web app is continually using the URL dataset, which makes it impossible for you to change it. You can simply reload the web page with the custom bits.
This is short for differential expression. We have pre-calculated 55+ differential expression tests. All eye tissue - origin pairs were compared to each other. We also have a synthetic human body set, made up of equal numbers of GTEx tissues (see manuscript, above, for more details). The word cloud displayed shows as many as the top 75 terms used in enriched GO terms in the selected comparison. The table data shows the actual GO terms. You can search for the comparison of your choice.
These are the values taken from the limma differential expression topTable() summary table. The following has been taken from the limma manual and edited to match parameters we used (https://www.bioconductor.org/packages/devel/bioc/vignettes/limma/inst/doc/usersguide.pdf):
A number of summary statistics are presented by topTable() for the top genes and the selected contrast. The logFC column gives the value of the contrast. Usually this represents a log2-fold change between two or more experimental conditions although sometimes it represents a log2-expression level. The AveExpr column gives the average log2-expression level for that gene across all the arrays and channels in the experiment. Column t is the moderated t-statistic. Column P.Value is the associated p-value and adj.P.Value is the p-value adjusted for multiple testing (False Discovery Rate corrected).
The B-statistic (lods or B) is the log-odds that the gene is differentially expressed. Suppose for example that B = 1.5. The odds of differential expression is exp(1.5)=4.48, i.e, about four and a half to one. The probability that the gene is differentially expressed is 4.48/(1+4.48)=0.82, i.e., the probability is about 82% that this gene is differentially expressed. A B-statistic of zero corresponds to a 50-50 chance that the gene is differentially expressed. The B-statistic is automatically adjusted for multiple testing by assuming that 1% of the genes, or some other percentage specified by the user in the call to eBayes(), are expected to be differentially expressed. The p-values and B-statistics will normally rank genes in the same order. In fact, if the data contains no missing values or quality weights, then the order will be precisely the same.
The Macosko data is a single-cell (~45,000) retina RNA-seq mouse P14 C57BL/6 dataset from Mackosko and McCarroll's field defining
paper.
The cluster / cell type assignments are taken from
here.
The Clark data is a 100,000 cell plus mouse retina RNA-seq time series dataset. Their pre-publication manuscript is on
bioRxiv.
Data was pulled from
here.
To efficiently display a huge amount of information, expression across many individual cells is averaged by cell type, (if available) age, and gene. You can select the Macosko or Clark dataset [1], then one gene [2] to plot. The gene expression is displayed as a heatmap, with each row being a retina cell type (derived by the respective authors) and each column [4] is a time point, arranged from youngest to oldest. More yellow is higher expression [5].
You can add the rank of expression (or rank of percentage of cells with detectable expression of selected gene) with this radio