A group of researchers from Columbia and Stanford have created a method for turning complex cellular datasets into visualizations that map the similarities between tens of thousands of cells within a tissue sample.
The idea of representing large or complex data as a graph is nothing new, but it has taken on more prominence thanks to the rise of social media and those ubiquitous social graphs that map out who’s connected to whom. As we highlighted recently, however, graph analysis is becoming more popular outside the realm of social networks, and is being applied to problems that are more complex than just figuring out simple relationships within a network. In cases such as medical research, especially, graphs can provide a very effective way of seeing how potentially hundreds of thousands of data points spanning perhaps hundreds of variables are similar to each other.
That’s exactly what the team at Columbia and Stanford has done with a new algorithm that they’ve demonstrated within the realm of mass cytometry. According to a press release announcing the research (which is available via paid download at Nature Biotechnology):
“The method, called viSNE (visual interactive Stochastic Neighbor Embedding), is based on a sophisticated algorithm that translates high-dimensional data (e.g., a dataset that includes many different simultaneous measurements from single cells) into visual representations similar to two-dimensional ‘scatter plots’ ….
“The viSNE software can analyze measurements of dozens of molecular markers. In the two-dimensional maps that result, the distance between points represents the degree of similarity between single cells. The maps can reveal clearly defined groups of cells with distinct behaviors (e.g., drug resistance) even if they are only a tiny fraction of the total population. This should enable the design of ways to physically isolate and study these cell subpopulations in the laboratory.”
I assume they say similar to scatter plots because the algorithm is analyzing data across more than two dimensions, although the resulting chart is essentially the same (i.e., data points with similar characteristics will form clusters).