Using an external tool for harmonization¶

In this example, we will take a neighborhood graph obtained with an external tool such as scVI and cluster the cells using northstar’s atlas-aware clustering algorithm.

The key class is ClusterWithAnnotations, which takes two required arguments: - the external graph must satisfy a specific order of cells. The first cells (starting from 0) must be the annotated atlas cells, the last ones must be the cells to be annotated - the annotations array/list contains the cell types of the atlas cells. The length of this array/list will be used to infer which cells in your graph are annotated atlas cells (the first ones, from 0 to the length of this array - 1) and which ones are to be annotated (the other ones, i.e. the last ones).

import anndata
import northstar

# Read in the GBM data to be annotated
# Here we assume it's a loom file, but
# of course it can be whatever format
newdata = anndata.read_loom('...')

# The variable graph contains a sparse
# adjacency matrix between cells. In
# other words, graph[i, j] is nonzero
# if cells i and j are neighbors. graph
# was computed using an external tool
print(graph)

# The variable annotations contains an
# array/list of cell types for the atlas
# cells, which are cells 0 to n_atlas - 1
# in the graph
print(annotations)
n_atlas = len(annotations)

# Prepare the clustering class
model = northstar.ClusterWithAnnotations(
    graph=graph,
    annotations=annotations,
    )

# Run the classification
model.fit(newdata)

# Get the inferred cell types
cell_types_newdata = model.membership

Although this example uses a sparse adjacency matrix, you can also use an igraph graph instead. A typical way to create a graph from a list of edges is:

import igraph as ig

# edges is a list of pairs, e.g.
# edges = [(0, 1), (0, 3)]
# would indicate that cells 0 and 1
# are neighbors, 0 and 3 are also
# neighbors, while cell 2 has no
# neighbors
print(edges)

graph = igraph.Graph(
    n=n_atlas+n_newcells,
    edges=edges,
    )

# Prepare the clustering class
model = northstar.ClusterWithAnnotations(
    graph=graph,
    annotations=annotations,
    )

The rest of the code stays the same.