Public Site Tutorial
This tutorial is for the public Cluster Tracker tool. Please see the access FAQ for the private CDPH California Big Tree Cluster Tracker tool.
Steps to investigate a sample in Cluster Tracker.
In this example, imagine your region has a growing number of cases. It appears there is a novel mutation in the spike protein, S:S813G, of the successful XBB lineage shared in the cases you have found. You want to investigate if there might be other samples in your community that share this mutation. You know that some variants can have mutations in the spike protein that can make it less recognizable by antibodies, and that other mutations in the Recptor Binding Domain (RBD) with ACE2 receptors are particularly significant. The cluster tracker tool can help you generate and prioritize different hypothesises about what is happening in your community with this new variant.
To start off with, you have a Genbank accession, OQ381704.1 for one of your samples as you come to the public Cluster Tracker tool.
Open the site: https://clustertracker.gi.ucsc.edu/
In the "Search:" box on the far right paste accession: OQ381704.1
Ensure you do not copy any leading white spaces.
This will filter the table to a cluster with this sample.
The result is a cluster with an identifier like California_node_# of 19 samples, when this was written.
The # number will change every time the tree is rebuilt, as new nodes on the tree are created, and new samples are added.
Therefore, it is important to keep a sample accession, such as OQ381704.1, when returning to a tree to find how the cluster may have changed over time.
The site operates best in Chrome if you are experiencing an issue of incomplete loading of data.
The "Best Potential Origins" column shows indeterminate, suggesting there isn't a clear source for this cluster -it may be from overseas.
The XBB.1.9 indicates the lineage, and the date range suggests it has been active for about two months.
Double clicking on the "Samples" column will provide a pop-up with information to find the other regional samples to investigate.
The next step is to click the "View Cluster" link.
This opens the Taxonium view, click OK on the pop-up, and the entire tree will load. Depending on your internet speeds, this may take some time and you may need to refresh, be sure to click OK on the pop-up.
Once all the tree is loaded, click a tiny magnifying glass to zoom to the cluster that was selected.
This magnifying glass is next to the "19 results", the number of samples related to this OQ381704.1 sample found by the mutation analysis utilities behind Cluster Tracker.
You can zoom in and zoom out with the scroll wheel on your mouse, or the other magnifying glasses toward the bottom of the page.
Now right-click at the top of the page in the phrase "powered by Taxonium" the Taxonium link to open a new window.
On this new window click the "SARS-CoV-2" link. Or skip the last step and just use this link to open a new window side-by-side with the previous window: https://taxonium.org/?backend=https://api.cov2tree.org
On this separate page find the "Search" feature and change the "Name" to "Genbank accession".
Now paste in the box (without whitespaces): OQ381704.1
Click the resulting "1 result" magnifying glass (note different accession used earlier in image)
This is the California_node_# cluster in the Cov2Tree browser, without the cluster identifying information.
Once viewing the cluster in the new Cov2Tree page, you can add the Treenome Browser view of Bloom data.
On the Cov2Tree window, find and mark the checkbox next to the "Treenome Browser" on the top right.
You may need to drag the tree on the left back into view if it has moved once the Treenome genome view comes up.
These vertical lines represent the mutations of the variants displayed on the left in the phylogenetic tree.
With the Cov2Tree Treenome Browser open, you can see a settings "sandwich icon" under NC_045512v2 on the top left.
Click that you get a pop-up of "Available tracks".
This is the place where you can turn on the Bloom lab’s mutation mapping scores.
There are a lot of tracks, so to help reduce the number of choices, we will collapse the first group.
Click the down arrow by "UCSC Tracks (Composite)" to make the arrow go sideways and hide that group.
Look at the second "UCSC Tracks" group and toward the top middle, just after the "ARTIC" tracks click the three "Bloom Lab:" RBD tracks.
Once clicked to add checkmarks, click anywhere outside the pop-up, to return to the main view (the top "X" will not work).
With the Bloom tracks successfully displayed, you can see a blue signal on the far right of the genome, under the S protein.
Zoom in on the S protein. One way to do that is right above the S Genes track click into and drag within the bar that has 20,000 and then a little "Zoom to region" option should pop-up.
Click the "Zoom to region" pop-up -again this will only happen if you use the top base track, above the gene annotations.
Another way to Zoom is to use the two magnifying glasses on the top right. You may need to resize the window to see these two icons if your screen is narrow.
Zoom in on the blue mountain" of RBD domain scores, then zoom out a little on the right-hand side.
Or paste this sequence range in the central view input box: NC_045512v2:22,415..24,586
Here's a view of the cluster region and with the S protein zoomed in you see something like the following.
The vertical line on the far right represents S:S813G for this group. It is far outside of the RBD domain as noted in the Bloom track, which suggests this is probably not a variant of concern.
You may have to use the mouse scroll wheel to zoom out on the phylogenetic tree to condense the cluster. You can also use the bottom magnifying glasses to zoom in and out on the tree both horizontally to see branches more clearly or to condense the tree.
Notice on the far left the circle around the sample OQ381704.1 in the phylogenetic tree helps us identify this cluster, whereas in the earlier first view from Cluster Tracker all the related samples were circled.
Also note in this screenshot the "Color by:" option is set to "PANGO Lineage" while there are other options. For instance, you can change it to "Genotype" and then keep "Gene S" and change the "Residue" to "813" and now the tree will emphasize this group where S:813 is G instead of S. This can be a way to zoom out on the entire phylogenetic tree and find other mutants with this change.
Another way to identify elsewhere variants with this change is to click the "Add a new search" option. Change the "Name" option to "Mutation" and then likewise set the "Mutation at residue" to 813 and mutation to G. You can then click the magnifying glass to see these results in the context of the tree. The results are that this mutation appears very very infrequently and at random, suggesting it may even be a deleterious mutation for the virus.
Try exploring other tracks in Treenome. There are even tracks to show evolutionary protein-coding potential as determined by PhyloCSF to help identify conserved, functional, protein-coding regions of the genome.