In this essay I attempt a top-down description of the information content in engrams. I focus on the structure that most plausibly mediates rapid electrochemical flow across brain regions to recall memories stored in engrams: the connectome. The connectome is a vague term. In order to make the discussion more precise, I consider different possible levels of detail that could be used to describe or measure the connectome, which contain increasing detail: the adjacency connectome, the cell membrane connectome, the ultrastructural feature connectome, the epigenetic-annotated connectome, and the biomolecule-annotated connectome. I consider evidence that each level could be sufficient to describe engrams, such that more precise levels of detail would not be necessary. While there is some possibility that a higher level of detail could sufficiently describe engrams, I describe why I think that, most likely, the level of detail required will be the biomolecule-annotated connectome. A key area of uncertainty is what degree of biomolecule annotation will be required to sufficiently describe connectomes – on the spectrum from a sparse to a rich degree of biomolecule annotation. The amount of biomolecule annotation required depends in part on how much mutual information there is in the biomolecular networks that are relevant to engram maintenance.
I use the word “connectome” to refer to the structural microscale connectome, as opposed to the mesoscale or macroscale connectome (Sporns, 2016). One important aspect of the connectome to be cognizant of is that it is not a static structure. The connectome is constantly being remodeled as a result of one’s experiences and may drift over time. Every time one recalls a memory, there is evidence that the associated engram is re-consolidated in an altered structural form (Chen et al., 2020). The memory itself may also be altered during the retrieval process (as cited in (Josselyn et al., 2017)). Yet, the fact that the connectome changes all the time does not mean that a snapshot of the connectome at any given time is not sufficient for engrams. It just means that the information contained in an engram is not necessarily an accurate representation of what happened in the world when it was initially formed. This is merely a restatement of the well-known fallibility of memory. Slight changes in the connectome are perfectly consistent with ordinary survival.
A wealth of evidence suggests engrams are encoded in neural structures distributed across the brain (Ryan et al., 2021). More specifically, it seems to be the distributed activity of neuronal ensembles communicating through the connectome that instantiate long-term memory recall (Tonegawa et al., 2015) (Kim et al., 2020).
Beyond long-term memory recall alone, there is a lot of evidence that the class of memory systems that includes declarative memories is encoded via neural connectivity patterns. For example, see (Martin et al., 2000) and the studies described at https://aspirationalneuroscience.org/.
Based on this experimental evidence and the spatiotemporal criterion, it is a central claim of these essays that there is no set of structures in the brain other than electrochemical information flow through the connectome that could operate on a spatiotemporal scale sufficiently fast to allow for rapid long-term memory recall.
The Brain Preservation Foundation (BPF)’s prize was for a procedure that could be shown to preserve the connectome across an animal brain with a method that would allow long-term storage of at least 100 years. Unfortunately, the use of the word “connectome” in brain preservation has led to communication difficulties for the past decade. Because it is such an amorphous word, it’s easy to straw person the proposal for brain preservation as life extension by saying that the connectome is not enough. For example, Sam Gershman does so in a conversation with Live Science about the company Nectome, claiming that “You need to know the synaptic strengths, if they’re excitatory/inhibitory, various time constants, what neuromodulators are present, the dynamical state of dendritic spines. And that’s all assuming that memories are even stored at synapses!”
The core idea of the BPF’s prize was not just connectivity, but connectivity alongside sufficient molecular information. As Ken Hayworth’s 2011 prize proposal stated (Kenneth Hayworth, 2011):
We now understand that the true measure of success should be that a procedure preserves the structural connectivity of the neuronal circuits of the brain along with enough molecular level information necessary to infer the functional properties of the neurons and their synaptic connections.
But as Gershman’s quote shows, that has not always gotten across. Arguably, it may be better to speak of preserving something like “morphomolecular maps” that explicitly contain both the morphologic information of abstract structural features (e.g., the connectome) as well as their biomolecular constituents. I hope to rescue the term connectome from some of its linguistic imprecision by specifying different possible levels of detail to describe the connectome. This way, we can communicate more precisely about what we mean.
Here are the levels of detail to I’ve come up with to conceptualize the connectome:
Level | Description |
---|---|
Adjacency connectome | Adjacency matrix, with or without weights |
Cell membrane connectome | 3D map of cell membrane locations of brain cells |
Ultrastructural feature connectome | As above, + intracellular/extracellular features visible on ultrastructure |
Epigenetic-annotated connectome | As above, + nuclear gene expression/epigenetic information |
Biomolecule-annotated connectome | As above, + detailed biomolecule distribution information across the connectome |
For each level of the connectome, I will consider what it is, what kind of synaptic properties could be estimated based on it, and consider the evidence that mapping this level of detail might be sufficient to describe engrams.
As an example of this, in investigating the Drosophila hemibrain connectome, (Scheffer et al., 2020) analyzes different levels of the connectome: compartment maps, neuron skeletons, connectivity graphs, and adjacency matrices. Here they show some of the ways that they represent their connectome data:
As another example, here is a figure from (Morgan et al., 2017) describing different levels of connectome representation:
As you can see, the first figure from (Scheffer et al., 2020) describes how raw electron microscopy data can be processed in different ways, which produces different types of connectome data that is useful to different types of users. The second figure from (Morgan et al., 2017) shows how connectome data can have its dimensionality reduced, for example, to a representation of the connectome between different cell types. Both of these can be thought of as different levels of detail to describe connectome data.
In one of two 2005 publications to independently coin the term “connectome,” Hagmann defined the connectome as a graph in which “each neuron is represented by a labeled vertex and its connections with a set of weighted and oriented edges” (Hagmann, 2005). Specifying connectivity data between neurons is therefore the first and perhaps the most literal definition of a connectome, in the -omics sense of capturing the totality of the connections within a nervous system. I would call Hagmann’s definition an adjacency connectome with weights and directions.
There are different possible ways to measure the adjacency connectome. Investigators could use a viral tracing approach that spreads across chemical synapses. This has been used to produce an adjacency connectome in a local brain region (Rossi et al., 2020). However, this viral tracing method would not necessarily allow counting the number of synaptic connections between neurons. One could also use electron microscopy and then use dimension reduction techniques to only focus on the synaptic connections (Lichtman et al., 2014). A true adjacency connectome would contain not only chemical synapses but also electrical ones, which are not as widely studied as chemical synapses but clearly play a major role in rapid electrochemical ion flow (Alcamí et al., 2019). Capturing electrical synapses would likely require electron microscopy measurements with a resolution of 2 nm or below (Marc et al., 2014).
It’s important to point out that Hagmann’s definition of the connectome, and indeed most definitions of the connectome, focus on connections between neurons. However, this focus is incomplete. Other cell types also participate in rapid electrochemical ion flow between cells, such as oligodendrocyte precursor cells and oligodendrocytes (Yamazaki et al., 2019). The strength of synaptic connections is likely also dependent on other cells nearby a synapse, such as astrocytes. In a complete adjacency connectome, non-neuronal cells would also need to be included.
Related to this, my personal feeling is that sometimes there is an excessive focus on neurons in the neuroscience literature. A lot of this may come from the bias of historically influential neuroscientists, such as Santiago Ramón y Cajal, who called non-neurons “glue” (thus: glia, which is Greek for glue) and considered them uninteresting. Regardless of the reason, a possible bias towards neurons is important to recognize and be cautious about.
Synaptic properties estimate: To distinguish the adjacency connectome from the next level of description, I will define synaptic weights in the adjacency connectome level solely based on counting the number of connections between cells.
While the adjacency connectome may seem like a coarse level of description, it’s important to point out that measuring the connections between cells is certainly not a trivial task. It has been estimated that there are 10^16 synapses in the brain, just considering neurons alone (DeFelipe et al., 2016). Assigning those connections to their origin cells also requires one to either identify a unique label for each cell or to trace back the origin cell of the process from which the synapse branches.
Sufficiency for describing engrams: It is hard to imagine being able to describe engrams in brain tissue without being able to infer the connections between cells. Communication between cells is the fundamental unit of information transmission in the brain. There is a reason that many neuroscientists such as Joseph LeDoux have pinned the majority of information about our sense of self on our set of synapses (LeDoux, 2003).
At the same time, the adjacency connectome alone seems not to be sufficient for engram storage. First, evidence from C. elegans suggests that simply counting the number of synaptic connections between two neurons is not a good predictor of the functional strength between those neurons (Yemini et al., 2021). This data point suggests that the adjacency connectome, even when connections are assigned weights via counting the number of synapses between neurons, is insufficient as a measure of connection strength.
Another form of information that the adjacency connectome will lose is the timing with which signals are sent and received by cells in the nervous system, which is dependent on the speed with which electrochemical signals propagate within and between cells. This, in turn, depends on the relative distances between connections, which is lost when reducing the connectome to an adjacency matrix level of description.
In the 1980s, Sydney Brenner and colleagues collected numerous electron microscopy images of the nematode C. elegans. From these images, they worked out how the 302 neurons in this organism made the approximately 7600 connections to one another. This data set contains not only the presence or absence of connections between neurons, but also the relative location of their cell bodies and neuronal processes. This has been described as the first nervous system-wide connectome (DeWeerdt, 2019). Brenner’s connectome was combined as a mosaic of nervous systems from several different individual C. elegans (White et al., 1986). If we instead imagine that it came from just one specific organism and contained cells other than just neurons, then in our framework, I would call this a cell membrane connectome. I find it helpful to break down the cell membrane connectome into four components.
Another historical reason the cell membrane connectome level is relevant is that some electron microscopy stains allow for the detection of the extracellular space or cell surface, but not the detection of intracellular details (Helmstaedter et al., 2008).
The first component of the cell membrane connectome is an outline of the location of cell membranes in three-dimensional space. This map would capture many different types of abstract structural features, such as cell body volume distribution, neurite branching, the arrangement of cell processes into bundles, and the locations of chemical and electrical synapses.
Several non-neuronal cell types, such as astrocytes, may not be strictly required for approximating rapid ion flow across brain regions. However, astrocytes help to establish the baseline ion concentrations. So if the ion distributions were lost, as they almost certainly would be in a preserved brain, then the location of these non-neuronal cell membranes would also need to be preserved to be able to infer the baseline ion distribution. Given the contribution of numerous cell types to baseline ion and small molecule distributions, it is likely that it would be necessary to be able to infer at least the approximate location of nearly all the cells in the brain. However, as discussed in a previous essay, the baseline distribution of ions in the brain likely need not be exact to capture the information content of engrams.
How precisely the cell membrane shape would need to be preserved and measured is an open question. Cell membrane shapes are in constant fluctuation in vivo. Cell membranes undergo slow fluctuations (~10 seconds) that are relatively large in size (100 nm-10 um) and occur primarily as a result of actin cytoskeleton rearrangements (Biswas et al., 2017). Because of the presence of a substantial amount of fluctuation in the cell membrane shape, there is good reason to think that the precise nanometer cell membrane contour may be relatively unstable during life and therefore unlikely to be a unique store of long-term engram information, based on the longevity criterion.
The second component of the cell membrane connectome is the thickness of cell membranes. Cell membranes in the brain have an average thickness of around 4 nm (Ingólfsson et al., 2017). The thickness or width of plasma membranes tells us about the electrochemical resistance of the membrane. All things equal, cell membranes that are thicker have higher resistance to ion flow. For example, the diameter of the myelin sheath around an axon can clearly affect the speed of rapid ion flow through that axon, and in turn can play an important role in mediating plasticity of cognitive functions (Fields, 2015).
The third component of the cell membrane connectome is the extracellular space. Ions do not just flow inside of cells. It is the change in ion concentration between the extracellular space and intracellular space that leads to voltage-dependent alterations in ion channels and allows for the propagation of neural information flow.
One example of the importance of the extracellular space in rapid ion flow is ephaptic coupling. As a result of ephaptic coupling, an action potential in one cell process can alter the ion flow and electrical field properties in nearby cell processes via the extracellular space, even if there is no synapse between these processes. The more extracellular space there is between two cellular processes, the less likely ephaptic transmission is to occur between them (Anastassiou et al., 2011). It has been suggested that changing the properties of the extracellular space can therefore lead to changes in the electrical synchronization of neurons and the likelihood of epileptiform activity (Hochman, 2012).
The amount of space between the presynaptic and postsynaptic cells at a chemical synapse is a particularly important type of extracellular space. This synaptic cleft space is likely more stable than other forms of extracellular space because it is maintained in part by trans-synaptic adhesion proteins (Kinney et al., 2013).
However, the extracellular space likely does not need to be preserved very precisely to retain engram information. It is not very stable during life, as it is subject to changes due to osmolar concentration and local movements of motile cells such as microglia. It is also highly altered in perturbations of the brain that have otherwise been found to not destroy engrams. For example, sleep or induction of anesthesia both lead to significant changes in the volume of extracellular space (Ding et al., 2016), yet engrams are robust to both of these perturbations.
The fourth component of the cell membrane connectome is morphology-based estimates of cell types. Different neuron types, such as pyramidal cells, interneurons, Purkinje cells, and granule cells, each have different shapes of their cell processes, so it is feasible to classify cells into each of these types to a certain degree of accuracy on the basis of their morphology (Luczak, 2010). Cell types also have characteristic connectivity patterns to other cell types that can be used to aid in their classification (Jiang et al., 2015) (Motta et al., 2019). Classifying cells based on these types of morphologic information has been done in the Drosophila olfactory connectome, leveraging prior information about cell types collected from the literature (Schlegel et al., 2021).
The reason that estimating cell types is important is that much of the variation in electrophysiologic properties between cells is likely present at the level of the cell type. This is especially true for large mammalian nervous systems such as the human brain, which is generally thought to have many more cells than cell types, as opposed to some insect nervous systems, such as C. elegans, which have uniquely defined neurons (Yemini et al., 2021). Our goal with a cell or neuron typing system is to predict the properties of how ions flow through cells. These properties include the levels and locations of biomolecules that affect membrane electrical conductance properties, such as ion channels, neurotransmitters, and neurotransmitter receptors.
It is unclear whether a cell type definition based on membrane morphology alone could ever be sufficient to predict electrophysiology to a sufficient degree to describe engrams, although to me it feels unlikely. There are so many different patterns of biomolecule distribution that can occur in different cells that seem to mediate variability in electrophysiologic function, and I doubt that they are all adequately correlated with membrane morphology to make that alone a sufficient channel of inference.
Synaptic properties estimate: There are multiple types of synaptic properties – a synapse doesn’t just have one “weight.” For example, at chemical synapses, we can consider the probability of neurotransmitter release, the speed of neurotransmitter release, and the strength of effect on the electrochemical potential of the postsynaptic cell. It is unclear to what extent these properties might be inferable from the cell membrane connectome.
A key aspect of synaptic properties is the size of the pre- and postsynaptic membranes, which have been found to vary in size up to 40-fold (Jarrell et al., 2012). The size of these cell membranes is sometimes used as a proxy of the functional strength, or “weight,” of the synaptic connection. The original connectome of C. elegans mapped by Brenner and colleagues in the 1980s did not have detailed synaptic weight information. Synaptic sizes were only mapped for the first time in 2012, for both electrical and chemical synapses (Jarrell et al., 2012). The size of the cell membranes at chemical synapses has been found to correlate well with the number of neurotransmitter receptors (Kasai et al., 2003). However, to the extent that there is not a perfect correlation between the abstract structural features visible on imaging and the actual biomolecules present at the synapse, the cell membrane connectome level would be unable to detect the discordance. Another problem is that, without synaptic vesicles, it can be challenging to identify whether two opposed cell membranes contain a chemical synapse at all.
Sufficiency for describing engrams: It is likely that information derived from the cell membrane connectome can account for much of the variation in electrochemical signaling and synapse properties relevant to long-term memory recall – but almost certainly not all of the variation, as will be discussed in the more detailed levels of the connectome.
Sebastian Seung’s 2010 TED talk was provocatively titled “I am my connectome” (Seung, 2010). Seung’s subsequent 2012 book expanded upon this theme (Seung, 2012). From our definition of the different levels of connectome description, Seung primarily described the level of the cell membrane connectome to be the key component of information necessary for personal identity. In a later chapter of Seung’s book, he clarified the level of detail further to specify that they were referring to this level of connectome detail in addition to models of cell types, which he called the “connectome plus”. Seung’s book stirred up considerable debate within the neuroscience community, including by those who critiqued it for focusing on too narrow of a level of detail. For example, it is unknown whether the resolution of cell typing achievable via information in the cell membrane connectome will be sufficient to adequately predict how ions flow through each of the cells.
One can imagine an argument such as “we have already had the connectome of C. elegans for decades, and yet we still cannot simulate it to accurately predict physiology and behavior, so clearly the connectome is not sufficient for engram information.” To which, the obvious responses would be that (a) Brenner et al’s original connectome, even when considered as a type of cell membrane connectome, was coarse and did not include important information such as synapse sizes; (b) the cell membrane connectome alone is clearly insufficient for whole brain emulation, which would also require detailed models of how different cells actually operate; and (c) the cell membrane connectome may not be a sufficient level of detail to describe engrams. So, from this perspective, it is utterly unsurprising that we have not yet been able to model the properties of C. elegans in silico in a detailed way. This property of our current technology is certainly not dispositive of the amount of information present in the cell membrane connectome.
Clearly, cells are not monoliths. They contain cytoskeletal elements, organelles, and condensates that divide up the cytoplasm into different compartments. We can consider these components of cells to be abstract structural features, each made up of numerous different types of biomolecules. The extracellular space also contains abstract structural features, such as the extracellular matrix and perineuronal nets.
Because the connectome is currently typically measured with electron microscopy and nonspecific stains (Morgan et al., 2017), one important level of connectome detail that we can consider is the level that includes not only cell membrane information, but also information about all of the cellular details that can be visualized under the electron microscope, such as the endoplasmic reticulum, ribosomes, or microtubules (Heinrich et al., 2021). One can imagine this as the cell membrane connectome in addition to abstract ultrastructural features, which I will call the ultrastructural feature connectome.
There is good reason to think that much of the variation between cells can be accounted for by the ultrastructural features associated with rapid ion flow through cells. For example, intracellular calcium ions are stored in the smooth endoplasmic reticulum and their movement can be buffered by mitochondria (Britzolaki et al., 2018). Transporters for other ions, including chloride (Rahmati et al., 2018), are also found in intracellular organelles and this likely affects their rapid ion flow across cells and cell membranes. Synaptic vesicles at synapses, and their locations, are a clear proxy for how neurotransmitters will be released in response to an action potential. Therefore, preserving and measuring organelles will likely add information about how ions will flow through the connectome.
While different synaptic structures are important for mediating ion flow at synapses, evidence suggests that their dimensions are all strongly correlated (Meyer et al., 2014) (Cirelli et al., 2020). When considering the long-term, stable changes, ultrastructural features at the synapse, such as the volume of the presynapse, the pool of synaptic vesicles, the areas of active zones and postsynaptic densities, and the volume of postsynapse all seem to be balanced. As (Meyer et al., 2014) describe their methods and results:
Here, we used a combination of two-photon time-lapse imaging, two-photon glutamate uncaging, and ultrastructural reconstruction to examine whether and how—along with the spine—other subsynaptic structures, in particular the PSD and presynaptic bouton, change during synaptic potentiation. We found a close correlation between the enlargements of all synaptic components 3 hr after plasticity induction. Furthermore, we observed that the balanced enlargement of pre- and postsynaptic components was a good indicator for the stabilization and persistence of structural modifications.
This makes sense from an evolutionary perspective: It doesn’t make sense to spend energy building a larger presynapse if the postsynapse won’t be able to efficiently transduce the increased neurotransmitter release into changes in membrane potential. From a brain preservation perspective, this balance is very helpful, because if some of the synaptic substructures are lost, their states are more likely to be able to be inferred based on other substructures that are still present.
It’s worth pointing out that electron microscopy can also distinguish macromolecular structures and assemblies in addition to organelles (Tao et al., 2018) (Heinrich et al., 2021). This leaks into the biomolecule-annotated connectome and shows how it is difficult to draw clear dividing lines between any level of connectome detail. Moreover, the number of abstract ultrastructural features will depend on the resolution used for electron microscopy. The more detail with which the tissue is imaged, the more abstract structural features will be visible and the more variance this level of connectome detail will be able to explain. So to be clear, for our connectome hierarchy, I am stipulating that the level of detail will not include individual biomolecules or biomolecular complexes.
Synaptic properties estimate: One important contribution of ultrastructural features to connectome information is at synapses. Here are a few examples of this.
First, by measuring ultrastructural features at the synapse such as the appearance of vesicles in the presynapse, one can better estimate whether a chemical synapse is inhibitory or excitatory (Burette et al., 2015). The number and location of synaptic vesicles in the synapse, such as whether they are docked, also correlates with the strength of that synapse (Kaeser et al., 2017). Other organelles present in each of the presynapse and postsynapse, such as mitochondria, could also be measured. Mitochondria might be another useful proxy for synaptic properties because they play a role in energy functions (Perkins et al., 2015) and can also act as a buffer for ions such as calcium (Turner et al., 2020).
Second, active zones, which are the locations within a presynaptic cell where neurotransmitters are released, are a key mediator of synaptic strength. Ultrastructural features of active zones can differ in ways that affect their function (Kittel et al., 2016) (Perkins et al., 2015). Also, one synapse can have multiple active zones; in fact, some mammalian synapses can have hundreds to thousands of active zones (Clarke et al., 2012).
Our ability to predict synaptic properties based on ultrastructural feature data is already impressive. Prediction of neurotransmitter identity can be performed at 87-94% accuracy in Drosophila based on electron microscopy images alone (Eckstein et al., 2020). Electrophysiological properties are also often predictable in a linear fashion based on EM data (Holler et al., 2021).
Sufficiency for describing engrams: Ken Hayworth once claimed that enough physiology could potentially be predicted on the basis of electron microscopy data alone in order to perform whole brain emulation (i.e. mind uploading) (Kenneth J. Hayworth, 2012):
I will argue that the key technology that needs to be developed to make mind uploading a reality is the ability to create a 3D volume electron microscope image of an entire chemically-fixed and plastic-embedded brain at sufficient resolution to allow the tracing of synaptic connections between all neurons. As in the Apollo program, such a feat represents an enormous engineering challenge but the basic technology for 3D electron imaging of brain tissue at this resolution exists today, and I will lay out one possible path for how this imaging technique could be robustly scaled to the size of an entire human brain.
In this article I first review today’s most well supported cognitive and neuroscience model of the human cognitive architecture (ACT-R), and show how it implies that a successful mind uploading could be accomplished based only on a static map of all neural connections in a brain (along with sufficient structural information on synapse size, location, cell morphology, etc. to allow correlation with functional experiments performed on other brains).
To me, the way he describes this effectively refers to the ultrastructural feature level of connectome detail. A whole brain emulation includes long-term memories, which would therefore make this level of detail sufficient to describe engrams. In his proposal, it would require inferring cell type properties based on electron microscopy data and leveraging detailed models of how cells work that we will likely develop in the future. However, whether this will ever be possible based solely on ultrastructural-level detail remains an open question. My personal feeling is that directly measuring at least a subset of biomolecules, rather than inferring them, will be necessary.
As discussed above, one can estimate the types of cells based on their location, their morphology, and their connectivity. Ultrastructural features can also aid in the process of cell typing. But in contemporary neuroscience, cell typing is more commonly done via measuring biomolecules present in cell nuclei.
The main source of variation in cell function beyond somatic DNA is epigenetics, which we will define as stable changes in cell function due to alterations in the structure of chromosomes independent of the DNA sequence (Berger et al., 2009). Epigenetics is based in large part on chemical modifications of DNA and DNA-binding proteins that promote or inhibit the expression of various sequences of DNA in each cell. During the process of cell division and differentiation, the epigenetic structure of each cell lineage is altered to specify how it will function. Epigenetics is perhaps the fundamental property that defines a cell’s type and function.
As a result, epigenetic information, if it were to be preserved and measured in addition to the ultrastructural feature connectome, would likely allow for a cell typing classification that captures more variance relevant to rapid electrochemical ion flow than a cell typing classification based purely on data from the ultrastructural feature connectome. However, epigenetic information would provide more than just coarse cell type classifications. Recent advances in molecular biology have allowed us to further interrogate the idea of a brain cell type. Through this analysis, we have learned of sub-cell types within each cell type class, such as those at different stages of oligodendrocyte maturation (Marques et al., 2016).
Epigenetic data can also tell us about gene expression program information that can differ within sub-cell types (McKenzie et al., 2018). At any given time, each cell has a number of different expression programs that are active at different levels. From this perspective, a cell’s “type”s can be thought of as a particular set of scalars along multiple expression program dimensions. In other words, while much has been made of “cell types” over the years, data from single cell RNA sequencing suggests that a cell’s set of ion channels and other biomolecules involved in rapid electrochemical information flow may not be fully predicted by its “type,” but instead by the levels of the gene expression programs that drive their synthesis.
One of the advantages of focusing on the expression program level is that multiple different aspects of the epigenome will redundantly encode it, so the information will likely be more robust to damage during the dying and/or preservation processes. But if all of the epigenetic information from each cell is preserved and measured, then one would be able to capture information about the expression levels of all RNA transcripts, which could be represented as a vector of scalars. In other words, theoretically, epigenetic information can also tell us the level to which individual transcripts are likely to be expressed in a given cell. In fact, epigenetic information like RNA levels can even be used to measure the recent history of a cell’s expression patterns, with the technique of RNA velocity.
To summarize, we can specify a few different levels of resolution of detail for each cell in the epigenetic connectome, using nuclear gene expression and chromatin state data:
1. Detailed cell type/sub-type classification for each cell.
2. Above + predicted levels of each expression program for each cell.
3. Above + predicted levels for each expressed RNA transcript in each cell.
Data from studies combining gene expression with electrophysiology measurements have suggested that some electrophysiologic functions can be predicted based on cell class, and some can be predicted based on graded differences within cell classes (Bomkamp et al., 2019). Some work has suggested that the set of ion channels and receptors present may be more important for predicting electrophysiologic function than the morphology of neurons (Otopalik et al., 2019).
Synaptic properties estimate: Epigenetic information in a cell dictates what ion channels it will express on its membranes and what kinds of neurotransmitter(s) it will create, if any. Studies have found that important forms of variability in ion channel and synaptic properties can be predicted on the basis of cell type (Alpizar et al., 2019).
Sufficiency for describing engrams: With advanced future technology, it might be possible to perform electron microscopy in preserved brains and then perform some kind of microdissection-based capture of nuclear regions to profile epigenetic data in each cell. (To be clear, this would involve advanced technology, and is presented here as a thought experiment.) This would likely be compatible with existing brain preservation techniques. For example, preservation of brain tissue by glutaraldehyde fixation is not only compatible with both electron microscopy and epigenetic measurements, but it is often even the state-of-the-art preservation method for these assays (Andrew McKenzie, 2019). In this case, one could imagine that it might be easier to measure these two rich aspects of structural data without having to measure other biomolecular distributions across the cell.
There is an argument to be made that with the addition of epigenetic data to the ultrastructural feature connectome, the epigenetic-annotated ultrastructural connectome might be sufficient to predict ion flow through the connectome to a degree of precision that can describe engrams. Changes in the epigenome have been found to correlate with long-term changes in the synaptic function of cells (Campbell et al., 2019). The more information about cells that turns out to be stored in epigenetic features, the easier the problem of preserving and measuring engram information will likely be.
A prediction based on epigenetic information would still be far from a perfect estimate of biomolecule distribution across a cell, which will be discussed below in the biomolecule-annotated connectome level. However, despite these exceptions, the epigenetic-annotated ultrastructural connectome may still provide enough biomolecular information to be sufficient for describing engrams. Whether it is or not – in my opinion – is an open question.
“To solve the brain, you have to be able to simulate it in a computer. We’re working very hard on ways to map the brain, using connectomics. But I would argue that connections are not enough. To understand how information is being processed, what you really need are all the molecules in the brain. And I think a reasonable goal at this point would be to simulate a small organism, but to do this you need a way to map a 3D object such as the brain with nanoscale precision.” - Ed Boyden, quoted in (O’Connell, 2017), page 58
For this level of the connectome, one can imagine measuring the ultrastructural feature connectome and annotating it in 3D space with direct measurements of biomolecules that affect ion flow.
Detailed biomolecule mapping screens out the need to infer cell types or the levels of expression programs. On a practical level, cell type information and epigenetic information will still be helpful when annotating the connectome with biomolecules, because these will help us to infer missing information due to damage. One can think of this as a hierarchical modeling problem, in which information flows in either direction of the connectome level hierarchy depending upon what is available to be accurately measured.
The biomolecule-annotated connectome level of detail has been described by Ken Hayworth as the “molecularly-annotated structural connectome” (Kenneth Hayworth, 2018). The main aspect of molecular annotations that Hayworth described was estimates of the membrane densities of certain ion channel types. But it’s not just ion channel proteins that are likely to be necessary to label or infer. We can generalize this conception of molecular annotation to any type of biomolecule annotation along the ultrastructural connectome that plays a role in electrochemical information flow.
One key question for the biomolecule-annotated connectome level is how much sub-cellular and extracellular biomolecule distribution is stably determined in ways that cannot be predicted by the ultrastructural feature and epigenetic-annotated connectome levels. There are a few such candidates:
1. A subset of biomolecules may be present for an extremely long time (Fornasiero et al., 2018). The levels and locations of these long-lived biomolecules may not be able to be predicted by epigenetic information, which may be altered since the biomolecules were originally synthesized.
2. Non-uniform, stable trafficking of biomolecules may be affected by variations in cytoskeleton architecture that cannot not be predicted by ultrastructural features alone.
3. Biomolecules can be trafficked from other cells via intercellular transfer. This is a common phenomenon and can be mediated by multiple mechanisms, such as exosomes or tunneling nanotubes. For example, axons have been found to traffic biomolecules to nearby oligodendrocytes (Thomas et al., 2017). It is possible that these biomolecules could still be predicted by epigenetic information of the other cell, however, it might be difficult to do so, especially if there were a mixture of cells contributing to the biomolecule content of each cell.
4. What types of biomolecules are found in the brain at any given time, especially lipids and carbohydrates, is dependent upon one’s diet and environmental exposures. This might not be easily inferred on the basis of epigenetic data alone.
For these reasons, it is reasonable to expect that at least a subset of individual biomolecules may need to be measured to sufficiently describe engrams. Here are some examples:
1. Ion channels, ion transporters, ion pumps, and ionotropic receptors, which are the proteins that directly mediate electrochemical ion flow through the cell membranes of a nervous system. They affect the baseline ion distribution as well as the speed and fidelity of ion flow in response to a stimulus or excitatory postsynaptic potential (EPSP).
One reason that it may be important to directly preserve and measure ion-related proteins is that they are not evenly distributed across cell membranes. For example, ion channels can be more concentrated in certain regions of axons, such as regions with more presynaptic boutons and at axon branch points (Alpizar et al., 2019). The post-translational modification states of ion channel proteins can also differ along axons, which can affect their conformations and therefore how they mediate rapid ion flow.
However, other information in the cell, such as ultrastructural features and epigenetic markers, may be sufficient to predict these non-uniform distributions, given future advances in our understand of the rules of how ion channels localize subcellularly (Trimmer, 2015). How easy it is to predict subcellular biomolecular localization patterns based on epigenetic information, and how important these subcellular biomolecular localization patterns are for determining electrochemical flow between cells, are important open questions for the information content of engrams.
2. The subcellular localization of kinases and other proteins that regulate the post-translational modifications of ion channels. These regulatory proteins may be important to measure for two reasons:
First, regulatory proteins can rapidly influence ion channel function and may play a direct role in memory retrieval through rapid neuromodulation. An example of this occurs following neurotransmitter binding to certain GPCRs that releases G protein beta subunits (Ostrovskaya et al., 2014). These subunits can lead to a change in the conformation of a type of potassium ion channel known as a GIRK channel, and thereby increase potassium flow rapidly, on the order of 100 milliseconds (Xie et al., 2010). Another example is dopamine modulation of potassium channel or calcium channel activity, which appears to rapidly affect firing properties at synapses on the timescale of milliseconds (Lahiri et al., 2020). Proteins that help to mediate rapid neuromodulation, such as RSG7 and Gbeta5, have been found to have non-uniform distribution within brain cells (Aguado et al., 2016).
Second, even when regulatory proteins operate too slowly to directly play a role in rapid memory retrieval, their levels and locations could be an important inference channel as to the most likely states of ion channels nearby if those are lost.
3. Electrical synapses, also known as gap junctions, mediate rapid electrochemical ion flow between cells. These connections can have diverse conductance properties not predicted purely by ultrastructure, such as asymmetry. The properties appear to be largely based on connexin type and the ion channels present on each side of the connection, which may need to be directly measured (Snipas et al., 2017).
4. Cytoskeletal proteins are able to directly alter electrochemical ion flow within a cell, such as the ion flow down a cell process (Priel et al., 2010).
5. The extracellular matrix, which is composed largely of structural carbohydrates such as hyaluronan and proteoglycans, can constrain small molecules and ion flow in the extracellular space of the brain (Nicholson et al., 2017). For example, the extracellular matrix likely plays a role in preventing the spread of neurotransmitters away from the synaptic cleft.
6. Proteins expressed by astrocytes can affect ion flow at synapses, which is why chemical synapses are often called tripartite: the presynaptic neuron, the postsynaptic neuron, and the associated astrocyte process all play an important role in synaptic function (Takano et al., 2020). A key role of astrocyte-associated biomolecules is to help establish baseline ion conditions and balance. While we know that momentary ion distributions can be lost without loss of engrams, as discussed above, it seems plausible that these baseline ion distributions, which can vary across the brain, may need to be inferred.
This is obviously not an exhaustive list. It’s just a representative sample, biased by what has so far been considered important in the field at large. It is clear that a comprehensive list of biomolecules involved in rapid electrochemical ion flow does not yet exist. For example, one 2017 publication points out that while we know that cell membrane permeability to ions is determined by ion-associated proteins, we don’t currently have a full list of these proteins, even if we just focus on the brain (Devor et al., 2017). And ion flow is not just dependent on proteins in cell membranes. For example, we know that internal movements of ions, such as those mediated by intracellular calcium stores, also can play a role in rapid ion-based communication between cells (Devor et al., 2017).
If we imagine progressively mapping more types of biomolecules, it is intuitive that each additional type of biomolecule mapped will increase the percentage of variance explained in predicting electrochemical ion flow through the connectome. However, it is also likely that prior to mapping all of the biomolecule types in the brain, we will reach a level of variance explained that would allow for the high-fidelity read-out of engram information. This would be a sufficient amount of biomolecular information for describing engrams, so further biomolecule mapping would not be necessary.
Currently, it is quite unclear how many types of biomolecules would need to be directly preserved and measured to be sufficient to describe engrams. The number of unique biomolecules mapped can be thought of as a level of resolution of the biomolecule-annotated connectome. Perhaps an arbitrary dividing line is that the preservation and mapping of less than 100 unique types of biomolecules would be a more “sparse” biomolecule-annotated connectome, whereas more than 100 unique types of biomolecules would be a more “rich” biomolecule-annotated connectome. If all of the biomolecules were preserved and measured, we could call this the “complete” biomolecule-annotated connectome.
Another level of resolution for the biomolecule-annotated connectome is the type of biomolecules that are preserved and measured. Small molecules could theoretically be a part of the biomolecule-annotated connectome, although they would likely be more difficult to preserve and measure. Lipids and carbohydrates might be lost during the brain preservation process, so a potentially important type of biomolecule-annotated connectome to consider is one that primarily or exclusively preserves and maps proteins and nucleic acids.
Synaptic properties estimate: If adequately preserved and measured at a sufficient level of resolution, the biomolecule-annotated connectome must contain all of the structural information about synaptic properties. For example, at this level, one would not need to infer the levels of AMPA receptors on the basis of ultrastructural features, which might be slightly inaccurate (Kasai et al., 2003). Instead, one could simply directly measure the AMPA receptors.
As O’Rourke and colleagues noted in their review paper about the heterogeneity of synaptic properties, “functional phenotypes must be tied to molecular-level mechanistic differences” (O’Rourke et al., 2012). In other words, at the rich biomolecule annotation level there is nothing else that could be necessary, because function must depend ultimately on structure.
We can imagine different types of biomolecule localization patterns at synapses, which would affect their need to be directly preserved and measured, inspired by the framework described by Craig and Boudin (Craig et al., 2001):
a. Some biomolecules might be ubiquitously expressed at similar levels in all synapses in all neurons. These biomolecules would not need to be measured. Their presence could simply be inferred.
b. Some biomolecules might be expressed by different cells at different levels, but found at all synapses within a cell at similar levels. These biomolecule levels could likely be inferred on the basis of cell type or epigenetic information alone, or by direct measurement at a small number of synapses for each cell.
c. Some biomolecules might be expressed by different cells at different levels, and found at different levels in different synapses, but in a way sufficiently predicted by cell membrane morphology and/or ultrastructural features. These biomolecules would not need to be directly measured at all synapses.
d. Some biomolecules might be expressed by different cells at different levels, and found at different levels in different synapses, in a way not sufficiently predicted by synapse membrane morphology and/or ultrastructural features. For example, this could come about as a result of communication between the presynapse and postsynapse in a way that depends on their history and involves long-lived proteins or creates stable trans-synaptic molecular cycles. These are the biomolecules that would need to be directly measured at each synapse.
To give some context about biomolecular diversity at the synapse, at the protein level, proteomics studies have found that there are approximately 1500 unique proteins at the postsynaptic density, approximately 500 at the active zone, and approximately 6620 unique proteins in the whole synaptosome including the presynaptic, postsynaptic, astroglial, and extracellular matrix components (Dieterich et al., 2016).
At the ultrastructural feature level, the levels of subsynaptic structures are highly correlated. So, too, appear to be their biomolecular constituents. For example, evidence suggests that persistent synapse enlargement requires the presence of additional PSD-95 and Homer1c proteins in the postsynaptic density (Meyer et al., 2014).
While synaptic vesicles can be visualized by high-resolution electron microscopy, their precise biomolecular composition, such as the number and type of stored neurotransmitters, the actin cytoskeletal framework that facilitates their docking (Porat-Shliom et al., 2013), and the associated voltage-gated calcium channels, can also contribute to variation in rapid electrochemical ion flow (Chung et al., 2013). Consistent with this, the patterns of synaptic vesicle release at chemical synapses seems to be mediated by the levels of different proteins at synapses (Chung et al., 2013). It is unclear the extent to which the biomolecular machinery at each synapse that mediates synaptic vesicle release is a long-lived and unique store of information necessary to describe to engrams.
One relevant data point is the longevity of biomolecules at synapses. The majority of synaptic proteins have been found to have rapid turnover, with half-lives of approximately 2-5 days, suggesting that individual proteins are not generally responsible for engram storage at synapses (Abraham et al., 2019). That said, there are some exceptional proteins at the synapse that have been found to be long-lived. One study found that proteins associated with microtubule binding and the extracellular matrix have almost no turnover during the measured time period of seven weeks (Heo et al., 2018). However, I am not aware of any long-lived protein at synapses that has been directly associated with rapid ion flow properties in a unique way. The long-lived synaptic proteins that have been associated with memory storage seem to do so indirectly. For example, PKM-zeta is associated with memory indirectly as a result of increasing the number of functional postsynaptic AMPA receptors, via decreasing AMPA receptor endocytic trafficking (Sacktor et al., 2017).
The topological location relationships between biomolecules at the synapse beyond composition may be important. For example, one location feature is the distance between synaptic vesicles and the voltage gated calcium channels that activate their docking, fusion, and neurotransmitter release. This is called the coupling distance. The tighter the coupling distance, the higher the synaptic vesicle release probability, and the lower the number of synaptic vesicles available in the ready releasable pool if another action potential reaches the synapse shortly thereafter (Fekete et al., 2019). As a result, the coupling distance is clearly relevant to rapid ion flow between cells and might be important for rapid long-term memory recall.
However, there is good reason to expect that the precise nanometer-scale relationships between proteins at synapses, such as the coupling distance, are not necessary to directly preserve or measure. From a uniqueness criterion perspective, synaptic location relationships are maintained by synaptic organizer proteins such as scaffold proteins (Suzuki et al., 2020) (Fulterer et al., 2018). If the synaptic organizer proteins can be preserved and measured, then the organization patterns that they mediate can be inferred.
Sufficiency for describing engrams: If all of the biomolecules in the brain are used to annotate the connectome, then this level of connectome detail seemingly has to be sufficient for engram information. To argue that the complete biomolecule-annotated connectome would not have sufficient information for engrams, you either need to claim that the idea of inferring memory function from structure is fundamentally mistaken or that it will be impossible to ever preserve the biomolecule-annotated connectome. One could also argue that it will never be possible to measure the biomolecule composition at the level of resolution needed. It is likely that if the rich biomolecule-annotated connectome level of detail is necessary to sufficiently describe engrams, then it will be, at a minimum, many decades before this is actually plausible to perform in a brain as large as the human brain.
To get a sense of how biomolecule mapping technology might evolve in the future, let’s look at some contemporary brain mapping technologies.
There are methods to map the relative density of populations of biomolecules and methods to map single biomolecules. Plausibly, mapping populations of biomolecules might be sufficient to describe engrams, given that more precise single molecule localizations might not be very well maintained over the long run. But, as will be discussed in the next essay, single biomolecule mapping will still likely be necessary regardless, because it will allow for archaeological triangulation of the original states of damaged structures.
When it comes to mapping single RNA molecules, a major advantage is that it is relatively easy to design probes that bind specifically to individual RNA molecules. So one can design multiple different probes that will bind to the same RNA molecule at different locations along it. This allows for a dramatically improved signal to noise ratio, and, as a result, the ability to map single molecules using conventional optical microscopy. One of the technologies that leverages this is called single molecule fluorescence in situ hybridization, or smFISH for short.
Currently, most smFISH protocols require that the RNA molecules mapped have relatively low copy number. smFISH is also difficult to scale. Finally, smFISH cannot be done on formalin fixed, paraffin embedded brain tissue sections that were fixed for too long (> 2-3 weeks) prior to processing (Jolly et al., 2019). Most likely, brain tissue fixed for such a long of a period of time is too inaccessible to the probes after such an extensive of a period of crosslinking. But it seems likely to me that with improvements in sample prep, imaging resolution, and computational analysis, these limitations could be routed around, either via improvements to smFISH or via a successor technology.
When it comes to proteins, a key way to map single molecules is via superresolution optical microscopy. When combined with short DNA oligonucleotides to tag antibodies that bind to individual proteins (DNA-PAINT), it is already possible to get close to single molecule mapping at the synapse (Narayanasamy et al., 2021).
Similar methods have also allowed for true single protein mapping (van Wee et al., 2021).
Another method that can be helpful for single molecule mapping is expansion microscopy. In expansion microscopy, a polymer network is introduced into a tissue sample and then physically expanded with chemical reactions, thereby increasing the size of the tissue. While current expansion methods can be imprecise and lose some spatial information, it allows for microscopy techniques to more easily distinguish single molecules at different locations in the tissue.
In addition, there are other methods that allow for direct correlative light and electron microscopy with population-level biomolecule mapping (Lane et al., 2022).
Some people are skeptical that we will ever be able to map biomolecules at a single molecule level throughout the brain. While it would be an impossibly large undertaking with today’s technology, my view is that it is a logical extension of what is possible today. There are already multiple possible avenues to approach the problem, including improvements in probe design, expansion technology, optical microscopy resolution, or correlative biomolecule mapping with electron microscopy. As a result, assuming that technological progress continues into the future, which is a requirement of the brain preservation project regardless, I doubt that single molecule mapping technology will be the bottleneck for revival.
Level | Adjacency connectome | Cell membrane connectome | Ultrastructural feature connectome | Epigenetic-annotated connectome | Biomolecule-annotated connectome |
---|---|---|---|---|---|
Description | Adjacency matrix, with or without weights | 3D map of cell membrane locations of brain cells | As previous + intracellular/extracellular features visible on ultrastructure | As previous, + nuclear gene expression/epigenetic information | As previous, + detailed biomolecule information across the connectome |
Features | Synaptic connectivity data | 3D map of membrane positions; shape information of cell membranes; size of extracellular space; cell type predictions based on morphology + connectivity | Organelles; extracellular matrix features; synaptic features such as active zones or the ready releasable pool | DNA posttranslational events; DNA structural differences between cells; RNA in nucleus; nucleosome information | Potentially all biomolecules |
Resolution Levels | Chemical synapses only vs chemical + electrical synapses; number of synapses between cells or just logical presence/absence | Level of detail of in vivo membrane morphology captured; detail of cell type estimates based on morphology + connectivity | Resolution of electron microscopy used will dictate the detail of ultrastructural features uncovered; at higher resolutions, such as with electron tomography, can examine supra-molecular multi-protein structures | Cell types/sub-cell types based on nuclear gene expression; expression program scalars; gene expression levels for all transcripts | Number of types of biomolecules mapped; how precisely locations are mapped; whether biomolecular conformations are mapped |
Synaptic property estimate | Number of synapses between cells (or their absence) | Improved – can estimate based on size of presynaptic/postsynaptic membranes; this likely captures a large percentage of the synaptic function variance | Can estimate based on ultrastructural synaptic features, such as the ready releasable pool; features for gap junctions | As previous, + synaptic properties that can differ across cells based on nuclear information for each cell | All biomolecular composition |
Cell Type Estimate | Location in the brain; pattern of connectivity | Detailed morphology and connectivity | As previous + ultrastructural features specific to cell types/sub-types | For some definitions of a cell “type,” this provides the most precise cell typing possible. | All cell type information |
Technology required to map | Many methods, for example doing DNA sequencing to capture the synaptic connection map | Electron microscopy of tissue stained to distinguish cell membranes | Electron microscopy + segmentation of organelles | Electron microscopy + nuclear biomolecule mapping for each cell | Electron microscopy + tissue-wide biomolecule mapping |
Sufficient for memories? | Seems like basically no way | Almost certainly not, though the majority of information that is most important to preserve is likely present here | Possibly, as suggested by some proponents of brain preservation such as Ken Hayworth (Kenneth J. Hayworth, 2012) | Sebastian Seung speculated that something like this might be sufficient (Seung, 2012) | If preserved well enough and mapped well enough, must be sufficient |
I believe that the most likely level of connectome detail needed to sufficiently describe engrams is the sparse biomolecule-annotated connectome. My feeling is that the majority of information is likely captured by the cell membrane connectome level and that the additional information needed could be mostly captured by ultrastructural features and epigenetic data. But I would also guess that there are certain biomolecules with additional information content, such as long-lived biomolecules, intracellularly trafficked biomolecules, connexins, or key synaptic organizer proteins, that might be a necessary part of a sufficient description of engrams as well. However, it is important to point out that while I think the biomolecule-annotated connectome level is needed to describe engrams, that doesn’t mean that any individual biomolecule is necessarily essential to preserve on its own, but rather that its information content may need to be preserved so that it could be adequately inferred. This will be discussed further in the next essay, which explores in more detail this critical idea of inference.