Heritability of Human Structural Connectomes

Jaewon Chung

(he/him) - NeuroData lab
Johns Hopkins University - Biomedical Engineering

icon j1c@jhu.edu
icon @j1c (Github)
icon @j1c (Twitter)

What is heritability?

  • Variations in phenotype caused by variations in genotype.
  • Potentially discover relationships between diseases and genetics.




Are the brain connectivity patterns heritable?

Brain connectivity as connectomes

  • Vertex: region of interest
  • Edges: connectivity measure between a pair of vertices
  • Diffusion MRI: # of estimated neuronal fibers
  • Undirected: neurons have no direction

center

Image from Gu, Zijin, et al. "Heritability and interindividual variability of regional structure-function coupling." (2021)

How do we get structural connectomes?


center

Heritability as causal problem

  • Directed acyclic graph

center

Do genomes affect connectomes?

  • Hypothesis:
    H0:F(H_0: F(Connectome|Genome)=F() = F(Connectome))
    HA:F(H_A: F(Connectome|Genome)F() \neq F(Connectome))

  • Alternatively:
    H0:F(H_0: F(Connectome, Genome)=F() = F(Connectome)F()F(Genome))
    HA:F(H_A: F(Connectome, Genome)F() \neq F(Connectome)F()F(Genome))

  • Known as independence testing

  • Test statistic: distance correlation (dcorr)

  • Implication if true: there exists an associational heritability.

What is distance correlation?

  • Measures dependence between two multivariate quantities.
    • For example: connectomes, genomes.
  • Can detect nonlinear associations.
  • Measures correlation between pairwise distances.

center

How to compare genomes?

  • Typical twin studies do not sequence genomes.
  • Coefficient of kinship (ϕij\phi_{ij})
    • Probabilities of finding particular genes as identical among subjects.
  • d(Genomei_i, Genomej_j) = 1 - 2ϕij\phi_{ij}.

Relationship ϕij\phi_{ij} 12ϕij1-2\phi_{ij}
Monozygotic 12\frac{1}{2} 00
Dizygotic 14\frac{1}{4} 12\frac{1}{2}
Non-twin siblings 14\frac{1}{4} 12\frac{1}{2}
Unrelated 00 11

How to compare connectomes?

  • Random dot product graph (RDPG)
    • Each vertex (region of interest) has a low dd dimensional latent vector.
    • P[ij]P[i\rightarrow j] = xi,xj\langle x_i, x_j\rangle
  • Latent vectors =
  • d(Connectomei_i, Connectomej_j) = X(k)X(l)RF||X^{(k)} - X^{(l)}R||_F

center

Human Connectome Project

  • Brain scans from identical (monozygotic), fraternal (dizygotic), non-twin siblings.
  • Regions defined using Glasser parcellation



center

Van Essen, David C., et al., The WU-Minn human connectome project: an overview (2013)

Glasser, Matthew F., et al. "A multi-modal parcellation of human cerebral cortex." Nature (2016).

Genome and connectomes are dependent


center



Sex All Females Males
p-value

Neuroanatomy (effect mediator)

  • Literature show neuroanatomy (e.g. brain volume) is highly heritable.

  • Want to test:
    H0:F(H_0: F(Neuroanatomy, Genome)=F() = F(Neuroanatomy)F()F(Genome))
    HA:F(H_A: F(Neuroanatomy, Genome)F() \neq F(Neuroanatomy)F()F(Genome))

  • Implication if true: causal model should include neuroanatomy.

Genome and neuroanatomy are dependent


center



Sex All Females Males
p-value

DAG including interactions of neuroanatomy

center

Do genomes affect connectomes given neuroanatomy?

  • Want a conditional independence test!
    H0:F(H_0: F(Conn., Genome|Neuro.)=F() = F(Conn.|Neuro.)F()F(Genome|Neuro.))
    HA:F(H_A: F(Conn., Genome|Neuro.)F() \neq F(Conn.|Neuro.)F()F(Genome|Neuro.))

  • Test statistic: Conditional distance correlation (cdcorr)

  • Implication if true: there exists causal dependence of connectomes on genomes.

What is conditional distance correlation?

  • Augment distance correlation procedure with third distance matrix.
  • d(Neuroanatomyi_i, Neuroanatomyj_j) = ||Neuroanatomyi_i - Neuroanatomyj_j||F_F

center

Connectomes are still dependent on genome



Sex All Females Males
p-value

Summary

center

  • Present a causal model for heritability of connectomes.
  • Leveraged recent advances:
    1. Statistical models for networks, allowing meaningful comparison of connectomes.
    2. Distance and conditional distance correlation as test statistic for causal analysis1^1.
  • Connectomes are dependent on genome, suggesting heritability.

1^1 Bridgeford, Eric W., et al. "Batch Effects are Causal Effects: Applications in Human Connectomics." (2021).

Acknowledgements

Team

person
Mike Powell

person
Eric Bridgeford

person
Carey Priebe

person
Joshua Vogelstein






Additional slides

Causal model

  • XX denote exposure, YY denote outcome, WW denote measured covariates, ZZ denote unmeasured covariates
  • Want to estimate the effect of different exposures on the outcome, which is quantified using the backdoor formula if WW and ZZ close all backdoor paths.

fw,z(yx)=W×Zf(yx,w,z)f(w,z)d(w,z)f_{w, z}(y|x) = \int_{\mathcal{W}\times\mathcal{Z}}f(y|x, w, z)f(w, z)\mathrm{d}(w, z)

  • Above integrates over all measured and unmeasured covariates.

f(yx)=W×Zf(yx,w,z)f(w,zx)(w,z)f(y | x) = \int_{\mathcal W \times \mathcal Z}{f(y | x, w, z) f(w, z | x)}{(w, z)}

  • Averages the true outcome distribution over the conditional distribution of the measured and unmeasured covariates.

Causal model (cont.)

  • We observe the triples (xi,yi,wi)(x_i, y_i, w_i) for i[n]i\in[n].
  • Only be able to estimate the functions of (X,Y,W)(X, Y, W)
  • The corresponding hypothesis test is:

H0:f(yx,w)=f(yw)vsHA:f(yx,w)f(yw).H_0: f(y|x, w) = f(y|w) \quad \text{vs} \quad H_A: f(y|x, w) \neq f(y|w).

Shortcomings - Network model

  • Problems with connectome estimation.
    • Inability to determine the precise origin/termination of connections in the cortex.
      • -> false negatives
    • Crossing fibers
      • -> false positives
  • RDPG can only represent subset of independent edge networks.

center

Shortcomings - Model assumptions

  • No interaction between genome and environment
  • No epistatsis
    • Effect of one gene is dependent on another
    • Ex: black hair and baldness
  • No dominance effects
  • Strong assumptions in genetic distances

What are environmental effects?

  • Shared
    • Common experiences of siblings living in the same household.
      • household income, the family’s living situation, the dynamics between the parents, food consumed
  • Non-shared
    • Everything else
    • Epigenetics
    • Luck
    • schools, peers

Random dot product graphs

  • Adjacency spectral embedding
  • representation of the vertices of the graphs into d dimensions via its singular value decomposition, given by A=USUA = USU^\top where URn×nU\in\mathbb{R}^{n×n} is the orthogonal matrix of eigenvectors and SRn×nS \in \mathbb{R}^{n×n} is a diagonal matrix containing the eigenvalues of AA ordered by magnitude.
  • ASE(A)=X^=U^S^1/2ASE(A) = \hat X =\hat U \hat S ^{1/2} where U^Rn×d\hat U \in\mathbb{R}^{n×d} contains the first dd columns of UU, which correspond to the largest eigenvectors, and S^Rd×d\hat S \in\mathbb{R}^{d×d} is the submatrix of $ S $ corresponding to the dd largest eigenvalues in magnitude.

(aka networks or graphs)