RSA Workshop

Distance measures

What does RSA measure?

GLM to get beta for each voxel x condition
e.g., OLS fitting

Euclidean distance

vector of values for each voxel, for each condition
some point in n-dim space (sum of squares of components)
Difference = sum of square differences between points in voxel space

Pearson correlation distance

Distance of points from origin = univariate effect, difference from baseline
- Could just subtract the mean from each vector, and move the voxels to plane
Some voxels might be relatively less active compared to others
- z transform --> divisive normalization of voxel variance
- project to circle?
- 1 - cos(angle between A and B) = distance

Issues w/ Euclidean distance & Pearson correlation

baseline shift can change angle between 2 points, but not Euclidean distance
- try subtracting out mean pattern (but makes patterns anticorrelated!)
change the pattern scaling (make B slightly stronger)
- changes Euclidean distance, but not correlation

Reliability of dissimilarity measures

split-half reliability analysis
- similar reliability between 2 metrics

Noise Normalization for fMRI

Some voxels are noiser than others
Could weight voxels by noisier (e.g., using residuals from epsilon matrix) ~ t-stat
Voxel by voxel covariance matrix, variance on matrix, weight $\beta$ = $\beta$/$\sigma_{voxel}$
Could normalize by whole covariance $\Sigma$ matrix --> huge increase in reliability
- u = beta * $\Sigma^{1/2}$

CV distance estimates

Noise drives distances further apart (although ranks of distances are preserved)
Multiple runs in the experiment
Each estimated voxel pattern = true pattern + noise
- run specific noise, that is independent between runs
estimated (u_a - u_b) from run 1 * (u_a - u_b)^T from run 2
- since noise is independent between runs, can call CV distance the true distance
average across folds if can have more folds (e.g., leave 1 out)
- 1/4(d_1 (d2+d2 + d4)/3^T)
CV Mahalanobis distance has better split-half reliability than multivariate noise normalization when SNR is low
Multivariate noise normalization > CV MD > distance!
For each cell in RDM, random effects t-test (vs. 0) across subjects, then FDR correct, threshold
Fixed effects within subject, can get Linear discriminate t-value, by normalizing the distance by standard error
- ratio between values is no longer preserved, because SD might differ between diff contrasts

Distances measures vs. Pattern classifiers

Instead of RDM, do pairwise classification accuracies, and insert into matrix
But pattern classifiers are quantized (broken into bins)
If doing binary classification, then fine, but not for investigating representation

RDM Practical

3 main steps:
1. compute/visualize RDMs from different brain regions
2. compare brain and model RDMs
3. statistical inference

`DEMO1_RSA_ROI_simulatedAndRealData.m`

vectorized RDM is way to convert symmetric RDM to vector (N(N-1))/2
- squareRDMs() to convert back to full RDM
Naming convention for RDMs:
- structure = RDM
  - field: RDM = dissimilarity matrix
  - field: name = 'hIT | BE | Session: 1' (separate info w/vertical bars)
  - field: color = [0 0 1]
- averageRDMs_subjectSession(RDMs, 'session'), can avg over name id
MultiDimensional Scaling (MDS)
- show each condition with plot, and specify color of the plot
- project data into 2D space, and visualize on 2D circle
- Shepard plot: shows similarity vs. disparity
- Pearson & Spearman rank correlations
- MDSConditions(cell name (has name, color, RDM))
- Dendrogram: avg RDMs across subjects, and can visualize categorial stucture
  - hierarchical clustering (exploratory!)
  - plots smalled colored dots that match up with MDS plot
  - dendrogramConditions()
Comparing RDMs (brain vs. model)
- Create RDM that is dissimilarity between separate RDMs
- How to quantify correlation
  - Between values: Pearson, rank correlation (e.g., Spearman, Kendall's tau a)
  - Between correlation matrices: Kendall's $\tau$ a is the correlation measure for 2nd order correlation matrix
    - Pearson assumes linear, but should use rank correlation, e.g., how similar are the ranks
    - Spearman correlation = correlation of the ranks (rank dissimilarities, and then compute pearson corr)
      - doesn't penalize
    - Kendall's tau a gives more bonus to the more specific predictions
      - within RDM get 2 dissimilarities, see if one is larger than the other, repeat for the other matrix
      - If the 2 RDMs match, then give a rank of 1, otherwise 0. Then, get proportion of 1s for all pairs of dissimilarities for each cell
      - allow for detailed predictions (.7 in 1 condition, .8 in the other)
- Correlation between RDMs: pairwiseCorrelateRDMs(cell array {models})
- MDS for RDM-RDM similarities: MDSRDMs()
- 'subjectRFXsignedRank'
- plotpValues = '=' or '*' (write pval, or * for significance)
- Height of bars is average correlation across subjects
- Threshold for multiple comparisons
  - FDR, or Bonferroni (N; 2)
- When comparing 2 RDMs, and not enough evidence for base RDM, could do bootstrapping
  - userOptions.candRDMdifferencesTest = 'conditionRFXbootstrap'
- Noise ceiling: Compute threshold (gray) for maximum correlations for this dataset
  - Upper bound
    - best you can do is at the mean of all the subjects (group mean of average RDMs is the best)
    - for RDMs rarely use Euclidean distance, but rather pearson correlation (or spearman, etc.)
      - but 1-r = $\sqrt{2}$ * Euclidean distance
      - for spearman, do the same thing as pearson, but rank transform first
      - Kendall tau a --> gradient descent (maximum correlation)
  - Lower bound
    - Leave 1-subject out, average RDMs for n-1, and repeat for all folds, then average
    - Underfit (so lower bound on ceiling), but still pretty good
  - If model is between lower and upper bound, doing pretty well.
  - But, what if data noisy?
    - Might be useful to report the value of the ceiling, since can't really do much better, given the noise.
    - Upper bound statistic: stats_p_r.ceiling(2)

Simulate fMRI data

DEMO1_RSA_ROI_sim.m
Edit betaCorrespondence.m
- Might want to use t statistics, rather than betas

Multivariate noise estimation & LD-t

Example:
- finger representations in M1
  - thumb, index... pinky
  - patterns are unique(ish) for each finger, but within subject
  - pattern is different in subject 2 (r between subjects is low)
  - but relationship in peaks is somewhat consistent across subjs

Compute cross validated distance metric

$(u^1-u^1)(u^2-u^2)^T$
1 ROI, 2 hemispheres, 2 hands, 6 subjects, 10 conditions, 8 runs
Estimate betas, residuals, and multivariate noise normalization
rsa_noiseNormalizeBeta(Y,SPM), where Y: timeseries after preprocessing
- Estimate beta coefficients and residuals from preproc time series
  - ordinary least squares estimate of beta_hat = inv(X'*X)*X'*Y
  - residuals: res = Y - X*beta
- Run-wise multivariate noise normalization
  - For each run, estimate the residual variance for each voxel (Sw_hat)
    - covdiag(res(indices_run_i,:))
    - ~100 time points per 1050 voxels; if just cov, then rank deficient, and need to be able to invert
    - this applies shrinkage estimation, so covariance matrix of resids (voxel to voxel), and optimal estimation of shrinkage
  - Mutlivariate noise normalization
    - beta_hat(indices_run_i,:)*Sw_hat(:,:,run_i)^(-1/2)
Calculate the CV squared Euclidian distances
- Mahalanobis distance (LDC)
- Compute pairwise distances
- Leave 1-run out, and cross validate
  - Take inverse of X, and apply to partition (10xnumber_voxels)
    - A(:,:,i) = pinv(Xa)*Ya; B = pinv(Xb)*Yb;
  - Compute the difference between these in each partition
    - d(i,:) = sum((C*A(:,:,i)).*(C*B),2)'; (1x45 vector --> 10 x 10 matrix)
- Take average across CV partitions
  - d = sum(d)./numPart;
Show dissimilarity matrix, rank-transformed, scaled (0,1)
- if rank transformed, then highlights small differences; analyses should be done on raw values

Compute linear discriminant t value (LDt)

t-value that tells us how likely patterns are truly dissimilar
Normalize contrast by $SE_{LDC}$
- Take diagonal of covariance matrix of noise (variance), and multiply by training w (u_A-u_B)
- A(:,:,i) = pinv(Xa)*Ya; B = pinv(Xb)*Yb; w=(C*A(:,:,i)); d(i,:) = sum(w.*(C*B),2)'; BCovb=SPM.xX.Bcov(partition==part(i),partition==part(i)); sigma=sum(w*Sw(:,:,i).*w,2); se = sqrt(sigma.*diag(C*BCovb*C'))'; d(i,:)=d(i,:)./se;
Can easily threshold matrix w/p-value for visualization purposes

Weighted Representational Component Models

(Jorn Diedrichsen, UCL)

Motivation

How are different features of a stimulus (e.g., color, shape) observed in neural activity?
- How many voxels are activated by a given feature?
- Approach:
  - weighting of different features
  - for instance, red might not be coded for, but blue a lot

Covariances & distances

$/beta$ matrix: k conditions X p voxels; each row, u, is a pattern of voxels
Compute distances (d) between patterns, u = inner product of $u_i-u_j$
- u_{i}u_i^T _ u_{i}u_{i}^T - 2u_{i}u_{j}^T
- Pattern "covariance": G = U{i}U{j}^T, related to pattern distance: D = G + G - G - G
  - Can go from covariance/inner product matrix to the Mahalanobis-distance matrix, just need the origin
  - Make sure sum of rows/columns = 0
  - If you can prove something for covariance matrix, applies to distance matrix too

Features & Representational components

Each pattern ($u$) is a sum of different features ($f$) times a feature pattern ($w_h$)
- As feature weighting changes, how does RDM change?
$G = UU^T$ = $\Sigma f_hw_h \Sigma w_h^Tf_h^T$
- if feature patterns are independent, then $w_i w_j^T = 0$
- = $\Sigma f_h W_h W_h^Tf_h^T$, since cross sums cancel, if weight (w) inner product = 0
  - strength with which feature is represented
  - how stimuli differ from eachother (f_h f_h^T)
- G = $\Sigma \omega_h G_h$ ($\omega = w_h w_h^T$)
- Distance matrix (d) = $\Sigma \omega_h D_h$
How well does a neural area represent part of the model, rather than estimating weight for each feature
- could group features into representational components
- sometimes we want to think of the features as dependent, loading onto a "component"
- beta matrix = sum of F_h (k conditions x q features) * w_h (q features x p voxels)
- Example:
  - 5 different stimuli (conditions 1-5, linearly increasing brightness) = F_h
  - Features correlation, V_h = 1
  - Covariance, G_h - F_hV_hF
  - Distance component = D = G_i+G_j-2G_{ij}
- Example 2:
  - Faces, places, different features
  - Correlation matrix = [1 0; 0 1]
  - Covariance: faces related to each other, not places, vice versa
  - Distance matrix = reverse of covariance matrix
Features don't really matter, its the component matrices (covariance & distance) that you get out
- The exact values of features doesn't matter too much, many different weightings of feature correlations give rise to same component matrices
** Estimation**
- D = $\Sigma \omega_h D_h$
- linear regression!
- Vectorize distances, then build component matrix X [d_1 d_2]
- $\omega = (X^T X)^{-1} X^T d$

Factorial Models (MANOVA)

Integrated vs. independent encoding of features?
- How to test?
  - Vary both factorially in design, and test for interaction
  - Where is factor A encoded? where is B encoded?
Example:
- A: grasp, pinch, grip; B: see, do
- Start with features:
  - Factor A: 6 x 3 ([1 0 0; 0 1 0; 0 0 1; 1 0 0; 0 1 0; 0 0 1])
  - Factor B: 6 x 2 ([1 0; 1 0; 1 0 ...])
  - Interaction: A * B
- Representational components
  - component matrix for each feature
- Determine weights for each component
  - string representational components together into component design matrix (X)
- Get data
  - Y --> betas --> pattern estimates --> D (RDM)
- Bring representational components and D together w/linear regression
  - $\omega = (X^T X)^{-1} X^T vec(\hat{D})$
- $(X^T X)^{-1} X^T$ yields interesting matrices
  - Similar to cross "decoding": train classifier on see-grasp, test on do-pinch/grasp
- Get component weights for each factor (simulate over many iterations)
  - if weights around 0, then doesn't code for that factor.
Models for BOLD signals adding
- patterns engage independent neural subpopulations
- combine linearly to determine firing rate
- relationship between neural activity and BOLD is approximately linear
  - NOT true! Log
- BUT, if within the range of a normal task, then work around that range (small % signal change), then its small variation around signal, and linear approximation of nonlinear function is fine

Linear representational models

Coding of action sequences (Yokoi et al., in prep)
- Set up
  - sequences consisting of chunks
  - where are sequences encoded in motor areas?
  - what is the nature of chunks?
    - finger presses, chunks, 2nd level chunks, entire sequence?
- From features, create representational component matrices
  - feature 1: finger transitions for each sequence --> covariances --> distances
  - feature 2: chunks (8), sequence # (11) x 8 chunks --> covariances --> distances
  - feature 3: 2nd order chunks ...
  - feature 4: sequences
  - Note: Component matrices are not orthogonal, so need to worry about that, but try to maximize non-orthoginality in design
- OLS
  - assume observations are independent (I)
  - equal variance (I)
  - normally distributed (D)
  - If this is violated, then have an unbiased estimator, but not the best linear unbiased estimator (BLUE)
    - not the one with lowest variance
    - If distances are non-zero, variance/covariance go up. If add noise, variance/covariance goes up
      - variance($\hat{d}$) = true variance of $dd^t$ + distance dependent variance * distance ($\delta \delta^T$) + constant dependent on noise
      - related to how you do cross validation
        
        usually maximize & exhaust CV
        
        If low SNR, independent noise dominates; otherwise other term dominates
- Can do iterated reweighted least squares (IRLS; since variances are dep on true probabilities)
  - start with some guess of component weights ($\omega$)
  - predict distances: $\hat{d} = X\omega$
  - calculate variance-covariance of d
  - use this in estimation, OLS, get best estimate
  - repeat til convergence2
  - moderate improvements in SD, but depends on your model how much it'll improve estimates over OLS

Nonlinear representational models

Sometimes model components are non-linear functions of parameters
Example:
- area might have narrow tuning curve, or wider (tuning widths change)
- can express as feature vectors
- change between distances and feature matrices are nonlinearly related
- How to solve:
  - could have linear approximation (model 1st derivatives), like AR-estimation in 1st level SPMs
  - estimate nonlinear parameters to optimize log-likelihood, e.g., log p(d|$\theta$)

Advanced Topics in RSA

Calculating the noise ceiling

when to search for better model (not at noise ceiling yet), or better data (in noise ceiling)
Correlation distance proportional to squared Euclidean distance (for normalized patterns)
Comparing RDMs
- use correlations, not distances, mostly since want a bigger bar for better model
Norm pattern vectors, so length = 1, with angle $\alpha$
- d is Euclidean distance between them; cos($\alpha$) = r, distance is 1-r
- d^2 = 2(1-r)
Max = group mean RDM
Min = leave one out RDMs

Kendall's tau a

Important when you have categorical models that predict tied dissimilarities (e.g., if predict 1 for within and 0 for between categories)
Have values for x and y
compute 2 difference matrices: a (differences between all points on x), b (diff between all points on y)
coeff = normalized inner product of a and b (just Pearson)
Kendall -> signed differences (concordant or discordant pairs, so 1 if both go up/down, and 0 if one does up, one goes down)
Spearman --> ranked differences
Using pearson or spearman, true model can lose to simplified step model, but using Kendall $\tau_a$, true model can't lose

Remixing/reweighting representational features

Might not know how brain-activity measurements mix and weight representational dimensions
- Local averaging w/voxels in fMRI
- sparse, biased neuronal sample in cell recordings

Practical: Weighted representational component models

Toolboxes:
- rsatoolbox.v.2a
- DataFrame
  - toolboxes from Jorn's lab for graphing
MANOVA design
- Define how conditions related to the features
- Make representational model matrices
  - rsa_modelRDMsMANOVA(F_A, F_B)
- Visualize pseudo-inverse matrices for visualization purposes (what's going into regression)
- Get prewhitened data from ROIs
  - get $\beta$s: rsa_noiseNormalizeBetas()
  - get CV distances: rsa_ldc()
    - NOTE! Distances should be LDC. This might not hold for correlations, LD-t, etc.
- Plot for each subject for visualization
  - don't want to do rank transform
  - Can apply nonlinear function to data before plotting: rsa_figureRDMs('transformFcn', 'fnc_name')
    - e.g., 'ssqrt' -> sign preserving sqrt
- Plot average RDMs across subjects
  - rsa_flattenRDMs(RDMs)
  - R-like function to average over subjects: T=tapply(Data,{'regType','regSide'},{'RDM'})
  - Put into RDM format: avrgRDMs=rsa_foldRDMs(T);
  - Plot: rsa_figureRDMs(avrgRDMs,'rankTransform',0,'transformFcn','ssqrt');
- Estimate component weights w/normal OLS regression
  - Basically multiply each RDM by the 3 M RDM models
  - rsa_fitRepModelRegress(M,RDMs,'method','OLS')
    - Arguments: Model, and RDMs (subject x condition x ROI),
    - Returns: omegas, 1 per Model (e.g., 448 x 3)
    - Flatten Model if necessary (# models X 36)
    - Y = (448 X 36)
    - OLS
      - omega = Y*pinv(X);
Hierarchical design/Linear model
- Making models from features
  - Features:
    - Rows = sequences, and columns = features from sequences
    - Each sequence might be represented
      - e.g., I-matrix (size 8, or # of conditions)
    - Each chunk might be represented
      - sequence #1 has chunks 2, 3, and shares chunk with 5
  - rsa_features2ModelRDMs()
- Load in data in RDMs
- Calculate variance-covariance of distances
  - given from rsa_LDCsigma; just save Sig into data structure
- Use variance-covariance in estimation of $\omega$ estimates
- Repeat update of $\omega$ through convergence

Practical: Anatomically-guided (surface-based) searchlight

Where in the brain is something represented?
- for each voxel in the brain, and a sphere of voxels around it, and calculate RDM
Issues
- If interested in one region, sphere around center voxel might go into other areas of non-interest
Solutions
- If cortical areas of regions, you could use anatomically guided surface searchlight (surface-based)
- Could draw a mask for anatomical ROI and constrain searchlight to ROI
- Have spheres that cut off at boundaries (e.g., just within cerebellum)
  - could specify a minimum number of voxels to include
  - If want to look at cortex and other areas, can force it so that only one region can claim a voxel
Example:
- Location of representations of skilled finger movements
Generate searchlight indices
- rsa_defineSearchlight({M},[],'sphere',[40 160]);
  - M = functional mask, [] means just the functional mask, look everywhere
  - at center node, give all voxels within 40 mm, but at least 160 voxels
  - to get fixed radius, just [40]
- Returns: structure with fields LI, voxels, voxmin, etc.
  - every row is one searchlight (linear indices to center voxels of searchlight)
  - 80613 total in tutorial
  - L.LI(1) --> 160 voxels
  - L.structure --> all 1s, since only looking at 1 region
Run searchlight
- Define contrast vector for RDM:
  - D = load(fullfile(Opt.rootPath,Opt.spmDir,'SPM_info.mat'));
  - Opt.conditionVec = (D.hand == 1 & D.stimType==0) .* D.digit;
  - Which trials are trials that below to right hand, active movements, code by digit (5 conditions)
- Actually run the searchlight:
  - rsa_runSearchlightLDC(L,Opt)
  - Returns:
    - EPI size x 10 (5 x 5 RDM (10 unique distances))
- Take mean of distances at each voxel (collapse over all 10 values, so just mean distance)
Generate searchlight indices from multiple regions (functional and anat)
- L = rsa_defineSearchlight(M(2:3),M{1},'sphere',[40 160]);
- Smaller # of indices, just for 2 masks, within functional mask
Surface-based searchlight
- define searchlight on the surface (e.g., lh.pial, lh.white)
- Gifti is like nifti, but for surface
- Convert FS files to .gii
- rsa_readSurf({white},{pial});
- rsa_defineSearchlight(S,mask,'sphere',[20 80]);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSA_Workshop.md

RSA_Workshop.md

RSA Workshop

Distance measures

What does RSA measure?

Euclidean distance

Pearson correlation distance

Issues w/ Euclidean distance & Pearson correlation

Reliability of dissimilarity measures

Noise Normalization for fMRI

CV distance estimates

Distances measures vs. Pattern classifiers

RDM Practical

`DEMO1_RSA_ROI_simulatedAndRealData.m`

Simulate fMRI data

Multivariate noise estimation & LD-t

Compute cross validated distance metric

Compute linear discriminant t value (LDt)

Weighted Representational Component Models

Motivation

Covariances & distances

Features & Representational components

Factorial Models (MANOVA)

Linear representational models

Nonlinear representational models

Advanced Topics in RSA

Calculating the noise ceiling

Kendall's tau a

Remixing/reweighting representational features

Practical: Weighted representational component models

Practical: Anatomically-guided (surface-based) searchlight

Files

RSA_Workshop.md

Latest commit

History

RSA_Workshop.md

File metadata and controls

RSA Workshop

Distance measures

What does RSA measure?

Euclidean distance

Pearson correlation distance

Issues w/ Euclidean distance & Pearson correlation

Reliability of dissimilarity measures

Noise Normalization for fMRI

CV distance estimates

Distances measures vs. Pattern classifiers

RDM Practical

DEMO1_RSA_ROI_simulatedAndRealData.m

Simulate fMRI data

Multivariate noise estimation & LD-t

Compute cross validated distance metric

Compute linear discriminant t value (LDt)

Weighted Representational Component Models

Motivation

Covariances & distances

Features & Representational components

Factorial Models (MANOVA)

Linear representational models

Nonlinear representational models

Advanced Topics in RSA

Calculating the noise ceiling

Kendall's tau a

Remixing/reweighting representational features

Practical: Weighted representational component models

Practical: Anatomically-guided (surface-based) searchlight

`DEMO1_RSA_ROI_simulatedAndRealData.m`