diff --git a/.gitignore b/.gitignore
index bbfbee8..8b0333f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,5 @@
 .DS_Store
+.vscode/
 
 ## Core latex/pdflatex auxiliary files:
 *.aux
diff --git a/README.md b/README.md
index 3f0b273..bfb18e3 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,24 @@
 # Robust time-varying functional connectivity estimation and its relevance for depression
 
+Thesis defended on December 1st, 2022.
+
+## Abstract
+
+This thesis investigates how to robustly estimate time-varying functional connectivity (TVFC), a construct in neuroimaging research that looks at changes in functional coupling (correlation between time series) between brain regions during a functional magnetic resonance imaging (fMRI) scan, and how it can be used as a lens through which to study depression as a functional disorder.
+
+Unfortunately, the field of TVFC is still riddled with uncertainty, especially regarding its estimation. This is mainly due to the absence of a ground truth. Without resolving this first, the value of any study, including this depression study, is significantly undermined and conclusions made therein less trustworthy. Therefore, I propose a novel and principled method for estimating TVFC, based on the Wishart process (WP), a covariance matrix stochastic process that has recently become computationally tractable, and introduce a comprehensive benchmarking framework based on machine learning principles to make sure it performs better than existing methods in the field. These benchmarks include simulations, subject phenotype prediction, test-retest studies, brain state analyses, external task prediction, and a range of qualitative method comparisons. Furthermore, I introduce a benchmark based on cross-validation, that can be run on any data set. The WP model is found to outperform other common estimation methods, such as sliding-windows (SW) approaches and dynamic conditional correlation (DCC).
+
+Returning to the depression study, several differences are found between depressed and healthy control cohorts. The study is run on thousands of participants from the UK Biobank, yielding unprecedented statistical power and robustness. I investigate connectivity between individual brain regions as well as functional networks (FNs). Depressed participants show decreased global connectivity, and increased connectivity instability (as measured by the temporal characteristics of estimated TVFC). By defining multiple depression phenotypes, I find that brain dynamics are affected especially when patients have been professionally diagnosed or indicated to be depressed during their fMRI scan, but were less or not at all affected based on self-reported past instances and genetic predisposition. I show that choosing a different TVFC estimation method would have changed our scientific conclusions. This sensitivity to seemingly arbitrary researcher choices highlights the need for robust method development and the importance of community-approved benchmarking.
+
+I wrap up this thesis with a discussion of results and how this style of work fits into the bigger picture of neuroscientific research, reflect on what has been learned about depression, and posit promising directions for future work.
+## Source code for experiments
+
+https://github.com/OnnoKampman/neuro-dynamic-covariance
+
+## Word count
+
+`ps2ascii main.pdf | wc -w`
+
 ## LaTeX
 
 This document is generated using `Latexmk` version 4.77 and Biber version 2.17.
@@ -9,5 +28,7 @@ You may need to clear the cache by running `rm -rf $(biber --cache)` if you enco
 ## Inspiration
 
 [1] https://github.com/kks32/phd-thesis-template
+
 [1] https://github.com/duvenaud/phd-thesis
+
 [1] https://jwalton.info/Embed-Publication-Matplotlib-Latex/
diff --git a/appendix/03_extra_benchmarking_results.tex b/appendix/03_extra_benchmarking_results.tex
new file mode 100644
index 0000000..7d3dc61
--- /dev/null
+++ b/appendix/03_extra_benchmarking_results.tex
@@ -0,0 +1,333 @@
+\chapter{More benchmarking results}\label{appendix:more-benchmarking-results}
+%%%%%
+
+%%
+\section{Simulations: Optimal learned window lengths}\label{appendix:sim-optimal-window-lengths}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0120_T0200/no_noise/SW_cross_validated_optimal_window_lengths}
+  \caption{
+    Simulations benchmark optimal cross-validated window lengths learned from bivariate ($D = 2$) noiseless data for $N = 120$.
+    Each dot represents one of $T = 200$ trials.
+  }\label{fig:sim-optimal-window-lengths-N120}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0200_T0200/no_noise/SW_cross_validated_optimal_window_lengths}
+  \caption{
+    Simulations benchmark optimal cross-validated window lengths learned from bivariate ($D = 2$) noiseless data for $N = 200$.
+    Each dot represents one of $T = 200$ trials.
+  }\label{fig:sim-optimal-window-lengths-N200}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N1200_T0200/no_noise/SW_cross_validated_optimal_window_lengths}
+  \caption{
+    Simulations benchmark optimal cross-validated window lengths learned from bivariate ($D = 2$) noiseless data for $N = 1200$.
+    Each dot represents one of $T = 200$ trials.
+  }\label{fig:sim-optimal-window-lengths-N1200}
+\end{figure}
+
+
+%%
+\clearpage
+\section{Simulations: Learned kernel lengthscales}\label{appendix:sim-kernel-lengthscales}
+%%
+
+
+\begin{figure}[ht]
+    \centering
+    \includegraphics[width=\textwidth]{fig/sim/d2/N0120_T0200/no_noise/SVWP_kernel_lengthscales}
+    \caption{
+        Simulations benchmark SVWP kernel lengthscales $l$ learned from bivariate ($D = 2$) noiseless data for $N = 120$.
+        Each dot represents one of $T = 200$ trials.
+    }\label{fig:sim-learned-kernel-lengthscales-N120}
+\end{figure}
+
+
+\begin{figure}[ht]
+    \centering
+    \includegraphics[width=\textwidth]{fig/sim/d2/N0200_T0200/no_noise/SVWP_kernel_lengthscales}
+    \caption{
+        Simulations benchmark SVWP kernel lengthscales $l$ learned from bivariate ($D = 2$) noiseless data for $N = 200$.
+        Each dot represents one of $T = 200$ trials.
+    }\label{fig:sim-learned-kernel-lengthscales-N200}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N1200_T0200/no_noise/SVWP_kernel_lengthscales}
+  \caption{
+    Simulations benchmark SVWP kernel lengthscales $l$ learned from bivariate ($D = 2$) noiseless data for $N = 1200$.
+    Each dot represents one of $T = 200$ trials.
+  }\label{fig:sim-learned-kernel-lengthscales-N1200}
+\end{figure}
+
+
+%%
+\clearpage
+\section{Simulations: Impact of noise}\label{ch:appendix-impact-of-noise}
+%%
+
+%%
+\subsection{Bivariate TVFC estimates}\label{ch:appendix-d2-impact-of-noise}
+%%
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0400_T0200/no_noise/all_covs_types_correlations}
+  \caption{
+    Model TVFC predictions on bivariate data for $N = 400$ data points.
+    No noise added.
+  }\label{fig:results-all-covariance-structures-tvfc-predictions}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_6/all_covs_types_correlations}
+  \caption{
+    Model TVFC predictions on bivariate data for $N=400$ data points.
+    HCP noise with SNR of 6 added.
+  }\label{fig:results-all-covariance-structures-tvfc-predictions-snr-6}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/all_covs_types_correlations}
+  \caption{
+    Model TVFC predictions on bivariate data for $N = 400$ data points.
+    HCP noise with SNR of 2 added.
+  }\label{fig:results-all-covariance-structures-tvfc-predictions-snr-2}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_1/all_covs_types_correlations}
+  \caption{
+    Model TVFC predictions on bivariate data for $N = 400$ data points.
+    HCP noise with SNR of 1 added.
+  }\label{fig:results-all-covariance-structures-tvfc-predictions-snr-1}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Bivariate quantitative results}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/no_noise/correlation_RMSE}
+  \caption{
+    Performance of models on all bivariate synthetic covariance structures without noise for $N = 400$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-400-all-correlation-RMSE-no-noise}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_6/correlation_RMSE}
+  \caption{
+    Performance of models on all bivariate synthetic covariance structures with HCP noise with SNR of 6 added for $N = 400$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-400-all-correlation-RMSE-snr-6}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/correlation_RMSE}
+  \caption{
+    Performance of models on all bivariate synthetic covariance structures with HCP noise with SNR of 2 added for $N = 400$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-400-all-correlation-RMSE-snr-2}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_1/correlation_RMSE}
+  \caption{
+    Performance of models on all bivariate synthetic covariance structures with HCP noise with SNR of 1 added for $N = 400$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-400-all-correlation-RMSE-snr-1}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Trivariate TVFC estimates}\label{ch:appendix-d3d-impact-of-noise}
+%%
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/sim/d3d/N0400_T0003/no_noise/periodic_1_correlations}
+    \caption{
+        Model TVFC estimates on dense trivariate data for $N = 400$ data points.
+        No noise added.
+    }\label{fig:results-d3d-periodic-1-tvfc-predictions-no-noise}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/sim/d3d/N0400_T0003/HCP_noise_snr_6/periodic_1_correlations}
+    \caption{
+        Model TVFC estimates on dense trivariate data for $N = 400$ data points.
+        HCP noise with SNR of 6 added.
+    }\label{fig:results-d3d-periodic-1-tvfc-predictions-snr-6}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d3d/N0400_T0003/HCP_noise_snr_2/periodic_1_correlations}
+  \caption{
+    Model TVFC estimates on dense trivariate data for $N = 400$ data points.
+    HCP noise with SNR of 2 added.
+  }\label{fig:results-d3d-periodic-1-tvfc-predictions-snr-2}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d3d/N0400_T0003/HCP_noise_snr_1/periodic_1_correlations}
+  \caption{
+    Model TVFC estimates on dense trivariate data for $N = 400$ data points.
+    HCP noise with SNR of 1 added.
+  }\label{fig:results-d3d-periodic-1-tvfc-predictions-snr-1}
+\end{figure}
+
+
+%%
+\clearpage
+\section{Simulations: Trivariate TVFC estimates}\label{ch:appendix-d3-tvfc-estimates}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d3s/N0400_T0003/no_noise/null_correlations}
+  \caption{
+    Simulations benchmark single trial TVFC estimates for null covariance structure, for trivariate ($D = 3$) data for $N = 400$.
+  }\label{fig:results-d3-no-noise-null-covariance}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d3d/N0400_T0003/no_noise/periodic_1_correlations}
+  \caption{
+    Simulations benchmark single trial TVFC estimates for periodic (fast) covariance structure, for dense trivariate ($D = 3$) data for $N = 400$.
+  }\label{fig:results-d3s-no-noise-periodic-3-covariance}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=\textwidth]{fig/sim/d3s/N0400_T0003/no_noise/periodic_3_correlations}
+  \caption{
+    Simulations benchmark single trial TVFC estimates for periodic (fast) covariance structure, for dense trivariate ($D = 3$) data for $N = 400$.
+  }\label{fig:results-d3s-no-noise-stepwise-covariance}
+\end{figure}
+
+
+%%
+\clearpage
+\section{Simulations: More quantitative results}\label{appendix:sim-more-quantitative-results}
+%%
+
+%%
+\subsection{Bivariate}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0120_T0200/no_noise/correlation_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all bivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 120$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-120-no-noise-all-correlation-RMSE}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N0200_T0200/no_noise/correlation_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all bivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 200$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-200-no-noise-all-correlation-RMSE}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d2/N1200_T0200/no_noise/correlation_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all bivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 1200$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d2-1200-no-noise-all-correlation-RMSE}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Trivariate}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d3d/N0120_T0200/no_noise/correlation_matrix_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all dense trivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 120$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d3d-120-no-noise-all-correlation-matrix-RMSE}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d3d/N0200_T0200/no_noise/correlation_matrix_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all dense trivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 200$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d3d-200-no-noise-all-correlation-matrix-RMSE}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d3s/N0120_T0200/no_noise/correlation_matrix_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all sparse trivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 120$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d3s-120-no-noise-all-correlation-matrix-RMSE}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.84\textwidth]{fig/sim/d3s/N0200_T0200/no_noise/correlation_matrix_RMSE}
+  \caption{
+    Simulations benchmark RMSE between model TVFC estimates and ground truth on all sparse trivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 200$.
+    Means and standard deviations are shown across $T = 200$ trials.
+  }\label{fig:results-sim-d3s-200-no-noise-all-correlation-matrix-RMSE}
+\end{figure}
diff --git a/appendix/04_ukb_with_other_methods.tex b/appendix/04_ukb_with_other_methods.tex
new file mode 100644
index 0000000..7fe6a5d
--- /dev/null
+++ b/appendix/04_ukb_with_other_methods.tex
@@ -0,0 +1,554 @@
+\chapter{More UK Biobank results}\label{appendix:more-ukb-results}
+%%%%%
+
+%%
+\section{Results for other depression phenotypes}\label{appendix:more-ukb-results-other-phenotypes}
+%%
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.9\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_TVFC_estimates_SVWP_joint_joint}
+  \caption{
+    Self-reported depression lifetime occurrence analysis - SVWP estimates.
+    Mean over 808 subjects per cohort for all ROI edges, for three TVFC summary measures.
+  }\label{fig:ukb-results-lo-roi-cohort-comparison-full-wp}
+\end{figure}
+
+
+\begin{figure}[ht]
+  \centering
+  \includegraphics[width=0.9\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_TVFC_estimates_SVWP_joint_joint}
+  \caption{
+    Self-reported depressed state analysis - SVWP estimates.
+    Mean over 1,411 subjects per cohort for all ROI edges, for three TVFC summary measures.
+  }\label{fig:ukb-results-srds-roi-cohort-comparison-full-wp}
+\end{figure}
+
+
+\begin{figure}[ht]
+    \centering
+    \subcaptionbox{High | TVFC mean\label{fig:ukb-results-pgs-cohort-comparison-full-wp-mean-high}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_high/ROI/correlation_TVFC_mean_SVWP_joint}}
+    \subcaptionbox{Med | TVFC mean\label{fig:ukb-results-pgs-cohort-comparison-full-wp-mean-med}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_medium/ROI/correlation_TVFC_mean_SVWP_joint}}
+    \subcaptionbox{Low | TVFC mean\label{fig:ukb-results-pgs-cohort-comparison-full-wp-mean-low}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_low/ROI/correlation_TVFC_mean_SVWP_joint}}
+    \subcaptionbox{High | TVFC variance\label{fig:ukb-results-pgs-cohort-comparison-full-wp-var-high}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_high/ROI/correlation_TVFC_variance_SVWP_joint}}
+    \subcaptionbox{Med | TVFC variance\label{fig:ukb-results-pgs-cohort-comparison-full-wp-var-med}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_medium/ROI/correlation_TVFC_variance_SVWP_joint}}
+    \subcaptionbox{Low | TVFC variance\label{fig:ukb-results-pgs-cohort-comparison-full-wp-var-low}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_low/ROI/correlation_TVFC_variance_SVWP_joint}}
+    \subcaptionbox{High | TVFC rate-of-change\label{fig:ukb-results-pgs-cohort-comparison-full-wp-roc-high}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_high/ROI/correlation_TVFC_rate_of_change_SVWP_joint}}
+    \subcaptionbox{Med | TVFC rate-of-change\label{fig:ukb-results-pgs-cohort-comparison-full-wp-roc-med}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_medium/ROI/correlation_TVFC_rate_of_change_SVWP_joint}}
+    \subcaptionbox{Low | TVFC rate-of-change\label{fig:ukb-results-pgs-cohort-comparison-full-wp-roc-low}}{\includegraphics[width=0.32\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/pgs_low/ROI/correlation_TVFC_rate_of_change_SVWP_joint}}
+    \caption{
+        Polygenic risk scores analysis - SVWP estimates.
+        Mean over 3,775 subjects per cohort for all ROI edges, for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-full-wp}
+\end{figure}
+
+
+%%
+\clearpage
+\section{Results with other TVFC estimation methods}\label{appendix:more-ukb-results-other-tvfc-methods}
+%%
+
+%%
+\clearpage
+\subsection{Diagnosed lifetime occurrence - ROI analysis}
+%%
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+  \caption{
+    Diagnosed depression lifetime occurrence analysis - brain regions of interest - DCC (joint) estimates.
+    Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+    *: $p \leq .05$, **: $p \leq .01$.
+  }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - brain regions of interest - DCC (bivariate loop) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$.
+    }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - brain regions of interest - SW-CV estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$.
+    }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - brain regions of interest - SW (30 seconds window) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$.
+    }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - brain regions of interest - SW (60 seconds window) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$.
+    }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Diagnosed lifetime occurrence - FN analysis}
+%%
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+  \caption{
+    Diagnosed depression lifetime occurrence analysis - functional networks - DCC (joint) estimates.
+    Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - functional networks - DCC (bivariate loop) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - functional networks - SW-CV estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - functional networks - SW (30 seconds window) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/diagnosed_lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Diagnosed depression lifetime occurrence analysis - functional networks - SW (60 seconds window) estimates.
+        Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+
+%%
+\clearpage
+\subsection{Self-reported lifetime occurrence - ROI analysis}
+%%
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - brain regions of interest - DCC (joint) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - brain regions of interest - DCC (bivariate loop) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - brain regions of interest - SW-CV estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - brain regions of interest - SW (30 seconds window) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - brain regions of interest - SW (60 seconds window) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Self-reported lifetime occurrence - FN analysis}
+%%
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - functional networks - DCC (joint) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - functional networks - DCC (bivariate loop) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - functional networks - SW-CV estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - functional networks - SW (30 seconds window) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Self-reported depression lifetime occurrence analysis - functional networks - SW (60 seconds window) estimates.
+        Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$.
+    }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+
+%%
+\clearpage
+\subsection{Self-reported depressed state - ROI analysis}
+%%
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+  \caption{
+    Self-reported depressed state analysis - brain regions of interest - DCC (joint) estimates.
+    Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - brain regions of interest - DCC (bivariate loop) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+  \caption{
+    Self-reported depressed state analysis - brain regions of interest - SW-CV estimates.
+    Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - brain regions of interest - SW (30 seconds window) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - brain regions of interest - SW (60 seconds window) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Self-reported depressed state - FN analysis}
+%%
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - functional networks - DCC (joint) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - functional networks - DCC (bivariate loop) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - functional networks - SW-CV estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Self-reported depressed state analysis - functional networks - SW (30 seconds window) estimates.
+        Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+        *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+    }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/self_reported_depression_state/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+  \caption{
+    Self-reported depressed state analysis - functional networks - SW (60 seconds window) estimates.
+    Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+
+%%
+\clearpage
+\subsection{Polygenic risk scores - ROI analysis}
+%%
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - brain regions of interest - DCC (joint) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - brain regions of interest - DCC (bivariate loop) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - brain regions of interest - SW-CV estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - brain regions of interest - SW (30 seconds window) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - brain regions of interest - SW (60 seconds window) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
+
+
+%%
+\clearpage
+\subsection{Polygenic risk scores - FN analysis}
+%%
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_joint_edges_of_interest}
+  \caption{
+    Polygenic risk scores analysis - functional networks - DCC (joint) estimates.
+    Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+  }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-dcc-j}
+\end{figure}
+
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/FN/correlation_all_TVFC_summary_measures_DCC_bivariate_loop_edges_of_interest}
+  \caption{
+    Polygenic risk scores analysis - functional networks - DCC (bivariate loop) estimates.
+    Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+  }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-dcc-bl}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_cross_validated_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - functional networks - SW-CV estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-sw-cv}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_30_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - functional networks - SW (30 seconds window) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-sw-30}
+\end{figure}
+
+
+\begin{figure}[h]
+    \centering
+    \includegraphics[width=0.7\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/pgs/cohort_comparison/FN/correlation_all_TVFC_summary_measures_SW_60_edges_of_interest}
+    \caption{
+        Polygenic risk scores analysis - functional networks - SW (60 seconds window) estimates.
+        Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
+    }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-sw-60}
+\end{figure}
diff --git a/ch/1_Introduction/0_Introduction.tex b/ch/1_Introduction/0_Introduction.tex
index 9b17338..fb59b6e 100644
--- a/ch/1_Introduction/0_Introduction.tex
+++ b/ch/1_Introduction/0_Introduction.tex
@@ -1,37 +1,41 @@
-\chapter{Introduction}
-\label{ch:introduction}
+\chapter{Introduction}\label{ch:introduction}
 %%%%%
 
 \info[inline]{Paragraph: Direct summary of what this thesis is about.}
-In this thesis we introduce novel approaches for robust estimation of \gls{tvfc} from \gls{fmri} neuroimaging data.
+This thesis introduces novel approaches for robust estimation of \gls{tvfc} from \gls{fmri} neuroimaging data.
 \Gls{tvfc} is a construct that studies the time-varying nature of interaction between brain regions.
-We compare and evaluate estimation approaches to existing ones through a proposed benchmarking framework.
+Estimation approaches are compared and evaluated to existing ones through a proposed benchmarking framework.
 %
 Afterwards, we use the best performing method to investigate how brain dynamics differ between depressed and healthy (control) subjects.
 We argue that the construct of \gls{tvfc} is particularly valuable in the study of \gls{mdd}, as this condition can be considered a functional or \emph{connectivity} disorder.
+That is, depression is believed to be associated by dysfunction in connectivity patterns between brain regions, instead of dysfunction in a single region.
 %
-The rest of this introductory chapter will be used to \emph{frame} our work within the larger study of the brain, as well as to discuss relevant concepts related to \gls{fc}, motivate why robust estimation thereof is so important, and discuss the current state of knowledge on how this construct relates to depression and related mood disorders.
+The rest of this introductory chapter will be used to \emph{frame} our work within the larger study of the brain.
+It also serves to discuss relevant concepts related to \gls{fc}, motivates why robust estimation thereof is so important, and discusses the current state of knowledge on how this construct relates to depression and related mood disorders.
 
 \info[inline]{Paragraph: Start from the top; describe broad landscape of neuroscientific research, and why studying the human brain can be overwhelming and requires a certain viewpoint.}
-Faced with the overwhelming prospect of studying the brain, we argue that it is unavoidable to limit the scope of any such investigation.
-We should pick a certain angle to approach the brain with, informed by the scientific question(s) at hand.
+Faced with the overwhelming prospect of studying the brain, it is unavoidable to limit the scope of any such investigation.
+We should pick a certain angle to approach the brain with, informed by the scientific question(s) at hand.\footnote{In fact, without constraining our hypothesis space, the number of conclusions to be made from data is infinitely flexible~\parencite[see also][]{Gershman2021}.}
 In doing so, we make implicit (and sometimes explicit) assumptions.
-Each perspective comes with advantages and disadvantages, and imposes a limit on what we can and cannot learn.
+Each perspective has its advantages and disadvantages, and imposes a limit on what we can and cannot learn.
 To start our journey, we provide a brief overview here of the entire neuroscientific landscape.
-Throughout the remainder of this chapter we will then gradually narrow down our focus to provide the context for the experimental chapters (\cref{ch:benchmarking,ch:ukb}), and describe the particular lens through which we study the brain.
-In \cref{ch:discussion} we will zoom out again and reflect on the limitations of our particular view of the brain and its inherent assumptions.
+Throughout the remainder of this chapter we then gradually narrow down our focus to provide the context for the experimental \cref{ch:benchmarking,ch:ukb}, and describe the particular lens through which we study the brain.
+In \cref{ch:discussion} we will zoom out again and reflect on the limitations of this view of the brain and its inherent assumptions.
 
-Studying brains is hard due to the highly interdisciplinary nature of it.
-The various levels of analysis involved have meant that no single scientist is able to study the full brain and all of its characteristics, properties, and interactions.\footnote{\textcite{Marr1982} famously talked about three levels of analysis: computational, algorithmic, and implementational. In fact, \textcite{Marr1976} already discussed similar ideas. More recently \textcite{Poggio2012} followed up and proposed another two levels; learning and evolution.}
-Starting from the building blocks, some researchers study ion channels and chemical interactions within individual neurons, some look at small circuits of neurons, some look at whole-brain structures, some investigate emergent brain waves, and some try to model human brain function with animal or computational models and speculate how these relate to human brain function (sometimes referred to as comparative cognition).
-Neuroscientists have considered for a long time that the brain is modular in its organization, and many would specialize in a particular brain \gls{roi} to study its function and relationship to other regions~\parencite{Prinz2006}.
-Even though the view of the brain as a collection of entirely distinct modules has been abandoned, this tradition is ongoing to some degree, and is often still (approximately) valid~\parencite{Genon2018}.
-Nowadays many argue for a more sophisticated view of \emph{hierarchy} in such modularity as well.
+The study of the brain is hard due to its highly interdisciplinary nature.
+The various levels of analysis involved have meant that no single scientist is able to study the entire brain and all its characteristics, properties, and interactions.\footnote{\textcite{Marr1982} famously talked about three levels of analysis: computational, algorithmic, and implementational. In fact, \textcite{Marr1976} already discussed similar ideas. More recently \textcite{Poggio2012} followed up and proposed another two levels; learning and evolution.}
+%
+Starting from the fundamental building blocks, some researchers study ion channels and chemical interactions within individual neurons and glial cells.
+Some look at small circuits of neurons, some look at whole-brain structures, some investigate emergent brain waves, and some try to model human brain function with animal or computational models and speculate how these relate to human brain function (sometimes referred to as comparative cognition).
+Neuroscientists have long considered that the brain is modular in its organization~\parencite{Prinz2006}.
+Many would therefore specialize in a particular brain \gls{roi} to study its function and relationship to other regions.
+Even though the view of the brain as a collection of entirely distinct modules has been abandoned, this tradition persists to some degree, and is often still (approximately) valid~\parencite{Genon2018}.
+Moreover, contemporary neuroscientists have argued for a more sophisticated view of \emph{hierarchy} in such modularity.
 %
 The division between psychology and neuroscience has blurred and merged in recent years.
 Investigators looking solely at behavior have started to be called neuroscientists as well~\parencite{Niv2021}.
 %
-Different time scales and life stages allow for various types of analysis as well.
+Aside from spatial characteristics, different time scales and life stages allow for various types of analysis.
 Whereas developmental neuroscientists study how the brain develops during pregnancy and childhood, others study the opposite side of the spectrum: how the brain slowly degenerates toward the end of someone's life.
 %
 Then there are scientists that have proposed neuroscientific ``theories of everything'', in hopeful analogy to the field of physics, trying to link up and synthesize all approaches into a unified account of brain function.
@@ -47,50 +51,53 @@ \chapter{Introduction}
 Scientific insights about the mind and mental disease are often misunderstood or misrepresented.
 Ideology tends to get in the way more so than in other scientific fields, as this topic is close to home and often relates to people's most personal experiences.
 %
-Obviously, all of the mentioned scientific approaches and angles are valuable and complementary.
-We live in a great time to study the brain, as improved tools and large data sets are becoming available at unprecedented scale, and advances in computing power and modeling open up new venues of inquiry~\parencite{Griffiths2015, Bzdok2017, Rutledge2019, Guest2021}.
+Obviously, all the mentioned scientific approaches and angles are valuable and even complementary.
+On a positive note, we live in an exciting time to study the brain.
+Improved tools and large data sets are becoming available at unprecedented scale, and advances in computing power and modeling facilitate new venues of inquiry~\parencite{Griffiths2015, Bzdok2017, Rutledge2019, Guest2021}.
 %
-At the same time, the uncertain, overwhelming, and high-pressure environment that psychologists and neuroscientists operate within has led to questionnable research practices~\parencite{John2012} and many findings not replicating.
-This `reproducibility crisis' started gaining traction about a decade ago, and as a field we are still working through its ramifications.
+At the same time, the uncertain, overwhelming, and high-pressure environment that psychologists and neuroscientists operate within has led to questionable research practices~\parencite{John2012} and many findings not replicating.
+This `reproducibility crisis' started gaining traction about a decade ago, and the field is still working through its ramifications.
 
 \info[inline]{Paragraph: Explain our point-of-view of the human brain.}
-Our particular point-of-view of the human brain in this work is the following.
-We abstract away implementational level details, and consider the human brain as a complex, dynamic system organ that is divided into distinct yet interacting regions.\footnote{Brain regions (or `nodes') can be defined in several ways, which we will address later.}
-This means we will not go into depth on any neurobiological signatures of behavior and disorders.
+The particular point-of-view of the human brain in this work is the following.
+We abstract away implementational level details, and consider the human brain as a complex, dynamic system organ that is divided into distinct yet interacting regions.\footnote{Brain regions and components (or `nodes', borrowing from graph theoretic jargon) can be defined in several ways, which will be addressed later.}
+We will not go into depth on any neurobiological signatures of behavior and disorders.
 %
-Our characterization is based on a historical trend.
+This characterization is based on a historical trend.
 Early modern neuroscientists discovered that the brain can be \emph{segregated} into distinct cortical (and subcortical) regions with distinct functions.
 Naturally, it followed that neuroscientists became interested in how these regions connect and communicate with one another.
 This is often referred to as the `functional' architecture of the brain, in contrast with the better understood \emph{structural} or \emph{anatomical} brain architecture.
 Higher-order cognition and complex behavior is made possible by the spatiotemporal integration, re-organization, and segregation of brain regions~\parencite{Deco2011}.
-Over the years a plethora of studies has painted a picture of the brain as a complex network of anatomical and functional segregation and integration that adapts and re-organizes itself to address a given task.
-Brain nodes also exhibit complex system structures, such as modular and hierarchical topology~\parencite{Meunier2009, Deco2015}.
+Over the years a plethora of studies has painted a picture of the brain as a complex, distributed, and adaptive network of anatomical and functional segregation and integration that re-organizes itself at different time scales to process information and address a given task.
+Brain constituent parts also exhibit complex system structures, such as modular and hierarchical topology~\parencite{Meunier2009, Deco2015}.
+Human brains have been found to rely on higher degrees of synergistic interactions compared to nonhuman primates, highlighting their importance to complex cognition~\parencite{Luppi2022}.
 %
-Anatomical organization and connectivity in the brain has been studied in depth, by a range of imaging methods, including \gls{mri}, which uses powerful magnetic fields and radio waves to produce high resolution images of the inside of the body, and \gls{dti}, which images axons in white matter tracts.
+Anatomical organization and connectivity in the brain has been studied by a range of imaging methods, including \gls{mri}, which uses powerful magnetic fields and radio waves to produce high resolution images of the inside of the body, and \gls{dti}, which images axons in white matter tracts.
 Functional interactions, often referred to as `connectivity' too, can be characterized using neuroimaging methods such as \gls{fmri}~\parencite{Soares2016}, \gls{pet}, which requires an injection of positron emitting isotopes, \gls{nirs}, \gls{eeg}, and \gls{meg}~\parencite{Rossini2019}.
 Crucially, such connectivity is usually not directly observed, and needs to be estimated.
 
 \info[inline]{Paragraph: Discuss confusing terminology of `connectivity'.}
 The term `connectivity' here may be confusing, as there need not be any direct anatomical connection between two regions for them to be functionally coupled.
-Whereas \emph{connectomes} originally refered to maps of physical connections, and have been constructed for e.g.~fruitflies but not humans (yet), `functional' connectomes refer to functional interactions (which can be defined in many ways).
+Whereas \emph{connectomes} originally referred to maps of physical connections, and have been constructed for e.g.~fruit flies but not humans (yet), `functional' connectomes refer to functional interactions (which can be defined in many ways).
 In the general sense, the term `connectome' has come to refer to any kind of wiring diagram of a brain~\parencite{Sporns2005}.
 
 \info[inline]{Paragraph: Introduce the study of brain disorders as a subset of neuroscience.}
 Many (if not \emph{most}) neuroscientists are motivated to study the brain to understand brain abnormalities and disorders, in the hope that better understanding will lead to better treatment.
-And most research funding goes to those diseases that are most common and whose disease burden on society is largest, such as \gls{mdd}, \gls{ad}, Parkinson's.
+And most research funding goes to those diseases that are most common and whose disease burden on society is largest, such as \gls{mdd}, \gls{ad}, and \gls{pd}.
 As such, many neuroscientists directly study how neural systems are disrupted in psychiatric and neurological disease.
+This requires a sufficiently large collection of healthy brains and disrupted ones.
 %
-In this thesis we focus on \gls{mdd}.
+This thesis is focused on \gls{mdd}.
 This condition is often linked to changes in higher-level and complex affective and cognitive processing, instead of single brain region malfunction or biological pathology.
 Therefore, it is especially suited to be studied through the lens of \gls{fc} and the dynamic nature thereof.
 By studying how neural systems such as \glspl{fn} are affected by psychiatric illness, we gain a better understanding of these diseases.
-However, while we aim to \emph{understand} \gls{mdd} as a disrupted neural system, we do not necessarily propose that \emph{treatment} should solely focus on the brain and/or pharmacological interventions.
+However, while the aim here is to \emph{understand} \gls{mdd} as a disrupted neural system, we do not necessarily propose that \emph{treatment} should solely focus on the brain and/or pharmacological interventions.
 
 \info[inline]{Paragraph: Outline remainder of Introduction chapter.}
-In this thesis, we study how we can (or \emph{should}) estimate such \gls{fc} as it evolves over time.
-As we will see, this is a non-trivial problem, and careless estimation procedures can greatly distort downstream scientific findings.
-In the remainder of this introductory chapter we shall discuss the key concepts and constructs studied.
+This thesis studies how we can (or \emph{should}) estimate such \gls{fc} as it evolves over time.
+As we will see, this is a non-trivial problem, and careless estimation procedures can distort downstream scientific findings.
+In the remainder of this introductory chapter the key concepts and constructs studied will be discussed.
 We also expand on why estimating \gls{tvfc} is a hard problem, and how to address this.
 Furthermore, we briefly touch upon the link between \gls{fc} and neurological and psychiatric disorders.
 We primarily focus on depression, as it is the condition studied in \cref{ch:ukb}, but note that studies on related or adjacent disorders may also be relevant.
-At the end of this chapter we provide an outline of the entire thesis.
+At the end of this chapter an outline of the entire thesis is provided.
diff --git a/ch/1_Introduction/1_Functional_connectivity.tex b/ch/1_Introduction/1_Functional_connectivity.tex
index 7fb22dd..9f5c2d8 100644
--- a/ch/1_Introduction/1_Functional_connectivity.tex
+++ b/ch/1_Introduction/1_Functional_connectivity.tex
@@ -4,79 +4,83 @@ \section{Functional connectivity}
 
 \info[inline]{Paragraph: Define concept of functional connectivity.}
 Functional connectivity refers to the functional interplay, \emph{interaction}, \emph{coupling}, \emph{synchrony}, or \emph{co-activation} between brain regions (e.g.~voxels, parcels, and/or \gls{ica} components).
-Its study has quickly become a cornerstone and key focus of modern neuroimaging research.
-Such connectivity depends on statistical dependencies between activity or \emph{activation} in brain regions, which can be measured by various neuroimaging modalities, most commonly \gls{fmri}, \gls{eeg}~\parencite[e.g.][]{Tagliazucchi2012, Chang2013}, and \gls{meg}~\parencite[e.g.][]{Baker2014, Vidaurre2018}.\footnote{As we will see, modeling dependencies between random variables is an important problem in machine learning and statistics as well.}
+Its study, mapping out the intrinsic organization of the brain, has quickly become a cornerstone and key focus of modern neuroimaging research.
+Such connectivity depends on statistical dependencies between activity or \emph{activation} in brain regions, which can be measured by various neuroimaging modalities, most commonly \gls{fmri}, \gls{eeg}~\parencite[e.g.][]{Tagliazucchi2012, Chang2013}, and \gls{meg}~\parencite[e.g.][]{Baker2014, Vidaurre2018}.\footnote{As discussed later, modeling dependencies between random variables is also an important problem in machine learning and statistics.}
 Sometimes multiple concurrent modalities are used, such as \gls{fmri} and \gls{eeg} (but it is not possible to combine \gls{fmri} with \gls{meg}).
 %
-Structural and \gls{fc} analyses are complementary in building a holistic understanding of the brain, as they capture complementary (and disparate) information~\parencite{Lang2012}.
-Importantly, \gls{fc} analyses do not makes any statements about causality~\parencite{Mehler2018}, in contrast with \emph{effective} connectivity~\parencite{Friston2011, Smith2012b, Park2018, Zeidman2019, Zarghami2020}.\footnote{In neuroscience, causality can confusingly refer to different concepts~\parencite[see][]{Barack2022}.}
+Neuroimaging methods can be divided into those that capture \emph{structural} and those capture \emph{functional} signals.
+The former aims to map out the location and molecular properties of brain tissue.
+The latter aims to capture more dynamic signals, related to the activity of brain tissue.
+Structural and functional (including \gls{fc}) analyses are complementary in building a holistic understanding of the brain, as they capture complementary (and disparate) information~\parencite{Lang2012}.
+Importantly, \gls{fc} analyses do not make any statements about causality~\parencite{Mehler2018}, in contrast with \emph{effective} connectivity~\parencite{Friston2011, Smith2012b, Park2018, Zeidman2019, Zarghami2020}.\footnote{In neuroscience, causality can confusingly refer to different concepts~\parencite[see][]{Barack2022}.}
 %
-In this thesis, we limit our focus to \gls{fmri} brain scans.
-This modality has proven to be a valuable method for studying \gls{fc}, due to its relatively high spatial resolution and its widespread availability in clinics around the world (these scanners are often used to study normal brain function by psychologists).
+This thesis is limited to \gls{fmri} brain scans.
+This modality has proven to be a valuable method for studying \gls{fc}, due to its high spatial resolution and its widespread availability in clinics around the world (these scanners are often used to study normal brain function by psychologists).
 
 \info[inline]{Paragraph: Introduce fMRI and BOLD signal and its limitations.}
 \gls{fmri} is a non-invasive and safe neuroimaging scanning method that infers brain activity based on measured changes in blood flow, pioneered in the 1990s by Seiji Ogawa and Kenneth Kwong and colleagues~\parencite[see][for a historical perspective]{Raichle1998}.\footnote{Even though it is one of the newest methods available, the idea that blood flow is related to neural activity dates back to the 19th century.}
 %
 Neurons that use more oxygen will cause surrounding blood vessels to dilate, causing a local increase in blood flow.
-This is necessary because these cells have no internal reserves of required glucose and oxygen.
+This is necessary because such cells have no internal reserves of required glucose and oxygen.
 \gls{fmri} measures the relative oxygenation of blood flow, where brain regions that are more active will see a spike in activity as glucose is converted into electrical energy required for both action potentials and managing continuous membrane potentials.
 \gls{mri} uses a property of atoms, where hydrogen protons in water align with each other when placed in a strong magnetic field (typically~1.5,~3,~or~7~Tesla) in its direction.
-A radiofrequency pulse then knocks protons off this alignment, and the subsequent re-aligning has characteristics that can be used to determine what tissue or blood composition the proton is in.
+A radio frequency pulse then knocks protons off this alignment, and the subsequent re-aligning has characteristics that can be used to determine what tissue or blood composition the proton is in.
 This signal depends on the surroundings of the hydrogen nuclei, thus allowing for differentiation between grey matter, white matter, and \gls{csf}.
 %
-Increased neuronal activity leads to higher demand for oxygen, which is carried by haemoglobin in red blood cells.
-This haemoglobin has different magnetic properties when oxygenated or not (diamagnetic vs.~paramagnetic), which has a (small) impact on the signal we measure.
+Increased neuronal activity leads to higher demand for oxygen, which is carried by hemoglobin in red blood cells.
+This hemoglobin has different magnetic properties when oxygenated or not (diamagnetic vs.~paramagnetic), which has a (small) impact on the measured signal.
 This is what \gls{fmri} picks up on, and this signal is known as the \gls{bold} signal.
 %
 It is important to stay mindful of the fact that it is not fully understood yet what we are actually measuring~\parencite{Logothetis2004, Cole2010}.
-In fact, the use of \gls{fmri} and \gls{fc} research are sometimes considered controversial~\parencite{Mehler2018}.
-If we want insights from \gls{fmri} scans to be useful in practice, it is important to get robust and reproducible results.
+In fact, the use of \gls{fmri} and \gls{fc} research in particular are sometimes considered controversial~\parencite{Mehler2018}.
+If insights from \gls{fmri} scans should translate to practice, it is important to get robust and reproducible results.
 In general, \gls{fmri} data has come under scrutiny regarding the reproducibility of results.
 A lot has been written about the topic in recent years~\parencite[see e.g.][]{Kriegeskorte2009, Gilmore2017, Poldrack2017, Botvinik-Nezer2020, Lindquist2020, Elliott2021, Aquino2022}.
 These efforts can be considered as part of a larger reproducibility movement in psychology and other scientific fields.
-A lot of progress has been made in this regard in recent years.
+On the other hand, a lot of progress has been made in this regard in recent years.
 Many tools have become available to help researchers to organize and publish their work in a more transparent fashion~\parencite{Marcus2011, Kumar2022, Niso2022}.
-Generally, it is important to know what we can and cannot do with the data~\parencite{Logothetis2008}.
-We will return to this topic in \cref{ch:discussion} as well.
+Generally, it is important to know what we can and cannot do with this class of data~\parencite{Logothetis2008}, see also \cref{ch:discussion}.
 %
 One of the key factors is that the blood flow dynamics are not fully understood yet.
-For example, while naively one would assume blood oxygenation levels to drop when surrounding neurons become active, this is only part of it.
-In response to a decrease in oxygenation (the initial `dip'), the haemodynamic response in fact \emph{increases} blood flow, \emph{overcompensating} for the increased demand of oxygen.
+For example, while naively one would assume blood oxygenation levels to drop when surrounding neurons become active, this is only part of the full story.
+In response to a decrease in oxygenation (the initial `dip'), the hemodynamic response in fact \emph{increases} blood flow, \emph{overcompensating} for the increased demand of oxygen.
 This complex function of neural activity, blood flow, and oxygenation is called the \gls{hrf}.
+It is not fully understood yet, and has been shown to vary across brain regions~\parencite{Handwerker2004}.
 %
 Lastly, \gls{fmri} experiments can broadly be divided into those with a certain external stimulus paradigm or set of instructions and those that are unconstrained.
 These are referred to as \gls{tb-fmri} and \gls{rs-fmri}.
-%
-This \gls{bold} signal (a.k.a.~response or observation) constitutes the node time series we will study in this thesis (see e.g.~\cref{fig:rockland-time-series-mean-over-subjects,fig:ukb-example-time-series,fig:ukb-fn-example-time-series}).
+
+The \gls{bold} signal (a.k.a.~response or observation) constitutes the brain region characteristic time series studied in this thesis (see e.g.~\cref{fig:rockland-time-series-mean-over-subjects,fig:ukb-example-time-series,fig:ukb-fn-example-time-series}).
 More concretely, an \gls{fmri} scan essentially returns a video of \emph{voxels} (3D pixels).
-The human brain contains hundreds of thousands of voxels (depending on voxel dimensions).
+The entire human brain typically encapsules hundreds of thousands of voxels (depending on voxel dimensions).
 Each voxel is characterized by an activity time series, but can be very noisy.
-It is common to characterize the activity of \glspl{roi} instead of working with raw voxels~\parencite{Korhonen2017}, by either averaging or taking the first eigenvariate of all voxels enclosed in it (using some parcellation atlas).
-Throughout this thesis, we shall denote these node time series as $\mathbf{y}_n \in \mathbb{R}^D$ for $n = 1, 2, \ldots , N$, where $D$ is the number of node time series (i.e.~brain regions) and $N$ is the number of observations across time.
+It is therefore common to characterize the activity of \glspl{roi} instead of working with raw voxels~\parencite{Korhonen2017}, by either averaging or taking the first eigenvariate of all voxels enclosed in it (using some parcellation atlas).
+Throughout this thesis, these time series shall be denoted as $\mathbf{y}_n \in \mathbb{R}^D$ for $n = 1, 2, \ldots , N$, where $D$ is the number of time series (i.e.~brain regions) and $N$ is the number of observations across time.
 
-Preprocessing pipelines, which take raw \gls{fmri} data and output node time series of interest, greatly impact study results and conclusions~\parencite{Caballero-Gaudes2017}.
-It is important to remember that preprocessing steps heavily influence subsequently extracted \gls{fc}~\parencite{Aquino2022}.
-An example of an impactful preprocessing step is \gls{gsr}, which can improve spatial localisation of networks, but can introduce artificial anticorrelations into the data~\parencite{Murphy2009}.
+Preprocessing pipelines, which take raw \gls{fmri} data and output node time series of interest, impact study results and conclusions~\parencite{Caballero-Gaudes2017}.
+As such, preprocessing steps also heavily influence subsequently extracted \gls{fc}~\parencite{Aquino2022}.
+An example of an impactful preprocessing step is \gls{gsr}, which can improve spatial localization of networks, but can introduce artificial anticorrelations into the data~\parencite{Murphy2009}.
 
 \info[inline]{Paragraph: Describe fMRI functional connectivity.}
-In the context of \gls{fmri}, `connectivity' is typically characterized as the Pearson correlation coefficient between brain region \gls{bold} measurement component time series~\parencite{Zalesky2012}.
-Note that this is a time-domain, non-directed quantitative measure of connectivity that assumes a linear relationship between node time series.
-There are other types of connectivity, including covariance, (directed) coherence, partial coherence, mutual information~\parencite[see e.g.][chapter 2]{Cover2005}, Granger causality, transfer entropy, and many frequency-domain measures~\parencite[see][for reviews]{Wang2014, VanDiessen2015, Bastos2016, Foti2019}.
+In the context of \gls{fmri}, `connectivity' is typically (i.e.~traditionally) characterized as the Pearson correlation coefficient between brain region \gls{bold} measurement component time series~\parencite{Zalesky2012}.
+This is a time-domain, non-directed quantitative measure of connectivity that assumes a linear relationship between brain region time series.
+There are other types of connectivity, including covariance, (directed) coherence, partial correlation, partial coherence, mutual information~\parencite[see e.g.][chapter 2]{Cover2005}, Granger causality, transfer entropy, and many frequency-domain measures~\parencite[see][for reviews]{Wang2014, VanDiessen2015, Bastos2016, Foti2019}.
 Some of these are directed whereas others are undirected.
 Choosing a connectivity measure constitutes a researcher degree of freedom~\parencite{Gelman2013} and each measure will extract different information from the raw signals.
 %
-Covariance (and correlation) is perhaps the simplest measure of dependency.
-This may explain its popularity.
-Throughout this thesis we refer to \gls{fc} as this connectivity measure based on \gls{fmri} data.
-It is important to realize that such connectivity (i.e.~covariance) is an \emph{unobserved} property, and must be \emph{estimated} in the absence of a ground truth.
+Covariance (and correlation) is perhaps the simplest measure of dependency, which may explain its popularity.
+Throughout this thesis \gls{fc} refers to this connectivity measure based on \gls{fmri} data.
+Such connectivity (i.e.~covariance) is an \emph{unobserved} property.
+It must be \emph{estimated} in the absence of a ground truth.
 
 \info[inline]{Paragraph: Discuss network neuroscience and its relationship to functional connectivity.}
-Work on \gls{fc} has greatly benefited from insights from graph theory and dynamical systems theory.\footnote{A dynamical system refers to any system of components (in our case brain regions) that interact and change across time according to some `dynamic' or `rule'. In state-space models, for example, system states are represented by a vector and evolve through a matrix multiplication.}
-Brain \glspl{roi} are then represented as a node in a graph and connectivity strength as (weighted) edges~\parencite{Ryyppo2018}.
-We will discuss this perspective in more detail later, in the context of using this viewpoint as a way to extract features from data.
+Work on \gls{fc} and connectomics has benefited from insights from graph theory and dynamical systems theory~\parencite{Bassett2017, Betzel2022}.\footnote{A dynamical system refers to any system of components (brain regions in this case) that interact and change across time according to some `dynamic' or `rule'. In state-space models, for example, system states are represented by a vector and evolve through a matrix multiplication.}
+Brain \glspl{roi} are then represented as a \textbf{node} in a graph and connectivity strength as (weighted) \textbf{edges}~\parencite[i.e.~node-pairs, see also][]{Ryyppo2018}.
+This perspective will be discussed in more detail later, in the context of using it to extract features from data.
+Moreover, the term `node' will be used throughout this thesis to refer to any brain region, and `edge' will refer to any connection between two such regions.
 
 \info[inline]{Paragraph: Summarize key scientific insights from functional connectivity.}
-Despite these caveats, \gls{fc} has taught us a great deal about the spatiotemporal organization of the cortex at macro-scale.
+Despite the caveats discussed, \gls{fc} has taught us a great deal about the spatiotemporal organization of the cortex at macro-scale.
 It can be viewed as an expression of network behavior required for high level complex cognitive function.
 %
 Many interindividual differences have been found~\parencite{Liegeois2019}.
diff --git a/ch/1_Introduction/2_Functional_networks.tex b/ch/1_Introduction/2_Functional_networks.tex
index 9e5976c..f6509d1 100644
--- a/ch/1_Introduction/2_Functional_networks.tex
+++ b/ch/1_Introduction/2_Functional_networks.tex
@@ -1,21 +1,20 @@
 \clearpage
-\section{Functional networks}
-\label{sec:functional-brain-networks}
+\section{Functional networks}\label{sec:functional-brain-networks}
 %%%%%
 
 \info[inline]{Paragraph: Introduce the concept of functional networks.}
-Brain regions have empirically been found to cluster and interact to adapt to a certain task at hand, forming \emph{networks} (also referred to as `circuits` or `systems')~\parencite{Fox2007}.
+Brain regions have empirically been found to cluster and interact to adapt to a certain task at hand, forming \emph{networks} (also referred to as `circuits' or `systems')~\parencite{Fox2007}.
 Cognitive tasks are not just performed by isolated brain regions, but rather by such networks, i.e.~linked collections of brain regions~\parencite{Bressler2010}.
 %
 These networks have been identified from \gls{fmri} \gls{bold} signals from scanning subjects presented with external tasks.
 Moreover, it has been shown that such large-scale networks exist at rest, and that these strongly resemble those found in task paradigms~\parencite{Smith2009}.\footnote{In neuroimaging, `rest' usually refers to the absence of an external stimulus or task, leaving mental activity relatively unconstrained. Signals measured at rest are sometimes referred to as `intrinsic' or `spontaneous', but rather confusingly in neuroimaging this sometimes refers to an actual signal of interest and sometimes to just noise.}
 In such cases, these networks are referred to as \glspl{rsn} or \glspl{icn}.
-However, this name can cause confusion, because the same networks can be found during task executions.
+However, this name can cause confusion, because networks with similar extents can be found during task executions.
 Therefore, we opt to simply use the term `functional network'~\parencite[FN; see also][]{Finn2021}.
 Viewing the brain as a superposition of networks is yet another level up in abstraction, beyond looking at individual brain regions.
 
 \info[inline]{Paragraph: Describe several common functional networks that have been discovered.}
-Depending on the analysis method, a number of such \glspl{fn} have been identified in humans, typically ranging from 2 to 20~\parencite{Yeo2011, Heine2012, Glomb2017}.
+Depending on the analysis method, several such \glspl{fn} have been identified in humans, typically ranging from 2 to 20~\parencite{Yeo2011, Heine2012, Glomb2017}.
 One particularly impactful \gls{rsn} study found 20 networks~\parencite{Smith2009, Laird2011}, ten of which showed strong overlap between resting-state and task-based data networks.
 We will return to these ten networks in \cref{ch:benchmarking}.
 They are printed in \cref{fig:brainmap-functional-networks}.
@@ -30,4 +29,4 @@ \section{Functional networks}
 \info[inline]{Paragraph: Discuss scientific insights gained from functional network studies.}
 Viewing cognitive function through the lens of \glspl{fn} has proven fruitful and valid.
 For example, \textcite{Vidaurre2017} showed that such large-scale brain networks are hierarchically organized and heritable.
-Furthermore, they show that the switching between such networks is not random, and the time a subject spends in each state is predictive of cognitive traits.
+Furthermore, they show that the switching between such networks is not random, and the time a subject spends in a certain state is predictive of cognitive traits.
diff --git a/ch/1_Introduction/3_TVFC.tex b/ch/1_Introduction/3_TVFC.tex
index 9642b2b..fd39683 100644
--- a/ch/1_Introduction/3_TVFC.tex
+++ b/ch/1_Introduction/3_TVFC.tex
@@ -1,11 +1,10 @@
 \clearpage
-\section{Time-varying functional connectivity (TVFC)}
-\label{sec:tvfc}
+\section{Time-varying functional connectivity (TVFC)}\label{sec:tvfc}
 %%%%%
 
 \info[inline]{Paragraph: Introduce concept of time-varying functional connectivity.}
 Is it fair to assume that \gls{fc} is temporally stable and stationary (i.e.~\emph{static}) across a measurement period?
-In fact, many studies have shown that such connectivity varies across the length of a brain scan~\parencite{Chang2010, Sakoglu2010, Cribben2012, Lang2012, Hutchison2013, Hutchison2013b, Allen2014, Lindquist2014, Gonzalez-Castillo2015, Leonardi2015, Preti2017}.
+In fact, many studies have shown that such connectivity varies across the length of a brain scan~\parencite{Chang2010, Sakoglu2010, Cribben2012, Lang2012, Hutchison2013, Hutchison2013b, Allen2014, Lindquist2014, Gonzalez-Castillo2015, Leonardi2015, Liegeois2017, Preti2017}.
 Therefore, there is a growing interest in studying the time-varying nature of the functional relationship between brain regions.
 Extending our estimation of \gls{fc} from \gls{sfc} to \gls{tvfc}, \textcite{Calhoun2014} proposed to call this functional connectome a `chronnectome', to avoid confusing it with the static (i.e.~scan average) analysis of \gls{bold} signal coupling.
 %
@@ -21,40 +20,42 @@ \section{Time-varying functional connectivity (TVFC)}
 
 \begin{figure}[t]
   \centering
-  \includegraphics[width=\textwidth]{fig/tvfc_methods_Lurie2020}
+  \includegraphics[width=\textwidth]{fig/tvfc_methods_Lurie2020_compressed}
   \caption{
-    Typical TVFC workflow(s).
+    Typical TVFC estimation and feature extraction workflow(s) in fMRI data.
+    Green arrows show a typical sliding-windows analysis, blue arrows indicate a range of other data-driven analyses, orange arrows represent fitting a biophysical model to time series data.
     Re-generated from \textcite{Lurie2020}.
-  }
-  \label{fig:tvfc-workflow}
+  }\label{fig:tvfc-workflow}
 \end{figure}
 
 
 \info[inline]{Paragraph: Clear up TVFC naming convention.}
-Before moving on, we need address some housekeeping regarding naming conventions.
-The terms \gls{tvfc} and \gls{dfc} (or dynFC) are sometimes used interchangably, but sometimes refer to different things.
+Before moving on, we need to address some housekeeping regarding naming conventions.
+The terms \gls{tvfc} and \gls{dfc} (or dynFC) are sometimes used interchangeably, but sometimes refer to different things.
 To avoid confusion, we will use the more general term \gls{tvfc} and a broad label, since `dynamic' functional connectivity is used in multiple ways and contexts across disciplines~\parencite[see][for more details]{Lurie2020}.
 
 \info[inline]{Paragraph: Discuss scientific insights from TVFC.}
-Although \gls{sfc} analyses have taught us a lot, \gls{tvfc} can increase our understanding of the underlying cognitive processes that generate such covariance structures.
+Although \gls{sfc} analyses have taught us a lot, \gls{tvfc} can increase our understanding of the underlying cognitive processes that generate such covariance structures~\parencite{Cohen2018}.
 The study of \gls{tvfc} can help the understanding of adaptation and shifts in behavior in the brain.
 %
 Since \gls{tvfc} is an extension of \gls{sfc}, all information captured by \gls{sfc} should be captured by \gls{tvfc} too.
 Therefore, we argue that \gls{tvfc} should always be compared to \gls{sfc} to find out what extra information is extracted from including the time-varying nature of the covariance structure.
-And indeed, \gls{tvfc} has been shown to extract additional information in some cases compared to just \gls{sfc}~\parencite[see e.g.][]{Rashid2016, Jin2017, Liegeois2019, Vidaurre2021}.
-These results also suggest that some mental disorders may benefit from a dynamic view of the brain.
+And indeed, \gls{tvfc} has been shown to extract additional information in some cases compared to just \gls{sfc}~\parencite[see e.g.][]{Rashid2016, Jin2017, Liegeois2019, Luppi2019, Varley2020, Vidaurre2021, Coppola2022}.
+These results also suggest that the study of consciousness and certain mental disorders may benefit from a dynamic view of the brain.
 %
 Despite these insights, there are still many open questions regarding the interpretation and validity of \gls{tvfc} estimates.
 The biological and physiological basis of \gls{tvfc}, both neural and nonneural in origin, is still elusive~\parencite{Lurie2020}.
 \textcite{Liu2013, Petridou2013} have suggested that \gls{tvfc} may originate from transient coactivation patterns (CAPs) and their dynamics.
 \textcite{Matsui2019} later confirmed this by simultaneously recording calcium imaging and optical hemodynamics~\parencite[see][for a review of multi-modal approaches]{Thompson2018b}.
 Furthermore, it has often been suggested that the brain exhibits a state-space structure, which may underlie the observed time-varying connectivity structure~\parencite{Hutchison2013}.
+Such `brain state' characterizations entail the evolving dynamics and self-organization of brain networks~\parencite{Kringelbach2020}.
+The particular definition in our \gls{fmri} \gls{fc} context will be discussed more in \cref{subsec:brain-states}.
 
 \info[inline]{Paragraph: Discuss open questions and limitations.}
 There are ongoing debates about the physiological origins and relevance (behaviorally as well as cognitively) of \gls{tvfc}.
 \textcite{Laumann2017} challenged the idea that \gls{tvfc} is related to ongoing cognition.
-One potentially strong argument against any cognitive relevance of \gls{tvfc} is that \gls{fc} fluctuations have also been observed in anesthetized (unconscious) brains~\parencite{Hutchison2013b}.
-However, \textcite{Demertzi2019} found anesthesia to change network complexity, validating its implication with consciousness.
+One potentially compelling argument against any cognitive relevance of \gls{tvfc} is that \gls{fc} fluctuations have also been observed in anesthetized (unconscious) brains~\parencite{Hutchison2013b}.
+However, \textcite{Demertzi2019} found anesthesia to change network complexity, validating its implication with consciousness~\parencite[see also][]{Varley2020b}.
 Furthermore, questions have been raised about the statistical validity of this construct.
 As \textcite{Lurie2020} discussed, fluctuations and correlations may well be explained by nonneural physiological factors such as head motion, cardiovascular, and respiratory effects.
 %
diff --git a/ch/1_Introduction/4_The_trouble_with_estimating_TVFC.tex b/ch/1_Introduction/4_The_trouble_with_estimating_TVFC.tex
index 91d767e..698367f 100644
--- a/ch/1_Introduction/4_The_trouble_with_estimating_TVFC.tex
+++ b/ch/1_Introduction/4_The_trouble_with_estimating_TVFC.tex
@@ -7,7 +7,10 @@ \section{The trouble with estimating TVFC}
 Many estimation methods are used in practice, and many more have been proposed.
 \gls{tvfc} estimations vary wildly across different estimation methods, and, as we will see, this results in different predictive power of subject measures (including clinical measures).
 Consequently, experimental and scientific conclusions are heavily influenced by the (seemingly arbitrary) choice of estimation method.
-This compounds onto the already large number of reseacher `degrees of freedom' present in \gls{fmri} analyses~\parencite{Gelman2013, Dafflon2022}.
+%
+This compounds onto the already substantial number of researcher `degrees of freedom' present in \gls{fmri} analyses~\parencite{Gelman2013, Dafflon2022}.
+For example, \gls{tvfc} estimates also heavily depend on data preprocessing methods~\parencite{Luppi2021b}.
+%
 This motivates the careful and robust development of \gls{tvfc} estimation methods.
 
 \info[inline]{Paragraph: Introduce common estimation methods.}
@@ -26,14 +29,14 @@ \section{The trouble with estimating TVFC}
 In \cref{sec:established-methods} we will go into more technical detail on the methods considered in this thesis.
 %
 Although we only consider time domain methods, time-frequency parameters can still be extracted from learned model parameters, as we shall see.
-Even though not every method is considered, when setting up the benchmarks it should be relatively straightforward to include another method to our comparison framework.
+Even though not every method is considered, when setting up the benchmarks it should be straightforward to include another method in our comparison framework.
 
 \info[inline]{Paragraph: Explain why method selection is hard.}
 The lack of a ground truth correlation makes method selection a hard problem.
 This explains why the field has not settled on a single approach.
 %
 In their review, \textcite{Lurie2020} noted the pitfall of studying \gls{rs-fmri} \gls{tvfc} of lacking clear benchmarks.
-They also notd that \gls{rs-fmri} has already gone through similar controversies in its early days.
+They also noted that \gls{rs-fmri} has already gone through similar controversies in its early days.
 This invites us to learn from its respective journey as a field.
 
 \info[inline]{Paragraph: State how we address this problem.}
diff --git a/ch/1_Introduction/5_Functional_connectivity_and_depression.tex b/ch/1_Introduction/5_Functional_connectivity_and_depression.tex
index b6ae2f0..ea52f14 100644
--- a/ch/1_Introduction/5_Functional_connectivity_and_depression.tex
+++ b/ch/1_Introduction/5_Functional_connectivity_and_depression.tex
@@ -1,6 +1,5 @@
 \clearpage
-\section{Functional connectivity and depression}
-\label{sec:fc-depression}
+\section{Functional connectivity and depression}\label{sec:fc-depression}
 %%%%%
 
 \info[inline]{Paragraph: Introduce the general study of depression.}
@@ -8,30 +7,28 @@ \section{Functional connectivity and depression}
 But what do we mean by depression?
 How does depression affect the brain?
 And more specifically, how does depression affect \gls{fc} in the brain?
-Can we use \gls{fc} to assign credit or discredit to varies theories of depression?
+Can \gls{fc} be used to assign credit or discredit to varies theories of depression?
 In this thesis we argue that depression is a particularly good disease to study through the lens of \gls{tvfc}.
 \gls{fc} has the potential of offering new diagnostic value in neuropsychiatric disorders, where typical \gls{fmri} activations are often small~\parencite{Fornito2012}.
 The rest of this section reviews the current understanding of what depression is, why it is important to study it, what subtypes exist, what symptoms typically occur, how it affects the brain, and how it affects \gls{fc} in the brain.
-Of course this will be a limited overview of all research and perspectives on depression, but will include the most relevant background information for the study in this work.
+Of course, this will be a limited overview of all research and perspectives on depression, but will include the most relevant background information for the study in this work.
 
 %%
-\subsection{What is depression?}
-\label{subsec:depression}
+\subsection{What is depression?}\label{subsec:depression}
 %%
 
 \info[inline]{Paragraph: Overview of depression burden and motivation to study it.}
 Depression is a human tragedy: it is absolutely crippling, it is pervasive, and it is global.
 The most recent \gls{who} estimate (for 2021) puts the number of people worldwide living with a proverbial `black dog' at 280 million.
 The burden of depression (and other neuropsychiatric disorders) on societies and their healthcare systems barely needs elaboration.
-Even more worrisome is that its disease burden and prevelance are growing.
+Even more worrisome is that its disease burden and prevalence are growing.
 Stigma, heterogeneity of symptoms, and lack of understanding of causes and brain and social mechanisms have meant that treatment of this disorder (or umbrella of disorders) remains insufficient.
-Furthermore, despite decades of intense research, the concept of depression remains elusive.
-Depression is slightly different for everyone.
+Depression is slightly different for everyone and remains difficult to conceptualize.
 We may call it a disease, illness, or disorder, but we may also view it as an \emph{experience} instead.
 
 \info[inline]{Paragraph: Introduce depressive disorders and describe MDD.}
 This thesis is mainly concerned with \gls{mdd}, the most common of all depressive disorders.
-It is important to distinguish between three types of depression: the every day, coloquial use of the word depression; a longer period of sadness after a traumatic life event (a \emph{reactive} depression); and \textbf{major depression}, which is characterized by \emph{persistent} sadness over long periods of time~\parencite{Otte2016}.
+It is important to distinguish between three types of depression: the everyday, colloquial use of the word depression; a longer period of sadness after a traumatic life event (a \emph{reactive} depression); and \textbf{major depression}, which is characterized by \emph{persistent} sadness over prolonged periods of time~\parencite{Otte2016}.
 Going forward we refer to the latter when we talk about depression.
 Two common ways of diagnosing (i.e.~categorizing or classifying) depression are based on standard diagnostic (category-based) frameworks: the \gls{dsm} and the \gls{icd}.\footnote{The \gls{icd} defines mental disorders as ``clinically recognizable set of symptoms or behaviors associated in most cases with distress and with interference with personal functions''~\parencite{WHO1992}.}
 The \gls{dsm} criteria for major depression are shown in Box~\ref{box:depression}.
@@ -40,7 +37,7 @@ \subsection{What is depression?}
 More broadly, depression endophenotypes\footnote{Endophenotypes, or `intermediate phenotypes', refer to heritable traits used to more robustly define behavioral symptoms into phenotypes. Similar terms are `biological marker' or \emph{biomarker} and `subclinical trait', although these are typically not used to refer to genetic components.} and cardinal symptoms include anhedonia, anergia, anxiety, rumination, changes in appetite and sleep patterns, strong and persistent feelings of guilt and grief, and, most tragically, self-injury~\parencite{Goldstein2014, Pizzagalli2014}.
 Although core symptoms are typically present, depression is not a consistent syndrome with a fixed set of symptoms.
 In fact, \textcite{Fried2015} found over 1,000 unique symptoms in a cohort of about 3,700 patients~\parencite[see also][]{Fried2015b}.
-\Gls{mdd} not only affects mood and affective processing, but is involved with a range of cognitive dysfunctions as well.
+\Gls{mdd} not only affects mood and affective processing but is also involved with a range of cognitive dysfunctions.
 
 \begin{mybox}[floatplacement=t,fontupper=\footnotesize,fontlower=\footnotesize,label={box:depression},colback=White]{Depression and its symptoms}
 
@@ -69,8 +66,8 @@ \subsection{What is depression?}
 What causes depression?
 As there are various subtypes of depression, this varies.
 However, commonly depressive episodes are predated by traumatic, adverse, and negative life events~\parencite{Kessler1997, Monroe2008}.
-When such events happen at a developmental age, they can disproportionality impact neurobiological systems, and lead to higher probability of developing depression later in life.
-Perhaps the right question is not what causes depressive episodes, but what makes some individuals seemingly more \emph{resilient} in the face of stressors to be able to cope and recover.
+When such events happen at a developmental age, they can disproportionally impact neurobiological systems, and lead to a higher probability of developing depression later in life.
+Perhaps the right question is not what causes depressive episodes, but what makes some individuals more \emph{resilient} in the face of stressors to be able to cope and recover.
 As such it is common to talk about `risk' or `contributing' factors (such as genetics, early life experiences, socioeconomic status, and environment), instead of `causes'.
 Most prevention efforts would focus on managing exactly these contributing factors.
 
@@ -81,7 +78,7 @@ \subsection{What is depression?}
 Genetic risk for \gls{mdd} is polygenic, meaning a variety of genes are involved, and the exact mechanisms are yet to be uncovered~\parencite{Hyman2014}.
 This is likely due to the heterogeneity of depressive symptoms as well.
 Moreover, much of depression risk may be due to other genetic factors.
-Generally higher overall cognitive function, for example, could lead to higher socioeconomic status, which in turn could lead to healthier diet and increased sense of safety and control in the world (which in turn have been linked to lower depression risk).
+Higher overall cognitive function, for example, could lead to higher socioeconomic status, which in turn could lead to healthier diet and increased sense of safety and control in the world (which in turn have been linked to lower depression risk).
 Genetic risk has often been described as influencing cognitive biases and thus \emph{resilience} to stressors.
 The most important take-away from genetic studies is that genes are about vulnerability and resilience to depression and not about inevitability.
 %
@@ -89,32 +86,31 @@ \subsection{What is depression?}
 
 \info[inline]{Paragraph: Describe cognitive effects of MDD.}
 Before looking at the brain and impacts of \gls{mdd} on \gls{fc}, we give an overview of changes in cognition and behavior.
-These will be referred back to in \cref{sec:ukb-discussion}.
+These will be referred to in \cref{sec:ukb-discussion}.
 Deficits in memory systems, attention, learning, processing speed, and decision-making are common among \gls{mdd} patients.
 \textcite{Rock2014} found especially executive function\footnote{In neuroscience, \textbf{executive function} generally refers to functions related to planning, focus, sticking with instructions, and multi-tasking~\parencite{Banich2009}.}, memory, and attention affected by \gls{mdd}.
-Dysfunction is linked to a range of cogntive and affective biases.
+Dysfunction is linked to a range of cognitive and affective biases.
 A core affective bias is toward paying attention to the negative, or only remembering the negative~\parencite{Pulcu2017}.
 For example, depressed individuals forget negative information at a slower rate~\parencite{Power2000, Joormann2010}.
 
 \info[inline]{Paragraph: Discuss integrated models of depression.}
 Key to all of this is to find ways to \emph{integrate} or \emph{unify} the various perspectives on depression.
-Several proposal have been made to build integrated models of depression.
-Most of these agree that we need a bridge between the psychological perspective (the one that `makes sense', but we can't do modern science on) and the biological perspective (the one that we can measure and work with, but is often too far removed from the human experience).
+Several proposals have been made to build integrated models of depression.
+Most of these agree that we need a bridge between the psychological perspective (the one that `makes sense' but we cannot do modern science on) and the biological perspective (the one that we can measure and work with but is often too far removed from the human experience).
 \textcite{Akiskal1973} discussed ways to integrate such psychological and biological views of depression.
 Their proposed framework integrates several depression characterizations; metapsychological (Freud's ``aggression-turned-inwards'' and the ``object-loss'' models), the ``reinforcement'' model, and the biological (``biogenic amine'') model into a common pathway of ``functional derangement of the mechanisms of reinforcement''.
 \textcite{Pizzagalli2014} proposed that anhedonia is the key feature of depression, and proposes an account of anhedonia, \gls{da} (reward systems), and the (internal) massive stress responses and heightened stress hormone levels found in depressed patients.
 More recently, \textcite{Beck2016} proposed that depression can be viewed as ``an adaptation to conserve energy after the \emph{perceived loss of an investment in a vital resource} such as a relationship, group identity, or personal asset.''
 They highlight that these are mediated by brain regions involved in cognition and emotion regulation: the \gls{amg}, \gls{hpc}, and \gls{pfc}.
-According to this proposal, depression can be viewed as an ``evolutionary program'' for conserving energy, that just so happens to have become maladaptive in contemporary life.
-\footnote{Such unfortunate evolutionary left-overs have been used to attribute other maladies to as well. Instinctive hoarding of sugar and information has had evolutionary advantages, but wreaks havoc in modern life.}
+According to this proposal, depression can be viewed as an ``evolutionary program'' for conserving energy, that just so happens to have become maladaptive in contemporary life.\footnote{Other maladies can also be attributed to such unfortunate evolutionary left-overs. Instinctive hoarding of sugar and information had evolutionary advantages, but wreaks havoc in modern life.}
 Overall, many of these existing grand theories share a lot of common ground.
-Most descriptions generally gear toward pertubations in reinforcement processing, negative affective bias~\parencite{Pulcu2017}, negative feedback loops, stress, and associated neurochemical pathways.
+Most descriptions gear toward perturbations in reinforcement processing, negative affective bias~\parencite{Pulcu2017}, negative feedback loops, stress, and associated neurochemical pathways.
 However, at the time of writing most of these are still quite general and fail to make concrete, falsifiable predictions, crucial for the development of strong theory.
 It also remains to be seen whether a single model will be able to describe all clinical cases.
 
 \info[inline]{Paragraph: Discuss treatment options for MDD.}
 That brings us to the treatment of depression.
-Importantly, knowing how what works and what doesn't to treat depression can also shed light on what the condition actually entails.
+Importantly, knowing what works to treat depression and what does not can also shed light on what the condition entails.
 Treatment options for \gls{mdd} generally are pharmacological intervention and/or one of the many types of (psycho)therapy~\parencite{Otte2016}.
 Antidepressant medication is usually meant to increase the concentration of a certain neurotransmitter in the brain, most commonly serotonin.
 In the case of serotonin these antidepressants are called \gls{ssri}.
@@ -126,13 +122,12 @@ \subsection{What is depression?}
 For example, medication seems to work well for some but has no effect on others.
 The latter are sometimes called `treatment-resistant', but they may well suffer from a different subtype of depression, where neurobiologically distinct domains are collapsed into a simple diagnostic index.
 Overall, there are many things that can help those with depression.
-However, one of the main issues is that not all of these things help everyone, and matching the right support to the right person is hard.
+However, one of the crucial issues is that not all these things help everyone, and matching the right support to the right person is hard.
 Each patient is characterized by a unique mixture of medical history, personality, comorbidities, socioeconomic environment, and many other factors~\parencite{Trivedi2006}.
-Here lies the challenge of treatment of depression in society: how do we provide care with the required level of personalization, yet to millions of people at the same time?
+Here lies the challenge of the treatment of depression in society: how do we provide care with the required level of personalization, yet to millions of people at the same time?
 
 %%
-\subsection{Depression and neuroimaging}
-\label{subsec:fc-neuroimaging}
+\subsection{Depression and neuroimaging}\label{subsec:fc-neuroimaging}
 %%
 
 Neuroimaging has the potential to offer unique insight into the mechanisms of depression.
@@ -144,23 +139,22 @@ \subsection{Depression and neuroimaging}
 Multiple brain region volumes are either increased or decreased~\parencite{Sacher2012, Schmaal2020}.
 Grey matter volumes are \emph{reduced} in the \gls{amg}, \gls{pccx}, \gls{dmpfc}, and \gls{hpc}.
 However, there are conflicting findings, and \gls{amg} volume may be increased or decreased based on individual specifics.
-Volumetric increases have been reported for the insula, middle frontal gyrus, superior frontal gyrus, and the thalamus.\footnote{In neuroanatomy, a \textbf{gyrus} refers to the ridges of the cortex surface, oppossed to a \textbf{sulcus}, which refers to the respective furrow of the folded cortex.}
+Volumetric increases have been reported for the insula, middle frontal gyrus, superior frontal gyrus, and the thalamus.\footnote{In neuroanatomy, a \textbf{gyrus} refers to the ridges of the cortex surface, opposed to a \textbf{sulcus}, which refers to the respective furrow of the folded cortex.}
 
 However, recent meta-analyses have come to dispute the reliability and clinical relevance of many such findings.
 A recent large (1809 participants) study found very modest predictive power of (univariate) neuroimaging modalities (\gls{mri}, \gls{dti}, \gls{rs-fmri}, and \gls{tb-fmri}) of \gls{mdd}~\parencite{Winter2022}.
-They found environmental factors such as social support and childhoold maltreatment to have much more predictive power.
+They found environmental factors such as social support and childhood maltreatment to have much more predictive power.
 Similar sentiments were echoed in \textcite{Nour2022}.
 At present, neuroimaging plays little to no role in clinical decision making~\parencite{Kapur2012}.
 
 %%
-\subsection{Functional connectivity in psychiatric disorders}
-\label{subsec:fc-depression}
+\subsection{Functional connectivity in psychiatric disorders}\label{subsec:fc-depression}
 %%
 
 \info[inline]{Paragraph: How can we relate functional connectivity to disorders? What can this teach us about these disorders? What value do these analyses have in a clinical setting?}
-Whereas some brain disorders and mental health conditions can be traced back to a dysfunction in a particular brain region (e.g. inflamation, neurodegeneration,\footnote{Neurodegenerative disorders refer to progressive loss of neural structure and function. Common disorders in this category include \gls{ad} and Parkinson's disease.} or physical trauma), others are better understood as dysfunction in brain region \emph{function} and/or \emph{interaction} between otherwise seemingly healthy individual brain regions.
-Mood disorders\footnote{In psychiatry, a \textbf{mood} or \textbf{affective disorder} refers to any depressive or bipolar disorder.} especially have been suggested to be functional rather than structural disorders~\parencite{Piguet2021}.
-\Gls{fc} is a particularly useful framework to study such aberrations in connectivity with, witnessed by the vast amount of studies studying depression through this lens.
+Whereas some brain disorders and mental health conditions can be traced back to a dysfunction in a particular brain region (e.g.~inflammation, neurodegeneration,\footnote{Neurodegenerative disorders refer to progressive loss of neural structure and function. Common disorders in this category include \gls{ad} and \gls{pd}.} or physical trauma), others are better understood as dysfunction in brain region \emph{function} and/or \emph{interaction} between otherwise seemingly healthy individual brain regions.
+Mood disorders\footnote{In psychiatry, a \textbf{mood} or \textbf{affective disorder} refers to any depressive or bipolar disorder.} especially have been suggested to be functional (related to dynamic connectivity patterns) rather than structural disorders~\parencite{Piguet2021}.
+\Gls{fc} is a particularly useful framework to study such aberrations in connectivity with, witnessed by the vast number of studies studying depression through this lens.
 
 \info[inline]{Paragraph: Discuss sFC in neurological and psychiatric disorders.}
 Most of such psychiatric studies employ \gls{sfc}.
@@ -172,23 +166,40 @@ \subsection{Functional connectivity in psychiatric disorders}
 
 \info[inline]{Paragraph: Discuss FNs in neurological and psychiatric disorders.}
 \Glspl{fn} are often affected with neuropsychiatric disorders, even if their individual brain region constituents appear normal.
-Such disorders are therefore increasingly studied as \emph{network} disorders~\parencite{Mulders2015}.
+Such disorders are therefore increasingly studied as \emph{network} disorders~\parencite[see][for a review on depression]{Mulders2015}.
+Instead of a single brain region not functioning properly, there is an aberration in the integration and segregation of brain regions.
 These changes in \gls{fn} are believed to contribute or be caused by cognitive changes from mental illness.
 Many mental illness conditions have been postulated to occur with large-scale disruptions, driven by neurotransmitter dysfunction, of whole-brain systems.
 Even though whole-brain \glspl{fn} are found to be highly similar across groups with or without a range of mental illnesses, the subtle differences that \emph{do} occur are meaningful in the sense that they are predictive of diagnosis~\parencite{Spronk2020}.
 This makes intuitive sense as well: a piano does not need major disruption to ruin a classical piece, one key being out of tune is sufficient~\parencite[see also][for a discussion of small effect sizes]{Paulus2019}.
 Perhaps this is even encouraging.
 If small network alterations can result in mental disease, a small intervention can bring someone's functional architecture back on track.
-\textcite{Mulders2015} found especially \gls{dmn}, \gls{cen}, and \gls{sn} neural circuits to be affected.
-These are also generally the most-studied networks, and will be the ones we look at in this thesis.
+
+\textcite{Mulders2015} found the main networks to be involved and affected in depression to be the \gls{dmn}~\parencite{Berman2011, Demirtas2016, Wise2017, Yan2019, Zhao2019, Zhou2020}, \gls{cen}~\parencite{Zhao2019}, and \gls{sn}~\parencite{Manoliu2014}.
+These are also generally the networks most studied and will be the ones considered in this thesis.
+The exact makeup of these networks varies across studies.
+A rough overview of each network is provided here (the more precise implementational details will be provided in \cref{subsec:ukb-fn-analysis}).
+
+The \gls{dmn} primarily consists of \gls{mpfc} and \gls{pcc}, as well as the (para)hippocampal areas, precuneus (cortex), and angular gyrus~\parencite{Andrews-Hanna2010}.
+It is often described as the neurological basis for `the self', and is attributed functions like self-referential thinking~\parencite{Sheline2009}, cognitive flexibility~\parencite{Vatansever2016}, mind-wandering, memory processing and rumination, theory of mind, emotion regulation, and as storage of autobiographical information.
+It it connected to the \gls{amg} and \gls{hpc}~\parencite{Andrews-Hanna2014}.
+\unsure{How should we define our DMN?}
+
+The \gls{cen} primarily consists of the lateral \gls{pfc}, posterior parietal cortex (PPC), \gls{dlpfc} (especially middle frontal gyrus), \gls{dmpfc}, and posterior parietal regions~\parencite{Rogers2004}.
+It is associated with cognitive processes and functions, like working memory and attention.
+\unsure{How should we define our CEN?}
+
+The \gls{sn} primarily consists of the \gls{ai} and (dorsal) \gls{acc}, with some adding the \gls{amg}, frontoinsular cortex, temporal poles, and striatum~\parencite{Seeley2007, Menon2010, Beck2016}.
+\unsure{How should we define our SN?}
+The \gls{sn} is a key network in cognitive flexibility~\parencite{Dajani2015}.
 
 \info[inline]{Paragraph: Discuss TVFC in neurological and psychiatric disorders.}
 What about the dynamics of \gls{fc}?
-The particular promise of \gls{tvfc} has been highlighted more recently as well in neurodegenerative conditions~\parencite{Filippi2019}.
+The promise of \gls{tvfc} has been highlighted more recently as well in neurodegenerative conditions~\parencite{Filippi2019}.
 More relevant information is contained in \gls{tvfc} compared to \gls{sfc}.
 \gls{tvfc} may be especially relevant for dynamical brain disorders like schizophrenia~\parencite{Jin2017}.
 
 \info[inline]{Paragraph: Discuss graph topology in neurological and psychiatric disorders.}
 Graph topology and network neuroscience have also been suggested to shed more light on neurological and psychiatric conditions~\parencite{Fornito2013}.
 Connectomic graph theoretic approaches to depression have found smaller path lengths and higher global efficiency~\parencite{Zhang2011}.
-This has been interpreted as shift toward brain network randomization~\parencite{Gong2015}.
+This has been interpreted as a shift toward brain network randomization~\parencite{Gong2015}.
diff --git a/ch/1_Introduction/6_Outline_and_contributions_of_thesis.tex b/ch/1_Introduction/6_Outline_and_contributions_of_thesis.tex
index 377f61d..fdcf3a9 100644
--- a/ch/1_Introduction/6_Outline_and_contributions_of_thesis.tex
+++ b/ch/1_Introduction/6_Outline_and_contributions_of_thesis.tex
@@ -19,10 +19,10 @@ \section{Outline and contributions of thesis}
 Multiple depression phenotypes and \gls{tvfc} metrics are studied to provide a rich multiverse of scientific insight.
 
 \info[inline]{Paragraph: Provide thesis outline.}
-In \cref{ch:methods} we go through established \gls{tvfc} estimation methods, our new approach, as well as the benchmarking framework used to compare estimation methods.
+In \cref{ch:methods} we go through established \gls{tvfc} estimation methods, our novel approach, as well as the benchmarking framework used to compare estimation methods.
 The remaining (experimental) chapters are about applying these methods.
 They are structured in ascending order of complexity and practicality, starting with simple, synthetic data sets to real, large-sample resting-state and task-based \gls{fmri} data in \cref{ch:benchmarking}, to the application in a large population study in \cref{ch:ukb}.
-In \cref{ch:discussion} we review and interpret our results, and set out directions for future work.
+In \cref{ch:discussion} we review and interpret our results and set out directions for future work.
 
 \info[inline]{Paragraph: Final comments before wrapping up the introduction.}
 We hope this thesis and accompanying software package can help researchers make more robust \gls{tvfc} brain connectivity estimates and shed more light on what we can infer from this construct.
diff --git a/ch/2_Robust_estimation_of_TVFC/0_Introduction.tex b/ch/2_Robust_estimation_of_TVFC/0_Introduction.tex
index cd7422e..e81cb1f 100644
--- a/ch/2_Robust_estimation_of_TVFC/0_Introduction.tex
+++ b/ch/2_Robust_estimation_of_TVFC/0_Introduction.tex
@@ -1,5 +1,4 @@
-\chapter{Robust estimation of TVFC}
-\label{ch:methods}
+\chapter{Robust estimation of TVFC}\label{ch:methods}
 %%%%%
 
 \info[inline]{Paragraph: Overview of chapter.}
@@ -8,12 +7,12 @@ \chapter{Robust estimation of TVFC}
 Additionally, we motivate the importance and describe several ways of extracting features or \emph{(bio)markers} from \gls{tvfc}.
 These will be used in further analyses in the following chapters.
 %
-Furthermore, we discuss how to compare and evaluate these methods, in order to weigh which one we ought to use.
+Furthermore, we discuss how to compare and evaluate these methods, to weigh which one we ought to use.
 Our aim is to develop \emph{robust} \gls{tvfc} estimation methods.
 Above all, we need to convince \emph{ourselves} that any method we opt to use is valid.
 In the words of the legendary physicist Richard~Feynman: ``The first principle is that you must not fool yourself, and you are the easiest person to fool''.
 %
-This chapter closes with a short discussion on the nature of estimation methods.
+This chapter closes with a brief discussion on the nature of estimation methods.
 
 \info[inline]{Paragraph: Frame TVFC estimation as covariance structure estimation problem.}
 The estimation of \gls{tvfc} (as we have narrowly defined it in the introduction to this thesis) is a particular form of the more general problem of covariance structure estimation.
@@ -65,7 +64,7 @@ \chapter{Robust estimation of TVFC}
 Throughout this thesis, \gls{fc} refers to this correlation metric.
 All plots will show correlation estimates instead of covariance estimates.
 %
-We will refer to a single \gls{fc} correlation matrix as a connectivity \emph{state}, and refer to this correlation matrix as a function of time (i.e.~\gls{tvfc}) as covariance or correlation \emph{structure} (used interchangably).
+We will refer to a single \gls{fc} correlation matrix as a connectivity \emph{state}, and refer to this correlation matrix as a function of time (i.e.~\gls{tvfc}) as covariance or correlation \emph{structure} (used interchangeably).
 %
 On a cautionary note, using the Pearson correlation as connectivity measure may be too simple.
 It assumes that time series are homoscedastic, meaning the variance across a brain scan is homogenous.
diff --git a/ch/2_Robust_estimation_of_TVFC/1_Established_methods_and_baselines.tex b/ch/2_Robust_estimation_of_TVFC/1_Established_methods_and_baselines.tex
index 8688d00..cf3e276 100644
--- a/ch/2_Robust_estimation_of_TVFC/1_Established_methods_and_baselines.tex
+++ b/ch/2_Robust_estimation_of_TVFC/1_Established_methods_and_baselines.tex
@@ -1,6 +1,5 @@
 \clearpage
-\section{Established methods and baselines}
-\label{sec:established-methods}
+\section{Established methods and baselines}\label{sec:established-methods}
 %%%%%
 \info[inline]{Section: Introduce key and established TVFC estimation methods.}
 
@@ -22,7 +21,7 @@ \subsection{Static functional connectivity}
 Furthermore, if a \gls{tvfc} estimation method cannot outperform the \gls{sfc} estimate on some task, this may either indicate that there is no (relevant) dynamic signal in the data set, or that the estimation method is flawed.
 
 \info[inline]{Paragraph: Describe our static functional connectivity estimation.}
-A standard covariance \gls{sfc} approach simply computes the covariance between nodes (i.e.~time series) across the entire brain scan duration as in \cref{eq:covariance}:
+A standard covariance \gls{sfc} approach simply computes the covariance between node time series (i.e.~regional activity) across the entire brain scan duration as in \cref{eq:covariance}:
 \begin{equation}
   \begin{aligned}
     \sigma_{ij} & = \mathbb{E}[(y_i - \mathbb{E}[y_i])(y_j - \mathbb{E}[y_j])] \\ & = \frac{1}{N} \sum_{n=1}^N (y_{i,n} - \bar y_i)(y_{j,n} - \bar y_j),
@@ -48,12 +47,11 @@ \subsection{Static functional connectivity}
 For larger values of $N$ this difference becomes negligible.
 
 %%
-\subsection{Sliding-windows functional connectivity}
-\label{subsec:sliding-windows-fc}
+\subsection{Sliding-windows functional connectivity}\label{subsec:sliding-windows-fc}
 %%
 
 \info[inline]{Paragraph: Introduce sliding-windows functional connectivity estimation.}
-Albeit criticism, \gls{tvfc} estimation methods based on \gls{sw}~\parencite{Chang2010, Sakoglu2010, Allen2014, Shakil2016, Preti2017} are still the most commonly used throughout the neuroscience literature~\parencite{Lurie2020}.
+Albeit criticism, \gls{tvfc} estimation methods based on \gls{sw}~\parencite{Chang2010, Sakoglu2010, Allen2014, Shakil2016, Preti2017} are still the most used throughout the neuroscience literature~\parencite{Lurie2020}.
 %
 This approach slides (or \emph{rolls}) a time window of a certain size (length)~$w$ and shape (e.g.~square, Gaussian) across the observations (typically with a step size of a single volume), and estimates the covariance or correlation as for the \gls{sfc} case in \cref{eq:sfc-estimation} for each step.
 As such it can be considered a (weighted) moving average.
@@ -69,19 +67,19 @@ \subsection{Sliding-windows functional connectivity}
 %
 The lack of consistency across studies in implementation of \gls{sw} is sub-optimal as it makes comparison across studies harder.
 Furthermore, stacking heuristics does not scale.
-Typically this situation is when we should start applying machine learning techniques~\parencite[][built this case beautifully]{Zinkevich2015}.
+Typically, this situation is when we should start applying machine learning techniques~\parencite[][built this case beautifully]{Zinkevich2015}.
 
 \info[inline]{Paragraph: Discuss problems with sliding-windows functional connectivity.}
 The main problem with this method is that without knowing the underlying process (i.e.~covariance structure from brain dynamics), it is hard to pick the right window length.
 %
-The \gls{sw} approach is also not a \emph{model-based}~\parencite{Foti2019} or \emph{data-driven} approach.\footnote{In neuroscientific context, `data-driven' refers to extracting insights directly from data in a relatively unbiased manner. It is often juxtaposed to `hypothesis-driven' or `theory-driven' approaches.}
+The \gls{sw} approach is also not a \emph{model-based}~\parencite{Foti2019} or \emph{data-driven} approach.\footnote{In neuroscientific context, `data-driven' refers to extracting insights directly from data in a relatively unbiased manner. It is often juxtaposed to `hypothesis-driven' or `theory-driven'.}
 Furthermore, the desired behavior at the start and end of the time series is not clear.
 See \textcite{Lindquist2014, Leonardi2015, Hindriks2016} for more important nuances and pitfalls of \gls{sw} methods.
 
 \info[inline]{Paragraph: Discuss our particular implementation.}
 In all experiments and benchmarks that follow in this thesis we implement a standard \gls{sw} approach to mimic a typical (often non-technical) investigator interested in using the construct of \gls{tvfc} to study the brain.
-Researchers are generally recommended to use a window length between 30 and 60 seconds~\parencite{Shirer2012}.
-Therefore, we implement both of these window lengths to test the limit cases.
+Researchers are recommended to use a window length between 30 and 60 seconds~\parencite{Shirer2012}.
+Both of these window lengths are implemented to test the limit cases.
 We implement the rectangular (non-tapered) window, with a step size of a single volume.
 We follow the rule of thumb proposed by \textcite{Leonardi2015} and high-pass filter the data to remove frequency components below $\frac{1}{w}$ before running the \gls{sw} algorithm.\footnote{\textcite{Smith2012} and \textcite{Hutchison2013} made similar suggestions.}
 Zeros are padded to the start and end of node time series to allow for computing the correlation coefficients around those locations.
@@ -105,12 +103,12 @@ \subsection{Multivariate GARCH}
 It should not come as a surprise then that methods based on \gls{sw} are not used in finance.
 %
 A commonly cited reason for the use of \gls{mgarch} in modelling financial time series is that they contain a lot of noise, requiring the use of stochastic methods.
-We argue that neuroimaging time series share these data characteristics.
+Neuroimaging time series share these data characteristics.
 
 \info[inline]{Paragraph: Describe MGARCH algorithm.}
 Many versions and implementations of \gls{mgarch} models exist, as they describe a general \emph{family} of models~\parencite[see][for an extensive overview]{Silvennoinen2009}.
 %
-The general \gls{mgarch} framework considers a zero mean vector stochastic process $\mathbf{y}_n \in \mathbb{R}^D$ with time-varying covariance structure $\mathbf{\Sigma}_n$:
+The general \gls{mgarch} framework considers a zero-mean vector stochastic process $\mathbf{y}_n \in \mathbb{R}^D$ with time-varying covariance structure $\mathbf{\Sigma}_n$:
 \begin{equation}
   \mathbf{y}_n = \mathbf{\Sigma}_n^{\frac12} \mathbf{\eta}_n,
 \end{equation}
@@ -123,27 +121,26 @@ \subsection{Multivariate GARCH}
 With its introduction, \textcite{Engle2002} showed it to outperform other \gls{mgarch} variants.
 It is by far the most common variant used in \gls{rs-fmri} analyses, which allows for better comparison.
 Furthermore, \textcite{Heaukulani2019} found the \gls{dcc} variant to outperform the generalized orthogonal~(GO) \gls{garch} model, which is another popular \gls{mgarch} variant.
-We implemented this latter variant and found it consistently outperformed by \gls{dcc}.
-It is therefore omitted from this work.
+This latter variant was implemented at an early stage of this work but was consistently outperformed by \gls{dcc} and less robust in its implementation.
 
 \info[inline]{Paragraph: Describe DCC implementation.}
 More specifically, throughout this thesis, we implement DCC(1,1)-GARCH(1,1) using the open-source \texttt{R} (version 4.1.0) package \texttt{rmgarch}~\parencite{Galanos2022}.
-One of the benefits of this method is that it is well-understood and that convenient off-the-shelf implementations exist.
-\Gls{mgarch} has a number of free parameters, although these are often hard to interpret and scale with data dimension to the fourth power~\parencite{Silvennoinen2009}.
+One of the benefits of this method is that it is well understood and that convenient off-the-shelf implementations exist.
+\Gls{mgarch} has several free parameters, although these are often hard to interpret and scale with data dimension to the fourth power~\parencite{Silvennoinen2009}.
 \Gls{mgarch} models are known to scale poorly to higher dimensions, although workarounds have been proposed~\parencite[see e.g.][]{Nakajima2017}.
 Therefore, \textcite{Gourieroux2009} claimed that these models are typically limited to studying $D < 6$ components.
 
 %%
-\subsection{State-based models}
-\label{subsec:state-based-models}
+\subsection{State-based models}\label{subsec:state-based-models}
 %%
 
 For completeness, we briefly review state-based models.
 These models are sometimes referred to as \emph{switching} models.
 They assume a state-space structure of brain activity, consisting of recurring \gls{fc} patterns (see \cref{subsec:brain-states} for more details on this brain state construct).
+Brain region dependencies only change when switching brain states.
 If such an assumption is made, it makes sense to incorporate this in a model.
 
 The dominant model used for this is the \gls{hmm}~\parencite[see e.g.][]{Vidaurre2017, Ahrends2022}.
-It assumes the brain state stochastic process (often called a \emph{chain} in this context) to model is Markovian, meaning the probability of finding oneself in a given state in the sequence of all states only depends on the previous state.
+It assumes the brain state stochastic process (often called a \emph{chain} in this context) to model is Markovian, meaning the probability of finding oneself in each state in the sequence of all states only depends on the previous state.
 However, these states are modeled as `hidden' (i.e.~latent).
 The observable process $\mathbf{y}_n$, the \gls{bold} node time series, is then used to infer the underlying hidden brain state process.
diff --git a/ch/2_Robust_estimation_of_TVFC/2_Cross-validated_sliding-windows.tex b/ch/2_Robust_estimation_of_TVFC/2_Cross-validated_sliding-windows.tex
index 1e6cc34..18f4b54 100644
--- a/ch/2_Robust_estimation_of_TVFC/2_Cross-validated_sliding-windows.tex
+++ b/ch/2_Robust_estimation_of_TVFC/2_Cross-validated_sliding-windows.tex
@@ -1,18 +1,17 @@
 \clearpage
-\section{Cross-validated sliding-windows}
-\label{sec:cross-validated-sw}
+\section{Cross-validated sliding-windows}\label{sec:cross-validated-sw}
 %%%%%
 
 \info[inline]{Paragraph: Discuss other proposals for determining the optimal window length.}
 Attempts have been made to improve \gls{sw} estimates by automatically extracting the optimal window length for a given scan from the data itself or to circumvent this issue~\parencite[see e.g.][]{Wang2014, Xu2015, Yaesoubi2018}.
-In fact, prior work has established that knowing the optimal window length $w$ a priori can make \gls{sw} a very effective method in the estimation of \gls{tvfc}~\parencite{Zalesky2015}.
+In fact, prior work has established that knowing the optimal window length $w$ a priori can make \gls{sw} a remarkably effective method in the estimation of \gls{tvfc}~\parencite{Zalesky2015}.
 These authors therefore argued that the window length should be set based on a rule of thumb after analyzing the \gls{bold} signal.
 As such these can be considered \emph{data-driven} estimation methods as well.
 
 \info[inline]{Paragraph: Introduce our way of cross-validating the optimal window length.}
 Here we propose another data-driven way of determining the optimal window length: by using cross-validation.
 Cross-validation is a simple yet effective technique to evaluate the generalizability performance of models~\parencite[see e.g.][section 8.2.4]{Deisenroth2019}.
-In machine learning model development it is often used to determine model hyperparameters.
+In machine learning model development, it is often used to determine model hyperparameters.
 %
 In our approach, which we call the \gls{sw-cv} method, evaluation data points are taken from the middle of node time series.
 For each of these data points individually, the likelihood of observing it under a zero-mean multivariate Gaussian for the full range of reasonable window lengths applied on all surrounding data points is taken (\emph{not} including the evaluation observation).
@@ -31,16 +30,17 @@ \section{Cross-validated sliding-windows}
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{Static structure \label{fig:sw-cv-demo-static-structure}}{
+  \subcaptionbox{Static structure\label{fig:sw-cv-demo-static-structure}}{
     \includegraphics[width=0.47\textwidth]{fig/studies/cross_validating_sliding_windows/sw_cv_results_df_null}
   }
-  \subcaptionbox{Fast-changing structure \label{fig:sw-cv-demo-fast-changing-structure}}{
+  \subcaptionbox{Fast-changing structure\label{fig:sw-cv-demo-fast-changing-structure}}{
     \includegraphics[width=0.47\textwidth]{fig/studies/cross_validating_sliding_windows/sw_cv_results_df_periodic_3}
   }
   \caption{
-    Cross-validated sliding-windows demonstration showing how optimal window length adapts to underlying covariance structure.
-  }
-  \label{fig:sw-cv-demo}
+    Cross-validated sliding-windows demonstration showing how the optimal window length adapts to the underlying covariance structure.
+    Heatmap colormaps indicate test location log likelihoods.
+    Line plots show the mean test log likelihood over all test locations.
+  }\label{fig:sw-cv-demo}
 \end{figure}
 
 
@@ -51,7 +51,7 @@ \section{Cross-validated sliding-windows}
 %
 The minimum window length ensures that we do not filter out signal in (expected) relevant frequency bands.
 There is no actual constraint on maximum proposal window length limit.
-If the signal is fundamentally static we would expect the window length to trend to infinity.
+If the signal is fundamentally static, we would expect the window length to trend to infinity.
 However, doing so reduces the number of available evaluation points, so we are left with a trade-off.
 %
 After the optimal window length is determined, \gls{tvfc} estimates are generated in the same way as the regular \gls{sw} approach as discussed in \cref{subsec:sliding-windows-fc}.
diff --git a/ch/2_Robust_estimation_of_TVFC/3_The_Wishart_process.tex b/ch/2_Robust_estimation_of_TVFC/3_The_Wishart_process.tex
index 8ffc87c..7d1181d 100644
--- a/ch/2_Robust_estimation_of_TVFC/3_The_Wishart_process.tex
+++ b/ch/2_Robust_estimation_of_TVFC/3_The_Wishart_process.tex
@@ -1,6 +1,5 @@
 \clearpage
-\section{The Wishart process}
-\label{sec:wishart-process}
+\section{The Wishart process}\label{sec:wishart-process}
 %%%%%
 
 \info[inline]{Paragraph: Introduce Wishart process and its history and application.}
@@ -16,8 +15,7 @@ \section{The Wishart process}
 As we enter the domain of Bayesian machine learning~\parencite{Ghahramani2015}, familiarity with concepts from excellent textbooks such as \textcite{MacKay2002, Bishop2006, Hastie2009, Murphy2012, Murphy2023} will prove helpful.
 
 %%
-\subsection{The Wishart distribution}
-\label{subsec:wishart-distribution}
+\subsection{The Wishart distribution}\label{subsec:wishart-distribution}
 %%
 
 \info[inline]{Paragraph: Introduce Wishart distribution.}
@@ -55,7 +53,7 @@ \subsection{Wishart process model definition}
 In \gls{fmri} analyses, $N$ refers to the number of time steps or scan \emph{volumes}, and $D$ refers to the number of node time series (e.g.~number of brain regions for which a characteristic \gls{bold} signal time series is determined).
 
 We denote \textbf{input locations} as $X$ := ($x_n$, $1 \leq n \leq N$) in $\mathbb{R}$.
-For \gls{fmri} analyses, the (univariate, 1-dimensional)~$x_n$ here is the time at which measurement $\mathbf{y}_n$ is taken (observed).\footnote{Our model construction does not \emph{require} the input locations to be univariate. One could, for example, add other (side) information of interest such as the decaying magnet strength during a scan, head motion~\parencite[often considered one of the most significant confounding factors, see e.g.][]{Laumann2017}, a design matrix for \gls{tb-fmri}, arousal (e.g. as measured by pupil diameter or eyelid closure), and/or physiological signals such as heart rate. Whether this is beneficial will require further validation studies.}
+For \gls{fmri} analyses, the (univariate, 1-dimensional)~$x_n$ here is the time at which measurement $\mathbf{y}_n$ is taken (observed).\footnote{Our model construction does not \emph{require} the input locations to be univariate. One could, for example, add other (side) information of interest such as the decaying magnet strength during a scan, head motion~\parencite[often considered one of the most significant confounding factors, see e.g.][]{Laumann2017}, a design matrix for \gls{tb-fmri}, arousal (e.g.~as measured by pupil diameter or eyelid closure), and/or physiological signals such as heart rate. Whether this is beneficial will require further validation studies.}
 In our model definition the spacing between values of $x_n$ may be irregular and does not need to be constant.
 Even though we expect \gls{fmri} data to be organized in a grid-like fashion of regular time intervals of a single \gls{tr}, this model flexibility could still be useful.
 For example, it allows for naturally leaving out a measurement due to an artifact.
@@ -108,7 +106,7 @@ \subsection{Wishart process model definition}
 We train matrix $\mathbf{A}$ as part of the inference routine.
 Intuitively, for a static covariance estimate, our \gls{wp} could simply learn these $\mathbf{A}$ covariance terms and `switch off' the \glspl{gp}.
 
-Writing it out, our zero mean, multivariate Gaussian likelihood is given by
+Writing it out, our zero-mean, multivariate Gaussian likelihood is given by
 \begin{equation}
   p(\mathbf{y}_n|\mathbf{\Sigma}_n) = \frac{1}{(2\pi)^{\frac{D}{2}} |\mathbf{\Sigma}_n|^{\frac{1}{2}}} e^{-\frac{1}{2} \mathbf{y}_n \mathbf{\Sigma}_n^{-1} \mathbf{y}_n}.
 \end{equation}
@@ -132,7 +130,7 @@ \subsection{Wishart process model definition}
 We know that $Y_n$ only depends on $\mathbf{F}_n$, so $p(Y|F) = \prod_{n=1}^N p(\mathbf{y}_n|\mathbf{A},\mathbf{F}_n)$.
 Since all entries of $\mathbf{F}_n$ are~i.i.d., we can write
 \begin{equation}
-  p(Y,F) = p(Y|F)p(F) = \prod_{n=1}^N \left[ p(\mathbf{y}_n|\mathbf{A},\mathbf{F}_n)) \prod_{d=1}^D \prod_{k=1}^\nu f_{d,k}(X_n) \right].
+  p(Y,F) = p(Y|F)p(F) = \prod_{n=1}^N \left[ p(\mathbf{y}_n|\mathbf{A},\mathbf{F}_n) \prod_{d=1}^D \prod_{k=1}^\nu f_{d,k}(X_n) \right].
 \end{equation}
 
 Recall that we want correlation in $Y$.
@@ -152,7 +150,7 @@ \subsection{Variational Wishart processes}
 
 \Gls{vi} is a technique that approximates a probability density through \emph{optimization}~\parencite{Jordan1999, Hoffman2015, Blei2017}.
 It is usually faster and more scalable than other inference methods, such as \gls{mcmc} sampling~\parencite[as used in e.g.][]{Fox2011}, especially with larger data sets.
-In fact, the recent advances that made this style of inference possible explains the `why now' of introducing this model to the task of \gls{tvfc} estimation.
+In fact, the recent advances that made this style of inference possible explain the `why now' of introducing this model to the task of \gls{tvfc} estimation.
 
 With \gls{vi}, we posit a family of distributions~$q(F)$ over the latent variables and then find the member of that family which is close to the target distribution (the true posterior)~$p(F|Y)$.
 Closeness here is measured by \gls{kl-divergence}.
@@ -194,7 +192,7 @@ \subsection{Variational Wishart processes}
 \end{equation}
 
 We iteratively maximize this as our objective function, using gradient descent.
-In order to be able to compute (approximate) gradients we use the `reparametrization trick' as discussed in \textcite{Salimans2013, Kingma2014}.
+In order to be able to compute (approximate) gradients we use the `reparameterization trick' as discussed in \textcite{Salimans2013, Kingma2014}.
 This boils down to taking samples (Monte Carlo estimates) of our objective function and computing gradients based on these.
 
 %%
@@ -214,8 +212,7 @@ \subsection{Additive white noise model}
 This modification may be interpreted as introducing white (or \emph{observational}) noise to the model.
 
 %%
-\subsection{Sparse variational Wishart processes}
-\label{subsec:svwp}
+\subsection{Sparse variational Wishart processes}\label{subsec:svwp}
 %%
 
 The beauty of basing our \gls{wp} construction on underlying \glspl{gp}, is that we can take advantage of the rapid development and improvement of these models~\parencite[echoing sentiments from][]{Foti2019}.
@@ -238,18 +235,17 @@ \subsection{Sparse variational Wishart processes}
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{VWP \label{fig:vwp-computational-complexity}}{
+  \subcaptionbox{VWP\label{fig:vwp-computational-complexity}}{
     \includegraphics[width=0.47\textwidth]{fig/studies/wp_computational_cost/VWP}
   }
-  \subcaptionbox{SVWP \label{fig:svwp-computational-complexity}}{
+  \subcaptionbox{SVWP\label{fig:svwp-computational-complexity}}{
     \includegraphics[width=0.47\textwidth]{fig/studies/wp_computational_cost/SVWP}
   }
   \caption{
-    WP computational complexity as a function of $N$ and $D$.
-    Shown is time required (in seconds) to complete 4 epochs.
+    WP model training computational complexity as a function of number of time steps ($N$) and number of components ($D$).
+    Shown is time required (in seconds) to complete~4 epochs.
     Run on a 3 GHz Intel Core i5 CPU.
-  }
-  \label{fig:wp-computational-cost}
+  }\label{fig:wp-computational-cost}
 \end{figure}
 
 
@@ -263,7 +259,7 @@ \subsection{Implementation details}
   \label{eq:matern}
   k(\textbf{x}, \textbf{x}') = \sigma^2 (1 + \sqrt{5} r + \frac53 r^2) \exp(-\sqrt{5}r),
 \end{equation}
-with $r = \frac{||\textbf{x} - \textbf{x}'||}{l}$, and where $l$ and $\sigma$ are the kernel length scales and variance parameters, respectively, which are trained.
+with $r = \frac{||\textbf{x} - \textbf{x}'||}{l}$, and where $l$ and $\sigma$ are the kernel lengthscales and variance parameters, respectively, which are trained.
 Their initial values are set to~0.3 and~1.0, respectively.
 These parameters are part of the total set of model parameters~$\theta$.
 This kernel is a twice differentiable covariance.
@@ -272,13 +268,13 @@ \subsection{Implementation details}
 For example, all Matérn kernels assume data stationarity.
 Kernel functions can be considered as a specification of similarity between observations.
 Our kernel considers further away observations less similar.
-But a periodic kernel, for example, could consider a further away points \emph{more} similar if it is in phase with another point.
+But a periodic kernel, for example, could consider a further away point \emph{more} similar if it is in phase with another point.
 Or, as with the Gibbs kernel, kernel parameters could themselves be a function of input features $\mathbf{x}$ (e.g.~time).
 In fact, this is one of the exciting aspects of the \gls{wp}.
 We can characterize time series through these kernels.
 Smoothness of correlation structure can be expressed by such kernel functions for example~\parencite{Fyshe2012, Fox2015, Foti2019}.
 Kernels can be combined too; sums and products of kernels are also valid kernels.
-As such more expressive kernels can be designed~\parencite{Gonen2011}.
+As such more expressive kernels can be designed~\parencite{Gonen2011} and domain knowledge incorporated.
 Kernel choice requires trial-and-error, although some efforts have been made to automate this process~\parencite[see e.g.][]{Steinruecken2019}.
 
 
@@ -287,8 +283,7 @@ \subsection{Implementation details}
   \includegraphics[width=\textwidth]{fig/studies/kernels}
   \caption{
     Draws from common Gaussian process kernels on interval $\left[ {0,1} \right]$.
-  }
-  \label{fig:kernel-draws}
+  }\label{fig:kernel-draws}
 \end{figure}
 
 
@@ -299,8 +294,8 @@ \subsection{Implementation details}
 We run both the \gls{vwp} and \gls{svwp} models in most experiments, although only the former for cases with $N \leq 200$ and only the latter for cases with $N \geq 400$.
 The number of inducing points is set to $M = 200$, irrespective of time series lengths~$N$.
 
-Standing on the shoulders of giants, the model is implemented using the open-source Python library \texttt{GPflow}~\parencite[][version 2.5.2]{Matthews2017, Wilk2020}.
-This \gls{gp} toolbox in turn is built on top of Google's \texttt{Tensorflow}~\parencite[][version 2.9.2]{Tensorflow2015}.
+Standing on the shoulders of giants, the model is implemented using the open-source Python library \texttt{GPflow}~\parencite[][version 2.6.4]{Matthews2017, Wilk2020}.
+This \gls{gp} toolbox in turn is built on top of Google's \texttt{Tensorflow}~\parencite[][version 2.11.0]{Tensorflow2015}.
 These packages take care of all underlying automatic differentiation.
 This means that the amount of code to write is minimal and consists mainly of implementing a (customized) likelihood function.
 For these reasons, this black-box implementation~\parencite{Ranganath2014} is simple and fast compared to proposed inference routines based on \gls{mcmc}.
diff --git a/ch/2_Robust_estimation_of_TVFC/4_Extracting_TVFC-based_features_and_biomarkers.tex b/ch/2_Robust_estimation_of_TVFC/4_Extracting_TVFC-based_features_and_biomarkers.tex
index 0bf984f..7f6eb15 100644
--- a/ch/2_Robust_estimation_of_TVFC/4_Extracting_TVFC-based_features_and_biomarkers.tex
+++ b/ch/2_Robust_estimation_of_TVFC/4_Extracting_TVFC-based_features_and_biomarkers.tex
@@ -1,10 +1,9 @@
 \clearpage
-\section{Extracting TVFC-based features and biomarkers}
-\label{sec:tvfc-feature-extraction}
+\section{Extracting TVFC-based features and biomarkers}\label{sec:tvfc-feature-extraction}
 %%%%%
 
 \info[inline]{Paragraph: Introduce importance of feature extraction.}
-For a given scan, we estimate \gls{tvfc} as a rather large $N \times D \times D$ tensor.
+For a given scan, \gls{tvfc} is estimated as a rather large $N \times D \times D$ tensor.
 This contrasts \gls{sfc} analyses, where the $N \times D$ data is typically \emph{reduced} in dimensionality to~$D \times D$.
 %
 In most practical settings and applications, however, we wish to study more interpretable features.
@@ -14,19 +13,19 @@ \section{Extracting TVFC-based features and biomarkers}
 
 \info[inline]{Paragraph: Discuss what we want from extracted features.}
 Any data processing step cannot add any information, but merely destroy it.
-Therefore, feature extraction should encompass information preservation while removing redundant information (for the particular task at hand) and noise.
-If we desire to use our estimated \gls{tvfc} in some practical application to make predictions, perhaps this estimation can be skipped and models can be trained directly on node time series data.
+Therefore, feature extraction should encompass information preservation while removing redundant information (for the task at hand) and noise.
+If we desire to use our estimated \gls{tvfc} in some practical application to make predictions, perhaps this estimation can be skipped.
+Models can then be trained directly on node time series data.
 However, in our context we are interested in the covariance structure itself; as a wiring diagram of the functional interactions within the brain.
 %
 Extracting features is a crucial step when planning to run any supervised learning algorithm.
-In fact, it has been argued that strong performance of artificial neural networks may largely be due to their ability to automatically extract meaningful features from data (in contrast to features hand-crafted by humans).
+In fact, it has been argued that the strong performance of artificial neural networks may be due to their ability to automatically extract meaningful features from data (in contrast to features hand-crafted by humans).
 
 %%
-\subsection{TVFC summary measures}
-\label{subsec:tvfc-summary-measures}
+\subsection{TVFC summary measures}\label{subsec:tvfc-summary-measures}
 %%
 
-One common way to extract features in \gls{tvfc} neuroimaging is to take edge-wise summary measures (or \emph{statistics}) across time.
+One common way to extract features in \gls{tvfc} neuroimaging is to take edgewise summary measures (or \emph{statistics}) across time.
 %
 Typical summary measures include the mean~\parencite[analogous to \gls{sfc} estimates; sometimes considered the connection `strength', see e.g.][]{Choe2017} and standard deviation or variance~\parencite[see e.g.][]{Chang2010, Hutchison2013b, Kucyi2013, Kucyi2014, Kaiser2015, Demirtas2016, Choe2017}.
 Variance (or standard deviation) is sometimes interpreted to represent connectivity `stability' or `flexibility'~\parencite[see e.g.][]{Allen2014}.
@@ -48,21 +47,21 @@ \subsection{TVFC summary measures}
 \end{equation}
 for the edge between nodes $1 \leq i \leq D$ and $1 \leq j \leq D$.
 This summary measure captures how smooth a time series is over time (i.e.~the smoothness of the estimated \gls{fc} time series in our case).
-It is more informative of \gls{fc} frequency amplitudes, and is akin to \gls{fc} `variability' as described in \textcite{Allen2014}.
+It is more informative of \gls{fc} frequency amplitudes and is akin to \gls{fc} `variability' as described in \textcite{Allen2014}.
 To illustrate its relevance, this summary measure can distinguish two sine waves with identical mean and variance, yet oscillating at different frequencies (see e.g. \cref{fig:synthetic-covariance-structures}).
 As such it is complementary to the other two summary measures.
 
 %%
-\subsection{Brain states and related metrics}
-\label{subsec:brain-states}
+\subsection{Brain states and related metrics}\label{subsec:brain-states}
 %%
 
 Another common way to reduce the dimensionality of \gls{tvfc} estimates is to assume that the estimated covariance structure at each time point can be characterized as a certain `brain state' (or `\gls{fc} state')~\parencite{Kringelbach2020}.
 Such states are short-term, recurring spatial activity \gls{fc} patterns across time and subjects.
+In our context, they are characterized by an \gls{fc} correlation matrix.
 %
 The existence of such a state-space structure in the brain has been shown to be a valid view~\parencite{Deco2015}.
 %
-These states can be insightful on their own, or can be used as extracted features.
+These states can be insightful on their own or can be used as extracted features.
 For example, \textcite{Rashid2016} showed that schizophrenia patients spend more time in low-contrast states.
 One exciting aspect of the construct of brain states is that they can be synthesized across species and imaging modalities (e.g.~with microstates in \gls{eeg}, see \textcite{Allen2014} for further discussion).
 As such they can serve as a common language to bridge multiple levels of neuroscientific research.
@@ -78,12 +77,12 @@ \subsection{Brain states and related metrics}
 Such a plot computes the summed distance between each correlation matrix and its closest (i.e.~assigned) basis state for a range of values of~$k$.
 The optimal value of~$k$ is then chosen as the one where this curve has an `elbow'; that is, after which the curve is relatively linear and after which adding another cluster does not result in a big decrease thereof.
 %
-Perhaps surprisingly, most studies using brain states to summarize the brain's activity find a relatively small number of distinct states, for example 3 in \textcite{Choe2017, Dini2021} and 12 in \textcite{Vidaurre2017}.
+Perhaps surprisingly, most studies using brain states to summarize the brain's activity find a relatively small number of distinct states, for example three in \textcite{Choe2017, Dini2021} and 12 in \textcite{Vidaurre2017}.
 
 Even though the \gls{wp} is not constrained in estimating \gls{tvfc} in grid-like fashion, we do this for extracting brain states to allow for better comparison to other methods.
 However, we could estimate as many covariance matrices at as high of a temporal resolution as we like for this task.
 %
-Furthermore, some studies have relaxed the assumption that participants are assigned to a single state, and learn a weight vector over all basis states instead for each time step~\parencite{Leonardi2014}.
+Furthermore, some studies have relaxed the assumption that participants are assigned to a single state and learn a weight vector over all basis states instead for each time step~\parencite{Leonardi2014}.
 Such an approach again blurs the boundary between the continuous modeling of brain activity and viewing the brain as a state-space system.
 The latter view is still debated~\parencite[see][for an excellent discussion on the potential and shortcomings of the brain states framework]{Keilholz2017}.
 
@@ -96,15 +95,14 @@ \subsubsection{Brain state metrics}
 and state-specific metrics such as occupancy rate and dwell time (the relative time participants spend in each state).
 
 %%
-\subsection{Graph theoretic analysis}
-\label{subsec:graph-theoretic-analysis}
+\subsection{Graph theoretic analysis}\label{subsec:graph-theoretic-analysis}
 %%
 
 Yet another common feature set extracted from \gls{tvfc} is to (mathematically) represent brain network connectivity as a graph and then compute graph theoretic metrics~\parencite{Sporns2011}.
 Such approaches can be used to study topological properties of brain networks and the properties of connectivity in the brain.
 In graph theory, brain \glspl{roi} are represented as \textbf{nodes}.
 The \textbf{edges} between brain regions can take many forms.
-Throughout this thesis we will use the terminology of nodes and edges to describe brain regions and their respective connections.
+As mentioned before, throughout this thesis we will use the terminology of nodes and edges to describe brain regions and their respective connections.
 
 It is common to use \gls{rs-fmri} and define edge strength as a given \gls{fc} metric (often, and in our case as well, correlation).
 Edges can be thresholded and binarized, but weighted graphs are also common.
@@ -117,7 +115,7 @@ \subsection{Graph theoretic analysis}
 Such dynamic changes have had a significant impact on basic neuroscience in understanding concepts such as the dynamics of integration and segregation in the brain, including in understanding disorders such as depression~\parencite{Gong2015}.
 
 Some controversy remains in graph theoretic studies.
-First of all, some graph metrics may not be applicable to the data at hand.
+First, some graph metrics may not be applicable to the data at hand.
 Graph \emph{efficiency}, for example, assumes causal connections between nodes, and may not be valid in (undirected) graphs defined by \gls{fc}~\parencite{Chen2017}.
 Secondly, the interpretation of graph metrics often leaves much to speculation.
 
@@ -131,24 +129,23 @@ \subsubsection{Dynamic graph theoretic metrics}
 In the simplest case these could be summary measures of graph metrics across time again.
 \textcite{Zalesky2014} demonstrated the synthesis of \gls{tvfc} with graph theory.
 Some dynamic graph studies have been run~\parencite{Bassett2011, Bassett2013}.
-This is a relatively newer approach, but we may take inspiration from the field of temporal network theory~\parencite{Holme2012, Yu2015, Thompson2017}.
+This is a newer approach, but we may take inspiration from the field of temporal network theory~\parencite{Holme2012, Yu2015, Thompson2017}.
 
 %%
-\subsection{Feature extraction from models}
-\label{subsec:model-features}
+\subsection{Feature extraction from models}\label{subsec:model-features}
 %%
 
 Instead of merely describing or estimating covariance, such as the \gls{sw} approach, the \gls{wp} is a generative model.
 Like \gls{dcc}, it is a model-based approach.
 Thus, we have a full model for each scan and subject.
-Assuming we have good model of the underlying process, we may be able to extract relevant process or subject features.
+Assuming we have a good model of the underlying process, we may be able to extract relevant process or subject features.
 
-An example would be the learned \gls{wp} kernel length scale $l$, see \cref{eq:matern}, which could be extracted from the trained models and compared between subjects and scan times.
+An example would be the learned \gls{wp} kernel lengthscale $l$, see \cref{eq:matern}, which could be extracted from the trained models and compared between subjects and scan times.
 The reverse may also be interesting.
-For example, in a similar Bayesian model approach, \textcite{Li2019a} defined the prior for their \gls{gp} length scale based on their expectation to find frequency dynamics close to the theta range (4-12 Hz).
+For example, in a similar Bayesian model approach, \textcite{Li2019a} defined the prior for their \gls{gp} lengthscale based on their expectation to find frequency dynamics close to the theta range (4--12 Hz).
 %
 With this in mind, we can make model choices that explicitly return interpretable and cognition-relevant features.
-However, \glspl{gp} (and related models) are not generally considered to be good at feature extraction, since we encode a lot of the data structure into the kernel choice.
+However, \glspl{gp} (and related models) are not considered to be good at feature extraction, since we encode a lot of the data structure into the kernel choice.
 This leaves only several hyperparameters that dictate the data description.
-Instead, deep learning models might be better feature extractors, although the flipside is that these typically require a lot of data.
+Instead, deep learning models might be better feature extractors~\parencite[and have proven useful in other neuroscientific fields, see e.g.][]{Richards2019}, although these typically require a lot of data.
 Overall, we consider model-based approaches a promising direction for feature extraction, but one that requires much work still.
diff --git a/ch/2_Robust_estimation_of_TVFC/5_The_benchmarking_framework.tex b/ch/2_Robust_estimation_of_TVFC/5_The_benchmarking_framework.tex
index 1aef5bc..9556ff3 100644
--- a/ch/2_Robust_estimation_of_TVFC/5_The_benchmarking_framework.tex
+++ b/ch/2_Robust_estimation_of_TVFC/5_The_benchmarking_framework.tex
@@ -1,22 +1,21 @@
 \clearpage
-\section{The benchmarking framework}
-\label{sec:benchmark-framework}
+\section{The benchmarking framework}\label{sec:benchmark-framework}
 %%%%%
 
 \info[inline]{Paragraph: Introduce main concept of benchmarking.}
-How do we compare methods in order to decide which one to use?
+How do we compare methods and decide which one to use?
 We propose to take inspiration from the field of machine learning, which has extensive experience with such problems.
 %
 When a single optimization target or `learning task' is missing, it is common practice to define a suite of \emph{benchmarks}.
 Each benchmark frames method selection as a prediction task and competition~\parencite{Breiman2001, Shmueli2010, Bzdok2018, Khosla2019, Poldrack2020, Tejavibulya2022}.
-Such benchmarks need to be uniquely domain-specific.
-A collection of benchmarks then paints a rich picture about which methods are more sensitive or specific, what the failure modes of each method are, and ultimately lead to practical guidelines on which method should be used in a given real-life situation.\footnote{In many machine learning sub-fields benchmarks are framed as clearly defined targets, such as as image classification accuracy. A more apt comparison may be natural language processing (NLP), where optimization targets are not straightforward, and a \emph{range} of desired targets are evaluated per model~\parencite[see e.g.][]{Bommasani2021}.}
+Such benchmarks need to be uniquely domain specific.
+A collection of benchmarks then paints a rich picture about which methods are more sensitive or specific, what the failure modes of each method are, and ultimately lead to practical guidelines on which method should be used in each real-life situation.\footnote{In many machine learning sub-fields benchmarks are framed as clearly defined targets, such as as image classification accuracy. A more apt comparison may be \gls{nlp}, where optimization targets are not straightforward, and a \emph{range} of desired targets are evaluated per model~\parencite[see e.g.][]{Bommasani2021}.}
 This is a live process, and insights and approaches are updated as time goes by.
 %
 In fact, this shift in focus toward \emph{predictive} methods is increasingly argued for in neuroscience, especially for translational work to clinic practice~\parencite{Yarkoni2017, Leenings2022, Voytek2022}.
 While the focus on benchmarks is relatively new in neuroimaging and psychology, it has existed in computer science and biomedical contexts for a long time,\footnote{In machine learning research, for example, the first large benchmark (ImageNet) was launched back in 2010~\parencite{Deng2009}. Image classification performance has dramatically increased since.} and we can learn a lot from their experiences~\parencite{Leenings2022}.
 %
-One such lessons is that benchmarks are not a silver bullet.
+One such lesson is that benchmarks are not a silver bullet.
 There are risks involved with a hyper-focus on benchmarks, such as often discussed in the case of machine learning research~\parencite{Wagstaff2012, Sculley2018}.
 This field has also been considered to have its fair share of a reproducibility crisis~\parencite[see also][]{Bell2021}.
 
@@ -31,8 +30,7 @@ \section{The benchmarking framework}
 Despite the heterogeneity of benchmarks, most can be categorized into one of the following buckets.
 
 %%
-\subsection{Simulations benchmarks}
-\label{subsec:simulation-benchmarks}
+\subsection{Simulations benchmarks}\label{subsec:simulation-benchmarks}
 %%
 
 \info[inline]{Paragraph: Describe general idea behind simulation benchmarks.}
@@ -43,7 +41,7 @@ \subsection{Simulations benchmarks}
 Performance is measured as the difference between estimated \gls{tvfc} and ground truth values.
 Methods with lower \emph{reconstruction error} are then considered superior.
 %
-This benchmark is the most commonly seen in literature.
+This benchmark is the most common in literature.
 In fact, \textcite{Thompson2018} already proposed a common framework to benchmark \gls{tvfc} methods on simulated data.
 %
 Furthermore, we use the simulations as motivating examples of why it is important to benchmark estimation methods.
@@ -69,7 +67,7 @@ \subsection{Simulations benchmarks}
 Strongly correlated predictions from methods may indicate that these methods capture similar aspects of the signal.
 The second tests for \emph{sensitivity} to changing covariance structure.
 The third tests for \emph{robustness} when the mean of the time series changes.
-The fourth tests how well the methods can pick up on sudden changes (see also \cref{subsec:sudden-changes}).
+The fourth tests how well the methods can detect sudden changes (see also \cref{subsec:sudden-changes}).
 Their simulations are constructed with autocorrelation effects, based on assumed structure in real \gls{fmri} data.
 Here we propose to achieve such structure by imposing real data on the simulated data.
 While we underwrite the philosophy behind this benchmarking framework, it still has its limitations.
@@ -88,11 +86,11 @@ \subsection{Resting-state fMRI benchmarks}
 In fact, in a so-called `fingerprint' analysis, \textcite{Finn2015} showed that an individual's covariance structure or `connectome' is unique.
 
 \info[inline]{Paragraph: Describe idea behind subject measure prediction benchmarks.}
-Typically the estimated \gls{tvfc} (or features derived from it) are viewed as extracted \emph{biomarkers}.
+Typically, the estimated \gls{tvfc} (or features derived from it) are viewed as extracted \emph{biomarkers}.
 These are fed to a regressor or classifier to predict either non-clinical~\parencite[see e.g.][]{Taghia2017, Li2019a} or clinical~\parencite[see e.g.][]{Filippi2019, Du2021} subject measures and phenotypes.
 Methods which do better at these prediction tasks are then said to have preserved more useful information.
 %
-In similar data-driven spirit, \textcite{Li2019a} argued that the controversial data preprocessing step of \gls{gsr} \emph{should} be included, since it increases the predictive power of subsequently extracted networks.
+In a similar data-driven spirit, \textcite{Li2019a} argued that the controversial data preprocessing step of \gls{gsr} \emph{should} be included, since it increases the predictive power of subsequently extracted networks.
 
 \info[inline]{Paragraph: Introduce test-retest robustness studies and frame as benchmark.}
 Test-retest robustness studies~\parencite{Noble2019}, although usually not explicitly described as predictive tasks, can be viewed as looking at the predictive power of a first \gls{rs-fmri} scan to predict which subsequent scan belongs to the same subject~\parencite{Fiecas2013, Choe2017, Abrol2017, Zhang2018, Elliott2020}.
@@ -103,8 +101,10 @@ \subsection{Resting-state fMRI benchmarks}
 Most \gls{tvfc} studies have used \gls{rs-fmri} data.
 Therefore, results will generalize better to practical use cases.
 %
-A major advantage of using \gls{rs-fmri} over \gls{tb-fmri} is that the acquisition and preprocessing pipelines are easier to standardize and replicate (and thus more robust).
-This allows for large scale, multi-site data collection collaborations that provide large, publicly available data sets for more robust benchmarking~\parencite[see e.g.][]{VanEssen2012, Allen2014b}.
+A major advantage of using \gls{rs-fmri} over \gls{tb-fmri} is that the data acquisition and preprocessing pipeline designs can easily be standardized and replicated (and are thus more robust).
+Moreover, the data can be used for a variety of purposes, which makes it more attractive for large-scale, multi-site data collection collaborations.
+And indeed, such large data sets are readily available for more robust benchmarking~\parencite[see e.g.][]{VanEssen2012, Allen2014b}.
+%
 A disadvantage of using \gls{rs-fmri} data is that there is no controlled stimulus influencing brain activity.
 It is typically not known what subjects were thinking about during the scan~\parencite[see also][]{Finn2021}.
 
@@ -119,12 +119,14 @@ \subsection{Task-based fMRI benchmarks}
 \info[inline]{Paragraph: Describe advantages and disadvantages of this class of benchmarks.}
 These benchmarks are less common, due to the difficulty of collecting large amounts of standardized data.
 Furthermore, test-retest reliability has been found to be especially low in \gls{tb-fmri}~\parencite{Elliott2020}.
-However, a promising trend is to record subjects watching a movie, which combines benefits of \gls{rs-fmri} benchmarks in the sense that it is easy to reproduce and standardize, with those of \gls{tb-fmri} benchmarks because we have some controlled stimuli guidance at known times that influence what subjects are experiencing and processing~\parencite{Eickhoff2020, Finn2021}.
+A promising idea is to record subjects watching a movie, which combines benefits of \gls{rs-fmri} benchmarks in the sense that it is easy to reproduce and standardize, with those of \gls{tb-fmri} benchmarks because we have some controlled stimuli guidance at known times that influence what subjects are experiencing and processing~\parencite{Eickhoff2020, Finn2021}.
+However, movie watching may make the data less multi-purpose and favor visual processing studies.
+Furthermore, subjects may pay varying degrees of attention to the stimuli.
+Those that pay little attention may exhibit higher activity in visual areas but would otherwise be like participants in \gls{rs-fmri} setups.
 We argue that \gls{tb-fmri} are especially powerful for benchmarking specific method characteristics.
 
 %%
-\subsection{The imputation benchmark}
-\label{subsec:imputation-benchmark}
+\subsection{The imputation benchmark}\label{subsec:imputation-benchmark}
 %%
 
 \info[inline]{Paragraph: Describe general idea behind the imputation benchmark.}
@@ -148,7 +150,7 @@ \subsection{The imputation benchmark}
 %
 Determining test location estimates is non-trivial due to the difference in nature of the \gls{tvfc} estimation methods considered.
 Since the \gls{wp} is not tied to a certain lattice as its training input or test output, predicting at unobserved data points follows naturally.
-For all other approaches, we linearly interpolate all values of the covariance matrix element-wise between the two enclosing training locations.
+For all other approaches, we linearly interpolate all values of the covariance matrix elementwise between the two enclosing training locations.
 
 \info[inline]{Paragraph: Final thought on this benchmark.}
 We apply and study this benchmark on all data sets in this thesis, including the simulated data sets.
diff --git a/ch/2_Robust_estimation_of_TVFC/6_Discussion.tex b/ch/2_Robust_estimation_of_TVFC/6_Discussion.tex
index 9296605..24c0398 100644
--- a/ch/2_Robust_estimation_of_TVFC/6_Discussion.tex
+++ b/ch/2_Robust_estimation_of_TVFC/6_Discussion.tex
@@ -1,14 +1,12 @@
 \clearpage
-\section{Discussion}
-\label{sec:methods-discussion}
+\section{Discussion}\label{sec:methods-discussion}
 %%%%%
 
 \info[inline]{Paragraph: Discuss relevant topics before starting the benchmarking chapter.}
 Here we briefly discuss any remaining issues and considerations regarding methods development.
 
 %%
-\subsection{Model-based and data-driven methods}
-\label{subsec:model-based-data-driven-methods}
+\subsection{Model-based and data-driven methods}\label{subsec:model-based-data-driven-methods}
 %%
 
 \info[inline]{Paragraph: Discuss benefits of model-based approaches.}
@@ -28,7 +26,7 @@ \subsubsection{Uncertainty and interpretability}
 %%
 
 The \gls{wp} model may lead to improved performance, flexibility, and interpretability.
-Additionally, we can sample our covariance matrix at any point in time, and get uncertainty in our estimates (unlike the other methods discussed).
+Additionally, we can sample our covariance matrix at any point in time and get uncertainty in our estimates (unlike the other methods discussed).
 The importance of uncertainty in covariance estimates has been discussed by~\textcite{Kudela2017}.
 Under a fully probabilistic model, it is also easier to do hypothesis tests whether there is any \gls{tvfc} present at all.
 
@@ -39,7 +37,7 @@ \subsubsection{Artifacts and asynchronous data}
 The \gls{wp} approach allows a practitioner to drop out certain data points.
 This can be useful for neuroimaging data, as measurement artifacts can elegantly be left out.
 All artifacts and limitations of \gls{fmri} data are directly relevant to any \gls{tvfc} analysis as well~\parencite{Nalci2019}.
-For example, outliers due to head motion can have a large impact on the signal~\parencite{Power2014, Power2015}.
+For example, outliers due to head motion can have a significant impact on the signal~\parencite{Power2014, Power2015}.
 Current popular methods do not allow for this, and would typically need to interpolate the missing values.
 %
 Although it seems that we still need all $D$ points to be present for $Y_n$, this can be mitigated.
@@ -53,8 +51,7 @@ \subsubsection{Artifacts and asynchronous data}
 This highlights the benefits of viewing preprocessing and analysis as a joint, interweaved process instead of two separate steps.
 
 %%
-\subsection{Higher dimensions: Pairwise or joint modeling?}
-\label{subsec:higher-dimensions}
+\subsection{Higher dimensions: Pairwise or joint modeling?}\label{subsec:higher-dimensions}
 %%
 
 \info[inline]{Paragraph: Discuss the issue of pairwise or joint modeling.}
@@ -65,7 +62,7 @@ \subsection{Higher dimensions: Pairwise or joint modeling?}
 %
 What is the difference between these two options?
 And how does this relate to multiple hypothesis testing?
-First of all, \gls{sw} approaches are always pairwise.
+First, \gls{sw} approaches are always pairwise.
 However, the window length and other global parameters operate on all edges.
 As a fair comparison to this standard approach, training the \gls{wp} and \gls{dcc} models in multivariate (joint) fashion is the fairest comparison.
 Even if we determine the window length from the data using \gls{sw-cv}, we have to choose whether to use the same window length for all edges, or a different one for each edge.
@@ -81,15 +78,15 @@ \subsection{Higher dimensions: Pairwise or joint modeling?}
   \centering
   \includegraphics[width=\textwidth]{fig/sim/d3s/N0400_T0003/no_noise/periodic_1_correlations}
   \caption{
-    Demonstration and motivation for training models in pairwise fashion when edges are characterized by radically different covariance structures.
-  }
-  \label{fig:pairwise-vs-joint}
+    Demonstration and motivation for training DCC models in pairwise fashion when edges are characterized by radically different covariance structures.
+    Pairwise training allows model parameters to adapt to the unique underlying generative characteristics of the time series (dynamic in the first edge and static in the other two edges in this case).
+  }\label{fig:pairwise-vs-joint}
 \end{figure}
 
 
 \info[inline]{Paragraph: Discuss our approach to this issue.}
-In a theoretical approach, in general, a model should expect to always find some pair that seems correlated when presented with a large number of pairs.
-A multivariate model would not extract spurious structure if it only sees one odd case out of many, or at least would need even more evidence.
+In a theoretic approach, in general, a model should expect to always find some pair that seems correlated when presented with a large number of pairs.
+A multivariate model will not extract spurious structure if it only sees one odd case out of many, or at least would need even more evidence.
 However, the pairwise approach would not have this inherent protection against reporting false positives.
 Of course, such issues could be resolved through careful post-estimation multiple comparison correction.
 %
@@ -98,7 +95,7 @@ \subsection{Higher dimensions: Pairwise or joint modeling?}
 For example, \cref{fig:pairwise-vs-joint} shows \gls{tvfc} estimates for a toy example of $D = 3$ simulated time series, where the first two nodes exhibit a periodic covariance structure and the other edges are uncorrelated.
 The pairwise \gls{dcc} model predicts better than the jointly trained \gls{dcc} model, arguably because the covariance structure characteristics are radically different for the different edges.
 This seems to motivate the benefit of training \gls{dcc} in a pairwise fashion.
-However, as we shall see later, in realistic scenarios these differences will not be as pronounced, and the two ways of training yield very similar estimates.
+However, as we shall see later, in realistic scenarios these differences will not be as pronounced, and the two ways of training yield similar estimates.
 %
 Throughout this thesis we train all models in joint fashion, except \gls{dcc} which will be trained in both manners.
 The `-J' suffix will indicate a multivariate jointly trained model, where `-BL' will refer to bivariate (pairwise) loop training.
@@ -129,7 +126,7 @@ \subsection{Beyond simple performance metrics}
 These include computational complexity, uncertainty modeling and estimation, robustness, ease of implementation, explainability, and flexibility in general.
 %
 These will be considered secondary (or \emph{auxiliary}) factors in our model comparison.
-For example, it seems reasonable to want to sacrifice on performance slightly if it would get you many other attractive characteristics in return.
+For example, it seems reasonable to want to sacrifice performance slightly if it would get you many other attractive characteristics in return.
 %
 We will return to this in \cref{ch:discussion}.
 
@@ -141,10 +138,10 @@ \subsection{Other benchmarks}
 The families of benchmarks discussed in this chapter are not exhaustive.
 %
 Other benchmarks outside the scope of this thesis include predicting a concurrent modality.
-For example, by using a simultaneous \gls{fmri}-\gls{eeg} data set~\parencite{Laufs2003}.
+For example, by using a simultaneous \gls{fmri}--\gls{eeg} data set~\parencite{Laufs2003}.
 \textcite{Tagliazucchi2014} predicted sleep state from \gls{fmri} using such a concurrent data set.
 Other concurrent modalities are possible as well.
 For example, \textcite{Matsui2016} simultaneously monitored neuronal calcium signals and \gls{fmri}.
 %
-The problem with such data sets is often that they are hard to process and understand, and they may not make for practically useful benchmarking data sets at the moment.
+The problem with such data sets is often that they are hard to process and understand, and they may not make for useful benchmarking data sets at the moment.
 This leads to typically sample sizes too.
diff --git a/ch/3_Benchmarking_TVFC_estimation/0_Introduction.tex b/ch/3_Benchmarking_TVFC_estimation/0_Introduction.tex
index f086292..86b562a 100644
--- a/ch/3_Benchmarking_TVFC_estimation/0_Introduction.tex
+++ b/ch/3_Benchmarking_TVFC_estimation/0_Introduction.tex
@@ -1,5 +1,4 @@
-\chapter{Benchmarking TVFC estimation}
-\label{ch:benchmarking}
+\chapter{Benchmarking TVFC estimation}\label{ch:benchmarking}
 %%%%%
 
 \info[inline]{Paragraph: Introduce benchmarking chapter.}
diff --git a/ch/3_Benchmarking_TVFC_estimation/1_Material_and_methods.tex b/ch/3_Benchmarking_TVFC_estimation/1_Material_and_methods.tex
index 1e3ca0a..c4d14b8 100644
--- a/ch/3_Benchmarking_TVFC_estimation/1_Material_and_methods.tex
+++ b/ch/3_Benchmarking_TVFC_estimation/1_Material_and_methods.tex
@@ -1,13 +1,11 @@
 \clearpage
-\section{Material and methods}
-\label{sec:simulations-methodology}
+\section{Material and methods}\label{sec:simulations-methodology}
 %%%%%
 
 In this section we describe the respective benchmarks and related methodology in detail.
 
 %%
-\subsection{Simulations}
-\label{subsec:simulations-methods}
+\subsection{Simulations}\label{subsec:simulations-methods}
 %%
 
 We consider a range of simulation studies with various deterministic synthetic covariance structures.
@@ -15,15 +13,14 @@ \subsection{Simulations}
 The synthetic covariance structures are both designed for exploring edge cases as well as mimicking structures that we may expect to drive actual processes in the human brain.
 
 %%
-\subsubsection{Synthetic covariance structures}
-\label{subsec:synthetic-covariance-structures}
+\subsubsection{Synthetic covariance structures}\label{subsec:synthetic-covariance-structures}
 %%
 
 Perhaps surprisingly, the covariance structures to be expected in real data are rarely discussed.
 The coupling of \gls{bold} time series is, in fact, still a black box to a large degree.
 In lieu of known covariance structure we test models against a battery of possible and reasonably exhaustive (synthetic) structures that may be encountered in an \gls{fmri} scan.
 The covariance structures studied are shown in \cref{fig:synthetic-covariance-structures};
-null covariance (node time series are uncorrelated during the entire scan),
+null covariance (node time series are uncorrelated during the entire scan)\improvement{Discuss importance of adding null model},
 a constant (i.e.~static) covariance of $\sigma_{ij} = 0.8$,
 periodic covariance structures (a slowly oscillating sine wave defined by one period, and a fast one with three periods) that model transient changes in coupling,
 a stepwise covariance that models two sudden (large) change points in covariance,
@@ -37,9 +34,9 @@ \subsubsection{Synthetic covariance structures}
   \centering
   \includegraphics[width=\textwidth]{fig/sim/covariance_structures}
   \caption{
-    Synthetic covariance structures considered in the simulations benchmark, capturing a wide range of edge cases and potentially realistic covariance structures.
-  }
-  \label{fig:synthetic-covariance-structures}
+    Synthetic covariance structures as a function of time considered in the simulations benchmark.
+    These capture a wide range of edge cases and potentially realistic underlying covariance structures that generate the observed node time series.
+  }\label{fig:synthetic-covariance-structures}
 \end{figure}
 
 
@@ -59,18 +56,19 @@ \subsubsection{Data generation}
 
 The number of time series ($D$) we expect to see in practice depends on the experimental design and research question at hand.
 In most applications more than two nodes or components are studied.
-Some studies even consider voxel time series directly, in which case $D$ can be in the hundreds of thousands or even millions.
+Some studies even consider voxel time series directly, in which case $D$ can be in the hundreds of thousands.
 Here, we test on pairwise (i.e.~\emph{bivariate}; $D = 2$) data, as well as on a trivariate ($D = 3$) data sets.
-Ideally we want to study \gls{tvfc} estimation performance per method \emph{as a function of} dimensionality.
+Ideally, we want to study \gls{tvfc} estimation performance per method \emph{as a function of} dimensionality.
 The trivariate case serves as an intermediate step toward scaling up to higher dimensions.
-All simulation experiments are run $T = 200$ times, and model performance is averaged across these trials.
+All simulation experiments are repeated $T = 200$ times to ensure robustness while balancing computational cost.
+Model performance is averaged across these trials.
 
 The bivariate benchmark analyses are similar to the ones conducted by~\textcite{Lindquist2014}.
 Furthermore, they will serve as a blueprint for when we analyze connectivity between just two brain regions.
 It is important to note that methods based on \gls{sw} are always pairwise as well.
 Thus, if a method robustly and consistently outperforms \gls{sw} on bivariate data, it will do so on higher dimensional data as well if we were to simply loop over all pairs of time series.
 
-Our covariance structure for generating bivariate data (\cref{eq:data-generation}) is given by
+The covariance structure in \cref{eq:data-generation} for generating bivariate data is given by
 \begin{equation}
   \mathbf{\Sigma}_n = \begin{bmatrix}
     1 & \sigma(n) \\
@@ -99,7 +97,8 @@ \subsubsection{Data generation}
 \end{equation}
 where the covariance terms $\sigma(n)$ again vary across time as depicted in \cref{fig:synthetic-covariance-structures}.
 This sparse version can be considered alike to the bivariate case with an added uncorrelated (i.e.~control) region.
-Perhaps we would expect such structure (most edges being stationary, but some showing time-varying structure) in \gls{fmri} data, but yet again lacking a ground truth makes this unclear.
+Perhaps we would expect such structure (most edges being stationary, but some showing time-varying structure) in \gls{fmri} data.
+Yet, lacking a ground truth again makes this unclear.
 To keep the covariance matrices positive semi-definite, the allowed range for $\sigma(n)$ in the dense trivariate case is~$[-0.5,1]$ while for the sparse version it remains~$[-1,1]$.
 
 We want our results to be robust across data set lengths.
@@ -109,10 +108,10 @@ \subsubsection{Data generation}
 Therefore, we study the cases of $N \in \{120, 200, 400, 1200\}$ data points per time series.
 We report results for $N = 400$, because this is the closest to the values of $N$ in \cref{ch:ukb} and thus most representative.
 Results for other values of $N$ are retired to \cref{appendix:more-benchmarking-results}.
-It is important to be aware of the typical trade-off where scans with higher \glspl{tr} consist of fewer data points, although these are usually observed with less noise.
+It is important to be aware of the typical trade-off where scans with higher \glspl{tr} consist of fewer data points, although the increase in number of slices per scanning volume can result in higher spatial resolution and reduce autocorrelation effects and other sources of noise~\parencite{Amaro2006, Iranpour2015, Yoo2018, McDowell2019}.
 We also note that Bayesian methods typically perform better on smaller data sets.
 Even though we strive to find a robust method that can be used in any setting, we consider the option that some methods may work better with smaller but less noisy data sets and other methods may perform better with larger, noisy data sets.
-This style of analysis is reminiscient of the multiverse analysis~\parencite{Steegen2016}, as discussed in more detail in \cref{subsec:robustness}.
+This style of analysis is reminiscent of the multiverse analysis~\parencite{Steegen2016}, as discussed in more detail in \cref{subsec:robustness}.
 
 %%
 \subsubsection{Noise addition and hybrid simulations}
@@ -124,17 +123,18 @@ \subsubsection{Noise addition and hybrid simulations}
 Taken together this makes \gls{fmri} data (in)famously noisy~\parencite[see also][for analysis and biophysical simulations of impact of noise and delay]{Deco2009}.
 The noise is both spatially and temporally correlated.
 
-To make our benchmark more robust, all experiments are repeated on the data sets described priorly with added noise.
+To make our benchmark more robust, all experiments are repeated on the data sets described above, but with added noise.
 Noise ensures generalizability of any conclusion we make.
 Higher levels of noise can also be interpreted as decreasing the amplitude of any existing covariance structure, requiring methods to be increasingly sensitive.
 In an extensive literature review, \textcite{Welvaert2013} found \gls{snr} values in \gls{fmri} data to range between 0.35 and 203.6.
 Higher \glspl{snr} are preferred in practice.
 %
 Here, all experiments are repeated with added noise with an \gls{snr} $\in \{1, 2, 6\}$.
-Lower \glspl{snr} showed all \gls{tvfc} estimation methods breaking down, and higher \glspl{snr} yielded equivalent results to the noiseless case.
+Empirically, we found that any lower \gls{snr} causes all \gls{tvfc} estimation methods to break down, and any higher \gls{snr} yields results equivalent to the noiseless case.
 
-We study two types of added noise.
-The simplest case (white noise) can be considered similar to thermal noise in \gls{fmri} scanners.
+Two types of added noise are studied.
+The simplest case (white noise) can be considered like thermal noise in \gls{fmri} scanners.
+\unsure{Do we still want to report the white noise results?}
 Secondly, in the \emph{hybrid} simulations, we use an \gls{rs-fmri} data set to add noise to the synthetically generated activation data.
 We use data from the \gls{hcp} data set as described in more detail in \cref{subsec:data-hcp}.
 If we take time series from brain regions (or \gls{ica} components) from different subjects and add them to the synthetic signal, we can assume that no additional covariance structure is added.
@@ -186,7 +186,7 @@ \subsubsection{Evaluation and performance metrics}
 %%
 
 Comparing model estimates visually provides an initial overview of method performance and qualitative characteristics.
-To \emph{quantify} performance we compute the \gls{rmse} between estimated and ground truth Pearson correlation, following~\textcite{Wilson2010, Lindquist2014}.
+To \emph{quantify} performance, we compute the \gls{rmse} between estimated and ground truth Pearson correlation, following~\textcite{Wilson2010, Lindquist2014}.
 This can be considered a \emph{reconstruction} loss.
 Estimators with lower \gls{rmse} will be considered superior estimators.
 For the trivariate cases, we compute the \gls{rmse} across all elements of the full correlation matrices.
@@ -195,8 +195,8 @@ \subsubsection{Evaluation and performance metrics}
 \subsubsection{Imputation benchmark}
 %%
 
-We run the imputation benchmark as described in \cref{subsec:imputation-benchmark}.
-Of course, it is unnecessary to do so, because we have the actual underlying ground truth.
+The imputation benchmark is run as described in \cref{subsec:imputation-benchmark}.
+Of course, it is unnecessary to do so because we have the actual underlying ground truth.
 However, this allows us to investigate the relationship between method performance on this imputation benchmark and its ability to recover an actual ground truth.
 
 %%
@@ -212,59 +212,63 @@ \subsection{Resting-state fMRI}
 This analysis is more suited as demonstration rather than benchmark.
 
 %%
-\subsubsection{Data: Human Connectome Project}
-\label{subsec:data-hcp}
+\subsubsection{Data: Human Connectome Project}\label{subsec:data-hcp}
 %%
 
-The \gls{hcp} has collected a large, publicly available data set.
-It is one of the largest and most commonly used data sets in neuroimaging research~\parencite{VanEssen2012, VanEssen2013, Elam2021}.
+The \gls{hcp} has collected a large, publicly available data set collected in 2012--2015.\footnote{Data: \url{https://www.humanconnectome.org}}
+It is one of the largest and most used data sets in neuroimaging research~\parencite{VanEssen2012, VanEssen2013, Elam2021}.
 It is important to use publicly available and transparently collected data to benchmark methods.
-Participants in this data set are relatively young adults, aged 22-37.
+Participants in this data set are relatively young adults (twins and non-twin siblings), aged 22--35.
 The data collection and collaboration of this project were accompanied by fundamental technological advances underlying scan collection.
 This included advances in the domains of accelerated multiband and multiplexed \gls{epi} pulse sequences~\parencite{Moeller2010, Feinberg2010, Setsompop2012, Xu2012}.
+The data is accompanied by a range of behavioral and non-behavioral subject measures (many of these collected through the NIH toolbox).
 
 More specifically, we use data from the \gls{hcp} S1200 public release (published in March 2017).
+This (final) release contains 1,113 subjects with \gls{mri} data.
+Of these, 1,003 subjects have complete (i.e.~four scans) \gls{fmri} data available.
+However, only 812 subjects with improved \gls{fmri} image reconstruction (\texttt{recon2}) are available.
+We use data from these 812 subjects only.
 All scans are acquired with a 3T Siemens connectome-Skyra using a multi-band sequence.
+%
 The readily available $D = 15$ dimensional \gls{ica} based node time series are used.
 \Gls{ica} is a technique that decomposes observed (voxel) time series into a set of independent components.
 The original individual observed time series can then be reconstructed as a mix of these components.
 Each of these \gls{ica} time series thus represents a significant \emph{component} of whole-brain activity.
-Other dimensionalities $D \in \{50,100,200,300\}$ are also readily available.
-We opted to study $D = 15$ time series, representative of the number of time series studied in many cognitive studies.
-For example, our depression study in \cref{ch:ukb} studies both nine and three time series.
+Other dimensionalities $D \in \{25,50,100,200,300\}$ are also readily available.
+However, we opted to study $D = 15$ time series, as this number of time series is most representative of the number studied in many cognitive studies, including the ones in this thesis.
+For example, the depression study in \cref{ch:ukb} studies both nine and three time series.
 %
 For interpretability, each \gls{ica} component is mapped to a well-known \gls{fn}, following \textcite{Giorgio2018}.
 We asked three neuroimaging researchers to visually inspect and manually assign each \gls{ica} component to one of the networks from \cref{fig:brainmap-functional-networks}; networks that resemble task-driven \glspl{fn}~\parencite{Fox2007, Smith2009}.
 
 
-% Remove a), b), etc.
+% Remove (a), (b), etc.
 \captionsetup[subfigure]{labelformat=empty}
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{1: Visual (Medial) \label{fig:rsn-1}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_00}}
+  \subcaptionbox{1: Visual (Medial)\label{fig:rsn-1}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_00}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{2: Visual (Occipital, Cognition–Language–Orthography) \label{fig:rsn-2}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_01}}
-  \subcaptionbox{3: Visual (Lateral, Cognition–Space) \label{fig:rsn-3}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_02}}
+  \subcaptionbox{2: Visual (Occipital, Cognition-Language-Orthography)\label{fig:rsn-2}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_01}}
+  \subcaptionbox{3: Visual (Lateral, Cognition-Space)\label{fig:rsn-3}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_02}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{4: Default Mode Network (DMN) \label{fig:rsn-4}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_03}}
-  \subcaptionbox{5: Cerebellum (CBM) \label{fig:rsn-5}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_04}}
+  \subcaptionbox{4: Default Mode Network (DMN)\label{fig:rsn-4}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_03}}
+  \subcaptionbox{5: Cerebellum (CBM)\label{fig:rsn-5}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_04}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{6: Sensorimotor (SM) \label{fig:rsn-6}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_05}}
-  \subcaptionbox{7: Auditory (AUD) \label{fig:rsn-7}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_06}}
+  \subcaptionbox{6: Sensorimotor (SM)\label{fig:rsn-6}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_05}}
+  \subcaptionbox{7: Auditory (AUD)\label{fig:rsn-7}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_06}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{8: Executive Control (EC) \label{fig:rsn-8}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_07}}
-  \subcaptionbox{9: Frontoparietal (Perception-Somesthesis-Pain) \label{fig:rsn-9}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_08}}
+  \subcaptionbox{8: Executive Control (EC)\label{fig:rsn-8}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_07}}
+  \subcaptionbox{9: Frontoparietal (Perception-Somesthesis-Pain)\label{fig:rsn-9}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_08}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{10: Frontoparietal (Cognition-Language) \label{fig:rsn-10}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_09}}
+  \subcaptionbox{10: Frontoparietal (Cognition-Language)\label{fig:rsn-10}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_09}}
   \caption{
     Functional networks from \textcite{Smith2009}.
-  }
-  \label{fig:brainmap-functional-networks}
+  }\label{fig:brainmap-functional-networks}
 \end{figure}
 
-% Add back a), b), etc.
-\captionsetup[subfigure]{labelformat=simple, labelsep=colon}  % or parens?
+% Add back (a), (b), etc.
+\captionsetup[subfigure]{labelformat=parens}
 
 
 Each subject undergoes four scans in total, divided into two consecutive scans on two separate days.
@@ -273,13 +277,13 @@ \subsubsection{Data: Human Connectome Project}
 Scans were acquired with a \gls{tr} of 0.72 seconds and voxel size of 2~mm isotropic.
 Each scan contains $N = 1200$ images and is 15 minutes long.
 %
-Data was preprocessed by the \gls{hcp} team according to the (minimal) preprocessing pipeline using \gls{fsl}~\parencite{Jenkinson2012} and FreeSurfer~\parencite{Fischl2012}.
-The pipeline is described in more detail in \textcite{Jenkinson2002, Glasser2013, Smith2013a, Smith2013b}.
-Importantly, noise components have been filtered out during preprocessing using ICA+FIX~\parencite{Salimi2014, Griffanti2014}.
+Data was preprocessed by the \gls{hcp} team according to \textcite{Smith2013a} with a (minimal) preprocessing pipeline using \gls{fsl}~\parencite{Jenkinson2012} and FreeSurfer~\parencite{Fischl2012}.
+The pipeline is described in more detail in \textcite{Jenkinson2002, Glasser2013, Smith2013b}.
+Importantly, noise components have been filtered out during preprocessing using \gls{ica}+FIX~\parencite{Salimi2014, Griffanti2014}.
 This ensured that none of the components we use can be considered as a noise component.
-All time series were also demeaned and variance normalized~\parencite{Beckmann2004}.
+All time series were also demeaned, with variance normalized~\parencite{Beckmann2004}.
 All \gls{ica} components for each subject and for each scan were individually standardized to have zero mean and unit variance.
-For more data preprocessing details we refer the reader to the \gls{hcp} release manual.
+For more data preprocessing details, we refer the reader to the \gls{hcp} release manual.
 
 %%
 \subsubsection{Subject measure prediction benchmark}
@@ -291,7 +295,7 @@ \subsubsection{Subject measure prediction benchmark}
 Furthermore, subject measure prediction is similar to many practical use cases of \gls{tvfc} estimates, for example when used in clinical contexts for disease diagnosis and prognosis.
 %
 The idea here is as follows.
-If a method's estimated covariance structure has more predictive power than another's, it can be said it has extracted and preserved more relevant information (for that particular prediction task at hand).
+If a method's estimated covariance structure has more predictive power than another's, it can be said it has extracted and preserved more relevant information (for the prediction task at hand).
 It is likely that some methods preserve more relevant information for some tasks, yet other methods preserve more for other tasks.
 However, we can look at the average across many subject measures, or we may suggest which method to use after selecting a subject measure of interest.
 As such this benchmarking can be considered a `profiling' or `cataloging' exercise.
@@ -313,7 +317,7 @@ \subsubsection{Subject measure prediction benchmark}
 \end{equation}
 where $\mathbf{y}$ is a column vector of length equal to the number of subjects studied containing, some quantitative subject measure (e.g.~age), $\mathbf{X}$ is the design matrix of confounding variables (a.k.a.~covariates or nuisance variables) weighted by~$\mathbf{\beta}$, $\mathbf{a}\sim \mathcal{N}(\mathbf{0}, \sigma_a^2 \mathbf{K}_a)$ is a random effects vector, and $\mathbf{\epsilon} \sim \mathcal{N}(\mathbf{0}, \sigma_e^2)$ is a noise vector.
 We call the symmetric matrix $\mathbf{K}_a$ the \gls{tvfc} similarity matrix (in contrast with the original paper, where it is called the \emph{anatomical} similarity matrix).
-Each value in this matrix encodes how globally `similar' two particular subjects are.
+Each value in this matrix encodes how globally `similar' two subjects are.
 It is still our choice how to define similarity in this case based on \gls{tvfc} estimates.
 More specifically then, morphometricity is computed as
 \begin{equation}
@@ -338,10 +342,10 @@ \subsubsection{Subject measure prediction benchmark}
 We repeat the analysis for a range of subject measures.
 %
 In fact, another study has run a similar analysis to benchmark \gls{fmri} data preprocessing.
-In this work, \textcite{Li2019a} studied the predictive power of \gls{sfc} estimates after both including or excluding a crucial preprocessing pipeline researcher choice: whether to include \gls{gsr} or not.\footnote{The respective authors also implemented a regression on top of this, and find that the predictive power and the amount of variance explained are highly correlated concepts.}
+In this work, \textcite{Li2019a} studied the predictive power of \gls{sfc} estimates after both including or excluding a crucial preprocessing pipeline researcher choice: whether to include \gls{gsr} or not.\footnote{The respective authors also implemented a regression on top of this. They found that the predictive power and the amount of variance explained are highly correlated concepts.}
 They conclude that \gls{gsr} significantly increases the predictive power of \gls{sfc} estimates.
 To allow for comparison and because they capture a wide range of individual differences, we study the exact same behavioral subject measures as they did: age, gender, and a range of cognitive task scores.
-These subjects measures were investigated in a related study as well~\parencite{Kong2019}, where it was found that not only \gls{sfc} but also network topography has predictive power of such measures.
+These subject measures were investigated in a related study as well~\parencite{Kong2019}, where it was found that not only \gls{sfc} but also network topography has predictive power of such measures.
 %
 Age and gender are included as nuisance variables in our model, that is $\mathbf{X}$ in \cref{eq:lme}.
 Some subject measures were missing from the raw data.
@@ -368,7 +372,7 @@ \subsubsection{Test-retest robustness benchmark}
 Therefore, we judged it sufficient to only consider data from the \gls{hcp}.
 Their main finding was that the \gls{dcc} more reliably predicts \gls{tvfc} variance across sessions compared to the \gls{sw} methods.
 
-Our study differs in a number of ways.
+Our study differs in several ways.
 We only replicate these experiments for an updated and larger version of the \gls{hcp} data.
 We include our additional models (i.e.~the \gls{svwp} and \gls{sw-cv}) as well as the additional rate-of-change \gls{tvfc} summary measure.
 Moreover, we study the $D = 15$ case instead of their $D = 50$ case.
@@ -386,10 +390,10 @@ \subsubsection{Test-retest robustness benchmark}
 In fact, several versions of \gls{icc} computation exist, and some controversies persist.
 For example, \textcite{Noble2019} argued that \gls{icc}(3,1) should not be used in general, while \textcite{Chen2018} argued it can be.
 We have opted to implement the most commonly used method in neuroimaging: \gls{icc}(2,1).
-Scores are computed using the open-source Python package \texttt{pingouin}~\parencite[][version 0.5.2]{Vallat2018}.
-Empirically we did not find large differences between \gls{icc}(1,1), \gls{icc}(2,1), and \gls{icc}(3,1) scores.
+Scores are computed using the open-source Python package \texttt{pingouin}~\parencite[][version 0.5.3]{Vallat2018}.
+Empirically, we did not find significant differences between \gls{icc}(1,1), \gls{icc}(2,1), and \gls{icc}(3,1) scores.
 
-Furthermore, again following \textcite{Choe2017}, we compute I2C2 scores, which can be seen as a single omnibus representation for the whole brain~\parencite{Shou2013}.
+Furthermore, again following \textcite{Choe2017}, we compute image intraclass correlation coefficient (I2C2) scores, which can be seen as a single omnibus representation for the whole brain~\parencite{Shou2013}.
 This I2C2 score is computed as
 \begin{equation}
   I2C2 = \frac{tr(\mathbf{K}_X)}{tr(\mathbf{K}_X + \mathbf{K}_U)},
@@ -426,8 +430,7 @@ \subsubsection{Implementational details}
 In terms of \gls{wp} models, we only run the \gls{svwp} model here with $M = 200$ inducing points, due to the large number of volumes per scan ($N = 1200$).
 
 %%
-\subsection{Task-based fMRI}
-\label{subsec:rockland-methodology}
+\subsection{Task-based fMRI}\label{subsec:rockland-methodology}
 %%
 
 \info[inline]{Paragraph: Introduce our tb-fMRI-based benchmark.}
@@ -447,8 +450,7 @@ \subsection{Task-based fMRI}
 We treat them as if we are unaware of the task paradigm.
 
 %%
-\subsubsection{Data: Rockland visual task}
-\label{subsec:rockland-data}
+\subsubsection{Data: Rockland visual task}\label{subsec:rockland-data}
 %%
 
 \info[inline]{Paragraph: Describe Rockland data.}
@@ -463,7 +465,7 @@ \subsubsection{Data: Rockland visual task}
 
 \info[inline]{Paragraph: Describe Rockland data preprocessing.}
 Data preprocessing is run in \gls{fsl}~\parencite{Jenkinson2012}.
-After brain extraction and field-of-view fixation, data was registered, motion corrected (i.e.~inter-slice corrected), smoothed with a Gaussian kernel with a \gls{fwhm} of 5~mm, and then detrended using a high-pass filter at 0.01~Hz.
+After brain extraction and field-of-view fixation, data was registered, motion corrected (i.e.~inter-slice corrected), smoothed with a Gaussian kernel with a \gls{fwhm} of 5~mm, and then detrended using a high-pass filter at~0.01~Hz.
 No initial volumes are left out of the scan.
 Finally, we de-noised the data using ICA-AROMA~\parencite{Pruim2015a, Pruim2015b}.
 This is a method for Automatic Removal Of Motion Artifacts based on \gls{ica}.
@@ -479,7 +481,7 @@ \subsubsection{Data: Rockland visual task}
 Individual node time series are noisy.
 However, since we have an external cue that drives \gls{bold} signal measurements, we can (unlike with \gls{rs-fmri}) align all time series across subjects.
 These average node time series are shown in \cref{fig:rockland-time-series-mean-over-subjects}.
-Generally the \gls{bold} measurements in the visual regions track the external task well.
+Generally, the \gls{bold} measurements in the visual regions track the external task well.
 As expected, correspondence to the external stimulus decreases as we move up and away from the visual processing hierarchy.
 In \gls{mpfc} and \gls{m1} regions, we see slight bumps in activity with stimulus on- and offset.
 
@@ -490,8 +492,8 @@ \subsubsection{Data: Rockland visual task}
   \caption{
     Rockland data normalized BOLD time series averaged over all 286 subjects.
     External visual stimulus convolved with HRF is shown for reference.
-  }
-  \label{fig:rockland-time-series-mean-over-subjects}
+    As expected, correspondence to the external stimulus decreases as we move up and away from the visual processing hierarchy.
+}\label{fig:rockland-time-series-mean-over-subjects}
 \end{figure}
 
 
@@ -511,13 +513,13 @@ \subsubsection{External stimulus prediction}
   \includegraphics[width=0.6\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/design_matrix_nilearn}
   \caption{
     Rockland benchmark design matrix $\mathbf{X}$ used for predicting external stimulus.
-  }
-  \label{fig:rockland-design-matrix}
+    The matrix is used by a GLM to determine how well the stimulus can be reconstructed from TVFC estimates.
+  }\label{fig:rockland-design-matrix}
 \end{figure}
 
 
-We frame our prediction task as predicting the presence of the external visual stimulus.
-As we are in the domain of \gls{tb-fmri}, we can use a well-established technique based on the \gls{glm} and its \textbf{design matrix}.
+The prediction task is framed as predicting the presence of the external visual stimulus.
+We use a well-established technique in the domain of \gls{tb-fmri} based on the \gls{glm} and its \textbf{design matrix}.
 The \gls{glm} is defined as
 \begin{equation}
   \mathbf{Y} = \mathbf{X} \mathbf{\beta} + \mathbf{E},
@@ -526,12 +528,13 @@ \subsubsection{External stimulus prediction}
 %
 The regressors in $\mathbf{X}$ are hypothesized contributors to the experiment.
 They can be divided into `experimental' regressors (those of interest) and `nuisance' regressors (those we expect to affect the measurements but are not of interest, such as motion parameters or physiological signals as heart rate).
-Our experimental regressors are the box-car (block design) model of the visual stimulus, convolved with the \gls{hrf}~\parencite{Glover1999}.
+Our experimental regressors are the boxcar (block design) model of the visual stimulus, convolved with the \gls{hrf}~\parencite{Glover1999}.
 Our nuisance regressors are first-order polynomials.
 The polynomial (drift) order was determined as \gls{tr} multiplied by~$\frac{N}{150}$~\parencite[see][for rationale]{Worsley2002}.
 Effectively this reduces to allowing the model to \emph{detrend} the observed \gls{tvfc} estimates (this includes a constant value as well to remove the \emph{offset}).
-Adding these can be considered similar to actively detrending the \gls{tvfc} estimates (e.g.~using bandpass filtering), but adding them to the design matrix allows the model to learn the relevance automatically.
-This matrix is generated and plotted using the open-source Python library \texttt{Nilearn}~\parencite[][version~0.9.2]{Abraham2014}, which in turn is built upon \texttt{scikit-learn}~\parencite[][version~1.1.1]{Pedregosa2011} and \texttt{SciPy}~\parencite[][version~1.8.0]{SciPy2020}.
+Adding these can be considered similar to actively detrending the \gls{tvfc} estimates (e.g.~using bandpass filtering).
+However, adding them to the design matrix allows the model to learn the relevance automatically.
+This matrix is generated and plotted using the open-source Python library \texttt{Nilearn}~\parencite[][version~0.9.2]{Abraham2014}, which in turn is built upon \texttt{scikit-learn}~\parencite[][version~1.2.1]{Pedregosa2011} and \texttt{SciPy}~\parencite[][version~1.10.0]{SciPy2020}.
 We infer $\mathbf{\beta}$ from the observations using \gls{ols} to minimize the error terms~$\mathbf{E}$~\parencite{Worsley1995}.
 These $\beta$ parameters can then be used to determine which \gls{tvfc} estimates have captured the presence of the external visual stimulus best.
 Entries in this regressor matrix indicate relative contributions of regressors in predicting $\mathbf{Y}$.
@@ -540,4 +543,4 @@ \subsubsection{External stimulus prediction}
 \subsubsection{Implementational details}
 %%
 
-We only train the \gls{vwp} here, as $N$ is reasonably small.
+Only the \gls{vwp} model is trained here, as $N$ is reasonably small.
diff --git a/ch/3_Benchmarking_TVFC_estimation/2_Results.tex b/ch/3_Benchmarking_TVFC_estimation/2_Results.tex
index fa7dd87..f866f71 100644
--- a/ch/3_Benchmarking_TVFC_estimation/2_Results.tex
+++ b/ch/3_Benchmarking_TVFC_estimation/2_Results.tex
@@ -1,13 +1,11 @@
 \clearpage
-\section{Results}
-\label{sec:benchmarking-results}
+\section{Results}\label{sec:benchmarking-results}
 %%%%%
 
 Here we discuss the results of all benchmarking efforts.
 
 %%
-\subsection{Simulations}
-\label{sec:simulations-results}
+\subsection{Simulations}\label{sec:simulations-results}
 %%
 
 %%
@@ -26,8 +24,8 @@ \subsubsection{Data-driven hyperparameter optimization}
   \caption{
     Simulations benchmark optimal cross-validated window lengths learned from bivariate ($D = 2$) noiseless data for $N = 400$.
     Each dot represents one of $T = 200$ trials.
-  }
-  \label{fig:sim-optimal-window-lengths}
+    Faster changing covariance structures result in shorter learned window lengths.
+  }\label{fig:sim-optimal-window-lengths}
 \end{figure}
 
 
@@ -48,8 +46,8 @@ \subsubsection{Data-driven hyperparameter optimization}
   \caption{
     Simulations benchmark SVWP kernel lengthscales $l$ learned from bivariate ($D = 2$) noiseless data for $N = 400$.
     Each dot represents one of $T = 200$ trials.
-  }
-  \label{fig:sim-learned-kernel-lengthscales}
+    Faster changing covariance structures result in shorter learned kernel lengthscales.
+  }\label{fig:sim-learned-kernel-lengthscales}
 \end{figure}
 
 
@@ -75,13 +73,13 @@ \subsubsection{Bivariate case}
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{Without noise \label{fig:sim-results-bivariate-no-noise-all-covariance-structures-tvfc-predictions}}{\includegraphics[width=0.48\textwidth]{fig/sim/d2/N0400_T0200/no_noise/all_covs_types_correlations}}
-  \subcaptionbox{With rs-fMRI noise (hybrid) \label{fig:sim-results-bivariate-HCP-noise-all-covariance-structures-tvfc-predictions}}{\includegraphics[width=0.48\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/all_covs_types_correlations}}
+  \subcaptionbox{Without noise\label{fig:sim-results-bivariate-no-noise-all-covariance-structures-tvfc-predictions}}{\includegraphics[width=0.48\textwidth]{fig/sim/d2/N0400_T0200/no_noise/all_covs_types_correlations}}
+  \subcaptionbox{With rs-fMRI noise (hybrid)\label{fig:sim-results-bivariate-HCP-noise-all-covariance-structures-tvfc-predictions}}{\includegraphics[width=0.48\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/all_covs_types_correlations}}
   \caption{
-    Simulations benchmark single trial TVFC estimates for all covariance structures, for \textbf{bivariate} data ($D = 2$) for $N = 400$.
+    Simulations benchmark single trial TVFC estimates for all covariance structures, for bivariate data ($D = 2$) for $N = 400$.
     Ground truth (GT) is included for reference.
-  }
-  \label{fig:sim-results-tvfc-estimates-example}
+    Estimation methods have distinct failure modes.
+  }\label{fig:sim-results-tvfc-estimates-example}
 \end{figure}
 
 
@@ -92,25 +90,26 @@ \subsubsection{Bivariate case}
 The \gls{dcc} and \gls{svwp} models remain close to the ground truth.
 
 For the two periodic covariance structures, the \gls{sfc} estimate is not sensitive to this structure, unlike for the null and constant covariance structure data.
-\gls{sw-cv} performs well and is able to pick the correct window length in both cases.
+\gls{sw-cv} performs well and picks the correct window length in both cases.
 The \gls{mgarch} model captures the structure as well, although spurious sudden large jumps are seen.
 We see that the \gls{svwp} can model smooth changes in covariance structure well (indeed, this is perhaps where it excels).
 From visual inspection it seems to return the best fit.
 It is worth pointing out as well that the \gls{sw-cv} estimates at all time steps fall within the uncertainty bounds of the \gls{svwp} estimates.
 
 For the stepwise data, we can see that the \gls{svwp} estimates are too smooth and perform poorly during sudden changes in covariance structure.
-This is an expected failure mode, as this model expect slowly-changing covariance.
-Moreover, the `static' part of the time series require a long lengthscales, whereas the sudden jumps would require a very short lengthscales.
+This is an expected failure mode, as this model expects slowly changing covariance.
+Moreover, the `static' part of the time series requires a longer lengthscales, whereas the sudden jumps would require a shorter lengthscales.
 The \gls{sw-cv} estimates suffer from a similar problem, where it is not clear what window length would be optimal here.
-The \gls{dcc} method deals well with the change points, but shows a lot of jumps in the middle of the time series.
+The \gls{dcc} method deals well with the change points but shows a lot of jumps in the middle of the time series.
 As they are autoregressive models, they do not work well for the beginning of the time series either.
-This could be mitigated, however, by removing the first several volumes from the analysis, which is common practice in \gls{fmri} studies (these volumes are often considered unreliable because the participant and scanner are `settling in' during these).
-Crucially, these plots show that estimates are very different in nature, and without knowing our eventual use case it is hard to say which estimates are superior here.
+This could be mitigated, however, by removing the first several volumes from the analysis, which is common practice in \gls{fmri} studies (usually no more than six).
+These first volumes are often considered unreliable because the participant and scanner are `settling in' during these.
+Crucially, these plots show that estimates are vastly different in nature, and without knowing our eventual use case it is hard to say which estimates are superior here.
 
 All methods fail to pick up the state transition covariance structure, although in different ways.
 The \gls{svwp} predicts like it is static (with a large uncertainty bound), and \gls{dcc} and \gls{sw} estimates jump around a little, sometimes seemingly capturing some of the structure.
-Interestingly, these estimates look similar to the ones made for the static covariance cases.
-So if we would have not known any ground truth, it would have been unclear if the underlying process was static or showed state transitions as simulated here.
+Interestingly, these estimates look like the ones made for the static covariance cases.
+So, if we would have not known any ground truth, it would have been unclear if the underlying process was static or showed state transitions as simulated here.
 The fact that none of the methods can pick up on this structure also raises concern about whether we need completely different approaches if such underlying structure is to be expected in real data.
 How subtle do we expect changes in \gls{tvfc} with shifting cognitive state?
 How many of such changes do we expect per minute?
@@ -123,20 +122,19 @@ \subsubsection{Bivariate case}
 
 \begin{figure}[ht]
   \centering
-  \subcaptionbox{Bivariate case ($D = 2$) \label{fig:sim-results-d2-HCP-noise-all-correlation-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/correlation_RMSE}}
-  \subcaptionbox{Trivariate dense case ($D = 3$) \label{fig:sim-results-d3d-HCP-noise-all-correlation-matrix-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d3d/N0200_T0200/HCP_noise_snr_2/correlation_matrix_RMSE}}
-  \subcaptionbox{Trivariate sparse case ($D = 3$) \label{fig:sim-results-d3s-HCP-noise-all-correlation-matrix-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d3s/N0200_T0200/HCP_noise_snr_2/correlation_matrix_RMSE}}
+  \subcaptionbox{Bivariate case ($D = 2$)\label{fig:sim-results-d2-HCP-noise-all-correlation-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d2/N0400_T0200/HCP_noise_snr_2/correlation_RMSE}}
+  \subcaptionbox{Trivariate dense case ($D = 3$)\label{fig:sim-results-d3d-HCP-noise-all-correlation-matrix-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d3d/N0200_T0200/HCP_noise_snr_2/correlation_matrix_RMSE}}
+  \subcaptionbox{Trivariate sparse case ($D = 3$)\label{fig:sim-results-d3s-HCP-noise-all-correlation-matrix-RMSE}}{\includegraphics[width=0.84\textwidth]{fig/sim/d3s/N0200_T0200/HCP_noise_snr_2/correlation_matrix_RMSE}}
   \caption{
     Simulations benchmark RMSE between model TVFC estimates and ground truth on all bivariate and trivariate covariance structures with added rs-fMRI noise (SNR of 2) for $N = 400$.
     Means and standard deviations are shown across $T = 200$ trials.
-  }
-  \label{fig:sim-results-HCP-noise-all}
+  }\label{fig:sim-results-HCP-noise-all}
 \end{figure}
 
 
-Although the visual inspection of this single trial gives us intuition for model performance and failure modes, in order to compare methods robustly, we need to \emph{quantify} model performance.
+Although the visual inspection of this single trial gives us intuition for model performance and failure modes, to compare methods robustly we need to \emph{quantify} model performance.
 The computed \gls{rmse} between estimated and ground truth correlation terms (i.e.~the off-diagonal term) is shown in \cref{fig:sim-results-d2-HCP-noise-all-correlation-RMSE}.
-The results are shown for the hybrid case, as a more noisy setup is considered more realistic.
+The results are shown for the hybrid case, as a noisier setup is considered more realistic.
 %
 These quantitative results generally confirm our intuition from the visual inspection.
 The \gls{svwp} and \gls{dcc} models perform like a single window approach for the null and constant cases, with the \gls{sw-cv} performing slightly worse.
@@ -147,7 +145,7 @@ \subsubsection{Bivariate case}
 %
 However, none of the methods can significantly outperform the \gls{sfc} estimate on the state transition covariance structure.
 This result can be considered a quantitative confirmation of our hypothesis from looking at the estimates.
-This structure may be too hard to learn by the considered methods.
+This structure may be too hard to learn by the methods considered.
 This point was made by \textcite{Lindquist2014} too, who claimed that \gls{dcc} cannot model abrupt changes (change points) well.
 Results for $N = 120$, $N = 200$, and $N = 1200$ show similar relative performance (see \cref{appendix:sim-more-quantitative-results}).
 %
@@ -171,7 +169,7 @@ \subsubsection{Trivariate cases}
 However, the sparse case (\cref{fig:results-d3s-no-noise-stepwise-covariance}) reveals that it may be a good idea to train \gls{dcc} in a pairwise manner.
 The jointly trained \gls{dcc} model performs much worse here.
 
-Quantitative results for the dense and sparse trivarate cases with added \gls{hcp} \gls{rs-fmri} noise are shown in \cref{fig:sim-results-d3d-HCP-noise-all-correlation-matrix-RMSE} and \cref{fig:sim-results-d3s-HCP-noise-all-correlation-matrix-RMSE}.
+Quantitative results for the dense and sparse trivariate cases with added \gls{hcp} \gls{rs-fmri} noise are shown in \cref{fig:sim-results-d3d-HCP-noise-all-correlation-matrix-RMSE} and \cref{fig:sim-results-d3s-HCP-noise-all-correlation-matrix-RMSE}.
 We generally see the same trends in these results as for the bivariate case.
 The \gls{svwp} does relatively well on the sparse covariance structures, and the jointly trained \gls{dcc} does relatively poorly here.
 
@@ -185,8 +183,7 @@ \subsubsection{Toward higher dimensions}
 However, more complicated covariance structures in higher dimensions are left for future work.
 
 %%
-\subsubsection{Impact of noise analysis}
-\label{subsec:impact-of-noise-analysis}
+\subsubsection{Impact of noise analysis}\label{subsec:impact-of-noise-analysis}
 %%
 
 To understand the impact of noise on our results, the estimates for various noise levels are shown in \cref{ch:appendix-d2-impact-of-noise,ch:appendix-d3d-impact-of-noise}.
@@ -203,12 +200,13 @@ \subsubsection{Imputation benchmark}
 %
 Let us look at two cardinal cases, both using bivariate, noiseless data.
 For the null covariance case (\cref{fig:sim-imputation-study-d2-null}), we find all methods to perform similarly well on the imputation benchmark.
-This was to be expected, as all model can model this just fine, as witnessed in both the visual inspection and the quantified results.
-Actually, \gls{sw-cv} performs slightly worse than the other method, which again was to be expected as it performed slightly worse on the other performance metrics as well.
+This was to be expected.
+All models can model this appropriately, as witnessed in both the visual inspection and the quantified results.
+\gls{sw-cv} performs slightly worse than the other methods, which again was to be expected as it also performed slightly worse on the other performance metrics.
 However, for a non-static covariance structure such as the slowly oscillating periodic one, we find the \gls{sfc} estimates to perform much worse on the imputation benchmark than the other methods (see \cref{fig:sim-imputation-study-d2-periodic-1}).
 We find the \gls{svwp} to perform best on this benchmark.
-Thus, we obtained results \emph{without} using a ground truth that correspond well to the performance metrics computed \emph{with} knowledge of a ground truth.
-This is exciting, as we can run this imputation benchmark on any (real) data set.
+Thus, we obtained results \emph{without} using a ground truth, that correspond well to the performance metrics computed \emph{with} knowledge of a ground truth.
+This is exciting, as this imputation benchmark can be run on any (real) data set.
 
 
 \begin{figure}[t]
@@ -218,8 +216,7 @@ \subsubsection{Imputation benchmark}
     Simulations imputation benchmark - null covariance.
     Test log likelihoods for bivariate ($D = 2$) noiseless data for $N = 400$.
     Each dot represents one of $T = 200$ trials.
-  }
-  \label{fig:sim-imputation-study-d2-null}
+  }\label{fig:sim-imputation-study-d2-null}
 \end{figure}
 
 
@@ -230,15 +227,13 @@ \subsubsection{Imputation benchmark}
     Simulations imputation benchmark - periodic (slow) covariance.
     Test log likelihoods for bivariate ($D = 2$) noiseless data for $N = 400$.
     Each dot represents one of $T = 200$ trials.
-  }
-  \label{fig:sim-imputation-study-d2-periodic-1}
+  }\label{fig:sim-imputation-study-d2-periodic-1}
 \end{figure}
 
 
 %%
 \clearpage
-\subsection{Resting-state fMRI}
-\label{subsec:hcp-results}
+\subsection{Resting-state fMRI}\label{subsec:hcp-results}
 %%
 
 Before diving into any quantitative results, we should again visually inspect the estimated \gls{tvfc} from different methods.
@@ -252,61 +247,60 @@ \subsection{Resting-state fMRI}
   \caption{
     HCP benchmark TVFC estimates for random selection of edges (from $D = 15$ time series) for a single scan of a single HCP subject.
     The y-axis scales from~-1~to~1; the black dashed line indicates zero correlation.
-  }
-  \label{fig:hcp-model-estimates-example}
+    Method estimates vary radically.
+}\label{fig:hcp-model-estimates-example}
 \end{figure}
 
 
 Several observations can be made from visual inspection.
 %
-First of all, there is a large variety in general correlation (connectivity strength) between the edges.
+First, there is a large variety in general correlation (connectivity strength) between the edges.
 Albeit with fluctuations, some edges are consistently strongly correlated, whereas others are generally uncorrelated.
 Some edges show dynamics, whereas others appear more static.
 This was to be expected, as brain region interactions should be distinct from one another.
 %
 Secondly, perhaps worryingly, estimates vary radically among the methods considered.
 The mean of \gls{tvfc} estimates across time for both the \gls{svwp} and \gls{sw} methods is consistent with the \gls{sfc} estimate, in contrast to the \gls{dcc} estimates.
-We also see large differences between \gls{tvfc} estimates in terms of how fast \gls{tvfc} changes.
+We also see major differences between \gls{tvfc} estimates in terms of how fast \gls{tvfc} changes.
 The \gls{sw} method estimates constantly and rapidly changing \gls{tvfc}, with large variance across time.
 The \gls{svwp} method estimates time-varying but transiently changing \gls{tvfc}.
-The \gls{dcc} method estimates rapidly changing \gls{tvfc}, but constricted to a very small range (i.e.~low variance).
+The \gls{dcc} method estimates rapidly changing \gls{tvfc}, but constricted to a small range (i.e.~low variance).
 %
-Lastly, we observe that the \gls{svwp} estimates look like a smoothed version of the \gls{sw-cv} estimates.
+Lastly, we observe that the \gls{svwp} estimates look like a smooth version of the \gls{sw-cv} estimates.
 This could mean that either the \gls{svwp} model cannot pick up on the fast-changing nature of \gls{tvfc}, or that the \gls{sw-cv} method picks up on spurious fluctuations (yet still captures the general trends).
-Training \gls{dcc} in a joint or pairwise manner does not seem to make a big difference here.
+Training \gls{dcc} in a joint or pairwise manner does not seem to make a significant difference here.
 %
 These insights have been qualitatively verified by inspecting other scans.
 
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{SVWP-J \label{fig:HCP-model-estimates-summary-measures-mean-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:HCP-model-estimates-summary-measures-mean-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:HCP-model-estimates-summary-measures-mean-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_SW_cross_validated}}
-  \subcaptionbox{SVWP-J \label{fig:HCP-model-estimates-summary-measures-var-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:HCP-model-estimates-summary-measures-var-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:HCP-model-estimates-summary-measures-var-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_SW_cross_validated}}
-  \subcaptionbox{SVWP-J \label{fig:HCP-model-estimates-summary-measures-roc-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:HCP-model-estimates-summary-measures-roc-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:HCP-model-estimates-summary-measures-roc-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:HCP-model-estimates-summary-measures-mean-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:HCP-model-estimates-summary-measures-mean-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:HCP-model-estimates-summary-measures-mean-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_mean_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:HCP-model-estimates-summary-measures-var-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:HCP-model-estimates-summary-measures-var-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:HCP-model-estimates-summary-measures-var-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_variance_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:HCP-model-estimates-summary-measures-roc-SVWP}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:HCP-model-estimates-summary-measures-roc-DCC}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:HCP-model-estimates-summary-measures-roc-SW}}{\includegraphics[width=0.30\textwidth]{fig/hcp/d15/TVFC_predictions_summaries/scan_0/all/correlation_TVFC_rate_of_change_SW_cross_validated}}
   \caption{
     HCP benchmark edgewise TVFC summary measures of first scan (1A) averaged over all subjects.
     TVFC mean (top), variance (middle), and rate-of-change (bottom row) are shown.
     For interpretation, ICA components are mapped to FNs.
     Visual (V): medial (M), occipital (O), lateral (L); Default Mode Network (DMN); Cerebellum (CBM); Sensorimotor (SM); Auditory (AUD); Executive Control (EC); Frontoparietal (FP) with Cognition-Language (CL) subset.
-  }
-  \label{fig:HCP-model-estimates-summary-measures}
+  }\label{fig:HCP-model-estimates-summary-measures}
 \end{figure}
 
 
-These distinct differences in dynamics can be captured by three \gls{tvfc} summary measures: mean, variance, and rate-of-change, see \cref{subsec:tvfc-summary-measures}) as shown in \cref{fig:HCP-model-estimates-summary-measures}.
-We can see that our broad intuition from the visual inspection holds for these summary measures.
+These distinct differences in dynamics can be captured by three \gls{tvfc} summary measures: mean, variance, and rate-of-change (see \cref{subsec:tvfc-summary-measures}), as shown in \cref{fig:HCP-model-estimates-summary-measures}.
+Our broad intuition from the visual inspection holds for these summary measures.
 %
-The mean across these three methods looks similar, although the \gls{dcc} means are slightly different.
+The mean estimates across these three methods looks similar, although the \gls{dcc} means are slightly different.
 %
 For variance, we see small values for \gls{dcc}, large values for \gls{sw-cv}, and the values for \gls{svwp} estimates in between these.
 %
-For rate-of-change, we see smaller values for both the \gls{svwp} and \gls{sw-cv}, and relatively large values for \gls{dcc}.
+For rate-of-change, we see smaller values for both the \gls{svwp} and \gls{sw-cv}, and larger values for \gls{dcc}.
 %
 From this inspection we may conclude that the combined three summary measures paint a reasonably comprehensive summary of each method's \gls{tvfc} estimates.
 The summary measures of the pairwise \gls{dcc} estimates are similar to the joint ones.
@@ -316,11 +310,10 @@ \subsection{Resting-state fMRI}
   \centering
   \includegraphics[width=0.45\textwidth]{fig/hcp/d15/lengthscale_optimal_window_length_relations}
   \caption{
-    HCP benchmark relationship between learned SVWP kernel lengthscales and SW-CV optimal window length.
+    HCP benchmark relationship between learned SVWP kernel lengthscales (scaled to time series length) and SW-CV optimal window length.
     Each dot represents one of four scans (1A, 1B, 2A, 2B) for each subject.
-    Full time series is 864 seconds long.
-  }
-  \label{fig:sim-relationship-lengthscale-optimal-window-length}
+    Full time series are 864 seconds long.
+  }\label{fig:sim-relationship-lengthscale-optimal-window-length}
 \end{figure}
 
 
@@ -342,8 +335,7 @@ \subsubsection{Subject measure prediction benchmark}
     HCP benchmark subject cognitive measures prediction morphometricity scores (with standard error).
     Run on TVFC summary measures of mean (top), variance (middle), and rate-of-change (bottom row).
     sFC is added for reference to the TVFC mean plot.
-  }
-  \label{fig:hcp-results-subject-measures-prediction}
+  }\label{fig:hcp-results-subject-measures-prediction}
 \end{figure}
 
 
@@ -351,66 +343,65 @@ \subsubsection{Subject measure prediction benchmark}
 The pairwise \gls{dcc} scores are almost identical to the joint \gls{dcc}, which is unsurprising as the summary measures are almost identical.
 Hence, they are left out here.
 %
-Since we study the same subject measures as \textcite{Li2019a}, we can compare the \gls{sfc} (as a reference or \emph{null} model; the alternative hypothesis that \gls{fc} is actually static) and \gls{tvfc} estimate means (top row) directly to theirs.
+Since we study the same subject measures as \textcite{Li2019a}, we can compare the \gls{sfc} (as a reference or \emph{null} model; the alternative hypothesis that \gls{fc} is static) and \gls{tvfc} estimate means (top row) directly to theirs.
 Scores are generally replicated, for example being close to 1 (for all methods) for Gender.
 This provides a healthy sanity check.
 %
 We also see that none of the \gls{tvfc} estimation methods can outperform the \gls{sfc} method regarding mean \gls{fc} (connectivity strength).
-This is not unexpected, and highlights that we should only look at \gls{tvfc} methods if we are interested in brain connectivity \emph{dynamics}.
+This was expected and highlights that we should only look at \gls{tvfc} methods if we are interested in brain connectivity \emph{dynamics}.
 However, a good \gls{tvfc} estimation method should still model mean \gls{tvfc} well (i.e.~get as close to \gls{sfc} performance as possible).
 
 We observe several things from these results.
 %
-First of all, for some subject measures none of the variance across all \gls{tvfc} summary measures and estimation methods can be explained (see e.g.~Sustained Attention Sens.).
+First, for some subject measures none of the variance across all \gls{tvfc} summary measures and estimation methods can be explained (see e.g.~Sustained Attention Sens.).
 This could indicate that signatures of these respective tasks are not captured by \gls{fc} in general.
 %
 Second, we observe strong heterogeneity across methods.
 There is no clear-cut conclusion on picking a superior method here.
-A method may outperform another in one occasion, but not in another.
-However, overal we see the \gls{svwp} and \gls{sw-cv} methods perform best, whereas \gls{dcc} often fails to explain much variance across these measures.\footnote{As another validation of cross-validating window lengths, we ran the \gls{sw} approach with a window length of both 30 and 60 seconds. The \gls{sw-cv} estimates consistently outperformed both of these, even as this would be an unfair, post-hoc comparison.}
+A method may outperform another on one occasion, but not on another.
+However, overall the \gls{svwp} and \gls{sw-cv} methods perform best, whereas \gls{dcc} often fails to explain much variance across these measures.\footnote{As another validation of cross-validating window lengths, we ran the \gls{sw} approach with a window length of both 30 and 60 seconds. The \gls{sw-cv} estimates consistently outperformed both of these, even as this would be an unfair, post-hoc comparison.}
 Particularly noteworthy is the complete lack of the \gls{dcc} \gls{tvfc} variance estimates to have any predictive power.
 This can be interpreted as it being meaningless (which we will later see to be relevant for the test-retest benchmark).
 %
 In general, it is also promising here that the time-varying summary measures (variance and rate-of-change) contain information about subject measures as well.
-This highlights the value of studying \gls{fc} dynamics instead of mereley its properties over the full scan.
+This highlights the value of studying \gls{fc} dynamics instead of merely its properties over the full scan.
 %
 Another observation can be made about how subject measures are expressed.
-In fact, this helps understand what these summary measures capture.
+In fact, this helps us understand what these summary measures capture.
 Subject age, for example, seems to affect \gls{tvfc} mean and variance more so than rate-of-change.
 Such insights could lead to (careful) biophysical interpretations as well.
 For example, \textcite{Hutchison2015} found that \gls{fc} variability over the length of a scan correlates positively with age.
-Based on this finding, we would expect \gls{tvfc} variance to be predictive of age to some degree.
-We do indeed find this for the \gls{wp} and \gls{sw-cv} methods, but fail to find this for the \gls{dcc} method.
+Based on this finding, we would expect \gls{tvfc} variance to be predictive of age (to some degree).
+We do indeed find this for the \gls{wp} and \gls{sw-cv} methods but fail to find this for the \gls{dcc} method.
 Such comparisons to literature can be used to assess the validity and general usefulness of a \gls{tvfc} estimation method.
 %
-Finally, we see that the performance of \gls{svwp} and \gls{sw-cv} estimates are generally in sync (performing well and failing in similar contexts).
+Finally, we see that the performance of \gls{svwp} and \gls{sw-cv} estimates are generally coordinated (performing well and failing in similar contexts).
 This aligns with our prior intuition that these methods may capture similar aspects of the data.
 
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{SVWP-J \label{fig:test-retest-mean-SVWP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:test-retest-mean-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:test-retest-mean-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_SW_cross_validated}}
-  \subcaptionbox{SVWP-J \label{fig:test-retest-variance-SVWP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:test-retest-variance-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:test-retest-variance-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_SW_cross_validated}}
-  \subcaptionbox{SVWP-J \label{fig:test-retest-roc-WP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:test-retest-roc-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:test-retest-roc-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:test-retest-mean-SVWP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:test-retest-mean-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:test-retest-mean-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_mean_ICCs_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:test-retest-variance-SVWP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:test-retest-variance-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:test-retest-variance-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_variance_ICCs_SW_cross_validated}}
+  \subcaptionbox{SVWP-J\label{fig:test-retest-roc-WP-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:test-retest-roc-DCC-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:test-retest-roc-SW-ICCs}}{\includegraphics[width=0.28\textwidth]{fig/hcp/d15/test_retest/ICCs/correlation_rate_of_change_ICCs_SW_cross_validated}}
   \caption{
     HCP benchmark test-retest robustness edgewise ICC(2,1) scores across four scans of estimated TVFC on $D = 15$ ICA-based data.
     Reliability of TVFC means (top), variance (middle), and rate-of-change (bottom row) are shown.
     For interpretation, ICA components are mapped to FNs.
     Visual (V): medial (M), occipital (O), lateral (L); Default Mode Network (DMN); Cerebellum (CBM); Sensorimotor (SM); Auditory (AUD); Executive Control (EC); Frontoparietal (FP) with Cognition-Language (CL) subset.
-  }
-  \label{fig:hcp-results-test-retest-ICCs-d15}
+  }\label{fig:hcp-results-test-retest-ICCs-d15}
 \end{figure}
 
 
 \info[inline]{Paragraph: Discuss additional subject measures.}
 Morphometricity scores for additional subject measures (including social-emotional and other measures) can be found in \cref{appendix:hcp-more-results}.
-Variance explained for these measures are generally much lower.
+The variance explained for these measures is generally much lower.
 This shows that \gls{fc} may capture cognitive individual differences better than affective ones.
 Comparing to prior findings, we replicate \textcite{Dubois2018} and find that the Big Five personality trait of `openness to experience' is best explained by \gls{sfc}~\parencite[see also][]{Beaty2018}.
 
@@ -438,7 +429,7 @@ \subsubsection{Test-retest robustness benchmark}
 We posit that using I2C2 scores is a good measure for whole-brain test-retest robustness.
 However, in some cases we may be interested in the robustness of particular edges.
 For comparison, \textcite{Choe2017} found respective values for the \gls{tvfc} mean between 0.44 and 0.48 for all methods.
-We find similar values for all of our methods.
+We find similar values for all our methods.
 They found respective values for the \gls{tvfc} variance between 0.16 and 0.30 for \gls{sw} methods (with different window lengths) and 0.49 for \gls{dcc}.
 That is, they also found more variety between methods in I2C2 values for \gls{tvfc} variances than for \gls{tvfc} means.
 
@@ -448,19 +439,18 @@ \subsubsection{Test-retest robustness benchmark}
   \includegraphics[width=\textwidth]{fig/hcp/d15/test_retest/I2C2/correlation_I2C2_scores}
   \caption{
     HCP benchmark test-retest robustness I2C2 omnibus scores for mean, variance, and rate-of-change TVFC summary measures.
-  }
-  \label{fig:hcp-results-test-retest-I2C2-scores-d15}
+  }\label{fig:hcp-results-test-retest-I2C2-scores-d15}
 \end{figure}
 
 
 Following up from \cref{subsec:model-features}, we can also compute the \gls{icc} score for the learned \gls{svwp} kernel parameters.
 We find a score of 0.29 for the kernel variance and 0.48 for the kernel lengthscales $l$.
 Taking the categories from \textcite{Cicchetti1994}, these can be considered `poor' and `fair', respectively.
-These are actually high in comparison to the scores for the summary measures.
+These are high in comparison to the scores for the summary measures.
 %
 How should we interpret these results?
 From the perspective of viewing this as a prediction problem, we can view this test-retest problem as the following question: ``Given the first scan, how well can we predict which scan is that subject's subsequent scan?''.
-The issue here is not just to look at test-retest scores, but to use them as a way to justify using a certain method over another.
+The issue here is not just to look at test-retest scores, but to use them to justify using one method over another.
 In that light, we just expect a method to generate \emph{any} feature that may help us do this.
 The method with the best score from any such feature can be considered stronger.
 %
@@ -483,33 +473,31 @@ \subsubsection{Imputation benchmark}
   \caption{
     HCP benchmark imputation results under LEOO train-test split.
     The boxplot shows median, quartiles, and outliers.
-  }
-  \label{fig:hcp-results-LEOO-multivariate}
+  }\label{fig:hcp-results-LEOO-multivariate}
 \end{figure}
 
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{SVWP-J \label{fig:edgewise-imputation-benchmark-SVWP}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_SVWP_joint}}
-  \subcaptionbox{DCC-J \label{fig:edgewise-imputation-benchmark-DCC}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_DCC_joint}}
-  \subcaptionbox{SW-CV \label{fig:edgewise-imputation-benchmark-SW-CV}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_SW_cross_validated}}
-  \subcaptionbox{sFC \label{fig:edgewise-imputation-benchmark-STATIC}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_sFC}}
+  \subcaptionbox{SVWP-J\label{fig:edgewise-imputation-benchmark-SVWP}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_SVWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:edgewise-imputation-benchmark-DCC}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:edgewise-imputation-benchmark-SW-CV}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_SW_cross_validated}}
+  \subcaptionbox{sFC\label{fig:edgewise-imputation-benchmark-STATIC}}{\includegraphics[width=0.45\textwidth]{fig/hcp/d15/imputation_study/LEOO_multivariate_test_log_likelihoods_edgewise_sFC}}
   \caption{
     HCP benchmark edgewise imputation results for various methods on $D = 15$ ICA-based data.
     The equivalent of mean test log likelihoods from \cref{fig:hcp-results-LEOO-multivariate} are shown for each edge individually.
     For interpretation, ICA components are mapped to FNs.
     Visual (V): medial (M), occipital (O), lateral (L); Default Mode Network (DMN); Cerebellum (CBM); Sensorimotor (SM); Auditory (AUD); Executive Control (EC); Frontoparietal (FP) with Cognition-Language (CL) subset.
-  }
-  \label{fig:hcp-results-edgewise-imputation-benchmark}
+  }\label{fig:hcp-results-edgewise-imputation-benchmark}
 \end{figure}
 
 
 This is a whole-brain analysis.
 However, there may be certain edges where \gls{svwp} performance is similar to the static approach (i.e.~static edges) and some where it outperforms (i.e.~dynamic edges).
-In fact, we posit that outperformance over static estimates can be considered a proxy for a statistical test whether there is any time-varying structure~\parencite[see also][]{Zalesky2014, Hindriks2016}.
+In fact, we posit that outperformance over static estimates can be considered a proxy for a statistical test of whether there is any time-varying structure~\parencite[see also][]{Zalesky2014, Hindriks2016}.
 This point was made earlier based on \cref{fig:sim-imputation-study-d2-null,fig:sim-imputation-study-d2-periodic-1}.
 To further explore this proposal, the population-level edgewise imputation benchmark scores (averaged over all subjects) are plotted in \cref{fig:hcp-results-edgewise-imputation-benchmark}.
-First of all, caution is advised when interpreting these.
+First, caution is advised when interpreting these.
 Certain edges have high performance on this imputation benchmark, see e.g.~FP-FP(CL), but this may simply be due to this edge being very static (and thus easier to fit).
 Comparison between edges is less insightful than comparison for the same edge \emph{between} estimation methods.
 %
@@ -523,7 +511,7 @@ \subsubsection{Brain state analysis}
 
 Extracted brain states for \gls{svwp}, \gls{dcc}, and \gls{sw-cv} are shown in \cref{fig:hcp-results-brain-states-svwp,fig:hcp-results-brain-states-dcc,fig:hcp-results-brain-states-sw-cv}, respectively.
 Interestingly, the extracted brain states for \gls{svwp} and \gls{sw-cv} estimates look similar.
-The first brain state also looks similar to the \gls{sfc} estimates, as expected~\parencite{Allen2014}.
+The first brain state also looks like the \gls{sfc} estimates, as expected~\parencite{Allen2014}.
 
 Apart from the extracted brain states, we are also interested in the dynamics and transitions of brain states.
 The number of brain state change points per method per subject is shown in \cref{fig:hcp-brain-state-change-point-counts}.
@@ -539,19 +527,17 @@ \subsubsection{Brain state analysis}
     Total number of time points is $N = 1200$.
     Run on multivariate (all $D = 15$ time series) HCP data.
     The boxplot shows median, quartiles, and outliers.
-  }
-  \label{fig:hcp-brain-state-change-point-counts}
+  }\label{fig:hcp-brain-state-change-point-counts}
 \end{figure}
 
 
 %%
-\subsection{Task-based fMRI}
-\label{subsec:rockland-results}
+\subsection{Task-based fMRI}\label{subsec:rockland-results}
 %%
 
-We firstly compare the \gls{vwp} kernel lengthscales to the optimal learned window length again, shown in \cref{fig:rockland-relationship-lengthscale-optimal-window-length}.
+First, we compare the \gls{vwp} kernel lengthscales to the optimal learned window length again, shown in \cref{fig:rockland-relationship-lengthscale-optimal-window-length}.
 We do not find a relationship between the two.
-This may indicate a failure of one (or both) of these methods to capture anything meaningful, or the approaches to focus on different aspects of the data.
+This may indicate a failure of one (or both) of these methods to capture anything meaningful, or the approaches to focus on distinct aspects of the data.
 Alternatively, these hyperparameters may in fact not be crucial to the estimates, these being driven much more by the actual observations.
 
 
@@ -559,16 +545,15 @@ \subsection{Task-based fMRI}
   \centering
   \includegraphics[width=0.45\textwidth]{fig/rockland/CHECKERBOARD645/lengthscale_optimal_window_length_relations}
   \caption{
-    Rockland benchmark relationship between learned VWP kernel lengthscales and SW-CV optimal window length.
+    Rockland benchmark relationship between learned VWP kernel lengthscales (scaled to time series length) and SW-CV optimal window length.
     Each dot represents one of 286 Rockland subjects.
     Full time series is 154.8 seconds long.
-  }
-  \label{fig:rockland-relationship-lengthscale-optimal-window-length}
+  }\label{fig:rockland-relationship-lengthscale-optimal-window-length}
 \end{figure}
 
 
 We start with a visual inspection of model \gls{tvfc} estimates, shown in \cref{fig:rockland-results-tvfc-predictions}.
-Just like the node time series plot, since we have an external task we can average estimates across all subjects.
+Just like the node time series plot, since we have an external task, we can average estimates across all subjects.
 
 
 \begin{figure}[ht]
@@ -578,15 +563,14 @@ \subsection{Task-based fMRI}
     Rockland benchmark model TVFC estimates.
     Shaded areas indicate presence of external visual stimulus.
     Average estimates over all 286 subjects.
-  }
-  \label{fig:rockland-results-tvfc-predictions}
+  }\label{fig:rockland-results-tvfc-predictions}
 \end{figure}
 
 
 As discussed, we expect that correlation between \gls{v1} and other regions should decrease as we move up and away from the visual cortex hierarchy.
 We do in fact observe this from all models: connectivity with V2 is highest, followed by V3, V4, \gls{mpfc}, and connectivity strength with \gls{m1} is lowest.
 However, the mean \gls{tvfc} is quite different across estimation methods, and often quite different from the \gls{sfc} estimate.
-This also highlights how important choice of \gls{tvfc} estimation method can be.
+This also highlights how important the choice of \gls{tvfc} estimation method can be.
 
 %%
 \subsubsection{External stimuli prediction}
@@ -599,25 +583,23 @@ \subsubsection{External stimuli prediction}
 Instead, the full \gls{sfc} estimates are captured by the constant (offset) parameter of the design matrix.
 %
 In terms of other models, we see the \gls{vwp} model having the largest estimated task-related $\textbf{\beta}$ parameters.
-Interestingly, the within-visual cortex edges have most predictive power.
-As expected from the visual inspection, the \gls{glm} has removed negative trends for \gls{v1}-\gls{mpfc} and \gls{v1}-\gls{m1}.
-We conclude that the \gls{vwp} estimates have relatively captured the most of the external task.
+Interestingly, the within-visual cortex edges have the most predictive power.
+As expected from the visual inspection, the \gls{glm} has removed negative trends for \gls{v1}--\gls{mpfc} and \gls{v1}--\gls{m1}.
+We conclude that the \gls{vwp} estimates have captured the most of the external task.
 
 
 \begin{figure}[ht]
   \centering
-  \subcaptionbox{VWP-J
-  \label{fig:rockland-results-glm-betas-VWP}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_VWP_joint}}
-  \subcaptionbox{DCC-J
-  \label{fig:rockland-results-glm-betas-DCC}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_DCC_joint}}
-  \subcaptionbox{SW-CV
-  \label{fig:rockland-results-glm-betas-SW-CV}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_SW_cross_validated}}
-  \subcaptionbox{sFC
-  \label{fig:rockland-results-glm-betas-sFC}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_sFC}}
+  \subcaptionbox{VWP-J\label{fig:rockland-results-glm-betas-VWP}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_VWP_joint}}
+  \subcaptionbox{DCC-J\label{fig:rockland-results-glm-betas-DCC}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_DCC_joint}}
+  \subcaptionbox{SW-CV\label{fig:rockland-results-glm-betas-SW-CV}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_SW_cross_validated}}
+  \subcaptionbox{sFC\label{fig:rockland-results-glm-betas-sFC}}{\includegraphics[width=0.48\textwidth]{fig/rockland/CHECKERBOARD645/prediction_benchmark/GLM_beta_sFC}}
   \caption{
     Rockland benchmark GLM $\beta$ (beta) parameters per TVFC estimation method.
-  }
-  \label{fig:rockland-results-glm-betas}
+    Values indicate learned weights.
+    Higher values indicate that the GLM uses the respective design matrix features more.
+    The VWP TVFC estimates are most useful for predicting the presence of the external stimulus (rest and stim columns).
+  }\label{fig:rockland-results-glm-betas}
 \end{figure}
 
 
@@ -626,8 +608,8 @@ \subsubsection{Imputation benchmark}
 %%
 
 The imputation benchmark results (\cref{fig:rockland-results-imputation-benchmark}) show strong performance for the \gls{vwp} model and weak performance for \gls{dcc}.
-\gls{sw-cv} estimate likelihoods are very similar to \gls{sfc} estimates.
-This may have been expected based on the visual inspection of model estimates; the \gls{sw-cv} approach found the right mean, but failed to capture the dynamics.
+\gls{sw-cv} estimate likelihoods are similar to \gls{sfc} estimates.
+This may have been expected based on the visual inspection of model estimates; the \gls{sw-cv} approach found the right mean but failed to capture the dynamics.
 %
 As such, we show strong correspondence of performance on the imputation benchmark with the external task prediction benchmark.
 
@@ -637,8 +619,7 @@ \subsubsection{Imputation benchmark}
   \includegraphics[width=\textwidth]{fig/rockland/CHECKERBOARD645/imputation_study/LEOO_test_log_likelihoods_raincloud}
   \caption{
     Rockland benchmark imputation results under LEOO train-test split.
-    Run on 286 Rockland data subjects.
+    Run on~286~Rockland data subjects.
     The boxplot shows median, quartiles, and outliers.
-  }
-  \label{fig:rockland-results-imputation-benchmark}
+  }\label{fig:rockland-results-imputation-benchmark}
 \end{figure}
diff --git a/ch/3_Benchmarking_TVFC_estimation/3_Discussion.tex b/ch/3_Benchmarking_TVFC_estimation/3_Discussion.tex
index 63de2b4..35553de 100644
--- a/ch/3_Benchmarking_TVFC_estimation/3_Discussion.tex
+++ b/ch/3_Benchmarking_TVFC_estimation/3_Discussion.tex
@@ -10,17 +10,17 @@ \subsection{Simulations}
 %%
 
 In summary, we studied simulated time series defined by a range of edge case and (possibly) realistic covariance structures.
-We took advantage of the flexibility simulations bring, and studied the impact of data dimensionality and noise configurations.
+We took advantage of the flexibility simulations bring and studied the impact of data dimensionality and noise configurations.
 These simulation benchmarks have taught us the following.
 
 Firstly, we found that \gls{sw-cv} and its ability to find an appropriate window length works well.
 The \gls{wp} approaches learn kernel lengthscales $l$ as a hyperparameter, and it seems to have a similar function as the window length.
 %
-Furthermore, we have replicated the frequent observation that standard \gls{sw} approaches can produce spurious correlation structures if the underlying covariance structure is actually static.
+Furthermore, we have replicated the frequent observation that standard \gls{sw} approaches can produce spurious correlation structures when the underlying covariance structure is static.
 Caution is required to avoid false positive conclusions.
 We also saw that \gls{dcc} models can still yield these.
 %
-The \gls{wp} approach generally works well, but may smooth out the estimates too much.
+The \gls{wp} approach works well but may smooth out the estimates too much.
 We also see that all models perform poorly on the `state transition' data set.
 As such we learned that if we expect to see drastic sudden changes in covariance structure, the current approaches may all be insufficient.
 %
@@ -35,12 +35,11 @@ \subsection{Simulations}
 %
 To dive deeper, we needed to investigate method estimates on real data.
 The closer our benchmarks are to actual practical applications, the more valuable they will be.
-For example, we do not actually know if it matters if a method's estimates are more `noisy' (see \gls{dcc} estimates for periodic covariance structures).
+For example, we do not actually know if it matters if a method's estimates are `noisier' (see \gls{dcc} estimates for periodic covariance structures).
 Perhaps capturing general trends is good enough for most practical scenarios.
 
 %%
-\subsection{Resting-state fMRI}
-\label{subsec:benchmarking-discussion-rs-fmri}
+\subsection{Resting-state fMRI}\label{subsec:benchmarking-discussion-rs-fmri}
 %%
 
 In summary, we studied several popular types of \gls{rs-fmri} benchmarks on a single, large, and publicly available data set.
@@ -49,13 +48,13 @@ \subsection{Resting-state fMRI}
 Firstly, we find large qualitative differences in the \gls{tvfc} estimates between different estimation methods.
 But how relevant are these?
 %
-The \gls{rs-fmri} benchmarks have taught us that choice of \gls{tvfc} estimation method greatly affects the utility of these estimates to predict subject measures.
+The \gls{rs-fmri} benchmarks have taught us that choice of \gls{tvfc} estimation method affects the utility of these estimates to predict subject measures.
 In general, we find the \gls{svwp} and \gls{sw-cv} methods to do similarly well.
 However, each has different predictive power for different subject measures and different qualitative characteristics.
 %
 In terms of the test-retest studies, the results are harder to interpret.
-All methods seem to do relatively similarly here, but \gls{dcc} estimate variances are much more robust across scanning sessions.
-In fact, the utility of test-retest studies in general is questionnable.
+All methods seem to perform relatively similarly here, but \gls{dcc} estimate variances are much more robust across scanning sessions.
+In fact, the utility of test-retest studies in general is questionable.
 It is unclear if we want to pick up on reliable characteristics or those that are indicative of cognitive occupation during a scan.
 Test-retest reliability may be desired, but optimizing for it for its own sake misses the point.
 Underwriting this, in a recent opinion piece \textcite{Finn2021b} also argued that (behaviorally) \emph{predictive} connectomes are more important than reliable connectomes.
@@ -79,17 +78,16 @@ \subsection{Task-based fMRI}
 For example, \textcite{Xie2019} studied classification accuracy in a multi-task setting data set, where the external stimulus again was used as a proxy for ground truth.
 
 %%
-\subsection{Reflection on benchmarking}
-\label{subsec:benchmarking-reflection}
+\subsection{Reflection on benchmarking}\label{subsec:benchmarking-reflection}
 %%
 
 What have we learned from these benchmarks?
 %
 We have not only proposed novel ways of estimating \gls{tvfc}, but also motivated and demonstrated how to think about robustly estimating it: by framing and designing benchmarks as prediction problems.
 
-Overall we consider the \gls{wp} to perform well, and best among these methods considered.
+Overall, we consider the \gls{wp} to perform well, and best among these methods considered.
 %
-Having the benchmarking framework in place also allows us to make model adjustments and rapidly check if these improved performance.
+Having the benchmarking framework in place also allows us to make model adjustments and rapidly check if these improved its performance.
 Ideas for model extensions will be discussed in \cref{subsec:model-extensions}.
 
 Equally insightful and encouraging is that the imputation benchmark generally corresponds to estimation method performance on the concurrent benchmark.
@@ -97,17 +95,16 @@ \subsection{Reflection on benchmarking}
 This is the beauty of the imputation benchmark: it uses real data and thus a real ground truth, and it can be run on any data set without the need for expert labeling or concurrent information.
 
 %%
-\subsubsection{Sudden changes and change points}
-\label{subsec:sudden-changes}
+\subsubsection{Sudden changes and change points}\label{subsec:sudden-changes}
 %%
 
 One of the open questions brought to the fore in this thesis concerns \gls{tvfc} change points.
 We must still consider the possibility of real covariance structures to be organized in change points, for which additional benchmarks need to be designed.
-This is one of the biggest gaps in the framework at the moment.
+This is currently one of the biggest gaps in the framework.
 
 It is often assumed that covariance between two brain regions can change and switch rapidly.
 The prior of the \gls{wp} is that covariance changes slowly.
-As we have seen, none of the proposed models are actually capable of picking up on sudden changes in covariance structure.
+As we have seen, none of the proposed models are capable of detecting sudden changes in covariance structure.
 Therefore, the real question is whether such jumps can be expected in real data.
 As argued by~\textcite{Lindquist2014} as well, if we expect sudden changes in our structure, we may need a different class of models.
 Another probabilistic approach to modeling \gls{tvfc} was also not able to capture the jagged dynamics of discrete jumps and states~\parencite{Li2019b}.
@@ -117,4 +114,4 @@ \subsubsection{Sudden changes and change points}
 We may take inspiration from \textcite{Saatci2010, Wilson2013} on change point kernels to include in the \gls{wp} models.
 
 Other types of neuroimaging data, such as local field potential time series data, often include outliers.
-If such outliers cannot be discarded as anomalies, but are expected under the scientific framework to be biologically insightful, this limits methods such as the \gls{wp}.
+If such outliers cannot be discarded as anomalies but are expected under the scientific framework to be biologically insightful, this limits methods such as the \gls{wp}.
diff --git a/ch/4_TVFC_and_depression/0_Introduction.tex b/ch/4_TVFC_and_depression/0_Introduction.tex
index b700087..4a18249 100644
--- a/ch/4_TVFC_and_depression/0_Introduction.tex
+++ b/ch/4_TVFC_and_depression/0_Introduction.tex
@@ -1,12 +1,11 @@
-\chapter{TVFC and depression}
-\label{ch:ukb}
+\chapter{TVFC and depression}\label{ch:ukb}
 %%%%%
 
 \info[inline]{Paragraph: Transition from methods development and benchmarking to application.}
-In the prior two chapters we discussed and convinced ourselves of how to robustly estimate \gls{tvfc} through the proposed benchmarking framework.
-Now we are ready to take the most robust method and apply it on a real world data set to answer a scientific question.
+In the prior two chapters we discussed and convinced ourselves of how to robustly estimate \gls{tvfc} through a proposed benchmarking framework.
+Now we are ready to take the most robust and best method and apply it on a real-world data set to answer a scientific question.
 %
-In this chapter we investigate whether \gls{tvfc} estimates have predictive power in a clinical setting.
-Specifically, in this study we are interested in how \gls{tvfc} estimates differ between depressed (professionally diagnosed as \gls{mdd} or self-reported) and \gls{hc} individuals.
-Such contrasts may shed new light on how this condition affects brain dynamics, and which brain regions are involved (see also \cref{sec:fc-depression}).
+This chapter investigates whether \gls{tvfc} estimates have predictive power in a clinical setting.
+Specifically, in this study we are interested in how \gls{tvfc} estimates differ between depressed (both professionally diagnosed as \gls{mdd} and self-reported) and \gls{hc} individuals.
+Such contrasts may shed new light on how this condition affects brain dynamics, and which brain regions are involved (see also \cref{sec:fc-depression} for background and motivation).
 In turn, this may inform treatment targets.
diff --git a/ch/4_TVFC_and_depression/1_Data_cohorts_and_parcellations.tex b/ch/4_TVFC_and_depression/1_Data_cohorts_and_parcellations.tex
index f8775ae..f31a7b2 100644
--- a/ch/4_TVFC_and_depression/1_Data_cohorts_and_parcellations.tex
+++ b/ch/4_TVFC_and_depression/1_Data_cohorts_and_parcellations.tex
@@ -1,88 +1,98 @@
 \clearpage
-\section{Data, cohorts, and parcellations}
-\label{sec:ukb-data}
+\section{Data, cohorts, and parcellations}\label{sec:ukb-data}
 %%%%%
 
 \info[inline]{Paragraph: Broad introduction of data and phenotyping.}
-Our data and cohorts are taken from the UK Biobank, a large population study data bank.
-We perform a four-by-two analysis, studying four depression phenotypes that define four sets of cohorts and two levels of brain abstractions.
+All data and cohorts are taken from the UK Biobank, a large population study data bank.
+A four-by-two analysis is performed, studying four depression phenotypes (that define four sets of cohorts) and two levels of brain abstractions.
 These four phenotypes are: (inpatient) diagnosed lifetime occurrence of \gls{mdd}, self-reported depression lifetime occurrence, self-reported depressive state (at the time of the brain scan), and depression genetic risk as measured by \gls{prs}.
-The two abstractions (parcellations) are as a collection of individual (relevant) brain \glspl{roi} and as a superposition of (relevant) \glspl{fn}.
-Carefully slicing cohorts in various ways may yield insights beyond a single study.
+The two abstractions (i.e.~parcellations) are as a collection of individual (relevant) brain \glspl{roi} and as a superposition of (relevant) \glspl{fn}.
+Carefully slicing cohorts in several ways may yield insights beyond a single study.
 
 %%
 \subsection{Data overview}
 %%
 
 \info[inline]{Paragraph: Introduce UK Biobank.}
-UK Biobank is a large, actively growing, and publicly available data biobank of 502,486 unique individuals located across the United Kingdom, recruited voluntarily from the general populace~\parencite{Collins2012, Allen2014b}.
+The UK Biobank is a large, actively growing, and publicly available data biobank of 502,486 unique individuals located across the United Kingdom, recruited voluntarily from the general populace~\parencite{Collins2012, Allen2014b}.
 Despite some `healthy volunteer' biases that affect how representative this data is of the entire populace~\parencite[see][]{Fry2017}, it is one of the largest of such kind in the world.
-Originally the data was particularly focused on genetics and lifestyle analyses, so most of the mental health information was collected at a later stage or synced through \glspl{ehr}.
-Questions regarding depressive symptoms (administered on touch-screens) were only added to the initial assessment protocol for the final two recruitment years (for 172,751 participants in total).
+%
+Originally the data was particularly focused on genetics and lifestyle analyses.
+Most of the mental health information was collected at a later stage or synchronized through \glspl{ehr}.
+Questions regarding depressive symptoms (administered on touchscreens) were only added to the initial assessment protocol for the final two recruitment years (for 172,751 participants in total).
+More generally, given the sheer size and long collection duration, not all data fields are available for all participants.
+%
 In an early descriptive epidemiological study, \textcite{Smith2013c} found \emph{probable} prevalence rates of 6.4\% for single lifetime episode of major depression, 12.2\% for recurrent major depression (moderate), and 7.2\% for recurrent major depression (severe).
-They also note that this is in line with other large population studies, thus underscoring the validity and representativeness of this data set (for depression studies at least).
+They noted that this is in line with other large population studies, thus underscoring the validity and representativeness of this data set (for depression studies at least).
 The richness in data included in this biobank presents an unprecedented opportunity to understand the interaction of mood disorders such as \gls{mdd} with genetic, lifestyle, and environmental risk factors and influences.
-All data used in this work has been fetched on the 1st of March, 2021.
+All data used in this work has been fetched on the 1st of March 2021.
 
 \info[inline]{Paragraph: Provide overview of rs-fMRI data.}
-We limit this study to \gls{rs-fmri} data.
-Our data fetch contains 44,083 participants with \gls{rs-fmri} data available (out of a total of 502,486 unique UK Biobank individuals).\footnote{At the time of writing, plans are on the way to get to 100,000 scanned participants~\parencite{Littlejohns2020}. As we are working with a live, active data set, we plan to re-run all analyses when more data becomes available.}
+This study is limited to \gls{rs-fmri} data.
+The data fetch contains 44,083 participants with \gls{rs-fmri} data available (out of a total of 502,486 unique UK Biobank individuals).\footnote{At the time of writing, plans are on the way to get to 100,000 scanned participants~\parencite{Littlejohns2020}. As we are working with a live, active data set, we plan to re-run all analyses when more data becomes available.}
 Data collection was standardized across scanning facilities.
 All source images were acquired with a voxel resolution of $2.4 \times 2.4 \times 2.4$ mm and a \gls{te} of 39 ms, for a duration of 6 minutes and a \gls{tr} of 0.735 seconds, resulting in $N = 490$ volumes per scan (for the majority of participants).
-Participants with fewer volumes than this were discarded, and for those with more volumes than this we truncated the time series to this length.
+Participants with fewer volumes than this were discarded.
+For those with more volumes than this the time series were truncated to this length.
+Data preprocessing was done by Richard Bethlehem and team at the Department of Psychiatry.
 
 \info[inline]{Paragraph: Describe data collection timeline.}
 UK Biobank participants were recruited and attended an initial assessment visit between 2006 and 2010.
 All participants were aged between 40 and 69 at the time of recruitment (note how this contrasts the young adult participants from the \gls{hcp} data as studied in \cref{ch:benchmarking}).
 The initial baseline assessment (codified as Instance 0) included basic health data collection through a touchscreen questionnaire as well as a verbal interview~\parencite{Bycroft2018}.
+%
 Some of these participants were invited several years later to repeat this assessment (codified as Instance 1).
 These visits happened in 2012 and 2013.
-A subset of all original participants was then invited for a second follow-up visit, which included an \gls{rs-fmri} scan.
+%
+A subset of all original participants was then invited for a second follow-up visit (codified as Instance 2), which included an \gls{rs-fmri} scan.
 These visits started in 2014 and are still ongoing.
-Some participants were asked to do a repeat imaging visit as well (starting in 2019 and still ongoing as well).
-In this study we only use the scans from the first imaging visit (codified as Instance 2), resulting in a single \gls{rs-fmri} scan per participant in our data set.
-A follow-up \gls{mhq} was sent out to participants as well to expand the potential of the UK Biobank data with mental health~\parencite{Davis2020, Glanville2021}.
+%
+Some participants were asked to do a repeat imaging visit (starting in 2019 and still ongoing).
+%
+This study only uses the scans from the first imaging visit (Instance 2), resulting in a single \gls{rs-fmri} scan per participant in our data set.
+A follow-up \gls{mhq} was sent out to participants to expand the potential of the UK Biobank data with mental health~\parencite{Davis2020, Glanville2021}.
 Participant responses from this online \gls{mhq} were collected in the second half of 2016.
 
 \info[inline]{Paragraph: Introduce ICD-10 diagnoses from electronic health records.}
-Inpatient hospital \glspl{ehr}, including \gls{icd}~\parencite{WHO1992} diagnoses, have been linked to the biobank as well.\footnote{Throughout this thesis only the 10th revision of these codes is used.}
-We note that this data was only available for 17,442 out of the 21,675 participants with available \gls{rs-fmri} and that met the other general prerequisites described below.
+Inpatient hospital \glspl{ehr}, including \gls{icd}~\parencite{WHO1992} diagnoses, have been linked to the biobank.\footnote{Throughout this thesis only the 10th revision of these codes is used.}
+However, this data was only available for 17,442 out of the 21,675 participants with available \gls{rs-fmri} and that met the other general prerequisites described below.
 
 %%
-\subsection{Cohort stratification}
-\label{subsec:cohort-stratification}
+\subsection{Cohort stratification}\label{subsec:cohort-stratification}
 %%
 
 \info[inline]{Paragraph: Provide overview of cohort stratification.}
 Here we describe how we define depression and how we construct our cohorts (i.e.~groups with a shared defining characteristic of interest) for each of the four depression phenotypes.
 
 \info[inline]{Paragraph: Describe general participant filters.}
-Before going into specific depression phenotype definitions, we ran several general filters across all participants.
-Firstly, we only select participants between 40 and 64 years old (at the time when the scan was taken), to avoid including co-morbidities to do with old age.
-This reduced the number of (broadly) eligible participants to include from 44,083 to 21,877.
-Secondly, we follow \textcite{Howard2020} and filter out any participants that had been diagnosed with schizophrenia, a personality disorder, and/or bipolar disorder.
-These diagnoses were taken from \gls{icd} data fields as well as Data-Field 20544.
-This further reduced the number of eligible participants to 21,675.
+Before going into specific depression phenotype definitions, several general filters were run across all participants.
+%
+Firstly, we only select participants between~40 and~64 years old (at the time when the scan was taken), to avoid including co-morbidities and changes in brain structure and function to do with old age.\footnote{There is a trade-off between sample size and sample homogeneity in this case.}
+This reduced the number of (broadly) eligible participants to include from~44,083 to~21,877.
+Secondly, following \textcite{Howard2020}, any participant that had been diagnosed with schizophrenia, a personality disorder, and/or bipolar disorder was filtered out.
+These diagnoses were taken from \gls{icd} data fields as well as Data-Field~20544.
+This further reduced the number of eligible participants to~21,675.
 Another factor to consider are cardiovascular disorders~\parencite{Whooley2013}, such as hypertension.
 \Gls{bold} signals are based on blood flow, so such conditions may bias our findings.
-We do not use this information in our cohort stratification.
-General clinical and demographic characterics of all such eligible participants are shown in \cref{tab:ukbiobank-cohorts}.
+However, this information is not used in our cohort stratification.
+General clinical and demographic characteristics of all such broadly eligible participants are shown in \cref{tab:ukbiobank-cohorts}.
 
 \info[inline]{Paragraph: Describe importance of careful stratification.}
 After applying these general filters, the next step toward our final data sets is to select and divide participants into cohorts to be contrasted in our analysis.
 As we shall see, this is a non-trivial task that requires several assumptions and heavily influences the scope of conclusions we can make about the relationship between depression and the brain.
 As discussed in \cref{sec:fc-depression}, depression is a clinically heterogeneous condition, and multiple definitions and subtypes exist~\parencite[see also][]{Fried2022}.
-Common symptoms include negative bias, adhedonia, impaired social cognition, and reduced motivation and behavioral responses.
+Common symptoms include negative bias, anhedonia, impaired social cognition, and reduced motivation and behavioral responses.
 Furthermore, it can be considered on a continuous scale of intensity, instead of just a binary classification.
 We argue that looking at a wider range of phenotypes paints a fuller picture.
-Based on the available data, we look at four depression phenotypes: diagnosed lifetime occurrence, self-reported lifetime occurrence,
-self-reported depressive episode/state while the participant was in the scanner, and \glspl{prs}.
+Based on the available data, we look at four depression phenotypes: diagnosed lifetime occurrence, self-reported lifetime occurrence, self-reported depressive episode/state while the participant was in the scanner, and \glspl{prs}.
 All general cohort characteristics are summarized in \cref{tab:ukbiobank-cohorts}.
-Neuroticism, a personality trait strongly correlated with mood disorders, scores were derived from a list of questions (on the touchscreen at the assessment centre).
+%
+Scores for neuroticism, a personality trait strongly correlated with mood disorders~\parencite{Goldstein2014}, have been derived from a list of questions (on the touchscreen at the assessment center).\footnote{Personality traits are generally considered to be stable across adulthood.}
 These scores were missing for about 1 out of 7 participants.
-Depressed cohorts generally score much higher on neuroticism.
-They have higher BMI as well, and are more educationally and materially deprived (although the spread in scores is large).
-This is not the case for the \gls{prs} cohorts, which may be interpreted as genetics not playing a dominant role in these outcomes.
+Our depressed cohorts score much higher on average on neuroticism ($p < .001$).
+They also have a higher \gls{bmi} and are more educationally and materially deprived (although the spread in scores is large).
+This is not the case for the \gls{prs} cohorts.
+Genetics may not play a dominant role in these outcomes.
 
 \input{tab/ukb_cohorts}
 
@@ -91,13 +101,15 @@ \subsubsection{Diagnosed lifetime occurrence (depressive trait analysis)}
 %%
 
 \info[inline]{Paragraph: Describe diagnosed lifetime occurrence phenotype.}
-For this first lifetime occurrence (a.k.a.~lifetime \emph{history} or \emph{instance}) phenotype we select two cohorts from all eligible participants: an \gls{mdd} cohort and a \gls{hc} cohort.
+For this first lifetime occurrence (a.k.a.~\emph{history} or \emph{instance}) phenotype two cohorts are selected from all eligible participants: an \gls{mdd} cohort and a \gls{hc} cohort.
 This cohort is based on medical diagnoses of \gls{mdd}, which can be found in Data-Field 41270.
-We select participants that have at any point in their lives been diagnosed with \gls{icd} codes F320-F323, F328-F329 (single depressive episodes), F330-F334, F338, and/or F339 (recurrent depressive episodes).
+We select participants that have at any point in their lives been diagnosed with \gls{icd} codes F320--F323, F328--F329 (single depressive episodes), F330--F334, F338, and/or F339 (recurrent depressive episodes).
 As such we do not distinguish between single or recurrent episodes (and thus depression severity).
 The control cohort is defined as having no such past diagnosis as well as not self-reporting any depression (both during visits and in the follow-up \gls{mhq}).
-The male/female ratios of these cohorts show a large discrepancy.\footnote{Higher reported depression incidence for women was to be expected~\parencite{Albert2015, Bogren2018}. The prevalence of depression decreases after the age of 65, however, and becomes similar across sex~\parencite{Bebbington2003}. This is likely influenced by female prevelance of depression peaking around hormononal changes (puberty, prior to menstruation, following pregnancy, and at perimenopause).}
-Therefore, the two cohorts are matched not only in size, but also in sex ratio.
+Moreover, participants that were taking antidepressants were filtered out from the \gls{hc} cohort.
+%
+The male/female ratios of these cohorts show a large discrepancy.\footnote{Higher reported depression incidence for women was to be expected~\parencite{Albert2015, Bogren2018}. The prevalence of depression decreases after the age of 65, however, and becomes similar across sex~\parencite{Bebbington2003}. This is likely influenced by female prevalence of depression peaking around hormonal changes (puberty, prior to menstruation, following pregnancy, and perimenopause).}
+Therefore, the control cohorts are subsampled to match the depressed cohort not only in size, but also in sex ratio.
 
 %%
 \subsubsection{Self-reported lifetime occurrence (depressive trait analysis)}
@@ -107,38 +119,40 @@ \subsubsection{Self-reported lifetime occurrence (depressive trait analysis)}
 For this second lifetime occurrence phenotype we again select two cohorts from all eligible participants: a depressed cohort (we avoid the term \gls{mdd} here due to a lack of professional diagnosis) and a \gls{hc} cohort.
 We broadly follow \textcite{Howard2020} for this analysis and use the self-reported lifetime instance depression phenotype definition based on the \gls{cidi-sf} \parencite{Kessler1998} as described and defined by \textcite{Davis2020}.\footnote{The scoring criteria from \textcite{Davis2020} are equivalent to the \gls{dsm} criteria for \gls{mdd}. See \cref{sec:fc-depression} for more details.}
 This inventory was part of the follow-up \gls{mhq} sent out to participants.
-We again note that this phenotype indicates a \emph{lifetime} instance measure of depression, and does not distinguish between a single or multiple past depressive episodes.
+We again note that this phenotype indicates a \emph{lifetime} instance measure of depression.
+It does not distinguish between a single or multiple past depressive episodes.
 After selecting eligible participants (those that completed this questionnaire), we were left with only 14,843 participants.\footnote{This highlights a core problem with self-reported phenotypes; they introduce selection bias.}
 
 \info[inline]{Paragraph: Discuss inherent data limitations.}
-Note that the relevant online follow-up questionnaires were sent out to participants with valid email addresses and completed in 2016, whereas the scans were taken any time between 2014 and 2018.
+The relevant online follow-up questionnaires were sent out to participants with valid email addresses and completed in 2016, whereas the scans were taken any time between 2014 and 2018.
 This means that some participants filled it out before the scan, whereas others did so afterwards.
-Consequently, some people may have gotten depressed for the first time after filling out the questionnaire, but before or during their scan.
+Consequently, some individuals may have gotten depressed for the first time after filling out the questionnaire, but before or during their scan.
 Moreover, for those who reported ever having been depressed, some may have been so while in the scanner, whereas for others it was a long time ago.
-Unfortunately we have no surefire way of separating out these groups.
-Here it is also important to note that we end up perhaps studying depression-like \emph{traits} (i.e.~individual susceptibility over longer periods of time as evidenced by history) instead of mental \emph{states} (i.e.~currently affected and experiencing depressive episode) of participants during the scan.
+Unfortunately, we have no surefire way of separating out these groups.
+Here it is also important to note that we end up perhaps studying depression-like \emph{traits} (i.e.~individual susceptibility over longer periods of time as evidenced by history) instead of mental \emph{states} (i.e.~currently affected and experiencing a depressive episode) of participants during the scan.
 
 \info[inline]{Paragraph: Describe CIDI-SF phenotype definition.}
-The \gls{cidi-sf} definition of a lifetime instance of depression requires at least one positive answer in the two \emph{core} symptoms in Data-Fields 20441 and 20446 (see \cref{tab:CIDI-SF-Data-Fields}).
-Additionally, it requires at least four out of six \emph{non-core} symptoms (some or a lot of impairment) from Data-Fields 20435, 20437, 20449, 20450, 20532, and/or 20536.
+The \gls{cidi-sf} definition of a lifetime instance of depression requires at least one positive answer in the two \emph{core} symptoms in Data-Fields~20441 and~20446 (see \cref{tab:CIDI-SF-Data-Fields}).
+Additionally, it requires at least four out of six \emph{non-core} symptoms (some or a lot of impairment) from Data-Fields~20435, 20437, 20449, 20450, 20532, and/or~20536.
 These non-core symptoms are based on follow-up questions that were only asked if participants indicated they had at least one core symptom.
 
 \input{tab/ukb_cidi_sf}
 
 \info[inline]{Paragraph: Describe our (stricter) phenotype definition.}
-Our goal is to have a strong contrast between our two cohorts.
+Our goal is to have a stark contrast between our two cohorts.
 Therefore, our definition is even stricter than the one used by \textcite{Howard2020}.
 We require participants for the depressed cohort to report \emph{both} core symptoms.
 Furthermore, we require them to have \emph{all} six non-core symptoms.
 We also use the second phenotype discussed in \textcite{Howard2020} to further narrow down our depressed cohort.
 This \emph{help-seeking} phenotype is based on whether a participant has ever sought help from a \gls{gpx} or a psychiatrist (Data-Fields 2090 and 2100, respectively) for nerves, anxiety, tension, or depression.
-This was asked at each of the--up to four--participant visits to the assessment center.
-To make our depression phenotype more strict, we also required depressed cohort participants to have visited a \gls{gpx} (but a psychiatrist visit was optional due to its rarity).
+This was asked at each of the up to four participant visits to the assessment center.
+To make our depression phenotype stricter, we also required depressed cohort participants to have visited a \gls{gpx}.
+A psychiatrist visit was optional due to its rarity.
 
 \info[inline]{Paragraph: Describe our control cohort phenotype definition.}
-\gls{hc} participants were selected on having neither core symptoms, as well as never having visited a \gls{gpx} nor psychiatrist for the above-mentioned reasons.
-Following \textcite{Glanville2021}, any participant that endorsed any condition in Data-Field 20544 is also excluded from the \gls{hc} cohort.
-Moreover, participants that were on anti-depressants during any of the assessment centre visits were excluded.
+\gls{hc} participants were selected to have neither core symptoms, as well as never having visited a \gls{gpx} nor a psychiatrist for the above-mentioned reasons.
+Following \textcite{Glanville2021}, any participant that endorsed any condition in Data-Field~20544 is also excluded from the \gls{hc} cohort.
+Moreover, participants that were on anti-depressants during any of the assessment center visits were excluded.
 
 \info[inline]{Paragraph: Describe final cohorts and sex ratio matching.}
 In the end, we obtained 979 depressed participants and 4,944 \gls{hc} participants with these criteria.
@@ -153,10 +167,10 @@ \subsubsection{Self-reported depressive state analysis}
 %%
 
 \info[inline]{Paragraph: Describe self-reported depressive state phenotype.}
-For the depressive state we use self-reported depressed states at the time of the \gls{rs-fmri} scan.
+For the depressive state analysis, we use self-reported depressed states at the time of the \gls{rs-fmri} scan.
 This is found in Data-Field 20002.
 This phenotype differs from the previous two in the sense that we know participants reported being depressed while in the scanner.
-As such the addition of this phenotype allows us to investigate the difference between depressive episodes or lifetime history (e.g.~through its lasting impact or general susceptibility differences).
+As such the addition of this phenotype allows us to investigate the difference between (current) depressive episodes or lifetime history (e.g.~through its lasting impact or general susceptibility differences).
 The respective \gls{hc} cohort was defined as not reporting said depressive state.
 %
 Sex ratios were again matched across cohorts.
@@ -172,7 +186,7 @@ \subsubsection{Polygenic risk score (depression risk analysis)}
 The main underlying algorithm used was \texttt{PRSice2}~\parencite{Choi2019}.\footnote{Code: \url{https://choishingwan.github.io/PRSice}}
 Importantly, these scores were generated using a data set independent from the UK Biobank.
 These \gls{prs} are only available for a subset of our participants.
-We divide these into three equally-sized groups: high, medium, and low risk (3,775 subjects per cohort).
+We divide these into three equally sized groups: high, medium, and low risk (3,775 subjects per cohort).
 
 \info[inline]{Paragraph: Describe caveats of using the polygenic risk scores phenotype.}
 Subjects with high \glspl{prs} may well have never experienced any depressive episode, and vice versa.
@@ -181,11 +195,11 @@ \subsubsection{Polygenic risk score (depression risk analysis)}
 As before, changing our depression phenotype inherently changes our study focus and scope of subsequent conclusions.
 
 \info[inline]{Paragraph: Discuss overlap between cohorts.}
-As such, how much correspondence is there between this genetics-based phenotype and the three beforementioned (diagnosed and self-reported, based on actual life experiences and symptoms) depression phenotype definitions?
+As such, how much correspondence is there between this genetics-based phenotype and the three aforementioned (diagnosed and self-reported, based on actual life experiences and symptoms) depression phenotype definitions?
 We verify (and validate) our cohorts by plotting the \glspl{prs} per cohort for both diagnosed and self-reported phenotypes (see \cref{fig:ukb-lifetime-occurrence-pgs}).
-As expected, participants of the other three depression phenotypes generally have higher mean depression \glspl{prs} than \glspl{hc} (two-sample $t$-test: Cohen $d = 0.28$, $t(1059) = 4.56$, $p < .001$; Cohen $d = 0.35$, $t(1421) = 6.52$, $p < .001$; and Cohen $d = 0.26$, $t(2468) = 6.49$, $p < .001$, respectively).
-\textcite{Glanville2021} also finds that there is a correlation between the \gls{cidi-sf} (self-reported lifetime occurrence) measure and \gls{prs}.
-Moreover, we are in agreement with \textcite{Cai2020}, who found that the \gls{cidi-sf} depression phenotype had the \emph{strongest} genetic contribution.
+As expected, participants of the other three depression phenotypes have higher mean depression \glspl{prs} than \glspl{hc} (two-sample $t$-test: Cohen $d = 0.28$, $t(1059) = 4.56$, $p < .001$; Cohen $d = 0.35$, $t(1421) = 6.52$, $p < .001$; and Cohen $d = 0.26$, $t(2468) = 6.49$, $p < .001$, respectively).
+\textcite{Glanville2021} also found that there is a correlation between the \gls{cidi-sf} (i.e.~self-reported lifetime occurrence) measure and \gls{prs}.
+Moreover, we are in agreement with \textcite{Cai2020}, who found that this \gls{cidi-sf} depression phenotype has the \emph{strongest} genetic contribution.
 
 
 \begin{figure}[t]
@@ -193,9 +207,8 @@ \subsubsection{Polygenic risk score (depression risk analysis)}
   \includegraphics[width=\textwidth]{fig/ukbiobank/PRS_all_analyses_per_cohort_joint}
   \caption{
     UK Biobank depression study distribution of polygenic risk scores for three depression phenotype cohorts.
-    Scores are higher for the depressed cohorts compared to the HC cohorts.
-  }
-  \label{fig:ukb-lifetime-occurrence-pgs}
+    Scores are significantly higher for the depressed cohorts compared to the HC cohorts.
+  }\label{fig:ukb-lifetime-occurrence-pgs}
 \end{figure}
 
 
@@ -206,14 +219,15 @@ \subsection{Brain regions of interest}
 \info[inline]{Paragraph: Frame brain region of interest selection.}
 After selecting our cohorts, we need to define and decide on how to characterize brain nodes.
 We have opted to study a selection of depression-relevant, anatomically defined brain \glspl{roi}.\improvement{Motivate more why we only look at a selection of brain regions}
-We will base this selection on the current understanding of the neurobiology and neurological basis of depression, as well as pick brain regions that have been the subject of other \gls{fc} studies that looked at the \emph{interaction} between individual brain regions.
+We will base this selection on the current understanding of the neurobiology and neurological basis of depression.
+Moreover, we pick brain regions that have been the subject of other \gls{fc} studies that looked at the \emph{interaction} between individual brain regions.
 The latter will also allow us to validate and contrast our findings within the existing body of work, both \gls{sfc} and \gls{tvfc}.
 
 \info[inline]{Paragraph: Describe brain regions generally involved with depression.}
 As reviewed in \cref{sec:fc-depression}, depression is known to primarily affect brain regions involved with mood and reward processing~\parencite{Pandya2012}.
 Here we continue this review and motivate in more detail why we study the regions we do.
 %
-An influential concept in neuroanatomy is that the human brain is in fact made up out of three brains.
+An influential concept in neuroanatomy is that the human brain is in fact made up of three brains.
 The concept of this \emph{triune} brain was introduced by \textcite{Maclean1985}.
 It posits from an evolutionary perspective that the human brain consists of a primal (`reptilian'; including basal ganglia and brain stem structures that help with the `plumbing' and regulatory side of bodily homeostasis), limbic (`mammalian' or `emotional'; involved with critical emotional skills required for social animals), and neomammalian (`rational'; responsible for higher and more complex cognitive function and regulation of emotions) brain.\footnote{The terms `reptilian' and `mammalian' should, of course, not be taken literally. Reptiles were never ancestors to mammals; our evolutionary lines diverged over 300 million years ago~\parencite{Striedter2019}.}
 The primal and limbic systems are more ancient than the (evolutionarily speaking) newer neocortex, and their physiology and anatomy is therefore categorically different as well.
@@ -221,65 +235,67 @@ \subsection{Brain regions of interest}
 Theories of depression often pertain to such functional descriptions, often suggesting aberrant and dysregulated emotional, limbic, and reward processing~\parencite{Akiskal1973}.
 It makes intuitive sense that depression would affect the regions and circuits involved with these functions, as opposed to the visual cortex, for example (although even such regions may very well be affected).
 Continuing this tradition, modern computational psychiatry approaches seeking more mechanistic explanations have imposed \gls{rl} concepts such as `utility', `reward', and `value' onto the brain and suggested implementational theories of mood disorders in the human brain~\parencite{Huys2013, Chen2015, Eldar2016, Juechems2019, Bennett2020, Bennett2021}.\footnote{\textbf{Computational psychiatry} broadly refers to computational, model-based approaches to understanding psychiatric illness~\parencite{Montague2012, Adams2016, Radulescu2019, Huys2021}.}
-A large proportion of the brain is, in fact, involved in processing value functions.
+A substantial proportion of the brain is, in fact, involved in processing value functions.
 However, key brain regions that have been proposed to be involved in such studies often include the basal ganglia and frontal areas such as the (medial) \gls{ofc}, \gls{vmpfc}, \gls{mpfc}, and (dorsal) \gls{acc}~\parencite{Lee2012}.
+%
 Yet another source of inspiration comes from applications of the free-energy principle to depression~\parencite{Chekroud2015}.
 The idea here is that if the brain does indeed build a generative model of the world, then this will include beliefs about both the external (e.g. intuitive physics) and internal world (e.g. beliefs about agency and helplessness).
 Depressive beliefs can then be viewed as those negatively biasing predictions.
 Despite these advances, however, it is still unclear what brain regions and networks are exactly involved, and how they may be affected.
 
 \info[inline]{Paragraph: Describe brain regions generally involved with depression (continued).}
-Generally, the \gls{pfc} (shown in \cref{fig:roi-mpfc}) has been found to be most consistently impaired with \gls{mdd}~\parencite{George1994, Pizzagalli2021}.
+The \gls{pfc} has been found to be most consistently impaired with \gls{mdd}~\parencite{George1994, Pizzagalli2021}.
 Many subregions of the \gls{pfc} have been individually studied and implicated.
 Generally, the \gls{pfc} can be divided into a lateral and a ventromedial part.
-Moreover, brain regions related to emotional (e.g. limbic system constituents) and reward processing are robustly found to be impacted.
-Therefore, the particular brain \glspl{roi} involved with depression that are included in this work are the \gls{amg}~\parencite{Dannlowski2009, Kong2013, Connolly2017, Zhang2020}, \gls{hpc}, \gls{pha}, \gls{ai}~\parencite{Avery2014, Kandilarova2018}, \gls{ofc}~\parencite{Rolls2020}, \gls{pcc}, \gls{dlpfc}, and \gls{acc}~\parencite{Drevets2008} and \gls{mpfc}~\parencite{Pizzagalli2021}.
+Moreover, brain regions related to emotional (e.g.~limbic system constituents) and reward processing are robustly found to be impacted.
+Therefore, the brain \glspl{roi} involved with depression that are included in this work are the \gls{amg}~\parencite{Dannlowski2009, Kong2013, Connolly2017, Zhang2020}, \gls{hpc}, \gls{pha}, \gls{ai}~\parencite{Avery2014, Kandilarova2018}, \gls{ofc}~\parencite{Rolls2020}, \gls{pcc}, \gls{dlpfc}, and \gls{acc}~\parencite{Drevets2008} and \gls{mpfc}~\parencite{Pizzagalli2021}.
 We will briefly describe what we know about these brain regions before we move on.
 
 The \gls{amg} is an almond-shaped, complex subcortical brain region that is part of the limbic system and is critical in regulating motivation and responses to both rewarding and aversive stimuli~\parencite{Nestler2002}.
 It was first described by Karl Friedrich Burdach in 1822~\parencite{Burdach1826}.
-It is located in the midbrain, next to the \gls{hpc}, and is connected to many other brain regions.
+It lies in the midbrain, next to the \gls{hpc}, and is connected to many other brain regions.
 As a brain structure it is made up of about 13 nuclei.\footnote{In neuroanatomy, a \textbf{nucleus} refers to any cluster of neurons, where such neurons have similar functions and connections to other nuclei.}
 The \gls{amg} is especially involved in processing of emotions and memories related to fear, threats, aggression, and pain~\parencite{Thompson2017b}.
-It also assigns value and emotional meaning to particular memories and decisions.
+It also assigns value and emotional meaning to memories and decisions.
 This makes it a prime contestant for relevant brain regions.
 It is theorized that the \gls{amg} is dysregulated and hyperactive in patients with \gls{mdd}.
 Similarly, the \gls{amg} has been found to be hyperactive in \gls{ptsd} patients, where emotional experiences from memories do not fade over time and keep their original (traumatic) impact.
 It has been found that the \gls{amg} has decreased connectivity with a range of other brain regions with \gls{mdd}~\parencite{Tang2013, Ramasubbu2014}.
 Even though the \gls{amg} is known to have three functionally distinct subdivisions, we consider it as one whole brain region here.
 The \gls{amg} is a relatively small brain region.
-\textcite{Brabec2010} found the average \gls{amg} in their sample to be 1,240-1,630~$mm^3$ in size per hemisphere (depending on measurement method, with no significant interhemisphere or intersex differences).
+\textcite{Brabec2010} found the average \gls{amg} in their sample to be 1,240--1,630~$mm^3$ in size per hemisphere (depending on measurement method, with no significant interhemisphere or intersex differences).
 The volume of the \gls{amg} has been shown to shrink with recurrent major depression~\parencite{Sheline1998}.
 
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{Amygdala (AMG) \label{fig:roi-amg}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/Amygdala/ho_joint}}
+  \subcaptionbox{Amygdala (AMG)\label{fig:roi-amg}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/Amygdala/ho_joint}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{Hippocampus (HPC) \label{fig:roi-hpc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/Hippocampus/ho_joint}}
-  \subcaptionbox{Parahippocampal area (PHA) \label{fig:roi-pha}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/PHA}}
+  \subcaptionbox{Hippocampus (HPC)\label{fig:roi-hpc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/Hippocampus/ho_joint}}
+  \subcaptionbox{Parahippocampal area (PHA)\label{fig:roi-pha}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/PHA}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{Anterior insula (AI) \label{fig:roi-insula}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/AI}}
-  \subcaptionbox{Orbitofrontal cortex (OFC) \label{fig:roi-ofc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/OFC}}
+  \subcaptionbox{Anterior insula (AI)\label{fig:roi-insula}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/AI}}
+  \subcaptionbox{Orbitofrontal cortex (OFC)\label{fig:roi-ofc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/OFC}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{Posterior cingulate cortex (PCC) \label{fig:roi-pcc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/PCC}}
-  \subcaptionbox{Dorsolateral PFC (dlPFC) \label{fig:roi-dlpfc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/dlPFC}}
+  \subcaptionbox{Posterior cingulate cortex (PCC)\label{fig:roi-pcc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/PCC}}
+  \subcaptionbox{Dorsolateral PFC (dlPFC)\label{fig:roi-dlpfc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/dlPFC}}
   \hspace{0.08\textwidth}
-  \subcaptionbox{Anterior cingulate cortex (ACC) \label{fig:roi-acc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/ACC}}
-  \subcaptionbox{Medial PFC (mPFC) \label{fig:roi-mpfc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/mPFC}}
+  \subcaptionbox{Anterior cingulate cortex (ACC)\label{fig:roi-acc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/ACC}}
+  \subcaptionbox{Medial PFC (mPFC)\label{fig:roi-mpfc}}{\includegraphics[width=0.37\textwidth]{fig/brain_regions/mPFC}}
+  % \hspace{0.08\textwidth}
+  % \subcaptionbox{V1 \label{fig:roi-v1}}{\includegraphics[width=0.38\textwidth]{fig/hcp/RSN_components/RSN_01}}
   \caption{
     Brain ROIs for UK Biobank depression study.
     AMG and HPC are extracted from Harvard-Oxford atlas.
     Other regions are extracted from HCP-MMP1.0 parcellation.
-  }
-  \label{fig:ukb-brain-regions}
+  }\label{fig:ukb-brain-regions}
 \end{figure}
 
 
 The \gls{hpc} is another subcortical region, located deep in the temporal lobe, and is also part of the limbic system.
 It has long been known to be involved with learning, memory, and replay of memories (consolidation).
 More recently it has also been implied with emotional behavior and spatial navigation.
-The \gls{hpc} is a relatively vulnerable brain region, and is one of the earlier and most severly affected brain areas with neurodegenerative disorders such as \gls{ad}.
+The \gls{hpc} is a vulnerable brain region, and is one of the earlier and most severely affected brain areas with neurodegenerative disorders such as \gls{ad}.
 Hippocampal volumes differ across hemispheres.
 \textcite{McHugh2007} found human hippocampal volumes of 3,480~$\pm$~430~$mm^3$ and 3,680~$\pm$~420~$mm^3$ for left and right \gls{hpc}, respectively.
 Macaque primate as well as human studies have found decreased hippocampal volume with depression~\parencite{Campbell2004, Malykhin2010, Brown2014, Schmaal2016}.
@@ -292,7 +308,7 @@ \subsection{Brain regions of interest}
 \textcite{Aminoff2013} proposed that the \gls{pha} processes \emph{contextual associations}.
 It becomes more active in \gls{fmri} studies when individuals are shown images of `places' and `situations'~\parencite{Epstein1998}.
 \textcite{Megevand2014} showed that stimulation of this region triggered hallucinations of such visuals.
-More recent work showed that is also processes \emph{social} context.
+More recent work showed that it also processes \emph{social} context.
 
 The insula, (also known as insular cortex) is part of the \gls{sn} and the limbic system~\parencite{Uddin2017}.
 It is known to be involved with modulation of emotional processing.
@@ -315,7 +331,7 @@ \subsection{Brain regions of interest}
 
 The \gls{dlpfc} is part of the lateral \gls{pfc} and is differentiated functionally rather than anatomically.
 The \gls{dlpfc} is affected with \gls{mdd} as well, showing lower metabolism~\parencite{Pandya2012}.
-This region is mostly linked to executive function (e.g. planning, reasoning, cognitive flexibility~\parencite{Dajani2015}, and working memory).
+This region is mostly linked to executive function (e.g.~planning, reasoning, cognitive flexibility~\parencite{Dajani2015}, and working memory).
 It also plays a role in mood regulation.
 
 The \gls{acc} is involved with salience and attention, as well as management of pain and emotions.
@@ -328,13 +344,13 @@ \subsection{Brain regions of interest}
 The \gls{mpfc} is another central node in the \gls{dmn} and has been linked to self-referential thought~\parencite{Gusnard2001}.
 
 In our study, only the \gls{bold} time series from these nine brain regions are considered.
-All brain regions of interest (as they are actually implemented in this study) are illustrated in \cref{fig:ukb-brain-regions}.
+All brain regions of interest (as they are implemented in this study) are illustrated in \cref{fig:ukb-brain-regions}.
 Interactive three-dimensional brain region plots (in Jupyter Notebooks) are provided upon publication.
 
 \info[inline]{Paragraph: Describe brain region parcellation.}
 We use the \gls{hcp} Multi-Modal Parcellation (MMP) 1.0 brain region parcellation~\parencite{Glasser2016}.
 This atlas contains 180 regions per hemisphere.
-We merge all brain regions across both hemispheres, assuming lateralization does not play a major role.
+We merge all brain regions across both hemispheres, assuming lateralization does not play a significant role.
 Each of the brain regions of interest consists of a collection of subregions from this atlas.
 We take a weighted (by number of voxels per subregion) average to obtain our final time series.
 Note that parcellation choices are often controversial, see also \textcite{Arslan2018, Bryce2021} for a comparison of parcellation methods.
@@ -342,7 +358,7 @@ \subsection{Brain regions of interest}
 
 \input{tab/ukb_roi}
 
-An example of extracted node time series for a single subject in shown in \cref{fig:ukb-example-time-series}.
+An example of extracted node time series for a single subject is shown in \cref{fig:ukb-example-time-series}.
 
 
 \begin{figure}[t]
@@ -350,8 +366,7 @@ \subsection{Brain regions of interest}
   \includegraphics[width=\textwidth]{fig/ukbiobank/node_timeseries/diagnosed_lifetime_occurrence/depressed/time_series_regions_of_interest}
   \caption{
     UK Biobank depression study example time series for all brain regions studied for a single (diagnosed lifetime occurrence, depressed cohort) participant.
-  }
-  \label{fig:ukb-example-time-series}
+  }\label{fig:ukb-example-time-series}
 \end{figure}
 
 
@@ -361,11 +376,11 @@ \subsection{Brain region edges of interest}
 
 \info[inline]{Paragraph: Describe brain region edges of interest decision.}
 To limit the scope of our study and to minimize risks related to \gls{mht}, we only look at certain brain region connections (edges).
-Similarly to deciding which brain \glspl{roi} to include, from previous findings we also know of several \emph{edges} that are relevant and/or altered with depression.
+As with deciding which brain \glspl{roi} to include, from previous findings we also know of several \emph{edges} that are relevant and/or altered with depression.
 Especially those previously studied through the lens of \gls{fc} are considered relevant.
 The rest of this section reviews such prior work.
 
-However, we have to be careful with comparing our results to these prior findings.
+However, we must be careful with comparing our results to these prior findings.
 The atlases used and brain region definition will be (slightly) different in each experimental setup.
 Connectivity measures or interaction indicators vary as well.
 Depression phenotypes are typically different across studies.
@@ -384,10 +399,10 @@ \subsubsection{Static functional connectivity edges affected in depression}
 In contrast, \textcite{Dannlowski2009} showed a \emph{decrease} in \gls{fc} between the \gls{amg} and prefrontal areas with \gls{mdd}.
 Similarly, \textcite{Kong2013} found decreased \gls{sfc} between the \gls{amg} and left rostral \gls{pfc} with \gls{mdd}.
 \textcite{Connolly2017} found reduced \gls{sfc} between the \gls{amg} and both \gls{dlpfc} as well as \gls{vmpfc}.
-\textcite{Willinger2022} further showed that prefrontal-\gls{amg} connectivity is affected in adolescents with \gls{mdd}, which they relate to aberrant emotional processing.
-\textcite{Burghy2012} found the connection strength of \gls{amg}-\gls{mpfc} to be decreased in those with significant \gls{els}, which (as discussed in \cref{subsec:depression}) is a major confounding factor of developing \gls{mdd}.
-\textcite{Rolls2020} finds aberrant connectivity between the \gls{amg} and medial \gls{ofc}.
-\textcite{Tang2018} also finds the \gls{amg} to be involved and its connection to \gls{ofc} changed.
+\textcite{Willinger2022} further showed that prefrontal--\gls{amg} connectivity is affected in adolescents with \gls{mdd}, which they relate to aberrant emotional processing.
+\textcite{Burghy2012} found the connection strength of \gls{amg}--\gls{mpfc} to be decreased in those with significant \gls{els}, which (as discussed in \cref{subsec:depression}) is a major confounding factor in developing \gls{mdd}.
+\textcite{Rolls2020} found aberrant connectivity between the \gls{amg} and medial \gls{ofc}.
+\textcite{Tang2018} also found the \gls{amg} to be involved and its connection to \gls{ofc} changed.
 
 Connectivity between \gls{amg} and other regions has also been studied.
 In a seed-based analysis, \textcite{Ramasubbu2014} find decreased \gls{sfc} between the \gls{amg} and many other brain regions, including vlPFC, insula, caudate, precuneus, STG, and occipital regions and \gls{cbm}.
@@ -397,7 +412,7 @@ \subsubsection{Static functional connectivity edges affected in depression}
 They also found increased connectivity between the \gls{dmpfc} and \gls{acc}/paracingulate gyrus and with the frontal pole.
 
 Network connectivity changes have also been reported.
-The connectivity strength of \gls{pcc}-\gls{mpfc} (key nodes of the \gls{dmn}) was found by \textcite{Philip2013} to be decreased in individuals with \gls{els}.
+The connectivity strength of \gls{pcc}--\gls{mpfc} (key nodes of the \gls{dmn}) was found by \textcite{Philip2013} to be decreased in individuals with \gls{els}.
 
 %%
 \subsubsection{Time-varying functional connectivity edges affected in depression}
@@ -419,9 +434,9 @@ \subsubsection{Time-varying functional connectivity edges affected in depression
 Especially the precuneus was found to be implied.
 They also demonstrated how important it is to take the temporal dynamics of \gls{fc} into account.
 
-\textcite{Dini2021} used \gls{sw} and then $k$-means clustering to find 3 brain states.
+\textcite{Dini2021} used \gls{sw} and then $k$-means clustering to find three brain states.
 They calculated the occupancy rate (OCR), the time that each subject spends in each state.
-What they found was that depressed subjects spend less time in a state where connectivity between \gls{cen} and \gls{dmn} was relatively higher than in other states.
+What they found was that depressed subjects spend less time in a state where connectivity between \gls{cen} and \gls{dmn} was higher than in other states.
 \Gls{ect} was seen to increase the amount of time subjects spend in that same state.
 
 \textcite{Ho2021} found lower within-network connectivity (which they refer to as network coherence) in ventral \gls{dmn}, lower within-network connectivity in anterior \gls{dmn} and insula-SN and higher between-network connectivity between \gls{cen} and \gls{dmn}.
@@ -438,55 +453,42 @@ \subsubsection{List of edges of interest}
 We study the edges between the \gls{amg} and four frontal regions: \gls{ofc}, \gls{dlpfc}, \gls{acc}, and \gls{mpfc}.
 Two \gls{hpc} edges are studied: those with the \gls{ai} and the \gls{ofc}.
 Two \gls{pha} edges are included: those with the \gls{acc} and the \gls{mpfc}.
-The edges between the \gls{ai} and the \gls{acc} as well as the \gls{mpfc} are included.
+The edges between the \gls{ai} and the \gls{acc} as well as the \gls{mpfc} are included.  % Kaiser2015
 The former of these connections represents \gls{sn} within-network connectivity.
 A single \gls{pcc} edge is studied; the one with the \gls{mpfc}.  % Wise2017
 This edge represents the \gls{dmn} within-network connectivity.
 Finally, we include two more \gls{dlpfc} edges: those with the \gls{acc} and the \gls{mpfc}. % Kaiser2015
 
 %%
-\subsection{Functional network analysis}
+\subsection{Functional network analysis}\label{subsec:ukb-fn-analysis}
 %%
 
 \info[inline]{Paragraph: Introduce functional networks analysis.}
 In our second analysis type, we study brain \emph{networks} instead of individual brain \glspl{roi}.
-Depression has also been considered as a network disorder~\parencite{Mulders2015}.
-Instead of a single brain region not functioning properly, there is an aberration in the integration and segregation of brain regions.
-The main networks found to be involved and affected are the \gls{dmn}~\parencite{Berman2011, Demirtas2016, Wise2017, Yan2019, Zhao2019, Zhou2020}, \gls{cen}~\parencite{Zhao2019}, and \gls{sn}~\parencite{Manoliu2014}.
-
-The \gls{dmn} primarily consists of \gls{mpfc} and \gls{pcc}, as well as the (para)hippocampal areas, precuneus (cortex), and angular gyrus~\parencite{Andrews-Hanna2010}.
-It is often described as the neurological basis for `the self', and is attributed functions like self-referential thinking~\parencite{Sheline2009}, cognitive flexibility~\parencite{Vatansever2016}, mind-wandering, memory processing and rumination, theory of mind, emotion regulation, and as storage of autobiographical information.
-It it connected to the \gls{amg} and \gls{hpc}~\parencite{Andrews-Hanna2014}.
-
-The \gls{cen} primarily consists of the lateral \gls{pfc}, posterior parietal cortex (PPC), \gls{dlpfc} (especially middle frontal gyrus), \gls{dmpfc}, and posterior parietal regions~\parencite{Rogers2004}.
-It is associated with cognitive processes and functions, like working memory and attention.
-
-The \gls{sn} primarily consists of the \gls{ai} and (dorsal) \gls{acc}, with some adding the \gls{amg}, frontoinsular cortex, temporal poles, and striatum~\parencite{Seeley2007, Menon2010, Beck2016}.
-The \gls{sn} is a key network in cognitive flexibility~\parencite{Dajani2015}.
-
-They are visualized in \cref{fig:ukb-brain-functional-networks}.
-The regions and respective sub-regions these networks are made up of are shown in \cref{tab:ukbiobank-functional-networks}.
+As reviewed in \cref{subsec:fc-depression}, depression has also been considered as a network disorder.
+%
+The three networks considered (and described in more detail in \cref{subsec:fc-depression}) in this study are visualized in \cref{fig:ukb-brain-functional-networks}.
+The regions and respective sub-regions of these networks are shown in \cref{tab:ukbiobank-functional-networks}.
 Which sub-regions (nodes) constitute each specific \gls{fn} is based on \textcite{Fan2016, Uddin2019, Oane2020}.\footnote{There are diverging views on this topic, and we have made an effort to balance out these views.}
-Note that IFJa and IFJp could also be considered to be part of the middle frontal gyrus instead of the inferior frontal gyrus.
+Note that IFJa and IFJp could also be part of the middle frontal gyrus instead of the inferior frontal gyrus.
 However, we want to make sure there is no overlap between these networks.
 
 
 \begin{figure}[t]
   \centering
-  \subcaptionbox{Central executive network (CEN) \label{fig:fn-cen}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/CEN}}
-  \subcaptionbox{Default mode network (DMN) \label{fig:fn-dmn}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/DMN}}
-  \subcaptionbox{Salience network (SN) \label{fig:fn-sn}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/SN}}
+  \subcaptionbox{Central executive network (CEN)\label{fig:fn-cen}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/CEN}}
+  \subcaptionbox{Default mode network (DMN)\label{fig:fn-dmn}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/DMN}}
+  \subcaptionbox{Salience network (SN)\label{fig:fn-sn}}{\includegraphics[width=0.50\textwidth]{fig/functional_networks/SN}}
   \caption{
     UK Biobank depression study brain functional networks studied.
     Network subregions are extracted from HCP-MMP1.0 parcellation.
-  }
-  \label{fig:ukb-brain-functional-networks}
+  }\label{fig:ukb-brain-functional-networks}
 \end{figure}
 
 
 \input{tab/ukb_fn}
 
-An example of extracted node time series for a single subject in shown in \cref{fig:ukb-fn-example-time-series}.
+An example of extracted node time series for a single subject is shown in \cref{fig:ukb-fn-example-time-series}.
 
 
 \begin{figure}[ht]
@@ -494,8 +496,7 @@ \subsection{Functional network analysis}
   \includegraphics[width=0.8\textwidth]{fig/ukbiobank/node_timeseries/diagnosed_lifetime_occurrence/depressed/time_series_functional_networks_of_interest}
   \caption{
     UK Biobank depression study example time series for all three functional networks studied for a single (diagnosed lifetime occurrence, depressed cohort) participant.
-  }
-  \label{fig:ukb-fn-example-time-series}
+  }\label{fig:ukb-fn-example-time-series}
 \end{figure}
 
 
diff --git a/ch/4_TVFC_and_depression/2_Material_and_methods.tex b/ch/4_TVFC_and_depression/2_Material_and_methods.tex
index d99f3f1..2014744 100644
--- a/ch/4_TVFC_and_depression/2_Material_and_methods.tex
+++ b/ch/4_TVFC_and_depression/2_Material_and_methods.tex
@@ -1,10 +1,9 @@
 \clearpage
-\section{Material and methods}
-\label{sec:ukb-methodology}
+\section{Material and methods}\label{sec:ukb-methodology}
 %%%%%
 
 In this section we describe how we \emph{analyze} the data and cohorts and described in the previous section.
-We mainly apply techniques and frameworks discussed in the previous chapters.
+We apply techniques and frameworks discussed in the previous chapters.
 
 %%
 \subsection{TVFC estimation method}
@@ -33,8 +32,8 @@ \subsection{TVFC estimation method}
   \includegraphics[width=\textwidth]{fig/ukbiobank/imputation_study/ROI/LEOO_test_log_likelihoods_raincloud}
   \caption{
     UK Biobank depression study imputation benchmark test log likelihoods.
-  }
-  \label{fig:ukb-imputation-benchmark}
+    The SVWP approach performs best.
+  }\label{fig:ukb-imputation-benchmark}
 \end{figure}
 
 
@@ -67,7 +66,7 @@ \subsection{Cohort comparison}
 \subsection{Brain state analysis}
 %%
 
-We also investigate cohort \gls{tvfc} characteristics and contrasts through the construct of brain states (see \cref{subsec:brain-states}).
+We also investigate cohort \gls{tvfc} characteristics and contrasts through the construct of brain states (see \cref{subsec:brain-states} for explicit definition).
 These are extracted separately for each of the four analysis types (i.e.~cohort stratifications).
 All nine \glspl{roi} are included.
 %
@@ -80,8 +79,9 @@ \subsection{Brain state analysis}
   \includegraphics[width=0.8\textwidth]{fig/ukbiobank/brain_states/inertias_SVWP_joint}
   \caption{
     UK Biobank brain state analysis inertia elbow plot - SVWP-J estimates.
-  }
-  \label{fig:ukb-results-brain-states-elbow-plot}
+    Used for determining optimal number of clusters.
+    Lines overlap heavily for all four cohort stratifications.
+  }\label{fig:ukb-results-brain-states-elbow-plot}
 \end{figure}
 
 
@@ -92,11 +92,11 @@ \subsection{Predictive power as explanation}
 One key insight from the benchmarking efforts is that it not only allows us to pick the optimal \gls{tvfc} estimation method.
 It also paints a broader picture and a profile of predictive power.
 Such maps between edge connectivity and predictive power of subject measures, for example, can be useful when interpreting results.
-This is especially relevant is an interdisciplinary field as this.
-This is exactly the philosophy that led to the recent publication of open-source Python package \texttt{neuromaps}~\parencite{Markello2022}.
+This is especially relevant in an interdisciplinary field like this.
+This is exactly the philosophy that led to the recent publication of the open-source Python package \texttt{neuromaps}~\parencite{Markello2022}.
 
 This mindset constitutes a very pure data science view of the brain and neuroscience as a field~\parencite[see also][]{Yarkoni2017}.
-Machine learning as a field has fully embraced this mindset.\footnote{The fields of NLP and acoustic modeling have slowly shifted away from mechanistic language models to purely statistical approaches. In 1988, speech recognition pioneer Frederick Jelinek famously said something along the lines of ``Every time I fire a linguist, the performance of the speech recognizer goes up'' (exact wording is lost in time).}
+Machine learning as a field has fully embraced this mindset.\footnote{The fields of \gls{nlp} and acoustic modeling have slowly shifted away from mechanistic language models to purely statistical approaches. In 1988, speech recognition pioneer Frederick Jelinek famously said something along the lines of ``Every time I fire a linguist, the performance of the speech recognizer goes up'' (exact wording is lost in time).}
 It may not be a silver bullet, and does not necessarily lead to mechanistic understanding and strong theory~\parencite[see e.g.][]{Jonas2017}.
 However, it is certainly a promising research direction that may help bridge brain dynamics and brain function, a key bottleneck often raised in the field~\parencite{Kopell2014}.
 Moreover, some psychiatrists have called for prioritizing predictive power and clinical utility in the short term over deeper understanding~\parencite{Paulus2015, Winter2022}.
diff --git a/ch/4_TVFC_and_depression/3_Results.tex b/ch/4_TVFC_and_depression/3_Results.tex
index 9a0b726..348b79b 100644
--- a/ch/4_TVFC_and_depression/3_Results.tex
+++ b/ch/4_TVFC_and_depression/3_Results.tex
@@ -1,6 +1,5 @@
 \clearpage
-\section{Results}
-\label{sec:ukb-results}
+\section{Results}\label{sec:ukb-results}
 %%%%%
 
 We report results for the four depression phenotypes (i.e.~cohort stratifications) as four separate studies.
@@ -14,8 +13,8 @@ \subsection{TVFC estimates}
 \Cref{fig:ukbiobank-example-correlation-estimates-roi} shows \gls{svwp} \gls{tvfc} estimates for a random subject for the discussed edges of interest.
 %
 As with the estimates we saw in \cref{ch:benchmarking}, we see that some brain regions are consistently stronger coupled than others.
-For example, the \gls{dlpfc}-\gls{acc} edge is highly correlated throughout the scan.
-Most edges are positively correlated throughout, with the exception of \gls{hpc}-\gls{ai} (which is anti-correlated through most of the scan) and \gls{amg}-\gls{mpfc}, \gls{hpc}-\gls{ofc}, \gls{pha}-\gls{acc}, and \gls{ai}-\gls{mpfc} (which show an oscillation between correlation and anti-correlation).
+For example, the \gls{dlpfc}--\gls{acc} edge is highly correlated throughout the scan.
+Most edges are positively correlated throughout, except for \gls{hpc}--\gls{ai} (which is anti-correlated through most of the scan) and \gls{amg}--\gls{mpfc}, \gls{hpc}--\gls{ofc}, \gls{pha}--\gls{acc}, and \gls{ai}--\gls{mpfc} (which show an oscillation between correlation and anti-correlation).
 Furthermore, some edges appear more static than others.
 We also see a shared periodic structure in these estimates.
 This suggests the presence of a global oscillatory structure in this individual's functional architecture during this scan.
@@ -27,8 +26,7 @@ \subsection{TVFC estimates}
   \caption{
     UK Biobank depression study TVFC estimates for a single (diagnosed lifetime occurrence depressed cohort) participant.
     Black dashed lines indicate uncorrelation.
-  }
-  \label{fig:ukbiobank-example-correlation-estimates-roi}
+  }\label{fig:ukbiobank-example-correlation-estimates-roi}
 \end{figure}
 
 
@@ -36,7 +34,7 @@ \subsection{TVFC estimates}
 All \gls{fn} edges here show non-trivial time-varying structure and seem to capture the same global oscillatory structure.
 %
 However, interpretation remains difficult.
-For example, what does the drop around the fourth minute mark in \gls{cen}-\gls{dmn} correlation mean?
+For example, what does the drop around the fourth minute mark in \gls{cen}--\gls{dmn} correlation mean?
 Without further information, it is unclear whether this could indicate falling asleep, switching internal preoccupation, or any other speculation.
 
 
@@ -46,8 +44,7 @@ \subsection{TVFC estimates}
   \caption{
     UK Biobank depression study TVFC estimates for a single (diagnosed lifetime occurrence depressed cohort) participant for all interactions between the functional networks.
     Black dashed lines indicate uncorrelation.
-  }
-  \label{fig:ukbiobank-example-correlation-estimates-fn}
+  }\label{fig:ukbiobank-example-correlation-estimates-fn}
 \end{figure}
 
 
@@ -72,19 +69,18 @@ \subsection{Diagnosed lifetime occurrence}
   \caption{
     Diagnosed depression lifetime occurrence analysis - SVWP estimates.
     Mean over 620 subjects per cohort for all ROI edges, for three TVFC summary measures.
-  }
-  \label{fig:ukb-results-dlo-roi-cohort-comparison-full-wp}
+  }\label{fig:ukb-results-dlo-roi-cohort-comparison-full-wp}
 \end{figure}
 
 
 Next, we zoom in on the particular edges of interest.
 Firstly, \cref{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sfc} shows the \gls{sfc} estimates for these for both cohorts.
-These estimates will be used as reference later; \gls{tvfc} mean estimates should be similar.
-As expected, strong coupling is found between prefrontal areas \gls{dlpfc}, \gls{acc}, and \gls{mpfc}, as well as \gls{pcc}-\gls{mpfc} (due to being the key components of the \gls{dmn}).
+These estimates will be used as a reference later; \gls{tvfc} mean estimates should be similar.
+As expected, strong coupling is found between prefrontal areas \gls{dlpfc}, \gls{acc}, and \gls{mpfc}, as well as \gls{pcc}--\gls{mpfc} (due to being the key components of the \gls{dmn}).
 %
 We find \emph{decreased} connectivity with \gls{mdd} across the board for all edges of interest.
-This seems to suggest some global association effect of a lifetime instance of \gls{mdd} on all brain region connections.
-The largest effect size is found for the \gls{hpc}-\gls{ai} edge (Cohen~$d = 0.19$, $t(1235) = -3.32$, $p = .0047$) and the smallest for \gls{dlpfc}-\gls{mpfc} (Cohen~$d = 0.12$, $t(1228) = -2.10$, $p = .0393$).
+This suggests some global association effect of a lifetime instance of \gls{mdd} on all brain region connections.
+The largest effect size is found for the \gls{hpc}--\gls{ai} edge (Cohen~$d = 0.19$, $t(1235) = -3.32$, $p = .0047$) and the smallest for \gls{dlpfc}--\gls{mpfc} (Cohen~$d = 0.12$, $t(1228) = -2.10$, $p = .0393$).
 
 
 \begin{figure}[t]
@@ -93,9 +89,8 @@ \subsection{Diagnosed lifetime occurrence}
   \caption{
     Diagnosed depression lifetime occurrence analysis - brain regions of interest - sFC estimates.
     Mean and standard error over 620 subjects per cohort for edges of interest.
-    *: $p \leq 0.05$, **: $p \leq 0.01$.
-  }
-  \label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sfc}
+    *: $p \leq .05$, **: $p \leq .01$.
+  }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-sfc}
 \end{figure}
 
 
@@ -104,18 +99,18 @@ \subsection{Diagnosed lifetime occurrence}
 We find the same global decrease with \gls{mdd} in connectivity strength for all edges.
 As expected, the mean \gls{tvfc} estimates are almost identical to the \gls{sfc} estimates.
 However, significance and effect sizes are slightly different, with generally smaller $p$ values and larger Cohen~$d$ values found.
-Here we find the smallest effect size for \gls{amg}-\gls{acc} (Cohen~$d = 0.12$, $t(1234) = -2.13$, $p = .0352$) and largest again for \gls{hpc}-\gls{ai} (Cohen~$d = 0.21$, $t(1235) = -3.61$, $p = .0019$).
+Here we find the smallest effect size for \gls{amg}--\gls{acc} (Cohen~$d = 0.12$, $t(1234) = -2.13$, $p = .0352$) and largest again for \gls{hpc}--\gls{ai} (Cohen~$d = 0.21$, $t(1235) = -3.61$, $p = .0019$).
 %
 For the variance summary measure, we find slight increases between prefrontal areas as well as decreases or no differences for all other edges.
 However, none differ significantly across the cohorts.
-The only edge here with a pre-\gls{mht} significant difference is \gls{dlpfc}-\gls{acc}.
+The only edge here with a pre-\gls{mht} significant difference is \gls{dlpfc}--\gls{acc}.
 %
-For the rate-of-change summary measure, we see slightly increased values in the \gls{mdd} cohort for all edges except \gls{amg}-\gls{acc}.
-We find significant increases for only two edges: \gls{hpc}-\gls{ofc} (Cohen~$d = 0.15$, $t(1020) = 2.57$, $p = .0458$) and \gls{dlpfc}-\gls{acc} (Cohen~$d = 0.16$, $t(1110) = 2.78$, $p = .0344$).
+For the rate-of-change summary measure, we see slightly increased values in the \gls{mdd} cohort for all edges except \gls{amg}--\gls{acc}.
+We find significant increases for only two edges: \gls{hpc}--\gls{ofc} (Cohen~$d = 0.15$, $t(1020) = 2.57$, $p = .0458$) and \gls{dlpfc}--\gls{acc} (Cohen~$d = 0.16$, $t(1110) = 2.78$, $p = .0344$).
 This finding is promising; finding cohort contrasts in \gls{tvfc} dynamics motivates the study thereof, beyond just \gls{sfc} analyses.
 How could we interpret this finding for these two edges?
 As we have learned from the subject measure prediction benchmark in the previous chapter (see \cref{fig:hcp-results-subject-measures-prediction}), the rate-of-change summary measure may be especially suitable for capturing subject characteristics related to language and memory.
-Since we know the \gls{hpc} plays a key role in memory and cardinal depressive symptoms like rumination (obsessive and repetitive thoughts related to prior experiences), it is perhaps not surprising to find changes in this region with \gls{mdd}.\footnote{Such comparisons also demonstrate that benchmarking is not just about picking the best method. It helps us as a community to catalogue what features, representations, metrics, and/or constructs are useful, to what extent, and in what context~\parencite[see also][]{Voytek2022}.}
+Since we know the \gls{hpc} plays a key role in memory and cardinal depressive symptoms like rumination (obsessive and repetitive thoughts related to prior experiences), it is not surprising to find changes in this region with \gls{mdd}.\footnote{Such comparisons also demonstrate that benchmarking is not just about picking the best method. It helps us as a community to catalogue what features, representations, metrics, and/or constructs are useful, to what extent, and in what context~\parencite[see also][]{Voytek2022}.}
 
 
 \begin{figure}[t]
@@ -124,9 +119,8 @@ \subsection{Diagnosed lifetime occurrence}
   \caption{
     Diagnosed depression lifetime occurrence analysis - brain regions of interest - SVWP estimates.
     Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
-    *: $p \leq 0.05$, **: $p \leq 0.01$.
-  }
-  \label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-wp}
+    *: $p \leq .05$, **: $p \leq .01$.
+  }\label{fig:ukb-results-dlo-roi-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -136,8 +130,8 @@ \subsection{Diagnosed lifetime occurrence}
 This echoes the global decreased connectivity strength effect found with the \gls{roi} analysis.
 %
 All \gls{tvfc} variances and rates-of-change are \emph{increased} with \gls{mdd}, but these differences are only significant with the latter (Cohen~$d = 0.25, 0.23, 0.21$, respectively).
-The pre-\gls{mht} variances of \gls{cen}-\gls{sn} and \gls{dmn}-\gls{sn} are significantly increased, however.
-Both the effect sizes and the statistical significance of these cohort contrast is much larger than with the \gls{roi} analysis.
+The pre-\gls{mht} variances of \gls{cen}--\gls{sn} and \gls{dmn}--\gls{sn} are significantly increased, however.
+Both the effect sizes and the statistical significance of these cohort contrast are much larger than with the \gls{roi} analysis.
 
 
 \begin{figure}[ht]
@@ -146,13 +140,12 @@ \subsection{Diagnosed lifetime occurrence}
   \caption{
     Diagnosed depression lifetime occurrence analysis - functional networks - SVWP estimates.
     Mean and standard error over 620 subjects per cohort for edges of interest for three TVFC summary measures.
-    ***: $p \leq 0.001$.
-  }
-  \label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-wp}
+    ***: $p \leq .001$.
+  }\label{fig:ukb-results-dlo-fn-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
-Finally, comparing to previous studies, this particular depression phenotype does not replicate many \gls{sfc} findings.
+Finally, compared to previous studies, this depression phenotype does not replicate many \gls{sfc} findings.
 Many prior findings found \emph{increases} or \emph{specific} decreases in connectivity strength with \gls{mdd}.
 We only find reduced connectivity strength across the board.
 
@@ -162,9 +155,9 @@ \subsection{Self-reported lifetime occurrence}
 %%
 
 \Cref{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sfc} shows the \gls{sfc} estimates for the edges of interest for both cohorts.
-These will be used as reference again.
+These will be used as references again.
 %
-Interestingly, none of the edges bar \gls{hpc}-\gls{ai} (Cohen~$d = 0.14$) show any cohort contrast here.
+Interestingly, none of the edges bar \gls{hpc}--\gls{ai} (Cohen~$d = 0.14$) show any cohort contrast here.
 This edge may be robustly affected by depression, as it also accounted for the smallest $p$-value and largest effect size among all edges in the previously studied depression phenotype.
 
 
@@ -174,9 +167,8 @@ \subsection{Self-reported lifetime occurrence}
   \caption{
     Self-reported depression lifetime occurrence analysis - brain regions of interest - sFC estimates.
     Mean and standard error over 808 subjects per cohort for edges of interest.
-    *: $p \leq 0.05$.
-  }
-  \label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sfc}
+    *: $p \leq .05$.
+  }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-sfc}
 \end{figure}
 
 
@@ -195,15 +187,14 @@ \subsection{Self-reported lifetime occurrence}
 Further interpretation of why these results are so different from the ones found in the \emph{diagnosed} lifetime occurrence analysis will be discussed in \cref{subsec:variation-depression-phenotypes}.
 
 
-\begin{figure}[h]
+\begin{figure}[t]
   \centering
   \includegraphics[width=\textwidth]{fig/ukbiobank/TVFC_predictions_summaries/lifetime_occurrence/cohort_comparison/ROI/correlation_all_TVFC_summary_measures_SVWP_joint_edges_of_interest}
   \caption{
     Self-reported depression lifetime occurrence analysis - brain regions of interest - SVWP estimates.
     Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
-    *: $p \leq 0.05$.
-  }
-  \label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-wp}
+    *: $p \leq .05$.
+  }\label{fig:ukb-results-lo-roi-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -218,9 +209,8 @@ \subsection{Self-reported lifetime occurrence}
   \caption{
     Self-reported depression lifetime occurrence analysis - functional networks - SVWP estimates.
     Mean and standard error over 808 subjects per cohort for edges of interest for three TVFC summary measures.
-    *: $p \leq 0.05$.
-  }
-  \label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-wp}
+    *: $p \leq .05$.
+  }\label{fig:ukb-results-lo-fn-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -233,8 +223,8 @@ \subsection{Self-reported depressed state}
 The results are, again, distinct from the previous two depression phenotypes.
 %
 However, we do see reduced connectivity again across the board for all edges.
-These decreases are significant for most edges: \gls{amg}-\gls{mpfc} (Cohen~$d = 0.10$), \gls{amg}-\gls{ofc} (Cohen~$d = 0.10$), \gls{hpc}-\gls{ofc} (Cohen~$d = 0.10$), \gls{hpc}-\gls{ai} (Cohen~$d = 0.15$), \gls{pha}-\gls{acc} (Cohen~$d = 0.11$), \gls{pha}-\gls{mpfc} (Cohen~$d = 0.12$), \gls{ai}-\gls{acc} (Cohen~$d = 0.10$), \gls{ai}-\gls{mpfc} (Cohen~$d = 0.09$), and \gls{dlpfc}-\gls{acc} (Cohen~$d = 0.10$).
-Again, we find the largest effect size for the \gls{hpc}-\gls{ai} edge.
+These decreases are significant for most edges: \gls{amg}--\gls{mpfc} (Cohen~$d = 0.10$), \gls{amg}--\gls{ofc} (Cohen~$d = 0.10$), \gls{hpc}--\gls{ofc} (Cohen~$d = 0.10$), \gls{hpc}--\gls{ai} (Cohen~$d = 0.15$), \gls{pha}--\gls{acc} (Cohen~$d = 0.11$), \gls{pha}--\gls{mpfc} (Cohen~$d = 0.12$), \gls{ai}--\gls{acc} (Cohen~$d = 0.10$), \gls{ai}--\gls{mpfc} (Cohen~$d = 0.09$), and \gls{dlpfc}--\gls{acc} (Cohen~$d = 0.10$).
+Again, we find the largest effect size for the \gls{hpc}--\gls{ai} edge.
 
 
 \begin{figure}[h]
@@ -243,22 +233,21 @@ \subsection{Self-reported depressed state}
   \caption{
     Self-reported depressed state analysis - brain regions of interest - sFC estimates.
     Mean and standard error over 1,411 subjects per cohort for edges of interest.
-    *: $p \leq 0.05$, **: $p \leq 0.01$, ***: $p \leq 0.001$.
-  }
-  \label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-sfc}
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-sfc}
 \end{figure}
 
 
 \Cref{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-wp} shows the \gls{svwp} \gls{tvfc} estimates for this depression phenotype.
 %
 The estimate means broadly correspond with the \gls{sfc}.
-However, here we find additional significant decreases for three additional edges: \gls{amg}-\gls{dlpfc} (Cohen~$d = 0.08$), \gls{dlpfc}-\gls{mpfc} (Cohen~$d = 0.08$), and \gls{pcc}-\gls{mpfc} (Cohen~$d = 0.09$).
+However, here we find additional significant decreases for three additional edges: \gls{amg}--\gls{dlpfc} (Cohen~$d = 0.08$), \gls{dlpfc}--\gls{mpfc} (Cohen~$d = 0.08$), and \gls{pcc}--\gls{mpfc} (Cohen~$d = 0.09$).
 %
-In terms of \gls{tvfc} variance, we find increases with depression for the edges between prefrontal areas, and decreases for all other edges.
-However, none of these are significant, even before \gls{mht}, except for \gls{dlpfc}-\gls{acc}.
+In terms of \gls{tvfc} variance, we find increases with depression for the edges between prefrontal areas and decreases for all other edges.
+However, none of these are significant, even before \gls{mht}, except for \gls{dlpfc}--\gls{acc}.
 %
 For \gls{tvfc} rate-of-change, we find increases for all edges, but none are significant again.
-Three edges were significantly different across cohorts before \gls{mht}: \gls{amg}-\gls{mpfc}, \gls{amg}-\gls{ofc}, and \gls{dlpfc}-\gls{acc}.
+Three edges were significantly different across cohorts before \gls{mht}: \gls{amg}--\gls{mpfc}, \gls{amg}--\gls{ofc}, and \gls{dlpfc}--\gls{acc}.
 
 
 \begin{figure}[h]
@@ -267,9 +256,8 @@ \subsection{Self-reported depressed state}
   \caption{
     Self-reported depressed state analysis - brain regions of interest - SVWP estimates.
     Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
-    *: $p \leq 0.05$, **: $p \leq 0.01$, ***: $p \leq 0.001$.
-  }
-  \label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-wp}
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-roi-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -287,9 +275,8 @@ \subsection{Self-reported depressed state}
   \caption{
     Self-reported depressed state analysis - functional networks - SVWP estimates.
     Mean and standard error over 1,411 subjects per cohort for edges of interest for three TVFC summary measures.
-    *: $p \leq 0.05$, **: $p \leq 0.01$, ***: $p \leq 0.001$.
-  }
-  \label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-wp}
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+  }\label{fig:ukb-results-srds-fn-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -309,14 +296,13 @@ \subsection{Polygenic risk scores}
   \caption{
     Polygenic risk scores analysis - brain regions of interest - sFC estimates.
     Mean and standard error over 3,775 subjects per cohort for edges of interest.
-  }
-  \label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-sfc}
+  }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-sfc}
 \end{figure}
 
 
 \Cref{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-wp} shows the \gls{roi} cohort contrasts for the \gls{svwp} estimates.
 None of the estimate means differ across cohorts.
-None of the \gls{tvfc} variances are different across cohorts either (although \gls{amg}-\gls{dlpfc} variance is pre-\gls{mht} significantly higher in the high risk cohort).
+None of the \gls{tvfc} variances are different across cohorts either (although \gls{amg}--\gls{dlpfc} variance is pre-\gls{mht} significantly higher in the high risk cohort).
 None of the rate-of-change summary measures are different across cohorts either.
 
 
@@ -326,8 +312,7 @@ \subsection{Polygenic risk scores}
   \caption{
     Polygenic risk scores analysis - brain regions of interest - SVWP estimates.
     Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
-  }
-  \label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-wp}
+  }\label{fig:ukb-results-pgs-roi-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -340,8 +325,7 @@ \subsection{Polygenic risk scores}
   \caption{
     Polygenic risk scores analysis - functional networks - SVWP estimates.
     Mean and standard error over 3,775 subjects per cohort for edges of interest for three TVFC summary measures.
-  }
-  \label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-wp}
+  }\label{fig:ukb-results-pgs-fn-cohort-comparison-edges-of-interest-wp}
 \end{figure}
 
 
@@ -353,16 +337,16 @@ \subsection{Polygenic risk scores}
 \subsection{Brain states analysis}
 %%
 
-The extracted $k = 4$ brain states are shown in \cref{fig:ukb-results-brain-states-dlo,fig:ukb-results-brain-states-srlo,fig:ukb-results-brain-states-srds} for the first three depression phenotypes.
+The extracted $k = 4$ brain states (as explicitly defined in \cref{subsec:brain-states}) are shown in \cref{fig:ukb-results-brain-states-dlo,fig:ukb-results-brain-states-srlo,fig:ukb-results-brain-states-srds} for the first three depression phenotypes.
 The \gls{prs} phenotype is excluded here, as we found no cohort contrasts in the previous analysis.
 
 Interestingly, the brain states extracted across the analysis types are almost identical.
 All brain states have strong within-\gls{pfc} connectivity and connectivity between \gls{pha} and the \gls{pcc}.
 %
-Brain state~1 is characterized by relatively weaker connectivity throughout, but especially within the \gls{pfc}.
+Brain state~1 is characterized by weaker connectivity throughout, but especially within the \gls{pfc}.
 %
-Brain state~2 is characterized by especially stronger connectivity between the \gls{hpc} and prefrontal areas, and between prefrontal areas themselves (compared to the first brain state, and including within-\gls{dmn} connectivity).
-The connectivity between the \gls{ai} and prefrontal areas is relatively weak, even weaker than found in the first brain state.
+Brain state~2 is characterized by especially stronger connectivity between the \gls{hpc} and prefrontal areas and between prefrontal areas themselves (compared to the first brain state and including within-\gls{dmn} connectivity).
+The connectivity between the \gls{ai} and prefrontal areas is even weaker than found in the first brain state.
 %
 Brain state~3 is characterized by particularly \emph{strong} connectivity between the \gls{ai} and prefrontal areas.
 %
@@ -371,13 +355,12 @@ \subsection{Brain states analysis}
 
 \begin{figure}[ht]
   \centering
-  \subcaptionbox{Diagnosed lifetime occurrence \label{fig:ukb-results-brain-states-dlo}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/diagnosed_lifetime_occurrence_brain_states_SVWP_joint_k04}}
-  \subcaptionbox{Self-reported lifetime occurrence \label{fig:ukb-results-brain-states-srlo}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/lifetime_occurrence_brain_states_SVWP_joint_k04}}
-  \subcaptionbox{Self-reported depressed state \label{fig:ukb-results-brain-states-srds}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/self_reported_depression_state_brain_states_SVWP_joint_k04}}
+  \subcaptionbox{Diagnosed lifetime occurrence\label{fig:ukb-results-brain-states-dlo}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/diagnosed_lifetime_occurrence_brain_states_SVWP_joint_k04}}
+  \subcaptionbox{Self-reported lifetime occurrence\label{fig:ukb-results-brain-states-srlo}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/lifetime_occurrence_brain_states_SVWP_joint_k04}}
+  \subcaptionbox{Self-reported depressed state\label{fig:ukb-results-brain-states-srds}}{\includegraphics[width=\textwidth]{fig/ukbiobank/brain_states/self_reported_depression_state_brain_states_SVWP_joint_k04}}
   \caption{
-    UK Biobank extracted brain states.
-  }
-  \label{fig:ukb-results-brain-states}
+    UK Biobank extracted brain states, separately for each cohort stratification.
+  }\label{fig:ukb-results-brain-states}
 \end{figure}
 
 
@@ -395,13 +378,12 @@ \subsection{Brain states analysis}
 
 \begin{figure}[ht]
   \centering
-  \subcaptionbox{Diagnosed lifetime occurrence \label{fig:ukb-results-brain-states-dwell-times-dlo}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/diagnosed_lifetime_occurrence_brain_states_SVWP_joint_k04_dwell_times}}
-  \subcaptionbox{Self-reported lifetime occurrence \label{fig:ukb-results-brain-states-dwell-times-srlo}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/lifetime_occurrence_brain_states_SVWP_joint_k04_dwell_times}}
-  \subcaptionbox{Self-reported depressed state \label{fig:ukb-results-brain-states-dwell-times-srds}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/self_reported_depression_state_brain_states_SVWP_joint_k04_dwell_times}}
+  \subcaptionbox{Diagnosed lifetime occurrence\label{fig:ukb-results-brain-states-dwell-times-dlo}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/diagnosed_lifetime_occurrence_brain_states_SVWP_joint_k04_dwell_times}}
+  \subcaptionbox{Self-reported lifetime occurrence\label{fig:ukb-results-brain-states-dwell-times-srlo}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/lifetime_occurrence_brain_states_SVWP_joint_k04_dwell_times}}
+  \subcaptionbox{Self-reported depressed state\label{fig:ukb-results-brain-states-dwell-times-srds}}{\includegraphics[width=0.6\textwidth]{fig/ukbiobank/brain_states/self_reported_depression_state_brain_states_SVWP_joint_k04_dwell_times}}
   \caption{
-    UK Biobank brain state dwell times.
-  }
-  \label{fig:ukb-results-brain-states-dwell-times}
+    UK Biobank brain state dwell times, separately extracted and computed for each cohort stratification.
+  }\label{fig:ukb-results-brain-states-dwell-times}
 \end{figure}
 
 
diff --git a/ch/4_TVFC_and_depression/4_Discussion.tex b/ch/4_TVFC_and_depression/4_Discussion.tex
index 44e18f7..0e2d9e5 100644
--- a/ch/4_TVFC_and_depression/4_Discussion.tex
+++ b/ch/4_TVFC_and_depression/4_Discussion.tex
@@ -1,6 +1,5 @@
 \clearpage
-\section{Discussion}
-\label{sec:ukb-discussion}
+\section{Discussion}\label{sec:ukb-discussion}
 %%%%%
 
 Taking a step back, we review what these study results have taught us, reflect on the differences in results between the various depression phenotypes, and discuss the importance of \gls{tvfc} estimation method choice.
@@ -13,48 +12,47 @@ \subsection{Interpretation of results}
 
 Functional connectivity can be hard to interpret.
 What does it really mean when a \gls{tvfc} summary measure varies across cohorts?
-To answer this we can link up the results with prior findings and current understanding of neural activity in the brain.
-This places our data and results within the broader context.
+To answer this, we can link up the results with prior findings and current understanding of neural activity in the brain.
+This places our data and results within a broader context.
 %
 In terms of the mean \gls{tvfc} estimate, this has often been considered connectivity `strength'.
-Correlation between time series can be generally decreased for a number of reasons.
-Brain activity in depressed participants could just be more noisy, for example, representing a less efficient and more chaotic functional architecture.
+Correlation between time series can be decreased for several reasons.
+Brain activity in depressed participants could just be noisier, for example, representing a less efficient and more chaotic functional architecture.
 Confounding factors could play a role as well.
 Perhaps we are merely picking up on a cohort contrast in general arousal or drowsiness.
 However, we do pick up more strongly on certain edges.
-These particular edges should be studied in more detail.
+These edges should be studied in more detail.
 %
-In terms of the dynamic summary measures (variance and rate-of-change), interpretation is perhaps even harder.
+In terms of the dynamic summary measures (variance and rate-of-change), interpretation is even harder.
 There is also less reference material.
 Higher values here could represent a functional architecture whose connections change more dramatically and more frequently, indicating a less stable organization.
 Whether this is maladaptive or not depends on context.
-Some studies have shown that in the case of schizophrenia, patients are stuck in prefrontal networks more so than \glspl{hc}.
-As such a higher degree of \gls{tvfc} variation across time may be the sign of a healthy, flexible brain.
+Some studies have shown that in the case of schizophrenia, patients are stuck in prefrontal networks more so than \glspl{hc}~\parencite{Damaraju2014}.
+As such a higher degree of \gls{tvfc} variation across time may be a sign of a healthy, flexible brain.
 However, in our study we consistently find dynamics to be \emph{increased} with depression.
 
-In general we find certain edges and \glspl{roi} to be more implicated than others.
+In general, we find certain edges and \glspl{roi} to be more implicated than others.
 Perhaps surprisingly we find the \gls{amg} to be less relevant compared to the \gls{hpc} and \gls{ai}.
 Many studies found the \gls{amg} to be affected, and show hypoconnectivity with \gls{pfc}~\parencite{Dannlowski2009, Burghy2012, Kong2013, Connolly2017}.
 Furthermore, we did not replicate one of the most reproduced findings: that of \emph{increased} \gls{dmn} connectivity~\parencite{Kaiser2015, Kaiser2015b, Mulders2015, Kaiser2016}.
 
 %%
-\subsection{Varying results across depression phenotypes}
-\label{subsec:variation-depression-phenotypes}
+\subsection{Varying results across depression phenotypes}\label{subsec:variation-depression-phenotypes}
 %%
 
 One of the key findings of this study is that the way we define depression matters a lot.
-We find global connectivity strength decreased and the dynamics of two edges (\gls{hpc}-\gls{ofc} and \gls{dlpfc}-\gls{acc}) affected for participants that have ever been professionally diagnosed with \gls{mdd}.
-However, only one edge (\gls{hpc}-\gls{ai}) strength is affected for those self-reporting a lifetime instance.
+We find global connectivity strength decreased and the dynamics of two edges (\gls{hpc}--\gls{ofc} and \gls{dlpfc}--\gls{acc}) to be affected for participants that have ever been professionally diagnosed with \gls{mdd}.
+However, only one edge (\gls{hpc}--\gls{ai}) strength is affected for those self-reporting a lifetime instance.
 This could be due to the first phenotype being less subjective.
 These labels are \emph{inpatient}, so there are likely to be few false positives.
 This could increase the contrast between the cohorts.
 %
-Another explanation for this contrast could be that participants in the former category would have likely received (more) treatment, including anti-depressants.
-Such treatment may have had influence on these individuals' functional architecture.
+Another explanation for this contrast could be that participants in the former category would have received (more) treatment, including anti-depressants.
+Such treatment may have had an influence on these individuals' functional architecture.
 To illustrate, treatment (with antidepressants) has been shown to revert changes in \gls{dmn} connectivity~\parencite{Liston2014}.
 
 The other difference we can examine is treating depression as a trait and/or lifetime history (participants that are prone to developing depressive symptoms or have had them throughout their lives) and as a state (participants that were depressed during the \gls{fmri} scan).
-The main difference between the diagnosed lifetime occurrence and self-reported depressed state analyses is that the \gls{amg}-\gls{acc} connectivity strength is not significantly affected in the latter.
+The main difference between the diagnosed lifetime occurrence and self-reported depressed state analyses is that the \gls{amg}--\gls{acc} connectivity strength is not significantly affected in the latter.
 Moreover, the connectivity rate-of-change edges are not affected for the depressed state cohorts.
 These differences are considered minor.
 However, the brain state analysis finds a more interesting contrast between these two paradigms.
@@ -64,12 +62,12 @@ \subsection{Varying results across depression phenotypes}
 \subsection{Comparison to prior studies}
 %%
 
-While we compare directly to prior work, not all studies are created equal.
+While we compare directly to prior work, not all studies are created equally.
 Moreover, many previous findings may not replicate.
-In fact, there are many reasons for why many neuroimaging depression study results may be false or inflated~\parencite{Flint2021}.
-The large number of studies looking to use neuroimaging to separate \gls{mdd} patients from \glspl{hc} still lacks cohesion.
+In fact, there are many reasons why many neuroimaging depression study results may be false or inflated~\parencite{Flint2021}.
+The considerable number of studies looking to use neuroimaging to separate \gls{mdd} patients from \glspl{hc} still lack cohesion.
 Sample sizes are often considered a key issue~\parencite{Varoquaux2018, Szucs2020, Libedinsky2022, Marek2022}.
-For \gls{mdd} estimation from structural \gls{mri} for example, \textcite{Flint2021} showed that small sample sizes can dramatically inflate predictive power of models.
+For \gls{mdd} estimation from structural \gls{mri} for example, \textcite{Flint2021} showed that small sample sizes can dramatically inflate the predictive power of models.
 We consider the large sample sizes in all benchmarks and experiments a major strength of the work in this thesis.
 
 Factors that impact the validity of direct comparison include: 1) choice of connectivity measure (e.g.~correlation or coherence), 2) choice of \gls{tvfc} estimation method, 3) source and version of data, 4) depression (phenotype) definition, and 5) parcellation procedure (e.g.~whether to study networks or regions and which atlas is used).
@@ -86,40 +84,40 @@ \subsection{Choice of TVFC estimation method}
 Repeats of the exact same analysis using other \gls{tvfc} estimation methods are shown in \cref{appendix:more-ukb-results}.
 Do we obtain the same conclusions?
 
-First of all, we see that the actual values of the estimate summary measures (apart from the mean) vary radically across estimation methods.
+First, we see that the actual values of the estimate summary measures (apart from the mean) vary radically across estimation methods.
 If we were to be interested in these absolute values, this requires more attention.
 However, throughout this study we are primarily interested in cohort contrasts.
 
-For the diagnosed lifetime occurrence analysis, we find the \gls{dcc} methods to miss several \gls{sfc} edges, and return a completely different selection of edges for the rate-of-change summary measure compared to the \gls{svwp} method.
+For the diagnosed lifetime occurrence analysis, we find the \gls{dcc} methods to miss several \gls{sfc} edges and return a completely different selection of edges for the rate-of-change summary measure compared to the \gls{svwp} method.
 %
 The \gls{sw-cv} method finds all \gls{sfc} edges to be decreased in connectivity with \gls{mdd}, replicating the global decreased connectivity strength effect.
-This is perhaps unsurprising; any reasonable estimation method would pick up the mean quite well.
+This is unsurprising; any reasonable estimation method would pick up the mean quite well.
 However, it returns many more significant increases in connectivity for the two dynamic summary measures (and generally much higher values than the \gls{svwp} estimates as well).
-Interestingly, it finds increased dynamic activity especially in the \gls{dlpfc}-\gls{acc} edge.
+Interestingly, it finds increased dynamic activity especially in the \gls{dlpfc}--\gls{acc} edge.
 %
 The story is similar for the two naive \gls{sw} approaches with window lengths of 30 and 60 seconds.
-Therefore, we carefully conclude that \gls{sw}-based methods would return more false positive rather than false negatives.
+Therefore, we carefully conclude that \gls{sw}-based methods would return more false positives rather than false negatives.
 Based on our simulations benchmarking in \cref{ch:benchmarking}, this may come as no surprise.
 %
-However, all estimation methods find an increased rate-of-change for the \gls{dlpfc}-\gls{acc} edge, indicating that this finding is robust to \gls{tvfc} estimation method choice.
+However, all estimation methods find an increased rate-of-change for the \gls{dlpfc}--\gls{acc} edge, indicating that this finding is robust to \gls{tvfc} estimation method choice.
 %
 Furthermore, all other methods do indeed find significantly increased \gls{tvfc} variance between all \glspl{fn}, in contrast to the \gls{svwp} estimates.
-As we have seen before, \gls{svwp} estimates are rather smooth compared to the other methods' estimates.
-If we are to believe our benchmarking has been executed properly, all of these findings could be interpreted as `spurious'.
+As we have seen before, \gls{svwp} estimates are smooth compared to the other methods' estimates.
+If we are to believe our benchmarking has been executed properly, all these findings could be interpreted as `spurious'.
 
 For the self-reported lifetime occurrence analysis, the \gls{dcc} methods do not find any significant alterations with depression across all edges and \gls{tvfc} summary measures.
 %
-All \gls{sw} methods find exactly the same single affected edge (\gls{hpc}-\gls{ai}), except for the 30-seconds \gls{sw} estimate that finds the rate-of-change to also be significantly affected for this edge.
+All \gls{sw} methods find exactly the same single affected edge (\gls{hpc}--\gls{ai}), except for the 30-seconds \gls{sw} estimate that finds the rate-of-change to also be significantly affected for this edge.
 %
-Overall, this shows that the reduced \gls{sfc} with depression for this edge is reasonably robust across \gls{tvfc} estimation method, and should be studied further as an affected brain region connection.
+Overall, this shows that the reduced \gls{sfc} with depression for this edge is reasonably robust across \gls{tvfc} estimation method and should be studied further as an affected brain region connection.
 %
 For the \gls{fn} analysis, the \gls{dcc} methods do not return any significant cohort contrasts, just like the~60 seconds \gls{sw} method.
-The 30~seconds \gls{sw} returns the same differences as the \gls{svwp} model, except for missing the increased \gls{cen}-\gls{sn} rate-of-change.
+The 30~seconds \gls{sw} returns the same differences as the \gls{svwp} model, except for missing the increased \gls{cen}--\gls{sn} rate-of-change.
 The \gls{sw-cv} approach only returns the two estimate means contrasts for the same two between-network connectivities as the \gls{svwp}.
 
-For the self-reported depressed state analysis, the \gls{dcc} methods predict a different set of edges whose mean \gls{tvfc} is affected in depression compared to both the \gls{sfc} and \gls{svwp} estimates.
+For the self-reported depressed state analysis, the \gls{dcc} methods predict a distinct set of edges whose mean \gls{tvfc} is affected in depression compared to both the \gls{sfc} and \gls{svwp} estimates.
 Furthermore, the joint approach returns many significantly different edges for the rate-of-change summary measure, where the pairwise implementation returns none (like the \gls{svwp} estimates).
-The \gls{sw-cv} estimates are broadly similar to the \gls{svwp} estimates, except for a curious rate-of-change \gls{pha}-\gls{mpfc} edge.
+The \gls{sw-cv} estimates are broadly like the \gls{svwp} estimates, except for a curious rate-of-change \gls{pha}--\gls{mpfc} edge.
 %
 For the \gls{fn} analysis, all other \gls{tvfc} estimation methods find the same cohort contrasts.
 However, the \gls{sw} methods also find differences in \gls{tvfc} variance, which may be interpreted as false positives.
@@ -127,5 +125,5 @@ \subsection{Choice of TVFC estimation method}
 For the \gls{prs} analysis, none of the other methods' estimates are significantly different for any of the edges, just like the \gls{sfc} and \gls{svwp} estimates.
 The null finding is robust across \gls{tvfc} estimation methods.
 
-Lastly, we note that any estimation method may return spurious structure or fail to pick up on certain structures.
+Lastly, we note that any estimation method may return spurious structure or fail to detect certain structures.
 However, since we average across participants in cohorts, some of these failure modes may be hidden from our results.
diff --git a/ch/5_Discussion/0_Introduction.tex b/ch/5_Discussion/0_Introduction.tex
index 0c859a3..1a860ef 100644
--- a/ch/5_Discussion/0_Introduction.tex
+++ b/ch/5_Discussion/0_Introduction.tex
@@ -1,7 +1,6 @@
-\chapter{Discussion}
-\label{ch:discussion}
+\chapter{Discussion}\label{ch:discussion}
 %%%%%
 
 In this chapter we summarize the findings from the previous chapters.
-We discuss limitations and advantages of the \gls{tvfc} estimation methods discussed, and reflect on this general task.
+We discuss limitations and advantages of the \gls{tvfc} estimation methods discussed and reflect on this general task.
 Furthermore, we discuss promising directions for future work, in terms of both \gls{tvfc} estimation and depression research.
diff --git a/ch/5_Discussion/1_Summary_of_presented_work.tex b/ch/5_Discussion/1_Summary_of_presented_work.tex
index eda98f5..9f70c41 100644
--- a/ch/5_Discussion/1_Summary_of_presented_work.tex
+++ b/ch/5_Discussion/1_Summary_of_presented_work.tex
@@ -14,7 +14,7 @@ \section{Summary of presented work}
 We translated principles from the field of machine learning to design a comprehensive suite of data science tasks, i.e.~benchmarks.
 
 In \cref{ch:benchmarking} we discussed all these benchmarks and showed how our new method broadly outperformed competitive baselines: both \gls{dcc} and \gls{sw} approaches.
-We also compared to \gls{sfc}, and were able to profile \gls{tvfc} beyond simply picking the optimal method.
+We also compared these to \gls{sfc}, and were able to profile \gls{tvfc} beyond simply picking the optimal method.
 Benchmarks included simulations, subject phenotype prediction, test-retest studies, brain state analyses, external task prediction, and a range of qualitative method comparisons.
 A new benchmark based on cross-validation was introduced, that can be run on any data set.
 
@@ -23,6 +23,6 @@ \section{Summary of presented work}
 The study is run on thousands of participants from the UK Biobank, yielding unprecedented statistical power and robustness.
 Individual brain regions as well as \glspl{fn} connectivity were investigated.
 Depressed participants show decreased global connectivity and increased connectivity instability (as measured by the temporal characteristics of estimated \gls{tvfc}).
-By defining multiple depression phenotypes, brain dynamics are found to be affected especially when patients have been professionally diagnosed or indicated to be depressed during their \gls{fmri} scan, but were less or not at all affected based on self-reported past instances and genetic predisposition.
+By defining multiple depression phenotypes, brain dynamics are found to be affected especially when patients have been professionally diagnosed or indicated to be depressed during their \gls{fmri} scan but were less or not at all affected based on self-reported past instances and genetic predisposition.
 It was demonstrated that choosing a different \gls{tvfc} estimation method would have changed our scientific conclusions.
 This sensitivity to seemingly arbitrary researcher choices highlights the need for robust method development and the importance of community-approved benchmarking.
diff --git a/ch/5_Discussion/2_Future_work.tex b/ch/5_Discussion/2_Future_work.tex
index 2549659..6340844 100644
--- a/ch/5_Discussion/2_Future_work.tex
+++ b/ch/5_Discussion/2_Future_work.tex
@@ -6,18 +6,16 @@ \section{Future work}
 Some of these will address the limitations as discussed throughout this thesis.
 
 %%
-\subsection{Model extensions}
-\label{subsec:model-extensions}
+\subsection{Model extensions}\label{subsec:model-extensions}
 %%
 
 Here we describe several venues of future work to improve the \gls{wp} for the task of \gls{tvfc} estimation.
 
 %%
-\subsubsection{Decreasing Wishart process computational cost}
-\label{subsubsec:decreasing-wp-computational-cost}
+\subsubsection{Decreasing Wishart process computational cost}\label{subsubsec:decreasing-wp-computational-cost}
 %%
 
-One area of concern for the practicality of the \gls{wp} is its relatively high computational cost (see \cref{subsec:svwp}).
+One area of concern for the practicality of the \gls{wp} is its high computational cost (see \cref{subsec:svwp}).
 We propose several directions for future work to address this.
 
 Firstly, recent innovations in variational sparse \glspl{gp} that remove the need for inducing points all-together can be ported directly to \gls{wp} models~\parencite[see][]{Saatchi2011, Adam2020, Wilkinson2021}.
@@ -31,7 +29,7 @@ \subsubsection{Decreasing Wishart process computational cost}
 
 Secondly, the low rank (or \emph{factored}) implementation as discussed in \textcite{Heaukulani2019} can help in situations when $D$ (number of time series) is large.
 \textcite{Heaukulani2019} proposed a model variant~\parencite[in turn built upon work by][]{Fox2015} that may be especially relevant for the field of neuroimaging.
-It suggests to assign each node in a weighted manner to $K \ll D$ clusters, and only learn correlation structure between these clusters.
+It suggests assigning each node in a weighted manner to $K \ll D$ clusters, and only learn correlation structure between these clusters.
 Hereby we reduce $\mathbf{F}_n$ in \cref{eq:sigma-definition} from a $D \times \nu$ to a $K \times \nu$ sized matrix.
 This way, apart from reducing the number of model parameters, it can be seen as a dimensionality reduction step.
 %
@@ -46,18 +44,18 @@ \subsubsection{Decreasing Wishart process computational cost}
 This can greatly improve computation times, especially as such hardware is becoming increasingly available at a lower cost.
 This stands in contrast with using sampling methods to infer model parameters.
 %
-It may be worthwile to try such sampling methods as inference routine, replacing our \gls{vi} routine.
+It may be worthwhile to try such sampling methods as inference routine, replacing our \gls{vi} routine.
 Especially promising may be the application of elliptical slice sampling~\parencite{Murray2010}.
 In this paradigm we may not need the mean-field assumption anymore.
 
 %%
-\subsubsection{Revisiting the zero mean assumption}
+\subsubsection{Revisiting the zero-mean assumption}
 %%
 
 Recall that in \cref{sec:wishart-process} we assumed mean function $\mathbf{\mu}_n$ to be zero(s).
 In fact, this assumption is fair if we subtract the empirical mean from data.
 %
-However, it is possible that models that include and estimate the mean function can yield better covariance estimates.
+However, it is possible that models that include an estimate of the mean function can yield better covariance estimates.
 This is a point made by \textcite{Lan2017} too.
 A natural candidate would be Gaussian process regression networks (GPRN)~\parencite{Wilson2012}.
 These models may also be necessary to model autocorrelation in data better.
@@ -69,7 +67,7 @@ \subsubsection{Revisiting the zero mean assumption}
 \subsubsection{Multi-subject fMRI}
 %%
 
-Typical \gls{fmri} data sets include relatively small time series data sets for a number of subjects (typically 10-20 for small studies or thousands for large biobanks).
+Typical \gls{fmri} data sets are comprised of relatively small time series data sets for a number of subjects (typically 10--20 for small studies or thousands for large biobanks).
 In larger studies, data is usually collected across multiple sites and scanners.
 
 A remaining challenge is to model this intersubject variability~\parencite{Allen2012}.
@@ -80,12 +78,11 @@ \subsubsection{Multi-subject fMRI}
 For example, \textcite{Ebrahimi2020} proposed a probabilistic model that considers all subject data jointly.
 
 %%
-\subsection{Benchmark framework extensions}
-\label{subsec:benchmark-framework-extensions}
+\subsection{Benchmark framework extensions}\label{subsec:benchmark-framework-extensions}
 %%
 
 Although we have looked at quite a \emph{qualitatively} exhaustive range of benchmarks, there are still opportunities to add more benchmarks to the framework to improve the robustness of conclusions and acceptance by the community.
-On the other hand, in general it is not about maximizing the number of data sets included, but running the lowest number of experiments to validate a certain methodological decisions.
+On the other hand, in general it is not about maximizing the number of data sets included but about running the lowest number of experiments required to validate certain methodological decisions.
 
 We argue that adding benchmark data sets for the \emph{same} prediction task increases robustness, whereas adding benchmarks with completely \emph{new} prediction tasks has an extra function.
 %
@@ -105,14 +102,13 @@ \subsubsection{Adding more data sets to the benchmarking framework}
 Particularly exciting is the ABCD study, which is recruiting young adolescents and tracking them for a decade~\parencite{Karcher2021}.
 
 %%
-\subsubsection{Adding more prediction tasks to the benchmarking framework}
-\label{subsec:spurious-brain-states}
+\subsubsection{Adding more prediction tasks to the benchmarking framework}\label{subsec:spurious-brain-states}
 %%
 
 Many prediction tasks can be added to the benchmarking.
-Some of these would require new protocol designs and collection of new data.
+Some of these would require new protocol designs and the collection of new data.
 
-One key open question remains whether it makes sense to posit that \gls{fc} does not vary smoothly across time, but jumps between discrete states.
+One key open question remains whether it makes sense to posit that \gls{fc} does not vary smoothly across time but jumps between discrete states.
 Future work should study whether this is a fair assumption.
 A carefully designed prediction task could in fact help to decide if brain states are artifacts of estimation method choice, or valid constructs that accurately model underlying brain dynamics.
 
@@ -163,49 +159,48 @@ \subsubsection{Computational complexity, ease-of-use, software}
 
 The benefits of using a more sophisticated model come at the cost of (computational) complexity.
 Such complexity adds to the technical debt of a system~\parencite{Sculley2015} and the benefits must be carefully weighed.
-The \gls{sw} approach, for instance, is very cheap, even for large cohort data sets.
+The \gls{sw} approach, for instance, is cheap, even for large cohort data sets.
 When analyzing a large cohort data set, the \gls{wp} model may take days or weeks to model all covariance structures.
 This may well be worth it, and this could be alleviated by good software engineering practices such as including such estimation in acquisition pipelines to be run immediately after data collection.
 %
 As discussed in \cref{subsubsec:decreasing-wp-computational-cost}, there are possibilities to reduce computational complexity of the \gls{wp}.
 
 Moreover, it is crucial that advances in modeling get picked up in the community.
-The availability of clean software with good documentation is essential for such uptake.
+The availability of clean software with good documentation is essential for such an uptake.
 We have aimed to make implementation of the \gls{wp} in Python as straightforward as possible.
 All software will be made available upon publication.
 
 %%
-\subsubsection{Robustness to researcher degrees of freedom}
-\label{subsec:robustness}
+\subsubsection{Robustness to researcher degrees of freedom}\label{subsec:robustness}
 %%
 
-In light of the ongoing reproducibility crisis, academic disciplines can also learn best practices from each other~\textcite{Bell2021}.
+Considering the ongoing reproducibility crisis, academic disciplines can also learn best practices from each other~\parencite{Bell2021}.
 A common criticism of studies in psychology and neuroscience is that researchers have the freedom to make many seemingly arbitrary choices.
 We have briefly touched upon the impact of such `researcher degrees of freedom'~\parencite{Gelman2013} throughout this thesis.
 These choices can happen at any stage, from data collection to neuroimaging data preprocessing to analysis and significance determination.
-Often such choices are made unconcsciously.
+Often such choices are made unconsciously.
 
-We can take a step back and look at different types of researcher choices.
+We can take a step back and look at distinct types of researcher choices.
 %
 The first set are trivial ones that do not need any justification.
 It does not make sense to mix human and mice brains in a study, for example.
-The second are those where strong theoretical guarantees exist, or where a strong argument can be made on the basis of mental reasoning.
+The second are those where strong theoretical guarantees exist, or where a compelling argument can be made on the basis of mental reasoning.
 %
 Then there are those we can benchmark.
 An excellent example is \textcite{Li2019a}, where it is shown that if you want to use \gls{fc} to characterize subject measures, it is best to apply \gls{gsr} in data preprocessing pipelines.
-As such, in Bayesian jargon, benchmarking can be considered marginalizating out a single researcher choice.
-Alternatively, or complementary, benchmarking or mapping exercises can also provide a rich profile to allow researcher to make more informed decisions.
+As such, in Bayesian jargon, benchmarking can be considered as marginalizing out a single researcher choice.
+Alternatively, or complementary, benchmarking or mapping exercises can also provide a rich profile to allow researchers to make more informed decisions.
 For example, in an extensive study, \textcite{Wang2014} proposed a framework to compare different connectivity measures.
 %
-However, no matter how rigorous we motivate our choices, some will always be left to make for the foreseeable future.
+However, no matter how rigorously we motivate our choices, some will always be left to make for the foreseeable future.
 For example, let us consider the study in \cref{ch:ukb}.
 Which data set do we study?
 How do we define depression?
 Do we use atlas A or B for parcellation~\parencite[see also][]{Dadashkarimi2021}?
 Do we consider time-domain or frequency-domain coupling as an indication of connectivity strength between two brain regions?
 
-One way to increase robustness of our conclusions is to \emph{formalize} these choice using a multiverse analysis framework~\parencite{Steegen2016}.
-This views each decision `universe' in parallel, and aims to run the analysis in each universe and then examine the full multiverse of results.
+One way to increase robustness of our conclusions is to \emph{formalize} these choices using a multiverse analysis framework~\parencite{Steegen2016}.
+This views each decision `universe' in parallel and aims to run the analysis in each universe and then examine the full multiverse of results.
 If a scientific conclusion holds in most universes, it can be considered more robust.
 Of course, this approach is often infeasible or unrealistic.
 In cases where obtaining experimental results is expensive, methods using active learning can help.
@@ -223,8 +218,7 @@ \subsubsection{Concrete advice for practitioners}
 If the decision is made to use \gls{sw} approaches, we strongly recommend cross-validating the window length.
 
 %%
-\subsection{Outlook on TVFC in depression research}
-\label{subsec:outlook-depression}
+\subsection{Outlook on TVFC in depression research}\label{subsec:outlook-depression}
 %%
 
 Understanding depression requires different perspectives.
@@ -248,7 +242,7 @@ \subsubsection{Bridging the levels}
 Key to making progress is to find common languages and bridge levels of understanding and scientific inquiry.
 
 As briefly touched upon, computational psychiatry and endophenotype research can act as conduits for mechanistic insight into depression.
-Linking such studies to neuroimaging has great potential.
+Linking such studies to neuroimaging has exciting potential.
 However, we do acknowledge this as a vastly complex approach, and one that requires collaboration between respective experts.
 
 Another bridge is neurochemistry, a relatively overlooked point of view.
@@ -272,7 +266,7 @@ \subsubsection{Improved labels}
 For example, \textcite{Chekroud2017} found three clusters of depressed patients, each with different responses to treatment.
 In fact, \textcite{Drysdale2017} performed a similar clustering approach on \gls{rs-fmri} data, identifying four clusters based on \gls{tms} therapy response.
 
-An interesting venue for collecting phenotypic data is through online testing.
+An interesting venue for collecting phenotypic data is online testing.
 This could be done through online testing tools such as Amazon Mechanical Turk, or through smartphone and wearable data~\parencite[see e.g.][]{Shapiro2013, Brown2014b}.
 The latter is referred to as `digital phenotyping' and is more ecologically valid.
 
@@ -284,10 +278,10 @@ \subsubsection{Specificity and applications}
 Moreover, broad brain networks are typically studied.
 These study characteristics may lead to a lack in specificity.
 Such specificity may be important in certain circumstances.
-For example, a clinician would rather know a precise treatment target, instead of broadly know that the \gls{dmn} is affected.
-One big drawback of more fine-grained studies~\parencite[e.g.][]{Klein-Flamp2022} is that \gls{fmri} signals become more noisy when we look at smaller brain region parcellations.
-Another drawback of specificity in disorder characterization is that specific (sub)factor models of depression (e.g. collected with ad-hoc questionnaires) can typically only be collected for a much smaller number of participants than established measures as we investigate in this work.
-Hence we are stuck in a trade-off situation.
+For example, a clinician would rather know a precise treatment target, instead of broadly knowing that the \gls{dmn} is affected.
+One big drawback of more fine-grained studies~\parencite[e.g.][]{Klein-Flamp2022} is that \gls{fmri} signals become noisier when we look at smaller brain region parcellations.
+Another drawback of specificity in disorder characterization is that specific (sub)factor models of depression (e.g.~collected with ad-hoc questionnaires) can typically only be collected for a much smaller number of participants than established measures as we investigate in this work.
+Hence, we are stuck in a trade-off situation.
 
 %%
 \subsubsection{The importance of theory}
@@ -303,4 +297,4 @@ \subsubsection{The importance of theory}
 
 Most importantly, these theory-driven and data-driven approaches are complementary~\parencite{Huys2016}.
 Insights from theoretic models can be added as inductive biases to machine learning models.
-And computationl modeling can be used to falsify or assign credit to various theoretic models.
+And computational modeling can be used to falsify or assign credit to various theoretic models.
diff --git a/ch/5_Discussion/3_Concluding_remarks.tex b/ch/5_Discussion/3_Concluding_remarks.tex
index 4ba7c52..0477156 100644
--- a/ch/5_Discussion/3_Concluding_remarks.tex
+++ b/ch/5_Discussion/3_Concluding_remarks.tex
@@ -1,12 +1,11 @@
 \clearpage
-\section{Concluding remarks}
-\label{sec:concluding-remarks}
+\section{Concluding remarks}\label{sec:concluding-remarks}
 %%%%%
 
-When I started my PhD I did not have much of a background in neuroscience or psychology.
+When I started my PhD, I did not have much of a background in neuroscience or psychology.
 In this early stage I attended an excellent workshop on psychology as a robust science, taught by Amy Orben.
-It had a great impact on me.
-On some level I consider myself a child of the reproducibility crisis in psychology.
+It had a profound impact on me.
+On some level, I consider myself a child of the reproducibility crisis in psychology.
 
 Consider the following two stories.
 
@@ -16,14 +15,14 @@ \section{Concluding remarks}
 
 In 2012 a TED talk about power poses was uploaded to the internet and it quickly became the website's most-watched talk ever.
 However, it turned out that the research it was based on did not replicate.
-The TED website now even has a disclaimer saying this research is debated, in the style of a social media platform adding disclaimers to stories from dodgy sources.
+The TED website now even has a disclaimer saying this research is debated, in the style of a social media platform adding disclaimers to stories from unreliable sources.
 
 What is the moral of these stories?
 
 Commitment to being a scientist does not just mean that we should try to speak the truth as best as we can.
-As elegantly put by Richard Feynman: it is about bending over backwards to provide as much information as possible as to why we may be wrong.
+As elegantly put by Richard Feynman: it is about bending over backward to provide as much information as possible as to why we may be wrong.
 
 This thesis certainly does not provide any conclusive evidence.
-However, it is my hope that it takes us a small yet \emph{robust} step into the right direction.
+However, it is my hope that it takes us a small yet \emph{robust} step in the right direction.
 Depression is a topic that desperately needs better understanding.
-I hope the overal style of this work contributes to a science that is trustworthy and cumulative in nature.
+Above all, I hope the overall style of this work contributes to a science that is trustworthy and cumulative in nature.
diff --git a/main.tex b/main.tex
index c18d21e..f04a756 100644
--- a/main.tex
+++ b/main.tex
@@ -11,7 +11,6 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \input{misc/headings}               % uncomment for fancy layout
 \input{misc/preamble}
-\input{misc/thesis-info}
 \input{misc/glossaries}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \newcommand\optionalindent{}        % uncomment for fancy layout
@@ -21,32 +20,30 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \input{misc/title_page}
-% \input{misc/dedication}
 \thispagestyle{empty}
 \input{misc/declaration}
 \clearpage
 \thispagestyle{empty}
 \input{misc/acknowledgements}
-% \input{misc/authorization}
 \input{misc/abstract}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \tableofcontents
 %%
 \clearpage
+% \addtocontents{lof}{\protect\addcontentsline{toc}{chapter}{List of Figures}}
 \addcontentsline{toc}{chapter}{List of Figures}
 \listoffigures
 %%
 \clearpage
-\addcontentsline{toc}{chapter}{List of Tables}
-\listoftables
+\thispagestyle{empty}
+\listoftables\addcontentsline{toc}{chapter}{List of Tables}
 %%
 \clearpage
+\thispagestyle{empty}
 \addcontentsline{toc}{chapter}{Acronyms}
 % \printglossary[type=\acronymtype,title=List of Abbreviations]
 \printnoidxglossaries
-%%
-% \input{misc/notations}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \renewcommand\optionalindent{\tabto{1.2cm}}     % uncomment for fancy layout
@@ -99,15 +96,12 @@
 \input{appendix/03_extra_benchmarking_results}
 \input{appendix/hcp_extra_results}
 \input{appendix/04_ukb_with_other_methods}
-% \input{appendix/ukb_graph_analysis}
-% \input{appendix/four_levels_of_computational_psychiatry}
 %%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \clearpage
-\nolinenumbers
+% \nolinenumbers
 \renewcommand\optionalindent{}              % uncomment for fancy layout?
 \renewcommand{\bibname}{References}         % changes the default name `Bibliography` -> `References'
-\addcontentsline{toc}{chapter}{References}
-\printbibliography
+\addcontentsline{toc}{chapter}{References}\printbibliography
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \end{document}
diff --git a/misc/acknowledgements.tex b/misc/acknowledgements.tex
index a72d3cb..350b77d 100644
--- a/misc/acknowledgements.tex
+++ b/misc/acknowledgements.tex
@@ -4,7 +4,7 @@ \chapter*{Acknowledgements}
 You managed to inspire me and gave me new insights in every conversation we had.
 Thank you for turning me into a neuroscientist, and thank you for believing in me.
 I also thank Andrew~Welchman for providing supervision and for his feedback.
-Furthermore, I want to thank my advisor, Carola-Bibiane~Schönlieb, for her kindness and for being an inspiration, and together with Deborah~Vickers for providing valuable feedback on my first year report.
+Furthermore, I want to thank my advisor, Carola-Bibiane~Schönlieb, for her kindness and for being an inspiration, and together with Deborah~Vickers for providing valuable feedback on my first-year report.
 I would like to thank Joana~Taylor~Tavares for creating an amazing and productive atmosphere in our lab, and for being patient with me.
 I also thank Máté~Lengyel for inviting me to attend his journal club at the CBL, where I got a lot of new perspectives.
 I warmly thank Mark~van~der~Wilk for his precise, insightful, and creative guidance on machine learning matters, not just on a technical level but also on a philosophical one.
@@ -16,15 +16,15 @@ \chapter*{Acknowledgements}
 I would also like to thank lab members Michael~Burkhart, Lorena~Santamaria~Covarrubias, Poly~Frangou, Máiréad~Healy, Vasilis~Karlaftis, Liz~Lee, Rui~Li, Elizabeth~Michael, Avraam~Papadopoulos, Reuben~Rideaux, Cecilia~Steinwurzel, Chie~Takahashi, Mengxin~Wang, and Elisa~Zamboni for welcoming me, for being my friend, and for making me feel at home in the lab.
 I am grateful to Richard~Bethlehem and Varun~Warrier at the Department of Psychiatry for their incredibly professional and helpful support.
 
-I thank Roland~Fleming and Kate~Storrs for welcoming me in Giessen and for providing a warm environment in the middle of winter.
+I thank Roland~Fleming and Kate~Storrs for welcoming me to Giessen and for providing a warm environment in the middle of winter.
 I would also like to thank the entire Marie~Sk\l{}odowska-Curie DyViTo network for the cross-pollination and amazing conference in Cappadocia.
-I express my gratitude to the European~Union's Horizon~2020 program and the respective tax payers that indirectly funded this work.
+I express my gratitude to the European~Union's Horizon~2020 program and the respective taxpayers that indirectly funded this work.
 Furthermore, I would like to thank the organizers of the machine learning summer school in T\"{u}bingen for giving me the opportunity to present an early iteration of this work, which allowed me to get valuable feedback.
 
 I would also like to thank Queens' College, the Department of Psychology, the NHS, and the broader society in Cambridge for providing a community and being resilient in the face of the pandemic.
 
 I also want to acknowledge all volunteers and everyone involved in data collection for the UK Biobank, the Human Connectome Project, and the Rockland sample, for putting in so much time and work.
-Your contribution to science and progress is of great value.
+Your contribution to science and progress is of immense value.
 
 I thank my friends and family for being there for me.
 Thank you, Fitz; you do not know me yet, but wanting to meet you has helped me a lot in making that final push.
diff --git a/misc/glossaries.tex b/misc/glossaries.tex
index acc1d69..9b55572 100644
--- a/misc/glossaries.tex
+++ b/misc/glossaries.tex
@@ -9,8 +9,10 @@
 \newacronym{amg}{AMG}{amygdala}
 \newacronym{ann}{ANN}{Artificial Neural Network}
 
+\newacronym{bmi}{BMI}{body mass index}
 \newacronym{bold}{BOLD}{blood oxygenation level dependent}
 
+\newacronym{cap}{CAP}{coactivation pattern}
 \newacronym{cbm}{CBM}{cerebellum}
 \newacronym{cen}{CEN}{central executive network}
 \newacronym{cidi-sf}{CIDI-SF}{Composite International Diagnostic Interview Short Form}
@@ -85,13 +87,14 @@
 \newacronym{na}{NA}{noradrenaline}
 \newacronym{nac}{NAc}{nucleus accumbens}
 \newacronym{nirs}{NIRS}{near infrared spectroscopy}
-\newacronym{nlp}{NLP}{Natural Language Processing}
+\newacronym{nlp}{NLP}{natural language processing}
 
 \newacronym{ofc}{OFC}{orbitofrontal cortex}
 \newacronym{ols}{OLS}{ordinary least squares}
 
 \newacronym{pcc}{PCC}{posterior cingulate cortex}
 \newacronym{pccx}{PCCx}{paracingulate cortex}
+\newacronym{pd}{PD}{Parkinson's disease}
 \newacronym{pet}{PET}{positron emission tomography}
 \newacronym{pfc}{PFC}{prefrontal cortex}
 \newacronym{pha}{PHA}{parahippocampal area}
@@ -110,6 +113,7 @@
 \newacronym{rsn}{RSN}{resting-state network}
 
 \newacronym{sfc}{sFC}{static functional connectivity}
+\newacronym{sm}{SM}{subject measure}
 \newacronym{sn}{SN}{salience network}
 \newacronym{snr}{SNR}{signal-to-noise ratio}
 \newacronym{ssri}{SSRI}{selective serotonin reuptake inhibitor}
diff --git a/misc/preamble.tex b/misc/preamble.tex
new file mode 100644
index 0000000..9308a83
--- /dev/null
+++ b/misc/preamble.tex
@@ -0,0 +1,91 @@
+% Returns the width of the current document in pts.
+% This can be useful when generating figures, so aspect ratios are preserved.
+% \showthe\textwidth
+% Currently 360.0pt, independent from margin, but this could change still!
+
+% The inputenc package is ignored with utf8 based engines.
+% \usepackage[utf8]{inputenc}
+
+\input{misc/references}
+
+%%%%%%%%%
+% BOXES %
+%%%%%%%%%
+
+% For adding e.g. Box 1.
+\usepackage{tcolorbox}
+\newtcolorbox[auto counter]{mybox}[2][]{
+  float,
+  fontupper=\footnotesize,
+  fontlower=\footnotesize,
+  title={Box~\thetcbcounter: #2},
+  #1
+}
+
+% https://en.wikibooks.org/wiki/LaTeX/Colors
+% e.g. define a color, then add colback=my-blue
+% \definecolor{my-blue}{cmyk}{0.80, 0.13, 0.14, 0.04, 1.00}
+
+%%%%%%%%%%%
+% GENERAL %
+%%%%%%%%%%%
+
+\usepackage{amsfonts}       % blackboard math symbols
+\usepackage{amsmath}
+\usepackage{amsthm}
+%%
+\usepackage[toc,page]{appendix}
+%%
+\usepackage{booktabs}       % professional-quality tables
+\usepackage[font=small,labelfont=it]{caption}  % small is 11pt when normalsize is 12pt
+%%
+\usepackage{cleveref}       % must be loaded after amsmath
+%%
+% \usepackage{mathptmx}       % Times New Roman
+\usepackage[T1]{fontenc}    % use 8-bit T1 fonts
+%%
+\usepackage[
+    left=30mm,
+    right=30mm,
+    top=35mm,
+    bottom=30mm
+]{geometry}
+\usepackage{graphicx}
+\usepackage{lettrine}       % for dropped capital letters (2 lines high)
+\usepackage{mathtools}
+%%
+\usepackage{microtype}      % microtypography (improves visual appearance)
+\usepackage{nicefrac}       % compact symbols for 1/2, etc.
+\usepackage{setspace}       % define line spacing in paragraph
+%%
+\usepackage[labelfont=bf,textfont=normalfont]{subcaption}     % allows for subplots
+%%
+\usepackage{lipsum}         % Dummytext
+\usepackage{xargs}          % Use more than one optional parameter in a new commands
+\usepackage{url}            % simple URL typesetting
+
+%%%%%%%%%%%%%%%%%%%%%
+% HEADERS & FOOTERS %
+%%%%%%%%%%%%%%%%%%%%%
+
+\usepackage{fancyhdr}             % Includes header on each page
+\setlength{\headheight}{14.5pt}   % Required to avoid overfull vbox warnings (13.6pt for 11pt, 14.5 for 12pt)
+\fancyhead{}
+\fancyfoot{}
+
+% \fancyhead[RE]{Chapter \thechapter}
+% \fancyhead[RO]{\rightmark}
+
+% \fancyhead[LE]{\thepage}  % uncomment for two-sided version
+% \fancyhead[RO]{\thepage}  % uncomment for two-sided version
+
+\fancyhead[R]{\thepage}   % uncomment for one-sided version (draft)
+
+% \fancyfoot[LE,RO]{\thepage}
+
+\input{misc/todo}
+
+\usepackage{lineno}
+% \doublespacing
+\onehalfspacing
+% \linespread{1.25}  % this is equal to 1.5 linespacing in Microsoft Word
diff --git a/misc/title_page.tex b/misc/title_page.tex
index 6496196..ef37c5b 100644
--- a/misc/title_page.tex
+++ b/misc/title_page.tex
@@ -1,7 +1,7 @@
 \begin{titlepage}
 
    \begin{center}
-       \vspace*{10mm}
+       \vspace*{6mm}
 
        \Huge
        \textbf{Robust time-varying functional connectivity estimation and its relevance for depression}
@@ -25,7 +25,7 @@
 
        This thesis is submitted for the degree of\\ \textit{Doctor of Philosophy}
 
-       \vspace{15mm}
+       \vspace{19mm}
 
        Queens' College \hspace*{\fill} September 2022
 
diff --git a/references.bib b/references.bib
index 42adb4c..f344fd0 100644
--- a/references.bib
+++ b/references.bib
@@ -2,9 +2,11 @@ @string{aistats
 @string{apa           = {American Psychological Association}}
 @string{biopsych      = {Biological Psychiatry}}
 @string{bjo           = {BJPsych Open}}
+@string{braincomms    = {Brain Communications}}
 @string{cdips         = {Current Directions in Psychological Science}}
 @string{cercor        = {Cerebral Cortex}}
 @string{clinph        = {Clinical Neurophysiology}}
+@string{cobs          = {Current Opinion in Behavioral Sciences}}
 @string{conb          = {Current Opinion in Neurobiology}}
 @string{conl          = {Current Opinion in Neurology}}
 @string{elife         = {eLife}}
@@ -18,32 +20,36 @@ @string{fpsyg
 @string{fnsys         = {Frontiers in Systems Neuroscience}}
 @string{hbm           = {Human Brain Mapping}}
 @string{iclr          = {International Conference on Learning Representations}}
+@string{iclr2         = {2nd International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings}}
 @string{icml          = {International Conference on Machine Learning}}
 @string{icml27        = {Proceedings of the 27th International Conference on Machine Learning, ICML 2010}}
 @string{icml29        = {Proceedings of the 29th International Conference on Machine Learning, ICML 2012}}
 @string{jad           = {Journal of Affective Disorders}}
 @string{jasa          = {Journal of the American Statistical Association}}
+@string{jocn          = {Journal of Cognitive Neuroscience}}
 @string{jeconom       = {Journal of Econometrics}}
 @string{jmlr          = {Journal of Machine Learning Research}}
 @string{jnnp          = {Journal of Neurology, Neurosurgery and Psychiatry}}
 @string{jnp           = {Journal of Neuropsychiatry and Clinical Neurosciences}}
-@string{jocn          = {Journal of Cognitive Neuroscience}}
+@string{jnph          = {Journal of Neurophysiology}}
 @string{jon           = {Journal of Neuroscience}}
 @string{jpn           = {Journal of Psychiatry and Neuroscience}}
+@string{jrss          = {Journal of the Royal Statistical Society}}
 @string{nas           = {National Academy of Sciences}}
-@string{nc            = {Nature Communications}}
+@string{natc          = {Nature Communications}}
+@string{natg          = {Nature Genetics}}
+@string{nathb         = {Nature Human Behaviour}}
 @string{netn          = {Network Neuroscience}}
 @string{neures        = {Neuroscience Research}}
 @string{neurips       = {Advances in Neural Information Processing Systems}}
 @string{ni            = {NeuroImage}}
 @string{nicl          = {NeuroImage: Clinical}}
-@string{npg           = {Nature Publishing Group}}
 @string{nm            = {Nature Medicine}}
+@string{nmet          = {Nature Methods}}
 @string{nn            = {Nature Neuroscience}}
 @string{npp           = {Neuropsychopharmacology}}
 @string{nrdp          = {Nature Reviews Disease Primers}}
 @string{nrn           = {Nature Reviews Neuroscience}}
-@string{oa            = {Oxford Academic}}
 @string{oup           = {Oxford University Press}}
 @string{pcbi          = {PLoS Computational Biology}}
 @string{pnas          = {Proceedings of the National Academy of Sciences of the United States of America}}
@@ -59,10 +65,8 @@ @string{tp
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
 @article{Abraham2014,
-    archivePrefix   =   {arXiv},
     author          =   {Abraham, Alexandre and Pedregosa, Fabian and Eickenberg, Michael and Gervais, Philippe and Mueller, Andreas and Kossaifi, Jean and Gramfort, Alexandre and Thirion, Bertrand and Varoquaux, Ga{\"{e}}l},
     doi             =   {10.3389/fninf.2014.00014},
-    eprint          =   {1412.3919},
     journal         =   fninf,
     number          =   {FEB},
     title           =   {{Machine learning for neuroimaging with scikit-learn}},
@@ -88,11 +92,9 @@ @article{Abrol2017
     year            =   {2017}
 }
 @inproceedings{Adam2020,
-    archivePrefix   =   {arXiv},
     author          =   {Adam, Vincent and Eleftheriadis, Stefanos and Durrande, Nicolas and Artemev, Artem and Hensman, James},
     booktitle       =   aistats,
     doi             =   {10.48550/arXiv.2001.05363},
-    eprint          =   {2001.05363v1},
     title           =   {{Doubly sparse variational Gaussian processes}},
     year            =   {2020}
 }
@@ -151,7 +153,6 @@ @article{Allen2014
     number          =   {3},
     pages           =   {663--676},
     title           =   {{Tracking whole-brain connectivity dynamics in the resting state}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/23146964/},
     volume          =   {24},
     year            =   {2014}
 }
@@ -173,6 +174,16 @@ @article{AlonsoMartinez2020
     volume          =   {14},
     year            =   {2020}
 }
+@article{Amaro2006,
+    author          =   {Amaro, Edson and Barker, Gareth J.},
+    doi             =   {10.1016/j.bandc.2005.11.009},
+    journal         =   {Brain and Cognition},
+    number          =   {3},
+    pages           =   {220--232},
+    title           =   {{Study design in fMRI: Basic principles}},
+    volume          =   {60},
+    year            =   {2006}
+}
 @article{Aminoff2013,
     author          =   {Aminoff, Elissa M. and Kveraga, Kestutis and Bar, Moshe},
     doi             =   {10.1016/j.tics.2013.06.009},
@@ -324,11 +335,9 @@ @article{Bastos2016
     year            =   {2016}
 }
 @inproceedings{Bauer2016,
-    archivePrefix   =   {arXiv},
     author          =   {Bauer, Matthias and {van der Wilk}, Mark and Rasmussen, Carl Edward},
     booktitle       =   neurips,
     doi             =   {10.48550/arXiv.1606.04820},
-    eprint          =   {1606.04820},
     pages           =   {1533--1541},
     title           =   {{Understanding probabilistic sparse Gaussian process approximations}},
     year            =   {2016}
@@ -374,11 +383,9 @@ @article{Beckmann2004
     year            =   {2004}
 }
 @inproceedings{Bell2021,
-    archivePrefix   =   {arXiv},
     author          =   {Bell, Samuel J. and Kampman, Onno P.},
     booktitle       =   {International Conference on Learning Representations - Science and Engineering of Deep Learning Workshop},
     doi             =   {10.48550/arXiv.2104.08878},
-    eprint          =   {2104.08878},
     title           =   {{Perspectives on machine learning from psychology's reproducibility crisis}},
     year            =   {2021}
 }
@@ -393,7 +400,7 @@ @inproceedings{Bell2022
 }
 @article{Benjamini1995,
     author          =   {Benjamini, Yoav and Hochberg, Yosef},
-    journal         =   {Journal of the Royal Statistical Society},
+    journal         =   jrss,
     number          =   {1},
     pages           =   {289--300},
     title           =   {{Controlling the False Discovery Rate: A practical and powerful approach to multiple testing}},
@@ -448,6 +455,7 @@ @incollection{Betzel2022
     author          =   {Betzel, Richard F.},
     booktitle       =   {Connectomic Deep Brain Stimulation},
     doi             =   {10.1016/b978-0-12-821861-7.00002-6},
+    editor          =   {Horn, Andreas},
     eprint          =   {2010.01591},
     pages           =   {25--58},
     title           =   {{Network neuroscience and the connectomics revolution}},
@@ -496,6 +504,7 @@ @article{Bommasani2021
     archivePrefix   =   {arXiv},
     author          =   {Bommasani, Rishi and Hudson, Drew A. and Adeli, Ehsan and Altman, Russ and Arora, Simran and von Arx, Sydney and Bernstein, Michael S. and Bohg, Jeannette and Bosselut, Antoine and Brunskill, Emma and Brynjolfsson, Erik and Buch, Shyamal and Card, Dallas and Castellon, Rodrigo and Chatterji, Niladri and Chen, Annie and Creel, Kathleen and Davis, Jared Quincy and Demszky, Dora and Donahue, Chris and Doumbouya, Moussa and Durmus, Esin and Ermon, Stefano and Etchemendy, John and Ethayarajh, Kawin and Fei-Fei, Li and Finn, Chelsea and Gale, Trevor and Gillespie, Lauren and Goel, Karan and Goodman, Noah and Grossman, Shelby and Guha, Neel and Hashimoto, Tatsunori and Henderson, Peter and Hewitt, John and Ho, Daniel E. and Hong, Jenny and Hsu, Kyle and Huang, Jing and Icard, Thomas and Jain, Saahil and Jurafsky, Dan and Kalluri, Pratyusha and Karamcheti, Siddharth and Keeling, Geoff and Khani, Fereshte and Khattab, Omar and Koh, Pang Wei and Krass, Mark and Krishna, Ranjay and Kuditipudi, Rohith and Kumar, Ananya and Ladhak, Faisal and Lee, Mina and Lee, Tony and Leskovec, Jure and Levent, Isabelle and Li, Xiang Lisa and Li, Xuechen and Ma, Tengyu and Malik, Ali and Manning, Christopher D. and Mirchandani, Suvir and Mitchell, Eric and Munyikwa, Zanele and Nair, Suraj and Narayan, Avanika and Narayanan, Deepak and Newman, Ben and Nie, Allen and Niebles, Juan Carlos and Nilforoshan, Hamed and Nyarko, Julian and Ogut, Giray and Orr, Laurel and Papadimitriou, Isabel and Park, Joon Sung and Piech, Chris and Portelance, Eva and Potts, Christopher and Raghunathan, Aditi and Reich, Rob and Ren, Hongyu and Rong, Frieda and Roohani, Yusuf and Ruiz, Camilo and Ryan, Jack and R{\'{e}}, Christopher and Sadigh, Dorsa and Sagawa, Shiori and Santhanam, Keshav and Shih, Andy and Srinivasan, Krishnan and Tamkin, Alex and Taori, Rohan and Thomas, Armin W. and Tram{\`{e}}r, Florian and Wang, Rose E. and Wang, William and Wu, Bohan and Wu, Jiajun and Wu, Yuhuai and Xie, Sang Michael and Yasunaga, Michihiro and You, Jiaxuan and Zaharia, Matei and Zhang, Michael and Zhang, Tianyi and Zhang, Xikun and Zhang, Yuhui and Zheng, Lucia and Zhou, Kaitlyn and Liang, Percy},
     eprint          =   {2108.07258},
+    journal         =   {arXiv},
     pages           =   {1--214},
     title           =   {{On the opportunities and risks of foundation models}},
     year            =   {2021}
@@ -651,7 +660,7 @@ @article{Caballero-Gaudes2017
 @article{Cai2020,
     author          =   {Cai, Na and Revez, Joana A. and Adams, Mark J. and Andlauer, Till F. M. and Breen, Gerome and Byrne, Enda M. and Clarke, Toni Kim and Forstner, Andreas J. and Grabe, Hans J. and Hamilton, Steven P. and Levinson, Douglas F. and Lewis, Cathryn M. and Lewis, Glyn and Martin, Nicholas G. and Milaneschi, Yuri and Mors, Ole and M{\"{u}}ller-Myhsok, Bertram and Penninx, Brenda W.J.H. and Perlis, Roy H. and Pistis, Giorgio and Potash, James B. and Preisig, Martin and Shi, Jianxin and Smoller, Jordan W. and Streit, Fabien and Tiemeier, Henning and Uher, Rudolf and {Van der Auwera}, Sandra and Viktorin, Alexander and Weissman, Myrna M. and Kendler, Kenneth S. and Flint, Jonathan},
     doi             =   {10.1038/s41588-020-0594-5},
-    journal         =   {Nature Genetics},
+    journal         =   natg,
     number          =   {4},
     pages           =   {437--447},
     title           =   {{Minimal phenotyping yields genome-wide association signals of low specificity for major depression}},
@@ -794,6 +803,15 @@ @book{Cohen1988
     title           =   {Statistical Power Analysis for the Behavioral Sciences},
     year            =   {1988}
 }
+@article{Cohen2018,
+    author          =   {Cohen, Jessica R.},
+    doi             =   {10.1016/j.neuroimage.2017.09.036},
+    journal         =   ni,
+    pages           =   {515--525},
+    title           =   {{The behavioral and cognitive relevance of time-varying, dynamic changes in functional connectivity}},
+    volume          =   {180},
+    year            =   {2018}
+}
 @article{Cole2010,
     author          =   {Cole, David M. and Smith, Stephen M. and Beckmann, Christian F.},
     doi             =   {10.3389/fnsys.2010.00008},
@@ -820,7 +838,6 @@ @article{Collins2012
     number          =   {9822},
     pages           =   {1173--1174},
     title           =   {{What makes UK Biobank special?}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/22463865/},
     volume          =   {379},
     year            =   {2012}
 }
@@ -830,10 +847,18 @@ @article{Connolly2017
     journal         =   jad,
     pages           =   {86--94},
     title           =   {{Resting-state functional connectivity of the amygdala and longitudinal changes in depression severity in adolescent depression}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/27716542/},
     volume          =   {207},
     year            =   {2017}
 }
+@article{Coppola2022,
+    author          =   {Coppola, Peter and Spindler, Lennart R. B. and Luppi, Andrea I. and Adapa, Ram and Naci, Lorina and Allanson, Judith and Finoia, Paola and Williams, Guy B. and Pickard, John D. and Owen, Adrian M. and Menon, David K. and Stamatakis, Emmanuel A.},
+    doi             =   {10.1016/j.neuroimage.2022.119128},
+    journal         =   ni,
+    pages           =   {119--128},
+    title           =   {{Network dynamics scale with levels of awareness}},
+    volume          =   {254},
+    year            =   {2022}
+}
 @book{Cover2005,
     author          =   {Cover, Thomas M. and Thomas, Joy A.},
     edition         =   {2},
@@ -886,7 +911,7 @@ @inproceedings{Dadashkarimi2021
 @article{Dafflon2022,
     author          =   {Dafflon, Jessica and {F. Da Costa}, Pedro and V{\'{a}}{\v{s}}a, Franti{\v{s}}ek and Monti, Ricardo Pio and Bzdok, Danilo and Hellyer, Peter J. and Turkheimer, Federico and Smallwood, Jonathan and Jones, Emily and Leech, Robert},
     doi             =   {10.1038/s41467-022-31347-8},
-    journal         =   nc,
+    journal         =   natc,
     number          =   {1},
     pages           =   {1--13},
     title           =   {{A guided multiverse study of neuroimaging analyses}},
@@ -903,14 +928,22 @@ @article{Dajani2015
     volume          =   {38},
     year            =   {2015}
 }
+@article{Damaraju2014,
+    author          =   {Damaraju, E. and Allen, E. A. and Belger, A. and Ford, J. M. and McEwen, S. and Mathalon, D. H. and Mueller, B. A. and Pearlson, G. D. and Potkin, S. G. and Preda, A. and Turner, J. A. and Vaidya, J. G. and {Van Erp}, T. G. and Calhoun, V. D.},
+    doi             =   {10.1016/j.nicl.2014.07.003},
+    journal         =   nicl,
+    pages           =   {298--308},
+    title           =   {{Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia}},
+    volume          =   {5},
+    year            =   {2014}
+}
 @article{Dannlowski2009,
     author          =   {Dannlowski, Udo and Ohrmann, Patricia and Konrad, Carsten and Domschke, Katharina and Bauer, Jochen and Kugel, Harald and Hohoff, Christa and Sch{\"{o}}ning, Sonja and Kersting, Anette and Baune, Bernhard T. and Mortensen, Lena S. and Arolt, Volker and Zwitserlood, Pienie and Deckert, Jrgen and Heindel, Walter and Suslow, Thomas},
     doi             =   {10.1017/S1461145708008973},
     journal         =   {International Journal of Neuropsychopharmacology},
     number          =   {1},
     pages           =   {11--22},
-    title           =   {{Reduced amygdala–-prefrontal coupling in major depression: Association with MAOA genotype and illness severity}},
-    url             =   {https://academic.oup.com/ijnp/article/12/1/11/627021},
+    title           =   {{Reduced amygdala--prefrontal coupling in major depression: Association with MAOA genotype and illness severity}},
     volume          =   {12},
     year            =   {2009}
 }
@@ -919,14 +952,14 @@ @article{Davis2020
     doi             =   {10.1192/bjo.2019.100},
     journal         =   bjo,
     number          =   {2},
-    title           =   {{Mental health in UK Biobank – development, implementation and results from an online questionnaire completed by 157 366 participants: A reanalysis}},
+    title           =   {{Mental health in UK Biobank - development, implementation and results from an online questionnaire completed by 157 366 participants: A reanalysis}},
     volume          =   {6},
     year            =   {2020}
 }
 @article{Daws2022,
     author          =   {Daws, Richard E. and Timmermann, Christopher and Giribaldi, Bruna and Sexton, James D. and Wall, Matthew B. and Erritzoe, David and Roseman, Leor and Nutt, David and Carhart-Harris, Robin},
     doi             =   {10.1038/s41591-022-01744-z},
-    journal         =   {Nature Medicine},
+    journal         =   nm,
     number          =   {4},
     pages           =   {844--851},
     title           =   {{Increased global integration in the brain after psilocybin therapy for depression}},
@@ -995,7 +1028,6 @@ @article{Demirtas2016
     number          =   {8},
     pages           =   {2918--2930},
     title           =   {{Dynamic functional connectivity reveals altered variability in functional connectivity among patients with major depressive disorder}},
-    url             =   {https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.23215 https://onlinelibrary.wiley.com/doi/abs/10.1002/hbm.23215 https://onlinelibrary.wiley.com/doi/10.1002/hbm.23215},
     volume          =   {37},
     year            =   {2016}
 }
@@ -1055,7 +1087,7 @@ @article{Douw2022
 }
 @inproceedings{Drevets2000,
     author          =   {Drevets, Wayne C.},
-    booktitle       =   {Biological Psychiatry},
+    booktitle       =   biopsych,
     doi             =   {10.1016/S0006-3223(00)01020-9},
     number          =   {8},
     pages           =   {813--829},
@@ -1076,7 +1108,7 @@ @article{Drevets2008
 @article{Drysdale2017,
     author          =   {Drysdale, Andrew T. and Grosenick, Logan and Downar, Jonathan and Dunlop, Katharine and Mansouri, Farrokh and Meng, Yue and Fetcho, Robert N. and Zebley, Benjamin and Oathes, Desmond J. and Etkin, Amit and Schatzberg, Alan F. and Sudheimer, Keith and Keller, Jennifer and Mayberg, Helen S. and Gunning, Faith M. and Alexopoulos, George S. and Fox, Michael D. and Pascual-Leone, Alvaro and Voss, Henning U. and Casey, B. J. and Dubin, Marc J. and Liston, Conor},
     doi             =   {10.1038/nm.4246},
-    journal         =   {Nature Medicine},
+    journal         =   nm,
     number          =   {1},
     pages           =   {28--38},
     title           =   {{Resting-state connectivity biomarkers define neurophysiological subtypes of depression}},
@@ -1089,7 +1121,6 @@ @article{Du2021
     journal         =   jad,
     pages           =   {7--15},
     title           =   {{Abnormal transitions of dynamic functional connectivity states in bipolar disorder: A whole-brain resting-state fMRI study}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/33906006/},
     volume          =   {289},
     year            =   {2021}
 }
@@ -1107,6 +1138,7 @@ @thesis{Duvenaud2014
     institution     =   {University of Cambridge},
     title           =   {Automatic model construction with Gaussian processes},
     type            =   {PhD thesis},
+    url             =   {https://www.repository.cam.ac.uk/handle/1810/247281},
     year            =   {2014}
 }
 
@@ -1175,6 +1207,16 @@ @article{Elliott2021
     title           =   {{Striving toward translation: Strategies for reliable fMRI measurement}},
     year            =   {2021}
 }
+@article{Emmery2019,
+    author          =   {Emmery, Chris and K{\'{a}}d{\'{a}}r, {\'{A}}kos and Wiltshire, Travis J. and Hendrickson, Andrew T.},
+    doi             =   {10.1007/s42113-019-00055-w},
+    journal         =   {Computational Brain and Behavior},
+    number          =   {3-4},
+    pages           =   {242--246},
+    title           =   {{Towards replication in computational cognitive modeling: A machine learning perspective}},
+    volume          =   {2},
+    year            =   {2019}
+}
 @article{Engle1982,
     author          =   {Engle, Robert F.},
     journal         =   {Econometrica: Journal of the econometric society},
@@ -1237,7 +1279,6 @@ @article{Fan2016
     number          =   {8},
     pages           =   {3508--3526},
     title           =   {{The Human Brainnetome Atlas: A new brain atlas based on connectional architecture}},
-    url             =   {https://academic.oup.com/cercor/article/26/8/3508/2429104},
     volume          =   {26},
     year            =   {2016}
 }
@@ -1377,7 +1418,7 @@ @inproceedings{Fox2015
 @article{Fried2015,
     author          =   {Fried, Eiko I. and Nesse, Randolph M.},
     doi             =   {10.1016/j.jad.2014.10.010},
-    journal         =   {Journal of Affective Disorders},
+    journal         =   jad,
     pages           =   {96--102},
     title           =   {{Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR∗D study}},
     volume          =   {172},
@@ -1410,7 +1451,7 @@ @article{Fried2022b
     year            =   {2022}
 }
 @article{Friston1995,
-    author          =   {Friston, Karl J. and Holmes, A. P. and Worsley, Keith J. and Poline, J. ‐P and Frith, C. D. and Frackowiak, R. S. J.},
+    author          =   {Friston, Karl J. and Holmes, A. P. and Worsley, Keith J. and Poline, J. and Frith, C. D. and Frackowiak, R. S. J.},
     doi             =   {10.1002/hbm.460020402},
     journal         =   hbm,
     number          =   {4},
@@ -1520,6 +1561,12 @@ @article{Gershman2019
     title           =   {{What does the free energy principle tell us about the brain?}},
     year            =   {2019}
 }
+@book{Gershman2021,
+    author          =   {Gershman, Samuel J.},
+    publisher       =   {Princeton University Press},
+    title           =   {{What makes us smart: The computational logic of human cognition}},
+    year            =   {2021}
+}
 @article{Ghahramani2015,
     author          =   {Ghahramani, Zoubin},
     doi             =   {10.1038/nature14541},
@@ -1532,7 +1579,7 @@ @article{Ghahramani2015
 }
 @article{Gillan2017,
     author          =   {Gillan, Claire M. and Whelan, Robert},
-    journal         =   {Current Opinion in Behavioral Sciences},
+    journal         =   cobs,
     pages           =   {34--42},
     title           =   {What big data can do for treatment in psychiatry},
     volume          =   {18},
@@ -1739,7 +1786,6 @@ @article{Hakimdavoodi2020
     journal         =   {Journal of Neural Engineering},
     number          =   {3},
     title           =   {{Using autoregressive-dynamic conditional correlation model with residual analysis to extract dynamic functional connectivity}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/32454472/},
     volume          =   {17},
     year            =   {2020}
 }
@@ -1760,14 +1806,13 @@ @article{Hao2020
     number          =   {1},
     pages           =   {1--11},
     title           =   {{Abnormal resting-state functional connectivity of hippocampal subfields in patients with major depressive disorder}},
-    url             =   {https://bmcpsychiatry.biomedcentral.com/articles/10.1186/s12888-020-02490-7},
     volume          =   {20},
     year            =   {2020}
 }
 @article{Harville1977,
     author          =   {Harville, D. A.},
     doi             =   {10.2307/2286796},
-    journal         =   {Journal of the American Statistical Association},
+    journal         =   jasa,
     number          =   {358},
     pages           =   {320--338},
     title           =   {{Maximum likelihood approaches to variance component estimation and to related problems}},
@@ -1859,10 +1904,8 @@ @article{Honari2019
     year            =   {2019}
 }
 @article{Honari2021,
-    archivePrefix   =   {arXiv},
     author          =   {Honari, Hamed and Choe, Ann S. and Lindquist, Martin A.},
     doi             =   {10.1016/j.neuroimage.2020.117704},
-    eprint          =   {2009.10126},
     journal         =   ni,
     pages           =   {117704},
     title           =   {{Evaluating phase synchronization methods in fMRI: A comparison study and new approaches}},
@@ -1950,6 +1993,17 @@ @article{Hyman2014
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
+@article{Iranpour2015,
+    author          =   {Iranpour, Juliana and Morrot, Gil and Claise, B{\'{e}}atrice and Jean, Betty and Bonny, Jean Marie},
+    doi             =   {10.1371/journal.pone.0141358},
+    journal         =   pone,
+    number          =   {11},
+    pages           =   {1--15},
+    title           =   {{Using high spatial resolution to improve BOLD fMRI detection at 3T}},
+    volume          =   {10},
+    year            =   {2015}
+}
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
 @article{Jenkinson2002,
@@ -2186,7 +2240,7 @@ @article{Kim2021
 @inproceedings{Kingma2014,
     archivePrefix   =   {arXiv},
     author          =   {Kingma, Diederik P. and Welling, Max},
-    booktitle       =   {2nd International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings},
+    booktitle       =   iclr2,
     eprint          =   {1312.6114},
     title           =   {{Auto-encoding variational Bayes}},
     year            =   {2014}
@@ -2212,7 +2266,7 @@ @article{Klaassens2017
 @article{Klein-Flamp2022,
     author          =   {Klein-Flamp, Miriam C. and {A Jensen}, Daria E. and Takagi, Yu and Priestley, Luke and Verhagen, Lennart and Smith, Stephen M. and {S Rushworth}, Matthew F.},
     doi             =   {10.1038/s41562-022-01434-3},
-    journal         =   {Nature Human Behaviour},
+    journal         =   nathb,
     title           =   {{Relationship between nuclei-specific amygdala connectivity and mental health dimensions in humans}},
     year            =   {2022}
 }
@@ -2505,7 +2559,7 @@ @article{Libedinsky2022
     year            =   {2022}
 }
 @article{Liegeois2017,
-    author          =   {Li{\'{e}}geois, Rapha{\"{e}}l and Laumann, Timothy and Snyder, Abraham and Zhou, Juan and Yeo, B. T. Thomas},
+    author          =   {Li{\'{e}}geois, Rapha{\"{e}}l and Laumann, Timothy O. and Snyder, Abraham and Zhou, Juan and Yeo, B. T. Thomas},
     doi             =   {10.1016/j.neuroimage.2017.09.012},
     journal         =   ni,
     pages           =   {437--455},
@@ -2516,7 +2570,7 @@ @article{Liegeois2017
 @article{Liegeois2019,
     author          =   {Li{\'{e}}geois, Rapha{\"{e}}l and Li, Jingwei and Kong, Ru and Orban, Csaba and {Van De Ville}, Dimitri and Ge, Tian and Sabuncu, Mert R. and Yeo, B. T. Thomas},
     doi             =   {10.1038/s41467-019-10317-7},
-    journal         =   nc,
+    journal         =   natc,
     number          =   {1},
     pages           =   {1--9},
     title           =   {{Resting brain dynamics at different timescales capture distinct aspects of human behavior}},
@@ -2562,7 +2616,7 @@ @article{Liston2014
 @article{Littlejohns2020,
     author          =   {Littlejohns, Thomas J. and Holliday, Jo and Gibson, Lorna M. and Garratt, Steve and Oesingmann, Niels and Alfaro-Almagro, Fidel and Bell, Jimmy D. and Boultwood, Chris and Collins, Rory and Conroy, Megan C. and Crabtree, Nicola and Doherty, Nicola and Frangi, Alejandro F. and Harvey, Nicholas C. and Leeson, Paul and Miller, Karla L. and Neubauer, Stefan and Petersen, Steffen E. and Sellors, Jonathan and Sheard, Simon and Smith, Stephen M. and Sudlow, Cathie L. M. and Matthews, Paul M. and Allen, Naomi E.},
     doi             =   {10.1038/s41467-020-15948-9},
-    journal         =   nc,
+    journal         =   natc,
     number          =   {1},
     title           =   {{The UK Biobank imaging enhancement of 100,000 participants: rationale, data collection, management and future directions}},
     volume          =   {11},
@@ -2607,6 +2661,16 @@ @article{Logothetis2008
     volume          =   {453},
     year            =   {2008}
 }
+@article{Luppi2019,
+    author          =   {Luppi, Andrea I. and Craig, Michael M. and Pappas, Ioannis and Finoia, Paola and Williams, Guy B. and Allanson, Judith and Pickard, John D. and Owen, Adrian M. and Naci, Lorina and Menon, David K. and Stamatakis, Emmanuel A.},
+    doi             =   {10.1038/s41467-019-12658-9},
+    journal         =   natc,
+    number          =   {1},
+    pages           =   {1--12},
+    title           =   {Consciousness-specific dynamic interactions of brain integration and functional diversity},
+    volume          =   {10},
+    year            =   {2019}
+}
 @article{Luppi2021,
     author          =   {Luppi, Andrea I. and Carhart-Harris, Robin L. and Roseman, Leor and Pappas, Ioannis and Menon, David K. and Stamatakis, Emmanuel A.},
     doi             =   {10.1016/j.neuroimage.2020.117653},
@@ -2616,6 +2680,25 @@ @article{Luppi2021
     volume          =   {227},
     year            =   {2021}
 }
+@article{Luppi2021b,
+    author          =   {Luppi, Andrea I. and Stamatakis, Emmanuel A.},
+    doi             =   {10.1162/netn_a_00170},
+    journal         =   netn,
+    number          =   {1},
+    pages           =   {96--124},
+    title           =   {Combining network topology and information theory to construct representative brain networks},
+    volume          =   {5},
+    year            =   {2021}
+}
+@article{Luppi2022,
+    author          =   {Luppi, Andrea I. and Mediano, Pedro A. M. and Rosas, Fernando E. and Holland, Negin and Fryer, Tim D. and O'Brien, John T. and Rowe, James B. and Menon, David K. and Bor, Daniel and Stamatakis, Emmanuel A.},
+    doi             =   {10.1038/s41593-022-01070-0},
+    journal         =   nn,
+    pages           =   {771--782},
+    title           =   {{A synergistic core for human brain evolution and cognition}},
+    volume          =   {25},
+    year            =   {2022}
+}
 @article{Lurie2020,
     author          =   {Lurie, Daniel J. and Kessler, Daniel and Bassett, Danielle S. and Betzel, Richard F. and Breakspear, Michael and Kheilholz, Shella and Kucyi, Aaron and Li{\'e}geois, Rapha{\"e}l and Lindquist, Martin A. and McIntosh, Anthony R. and others},
     doi             =   {10.1162/netn_a_00116},
@@ -2657,7 +2740,7 @@ @article{Maclean1985
 @article{Malykhin2010,
     author          =   {Malykhin, Nikolai V. and Carter, Rawle and Seres, Peter and Coupland, Nicholas J.},
     doi             =   {10.1503/jpn.100002},
-    journal         =   {Journal of Psychiatry and Neuroscience},
+    journal         =   jpn,
     number          =   {5},
     pages           =   {337--343},
     title           =   {{Structural changes in the hippocampus in major depressive disorder: Contributions of disease and treatment}},
@@ -2672,9 +2755,9 @@ @article{Manoliu2014
     year            =   {2014}
 }
 @article{Marcus2011,
-    author          =   {Marcus, Daniel S. and Harwell, John and Olsen, Timothy and Hodge, Michael and Glasser, Matthew F. and Prior, Fred and Jenkinson, Mark and Laumann, Timothy and Curtiss, Sandra W. and {Van Essen}, David C.},
+    author          =   {Marcus, Daniel S. and Harwell, John and Olsen, Timothy and Hodge, Michael and Glasser, Matthew F. and Prior, Fred and Jenkinson, Mark and Laumann, Timothy O. and Curtiss, Sandra W. and {Van Essen}, David C.},
     doi             =   {10.3389/fninf.2011.00004},
-    journal         =   {Frontiers in Neuroinformatics},
+    journal         =   fninf,
     title           =   {{Informatics and data mining tools and strategies for the Human Connectome Project}},
     volume          =   {5},
     year            =   {2011}
@@ -2692,7 +2775,7 @@ @article{Marek2022
 @article{Markello2022,
     author          =   {Markello, Ross D. and Hansen, Justine Y. and Liu, Zhen-Qi and Bazinet, Vincent and Shafiei, Golia and Su{\'{a}}rez, Laura E. and Blostein, Nadia and Seidlitz, Jakob and Baillet, Sylvain and Satterthwaite, Theodore D. and Chakravarty, M. Mallar and Raznahan, Armin and Misic, Bratislav},
     doi             =   {10.1038/s41592-022-01625-w},
-    journal         =   {Nature Methods},
+    journal         =   nmet,
     title           =   {neuromaps: Structural and functional interpretation of brain maps},
     year            =   {2022}
 }
@@ -2724,7 +2807,6 @@ @article{Matsui2019
     number          =   {4},
     pages           =   {1496--1508},
     title           =   {{Neuronal origin of the temporal dynamics of spontaneous BOLD activity correlation}},
-    url             =   {https://academic.oup.com/cercor/article/29/4/1496/4924353},
     volume          =   {29},
     year            =   {2019}
 }
@@ -2751,6 +2833,16 @@ @article{Matthews2017
     volume          =   {18},
     year            =   {2017}
 }
+@article{McDowell2019,
+    author          =   {McDowell, Amy R. and Carmichael, David W.},
+    doi             =   {10.1002/mrm.27498},
+    journal         =   {Magnetic Resonance in Medicine},
+    number          =   {3},
+    pages           =   {1890--1897},
+    title           =   {{Optimal repetition time reduction for single subject event-related functional magnetic resonance imaging}},
+    volume          =   {81},
+    year            =   {2019}
+}
 @article{McHugh2007,
     author          =   {McHugh, Tara L. and Saykin, Andrew J. and Wishart, Heather A. and Flashman, Laura A. and Cleavinger, Howard B. and Rabin, Laura A. and Mamourian, Alexander C. and Shen, Li},
     doi             =   {10.1080/13854040601064534},
@@ -2794,7 +2886,7 @@ @article{Meunier2009
     author          =   {Meunier, David and Lambiotte, Renaud and Fornito, Alex and Ersche, Karen D. and Bullmore, Edward T.},
     doi             =   {10.3389/neuro.11.037.2009},
     eprint          =   {1004.3153},
-    journal         =   {Frontiers in Neuroinformatics},
+    journal         =   fninf,
     number          =   {OCT},
     title           =   {{Hierarchical modularity in human brain functional networks}},
     volume          =   {3},
@@ -2990,7 +3082,6 @@ @article{Noble2019
     doi             =   {10.1016/j.neuroimage.2019.116157},
     journal         =   ni,
     title           =   {{A decade of test-retest reliability of functional connectivity: A systematic review and meta-analysis}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/31494250/},
     volume          =   {203},
     year            =   {2019}
 }
@@ -3014,11 +3105,9 @@ @article{Nour2022
     year            =   {2022}
 }
 @article{Novelli2022,
-    archivePrefix   =   {arXiv},
     author          =   {Novelli, Leonardo and Razi, Adeel},
     doi             =   {10.1038/s41467-022-29775-7},
-    eprint          =   {2106.10631},
-    journal         =   nc,
+    journal         =   natc,
     number          =   {1},
     title           =   {{A mathematical perspective on edge-centric brain functional connectivity}},
     volume          =   {13},
@@ -3049,7 +3138,7 @@ @article{Oberauer2019
 @article{Ormel2019,
     author          =   {Ormel, Johan and Hartman, Catharina A. and Snieder, Harold},
     doi             =   {10.1038/s41398-019-0450-5},
-    journal         =   {Translational Psychiatry},
+    journal         =   tp,
     number          =   {1},
     pages           =   {1--10},
     title           =   {{The genetics of depression: Successful genome-wide association studies introduce new challenges}},
@@ -3207,7 +3296,6 @@ @article{Pizzagalli2014
     journal         =   {Annual review of clinical psychology},
     pages           =   {393},
     title           =   {{Depression, stress, and anhedonia: Toward a synthesis and integrated model}},
-    url             =   {/pmc/articles/PMC3972338/ /pmc/articles/PMC3972338/?report=abstract https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3972338/},
     volume          =   {10},
     year            =   {2014}
 }
@@ -3311,7 +3399,6 @@ @article{Pruim2015a
     journal         =   ni,
     pages           =   {267--277},
     title           =   {{ICA-AROMA: A robust ICA-based strategy for removing motion artifacts from fMRI data}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/25770991/},
     volume          =   {112},
     year            =   {2015}
 }
@@ -3441,10 +3528,9 @@ @article{Rock2014
 @article{Rolls2020,
     author          =   {Rolls, Edmund T. and Cheng, Wei and Feng, Jianfeng},
     doi             =   {10.1093/braincomms/FCAA196},
-    journal         =   {Brain Communications},
+    journal         =   braincomms,
     number          =   {2},
     title           =   {{The orbitofrontal cortex: Reward, emotion and depression}},
-    url             =   {https://academic.oup.com/braincomms/article/2/2/fcaa196/5976759},
     volume          =   {2},
     year            =   {2020}
 }
@@ -3479,12 +3565,11 @@ @article{Rutledge2019
 }
 @article{Ryyppo2018,
     author          =   {Ryyppo, Elisa},
-    doi             =   {10.1162/NETN},
-    journal         =   {Network Neuroscience},
+    doi             =   {10.1162/netn_a_00083},
+    journal         =   netn,
     number          =   {3},
     pages           =   {222--241},
     title           =   {{Region of Interest as nodes of dynamic functional brain networks}},
-    url             =   {http://dx.doi.org/10.1162/netn_a_00083},
     volume          =   {1},
     year            =   {2018}
 }
@@ -3575,13 +3660,6 @@ @article{Salimi2014
     volume          =   {90},
     year            =   {2014}
 }
-@misc{Sapolsky2009,
-    title 		      =   {On depression},
-    organization 	  =   {Youtube},
-    author 		      =	  {Sapolsky, Robert},
-    howpublished 	  =	  {\url{https://www.youtube.com/watch?v=NOAgplgTxfc}},
-    year 		        = 	{2009}
-}
 @article{Schmaal2016,
     author          =   {Schmaal, Lianne and Veltman, D. J. and {Van Erp}, T. G. M. and Smann, P. G. and Frodl, T. and Jahanshad, N. and Loehrer, E. and Tiemeier, H. and Hofman, A. and Niessen, W. J. and Vernooij, M. W. and Ikram, M. A. and Wittfeld, K. and Grabe, H. J. and Block, A. and Hegenscheid, K. and V{\"{o}}lzke, H. and Hoehn, D. and Czisch, M. and Lagopoulos, J. and Hatton, S. N. and Hickie, I. B. and Goya-Maldonado, R. and Krmer, B. and Gruber, O. and Couvy-Duchesne, B. and Rentera, M. E. and Strike, L. T. and Mills, N. T. and {De Zubicaray}, G. I. and McMahon, K. L. and Medland, S. E. and Martin, N. G. and Gillespie, N. A. and Wright, M. J. and Hall, G. B. and MacQueen, G. M. and Frey, E. M. and Carballedo, A. and {Van Velzen}, L. S. and {Van Tol}, M. J. and {Van der Wee}, N. J. and Veer, I. M. and Walter, H. and Schnell, K. and Schramm, E. and Normann, C. and Schoepf, D. and Konrad, C. and Zurowski, B. and Nickson, T. and McIntosh, A. M. and Papmeyer, M. and Whalley, H. C. and Sussmann, J. E. and Godlewska, B. R. and Cowen, P. J. and Fischer, F. H. and Rose, M. and Penninx, B. W. J. H. and Thompson, P. M. and Hibar, D. P.},
     doi             =   {10.1038/mp.2015.69},
@@ -3595,7 +3673,7 @@ @article{Schmaal2016
 @article{Schmaal2020,
     author          =   {Schmaal, Lianne and Pozzi, Elena and {C. Ho}, Tiffany and van Velzen, Laura S. and Veer, Ilya M. and Opel, Nils and {Van Someren}, Eus J.W. and Han, Laura K.M. and Aftanas, Lybomir and Aleman, Andr{\'{e}} and Baune, Bernhard T. and Berger, Klaus and Blanken, Tessa F. and Capit{\~{a}}o, Liliana and Couvy-Duchesne, Baptiste and {R. Cullen}, Kathryn and Dannlowski, Udo and Davey, Christopher and Erwin-Grabner, Tracy and Evans, Jennifer and Frodl, Thomas and Fu, Cynthia H.Y. and Godlewska, Beata and Gotlib, Ian H. and Goya-Maldonado, Roberto and Grabe, Hans J. and Groenewold, Nynke A. and Grotegerd, Dominik and Gruber, Oliver and Gutman, Boris A. and Hall, Geoffrey B. and Harrison, Ben J. and Hatton, Sean N. and Hermesdorf, Marco and Hickie, Ian B. and Hilland, Eva and Irungu, Benson and Jonassen, Rune and Kelly, Sinead and Kircher, Tilo and Klimes-Dougan, Bonnie and Krug, Axel and Landr{\o}, Nils Inge and Lagopoulos, Jim and Leerssen, Jeanne and Li, Meng and Linden, David E.J. and MacMaster, Frank P. and {M. McIntosh}, Andrew and Mehler, David M.A. and Nenadi{\'{c}}, Igor and Penninx, Brenda W.J.H. and Portella, Maria J. and Reneman, Liesbeth and Renter{\'{i}}a, Miguel E. and Sacchet, Matthew D. and {G. S{\"{a}}mann}, Philipp and Schrantee, Anouk and Sim, Kang and Soares, Jair C. and Stein, Dan J. and Tozzi, Leonardo and {van Der Wee}, Nic J.A. and van Tol, Marie Jos{\'{e}} and Vermeiren, Robert and Vives-Gilabert, Yolanda and Walter, Henrik and Walter, Martin and Whalley, Heather C. and Wittfeld, Katharina and Whittle, Sarah and Wright, Margaret J. and Yang, Tony T. and Zarate, Carlos and Thomopoulos, Sophia I. and Jahanshad, Neda and Thompson, Paul M. and Veltman, Dick J.},
     doi             =   {10.1038/s41398-020-0842-6},
-    journal         =   {Translational Psychiatry},
+    journal         =   tp,
     number          =   {1},
     title           =   {{ENIGMA MDD: Seven years of global neuroimaging studies of major depression through worldwide data sharing}},
     volume          =   {10},
@@ -3616,7 +3694,7 @@ @article{SciPy2020
               Ribeiro, Ant{\^o}nio H. and Pedregosa, Fabian and
               {van Mulbregt}, Paul and {SciPy 1.0 Contributors}},
     doi             =   {10.1038/s41592-019-0686-2},
-    journal         =   {Nature Methods},
+    journal         =   nmet,
     pages           =   {261--272},
     title           =   {{{SciPy} 1.0: Fundamental algorithms for scientific computing in Python}},
     volume          =   {17},
@@ -3743,7 +3821,6 @@ @article{Shirer2012
     number          =   {1},
     pages           =   {158--165},
     title           =   {{Decoding subject-driven cognitive states with whole-brain connectivity patterns}},
-    url             =   {https://academic.oup.com/cercor/article/22/1/158/366732},
     volume          =   {22},
     year            =   {2012}
 }
@@ -3766,7 +3843,6 @@ @article{Shou2013
     number          =   {4},
     pages           =   {714--724},
     title           =   {{Quantifying the reliability of image replication studies: The image intraclass correlation coefficient (I2C2)}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/24022791/},
     volume          =   {13},
     year            =   {2013}
 }
@@ -3793,7 +3869,7 @@ @incollection{Silvennoinen2009
 @article{Singleton2022,
     author          =   {Singleton, S. Parker and Luppi, Andrea I. and Carhart-Harris, Robin L. and Cruzat, Josephine and Roseman, Leor and Nutt, David J. and Deco, Gustavo and Kringelbach, Morten L. and Stamatakis, Emmanuel A. and Kuceyeski, Amy},
     doi             =   {10.1038/s41467-022-33578-1},
-    journal         =   {Nature Communications},
+    journal         =   natc,
     number          =   {5812},
     pages           =   {1--13},
     title           =   {{Receptor-informed network control theory links LSD and psilocybin to a flattening of the brain's control energy landscape}},
@@ -3831,7 +3907,7 @@ @article{Smith2012b
     year            =   {2012}
 }
 @article{Smith2013a,
-    author          =   {Smith, Stephen M. and Beckmann, Christian F. and Andersson, Jesper and Auerbach, Edward J. and Bijsterbosch, Janine and Douaud, Gwena{\"{e}}lle and Duff, Eugene and Feinberg, David A. and Griffanti, Ludovica and Harms, Michael P. and Kelly, Michael and Laumann, Timothy and Miller, Karla L. and Moeller, Steen and Petersen, Steve and Power, Jonathan and Salimi-Khorshidi, Gholamreza and Snyder, Abraham Z. and Vu, An T. and Woolrich, Mark W. and Xu, Junqian and Yacoub, Essa and Uǧurbil, Kamil and {Van Essen}, David C. and Glasser, Matthew F.},
+    author          =   {Smith, Stephen M. and Beckmann, Christian F. and Andersson, Jesper and Auerbach, Edward J. and Bijsterbosch, Janine and Douaud, Gwena{\"{e}}lle and Duff, Eugene and Feinberg, David A. and Griffanti, Ludovica and Harms, Michael P. and Kelly, Michael and Laumann, Timothy O. and Miller, Karla L. and Moeller, Steen and Petersen, Steve and Power, Jonathan and Salimi-Khorshidi, Gholamreza and Snyder, Abraham Z. and Vu, An T. and Woolrich, Mark W. and Xu, Junqian and Yacoub, Essa and Uǧurbil, Kamil and {Van Essen}, David C. and Glasser, Matthew F.},
     doi             =   {10.1016/j.neuroimage.2013.05.039},
     journal         =   ni,
     pages           =   {144--168},
@@ -3855,7 +3931,6 @@ @article{Smith2013c
     journal         =   pone,
     number          =   {11},
     title           =   {{Prevalence and characteristics of probable major depression and bipolar disorder within UK Biobank: Cross-sectional study of 172,751 participants}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/24282498/},
     volume          =   {8},
     year            =   {2013}
 }
@@ -3885,7 +3960,6 @@ @article{Sporns2005
     number          =   {4},
     pages           =   {e42},
     title           =   {{The human connectome: A structural description of the human brain}},
-    url             =   {https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0010042},
     volume          =   {1},
     year            =   {2005}
 }
@@ -4016,7 +4090,6 @@ @article{Tang2013
     number          =   {9},
     pages           =   {1921--1927},
     title           =   {{Decreased functional connectivity between the amygdala and the left ventral prefrontal cortex in treatment-naive patients with major depressive disorder: A resting-state functional magnetic resonance imaging study}},
-    url             =   {https://pubmed.ncbi.nlm.nih.gov/23194671/},
     volume          =   {43},
     year            =   {2013}
 }
@@ -4026,7 +4099,6 @@ @article{Tang2018
     journal         =   {EBioMedicine},
     pages           =   {436},
     title           =   {{Abnormal amygdala resting-state functional connectivity in adults and adolescents with major depressive disorder: A comparative meta-analysis}},
-    url             =   {/pmc/articles/PMC6197798/ /pmc/articles/PMC6197798/?report=abstract https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6197798/},
     volume          =   {36},
     year            =   {2018}
 }
@@ -4241,6 +4313,26 @@ @article{Varela2001
     volume          =   {2},
     year            =   {2001}
 }
+@article{Varley2020,
+    author          =   {Varley, Thomas F. and Craig, Michael M. and Adapa, Ram and Finoia, Paola and Williams, Guy and Allanson, Judith and Pickard, John and Menon, David K. and Stamatakis, Emmanuel A.},
+    doi             =   {10.1371/journal.pone.0223812},
+    journal         =   pone,
+    number          =   {2},
+    pages           =   {e0223812},
+    title           =   {Fractal dimension of cortical functional connectivity networks \& severity of disorders of consciousness},
+    volume          =   {15},
+    year            =   {2020}
+}
+@article{Varley2020b,
+    author          =   {Varley, Thomas F. and Luppi, Andrea I. and Pappas, Ioannis and Naci, Lorina and Adapa, Ram and Owen, Adrian M. and Menon, David K. and Stamatakis, Emmanuel A.},
+    doi             =   {10.1038/s41598-020-57695-3},
+    journal         =   {Scientific Reports},
+    number          =   {1},
+    pages           =   {1--13},
+    title           =   {Consciousness \& brain functional complexity in propofol anaesthesia},
+    volume          =   {10},
+    year            =   {2020}
+}
 @article{Varoquaux2018,
     author          =   {Varoquaux, Ga{\"{e}}l},
     doi             =   {10.1016/j.neuroimage.2017.06.061},
@@ -4320,7 +4412,7 @@ @article{Vohryzek2022
 @article{Voytek2022,
     author          =   {Voytek, Bradley},
     doi             =   {10.1038/s41592-022-01630-z},
-    journal         =   {Nature Methods},
+    journal         =   nmet,
     title           =   {{The data science future of neuroscience theory}},
     year            =   {2022}
 }
@@ -4391,6 +4483,13 @@ @article{Whooley2013
     volume          =   {9},
     year            =   {2013}
 }
+@phdthesis{Wilk2019,
+    author          =   {{van der Wilk}, Mark},
+    doi             =   {10.17863/cam.35660},
+    institution     =   {University of Cambridge},
+    title           =   {Sparse Gaussian process approximations and applications},
+    year            =   {2019}
+}
 @article{Wilk2020,
     author          =   {{van der Wilk}, Mark and Dutordoir, Vincent and John, ST and Artemev, Artem and Adam, Vincent and Hensman, James},
     journal         =   {arXiv preprint arXiv:2003.01115},
@@ -4408,7 +4507,7 @@ @article{Wilkinson2021
 @article{Willinger2022,
     author          =   {Willinger, David and Karipidis, Iliana I. and H{\"{a}}berling, Isabelle and Berger, Gregor and Walitza, Susanne and Brem, Silvia},
     doi             =   {10.1038/s41398-022-01955-5},
-    journal         =   {Translational Psychiatry},
+    journal         =   tp,
     number          =   {1},
     pages           =   {1--10},
     title           =   {{Deficient prefrontal-amygdalar connectivity underlies inefficient face processing in adolescent major depressive disorder}},
@@ -4464,7 +4563,6 @@ @article{Wise2017
     number          =   {4},
     pages           =   {e1105--e1105},
     title           =   {{Instability of default mode network connectivity in major depression: A two-sample confirmation study}},
-    url             =   {https://www.nature.com/articles/tp201740},
     volume          =   {7},
     year            =   {2017}
 }
@@ -4564,14 +4662,23 @@ @article{Yarkoni2017
 @article{Yeo2011,
     author          =   {Yeo, B. T. Thomas and Krienen, Fenna M. and Sepulcre, Jorge and Sabuncu, Mert R. and Lashkari, Danial and Hollinshead, Marisa and Roffman, Joshua L. and Smoller, Jordan W. and Z{\"{o}}llei, Lilla and Polimeni, Jonathan R. and Fisch, Bruce and Liu, Hesheng and Buckner, Randy L.},
     doi             =   {10.1152/jn.00338.2011},
-    journal         =   {Journal of Neurophysiology},
+    journal         =   jnph,
     number          =   {3},
     pages           =   {1125},
     title           =   {{The organization of the human cerebral cortex estimated by intrinsic functional connectivity}},
-    url             =   {/pmc/articles/PMC3174820/ /pmc/articles/PMC3174820/?report=abstract https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3174820/},
     volume          =   {106},
     year            =   {2011}
 }
+@article{Yoo2018,
+    author          =   {Yoo, Peter E. and John, Sam E. and Farquharson, Shawna and Cleary, Jon O. and Wong, Yan T. and Ng, Amanda and Mulcahy, Claire B. and Grayden, David B. and Ordidge, Roger J. and Opie, Nicholas L. and O'Brien, Terence J. and Oxley, Thomas J. and Moffat, Bradford A.},
+    doi             =   {10.1016/j.neuroimage.2017.03.002},
+    journal         =   ni,
+    number          =   {March},
+    pages           =   {214--229},
+    title           =   {{7T-fMRI: Faster temporal resolution yields optimal BOLD sensitivity for functional network imaging specifically at high spatial resolution}},
+    volume          =   {164},
+    year            =   {2018}
+}
 @article{Yu2015,
     author          =   {Yu, Qingbao and Erhardt, Erik B. and Sui, Jing and Du, Yuhui and He, Hao and Hjelm, Devon and Cetin, Mustafa S. and Rachakonda, Srinivas and Miller, Robyn L. and Pearlson, Godfrey and Calhoun, Vince D.},
     doi             =   {10.1016/j.neuroimage.2014.12.020},
diff --git a/tab/ukb_cidi_sf.tex b/tab/ukb_cidi_sf.tex
index b5dd7ce..529efc4 100644
--- a/tab/ukb_cidi_sf.tex
+++ b/tab/ukb_cidi_sf.tex
@@ -21,8 +21,8 @@
   \end{tabular}
   \caption{
     UK Biobank core and non-core CIDI-SF depression-relevant Data-Fields and help-seeking Data-Fields from online follow-up mental health questionnaire.
-  }
-  \label{tab:CIDI-SF-Data-Fields}
+    These are used in stratifying participants into cohorts.
+  }\label{tab:CIDI-SF-Data-Fields}
 \end{center}
 \end{table*}
 
@@ -55,7 +55,6 @@
 %   \end{tabular}
 %   \caption{
 %     UK Biobank other depression-relevant Data-Fields.
-%   }
-%   \label{tab:Other-Data-Fields}
+%   }\label{tab:Other-Data-Fields}
 % \end{center}
 % \end{table*}
diff --git a/tab/ukb_cohorts.tex b/tab/ukb_cohorts.tex
index 1c8a3fc..7b4c64f 100644
--- a/tab/ukb_cohorts.tex
+++ b/tab/ukb_cohorts.tex
@@ -9,18 +9,18 @@
 
         \midrule
 
-        Diagnosed                           & \gls{mdd}         & 620           & 56.9 $\pm$ 4.9    & 31.3              & 17.0 $\pm$ 16.5     & 28.8 $\pm$ 6.0    & -1.1 $\pm$ 3.1    & 7.1 $\pm$ 3.2 \\
+        Diagnosed                           & \gls{mdd}         & 620           & 56.9 $\pm$ 4.9    & 31.3              & 17.0 $\pm$ 16.5     & 28.8 $\pm$ 6.0    & -1.1 $\pm$ 3.1    & 7.1 $\pm$ 3.2 *** \\
         lifetime occurrence                 & \gls{hc}          & 620           & 57.4 $\pm$ 4.6    & 31.3              & 12.3 $\pm$ 13.3     & 26.1 $\pm$ 4.4    & -2.0 $\pm$ 2.8    & 2.6 $\pm$ 2.5 \\
 
         \midrule
 
-        Self-reported                       & Depressed         & 808           & 56.9 $\pm$ 4.5    & 23.6              & 15.6 $\pm$ 16.0     & 27.8 $\pm$ 5.4    & -1.3 $\pm$ 3.0    & 6.8 $\pm$ 3.3 \\
+        Self-reported                       & Depressed         & 808           & 56.9 $\pm$ 4.5    & 23.6              & 15.6 $\pm$ 16.0     & 27.8 $\pm$ 5.4    & -1.3 $\pm$ 3.0    & 6.8 $\pm$ 3.3 *** \\
         lifetime occurrence                 & \gls{hc}          & 808           & 57.5 $\pm$ 4.5    & 23.6              & 12.2 $\pm$ 13.4     & 26.0 $\pm$ 4.4    & -2.0 $\pm$ 2.7    & 2.7 $\pm$ 2.5 \\
 
         \midrule
 
-        Self-reported                       & Depressed         & 1,411         & 57.1 $\pm$ 4.6    & 33.6              & 15.5 $\pm$ 15.8     & 27.6 $\pm$ 5.1    & -1.6 $\pm$ 2.9    & 6.5 $\pm$ 3.2 \\
-        depressed state                    & \gls{hc}          & 1,411         & 57.5 $\pm$ 4.6    & 33.6              & 12.0 $\pm$ 13.3     & 26.0 $\pm$ 4.4    & -2.0 $\pm$ 2.7    & 2.7 $\pm$ 2.6 \\
+        Self-reported                       & Depressed         & 1,411         & 57.1 $\pm$ 4.6    & 33.6              & 15.5 $\pm$ 15.8     & 27.6 $\pm$ 5.1    & -1.6 $\pm$ 2.9    & 6.5 $\pm$ 3.2 *** \\
+        depressed state                     & \gls{hc}          & 1,411         & 57.5 $\pm$ 4.6    & 33.6              & 12.0 $\pm$ 13.3     & 26.0 $\pm$ 4.4    & -2.0 $\pm$ 2.7    & 2.7 $\pm$ 2.6 \\
 
         \midrule
 
@@ -35,7 +35,7 @@
     Means and standard deviations (where applicable) are shown.
     BMI, body mass index; SES, socioeconomic status (indicated by neighborhood-level Townsend Deprivation index~\parencite{Townsend1987}, where negative scores reflect less deprivation, and gives a general idea of material deprivation).
     Education scores are only based on participants in England (Data-Field 26414), and higher scores indicate more deprivation.
-    Neuroticism scores were derived at time of initial assessment.
-}
-\label{tab:ukbiobank-cohorts}
+    Neuroticism scores were derived at the initial assessment.
+    *: $p \leq .05$, **: $p \leq .01$, ***: $p \leq .001$.
+}\label{tab:ukbiobank-cohorts}
 \end{table*}