Skip to content

Commit

Permalink
parkinson article
Browse files Browse the repository at this point in the history
  • Loading branch information
awxuelong committed Nov 2, 2023
1 parent 7334a04 commit 5dfa375
Show file tree
Hide file tree
Showing 7 changed files with 84 additions and 9 deletions.
2 changes: 1 addition & 1 deletion _posts/2023-01-12-clevr-hans-orig-videos.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
layout: post
layout: distill
title: "Are you Clever Hans or are you actually clever, Hans?" #a post with videos
date: 2023-01-12 21:42:00
description: my (mostly satirical) comments on ChatGPT and Gato, large, transformer-based models that shook the AI community on 2022 #this is what included videos could look like
Expand Down
2 changes: 1 addition & 1 deletion _posts/2023-09-01-nesy-summer-school.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ I interpreted this question as whether neural networks can perform propositional
<figure>
<img src="/assets/img/logicgate.png" alt="Sorry. Image couldn't load." width="100%" height="auto">
<figcaption id="sgd"> Illustration of the problem setting of simulating an OR, AND and XOR gate. The notation A, B would correspond to the input to the neural network model, the truth table would be the supplied dataset, and the challenge was for the model to learn parameters that given those inputs could replicate the output resembling a logic gate. Single layer networks could learn AND, and OR gates, but the XOR could only be simulated by networks with more than one hidden layer. Illustration 'borrowed' from
https://adrian-pardo.github.io/perceptron/ <d-cite key="ergin_2014_chapter"></d-cite> </figcaption>
https://adrian-pardo.github.io/perceptron/</figcaption>
</figure>

Nonetheless, whether this entails that neural networks can emulate inference in propositional logic is debatable. For example, see experiments and analysis done at the paper by [Honghua et. al, "On the Paradox of Learning to Reason"](https://arxiv.org/pdf/2205.11502.pdf). TL;DR, neural networks, no matter how deep they are, can't generalize to out of distribution test data in a simple propositional logic task of reasoning.
Expand Down
47 changes: 47 additions & 0 deletions _posts/2023-11-2-mechanistic-subtypes-parkinson.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
layout: distill
title: "Review of the paper Prediction of mechanistic subtypes of Parkinson’s using patient-derived stem cell models"
#a post with bibliography
date: 2023-11-02 14:42
description: Thoughts and comments on the paper "Prediction of mechanistic subtypes of Parkinson’s using patient-derived stem cell models" #an example of a blog post with bibliography
tags: research
categories: blog-post
disqus_comments: true
thumbnail: /assets/img/parkinson.png
authors:
- name: An Xuelong
toc:
-name: Brief summary
-name: My comments and future research directions
bibliography: deep-med.bib
---

# Brief summary

The paper by <d-cite key="dsa_2023_prediction"></d-cite> leverages deep learning architectures to solve a pentanary classification task given either a cell's tabular features or images. The five independent classes are one healthy control and four subtypes of Parkinson's Disease: familial proteinopathy (SNCA), environmental proteinopathy ($$\alpha$$-Syn oligomer), and two subtypes characterized by different mitochondria dysfunction pathways. These pathologies were chemically induced on stem cells. Fifty-six phenotypical features of them were extracted automatically and recorded as tabular data, along with images of the cells extracted via microscopy. Both data modalities were labeled with one of the five classes.

The research team trained separately a dense feedforward neural network (DNN) to classify on the tabular data, as well as a convolutional neural network (CNN) to classify on image data. The test classification accuracy achieved by the DNN reached around 83%, while the CNN 95%.

<figure>
<img src="/assets/img/parkinson.png" alt="Sorry. Image couldn't load." width="100%" height="auto">
<figcaption id="bottleneck">Two separate models are trained on different datasets on the same task of Parkinson subtype classification. Figure extracted from the original research article at https://www.nature.com/articles/s42256-023-00702-9</figcaption>
</figure>

# My comments and future research directions

Generally, in the deep learning literature, it is acknowledged that the usage of DNNs comes at the expense of poor explainability. Despite achieving high classification accuracy, these models are black-boxes. Nonetheless, there are ways to identify what are the features that the neural networks pay the most attention when deciding on a classification label, mainly by looking at its last layer's activation and tracing back to the input space which input feature is associated to it. In CNNs, the technique employed by the research team is called the ShAP (SHapley Additive exPlanations) method.

The authors managed to identify in both the DNN and CNN that the mitochondria, lysosome and the interaction of both features mainly contributed to the classification decisions of both models.

One future research direction I am interested is exploring whether by integrating both data sources can improve performance and yield explainability, because the original work trains separate models, trained on different datasets.

One source of inspiration is from <d-cite key="li_2023_v1t"></d-cite>, where they integrate image data along with a mouse's behavioral features to predict its neural responses collected from neural recordings. Another source of inspiration is drawn from concept-bottleneck models <d-cite key="koh_2020_concept"></d-cite>. There, a CNN in charge of processing images doesn't learn to output a classification label, but instead to output features that are relevant to the image. These features, in turn, are annotations of the image stored in tabular:

<figure>
<img src="/assets/img/bottleneck.png" alt="Sorry. Image couldn't load." width="100%" height="auto">
<figcaption id="bottleneck">A depiction of the pipeline of a concept-bottleneck model. The first half outputs a set of concepts given an image, which can be learnt from intricate annotations, or metadata, of the image. The second half outputs a classification label. Figure extracted from the original paper </figcaption>
</figure>

With regards to the work by <d-cite key="dsa_2023_prediction"></d-cite>, one interesting extension to their CNN is to have it not predict a Parkinson subtype, but rather learn to predict the cell's physiological features stored as tabular data given image input. Subsequently, use the features to train a multi-class regressor using standard softmax to output a classification label. The prospect is that this hybrid model can leverage the high accuracy prediction of the CNN, whilst being explainable thanks to the logistic regressor.

As a further improvement, we can use a [Slot Transformer](https://arxiv.org/abs/2210.11394) instead of the CNN with the hope of learning a disentangled representation given the image with its annotations. However, the architecture will be more computationally expensive. A pretrained Slot Transformer that already learnt to disentangle CLEVR-Scenes may be more powerful than training it from scratch.
7 changes: 0 additions & 7 deletions _posts/mechanistic-subtypes-parkinson.md

This file was deleted.

35 changes: 35 additions & 0 deletions assets/bibliography/deep-med.bib
Original file line number Diff line number Diff line change
Expand Up @@ -303,4 +303,39 @@ @article{takagi_2022_highresolution
url = {https://sites.google.com/view/stablediffusion-with-brain/},
year = {2022},
journal = {bioRxiv}
}
@article{dsa_2023_prediction,
author = {D’Sa, Karishma and Evans, James R. and Virdi, Gurvir S. and Vecchi, Giulia and Adam, Alexander and Bertolli, Ottavia and Fleming, James and Chang, Hojong and Leighton, Craig and Horrocks, Mathew H. and Athauda, Dilan and Choi, Minee L. and Gandhi, Sonia},
month = {08},
pages = {933–946},
title = {Prediction of mechanistic subtypes of Parkinson’s using patient-derived stem cell models},
doi = {10.1038/s42256-023-00702-9},
url = {https://www.nature.com/articles/s42256-023-00702-9},
volume = {5},
year = {2023},
journal = {Nature Machine Intelligence}
}
@article{li_2023_v1t,
author = {Li, Bryan M. and Cornacchia, Isabel Maria and Rochefort, Nathalie and Onken, Arno},
month = {05},
title = {V1T: large-scale mouse V1 response prediction using a Vision Transformer},
url = {https://openreview.net/forum?id=qHZs2p4ZD4},
urldate = {2023-11-02},
year = {2023},
journal = {Transactions on Machine Learning Research}
}
@article{koh_2020_concept,
author = {Koh, Pang Wei and Nguyen, Thao and Tang, Yew Siang and Mussmann, Stephen and Pierson, Emma and Kim, Been and Liang, Percy},
month = {11},
pages = {5338–5348},
publisher = {PMLR},
title = {Concept Bottleneck Models},
url = {https://proceedings.mlr.press/v119/koh20a.html},
urldate = {2023-11-02},
volume = {119},
year = {2020},
journal = {Proceedings of Machine Learning Research}
}
Binary file added assets/img/bottleneck.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/parkinson.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 5dfa375

Please sign in to comment.