The repository is part of the so-called, Mega-Meta study on reviewing factors contributing to substance use, anxiety, and depressive disorders. The study protocol has been pre-registered at Prospero. The procedure for obtaining the search terms, the exact search query, and selecting key papers by expert consensus can be found on the Open Science Framework.
The screening was conducted in the software ASReview (Van de Schoot et al., 2020 using the protocol as described in Hofstee et al. (2021). The server installation is described in Melnikov (2021), and the post-processing is described by van de Brand et al., 2021. The data can be found on DANS [LINK NEEDED].
This repository stores the scripts and plugins that were used for the creation of the three final project files. These final project files were created using a classifier based on a convolutional neural network, optimized using Optuna.
The Plugins
folder contains the 2 used ASReview plugins.
asreview-cnn-hpo
was
used to find the optimal settings for each dataset, and
asreview-model-cnn-17-layer
was used to implement these settings.
The scripts
folder contains a Jupyter Notebook that can be run in Google
Colab. This script installs the plugins
found in the plugins
folder and then creates an ASReview lab instance that can
be used to employ these plugins, find the optimal hyperparameters, and create
the final project files.
- Create a project file from one of the input files.
- Select the HPO-CNN optimizer as classifier for the created project file.
- Train the project file to recieve the optimal CNN hyper parameters. The optimal parameters appear in the console used to start ASReview.
- Plug these parameters into the CNN model (not the HPO-CNN model).
- Train the CNN model.
- Start screening records with the optimized CNN model.
This guide details how the CNN was optimized and trained.
-
The process started with 3 different excel files containing ASReview output.
-
In Google Colab, upload and run the
cnn_training_script_for_use_in_google_colab.ipynb
file in found in the folder calledscripts
. This file is used to optimize the CNNs for the later processing.-
A custom-made version of the HPO-CNN was used to determine the optimal hyperparameters for each of the project files. These hyperparameters were then applied to the CNN classifier model.
-
To use this custom-made version of the HPO-CNN, upload the folder containing the HPO-CNN version to Colab. A quick way of doing this is by zipping-up the folder and uploading this zip file.
-
The notebook contains code for unzipping, and then installing the plugin automatically.
-
-
The notebook uses NGROK to access the ASReview frontend. A personal NGROK token is needed for the
NGROK_AUTH_TOKEN
variable. A link with instructions on where to get such a token can be found in the notebook.
!pip install pyngrok --quiet from pyngrok import ngrok # Terminate open tunnels if exist ngrok.kill() # Setting the authtoken (optional) # Get your authtoken from https://dashboard.ngrok.com/auth NGROK_AUTH_TOKEN = "fill token here" ngrok.set_auth_token(NGROK_AUTH_TOKEN)
- After running the notebook, there will be a link produced by NGROK. This link will open the ASReview frontend.
ngrok.connect(port="80", proto="http") <NgrokTunnel: "http://d3c5-35-197-26-146.ngrok.io" -> "http://localhost:80">
-
In this front end, create a new project file with one of the Excel files.
-
When the training of the new project file is finished (after a long time), the still processing colab cell running ASReview will print the optimal parameters in the output.
Hpo trail: 77/80 Hpo trail: 77/80 Hpo trail: 77/80 Hpo trail: 77/80 FOUND HYPERPARAMETERS: {'nlayers': 3, 'nfilters': 209}
-
-
These parameters are used to optimize the CNN used for the final project files.
- For the training of these project files, the parameters that resulted as
an output of the HPO are used for the CNN. The output is set by filling in
the results into the
nlayers
andnfilters
variables in theasreview-plugin-model-cnn-17-layer\asreviewcontrib\models\cnn.py
file.
def _create_dense_nn_model(_size): def model_wrapper(): backend.clear_session() nfilters = 209 model = Sequential()
- For the training of these project files, the parameters that resulted as
an output of the HPO are used for the CNN. The output is set by filling in
the results into the
-
Install this newly created CNN. Using this new optimized CNN, create the projects files for screening.
This project is funded by a grant from the Centre for Urban Mental Health, University of Amsterdam, The Netherlands
The content in this repository is published under the MIT license.
For any questions or remarks, please send an email to the ASReview-team.