From 29532e7b5c26138d89dc1d30b8774701732e206a Mon Sep 17 00:00:00 2001 From: DEGIACOMI Date: Fri, 6 Sep 2024 16:17:23 +0100 Subject: [PATCH] Updated Extra MD Analysis notebook - Formatted header - Added TOC and sections - added some explanatory text for each section - added reference to MDAnalysis manual --- .../5_Extra_p24_analysis.ipynb | 177 ++++++++---------- 1 file changed, 74 insertions(+), 103 deletions(-) diff --git a/5_Analysis_MDAnalysis/5_Extra_p24_analysis.ipynb b/5_Analysis_MDAnalysis/5_Extra_p24_analysis.ipynb index eee6c54..3b45701 100644 --- a/5_Analysis_MDAnalysis/5_Extra_p24_analysis.ipynb +++ b/5_Analysis_MDAnalysis/5_Extra_p24_analysis.ipynb @@ -16,12 +16,40 @@ "**Author**: Dr Matteo Degiacomi (matteo.t.degiacomi@durham.ac.uk)" ] }, + { + "cell_type": "markdown", + "id": "3d12646a-486f-43b8-aaee-76157acd66cf", + "metadata": {}, + "source": [ + "**Jupyter cheat sheet:**\n", + "- to run the currently highlighted cell, hold ⇧ Shift and press ⏎ Enter;\n", + "- to get help for a specific function, place the cursor within the function's brackets, hold ⇧ Shift, and press ⇥ Tab;\n", + "\n", + "
Remember: variables persist between cells \n", + " \n", + "Be aware that it is the order of execution of cells that is important in a Jupyter notebook, not the order in which they appear. Python will remember all the code that was run previously, including any variables you have defined, irrespective of the order in the notebook. Therefore if you define variables lower down the notebook and then (re)run cells further up, those defined further down will still be present.
" + ] + }, + { + "cell_type": "markdown", + "id": "8e969ae8-2e99-48d4-aa1d-c0c090c274e9", + "metadata": {}, + "source": [ + "## Table of Contents\n", + "\n", + "[1. Introduction](#intro) \n", + "[2. Root Mean Square Deviations (RMSDs)](#rmsd) \n", + "[3. Pairwise RMSD](#p_rmsd) \n", + "[4. Root Mean Square Fluctuation (RMSF)](#rmsf) \n", + "[5. Radius of gyration and end-to-end distance](#rgyr) " + ] + }, { "cell_type": "markdown", "id": "ea2fe146-eea8-46f7-8696-ff5fa5cb823d", "metadata": {}, "source": [ - "## Google Colab setup\n", + "## 0. Google Colab setup\n", "
\n", "Attention: Please only run the following cells if you are using Colab! These cells install necessary packages and download data.
" ] @@ -65,33 +93,13 @@ "os.chdir(f\"CCP5_Simulation_of_BioMolecules{os.sep}4_Analysis_MDAnalysis\")" ] }, - { - "cell_type": "markdown", - "id": "3d12646a-486f-43b8-aaee-76157acd66cf", - "metadata": {}, - "source": [ - "## Jupyter cheat sheet\n", - "\n", - "- to run the currently highlighted cell, hold ⇧ Shift and press ⏎ Enter;\n", - "- to get help for a specific function, place the cursor within the function's brackets, hold ⇧ Shift, and press ⇥ Tab;" - ] - }, - { - "cell_type": "markdown", - "id": "1d143168-1643-4caa-907e-15c87cdfb52d", - "metadata": {}, - "source": [ - "
REMEMBER: variables persist between cells \n", - " \n", - "Be aware that it is the order of execution of cells that is important in a Jupyter notebook, not the order in which they appear. Python will remember all the code that was run previously, including any variables you have defined, irrespective of the order in the notebook. Therefore if you define variables lower down the notebook and then (re)run cells further up, those defined further down will still be present.
" - ] - }, { "cell_type": "markdown", "id": "fe8671d2-93ec-4fb3-b733-a6d47f818a2b", "metadata": {}, "source": [ - "## Introduction" + "## 1. Introduction\n", + "" ] }, { @@ -141,7 +149,8 @@ "id": "30b29017-8118-4ea9-9476-b0471cfb10c0", "metadata": {}, "source": [ - "## Root Mean Square Deviations (RMSD)" + "## 2. Root Mean Square Deviations (RMSDs)\n", + "" ] }, { @@ -149,7 +158,7 @@ "id": "47d163d5-4f5c-4397-9610-e26a38e3bae5", "metadata": {}, "source": [ - "Let's demonstrate how the time evolution of RMSD with respect to the first frame changes with different alignments and atoms of interest." + "Let's demonstrate how the time evolution of [RMSD](https://docs.mdanalysis.org/1.1.1/documentation_pages/analysis/rms.html) with respect to the first frame changes with different alignments and atoms of interest." ] }, { @@ -172,6 +181,14 @@ "R_D2.run()" ] }, + { + "cell_type": "markdown", + "id": "2147a729-4b24-42e6-b289-4e51eda09c5f", + "metadata": {}, + "source": [ + "Now, let's plot everything!" + ] + }, { "cell_type": "code", "execution_count": null, @@ -229,7 +246,7 @@ "id": "eb99eb0f-fc8a-4a2a-b74b-f575b26bbc07", "metadata": {}, "source": [ - "For the first slide on RMSD, let's also plot only a single RMSD profile" + "For the first slide in the lecture featuring RMSD, let's also plot only a single RMSD profile" ] }, { @@ -257,7 +274,16 @@ "id": "408d030b-a5c6-446f-9840-bf67820e6443", "metadata": {}, "source": [ - "## pairwise RMSD" + "## 3. Pairwise RMSD\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "ee67f243-2f0c-495b-a310-ea141b7b6f3a", + "metadata": {}, + "source": [ + "Now, let's generate a [pairwise RMSD](https://userguide.mdanalysis.org/stable/examples/analysis/alignment_and_rms/pairwise_rmsd.html) plot, i.e., a surface plot reporting on the RMSD of each conformation vs each other conformation." ] }, { @@ -292,7 +318,8 @@ "id": "9d30c163-f7a7-489a-b9fe-06608d9cd245", "metadata": {}, "source": [ - "## Root Mean Square Fluctuations (RMSF)" + "## 4. Root Mean Square Fluctuations (RMSF)\n", + "" ] }, { @@ -300,7 +327,7 @@ "id": "e4e4c72a-4255-4fd2-8ecf-502d212c697b", "metadata": {}, "source": [ - "We start by defining a function that aligns the trajectory and calculates the RMSF of a selection of interest." + "The Root Mean Square Fluctuation ([RMSF](https://userguide.mdanalysis.org/stable/examples/analysis/alignment_and_rms/rmsf.html)) reports on the amount of displacement of an amino acid w.r.t. its mean position during the simulation. We start by defining a function that aligns the trajectory and calculates the RMSF of a selection of interest." ] }, { @@ -395,7 +422,16 @@ "id": "453510f6-2bd4-4856-a871-3c28bafd4f49", "metadata": {}, "source": [ - "## Radius of gyration and end-to-end distance" + "## 5. Radius of gyration and end-to-end distance\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "d9e0f916-9497-42e6-a782-cbd51e7e12ec", + "metadata": {}, + "source": [ + "To calculate radius of gyration (Rg) and end-to-end distance of a protein, we will create a few AtomGroups. The radius of gyration is a quantity that can be directly extracted from any AtomGroup (here, we will select the whole protein). N-terminus and C-terminus coordinates, necessary to calculate the end-to-end distance, can be extracted as the first and last atom in AtomGroups containing coordinates of N and C atoms, respectively." ] }, { @@ -405,7 +441,6 @@ "metadata": {}, "outputs": [], "source": [ - "#u = mda.Universe(\"trajectory_formatted.pdb\")\n", "nterm = u.select_atoms('name N')[0]\n", "cterm = u.select_atoms('name C')[-1]\n", "bb = u.select_atoms('protein')\n", @@ -421,6 +456,14 @@ " rg.append(rgyr)" ] }, + { + "cell_type": "markdown", + "id": "c1dfd9f2-2d55-4ab1-a9d2-13685b853947", + "metadata": {}, + "source": [ + "Let's now plot the quantities we have extracted for each simulation snapshot!" + ] + }, { "cell_type": "code", "execution_count": null, @@ -444,78 +487,6 @@ "fig.savefig(\"rg_dist_p24.png\")" ] }, - { - "cell_type": "markdown", - "id": "86865f4e-ea12-4fda-860d-99ba2f250daf", - "metadata": {}, - "source": [ - "## Hydrogen bonds" - ] - }, - { - "cell_type": "markdown", - "id": "ecae01ea-b687-43ac-ae3c-3b86d37a4859", - "metadata": {}, - "source": [ - "The function below can work the hydrogen bonds in your protein. Can you work out how to use it?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1657a552-0442-4c57-a25a-f1bc019c30e1", - "metadata": {}, - "outputs": [], - "source": [ - "def hbonds(hydrogens, acceptors):\n", - " \n", - " \"\"\" this function calculates hydrogen bonds \"\"\"\n", - " \n", - " acc_idx, hyd_idx = idx.T\n", - " \n", - " idx, dists = mda.lib.distances.capped_distance(acceptors.positions, \n", - " hydrogens.positions, \n", - " max_cutoff=3.0,\n", - " box=acceptors.dimensions) \n", - "\n", - " \n", - " acc_idx, hyd_idx = idx.T\n", - "\n", - " # select potential hydrogen bonds to check angles\n", - " potential_hbond_acceptors = acceptors[acc_idx]\n", - " potential_hbond_hydrogens = hydrogens[hyd_idx]\n", - "\n", - " # select hydrogen bond donors by looping over hydrogens and selecting the bonded oxygens\n", - " potential_hbond_donors = sum(h.bonded_atoms[0] for h in potential_hbond_hydrogens)\n", - " \n", - " angles = mda.lib.distances.calc_angles(potential_hbond_acceptors.positions,\n", - " potential_hbond_hydrogens.positions,\n", - " potential_hbond_donors.positions, \n", - " box=u.dimensions)\n", - " #convert to degrees\n", - " angles = np.rad2deg(angles)\n", - " \n", - " #check angles are larger than 130 degrees\n", - " angle_idx = np.where(angles >= 130.0)\n", - " \n", - " hbond_acceptors = potential_hbond_acceptors[angle_idx]\n", - " hbond_hydrogens = potential_hbond_hydrogens[angle_idx]\n", - " hbond_donors = potential_hbond_donors[angle_idx]\n", - " \n", - " return hbond_acceptors, hbond_hydrogens, hbond_donors" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0f7002d1-05a0-473c-93eb-384676f12848", - "metadata": {}, - "outputs": [], - "source": [ - "# Try using the hbonds function here!\n", - "\n" - ] - }, { "cell_type": "markdown", "id": "960746cd-7b96-4f92-82ed-608e23dea2d7",