Stochastic Block Model Prior with Ordering Constraints for Gaussian Graphical Models

🏆 This was the highest ranking project in its year batch.

This project was developed for the course of Bayesian Statistics for the MSc. in Mathematical Engineering at Politecnico di Milano, A.Y. 2022/2023.

Installation

How to clone the repository

git clone https://github.com/teobucci/bayesian-statistics-project
git submodule update --init
git submodule update --recursive

How to build the `FGM` package

Open the ./FGM/FGM.Rproj in RStudio and type:

Ctrl+Shift+B on Windows
CMD+Shift+B on macOS

On macOS on M1 chip you may get an error involving gfortran, in which case proceed as follows according to this:

Install gcc which includes gfortran with
```
brew install gcc
```
Create a file ~/.R/Makevars (if it does not exist yet). For example running with a terminal
```
mkdir -p ~/.R
touch ~/.R/Makevars
```
Add the following lines to ~/.R/Makevars
```
FC = /opt/homebrew/Cellar/gcc/11.3.0_2/bin/gfortran
F77 = /opt/homebrew/Cellar/gcc/11.3.0_2/bin/gfortran
FLIBS = -L/opt/homebrew/Cellar/gcc/11.3.0_2/lib/gcc/11
```
This can be done by opening it in a normal text editor such as VSCode (code ~/.R/Makevars) or SublimeText (subl ~/.R/Makevars).

Note that you might have to change gcc version 11.3.0_2 to whatever your gcc version is.

How to install the packages

Install the required packages from CRAN

packages_list <-
    c(
        "tidyverse",
        "mvtnorm",
        "salso",
        "logr",
        "gmp",
        "mcclust",
        "igraph",
        "ggraph",
        "tidygraph",
        "uuid",
        "dittodb",
        "latex2exp",
        "kableExtra",
        "doSNOW",
        "doParallel"
    )
install.packages(packages_list)

and install the custom utilities by Alessandro Colombi and mcclust.ext

devtools::install_github("alessandrocolombi/ACutils")
devtools::install_github("sarawade/mcclust.ext")

How to compile the PDF files

To compile the presentations, run the following in the root of the repo

make prese1
make prese2
make prese3

To compile the report, run

make report

To compile everything, run

make pdf

To remove temporary LaTeX files, run

make clean

To remove both temporary and PDF files, run

make distclean

Running the analysis

The repository contains different files to perform the analysis

01_simulations_basic.Rmd is a notebook containing a vanilla implementation for running a single simulation, meant to be used a playground for on-the-go configurations.
The second block of files implements a grid-search approach to run different simulations varying parameters to see how well the MCMC behaves and how robust it is:
- 02a_simulation_grid_generation.R generates the grid of required configurations.
- 02b_simulation_grid_parallel_execution.R runs and saves all the simulations from the grid by the previous file, which is sourced here. The execution is run in parallel and on a MacBook Pro M1 14" takes about 10 minutes.
- 02c_simulation_grid_visualization_notebook.Rmd reads all the simulations generated from the previous files and, when knitted, produces a PDF where for each section there is a simulation. At the beginning there is a comprehensive table with all the relevant indexes across the grid.
- 02d_simulation_grid_visualization_export_files.R is a script that, when sourced, reads all the simulations generated by 02b_simulation_grid_parallel_execution.R and saves all the relevant plots and figures to file, useful for embedding in presentations and report.
- 02e_simulation_grid_kl_comparison.Rmd reads all the simulations generated from the previous files and, when knitted, produces a PDF comparing the evolution of the KL distance across iterations for different configurations. For a better understanding, it is advised to run the simulations without burn-in in this case.
03_simulations_real_dataset.Rmd is a notebook where the algorithm is run on a real dataset, which is meant to be stored in dataset, not included in this repository for privacy reasons. It is essentially a copy of 01_simulations_basic.Rmd but without the knowledge of the true graph and partition.
04_execution_time_regression.R is a script that implements a polynomial linear regression of the execution time against the number of nodes, taken from the simulation grid results.

Final results

The final presentations can be found here:

The final report can be found here:

Stochastic_Block_Model_Prior_with_Ordering_Constraints_for_Gaussian_Graphical_Models.pdf

The results from the simulations knitted can be found here:

Authors

Supervisor: Alessandro Colombi (@alessandrocolombi)

Teo Bucci (@teobucci)
Filippo Cipriani (@SmearyTundra)
Filippo Pagella (@effefpi2)
Flavia Petruso (@fl-hi1)
Andrea Puricelli (@apuri99)
Giulio Venturini (@Vinavil334)

Name		Name	Last commit message	Last commit date
Latest commit History 661 Commits
FGM @ 4bcd5d0		FGM @ 4bcd5d0
docs		docs
output		output
ref		ref
report		report
slides		slides
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
01_simulations_basic.Rmd		01_simulations_basic.Rmd
02a_simulation_grid_generation.R		02a_simulation_grid_generation.R
02b_simulation_grid_parallel_execution.R		02b_simulation_grid_parallel_execution.R
02c_simulation_grid_visualization_notebook.Rmd		02c_simulation_grid_visualization_notebook.Rmd
02d_simulation_grid_visualization_export_files.R		02d_simulation_grid_visualization_export_files.R
02e_simulation_grid_kl_comparison.Rmd		02e_simulation_grid_kl_comparison.Rmd
03_analysis_real_dataset.Rmd		03_analysis_real_dataset.Rmd
04_execution_time_regression.R		04_execution_time_regression.R
Makefile		Makefile
README.md		README.md
bayesian-statistics-project.Rproj		bayesian-statistics-project.Rproj
bibliography.bib		bibliography.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stochastic Block Model Prior with Ordering Constraints for Gaussian Graphical Models

Table of contents

Installation

How to clone the repository

How to build the `FGM` package

How to install the packages

How to compile the PDF files

Running the analysis

Final results

Authors

About

Contributors 6

Languages

teobucci/bayesian-statistics-project

Folders and files

Latest commit

History

Repository files navigation

Stochastic Block Model Prior with Ordering Constraints for Gaussian Graphical Models

Table of contents

Installation

How to clone the repository

How to build the FGM package

How to install the packages

How to compile the PDF files

Running the analysis

Final results

Authors

About

Topics

Resources

Stars

Watchers

Forks

Contributors 6

Languages

How to build the `FGM` package