If you have any trouble please email us! holly.pacey/benjamin.hodkinson@physics.ox.ac.uk
Can use either the github or gitlab.cern repos.
Anyone can get a github account, and anyone with a cern account will have a gitlab cern account. Recommend to set up SSH keys on your machine for authentication. Instructions here to generate a new key and upload it: github gitlab
go to the repo url and in the drop down menu for 'code' copy the ssh path. in the directory you want to work in:
git clone <ssh path>
The tutorials are all based in jupyter notebooks and use virtual environments. Jupyter notebooks are good for tutorials to step though bits of code in subsections, edit and rerun small parts, and to see worked examples. You can also still view the code itself on the git repo online. However, in your actual PhD work we'd recommend using proper scripts in a modular package structure.
For the analysis tutorials in the course, we have 2 versions either using ROOT or pythonic workflows. You can pick the one you think will be the most relevant to your PhD work.
Ideally use ppxlint: just set up a venv using the setup/setup_root_venv.sh scripts.
Elsewhere, assuming you don't have a local ROOT installation, use conda. Although ROOT-via_Conda doesn't work on Windows, so for the root tutorial please try to ssh instead.
- Download and Install miniconda: https://docs.anaconda.com/miniconda/ (this is a lighter-weight version of full conda to download, but should be fine for anything you need). Install miniconda in your data/ area rather than your home/ area.
- switch it on. via
eval "$(/your/path/to/miniconda/bin/conda shell.bash hook)" - create and set up the environment via setup/setup_conda_root.sh script.
Either set up the venvs using the scripts:
setup/setup_venv_softwaretut.shfor the software+coding lecture tutorials.setup/setup_venv_ml.shfor the ML lecture tutorials.setup/setup_venv_fitting.shfor the statistics lecture tutorials.
Or use conda to set up the virtual environment:
- Download, install and activate miniconda: https://docs.anaconda.com/miniconda/ (this is a lighter-weight version of full conda to download, but should be fine for anything you need) Install miniconda in your data/ area rather than your home/ area.
- create and set up the environment via
source setup/setup_conda.sh
if you are working on ppxlint, just softlink it:
ln -s /data/atlas/users/pacey/GamGam/GamGam /your/desired/folder/oxfordcmpp/data/
otherwise you can copy it over ssh to your machine:
scp -r -o 'ProxyJump username@bastion.physics.ox.ac.uk' username@pplxint12.physics.ox.ac.uk:/data/atlas/users/pacey/GamGam/GamGam /your/desired/folder/oxfordcmpp/data/
OR you can download it directly from here: here GamGam.zip has everything.
We want to run Jupyter remotely on pplxint but open the notebook in a browser on our laptop.
Ensure you are X11 forwarding either by including including the -X flag when running ssh (ssh -X {username}@pplxint12.physics.ox.ac.uk) or by adding the following to your ~/.ssh/config:
Host *.physics.ox.ac.uk
# This sets the username ssh will try to use by default for anything under the physics domain
User <my_physics_username>
ForwardX11Trusted yes
ForwardX11 yes
Host *.physics.ox.ac.uk !bastion.physics.ox.ac.uk
# This tells ssh to "jump" via another system, this is needed to get in from outside the network.
ProxyJump bastion.physics.ox.ac.uk
See IT support pages here and here for more info.
It should be sufficient to now just launch the notebook on pplxint in your terminal session via jupyter notebook {notebook}.ipynb, and a new browser window will open.
- Install XQuartz. If you have a department macbook you can do this via the Self Service application.
- Restart.
- Connect to pplxint (
ssh {username}@pplxint12.physics.ox.ac.uk) - Activate your conda/virtual environment.
jupyter notebook password--> enter a password (will be used to access a Jupyter session).- Launch a Jupyter session on the remote server:
jupyter notebook --no-browser &Make a note of the port used (You should see some printout likeJupyter Server is running at: http://localhost:{port}/) - On a local terminal window, run
ssh -N -f -L 8888:localhost:{port} {username}@pplxint12.physics.ox.ac.uk - Fire up your browser and type
localhost:8888. Enter the password you set in step (5). You should now have the remote Jupyter session running in your browser.
(See https://www.blopig.com/blog/2018/03/running-jupyter-notebook-on-a-remote-server-via-ssh/ for explanation on these steps)
We recommend trying VSCode, as it has a huge amount of functionality and extensions that allow you to develop code efficiently. You can ssh into pplxint with it directly.
- For windows: Install and setup windows openssh first info
- Then setup ssh in vscode info, helpful stack overflow about bastion jumping here. Follow instructions to make ssh keys for bastion and pplxint12/13. Involves making a ~/.ssh/config that will look roughly like this:
Host bastion
HostName bastion.physics.ox.ac.uk
IdentityFile ~/.ssh/bastion_key
MACs = <your mac address>
User <your username>
Host pplxint12
Hostname pplxint12.physics.ox.ac.uk
IdentityFile ~/.ssh/pplxint12_key
ProxyJump bastion
User <your username>
some info here
- Setup an environment that has jupyter, ipython, ipykernel installed.
- open notebook via
jupyter-notebook file.ipynbor via the explorer. - on the top right click 'select kernel', click 'Python Environments...' in the top menu, then select the appropriate venv/conda path.
Please check you can download the git repository (repo) and setup conda/venv for either the root/pythonic environment, depending on which tutorial you think you'd like to do (see below).
(optional for those new to Git) You can also check that you can practice pushing to a repo, so after you have cloned it and moved into the repo folder....
If any time has passed, check you have all the latest changes:
git pull origin master
(or main instead of master if using github)
Create a new branch from main/master that you can develop the code in.
git checkout -b <your_name>_banch
We can check you can push code to the repo with a random simple example...
mkdir ascii_art
Go find your favourite ascii art animal picture from here: here
and put it into a text file called <your_name>.txt in the ascii_art folder.
Lets put it on git:
git add ascii_art/<your_name>.txt
git commit -m "my animal"
git push origin <your_name_branch>
Which creates a new branch of the code in the online repo with your change. You can then, via the web browser, create a Merge/Pull request with your branch as a source, and main/master as the 'target'. If you add Ben and Holly as reviewers then we can check it and merge it in.
Then back in your local directory, pull/fetch again to get the new remote repo changes in your local repo, and switch back to the master/main branch fresh for future development....
There are too many topics to fit into one hour, and you have too broad a range of existing experience/needs. So, you have many choices of what to do now! All the materials will also remain available to you if later on in your PhD you want to come back and learn about something else.
If you want to look at an example of analysing/processing a ROOT Ntuple (TTree) to extract histograms with a given selection, and then create plots you can follow things in the AnalsisTutorials/ directory:
part1_process_TTree_root.ipynb, thenpart1_plotter_root.ipynbusing pyROOT (needs the ROOT environment)part1_process_TTree_root.cpp, thenpart1_plotter_root.cppusing C++ ROOT (needs the ROOT environment). This uses the 'HistMaker' class in the same folder, that you should look through too. in setup/ there are bash scripts to compile and run the code.part1_process_TTree_pythonic.ipynbthenpart1_plotter_pythonic.ipynbusing Uproot/Awkward/Pandas/MatPlotLib. (doesn't need the ROOT environment) This uses ATLAS open data for a Higgs -> GammaGamma analysis. Ideally you could look at both sets of scripts to compare how the different tools do the same thing.
If you want a different look at using Pandas Dataframes to process some Miniboone data, or shorter tutorials on MatPlotLib and Numpy, look in the ToolTutorials/ directory (none need ROOT):
MatplotlibExample.ipynbNumPyExamples.ipynbPandasTests.ipynbThere is an example of how to submit a batch job on pplxint too, instructions here:ToolTutorials/batchtest_readme.md
If you would prefer to learn more about any of the other concepts in the lecture you can look at the broad tutorial provided by the Oxford DTC:
- pre-recs: here
- course materials: here You can use this as a guide later also if you want to refactor/document/setup-CI/setup-Git/improve any of your existing code.
Alternatively, the HEP Software Foundation has excellent tutorials on the topics covered and more: https://hsf-training.org/training-center/.
Tutorials are all in the ML folder in jupyeter notebooks.
In the Fitting folder there are
(1) a number of small standalone notebooks to show simple examples of diffeent aspects of RooStats, bayesian inference, MCMC, and SPlot. (taken from 2023 course)
(2) a meaty Higgs -> diphoton script going through a full discovery/exclusion analysis with RooStats (new, kept as .py not .ipynb after some memory issues, sorry)
- Includes: functional form fits; setting up a likelihood, RooWorkspace, defining s+b and b-only models; profile likelihood fitting; discovery fits; exclusion fits; CLs upper limit scans.
- This builds on the results from part1 (Lecture 3) namely the histograms made. We have a slightly more complete version, compared to last week this just has a few more cuts added and runs over all the Higgs signal processes. The final version of the code can be found here if you are interested:
part1_process_TTree_root_completed.ipynb. We've added the needed histograms to git. - Please set up an environment with ROOT (or if you are on pplxint you don't need to setup anything).
The RooStats script is run via:
python Fitting/part2_discoHiggs.py -i histograms/GamGam_root/ -o plots/ -f poly4 - As the top of the script says, we suggest you work through this by commenting out everything in main() and slowly readding each step. For each step, read through the function and terminal outputs to check your understanding bit by bit.