In this example you will submit a grid job what will split the list of input file in two and each job will run on half of the files.
There is one subtlety. Our grid tools require that the output root file be named according to the Mu2e file name conventions. To satisfy this, the fcl file that we will actually run on the grid is named Tutorial/AllInOne/fcl/all01_grid.fcl . This is done by using #include to include the orignal file and then overriding the definition of the root file name. Look at the file to see this. We strongly recommend this style over the alternative of making a copy of the file and editing the copy; the prefered method automatically transmits changes in the first file to the second.
- Login and return to your working directory. Do the
mu2einit
andmuse setup
- Use muse to Make a tar file of the content of your working directory. It is smart enough to skip the out/ subdirectory. It also knows that the backing releases are in cvmfs and will be visible on the grid worker nodes at the same path. So they will just work.
muse tarball
- This will print a name like /mu2e/data/users/kutschke/museTarball/tmp.fNQI4rlDbd/Code.tar.bz2. Save the name.
- The next step is to generate the two fcl files that are needed, one for each job. This is done with a tool named generate_fcl. You can get help for generate_fcl with:
generate_fcl --help
. But first you need to bring generate_fcl into your environment. setup mu2etools
setup mu2efiletools
- Issue the command.
generate_fcl --dsconf=dummy \
--inputs=Tutorial/AllInOne/filelist.txt \
--merge-factor=10 \
--auto-description \
--include Tutorial/AllInOne/fcl/all01_grid.fcl
This will make a subdirectory 000/ that contains two .fcl files and two .fcl.json files. We won't cover the use of the .json files today. The --dsconf argument is needed even though it is not actually used. The other arguments are well described in the generate_fcl --help
. The fcl files need to be put onto /pnfs so that will be visible to the grid nodes, where the jobs run.
tar czf AllInOne.tar.bz2 000
- mkdir -p /pnfs/mu2e/scratch/users/$USER/fcl
- cp AllInOne.tar.bz2 /pnfs/mu2e/scratch/users/$USER/fcl/
Now, submit the job
- setup mu2egrid
mu2eprodsys --code=path_to_the_output_of_muse_tarball \
--fcllist=/pnfs/mu2e/scratch/users/$USER/fcl/AllInOne.tar.bz2 \
--memory=1500MB \
--expected-lifetime=1h \
--xrootd \
--dsconf=MDC2020v_perfect_v1_0
Here dsconf is not ignored. The value was chosen to match the configuation field of the input dataset. If you have not submitted a grid job in the last 30 days the submission will pause with the message
Complete the authentication at:
https://cilogon.org/device/?user_code=9DM-K2X-6PK
No web open command defined, please copy/paste the above to any web browser
Waiting for response in web browser
Mouse the url into a brower and go to it. The browse can be running on our laptop. You will come to a screen with a login button. Click on it. This uses your Fermilab SSO to complete authentication. This will bring you to a screen that says "CILogong User Code Veriication" click on that and in a few seconds you submission will continue.
Save the output generated by this command. You need to save two things. Near the start of the output is a line:
Will use the outstage directory = /pnfs/mu2e/scratch/users/kutschke/workflow/default/outstage
Near the bottom is a line like:
Use job id 73013353.0@jobsub02.fnal.gov to retrieve output
When your jobs is complete you can find output files in the two directories:
/pnfs/mu2e/scratch/users/kutschke/workflow/default/outstage/73013353/*
While your jobs are running you can monitor their progress with the command jobsub_q --user=$USER
. Among other things this shows you the jobs state. R means running. I means idle, which usually means it is in the queue waiting to run. And H means iit's on hold. That's trouble your options will be described elsewhere.