forked from oar-team/oar
-
Notifications
You must be signed in to change notification settings - Fork 0
/
QUICKSTART-USER
146 lines (97 loc) · 4.38 KB
/
QUICKSTART-USER
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
Using OAR - Basic steps
=======================
Visualising the cluster State
-----------------------------
Many tools are available to visualize the cluster state.
~~~~~~~~~~~~~~~
Shell commands:
~~~~~~~~~~~~~~~
* oarstat: this command shows information about running or planned jobs.
(The -f option shows full infomation)
* oarnodes: this command shows the resources states. Warning: in our context, a
resource is not necessary a machine. It is generally a cpu, a core or a host,
but it can be much more... like licence tokens, vlan, ...
The oarnodes command gives information about the network address where is
located this resource, its type, its state and many other (interesting)
information.
~~~~~~~~~~~~~~~~
Graphical tools:
~~~~~~~~~~~~~~~~
* Monika: this web page shows current resources states and jobs information.
On this page you can have more information about a particular resource or job.
* DrawGantt: this web page shows the gantt diagram of the scheduling. It
represents the current, former and future jobs.
Submitting a job in an interactive shell
----------------------------------------
~~~~~~~~~~
Submission
~~~~~~~~~~
To submit an interactive job we use the "oarsub" command with the "-I" option::
frontend:~$> oarsub -I
OAR returns then an unique job ID that will identify your job in the system::
OAR_JOB_ID=1234
Once the job is scheduled, when the requested resources are available, OAR
connects you to the first allocated node. OAR initiates environment variables
that inform you of your submission properties::
node:~$> env | grep OAR
Particularly, the allocated nodes list is contained in the $OAR_NODEFILE::
node:~$> cat $OAR_NODEFILE
~~~~~~~~~~~~~
Visualisation
~~~~~~~~~~~~~
You can get information about your job by looking at the Monika or DrawGantt
interfaces or by typing in a command line console::
frontend:~$> oarstat -fj OAR_JOB_ID
~~~~~~~~~~~~~~~
Exiting the job
~~~~~~~~~~~~~~~
To terminate an interactive job you just have to disconnect from the resource::
node:~$> exit
You can likewise kill the job by typing::
frontend:~$> oardel OAR_JOB_ID
In this case, the session will be killed ("kill -9").
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Interactive submission on many resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The "-l" option allows to specify wanted resources. For example, if we need to
work in interactive mode on 2 cpu for a max duration of 30 minutes we will ask::
frontend:~$> oarsub -I -l /cpu=2,walltime=00:30:00
The walltime is the job's max duration. If the job overruns its walltime, it will
be killed by the system. Thus, you better have to set your walltime correctly
depending on how long will take your job to prevent being killed if the
walltime has been set too short or being scheduled later if it is too long.
Then, once the job is scheduled and started, OAR connects you on the first
reserved node. You still can access the list of the other resources via the
$OAR_NODEFILE env variable.
Batch submission
----------------
OAR allows to execute scripts in "passive mode". In this mode, the user
specifies a script at the submission time. This script will be executed on the
first reserved node. It's within this script that the user will define the way
to operate parallel resources. All the ``$OAR_*`` env variables are reachable
within the script.
The script must be executable.
~~~~~~~~~~
Submission
~~~~~~~~~~
In this case, the principle is the same that interactive submission, just
replace the "-I" option with the path of your script::
frontend:~$> oarsub -l /cpu=2,walltime=00:30:00 ./hello_mpi.sh
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Getting the results of the submission
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In passive mode, OAR creates 2 files: OAR.<OAR_JOB_ID>.stdout for the stdout and
OAR.<OAR_JOB_ID>.stderr for the stderr.
The name of these 2 files can be changed (see "man oarsub").
~~~~~~~~~~~~~~~~~~~~~~~~
Connecting a running job
~~~~~~~~~~~~~~~~~~~~~~~~
You can connect a running job with the "-C" option to oarsub::
frontend:~$> oarsub -C <OAR_JOB_ID>
Thus, you will be connected to the first reserved node.
Reservations
------------
Until now we only asked for immediate start for our submission.
However it is also possible to plan a job in the future. This feature is
available through the "-r <date>" option::
frontend:~$> oarsub -r '2008-03-07 16:45:00' -l nodes=2,walltime=0:10:00 ./hello_mpi.sh