Skip to content

Commit

Permalink
README: Big update
Browse files Browse the repository at this point in the history
  • Loading branch information
rkdarst committed Aug 11, 2024
1 parent f14ea9e commit 3e681f4
Showing 1 changed file with 145 additions and 48 deletions.
193 changes: 145 additions & 48 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,35 +1,57 @@
Convert Slurm accounting database to sqlite3 file
=================================================
Read a Slurm accounting database to a sqlite3 file
==================================================

This contains one utility, ``slurm2sql``, which uses the `Slurm
This contains a utility, ``slurm2sql``, which uses the `Slurm
<https://slurm.schedmd.com/overview>`__ workload manager's ``sacct``,
to export all statistics from jobs and load them to a well-formed
SQLite3 file. This file can then be queried for analytics much more
easily than the raw database or your own exports.

Even if SQLite isn't the main use you want, it provides an easy
intermediate file on the way to convert to whatever format you want.
In particular, it defines the database so that it can be used with
to export statistics from jobs and load them to a well-formed SQLite3
file (the database is also made so that it can be quried with DuckDB).
This file can then be queried for analytics much more easily than the
raw database or your own exports. The main features are:

- Parse ``sacct`` output (this was made before it had JSON output,
which you might want to look at instead - it's hard to use though).
- Preprocess all the values in to basic units, including values like
GPU usage that currently have to be extracted from other fields.

Even if SQLite isn't what you need, it provides an easy intermediate
file on the way to convert to whatever format you want. In
particular, it defines the database so that it can be used with
DuckDB, which is a more efficient tool for analytics.

There are also some command line frontends, ``slurm2sql-sacct`` and
``slurm2sql-seff`` that use this parsing to print out data in better
forms than built-in Slurm commands. This is especially useful for
``sacct``. You can design your own tools like this.


Installation
------------

There is only a single file with no dependencies. Python greater than
2.7 is required (dependencies are purposely kept minimal).
Normal ``pip`` installation, name ``slurm2sql`` or ``slurm2sql[cli]``
for the command line programs. This installs the library and command
line programs.

::

The ``slurm2sql`` library + command line frontend can be installed via
the Python Package Index: ``pip install slurm2sql``.
pip install slurm2sql[cli]

There is only a single file with no depencecies for the core
``slurm2sql`` library (which could also be manually downloaded - HPC,
right?), though the command line programs require ``tabulate``. It's
made to support very old Python.



Usage
-----


``slurm2sql``
~~~~~~~~~~~~~

Sample usage::

slurm2sql.py [output_db] -- [sacct selection options]
slurm2sql.py OUTPUT_DB -- [SACCT_FILTER_OPTIONS]


For example, to get all data from July and August (``-S``) for all
Expand Down Expand Up @@ -61,11 +83,44 @@ You can also run this every day, to first load old historykeep a database update
slurm2sql.py --history-resume sincejuly.sqlite3 -- -a


``slurm2sql-sacct``
~~~~~~~~~~~~~~~~~~~

This probably isn't the most useful part. Look at command line options.

.. code-block:: console
$ slurm2sql-sacct SACCT_FILTER
``slurm2sql-seff``
~~~~~~~~~~~~~~~~~~

This is more useful: it prints ``seff`` like output in a tabular
format. MemReqGiB is per-node, to compare withMaxRSSGiB.

.. code-block:: console
$ slurm2sql-sacct SACCT_FILTER
.. code-block:: console
$ slurm2sql-seff -S now-3day
JobID User hours NCPUS CPUeff MemReqGiB MaxRSSGiB MemEff NGpus GPUeff read_MiBps write_MiBps
------- ------- ----- ----- ------ --------- --------- ------ ----- ------ ---------- -----------
1860854 darstr1 0.28 1 87% 50 9.76 20% 213.88 14.51
1877467 darstr1 0 0 0% 0 0%
1884493 darstr1 0 1 0% 0.49 0 0%
1884494 darstr1 0 1 0% 0.49 0 0%
From Python
~~~~~~~~~~~

It can also be used from Python as what is essentially a glorified
parser::
parser.

.. code-block:: python
db = sqlite3.connect(':memory:')
slurm2sql.slurm2sql(db, ['-S', '2019-08-26'])
Expand Down Expand Up @@ -116,26 +171,25 @@ Using via DuckDB from Python (with the raw sqlite database):
Database format
---------------

There is one table with name ``slurm``. There is one view
``allocations`` which has only the jobs (not job steps) (``where
JobStep is null``).
Tables and views:

There is one row for each item returned by ``sacct``.

There is another view ``eff`` that combines the data in a form that is
useful for computing efficiency of jobs: it computes ``CpuEff``,
``MemEff``, ``GpuUtil``, and some more things.
* Table ``slurm``: the main table with all of the data. There is one
row for each item returned by ``sacct``.
* View ``allocations``: has only the jobs (not job steps) (``where
JobStep is null``).
* View ``eff``: Does a lot of processing of ``slurm`` to produce some
``CPUEff``, ``MemEff``, and ``GPUeff`` values (0.0-1.0 usage
fractions), in addition to a bit more.

In general, there is one column for each item returned by ``sacct``,
but some of them are converted into a more useful form. Some columns
are added by re-processing other columns. In general, just use the
source. See ``COLUMNS`` in ``slurm2sql.py`` for details. Extra
columns can easily be added.
are added by re-processing other columns. See ``COLUMNS`` in
``slurm2sql.py`` for details. Extra columns can easily be added.

There are two types of converter functions: easy ones, which map one
slurm column directly to a database column via a function, and line
functions, which take the whole row and can do arbitrary remixing of
the data.
Developer note: There are two types of converter functions to make the
columns: easy ones, which map one slurm column directly to a database
column via a function, and line functions, which take the whole row
and can do arbitrary remixing of the data (to compute things like CpuEff.

Columns
~~~~~~~
Expand All @@ -144,6 +198,9 @@ All column values are converted to standard units: *bytes* (not MB,
KB, etc), *seconds*, *fraction 0.0-1.0* for things like
percentages, and *unixtime*.

Columns which are the same in raw ``sacct`` output aren't documented
specifically here (but note the default units above).

Below are some notable columns which do not exist in sacct (for the
rest, check out the `sacct manual page <https://slurm.schedmd.com/sacct.html#lbAF>`_). It's good
to verify that any of our custom columns make sense before trusting
Expand All @@ -170,23 +227,30 @@ them. For other columns, check ``man sacct``.
is the same for all jobs in an array. We split all of these
different IDs into the following fields:

* ``JobID``: Only the integer Job ID, without the trailing array
* ``JobID``: The full raw value that Slurm gives. The same for each
job in an array.

Only the integer Job ID, without the trailing array
tasks or job IDs. For array jobs, this is the "Raw JobID" as
described above, use ``ArrayJobID`` to filter jobs that are the
same. Integer

* ``ArrayJobID``: The common array ID for all jobs in an array -
only. For non-array jobs, same as JobID. Integer or null.
* ``JobIDnostep``: The part of JobID without anything after the ``.``
(no steps)

* ``JobIDonly``: The integer part of the JobID.

* ``JobIDRawonly``: The integer part of the Raw JobID (so this is
different for each job in an aray).

* ``ArrayTaskID``: As used above. Integer on null.

* ``JobStep``: Job step - only. If you SQL filter for ``StepID is
null`` you get only the main allocations. String.

* ``JobIDSlurm``: The raw output from sacct JobID field, including
``.`` and ``_``. String.

* Note: HetJob offsets are not currently handled
* Note: HetJob offsets are not currently handled and silently
stripped out and give invalid data. File an issue and this will
be added.

* ``ReqMem``: The raw slurm value in a format like "5Gn". Instead of
parsing this, you probably want to use one of the other values below.
Expand All @@ -204,32 +268,30 @@ them. For other columns, check ``man sacct``.
* ``ReqMemRaw``: The numeric value of the ``ReqMem``, whether it is
``c`` or ``n``.

* ``ReqGPU``: Number of GPUs requested. Extracted from ``ReqGRES``.
* ``ReqGPU``: Number of GPUs requested. Extracted from ``ReqTRES``.

* GPU information. These use values from the ``TRESUsageInAve``
fields in modern Slurm

* ``GpuMem``: ``gres/gpumem``

* ``GpuUtil``: ``gres/gpuutil`` (0.0-1.0).
* ``GpuUtil``: ``gres/gpuutil`` (fraction 0.0-1.0).

* ``NGpus``: Number of GPUs. Should be the same as ``ReqGPU``, but
who knows.

* ``GpuUtilTot``, ``GpuMemTot``: like above but using the
``TRESUsageInTot`` sacct field.

* ``MemEff``: Memory efficiency (0.0-1.0). Like in ``seff``. We
compute it ourselves, so it could be wrong. Test before trusting!
There can still be corner cases, job steps may be off, etc. This
also relies on memory reporting being correct, which may not be the
case...
* ``MemEff``: This is null in the Slurm table now, since Slurm gives
ReqMem in allocations and memory used in steps. The ``eff`` table
calculates this now.

* ``CPUEff``: CPU efficiency (0.0-1.0). All the same caveats as above
apply: test before trusting.

Quick reference of the other most important columns from the
accounting database:
accounting database that are hardest to remember:

* ``Elapsed``: Wall clock time

Expand All @@ -238,16 +300,32 @@ accounting database:

* ``TotalCPU``: SystemCPU + TotalCPU, seconds of productive work.

The ``eff`` table adds the following:

* ``CPUEff``: like CPUEff but for the whole job

* ``MemEff``: Memory efficiency for the whole job (max(MaxRSS) /
ReqMem)

* And more, see the code for now.



Changelog
---------

Next

* This is the biggest column clean-up in a while.
* Add slurm2sql-{seff,sacct} commands.
* JobID columns adjusted: ``JobID`` is the raw thing that slurm gives,
``*only`` integer IDs without any trailing things,
``JobIDrawonly`` is the RawJobID without any trailing things.
* ReqMem has been updated: it no longer parses ``n`` and ``c``
suffixes for mem-per-node/cpu, and that respective column has been
removed.
* MemEff has been removed from the ``slurm`` table, since it is always
empty. The ``eff`` view has been added instead.

0.9.1

Expand All @@ -266,14 +344,33 @@ Development and maintenance
---------------------------

This could be considered beta right now, but it works and is in use by
people. There are many different variations of Slurm, if it doesn't
people. If this is important for you, comment about your use case
in the Github issue tracker. Watch the repo if you want to give
comments on future data schema updates and backwards compatibility
when Slurm changes.

There are many different variations of Slurm, if it doesn't
work for you, send an issue or pull request to help us make it more
general - development is only done in response to feedback.

Development principles:

- All values are the most basic (metric) units: bytes, seconds,
seconds-since-epoch, etc.
- Try to use existing Slurm column names as much as possible (even if
they are hard to remember).
- Try to support as many Slurm versions as possible, but if something
becomes hard to support, don't worry too much about breaking
compatibly. SchedMD support slurm for 18 months after release. Try
to support at least those versions. (Until someone asks for it,
don't assume we can import data from very old Slurm versions)
- Don't try to maintain database compatibility. It's expected that for
all schema changes, you have to delete and re-import. But try to
avoid this if not needed.

Release process::

python setup.py sdist bdist_wheel
twine upload [--repository-url https://test.pypi.org/legacy/] dist/*0.9.0*


Originally developed at Aalto University, Finland.

0 comments on commit 3e681f4

Please sign in to comment.