forked from open-mpi/hwloc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
697 lines (620 loc) · 25.3 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
This is a truncated and poorly-formatted version of the documentation main page.
See https://www.open-mpi.org/projects/hwloc/doc/ for more.
Installation
hwloc (https://www.open-mpi.org/projects/hwloc/) is available under the BSD
license. It is hosted as a sub-project of the overall Open MPI project (https:/
/www.open-mpi.org/). Note that hwloc does not require any functionality from
Open MPI -- it is a wholly separate (and much smaller!) project and code base.
It just happens to be hosted as part of the overall Open MPI project.
Basic Installation
Installation is the fairly common GNU-based process:
shell$ ./configure --prefix=...
shell$ make
shell$ make install
The hwloc command-line tool "lstopo" produces human-readable topology maps, as
mentioned above. Running the "lstopo" tool is a good way to check as a
graphical output whether hwloc properly detected the architecture of your node.
Optional Dependencies
lstopo may also export graphics to the SVG and "fig" file formats. Support for
PDF, Postscript, and PNG exporting is provided if the "Cairo" development
package (usually cairo-devel or libcairo2-dev) can be found in "lstopo" when
hwloc is configured and build.
The hwloc core may also benefit from the following development packages:
* libpciaccess for full I/O device discovery (libpciaccess-devel or
libpciaccess-dev package). On Linux, PCI discovery may still be performed
(without vendor/device names) even if libpciaccess cannot be used.
* AMD or NVIDIA OpenCL implementations for OpenCL device discovery.
* the NVIDIA CUDA Toolkit for CUDA device discovery. See How do I enable CUDA
and select which CUDA version to use?.
* the NVIDIA Management Library (NVML) for NVML device discovery. It is
included in CUDA since version 8.0. Older NVML releases were available
within the NVIDIA GPU Deployment Kit from https://developer.nvidia.com/
gpu-deployment-kit .
* the NV-CONTROL X extension library (NVCtrl) for NVIDIA display discovery.
The relevant development package is usually libXNVCtrl-devel or
libxnvctrl-dev. It is also available within nvidia-settings from ftp://
download.nvidia.com/XFree86/nvidia-settings/ and https://github.com/NVIDIA/
nvidia-settings/ .
* the AMD ROCm SMI library for RSMI device discovery. The relevant
development package is usually rocm-smi-lib64 or librocm-smi-dev. See How
do I enable ROCm SMI and select which version to use?.
* the oneAPI Level Zero library. The relevant development package is usually
level-zero-dev or level-zero-devel.
* libxml2 for full XML import/export support (otherwise, the internal
minimalistic parser will only be able to import XML files that were
exported by the same hwloc release). See Importing and exporting topologies
from/to XML files for details. The relevant development package is usually
libxml2-devel or libxml2-dev.
* libudev on Linux for easier discovery of OS device information (otherwise
hwloc will try to manually parse udev raw files). The relevant development
package is usually libudev-devel or libudev-dev.
* libtool's ltdl library for dynamic plugin loading if the native dlopen
cannot be used. The relevant development package is usually
libtool-ltdl-devel or libltdl-dev.
PCI and XML support may be statically built inside the main hwloc library, or
as separate dynamically-loaded plugins (see the Components and plugins
section).
Also note that if you install supplemental libraries in non-standard locations,
hwloc's configure script may not be able to find them without some help. You
may need to specify additional CPPFLAGS, LDFLAGS, or PKG_CONFIG_PATH values on
the configure command line.
For example, if libpciaccess was installed into /opt/pciaccess, hwloc's
configure script may not find it by default. Try adding PKG_CONFIG_PATH to the
./configure command line, like this:
./configure PKG_CONFIG_PATH=/opt/pciaccess/lib/pkgconfig ...
Note that because of the possibility of GPL taint, the pciutils library libpci
will not be used (remember that hwloc is BSD-licensed).
Installing from a Git clone
Additionally, the code can be directly cloned from Git:
shell$ git clone https://github.com/open-mpi/hwloc.git
shell$ cd hwloc
shell$ ./autogen.sh
Note that GNU Autoconf >=2.63, Automake >=1.11 and Libtool >=2.2.6 are required
when building from a Git clone.
Nightly development snapshots are available on the web site, they can be
configured and built without any need for Git or GNU Autotools.
Questions and Bugs
Bugs should be reported in the tracker (https://github.com/open-mpi/hwloc/
issues). Opening a new issue automatically displays lots of hints about how to
debug and report issues.
Questions may be sent to the users or developers mailing lists (https://
www.open-mpi.org/community/lists/hwloc.php).
There is also a #hwloc IRC channel on Libera Chat (irc.libera.chat).
Command-line Examples
On a 4-package 2-core machine with hyper-threading, the lstopo tool may show
the following graphical output:
[dudley]
Here's the equivalent output in textual form:
Machine
NUMANode L#0 (P#0)
Package L#0 + L3 L#0 (4096KB)
L2 L#0 (1024KB) + L1 L#0 (16KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#8)
L2 L#1 (1024KB) + L1 L#1 (16KB) + Core L#1
PU L#2 (P#4)
PU L#3 (P#12)
Package L#1 + L3 L#1 (4096KB)
L2 L#2 (1024KB) + L1 L#2 (16KB) + Core L#2
PU L#4 (P#1)
PU L#5 (P#9)
L2 L#3 (1024KB) + L1 L#3 (16KB) + Core L#3
PU L#6 (P#5)
PU L#7 (P#13)
Package L#2 + L3 L#2 (4096KB)
L2 L#4 (1024KB) + L1 L#4 (16KB) + Core L#4
PU L#8 (P#2)
PU L#9 (P#10)
L2 L#5 (1024KB) + L1 L#5 (16KB) + Core L#5
PU L#10 (P#6)
PU L#11 (P#14)
Package L#3 + L3 L#3 (4096KB)
L2 L#6 (1024KB) + L1 L#6 (16KB) + Core L#6
PU L#12 (P#3)
PU L#13 (P#11)
L2 L#7 (1024KB) + L1 L#7 (16KB) + Core L#7
PU L#14 (P#7)
PU L#15 (P#15)
Note that there is also an equivalent output in XML that is meant for exporting
/importing topologies but it is hardly readable to human-beings (see Importing
and exporting topologies from/to XML files for details).
On a 4-package 2-core Opteron NUMA machine (with two core cores disallowed by
the administrator), the lstopo tool may show the following graphical output
(with --disallowed for displaying disallowed objects):
[hagrid]
Here's the equivalent output in textual form:
Machine (32GB total)
Package L#0
NUMANode L#0 (P#0 8190MB)
L2 L#0 (1024KB) + L1 L#0 (64KB) + Core L#0 + PU L#0 (P#0)
L2 L#1 (1024KB) + L1 L#1 (64KB) + Core L#1 + PU L#1 (P#1)
Package L#1
NUMANode L#1 (P#1 8192MB)
L2 L#2 (1024KB) + L1 L#2 (64KB) + Core L#2 + PU L#2 (P#2)
L2 L#3 (1024KB) + L1 L#3 (64KB) + Core L#3 + PU L#3 (P#3)
Package L#2
NUMANode L#2 (P#2 8192MB)
L2 L#4 (1024KB) + L1 L#4 (64KB) + Core L#4 + PU L#4 (P#4)
L2 L#5 (1024KB) + L1 L#5 (64KB) + Core L#5 + PU L#5 (P#5)
Package L#3
NUMANode L#3 (P#3 8192MB)
L2 L#6 (1024KB) + L1 L#6 (64KB) + Core L#6 + PU L#6 (P#6)
L2 L#7 (1024KB) + L1 L#7 (64KB) + Core L#7 + PU L#7 (P#7)
On a 2-package quad-core Xeon (pre-Nehalem, with 2 dual-core dies into each
package):
[emmett]
Here's the same output in textual form:
Machine (total 16GB)
NUMANode L#0 (P#0 16GB)
Package L#0
L2 L#0 (4096KB)
L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
L1 L#1 (32KB) + Core L#1 + PU L#1 (P#4)
L2 L#1 (4096KB)
L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
L1 L#3 (32KB) + Core L#3 + PU L#3 (P#6)
Package L#1
L2 L#2 (4096KB)
L1 L#4 (32KB) + Core L#4 + PU L#4 (P#1)
L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
L2 L#3 (4096KB)
L1 L#6 (32KB) + Core L#6 + PU L#6 (P#3)
L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
Programming Interface
The basic interface is available in hwloc.h. Some higher-level functions are
available in hwloc/helper.h to reduce the need to manually manipulate objects
and follow links between them. Documentation for all these is provided later in
this document. Developers may also want to look at hwloc/inlines.h which
contains the actual inline code of some hwloc.h routines, and at this document,
which provides good higher-level topology traversal examples.
To precisely define the vocabulary used by hwloc, a Terms and Definitions
section is available and should probably be read first.
Each hwloc object contains a cpuset describing the list of processing units
that it contains. These bitmaps may be used for CPU binding and Memory binding.
hwloc offers an extensive bitmap manipulation interface in hwloc/bitmap.h.
Moreover, hwloc also comes with additional helpers for interoperability with
several commonly used environments. See the Interoperability With Other
Software section for details.
The complete API documentation is available in a full set of HTML pages, man
pages, and self-contained PDF files (formatted for both both US letter and A4
formats) in the source tarball in doc/doxygen-doc/.
NOTE: If you are building the documentation from a Git clone, you will need to
have Doxygen and pdflatex installed -- the documentation will be built during
the normal "make" process. The documentation is installed during "make install"
to $prefix/share/doc/hwloc/ and your systems default man page tree (under
$prefix, of course).
Portability
Operating System have varying support for CPU and memory binding, e.g. while
some Operating Systems provide interfaces for all kinds of CPU and memory
bindings, some others provide only interfaces for a limited number of kinds of
CPU and memory binding, and some do not provide any binding interface at all.
Hwloc's binding functions would then simply return the ENOSYS error (Function
not implemented), meaning that the underlying Operating System does not provide
any interface for them. CPU binding and Memory binding provide more information
on which hwloc binding functions should be preferred because interfaces for
them are usually available on the supported Operating Systems.
Similarly, the ability of reporting topology information varies from one
platform to another. As shown in Command-line Examples, hwloc can obtain
information on a wide variety of hardware topologies. However, some platforms
and/or operating system versions will only report a subset of this information.
For example, on an PPC64-based system with 8 cores (each with 2 hardware
threads) running a default 2.6.18-based kernel from RHEL 5.4, hwloc is only
able to glean information about NUMA nodes and processor units (PUs). No
information about caches, packages, or cores is available.
Here's the graphical output from lstopo on this platform when Simultaneous
Multi-Threading (SMT) is enabled:
[ppc64-with]
And here's the graphical output from lstopo on this platform when SMT is
disabled:
[ppc64-with]
Notice that hwloc only sees half the PUs when SMT is disabled. PU L#6, for
example, seems to change location from NUMA node #0 to #1. In reality, no PUs
"moved" -- they were simply re-numbered when hwloc only saw half as many (see
also Logical index in Indexes and Sets). Hence, PU L#6 in the SMT-disabled
picture probably corresponds to PU L#12 in the SMT-enabled picture.
This same "PUs have disappeared" effect can be seen on other platforms -- even
platforms / OSs that provide much more information than the above PPC64 system.
This is an unfortunate side-effect of how operating systems report information
to hwloc.
Note that upgrading the Linux kernel on the same PPC64 system mentioned above
to 2.6.34, hwloc is able to discover all the topology information. The
following picture shows the entire topology layout when SMT is enabled:
[ppc64-full]
Developers using the hwloc API or XML output for portable applications should
therefore be extremely careful to not make any assumptions about the structure
of data that is returned. For example, per the above reported PPC topology, it
is not safe to assume that PUs will always be descendants of cores.
Additionally, future hardware may insert new topology elements that are not
available in this version of hwloc. Long-lived applications that are meant to
span multiple different hardware platforms should also be careful about making
structure assumptions. For example, a new element may someday exist between a
core and a PU.
API Example
The following small C example (available in the source tree as ``doc/examples/
hwloc-hello.c'') prints the topology of the machine and performs some thread
and memory binding. More examples are available in the doc/examples/ directory
of the source tree.
/* Example hwloc API program.
*
* See other examples under doc/examples/ in the source tree
* for more details.
*
* Copyright (c) 2009-2016 Inria. All rights reserved.
* Copyright (c) 2009-2011 Universit?eacute; Bordeaux
* Copyright (c) 2009-2010 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
*
* hwloc-hello.c
*/
#include "hwloc.h"
#include <errno.h>
#include <stdio.h>
#include <string.h>
static void print_children(hwloc_topology_t topology, hwloc_obj_t obj,
int depth)
{
char type[32], attr[1024];
unsigned i;
hwloc_obj_type_snprintf(type, sizeof(type), obj, 0);
printf("%*s%s", 2*depth, "", type);
if (obj->os_index != (unsigned) -1)
printf("#%u", obj->os_index);
hwloc_obj_attr_snprintf(attr, sizeof(attr), obj, " ", 0);
if (*attr)
printf("(%s)", attr);
printf("\n");
for (i = 0; i < obj->arity; i++) {
print_children(topology, obj->children[i], depth + 1);
}
}
int main(void)
{
int depth;
unsigned i, n;
unsigned long size;
int levels;
char string[128];
int topodepth;
void *m;
hwloc_topology_t topology;
hwloc_cpuset_t cpuset;
hwloc_obj_t obj;
/* Allocate and initialize topology object. */
hwloc_topology_init(&topology);
/* ... Optionally, put detection configuration here to ignore
some objects types, define a synthetic topology, etc....
The default is to detect all the objects of the machine that
the caller is allowed to access. See Configure Topology
Detection. */
/* Perform the topology detection. */
hwloc_topology_load(topology);
/* Optionally, get some additional topology information
in case we need the topology depth later. */
topodepth = hwloc_topology_get_depth(topology);
/*****************************************************************
* First example:
* Walk the topology with an array style, from level 0 (always
* the system level) to the lowest level (always the proc level).
*****************************************************************/
for (depth = 0; depth < topodepth; depth++) {
printf("*** Objects at level %d\n", depth);
for (i = 0; i < hwloc_get_nbobjs_by_depth(topology, depth);
i++) {
hwloc_obj_type_snprintf(string, sizeof(string),
hwloc_get_obj_by_depth(topology, depth, i), 0);
printf("Index %u: %s\n", i, string);
}
}
/*****************************************************************
* Second example:
* Walk the topology with a tree style.
*****************************************************************/
printf("*** Printing overall tree\n");
print_children(topology, hwloc_get_root_obj(topology), 0);
/*****************************************************************
* Third example:
* Print the number of packages.
*****************************************************************/
depth = hwloc_get_type_depth(topology, HWLOC_OBJ_PACKAGE);
if (depth == HWLOC_TYPE_DEPTH_UNKNOWN) {
printf("*** The number of packages is unknown\n");
} else {
printf("*** %u package(s)\n",
hwloc_get_nbobjs_by_depth(topology, depth));
}
/*****************************************************************
* Fourth example:
* Compute the amount of cache that the first logical processor
* has above it.
*****************************************************************/
levels = 0;
size = 0;
for (obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, 0);
obj;
obj = obj->parent)
if (hwloc_obj_type_is_cache(obj->type)) {
levels++;
size += obj->attr->cache.size;
}
printf("*** Logical processor 0 has %d caches totaling %luKB\n",
levels, size / 1024);
/*****************************************************************
* Fifth example:
* Bind to only one thread of the last core of the machine.
*
* First find out where cores are, or else smaller sets of CPUs if
* the OS doesn't have the notion of a "core".
*****************************************************************/
depth = hwloc_get_type_or_below_depth(topology, HWLOC_OBJ_CORE);
/* Get last core. */
obj = hwloc_get_obj_by_depth(topology, depth,
hwloc_get_nbobjs_by_depth(topology, depth) - 1);
if (obj) {
/* Get a copy of its cpuset that we may modify. */
cpuset = hwloc_bitmap_dup(obj->cpuset);
/* Get only one logical processor (in case the core is
SMT/hyper-threaded). */
hwloc_bitmap_singlify(cpuset);
/* And try to bind ourself there. */
if (hwloc_set_cpubind(topology, cpuset, 0)) {
char *str;
int error = errno;
hwloc_bitmap_asprintf(&str, obj->cpuset);
printf("Couldn't bind to cpuset %s: %s\n", str, strerror(error));
free(str);
}
/* Free our cpuset copy */
hwloc_bitmap_free(cpuset);
}
/*****************************************************************
* Sixth example:
* Allocate some memory on the last NUMA node, bind some existing
* memory to the last NUMA node.
*****************************************************************/
/* Get last node. There's always at least one. */
n = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_NUMANODE);
obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NUMANODE, n - 1);
size = 1024*1024;
m = hwloc_alloc_membind(topology, size, obj->nodeset,
HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_BYNODESET);
hwloc_free(topology, m, size);
m = malloc(size);
hwloc_set_area_membind(topology, m, size, obj->nodeset,
HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_BYNODESET);
free(m);
/* Destroy topology object. */
hwloc_topology_destroy(topology);
return 0;
}
hwloc_cpuset_t
hwloc_bitmap_t hwloc_cpuset_t
A CPU set is a bitmap whose bits are set according to CPU physical OS indexes.
Definition: hwloc.h:161
HWLOC_OBJ_NUMANODE
@ HWLOC_OBJ_NUMANODE
NUMA node. An object that contains memory that is directly and byte-accessible
to the host processors...
Definition: hwloc.h:257
HWLOC_OBJ_PACKAGE
@ HWLOC_OBJ_PACKAGE
Physical package. The physical package that usually gets inserted into a socket
on the motherboard....
Definition: hwloc.h:212
HWLOC_OBJ_PU
@ HWLOC_OBJ_PU
Processing Unit, or (Logical) Processor. An execution unit (may share a core
with some other logical ...
Definition: hwloc.h:222
HWLOC_OBJ_CORE
@ HWLOC_OBJ_CORE
Core. A computation unit (may be shared by several PUs, aka logical
processors).
Definition: hwloc.h:218
hwloc_topology_init
int hwloc_topology_init(hwloc_topology_t *topologyp)
Allocate a topology context.
hwloc_topology_t
struct hwloc_topology * hwloc_topology_t
Topology context.
Definition: hwloc.h:712
hwloc_topology_destroy
void hwloc_topology_destroy(hwloc_topology_t topology)
Terminate and free a topology context.
hwloc_topology_load
int hwloc_topology_load(hwloc_topology_t topology)
Build the actual topology.
hwloc_get_nbobjs_by_depth
unsigned hwloc_get_nbobjs_by_depth(hwloc_topology_t topology, int depth)
Returns the width of level at depth depth.
hwloc_get_root_obj
static hwloc_obj_t hwloc_get_root_obj(hwloc_topology_t topology)
Returns the top-object of the topology-tree.
hwloc_get_obj_by_depth
hwloc_obj_t hwloc_get_obj_by_depth(hwloc_topology_t topology, int depth,
unsigned idx)
Returns the topology object at logical index idx from depth depth.
hwloc_get_obj_by_type
static hwloc_obj_t hwloc_get_obj_by_type(hwloc_topology_t topology,
hwloc_obj_type_t type, unsigned idx)
Returns the topology object at logical index idx with type type.
hwloc_get_nbobjs_by_type
static int hwloc_get_nbobjs_by_type(hwloc_topology_t topology, hwloc_obj_type_t
type)
Returns the width of level type type.
hwloc_get_type_or_below_depth
static int hwloc_get_type_or_below_depth(hwloc_topology_t topology,
hwloc_obj_type_t type)
Returns the depth of objects of type type or below.
hwloc_get_type_depth
int hwloc_get_type_depth(hwloc_topology_t topology, hwloc_obj_type_t type)
Returns the depth of objects of type type.
hwloc_topology_get_depth
int hwloc_topology_get_depth(hwloc_topology_t restrict topology)
Get the depth of the hierarchical tree of objects.
HWLOC_TYPE_DEPTH_UNKNOWN
@ HWLOC_TYPE_DEPTH_UNKNOWN
No object of given type exists in the topology.
Definition: hwloc.h:849
hwloc_obj_attr_snprintf
int hwloc_obj_attr_snprintf(char *restrict string, size_t size, hwloc_obj_t
obj, const char *restrict separator, unsigned long flags)
Stringify the attributes of a given topology object into a human-readable form.
hwloc_obj_type_snprintf
int hwloc_obj_type_snprintf(char *restrict string, size_t size, hwloc_obj_t
obj, unsigned long flags)
Stringify the type of a given topology object into a human-readable form.
hwloc_set_cpubind
int hwloc_set_cpubind(hwloc_topology_t topology, hwloc_const_cpuset_t set, int
flags)
Bind current process or thread on CPUs given in physical bitmap set.
hwloc_free
int hwloc_free(hwloc_topology_t topology, void *addr, size_t len)
Free memory that was previously allocated by hwloc_alloc() or
hwloc_alloc_membind().
hwloc_alloc_membind
void * hwloc_alloc_membind(hwloc_topology_t topology, size_t len,
hwloc_const_bitmap_t set, hwloc_membind_policy_t policy, int flags)
Allocate some memory on NUMA memory nodes specified by set.
hwloc_set_area_membind
int hwloc_set_area_membind(hwloc_topology_t topology, const void *addr, size_t
len, hwloc_const_bitmap_t set, hwloc_membind_policy_t policy, int flags)
Bind the already-allocated memory identified by (addr, len) to the NUMA node(s)
specified by set.
HWLOC_MEMBIND_BYNODESET
@ HWLOC_MEMBIND_BYNODESET
Consider the bitmap argument as a nodeset.
Definition: hwloc.h:1599
HWLOC_MEMBIND_BIND
@ HWLOC_MEMBIND_BIND
Allocate memory on the specified nodes.
Definition: hwloc.h:1511
hwloc_obj_type_is_cache
int hwloc_obj_type_is_cache(hwloc_obj_type_t type)
Check whether an object type is a CPU Cache (Data, Unified or Instruction).
hwloc_bitmap_asprintf
int hwloc_bitmap_asprintf(char **strp, hwloc_const_bitmap_t bitmap)
Stringify a bitmap into a newly allocated string.
hwloc_bitmap_free
void hwloc_bitmap_free(hwloc_bitmap_t bitmap)
Free bitmap bitmap.
hwloc_bitmap_singlify
int hwloc_bitmap_singlify(hwloc_bitmap_t bitmap)
Keep a single index among those set in bitmap bitmap.
hwloc_bitmap_dup
hwloc_bitmap_t hwloc_bitmap_dup(hwloc_const_bitmap_t bitmap)
Duplicate bitmap bitmap by allocating a new bitmap and copying bitmap contents.
hwloc_obj
Structure of a topology object.
Definition: hwloc.h:423
hwloc_obj::children
struct hwloc_obj ** children
Normal children, children[0 .. arity -1].
Definition: hwloc.h:483
hwloc_obj::nodeset
hwloc_nodeset_t nodeset
NUMA nodes covered by this object or containing this object.
Definition: hwloc.h:567
hwloc_obj::os_index
unsigned os_index
OS-provided physical index number. It is not guaranteed unique across the
entire machine,...
Definition: hwloc.h:428
hwloc_obj::cpuset
hwloc_cpuset_t cpuset
CPUs covered by this object.
Definition: hwloc.h:539
hwloc_obj::arity
unsigned arity
Number of normal children. Memory, Misc and I/O children are not listed here
but rather in their dedi...
Definition: hwloc.h:479
hwloc_obj::type
hwloc_obj_type_t type
Type of object.
Definition: hwloc.h:425
hwloc_obj::attr
union hwloc_obj_attr_u * attr
Object type-specific Attributes, may be NULL if no attribute value was found.
Definition: hwloc.h:442
hwloc_obj::parent
struct hwloc_obj * parent
Parent, NULL if root (Machine object)
Definition: hwloc.h:473
hwloc_obj_attr_u::cache
struct hwloc_obj_attr_u::hwloc_cache_attr_s cache
hwloc_obj_attr_u::hwloc_cache_attr_s::size
hwloc_uint64_t size
Size of cache in bytes.
Definition: hwloc.h:644
hwloc provides a pkg-config executable to obtain relevant compiler and linker
flags. For example, it can be used thusly to compile applications that utilize
the hwloc library (assuming GNU Make):
CFLAGS += $(shell pkg-config --cflags hwloc)
LDLIBS += $(shell pkg-config --libs hwloc)
hwloc-hello: hwloc-hello.c
$(CC) hwloc-hello.c $(CFLAGS) -o hwloc-hello $(LDLIBS)
On a machine 2 processor packages -- each package of which has two processing
cores -- the output from running hwloc-hello could be something like the
following:
shell$ ./hwloc-hello
*** Objects at level 0
Index 0: Machine
*** Objects at level 1
Index 0: Package#0
Index 1: Package#1
*** Objects at level 2
Index 0: Core#0
Index 1: Core#1
Index 2: Core#3
Index 3: Core#2
*** Objects at level 3
Index 0: PU#0
Index 1: PU#1
Index 2: PU#2
Index 3: PU#3
*** Printing overall tree
Machine
Package#0
Core#0
PU#0
Core#1
PU#1
Package#1
Core#3
PU#2
Core#2
PU#3
*** 2 package(s)
*** Logical processor 0 has 0 caches totaling 0KB
shell$
History / Credits
hwloc is the evolution and merger of the libtopology project and the Portable
Linux Processor Affinity (PLPA) (https://www.open-mpi.org/projects/plpa/)
project. Because of functional and ideological overlap, these two code bases
and ideas were merged and released under the name "hwloc" as an Open MPI
sub-project.
libtopology was initially developed by the Inria Runtime Team-Project. PLPA was
initially developed by the Open MPI development team as a sub-project. Both are
now deprecated in favor of hwloc, which is distributed as an Open MPI
sub-project.
Further Reading
The documentation chapters include
* Terms and Definitions
* Command-Line Tools
* Environment Variables
* CPU and Memory Binding Overview
* I/O Devices
* Miscellaneous objects
* Object attributes
* Topology Attributes: Distances, Memory Attributes and CPU Kinds
* Importing and exporting topologies from/to XML files
* Synthetic topologies
* Interoperability With Other Software
* Thread Safety
* Components and plugins
* Embedding hwloc in Other Software
* Frequently Asked Questions (FAQ)
* Upgrading to the hwloc 2.0 API
Make sure to have had a look at those too!
See https://www.open-mpi.org/projects/hwloc/doc/ for more hwloc documentation,
actual links to related pages, images, etc.