Skip to content

Latest commit

 

History

History
133 lines (99 loc) · 6.63 KB

DEBUGFS.md

File metadata and controls

133 lines (99 loc) · 6.63 KB

Debugging support

First ensure that debugfs is mounted.

# mount -t debugfs none /sys/kernel/debug

A list of available debugging functions can be found in /sys/kernel/debug/pib/pib_X/.

Objection inspection

The object inspection displays IB objects. ucontext, cq, pd, mr, ah, srq and qp

Each IB objects except QP has an unique number(OID) in creation time. The OID has a range from 1 to N. Zero indicates invalid.

The QP's OID is the same as QPN.

In addition, each IB object except ucontext via uverbs is assigned user handle id (UHWD). User-land programs can get user handle id from handle fields of struct ibv_pd, struct ibv_mr, struct ibv_ah, struct ibv_srq and struct ibv_qp.

ucontext displays a list of ucontext(s).

OID  CREATIONTIME                      PID   COMM
0007 [2014-02-08 02:59:05.631,314,154] 13445 ibv_rc_pingpong
0008 [2014-02-08 02:59:10.044,493,257] 13447 ibv_srq_pingpon

cq displays a list of completion queue(s).

OID  UCTX UHWD  CREATIONTIME                      S   MAX   CUR   TYPE NOTIFY
0001 KERN NOHWD [2014-02-08 02:46:03.057,814,080] OK   1280     0 NONE WAIT
0002 KERN NOHWD [2014-02-08 02:46:03.061,807,815] OK   1280     0 NONE WAIT
0003 KERN NOHWD [2014-02-08 02:46:03.070,253,748] OK    642     0 NONE WAIT
0004 KERN NOHWD [2014-02-08 02:46:03.070,370,634] OK    128    13 NONE WAIT
0005 KERN NOHWD [2014-02-08 02:46:03.076,738,203] OK    642     0 NONE WAIT
0006 KERN NOHWD [2014-02-08 02:46:03.076,851,870] OK    128    14 NONE WAIT
000d    7     0 [2014-02-08 02:59:05.631,369,851] OK    501     0 NONE WAIT
000e    8     1 [2014-02-08 02:59:10.044,547,908] OK    516     0 NONE WAIT
  • UCTX displays an ucontext OID that this cq belongs to. If the cq is generated by kernel code, UCTX indicates KERN.
  • UHWD displays this cq's user handle id.
  • S indicates *OK" or "ERR" as this cq's state.
  • TYPE indicates NONE(don't attach completion channel), SOLI(solicited only) or COMP(all completion).
  • NOTIFY indicates NOTIFY or WAIT.

pd displays a list of protection domain(s).

OID  UCTX UHWD  CREATIONTIME
0001 KERN NOHWD [2014-02-08 02:46:03.058,019,052]
0002 KERN NOHWD [2014-02-08 02:46:03.062,021,207]
0003 KERN NOHWD [2014-02-08 02:46:03.067,386,427]
0004 KERN NOHWD [2014-02-08 02:46:03.073,458,442]
000b    7     0 [2014-02-08 02:59:05.631,326,473]
000c    8     1 [2014-02-08 02:59:10.044,504,640]

mr displays a list of memory region(s).

OID  UCTX UHWD  CREATIONTIME                      PD   START            LENGTH           LKEY     RKEY     DMA AC
0001 KERN NOHWD [2014-02-08 02:46:03.058,035,303] 0001 0000000000000000 ffffffffffffffff 63a43000 63a44000 DMA 1
0002 KERN NOHWD [2014-02-08 02:46:03.061,804,672] 0001 0000000000000000 ffffffffffffffff 28840001 38843001 DMA 1
0013    7     0 [2014-02-08 02:59:05.631,362,956] 000b 0000000001e4d000 0000000000001000 0bd78000 7bd7b000 USR 1
0014    8     1 [2014-02-08 02:59:10.044,541,101] 000c 00000000006a1000 0000000000001000 7a4e8000 4a4ef000 USR 1
  • DMA indicates DMA or USR.

ah displays a list of address handle(s).

OID    UCTX UHWD  CREATIONTIME                      PD   DLID AC PORT
000017 KERN NOHWD [2014-02-08 02:46:03.361,759,808] 0001 0001  0 1
000019 KERN NOHWD [2014-02-08 02:46:03.361,770,318] 0002 0001  0 2
00001a KERN NOHWD [2014-02-08 02:46:06.193,034,040] 0003 c000  1 1
000023 KERN NOHWD [2014-02-08 02:46:06.199,621,847] 0004 c006  1 2

srq displays a list of share receive queue(s).

OID  UCTX UHWD  CREATIONTIME                      PD   S   MAX   CUR
0001 KERN NOHWD [2014-02-08 02:46:03.067,477,973] 0003 OK    256     0
0002 KERN NOHWD [2014-02-08 02:46:03.073,521,972] 0004 OK    256     0
0006    8     0 [2014-02-08 02:59:10.044,600,849] 000c OK    500     0

qp displays a list of queue pair(s).

OID    UCTX UHWD  CREATIONTIME                      PD   QT  STATE S-CQ R-CQ SRQ  MAX-S CUR-S MAX-R CUR-R
000000 KERN NOHWD [2014-02-08 02:46:03.058,037,569] 0001 SMI RTS   0001 0001 0000   128     0   512     0
000001 KERN NOHWD [2014-02-08 02:46:03.058,604,390] 0001 GSI RTS   0001 0001 0000   128     0   512     0
000000 KERN NOHWD [2014-02-08 02:46:03.062,054,749] 0002 SMI RTS   0002 0002 0000   128     0   512     0
000001 KERN NOHWD [2014-02-08 02:46:03.063,060,784] 0002 GSI RTS   0002 0002 0000   128     0   512     0
547575 KERN NOHWD [2014-02-08 02:46:03.070,408,186] 0003 UD  RTS   0004 0003 0000   128     0   256     0
547576 KERN NOHWD [2014-02-08 02:46:03.076,867,571] 0004 UD  RTS   0006 0005 0000   128     0   256     0
5475ab    8     1 [2014-02-08 02:59:10.044,746,008] 000c RC  INIT  000e 000e 0006     1     0     0     0
5475bb    9     0 [2014-02-08 03:01:35.975,160,059] 000d UD  INIT  000f 000f 0000     1     0   500     0

Execution trace

trace displays execution trace.

  • API indicates that user-land programs or kernel modules call IB API.
  • SEND indicates that pib's socket transmits an IB packet encapsulated in the UDP packet.
  • RCV1 indicates that pib's socket receive an UDP packet.
  • RCV2 indicates that pib's socket accepts the receiving UDP packet as the encapsulated IB packet.
  • RTRY indicates that pib's requester perform retries due to local ack timeout.
  • COMP indicates that pib generates a successful completion or a completion error.
  • ASYNC indicates that an asynchronous error(including an event) is caused.
  • TIME is an internal entry.

You can insert a bookmark message into the list of execution trace.

$ echo "Benchmark Start." > /sys/kernel/debug/pib/pib_0/trace

[2014-02-09 03:51:58.198,557,202] RCV1 UD/SEND_ONLY       PORT:1 PSN:001cee LEN:0264 SLID:ffff DLID:0001 DQPN:000000
[2014-02-09 03:51:58.198,557,944] RCV2 UD/SEND_ONLY       PORT:1 PSN:001cee DATA:0256 SQPN:000000
[2014-02-09 03:51:58.198,562,915] API  req_notify_cq      OID:0001
[2014-02-09 03:51:58.198,563,055] API  poll_cq            OID:0001
[2014-02-09 03:51:58.198,571,649] API  destroy_ah         OID:001d35
[2014-02-09 03:51:58.198,573,892] API  post_recv          OID:000000
[2014-02-09 03:51:58.198,574,445] API  poll_cq            OID:0001
[2014-02-09 03:52:06.477,762,060] TIME
[2014-02-09 03:52:06.477,762,084] BOOKMARK Bechmark Start.
[2014-02-09 03:52:08.198,759,305] TIME
[2014-02-09 03:52:08.198,759,409] API  create_ah          OID:001d36
[2014-02-09 03:52:08.198,763,268] API  post_send          OID:000000

Error injection

You can inject CQ, QP or SRQ asynchronous error via inject_err.

$ echo "CQ 0004" > inject_err