Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA-378768: capture block schedulers #4

Merged
merged 1 commit into from
Sep 19, 2023
Merged

Conversation

MarkSymsCtx
Copy link
Contributor

No description provided.

xen-bugtool Outdated Show resolved Hide resolved
@MarkSymsCtx MarkSymsCtx force-pushed the CA-378768 branch 2 times, most recently from 8ea204b to 5338de0 Compare June 20, 2023 12:10
andyhhp
andyhhp previously approved these changes Jun 20, 2023
xen-bugtool Outdated
@@ -765,6 +769,7 @@ exclude those logs from the archive.
cmd_output(CAP_DISK_INFO, [SG_MAP, '-x'])
func_output(CAP_DISK_INFO, 'scsi-hosts', dump_scsi_hosts)
cmd_output(CAP_DISK_INFO, [LVDISPLAY, '--map'])
cmd_output(CAP_BLOCK_SCHEDULER, [LSBLK, '-o', 'SCHED,NAME'])
Copy link
Collaborator

@bernhardkaindl bernhardkaindl Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example to get more information in the same call:

lsblk -io type,name,sched,tran,rota,log-sec,rq-size,vendor,model,serial
TYPE NAME                                                            SCHED TRAN   ROTA LOG-SEC RQ-SIZE VENDOR   MODEL            SERIAL
disk sda                                                             cfq   sata      1     512     128 ATA      ST1000NX0423     W470MWXE
part |-sda4                                                          cfq             1     512     128                           
part |-sda2                                                          cfq             1     512     128                           
part |-sda5                                                          cfq             1     512     128                           
part |-sda3                                                          cfq             1     512     128                           
lvm  | `-VG_XenStorage--03086bb3--efa1--a74e--fbc5--16077bf518e7-MGT                 1     512     128                           
part |-sda1                                                          cfq             1     512     128                           
part `-sda6                                                          cfq             1     512     128

As in this example, I'd also add -i to use ASCII characters for the tree formatting as otherwise lsblk would use Unicode characters to format the tree if it runs inside a UTF-8 locale (like when running it from the shell) but ASCII format if running it from some XAPI plugin (in which the en_US.UTF-8 default locale of XenServer) is not set. So for a clean and more-stable ASCII output, I'd recommend adding it.

Because , are used in the arguments, they would be carried over into the generated filename by cmd_output, so it would be good to add label="lsblk" to cmd_output(), to get a file like bug-report-20230622161233/lsblk.out.

If that's not done, a file like

bug-report-20230622161233/lsblk-osched,name.out

would be created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with using /sys/class/block/*/queue/scheduler and this has a number of problems

  1. picks up all kernel block devices, including the tdX devices for the live, active VBDs attached to VM. Where there may be many hundred of these. Filtering to just sd* and dm-* would be possible but potentially risks missing something that could be important in triaging a customer issue.
  2. The output results in one file in the bugtool for each entry in /sys/class/block/*/queue which results in a lot of files to deal with when triaging the logs whereas lsblk gives the result in one single file.
  3. Due to the implementation of sysfs the stat call for reading the size of the logs will always report 4KB so you very rapidly (and erroneously) exceed the size limit for the capability, or the size limit has to be set artificially high.

Copy link
Collaborator

@bernhardkaindl bernhardkaindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mg12 Do you have a preference for just the file or a more complete set of values from lsblk, and would you need additional values from lsblk? (lsblk -h has a list)

As described in my review comment for the new cmd_output call, think this would be good:

  • Use -i or --ascii to not get Unicode tree formatting in the data
  • Add label=lsblk-sched" to cmd_output (or so) not get "," in the archived file name
  • To retrieve more info, not just the scheduler, what about this proposal:
    lsblk -io type,name,sched,tran,rota,log-sec,rq-size,vendor,model,serial

@edwintorok
Copy link
Contributor

For performance testing purposes we need the scheduler, and 'rotational' is also useful (e.g. if 'cfq' is used and it has wrongly determined that an SSD is rotational because it is hidden behind too many layers of PERC controllers, etc.).

I wouldn't mind getting the rest of the fields you suggested, except for 'serial' which might have some privacy implications and we don't really need it, so best not to explicitly capture it. 'vendor/model' can be useful to determine which drive got assigned which letter, so that is probably fine.

@mg12
Copy link
Collaborator

mg12 commented Aug 11, 2023

@MarkSymsCtx: what @edwintorok suggested above looks good to me, essentially @bernhardkaindl's parameters minus serial:
lsblk -io type,name,sched,tran,rota,log-sec,rq-size,vendor,model

mg12
mg12 previously approved these changes Aug 30, 2023
edwintorok
edwintorok previously approved these changes Aug 31, 2023
bernhardkaindl
bernhardkaindl previously approved these changes Sep 5, 2023
xen-bugtool Outdated Show resolved Hide resolved
Signed-off-by: Mark Syms <mark.syms@citrix.com>
@mg12 mg12 merged commit ea557ca into xenserver:master Sep 19, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants