Skip to content

Conversation

@XiaoNi87
Copy link
Collaborator

load_devices may replaces superblock pointer if it finds the better one from member disks. In this way, it releases the memory which passes to Assemble by argument pointer st. The function main doesn't know this. So it may dereference a memory region which has been released after existing from Assemble.

This can be reproduced by:
mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1 --assume-clean mdadm -Ss
mdadm -A -e 1.2 /dev/md0 /dev/loop0 /dev/loop1

This patch uses a double pointer as the argument for Assemble. So it can reset the pointer after load_devices.

@mtkaczyk
Copy link
Member

@mwilck would you like to take a look?

Copy link
Contributor

@mwilck mwilck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Assemble.c Outdated
* load_devices may release superblock which passes to it
* and alloc new superblock for it.
*/
*super = st;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be more obvious if you just pass super instead of &st to load_devices?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do that, use of st should probably be replaced by (*super).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this way, the patch will be a big one because st is used in many places after load_devices.

@XiaoNi87
Copy link
Collaborator Author

Hi all

This patch introduces a regression problem which can be found by case 07autoassemble. The command is:
/usr/bin/mdadm -As -c /dev/null --homehost=testing -vvv
loop0 7:0 0 20M 0 loop
└─md1 9:1 0 19M 0 raid1
└─md127 9:127 0 34M 0 raid0
loop1 7:1 0 20M 0 loop
└─md1 9:1 0 19M 0 raid1
└─md127 9:127 0 34M 0 raid0
loop2 7:2 0 20M 0 loop
└─md2 9:2 0 19M 0 raid1
└─md127 9:127 0 34M 0 raid0
loop3 7:3 0 20M 0 loop
└─md2 9:2 0 19M 0 raid1
└─md127 9:127 0 34M 0 raid0

After investigating, the related codes are:
do {
struct mddev_dev devlist = conf_get_devs();
acnt = 0;
do {
rv2 = Assemble(ss, NULL,
ident,
devlist, c);
printf("%s:%d Assemble %d\n", func, LINE, rv2);
if (rv2 == 0) {
cnt++;
acnt++;
}
} while (rv2 != 2);
/
Incase there are stacked devices, we need to go around again */
} while (acnt);

Now the logic depends on the original superblock every time. In this patch, it replaces the pointer if the original memory is released. So I plan to remove the first argument supertype in the Assemble function. I'll update this PR after fixing these problems.

@XiaoNi87 XiaoNi87 force-pushed the assemble-fix branch 2 times, most recently from 7af474e to 8a3a41d Compare October 23, 2025 01:43
@XiaoNi87 XiaoNi87 changed the title mdadm/Assemble: reset superblock pointer after replacing it mdadm/Assemble: alloc superblock in Assemble Oct 23, 2025
@XiaoNi87
Copy link
Collaborator Author

Hi all

I updated the PR and regression test has passed:
Testing on linux-6.17.0-rc3+ kernel
tests/00confnames... Execution time (seconds): 2 succeeded
tests/00createnames... Execution time (seconds): 3 succeeded
tests/00linear... Execution time (seconds): 0 succeeded
tests/00multipath... Execution time (seconds): 0 succeeded
tests/00names... Execution time (seconds): 5 succeeded
tests/00raid0... Execution time (seconds): 1 succeeded
tests/00raid1... Execution time (seconds): 7 succeeded
tests/00raid10... Execution time (seconds): 5 succeeded
tests/00raid4... Execution time (seconds): 2 succeeded
tests/00raid5... Execution time (seconds): 12 succeeded
tests/00raid5-zero... Execution time (seconds): 11 succeeded
tests/00raid6... Execution time (seconds): 3 succeeded
tests/00readonly... Execution time (seconds): 81 succeeded
tests/01r1fail... Execution time (seconds): 53 succeeded
tests/01r5fail... Execution time (seconds): 23 succeeded
tests/01r5integ... Execution time (seconds): 172 succeeded
tests/01raid6integ... Execution time (seconds): 1812 succeeded
tests/01replace... Execution time (seconds): 206 succeeded
tests/02lineargrow... Execution time (seconds): 0 succeeded
tests/02r1add... Execution time (seconds): 35 succeeded
tests/02r1grow... Execution time (seconds): 39 succeeded
tests/02r5grow... Execution time (seconds): 55 succeeded
tests/02r6grow... Execution time (seconds): 39 succeeded
tests/03assem-incr... Execution time (seconds): 3 succeeded
tests/03r0assem... Execution time (seconds): 4 succeeded
tests/03r5assem... Execution time (seconds): 21 succeeded
tests/03r5assemV1... Execution time (seconds): 23 succeeded
tests/04r0update... Execution time (seconds): 1 succeeded
tests/04r1update... Execution time (seconds): 11 succeeded
tests/04r5swap... Execution time (seconds): 11 succeeded
tests/04update-metadata... Execution time (seconds): 41 succeeded
tests/04update-uuid... Execution time (seconds): 3 succeeded
tests/05r1-add-badblocks... Execution time (seconds): 5 succeeded
tests/05r1-add-internalbitmap... Execution time (seconds): 9 succeeded
tests/05r1-add-internalbitmap-v1a... Execution time (seconds): 8 succeeded
tests/05r1-add-internalbitmap-v1b... Execution time (seconds): 9 succeeded
tests/05r1-add-internalbitmap-v1c... Execution time (seconds): 9 succeeded
tests/05r1-failfast... Execution time (seconds): 31 succeeded
tests/05r1-grow-internal... Execution time (seconds): 20 succeeded
tests/05r1-grow-internal-1... Execution time (seconds): 20 succeeded
tests/05r1-internalbitmap... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1a... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1b... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1c... Execution time (seconds): 35 succeeded
tests/05r1-re-add... Execution time (seconds): 40 succeeded
tests/05r1-re-add-nosuper... Execution time (seconds): 38 succeeded
tests/05r1-remove-internalbitmap... Execution time (seconds): 9 succeeded
tests/05r1-remove-internalbitmap-v1a... Execution time (seconds): 8 succeeded
tests/05r1-remove-internalbitmap-v1b... Execution time (seconds): 9 succeeded
tests/05r1-remove-internalbitmap-v1c... Execution time (seconds): 9 succeeded
tests/05r5-internalbitmap... Execution time (seconds): 37 succeeded
tests/06name... Execution time (seconds): 0 succeeded
tests/06sysfs... Execution time (seconds): 1 succeeded
tests/06wrmostly... Execution time (seconds): 2 succeeded
tests/07autoassemble... Execution time (seconds): 24 succeeded
tests/07autodetect... skipping - see /var/tmp/07autodetect.log and /var/tmp/07autodetect.log for details
tests/07changelevelintr... Execution time (seconds): 61 succeeded
tests/07layouts... Execution time (seconds): 843 succeeded
tests/07reshape5intr... Execution time (seconds): 84 succeeded
tests/07testreshape5... Execution time (seconds): 126 succeeded
tests/10ddf-create... Execution time (seconds): 96 succeeded
tests/10ddf-create-fail-rebuild... Execution time (seconds): 58 succeeded
tests/10ddf-fail-readd... Execution time (seconds): 72 succeeded
tests/10ddf-fail-spare... Execution time (seconds): 19 succeeded
tests/10ddf-fail-stop-readd... Execution time (seconds): 82 succeeded
tests/10ddf-fail-twice... Execution time (seconds): 130 succeeded
tests/10ddf-geometry... Execution time (seconds): 12 succeeded
tests/10ddf-sudden-degraded... Execution time (seconds): 67 succeeded
tests/11spare-migration... Execution time (seconds): 1193 succeeded
tests/12imsm-r0_2d-grow-r0_3d... Execution time (seconds): 164 succeeded
tests/12imsm-r0_2d-grow-r0_4d... Execution time (seconds): 193 succeeded
tests/12imsm-r0_2d-grow-r0_5d... Execution time (seconds): 220 succeeded
tests/12imsm-r0_3d-grow-r0_4d... Execution time (seconds): 165 succeeded
tests/12imsm-r5_3d-grow-r5_4d... Execution time (seconds): 145 succeeded
tests/12imsm-r5_3d-grow-r5_5d... Execution time (seconds): 173 succeeded
tests/13imsm-r0_r0_2d-grow-r0_r0_4d... Execution time (seconds): 220 succeeded
tests/13imsm-r0_r0_2d-grow-r0_r0_5d... Execution time (seconds): 248 succeeded
tests/13imsm-r0_r0_3d-grow-r0_r0_4d... Execution time (seconds): 193 succeeded
tests/13imsm-r0_r5_3d-grow-r0_r5_4d... Execution time (seconds): 159 succeeded
tests/13imsm-r0_r5_3d-grow-r0_r5_5d... Execution time (seconds): 187 succeeded
tests/13imsm-r5_r0_3d-grow-r5_r0_4d... Execution time (seconds): 173 succeeded
tests/13imsm-r5_r0_3d-grow-r5_r0_5d... Execution time (seconds): 201 succeeded
tests/14imsm-r0_3d_no_spares-migrate-r5_3d... Execution time (seconds): 83 succeeded
tests/14imsm-r0_3d-r5_3d-migrate-r5_4d-r5_4d... Execution time (seconds): 118 succeeded
tests/14imsm-r0_r0_2d-takeover-r10_4d... Execution time (seconds): 111 succeeded
tests/14imsm-r10_4d-grow-r10_5d... Execution time (seconds): 90 succeeded
tests/14imsm-r10_r5_4d-takeover-r0_2d... Execution time (seconds): 71 succeeded
tests/14imsm-r5_3d-grow-r5_5d-no-spares... Execution time (seconds): 63 succeeded
tests/14imsm-r5_3d-migrate-r4_3d... Execution time (seconds): 63 succeeded
tests/15imsm-r0_3d_64k-migrate-r0_3d_256k... Execution time (seconds): 96 succeeded
tests/15imsm-r5_3d_4k-migrate-r5_3d_256k... Execution time (seconds): 76 succeeded
tests/15imsm-r5_3d_64k-migrate-r5_3d_256k... Execution time (seconds): 76 succeeded
tests/15imsm-r5_6d_4k-migrate-r5_6d_256k... Execution time (seconds): 76 succeeded
tests/15imsm-r5_r0_3d_64k-migrate-r5_r0_3d_256k... Execution time (seconds): 117 succeeded
tests/16imsm-r0_3d-migrate-r5_4d... Execution time (seconds): 115 succeeded
tests/16imsm-r0_5d-migrate-r5_6d... Execution time (seconds): 116 succeeded
tests/16imsm-r5_3d-migrate-r0_3d... Execution time (seconds): 63 succeeded
tests/16imsm-r5_5d-migrate-r0_5d... Execution time (seconds): 63 succeeded
tests/18imsm-1d-takeover-r0_1d... Execution time (seconds): 55 succeeded
tests/18imsm-1d-takeover-r1_2d... Execution time (seconds): 56 succeeded
tests/18imsm-r0_2d-takeover-r10_4d... Execution time (seconds): 150 succeeded
tests/19raid6check... Execution time (seconds): 279 succeeded
tests/19repair-does-not-destroy... Execution time (seconds): 11 succeeded
tests/21raid5cache... Execution time (seconds): 49 succeeded
tests/23rdev-lifetime... Execution time (seconds): 3 succeeded
tests/24raid10deadlock... skipping - see /var/tmp/24raid10deadlock.log and /var/tmp/24raid10deadlock.log for details
tests/24raid456deadlock... Execution time (seconds): 124 succeeded
tests/25raid456-recovery-while-reshape... Execution time (seconds): 13 succeeded
tests/25raid456-reshape-corrupt-data... Execution time (seconds): 19 succeeded
tests/25raid456-reshape-deadlock... Execution time (seconds): 7 succeeded
Removed IMSM_DEVNAME_AS_SERIAL=1 from systemd environment.
Removed IMSM_NO_PLATFORM=1 from systemd environment.

Now it allocs superblock outside Assemble and frees the memory outside
Assemble. But the memory can be freed and realloc in Assemble. So freed
memory will be dereferenced outside Assemble. This patch moves the memory
management into Assemble. So it's more safe and the input arguments is
less.

This can be reproduced by:
mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1 --assume-clean
mdadm -Ss
mdadm -A -e 1.2 /dev/md0 /dev/loop0 /dev/loop1

Signed-off-by: Xiao Ni <xni@redhat.com>
@XiaoNi87
Copy link
Collaborator Author

Regression test pass with latest update:
tests/00confnames... Execution time (seconds): 2 succeeded
tests/00createnames... Execution time (seconds): 2 succeeded
tests/00linear... Execution time (seconds): 1 succeeded
tests/00multipath... Execution time (seconds): 0 succeeded
tests/00names... Execution time (seconds): 4 succeeded
tests/00raid0... Execution time (seconds): 2 succeeded
tests/00raid1... Execution time (seconds): 6 succeeded
tests/00raid10... Execution time (seconds): 5 succeeded
tests/00raid4... Execution time (seconds): 3 succeeded
tests/00raid5... Execution time (seconds): 11 succeeded
tests/00raid5-zero... Execution time (seconds): 11 succeeded
tests/00raid6... Execution time (seconds): 3 succeeded
tests/00readonly... Execution time (seconds): 80 succeeded
tests/01r1fail... Execution time (seconds): 54 succeeded
tests/01r5fail... Execution time (seconds): 22 succeeded
tests/01r5integ... Execution time (seconds): 174 succeeded
tests/01raid6integ... Execution time (seconds): 1819 succeeded
tests/01replace... Execution time (seconds): 207 succeeded
tests/02lineargrow... Execution time (seconds): 1 succeeded
tests/02r1add... Execution time (seconds): 35 succeeded
tests/02r1grow... Execution time (seconds): 38 succeeded
tests/02r5grow... Execution time (seconds): 56 succeeded
tests/02r6grow... Execution time (seconds): 39 succeeded
tests/03assem-incr... Execution time (seconds): 3 succeeded
tests/03r0assem... Execution time (seconds): 4 succeeded
tests/03r5assem... Execution time (seconds): 21 succeeded
tests/03r5assemV1... Execution time (seconds): 23 succeeded
tests/04r0update... Execution time (seconds): 0 succeeded
tests/04r1update... Execution time (seconds): 12 succeeded
tests/04r5swap... Execution time (seconds): 10 succeeded
tests/04update-metadata... Execution time (seconds): 42 succeeded
tests/04update-uuid... Execution time (seconds): 3 succeeded
tests/05r1-add-badblocks... Execution time (seconds): 4 succeeded
tests/05r1-add-internalbitmap... Execution time (seconds): 9 succeeded
tests/05r1-add-internalbitmap-v1a... Execution time (seconds): 9 succeeded
tests/05r1-add-internalbitmap-v1b... Execution time (seconds): 9 succeeded
tests/05r1-add-internalbitmap-v1c... Execution time (seconds): 9 succeeded
tests/05r1-failfast... Execution time (seconds): 31 succeeded
tests/05r1-grow-internal... Execution time (seconds): 20 succeeded
tests/05r1-grow-internal-1... Execution time (seconds): 21 succeeded
tests/05r1-internalbitmap... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1a... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1b... Execution time (seconds): 35 succeeded
tests/05r1-internalbitmap-v1c... Execution time (seconds): 35 succeeded
tests/05r1-re-add... Execution time (seconds): 40 succeeded
tests/05r1-re-add-nosuper... Execution time (seconds): 38 succeeded
tests/05r1-remove-internalbitmap... Execution time (seconds): 9 succeeded
tests/05r1-remove-internalbitmap-v1a... Execution time (seconds): 8 succeeded
tests/05r1-remove-internalbitmap-v1b... Execution time (seconds): 9 succeeded
tests/05r1-remove-internalbitmap-v1c... Execution time (seconds): 8 succeeded
tests/05r5-internalbitmap... Execution time (seconds): 37 succeeded
tests/06name... Execution time (seconds): 1 succeeded
tests/06sysfs... Execution time (seconds): 0 succeeded
tests/06wrmostly... Execution time (seconds): 3 succeeded
tests/07autoassemble... Execution time (seconds): 24 succeeded
tests/07autodetect... skipping - see /var/tmp/07autodetect.log and /var/tmp/07autodetect.log for details
tests/07changelevelintr... Execution time (seconds): 61 succeeded
tests/07layouts... Execution time (seconds): 843 succeeded
tests/07reshape5intr... Execution time (seconds): 84 succeeded
tests/07testreshape5... Execution time (seconds): 139 succeeded
tests/10ddf-create... Execution time (seconds): 100 succeeded
tests/10ddf-create-fail-rebuild... Execution time (seconds): 58 succeeded
tests/10ddf-fail-readd... Execution time (seconds): 73 succeeded
tests/10ddf-fail-spare... Execution time (seconds): 18 succeeded
tests/10ddf-fail-stop-readd... Execution time (seconds): 83 succeeded
tests/10ddf-fail-twice... Execution time (seconds): 130 succeeded
tests/10ddf-geometry... Execution time (seconds): 12 succeeded
tests/10ddf-sudden-degraded... Execution time (seconds): 67 succeeded
tests/11spare-migration... Execution time (seconds): 1184 succeeded
tests/12imsm-r0_2d-grow-r0_3d... Execution time (seconds): 165 succeeded
tests/12imsm-r0_2d-grow-r0_4d... Execution time (seconds): 193 succeeded
tests/12imsm-r0_2d-grow-r0_5d... Execution time (seconds): 219 succeeded
tests/12imsm-r0_3d-grow-r0_4d... Execution time (seconds): 165 succeeded
tests/12imsm-r5_3d-grow-r5_4d... Execution time (seconds): 146 succeeded
tests/12imsm-r5_3d-grow-r5_5d... Execution time (seconds): 173 succeeded
tests/13imsm-r0_r0_2d-grow-r0_r0_4d... Execution time (seconds): 220 succeeded
tests/13imsm-r0_r0_2d-grow-r0_r0_5d... Execution time (seconds): 247 succeeded
tests/13imsm-r0_r0_3d-grow-r0_r0_4d... Execution time (seconds): 193 succeeded
tests/13imsm-r0_r5_3d-grow-r0_r5_4d... Execution time (seconds): 160 succeeded
tests/13imsm-r0_r5_3d-grow-r0_r5_5d... Execution time (seconds): 186 succeeded
tests/13imsm-r5_r0_3d-grow-r5_r0_4d... Execution time (seconds): 154 succeeded
tests/13imsm-r5_r0_3d-grow-r5_r0_5d... Execution time (seconds): 200 succeeded
tests/14imsm-r0_3d_no_spares-migrate-r5_3d... Execution time (seconds): 84 succeeded
tests/14imsm-r0_3d-r5_3d-migrate-r5_4d-r5_4d... Execution time (seconds): 118 succeeded
tests/14imsm-r0_r0_2d-takeover-r10_4d... Execution time (seconds): 110 succeeded
tests/14imsm-r10_4d-grow-r10_5d... Execution time (seconds): 91 succeeded
tests/14imsm-r10_r5_4d-takeover-r0_2d... Execution time (seconds): 71 succeeded
mdadm: failed to write 'inactive' to '/sys/block/md126/md//array_state' (Device or resource busy)
tests/14imsm-r5_3d-grow-r5_5d-no-spares... Execution time (seconds): 63 succeeded
tests/14imsm-r5_3d-migrate-r4_3d... Execution time (seconds): 63 succeeded
tests/15imsm-r0_3d_64k-migrate-r0_3d_256k... Execution time (seconds): 96 succeeded
tests/15imsm-r5_3d_4k-migrate-r5_3d_256k... Execution time (seconds): 75 succeeded
tests/15imsm-r5_3d_64k-migrate-r5_3d_256k... Execution time (seconds): 76 succeeded
tests/15imsm-r5_6d_4k-migrate-r5_6d_256k... Execution time (seconds): 77 succeeded
tests/15imsm-r5_r0_3d_64k-migrate-r5_r0_3d_256k... Execution time (seconds): 116 succeeded
tests/16imsm-r0_3d-migrate-r5_4d... Execution time (seconds): 116 succeeded
tests/16imsm-r0_5d-migrate-r5_6d... Execution time (seconds): 115 succeeded
tests/16imsm-r5_3d-migrate-r0_3d... Execution time (seconds): 63 succeeded
tests/16imsm-r5_5d-migrate-r0_5d... Execution time (seconds): 64 succeeded
tests/18imsm-1d-takeover-r0_1d... Execution time (seconds): 55 succeeded
tests/18imsm-1d-takeover-r1_2d... Execution time (seconds): 56 succeeded
tests/18imsm-r0_2d-takeover-r10_4d... Execution time (seconds): 150 succeeded
tests/19raid6check... Execution time (seconds): 278 succeeded
tests/19repair-does-not-destroy... Execution time (seconds): 11 succeeded
tests/21raid5cache... Execution time (seconds): 49 succeeded
tests/23rdev-lifetime... Execution time (seconds): 3 succeeded
tests/24raid10deadlock... skipping - see /var/tmp/24raid10deadlock.log and /var/tmp/24raid10deadlock.log for details
tests/24raid456deadlock... Execution time (seconds): 123 succeeded
tests/25raid456-recovery-while-reshape... Execution time (seconds): 14 succeeded
tests/25raid456-reshape-corrupt-data... Execution time (seconds): 20 succeeded
tests/25raid456-reshape-deadlock... Execution time (seconds): 6 succeeded
Removed IMSM_DEVNAME_AS_SERIAL=1 from systemd environment.
Removed IMSM_NO_PLATFORM=1 from systemd environment.

@XiaoNi87 XiaoNi87 merged commit 6fa6c4b into md-raid-utilities:main Oct 27, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants