Tried out on a Mac M4. I made a small model with just a nn::Linear and tried saving it in both cpp and python. And I noticed they would output different .safetensors files. The only difference was they had different endianness in the data section (headerLength and json sections were fine).
I was able to fix it by commenting out the following line in save_safetensors():
std::reverse(data_ptr + i, data_ptr + i + cpu_tensor.element_size());
I'm not familiar with the code, but the load() func has a similar reverse(), but only inside a is_big_endian() condition. I'm wondering if a similar check is required in the save() func. Either that or I have something configured wrong somewhere on my machine.