-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plan about my dev v2 #37
Comments
Looking forward to see lima driver mainlined :) |
Progress update
Next
|
@yuq PP error irq isn't fixed for mali400. I still get this on some runs of glmark:
|
Also I get MMU faults in kmscube -M rgba:
|
What's your screen resolution when this kind of error happens? I remember you said you have a 2536x1440 monitor? I fix this error when 1920x1080 and the PLB number is not set to max. But there's another dimension I haven't tried -- the PLB size. PLB size can be 128, 256, 512, 1024. Dumping mali driver I always see it's set to 512, so does lima-ng. But maybe when higher resolution, it should be increased to 1024. |
My monitor resolution is 2560x1440. |
So does increase LIMA_CTX_PLB_BLK_SIZE to 1024 solves the error on your side? |
No, with LIMA_CTX_PLB_BLK_SIZE = 1024 kmscube doesn't work at all - and I get this in dmesg:
|
OK, maybe there's other place need to be configured for 1024 PLB like the DLBU reg: I just hard code 0x20000000 for 512 PLB, 1024 PLB should be 0x30000000. So there maybe the same field for mali400 that we haven't discovered. We can first dump 2560x1440 mali and see if it uses 1024 PLB size, then where's this field. |
Here's dump: https://drive.google.com/file/d/16WDMIvAeE6-wK4NYvepF8R0YEfJXEUHD/view?usp=sharing - I'm not really sure what to look for. |
From your dump, although the gp stream mem is missing, I can see in the pp stream mem it's still 512 PLB. But I also find in the code that LIMA_CTX_PLB_BLK_SIZE is not used every where it should be, so fixed with: With this fix, 1024 PLB works, could you try it again? |
1024 PLB works now, but it's the same as 512 - I'm still getting mmu fault in 'kmscube -m rgba':
and pp error in glmark2-es2-drm -b build:
Btw, everything that uses textures stutters for me, i.e. textured cube or 'glmark2-es-drm -b pulsar' |
OK, then seems not the plb size problem. As the texture, Is it caused by the compiler: |
Oh, I wasn't aware of this change in mesa-18.0. That explains stuttering. As for the issue - I suspect it's something related to cache - since it works 4 out of 5 times fine, and fails on 5th time (that's approximately) |
The compiler scalar back to vec problem will get worse when 18.1. But I want to focus on kernel currently so left it with some incomplete work around. The issue maybe cache problem. Another possibility is the switch_delay, I found on Amlogic chip, when in high frequency (>500MHz), it has to be bigger than 0xff, otherwise the chip will work in unstable state. Not sure if this affect your chip. |
Setting switch-delay to 0xffff doesn't help for ppmmu error, but "pp error irq state=200" goes away. Looks like Mali400 in Allwinner A64 needs switch-delay 0xffff to work properly. Does it make sense to make switch-delay = 0xffff default value? |
And I think I understand when "ppmmu error" happens - it always happens if I run some app that uses textures and when I press ctrl+c to interrupt it. I believe driver tears down MMU mapping while PP is still running. |
I don't know if it's proper to always set switch delay to 0xffff as some platform just set this value to 0xff and some set it to 0xffff in the mali driver, also this value depends on the clk freq. Does proprietary A64 mali kernel driver set it to 0xffff or 0xff? As the ppmmu error, no matter your guess is true, kernel driver indeed has no mechanism to prevent this situation happen. If user just call vm_unmap before PP task is done, this result is expected. If user is interrupted and resource is freed due to dev file descriptor close, we may add some code to wait the task done. |
If I read this code correctly, it uses 0x0 as delay since there's no pmu_switch_delay in device tree: What does 0x0 mean in this case? Highest possible delay? |
Are you sure the switch delay reg is set to 0? this is the min delay or no delay from the comment. |
I verified it, and it's setting it to 0. |
Then if set to 0 in lima kernel driver, does it fix your pp error too? |
Progress update:
I'll prepare an RFC for the kernel driver recently. |
@yuq please CC me on your RFC patches |
@anarsoul no problem. |
RFC has been send: |
Soo.. I noticed a guy noticed you are missing some Mali architectures there. |
Oh, I didn't know there're so many ARCH. Now I decide to just write like this: Thanks for your notice. |
As previous plan is done, start a new one.
I've setup a mali450 board for mali450 dev and found the kernel driver HW ops not stable, like L2 cache and MMU reset command timeout, so want to give the kernel driver some refine and fix which may also benefit some problem found when mali400 dev. After this, I can send a RFC to kernel DRM driver mailing list for feedback.
The text was updated successfully, but these errors were encountered: