-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent array functions from escaping taints #9041
base: 5.x
Are you sure you want to change the base?
Conversation
Looks promising :) Mind fixing CI failures? |
@orklah with new functions added to stubs without |
I will look into both. |
@weirdan indeed, we need to make sure the stubs are at least as detailed as the callmap, including php versions. I wonder if it would make sense to have a separate way to document taint flow without having to make new stubs |
Well, we have 3 sources of truth in fact. Callmaps, Stubs and TypeProviders. Each override the previous one and the complexity augment each time
I don't think there's a lot of space for improvements here, removing any of those 3 steps would be a massive work and I don't think it would be for the best. What bothered me for this PR is that you had to create stubs for functions that are perfectly fine in Callmaps just for tainting. This will force you to actually reproduce the infos from callmaps in the stubs and that's a downgrade I think a new Callmap-like file with taint could be a good addition, it would allow removing some stubs in favor of a simpler and more efficient syntax |
@orklah & @weirdan, don't feel like you need to rush to respond. I'm sure this is low priority for you. Consolidating the three sources of truth all at once does sound intimidating. However, it may not be that difficult to lay the groundwork for a new format that could eventually consolidate all of them. The functions in this PR could be the only things there to start, and everything else could be moved there incrementally over time. If you were designing psalm for the first time today, what would be the ideal way to represent our 3 sources of truth in a single place? Maybe the CallMap format, with expanded support for more advanced features like so? 'array_splice' => [
'getFunctionReturnType' => function(FunctionReturnTypeProviderEvent $event): array{
// Insert TypeProvider implementation
},
'arguments' => [
'array',
'&rw_array' => [
'returnType' => 'array',
'taintFlow' => true,
// Maybe another item to replace the odd 'rw' prefix
],
'offset' => 'int',
'length=' => '?int',
'replacement=' => [
'returnType' => 'array|string',
'taintFlow' => true,
]
]
], I also wonder if we're unnecessarily reinventing the wheel by trying to describe PHP behaviors in something other than PHP... Would it make more sense to go all in on the stubs feature, including analyzing implementations like we would any other code that psalm analyzes? I'm sure I'm oversimplifying things, but dumb questions sometimes help get the wheels turning. There has to be an elegant solution if we can only imagine it, and an incremental way to transition toward it. |
These changes allow tainted array function arguments to flow through to return values. Don't hesitate to speak up if there are any concerns or requests. I'm hoping you'll appreciate
buildDataSets()
as it significantly reduces boilerplate. Once approved & merged, I plan to create similar PRs for other core functions.