Skip to content

Fix: Correctly resolve in scope variables of instanceof patterns #6645#6647

Open
Luro02 wants to merge 1 commit intoINRIA:masterfrom
Luro02:master
Open

Fix: Correctly resolve in scope variables of instanceof patterns #6645#6647
Luro02 wants to merge 1 commit intoINRIA:masterfrom
Luro02:master

Conversation

@Luro02
Copy link
Copy Markdown
Contributor

@Luro02 Luro02 commented Feb 25, 2026

I have not looked at the JLS, so I might have missed something.

The code in the SiblingsFunction introduced in #6444 has been removed, because I think it does not belong there? Seems wrong to return the instanceof variables there.

@MartinWitt
Copy link
Copy Markdown
Collaborator

@Luro02 mind using our assertions and testing utils? Take a look at https://github.com/INRIA/spoon/blob/master/CONTRIBUTING.md or ask the next mistral instance.

@Luro02
Copy link
Copy Markdown
Contributor Author

Luro02 commented Feb 25, 2026

@Luro02 mind using our assertions and testing utils? Take a look at https://github.com/INRIA/spoon/blob/master/CONTRIBUTING.md or ask the next mistral instance.

Are you referring to that assertNotNull(decl); or do you have a problem with the assertThat? I am fine with changing the assertThat to a assertSame to conform with the rest of the test class.

@SirYwell
Copy link
Copy Markdown
Collaborator

This looks somewhat incomplete, especially with more complex if conditions. For example:

  • if (!(!(a instanceof String s))) { ... }
  • if (!(!(!(a instanceof String s)))) { ... }
  • if (a instanceof String s || b) { ... }
  • if (a instanceof String s && b) { ... }
  • if (!(a instanceof String s1) || !(b instanceof String s2)) { ... }

Could you expand the tests to cover these scenarios?

It feels like we need full DA/DU analysis here to properly support that feature...

@Luro02 Luro02 marked this pull request as draft February 26, 2026 10:20
@Luro02
Copy link
Copy Markdown
Contributor Author

Luro02 commented Feb 26, 2026

This is likely going to be more involved than I expected, so I am converting this to a draft for now.

@Luro02
Copy link
Copy Markdown
Contributor Author

Luro02 commented Mar 5, 2026

I went through the JLS https://docs.oracle.com/javase/specs/jls/se25/html/jls-6.html#jls-6.3.1 and implemented the pattern variable resolution accordingly.

The pushed implementation seems to be mostly working, except for switches, which I have not implemented yet. There are a lot of tests missing, many things are not covered, but I am unsure how these should look like.

The Contributing Guidelines suggest using assertThat, which is kind of useless here, but they should work for assertThat(decl).isSame() and isNull.

The main problem would be how to get the declarations and associated references from a piece of code. I am tempted to simply fetch a list of all declarations, and all references, then hardcode which indices of the references should resolve to which declaration, but maybe there is a cleaner way?

@Luro02
Copy link
Copy Markdown
Contributor Author

Luro02 commented Mar 6, 2026

Here is the current state of the PR:

The original code starts at the given element, checks if the desired variable is in scope, and if not, moves up to the parent of the element. Same repeats until it reaches the package declaration or runs out of parents.

For instanceof patterns, the scope changes through the parents, e.g. a pattern that was in scope will not be in scope anymore when negated (it would then match on false). The final scope of the pattern is defined by the statement containing the expression with the instanceof pattern. To figure out which instanceof pattern is in scope, one has to start at the pattern definition, then walk up the element tree.

My implementation does the following:

The loop that moves up the tree keeps track of the current scopes. When it moves up to the parent, it updates the scopes with new variables that have become available (like one declared in a sibling) or removes ones that no longer apply. The code can then look at the scopes to find any variable that is in scope.

When moving to the parent, it only knows the variables from the child, but there might be siblings that introduce variables too. The code currently hardcodes which siblings are explored, but I think always applying the siblings function might be more sensible here.

For siblings it will try to find one variable declaration (a leaf in the tree), then move up from there. Given that it uses the same code that is used for the child, it will automatically explore the other branches.


The code has become rather complicated, so the next steps would be to simplify where applicable. I found a few bugs in the old code as well, e.g. it used filterChildren to get the variables of a case, but then a variable declared in the block case ..: { /* here */ } would be discovered in the following blocks, which is wrong.

@Luro02
Copy link
Copy Markdown
Contributor Author

Luro02 commented Mar 28, 2026

It has been a while since I last worked on the PR, because I was busy.

I think the implementation is far enough that it is ready for an initial review.
All tests are passing locally (let's see what the CI thinks), and most of the concerning pieces of code have been resolved/cleaned up.

It looks like I am not supposed to add custom test assertions? Will have to investigate where I should add them.

What remains to be done?

  • Go through the JLS and check that the code matches it, I think it does, but maybe I missed something
  • Add support for patterns in a ? b : c
  • Add support for switch expressions and their patterns, I skipped these JLS sections, but maybe I already covered them with the current code
  • Add more tests to ensure the code always behaves correctly (not sure how much time I will spend on this, maybe an AI agent can find some edge cases I might have forgotten).

What I need feedback on:

  • There are two FIXMEs, one of them is just the definition of "cannot complete normally" while the other requires some form of control flow analysis. The latter I would rather not implement, sounds like a ton of work.
  • I reused the old code that used the SiblingsFunction to resolve the variable declarations for things like executables, for loops, and so on. Here I would like to know whether I should keep it as-is or apply the mentioned fix (see TODO in the code)

@Luro02 Luro02 marked this pull request as ready for review March 28, 2026 13:41
@Luro02 Luro02 requested a review from I-Al-Istannen March 28, 2026 13:41
@Luro02 Luro02 changed the title fix: Correctly resolve in scope variables of instanceof patterns #6645 Fix: Correctly resolve in scope variables of instanceof patterns #6645 Mar 29, 2026
Copy link
Copy Markdown
Collaborator

@I-Al-Istannen I-Al-Istannen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! I added some nitpicks and some thoughts, but I don't think I have completely grasped it.

I think I found one omission (if without a then but a non-terminating else) and some other small comments.

Comment on lines +147 to +149
if (parent instanceof CtModifiable ctModifiable) {
isInStaticScope = isInStaticScope || ctModifiable.hasModifier(ModifierKind.STATIC);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (parent instanceof CtModifiable ctModifiable) {
isInStaticScope = isInStaticScope || ctModifiable.hasModifier(ModifierKind.STATIC);
}
if (parent instanceof CtModifiable ctModifiable && ctModifiable.hasModifier(ModifierKind.STATIC)) {
isInStaticScope = true;
}

?

Copy link
Copy Markdown
Contributor Author

@Luro02 Luro02 Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part is from the original code, but will update this. I think my change was just some whitespace adjustments

Comment on lines +158 to +162
/**
* The variable this scope applies to.
*
* @return the variable
*/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* The variable this scope applies to.
*
* @return the variable
*/
/**
* {@return the variable this scope applies to}
*/

Comment on lines +165 to +169
/**
* The element in which the variable can be referenced.
*
* @return the element
*/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* The element in which the variable can be referenced.
*
* @return the element
*/
/**
* {@return the element in which the variable can be referenced}
*/

CtElement ctElement();

/**
* Checks whether the scope applies to the given element.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can glance that from the method name, but what does "apply" mean?

*
* @return the variable
*/
CtVariable<?> ctVariable();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for prefixing everything with ct? I think the rest of the code doesn't do that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just my personal preference naming things after their types

}

/**
* Updates the child scopes for the parent.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Comment on lines +371 to +374
// TODO: With the current implementation any new CtBodyHolders will be resolved by the below code, potentially causing
// wrong variable resolution. Instead one could constrain the below if to
// if (parent instanceof CtCatch || parent instanceof CtExecutable || ...)
// Then in any new implementation the variables would resolve to null instead of a wrong variable (trickier to discover)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure. WDYT @SirYwell?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CtBodyHolder is a somewhat odd interface that doesn't really correspond to a JLS concept, so I'd rather cover the specific cases.

That also brings me to synchronized blocks, which is not a CtBodyHolder but has a similar shape. Is that handled correctly here?

Copy link
Copy Markdown
Contributor Author

@Luro02 Luro02 Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CtBodyHolder is a somewhat odd interface that doesn't really correspond to a JLS concept, so I'd rather cover the specific cases.

That also brings me to synchronized blocks, which is not a CtBodyHolder but has a similar shape. Is that handled correctly here?

CtSynchronized isn't mentioned in the JLS for instanceof patterns, so I assume synchronized (a instanceof B b) { ... } is not a thing.

Other than that the CtSynchronized always has a CtBlock, and the variable can not escape the block. The CtBlock is handled correctly

I will add some tests for this

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CtSynchronized isn't mentioned in the JLS for instanceof patterns, so I assume synchronized (a instanceof B b) { ... } is not a thing.

Other than that the CtSynchronized always has a CtBlock, and the variable can not escape the block. The CtBlock is handled correctly

Oh so e.g., CtCatch is relevant here because it introduces a variable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it only cares about elements that introduce variables. Then defines where in their element the variables are valid.

When moving to the parent, these will no longer apply, so they are discarded. The instanceof patterns are the special case for which we need the scopes, because they can still be valid in their parent element.

Comment on lines +234 to +236
private static <T> List<T> castList(List<?> sourceList) {
return (List<T>) sourceList;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels... risky? I would inline that logic and add an explanatory comment ("Will only create VariableScope if branch is a CtVariable, which it isn't"), if you need that property

}

// If the branch itself is a variable declaration, then this variable is in scope for the branch.
// The updateChildScopesForParent would have introduced it, but it was not called, because of the loop conidition
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The updateChildScopesForParent would have introduced it, but it was not called, because of the loop conidition
// The updateChildScopesForParent would have introduced it, but it was not called, because of the loop condition

}

private static boolean completesNormally(CtStatement statement) {
// FIXME: The JLS has a definition for what "cannot complete normally" is
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should under-approximate this, if the JLS implementation requires too much. I asked some AI and it spit out 60 lines of fun code, so I guess technically it is doable, but if you have a save under-approximation that's fine by me. Do keep a FIXME there then, though :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants