Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a LocalScope for CollectionComprehensions and generate a Declaration #2019

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ class ScopeManager : ScopeProvider {
is TryStatement,
is IfStatement,
is CatchClause,
is CollectionComprehension,
is Block -> LocalScope(nodeToScope)
is FunctionDeclaration -> FunctionScope(nodeToScope)
is RecordDeclaration -> RecordScope(nodeToScope)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ fun LanguageProvider.objectType(
): Type {
// First, we check, whether this is a built-in type, to avoid necessary allocations of simple
// types
val builtIn = language?.getSimpleTypeOf(name.toString())
val builtIn = language.getSimpleTypeOf(name.toString())
if (builtIn != null) {
return builtIn
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,13 +99,25 @@ class ExpressionHandler(frontend: PythonLanguageFrontend) :
}
}

/**
* Works similar to [apply] but before executing [block], it enters the scope for this object
* and afterward leaves the scope again.
*/
private inline fun <T : Node> T.applyWithScope(block: T.() -> Unit): T {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a helpful bit, also for the future node builder API, should we move this to the scope manager? Or keep it intentionally private for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same and wasn't sure where would be a good place. Maybe Node directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When in doubt place it in Extensions.kt:D

return this.apply {
ctx?.scopeManager?.enterScope(this)
block()
ctx?.scopeManager?.leaveScope(this)
}
}

/**
* Translates a Python
* [`GeneratorExp`](https://docs.python.org/3/library/ast.html#ast.GeneratorExp) into a
* [CollectionComprehension].
*/
private fun handleGeneratorExp(node: Python.AST.GeneratorExp): CollectionComprehension {
return newCollectionComprehension(rawNode = node).apply {
return newCollectionComprehension(rawNode = node).applyWithScope {
statement = handle(node.elt)
comprehensionExpressions += node.generators.map { handleComprehension(it, node) }
type = objectType("Generator")
Expand All @@ -117,10 +129,10 @@ class ExpressionHandler(frontend: PythonLanguageFrontend) :
* into a [CollectionComprehension].
*/
private fun handleListComprehension(node: Python.AST.ListComp): CollectionComprehension {
return newCollectionComprehension(rawNode = node).apply {
return newCollectionComprehension(rawNode = node).applyWithScope {
statement = handle(node.elt)
comprehensionExpressions += node.generators.map { handleComprehension(it, node) }
type = objectType("list") // TODO: Replace this once we have dedicated types
type = objectType("list")
}
}

Expand All @@ -129,10 +141,10 @@ class ExpressionHandler(frontend: PythonLanguageFrontend) :
* a [CollectionComprehension].
*/
private fun handleSetComprehension(node: Python.AST.SetComp): CollectionComprehension {
return newCollectionComprehension(rawNode = node).apply {
return newCollectionComprehension(rawNode = node).applyWithScope {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, if this is really relevant in our case, but it seems that the exact definition is that only the control variable is not leaked into the outer scope, but I guess the only sane way to model this is exactly like you did by opening up a local scope for the comprehension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, from https://peps.python.org/pep-0709/ -> "Comprehensions are currently compiled as nested functions, which provides isolation of the comprehension’s iteration variable, but is inefficient at runtime."

Copy link
Contributor Author

@KuechA KuechA Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this sounds as if the current behavior would be appropriately modeled with the local scope but the intended/future behavior might be different.

Actually, we're only generating a VariableDeclaration inside this local scope for the variable of the comprehension expression and this is done inside the PythonAddDeclarationsPass. Leaving the scope earlier (i.e., right after generating the ComprehensionExpression) here won't impact on resolving variables, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this sounds as if the current behavior would be appropriately modeled with the local scope but the intended/future behavior might be different.

Yes, although I am not sure if this "new" bevahiour will leak the control variable or not. Maybe our PEP master @maximiliankaul understands this better than us.

Actually, we're only generating a VariableDeclaration inside this local scope for the variable of the comprehension expression and this is done inside the PythonAddDeclarationsPass. Leaving the scope earlier (i.e., right after generating the ComprehensionExpression) here won't impact on resolving variables, right?

I would assume so

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, the variable should be available only in the LocalScope tied to the ComprehensionExpression. Obviously, we can always enter this scope to incorrectly resolve a variable but this shouldn't be the case. I also added a test which simulates the test case he provided in the original comment.

this.statement = handle(node.elt)
this.comprehensionExpressions += node.generators.map { handleComprehension(it, node) }
this.type = objectType("set") // TODO: Replace this once we have dedicated types
this.type = objectType("set")
}
}

Expand All @@ -141,15 +153,15 @@ class ExpressionHandler(frontend: PythonLanguageFrontend) :
* into a [CollectionComprehension].
*/
private fun handleDictComprehension(node: Python.AST.DictComp): CollectionComprehension {
return newCollectionComprehension(rawNode = node).apply {
return newCollectionComprehension(rawNode = node).applyWithScope {
this.statement =
newKeyValueExpression(
key = handle(node.key),
value = handle(node.value),
rawNode = node,
)
this.comprehensionExpressions += node.generators.map { handleComprehension(it, node) }
this.type = objectType("dict") // TODO: Replace this once we have dedicated types
this.type = objectType("dict")
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ import de.fraunhofer.aisec.cpg.graph.declarations.VariableDeclaration
import de.fraunhofer.aisec.cpg.graph.scopes.RecordScope
import de.fraunhofer.aisec.cpg.graph.statements.ForEachStatement
import de.fraunhofer.aisec.cpg.graph.statements.expressions.AssignExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.ComprehensionExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.InitializerListExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.MemberExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.Reference
Expand Down Expand Up @@ -67,15 +68,15 @@ class PythonAddDeclarationsPass(ctx: TranslationContext) : ComponentPass(ctx), L
}

/**
* This function checks for each [AssignExpression] whether there is already a matching variable
* or not. New variables can be one of:
* This function checks for each [AssignExpression], [ComprehensionExpression] and
* [ForEachStatement] whether there is already a matching variable or not. New variables can be
* one of:
* - [FieldDeclaration] if we are currently in a record
* - [VariableDeclaration] otherwise
*
* TODO: loops
*/
private fun handle(node: Node?) {
when (node) {
is ComprehensionExpression -> handleComprehensionExpression(node)
is AssignExpression -> handleAssignExpression(node)
is ForEachStatement -> handleForEach(node)
else -> {
Expand Down Expand Up @@ -194,6 +195,27 @@ class PythonAddDeclarationsPass(ctx: TranslationContext) : ComponentPass(ctx), L
this.base.name == scopeManager.currentMethod?.receiver?.name
}

/**
* Generates a new [VariableDeclaration] for [Reference] (and those included in a
* [InitializerListExpression]) in the [ComprehensionExpression.variable].
*/
private fun handleComprehensionExpression(comprehensionExpression: ComprehensionExpression) {
when (val variable = comprehensionExpression.variable) {
is Reference -> {
variable.access = AccessValues.WRITE
handleWriteToReference(variable)
}
is InitializerListExpression -> {
variable.initializers.forEach {
(it as? Reference)?.let { ref ->
ref.access = AccessValues.WRITE
handleWriteToReference(ref)
}
}
}
}
}

/**
* Generates a new [VariableDeclaration] if [target] is a [Reference] and there is no existing
* declaration yet.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,77 @@
package de.fraunhofer.aisec.cpg.frontends.python

import de.fraunhofer.aisec.cpg.graph.*
import de.fraunhofer.aisec.cpg.graph.declarations.ParameterDeclaration
import de.fraunhofer.aisec.cpg.graph.declarations.VariableDeclaration
import de.fraunhofer.aisec.cpg.graph.scopes.LocalScope
import de.fraunhofer.aisec.cpg.graph.statements.expressions.AssignExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.BinaryOperator
import de.fraunhofer.aisec.cpg.graph.statements.expressions.Block
import de.fraunhofer.aisec.cpg.graph.statements.expressions.CallExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.CollectionComprehension
import de.fraunhofer.aisec.cpg.graph.statements.expressions.InitializerListExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.KeyValueExpression
import de.fraunhofer.aisec.cpg.graph.statements.expressions.Reference
import de.fraunhofer.aisec.cpg.test.analyze
import de.fraunhofer.aisec.cpg.test.assertLiteralValue
import de.fraunhofer.aisec.cpg.test.assertLocalName
import de.fraunhofer.aisec.cpg.test.assertNotRefersTo
import de.fraunhofer.aisec.cpg.test.assertRefersTo
import java.nio.file.Path
import kotlin.test.*

class ExpressionHandlerTest {

@Test
fun testComprehensionExpressionTuple() {
val topLevel = Path.of("src", "test", "resources", "python")
val result =
analyze(listOf(topLevel.resolve("comprehension.py").toFile()), topLevel, true) {
it.registerLanguage<PythonLanguage>()
}
assertNotNull(result)

val tupleComp = result.functions["tupleComp"]
assertNotNull(tupleComp)

val body = tupleComp.body
assertIs<Block>(body)
val tupleAsVariableAssignment = body.statements[0]
assertIs<AssignExpression>(tupleAsVariableAssignment)
val tupleAsVariable = tupleAsVariableAssignment.rhs[0]
assertIs<CollectionComprehension>(tupleAsVariable)
val barCall = tupleAsVariable.statement
assertIs<CallExpression>(barCall)
assertLocalName("bar", barCall)
val argK = barCall.arguments[0]
assertIs<Reference>(argK)
assertLocalName("k", argK)
val argV = barCall.arguments[1]
assertIs<Reference>(argV)
assertLocalName("v", argV)
assertEquals(1, tupleAsVariable.comprehensionExpressions.size)
val initializerListExpression = tupleAsVariable.comprehensionExpressions[0].variable
assertIs<InitializerListExpression>(initializerListExpression)
val variableK = initializerListExpression.initializers[0]
assertIs<Reference>(variableK)
assertLocalName("k", variableK)
val variableV = initializerListExpression.initializers[1]
assertIs<Reference>(variableV)
assertLocalName("v", variableV)

// Check that the declarations exist for the variables k and v
val declK = variableK.refersTo
assertIs<VariableDeclaration>(declK)
assertIs<LocalScope>(declK.scope)
assertEquals(tupleAsVariable, declK.scope?.astNode)
assertRefersTo(argK, declK)
val declV = variableV.refersTo
assertIs<VariableDeclaration>(declV)
assertIs<LocalScope>(declV.scope)
assertEquals(tupleAsVariable, declV.scope?.astNode)
assertRefersTo(argV, declV)
}

@Test
fun testListComprehensions() {
val topLevel = Path.of("src", "test", "resources", "python")
Expand All @@ -50,21 +107,42 @@ class ExpressionHandlerTest {
assertNotNull(result)
val listComp = result.functions["listComp"]
assertNotNull(listComp)
val paramX = listComp.parameters[0]
assertIs<ParameterDeclaration>(paramX)
assertLocalName("x", paramX)

val body = listComp.body
assertIs<Block>(body)
val singleWithIfAssignment = body.statements[0]
assertIs<AssignExpression>(singleWithIfAssignment)
val singleWithIf = singleWithIfAssignment.rhs[0]
assertIs<CollectionComprehension>(singleWithIf)
assertIs<CallExpression>(singleWithIf.statement)
val fooCall = singleWithIf.statement
assertIs<CallExpression>(fooCall)
val usageI = fooCall.arguments[0]
assertIs<Reference>(usageI)
assertEquals(1, singleWithIf.comprehensionExpressions.size)
assertLocalName("i", singleWithIf.comprehensionExpressions[0].variable)
assertIs<Reference>(singleWithIf.comprehensionExpressions[0].iterable)
assertLocalName("x", singleWithIf.comprehensionExpressions[0].iterable)
val variableI = singleWithIf.comprehensionExpressions[0].variable
assertIs<Reference>(variableI)
assertLocalName("i", variableI)
val declI = variableI.refersTo
assertIs<VariableDeclaration>(declI)
assertEquals(singleWithIf, declI.scope?.astNode)
val iterableX = singleWithIf.comprehensionExpressions[0].iterable
assertIs<Reference>(iterableX)
assertLocalName("x", iterableX)
assertRefersTo(iterableX, paramX)
val ifPredicate = singleWithIf.comprehensionExpressions[0].predicate
assertIs<BinaryOperator>(ifPredicate)
assertEquals("==", ifPredicate.operatorCode)
assertRefersTo(usageI, declI)

val fooIOutside = body.statements[4]
assertIs<CallExpression>(fooIOutside)
val outsideI = fooIOutside.arguments[0]
assertIs<Reference>(outsideI)
assertLocalName("i", outsideI)
assertNotRefersTo(outsideI, declI)

val singleWithoutIfAssignment = body.statements[1]
assertIs<AssignExpression>(singleWithoutIfAssignment)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ def listComp(x, y):
b = [foo(i) for i in x]
c = {foo(i) for i in x if i == 10 if i < 20}
d = [foo(i) for z in y if z in x for i in z if i == 10 ]
foo(i)

def setComp(x, y):
a = {foo(i) for i in x if i == 10}
Expand All @@ -21,4 +22,10 @@ def dictComp(x, y):

def generator(x, y):
a = (i**2 for i in range(10) if i == 10)
b = (i**2 for i in range(10))
b = (i**2 for i in range(10))

def bar(k, v):
return k+v

KuechA marked this conversation as resolved.
Show resolved Hide resolved
def tupleComp(x):
a = [bar(k, v) for (k, v) in x]
Loading