-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dyno: Add resolution of nested functions with outer variables #25273
dyno: Add resolution of nested functions with outer variables #25273
Conversation
2cc8c45
to
64659fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need some time to think over the broader architectural changes, and what they'll mean going forward. I think my main concerns are around the ways in which we are kind of working around the query system, and the Impl/Query bifurcation. I think a second opinion would help here too.
frontend/lib/resolution/Resolver.cpp
Outdated
var x: T; | ||
proc foo() { | ||
// Here 'T' is read from the implicit receiver of 'foo'. | ||
proc bar(f: T) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better example would be using x
in bar
frontend/lib/resolution/Resolver.cpp
Outdated
|
||
// The obvious case: we are a nested non-method function. | ||
if (isMentionInNested && !isMentionInMethod) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly clear on the distinction between these two cases. How could this first case be triggered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This first case is actually the case that is triggering for test5
:
record R {
type T;
var x : T;
proc foobar() {
proc helper(arg: T) { // Error for 'T' !
var y: x.type;
return y;
}
return helper(x);
}
}
var r : R(int);
var x = r.foobar();
Where I believe helper
is not actually considered to be a method, as in isMethod
returns false and it doesn't have a receiver formal. If that's a bug and we want it to be considered a method, then I should probably raise an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helper
is definitely not a method in that case.
Since there is a lot that isn't immediately obvious about these cases, I'd like you to encourage to include these simple code examples as comments near the cases that are handling them. This will help anybody trying to understand the code in the future.
Additionally, IMO this case isn't using x
as an outer variable. It's using the implicit this
formal in proc foobar()
, combined with the implicit this.
. In other words, the correct resolution should be T
is determined to be foobarThis.T
where foobarThis
is what I am calling the implicit this
formal for proc foobar()
.
I'd recommend you bring this up in Future Work as something to adjust here. It is a case where your work on nested functions overlaps with the work I was doing on implicit this.
. IMO the big reason that we need to consider it as a case where this.
is the outer variable is that we need the strategy to apply to parenless methods as well as to fields.
frontend/lib/resolution/Resolver.cpp
Outdated
if (target.isEmpty()) return false; | ||
static bool | ||
isOuterVariable(Resolver& rv, const ID& target, const ID& mention) { | ||
if (!target || !mention || target.isSymbolDefiningScope()) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that the Resolver knows if there is a parent frame going. Basically, I would think that the Resolver knows if it's resolving a nested function. Can't this function have an early return of false
in the case that the Resolver isn't working on a nested function? (or maybe, in the case that the Resolver isn't working on a function that contains nested functions?)
// in the 'CallerDetails'. If there are no details, then try to call the | ||
// 'typedSignatureInitial', but it will give up if the signature for the | ||
// "parent's parent" is generic or if the parent contains outer variables. | ||
const TypedFnSignature* parentSignature = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO UntypedFnSignature
should have a field indicating it is a nested function & a field indicating if it's a function containing nested functions. That should make it easy to avoid these computations in the common case that there is no nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still something to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we want to store those details inside UntypedFnSignature
, perhaps instead I could add a separate query later to compute that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? They should be syntactically apparent. UntypedFnSignature
already has a bunch of bool
s so it should be possible to add it without taking any more space.
64659fd
to
07ed1d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a bit of reviewing. I need to pick it back up with ResolutionContext's implementation and looking at the .cpp files.
void operator()(Context* context, const std::tuple<Ts...>& keep) const { | ||
mark_tuple_impl(context, keep, std::index_sequence_for<Ts...>()); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks identical to what is declared just a few lines up. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's specialized for const std::tuple
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. I'm surprised you would need that.
// resulting 'ResolvedFunction' for the child is stored within this map. | ||
// The primary key is the POI trace, and the secondary key is signature | ||
// and initial PoiInfo. This mirrors the way functions are cached. | ||
PoiTraceToChildMap poiTraceToChild_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, a nested function could be instantiated differently than the outer function, so need its own POI info? Why use a PoiInfo::Trace rather than storing a new PoiInfo? Is it because of some query-based discipline with the PoiInfo that needs to be avoided here?
But SigAndInfoToChildPtrMap uses PoiInfo? I am not catching on to what "initial PoiInfo" means. Can you remind me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all I was trying to do here was mimic how caching of generics is implemented using queries. The map is just replacing the role of the query cache in resolveFunctionByPois
. If you think I can simplify what's going on here I'm all ears.
The initial PoiInfo is the PoiInfo that is input to resolveFunctionByInfoQuery
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just update the comment to say that it mimics those specific queries, as you just described to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked over the changes here. It's a big PR & I know there is lots of future work. We will need to make further adjustments after it is merged & we work with it.
namespace chpl { | ||
|
||
namespace parsing { | ||
bool idIsNestedFunction(Context* context, ID id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd. Why not write #include "chpl/parsing/parsing-queries.h"
above?
|
||
namespace resolution { | ||
|
||
struct Resolver; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have a comment here like
// Forward declare various Resolver types so that they can be mentioned
// here even as their real declarations depend on this file
|
||
public: | ||
/** Create an empty 'ResolutionContext'. */ | ||
ResolutionContext(Context* context) : context_(context) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use explicit ResolutionContext(Context* context)
since AFAIK we don't want to enable implicit conversion.
using StoreHash = StoredResultBase::OwnedKeyByValHash; | ||
using StoreEqual = StoredResultBase::OwnedKeyByValEquals; | ||
using Store = std::unordered_set<StoreSlot, StoreHash, StoreEqual>; | ||
static constexpr int64_t BASE_FRAME_INDEX = -8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why -8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, I can use -1
. 😄
auto crr = resolveCall(&rcval, ...); | ||
|
||
The 'ResolutionContext' consists of a series of 'frames' that roughly | ||
correspond to typical stack frames. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say something more in the comment here about what's going on with stable and unstable frames and what these terms mean?
// context query cache. The query guard should have returned early. | ||
CHPL_ASSERT(false && "Should be set in 'resolveFunctionByInfoQuery'!"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We lost this comment. Could you add it back in?
// the actual value is set in resolveFunctionByInfoQuery after it is
// computed because computing it generates the poiFnIdsUsed which is
// part of the key for this query.
// be considered "instantiated". | ||
// TODO: This needs to be consistent with code in 'AdjustMaybeRefs' | ||
// for inferring ref-maybe-const on calls (whatever this does to compute | ||
// the poiScope, that code also needs to do). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand there are TODOs here (to list in Future Work!) but shouldn't you make a helper function that both can call so at least it's consistent, even if it still needs work?
return 42; | ||
} else { | ||
return "hello"; | ||
// This is private issue #6022. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also say this tests a case that was not working earlier where a nested function uses a field accessible through an outer method's this
.
assert(qt.type() && qt.type()->isIntType()); | ||
} | ||
|
||
static void test6(void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// same as test5 but with a nested method instead of a nested function
*/ | ||
|
||
/* | ||
static void test8(void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a comment about why this is commented out
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
a092a58
to
3209021
Compare
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Signed-off-by: David Longnecker <dlongnecke-cray@users.noreply.github.com>
Hooray, David! |
This PR adds support for resolving calls to nested functions that reference outer variables.
It adds a new type called
ResolutionContext*
that is passed instead of theContext*
to resolution functions that require it. TheResolutionContext*
is used to power a new type of query, theCHPL_RESOLUTION_QUERY...
. These queries enable stack frames for parent functions to be consulted while maintaining a strong invariant: any query computation that references state from a mutable parent frame will not store its results in the globalContext*
query cache.With stack frames for functions present, outer variables can be typed in programs like the following:
Interior calls to nested functions cause a problem for the query framework
As Chapel exists today, only interior calls to nested functions can be written - calls such as
bar()
orbaz()
in the example above. An interior call is a call to a nested function that is issued within its parent or a sibling function.Interior calls present a unique problem. When the call
bar()
is being resolved, the proceduresbaz()
andfoo()
are also in the process of being resolved! While a function is being resolved it is a big blob of mutable state - and many details of resolution are not finalized until several AST walks have been completed.One of the requirements of the query framework is that all the inputs to a query function are immutable: they must be values, or pointers to constants. When the procedure
bar()
is resolved, it will askfoo()
for the type ofx
. Sincefoo()
is still being resolved at that time, the answer thatfoo()
produces (though correct) is taken from its mutableResolutionResultsByPostorderID
.On its own, this would not be a problem, as the type of
x
is still just a value. However, the stack frames used to retrievex
are also a critical part of computing it, and they cannot easily be captured by the query framework. If not all of the inputs required to perform a computation can be stored by the query framework, then the computation can't be a query!(This is not to say that we cannot represent the stack frames as a query input in some other way. E.g., one of the followup ideas thrown around is to create some sort of trace, similar to what is done for POIs, used to indicate where all the outer variables came from. But it is not 100% clear what needs to be in such a type.)
The new
ResolutionContext
typeThis PR adds a new type called
ResolutionContext*
that is passed instead of theContext*
to resolution functions that require it. TheResolutionContext*
is used to power a new type of query, theCHPL_RESOLUTION_QUERY...
.These new "resolution context queries" are written exactly the same as regular
QUERY_...
queries, except the first argument must be of typeResolutionContext*
:The
ResolutionContext
(abbreviated asRC
) contains zero or more stack frames of the typeResolutionContext::Frame
. Right now, frames are pushed when aResolver
for a function is created, and are popped when thatResolver
is destroyed.A "resolution context query" will not always cache its computations in the "global query cache" that is maintained in the
Context*
. At any point in time, theRC
is either stable or unstable. When theRC
is unstable, that means it contains one or moreResolutionContext::Frames
that encapsulate aResolver
. ThisResolver
is mutable and is being mutated during an AST walk for some computation - usually a call toresolveFunction
for a parent function(s).When the resolution context query begins, the
RC
and the query inputs are consulted. If theRC
is stable, then that means no mutable state is present, so the global query cache can always be used. If theRC
is unstable, then the query inputs determine if the global query cache can be consulted. Right now, if any input ID is for a nested function, the global query cache cannot be used. This is a coarse filter and can be adjusted (along with many other aspects of a resolution context query) by specializing query components on a per-query basis.When the global query cache cannot be used, in most cases this means the query's computation will be performed every single time (as though it is uncached). However, it is possible for queries to specialize the "unstable cache" behavior and to fetch/store results within mutable resolver frames. This is done for
resolveFunction
to prevent interior calls to nested functions from resulting in the nested function's body being resolved repeatedly.History of this effort
Originally (any commit before 07ed1d5), this PR adopted a strategy of passing along an extra argument (called
CallerDetails
) to resolution functions which required it. This type modeled stack frames. When a signature or function required an outer variable, the resolver would walk up stack frames to find the variable.Because the global context query cache requires immutable inputs, any time stack frames were consulted to compute a result, the result could not be stored in the context query cache. This invariant was maintained in an impromptu manner, usually by adding the following branch to queries that required it:
Not only does this make the code harder to digest, it makes it hard to maintain the invariant (and thus the correctness of the query framework) due to the possibility of developer error.
After discussions with @benharsh and @mppf we came up with the following sort of ideas which could make this effort more maintainable going forward:
ResolutionContext
type which could be passed instead of theContext*
, removing the extraCallerDetails&
argumentSome things, like the
ResolutionContext
and theResolutionContext::Frame
, were very easy to add, as they were just adaptations of existing code. However, I really struggled with implementing the idea of a new query that contained and conditionally ran a traditional query.At first, I tried to embed the
QUERY_BEGIN
macros inside of new macros, but this very quickly became impossible to edit, understand, or maintain. However, I also ran into a more fundamental problem.The entire context query framework is implemented around the idea of a singular "query function", that must have a certain shape:
const&
.Context*
as the first argument.std::decay
.But these new "resolution query" functions would have a new type
ResolutionContext*
as the first argument, so they violated the second requirement.My first strategy to solve this problem was to embed a secondary, hidden function inside of the query function which has a different signature:
The function had to be embedded within a struct so that it could have a user-writeable name (lambdas are anonymous). There was a kernel of my final code present in this attempt, but it had another problem: the function
hidden__::theGlobalQuery
was not visible outside ofsomeNewResolutionQuery
! It could not be used in inactive stores, that is:QUERY_STORE_RESULT
and friends.Along with this embedded struct, I also implemented the "global portion" of the query end using what were effectively
STORE_RESULT
calls. This would come back to bite me later: inactive stores are NOT the same asQUERY_END
, and encapsulating the user's computation within some form ofQUERY_BEGIN
andQUERY_END
would prove essential to tracking query dependencies!Eventually, I got fed up with writing code embedded in giant macros, and started to think about a way around that. The problem with the
QUERY...
macros is that they are not extensible, because all names must be at the outermost scope in order to be referenced in bothQUERY_BEGIN
andQUERY_END
.I eventually realized that a good workaround is to move all the names that were at the top level, into a templated
struct
instead. This has the added advantage of allowing me to write 99% of the code as code that is not embedded in macros. I am very happy with how readable the code ultimately ended up looking. A new C++17 feature, "auto
value template parameters", enabled me to capture the resolution query functions as a template value in a very succinct manner.As iterations over this idea continued, I eventually realized that I had to insert
QUERY_BEGIN
andQUERY_END
calls into my new query guards in order to ensure that global dependencies were tracked. But I could not embed them in theResolutionContext::Query
struct methods because they are implemented by declaring hidden variables at the top level of a function.Ultimately, I had to reach into the body of the
QUERY...
macros and reimplement them as calls within my new struct methods. As I was working on this, I realized that I had implemented two types:GlobalQuery
, which essentially reimplements the body of theQUERY...
functions in a more readable and maintainable fashion.ResolutionContext::Query
, which is a wrapper around aGlobalQuery
that is required to uphold the invariant.Future work should take this new implementation and utilize it to implement the
QUERY...
macro functions.TESTING
linux64
,standard
Thanks to @mppf for a thorough review and pointers on future work. Thanks to @mppf and @benharsh for brainstorming the
ResolutionContext
and theCHPL_RESOLUTION_QUERY...
. And thanks to the dyno team for their patience!FUTURE WORK
See: https://github.com/Cray/chapel-private/issues/6721