-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib: create an init function for records with complex default values #9373
base: master
Are you sure you want to change the base?
stdlib: create an init function for records with complex default values #9373
Conversation
CT Test Results 2 files 97 suites 1h 9m 4s ⏱️ For more details on these failures, see this check. Results for commit c22b3fa. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
d544c98
to
e4b004f
Compare
St); | ||
|
||
IsUndefined = [{RF, AnnoRF, Field, {atom, AnnoRF, 'undefined'}} || {record_field=RF, AnnoRF, Field, _} <- Is], | ||
Fields = lists:flatten(lists:sort([atom_to_list(FieldAtom) || {record_field, _, {atom, _, FieldAtom}, _} <- Is])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, prefer binaries over lists, and to sink expressions as far down as possible: the compiler has to maintain observability with regards to tracing, and cannot sink this expression down to the true
branch below as that would result in a different trace.
Now, as "free vars" are allowed in record fields, these free vars can affect each other. -module(z).
-export([mk1/0]).
-record(a, {
a = X = id(1),
b = X = id(2)
}).
id(X) -> X.
mk1() ->
#a{}.
|
Thanks, I'll get to it soon! |
Actually, I spoke to soon. This is as intended. Just like how it works if you try to update the record like this: But I will update the linter so that it warns for this. |
4ed5f0c
to
fc096c6
Compare
Any chance of not including the new function in the error description? So, from the other comment… 1> z:mk1().
** exception error: no match of right hand side value 2
in function z:'rec_init$^0'/0 (z.erl, line 13) I would've preferred to see something like… 1> z:mk1().
** exception error: no match of right hand side value 2
in function z:mk1/0 (z.erl, line 13) |
Tail calls, like in this case, makes it invisible in the stacktrace. Would that work for you? |
Another option would be to explicitly prevent the compiler from tail calling into this helper function, and always emit a full call with building a stack frame |
traverse_af(AF, Fun) -> | ||
traverse_af(AF, Fun, []). | ||
traverse_af(AF, Fun, Acc) when is_list(AF) -> | ||
[ traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; | |
[traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; |
With the current implementation it's possible to crash -module(z).
-export([mk1/0]).
-record(a, {
a = X = 1,
b = X
}).
mk1() ->
#a{a = 2}.
The code after expansion is: -file("z.erl", 1).
-module(z).
-export([mk1/0]).
-record(a,{a = X = 1, b = X}).
mk1() ->
begin
REC0 = 'rec_init$^0'(),
case REC0 of
{a, _, _} ->
setelement(2, REC0, 2);
_ ->
error({badrecord, REC0})
end
end.
'rec_init$^0'() ->
{a, undefined, X}. |
The test case |
5e9f9b8
to
cf5cbab
Compare
I think my attempt at being clever about this failed, and a naiver approach is better.
In the above case, you will apply the initialize values, (a=2 in this case) #a{2,X,3}, check if there are variables left in the expression.
In this case after the initializing values are applied, you have #a{2,1,3}, there are no variables, so its not necessary to run the rec_init() function. If you have side effects (which you shouldn't) in your records default value, you have to think a second time if that really is what you want. Since they might be running when you do not expect it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I've looked at erl_error
and erl_expand_record
.
My comments are very nit-picky. We care very much about a consistent code-style in the compiler (and compiler-adjacent modules in STDLIB).
lib/stdlib/src/erl_error.erl
Outdated
no -> case is_rec_init(F) of | ||
true -> <<"in record">>; | ||
_ -> <<"in function ">> | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend not using _
to match a single known value. The _
can hide bugs if is_rec_init/1
is changed in the future to return more values. Also, if true
is misspelled as ture
, that bug is hidden too. Furthermore, the '_' can result in worse types in both Dialyzer and the compiler. Sometimes those worse types can result in worse code because the compiler is unable to see that some optimization is safe.
In this particular case, the compiler will "see" that if the value is not true
, it must be false. In other words, the compiler will do the optimization for you.
no -> case is_rec_init(F) of | |
true -> <<"in record">>; | |
_ -> <<"in function ">> | |
end | |
no -> | |
case is_rec_init(F) of | |
true -> <<"in record">>; | |
false -> <<"in function ">> | |
end |
lib/stdlib/src/erl_error.erl
Outdated
case is_rec_init(F) of | ||
true -> <<"default value">>; | ||
_ -> io_lib:fwrite(<<"~ts/~w">>, [mf_to_string({M, F}, A, Enc), A]) | ||
end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case is_rec_init(F) of | |
true -> <<"default value">>; | |
_ -> io_lib:fwrite(<<"~ts/~w">>, [mf_to_string({M, F}, A, Enc), A]) | |
end. | |
case is_rec_init(F) of | |
true -> | |
<<"default value">>; | |
false -> | |
io_lib:fwrite(<<"~ts/~w">>, | |
[mf_to_string({M, F}, A, Enc), A]) | |
end. |
@@ -95,6 +97,12 @@ forms([{function,Anno,N,A,Cs0} | Fs0], St0) -> | |||
forms([F | Fs0], St0) -> | |||
{Fs,St} = forms(Fs0, St0), | |||
{[F | Fs], St}; | |||
forms([], #exprec{new_forms=FsN}=St) -> | |||
{[{'function', Anno, | |||
maps:get(Def,FsN), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the existing code style in this module.
maps:get(Def,FsN), | |
maps:get(Def, FsN), |
lib/stdlib/src/erl_error.erl
Outdated
origin(1, M, F, A) -> | ||
case is_op({M, F}, n_args(A)) of | ||
{yes, F} -> <<"in operator ">>; | ||
no -> <<"in function ">> | ||
no -> case is_rec_init(F) of | ||
true -> <<"in record">>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any test of this added code?
traverse_af(AF, Fun) -> | ||
traverse_af(AF, Fun, []). | ||
traverse_af(AF, Fun, Acc) when is_list(AF) -> | ||
[traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; | ||
traverse_af(AF, Fun, Acc) when is_tuple(AF) -> | ||
%% Iterate each tuple element, if the element is an AF, traverse it | ||
[[(fun (List) when is_list(List) -> | ||
traverse_af(List, Fun, Acc); | ||
(Tuple) when is_tuple(Tuple)-> | ||
case erl_anno:is_anno(Tuple) of | ||
true -> []; | ||
false -> traverse_af(Tuple, Fun, Fun(Tuple,Acc)) | ||
end; | ||
(_) -> [] | ||
end)(Term) || Term <- tuple_to_list(AF)],Acc]; | ||
traverse_af(_, _, Acc) -> Acc. | ||
save_vars({var, _, Var}, _) -> Var; | ||
save_vars(_, Acc) -> Acc. | ||
free_variables(AF, Acc) -> | ||
try | ||
_=traverse_af(AF, fun free_variables1/2, Acc), | ||
false | ||
catch | ||
throw:{error,unsafe_variable} -> true | ||
end. | ||
free_variables1({'fun',_anno,{clauses, _}}, Acc) -> | ||
{function,Acc}; %% tag that we are in a 'fun' now that can define new variables | ||
free_variables1({clause,_anno,Pattern,_guards,_body}, {function,Acc}) -> | ||
lists:flatten(traverse_af(Pattern, fun save_vars/2, [])++Acc); | ||
free_variables1({var, _, Var}, Acc) -> | ||
case lists:member(Var, Acc) of | ||
true -> Acc; | ||
false -> throw({error, unsafe_variable}) | ||
end; | ||
free_variables1(_, Acc) -> Acc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traverse_af(AF, Fun) -> | |
traverse_af(AF, Fun, []). | |
traverse_af(AF, Fun, Acc) when is_list(AF) -> | |
[traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; | |
traverse_af(AF, Fun, Acc) when is_tuple(AF) -> | |
%% Iterate each tuple element, if the element is an AF, traverse it | |
[[(fun (List) when is_list(List) -> | |
traverse_af(List, Fun, Acc); | |
(Tuple) when is_tuple(Tuple)-> | |
case erl_anno:is_anno(Tuple) of | |
true -> []; | |
false -> traverse_af(Tuple, Fun, Fun(Tuple,Acc)) | |
end; | |
(_) -> [] | |
end)(Term) || Term <- tuple_to_list(AF)],Acc]; | |
traverse_af(_, _, Acc) -> Acc. | |
save_vars({var, _, Var}, _) -> Var; | |
save_vars(_, Acc) -> Acc. | |
free_variables(AF, Acc) -> | |
try | |
_=traverse_af(AF, fun free_variables1/2, Acc), | |
false | |
catch | |
throw:{error,unsafe_variable} -> true | |
end. | |
free_variables1({'fun',_anno,{clauses, _}}, Acc) -> | |
{function,Acc}; %% tag that we are in a 'fun' now that can define new variables | |
free_variables1({clause,_anno,Pattern,_guards,_body}, {function,Acc}) -> | |
lists:flatten(traverse_af(Pattern, fun save_vars/2, [])++Acc); | |
free_variables1({var, _, Var}, Acc) -> | |
case lists:member(Var, Acc) of | |
true -> Acc; | |
false -> throw({error, unsafe_variable}) | |
end; | |
free_variables1(_, Acc) -> Acc. | |
variables({var,_,'_'}) -> | |
[]; | |
variables({var,_,V}) -> | |
[V]; | |
variables({'fun',_,Def}) -> | |
%% The Def tuple has no annotation. Must handle it specially. | |
case Def of | |
{clauses,Cs} -> variables(Cs); | |
{function,F,A} -> variables([F,A]); | |
{function,M,F,A} -> variables([M,F,A]) | |
end; | |
variables(Tuple) when is_tuple(Tuple) -> | |
[Tag,Anno|T] = tuple_to_list(Tuple), | |
true = is_atom(Tag), %Assertion. | |
true = erl_anno:is_anno(Anno), %Assertion. | |
variables(T); | |
variables(List) when is_list(List) -> | |
foldl(fun(E, Vs0) -> | |
Vs1 = variables(E), | |
ordsets:union(Vs0, Vs1) | |
end, [], List); | |
variables(_) -> | |
[]. |
[traverse_af(Ast, Fun, Fun(Ast,Acc)) || Ast <- AF]; | ||
traverse_af(AF, Fun, Acc) when is_tuple(AF) -> | ||
%% Iterate each tuple element, if the element is an AF, traverse it | ||
[[(fun (List) when is_list(List) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tip: there is no need create and call a fun here. You can just use a case
or an if
here.
throw:{error,unsafe_variable} -> true | ||
end. | ||
free_variables1({'fun',_anno,{clauses, _}}, Acc) -> | ||
{function,Acc}; %% tag that we are in a 'fun' now that can define new variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. Only variables defined in the function head will be shadowed. Variables defined in the function body will be matched against variables having the same name in the enclosing function.
In my suggested simplified function, I don't try to handle funs; I think they are used too infrequently to be worth the effort.
%% Initialize the record | ||
R_init = [{atom,Anno,Name} | | ||
record_inits(record_fields(Name, Anno0, St), Is)], | ||
Vars = lists:flatten(traverse_af(Is, fun save_vars/2)), | ||
%% Check if there are variables in the initialized record | ||
%% if there are, we need to initialize the record using a generated function | ||
case free_variables(R_init, Vars) of | ||
true -> | ||
%% Initialize the record with only the default values | ||
R_default_init = [{atom,Anno,Name} | | ||
record_inits(record_fields(Name, Anno, St),[])], | ||
{Def,St1} = expr({tuple,Anno,R_default_init},St), | ||
Map=St1#exprec.new_forms, | ||
{FName,St2} = case maps:get(Def, Map, undefined) of | ||
undefined-> | ||
C=St1#exprec.rec_init_count, | ||
NewName=list_to_atom("rec_init$^" ++ integer_to_list(C)), | ||
{NewName, St1#exprec{rec_init_count=C+1, new_forms=Map#{Def=>NewName}}}; | ||
OldName -> {OldName,St1} | ||
end, | ||
%% replace the init record expression with a call expression | ||
%% to the newly added function and a record update. | ||
expr({record, Anno0, {call,Anno,{atom,Anno,FName},[]}, Name, Is},St2); | ||
false -> | ||
%% No free variables means that we can just | ||
%% output the record as a tuple. | ||
expr({tuple,Anno,R_init},St) | ||
end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here follows my suggested change to use the variables/1
function suggested by me earlier. I've also broken some long lines, renamed variables to follow the conventions we have in compiler modules, and a few other minor improvements.
For the main change below to work, line 180 must be changed as follows:
expr({record,Anno0,Name,Is}, St0) ->
Here follows the main suggestion:
%% Initialize the record | |
R_init = [{atom,Anno,Name} | | |
record_inits(record_fields(Name, Anno0, St), Is)], | |
Vars = lists:flatten(traverse_af(Is, fun save_vars/2)), | |
%% Check if there are variables in the initialized record | |
%% if there are, we need to initialize the record using a generated function | |
case free_variables(R_init, Vars) of | |
true -> | |
%% Initialize the record with only the default values | |
R_default_init = [{atom,Anno,Name} | | |
record_inits(record_fields(Name, Anno, St),[])], | |
{Def,St1} = expr({tuple,Anno,R_default_init},St), | |
Map=St1#exprec.new_forms, | |
{FName,St2} = case maps:get(Def, Map, undefined) of | |
undefined-> | |
C=St1#exprec.rec_init_count, | |
NewName=list_to_atom("rec_init$^" ++ integer_to_list(C)), | |
{NewName, St1#exprec{rec_init_count=C+1, new_forms=Map#{Def=>NewName}}}; | |
OldName -> {OldName,St1} | |
end, | |
%% replace the init record expression with a call expression | |
%% to the newly added function and a record update. | |
expr({record, Anno0, {call,Anno,{atom,Anno,FName},[]}, Name, Is},St2); | |
false -> | |
%% No free variables means that we can just | |
%% output the record as a tuple. | |
expr({tuple,Anno,R_init},St) | |
end; | |
RInit = [{atom,Anno,Name} | | |
record_inits(record_fields(Name, Anno0, St0), Is)], | |
Vars = variables(Is), | |
%% Check if there are variables in the initialized record. If | |
%% there are, we need to initialize the record using a generated | |
%% function | |
AnyVariables = not ordsets:is_subset(variables(RInit), Vars), | |
case AnyVariables of | |
true -> | |
%% Initialize the record with only the default values. | |
RDefInit = [{atom,Anno,Name} | | |
record_inits(record_fields(Name, Anno, St0),[])], | |
{Def,St1} = expr({tuple,Anno,RDefInit}, St0), | |
Map0 = St1#exprec.new_forms, | |
{FName,St2} = | |
case Map0 of | |
#{Def := OldName} -> | |
{OldName,St1}; | |
#{} -> | |
C = St1#exprec.rec_init_count, | |
NewName = list_to_atom("rec_init$^" ++ | |
integer_to_list(C)), | |
Map = Map0#{Def => NewName}, | |
{NewName,St1#exprec{rec_init_count=C+1, | |
new_forms=Map}} | |
end, | |
%% Replace the init record expression with a call expression | |
%% to the newly added function followed by a record update. | |
expr({record, Anno0, {call,Anno,{atom,Anno,FName},[]}, Name, Is},St2); | |
false -> | |
%% No free variables means that we can just output the | |
%% record as a tuple. | |
expr({tuple,Anno,RInit}, St0) | |
end; |
"Bound" variables from previous fields are not propagated correctly to next fields. This definition is accepted: -record(r1, {
a = X = 1,
c = X
}). While this definition is rejected: -record(r1, {
a = X = 1,
b,
c = X
}).
|
Thinking about it a bit more, I actually find it very surprising we want to allow variables "leaking" across fields in default values - we don't allow that in regular record creation! For example this is an error:
Why should moving this into the default expression somehow make this compile and OK? This sounds quite inconsistent to me. |
I agree, - this introduces one more "rule" about scoping of variables that is already quite complicated and sometimes surprising. |
Okey, I must have been confused because I thought this was valid! I'll fix the behavior next week then. |
create an init function e.g. rec_init$^0, for each record with definitions containing variables. e.g. -record(r, {f = fun(X)->case X of {y, Y} -> Y; _ -> X end, g=..., h=abc}). foo(X)->\#r{}. --> foo(X)->(rec_init()){}. rec_init() will initialize all fields with the default values If one field is set and the omitted field default value has variables, then a new init function is created that only initializes the omitted fields. - removes lint error for variables in definitions - updates erl_lint_SUITE and erl_expand_records_SUITE to work with this new behavior - adds handling of records that are calling functions to the shell - records calling local non exported functions will fail initialization
cf5cbab
to
f8dfae2
Compare
lib/stdlib/src/shell.erl
Outdated
[NE2]=reconstruct1(NE, [],0, []), | ||
ets:insert(FT, [begin {value, Fun, []} = erl_eval:expr({'fun', A, {clauses, F}}, []), | ||
{{function, {shell_default, FunName, 0}}, Fun} | ||
end || {function,_,FunName,0,F}=_F1<-Forms, FunName=/=foo]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
end || {function,_,FunName,0,F}=_F1<-Forms, FunName=/=foo]), | |
end || {function,_,FunName,0,F}<-Forms, FunName=/=foo]), |
lib/stdlib/src/edlin_expand.erl
Outdated
{ok, [{atom, _, Fun1}], _} -> | ||
shell(Fun1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ok, [{atom, _, Fun1}], _} -> | |
shell(Fun1) | |
{ok, [{atom, _, Fun1}], _} -> shell(Fun1) |
{ok, [{atom, _, Fun1}], _} -> | ||
bif(Fun1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ok, [{atom, _, Fun1}], _} -> | |
bif(Fun1) | |
{ok, [{atom, _, Fun1}], _} -> bif(Fun1) |
lib/stdlib/src/edlin_expand.erl
Outdated
true -> "shell_default"; | ||
_ -> bif(Fun) | ||
end | ||
shell_default_or_bif(Fun1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For keep one code style, it make sense to move this API call, to line 753
.
records that have field default values containing variables that are "free" was unsafe in functions that have variables with the same name. This commit creates init function for records to protect the variables in the default value.
e.g.
-record(r, {f = fun(X)->case X of {y, Y} -> Y; _ -> X end, g=..., h=abc}). foo(X)->#r{}. --> foo(X)->(r_init()){}.
r_init() will only initialize fields that will not be updated e.g.
foo(X)->#r{f=X} --> foo(X)->(r_init_f()){f=X}.
r_init_f will only initialize g and h with its default value, f will be initialized to undefined.
r_init() functions will not be generated if all fields of the record that contains "free variables" are initialized by the user.
e.g.
foo(X)->#r{f=X,g=X}. --> foo(X)->{r,X,X,abc}.
closes #9317