-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BindPython ABI #2796
BindPython ABI #2796
Conversation
cpython_bindings.py will contain declarations for all the CPython C API function required
The right solution here is to have an ASR->ASR pass that takes the If you think you can get something working with relatively short code, you can finish the PR, but I think having a pass is a better solution long term. |
split one `pass_python_bind` function into `generate_body`, `native_to_cpython`, `cpython_to_native`, and `pass_python_bind`
Yes. I am implementing this using an ASR to ASR pass. The |
@Shaikh-Ubaid, @certik, Please have a look at this. |
current_scope = parent_scope; | ||
visit_ImportFrom(*imports); | ||
current_scope = function_scope; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these needed in AST->ASR?
Ideally the initial ASR has nothing to do with Python C/API. Then the ASR->ASR pass translates bind(python)
into C/API calls in ASR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, even I was confused in my last meet with Vipul. Looking into the changes, I think I understand what he is doing here.
He has a new runtime module cpython_bindings.py
which contains the declarations to the C
functions in Python.h
that we need. In the above change, he is just creating AST
code for the following:
from cpython_bindings import Py_Initialize, Py_IsInitialized, PyRun_SimpleString, Py_DecodeLocale, PySys_SetArgv, Py_FinalizeEx, PyUnicode_FromString, PyUnicode_AsUTF8AndSize, PyImport_Import, Py_DecRef, Py_IncRef, PyObject_GetAttrString, PyTuple_New, PyTuple_SetItem, PyObject_CallObject, PyLong_AsLongLong, PyLong_AsUnsignedLongLong, PyLong_FromLongLong, PyLong_FromUnsignedLongLong, PyFloat_FromDouble, PyFloat_AsDouble, PyBool_FromLong, PyObject_IsTrue
When he visits the AST
code, lpython
reads/loads the runtime/cpython_bindings.py
and automatically generates the declarations for the above BindC
functions.
@Vipul-Cariappa Initially this approach sounded interesting to me, but now after going through the changes, it seems like we should instead write the ASR code for these directly in the python_bind
pass. I think if we simply extend the current pybind_funcs
array with the parameter types and return type info (for example specifying types like here
lpython/src/libasr/codegen/asr_to_wasm.cpp
Line 206 in 9374feb
m_wa.define_func({i64}, {}, {i64, i64, i64, i64}, "print_i64", [&](){ |
runtime/cpython_bindings.py
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will update the code to generate the required function declarations in the python_bind
pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially this approach sounded interesting to me, but now after going through the changes, it seems like we should instead write the ASR code for these directly in the
python_bind
pass.
Yes, exactly.
Otherwise it looks like the right approach, it seems you are doing the type translation to/from Python in the ASR->ASR pass already, which is perfect. |
src/bin/lpython.cpp
Outdated
@@ -321,6 +321,7 @@ int emit_c(const std::string &infile, | |||
pass_manager.use_default_passes(true); | |||
compiler_options.po.always_run = true; | |||
compiler_options.po.run_fun = "f"; | |||
compiler_options.po.c_backend = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have --backend=c
which means the backend is c
. Another c_backend
could seem confusing. I think we should call it something like c_skip_bindpy_pass
.
We should also enable this flag in emit_c_to_file()
.
src/libasr/pass/python_bind.cpp
Outdated
// backend as it would be handled by this ASR pass. | ||
return; | ||
} | ||
bool bindpython_used = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need not check for bindpython_used
here, and just assume that bindpython is used when --enable-cpython
is provided.
We skip this whole pass if --enable-cpython
is not provided.
src/libasr/pass/python_bind.cpp
Outdated
ASR::symbol_t *sym_PyRun_SimpleString = parent_scope.get_symbol("PyRun_SimpleString"); | ||
Vec<ASR::call_arg_t> args_PyRun_SimpleString; | ||
s.from_str(al, "import sys; sys.path.append(\".\")"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to run this? I think the current example in integration_tests/bindpy_05.py
works without it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the bindpy_05.py
example does not require this. But it is required for other tests example bindpy_01.py
.
We append the current path to CPython's module search path. If we do not do this, CPython will fail with ModuleNotFoundError
, because it only looks for modules installed using pip or the builtins/CPython standard library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
lpython/integration_tests/bindpy_05.py
Lines 72 to 73 in 9374feb
argv1: CPtr = Py_DecodeLocale("", empty_c_void_p()) | |
PySys_SetArgv(1, pointer(argv1, i64)) |
.
to the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try this.
What does pointer(argv1, i64)
do? What does the pointer
function do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it out.
I am no longer using PyRun_SimpleString
and appending the path. I am using Py_DecodeLocale
and PySys_SetArgv
.
src/libasr/pass/python_bind.cpp
Outdated
// reassignment | ||
f.m_body = body.p; | ||
f.n_body = body.n; | ||
ASRUtils::get_FunctionType(f)->m_abi = ASR::abiType::BindC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to make the abi
as BindC
? I think we should be making it Source
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will change it to Source in the next commit.
Vec<ASR::call_arg_t> args_PyLong_AsLongLong; | ||
args_PyLong_AsLongLong.reserve(al, 1); | ||
args_PyLong_AsLongLong.push_back(al, {f.base.base.loc, exp}); | ||
conv_result = ASRUtils::EXPR(ASR::make_Cast_t(al, f.base.base.loc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do Cast_t
here (and the rest of the types below)? Isn't the returned value by the specific function (in this case PyLong_AsLongLong
) already of the desired type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature of PyLong_AsLongLong
is
long long PyLong_AsLongLong(PyObject*)
. i.e.
i64 PyLong_AsLongLong(CPtr)
.
Python does not have PyLongAsInt
that returns an int
(i32
).
The Cast_t
is required in the following situation.
Assume the following PythonBind function
@pythoncall(module="mypylib")
def f(a: i32) -> i32:
pass
or
@pythoncall(module="mypylib")
def f(a: i8) -> i8:
pass
The Cast_t
casts the result of PyLong_AsLongLong
from i64
to i32
or i8
according to the actual type required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks.
current_scope = parent_scope; | ||
visit_ImportFrom(*imports); | ||
current_scope = function_scope; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, even I was confused in my last meet with Vipul. Looking into the changes, I think I understand what he is doing here.
He has a new runtime module cpython_bindings.py
which contains the declarations to the C
functions in Python.h
that we need. In the above change, he is just creating AST
code for the following:
from cpython_bindings import Py_Initialize, Py_IsInitialized, PyRun_SimpleString, Py_DecodeLocale, PySys_SetArgv, Py_FinalizeEx, PyUnicode_FromString, PyUnicode_AsUTF8AndSize, PyImport_Import, Py_DecRef, Py_IncRef, PyObject_GetAttrString, PyTuple_New, PyTuple_SetItem, PyObject_CallObject, PyLong_AsLongLong, PyLong_AsUnsignedLongLong, PyLong_FromLongLong, PyLong_FromUnsignedLongLong, PyFloat_FromDouble, PyFloat_AsDouble, PyBool_FromLong, PyObject_IsTrue
When he visits the AST
code, lpython
reads/loads the runtime/cpython_bindings.py
and automatically generates the declarations for the above BindC
functions.
@Vipul-Cariappa Initially this approach sounded interesting to me, but now after going through the changes, it seems like we should instead write the ASR code for these directly in the python_bind
pass. I think if we simply extend the current pybind_funcs
array with the parameter types and return type info (for example specifying types like here
lpython/src/libasr/codegen/asr_to_wasm.cpp
Line 206 in 9374feb
m_wa.define_func({i64}, {}, {i64, i64, i64, i64}, "print_i64", [&](){ |
runtime/cpython_bindings.py
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, Vipul! Shared some comments. Rest looks good.
src/libasr/pass/python_bind.cpp
Outdated
F64, | ||
PTR, | ||
PTR_TO_PTR, | ||
}; | ||
|
||
struct F { | ||
std::string m_name; | ||
std::vector<enum T> args; | ||
enum T retvar; | ||
}; | ||
|
||
ASR::expr_t *type_enum_to_asr_expr(Allocator &al, enum T t, Location &l, std::string n, SymbolTable *current_scope, ASR::intentType intent) { | ||
ASR::ttype_t *type = nullptr; | ||
|
||
Str s; | ||
s.from_str(al, n); | ||
|
||
switch (t) { | ||
case VOID: | ||
return nullptr; | ||
case I1: | ||
type = ASRUtils::TYPE(ASR::make_Logical_t(al, l, 4)); | ||
break; | ||
case I8: | ||
type = ASRUtils::TYPE(ASR::make_Integer_t(al, l, 1)); | ||
break; | ||
case I32: | ||
type = ASRUtils::TYPE(ASR::make_Integer_t(al, l, 4)); | ||
break; | ||
case I64: | ||
type = ASRUtils::TYPE(ASR::make_Integer_t(al, l, 8)); | ||
break; | ||
case U8: | ||
type = ASRUtils::TYPE(ASR::make_UnsignedInteger_t(al, l, 1)); | ||
break; | ||
case U32: | ||
type = ASRUtils::TYPE(ASR::make_UnsignedInteger_t(al, l, 4)); | ||
break; | ||
case U64: | ||
type = ASRUtils::TYPE(ASR::make_UnsignedInteger_t(al, l, 8)); | ||
break; | ||
case F32: | ||
type = ASRUtils::TYPE(ASR::make_Real_t(al, l, 4)); | ||
break; | ||
case F64: | ||
type = ASRUtils::TYPE(ASR::make_Real_t(al, l, 8)); | ||
break; | ||
case STR: | ||
type = ASRUtils::TYPE(ASR::make_Character_t(al, l, 1, -2, nullptr)); | ||
break; | ||
case PTR: | ||
type = ASRUtils::TYPE(ASR::make_CPtr_t(al, l)); | ||
break; | ||
case PTR_TO_PTR: | ||
type = ASRUtils::TYPE(ASR::make_Pointer_t(al, l, ASRUtils::TYPE(ASR::make_CPtr_t(al, l)))); | ||
break; | ||
} | ||
LCOMPILERS_ASSERT(type); | ||
ASR::symbol_t *v = ASR::down_cast<ASR::symbol_t>(ASR::make_Variable_t(al, l, current_scope, s.c_str(al), nullptr, | ||
0, intent, nullptr, nullptr, ASR::storage_typeType::Default, | ||
type, nullptr, ASR::abiType::BindC, ASR::Public, | ||
ASR::presenceType::Required, true)); // intent == ASRUtils::intent_in ? true : false | ||
current_scope->add_symbol(n, v); | ||
return ASRUtils::EXPR(ASR::make_Var_t(al, l, v)); | ||
} | ||
|
||
void declare_functions(Allocator &al, std::vector<struct F> fns, SymbolTable *parent_scope, | ||
std::string header_name="") { | ||
Location *l = al.make_new<Location>(); | ||
l->first = 0; | ||
l->last = 0; | ||
|
||
Str s; | ||
char *c_header = nullptr; | ||
if (header_name != "") { | ||
s.from_str(al, header_name); | ||
c_header = s.c_str(al); | ||
} | ||
|
||
for (auto i: fns) { | ||
Vec<ASR::expr_t*> args; | ||
args.reserve(al, i.args.size()); | ||
s.from_str(al, i.m_name); | ||
SymbolTable *current_scope = al.make_new<SymbolTable>(parent_scope); | ||
int c = 0; | ||
for (auto j: i.args) { | ||
args.push_back(al, type_enum_to_asr_expr(al, j, *l, i.m_name + std::to_string(++c), current_scope, | ||
ASRUtils::intent_in)); | ||
} | ||
ASR::expr_t *retval = type_enum_to_asr_expr(al, i.retvar, *l, "_lpython_return_variable", current_scope, | ||
ASRUtils::intent_return_var); | ||
char *fn_name = s.c_str(al); | ||
ASR::asr_t *f = ASRUtils::make_Function_t_util(al, *l, current_scope, fn_name, nullptr, 0, args.p, args.n, | ||
nullptr, 0, retval, ASR::abiType::BindC, ASR::accessType::Public, | ||
ASR::deftypeType::Interface, nullptr, false, false, false, false, false, nullptr, 0, | ||
false, false, false, c_header); | ||
|
||
parent_scope->add_symbol(i.m_name, ASR::down_cast<ASR::symbol_t>(f)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about moving these functions to asr_utils.cpp
.
These are generic functions that can be used to create new BindC
Interface
function declarations.
Any thoughts? @certik @Shaikh-Ubaid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. Also see/consider if pass_utils
can be a better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel asr_utils.cpp
is the better option. In the future, if we need to generate function declarations for using any C function, we could just use these functions. Putting it in asr_utils.cpp
will let us use it in the AST to ASR step also (if necessary).
@Shaikh-Ubaid, I believe if the CI passes and these changes look good to you, then we can merge. |
src/libasr/pass/python_bind.cpp
Outdated
fns.push_back({"PyBool_FromLong", {ASRUtils::I32}, ASRUtils::PTR}); | ||
fns.push_back({"PyObject_IsTrue", {ASRUtils::PTR}, ASRUtils::I32}); | ||
|
||
bool included_cpython_funcs = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think included_cpython_funcs
is not needed. We can simply call ASRUtils::declare_functions(al, fns, *l, unit.m_symtab, "Python.h")
here. We skip the whole python_bind
pass if --enable-cpython
is not provided.
Also suggested here #2796 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --enable-cpython
flag sets CompilerOptions.enable_cpython
to true. We do not have access to CompilerOptions
within an ASR pass. We only have access to PassOptions
.
Should I move enable_cpython
to PassOptions
then? Or have it in both places in CompilerOptions
and PassOptions
, so that we don't break any other code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move it to pass options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes. Hope I understood it correctly and this is what you wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shared one comment here https://github.com/lcompilers/lpython/pull/2796/files#r1713010953. I think its good to merge after that. Thanks for this @Vipul-Cariappa!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job. After @Shaikh-Ubaid is happy, this can be merged. It looks clean and small, the pass has less than 460 lines and it does everything. I think this is the way to go.
Is this now used by default for all Python bindings, meaning all our existing tests pass with this new approach?
@certik |
Co-authored-by: Shaikh Ubaid <shaikhubaid769@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks Vipul! It looks great! I appreciate it.
No description provided.