ctypes-binding-generator is a Python package to generate ctypes binding from C source files. It runs under Python 2 and Python 3, and generates Python bindings that are compatible with Python 2 and Python 3. It requires libclang to parse source files.
ctypes-binding-generator provides a command-line program called cbind. You may use it to generate ctypes binding for, say, stdio.h.
$ cbind -i /usr/include/stdio.h -o stdio.py -l libc.so.6 \
-- -I/usr/local/lib/clang/3.4/include
Note that you need /usr/local/lib/clang/3.4/include for stddef.h, etc. Then you may test the generated ctypes binding of stdio.h.
$ python -c 'import stdio; stdio.printf("hello world\n")'
hello world
In fact, cbind by default uses the libclang binding generated by cbind. You may generate the binding with:
$ cbind -i /path/to/clang/include/clang-c/Index.h \
-l libclang.so \
-o cbind/min_cindex.py \
--config demo/cindex.yaml \
-- -I /usr/local/lib/clang/3.4/include
If you would like cbind to use the official libclang binding maintained by
the Clang project, run cbind with --cindex clang-cindex
flag.
In the above example we generated libclang binding by cbind. However, the "raw"
binding would not be very useful, and so we provided configuration file
demo/cindex.yaml
, which guided cbind to generate a object-oriented interface
on top of the raw binding (as the suffix suggests, the configuration file is in
YAML format). The configuration file is basically a YAML mapping. The
supported top-level keys of the mappings are:
- preamble
- import
- rename
- enum
- errcheck
- method
- mixin
We introduce each of them below.
The preamble top-level key maps to a string which will be inserted into the binding of the output binding. Generally it could be used for import helper Python modules. Alternatively, preamble maps to a mapping that supports the following keys:
- codes: A string of codes that will be inserted.
- library: A string of the name of the shared library.
- use_custom_loader: (Optional) Boolean value; if true, the codes string will be used as library loader, and the default loader codes will not be inserted.
All other top-level keys map to a list of matchers and actions. The action of the first matcher that matches the syntax tree node, and only the action of the first matched matcher, will be performed. An action is a key-value pair, and the name of action is the same with the top-level key. For example, a top-level key "import" with one matcher and one action might look like this:
import:
- name: ^clang_createIndex$
import: True
A matcher is a mapping that specifies how to match a syntax tree node. It supports the following keys:
- argtypes: A list of regular expressions matching function argument types.
- name: A regular expression matching syntax tree node's name.
- parent: A matcher matching syntax tree node's parent.
- restype: A regular expression matching function return type.
Note that the type string that is going to be matched is the ctypes binding for that type, i.e., Python codes, rather than C codes.
The import top-level key determines which syntax tree nodes are imported to (added to) output Python binding codes. The (optional) action is Boolean valued; if true, the matched syntax tree node will be imported. If the import top-level key is not presented, cbind will import only syntax tree nodes of input files.
The rename top-level key changes the name of output syntax tree nodes. The (simplest) action value is a substitution string, whose accompanying regular expression is specified in the "name" matcher key. For example, this matches node names containing "CX" and removes it
rename:
- name: CX(\w+)
rename: \1
The action value could be a list of regular expressions and substitutions.
For example, this matches node names containing "CXCursor_" or "CXLinkage_",
and then this inserts underscore and applies upper()
function to the matched
string, effectively replacing CamelStyle with UNDERSCORE_STYLE
rename:
- name: CX(Cursor|Linkage)_
rename:
- pattern: '([a-z])([A-Z])'
replace: \1_\2
- pattern: CX(Cursor|Linkage)_(\w+)
function: 'lambda match: match.group(2).upper()'
The enum top-level key generates extra binding codes around enum constant declarations. The action value is a Python format string; the supported keys are
- enum_name: The name of enum declaration.
- enum_type: The integral type of enum values.
- enum_field: The name of enum constants.
- enum_value: The value of that constant.
The errcheck top-level key attaches errcheck
function to ctypes functions.
If the action value is empty, no errcheck function will be attached. You may
use this feature combined with the fact that cbind applies matcher sequentially
to avoid attach errcheck function to some ctypes functions. For example, this
attaches check_cursor
to errcheck of functions whose return type is
Cursor
, except clang_getNullCursor
function
errcheck:
- name: clang_getNullCursor
errcheck:
- restype: Cursor
errcheck: check_cursor
The method top-level key matches ctypes functions, and adds these function to ctypes classes. The action value is a "class.method"-style string.
The mixin top-level key inserts mix-in classes when subclassing ctypes Structure and Union, and when generating subclass for C-enum. Note that the mix-in classes are placed at first of inheritance in subclass definition so that they may override methods of ctypes classes. For example, if Foo is a C-struct, given the config below
mixin:
- name: ^Foo$
mixin: [FooMixin]
The output binding would be like
class Foo(FooMixin, Structure):
pass
Since macros are an important part of C headers, cbind may translate simple C macros to Python codes. For those complicated macros that cbind cannot understand, you have to translate them manually. Let's consider Linux input.h header as an example, and write a small program that dumps input events, such as mouse movements.
To enable macro translation, just provide --enable-macro
flag to cbind.
$ cbind -i /usr/include/linux/input.h -o demo/linux_input.py -v \
--enable-macro \
-- -I/usr/local/lib/clang/3.4/include
macro.py: Could not parse macro: #define EVIOCGID (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x02)) << 0) | ((((sizeof(struct input_id)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGREP (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x03)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSREP (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x03)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGKEYCODE (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGKEYCODE_V2 (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(struct input_keymap_entry)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSKEYCODE (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSKEYCODE_V2 (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(struct input_keymap_entry)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGABS(abs) (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x40 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSABS(abs) (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0xc0 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSFF (((1U) << (((0 +8)+8)+14)) | (('E') << (0 +8)) | ((0x80) << 0) | ((sizeof(struct ff_effect)) << ((0 +8)+8)))
Note that we provide -v
flag to cbind, which enables verbose output, and
cbind reports macros that it cannot understand. However, not all of them
are incomprehensible to cbind - it just needs some hints. cbind may translate
constant integer expressions, thanks to Clang, but you have to tell cbind
which macros are indeed integer expressions with --macro-int.
$ cbind -i /usr/include/linux/input.h -o demo/linux_input.py -v \
--enable-macro --macro-int EVIO \
-- -I/usr/local/lib/clang/3.4/include
macro.py: Could not parse macro: #define EVIOCGABS(abs) (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x40 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSABS(abs) (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0xc0 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
For the remaining two macros you have to translate manually:
EVIOCGABS = lambda abs: (2 << 30) | (ord('E') << 8) | (0x40 + abs) | (sizeof(input_absinfo) << 16)
EVIOCSABS = lambda abs: (1 << 30) | (ord('E') << 8) | (0xc0 + abs) | (sizeof(input_absinfo) << 16)
Under demo/ directory there is the evtest program which uses the linux_input.py binding we generated. It will require root permission to access device file. Press Ctrl-C to break evtest.
$ sudo demo/evtest /dev/input/event0
input driver version : 1.0.1
input device ID : bus 0x3 vendor 0x46d product 0xc05b version 0x111
input device name : 'Logitech USB Optical Mouse'
supported events:
event type 0 (Sync)
event type 1 (Key)
event code 272 (LeftBtn)
event code 273 (RightBtn)
event code 274 (MiddleBtn)
event code 275 (SideBtn)
event code 276 (ExtraBtn)
event code 277 (ForwardBtn)
event code 278 (BackBtn)
event code 279 (TaskBtn)
event type 2 (Relative)
event code 0 (X)
event code 1 (Y)
event code 6 (HWheel)
event code 8 (Wheel)
event type 4 (Misc)
event code 4 (ScanCode)
testing ... (interrupt to exit)
event: time 1374999609.141463, type 2 (Relative), code 0 (X), value 1
event: time 1374999609.141466, type 2 (Relative), code 1 (Y), value -1
event: time 1374999609.141472, -------------- Report Sync ------------
event: time 1374999609.149452, type 2 (Relative), code 0 (X), value 4
event: time 1374999609.149454, type 2 (Relative), code 1 (Y), value -1
event: time 1374999609.149459, -------------- Report Sync ------------
^C
You should see evtest shows driver and device info, supported events, and dumps input events.