dear list,
c2ffi is a tool written in C that uses the llvm/clang libs to generate a json file from C headers. it's like gcc-xml, only much better. the json file has explicit offsets and sizes, etc... see an example here:
https://github.com/rpav/ZMQ4L/blob/master/src/autospec/zmq.x86_64-pc-linux-g...
once this json has been generated for your platform, it can be checked into the repo and no external tool is needed afterwards.
cl-autowrap uses such json files to generate an alternative FFI API, but i'd like to use vanilla CFFI, so i'm planning to write some code towards this direction, but i'd like to hear some input on this before i start working on anything...
i imagine it to be similar to how groveling works in iolib, namely: generate the intermediate files from the json files using some ASDF integration, and then compile the lisp files as any other lisp files.
my first question is why isn't this ASDF integration, or something like this, ported into cffi for the gorveler? is there any other reason besides nobody has done it yet?
https://github.com/sionescu/iolib/blob/master/src/grovel/asdf.lisp
does anyone see some showstoppers? e.g. can i lay out defcstruct fields with explicit offsets? NOTICE-FOREIGN-STRUCT-DEFINITION suggests so.
should this code go into a cffi-c2ffi.asd or into a standalone project? any ideas for a name?
On Wed, Dec 9, 2015 at 9:10 PM, Attila Lendvai attila@lendvai.name wrote:
c2ffi is a tool written in C that uses the llvm/clang libs to generate a json file from C headers. it's like gcc-xml, only much better. the json file has explicit offsets and sizes, etc... see an example here:
https://github.com/rpav/ZMQ4L/blob/master/src/autospec/zmq.x86_64-pc-linux-g...
once this json has been generated for your platform, it can be checked into the repo and no external tool is needed afterwards.
That is very nice. The need for a C compiler during compilation of things that use cffi-grovel, particularly on Windows where, ironically, binary compatibility is easier.
cl-autowrap uses such json files to generate an alternative FFI API, but i'd like to use vanilla CFFI, so i'm planning to write some code towards this direction, but i'd like to hear some input on this before i start working on anything...
Cool!
i imagine it to be similar to how groveling works in iolib, namely: generate the intermediate files from the json files using some ASDF integration, and then compile the lisp files as any other lisp files.
I was thinking it'd be pretty neat to COMPILE-FILE the spec file directly using a custom readtable, but whatever's easier. :-)
my first question is why isn't this ASDF integration, or something like this, ported into cffi for the gorveler? is there any other reason besides nobody has done it yet?
https://github.com/sionescu/iolib/blob/master/src/grovel/asdf.lisp
Not sure what you mean. This ASDF integration is in cffi-grovel.
BTW, the scope of the groveller is quite narrow. Things like c2ffi are for grabbing everything a header file has got to offer. The groveller way is just "hey, I know this constant exists, grab me its value please". But, if something like c2ffi turns out to work very well, then we can reimplement the groveller's API on top of it, sure.
Have you seen https://github.com/rpav/c2ffi-cffi?
does anyone see some showstoppers? e.g. can i lay out defcstruct fields with explicit offsets? NOTICE-FOREIGN-STRUCT-DEFINITION suggests so.
I don't see any showstoppers. cl-autowrap is built on top of CFFI, as is the groveller.
should this code go into a cffi-c2ffi.asd or into a standalone project? any ideas for a name?
cffi-c2ffi sounds good and I'd be happy to merge into CFFI.
Looking forward to your next steps!
Cheers,
i imagine it to be similar to how groveling works in iolib, namely: generate the intermediate files from the json files using some ASDF integration, and then compile the lisp files as any other lisp files.
I was thinking it'd be pretty neat to COMPILE-FILE the spec file directly using a custom readtable, but whatever's easier. :-)
that sounds like a nice idea but it may bite us later on because it'd provide less flexibility if we need to cross-reference things in the file and will make debugging also harder, including M-. navigation.
my first question is why isn't this ASDF integration, or something like this, ported into cffi for the gorveler? is there any other reason besides nobody has done it yet?
https://github.com/sionescu/iolib/blob/master/src/grovel/asdf.lisp
Not sure what you mean. This ASDF integration is in cffi-grovel.
sorry, must have been blind and/or confused, i can see it now.
BTW, the scope of the groveller is quite narrow. Things like c2ffi are for grabbing everything a header file has got to offer. The groveller way is just "hey, I know this constant exists, grab me its value please". But, if something like c2ffi turns out to work very well, then we can reimplement the groveller's API on top of it, sure.
there are some issues with the c2ffi approach, too: even though it can extract the #define macro definitions, but its value expressions need to be evaluated somewhere by someone. maybe c2ffi could be extended to invoke the LLVM JIT and eval them?
except that moacro values are not C code, e.g. what about macros that have inputs or do source code string concatenation? at first it sounds reasonable from an FFI perspective to just ignore the ones with inputs, but i have to admit that i didn't think enough about this.
another unresolved issue is inline functions in the header files.
a way to deal with these could be a hybrid approach, namely the generator could group the generated definitions into two or three files with different dependencies:
- one that needs support for call-by-value structs and thus depends on cffi-libffi
- one that needs cffi-grovel? the code could generate a grovel file e.g. to grovel static inline functions from C headers. and if c2ffi cannot evaluate #defines then those, too. the user could add hand-written grovel files also.
- a base set of definitions that only need simple cffi
and users could decide what to depend-on.
Have you seen https://github.com/rpav/c2ffi-cffi?
oh, thanks for reminding me! i've seen it but it slipped out of my mind.
I don't see any showstoppers. cl-autowrap is built on top of CFFI, as is the groveller.
just to be sure: what i'm considering to work on will be a competitor to cl-autowrap.
initially i made quick progress in *using* cl-autowrap, but whenever i try to fix things, or extend it, or get to more complex usage, i get frustrated, and that's why i'm considering this project. plus i'm stuck with something now that needs pass by value.
On Thu, Dec 10, 2015 at 1:34 AM, Luís Oliveira luismbo@gmail.com wrote:
On Wed, Dec 9, 2015 at 9:10 PM, Attila Lendvai attila@lendvai.name wrote:
i imagine it to be similar to how groveling works in iolib, namely: generate the intermediate files from the json files using some ASDF integration, and then compile the lisp files as any other lisp files.
I was thinking it'd be pretty neat to COMPILE-FILE the spec file directly using a custom readtable, but whatever's easier. :-)
I think c2ffi can output sexps, not just json.
On Thu, Dec 10, 2015 at 11:55 AM, Stas Boukarev stassats@gmail.com wrote:
I think c2ffi can output sexps, not just json.
Seems sensible to compile those directly then.
I think c2ffi can output sexps, not just json.
the sexp output driver is half-baked.
json is the preferred one because it's more friendly towards non-lisp platforms.
On Fri, Dec 11, 2015 at 11:17 AM, Attila Lendvai attila@lendvai.name wrote:
json is the preferred one because it's more friendly towards non-lisp platforms.
Not to put too much emphasis on this issue, which is decidedly minor, but converting json to a (compile-file friendly) s-exp representation should be fairly trivial, right? Again, minor detail. ;-)
Not to put too much emphasis on this issue, which is decidedly minor, but converting json to a (compile-file friendly) s-exp representation should be fairly trivial, right? Again, minor detail. ;-)
i'm not sure i understand you Luis (especiall "compile-file friendly"), but it sounds like i should, because you're suggesting a solution that may be better.
are you saying that the json output could be converted to some sexp form with a macro on the top, so that the macro could expand into the cffi definitions? and this file should be the one that gets checked in to the repos, not the json file? (as opposed to generating a standalone tmp lisp file from the json file holding the cffi definitions in the ASDF fasl cache; which is done already)
cl-json itself reads it into alists, so it shouldn't be hard... but what would we gain? one slight drawback would be that M-. would take us to the big toplevel macro, not the actual cffi definition.
anything i'm not aware of?
On Fri, Dec 11, 2015 at 5:09 PM, Attila Lendvai attila@lendvai.name wrote:
are you saying that the json output could be converted to some sexp form with a macro on the top, so that the macro could expand into the cffi definitions? and this file should be the one that gets checked in to the repos, not the json file? (as opposed to generating a standalone tmp lisp file from the json file holding the cffi definitions in the ASDF fasl cache; which is done already)
No need for the macro on the top, that macro could be defined a priori somewhere else in cffi-c2ffi.
cl-json itself reads it into alists, so it shouldn't be hard... but what would we gain? one slight drawback would be that M-. would take us to the big toplevel macro, not the actual cffi definition.
So if the resulting spec file looked something like
;; spec file starts here (cffi-c2ffi:definition foo ...) (cffi-c2ffi:definition bar ...) (cffi-c2ffi:definition baz ...) ;; spec file ends here
Then M-. takes us to one of the definitions and the actual CFFI definition would be a macroexpansion away.
What do we gain? I'm not sure. I think this way is more lispy and less complex, but that's just a gut feeling at this point.
Cheers,
So if the resulting spec file looked something like
;; spec file starts here (cffi-c2ffi:definition foo ...) (cffi-c2ffi:definition bar ...) (cffi-c2ffi:definition baz ...) ;; spec file ends here
ah, ok, this makes sense, thanks!
but it leads up to an issue i'm dealing with now:
the c2ffi output contains offset for each field in a struct, which enables us to generate partial bindings where not all struct fields need to have a defined type.
my current idea to deal with undefined types is the following:
- (cffi:defctype undefined :char "Used by cffi-c2ffi to mark types that are used in e.g. structs, but are not defined in the scope of the generation.")
- in a first phase collect all defined and referenced types. (this may be a PITA because i don't know all the ins and outs of C namespaces, e.g. where is it ok to reference a struct by only its name, as opposed to by "struct some-struct", and which namespaces are looked up in naked type references, etc.)
- in the emitting phase first emit defctype for each used but not defined type before emitting the definitions themselves
- while emitting definitions simplify them compatibly. e.g. the bindings of functions that take a pointer to an undefined type get simplified into a :pointer.
as you can see this needs a multi-phase whole-file processing, but i'm open for ideas.
the way cl-autowrap deals with this is that its struct framework doesn't complain for undefined types because it expects :offset for each field.
maybe defcstruct could be extended in a sane way with something like this?
the c2ffi output contains offset for each field in a struct, which enables us to generate partial bindings where not all struct fields need to have a defined type.
and on this note, where can i find info/examples on bitfield support in cffi? (as in { int bar : 3}, not DEFBITFIELD) IIRC it does support bitfields, but i couldn't find it in the code.
the OFFSET slot in FOREIGN-STRUCT-SLOT doesn't say anything about the unit it's assuming, but the way it's used in MEM-REF suggests it's in bytes (as opposed to bits), right?
as an example, this is what c2ffi emits:
struct { unsigned int f1 : 1; int f2 : 3; } teststruct;
=>
{ "type": { "fields": [ { "type": { "type": { "tag": ":unsigned-int" }, "width": 1, "tag": ":bitfield" }, "bit-alignment": 32, "bit-size": 32, "bit-offset": 0, "name": "f1", "tag": "field" }, { "type": { "type": { "tag": ":int" }, "width": 3, "tag": ":bitfield" }, "bit-alignment": 32, "bit-size": 32, "bit-offset": 1, "name": "f2", "tag": "field" } ], "bit-alignment": 32, "bit-size": 32, "location": "[...]/autospec/bluez.h:1:1", "id": 1, "name": "", "ns": 0, "tag": "struct" }, "location": "[...]/autospec/bluez.h:4:3", "name": "teststruct", "tag": "const" },
Attila Lendvai attila@lendvai.name writes:
the c2ffi output contains offset for each field in a struct, which enables us to generate partial bindings where not all struct fields need to have a defined type.
Is this because the user doesn't care about some slots or because the type may not be defined yet? If it's the former, you can just omit the slot, provided you pass the appropriate :size argument to defcstruct so that CFFI know how big the struct is despite missing slots.
my current idea to deal with undefined types is the following:
- (cffi:defctype undefined :char "Used by cffi-c2ffi to mark types that are used in e.g. structs,
but are not defined in the scope of the generation.")
This idea suggests you don't care about the slot, just omit then.
- in a first phase collect all defined and referenced types. (this may be a PITA because i don't know all the ins and outs of C namespaces, e.g. where is it ok to reference a struct by only its name, as opposed to by "struct some-struct", and which namespaces are looked up in naked type references, etc.)
It's only possible to reference a struct with "some_struct" when there's a "typedef struct some_struct some_struct".
- in the emitting phase first emit defctype for each used but not defined type before emitting the definitions themselves
Can't you sort the definitions in topological order? https://en.wikipedia.org/wiki/Topological_sorting
That should ensure definitions would always show up before their uses, right?
- while emitting definitions simplify them compatibly. e.g. the bindings of functions that take a pointer to an undefined type get simplified into a :pointer.
the way cl-autowrap deals with this is that its struct framework doesn't complain for undefined types because it expects :offset for each field.
It also needs to know the size of the struct, right?
maybe defcstruct could be extended in a sane way with something like this?
You know what, we've had a similar discussion before, 8 years ago! :-) http://thread.gmane.org/gmane.lisp.cffi.devel/1116. It might be possible to delay the typing for defcstruct, but it'd be simpler (well, from my point of view) if you handled this during the binding generation.
Attila Lendvai attila@lendvai.name writes:
the c2ffi output contains offset for each field in a struct, which enables us to generate partial bindings where not all struct fields need to have a defined type.
and on this note, where can i find info/examples on bitfield support in cffi? (as in { int bar : 3}, not DEFBITFIELD) IIRC it does support bitfields, but i couldn't find it in the code.
defcstruct doesn't support bitfields, no. It shouldn't be too hard to implement, though. The harder bit is figuring out the packing layout, but, if we know the offsets, then we can skip that.
the OFFSET slot in FOREIGN-STRUCT-SLOT doesn't say anything about the unit it's assuming, but the way it's used in MEM-REF suggests it's in bytes (as opposed to bits), right?
Yes, bytes.
It shouldn't be too hard to add :bit-offset and :bit-width options. Let me know if you need help with that.
Cheers,
Is this because the user doesn't care about some slots or because the type may not be defined yet? If it's the former, you can just omit the slot, provided you pass the appropriate :size argument to defcstruct so that CFFI know how big the struct is despite missing slots.
in cl-autowrap, and in my current cffi-c2ffi code, there are ways to filter out files and/or definitions to focus only on the interesting parts of the bindings. this keeps its size within reasonable limits, as it'd also be pointless for each cl binding project to contain the whole linux/c stuff -- plus a little interesting extra.
omitting the slot seems to have been too obvious an idea to come to me... thanks for drawing my attention! :)
i'll just change the generator to emit the slots that refer to a not yet defined type as a comment (to retain human readability).
the struct size is already part of the generated output. i'll try to record some v0.1 state soon so that interested parties can also look at what's working already.
It's only possible to reference a struct with "some_struct" when there's a "typedef struct some_struct some_struct".
that helps, thanks!
so, essentially there are a few namespaces in C:
- default (typedef, functions, variables) - struct - union (or it's the same namespace as struct?)
does someone know about a documentation on this? i couldn't find anything on this topic specifically...
Can't you sort the definitions in topological order? https://en.wikipedia.org/wiki/Topological_sorting
That should ensure definitions would always show up before their uses, right?
that shouldn't be needed because to the best of my knowledge the order in the json file is the same as in the preprocessed C file.
maybe defcstruct could be extended in a sane way with something like this?
You know what, we've had a similar discussion before, 8 years ago! :-) http://thread.gmane.org/gmane.lisp.cffi.devel/1116. It might be possible to delay the typing for defcstruct, but it'd be simpler (well, from my point of view) if you handled this during the binding generation.
ehh, time flies! :) and apparently this issue has not yet been resolved by any opensource fairies in those 8 years.
defcstruct doesn't support bitfields, no. It shouldn't be too hard to implement, though. The harder bit is figuring out the packing layout, but, if we know the offsets, then we can skip that.
yep, we know bit-offsets (and bit-sizes) from the c2ffi output. it would be nice if there was at least an API (that initially throws an error in the runtime accessors that it's not yet implemented?).
but i guess it's not urgent. cffi-c2ffi can always filter those out, and support added later on.
the OFFSET slot in FOREIGN-STRUCT-SLOT doesn't say anything about the unit it's assuming, but the way it's used in MEM-REF suggests it's in bytes (as opposed to bits), right?
Yes, bytes.
i started to follow the convention to store everything in its natural unit (length in meters, time in seconds, etc). i'm aware of the pain it would take to migrate from bytes to bits under the same name, but it still tickles my idealism, especially if we bring in C bitfield support that deals with bits.
It shouldn't be too hard to add :bit-offset and :bit-width options. Let me know if you need help with that.
i will not jump on it for now, but as i wrote above it's also not urgent to be implemented in the first wave of features.
thanks,
FTR, tried to compile stuff and according to gcc error messages the following two namespaces are used in C:
- functions, variables, typedefs - structs and unions
e.g. this is valid C:
// an anonymous struct typedef'd as foo typedef struct { char c; } foo;
// a different body! struct foo { int i; };
union a_union { struct foo s1; foo s2; };
i've pushed a v0.1 here:
https://github.com/attila-lendvai/cffi
i know that it's pretty useless without an extensive example, but that's all for now.
it tracks type dependencies and only emits definitions that have their type dependencies already emitted. e.g. defcstruct will skip slots whose type has not been defined already.
it can compile the following in its entirety into valid cffi (with only filtering out "va_list", and a small patch to cffi):
bluez.h: #include <unistd.h> #include <fcntl.h> #include <errno.h> #include <string.h> #include <sys/ioctl.h> #include <bluetooth/bluetooth.h> #include <bluetooth/hci.h> #include <bluetooth/hci_lib.h>
and this is an example usage of the ASDF component:
(:cffi-c2ffi-file "bluez.h" :package #:hu.dwim.bluez :spec-path (system-relative-pathname :hu.dwim.bluez2 "autospec/") :exclude-archs ("i386-unknown-freebsd" "x86_64-unknown-freebsd" "i686-apple-darwin9" "x86_64-apple-darwin9" "i686-pc-windows-msvc" "x86_64-pc-windows-msvc") :sys-include-paths ( ;; my llvm is not installed; or is it a c2ffi bug? "/path/llvm-3.6/lib/clang/3.6.2/include/" ) :include-sources :all #+nil("bluetooth/bluetooth.h" "bluetooth/hci.h" "bluetooth/hci_lib.h") :exclude-sources () #+nil :all :include-definitions :all #+nil("memset" "size_t" "ssize_t" "socklen_t" ... ) :exclude-definitions ("va_list"))
an open question is that cl-ppcre (or equivalent) would be handy for filtering definitions, but it's a rather heavy dependency. i'll probably keep it optional, and redefine some defun's if someone loads cffi-c2ffi+cl-ppcre.asd
but i'm open for suggestions.
more later,
i've pushed a state that is pretty functional, i'd call it a v1.0 beta:
https://github.com/attila-lendvai/cffi
and here's an example project that uses it:
https://github.com/attila-lendvai/hu.dwim.bluez
the ASDF part probably needs some care, espcially the recompilation dependency checking, but it's beyond me. the shared ASDF parts of cffi-grovel should also be factored out. i'm hoping to convince Fare with something to deal with it.