Hi folks,
I'm currently working on adding caching of .so and wrapper files to
cffi. This is to handle issues arising from trying to use various
libraries on machines without a C compiler. I'm hoping you can take a
peek at my branch [0] and see if I'm going in the right direction.
The changes are very simple, we allow users to specify a cache
directory by using an optional :cache-dir argument in :grovel-file &
:wrapper-file forms. CFFI will then look there first for the required
files before trying to compile them itself.
The tricky bit is that we recommend that people use the #+ & #- reader
conditionals in the grovel and wrapper specifications, which means the
cached results are only applicable if all the required feature-forms
match.
To handle this I am currently using my with-cached-reader-conditionals
library [1] which, whilst small and standalone, could be replicated
inside CFFI if required. The library modifies the readtable so that
feature expression still work as before, but the also record the
feature-forms used.
With the captured feature information we can make a subdirectory
inside the :cache-dir directory, whose name is based on the feature
information. In my current experiment it does something simple & easy
to read, however it would likely make directory names too long for
windows users.
For example I took osicat and added a cache directory to the following lines:
(:grovel-file "basic-unixint" :cache-dir "foo")
(:wrapper-file "wrappers" :soname "libosicat" :cache-dir "foo")
The "foo" directory has the following contents
bsd_nil_darwin_nil_freebsd_nil_linux_t_openbsd_nil
unixint.processed-grovel-file
darwin_nil_linux_t_mips_nil_openbsd_nil_windows_nil
basic-unixint.processed-grovel-file
linux_t_windows_nil
libosicat.so
wrappers.processed-wrapper-file
All the system does is copy these files to the same directory the
system would have put the compiled results in the standard system.
My next addition is going to be a function the user can call that will
do the following:
- Work out if cffi-grovel has cached files for the grovel and wrapper results
- If it didnt have those files it will copy the cachable files to a
user specified location (or maybe just the cache directory)
The idea of this is that it will allow the users to easily dontate
their cache files to the project which hopefully means we can better
coverage quickly
Thoughts would be very welcome,
I hope this finds you all well,
Baggers
[0] https://github.com/cbaggers/cffi/tree/feature-grovel-caching
[1] https://github.com/cbaggers/with-cached-reader-conditionals
Ok I have made the changes we spoke about and I think it's in a place
where we can start testing some projects with it.
Find the latest at: https://github.com/cbaggers/cffi/tree/feature-grovel-caching
Excited to hear your results/issues,
Baggers
On 29 August 2016 at 00:18, Chris Bagley <chris.bagley(a)gmail.com> wrote:
> [sorry I keep replying to luis instead of the mailinglist]
>
>> but I think all we have to do is look at *features* really.
> Yup and uiop has #'operating-system and #'architecture that helps us out there
>
>> Perhaps grabbing the #+/#- reader macro functions and invoking them directly is slightly more elegant/robust than calling going through read-from-string?
> hmm could be, how would we get the feature-expression from those?
>
>> Which direction are we copying in, and is that .cache directory grovel's or ASDF's?
> So in my branch currently:
> - if we have the files cached we copy them to .cache
> - if we dont have them cached we build them in .cache
>
> My chance would be to the 'dont have them case'
> - if we dont have them cached we build them in .cache and then copy
> the results to the cache folder
>
> On 29 August 2016 at 00:17, Chris Bagley <chris.bagley(a)gmail.com> wrote:
>>> but I think all we have to do is look at *features* really.
>> Yup and uiop has #'operating-system and #'architecture that helps us out there
>>
>>> Perhaps grabbing the #+/#- reader macro functions and invoking them directly is slightly more elegant/robust than calling going through read-from-string?
>> hmm could be, how would we get the feature-expression from those?
>>
>>> Which direction are we copying in, and is that .cache directory grovel's or ASDF's?
>> So in my branch currently:
>> - if we have the files cached we copy them to .cache
>> - if we dont have them cached we build them in .cache
>>
>> My chance would be to the 'dont have them case'
>> - if we dont have them cached we build them in .cache and then copy
>> the results to the cache folder
>>
>> On 28 August 2016 at 22:21, Luís Oliveira <luismbo(a)gmail.com> wrote:
>>> [cc-ing cffi-devel]
>>>
>>> On Sun, Aug 28, 2016 at 8:08 PM, Chris Bagley <chris.bagley(a)gmail.com> wrote:
>>>> Cool, thanks for the details and sorry I've been busy this last week.
>>>>
>>>>> cpu/vendor/os triplet
>>>>
>>>> Sounds like it would be good to include these by default in the
>>>> feature check, I'll add that.
>>>
>>> At first I was worried we'd need to check this via uname or something
>>> and that'd we need to handle the special case of running a 32-bit Lisp
>>> on a 64-bit OS and things like that, but I think all we have to do is
>>> look at *features* really.
>>>
>>>
>>>>> dirty-featurep is not super pretty but it seems like the way to go
>>>>
>>>> Cool, then I will move this into cffi itself, and leave my potentially
>>>> less portable version in the with-cached-reader-conditionals library
>>>
>>> Perhaps grabbing the #+/#- reader macro functions and invoking them
>>> directly is slightly more elegant/robust than calling going through
>>> read-from-string?
>>>
>>>
>>>>> Windows is picky about what a valid pathname is and (b) it's got a 260 character limit for pathnames.
>>>>
>>>> 100% agreed, also symbols can be unicode so that would break fast.
>>>>
>>>>> Perhaps we just need to record the result of each reader conditional, store those boolean results as increasingly significant bits in an integer
>>>>
>>>> The case I was worried about there is say someone had the following in
>>>> their spec file:
>>>>
>>>> (and linux swank (not windows))
>>>>
>>>> And that was #b110
>>>>
>>>> And they change it to:
>>>>
>>>> (and linux sly (not windows))
>>>>
>>>> And that would still be #b110.
>>>>
>>>> I think we should do what you said about appending to the triplet but
>>>> we should use some simple string hashing function instead of the
>>>> bitmask. Will be a little ugly but robust at least.
>>>
>>> Good point. Hashing seems much more reasonable than my suggestion. :-)
>>>
>>>
>>>>> Can we avoid the copying by just writing to and loading from the cache directory unconditionally?
>>>>
>>>> Good question. I liked asdf's rational for using the system's cache
>>>> directory for fasls and intermediate files and like how cffi uses it
>>>> too. I'm a bit nervous of someone trying to use asdf to load a
>>>> grovelling library from a directory they don't have write permissions
>>>> for as currently that works fine.
>>>>
>>>> Actually, we could just check: if we have write permission we copy
>>>> unconditionally, if not then we just leave it in the .cache dir.
>>>
>>> Complying with ASDF's concept of output/cache directory does seem
>>> important. (Perhaps we could ask asdf-devel for advice?) But I didn't
>>> grasp this last solution you've suggested. Which direction are we
>>> copying in, and is that .cache directory grovel's or ASDF's?
>>>
>>>>
>>>> I'll get implementing the above. Feel free to throw more ideas in the pile!
>>>>
>>>> Baggers
>>>
>>> Cheers,
>>>
>>> --
>>> Luís Oliveira
>>> http://kerno.org/~luis/
[sorry I keep replying to luis instead of the mailinglist]
> but I think all we have to do is look at *features* really.
Yup and uiop has #'operating-system and #'architecture that helps us out there
> Perhaps grabbing the #+/#- reader macro functions and invoking them directly is slightly more elegant/robust than calling going through read-from-string?
hmm could be, how would we get the feature-expression from those?
> Which direction are we copying in, and is that .cache directory grovel's or ASDF's?
So in my branch currently:
- if we have the files cached we copy them to .cache
- if we dont have them cached we build them in .cache
My chance would be to the 'dont have them case'
- if we dont have them cached we build them in .cache and then copy
the results to the cache folder
On 29 August 2016 at 00:17, Chris Bagley <chris.bagley(a)gmail.com> wrote:
>> but I think all we have to do is look at *features* really.
> Yup and uiop has #'operating-system and #'architecture that helps us out there
>
>> Perhaps grabbing the #+/#- reader macro functions and invoking them directly is slightly more elegant/robust than calling going through read-from-string?
> hmm could be, how would we get the feature-expression from those?
>
>> Which direction are we copying in, and is that .cache directory grovel's or ASDF's?
> So in my branch currently:
> - if we have the files cached we copy them to .cache
> - if we dont have them cached we build them in .cache
>
> My chance would be to the 'dont have them case'
> - if we dont have them cached we build them in .cache and then copy
> the results to the cache folder
>
> On 28 August 2016 at 22:21, Luís Oliveira <luismbo(a)gmail.com> wrote:
>> [cc-ing cffi-devel]
>>
>> On Sun, Aug 28, 2016 at 8:08 PM, Chris Bagley <chris.bagley(a)gmail.com> wrote:
>>> Cool, thanks for the details and sorry I've been busy this last week.
>>>
>>>> cpu/vendor/os triplet
>>>
>>> Sounds like it would be good to include these by default in the
>>> feature check, I'll add that.
>>
>> At first I was worried we'd need to check this via uname or something
>> and that'd we need to handle the special case of running a 32-bit Lisp
>> on a 64-bit OS and things like that, but I think all we have to do is
>> look at *features* really.
>>
>>
>>>> dirty-featurep is not super pretty but it seems like the way to go
>>>
>>> Cool, then I will move this into cffi itself, and leave my potentially
>>> less portable version in the with-cached-reader-conditionals library
>>
>> Perhaps grabbing the #+/#- reader macro functions and invoking them
>> directly is slightly more elegant/robust than calling going through
>> read-from-string?
>>
>>
>>>> Windows is picky about what a valid pathname is and (b) it's got a 260 character limit for pathnames.
>>>
>>> 100% agreed, also symbols can be unicode so that would break fast.
>>>
>>>> Perhaps we just need to record the result of each reader conditional, store those boolean results as increasingly significant bits in an integer
>>>
>>> The case I was worried about there is say someone had the following in
>>> their spec file:
>>>
>>> (and linux swank (not windows))
>>>
>>> And that was #b110
>>>
>>> And they change it to:
>>>
>>> (and linux sly (not windows))
>>>
>>> And that would still be #b110.
>>>
>>> I think we should do what you said about appending to the triplet but
>>> we should use some simple string hashing function instead of the
>>> bitmask. Will be a little ugly but robust at least.
>>
>> Good point. Hashing seems much more reasonable than my suggestion. :-)
>>
>>
>>>> Can we avoid the copying by just writing to and loading from the cache directory unconditionally?
>>>
>>> Good question. I liked asdf's rational for using the system's cache
>>> directory for fasls and intermediate files and like how cffi uses it
>>> too. I'm a bit nervous of someone trying to use asdf to load a
>>> grovelling library from a directory they don't have write permissions
>>> for as currently that works fine.
>>>
>>> Actually, we could just check: if we have write permission we copy
>>> unconditionally, if not then we just leave it in the .cache dir.
>>
>> Complying with ASDF's concept of output/cache directory does seem
>> important. (Perhaps we could ask asdf-devel for advice?) But I didn't
>> grasp this last solution you've suggested. Which direction are we
>> copying in, and is that .cache directory grovel's or ASDF's?
>>
>>>
>>> I'll get implementing the above. Feel free to throw more ideas in the pile!
>>>
>>> Baggers
>>
>> Cheers,
>>
>> --
>> Luís Oliveira
>> http://kerno.org/~luis/
[cc-ing cffi-devel]
On Sun, Aug 28, 2016 at 8:08 PM, Chris Bagley <chris.bagley(a)gmail.com> wrote:
> Cool, thanks for the details and sorry I've been busy this last week.
>
>> cpu/vendor/os triplet
>
> Sounds like it would be good to include these by default in the
> feature check, I'll add that.
At first I was worried we'd need to check this via uname or something
and that'd we need to handle the special case of running a 32-bit Lisp
on a 64-bit OS and things like that, but I think all we have to do is
look at *features* really.
>> dirty-featurep is not super pretty but it seems like the way to go
>
> Cool, then I will move this into cffi itself, and leave my potentially
> less portable version in the with-cached-reader-conditionals library
Perhaps grabbing the #+/#- reader macro functions and invoking them
directly is slightly more elegant/robust than calling going through
read-from-string?
>> Windows is picky about what a valid pathname is and (b) it's got a 260 character limit for pathnames.
>
> 100% agreed, also symbols can be unicode so that would break fast.
>
>> Perhaps we just need to record the result of each reader conditional, store those boolean results as increasingly significant bits in an integer
>
> The case I was worried about there is say someone had the following in
> their spec file:
>
> (and linux swank (not windows))
>
> And that was #b110
>
> And they change it to:
>
> (and linux sly (not windows))
>
> And that would still be #b110.
>
> I think we should do what you said about appending to the triplet but
> we should use some simple string hashing function instead of the
> bitmask. Will be a little ugly but robust at least.
Good point. Hashing seems much more reasonable than my suggestion. :-)
>> Can we avoid the copying by just writing to and loading from the cache directory unconditionally?
>
> Good question. I liked asdf's rational for using the system's cache
> directory for fasls and intermediate files and like how cffi uses it
> too. I'm a bit nervous of someone trying to use asdf to load a
> grovelling library from a directory they don't have write permissions
> for as currently that works fine.
>
> Actually, we could just check: if we have write permission we copy
> unconditionally, if not then we just leave it in the .cache dir.
Complying with ASDF's concept of output/cache directory does seem
important. (Perhaps we could ask asdf-devel for advice?) But I didn't
grasp this last solution you've suggested. Which direction are we
copying in, and is that .cache directory grovel's or ASDF's?
>
> I'll get implementing the above. Feel free to throw more ideas in the pile!
>
> Baggers
Cheers,
--
Luís Oliveira
http://kerno.org/~luis/