Hi,
When I dump an SBCL core from an image that has antik loaded, I get a memory corruption error when loading the core: *** glibc detected *** sbcl: free(): invalid pointer: 0x0808d588 *** This happens even with a minimal project that does nothing but load antik; I've attached a script to reproduce the problem.
I did some investigation, and it looks like the problem is the `*formatting-test-grid*' variable in `format-grid.lisp'. It contains a list of grids, including some foreign arrays. Presumably the foreign pointers in these arrays will be pointing to random/unowned memory when the core is loaded. I've run into similar problems before in my own code (attempting to dump a global `*rng*' variable that contained a GSLL random number generator).
I've attached a patch that replaces the problem `defparameter' form with a function instead. I can't find any references to this variable in the source tree, so I assume it's just there for manual testing.
Thanks, James
On Fri, Dec 9, 2011 at 3:39 PM, James Wright james@chumsley.org wrote:
Hi,
When I dump an SBCL core from an image that has antik loaded, I get a memory corruption error when loading the core: *** glibc detected *** sbcl: free(): invalid pointer: 0x0808d588 *** This happens even with a minimal project that does nothing but load antik; I've attached a script to reproduce the problem.
I did some investigation, and it looks like the problem is the `*formatting-test-grid*' variable in `format-grid.lisp'. It contains a list of grids, including some foreign arrays. Presumably the foreign pointers in these arrays will be pointing to random/unowned memory when the core is loaded. I've run into similar problems before in my own code (attempting to dump a global `*rng*' variable that contained a GSLL random number generator).
I've attached a patch that replaces the problem `defparameter' form with a function instead. I can't find any references to this variable in the source tree, so I assume it's just there for manual testing.
Thanks, James
Thanks James. Indeed, *formatting-test-grid* is just a throwaway to test the formatting of grids. While a patch is OK, I would rather just not load it. However, I've long been uneasy with saved images and foreign memory; I have no confidence that what's saved will come back. Indeed, if you are seeing rng variable problems, then there is a deeper problem that can't be fixed with turning a defparameter into a function. I would like to get ideas from SBCL experts. I think the solution is likely to be implementation dependent, but that's the place to start. Can you generate a small example of your rng problem?
Liam
Hi Liam,
Thanks James. Indeed, *formatting-test-grid* is just a throwaway to test the formatting of grids. While a patch is OK, I would rather just not load it.
Works for me; either solution will stop the corruption issue that I ran into. :)
However, I've long been uneasy with saved images and foreign memory; I have no confidence that what's saved will come back.
I feel exactly the same way. It has become an article of faith with me that foreign pointers that get saved to a core will come back essentially uninitialized.
Indeed, if you are seeing rng variable problems, then there is a deeper problem that can't be fixed with turning a defparameter into a function.
Actually, my workaround (to saved rngs coming back uninitialized) kind of does boil down to converting the defparameter to a function. Fortunately it's possible to access the state of a GSL rng, so I've been checkpointing that and then restoring it into a freshly-constructed rng every time my program starts.
I would like to get ideas from SBCL experts. I think the solution is likely to be implementation dependent, but that's the place to start.
I'll be interested to hear if there's any realistic solution; I've mostly been assuming that it's up to me to make sure that I don't have any "live" foreign pointers when I dump core.
Can you generate a small example of your rng problem?
Certainly; I've attached a small script along with its output on my machine.
James
On Sun, Dec 11, 2011 at 5:01 PM, James Wright james@chumsley.org wrote:
Hi Liam,
Thanks James. Indeed, *formatting-test-grid* is just a throwaway to test the formatting of grids. While a patch is OK, I would rather just not
load
it.
Works for me; either solution will stop the corruption issue that I ran into. :)
However, I've long been uneasy with saved images and foreign memory; I have no confidence that what's saved will come back.
I feel exactly the same way. It has become an article of faith with me that foreign pointers that get saved to a core will come back essentially uninitialized.
Indeed, if you are seeing rng variable problems, then there is a deeper problem that can't
be
fixed with turning a defparameter into a function.
Actually, my workaround (to saved rngs coming back uninitialized) kind of does boil down to converting the defparameter to a function. Fortunately it's possible to access the state of a GSL rng, so I've been checkpointing that and then restoring it into a freshly-constructed rng every time my program starts.
I would like to get ideas from SBCL experts. I think the solution is likely to be implementation dependent, but that's the place to start.
I'll be interested to hear if there's any realistic solution; I've mostly been assuming that it's up to me to make sure that I don't have any "live" foreign pointers when I dump core.
Can you generate a small example of your rng problem?
Certainly; I've attached a small script along with its output on my machine.
James
James,
In both the case of *all-formatting-test-grids* and your *rng*, we have a defparameter, and that may be the key to a workable solution here. I asked on #lisp and SBCL does not have a way to save and restore foreign objects. So I'm thinking along the following lines: define a #'defautoload (or something) which would be like defparameter but leave the object unbound and store the thunk. It would actually bind the symbol as a symbol macro, and when you went to use it, it would invoke the thunk as an initializer; in future uses the function would merely look up the saved answer. Does this sound sensible? It would act just like a global special as defined by defparameter, except you couldn't rebind it in a let or setf. This also should be portable; nothing SBCL specific. I vaguely remember I coded up something similar to this at one point as part of something else, so I'll have to hunt around for it. This would only initialize on reload, not restore the state at dump time, but in your case and my case that's all we need.
The general problem of saving and restoring foreign objects in any context and any state is a tough one. SBCL provides variables *save-hooks* and *init-hooks* which could provide the jumping-off point. To restore the state of a variable shouldn't be that hard using these hooks, but if the object is buried inside some other thing, I don't know how we'd find it in general. The only avenue I can think of is that all the foreign objects are slot values in a CLOS object (if not a global special), so the symbol-macro/thunk trick could be used similarly. But without a strong motivation for doing it, I don't think it's worth the effort.
What do you think?
Liam
Hi Liam,
Thanks for looking into this. I agree that somehow tracking all the foreign objects in order to save/restore them using SBCL's hook variables is a lot of work for probably not much benefit. Seems like a lot of the performance benefits of using foreign objects directly is lost if they imply a bunch of extra bookkeeping in a global structure somewhere.
The `defautoload' idea is pretty interesting; it could even leverage `make-load-form' to generate the thunk, I would think. According to the hyperspec, it looks like a symbol macro actually can be rebound by a let, so it's really just setf that would be problematic. I bet one could get around the setf problem with a bit of trickery as well; expand into a function call for accessing the thunk, and then define a setf method on that accessor, or something of the sort.
If I have some spare time over the weekend I might try coding up an initial stab.
James
On Sun, Feb 19, 2012 at 9:18 AM, Liam Healy lhealy@common-lisp.net wrote:
On Sun, Dec 11, 2011 at 5:01 PM, James Wright james@chumsley.org wrote:
Hi Liam,
Thanks James. Indeed, *formatting-test-grid* is just a throwaway to test the formatting of grids. While a patch is OK, I would rather just not load it.
Works for me; either solution will stop the corruption issue that I ran into. :)
However, I've long been uneasy with saved images and foreign memory; I have no confidence that what's saved will come back.
I feel exactly the same way. It has become an article of faith with me that foreign pointers that get saved to a core will come back essentially uninitialized.
Indeed, if you are seeing rng variable problems, then there is a deeper problem that can't be fixed with turning a defparameter into a function.
Actually, my workaround (to saved rngs coming back uninitialized) kind of does boil down to converting the defparameter to a function. Fortunately it's possible to access the state of a GSL rng, so I've been checkpointing that and then restoring it into a freshly-constructed rng every time my program starts.
I would like to get ideas from SBCL experts. I think the solution is likely to be implementation dependent, but that's the place to start.
I'll be interested to hear if there's any realistic solution; I've mostly been assuming that it's up to me to make sure that I don't have any "live" foreign pointers when I dump core.
Can you generate a small example of your rng problem?
Certainly; I've attached a small script along with its output on my machine.
James
James,
In both the case of *all-formatting-test-grids* and your *rng*, we have a defparameter, and that may be the key to a workable solution here. I asked on #lisp and SBCL does not have a way to save and restore foreign objects. So I'm thinking along the following lines: define a #'defautoload (or something) which would be like defparameter but leave the object unbound and store the thunk. It would actually bind the symbol as a symbol macro, and when you went to use it, it would invoke the thunk as an initializer; in future uses the function would merely look up the saved answer. Does this sound sensible? It would act just like a global special as defined by defparameter, except you couldn't rebind it in a let or setf. This also should be portable; nothing SBCL specific. I vaguely remember I coded up something similar to this at one point as part of something else, so I'll have to hunt around for it. This would only initialize on reload, not restore the state at dump time, but in your case and my case that's all we need.
The general problem of saving and restoring foreign objects in any context and any state is a tough one. SBCL provides variables *save-hooks* and *init-hooks* which could provide the jumping-off point. To restore the state of a variable shouldn't be that hard using these hooks, but if the object is buried inside some other thing, I don't know how we'd find it in general. The only avenue I can think of is that all the foreign objects are slot values in a CLOS object (if not a global special), so the symbol-macro/thunk trick could be used similarly. But without a strong motivation for doing it, I don't think it's worth the effort.
What do you think?
Liam