Hi everyone!
I stepped into a weird issue using the last croatoan commit. Seems that the example 'dlg02' does not works if the procedure is called from an executable generated by the compiler.
I do not know if this issue arise only on my machine though, so i prepared a little use case to reproduce the behaviour.
Attached you can find the code with an asdf file to compile the executable.
To build the package just use (asdf:make "test-menu"), the asdf file "test-menu.asd" must be in your source-registry (probably decompressing the tar in "$HOME/quicklisp/local-projects" is enough).
After compiling run the executable: ./src/test-menu
on my machine (SBCL 2.2.9.debian, X86_64) i get an error:
#<SB-SYS:MEMORY-FAULT-ERROR {1004AB3183}> debugger invoked on a SB-SYS:MEMORY-FAULT-ERROR in thread #<THREAD "main thread" RUNNING {1001D50003}>: Unhandled memory fault at #x7FFB37AE1730.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name): 0: [ABORT] Exit from the current thread.
(DE.ANVI.CROATOAN:ACS :UPPER-LEFT-CORNER) source: (CFFI:MEM-AREF ACS-MAP-ARRAY 'DE.ANVI.NCURSES:CHTYPE (CHAR-CODE (CDR (ASSOC CHAR-NAME ACS-ALIST)))) 0]
but if you comment the line 8 (':build-operation program-op') in test-menu.asd and then run this commands:
$ sbcl (i suggest use rlwrap)
...wait for compilation....
the example run just fine.
Can someone confirm if this happens on other machines?
Thanks in advance! C.
Hi!
After some more investigation with a friend, the offending form seems could be reduced to:
where '108' is the code for ':upper-left corner'
Works in a REPL, does not works in an executable.
Bye! C.
Hello cage,
After some more investigation with a friend, the offending form seems could be reduced to:
(cffi:mem-aref de.anvi.croatoan::acs-map-array 'ncurses:chtype 108) ; โ 0
I also tried to narrow down the source of the error, basically getting to the same conclusion, that it is a low-level cffi problem in accessing the alternate character set ACS used to draw the borders.
I also tried running the same code on older sbcl version, older linux machines, older library versions. Even though I still have older binaries which I built as late as end of november, which run as expected and where the error doesnt occur, now suddenly it happens on every single backup I had. I guess _something_ changed in the underlying C libraries, which are updated all the time, and code that worked for years now doesn't, a pretty scary thought.
But after a helpful discussion in #sbcl, the cause and thankfully a solution was quickly found.
For the sake of completeness, this is a minimal ncurses example, based on your test-menu, which triggers the bug without any of the library overhead except cffi:
(defsystem :test-menu :pathname "src" :serial t :depends-on (:cffi) :entry-point "test-menu::main" :build-operation program-op :components ((:file "main")))
(defpackage :test-menu (:use :cl :cffi))
(in-package :test-menu)
(cffi:defctype chtype :uint32) (cffi:defctype window :pointer)
(cffi:defcfun ("initscr" initscr) window) (cffi:defcfun ("addch" addch) :int (ch chtype)) (cffi:defcfun ("refresh" refresh) :int) (cffi:defcfun ("getch" getch) :int) (cffi:defcfun ("endwin" endwin) :int)
(cffi:define-foreign-library libncursesw (:unix (:or "libncursesw.so.6.3")) (t (:default "libncursesw")))
(cffi:use-foreign-library libncursesw)
(defparameter acs-map-array (cffi:foreign-symbol-pointer "acs_map"))
(defun main () (initscr) (addch (cffi:mem-aref acs-map-array :uint32 (char-code #\l))) (refresh) (getch) (endwin))
It should display one single ACS char and basically signals the same error.
The underlying cause of the error is the fact that acs-map-array, which is a pointer to the acs_map array, is basically allocated during build time, which obviously is wrong, and the fact that it worked till now just luck.
As soon as the function is called at run time, as in
(defun main () (initscr) (addch (cffi:mem-aref (cffi:foreign-symbol-pointer "acs_map") :uint32 (char-code #\l))) (refresh) (getch) (endwin))
the error doesnt occur any more. So the fix for the whole library is just that one line in the acs function, which is the only place the acs_map C array is accessed. I'm going to push that change in a minute, when you find time, please test and confirm that it also works for you.
There is also the question whether cffi:use-foreign-library should be called during build time or run time, but as long as it does not cause problems, I would let it stay as it is now, at build time.
Thanks for the report and the test case, it greatly accelerated the search. A.
On Sun, Jan 29, 2023 at 10:54:52PM +0100, you wrote:
Hello cage,
Hi!!
[...]
I also tried to narrow down the source of the error, basically getting to the same conclusion, that it is a low-level cffi problem in accessing the alternate character set ACS used to draw the borders.
I also tried [...] I guess _something_ changed in the underlying C libraries, which are updated all the time, and code that worked for years now doesn't, a pretty scary thought.
It is, indeed! O_O
But after a helpful discussion in #sbcl, the cause and thankfully a solution was quickly found.
๐๐
For the sake of completeness, this is a minimal ncurses example, based on your test-menu, which triggers the bug without any of the library overhead except cffi:
[...]
As soon as the function is called at run time, as in
(defun main () (initscr) (addch (cffi:mem-aref (cffi:foreign-symbol-pointer "acs_map") :uint32 (char-code #\l))) (refresh) (getch) (endwin))
the error doesnt occur any more. So the fix for the whole library is just that one line in the acs function, which is the only place the acs_map C array is accessed. I'm going to push that
+change in a minute, when you find time, please test and confirm that it also works for you.
I have tested both the example above and the original code i posted a couple of days ago, I can confirm both work like a charm! :)
Thanks for the report and the test case, it greatly accelerated the search.
Thanks to you for all your work!
Also I have learnt something new about sbcl and cffi! :))
Now after your patch i should be able to update tinmop and unlock the proposed updating of croatoan on guix (a kind person seems interested on working on packaging the library!).
for reference:
https://issues.guix.gnu.org/issue/60944
Bye! C.
Hello cage,
I ran your test and can confirm the bug when called from an executable, and basically get the same error:
~/quicklisp/local-projects/test-menu/src$ ./test-menu
#<SB-SYS:MEMORY-FAULT-ERROR {1004DC5E43}> debugger invoked on a SB-SYS:MEMORY-FAULT-ERROR in thread #<THREAD "main thread" RUNNING {1001B40003}>: Unhandled memory fault at #x7FAAD73B7770.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name): 0: [ABORT] Exit from the current thread.
(DE.ANVI.CROATOAN:ACS :UPPER-LEFT-CORNER) source: (CFFI:MEM-AREF ACS-MAP-ARRAY 'DE.ANVI.NCURSES:CHTYPE (CHAR-CODE (CDR (ASSOC CHAR-NAME ACS-ALIST)))) 0]
This seems to be at a rather low cffi level issue, I hope that it does not lead to a fundamental bug.
Instead of drawing the higher-level dialog, I have also confirmed it by making the simpler, lower-level ACS example, t12 (from clos.lisp) the main function, and then building that, which unfortunately leads to the same ACS bug:
(defun main () (with-screen (scr :input-echoing nil :cursor-visible t :enable-colors t :enable-function-keys t :input-blocking t) (add-string scr "ACS_ULCORNER ") (add-char scr (acs :upper-left-corner)) (get-char scr) (close scr)))
I also am not able any more to run the info msgbox in the irc client, which I've been regularly building over the last months without issues, but now fails due to the same error. The existing 1-2 month old builds work fine and can display dialogs and ACS drawings, but new ones dont, even if I build them with older croatoan versions and older sbcl versions.
I cant say for sure, but it seems to me like an issue with a glibc update over the last weeks. I'll have to check out if the same issues exist on some older machines I have. I have to look if I can replicate it with C code, in which case I will submit to the ncurses mailing list again, the ncurses author is usually very quick to help.
Take care, Anton
croatoan-devel@common-lisp.net