Hi, is there a way to unwrap a Java boxed value to get the Java primitive value inside it?
In particular I am interested in obtaining the char inside a Character, in order to supply the char to a method which requires a char.
Thanks for any info,
Robert
Hey Robert,
In Java, both Character and char are Unicode. So, you can unbox Character to char but you'll still have unicode.
Also, although it intuitively seems like converting unicode to ascii should be easy, I've spens a bunch of time with this and was never able to do it completely. Here is what I have been using:
From chatGPT:
import java.text.Normalizer; import java.util.regex.Pattern;
public class UnicodeToAsciiConverter {
public static String convertToAscii(String unicodeStr) { String normalized = Normalizer.normalize(unicodeStr, Normalizer.Form.NFD); Pattern pattern = Pattern.compile("[^\p{ASCII}]"); return pattern.matcher(normalized).replaceAll(""); }
public static void main(String[] args) { String unicodeString = "Héllo, Wörld! 👋"; // Example Unicode string String asciiString = convertToAscii(unicodeString); System.out.println("Original Unicode String: " + unicodeString); System.out.println("Converted ASCII String: " + asciiString); } }
--blake
On Wed, Jan 3, 2024 at 2:27 PM Robert Dodier robert.dodier@gmail.com wrote:
Hi, is there a way to unwrap a Java boxed value to get the Java primitive value inside it?
In particular I am interested in obtaining the char inside a Character, in order to supply the char to a method which requires a char.
Thanks for any info,
Robert
Sorry for the confusing message, although it is correct. When I said I spent a bunch of time on it with an incomplete solutions, I was talking about JavaScript.
So, except for the paragraph that starts with "Also, ...", the rest is correct.
--blake
On Wed, Jan 3, 2024 at 3:46 PM Blake McBride blake@mcbride.name wrote:
Hey Robert,
In Java, both Character and char are Unicode. So, you can unbox Character to char but you'll still have unicode.
Also, although it intuitively seems like converting unicode to ascii should be easy, I've spens a bunch of time with this and was never able to do it completely. Here is what I have been using:
From chatGPT:
import java.text.Normalizer; import java.util.regex.Pattern;
public class UnicodeToAsciiConverter {
public static String convertToAscii(String unicodeStr) { String normalized = Normalizer.normalize(unicodeStr,
Normalizer.Form.NFD); Pattern pattern = Pattern.compile("[^\p{ASCII}]"); return pattern.matcher(normalized).replaceAll(""); }
public static void main(String[] args) { String unicodeString = "Héllo, Wörld! 👋"; // Example Unicode
string String asciiString = convertToAscii(unicodeString); System.out.println("Original Unicode String: " + unicodeString); System.out.println("Converted ASCII String: " + asciiString); } }
--blake
On Wed, Jan 3, 2024 at 2:27 PM Robert Dodier robert.dodier@gmail.com wrote:
Hi, is there a way to unwrap a Java boxed value to get the Java primitive value inside it?
In particular I am interested in obtaining the char inside a Character, in order to supply the char to a method which requires a char.
Thanks for any info,
Robert
Java, auto-unboxes. So, you can do the following:
Character co = 'A'; char c = co;
On Wed, Jan 3, 2024 at 3:51 PM Blake McBride blake@mcbride.name wrote:
Sorry for the confusing message, although it is correct. When I said I spent a bunch of time on it with an incomplete solutions, I was talking about JavaScript.
So, except for the paragraph that starts with "Also, ...", the rest is correct.
--blake
On Wed, Jan 3, 2024 at 3:46 PM Blake McBride blake@mcbride.name wrote:
Hey Robert,
In Java, both Character and char are Unicode. So, you can unbox Character to char but you'll still have unicode.
Also, although it intuitively seems like converting unicode to ascii should be easy, I've spens a bunch of time with this and was never able to do it completely. Here is what I have been using:
From chatGPT:
import java.text.Normalizer; import java.util.regex.Pattern;
public class UnicodeToAsciiConverter {
public static String convertToAscii(String unicodeStr) { String normalized = Normalizer.normalize(unicodeStr,
Normalizer.Form.NFD); Pattern pattern = Pattern.compile("[^\p{ASCII}]"); return pattern.matcher(normalized).replaceAll(""); }
public static void main(String[] args) { String unicodeString = "Héllo, Wörld! 👋"; // Example Unicode
string String asciiString = convertToAscii(unicodeString); System.out.println("Original Unicode String: " + unicodeString); System.out.println("Converted ASCII String: " + asciiString); } }
--blake
On Wed, Jan 3, 2024 at 2:27 PM Robert Dodier robert.dodier@gmail.com wrote:
Hi, is there a way to unwrap a Java boxed value to get the Java primitive value inside it?
In particular I am interested in obtaining the char inside a Character, in order to supply the char to a method which requires a char.
Thanks for any info,
Robert
However, remember that "co" can be null and "c" can't. So, you may have to do something like:
if (co == null) c = myDefault; else c = co;
On Wed, Jan 3, 2024 at 3:57 PM Blake McBride blake@mcbride.name wrote:
Java, auto-unboxes. So, you can do the following:
Character co = 'A'; char c = co;
On Wed, Jan 3, 2024 at 3:51 PM Blake McBride blake@mcbride.name wrote:
Sorry for the confusing message, although it is correct. When I said I spent a bunch of time on it with an incomplete solutions, I was talking about JavaScript.
So, except for the paragraph that starts with "Also, ...", the rest is correct.
--blake
On Wed, Jan 3, 2024 at 3:46 PM Blake McBride blake@mcbride.name wrote:
Hey Robert,
In Java, both Character and char are Unicode. So, you can unbox Character to char but you'll still have unicode.
Also, although it intuitively seems like converting unicode to ascii should be easy, I've spens a bunch of time with this and was never able to do it completely. Here is what I have been using:
From chatGPT:
import java.text.Normalizer; import java.util.regex.Pattern;
public class UnicodeToAsciiConverter {
public static String convertToAscii(String unicodeStr) { String normalized = Normalizer.normalize(unicodeStr,
Normalizer.Form.NFD); Pattern pattern = Pattern.compile("[^\p{ASCII}]"); return pattern.matcher(normalized).replaceAll(""); }
public static void main(String[] args) { String unicodeString = "Héllo, Wörld! 👋"; // Example Unicode
string String asciiString = convertToAscii(unicodeString); System.out.println("Original Unicode String: " + unicodeString); System.out.println("Converted ASCII String: " + asciiString); } }
--blake
On Wed, Jan 3, 2024 at 2:27 PM Robert Dodier robert.dodier@gmail.com wrote:
Hi, is there a way to unwrap a Java boxed value to get the Java primitive value inside it?
In particular I am interested in obtaining the char inside a Character, in order to supply the char to a method which requires a char.
Thanks for any info,
Robert
Hi Blake, thanks for your reply.
I actually don't need to convert to ASCII -- what I really want to do is to call a Java method which has a char argument. I got the following error:
CL-USER(19): (jss:new "org.armedbear.lisp.LispCharacter" #\U2502) #<THREAD "interpreter" native {41BFB5BC}>: Debugger invoked on condition of type JAVA-EXCEPTION Java exception 'java.lang.NoSuchMethodException: LispCharacter(java.lang.Character)'.
It appears that the Lisp character #\U2502 has been converted to a Java Character, but that's not acceptable to the org.armedbear.lisp.LispCharacter constructor, which is declared to take a char argument; see line 70 of src/org/armedbear/lisp/LispCharacter.java in the current version (commit bba779e).
Now a complicating factor is that the LispCharacter(char) constructor is declared private -- I don't know if that's the actual problem, and the error message about the argument type is misleading. Are Java private methods, variables, and constructors visible from Lisp?
All the best,
Robert
On Jan 3, 2024, at 23:49, Robert Dodier robert.dodier@gmail.com wrote:
Hi Blake, thanks for your reply.
I actually don't need to convert to ASCII -- what I really want to do is to call a Java method which has a char argument. I got the following error:
CL-USER(19): (jss:new "org.armedbear.lisp.LispCharacter" #\U2502) #<THREAD "interpreter" native {41BFB5BC}>: Debugger invoked on condition of type JAVA-EXCEPTION Java exception 'java.lang.NoSuchMethodException: LispCharacter(java.lang.Character)'.
Do you need to 1) call an arbitrary Java method with a char argument, or 2) just programmatically call the org.armedbear.lisp.LispCharacter constructor to get a Lisp character?
For 1) calling arbitrary Java methods, one should be able to use JSS to do the following:
(#0"charValue" (#"valueOf" 'java.lang.Character #\U2502))
The #0 returns the raw value of the method invocation without attempting to convert through the Lisp/Java FFI.
For 2) why can’t you use CODE-CHAR
(char-code #\U2502) => 9474 (code-char 9474) =>#\│
It appears that the Lisp character #\U2502 has been converted to a Java Character, but that's not acceptable to the org.armedbear.lisp.LispCharacter constructor, which is declared to take a char argument; see line 70 of src/org/armedbear/lisp/LispCharacter.java in the current version (commit bba779e).
Indeed it does change back to a java.lang.Character instance, even when I use the method to return the raw Java type. I cant figure out a way to coerce calling arguments to a primitive char. Maybe I haven't found the right invocation yet, but otherwise we will have to fix the implementation.
Now a complicating factor is that the LispCharacter(char) constructor is declared private -- I don't know if that's the actual problem, and the error message about the argument type is misleading. Are Java private methods, variables, and constructors visible from Lisp?
JSS:NEW should be able to invoke private constructors. That is one of the features which distinguishes it from JAVA:JNEW.
On Thu, Jan 4, 2024 at 3:11 AM Mark Evenson evenson@panix.com wrote:
or 2) just programmatically call the org.armedbear.lisp.LispCharacter constructor to get a Lisp character?
For 2) why can’t you use CODE-CHAR
Right, I want to call the LispCharacter constructor. CODE-CHAR yields a java.lang.Character, as reported by (java:jcall "getClass" (code-char #xnnnn)), so that's not quite enough.
To back up a little, what I really want is for ABCL to recognize Unicode characters by name, e.g. #\BOX_DRAWINGS_LIGHT_VERTICAL and so on. Towards that end, I believe, perhaps unreasonably, that if I can wedge a new item into LispCharacter.lispChars, then I can call LispCharacter.setCharName, and then -- presto, perhaps -- the reader will recognize #\MY_NEW_NAME.
Then, of course, I'll be able to compile some new code which makes use of those named characters without modifying that code. Some other Lisp implementations recognize those out of the box; at least one needs to have them defined, for which there is an internal Lisp function to do it. It appears it's also possible with ABCL with a little more effort.
Thanks for your help, and all the best.
Robert
On Thu, Jan 4, 2024 at 3:11 AM Mark Evenson evenson@panix.com wrote:
Now a complicating factor is that the LispCharacter(char) constructor is declared private -- I don't know if that's the actual problem, and the error message about the argument type is misleading. Are Java private methods, variables, and constructors visible from Lisp?
JSS:NEW should be able to invoke private constructors. That is one of the features which distinguishes it from JAVA:JNEW.
JSS:NEW doesn't appear to be able to call the private constructor of LispCharacter -- at least, when I change it to public, then it succeeds.
With private constructor:
CL-USER(3): #<THREAD "interpreter" native {6B89AFA7}>: Debugger invoked on condition of type JAVA-EXCEPTION Java exception 'java.lang.NoSuchMethodException: LispCharacter(java.lang.Character)'.
With public constructor:
CL-USER(3): (jss:new "org.armedbear.lisp.LispCharacter" (code-char #x2502)) #\│
The patch is just to make the constructor public -- I didn't change the argument type; so I guess ABCL was able to see that the LispCharacter(char) constructor is appropriate, and wedge the java.lang.Character into a char as needed.
At this point I think the direction I was trying to go (to define new named characters to be recognized by the reader) is not going to work, because I can't, without modifying LispCharacter, create a LispCharacter instance.
Thanks to everyone for their help,
Robert
PS. Here's the patch:
$ git stash show --patch diff --git a/src/org/armedbear/lisp/LispCharacter.java b/src/org/armedbear/lisp/LispCharacter.java index ef9a32c..4219d7e 100644 --- a/src/org/armedbear/lisp/LispCharacter.java +++ b/src/org/armedbear/lisp/LispCharacter.java @@ -67,7 +67,7 @@ public final class LispCharacter extends LispObject }
// This needs to be public for the compiler. - private LispCharacter(char c) + public LispCharacter(char c) { this.value = c; }
armedbear-devel@common-lisp.net