Do symbols need to be EQ?

older
Proper behavior of slot-initforms...

Edi Weitz

3 Jul 2015 3 Jul '15

7:09 a.m.

Just out of curiosity and without any relevance in practise: Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right? The EQ dictionary entry for example shows this example: (eq 'a 'a) => true and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function." And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me. But how does this fit into the picture? CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL *S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*? Did I miss some clarifying words in the standard? Did I just manage to confuse myself? Thanks, Edi. PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

Show replies by date

Anton Vodonosov

3 Jul 3 Jul

7:16 a.m.

I think the most confusing part is what you mean by "same" symbols. 03.07.2015, 10:10, "Edi Weitz" <edi@weitz.de>:

...

Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

Anton Vodonosov

7:30 a.m.

EQ just checks object identity. Symbol names, like CL-USER::FOO are a way to refer symbol objects using packages machinery. If we manipulate packages then dereferencing the name CL-USER::FOO may return different object, and they would not be EQ. Yes, INTERN gives us ability to use CL-USER::FOO as a reference to exactly the same symbol object, unless someone destructed the symbol name/object mapping. That's what I rely to and don't expect the standard to provide any more guarantees. Best regards, - Anton 03.07.2015, 10:17, "Anton Vodonosov" <avodonosov@yandex.ru>:

...

I think the most confusing part is what you mean by "same" symbols.

03.07.2015, 10:10, "Edi Weitz" <edi@weitz.de>:

...
Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

Edi Weitz

7:53 a.m.

Let me repeat: I'm not concerned about whether this could impede my ability to write CL programs nor am I concerned that some future implementor might not do the right thing. I just can't see the internal logic (and the CLHS seems otherwise mostly very clear and logical to me). The standard actually defines the word "same" and says that two objects are the same if they can't be distinguished by EQL (unless another predicate is explicitly mentioned). But let's forget about this definition (although it is hard to talk about such concepts if you can't use certain words). I'm more concerned with object identity: 1. I guess we all agree that there's one and only one mathematical object which is the number 536870912. 2. We also all know that on some 32-bit implementations (EQ 536870912 536870912) can yield NIL while (EQL 536870912 536870912) must yield T. 3. So EQL is the preferred predicate in the standard and is intended to mean that two things are _semantically_ identical although they might _technically_ be different (like above). 4. EQ on the other hand tests whether its arguments are (according to the CLHS) "the same, identical object." I've always understood this as a test for identity at the implementation level I shouldn't be concerned with. (Leaving the question open why EQ is in the standard at all...) 5. Now, and I think this is the crucial part, by using EQ to compare symbols in various parts of the standard, I take this as a suggestion that there is for example one and only one symbol CL-USER::FOO like there is one and only one number 536870912. Even more so, because they use EQ and not EQL they also suggest - it seems to me - that this one and only one symbol must have one and only internal representation. 6. But if you agree with #5 and then look at my UNINTERN example how do you explain the results? Has the symbol which once was CL-USER::FOO and is still stored in *S* lost its identity? There are plenty of operations which modify objects - like (SETF GETHASH) - but none of them causes the object to lose its identity. I guess I could rephrase my question like this: Wouldn't it be clearer if "sameness" of symbols would be defined via EQL with something like: "Two symbols are EQL if their names are the same under STRING= and their home packages are the same under EQL." (And maybe some more sentences if necessary.) On Fri, Jul 3, 2015 at 9:16 AM, Anton Vodonosov <avodonosov@yandex.ru> wrote:

...

I think the most confusing part is what you mean by "same" symbols.

03.07.2015, 10:10, "Edi Weitz" <edi@weitz.de>:

...
Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

Anton Vodonosov

8:14 a.m.

I personally don't think that name CL-USER::FOO anyhow represents the "nature" of the symbol The same number may be referenced as #x20000000 and as 536870912. It's just a way to refer the object, not the object itself. Lets consider and example of symbols use: (defun print-value (value mode) (if (eq mode 'mypkg:lowcase) (format nil "~(~A~)" value) (format nil "~A" value))) So, (print-value "HelLo" 'mypkg:lowcase) returns "hello" Lets suppose someone manipulated packages: uninternet and re-interned MYPKG:LOWCASE. This doesn't break my PRINT-VALUE function, because the contract of my function is not to print lower case value when MODE is a symbols named "MYPKG:LOWCASE", but when MODE is exactly the symbol referred to in PRINT-VALUE. I provided a constant which allows to specify different mode, I provided a way to refer it via package systems as 'MYPKG:LOWCASE. If someone destroyed the mapping, well, the he can't use the name to refer my constant. He should have stored a reference to it, or something. But PRINT-VALUE remains correct. How about this treatment? Best regards, - Anton 03.07.2015, 10:54, "Edi Weitz" <edi@weitz.de>:

...

Let me repeat: I'm not concerned about whether this could impede my ability to write CL programs nor am I concerned that some future implementor might not do the right thing. I just can't see the internal logic (and the CLHS seems otherwise mostly very clear and logical to me).

The standard actually defines the word "same" and says that two objects are the same if they can't be distinguished by EQL (unless another predicate is explicitly mentioned). But let's forget about this definition (although it is hard to talk about such concepts if you can't use certain words). I'm more concerned with object identity:

1. I guess we all agree that there's one and only one mathematical object which is the number 536870912.

2. We also all know that on some 32-bit implementations (EQ 536870912 536870912) can yield NIL while (EQL 536870912 536870912) must yield T.

3. So EQL is the preferred predicate in the standard and is intended to mean that two things are _semantically_ identical although they might _technically_ be different (like above).

4. EQ on the other hand tests whether its arguments are (according to the CLHS) "the same, identical object." I've always understood this as a test for identity at the implementation level I shouldn't be concerned with. (Leaving the question open why EQ is in the standard at all...)

5. Now, and I think this is the crucial part, by using EQ to compare symbols in various parts of the standard, I take this as a suggestion that there is for example one and only one symbol CL-USER::FOO like there is one and only one number 536870912. Even more so, because they use EQ and not EQL they also suggest - it seems to me - that this one and only one symbol must have one and only internal representation.

6. But if you agree with #5 and then look at my UNINTERN example how do you explain the results? Has the symbol which once was CL-USER::FOO and is still stored in *S* lost its identity? There are plenty of operations which modify objects - like (SETF GETHASH) - but none of them causes the object to lose its identity.

I guess I could rephrase my question like this: Wouldn't it be clearer if "sameness" of symbols would be defined via EQL with something like: "Two symbols are EQL if their names are the same under STRING= and their home packages are the same under EQL." (And maybe some more sentences if necessary.)

On Fri, Jul 3, 2015 at 9:16 AM, Anton Vodonosov <avodonosov@yandex.ru> wrote:

...
I think the most confusing part is what you mean by "same" symbols.

03.07.2015, 10:10, "Edi Weitz" <edi@weitz.de>:

...
Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

Edi Weitz

8:36 a.m.

On Fri, Jul 3, 2015 at 10:14 AM, Anton Vodonosov <avodonosov@yandex.ru> wrote:

...

This doesn't break my PRINT-VALUE function, because the contract of my function is not to print lower case value when MODE is a symbols named "MYPKG:LOWCASE", but when MODE is exactly the symbol referred to in PRINT-VALUE.

I think this is where we agree to disagree. Suppose you had written your function like so: (defun print-value (value mode) (if (eql mode 42) (format nil "~(~A~)" value) (format nil "~A" value))) Would you expect someone to be able change the identity of the constant 42 in your function in such a way that it would no longer work if called as (PRINT-VALUE ... 42)? Yes, there are different ways to represent 42 (as in binary, octal, and so on), but unless you totally mess up the readtable, there's no simple way to make it impossible to refer to 42 with a literal anymore.

Edi Weitz

8:48 a.m.

Perhaps the excerpt below (from a fresh LW image) makes more obvious what my "philosophical problem" is. I have redacted the output of DISASSEMBLE to only show the relevant parts. It shows that EQ is essentially just one simple comparison with a machine word (which is what I expected). It also shows that I get the same machine word again as long as I don't mess around with UINTERN or something. But once I've done that, I get _another_ machine word and so in terms of simple-minded EQ I get a different object. CL-USER 1 > (defun foo-1 (x) (eq x 'bar)) FOO-1 CL-USER 2 > (disassemble 'foo-1) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 3 > (defun foo-2 (x) (eq x 'bar)) FOO-2 CL-USER 4 > (disassemble 'foo-2) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 5 > (unintern 'bar) T CL-USER 6 > (defun foo-3 (x) (eq x 'bar)) FOO-3 CL-USER 7 > (disassemble 'foo-3) ;; ... 21: 3DAB71F921 cmp eax, 21F971AB ; BAR 26: 750D jne L3 ;; ... NIL

Kenneth Tilton

8:56 a.m.

On Fri, Jul 3, 2015 at 4:48 AM, Edi Weitz <edi@weitz.de> wrote:

...

Perhaps the excerpt below (from a fresh LW image) makes more obvious what my "philosophical problem" is. I have redacted the output of DISASSEMBLE to only show the relevant parts. It shows that EQ is essentially just one simple comparison with a machine word (which is what I expected). It also shows that I get the same machine word again as long as I don't mess around with UINTERN or something. But once I've done that, I get _another_ machine word and so in terms of simple-minded EQ I get a different object.

CL-USER 1 > (defun foo-1 (x) (eq x 'bar)) FOO-1 CL-USER 2 > (disassemble 'foo-1) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 3 > (defun foo-2 (x) (eq x 'bar)) FOO-2 CL-USER 4 > (disassemble 'foo-2) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 5 > (unintern 'bar) T CL-USER 6 > (defun foo-3 (x) (eq x 'bar)) FOO-3 CL-USER 7 > (disassemble 'foo-3) ;; ... 21: 3DAB71F921 cmp eax, 21F971AB ; BAR 26: 750D jne L3 ;; ... NIL

Sorry, where is the problem? The spec is clear that a new object (with a new pointer) will be created given the unintern hijinx, so all is consistent: different pointer, EQ->nil. ie, It is not just "in terms of EQ" that you have a different object: you have created two distinct pointer objects (and EQ dutifully says so). And at a higher level of abstraction, you have created two different symbols, one interned and one not. -kt -- Kenneth Tilton 54 Isle of Venice Dr Fort Lauderdale, FL 33301 ken@tiltontec.com http://tiltontec.com @tiltonsalgebra 646-269-1077 "In a class by itself." *-Macworld*

Alessio Stalla

9:02 a.m.

Package = map from symbol name to symbol object. INTERN ~= (or (gethash ...) (setf (gethash ...))) UNINTERN ~= remhash There's nothing special about symbols. You'd get the same effect with a map of constants and operations to add/remove them from the map. On Fri, Jul 3, 2015 at 10:56 AM, Kenneth Tilton <ken@tiltontec.com> wrote:

...

On Fri, Jul 3, 2015 at 4:48 AM, Edi Weitz <edi@weitz.de> wrote:

...
Perhaps the excerpt below (from a fresh LW image) makes more obvious what my "philosophical problem" is. I have redacted the output of DISASSEMBLE to only show the relevant parts. It shows that EQ is essentially just one simple comparison with a machine word (which is what I expected). It also shows that I get the same machine word again as long as I don't mess around with UINTERN or something. But once I've done that, I get _another_ machine word and so in terms of simple-minded EQ I get a different object.

CL-USER 1 > (defun foo-1 (x) (eq x 'bar)) FOO-1 CL-USER 2 > (disassemble 'foo-1) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 3 > (defun foo-2 (x) (eq x 'bar)) FOO-2 CL-USER 4 > (disassemble 'foo-2) ;; ... 21: 3DF771F921 cmp eax, 21F971F7 ; BAR 26: 750D jne L3 ;; ... NIL CL-USER 5 > (unintern 'bar) T CL-USER 6 > (defun foo-3 (x) (eq x 'bar)) FOO-3 CL-USER 7 > (disassemble 'foo-3) ;; ... 21: 3DAB71F921 cmp eax, 21F971AB ; BAR 26: 750D jne L3 ;; ... NIL

Sorry, where is the problem? The spec is clear that a new object (with a new pointer) will be created given the unintern hijinx, so all is consistent: different pointer, EQ->nil.

ie, It is not just "in terms of EQ" that you have a different object: you have created two distinct pointer objects (and EQ dutifully says so).

And at a higher level of abstraction, you have created two different symbols, one interned and one not.

-kt

-- Kenneth Tilton 54 Isle of Venice Dr Fort Lauderdale, FL 33301

ken@tiltontec.com http://tiltontec.com @tiltonsalgebra

646-269-1077

"In a class by itself." *-Macworld*

Edi Weitz

9:31 a.m.

On Fri, Jul 3, 2015 at 11:02 AM, Alessio Stalla <alessiostalla@gmail.com> wrote:

...

Package = map from symbol name to symbol object. INTERN ~= (or (gethash ...) (setf (gethash ...))) UNINTERN ~= remhash

I would consider that to be an implementation detail. As Anton said, this is mostly about saving space and time. It would not be inconceivable to have an "implementation" that worked like so: (defparameter *my-package* (make-hash-table :test 'equal)) (defun my-intern (symbol-name &optional (package *my-package*)) (or (gethash symbol-name package) (setf (gethash symbol-name package) (parse-integer symbol-name)))) ;; <-- imagine some clever hashing technique (defun my-unintern (symbol-name &optional (package *my-package*)) (remhash symbol-name package)) CL-USER > (defparameter *s* (my-intern "42")) *S* CL-USER > (my-unintern "42") T CL-USER > (eql (my-intern "42") *s*) T (Meaning you'd somehow enforce the same "pointer" once the symbol is "re-created".)

Sam Steingold

6 Jul 6 Jul

10:28 p.m.

...

* Edi Weitz <rqv@jrvgm.qr> [2015-07-03 11:31:55 +0200]:

On Fri, Jul 3, 2015 at 11:02 AM, Alessio Stalla <alessiostalla@gmail.com> wrote:

...
Package = map from symbol name to symbol object. INTERN ~= (or (gethash ...) (setf (gethash ...))) UNINTERN ~= remhash

I would consider that to be an implementation detail. As Anton said, this is mostly about saving space and time. It would not be inconceivable to have an "implementation" that worked like so:

(defparameter *my-package* (make-hash-table :test 'equal))

(defun my-intern (symbol-name &optional (package *my-package*)) (or (gethash symbol-name package) (setf (gethash symbol-name package) (parse-integer symbol-name)))) ;; <-- imagine some clever hashing technique

(defun my-unintern (symbol-name &optional (package *my-package*)) (remhash symbol-name package))

CL-USER > (defparameter *s* (my-intern "42")) *S* CL-USER > (my-unintern "42") T CL-USER > (eql (my-intern "42") *s*) T

(Meaning you'd somehow enforce the same "pointer" once the symbol is "re-created".)

this behavior is non-compliant. http://www.lispworks.com/documentation/HyperSpec/Body/f_intern.htm

...

...
...
If no such symbol is accessible in package, a new symbol with the given name is created

i.e., after unintern, there is no symbol with this name, thus intern creates a _NEW_ symbol which cannot be EQ to any other existing object (this is the definition of the word "new" or "fresh"). -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1348 http://www.childpsy.net/ http://truepeace.org http://mideasttruth.com http://www.dhimmitude.org http://ffii.org http://iris.org.il Warning! Dates in calendar are closer than they appear!

Anton Vodonosov

3 Jul 3 Jul

9:15 a.m.

A note about "philosophical problems" - if one wants to build a compact mental model, reasonable and consistent with all the Common Lisp properties, there probably may be more than one way to do so, and neither of possible models can be proven incorrect. 03.07.2015, 11:16, "Anton Vodonosov" <avodonosov@yandex.ru>:

...

I personally don't think that name CL-USER::FOO anyhow represents the "nature" of the symbol

The same number may be referenced as #x20000000 and as 536870912. It's just a way to refer the object, not the object itself.

I want to correct myself. Unlike numbers or any other objects, symbols _are_ about names, so we can say that the name CL-USER::FOO represents the "nature" of symbol. I think Common Lisp wants to save memory and speedup comparison, so when we use the same name we get the same object, as implemented by INTERN (this trick even has name - the Flyweight pattern). So, this is just an optimization trick, and UNITERN is a maintenance, system tool, not designed to express programs. We are encouraged to operate as if the symbol name means the same object.

Anton Vodonosov

9:22 a.m.

BTW, you may be interested to know how Clojure handles symbols. Unlike CL, where symbol is both a textual name, and a slot where symbol value may be stored, Clojure separates these concepts. The slot-holing object is called Var - it is similar to CL symbol. And sybmols returned by Clojure reader are essentially strings, qualified with namespace (another string). Symbols are not reused, Clojure reader creates new instances of them freely.

Martin Simmons

12:44 p.m.

...

...
...
...
...
On Fri, 03 Jul 2015 12:15:32 +0300, Anton Vodonosov said: Envelope-From: avodonosov@yandex.by

A note about "philosophical problems" - if one wants to build a compact mental model, reasonable and consistent with all the Common Lisp properties, there probably may be more than one way to do so, and neither of possible models can be proven incorrect.

03.07.2015, 11:16, "Anton Vodonosov" <avodonosov@yandex.ru>:

...
I personally don't think that name CL-USER::FOO anyhow represents the "nature" of the symbol

The same number may be referenced as #x20000000 and as 536870912. It's just a way to refer the object, not the object itself.

I want to correct myself. Unlike numbers or any other objects, symbols _are_ about names, so we can say that the name CL-USER::FOO represents the "nature" of symbol.

I think Common Lisp wants to save memory and speedup comparison, so when we use the same name we get the same object, as implemented by INTERN (this trick even has name - the Flyweight pattern).

So, this is just an optimization trick, and UNITERN is a maintenance, system tool, not designed to express programs. We are encouraged to operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with mutable attributes, so identity is important. Also, the identity of uninterned symbols is just as important (e.g. for macros) as interned ones, so finding symbols via packages (and the reader) is not fundamental to their common use. Packages are just a way to convert strings to symbols, which is useful when they are obtained from files outside a running CL (e.g. via the reader/fasl loader). -- Martin Simmons LispWorks Ltd http://www.lispworks.com/

Scott McKay

12:52 p.m.

On Fri, Jul 3, 2015 at 8:44 AM, Martin Simmons <martin@lispworks.com> wrote:

...

Packages are just a way to convert strings to symbols, which is useful when they are obtained from files outside a running CL (e.g. via the reader/fasl loader).

Agreed. Isn't it the case that {package x string} -> symbol is a 1-to-1 relationship? In which case, two symbols having the same name in the same package implies that the two symbols are in fact EQ? Sorry if I'm late to the party, I haven't been thinking about this for a few years. --S

Alessio Stalla

1:10 p.m.

In general it is a n-to-1 relationship, n >= 0. A symbol has always a name but it can have either no home package or one home package, and additionally there can be any number of packages in which it is accessible. You have to think three-dimensionally ;) yes, two symbols with the same name in the same package are EQ. However, you can destructively alter packages so as to replace a symbol with another with the same name *at a later time*. Those won't be EQ, but a package will always contain at most one symbol with a given name - at a given time. On Fri, Jul 3, 2015 at 2:52 PM, Scott McKay <swmckay@gmail.com> wrote:

...

On Fri, Jul 3, 2015 at 8:44 AM, Martin Simmons <martin@lispworks.com> wrote:

...
Packages are just a way to convert strings to symbols, which is useful when they are obtained from files outside a running CL (e.g. via the reader/fasl loader).

Agreed. Isn't it the case that {package x string} -> symbol is a 1-to-1 relationship? In which case, two symbols having the same name in the same package implies that the two symbols are in fact EQ?

Sorry if I'm late to the party, I haven't been thinking about this for a few years.

--S

Anton Vodonosov

9:49 p.m.

03.07.2015, 15:45, "Martin Simmons" <martin@lispworks.com>:

...

...
...
...
...
...
On Fri, 03 Jul 2015 12:15:32 +0300, Anton Vodonosov said: Envelope-From: avodonosov@yandex.by

I want to correct myself. Unlike numbers or any other objects, symbols _are_ about names, so we can say that the name CL-USER::FOO represents the "nature" of symbol.

I think Common Lisp wants to save memory and speedup comparison, so when we use the same name we get the same object, as implemented by INTERN (this trick even has name - the Flyweight pattern).

So, this is just an optimization trick, and UNITERN is a maintenance, system tool, not designed to express programs. We are encouraged to operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with mutable attributes, so identity is important.

This is part of the optimization. Functions like SYMBOL-VALUE, GET could be, for example, backed by hash maps from symbol name, thus returning the same value for equally named symbols. I mean on the level of abstraction mathematicians use when they say "let X = 10" it means that the textual name X is bound to 10. In Common Lisp it means that the symbol object with name X is bound to 10. So, in general, abstract sense, symbols need not to be EQ. But Common Lisp distinguishes symbols up to their object instance identity. I still suppose this choice is an optimization.

...

Also, the identity of uninterned symbols is just as important (e.g. for macros) as interned ones

I think if symbols were compared by their names instead of EQ, the were ways to satisfy needs of macros. But that would be another language, not CL. Best regards, - Anton

Steve Haflich

11:37 p.m.

Symbols must behave under EQ as they traditionally always have if symbols are to be useful as property list indicators. There is a semantic issue that is not explicit in the ANS but which underlie language semantics and the concepts of "same", "identical", and "equivalent". It concerns object mutability. The only objects for which the EQ/EQL distinction is unspecified are characters and numbers. But hese objects are immutable (at least in the portable language). Most other kinds of objects, including symbols, are mutable. The fundamental principle of "identical" is that if two objects are EQ, mutating one of them [sic] will necessarily mutate the other. If two objects are _not_ EQ, then mutating one will _not_ mutate the other. I would propose that since symbols have several mutable properties, mutating a property of one reference to a symbol will mutate that property of another reference iff those two references are EQ. The "only if" part of "iff" is here crucial. This is exactly the same as for conses, arrays, structure slots, hashtables, readtable dispatches, etc. etc. etc. It still isn't clear whether this obvious semantic property can be proven from the ANS. On Fri, Jul 3, 2015 at 2:49 PM, Anton Vodonosov <avodonosov@yandex.ru> wrote:

...

03.07.2015, 15:45, "Martin Simmons" <martin@lispworks.com>:

...
...
...
...
...
> On Fri, 03 Jul 2015 12:15:32 +0300, Anton Vodonosov said: Envelope-From: avodonosov@yandex.by

I want to correct myself. Unlike numbers or any other objects, symbols _are_ about names, so we can say that the name CL-USER::FOO represents the "nature" of symbol.

I think Common Lisp wants to save memory and speedup comparison, so when we use the same name we get the same object, as implemented by INTERN (this trick even has name - the Flyweight pattern).

So, this is just an optimization trick, and UNITERN is a maintenance, system tool, not designed to express programs. We are encouraged to operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with mutable attributes, so identity is important.

This is part of the optimization. Functions like SYMBOL-VALUE, GET could be, for example, backed by hash maps from symbol name, thus returning the same value for equally named symbols.

I mean on the level of abstraction mathematicians use when they say "let X = 10" it means that the textual name X is bound to 10. In Common Lisp it means that the symbol object with name X is bound to 10.

So, in general, abstract sense, symbols need not to be EQ. But Common Lisp distinguishes symbols up to their object instance identity. I still suppose this choice is an optimization.

...
Also, the identity of uninterned symbols is just as important (e.g. for macros) as interned ones

I think if symbols were compared by their names instead of EQ, the were ways to satisfy needs of macros. But that would be another language, not CL.

Best regards, - Anton

Kenneth Tilton

8:21 a.m.

On Fri, Jul 3, 2015 at 3:09 AM, Edi Weitz <edi@weitz.de> wrote:

...

Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

I think you manged to confuse yourself. unintern of course did not change the identity of *s* (by which we are meaning the symbol bound to *S*) -- identity is identity is identity. Unintern did, however, change the package of *s*, so (as one side-effect) a new symbol of the same name in the same package is a new object (identical to nothing at birth). Perhaps the problem is confusing the levels of abstraction offered by (a) EQ and (b) object identity. The latter is a very simple idea. EQ, as you adroitly demonstrated, worries about all sorts of things, including a symbol's package. my2 anyway. -kt

...

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?

-- Kenneth Tilton 54 Isle of Venice Dr Fort Lauderdale, FL 33301 ken@tiltontec.com http://tiltontec.com @tiltonsalgebra 646-269-1077 "In a class by itself." *-Macworld*

Edi Weitz

8:29 a.m.

On Fri, Jul 3, 2015 at 10:21 AM, Kenneth Tilton <ken@tiltontec.com> wrote:

...

EQ, as you adroitly demonstrated, worries about all sorts of things, including a symbol's package.

Which is part of what has me confused. Up until now I would have said that the "problem" of EQ is that it doesn't worry about _enough_ things. (EQ 3/4 3/4) is NIL because EQ doesn't bother to look "into" the numbers (as EQL does) but just superficially checks their "pointer identity". And for symbols that's not the case? Hmmm...

Kenneth Tilton

8:46 a.m.

On Fri, Jul 3, 2015 at 4:29 AM, Edi Weitz <edi@weitz.de> wrote:

...

On Fri, Jul 3, 2015 at 10:21 AM, Kenneth Tilton <ken@tiltontec.com> wrote:

...
EQ, as you adroitly demonstrated, worries about all sorts of things, including a symbol's package.

Which is part of what has me confused. Up until now I would have said that the "problem" of EQ is that it doesn't worry about _enough_ things. (EQ 3/4 3/4) is NIL because EQ doesn't bother to look "into" the numbers (as EQL does) but just superficially checks their "pointer identity". And for symbols that's not the case? Hmmm...

<cough> OK, I myself was at the wrong level of abstraction: EQ is not worrying about anything other than pointer identity. It is the behavior of intern and unintern that arranges for two symbols with the same name to be distinct objects if their packages vary. -kt -- Kenneth Tilton 54 Isle of Venice Dr Fort Lauderdale, FL 33301 ken@tiltontec.com http://tiltontec.com @tiltonsalgebra 646-269-1077 "In a class by itself." *-Macworld*

Jason Cornez

4 Jul 4 Jul

9:47 a.m.

Sorry this is arriving late - I had some trouble posting to the group yesterday. -Jason On 07/03/2015 10:21 AM, Kenneth Tilton wrote:

...

Perhaps the problem is confusing the levels of abstraction offered by (a) EQ and (b) object identity. The latter is a very simple idea. EQ, as you adroitly demonstrated, worries about all sorts of things, including a symbol's package.

I don't think that anything has demonstrated that EQ is worried about anything other than object identity. And 5.3.33 is pretty clear that this is all that EQ does "Returns true if its arguments are the same, identical object; otherwise, returns false." As for symbols, I agree that unintern does NOT affect identity of a symbol. At the repl... (defparameter *a* 'foo) (defparameter *b* 'foo) (eq *a* *b*) ==> T (unintern 'foo) (eq *a* *b*) ==> T (defparameter *c* 'foo) (eq *a* *c*) ==> NIL If there is some doubt about why the last form is NIL, it is because when the (defparameter *c* 'foo) form is _read_, the reader creates a new symbol (via intern) because there is no current symbol named "FOO" in the current package - obviously, we just uninterned the previous symbol which is still the value of *a* and *b*. The same thing is going on in the case of a function that refers to a symbol. The symbol won't change, unless the function text is _read_ again. (defun func-foo (sym) (when (eq sym 'foo) ...)) If you pass in the same symbol object, EQ will always return T. But sure, if you unintern 'foo and then at the repl call (func-foo 'foo), you are now passing in a brand-new symbol, and so of course EQ will return NIL in that existing function. Hope this helps, -Jason

Thomas Burdick

3 Jul 3 Jul

11:04 a.m.

...

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me. [...] Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

I think the confusion here is because you're confounding the reader path to a symbol with the idea of the symbol itself. And honestly, most of the "Packages" chapter is written in a way that doesn't help (e.g., UNINTERN calling a symbol with no package "pathological"). First of all, just ignore packages completely. They are but glorified hash tables, as the first sentance of the Packages chapter tries to make clear: "A package establishes a mapping from names to symbols" Instead, consider just the symbol type. Symbols are structured objects, just like any other. They have slots for their name, for their value, for their function value, for their properties list, and for their (optional) home package. You make them with MAKE-SYMBOL, and you copy them with COPY-SYMBOL. AFAIK, the examples given for COPY-SYMBOL are the clearest attempt the spec makes to establish this as the conceptual model. The reason that EQL doesn't do anything special for symbols the way it does for numbers is that symbols have structure. Symbols have names, home packages, property lists ... all sorts of things that you can destructively modify. The trick with EQL and numbers only works if you can't change anything about an object. Symbols being proper objects, object identity is the only thing you can use. INTERN and UNINTERN are just GETHASH and REMHASH under the hood. Of course, the above is just the model; implementations can do anything they want that doesn't break the model, and the spec tries to use lots of formulations and careful, confusing wording to leave implementations the most freedom. Cheers, Thomas

Pascal Costanza

4 Jul 4 Jul

8:39 a.m.

Most languages don’t specify object identity in sufficient detail, so I’m not surprised Common Lisp doesn’t do this either. Pascal

...

On 3 Jul 2015, at 09:09, Edi Weitz <edi@weitz.de> wrote:

Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that two symbols which are the "same" symbol must be "identical"? I know that there are a couple of examples where this is implied, but formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

And the entry for INTERN is actually the closest I could find in terms of clarification because it says that if a symbol of a specified name is already accessible, _IT_ is returned -- which sounds like object identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo) *S* CL-USER 2 > (unintern 'foo) T CL-USER 3 > (defparameter *s2* 'foo) *S2* CL-USER 4 > (eq *s* *s2*) NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but how do we explain this in terms of object identity? Has the UNINTERN operation changed the identity of *S* which once was the one and only CL-USER::FOO but can't be anymore because this role is now occupied by *S2*?

Did I miss some clarifying words in the standard? Did I just manage to confuse myself?

Thanks, Edi.

PS: The UNINTERN entry warns about side effects which could harm consistency, so maybe this is what they meant?