[Glass] Why #identityHash does not rely on OOP?

Martin McClure via Glass glass at lists.gemtalksystems.com
Wed Oct 26 09:19:17 PDT 2016


On 10/26/2016 07:51 AM, Mariano Martinez Peck via Glass wrote:
> I am sure the answer is dumb, but I still wonder. At a first glance I
> thought it could be because you may reuse OOPs. But...if you re-use a
> OOP is because that object is not anywhere in the system (and hence,
> there is no need of re-hash any collection).
>
> Another thought I had is none-persisting object and the costs of getting
> available OOPs vs a simply #identityHash implementation.
>
> BTW how long is the #identityHash? In Pharo (before Spur) we had 12 bits
> (or similar) in the object header so that yield 4k different values.
> That's not fun when you have large identity-based hashed collection
> because of collisions.  Anyway, how big is the identity hash in
> Gemstone? Or how "unique" it is?
>
> Anyway, I was simply curious about this topic and would like you know
> your thoughts.
>

Hi Mariano,

In GemStone, identity hash is based on the oop. Try looking at an 
object's oop and its identity hash in hex. There is an identity hash for 
each possible persistent object, so hash collisions between persistent 
objects cannot happen. It is possible to have, for instance, a 
SmallInteger and a regular object with the same identity hash, since a 
SmallInteger's identity hash is itself.

For all types of hashed collections you want to avoid collisions. 
GemStone is better at this than most other Smalltalks since there is a 
unique hash per object (excepting immediate objects like SmallIntegers).

However, for some kinds of hashed collections (especially open 
addressing with linear probing) you also need to avoid "clumping" of 
hash values. What you need are hash values that look random even when 
the actual keys are not. For SmallIntegers, most Smalltalks make the 
same mistake -- the hash values of 1 2 3 4 5 are 1 2 3 4 5, when you'd 
want those hashes to be spread all over the possible range. And oops in 
GemStone have patterns, too -- they're far from random. So if you're 
implementing a hashed collection, it's a good idea to put the hashes of 
the objects through a scrambling function before using them. See 
#hashMultiply in Pharo for an example scrambling function.

Regards,

-Martin


More information about the Glass mailing list