[Glass] Cost of includesKey in RcKeyValueDictionary
Martin McClure
martin.mcclure at gemtalksystems.com
Wed Feb 26 15:36:28 PST 2020
Hi Bruno,
What does the ID of a B object get used for? Is is needed to look up a B
object in the A object?
If so, I think using a random SmallInteger key is a reasonable approach,
likely less trouble than other possibilities. A UUID does not fit in a
SmallInteger and would take more space, and though a UUID could reduce
the probability of clashes to the point that you didn't need to test for
them, it might well be slower than using a SmallInteger and testing for
clashes.
Lookups with #includesKey: are fairly fast. I don't know what kind of
speed you need, but using a variant of your test code (included below) I
see the new keys being added at a rate of about 60,000 per second on an
8-year-old not very fast desktop machine I have at home. I'm using more
of the SmallInteger range by using a 56-bit random ID. The increase from
a 32-bit ID has very little impact on speed, in my quick testing.
Regards,
-Martin
| numbers instances r int milli1 milli2 clashes|
numbers := KeyValueDictionary new.
r := Lag1MwcRandom new.
milli1 := System millisecondsToRun: [
instances := 900000.
1 to: instances do: [:each | int := r integer + ((r integer bitAnd:
16rFFFFFF) bitShift: 32).
numbers at: int put: int]. "already generated collection"].
clashes := 0.
milli2 := System millisecondsToRun: [ "new ids"
1 to: 10000 do: [:each | int := ((r integer bitAnd: 16rFFFFFF) bitShift:
32).
[numbers includesKey: int] whileTrue: [clashes :=
clashes + 1. int :=
int +1].
numbers at: int put: int].
].
^ Array with: clashes with: milli1 with: milli2
On 2/26/20 1:50 PM, BrunoBB via Glass wrote:
> Hi,
>
> A domain object (A) has a RcKeyValueDictionary to store large quantity of
> (B) objects.
>
> B objects has a random id generated with:
> id := Lag1MwcRandom new integer. "i think the max number is 4294967295"
>
> The problem here is that with 100.000 (B) objects the probability of id
> clash is pretty high.
>
> It can be fixed with #includesKey: (inside a loop :( ) but which is the cost
> of sending #includesKey: in RcKeyValueDictionary ?
>
> It all reduce to #_at: but is a primitive. As far as i remember in a Martin
> McClure presentation about Hash functions the cost should NOT be to much.
> But just want to confirm this :)
>
> Maybe a better solution it will be to use persistent counters ?
> Or maybe a UUID ? (which has the same problem but a lower clash
> probability)
>
> regards,
> bruno
>
> PS:
> At first it seems that the loop is NOT a good idea but the collections of
> ids will be already generated and the #includesKey: does not take to long.
>
> | numbers instances r int milli clashes|
> numbers := Dictionary new.
> r := Lag1MwcRandom new.
> instances := 900000.
> 1 to: instances do: [:each | int := r integer.
> numbers at: int put: int]. "already generated collection"
>
> clashes := 0.
> milli := Time millisecondsToRun: [ "new ids"
> 1 to: 10000 do: [:each | int := r integer.
> [numbers includesKey: int] whileTrue: [clashes := clashes + 1. int :=
> int +1].
> numbers at: int put: int].
> ].
> Array with: clashes with: milli.
>
>
>
>
> --
> Sent from: http://forum.world.st/GLASS-f1460844.html
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> https://lists.gemtalksystems.com/mailman/listinfo/glass
More information about the Glass
mailing list