[Glass] GsQuery results are duplicated

Dale Henrichs via Glass glass at lists.gemtalksystems.com
Sat Jan 7 16:47:49 PST 2017


Thanks for providing the additional details.

Hmmm, seems like a bug ... interesting that we haven't run across this 
earlier ...

I am assuming that your instanceSet is actually a kind of Bag and the 
all of the result sets are Bags as well - otherwise there would not be 
duplicates...

It looks like the behavior you are seeing stems from the fact that we 
use the #+ operation when combining intermediate results for predicates  
when the #| query operator is used (see 
GsCompoundClause>>resultOperatorFor: and IdentityBag>>+). The #+ 
operator adds the number of duplicate elements when creating the new bag 
...

Sooooo this bug is tied into the expected behavior of bags and I'm not 
quite sure that there is a right answer here ... suggestions for 
alternative behaviors would be welcomed:)

After a bit of study, I have a feeling that the "right answer" will be 
something like the following:

   | nsc query |
   nsc := IdentityBag new.
   nsc add: 1 -> '1'.
   nsc add: 2 -> '1'.
   nsc add: 3 -> '1'.
   query := '(each.key = 2) | (each.value = ''1'')' asQueryOn: nsc.
   query queryResult * nsc

The effect of `* nsc` is to normalize the result object count in your 
result to match the count in the original bag and give a result would be 
correct even if the original bag has multiple occurrences of the 
objects... #* is implemented as a primitive and our identity-based 
intersection and union primitives are pretty efficient (they do not page 
the objects into memory ... operations are performed at the collection 
leaf level without touching the elements themselves).

If you know that your Bag does not contain multiple occurrences (or you 
don't care), then converting the query result using #asSet ... #asSet 
will end up paging in all of the objects in the result set and I 
actually think that `* nsc` would end being more efficient.

I was thinking that using a do: block might actually give better 
results, but it turns out that the block also gives duplicate results 
(for a slightly different reason):

   | nsc query ar |
   nsc := IdentityBag new.
   nsc add: 1 -> '1'.
   nsc add: 2 -> '1'.
   nsc add: 3 -> '1'.
   query := '(each.key = 2) | (each.value = ''1'')' asQueryOn: nsc.
   ar := {}.
   query do: [ :each | ar add: each ].
   ar

I've create internal bug reports for both of these:

   46607  GsCompoundClause>>executeAndDo: and 
GsCompoundClause>>executeAndDo:using: not correct for #| query operator
   46609  GsCompoundClause>>executeClauseUsing: and 
GsCompoundClause>>executeClauseNegated not correct for #| query operator 
on bag-based collections

Thanks again for reporting this ... I imagine that the solution to both 
bugs will be in Smalltalk code, so if you are interested, I can probably 
provide a patch for one or both problems, otherwise the fix will be in 
the upcoming 3.4.0 (we're aiming at late spring, early summer at the 
moment).

Dale

On 1/7/17 2:20 PM, BrunoBB via Glass wrote:
> Dale,
>
> There is no duplicates in the main collection.
>
> (instancesSet size = instancesSet asSet size). "answer true"
> instancesSet occurrencesOf: (instancesSet detect: [:each | each username=
> 'admin']). "answer 1"
>
> (instancesSet select: [:each | (each username = 'admin') or: [(each
> groupname = 'admin')
> or:[(each groupname = 'orbeon-role')]]]) size. "answer 12"
>
> ('(each.username = ''admin'') | (each.groupname = ''admin'') |
> (each.groupname = ''orbeon-role'')' asQueryOn: instancesSet)
> size. "answer 24"
>
> ('(each.groupname|username = ''admin'') | (each.groupname =
> ''orbeon-role'')' asQueryOn: instancesSet)
> size. "answer 24"
>
> ('(each.groupname|username = ''admin'') | (each.groupname = ''admin'') |
> (each.groupname = ''orbeon-role'')' asQueryOn: instancesSet)
> size. "answer 36"
>
> ('(each.username = ''admin'') | (each.groupname = ''admin'')' asQueryOn:
> instancesSet)
> size. "answer 24"
>
> ('(each.username = ''admin'')' asQueryOn: instancesSet)
> size. "answer 12"
>
> No idea what is going on...
>
> regards
> bruno
>
>
>
> --
> View this message in context: http://forum.world.st/GsQuery-results-are-duplicated-tp4929023p4929027.html
> Sent from the GLASS mailing list archive at Nabble.com.
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/glass



More information about the Glass mailing list