[Glass] There is really no way to have an ordered/sorted collection together with indexes?

Dale Henrichs via Glass glass at lists.gemtalksystems.com
Mon Aug 7 18:47:27 PDT 2017



On 8/7/17 1:04 PM, Richard Sargent via Glass wrote:
> GLASS mailing list wrote
>> Hi guys,
>>
>> I am storing huge lists of "prices". For this, it really helps me to store
>> things ordered (by date)...either in an SequenceableCollection or a
>> SortedCollection. On the other hand, I do want to use indexes to speedup
>> my
>> query to find price for a given date (equality index).
>>
>> But I have found no way to have them both. The only workaround I found is
>> to keep 2 collections for each of these collections, one sorted/ordered,
>> and the other one an unordered one for querying via index. But this is a
>> pain from "developing" point of view, as well as for unnecessary
>> repository
>> growth.
>>
>> Am I missing something?
> Mariano, have you read chapter 7 in the GemStone/S 64 Programming Guide?
> It's all about indexing.
>
> It looks like you could define a "range" query over the unordered collection
> to get the sorted sequence needed to traverse all the prices in date order.
> The index would be on the date and the range query would be from some "least
> date" through some "greatest date". You would then use the streaming results
> or the #do: message I think) to iterate over the query result in date order.
> There is a section in chapter 7.2 discussing result order.
>
>
> It might be helpful to discuss the use cases for the two collections. When
> would you iterate over all the dates versus when would you search for
> specific dates or ranges of dates?
>
>
Actually I think you can just set up two queries ... one to search for 
the exact date:

   detectQuery := (GsQuery fromString: 'each.value = targetDate' on: nsc)

and one query to find the first date prior to your targetDate:

   nearestQuery := (GsQuery fromString: 'each.value < targetDate' on: nsc)


Then you can use detect:ifNone: sent to the query object itself to 
either find the first matching date using the index or the first date 
prior to the target data using an index ... in both queries you are 
using very efficient btree lookups and avoiding the need to scan your 
collection (key is the price and value if the date):

       | nsc random maxYear detectQuery targetDate result |
       nsc := IdentityBag new.
       random := HostRandom new.
       GsIndexSpec new
         equalityIndex: 'value' lastElementClass: Date;
         createIndexesOn: nsc.
       1 to: 100 do: [ :index |
         nsc
           add: (ScaledDecimal for: random float scale: 2) ->
             (Date
               newDay: (random integerBetween: 1 and: 365)
               year: (random integerBetween: 2000 and: 2017)) ].
       targetDate := Date newDay: 250 year: 2011.
       detectQuery := (GsQuery fromString: 'each.value = targetDate' on: 
nsc)
         bind: 'targetDate'
         to: targetDate.
       result := detectQuery
         detect: [ :date | true ]
         ifNone: [ | nearestQuery |
           nearestQuery :=  (GsQuery fromString: 'each.value < 
targetDate' on: nsc)
             bind: 'targetDate'
             to: targetDate.
            nearestQuery reversedReadStream next].
       {nsc. nsc sortAscending: 'value'. targetDate. result}

I'm returning the sorted nsc, to make it easy to validate that  the 
result is correct, since we're generating random dates.
The parsed query can be persisted (with or without the nsc attached) to 
avoid the overhead of parsing the query string each time you run the 
query ...



More information about the Glass mailing list