[Glass] There is really no way to have an ordered/sorted collection together with indexes?

Mariano Martinez Peck via Glass glass at lists.gemtalksystems.com
Mon Aug 7 19:53:11 PDT 2017


On Mon, Aug 7, 2017 at 10:47 PM, Dale Henrichs via Glass <
glass at lists.gemtalksystems.com> wrote:

>
>
> On 8/7/17 1:04 PM, Richard Sargent via Glass wrote:
>
>> GLASS mailing list wrote
>>
>>> Hi guys,
>>>
>>> I am storing huge lists of "prices". For this, it really helps me to
>>> store
>>> things ordered (by date)...either in an SequenceableCollection or a
>>> SortedCollection. On the other hand, I do want to use indexes to speedup
>>> my
>>> query to find price for a given date (equality index).
>>>
>>> But I have found no way to have them both. The only workaround I found is
>>> to keep 2 collections for each of these collections, one sorted/ordered,
>>> and the other one an unordered one for querying via index. But this is a
>>> pain from "developing" point of view, as well as for unnecessary
>>> repository
>>> growth.
>>>
>>> Am I missing something?
>>>
>> Mariano, have you read chapter 7 in the GemStone/S 64 Programming Guide?
>> It's all about indexing.
>>
>> It looks like you could define a "range" query over the unordered
>> collection
>> to get the sorted sequence needed to traverse all the prices in date
>> order.
>> The index would be on the date and the range query would be from some
>> "least
>> date" through some "greatest date". You would then use the streaming
>> results
>> or the #do: message I think) to iterate over the query result in date
>> order.
>> There is a section in chapter 7.2 discussing result order.
>>
>>
>> It might be helpful to discuss the use cases for the two collections. When
>> would you iterate over all the dates versus when would you search for
>> specific dates or ranges of dates?
>>
>>
>> Actually I think you can just set up two queries ... one to search for
> the exact date:
>
>   detectQuery := (GsQuery fromString: 'each.value = targetDate' on: nsc)
>
> and one query to find the first date prior to your targetDate:
>
>   nearestQuery := (GsQuery fromString: 'each.value < targetDate' on: nsc)
>
>
>

Hi Dale,

Thanks for your answers. I am headed bed right now, but I would do a quick
question so to have as much as possible info for tomorrow...

Aside from both needed queries above, I would still need this 2 more APIs:

3) get the whole price history sorted.
4) get the newest available price of the collection

How can I make those fast too taking advantage of the index? For 3) I can
simply do the nearestQuery with a future day (like tomorrow), but maybe
there is a cleaner way?  And for 4?

In summary.... I am comparing whether store things sorted (so that 3 and 4
are fast...3 is a simple #last and 4 is simple same collection..nothing to
do ) and make binary search for exact query and nearest query, vs  store
things in a bag and do everything via index (querying will be fast, but I
doubt about 3) and 4)... i mostly doubt about 3)..


Thanks a lot!



> Then you can use detect:ifNone: sent to the query object itself to either
> find the first matching date using the index or the first date prior to the
> target data using an index ... in both queries you are using very efficient
> btree lookups and avoiding the need to scan your collection (key is the
> price and value if the date):
>
>       | nsc random maxYear detectQuery targetDate result |
>       nsc := IdentityBag new.
>       random := HostRandom new.
>       GsIndexSpec new
>         equalityIndex: 'value' lastElementClass: Date;
>         createIndexesOn: nsc.
>       1 to: 100 do: [ :index |
>         nsc
>           add: (ScaledDecimal for: random float scale: 2) ->
>             (Date
>               newDay: (random integerBetween: 1 and: 365)
>               year: (random integerBetween: 2000 and: 2017)) ].
>       targetDate := Date newDay: 250 year: 2011.
>       detectQuery := (GsQuery fromString: 'each.value = targetDate' on:
> nsc)
>         bind: 'targetDate'
>         to: targetDate.
>       result := detectQuery
>         detect: [ :date | true ]
>         ifNone: [ | nearestQuery |
>           nearestQuery :=  (GsQuery fromString: 'each.value < targetDate'
> on: nsc)
>             bind: 'targetDate'
>             to: targetDate.
>            nearestQuery reversedReadStream next].
>       {nsc. nsc sortAscending: 'value'. targetDate. result}
>
> I'm returning the sorted nsc, to make it easy to validate that  the result
> is correct, since we're generating random dates.
> The parsed query can be persisted (with or without the nsc attached) to
> avoid the overhead of parsing the query string each time you run the query
> ...
>
>
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/glass
>



-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20170807/757932e6/attachment-0001.html>


More information about the Glass mailing list