[Glass] There is really no way to have an ordered/sorted collection together with indexes?

Richard Sargent via Glass glass at lists.gemtalksystems.com
Tue Aug 8 08:46:29 PDT 2017


GLASS mailing list wrote
> On Mon, Aug 7, 2017 at 10:47 PM, Dale Henrichs via Glass <

> glass at .gemtalksystems

>> wrote:
> 
>>
>>
>> On 8/7/17 1:04 PM, Richard Sargent via Glass wrote:
>>
>>> GLASS mailing list wrote
>>>
>>>> Hi guys,
>>>>
>>>> I am storing huge lists of "prices". For this, it really helps me to
>>>> store
>>>> things ordered (by date)...either in an SequenceableCollection or a
>>>> SortedCollection. On the other hand, I do want to use indexes to
>>>> speedup
>>>> my
>>>> query to find price for a given date (equality index).
>>>>
>>>> But I have found no way to have them both. The only workaround I found
>>>> is
>>>> to keep 2 collections for each of these collections, one
>>>> sorted/ordered,
>>>> and the other one an unordered one for querying via index. But this is
>>>> a
>>>> pain from "developing" point of view, as well as for unnecessary
>>>> repository
>>>> growth.
>>>>
>>>> Am I missing something?
>>>>
>>> Mariano, have you read chapter 7 in the GemStone/S 64 Programming Guide?
>>> It's all about indexing.
>>>
>>> It looks like you could define a "range" query over the unordered
>>> collection
>>> to get the sorted sequence needed to traverse all the prices in date
>>> order.
>>> The index would be on the date and the range query would be from some
>>> "least
>>> date" through some "greatest date". You would then use the streaming
>>> results
>>> or the #do: message I think) to iterate over the query result in date
>>> order.
>>> There is a section in chapter 7.2 discussing result order.
>>>
>>>
>>> It might be helpful to discuss the use cases for the two collections.
>>> When
>>> would you iterate over all the dates versus when would you search for
>>> specific dates or ranges of dates?
>>>
>>>
>>> Actually I think you can just set up two queries ... one to search for
>> the exact date:
>>
>>   detectQuery := (GsQuery fromString: 'each.value = targetDate' on: nsc)
>>
>> and one query to find the first date prior to your targetDate:
>>
>>   nearestQuery := (GsQuery fromString: 'each.value < targetDate' on: nsc)
>>
>>
>>
> 
> Hi Dale,
> 
> Thanks for your answers. I am headed bed right now, but I would do a quick
> question so to have as much as possible info for tomorrow...
> 
> Aside from both needed queries above, I would still need this 2 more APIs:
> 
> 3) get the whole price history sorted.

Mariano, given the size of this collection, how would having all 1 million
prices (or 10 million?)  be usable? #do: would allow you to iterate across
the collection indexed by date and get the prices in date order as would
streaming over the result. Dale points out that you can get the result set
as a collection, but what is it you would do with it such that you need the
entire sorted collection at one time?


> 4) get the newest available price of the collection
> 
> How can I make those fast too taking advantage of the index? For 3) I can
> simply do the nearestQuery with a future day (like tomorrow), but maybe
> there is a cleaner way?  And for 4?
> 
> In summary.... I am comparing whether store things sorted (so that 3 and 4
> are fast...3 is a simple #last and 4 is simple same collection..nothing to
> do ) and make binary search for exact query and nearest query, vs  store
> things in a bag and do everything via index (querying will be fast, but I
> doubt about 3) and 4)... i mostly doubt about 3)..
> 
> 
> Thanks a lot!
> 
> 
> 
>> Then you can use detect:ifNone: sent to the query object itself to either
>> find the first matching date using the index or the first date prior to
>> the
>> target data using an index ... in both queries you are using very
>> efficient
>> btree lookups and avoiding the need to scan your collection (key is the
>> price and value if the date):
>>
>>       | nsc random maxYear detectQuery targetDate result |
>>       nsc := IdentityBag new.
>>       random := HostRandom new.
>>       GsIndexSpec new
>>         equalityIndex: 'value' lastElementClass: Date;
>>         createIndexesOn: nsc.
>>       1 to: 100 do: [ :index |
>>         nsc
>>           add: (ScaledDecimal for: random float scale: 2) ->
>>             (Date
>>               newDay: (random integerBetween: 1 and: 365)
>>               year: (random integerBetween: 2000 and: 2017)) ].
>>       targetDate := Date newDay: 250 year: 2011.
>>       detectQuery := (GsQuery fromString: 'each.value = targetDate' on:
>> nsc)
>>         bind: 'targetDate'
>>         to: targetDate.
>>       result := detectQuery
>>         detect: [ :date | true ]
>>         ifNone: [ | nearestQuery |
>>           nearestQuery :=  (GsQuery fromString: 'each.value < targetDate'
>> on: nsc)
>>             bind: 'targetDate'
>>             to: targetDate.
>>            nearestQuery reversedReadStream next].
>>       {nsc. nsc sortAscending: 'value'. targetDate. result}
>>
>> I'm returning the sorted nsc, to make it easy to validate that  the
>> result
>> is correct, since we're generating random dates.
>> The parsed query can be persisted (with or without the nsc attached) to
>> avoid the overhead of parsing the query string each time you run the
>> query
>> ...
>>
>>
>> _______________________________________________
>> Glass mailing list
>> 

> Glass at .gemtalksystems

>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>
> 
> 
> 
> -- 
> Mariano
> http://marianopeck.wordpress.com
> 
> _______________________________________________
> Glass mailing list

> Glass at .gemtalksystems

> http://lists.gemtalksystems.com/mailman/listinfo/glass





--
View this message in context: http://forum.world.st/There-is-really-no-way-to-have-an-ordered-sorted-collection-together-with-indexes-tp4959121p4959196.html
Sent from the GLASS mailing list archive at Nabble.com.


More information about the Glass mailing list