[Glass] There is really no way to have an ordered/sorted collection together with indexes?

Dale Henrichs via Glass glass at lists.gemtalksystems.com
Tue Aug 8 07:13:29 PDT 2017



On 8/7/17 7:53 PM, Mariano Martinez Peck wrote:
>
>
> On Mon, Aug 7, 2017 at 10:47 PM, Dale Henrichs via Glass 
> <glass at lists.gemtalksystems.com 
> <mailto:glass at lists.gemtalksystems.com>> wrote:
>
>
>
>     On 8/7/17 1:04 PM, Richard Sargent via Glass wrote:
>
>         GLASS mailing list wrote
>
>             Hi guys,
>
>             I am storing huge lists of "prices". For this, it really
>             helps me to store
>             things ordered (by date)...either in an
>             SequenceableCollection or a
>             SortedCollection. On the other hand, I do want to use
>             indexes to speedup
>             my
>             query to find price for a given date (equality index).
>
>             But I have found no way to have them both. The only
>             workaround I found is
>             to keep 2 collections for each of these collections, one
>             sorted/ordered,
>             and the other one an unordered one for querying via index.
>             But this is a
>             pain from "developing" point of view, as well as for
>             unnecessary
>             repository
>             growth.
>
>             Am I missing something?
>
>         Mariano, have you read chapter 7 in the GemStone/S 64
>         Programming Guide?
>         It's all about indexing.
>
>         It looks like you could define a "range" query over the
>         unordered collection
>         to get the sorted sequence needed to traverse all the prices
>         in date order.
>         The index would be on the date and the range query would be
>         from some "least
>         date" through some "greatest date". You would then use the
>         streaming results
>         or the #do: message I think) to iterate over the query result
>         in date order.
>         There is a section in chapter 7.2 discussing result order.
>
>
>         It might be helpful to discuss the use cases for the two
>         collections. When
>         would you iterate over all the dates versus when would you
>         search for
>         specific dates or ranges of dates?
>
>
>     Actually I think you can just set up two queries ... one to search
>     for the exact date:
>
>       detectQuery := (GsQuery fromString: 'each.value = targetDate'
>     on: nsc)
>
>     and one query to find the first date prior to your targetDate:
>
>       nearestQuery := (GsQuery fromString: 'each.value < targetDate'
>     on: nsc)
>
>
>
>
> Hi Dale,
>
> Thanks for your answers. I am headed bed right now, but I would do a 
> quick question so to have as much as possible info for tomorrow...
>
> Aside from both needed queries above, I would still need this 2 more APIs:
>
> 3) get the whole price history sorted.
You've got several options here .. sortAscending: produces a sorted 
collection, a GsQuery combined with a do:, detect:, etc., a GsQuery 
combined with readStream, a separate SortedCollection ... (as i 
described in response to your "can't we use indexes for sorting too" thread

> 4) get the newest available price of the collection
you get the last price from a GsQuery using a reversedReadStream ...
>
> How can I make those fast too taking advantage of the index? For 3) I 
> can simply do the nearestQuery with a future day (like tomorrow), but 
> maybe there is a cleaner way?  And for 4?
>
> In summary.... I am comparing whether store things sorted (so that 3 
> and 4 are fast...3 is a simple #last and 4 is simple same 
> collection..nothing to do ) and make binary search for exact query and 
> nearest query, vs  store things in a bag and do everything via index 
> (querying will be fast, but I doubt about 3) and 4)... i mostly doubt 
> about 3)..
You are right if you really need to produce the sorted price history 
quickly, then your best bet is to keep two collections ... use indexed 
queries for fast price lookups and the sorted collection for price 
history, first and last ...

Dale
>
> Thanks a lot!
>
>     Then you can use detect:ifNone: sent to the query object itself to
>     either find the first matching date using the index or the first
>     date prior to the target data using an index ... in both queries
>     you are using very efficient btree lookups and avoiding the need
>     to scan your collection (key is the price and value if the date):
>
>           | nsc random maxYear detectQuery targetDate result |
>           nsc := IdentityBag new.
>           random := HostRandom new.
>           GsIndexSpec new
>             equalityIndex: 'value' lastElementClass: Date;
>             createIndexesOn: nsc.
>           1 to: 100 do: [ :index |
>             nsc
>               add: (ScaledDecimal for: random float scale: 2) ->
>                 (Date
>                   newDay: (random integerBetween: 1 and: 365)
>                   year: (random integerBetween: 2000 and: 2017)) ].
>           targetDate := Date newDay: 250 year: 2011.
>           detectQuery := (GsQuery fromString: 'each.value =
>     targetDate' on: nsc)
>             bind: 'targetDate'
>             to: targetDate.
>           result := detectQuery
>             detect: [ :date | true ]
>             ifNone: [ | nearestQuery |
>               nearestQuery :=  (GsQuery fromString: 'each.value <
>     targetDate' on: nsc)
>                 bind: 'targetDate'
>                 to: targetDate.
>                nearestQuery reversedReadStream next].
>           {nsc. nsc sortAscending: 'value'. targetDate. result}
>
>     I'm returning the sorted nsc, to make it easy to validate that 
>     the result is correct, since we're generating random dates.
>     The parsed query can be persisted (with or without the nsc
>     attached) to avoid the overhead of parsing the query string each
>     time you run the query ...
>
>
>     _______________________________________________
>     Glass mailing list
>     Glass at lists.gemtalksystems.com <mailto:Glass at lists.gemtalksystems.com>
>     http://lists.gemtalksystems.com/mailman/listinfo/glass
>     <http://lists.gemtalksystems.com/mailman/listinfo/glass>
>
>
>
>
> -- 
> Mariano
> http://marianopeck.wordpress.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20170808/5980a0dc/attachment-0001.html>


More information about the Glass mailing list