[Glass] Where are indices really useful ?
Dale Henrichs via Glass
glass at lists.gemtalksystems.com
Sat Jan 28 12:02:03 PST 2017
Marten,
I suggest that you wait for 3.4 before making up your mind about
indexing support.
The work that we are doing for 3.4 should significantly speed up the
execution of queries, but I don't think that query speedup is the only
thing that can be done ...
The main thrust of my efforts for 3.4 involves eliminating the need for
the RcIndexDictionary ... Without going into gory details, eliminating
the RcIndexDictionary means that we can eliminate quite a bit of object
faulting and execution time associated with **each object in a query
result***. In my initial prototype, I saw an overall speedup of 2x in
running one of the major index tests in our test suite ... this test
spends most of it's time in query evaluation, but it also sequences
through the whole life cycle of an index: index creation/removal;
adding, updating, and removing indexed objects... and my initial
prototype is written all in Smalltalk ... there are several areas where
we can add new primitives to speed things up even more.
You have been making several general observations about index
performance, but unfortunately you haven't provided any examples or test
cases to back up your observations ...
I am not claiming that your observations are wrong, but without an
example or test case I cannot know what the root cause may be and
whether or not the performance issues you are seeing in your particular
test cases will be addressed or not by the new work or even what
additional work that _could_ be done to improve things for you.
You have made some observations that are likely to not be directly
affected by my work, like order of predicate execution.
As you have observed, currently, you can only "hope to reduce the size
of your result set". One of features we are are adding is a counted b
plus tree implementation ... with a counted b plus tree, it should be
possible to calculate the size of a result set without executing the
entire query ... depending upon the exact nature of your queries, it may
be possible to define a query optimizer that utilizes the estimated
result set sizes to determine predicate evaluation order ... otoh,
without knowing the details of your exact use case, we may not implement
the optimizer in such a way that addresses your use case ...
If you are willing, I am sure that we could make an alpha release of 3.4
available to so you could give us better feedback. In fact, I would
really like to get a closer look at your use cases. Ideally I would get
a copy of your development extent so that I can work with exactly the
same sample data and code that you are using ... If you are unable to
share extents, then a set of test cases that exhibit the same behavior
as you are seeing would be much appreciated.
This happens to be the perfect time to be looking at index performance
issues ... I am a couple of weeks from finishing up the initial work on
the new implementation and being able to get direct feedback from you
would help us focus and prioritize any additional work that might be
needed to round out the usability and performance of the new implementation.
Dale
On 1/28/17 9:27 AM, Marten Feldtmann via Glass wrote:
>
> After testing the index support in Gemstone I come (for me) to the
> following point:
>
> a) try to define a query with only one predicate to get a streamable
> result set (and hope to reduce the size of your result set)
>
> b) if you have more than one predicate -> build your own block and
> then execute this block against the result set from (a).
>
> c) Defining more than one index on a set may result in strange
> (slower) query execution time - the query is executed faster, if no
> index is defined at all.
>
> Puuuh, lots of new things to learn. Perhaps it would be nice to ask
> GsQuery NOT to use indices at all - so you have ONE programming
> interface for all cases.
>
> Marten
>
>> Marten Feldtmann via Glass <glass at lists.gemtalksystems.com> hat am
>> 28. Januar 2017 um 18:09 geschrieben:
>>
>> But using streamable result sets pop up another interface problem:
>> they are only allowed to use, if an equalityIndex is defined on that
>> attribute. In the example I create an index only on larger sets
>> (>50000 items) - that means sometimes the index is available, but
>> sometimes not .. so to make it right, you should ask GsQuery (if this
>> is possible at all) if the query can be executed at all and then
>> according to the result, use the streaming interface - otherwise the
>> do: interface.
>>
>>
>> Marten
>>
>>>
>>> First streamable queries assume equality indices and I assume, that
>>> comparing numerical values (with something different than =) can not
>>> benefit from identity indices.
>>>
>>> Marten
>>>
>>>> Mariano Martinez Peck <marianopeck at gmail.com> hat am 28. Januar
>>>> 2017 um 17:25 geschrieben:
>>>> Vía phone.. Quick answer.. Small double are immediate objects so
>>>> always are identical. Can you measure performance with identity index?
>>>>
>>>>
>>
>>> _______________________________________________ Glass mailing list
>>> Glass at lists.gemtalksystems.com
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
>> _______________________________________________ Glass mailing list
>> Glass at lists.gemtalksystems.com
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/glass
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20170128/dd59bc49/attachment.html>
More information about the Glass
mailing list