[Glass] Large collection and common practice
Smalltalk via Glass
glass at lists.gemtalksystems.com
Tue Jan 3 15:48:55 PST 2017
Paul,
I welcome all your information that you mention is very useful. This
application runs with Seaside no GBS with it's replicates.
Time will tell how i will approach this problem but the information is
very useful. And with GemStone there are plenty of solutions :)
For now i will keep it simple maybe in the future i will try some sort
of your solution !
regards
bruno
El 03/01/2017 a las 20:33, Paul Baumann escribió:
> Hi Bruno,
>
> Why you might not want to keep objects forever in an RC collection:
>
> RC collections have more overhead. Many are implemented with
> session-specific sub-structures that can be modified by the session
> with little risk of conflict (the rare conflict is when the RC
> collection itself changes). Consider if you have an RC collection that
> was populated from 100 separate sessions and then the query that the
> RC collection would need to do to see if an object/key exists in the
> RC collection. RC collections are well implemented and reasonably
> efficient, they just aren't as efficient at some operations (like
> lookup) as some non-RC collections (otherwise all would be implemented
> to have RC behavior). You'll find that RC collections have sometimes
> unexpected growth and shrinkage behavior. Some grow large
> session-specific subcollections that may never be cleaned up unless
> there is at least one removal. Some grow in inopportune moments that
> can affect time-sensitive operations. I'm not saying that you
> shouldn't use the RC collections for root collections, it depends on
> your application needs.
>
> Regarding indexing for many attributes:
>
> Sounds like you want to create indexes on a common collection like
> OrderedCollection that only one session in in charge of updating. I
> know that GemTalk had improved their indexing implementation several
> years back, but some kinds of practical issues likely still remain. A
> field index updates some underlying structure that might also be
> updated from changes to other objects by other sessions, updates to
> indexes used to cause many commit conflicts. The more indexes a
> collection has, the higher the odds of commit conflict. Applications
> that I've worked on for the past decade or so didn't use collection
> indexes like you are about to do. An application that used a lot of
> indexes also had some custom code to save and replay changes to domain
> objects to compensate for unpredictable commit failures. It is from
> experiences like that that the queue-manager approach became useful
> despite all the cross-session coordination.
>
> I'd probably implement a query kind of object that wraps that
> collection to support collection-specific queries and maintenance
> operations. The OC (or whatever you use) would normally be private to
> the query object. The query object could even have special behavior
> for avoiding commit conflicts (like locking or queueing for example).
> The query object might for even be clever enough to do a
> private/internal RC queue when your application code detects conflict
> is possible (like from use of locks). The queue object would manage
> the internal RC collection as practical.
>
> You might think of making that query object a subclass of Collection
> but any GBS users out there should beware that there would be
> replication bugs (I'd reported the bug with workaround code to GemTalk
> many years ago). I doubt you'd be doing replication of something like
> this even if you used GBS, but just saying there was is a bit of
> strangeness to be discovered at the basic/private/primitive levels and
> unfortunately it means that caution applies to user-defined subclasses
> of Collection.
>
> I'm not suggesting you do this, but it is an option. In the time that
> indexing was not reliable I'd once resorted to creating my own
> application-specific indexes. That query object that I just mentioned
> could also have private dictionary instances that can quickly resolve
> specific keys (attributes of the objects). The query object has the
> overhead of also maintaining the private attribute-key dictionaries as
> object are added and removed. I could go into how I implemented these
> application-defined indexes even without the query object wrapping it,
> but no need because you have good GemTalk supported indexes now anyway.
>
> I've presented ideas more complicated than you'll need, hopefully an
> awareness of potential issues and past remedies will save you some effort.
>
> Regards,
>
> Paul Baumann
>
>
> On Tue, Jan 3, 2017 at 5:04 PM, Smalltalk <smalltalk at adinet.com.uy
> <mailto:smalltalk at adinet.com.uy>> wrote:
>
> Paul,
>
> Thanks for your answer ...
>
> /* you don't always want to keep the objects in the RC collection
>
> Why you don't always want to keep the objects in the RC collection ?
> This is what i'm doing right now :( - RcKeyValueDictionary
>
> Thanks for the technique you are explaining.
>
> For now i will keep as simple as i can :) may be in the future
> (next year) i can do something like that but i need to do much
> more research :)
>
> /* I wonder what kind of indexing you would need besides ID. If
> you don't need to query for anything other than ID then a
> dictionary would be fine with the ID as key.
>
> This project/system implement a persistence layer (using rest
> services) for a Java Application (www.orbeon.com
> <http://www.orbeon.com>) which is used to design, publish, save
> and query web forms.
> (https://github.com/brunobuzzi/OrbeonPersistenceLayer/
> <https://github.com/brunobuzzi/OrbeonPersistenceLayer/>)
>
> When designing/publishing/sending/saving form --> the ID is mostly
> used.
> Then you have the Summary page. That display all form instances
> (saved and sent forms) of some form definition.
> Here the user can search by a particular field of the defined form.
> A search can be by N different fields depending on the form
> definition (the definition could be a form with 200 nested fields
> and sections or whatever).
>
> In this case indexes are very useful but in the previous cases a
> Dictionary is more suitable using the id (that after assigned is
> immutable)
>
> regards,
> bruno
>
>
> El 03/01/2017 a las 17:38, Paul Baumann escribió:
>> Hi Bruno,
>>
>> Multiple sessions can feed an RC collection with reduced commit
>> conflicts, but you don't always want to keep the objects in the
>> RC collection. One common technique is to have a manager session
>> dedicated to moving objects from RC collections into collections
>> that can be accessed more efficiently. Design so that the manager
>> is the only session that will be updating the collections (so
>> that commit conflicts will not happen). The manager session can
>> do polling for new items and you can add gem-to-gem signaling to
>> wake the manager for more timely responses. The challenges with
>> this kind of design are related to update timing between
>> sessions. The process involves a commit to add to the RC
>> collection, an abort for the manager session to see the objects,
>> a commit by the manager to update the root collection (with RC
>> collection removal, RcQueues are usually used BTW), and an abort
>> by the original session if it needs to see the indexed item was
>> added to the root collection. If there is timing sensitivity with
>> this data then you'll likely resort to searching first in your
>> indexeded collection and then also reviewing objects still in the
>> queue waiting for the manager to process them.
>>
>> A variation of the manager session technique is to send data to
>> the manager session without doing a commit, this might be through
>> communication between gems or by using session-specific file
>> updates that the manager gem reads. Gem-to-gem signaling can be
>> added to this approach later too if you need to improve timing.
>> This variation can avoid the intermediate commit, but you'd still
>> may need to #continueTransaction to see what the manager session
>> updated.
>>
>> I wonder what kind of indexing you would need besides ID. If you
>> don't need to query for anything other than ID then a dictionary
>> would be fine with the ID as key. A dictionary can even use a key
>> that is a custom object that redefines equality and hash from
>> attributes of what is searched for. Merkle tree hashes might also
>> be used as a way to test if some attribute is contained, but that
>> is a bit advanced to go into. Another advanced item that I once
>> implemented was a custom Dictionary where the key was derived
>> from the value by behavior (it was more efficient because it
>> avoided the cost of Association creation). So many cool tricks, I
>> loved working with GS/S.
>>
>> Paul Baumann
>>
>>
>>
>> On Tue, Jan 3, 2017 at 2:39 PM, Mariano Martinez Peck via Glass
>> <glass at lists.gemtalksystems.com
>> <mailto:glass at lists.gemtalksystems.com>> wrote:
>>
>>
>> On Tue, Jan 3, 2017 at 3:14 PM, BrunoBB via Glass
>> <glass at lists.gemtalksystems.com
>> <mailto:glass at lists.gemtalksystems.com>> wrote:
>>
>> Hi All,
>>
>> I have a lot RcKeyValueDictionary where the key is the id
>> of the object and
>> the value is the object itself.
>> This id once assigned it does NOT change, so far so good :)
>>
>> The RcKeyValueDictionary is used intensively to add and
>> remove objects (in
>> my case OrbeonFormInstance). The dictionary is very
>> useful because the key
>> is always given as parameter.
>>
>> Also there are searchs by specific inst var of
>> OrbeonFormInstance class
>> (like username,group, createdTime and so on).
>>
>> My problem is that i can NOT create an index on
>> aRcKeyValueDictionary.
>> So which is the commom practice in these cases:
>> 1- Change the RcKeyValueDictionary to be an
>> UnorderedCollection ?
>> 2- Add a new instance variable to the class that holds the
>> RcKeyValueDictionary and this new variable to be
>> anUnorderedCollection ?
>>
>> 1) This will complicate my direct searchs using the ID.
>> 2) Extra computation when adding and removing objects
>> (now there 2
>> collections to maintain)
>>
>> The general question will be something like:
>> When Dictionaries are very suitable to store large
>> quantity of objects but
>> indexes are also needed which solution should be
>> implemented ?
>>
>>
>>
>> Assuming you do need or get benefits from the RC flavor (else
>> it brings unnecessary overhead), then quickly analyzing the
>> situation (until GemStone have indexed and rc-flavor
>> Dictionary impl), I think I would use a RcIdentityBag. I
>> would create a identity index for #id , and yes, you will
>> have to modify your code that access the dict, to know detect
>> on the collection using the identity index of ID.
>>
>> But...I am sure someone more experienced will come with a
>> better approach!
>>
>> Cheers,
>>
>>
>>
>> regards
>> bruno
>>
>>
>>
>> --
>> View this message in context:
>> http://forum.world.st/Large-collection-and-common-practice-tp4928607.html
>> <http://forum.world.st/Large-collection-and-common-practice-tp4928607.html>
>> Sent from the GLASS mailing list archive at Nabble.com.
>> _______________________________________________
>> Glass mailing list
>> Glass at lists.gemtalksystems.com
>> <mailto:Glass at lists.gemtalksystems.com>
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>> <http://lists.gemtalksystems.com/mailman/listinfo/glass>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>> <http://marianopeck.wordpress.com>
>>
>> _______________________________________________
>> Glass mailing list
>> Glass at lists.gemtalksystems.com
>> <mailto:Glass at lists.gemtalksystems.com>
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>> <http://lists.gemtalksystems.com/mailman/listinfo/glass>
>>
>>
>
>
>
> ------------------------------------------------------------------------
> Avast logo
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> El software de antivirus Avast ha analizado este correo
> electrónico en busca de virus.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
>
>
---
El software de antivirus Avast ha analizado este correo electrónico en busca de virus.
https://www.avast.com/antivirus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20170103/0b00194f/attachment-0001.html>
More information about the Glass
mailing list