[Glass] Lots of seaside objects not being GCed (need gemstone advise)
Dale Henrichs via Glass
glass at lists.gemtalksystems.com
Mon Jul 6 14:47:34 PDT 2015
On 07/06/2015 12:28 PM, Mariano Martinez Peck wrote:
>
>
> On Mon, Jul 6, 2015 at 2:33 PM, Dale Henrichs via Glass
> <glass at lists.gemtalksystems.com
> <mailto:glass at lists.gemtalksystems.com>> wrote:
>
> Mariano,
>
> I've read over your other messages and I guess you are still
> struggling to clean these guys up ... Rest of my comments in line.
>
>
> Hi Dale.
> Thanks, I answer inline.
>
> Dael
>
> On 07/03/2015 08:22 PM, Mariano Martinez Peck via Glass wrote:
>>
>> Then...I check some #allInstances size and I get this:
>>
>> DpWebSession allInstances size 32
>> WACallbackRegistry allInstances size 217
>> JQueryClass allInstances size 16519
>> WACache allInstances size 35
>> WAApplication allInstances size 3
>> WARenderVisitor allInstances size 217
>> WARenderContext allInstances size 217
>> WAHtmlCanvas allInstances size 909
>> .....
>>
> Right off the bat, my observation is that this doesn't seem like
> a lot of uncollected objects, presumably you churn through a lot
> more sessions than this on a regular basis, so these objects
> appear to be the exception instead of the rule...
>
>
> Of course. All those numbers are in a system which didn't receive a
> single request in a whole day. And this is the results after all the
> cleanings I could do. So this is why I expect to have zero instances
> of those (meaning .. no zero, but much less in the real system that
> what I have now),.
Okay, you didn't have a single request today, so these objects must be
hanging around from a previous day. Did you have zero instances the day
before?
Without any other information, it is possible that these objects got
left behind because of a voting issue (i.e., reference left in the head
of a vm) ... did you cycle all of the gems before running the mfc? What
is your setting for GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE? If I'm not
mistaken GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE does not guarantee that there
aren't other references in the gems head to objects ...
These instances did not appear out of thin air and there is a logical
reason for them to be still hanging around ... This a complicated system
with a number of moving parts and there is no way to rule out bugs
either ...
Without a detailed accounting of the "starting point" and the gems,
started and stopped between that point and now it is impossible to guess
we cannot guess what might have happened ...
I would suggest that at some point you record the oops of the session
objects so that we don't end up finding that every time we look we are
looking at a different set of sessions ...
>
>
>> (just as some examples).
>>
>> The good news is that ALL the sessions do look expired:
>>
>> (DpWebSession allInstances select: [ :each | (each instVarNamed:
>> 'parent') isNil ]) size 32
>>
>> (expired sessions have a nil 'parent').
> One of the interestings that came out of the Larry's "ordeal", is
> that we found a bug in WACache>>gemstoneReap, where an error while
> running this method can result in objects getting stuck in the
> WACache. Basically objects are marked as expired in the
> WARcLastAccessExpiryPolicy, but due to the error, they may not be
> removed from the objectsByKey and keysByObjects dictionaries ...
> thus keeping them alive "forever".
>
> If you check your maintenance vm logs, you might find an error
> with WACache>>gemstoneReap (Almost Out of Memory is how we found
> the bug) in Larry's case.
>
>
>
> I grep but I found no error in my maintenance logs.
>
>
> Since you have so few sessions, we can test whether the object
> leak is due to this bug:
>
> | sessions |
> System abortTransaction.
> sessions := WASession allInstances
> select: [ :each | (each instVarNamed: 'parent') isNil ].
> System abortTransaction.
> WAApplication allInstances
> do: [ :app |
> | cache keysByObject |
> cache := app cache.
> keysByObject := cache instVarNamed: 'keysByObject'.
> sessions
> do: [ :session |
> (keysByObject includesKey: session)
> ifTrue: self halt ] ]
>
> If you get a halt running the above, then you've been bitten by
> the bug and you you need to arrange to remove the session objects
> from both dicts. See WACache>>gemstoneReap for example code ...
>
>
> I did not get a Halt in above code.
Did you replace `WASession allInstances` with DpWebSession?
>
>
> If the WASessions are not stuck in a WAApplication, then it's
> likely that you have some accidental reference to the WASession
> objects and you'll have to trace the reference path back to a
> persistent root using Repository>>findReferencePathToObject: ..
> this method only returns one reference path... In 3.2 we've
> created Repository>>findAllReferencePathsToObject: that finds and
> returns all of the reference paths (in a pinch you could upgrade
> your repository to 3.2.6 just to run the aalysis) ...
>
> [1] https://github.com/GsDevKit/Seaside31/issues/68
>
>
>
> Yes, in fact, earlier today I tried #findReferencePathToObject: with
> (MySessionSubclass allInstances any) and guess what????
> I get an array of only 2 entries, first element is target object and
> second element is false. Reading method comment says it means there is
> no path to that object. WTF!!!! so then why they do not go away??? As
> said, I do run MFC, I do run #reclaimAll... so..... *in which scenario
> would I hold into instances (and in fact found via #allInstances), yet
> #findReferencePathToObject: would say there is no path?*
If I'm not mistaken, #findReferencePathToObject: scans for references in
the repository, but does not take into account instances in a vm's
memory ...
At this point I don't know whether these objects are staying alive
because of persistent references or because they are in a vms head and
being voted down ...
>
>
>>
>> However...I cannot explain why I still have all that garbage
>> above if all sessions are expired. Is that normal? I would expect
>> to have nothing.
> It's not normal:)
>
>
> That's cool to hear. So...even if those are little number of objects,
> this gives me a small scenario of the real system. If this stone has
> not received a single request in hours, then I should get ZERO
> instances of those :) Cool.
>
To know with certainty whether or not an object is considered truly
dead, you can look at System class>>_deadNotReclaimed and see if the
oops of the suspect sessions are in it or not (see the comment in the
method for conditions of use). Barring any nasty bugs they are likely to
have be voted down ...
If you set STN_TRAN_LOG_DEBUG_LEVEL=3 in your system.conf and restart
your stone ... it is possible to find the list of objects in the
possible dead set, the list of objects voted down (and the session id
that voted them down) and the original list of deadNotReclaimed ...
Of course if you restart your stone then the heads of the various gems
will be cleared and it is likely that the objects will go away on the
next mfc ... Note that in 3.1.0.6, it is possible that the gem doing the
mfc is hanging onto some objects in it's head, so unless you logout
after the mfc, that might be the reason for voting guys down ...
Dale
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150706/d3d1454f/attachment.html>
More information about the Glass
mailing list