[Glass] Lots of seaside objects not being GCed (need gemstone advise)

Mon Jul 13 11:08:17 PDT 2015

Ok, cristal clear! Thanks Dale for the explanation.

On Mon, Jul 13, 2015 at 1:55 PM, Dale Henrichs <
dale.henrichs at gemtalksystems.com> wrote:

>
>
> On 07/11/2015 02:28 PM, Mariano Martinez Peck wrote:
>
>
>
> On Tue, Jul 7, 2015 at 3:56 PM, Dale Henrichs <
> dale.henrichs at gemtalksystems.com> wrote:
>
>>
>>
>> On 07/07/2015 05:49 AM, Mariano Martinez Peck wrote:
>>
>>   Dale,
>>
>>  I have continue analyzing this in other stones and after some testing
>> it is clear that some sessions (the size would depend on the system usage)
>> are NOT GCed unless I shut all seaside gems down or cycle them. Originally
>> I was having  GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE on 90% and I was cycling
>> seaside gems once a day as part of GC. Then, I changed it to 100% and stop
>> restarting gems. Now...it COULD have happened that I did not restarted all
>> gems since I modified the GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE and so, the
>> system was still running with 90% and yet I was not restarting seaside gems
>> anymore.
>>
>>  Yes. The meaning of GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100 is that all
>> pomgen spaces are dropped ... this does not mean that all references to
>> persistent objects in the vm are dropped ....
>>
>
>  Indeed. That's why to be 100% sure to drop all references to persistent
> objects you likely need to recycle seaside gems (even with
> EM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100)
>
> Right ... the odds of dead object references drops lower when using this
> approach but to reach 100% drastic measures are needed ... Frankly this is
> why I made the initial comment about it being only 32 sessions ....
>
>
>
>>    That could explain why I hold onto some instances, right?  Another
>> possibility is the "stale reference" you mention below. *I continue
>> answering below:*
>>
>>
>>>     Good point. Thanks. I will remember it for next time: each time I
>>> am dealing with this kind of stuff: cycle all seaside gems first!
>>> Thanks. BTW, my GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE is 100% now to avoid
>>> having to cycle gems.
>>> I will continue with the tests with cycling/killing the gems... but....
>>> continue reading below...
>>>
>>>  Do you also have the marksweep guy running?
>>>
>>
>>  The guy that every 30 minutes perform the "System
>> _generationScavenge_vmMarkSweep."?  Then yes. Why you ask? how this guy
>> could affect? He does not hold any seaside session as far as I know...i
>> simply sends "System _generationScavenge_vmMarkSweep.". Could it be that
>> the #wait: freezes the gem and therefore does not answer the the voting?
>>
>>  No if a gem is busy, the stone patiently waits for the gem to hit a
>> transaction boundary - the vote happens on a transaction boundary.
>>
>
>  Dale, with this comment, I do not understand why then the comment in the
> sys admin guide I pasted below "*Gems do not vote until they complete
> their current transaction. If a Gem is sleeping or otherwise engaged in a
> long transaction, the vote cannot be **finalized and garbage collection
> pauses at this point.*"
>
>
> I'm not sure how the "the stone waits for the gem to hit a transaction
> boundary" is inconsistent with "gems do not vote until they complete their
> current transaction"...
>
>    This is one of the factors that causes reclaimAll to be
>> non-deterministic (our goal is for recalimAll to be deterministic, but the
>> system _is_ a complex state machine). Gems can be busy doing a long running
>> transaction or a a gem can be idle sitting in transaction - like an idle
>> topaz or GemTools and unless the system triggers an event to cause the gem
>> to wake up, like hitting the commit record limit thresholds, the system
>> patiently waits for the Gem to finish it's "work".
>>
>
>  Ok... so it will wait. Ok, I got that.
>
> Ah, good:)
>
>
>
>>
>>  Mmmmm now I read in the sysadmin guide: *"Gems do not vote until they
>> complete their current transaction. If a Gem is sleeping or otherwise
>> engaged in a long transaction, the vote cannot be*
>> *finalized and garbage collection pauses at this point. Commit records
>> accumulate, garbage accumulates, and a variety of problems can ensue."*
>>
>>  Uffff maybe since this guys practically sleeps all the time and yet
>> does not do a commit nor abort in each iteration of the loop...maybe this
>> guy is preventing the vote?
>>
>>  Recall the little process that you installed the vm marksweep code? This
>> particular process is there so that a Seaside gem is guaranteed to have a
>> Smalltalk process ready and available to respond to the SigAbort ... The
>> SigAbort is sent by the stone, when commit records accumulate ...
>>
>>
>>     Well. Here is where I have the last question. That little process we
> are talking about does this code:
>
>   [
>   | count minutesToForceGemGC |
>   count := 0.
>   minutesToForceGemGC := 30.
>    [ true ] whileTrue: [
>   (Delay forSeconds: 30) wait.
>   count := count + 1.
>   (count \\\ (minutesToForceGemGC * 2)) = 0 ifTrue: [
>   System _generationScavenge_vmMarkSweep.
>   count := 0.
>   ].
>   ].
>  ] forkAt: Processor lowestPriority.
>
>  So my question is.... in that code you see I do NOT ever do a commit or
> abort. So I don't see how this code can enter what you describe as "the
> vote happens on a transaction boundary". I mean...that code is 99.9% time
> in a #wait doing no commit nor abort. So...wouldn't that make the voting
> process to wait for it forever?  Or the SigAbort is what would prevent that?
>
>
> Good question ... Immediately before the code you'r shown, you will find
> the following code:
>
>  Exception
>   installStaticException:
>     [:ex :cat :num :args |
>       "Run the abort in a lowPriority process, since we must acquire the
>        transactionMutex."
>       [
>         GRPlatform current transactionMutex
>           critical: [
>             GRPlatform current doAbortTransaction ].
>         System enableSignaledAbortError.
>       ] forkAt: Processor lowestPriority.
>     ]
>   category: GemStoneError
>   number: 6009
>   subtype: nil.
>  System enableSignaledAbortError.
>
>
> The above code installs a static exception handler for the SigAbort
> exception (error number 6009). The SigAbort is an asynchronous signal that
> it is signaled upon notification from the stone. The vm signals the
> SigAbort in the context of  the currently active GsProcess. if there are no
> explicit handlers on the stack, the list of static handlers is searched. If
> a static handler is found, the handler is run by the currently active
> GsProcess.... If there are no active processes (i.e., all of the processes
> are blocked on a semaphore or a socket call), then the vm waits for the
> first process to go active ... if no process wakes up before the stone hits
> the STN_GEM_ABORT_TIMEOUT, the stone will signal a lost OT effectively
> killing the session ... Since Seaside gems could very well be blocked
> sitting on an accept() call, the "extra" process was created to wake up
> every 30 seconds (half of the default STN_GEM_ABORT_TIMEOUT) to try to
> guarantee that there will always be an active GsProcess available to abort
> when a Seaside gem is idle and waiting for requests ...
>
> Dale
>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150713/6221971b/attachment.html>