[Glass] Lots of seaside objects not being GCed (need gemstone advise)

Dale Henrichs via Glass glass at lists.gemtalksystems.com
Mon Jul 13 09:55:46 PDT 2015



On 07/11/2015 02:28 PM, Mariano Martinez Peck wrote:
>
>
> On Tue, Jul 7, 2015 at 3:56 PM, Dale Henrichs 
> <dale.henrichs at gemtalksystems.com 
> <mailto:dale.henrichs at gemtalksystems.com>> wrote:
>
>
>
>     On 07/07/2015 05:49 AM, Mariano Martinez Peck wrote:
>>     Dale,
>>
>>     I have continue analyzing this in other stones and after some
>>     testing it is clear that some sessions (the size would depend on
>>     the system usage) are NOT GCed unless I shut all seaside gems
>>     down or cycle them. Originally I was having
>>      GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE on 90% and I was cycling
>>     seaside gems once a day as part of GC. Then, I changed it to 100%
>>     and stop restarting gems. Now...it COULD have happened that I did
>>     not restarted all gems since I modified the
>>     GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE and so, the system was still
>>     running with 90% and yet I was not restarting seaside gems anymore.
>     Yes. The meaning of GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100 is that
>     all pomgen spaces are dropped ... this does not mean that all
>     references to persistent objects in the vm are dropped ....
>
>
> Indeed. That's why to be 100% sure to drop all references to 
> persistent objects you likely need to recycle seaside gems (even with 
> EM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE=100)
Right ... the odds of dead object references drops lower when using this 
approach but to reach 100% drastic measures are needed ... Frankly this 
is why I made the initial comment about it being only 32 sessions ....
>
>>     That could explain why I hold onto some instances, right? 
>>     Another possibility is the "stale reference" you mention below.
>>     *I continue answering below:*
>>
>>>         Good point. Thanks. I will remember it for next time: each
>>>         time I am dealing with this kind of stuff: cycle all seaside
>>>         gems first!
>>>         Thanks. BTW, my GEM_TEMPOBJ_POMGEN_PRUNE_ON_VOTE is 100% now
>>>         to avoid having to cycle gems.
>>>         I will continue with the tests with cycling/killing the
>>>         gems... but.... continue reading below...
>>         Do you also have the marksweep guy running?
>>
>>
>>     The guy that every 30 minutes perform the "System
>>     _generationScavenge_vmMarkSweep."?  Then yes. Why you ask? how
>>     this guy could affect? He does not hold any seaside session as
>>     far as I know...i simply sends "System
>>     _generationScavenge_vmMarkSweep.". Could it be that the #wait:
>>     freezes the gem and therefore does not answer the the voting?
>     No if a gem is busy, the stone patiently waits for the gem to hit
>     a transaction boundary - the vote happens on a transaction boundary.
>
>
> Dale, with this comment, I do not understand why then the comment in 
> the sys admin guide I pasted below "*Gems do not vote until they 
> complete their current transaction. If a Gem is sleeping or otherwise 
> engaged in a long transaction, the vote cannot be **finalized and 
> garbage collection pauses at this point.*"
I'm not sure how the "the stone waits for the gem to hit a transaction 
boundary" is inconsistent with "gems do not vote until they complete 
their current transaction"...
>
>     This is one of the factors that causes reclaimAll to be
>     non-deterministic (our goal is for recalimAll to be deterministic,
>     but the system _is_ a complex state machine). Gems can be busy
>     doing a long running transaction or a a gem can be idle sitting in
>     transaction - like an idle topaz or GemTools and unless the system
>     triggers an event to cause the gem to wake up, like hitting the
>     commit record limit thresholds, the system patiently waits for the
>     Gem to finish it's "work".
>
>
> Ok... so it will wait. Ok, I got that.
Ah, good:)
>
>>
>>     Mmmmm now I read in the sysadmin guide: *"Gems do not vote until
>>     they complete their current transaction. If a Gem is sleeping or
>>     otherwise engaged in a long transaction, the vote cannot be*
>>     *finalized and garbage collection pauses at this point. Commit
>>     records accumulate, garbage accumulates, and a variety of
>>     problems can ensue."*
>>
>>     Uffff maybe since this guys practically sleeps all the time and
>>     yet does not do a commit nor abort in each iteration of the
>>     loop...maybe this guy is preventing the vote?
>     Recall the little process that you installed the vm marksweep
>     code? This particular process is there so that a Seaside gem is
>     guaranteed to have a Smalltalk process ready and available to
>     respond to the SigAbort ... The SigAbort is sent by the stone,
>     when commit records accumulate ...
>>
> Well. Here is where I have the last question. That little process we 
> are talking about does this code:
>
>  [
> | count minutesToForceGemGC |
> count := 0.
> minutesToForceGemGC := 30.
>    [ true ] whileTrue: [
> (Delay forSeconds: 30) wait.
> count := count + 1.
> (count \\\ (minutesToForceGemGC * 2)) = 0 ifTrue: [
> System _generationScavenge_vmMarkSweep.
> count := 0.
> ].
> ].
>  ] forkAt: Processor lowestPriority.
>
> So my question is.... in that code you see I do NOT ever do a commit 
> or abort. So I don't see how this code can enter what you describe as 
> "the vote happens on a transaction boundary". I mean...that code is 
> 99.9% time in a #wait doing no commit nor abort. So...wouldn't that 
> make the voting process to wait for it forever?  Or the SigAbort is 
> what would prevent that?

Good question ... Immediately before the code you'r shown, you will find 
the following code:

  Exception
   installStaticException:
     [:ex :cat :num :args |
       "Run the abort in a lowPriority process, since we must acquire the
        transactionMutex."
       [
         GRPlatform current transactionMutex
           critical: [
             GRPlatform current doAbortTransaction ].
         System enableSignaledAbortError.
       ] forkAt: Processor lowestPriority.
     ]
   category: GemStoneError
   number: 6009
   subtype: nil.
  System enableSignaledAbortError.


The above code installs a static exception handler for the SigAbort 
exception (error number 6009). The SigAbort is an asynchronous signal 
that it is signaled upon notification from the stone. The vm signals the 
SigAbort in the context of  the currently active GsProcess. if there are 
no explicit handlers on the stack, the list of static handlers is 
searched. If a static handler is found, the handler is run by the 
currently active GsProcess.... If there are no active processes (i.e., 
all of the processes are blocked on a semaphore or a socket call), then 
the vm waits for the first process to go active ... if no process wakes 
up before the stone hits the STN_GEM_ABORT_TIMEOUT, the stone will 
signal a lost OT effectively killing the session ... Since Seaside gems 
could very well be blocked sitting on an accept() call, the "extra" 
process was created to wake up every 30 seconds (half of the default 
STN_GEM_ABORT_TIMEOUT) to try to guarantee that there will always be an 
active GsProcess available to abort when a Seaside gem is idle and 
waiting for requests ...

Dale
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150713/2ff15fdc/attachment-0001.html>


More information about the Glass mailing list