[Glass] Explanation to "too many failed pom_gen scavenges" in this context??

Mariano Martinez Peck via Glass glass at lists.gemtalksystems.com
Tue Mar 8 18:26:02 PST 2016


On Tue, Mar 8, 2016 at 6:38 PM, Dale Henrichs <
dale.henrichs at gemtalksystems.com> wrote:

> Okay .. given your stack, OrderedCollection>>at: is probably not going to
> be causing you to run out of memory in and of itself. ...
>
>
Exactly, that's what I thought too.


> The error `too many failed pom_gen scavenges` is indicating that you are
> running into trouble in POM gen and by extension OLD gen ...
>
>
OK.


> From your memory dump, you've got 250K OrderedCollection instances taking
> up 500MB and another 1M Strings taking up 300MB ...
>
>
BTW, I think strings are holding 30MB, not 300MB.


> Objects are held in POM gen and OLD gen if they've been modified or if
> they are directly  referenced from temp objects or the stack and the error
> specifically means that you've filled both OLD Gen and POM gen
>

*Hold on!!!! *I have a question here. I do something like this (pseudo code)

myBootstrapMethod
| storage |
storage := FaUserContextInformation current outerSpace cache at:
'ZacksQuarterlyLoadingCache' put: Dictionary new.

self parsedData do: [:each |

       storage at: aKey put: self processInfoFrom: each.
]...

As you can see, I have a temp object (part of the stack) that directly
refers to a persistent object (a dictionary).
I thought that under memory pressure, the objects referenced from this
dictionary, would go to disk. But now I am not sure because of what you
said.

So maybe avoid the temp var "storage" and instead getting it from
"FaUserContextInformation current outerSpace cache at:
'ZacksQuarterlyLoadingCache' "   would allow what I was thinking?  mmmmm




> ... it is interesting that you are able to reference so many strings and
> ordered collections from the stack and/or temp objects ...
>
>
This is a bootstrap a huge text file of 600MB :)  Of course, I am using
temporary persistent dictionaries (similar to SIXX approeach when dealing
with large graphs).


> First we want to know what the setting for GEM_TEMPOBJ_AGGRESSIVE_STUBBING
> (perhaps you could send us the banner that gets dumped when topaz starts
> up, so we can get more info about what's happening). If
> GEM_TEMPOBJ_AGGRESSIVE_STUBBING  is FALSE, then that could be the source of
> your problem the default for GEM_TEMPOBJ_AGGRESSIVE_STUBBING is TRUE ...
> but seeing the banner will help us nonetheless...
>

I didn't change that. I checked and it's TRUE.

BTW, please find attached the whole topaz script. Note that in this case, I
tried with the default POM gem size


>
> We took a 3.2.9 topaz and started it with a 700MB temp obj cache and this
> memory space info was printed out by topaz:
>
> (vmGc spaceSizes: eden init 2048K max 134400K , survivor init 384K max
> 22400K,
>  vmGc    old max 537600K, code max 143360K, perm max 71680K, pom 10 *
> 59776K = 597760K,
>  vmGc    remSet 13828K, meSpace max 685324K oopMapSize 4194304  max
> footprint 2229M)
>
> Giving you a pom gen size of ~600M and old gem of ~500M
>
>
Indeed, similar here:

(vmGc spaceSizes: eden init 2048K max 131264K , survivor init 448K max
21888K,
 vmGc    old max 524928K, code max 140032K, perm max 70016K, pom 10 *
58368K = 583680K,
 vmGc    remSet 13252K, meSpace max 669196K oopMapSize 4194304  max
footprint 2178M)

BTW, thanks for pointing out this information, I wasn't aware of it :)



> In order to get additional information we want you to reduce the size of
> POM gen to 300M (which will in turn cause OLD gen to be increased in size)
> with the following settings in your system.conf:
>
> GEM_TEMPOBJ_AGGRESSIVE_STUBBING = TRUE;
> GEM_TEMPOBJ_POMGEN_SIZE = 300MB;
> GEM_TEMPOBJ_CACHE_SIZE = 700MB;
>
> This may shift the problem to a different point but at least we will learn
> more info ... especially getting the startup banner from topaz...
>
>
Thanks. I tried but it fails in the same place.




> Dale
>
>
> On 03/08/2016 10:57 AM, Mariano Martinez Peck wrote:
>
>
>
> On Tue, Mar 8, 2016 at 3:13 PM, Dale Henrichs <
> dale.henrichs at gemtalksystems.com> wrote:
>
>>
>>
>> On 03/08/2016 09:50 AM, Mariano Martinez Peck wrote:
>>
>> Hi guys,
>>
>> First of all, thanks to Richard for the suggestion. Adding the env var
>> did write the useful info in the gem log.
>>
>> Problem is that even if I tried to fix it (in my code), I still failed :(
>>    I attach the whole log in case someone wants to take a look.
>>
>> However, I think there is something strange in
>> #installAlmostOutOfMemoryStaticHandler:. I added some logging to this
>> method to see what was going on (in bold the important parts):
>>
>>
>> *installAlmostOutOfMemoryStaticHandler: initialCommitThreshold*
>> *  | handler commitThreshold used |*
>> *  commitThreshold := initialCommitThreshold.*
>> *  handler := AlmostOutOfMemory*
>> *    addDefaultHandler: [ :ex | *
>> *      "Exception caught, do a commit."*
>> *      Transcript*
>> *        show: 'exception handled: ' , ex printString;*
>> *        cr.*
>> *      System commitTransaction*
>> *        ifFalse: [ *
>> *          Transcript*
>> *            show: 'AutoCommit failed';*
>> *            cr.*
>> *          nil error: 'AutoCommit failed' ].*
>> *      System _vmMarkSweep.*
>> *      Transcript*
>> *        show: 'commit threshold value: ' , commitThreshold printString;*
>> *        cr.*
>> *      used := System _tempObjSpacePercentUsedLastMark.*
>> *      Transcript*
>> *        show: 'percent used ' , used printString;*
>> *        cr.*
>> *      used < commitThreshold*
>> *        ifTrue: [ *
>> *          "We dropped below the threshold, reenable the signal"*
>> *          Transcript*
>> *            show: 're-enable the signal ';*
>> *            cr.*
>> *          System enableAlmostOutOfMemoryError ]*
>> *        ifFalse: [ *
>> *          "commit and mark sweep did drop us below threshold. cut
>> threshold in half *
>> *           and try again. Record an entry in the object log "*
>> *          commitThreshold := (100 - commitThreshold) // 2 +
>> commitThreshold.*
>> *          Transcript*
>> *            show: 'commit threshold value: ' , commitThreshold
>> printString;*
>> *            cr.*
>> *          commitThreshold < 98*
>> *            ifTrue: [ *
>> *              Transcript*
>> *                show: 'signalAlmostOutOfMemoryThreshold: ';*
>> *                cr.*
>> *              System signalAlmostOutOfMemoryThreshold: commitThreshold ]*
>> *            ifFalse: [ *
>> *              "we're with 10% of out of memory, commit instance counts
>> structure *
>> *               into objectlog for posterity."*
>> *              Transcript*
>> *                show: '10% out of memory... record in the object log ';*
>> *                cr.*
>> *              (ObjectLogEntry*
>> *                trace: 'almost out of memory: ' , commitThreshold
>> printString*
>> *                object: (System _vmInstanceCountsReport: 3)) addToLog.*
>> *              System commitTransaction*
>> *                ifFalse: [ *
>> *                  Transcript*
>> *                    show: 'Last AutoCommit failed ';*
>> *                    cr.*
>> *                  nil error: 'AutoCommit failed' ] ] ].*
>> *      ex resume ].*
>> *  SessionTemps current*
>> *    at: #'CommitOnAlmostOutOfMemoryStaticException'*
>> *    put: handler.*
>> *  System signalAlmostOutOfMemoryThreshold: commitThreshold*
>>
>>
>>
>> Now...if I check the gem log, I get this:
>>
>> *--transcript--'exception handled: a AlmostOutOfMemory occurred
>> (notification 6013), Session''s temporary object memory is almost full'*
>> *--transcript--'commit threshold value: 10'*
>> *--transcript--'percent used 0'*
>> *--transcript--'re-enable the signal '*
>>
>> And that 4 lines repeated several times, until it starts with the:
>>
>>
>> ************ Fatal error 4067 OutOfMemory ******
>>
>> *Printing Smalltalk stack because of OutOfMemory:*
>> *Smalltalk stack: printing to topaz .out file at [03/08/2016 12:36:11.827
>> EST]*
>>
>>
>>
>> And this is what I don't understand...from the transcript, it seems that "*System
>> _tempObjSpacePercentUsedLastMark" *always answered 0, hence we are below
>> threshold (no problem!). But then I get the OutOfMemory anyway...  so...
>> how can I run out of memory if #_tempObjSpacePercentUsedLastMark answered 0
>> each time?
>>
>>
>> Okay the `--transcript--'re-enable the signal '` simply means that the
>> AlmostOfMemory signal was re-enabled - the handler is armed - it does not
>> guarantee that you won't run out of memory.
>>
>> The `*********** Fatal error 4067 OutOfMemory *****` means that you ran
>> out of memory without triggering an AlmostOutOfMemory exception ... for the
>> reasons I mentioned earlier: most likely you ran out of memory during a
>> primitive call BEFORE the deferred exception could be signaled ..
>>
>>
> OK, I got it, and that would explain it. OK.
>
>
>> BTW, I tried looking at the selector in my 3.2.9 and didn't see symbols,
>> so I don't know what methods are actually being called  in the first couple
>> of frames, so I can't guess about what might be consuming memory ...
>>
>>
> OK.. I did some analysis. The most complicated part that I didn't know how
> to debug was the "blocks" entries in the stack. Until some time I figured
> out I could use:
>
> (Object _objectForOop: XXX) _sourceStringForBlock
>
> where XXX is the OOP of the GsNMethod printed at the END of the line, not
> the homeMethod printing at the beginning.
>
>
> This is the resulting, translated analysis. Please, start reading from the
> bottom to the top.
>
>
> Thanks!!
>
>
>
> 1  Object >>   #'at:'  (receiver is OrderedCollection)
>
>
> 2. The closure
>
> '[ :coDataRowDict | coDataRowDict row at: index ]'
>
>
> 3.  SequenceableCollection >>   #'collect:'
>
>
> 4. Looks to me the closure of #mapSelectedRows: inputListOfRowDictionaries
> toQuuveUsing: quuveKeysForZacksTableAccessor prependKeysWith: aPrefix ,
> that is:
>
>
> [ :rowDict |
> | key tmp values dataType |
> ...
> values := aListOfRowDictionaries collect: [ :coDataRowDict | coDataRowDict
> row at: index ].
> ....
> ]
>
>
> 5. The closure of #doWithRowDictionaries:  that is:
>
> [ :row | | t|
> t := self rowAsRowDictionary: row. "now the user can use symbolic
> references for testing in the aTestBlock"
> (aBlock value: t)
> ].
>
>
> 6.  Collection  >>  do:
>
>
> 7. FaTableAccessor >> doWithRowDictionaries: aBlock
>   self table
>     do: [ :row |
>       | t |
>       t := self rowAsRowDictionary: row. "now the user can use symbolic
> references for testing in the aTestBlock"
>       aBlock value: t ]
>
>
>
> 8  FaQuandlZacksFundamentalsDataLoader >> mapSelectedRows:
> inputListOfRowDictionaries toQuuveUsing: quuveKeysForZacksTableAccessor
> prependKeysWith: aPrefix
>
> "some code here"
> ....
> quuveKeysForZacksTableAccessor doWithRowDictionaries: [ :rowDict |
> | key tmp values dataType |
> ...
> values := aListOfRowDictionaries collect: [ :coDataRowDict | coDataRowDict
> row at: index ].
> ....
> ]
>
>
>
>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20160308/94ddee90/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wholeTopazOutOfMemory.log
Type: application/octet-stream
Size: 132658 bytes
Desc: not available
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20160308/94ddee90/attachment-0001.obj>


More information about the Glass mailing list