[Glass] Grrrr cannot migrate (class rename with subclasses and with a name of a deleted class)

Dale Henrichs via Glass glass at lists.gemtalksystems.com
Fri Sep 11 10:03:44 PDT 2015



On 09/09/2015 06:24 AM, Mariano Martinez Peck wrote:
>
> On Tue, Sep 8, 2015 at 7:00 PM, Dale Henrichs 
> <dale.henrichs at gemtalksystems.com 
> <mailto:dale.henrichs at gemtalksystems.com>> wrote:
>
>     Mariano,
>
>     I just talked with engineering and they concur that this is likely
>     to be a malloc failure and the this area of the code has been
>     substantially reworked in recent releases to attempt to reduce the
>     amount of RAM consumed during list instances ...
>
>     So for 3.1.0.6, you might try this operation with more RAM
>     available or perhaps just adding more swap space will allow the
>     malloc to complete ... running statmon with a 1 second interval
>     and looking at the heap consumption of the gem, might show  growth
>     and a "sudden decline" when the malloc fails ...
>
>
> Hi Dale,
>
> Just for the record, I tried with this scenario:
>
> [marianopeck at quuveserver1 ~]$ free -m
>               total        used        free  shared  buff/cache   
> available
> Mem:           8014         388        6850 359         775        7205
> Swap:         16639           0       16639
>
> And still didn't work. Note that I have 7GB of RAM free. At the end, 
> when the system crashed, this was the resulting state:
>
> [marianopeck at quuveserver1 ~]$ free -m
>               total        used        free  shared  buff/cache   
> available
> Mem:           8014         338        1316 973        6359        6639
> Swap:         16639           0       16639
>
>
> Anyway, no problem, I would assume this is a problem in 3.1.0.6 and 
> hopefully I will never need to list instances / migrate this class 
> until I am in 3.2/3.3...
>

Okay, we've read code and to sorta confirm your experience, we _do not_ 
return a nil when the malloc fails ... So we're  reading more code, but 
our suspicion now is that you are running out of TOC and the"normal"  
failure mechanisms aren't being triggered  ... to help confirm this 
suspicion we think that you can try two independent things:

   1. trigger an in-vm scavenge before making a call and/or
   2. bump up the TOC for that particular vm and see if you can find a 
size that works ...

The journey continues...

Dale


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150911/a931ed3b/attachment-0001.html>


More information about the Glass mailing list