[Glass] Grrrr cannot migrate (class rename with subclasses and with a name of a deleted class)

Mariano Martinez Peck via Glass glass at lists.gemtalksystems.com
Fri Sep 11 11:31:25 PDT 2015


Hi Dale,

Ok, I increased the SPC at 2GB and I put a TOC of 1.8GB. Now, the code
update DOES WORK and does not crash anymore.
However, the resulting stuff is again the 2 metaclasses / 2 classes for the
same class. So I think we are dealing with 2 problems:

1) One was that the listInstances thingy was clearly failing because of TOC
size. As you just found out.
2) This kind of code refactor I needed, does not seem to be correctly
performed by Monticello. The way to solve this was performing the manual
thing that James and Martin recommended at the very beginning of this
thread. This change also avoided migration and so avoided the listInstaces
issue too.

So... I think those are the 2 problems and conclusions. I don't think we
should continue investigating more. Thoughts?

Thank you very much for keeping searching for this and for the engineers
also.



On Fri, Sep 11, 2015 at 2:03 PM, Dale Henrichs <
dale.henrichs at gemtalksystems.com> wrote:

>
>
> On 09/09/2015 06:24 AM, Mariano Martinez Peck wrote:
>
>
> On Tue, Sep 8, 2015 at 7:00 PM, Dale Henrichs <
> dale.henrichs at gemtalksystems.com> wrote:
>
>> Mariano,
>>
>> I just talked with engineering and they concur that this is likely to be
>> a malloc failure and the this area of the code has been substantially
>> reworked in recent releases to attempt to reduce the amount of RAM consumed
>> during list instances ...
>>
>> So for 3.1.0.6, you might try this operation with more RAM available or
>> perhaps just adding more swap space will allow the malloc to complete ...
>> running statmon with a 1 second interval and looking at the heap
>> consumption of the gem, might show  growth and a "sudden decline" when the
>> malloc fails ...
>>
>
> Hi Dale,
>
> Just for the record, I tried with this scenario:
>
> [marianopeck at quuveserver1 ~]$ free -m
>               total        used        free      shared  buff/cache
> available
> Mem:           8014         388        6850         359         775
>  7205
> Swap:         16639           0       16639
>
> And still didn't work. Note that I have 7GB of RAM free. At the end, when
> the system crashed, this was the resulting state:
>
> [marianopeck at quuveserver1 ~]$ free -m
>               total        used        free      shared  buff/cache
> available
> Mem:           8014         338        1316         973        6359
>  6639
> Swap:         16639           0       16639
>
>
> Anyway, no problem, I would assume this is a problem in 3.1.0.6 and
> hopefully I will never need to list instances / migrate this class until I
> am in 3.2/3.3...
>
>
> Okay, we've read code and to sorta confirm your experience, we _do not_
> return a nil when the malloc fails ... So we're  reading more code, but our
> suspicion now is that you are running out of TOC and the"normal"  failure
> mechanisms aren't being triggered  ... to help confirm this suspicion we
> think that you can try two independent things:
>
>   1. trigger an in-vm scavenge before making a call and/or
>   2. bump up the TOC for that particular vm and see if you can find a size
> that works ...
>
> The journey continues...
>
> Dale
>
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150911/cde65dc7/attachment.html>


More information about the Glass mailing list