[Glass] Grrrr cannot migrate (class rename with subclasses and with a name of a deleted class)
Mariano Martinez Peck via Glass
glass at lists.gemtalksystems.com
Fri Sep 11 12:10:16 PDT 2015
On Fri, Sep 11, 2015 at 4:06 PM, Dale Henrichs <
dale.henrichs at gemtalksystems.com> wrote:
> Okay ... now that the bug is characterized we'll be able to determine if
> it exists in older versions or not ... the code in this area has been
> reworked for 3.2+ ...
>
>
Indeed.
> Which brings us to the second problem ... since I am entering the bug
> sweep, it will be worth creating a test case to produce the "2 metaclasses
> / 2 classes for the same class" and I plan to do that (if I can) and then
> see if there is a reasonable resolution (not sure:) ...
>
>
Yes! I will see if I can reproduce that too today. Basically, I had this:
Object
- *FaSecurityClosingPriceRecord* (no instances)
- SpecialSuperclass
- - *FaSecurityClosingPriceRecord2* (many instances)
- - - FSCPR2a (instances)
- - - FSCPR2b (instances)
and then I committed a monticello change with this:
Object
- SpecialSuperclass
- - *FaSecurityClosingPriceRecord* (many instances....and note there is no
2 at the end)
- - - FSCPR2a (instances)
- - - FSCPR2b (instances)
I will see if I can reproduce it too using dummy classes.
Cheers,
Dale
>
>
> On 09/11/2015 11:31 AM, Mariano Martinez Peck wrote:
>
> Hi Dale,
>
> Ok, I increased the SPC at 2GB and I put a TOC of 1.8GB. Now, the code
> update DOES WORK and does not crash anymore.
> However, the resulting stuff is again the 2 metaclasses / 2 classes for
> the same class. So I think we are dealing with 2 problems:
>
> 1) One was that the listInstances thingy was clearly failing because of
> TOC size. As you just found out.
> 2) This kind of code refactor I needed, does not seem to be correctly
> performed by Monticello. The way to solve this was performing the manual
> thing that James and Martin recommended at the very beginning of this
> thread. This change also avoided migration and so avoided the listInstaces
> issue too.
>
> So... I think those are the 2 problems and conclusions. I don't think we
> should continue investigating more. Thoughts?
>
> Thank you very much for keeping searching for this and for the engineers
> also.
>
>
>
> On Fri, Sep 11, 2015 at 2:03 PM, Dale Henrichs <
> dale.henrichs at gemtalksystems.com> wrote:
>
>>
>>
>> On 09/09/2015 06:24 AM, Mariano Martinez Peck wrote:
>>
>>
>> On Tue, Sep 8, 2015 at 7:00 PM, Dale Henrichs <
>> dale.henrichs at gemtalksystems.com> wrote:
>>
>>> Mariano,
>>>
>>> I just talked with engineering and they concur that this is likely to be
>>> a malloc failure and the this area of the code has been substantially
>>> reworked in recent releases to attempt to reduce the amount of RAM consumed
>>> during list instances ...
>>>
>>> So for 3.1.0.6, you might try this operation with more RAM available or
>>> perhaps just adding more swap space will allow the malloc to complete ...
>>> running statmon with a 1 second interval and looking at the heap
>>> consumption of the gem, might show growth and a "sudden decline" when the
>>> malloc fails ...
>>>
>>
>> Hi Dale,
>>
>> Just for the record, I tried with this scenario:
>>
>> [marianopeck at quuveserver1 ~]$ free -m
>> total used free shared buff/cache
>> available
>> Mem: 8014 388 6850 359 775
>> 7205
>> Swap: 16639 0 16639
>>
>> And still didn't work. Note that I have 7GB of RAM free. At the end, when
>> the system crashed, this was the resulting state:
>>
>> [marianopeck at quuveserver1 ~]$ free -m
>> total used free shared buff/cache
>> available
>> Mem: 8014 338 1316 973 6359
>> 6639
>> Swap: 16639 0 16639
>>
>>
>> Anyway, no problem, I would assume this is a problem in 3.1.0.6 and
>> hopefully I will never need to list instances / migrate this class until I
>> am in 3.2/3.3...
>>
>>
>> Okay, we've read code and to sorta confirm your experience, we _do not_
>> return a nil when the malloc fails ... So we're reading more code, but our
>> suspicion now is that you are running out of TOC and the"normal" failure
>> mechanisms aren't being triggered ... to help confirm this suspicion we
>> think that you can try two independent things:
>>
>> 1. trigger an in-vm scavenge before making a call and/or
>> 2. bump up the TOC for that particular vm and see if you can find a
>> size that works ...
>>
>> The journey continues...
>>
>> Dale
>>
>>
>>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
--
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150911/1e0b82ae/attachment-0001.html>
More information about the Glass
mailing list