[Glass] GC causing system reboot?

Mariano Martinez Peck marianopeck at gmail.com
Thu Dec 19 10:24:51 PST 2013


On Thu, Dec 19, 2013 at 12:40 PM, Dale K. Henrichs <
dale.henrichs at gemtalksystems.com> wrote:

> By default the maintenance vm  runs an mfc once an hour, so if you haven't
> changed the frequncy for that I'm not sure what the "once a day job would
> be". The MFC's should show up in your maintenance vm log .... from your
> description a backup or mfc could cause "sluggishness" on a machine if the
> repository exceeds the size of your SPC by a significant margin
>

Dale, I am JUST NOW facing this problem in my local machine. I have a topaz
running taking 100% since 1:30 hours. The extent file is 4GB. Is this
normal to take 1:30 hours??? mmm I don't think so.

My grep tells me that it is the gc indeed:

 501 12858     1   0 12:38PM ??        99:12.24
/opt/gemstone/product/bin/topaz -l -e
/opt/gemstone/product/seaside/etc/maintenance30.conf

maintenance30.conf has a GEM_TEMPOBJ_CACHE_SIZE = 200000;

(although from the OSX monitor such a process is using 600MB)

My system.conf has:

SHR_PAGE_CACHE_SIZE_KB = 2000000;
GEM_TEMPOBJ_CACHE_SIZE = 1800000;

the big temp cache is just because I had some big reports that were
crashing with an out of memory.

My sysctl has:

kern.sysv.shmmax=68719476736
kern.sysv.shmmin=1
kern.sysv.shmmni=256
kern.sysv.shmseg=64
kern.sysv.shmall=4294967296

And my machine has 8GB of RAM.  yes, I know, maybe those limits are too
much.

So...should i let the topaz running? Since it started, I started having
entries in maintenance_gem.log and  seaside_11810reclaimgcgem0.log but
nothing strange to me.

seaside_11810reclaimgcgem0.log shows a reclaiming like the following every
15 mins:

12/19/2013 15:04:48 ART
   90 reclaims  3953 pagesProcessed  3953 pagesReclaimed 0 allValidPages  0
singleObjPages
   1831336 processedObjs  219 liveObjs  0 shadowObjs 1831117 deadObjs
 463.3 avgObjsPerPage

maintenance_gem.log shows:

Unregistering...2013-12-19T08:38:37.92392206192017-08:00
...Expired: 0 sessions.
--transcript--'Starting markForCollect.: 2013-12-19T*08:38*
:37.94464492797852-08:00'
--transcript--'Warning: markForCollection found 17142463 live objects,
17298 dead objects(occupying approx 1556820 bytes)'
--transcript--'...finished markForCollect.2013-12-19T*10:21*
:02.28886604309082-08:00'
Unregistering...2013-12-19

So as you can see, it took almost 2 hours to do the markForCollect: in a
4GB repository.

Normal or problem?

Thanks!


> ... Nothing GemStone does should cause a system to reboot, however, if you
> start running low on real memory (how much swap space do you nave
> allocated) linux will start killing processes that are memory hogs and
> perhaps it killed a process that resulted in a reboot?
>
> Dale
>
> ------------------------------
>
> *From: *"Mariano Martinez Peck" <marianopeck at gmail.com>
> *To: *glass at lists.gemtalksystems.com
> *Sent: *Thursday, December 19, 2013 5:28:34 AM
> *Subject: *[Glass] GC causing system reboot?
>
>
> Hi guys,
>
> Today morning I found out that my server was restarted automatically at
> 6:54 am. I was weird, since it was an unexpected restart. I tried to find
> the cause and I didn't find anything from the OS itself nor from VirtualBox
> logs (yes, the Linux runs over a VirtualBox machine).
>
> Then I started to take a look to see if GemStone could have been the
> cause...and indeed, there might be something related. Look at the logs (and
> the timestamps):
>
> seaside_20477admingcgem.log  shows:
>
> --- 12/19/2013 *06:08:21*.317 EST ---
> [Info]: Successfully acquired GC lockKind 6 on attempt 1.
> Starting SweepWsUnion, WSU size=18  PD size=7
>
> seaside_20475reclaimgcgem0.log shows:
>
> 12/19/2013 *06:19*:51 EST
>    1 reclaims  40 pagesProcessed  40 pagesReclaimed 2 allValidPages  38
> singleObjPages
>    60 processedObjs  21 liveObjs  39 shadowObjs 0 deadObjs    1.5
> avgObjsPerPa
>
>
> All 3 fastcgi logs have several lines like:
>
> --transcript--'handled sigabort: 2013-12-19T03:46:52.31182599067688-08:00'
>
> Here I think the difference in the time is that the previous ones were the
> server local time while in this one is the default timezone I have in
> GemStone (could that be it?).
>
> Finally, maintenance_gem.log has modification timestamp at *6:52* but no
> meaningful entry. It has plenty of entries like this:
>
> Unregistering...2013-12-19T03:52:42.86958003044128-08:00
> ...Expired: 0 sessions.
>
> (again, different timestampt, but maybe the difference I mention above).
>
>
> And at 6:54 (2 minutes after timestamp of mainenance_gem.log), the system
> was rebooted. So....is there anything particular strange here?
> I have the same setup in my machine (OSX) and I notice that every day at
> some point, a topaz process starts to run in my machine and takes 100% of
> CPU. It literally kills my machine, I mean, I cannot do anything else. And
> the process takes quite a while. I GUESS that is the GC running. So maybe
> the GC was simply trying to run in my server and the fact of being CPU
> intensive put it down? mmmmm
>
> Thanks in advance for any help,
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
> _______________________________________________
> Glass mailing list
> Glass at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/glass
>
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20131219/047905b4/attachment.html>


More information about the Glass mailing list