[Glass] GC causing system reboot?

Thu Dec 19 11:36:11 PST 2013

Mariano, 

This does sound way too long ... with a 2GB SPC and a 4GB repo the mfc should be done at once:) 

I'm suspicious that your Virtual Linux Machine is somehow paging it's brains out on your mac? I am not an expert in getting these thing setup, but if OSx is only giving your virtual machine 600Mb and your virtual machine thinks it has 2GB+ then that is probably your problem .... you need to find a way to give your virtual machine more real memory in OSx ... I think... 

Dale 

----- Original Message -----

| From: "Mariano Martinez Peck" <marianopeck at gmail.com>
| To: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
| Cc: glass at lists.gemtalksystems.com
| Sent: Thursday, December 19, 2013 10:24:51 AM
| Subject: Re: [Glass] GC causing system reboot?

| On Thu, Dec 19, 2013 at 12:40 PM, Dale K. Henrichs <
| dale.henrichs at gemtalksystems.com > wrote:

| | By default the maintenance vm runs an mfc once an hour, so if you
| | haven't changed the frequncy for that I'm not sure what the "once a
| | day job would be". The MFC's should show up in your maintenance vm
| | log .... from your description a backup or mfc could cause
| | "sluggishness" on a machine if the repository exceeds the size of
| | your SPC by a significant margin
| 
| Dale, I am JUST NOW facing this problem in my local machine. I have a
| topaz running taking 100% since 1:30 hours. The extent file is 4GB.
| Is this normal to take 1:30 hours??? mmm I don't think so.

| My grep tells me that it is the gc indeed:

| 501 12858 1 0 12:38PM ?? 99:12.24 /opt/gemstone/product/bin/topaz -l
| -e /opt/gemstone/product/seaside/etc/maintenance30.conf

| maintenance30.conf has a GEM_TEMPOBJ_CACHE_SIZE = 200000;

| (although from the OSX monitor such a process is using 600MB)

| My system.conf has:

| SHR_PAGE_CACHE_SIZE_KB = 2000000;
| GEM_TEMPOBJ_CACHE_SIZE = 1800000;

| the big temp cache is just because I had some big reports that were
| crashing with an out of memory.

| My sysctl has:

| kern.sysv.shmmax=68719476736
| kern.sysv.shmmin=1
| kern.sysv.shmmni=256
| kern.sysv.shmseg=64
| kern.sysv.shmall=4294967296

| And my machine has 8GB of RAM. yes, I know, maybe those limits are
| too much.

| So...should i let the topaz running? Since it started, I started
| having entries in maintenance_gem.log and
| seaside_11810reclaimgcgem0.log but nothing strange to me.

| seaside_11810reclaimgcgem0.log shows a reclaiming like the following
| every 15 mins:

| 12/19/2013 15:04:48 ART
| 90 reclaims 3953 pagesProcessed 3953 pagesReclaimed 0 allValidPages 0
| singleObjPages
| 1831336 processedObjs 219 liveObjs 0 shadowObjs 1831117 deadObjs
| 463.3 avgObjsPerPage

| maintenance_gem.log shows:

| Unregistering...2013-12-19T08:38:37.92392206192017-08:00
| ...Expired: 0 sessions.
| --transcript--'Starting markForCollect.: 2013-12-19T 08:38
| :37.94464492797852-08:00'
| --transcript--'Warning: markForCollection found 17142463 live
| objects, 17298 dead objects(occupying approx 1556820 bytes)'
| --transcript--'...finished markForCollect.2013-12-19T 10:21
| :02.28886604309082-08:00'
| Unregistering...2013-12-19

| So as you can see, it took almost 2 hours to do the markForCollect:
| in a 4GB repository.

| Normal or problem?

| Thanks!

| | ... Nothing GemStone does should cause a system to reboot, however,
| | if you start running low on real memory (how much swap space do you
| | nave allocated) linux will start killing processes that are memory
| | hogs and perhaps it killed a process that resulted in a reboot?
| 

| | Dale
| 

| | | From: "Mariano Martinez Peck" < marianopeck at gmail.com >
| | 
| 
| | | To: glass at lists.gemtalksystems.com
| | 
| 
| | | Sent: Thursday, December 19, 2013 5:28:34 AM
| | 
| 
| | | Subject: [Glass] GC causing system reboot?
| | 
| 

| | | Hi guys,
| | 
| 

| | | Today morning I found out that my server was restarted
| | | automatically
| | | at 6:54 am. I was weird, since it was an unexpected restart. I
| | | tried
| | | to find the cause and I didn't find anything from the OS itself
| | | nor
| | | from VirtualBox logs (yes, the Linux runs over a VirtualBox
| | | machine).
| | 
| 

| | | Then I started to take a look to see if GemStone could have been
| | | the
| | | cause...and indeed, there might be something related. Look at the
| | | logs (and the timestamps):
| | 
| 

| | | seaside_20477admingcgem.log shows:
| | 
| 

| | | --- 12/19/2013 06:08:21 .317 EST ---
| | 
| 
| | | [Info]: Successfully acquired GC lockKind 6 on attempt 1.
| | 
| 
| | | Starting SweepWsUnion, WSU size=18 PD size=7
| | 
| 

| | | seaside_20475reclaimgcgem0.log shows:
| | 
| 

| | | 12/19/2013 06:19 :51 EST
| | 
| 
| | | 1 reclaims 40 pagesProcessed 40 pagesReclaimed 2 allValidPages 38
| | | singleObjPages
| | 
| 
| | | 60 processedObjs 21 liveObjs 39 shadowObjs 0 deadObjs 1.5
| | | avgObjsPerPa
| | 
| 

| | | All 3 fastcgi logs have several lines like:
| | 
| 

| | | --transcript--'handled sigabort:
| | | 2013-12-19T03:46:52.31182599067688-08:00'
| | 
| 

| | | Here I think the difference in the time is that the previous ones
| | | were the server local time while in this one is the default
| | | timezone
| | | I have in GemStone (could that be it?).
| | 
| 

| | | Finally, maintenance_gem.log has modification timestamp at 6:52
| | | but
| | | no meaningful entry. It has plenty of entries like this:
| | 
| 

| | | Unregistering...2013-12-19T03:52:42.86958003044128-08:00
| | 
| 
| | | ...Expired: 0 sessions.
| | 
| 

| | | (again, different timestampt, but maybe the difference I mention
| | | above).
| | 
| 

| | | And at 6:54 (2 minutes after timestamp of mainenance_gem.log),
| | | the
| | | system was rebooted. So....is there anything particular strange
| | | here?
| | 
| 
| | | I have the same setup in my machine (OSX) and I notice that every
| | | day
| | | at some point, a topaz process starts to run in my machine and
| | | takes
| | | 100% of CPU. It literally kills my machine, I mean, I cannot do
| | | anything else. And the process takes quite a while. I GUESS that
| | | is
| | | the GC running. So maybe the GC was simply trying to run in my
| | | server and the fact of being CPU intensive put it down? mmmmm
| | 
| 

| | | Thanks in advance for any help,
| | 
| 

| | | --
| | 
| 
| | | Mariano
| | 
| 
| | | http://marianopeck.wordpress.com
| | 
| 

| | | _______________________________________________
| | 
| 
| | | Glass mailing list
| | 
| 
| | | Glass at lists.gemtalksystems.com
| | 
| 
| | | http://lists.gemtalksystems.com/mailman/listinfo/glass
| | 
| 

| --
| Mariano
| http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20131219/c39d1e24/attachment.html>