[GemStone-Smalltalk] GbsGsErrStnNetLost every two hours?

David Shaffer shaffer at shaffer-consulting.com
Sun Nov 14 16:24:56 PST 2021


I just tried updating one ivar of an object every commit (a timestamp stored in one of my “root” objects) so now there should be some traffic with every commit (wouldn’t there be traffic even if my commits didn’t have data to push?  At the very least VW would need to sync with the gem to get the list of new modified objects?).  Anyway, no dice, still dies every 2 hours.

I am knee deep in google hits right now…I’ll post back if anything pans out.

-D

> On Nov 14, 2021, at 1:53 PM, James Foster <smalltalk at jgfoster.net> wrote:
> 
> David,
> 
> The error “socket read EOF” indicates that the Gem attempted to read from a socket and received an EOF response. 
> 
> Given that the client and the server are in different Docker containers, they are effectively on separate hosts and there is a strong indication that the socket closed between them. Given the timing duration and consistency, my first guess is that a socket is being closed due to inactivity. If you added a commit every minute (say, the last time through the loop), would that change the behavior?
> 
> James
> 
>> On Nov 14, 2021, at 10:40 AM, David Shaffer <shaffer at shaffer-consulting.com <mailto:shaffer at shaffer-consulting.com>> wrote:
>> 
>> The host is an AWS EC2 instance running Ubuntu (20.04.2) running on AWS (kernel 5.11.0-1020-aws) with Docker (version 20.10.7, build 20.10.7-0ubuntu5~20.04.2).  Gemstone and the VisualWorks client are running in separate Ubuntu containers (“latest” on Dockerhub which is 20.04/focal).  Docker is running in “swarm mode” on this host and both client and server are swarm services.
>> 
>> The sleep is 1 second and there is not always work to do.  In fact, most loop iterations complete without committing any changes.
>> 
>> I’m currently pursuing some sketchy syslog entries on the EC2 host that seem to correspond to the network errors.  I’ll share them as soon as I’ve pruned them down a bit.  This same setup ran for 6 years (this EC2 instance for 2 years, my transition to swarm mode was about 1 year ago) with GOODS as backend without network-related errors, though.
>> 
>> -D
>> 
>>> On Nov 14, 2021, at 1:29 PM, James Foster <smalltalk at jgfoster.net <mailto:smalltalk at jgfoster.net>> wrote:
>>> 
>>> David,
>>> 
>>> Tell us a bit more about your configuration. Are you running Windows, macOS, or Linux? Is the client inside the Docker container (you mentioned the “entire system”)? How long is the sleep? Is there always work to do? 
>>> 
>>> James
>>> 
>>> 
>>>> On Nov 14, 2021, at 10:22 AM, David Shaffer via GemStone-Smalltalk <gemstone-smalltalk at lists.gemtalksystems.com <mailto:gemstone-smalltalk at lists.gemtalksystems.com>> wrote:
>>>> 
>>>> Hey folks:
>>>> 
>>>> I’ve (finally) deployed a server using GemStone 3.6.2, GemBuilder 8.5 and VW 9.0.  My server’s main loop is essentially:
>>>> 
>>>> Abort
>>>> Do work
>>>> Commit
>>>> Sleep
>>>> 
>>>> Every two hours (I’m not sure it is exactly two hours but it seems pretty reliable), I get the following during the abort call:
>>>> 
>>>> GS Server Error - GbsGsErrStnNetLost - The session has lost its connection to the Stone Repository monitor.
>>>> 
>>>> The entire system runs on a single host in Docker so it can’t possibly be a network hiccup.  The gemnetobject logs are pasted below…they make it seem like a network error but, again, this is highly unlikely.  Has anyone else run into this or have any troubleshooting advice?
>>>> 
>>>> -David
>>>> 
>>>> --- 11/13/21 22:56:18.959 UTC Login
>>>> [Info]: Gave this process preference for OOM killer: wrote to /proc/460/oom_score_adj value 250
>>>> [Info]: User ID: Trader
>>>> [Info]: Repository: gs64stone
>>>> [Info]: Session ID: 5 login at 11/13/21 22:56:18.964 UTC
>>>> [Info]: GCI Client Host: 
>>>> [Info]: Page server PID: -1
>>>> [Info]: using libicu version 58.2
>>>> -----------------------------------------------------
>>>> GemStone: Error         Fatal
>>>> Network error - text follows:
>>>> , socket read EOF
>>>> Error Category: 231169 [GemStone] Number: 4137  Arg Count: 1 Context : 20 exception : 20
>>>> Arg 1:   20
>>>> --- 11/13/21 23:03:16.322 UTC Logging out
>>>> 
>>>> 
>>>> *****************************************************
>>>> ****** Abnormal Shutdown at 11/13/21 23:03:16.824 UTC
>>>> *****************************************************
>>>> -----------------------------------------------------
>>>> GemStone: Error         Fatal
>>>> Network error - text follows:
>>>> , socket read EOF
>>>> Error Category: 231169 [GemStone] Number: 4137  Arg Count: 1 Context : 20 exception : 20
>>>> Arg 1:   20
>>>> 
>>>> _______________________________________________
>>>> GemStone-Smalltalk mailing list
>>>> GemStone-Smalltalk at lists.gemtalksystems.com <mailto:GemStone-Smalltalk at lists.gemtalksystems.com>
>>>> https://lists.gemtalksystems.com/mailman/listinfo/gemstone-smalltalk <https://lists.gemtalksystems.com/mailman/listinfo/gemstone-smalltalk>
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/archives/gemstone-smalltalk/attachments/20211114/b723548e/attachment-0001.htm>


More information about the GemStone-Smalltalk mailing list