[Glass] problem update: Zinc server with Seaside (3.1)

Johan Brichau johan at yesplan.be
Sun Dec 22 10:54:30 PST 2013


Dale,

With the new implementation of #atEnd, I see an error:

--transcript--2013-12-22 10:39:3.5306130886077881E+01 302143 D Executing request/response loop
--transcript--2013-12-22 10:39:3.5307526111602783E+01 302143 I Read aZnRequest(GET /welcome)
--transcript--2013-12-22 10:39:3.5339780092239380E+01 302143 T GET /welcome 200 4733B 32ms
--transcript--2013-12-22 10:39:3.5340270996093750E+01 302143 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B)
--transcript--2013-12-22 10:39:3.5341017007827759E+01 302143 D a ZnUnknownHttpMethod occurred (error 2710) while reading request
--transcript--2013-12-22 10:39:3.5341190099716187E+01 302143 D Closing stream

With the old implementation of #atEnd, everything works (again) and I also do not have the error:

--transcript--2013-12-22 10:46:2.9496814012527466E+01 694893 D Executing request/response loop
--transcript--2013-12-22 10:46:2.9498058080673218E+01 694893 I Read aZnRequest(GET /welcome)
--transcript--2013-12-22 10:46:2.9528563976287842E+01 694893 T GET /welcome 200 4733B 30ms
--transcript--2013-12-22 10:46:2.9529000997543335E+01 694893 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B)
--transcript--2013-12-22 10:46:5.9529083013534546E+01 694893 D a ConnectionTimedOut occurred (error 2710), Data receive timed out. while reading request
--transcript--2013-12-22 10:46:5.9529268026351929E+01 694893 D Closing stream

The only problem is… this codebase used to misbehave instantly. So… I agree that there is a condition that does not occur anymore but I cannot characterize it.
In any case, I think the failsafe code in #next to check if there is anything in the buffer after the receiveData should be sufficient because the change to #atEnd also changes its semantics and this appears to yield different results (see above).

I compared with the behaviour in Pharo and there I’m seeing the same log as the second one above.

Johan

On 22 Dec 2013, at 18:30, Dale K. Henrichs <dale.henrichs at gemtalksystems.com> wrote:

> Johan,
> 
> The implementation of atEnd only becomes important when you hit one of "null connections" if you get one of those, I think that SocketStream misbehaves .... the socket is alive, says that it will not block on read and then proceeds to return no data ... this could very easily be "an illegal combination", but if it occurs things go crazy ...
> 
> If you are able to avoid the "null connection" (my guess) then SocketStream works fine ...
> 
> So I think we need to understand where these "null connections" are coming from and make sure that they don't reappear.
> 
> Can you confirm that you are not seeing the "null connections" as well ... or are you surviving through them?
> 
> Dale
> 
> From: "Johan Brichau" <johan at yesplan.be>
> To: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> Cc: glass at lists.gemtalksystems.com
> Sent: Sunday, December 22, 2013 4:10:18 AM
> Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1)
> 
> Dale,
> 
> It’s becoming quite a crazy issue…
> 
> I’m now executing the Zinc adaptor perfectly _without_ any of the changes to SocketStream.
> I even did a complete load of Seaside31 and Zinc in a virgin image (without the fixes for SocketStream) and it works.
> 
> I got there because I was fiddling around to get to the origin of the second connection. To me, the original implementation of #atEnd is correct because #receiveData throws a timeout exception if it does not receive any data from the socket. However, I noticed that this does not happen in the Gs implementation of SocketStream because the timeout is set to 0. But now I’m unable to test my theory because the original implementation works…
> 
> So, there probably is still something fishy going on because we both had different kinds of errors before with the same codebase.
> 
> Johan
> 
> On 22 Dec 2013, at 05:10, Dale K. Henrichs <dale.henrichs at gemtalksystems.com> wrote:
> 
> Yes I was seeing the second odd (null) connection, too and didn't figure out where that was coming from ... Didn't try things in topaz ...
> 
> I never did characterize the out-of-memory cause either ...
> 
>  I just focused on solving the spinning in empty buffer problem ... there are probably some remaining gotchas hanging around ...
> 
> Dale
> From: "Johan Brichau" <johan at yesplan.be>
> To: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> Cc: glass at lists.gemtalksystems.com
> Sent: Saturday, December 21, 2013 3:08:34 AM
> Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1)
> 
> Obviously, once I turn on the logging, the dropped connections disappear too…. (see log below)
> I’m wondering about the ZnUnknownHttpMethod before closing the stream. It tries to read from an empty stream actually… 
> 
> But we are definitely making progress… 
> I’m off Xmas shopping today but will look at this in the evening..
> 
> Johan
> 
> topaz 1> topaz 1> topaz 1> topaz 1> Zinc Server started on port 8383
> --transcript--2013-12-21 03:03:1.1953476905822754E+01 455751 I Starting ZnManagingMultiThreadedServer HTTP port 8383
> --transcript--2013-12-21 03:03:1.1953932046890259E+01 455683 D Initializing server socket
> --transcript--'new stream 116656385'
> --transcript--2013-12-21 03:03:1.8871202945709229E+01 455692 D Executing request/response loop
> --transcript--2013-12-21 03:03:1.8872347116470337E+01 455692 I Read aZnRequest(GET /welcome)
> --transcript--2013-12-21 03:03:1.8899705886840820E+01 455692 T GET /welcome 200 4733B 27ms
> --transcript--2013-12-21 03:03:1.8900135993957520E+01 455692 I Wrote aZnResponse(200 OK text/html;charset=utf-8 4733B)
> --transcript--2013-12-21 03:03:1.8900440931320190E+01 455692 D a ZnUnknownHttpMethod occurred (error 2710) while reading request
> --transcript--2013-12-21 03:03:1.8900584936141968E+01 455692 D Closing stream 116656385
> 
> 
> On 21 Dec 2013, at 10:51, Johan Brichau <johan at yesplan.be> wrote:
> 
> Ha, a second try does yield better results. The start in topaz does not have the out-of-mem.
> I do see a connection being dropped every other request…. but there is progress!
> 
> Don’t know what made it fail similarly the first time. The only change I did between this email and the one below is a restart of the stone...
> 
> Johan
> 
> On 21 Dec 2013, at 10:42, Johan Brichau <johan at yesplan.be> wrote:
> 
> Dale,
> 
> Unfortunately I’m getting the same result in topaz and less stable behaviour from gemtools (connections are dropped)
> Looking into it…
> 
> Johan
> 
> On 20 Dec 2013, at 04:31, Dale K. Henrichs <dale.henrichs at gemtalksystems.com> wrote:
> 
> Johan,
> 
> I moved the package SocketStream to the repo (changed baseline) and commit on socketstream_fix ... let me know if this works for you..
> 
> Dale
> 
> ----- Original Message -----
> | From: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> | To: "Johan Brichau" <johan at yesplan.be>
> | Cc: glass at lists.gemtalksystems.com
> | Sent: Thursday, December 19, 2013 5:45:26 PM
> | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1)
> | 
> | I think I've got things working now ... I will push a branch of
> | glassdb/zinc with my changes ... I need to clean things up a bit and
> | test with topaz ... right now its running in tODE ... I'm still
> | getting two connections per request, but it's much more stable now
> | ...
> | 
> | Dale
> | 
> | ----- Original Message -----
> | | From: "Johan Brichau" <johan at yesplan.be>
> | | To: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> | | Cc: glass at lists.gemtalksystems.com
> | | Sent: Thursday, December 19, 2013 7:46:37 AM
> | | Subject: Re: [Glass] problem update: Zinc server with Seaside (3.1)
> | | 
> | | Hi Dale,
> | | 
> | | Thanks for these pointers.
> | | I will look into it asap.
> | | 
> | | One thing I noticed is that after I merged the latest changes into
> | | Seaside3.1 [1], a lot less processes were spawning and it took more
> | | time before the system ran out of memory.
> | | But it might just have been a coincidence...
> | | 
> | | Johan
> | | 
> | | [1] https://github.com/glassdb/Seaside31/pull/5
> | | 
> | | On 19 Dec 2013, at 06:38, Dale K. Henrichs
> | | <dale.henrichs at gemtalksystems.com> wrote:
> | | 
> | | > Just a touch more info:
> | | > 
> | | > ZnMultiThreadedServer>>listenLoop sents #repeat to a block ...
> | | > 
> | | > if `serverSocket isValid` returns false, the loop is exited but
> | | > it
> | | > launches another
> | | > #listenLoop ... if we get consistent invalid serverSockets, this
> | | > guy will spin out of control..
> | | > 
> | | > Inside the block, ZnMultiThreadedServer>>serveConnectionsOn: is
> | | > called (right before the repeat).
> | | > 
> | | > In ZnMultiThreadedServer>>serveConnectionsOn: if
> | | > #waitForAcceptFor:
> | | > returns a non-nil response, we are off to the races with a forked
> | | > process and a return to ZnMultiThreadedServer>>listenLoop which
> | | > itself spins up a new call to
> | | > ZnMultiThreadedServer>>serveConnectionsOn: ....
> | | > 
> | | > Soooo, it looks like SocketStreamSocket>>waitForAcceptFor: had
> | | > better function correctly .... and serverSocket had better not go
> | | > into terminal invalidity...
> | | > 
> | | > Some halts or breakpoints in and around these particular methods
> | | > should yield some good information ... or point us in a more
> | | > productive direction.
> | | > 
> | | > Dale
> | | > 
> | | > From: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> | | > To: "Johan Brichau" <johan at yesplan.be>
> | | > Cc: glass at lists.gemtalksystems.com
> | | > Sent: Wednesday, December 18, 2013 8:59:47 PM
> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside
> | | > (3.1)
> | | > 
> | | > Johan,
> | | > 
> | | > I've collected more info on two processes[1] ... first the one
> | | > that
> | | > gets an "index out of range error" and the other process that was
> | | > sitting in the process list waiting for more input from the
> | | > socket, but the socket was closed (at leasted #isConnected is
> | | > false).
> | | > 
> | | > I hit the out of memory earlier but didn't collect any meaningful
> | | > info ... I'm assuming that the multi-threaded boy is hitting an
> | | > error somewhere and spinning away like crazy ...
> | | > 
> | | > Need to catch the error in progress and perhaps find out from
> | | > Sven
> | | > if there is supposed to be a "deadman switch" on this guy to
> | | > avoid
> | | > infinite errors ..
> | | > 
> | | > Dale
> | | > 
> | | > [1]
> | | > https://github.com/dalehenrich/todeHome/blob/master/home/seaside3.1.0/stacks
> | | > 
> | | > From: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> | | > To: "Johan Brichau" <johan at yesplan.be>
> | | > Cc: glass at lists.gemtalksystems.com
> | | > Sent: Wednesday, December 18, 2013 6:39:13 PM
> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside
> | | > (3.1)
> | | > 
> | | > Here's the stack:
> | | > 
> | | > 1. OffsetError(AbstractException)>>_signalWith: @5 line 25
> | | > 2. OffsetError(AbstractException)>>signal @2 line 47
> | | > 3. ByteArray(Object)>>_error:args: @15 line 11
> | | > 4. ByteArray(Object)>>_errorIndexOutOfRange: @2 line 6
> | | > 5. ByteArray>>at: @4 line 13
> | | > 6. SocketStream>>next @13 line 10
> | | > 7. ZnLineReader>>processNext @5 line 4
> | | > 8. ZnLineReader>>nextLine @3 line 3
> | | > 9. ZnRequestLine>>readFrom: @3 line 3
> | | > 10. ZnRequestLine class>>readFrom: @3 line 3
> | | > 11. ZnRequest>>readHeaderFrom: @2 line 2
> | | > 12. ZnRequest(ZnMessage)>>readBinaryFrom: @2 line 3
> | | > 13. ZnRequest class(ZnMessage class)>>readBinaryFrom: @3 line 3
> | | > 14. [] in
> | | > ExecBlock1(ZnZincServerAdaptor)>>configureServerForBinaryReading
> | | > @2 line 4
> | | > 15. [] in
> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest:
> | | > @3 line 6
> | | > 16. [] in
> | | > ExecBlock0(ZnSingleThreadedServer)>>withMaximumEntitySizeDo: @2
> | | > line 6
> | | > 17. [] in ZnMaximumEntitySize class(DynamicVariable
> | | > class)>>value:during: @3 line 9
> | | > 18. ZnMaximumEntitySize class(ExecBlock)>>ensure: @2 line 12
> | | > 19. ZnMaximumEntitySize class(DynamicVariable
> | | > class)>>value:during:
> | | > @6 line 10
> | | > 20.
> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>withMaximumEntitySizeDo:
> | | > @5 line 5
> | | > 21.
> | | > ZnManagingMultiThreadedServer(ZnSingleThreadedServer)>>readRequest:
> | | > @2 line 6
> | | > 22. [] in
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely:
> | | > @2 line 4
> | | > 23. ZnManagingMultiThreadedServer(ExecBlock)>>on:do: @3 line 42
> | | > 24.
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>readRequestSafely:
> | | > @3 line 5
> | | > 25.
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeOneRequestResponseOn:
> | | > @2 line 7
> | | > 26. [] in
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn:
> | | > @2 line 10
> | | > 27. [] in ZnCurrentServer class(DynamicVariable
> | | > class)>>value:during: @3 line 9
> | | > 28. ZnCurrentServer class(ExecBlock)>>ensure: @2 line 12
> | | > 29. ZnCurrentServer class(DynamicVariable class)>>value:during:
> | | > @6
> | | > line 10
> | | > 30.
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>executeRequestResponseLoopOn:
> | | > @4 line 8
> | | > 31. [] in
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn:
> | | > @2 line 9
> | | > 32. ZnManagingMultiThreadedServer(ExecBlock)>>ensure: @2 line 12
> | | > 33. [] in
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn:
> | | > @2 line 10
> | | > 34. [] in ExecBlock0(ExecBlock)>>ifCurtailed: @2 line 6
> | | > 35. ExecBlock0(ExecBlock)>>ensure: @2 line 12
> | | > 36. ZnManagingMultiThreadedServer(ExecBlock)>>ifCurtailed: @3
> | | > line
> | | > 8
> | | > 37. [] in
> | | > ZnManagingMultiThreadedServer(ZnMultiThreadedServer)>>serveConnectionsOn:
> | | > @2 line 11
> | | > 38. GsProcess>>_start @7 line 16
> | | > 39. UndefinedObject(GsNMethod class)>>_gsReturnToC @1 line 1
> | | > 
> | | > 
> | | > From: "Dale K. Henrichs" <dale.henrichs at gemtalksystems.com>
> | | > To: "Johan Brichau" <johan at yesplan.be>
> | | > Cc: glass at lists.gemtalksystems.com
> | | > Sent: Wednesday, December 18, 2013 6:26:01 PM
> | | > Subject: Re: [Glass] problem update: Zinc server with Seaside
> | | > (3.1)
> | | > 
> | | > Johan,
> | | > 
> | | > I've finally started monkeying with this and the first thing I
> | | > ran
> | | > into (running this under tODE and trying to duplicate the
> | | > GemTools
> | | > error condition) was an error involving SocketStream and an
> | | > apparent buffer overflow ... it looks to me that
> | | > SocketStream>>next has no logic to refill the buffer once it is
> | | > full ... at least I got a stack in which the ByteArray buffer was
> | | > accessed at slot 4097 (1 larger than the buffer):
> | | > 
> | | > There are three other processes floating around as well ... I'll
> | | > keep digging ...
> | | > 
> | | > Dale
> | | > 
> | | > 
> | | > From: "Johan Brichau" <johan at yesplan.be>
> | | > To: glass at lists.gemtalksystems.com
> | | > Sent: Sunday, December 15, 2013 1:20:20 PM
> | | > Subject: [Glass] problem update: Zinc server with Seaside (3.1)
> | | > 
> | | > Hi all,
> | | > 
> | | > I’m working on the Zinc server adaptor for Seaside 3.1 but I got
> | | > stuck.
> | | > 
> | | > Launching the following from Gemtools brings up the server and
> | | > makes it respond to requests nicely. Trying to do the same from a
> | | > topaz session though, immediately throws an out-of-mem exception.
> | | > 
> | | > WAGsZincAdaptor startOn: 8383
> | | > 
> | | > If you break the blocking call in Gemtools and run the line above
> | | > again, you also run out-of-mem after a couple of times (but only
> | | > after _also_ connecting to the server from your web browser).
> | | > First question about this is why that does not happen in gem
> | | > tools?
> | | > I tried turning the auto commit off but that did not change
> | | > anything.
> | | > 
> | | > Following advice in [1], I started investigating from gemtools
> | | > and
> | | > got this for the byte sizes in temp obj memory on 75% full:
> | | > 
> | | > #'ByteArray'->22153440, #'GsMethodDictionary'->3073088,
> | | > #'GsMethodLookupCache'->2224064, #'GsProcess'->562224,
> | | > #'String'->415504, #'Array'->340168, #'SocketStream'->322680,
> | | > #'ExecBlock'->201664, #'VariableContext'->175552,
> | | > #'SocketStreamSocket'->150696,  …0
> | | > 
> | | > The process browser shows over 100 threads displayed as follows:
> | | > 
> | | > (priority=25) ready [oop= …. ]
> | | > 
> | | > I guess this is where Dale started having nightmares. The stack
> | | > traces show the out-of-mem always happens in the
> | | > ZnNetworkingUtils>>setSocketStreamParameters: method.
> | | > 
> | | > All of this is using the current Zinc version on github for
> | | > gemstone 3.1 [2]  and my version of the Seaside 3.1 port [3]
> | | > To try, load Seaside from my repo, load the ‘Zinc-Seaside’ group
> | | > from the baseline and hit the above.
> | | > 
> | | > Any ideas are welcome
> | | > 
> | | > Johan
> | | > 
> | | > [1]
> | | > http://gemstonesoup.wordpress.com/2008/11/19/gemstone-101-managing-out-of-memory-situations/
> | | > [2] https://github.com/glassdb/zinc/tree/gemstone3.1
> | | > [3] https://github.com/jbrichau/Seaside31
> | | > 
> | | > _______________________________________________
> | | > Glass mailing list
> | | > Glass at lists.gemtalksystems.com
> | | > http://lists.gemtalksystems.com/mailman/listinfo/glass
> | | 
> | | 
> |

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20131222/808b7f6e/attachment-0001.html>


More information about the Glass mailing list