[Glass] Swazoo server hangs

Wed Nov 6 04:49:25 PST 2013

Thanks for the input Johan.

> First off: given the number of problems you have using Swazoo and that Zinc server has not been battle tested in Gemstone (and there are open issues nobody really looked at), I definitely recommend to switch (back) to FastCGI. It is stable and fast. But, of course, it would be great if you can flesh out the remaining issues with Zinc server on Gemstone ;-)

Thanks, really can't work on Zinc now, pressure => battle tested FastCGI.

> Second, are you seeing the lock ups occurring frequently? Are they irregular or is there a pattern?

Yes, there's a pattern.

Someone else looked at the problem, but here's my laymens
interpretation. We picked up the pattern when we had the same ajax
call on the onblur and onchange events on the same html element. This
caused virtually simultaneous calls to 2 different Swazoo servers with
the same session (and action) id. This caused a conflict and one
process retries. When it retries, it tries to read from the socket
again, which has already been read on the first try (hey, there's no 2
phase commit on reading from sockets?). So, something like that. In
principle, when we read from / write to external systems and a commit
fails in GS, we generally have a problem.

Does this make sense? I can get more details if you need.

> I am asking this because we do have a similar problem that occurs (rather infrequently) with FastCGI adaptors for Seaside [1]:
> A seaside gem will become unresponsive after some time. I already managed to find out that the gateSemaphore of a quit system could still be less than 10 (i.e. some processes got locked and never signaled the semaphore) and that it might have something to do with the front-end server dropping connections. I'm not sure if these problems are related though.

Does not sound as if they are related, but I suppose it could be.

Thanks
Otto