[Glass] Nginx , FastCGI and GLASS problem

Wed Dec 4 20:13:13 PST 2013

OK, I found the problem.....

For some reason, reading from /dev/random is very very slow in my centos
server...the next lines takes forever there (but answered immediately in my
local gemstone in OSX):

[

| stream |
stream := (FileStream oldFileNamed: '/dev/random').
stream binary.
(stream next: 20) asByteArray.
stream close.

] timeToRun

On Thu, Dec 5, 2013 at 12:33 AM, Mariano Martinez Peck <
marianopeck at gmail.com> wrote:

> mmmm I could reproduce the problem even that long request....some problem
> as a few days back...my gems start to go down...just browsing a bit my
> app...
> In fact...I have now particular button that saves an object and that
> "hungs" the gem....
> Ohhh ah I have the same problem with Swazoo so I doesn't seem FastCGI
> related...
> Weird that in my local gemstone I have no problem.
> So it really seems like something is happening in the remote gemstone that
> doesn't happen here. I will try to do a backup in remote one and restore
> here to see if I can reproduce it. If not, then there is something in the
> server installation.
>
> Thanks guys. I will continue digging and let you know.
>
> Cheers,
>
>
> On Wed, Dec 4, 2013 at 8:12 PM, Mariano Martinez Peck <
> marianopeck at gmail.com> wrote:
>
>>
>>
>> On Wed, Dec 4, 2013 at 7:29 PM, Paul DeBruicker <pdebruic at gmail.com>wrote:
>>
>>> Sounds to me like you have an error in your Seaside app or your own code
>>> that
>>> kills your FastCGI gems.
>>>
>>
>> Weird :(   The only problem I found strange was clicking in "update
>> button" which took more than 30 seconds. See thread "[GS/SS Beta] What do
>> these error messages mean?".
>> Does this sound like the problem?
>>
>>
>>>
>>> In your Seaside app are you using the WAGemStoneProductionErrorHandler?
>>
>>
>> No, I am using WARemoteDebuggingWalkbackErrorHandler since I am still in
>> development and I really need to be able to debug problems.
>> I can temporarily move to WAGemStoneProductionErrorHandler to see if this
>> is the cause of the problem.
>>
>>
>>> If
>>> not, switch to it.  If so, are any errors getting put on the ObjectLog?
>>>
>>>
>> I will check.
>>
>>
>>>
>>>
>>> Do you have monit or daemontools monitoring your FastCGI gems to restart
>>> them automatically when they become unresponsive?
>>>
>>>
>> Not for FastCGI in particular, but I do have for nginx (monit):
>>
>> $ cat /etc/monit.d/nginx
>> check process nginx with pidfile /var/run/nginx.pid
>> group nginx
>> start program = "/etc/init.d/nginx start"
>> stop program = "/etc/init.d/nginx stop"
>> if failed host 127.0.0.1 port 80 protocol http then restart
>> if 5 restarts within 5 cycles then timeout
>>
>>
>>
>> Thanks Paul!!
>>
>>
>>
>>>
>>>
>>>
>>> Mariano Martinez Peck wrote
>>> > ObjectLog, from yesterday:  FastCGI: topaz exit     Object: port: 9003
>>> > pid:
>>> > 2743
>>> >
>>> >
>>> > And now, FastCGI of 9001 (gem 1)
>>> >
>>> > Unreportable ERROR Encountered:
>>> > 2013-12-04T07:22:54.77649903297424-08:00UserDefinedError:
>>> > send(15,0x7f352f636400,49,0) failed with errno=107,ENOTCONN, Socket is
>>> not
>>> > connected-----------
>>> >
>>> > So I got this error in each of the gems....
>>> >
>>> > Any idea?
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2013 at 9:56 AM, Mariano Martinez Peck <
>>>
>>> > marianopeck@
>>>
>>> > > wrote:
>>> >
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Nov 26, 2013 at 3:22 AM, Johan Brichau <
>>>
>>> > johan@
>>>
>>> > > wrote:
>>> >>
>>> >>> Mariano,
>>> >>>
>>> >>> I would guess not all fastcgi gems are running correctly on all
>>> ports.
>>> >>> There is a default timeout of 60s in nginx before it will pass a
>>> fastcgi
>>> >>> request to another port in the upstream configuration.
>>> >>>
>>> >>> The reason you see the delay with the post and not with the following
>>> >>> get
>>> >>> might be because nginx is doing a round-trip load balance over the
>>> >>> fastcgi
>>> >>> backends and one request goes to a working port but the next is not
>>> (or
>>> >>> vice-versa).
>>> >>>
>>> >>> You might want to experiment by limiting the nginx conf to a single
>>> >>> fastcgi backend and/or change the default timeout and put some
>>> logging
>>> >>> code
>>> >>> in your app when the request gets to seaside. That would at least
>>> allow
>>> >>> to
>>> >>> find on what to focus.
>>> >>>
>>> >>>
>>> >>
>>> >> Thanks Johan, that was indeed the problem. This is the second time
>>> that
>>> >> happened to me. If I restart everything, it will probably be "fixed".
>>> But
>>> >> I
>>> >> don't want to do that to really fix the problem. So 9001 is answering
>>> >> correct, yet, 9002 yields timeone and 9003 yields could not connect
>>> after
>>> >> 1
>>> >> minute.... I tried by doing:
>>> >>
>>> >> curl -v http://localhost:9003/myApp
>>> >>
>>> >> So...the thing is how to know what the hell is happening that some of
>>> my
>>> >> gems (or  FastCGI adaptors) go down for some reason.
>>> >> I have checked all my gemstone logs. 2 things look suspicious.
>>> >> FastCGI_server-9003.log shows that I have a button that updates my app
>>> >> code
>>> >> in GemStone which is causing this:
>>> >>
>>> >> a RepositoryViewLost occurred (notification 3031), When running
>>> outside
>>> >> of a transaction the session failed to respond quickly enough to a
>>> >> request from Stone to abort.  All object caches have been cleared.
>>> >> nil
>>> >>
>>> >> Then,there are some:  ----------- Commit failure - retrying LOG ENTRY:
>>> >> aSymbolDictionary-----------
>>> >>
>>> >> Here is the pieces of logs I could rescue. Maybe someone could help me
>>> >> figure out what is happening!
>>> >>
>>> >> Thanks in advance!!
>>> >>
>>> >>
>>> >> seaside_2485pcmon.log
>>> >>
>>> >> [12/04/13 02:17:21.210 EST]: Client died: Slot   12, PID    2743,
>>> >> LostOtFlags    0, Name FastCGI-11-DataCurator
>>> >> [12/04/13 02:17:21.210 EST]: Starting crashed client recovery: Slot
>>> 12,
>>> >> PID 2743, Name FastCGI-11-DataCurator
>>> >>     Cleaned up locked/pinned frame 550 for crashed process 2743
>>> >>     Cleaned up locked/pinned frame 11545 for crashed process 2743
>>> >>     Cleaned up locked/pinned frame 3929 for crashed process 2743
>>> >>     Cleaned up locked/pinned frame 11407 for crashed process 2743
>>> >>     Disposing free frame cache from slot 12:  4 of 4 frames recovered.
>>> >> [12/04/13 02:17:23.451 EST]: Finished crashed client recovery: Slot
>>> 12,
>>> >> PID 2743, Name FastCGI-11-DataCurator
>>> >>
>>> >>
>>> >> FastCGI_server-9001.log  (which is the only gem/adaptor that seems to
>>> be
>>> >> working)
>>> >>
>>> >> paz 1> topaz 1> topaz 1> topaz 1> FastCGI Server started on port 9001
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T22:52:06.46091294288635-08:00UserDefinedError:
>>> >> send(16,0x7f352f6ebba8,7416,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T22:53:06.45396900177002-08:00UserDefinedError:
>>> >> send(17,0x7f352f786110,7416,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T23:05:07.57434701919556-08:00UserDefinedError:
>>> >> send(14,0x7f352f65e6c0,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T23:05:31.43626189231873-08:00UserDefinedError:
>>> >> send(15,0x7f352f6bed60,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T23:14:13.39595103263855-08:00UserDefinedError:
>>> >> send(16,0x7f352f71f7e0,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-03T23:15:13.47269105911255-08:00UserDefinedError:
>>> >> send(17,0x7f352f77ff90,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >>
>>> >>
>>> >> FastCGI_server-9002.log
>>> >>
>>> >> failure
>>> >> Read-Write Conflicts...
>>> >> Write-Write Conflicts...
>>> >>     383168257
>>> >> Write-Dependency Conflicts...
>>> >> Write-ReadLock Conflicts...
>>> >> Write-WriteLock Conflicts...
>>> >> Rc-Write-Write Conflicts...
>>> >> Synchronized-Commit Conflicts...
>>> >> ----------- Commit failure - retrying LOG ENTRY:
>>> >> aSymbolDictionary-----------
>>> >> failure
>>> >> Read-Write Conflicts...
>>> >> Write-Write Conflicts...
>>> >>     383168257
>>> >> Write-Dependency Conflicts...
>>> >> Write-ReadLock Conflicts...
>>> >> Write-WriteLock Conflicts...
>>> >> Rc-Write-Write Conflicts...
>>> >> Synchronized-Commit Conflicts...
>>> >> ----------- Commit failure - retrying LOG ENTRY:
>>> >> aSymbolDictionary-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.04186797142029-08:00UserDefinedError:
>>> >> send(16,0x7f21f9462268,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.0669538974762-08:00UserDefinedError:
>>> >> send(17,0x7f21f94ed6c8,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.10030889511108-08:00UserDefinedError:
>>> >> send(18,0x7f21f9551ae8,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.11388611793518-08:00UserDefinedError:
>>> >> send(14,0x7f21f95b2918,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.12687110900879-08:00UserDefinedError:
>>> >> send(19,0x7f21f96184e8,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.12758493423462-08:00UserDefinedError: Incorrect
>>> FCGI
>>> >> version. Expected: 1 Received: 71-----------
>>> >> -----------  Unreportable ERROR Encountered:
>>> >> 2013-12-04T04:36:49.15798401832581-08:00UserDefinedError:
>>> >> send(15,0x7f21f968f7d8,2673,0) failed with errno=107,ENOTCONN, Socket
>>> is
>>> >> not connected-----------
>>> >>
>>> >>
>>> >>
>>> >> FastCGI_server-9003.log
>>> >>
>>> >>
>>> >> --transcript--'...finished 1.4-baseline'
>>> >> failure
>>> >> Read-Write Conflicts...
>>> >> Write-Write Conflicts...
>>> >>     288128769
>>> >>     294245633
>>> >> Write-Dependency Conflicts...
>>> >> Write-ReadLock Conflicts...
>>> >> Write-WriteLock Conflicts...
>>> >> Rc-Write-Write Conflicts...
>>> >> Synchronized-Commit Conflicts...
>>> >> ----------- Commit failure - retrying LOG ENTRY:
>>> >> aSymbolDictionary-----------
>>> >> -----------------------------------------------------
>>> >> GemStone: Error         Nonfatal
>>> >> a RepositoryViewLost occurred (notification 3031), When running
>>> outside
>>> >> of a transaction the session failed to respond quickly enough to a
>>> >> request from Stone to abort.  All object caches have been cleared.
>>> >> nil
>>> >> Error Category: 231169 [GemStone] Number: 3031  Arg Count: 1 Context :
>>> >> 318063361 exception : 318063105
>>> >> Arg 1: [20 sz:0 cls: 76289 UndefinedObject] nil
>>> >> topaz > exec iferr 1 : where
>>> >> ==> 1 AbstractException >> _signalWith:             (envId 0) @6 line
>>> 25
>>> >> [methId 4496129]
>>> >> 2 [] in  AbstractException >> _signalAsync      (envId 0) @2 line 19
>>> >> [methId 4605185]
>>> >> 3 ExecBlock >> ensure:                          (envId 0) @2 line 12
>>> >> [methId 3709185]
>>> >> 4 AbstractException >> _signalAsync             (envId 0) @6 line 21
>>> >> [methId 4600321]
>>> >> 5 ProcessorScheduler >> _reapEvents:            (envId 0) @96 line 86
>>> >> [methId 4810241]
>>> >> 6 ProcessorScheduler >> _findReadyProcess:      (envId 0) @2 line 8
>>> >> [methId 4815105]
>>> >> 7 ProcessorScheduler >> _reschedule             (envId 0) @3 line 8
>>> >> [methId 4805121]
>>> >> 8 GsProcess >> _wait                            (envId 0) @3 line 3
>>> >> [methId 4486145]
>>> >> 9 Delay >> _wait:                               (envId 0) @26 line 26
>>> >> [methId 13285377]
>>> >> 10 Delay >> wait                                 (envId 0) @3 line 5
>>> >> [methId 13307905]
>>> >> 11 [] in  Executed Code                          (envId 0) @3 line 32
>>> >> [methId 318074625]
>>> >> 12 GsProcess >> _start                           (envId 0) @7 line 16
>>> >> [methId 4467969]
>>> >> 13 GsNMethod class >> _gsReturnToC               (envId 0) @1 line 1
>>> >> [methId 4461825]
>>> >>   [GsProcess 318063361]
>>> >> topaz 1> topaz 1> [268 sz:0 cls: 68097 Boolean] true
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>> Johan
>>> >>>
>>> >>> On 26 Nov 2013, at 00:33, Mariano Martinez Peck <
>>>
>>> > marianopeck@
>>>
>>> > >
>>> >>> wrote:
>>> >>>
>>> >>> Also, when I see that 1-minute request in the browser development
>>> tools,
>>> >>> it appears like 2ms receiving, 1 minute waiting...weird!
>>> >>>
>>> >>>
>>> >>> On Mon, Nov 25, 2013 at 8:26 PM, Mariano Martinez Peck <
>>> >>>
>>>
>>> > marianopeck@
>>>
>>> >> wrote:
>>> >>>
>>> >>>> HI guys. This is the first time I am trying to use FastCGI. For some
>>> >>>> reason, i have a timeout with fastCGI.
>>> >>>> I started the Gems either with:
>>> >>>>
>>> >>>>
>>> >>>> WAGemStoneRunSeasideGems default
>>> >>>>         name: 'FastCGI';
>>> >>>>         adaptorClass: WAFastCGIAdaptor;
>>> >>>>         ports: #(9001 9002 9003).
>>> >>>> WAGemStoneRunSeasideGems restartGems.
>>> >>>>
>>> >>>> (from GemTools) or from command line:
>>> >>>>
>>> >>>> ./startSeaside_FastCGI 9001
>>> >>>>
>>> >>>> In either case, what happens is that I can browse my app
>>> >>>> http:localhost:9001/dp (and 9002 and 9003). But when I do a request
>>> to
>>> >>>> the
>>> >>>> server trough my FastCGI, it takes ONE MINUTE to load the page. I am
>>> >>>> using
>>> >>>> Nginx in the backend with this configuration:
>>> >>>>
>>> >>>>     upstream seaside
>>> >>>>       {
>>> >>>>         server localhost:9001;
>>> >>>>         server localhost:9002;
>>> >>>>         server localhost:9003;
>>> >>>>       }
>>> >>>>
>>> >>>>       server
>>> >>>>       {
>>> >>>>
>>> >>>> listen 33300;
>>> >>>> server_name localhost;
>>> >>>>         root /var/www/glass/;
>>> >>>>
>>> >>>>
>>> >>>>     access_log  /var/log/nginx/fastCGI.log;
>>> >>>>     error_log  /var/log/nginx/error.log info;
>>> >>>>
>>> >>>>         location /
>>> >>>>         {
>>> >>>>             try_files $uri @seaside;
>>> >>>>   #  error_page 403 404 = @seaside;
>>> >>>>         }
>>> >>>>
>>> >>>>         location @seaside
>>> >>>>         {
>>> >>>>           include fastcgi_params;
>>> >>>>           fastcgi_pass seaside;
>>> >>>>         }
>>> >>>>       }
>>> >>>>
>>> >>>>
>>> >>>> Using the webtools of the browser, I see that each time I go to a
>>> page,
>>> >>>> there is either a GET or a POST that takes exactly 1 minute. I
>>> >>>> mean...nothing happens, until one minute passes and then the page is
>>> >>>> rendered. The requests are small.  And in fact, browsing directly to
>>> >>>> 9001/2/3 works perfectly. So I guess this is a one minute timeout of
>>> >>>> something.
>>> >>>>
>>> >>>> For what I can see, it could be related to the
>>> >>>>
>>> >>>>  keepalive_timeout  65;
>>> >>>>
>>> >>>> And in nginx error.log I see something like:
>>> >>>>
>>> >>>> 2013/11/25 17:38:18 [error] 22835#0: *73 upstream timed out (60:
>>> >>>> Operation timed out) while reading response header from upstream,
>>> >>>> client:
>>> >>>> 127.0.0.1, server: localhost, request: "GET /dp HTTP/1.1", upstream:
>>> >>>> "fastcgi://127.0.0.1:9001", host: "localhost:33300"
>>> >>>> 2013/11/25 17:38:18 [info] 22835#0: *73 kevent() reported that
>>> client
>>> >>>> 127.0.0.1 closed keepalive connection
>>> >>>>
>>> >>>> Any ideas what could be wrong?
>>> >>>>
>>> >>>>
>>> >>>> Thanks!
>>> >>>>
>>> >>>> --
>>> >>>> Mariano
>>> >>>> http://marianopeck.wordpress.com
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Mariano
>>> >>> http://marianopeck.wordpress.com
>>> >>>
>>> >>> _______________________________________________
>>> >>>
>>> >>> Glass mailing list
>>> >>>
>>>
>>> > Glass at .gemtalksystems
>>>
>>> >>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >> --
>>> >> Mariano
>>> >> http://marianopeck.wordpress.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Mariano
>>> > http://marianopeck.wordpress.com
>>> >
>>> > _______________________________________________
>>> > Glass mailing list
>>>
>>> > Glass at .gemtalksystems
>>>
>>> > http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://forum.world.st/Glass-Nginx-FastCGI-and-GLASS-problem-tp4725218p4727568.html
>>> Sent from the GLASS mailing list archive at Nabble.com.
>>> _______________________________________________
>>> Glass mailing list
>>> Glass at lists.gemtalksystems.com
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20131205/593f55a1/attachment-0001.html>