[Glass] Nginx , FastCGI and GLASS problem

Wed Dec 4 19:33:38 PST 2013

mmmm I could reproduce the problem even that long request....some problem
as a few days back...my gems start to go down...just browsing a bit my
app...
In fact...I have now particular button that saves an object and that
"hungs" the gem....
Ohhh ah I have the same problem with Swazoo so I doesn't seem FastCGI
related...
Weird that in my local gemstone I have no problem.
So it really seems like something is happening in the remote gemstone that
doesn't happen here. I will try to do a backup in remote one and restore
here to see if I can reproduce it. If not, then there is something in the
server installation.

Thanks guys. I will continue digging and let you know.

Cheers,

On Wed, Dec 4, 2013 at 8:12 PM, Mariano Martinez Peck <marianopeck at gmail.com
> wrote:

>
>
> On Wed, Dec 4, 2013 at 7:29 PM, Paul DeBruicker <pdebruic at gmail.com>wrote:
>
>> Sounds to me like you have an error in your Seaside app or your own code
>> that
>> kills your FastCGI gems.
>>
>
> Weird :(   The only problem I found strange was clicking in "update
> button" which took more than 30 seconds. See thread "[GS/SS Beta] What do
> these error messages mean?".
> Does this sound like the problem?
>
>
>>
>> In your Seaside app are you using the WAGemStoneProductionErrorHandler?
>
>
> No, I am using WARemoteDebuggingWalkbackErrorHandler since I am still in
> development and I really need to be able to debug problems.
> I can temporarily move to WAGemStoneProductionErrorHandler to see if this
> is the cause of the problem.
>
>
>> If
>> not, switch to it.  If so, are any errors getting put on the ObjectLog?
>>
>>
> I will check.
>
>
>>
>>
>> Do you have monit or daemontools monitoring your FastCGI gems to restart
>> them automatically when they become unresponsive?
>>
>>
> Not for FastCGI in particular, but I do have for nginx (monit):
>
> $ cat /etc/monit.d/nginx
> check process nginx with pidfile /var/run/nginx.pid
> group nginx
> start program = "/etc/init.d/nginx start"
> stop program = "/etc/init.d/nginx stop"
> if failed host 127.0.0.1 port 80 protocol http then restart
> if 5 restarts within 5 cycles then timeout
>
>
>
> Thanks Paul!!
>
>
>
>>
>>
>>
>> Mariano Martinez Peck wrote
>> > ObjectLog, from yesterday:  FastCGI: topaz exit     Object: port: 9003
>> > pid:
>> > 2743
>> >
>> >
>> > And now, FastCGI of 9001 (gem 1)
>> >
>> > Unreportable ERROR Encountered:
>> > 2013-12-04T07:22:54.77649903297424-08:00UserDefinedError:
>> > send(15,0x7f352f636400,49,0) failed with errno=107,ENOTCONN, Socket is
>> not
>> > connected-----------
>> >
>> > So I got this error in each of the gems....
>> >
>> > Any idea?
>> >
>> >
>> >
>> > On Wed, Dec 4, 2013 at 9:56 AM, Mariano Martinez Peck <
>>
>> > marianopeck@
>>
>> > > wrote:
>> >
>> >>
>> >>
>> >>
>> >> On Tue, Nov 26, 2013 at 3:22 AM, Johan Brichau <
>>
>> > johan@
>>
>> > > wrote:
>> >>
>> >>> Mariano,
>> >>>
>> >>> I would guess not all fastcgi gems are running correctly on all ports.
>> >>> There is a default timeout of 60s in nginx before it will pass a
>> fastcgi
>> >>> request to another port in the upstream configuration.
>> >>>
>> >>> The reason you see the delay with the post and not with the following
>> >>> get
>> >>> might be because nginx is doing a round-trip load balance over the
>> >>> fastcgi
>> >>> backends and one request goes to a working port but the next is not
>> (or
>> >>> vice-versa).
>> >>>
>> >>> You might want to experiment by limiting the nginx conf to a single
>> >>> fastcgi backend and/or change the default timeout and put some logging
>> >>> code
>> >>> in your app when the request gets to seaside. That would at least
>> allow
>> >>> to
>> >>> find on what to focus.
>> >>>
>> >>>
>> >>
>> >> Thanks Johan, that was indeed the problem. This is the second time that
>> >> happened to me. If I restart everything, it will probably be "fixed".
>> But
>> >> I
>> >> don't want to do that to really fix the problem. So 9001 is answering
>> >> correct, yet, 9002 yields timeone and 9003 yields could not connect
>> after
>> >> 1
>> >> minute.... I tried by doing:
>> >>
>> >> curl -v http://localhost:9003/myApp
>> >>
>> >> So...the thing is how to know what the hell is happening that some of
>> my
>> >> gems (or  FastCGI adaptors) go down for some reason.
>> >> I have checked all my gemstone logs. 2 things look suspicious.
>> >> FastCGI_server-9003.log shows that I have a button that updates my app
>> >> code
>> >> in GemStone which is causing this:
>> >>
>> >> a RepositoryViewLost occurred (notification 3031), When running outside
>> >> of a transaction the session failed to respond quickly enough to a
>> >> request from Stone to abort.  All object caches have been cleared.
>> >> nil
>> >>
>> >> Then,there are some:  ----------- Commit failure - retrying LOG ENTRY:
>> >> aSymbolDictionary-----------
>> >>
>> >> Here is the pieces of logs I could rescue. Maybe someone could help me
>> >> figure out what is happening!
>> >>
>> >> Thanks in advance!!
>> >>
>> >>
>> >> seaside_2485pcmon.log
>> >>
>> >> [12/04/13 02:17:21.210 EST]: Client died: Slot   12, PID    2743,
>> >> LostOtFlags    0, Name FastCGI-11-DataCurator
>> >> [12/04/13 02:17:21.210 EST]: Starting crashed client recovery: Slot 12,
>> >> PID 2743, Name FastCGI-11-DataCurator
>> >>     Cleaned up locked/pinned frame 550 for crashed process 2743
>> >>     Cleaned up locked/pinned frame 11545 for crashed process 2743
>> >>     Cleaned up locked/pinned frame 3929 for crashed process 2743
>> >>     Cleaned up locked/pinned frame 11407 for crashed process 2743
>> >>     Disposing free frame cache from slot 12:  4 of 4 frames recovered.
>> >> [12/04/13 02:17:23.451 EST]: Finished crashed client recovery: Slot 12,
>> >> PID 2743, Name FastCGI-11-DataCurator
>> >>
>> >>
>> >> FastCGI_server-9001.log  (which is the only gem/adaptor that seems to
>> be
>> >> working)
>> >>
>> >> paz 1> topaz 1> topaz 1> topaz 1> FastCGI Server started on port 9001
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T22:52:06.46091294288635-08:00UserDefinedError:
>> >> send(16,0x7f352f6ebba8,7416,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T22:53:06.45396900177002-08:00UserDefinedError:
>> >> send(17,0x7f352f786110,7416,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T23:05:07.57434701919556-08:00UserDefinedError:
>> >> send(14,0x7f352f65e6c0,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T23:05:31.43626189231873-08:00UserDefinedError:
>> >> send(15,0x7f352f6bed60,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T23:14:13.39595103263855-08:00UserDefinedError:
>> >> send(16,0x7f352f71f7e0,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-03T23:15:13.47269105911255-08:00UserDefinedError:
>> >> send(17,0x7f352f77ff90,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >>
>> >>
>> >> FastCGI_server-9002.log
>> >>
>> >> failure
>> >> Read-Write Conflicts...
>> >> Write-Write Conflicts...
>> >>     383168257
>> >> Write-Dependency Conflicts...
>> >> Write-ReadLock Conflicts...
>> >> Write-WriteLock Conflicts...
>> >> Rc-Write-Write Conflicts...
>> >> Synchronized-Commit Conflicts...
>> >> ----------- Commit failure - retrying LOG ENTRY:
>> >> aSymbolDictionary-----------
>> >> failure
>> >> Read-Write Conflicts...
>> >> Write-Write Conflicts...
>> >>     383168257
>> >> Write-Dependency Conflicts...
>> >> Write-ReadLock Conflicts...
>> >> Write-WriteLock Conflicts...
>> >> Rc-Write-Write Conflicts...
>> >> Synchronized-Commit Conflicts...
>> >> ----------- Commit failure - retrying LOG ENTRY:
>> >> aSymbolDictionary-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.04186797142029-08:00UserDefinedError:
>> >> send(16,0x7f21f9462268,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.0669538974762-08:00UserDefinedError:
>> >> send(17,0x7f21f94ed6c8,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.10030889511108-08:00UserDefinedError:
>> >> send(18,0x7f21f9551ae8,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.11388611793518-08:00UserDefinedError:
>> >> send(14,0x7f21f95b2918,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.12687110900879-08:00UserDefinedError:
>> >> send(19,0x7f21f96184e8,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.12758493423462-08:00UserDefinedError: Incorrect
>> FCGI
>> >> version. Expected: 1 Received: 71-----------
>> >> -----------  Unreportable ERROR Encountered:
>> >> 2013-12-04T04:36:49.15798401832581-08:00UserDefinedError:
>> >> send(15,0x7f21f968f7d8,2673,0) failed with errno=107,ENOTCONN, Socket
>> is
>> >> not connected-----------
>> >>
>> >>
>> >>
>> >> FastCGI_server-9003.log
>> >>
>> >>
>> >> --transcript--'...finished 1.4-baseline'
>> >> failure
>> >> Read-Write Conflicts...
>> >> Write-Write Conflicts...
>> >>     288128769
>> >>     294245633
>> >> Write-Dependency Conflicts...
>> >> Write-ReadLock Conflicts...
>> >> Write-WriteLock Conflicts...
>> >> Rc-Write-Write Conflicts...
>> >> Synchronized-Commit Conflicts...
>> >> ----------- Commit failure - retrying LOG ENTRY:
>> >> aSymbolDictionary-----------
>> >> -----------------------------------------------------
>> >> GemStone: Error         Nonfatal
>> >> a RepositoryViewLost occurred (notification 3031), When running outside
>> >> of a transaction the session failed to respond quickly enough to a
>> >> request from Stone to abort.  All object caches have been cleared.
>> >> nil
>> >> Error Category: 231169 [GemStone] Number: 3031  Arg Count: 1 Context :
>> >> 318063361 exception : 318063105
>> >> Arg 1: [20 sz:0 cls: 76289 UndefinedObject] nil
>> >> topaz > exec iferr 1 : where
>> >> ==> 1 AbstractException >> _signalWith:             (envId 0) @6 line
>> 25
>> >> [methId 4496129]
>> >> 2 [] in  AbstractException >> _signalAsync      (envId 0) @2 line 19
>> >> [methId 4605185]
>> >> 3 ExecBlock >> ensure:                          (envId 0) @2 line 12
>> >> [methId 3709185]
>> >> 4 AbstractException >> _signalAsync             (envId 0) @6 line 21
>> >> [methId 4600321]
>> >> 5 ProcessorScheduler >> _reapEvents:            (envId 0) @96 line 86
>> >> [methId 4810241]
>> >> 6 ProcessorScheduler >> _findReadyProcess:      (envId 0) @2 line 8
>> >> [methId 4815105]
>> >> 7 ProcessorScheduler >> _reschedule             (envId 0) @3 line 8
>> >> [methId 4805121]
>> >> 8 GsProcess >> _wait                            (envId 0) @3 line 3
>> >> [methId 4486145]
>> >> 9 Delay >> _wait:                               (envId 0) @26 line 26
>> >> [methId 13285377]
>> >> 10 Delay >> wait                                 (envId 0) @3 line 5
>> >> [methId 13307905]
>> >> 11 [] in  Executed Code                          (envId 0) @3 line 32
>> >> [methId 318074625]
>> >> 12 GsProcess >> _start                           (envId 0) @7 line 16
>> >> [methId 4467969]
>> >> 13 GsNMethod class >> _gsReturnToC               (envId 0) @1 line 1
>> >> [methId 4461825]
>> >>   [GsProcess 318063361]
>> >> topaz 1> topaz 1> [268 sz:0 cls: 68097 Boolean] true
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>> Johan
>> >>>
>> >>> On 26 Nov 2013, at 00:33, Mariano Martinez Peck <
>>
>> > marianopeck@
>>
>> > >
>> >>> wrote:
>> >>>
>> >>> Also, when I see that 1-minute request in the browser development
>> tools,
>> >>> it appears like 2ms receiving, 1 minute waiting...weird!
>> >>>
>> >>>
>> >>> On Mon, Nov 25, 2013 at 8:26 PM, Mariano Martinez Peck <
>> >>>
>>
>> > marianopeck@
>>
>> >> wrote:
>> >>>
>> >>>> HI guys. This is the first time I am trying to use FastCGI. For some
>> >>>> reason, i have a timeout with fastCGI.
>> >>>> I started the Gems either with:
>> >>>>
>> >>>>
>> >>>> WAGemStoneRunSeasideGems default
>> >>>>         name: 'FastCGI';
>> >>>>         adaptorClass: WAFastCGIAdaptor;
>> >>>>         ports: #(9001 9002 9003).
>> >>>> WAGemStoneRunSeasideGems restartGems.
>> >>>>
>> >>>> (from GemTools) or from command line:
>> >>>>
>> >>>> ./startSeaside_FastCGI 9001
>> >>>>
>> >>>> In either case, what happens is that I can browse my app
>> >>>> http:localhost:9001/dp (and 9002 and 9003). But when I do a request
>> to
>> >>>> the
>> >>>> server trough my FastCGI, it takes ONE MINUTE to load the page. I am
>> >>>> using
>> >>>> Nginx in the backend with this configuration:
>> >>>>
>> >>>>     upstream seaside
>> >>>>       {
>> >>>>         server localhost:9001;
>> >>>>         server localhost:9002;
>> >>>>         server localhost:9003;
>> >>>>       }
>> >>>>
>> >>>>       server
>> >>>>       {
>> >>>>
>> >>>> listen 33300;
>> >>>> server_name localhost;
>> >>>>         root /var/www/glass/;
>> >>>>
>> >>>>
>> >>>>     access_log  /var/log/nginx/fastCGI.log;
>> >>>>     error_log  /var/log/nginx/error.log info;
>> >>>>
>> >>>>         location /
>> >>>>         {
>> >>>>             try_files $uri @seaside;
>> >>>>   #  error_page 403 404 = @seaside;
>> >>>>         }
>> >>>>
>> >>>>         location @seaside
>> >>>>         {
>> >>>>           include fastcgi_params;
>> >>>>           fastcgi_pass seaside;
>> >>>>         }
>> >>>>       }
>> >>>>
>> >>>>
>> >>>> Using the webtools of the browser, I see that each time I go to a
>> page,
>> >>>> there is either a GET or a POST that takes exactly 1 minute. I
>> >>>> mean...nothing happens, until one minute passes and then the page is
>> >>>> rendered. The requests are small.  And in fact, browsing directly to
>> >>>> 9001/2/3 works perfectly. So I guess this is a one minute timeout of
>> >>>> something.
>> >>>>
>> >>>> For what I can see, it could be related to the
>> >>>>
>> >>>>  keepalive_timeout  65;
>> >>>>
>> >>>> And in nginx error.log I see something like:
>> >>>>
>> >>>> 2013/11/25 17:38:18 [error] 22835#0: *73 upstream timed out (60:
>> >>>> Operation timed out) while reading response header from upstream,
>> >>>> client:
>> >>>> 127.0.0.1, server: localhost, request: "GET /dp HTTP/1.1", upstream:
>> >>>> "fastcgi://127.0.0.1:9001", host: "localhost:33300"
>> >>>> 2013/11/25 17:38:18 [info] 22835#0: *73 kevent() reported that client
>> >>>> 127.0.0.1 closed keepalive connection
>> >>>>
>> >>>> Any ideas what could be wrong?
>> >>>>
>> >>>>
>> >>>> Thanks!
>> >>>>
>> >>>> --
>> >>>> Mariano
>> >>>> http://marianopeck.wordpress.com
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Mariano
>> >>> http://marianopeck.wordpress.com
>> >>>
>> >>> _______________________________________________
>> >>>
>> >>> Glass mailing list
>> >>>
>>
>> > Glass at .gemtalksystems
>>
>> >>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Mariano
>> >> http://marianopeck.wordpress.com
>> >>
>> >
>> >
>> >
>> > --
>> > Mariano
>> > http://marianopeck.wordpress.com
>> >
>> > _______________________________________________
>> > Glass mailing list
>>
>> > Glass at .gemtalksystems
>>
>> > http://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://forum.world.st/Glass-Nginx-FastCGI-and-GLASS-problem-tp4725218p4727568.html
>> Sent from the GLASS mailing list archive at Nabble.com.
>> _______________________________________________
>> Glass mailing list
>> Glass at lists.gemtalksystems.com
>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20131205/a24d5bac/attachment-0001.html>