[GemStone-Smalltalk] Error with netldi, the connection to the repository is never made and the port becomes blocked
James Foster via GemStone-Smalltalk
gemstone-smalltalk at lists.gemtalksystems.com
Tue Mar 28 13:50:37 PDT 2017
Hi Ezequiel,
This is very helpful information. Let me summarize the information I got from the log files:
netldi-1.log (successful connection)
Before connection frame 8 is in waitForEvents when interrupted by signal handler.
After connection frame 8 is again in waitForEvents, as it should be.
netldi-3.log (failed connection)
Before connection frame 8 is in waitForEvents (as above).
After connection things are very different:
#6 <signal handler called>
#7 poll () from /lib64/libc.so.6
#8 __libc_res_nsend () from /lib64/libresolv.so.2
#9 __libc_res_nquery () from /lib64/libresolv.so.2
#10 _nss_dns_gethostbyaddr2_r () from /lib64/libnss_dns.so.2
#11 _nss_dns_gethostbyaddr_r () from /lib64/libnss_dns.so.2
#12 gethostbyaddr_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
#13 getnameinfo () from /lib64/libc.so.6
#14 SocketGetPeerInfo () at build37291/src/socket.c:1714
#15 getServiceForClient () at build37291/src/nldicmn.c:1079
As requested, you repeated the signal after short pauses, and each of the three signals showed the process stuck in the same place (so either hung or a tight loop). So, the GemStone code (frame 14) is calling getnameinfo() in libc and it fails to return. In frames 10/11 we can see that it is attempting to do a DNS lookup of the host by the address.
You were able to ping to MW from ML using both name and IP (as I suggested). The next thing to try is to see if name lookup works from the command line. After a ping gives you the IP address, try ‘host’ and ‘nslookup’ with the IP address as the argument. This should return the name (with other information).
A brief Google search suggests that this may be an OS bug and might be fixed by upgrading your Linux kernel. See http://stackoverflow.com/questions/35880798/java-dns-resolution-hangs-forever <http://stackoverflow.com/questions/35880798/java-dns-resolution-hangs-forever>. As a work-around, you might add MW to the /etc/hosts file on ML. This might avoid going to the DNS for the lookup.
Alternatively, you could upgrade to GS/S 3.2.15 (or 3.3.3) and use the -N option on startnetldi to avoid the name lookup entirely. See https://downloads.gemtalksystems.com/docs/GemStone64/3.2.x/GS64-ReleaseNotes-3.2.15/1-ReleaseNotes.htm#pgfId-1613796 <https://downloads.gemtalksystems.com/docs/GemStone64/3.2.x/GS64-ReleaseNotes-3.2.15/1-ReleaseNotes.htm#pgfId-1613796>.
Let us know what you find!
James
> On Mar 28, 2017, at 12:49 PM, brianstone via GemStone-Smalltalk <gemstone-smalltalk at lists.gemtalksystems.com> wrote:
>
> Hi James,
>
> I made the tests you mentioned, I have obtained the following results:
>
> 1) Trying to ping MW from ML using name or IP, works. I can state that each
> machine 'sees' each other in the network.
>
> 2) I figured out that netldi is not finished normally when it was hang. It
> is killed instead of finish normally.
> I try to stop it using 'stopnetldi' command:
> stopnetldi gs64ldi-3281
> stopnetldi[Info]: GemStone version '3.2.8.1'
> stopnetldi[Info]: GemStone server 'gs64ldi-3281' has been
> stopped.
> Then I check if the netldi was terminated and I se the following:
> gslist -l
> Status Version Owner Pid Port Started Type
> Name
> ------- --------- --------- ----- ----- ------------ ------
> ----
> *killed* 3.2.8.1 gemst643281 16056 50387 Mar 28 15:49
> Netldi gs64ldi-3281
> exists 3.2.8.1 gemst643281 15396 39139 Mar 28 15:10
> cache newRepositorystone~7663a27bab8c7a96
> exists 3.2.8.1 gemst643281 15394 41387 Mar 28 15:10
> Stone newRepositorystone
>
> 3) I made tests sending SIGUSR1 signal to netldi process, I've noted some
> differences between a 'normal' log, and a log with the netldi hung, but I
> could not detect where the problem is.
> Maybe you can read it better, I have attached three files.
>
> a) The first, named "netldi-1.log" have the log resulting of make
> this actions:
> -Start netldi
> -Send SIGUSR1 signal
> -Establish a session using Jade from a VPN connection
> -Send SIGUSR1 signal
> -Stop netldi
>
> b)The second file, named "netldi-2.log", have the log resulting of
> make this actions :
> -Start netldi
> -Send SIGUSR1 signal
> -Try to establish a session using Jade from Wired connection
> -Send SIGUSR1 signal
> -Stop netldi
>
> c) And the third named as "netldi-3.log", have the same content as
> the 2nd but with some additionals SIGUSR1 signals sent to the netldi.
>
> netldi-1.log <http://forum.world.st/file/n4940263/netldi-1.log>
>
> netldi-2.log <http://forum.world.st/file/n4940263/netldi-2.log>
>
> netldi-3.log <http://forum.world.st/file/n4940263/netldi-3.log>
>
> I hope this information will be useful.
>
> Regarding about an additional useful service on the server, I'm afraid that
> there is not something that can help us.
> But I will try to install apache, or NodeJS or something in order to start a
> little web server. I don't know when I will have the time to make this
> tests, but I'll try to make it as soon as possible.
>
> Tell me if you need some additional information, please.
>
> Ezequiel
>
>
>
>
> --
> View this message in context: http://forum.world.st/Error-with-netldi-the-connection-to-the-repository-is-never-made-and-the-port-becomes-blocked-tp4939542p4940263.html
> Sent from the Gemstone/S mailing list archive at Nabble.com.
> _______________________________________________
> GemStone-Smalltalk mailing list
> GemStone-Smalltalk at lists.gemtalksystems.com
> http://lists.gemtalksystems.com/mailman/listinfo/gemstone-smalltalk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/gemstone-smalltalk/attachments/20170328/345f1307/attachment.html>
More information about the GemStone-Smalltalk
mailing list