Problem Description:
Java Client on Linux is using JNDI Context to lookup JMS Topic. Once that Topic connection is open if the server it is connected to is powered down or otherwise not available to the network, the JNDI Connection will not timeout, thus allowing our Failover strategy to use the secondary server.
The same code running as a Windows Client to the Linux server will work.
Our research might be showing that this issue is related to SO_TIMEOUT and SO_KEEPALIVE and that on Linux the application has to explicitly set these values. In a direct java client you would issue against instance of java.net.Socket
setKeepAlive(boolean on)
Enable/disable SO_KEEPALIVE.
and/or
setSoTimeout(int timeout)
Enable/disable SO_TIMEOUT with the specified timeout, in milliseconds.
However we cannot find anyway within the OC4J framework to set these values.
Details on the behavior:
- Client is very simple. Just the basics to get Context lookup using com.evermind.server.rmi.RMIInitialContextFactory, and set an Exception Listener and then publish a message
- To publish message, place topicPublisher.publish(outMessage); into a loop.
- Run client. Messages are published
- Remove the network cable from the server's nic card
- Client stops publishing message, but simply hangs. There is never a timeout. No exceptions, etc.
- Run the same client code on a Windows platform, repeat the scenario, works fine
Things attempted:
- Properties for the InitialContext:
- "oracle.j2ee.rmi.loadBalance", "lookup" - Did not work because we are not configured in a OC4J Cluster
- "rmi.client.connection.timeout", "20" -Therefore, my conclusion is that the issue is not with the JNDI Lookup or other RMI related configurations, it has to do with how the JMS publisher is receiving its response. To look into this, I ran an strace with the Java Client. So you will see below, messages are published, and then when I pull the network connection, the client simply sits and waits for the receive of the publish via the JMS Session
gettimeofday({1230053150, 638966}, NULL) = 0
send(7, "\0\0\27p\0\7APP_LOG\10\1\0005Oc4jJMS.Session"..., 275, 0) = 275
recv(7, "\0\0\4\322", 65536, 0) = 4
write(1, "Message Published 20271", 23) = 23
write(1, "\n", 1) = 1
gettimeofday({1230053150, 639711}, NULL) = 0
send(7, "\0\0\27p\0\7APP_LOG\10\1\0005Oc4jJMS.Session"..., 275, 0) = 275
recv(7,
So, lets look at a different question: How do I reconfigure the JMS Publisher or Consumer to timeout or otherwise recognize that there has been a loss in network connection?
Any ideas?? Thanks!!!
### Provide the full build number for the OC4J version that you are running: ###
Oracle Containers for
J2EE 10g (10.1.3.4.0) (build 080709.0800.28953)