• Post Reply Bookmark Topic Watch Topic
  • New Topic

Locking when reconnecting a socket  RSS feed

 
J Holls
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a resource that must be shared across multiple threads. Simply put this is a telnet client from commons net with a wrapper around it to format commands for the server to understand. I can only have one connection at a time. Also, the endpoint i am connecting to can fail over to an standby system. When this happens there are no warnings, and as can be expected all calls to the system that is no longer began to fail. What i am trying to do in my client code is to notices the failures and issue a reconnect which will also renegotiate which system is currently active. The problem i am running into is that, lets say i have 5 threads currently executing across this. When it fails, it fails for all so each thread begins the reconnect process. This causes issues because only 1 connection can be active and this attempts to create (N) connections where n is the number of active threads accessing this object.

right now the "Execute portion" looks like this




I tossed an extra synchronized (this) in there but that didn't work, connect and disconnect are both synchronized. I know I am not locking correctly at the right portion.

What I want to happen is that each thread does its work as normal, when an error is detected, the thread that first detected it initiates the restart procedure, this blocks all other threads until it has successfully done so. If it fails to do so, i want the app to throw an uncaught exception and exit.
 
Jim Hoglund
Ranch Hand
Posts: 525
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think I see what you would like, but can you be more specific
about what is happening now? What errors and other symptoms
do you see?
Jim...
 
J Holls
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

OK the snippet i shown was from the facade class. This class takes a list of systems for it to do its work on. Each time it does its work it gets the active system and then executes the command across it. Should the command fail the facade needs to be reattempt executing the command. For it to be reattempted the facade class calls each of the systems disconnect method in an attempt to clean up resources. From here the facade then reconnects each system. The problem is that when command execution fails for what ever reason and it goes down the retry path, the reconnect is attempted once per thread that is utilizing the facade. As i said in my initial post, this is not good because i can only have 1 active connection (remote end rules not mine).

So lets say i have 7 threads accessing the facade trying to execute a command and the work thread 1 was doing fails. This initiates a reconnect for all threads. Since i am sharing a single resource on the remote end i cannot have multiple connections and this results in me being locked out and all connections failing.

What needs to happen is that when one thread fails, all other threads must be blocked while it reconnects. once it reconnects then it can begin its work again.

After explaining this, im thinking i need some sort of object to lock on. when a failure is detected.Inside of the reconnect code, lock the object, this would prevent all of the other threads from accessing it and would lock them. the reconnect attempt, unlock the object and allow the process to retry. If it fails again, then so be it, the "next first" thread to encounter a failure will try to renegotiate again.

 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
lets say i have 5 threads currently executing across this


I don't see how you can have 5 Threads using the same socket "at the same time" - how would you know which one gets the response?

Why not put all of your socket communication in a single object which is shared by the other processes with a typical locking mechanism.

That way only one Thread will see the loss of the connection and be responsible for re-establishing the connection.

Bill
 
J Holls
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Brogden wrote:
lets say i have 5 threads currently executing across this


I don't see how you can have 5 Threads using the same socket "at the same time" - how would you know which one gets the response?

Why not put all of your socket communication in a single object which is shared by the other processes with a typical locking mechanism.

That way only one Thread will see the loss of the connection and be responsible for re-establishing the connection.

Bill


Sorry for not explaining this better and the potential thread necro, but I wanted to update on this incase others are having problems and possibly to do a sanity check.


Here are the basicis of what i have, and what i have done that seems to be working.

I have an interface called Session, its definition is


I have the following objects the implement this interface
SessionImpl that has the actual socket and all of the logic to read and write commands to that socket
CollapsedSession that has a collection of Session's that it executes the commands across.

The implementation of executeCommand for the CollapsedSession is listed in the original post. It is here that i had my issues. Every thread that needed to execute a command would do so on the CollapsedSession. I was locking properly if the command could just be executed, however if something happened and I had to renegotiate which system was the active one, it was renegotiating once per thread that had access to the collapsed session. Causing me to get locked out by the server at the remote end and failing all other commands.

What i have done now is added a reentrant lock to the CollapsedSession, On each call to the executeCommand, it will try to acquire the lock. If it gets the lock then it will execute the command, should this fail, it will try to renegotiate the connection and send the command again as long as there are more retries left.

This is working and here is the modified code


I'm working on cleaning it up and testing it further however, i have been able to simulate a dropped connection and it worked as expected, (only one attempt at renegotiation).
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!