Hi everybody,
Thank you all for entering this discussion.
Funny introduction While finishing with the db part of the assignment (where I invested far to much time and energy IMO, with a global design which is all but simple though scalable and performant), I made a promise to myself : "For the next two parts (network and GUI), keep it simple Phil !".
Well, unfortunately, I had decided to implement a socket solution before I made that promise, sockets are all but simple (but I didn't know in which extent), and finally I am too much faithful to my own decisions (but not to my promises).
The Design Choice Issue Yes, let's take it from the begining. In my instructions (URLyBird 1.2.1) it is stated :
Network Communication Approach
You have a choice regarding the network connection protocol. You must use either serialized
objects over a simple socket connection, or RMI. Both options are equally
acceptable.
and (under "Packaging of Submissions") :
A file called choices.txt that containing pure ASCII (not a word processor format)
text describing the significant design choices you made. Detail the problems you perceived,
the issues surrounding them, your value judgments, and the decisions that you made.
It means that you cannot choose one solution against the other without justify your choice.
As so many people here, I read Max's book (and for the others, here is
the best investment you can make as far as
SCJD preparation is concerned). According to his book, the pros of sockets are
performance and
scalability. I will add this one : it's a
standard, I mean an open standard, while RMI is "just" a
java standard. Indirectly, that's what Max writes too by telling "... sockets are well suited for sending data, often in compressed form, ...". "Compressed form" : you send and receive whatever you want as far as both sides of the connection agree with.
To be honest, let's have a look to the cons of sockets (which are opposite of RMI pros) : more complex to implement (low-level, no "network transparency", the need to build a multi-threaded server yourself).
Here comes the issue. We are not talking about what's the weather like, but about a
design choice : the one which answers this simple question : "Why did you choose sockets over RMI ?". And here flys away the design simplicity you had in mind despite your promises, because you simply cannot claim "I chose sockets over RMI for performance, scalability and openness considerations" while coming with a slow, not scalable and close solution.
Sockets may be simple : Ephemeral connections / One thread per connection Server-side you just need a ServerSocket accepting on a given port. When a client comes in, it creates a Thread, passes the new connection socket to it, and goes on accepting new connections. That Thread's job is simple too : read from its socket InputStream, interpret what's beeing read as some Command (an abstract "executable thing"), get the result, send it back to the same socket through its OutputStream, close the socket and die.
Thanks to java serialization, even the "marshalling" is simple : you get commands by a simple readObject() and send results by a simple writeObject().
Unfortunately, that implementation is not performant, not scalable and close : Not performant because each time some request needs to be sent to the server, a network connection is open and then closed, which is very time consuming. To execute the command, a new thread is allocated which is time consuming too. Not scalable because threads are a rare resource on any system : allocating a thread for each connection automatically limits the number of concurrent connections. And finally close, because the only marshalling protocol that basic implementation supports is java serialization : if you want later connect to your application server with something else than a java application, you'll get in big troubles.
Or a little more complex : Permanent connections / Pool of threads Permanent connections : Once accepted by the server (a client connects at start time), a given connection stays open during the whole client's life (except if the connection is broken for any reason, in which case some reconnection must happen).
Pool of threads : We have threads created and started, which never die while the server is running. In my implementation, I called them Handlers, there is number of them created from start (property), and a maximum number (property too). Incoming connections are put in a queue, automatically allocated to some handler which ... handles it and put it back in the queue (if nothing "bad" happened in the meantime from the network point of view).
In theory, it is not that much more complex BTW, and far more performant and scalable.
More performant, because threads stay alive as well as network connections.
More scalable, because you may have much more "concurrent" connections than you have threads running.
But in practice, I noticed that it is difficult to achieve : when you allocate existing connections to threads in some FIFO order, it doesn't make sense for a given handler to wait some time on a given connection, maybe just to notice that it had nothing to do. So I set the timeout of the client socket server-side to its minimum value : setSOTimeout(1). Here came the biggest issue I had to solve : a SocketTimeoutException may be thrown when there is nothing to read from the socket and the timeout expired (I understand that
![](https://coderanch.com/images/smilies/3b63d1616c5dfcf29f8a7a031aaa7cad.gif)
), but sometimes it happens that it's thrown in the middle of a read (!). At first sight, it just seems funny, but let's say that the client sent some request object serialized in a 3447 bytes stream. How do you think your server-side ObjectInputStream.readObject() reacts when it is interrupted after reading just 845 of them ? I can tell you : it hates that. At best you get an EOFException, but it can be a StreamCorruptedException either. I wrote "at best", just because an EOFException is more understandable. But in any case it's an unrecoverable error.
Fortunately, the solution I found was just what I needed to achieve a design decision I had in mind from the beginning : decouple the marshalling process from the communication layer. Before I come back to that issue, just a few words on the latter :
Sockets are a standard, while serialized objects over socket connections is a pure java-to-java solution : sockets themselves know only the byte streams they may receive and/or send. Objects of any kind must be packaged in some way that both parties (server and client) understand. That's the marshalling process and java object serialization is just one of them. If you abstract the marshalling, you get a much more "open" system, open to the outside world, but even within the java one :
Open to the world : as far as client and server may interpret a given bytes stream with some common protocol, they can communicate over sockets (a bad example of this would be some (legacy) asp application querying your (new) java application server)
![](https://coderanch.com/images/smilies/3b63d1616c5dfcf29f8a7a031aaa7cad.gif)
.
Open to java : let's say that among your 10 CSR's, one of them negociated to be allowed to work from home remotely. If performance considerations need it, and if your application server supports multiple marshalling schemes (as I did), it's easy to add one more to support those remote clients : compressed serialized objects.
Back to the SocketTimeoutException issue : the solution simply consists to read from the socket InputStream as many bytes as you can before getting interrupted, and put them in some buffer (back to this buffer soon). If 0 bytes are read, you are done. If more are read, just delegate to some ObjectConverter (back to it soon too), trying to interpret them and throwing an IncompleteObjectException in case the bytes stream is incomplete. When it arrises (very rare), you just need to read more bytes, till the object is complete. Of course, if an IOException of some sort is thrown in the middle, the process stops, the connection is lost and the client gets a SocketException.
I have 2 things to tell about that buffer :
It must be "resizable" : While reading from the socket InputStream, you have no way to know how much data you'll get. So if the buffer gets full, you may call its increaseCapacity() method (current size is increased by some increaseCapacityFactor). And when writing an object representation to it (through an OutputStream of my own), you may call its ensureCapacity() method to make sure your object fits in. Thanks to a design mistake
, I decided to implement it "softly", I mean through a soft reference. Let me explain : initially, such a buffer was owned by my ClientConnection class (bad design). And I thought : it would be stupid to see your server running out of memory because of buffers owned by "dormant" connections. Now that they are owned by the connection handlers, it is of little interest to make them "soft", but as they are I kept that so called SoftResizableBuffer class. How does it works ? The buffer is privately stored as a SoftReference. The garbage collector is allowed to clear them before running out of memory, as far as there is no more strong reference to them. Its public method getBuffer() returns it if it was not cleared (probably to be stored in some normal "strong" reference, preventing the garbage collector to clear it), or allocates a new one to its initial capacity if it has been cleared. Two additional methods (fix() / unfix()) allow a process to prevent temporarily the buffer to be cleared without needing to store (and pass along to other methods) the strong reference got by getBuffer(). Now thanks to a bug (handlers were created till their maximum number even when they weren't needed to serve existing connections) (there are so many thanks to mistakes in this paragraph
), I saw SoftResizableBuffer at real work : some of them were reallocated sometimes to there initial capacity, growing as needed, then cleared and reallocated, etc. SoftReferences are magic ! After correcting that bug, I wondered if it wouldn't be better to simplify it (making it a simple ResizableBuffer). My conclusion is that that "soft" behaviour is still interesting, to smooth memory peaks. Let's say that such a buffer has an initial capacity of 32Kb. Now some handler needs to handle a huge query result (1Mb ?). No problem, the buffer grows. But after having sent the result back to the client, what to do with that "huge" buffer ? Deallocate it ? It would be a pity because the next connection to handle may need such a huge buffer too. Keep it ? If other handlers need such a huge buffer too, we risk the fatal OutOfMemoryError. Just keeping "soft" is clearly the best solution. Mmh, this post is huge already, it's late here, and I have still a few things to tell you about (if still interested
![](https://coderanch.com/images/smilies/3b63d1616c5dfcf29f8a7a031aaa7cad.gif)
) :
Abstract marshalling : how class SocketObjectReaderWriter, interface ObjectConverter and classes ObjectConvertersFactory and ObjectSerialConverter work together in order to support multiple marshalling protocols at the same time. The Command pattern as implemented in my solution (no switch server-side) How sessions (objects which maintain the "state" of a connection) are handled abstractly. An example of a useful Session object that a permanent network connection may be bound to would be a database connection. The two-way communication : how optional callbacks are supported in a simple way, limited in my implementation to server-side messages broadcastable to clients The hand-shake process : how clients and server agree (or not) on some protocol. No indigestion yet ?
Best,
Phil.
[ September 09, 2003: Message edited by: Philippe Maquet ]