• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • paul wheaton
  • Ron McLeod
  • Devaka Cooray
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Paul Clapham
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:

Threading strategy for the server (long)

 
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The first line of the description of "Fly by night services" says that the company is small but growing.
This has made scalability (not performance nota bene) one of my primary non-functional design goals.
Another personal reason for making scalability important is to be able to learn more about threading models, not just passing the exam.
After browsing this forum it seems that it is enough to just assume that RMI will spawn threads for each remote method invocation to pass the exam.
Some questions:
1) The RMI-spec states that method invocations may or may not run in separate threads. Is this because the underlying OS and HW might have different capabilities or is it also dependent on the VM and the particular RMI-implementation?
2) Does the SUN RMI-implementation just spawn new threads indefinitely or does it use a threadpool?
3) If the SUN RMI-implementation uses pooling, is it a dedicated pool implemented as part of RMI or is it actually the underlying VM trying to reuse threads behind the scenes?
4) I'd appreciate comments on the following approach:
4.1) A scalable server must use thread-pooling to avoid creating an excessive amount of threads,
ultimately resulting in a denial-of-service.
4.2) There are two potential scalability bottle-necks in the server, 1) creation of objects for representing a client request on the server and 2) execution of the actual database queries
4.3) 4.2) can be avoided by having 2 threadpools, one for constructing clientrequests and one for executing queueries. Two pools are better than one since it maximizes availability of the server to clients (this assumes that, in general, constructing a clientrequest is cheaper than carrying out a query)
4.4) Requests created by the first pool is enqueued on a FIFO-queue, which is dequeued by the second pool running the queries. This maintains a degree of fairness in the order of execution. And of course a queue is necessary if we cannot have an unlimited amount of threads for running queueries (refer to 4.1)).
4.5) The server should use callbacks to avoid congestion in the traffic going into the server.
A nice side-effect is that the client doesn't have to deal w. blocking I/O.

I have a prototype for the above that works, however, it relies on RMI for constructing clientrequests and enqueueing them. So I don't know if it will actually scale since it might just continue creating new threads until the amount is too high.
Thanks for your time
/Alex
 
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


4.1) A scalable server must use thread-pooling to avoid creating an excessive amount of threads,
ultimately resulting in a denial-of-service.
4.3) 4.2) can be avoided by having 2 threadpools, one for constructing clientrequests and one for executing queueries. Two pools are better than one since it maximizes availability of the server to clients (this assumes that, in general, constructing a clientrequest is cheaper than carrying out a query)
4.4) Requests created by the first pool is enqueued on a FIFO-queue, which is dequeued by the second pool running the queries. This maintains a degree of fairness in the order of execution. And of course a queue is necessary if we cannot have an unlimited amount of threads for running queueries (refer to 4.1)).
4.5) The server should use callbacks to avoid congestion in the traffic going into the server.
I have a prototype for the above that works, however, it relies on RMI for constructing clientrequests and enqueueing them.


You are going way beyond the requirements, and you may actually be going against the requirements. One of the requirements is to use standard, well-known solutions to standard problems, and avoid the complex solution for the sake of performance. RMI itself is a standard, well known solution to networking, and its already loaded with features such as thread pooling, security, distributed garbage collection, etc.
I commend your desire to come up with a comprehensive solution, but this is not what this assignment is about.
Eugene.
[ March 12, 2003: Message edited by: Eugene Kononov ]
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks for the reply,
I probably try to achieve more SCALABILITY than is required by the assignment, that is true. But is that against the assignment, I cannot see how.
Using a threadpool combined with a FIFO-queue and doing callbacks ARE standard/well-known solutions for achieving SCALABILITY as far as I know. HOWEVER I'm NOT implementing these solutions to achieve PERFORMANCE. In fact the above will actually lower PERFORMANCE from a single clients perspective, since there's more overhead.
The assignment only talks about avoiding complex solutions to achieve performance NOT scalability.
I'd really like comments on my replies above.
You mention that RMI uses a threadpool, perhaps you could tell me a bit more about how it works? Is the threadpool part of RMI or actually just the VM re-using threads?
Is this threadpool self-adapting to the load somehow?
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Using a threadpool combined with a FIFO-queue and doing callbacks ARE standard/well-known solutions for achieving SCALABILITY as far as I know.


Ok, but they are the solutions to a higher level problem than the one that we are supposed to solve. By analogy, a relational database is a standard solution to persistance, but you are not going to implement it for this assignment, are you?
Eugene.
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
No, I won't implement a full-featured DBMS since there's nothing in the assignment that indicates or states that this is a current or future need/requirement.
However, the requirement for scalability is strongly hinted in the first sentence describing FBN.
A discussion regarding how much scalability is enough is impossible to win for anyone since there are no hard figures stated in the assignment.
I'd like to add that if this was reality I would probably tell the customer that they should use a real DBMS right away since it's more than likely that they would need one for later releases.
Do you think SUN would accept my reasoning as a rationale to include some scalability enhancements?
Finally, I'm really curious about the capabilites of SUN's RMI-implementation. It looked like you had some knowledge about this in your first post? I would also very much appreciate other peoples thoughts on my questions and thinking about scalability.
 
Ranch Hand
Posts: 234
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Alexander -
I agree strongly with Eugene. I think you are going in the wrong direction with the scalability part. The point of this exercise is to satisfy the vague and incomplete requirements as put forth by Sun, not produce a commercial product. If you spend some time reading this forum you will see what I mean.
Also, I have never written my own thread pool, but I know you would be opening an ugly can of worms with that and it could lead to trouble, unless you already have some strong experience with it. If you need to go this route, write a simple unscalable solution for Sun, submit it, and then write a scalable solution for yourself.

-BJ
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
BJ Grau,
I realize that there's a common truth on this forum on what the assignment is really about in practice.
But I still think that if I put up a clear reasoning, based on the description in the assignment, why these scalability enhancements are implemented in my design documents, SUN must accept this.
Also my server-prototype is tested and is working fine and is ready to be refined for use in the assignment, so I'm not really worried about not getting it to work from a pure functional point of view.
What I am worried about though, is that my threadpool for executing db-queueries might be moot if the SUN VM is actually re-using threads internally. If it does, this might mean that my threadpool (and the FIFO-queue) is just overhead and nothing more.
Furthermore there's still the question on how SUN's RMI-implementation behaves, although I'm fairly sure after reading this forum that it will multithread if the underlying platform has this capability. So this would mean I could use RMI for concurrently create and enqueue new clientrequests.
But I guess if not a single person on this forum believes that this is an approach that SUN will approve, I'll give it up and just do it as it is meant to be done.
Anyone???
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


You mention that RMI uses a threadpool, perhaps you could tell me a bit more about how it works? Is the threadpool part of RMI or actually just the VM re-using threads?


All I could find in RMI Specification with regards to threads was this:
"A method dispatched by the RMI runtime to a remote object implementation may or may not execute in a separate thread. The RMI runtime makes no guarantees with respect to mapping remote object invocations to threads. Since remote method invocation on the same remote object may execute concurrently, a remote object implementation needs to make sure its implementation is thread-safe."
Perhaps some other people can point to a more detailed technical description of how the threads are managed by RMI.


Is this threadpool self-adapting to the load somehow?


That's a good question, but again, I think that it is irrelevant to this assignment. Think of this way: suppose you are an EJB developer, and you are asked to write a stateless session bean. Now, of course, you need to understand the life cycle of the stateless session bean to write your code, but how exactly it is implemented by a particular EJB container is irrelevant to you. Or, taking the RDMS example, when you write the JDBC code, all you care is that the RDMS has a JDBC driver and supports SQL. You take it for granted that your relational database is designed to handle multi-user, multi-transactional environment. How exactly it is implemented by Oracle/DB2/Informix engineers is a question for computer science theorists, but not for the developers. This separation of knowledge is a good thing, of course. As a Java Developer, I can write a simple JDBC code that will work against any database, while I don't have a clue how exactly it works.
Coming back to RMI, take it as a wrapped up solution (just like an RDBMS, or a servlet container, or an EJB container, or JDBC). All you need to know is that it a wrapper around a network communication protocol that also incorporates security, distributed garbage collection, and thread pooling, among other things. And yes, for the purposes of this assignment, you can safely assume that RMI is scalable enough.
The truth is that Sun's implementation of RMI is not scalable, and that's why the application server vendors chose to provide their own implementation of RMI (BEA weblogic is an example with a custom RMI implementation). In your testing, you may notice that RMI sometimes "chokes" when bombarded with too many connection requests (it happens to me every time I run more than 25 clients on my Windows 98 machine). But again, this is completely irrelevant in the context of this developer assignment.
You have good ideas, and they would certainly be applicable to some industrial strength, mission critical applications, where scalability, availability, and reliability are of the utmost importance (air traffic control system?). However, the Sun's assessors are likely to be overwhelmed with your extra efforts to improve scalability, and I would not take this risk. But, after 700 messages in this forum, my mind may be clouded with stereotypes, so if someone straightens me out, I would accept it.
Eugene.
[ March 12, 2003: Message edited by: Eugene Kononov ]
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Eugene,
you're absolutely right: my questions and thoughts are irrelevant for the assignment in itself.
The problem for me, however is that I also wanted to learn to implement some scalability "patterns" in java along the way. This in turn means that I MUST investigate and understand parts of the SUN VM and SUN RMI threading capabilities. Otherwise I might produce a solution that is less scalable than pure RMI threading. If that is the case I will for sure fail the exam. This is clearly stated in the instructions.
I have noticed the choking you talk about, however alot of other stuff was going on in the server, apart from receiving remote method invocations, so I cannot really say that it was the same cause.
I'll think about your comments a day or two and then decide. In the mean time there's at least a small chance that someone posts a supporting post . But it doesn't look like my prototype will survive
 
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am still trying to come up with the
strength of will to tackle the RMI part,
and felt compelled to respond just how
interesting this insightful (and also
respectful) thread of discussion is.
- Eric
 
BJ Grau
Ranch Hand
Posts: 234
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Alexander Gunnerhell:
No, I won't implement a full-featured DBMS since there's nothing in the assignment that indicates or states that this is a current or future need/requirement.
However, the requirement for scalability is strongly hinted in the first sentence describing FBN.


Is it really scalability that is strongly hinted at? If so, everyone on this forum is in big trouble. Perhaps it is <i>extensibility</i> that they are hinting at.
If it is scalability, how scalable do you think Data and db.db are as a database?
This assignment is merely <b>symbolic</b> of a real project. If we were to take it literally, there would be so many implied requirements to fullfill that we would never finish.
I think your idea is very interesting, and I am very impressed that you successfully implemented your own thread pool. I almost feel bad for not saying "Go for it", but I have just never heard any advice from any person or forum that suggested anything but keeping this very, very simple. If you do go for it, the worst that will happen is you spend another $250 and have to submit again. If you do go for it, please let us know how it works out for you.
I wish my understanding of RMI went beyond the specification, then I could actually address your questions #1-3.
-BJ
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
BJ Grau,
I agree, in reality this would probably mean both scalability and extensibility. I have deliberately chosen to interpret it as scalability only to be able to fit my scalability excercises into the exam. I don't think that this is a problem (just as long as you clearly state your rationale).
I think the requirements are so vague that you can come up with many different interpretations, therefore it's nothing strange with that most people can get away with disregarding scalability altogether.
Your second thought about db.db is more worrying, they might fail me just because the design is unbalanced. I.e. I have wasted time and money on scaling the server, but the database will anyhow become a bottle-neck.
I'm currently designing the db-part and was planning to use a LockManager based on a HashTable w. queued read-write locks for each row.
However this is still far from what could be done in terms of escalating locks to tables or the whole database to avoid row-lock overhead when searching for example. Not to mention a richer set of locktypes, for example intent locks etc. Also there are loads of theories of queued locks and what kind of db-operations to prioritize dynamically in runtime.
Clearly, the latter is way beyond my knowledge and the time I'm willing to put into this exam.
So would this make my design unbalanced? I don't know, since I'm not experienced enough to get a gut-feeling for this just by looking at the design. The only way to answer the question is to run a white-box test and identify the bottle-necks.
What's at stake for me is not only the course fee, it's the time invested and the fact that I probably won't have the energy and interest to re-do the exam the way it's supposed to be done if I fail with my current design.
I haven't decided yet...
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I realize not many people are interested in the issues above, but I'll keep posting just in case someone else comes here later on looking for this kind of information.
I've been busy with other stuff for a while but today I had some time to continue my research:
I managed to dig up some additional information about SUN's RMI threading:
http://forum.java.sun.com/thread.jsp?thread=323360&forum=58&message=1315076
According to the thread,RMI is creating 1 accept-thread for each server and listening port. In addition to that RMI creates one additional thread for each connection (ConnectionThreads). The ConnectionThreads are supposed to time-out a while after the client closes it's connection.
Some conclusions:
- since there's only one accept-thread per remotely exported object, it becomes a precious resource, therefore callbacks are a good thing (this is assuming it works the same as with sockets where I think the accept-thread normally is used for the response back to the client as well)
- there will be a potentially indefinite number of ConnectionThreads, i.e. there's no pool just a timer for each threads life-cycle. The big question here is what is the least evil:
1) having a threadpool for creating requests on top of all this (with added overhead) or
2) minimizing overhead and disregarding the potential situation where RMI creates too many threads
I think 2) is closer to the business requirements, which means RMI will handle creation and enqueueing of client-requests and my own threadpool will execute db-operations.
Aside from RMI I have also researched thread reuse in the JVM in general, I haven't found one single trace of that current SUN JVMs have this feature. So the db threadpool stays.
Now, what's left is to think about the potential risk of the db becoming a bottle-neck, which might be a nightmare to resolve, if it's even possible.
After that I'm ready to decide how to proceed.
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Aside from RMI I have also researched thread reuse in the JVM in general, I haven't found one single trace of that current SUN JVMs have this feature. So the db threadpool stays.


Would you rephrase that for me? Are you saying that RMI does not use a thread pool and that's why you intend to implement that pool yourself?
Eugene.
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Clarification:
no, this is not what I mean. RMI will handle creation and enqueueing of clientrequests (i.e. no pooling what so ever) by means of it's internal threading. Dequeueing and execution of the clientrequests in the "database" will be managed by my own threadpool. These threads will also handle the callbacks to the clients.
I'm still waiting for insights and comments on the db-scalability problem from a colleague of mine (he's a DBA) before deciding how to proceed.
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Dequeueing and execution of the clientrequests in the "database" will be managed by my own threadpool.


So, every incoming RMI call will be placed in some sort of a queue on the server, and in addition to RMI thread pool you will take a thread from your own thread pool to pick these calls from the queue and execute them. Shouldn't it be just a single thread that is dequeueing the queue? I mean, what is there to pool?


These threads will also handle the callbacks to the clients.


What kind of callbacks? Do you mean that the server returns an "acknowledged" response to the client, and when the actual request is serviced, the server will notify the client by invoking some method on the remote object that the client uses? Kind of asynchronous communication, JMS-style?
All this sounds to me as implementing your own EJB specification around RMI protocol. Way, way too much stuff. I would hate to see you fail, but for the purposes of the experiment, I am almost willing to pay for your re-submission.
Eugene.
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're correct about the RMI calls, this is basically how I described it in my first post.
The reason for having multiple threads for executing clientrequests in the "database" is that a thread in the db will typically be blocked for most of it's scheduled time since it's doing I/O. Now, instead of just waiting for that single thread to stop blocking, why not let other threads get a chance to run? Of course this won't be possible for all cases.
The reason for managing those multiple threads in a pool is too be able to limit the amount of threads for executing clientrequests to avoid the overhead of too many threads.
You're correct about the callbacks. In the prototype the client observes a remotely observable server-object. When that observable server object changes state (i.e. a clientrequest has finished) it executes a method on the remotely exported observer object in the client.
I know this is more than what is required, I think I have already explained why I'm doing this and how I try to fit it into the assignment without violating the "rules".
I'd also like to point out again that I have already proven my "architecture" (from a functional point of view) in the prototype. So it's not an impossible task that lies ahead. But it might still be very hard to make the database as scalable as the rest of the server.
The prototype only had a stub simulating the database part. But the FIFO-queue, the threadpool and the callback mechanism were working fine.
Comparing my "server" with a J2EE application server is very exaggerated, but I guess you were joking about that part as well as the part about paying my re-submission.
Tell me when you're ready to pay
I'm still waiting for my colleague's thoughts about the database, but he's on vacation right now.
Even if he thinks too many advanced mechanisms is necessary to scale the database and I have to change approach, I'm still very glad I've done all this thinking/design as well as the prototype. I have really learnt alot and have had great fun so far.
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


The reason for having multiple threads for executing clientrequests in the "database" is that a thread in the db will typically be blocked for most of it's scheduled time since it's doing I/O.


Ok, that makes sense. I just want to point out that since the access to database is granted through the synchronized methods of Data, and all your threads will presumably use the same instance of Data, nearly 100% of all the time spent in the execution of the methods of the remote object will be spent in a serialized sequence anyway. That is, you will not gain much if you let multiple threads to execute the code that almost entiterly synchronized around the same semaphore.


The reason for managing those multiple threads in a pool is too be able to limit the amount of threads for executing clientrequests to avoid the overhead of too many threads.


Ah, a resource controlling mechanism, similar to the one used in application servers, such as Weblogic?


You're correct about the callbacks. In the prototype the client observes a remotely observable server-object. When that observable server object changes state (i.e. a clientrequest has finished) it executes a method on the remotely exported observer object in the client.


Yeah, a distributed and asynchronous Observer-Observable. I must tell you, that's very cool.


Comparing my "server" with a J2EE application server is very exaggerated, but I guess you were joking about that part as well as the part about paying my re-submission.


I did exaggerate, of course, but not by much. So, no, I am not joking. You are trying to emulate the best of J2EE with your own code. Your arguments and justifications are certainly very coherent, but your assessor will be in "shock and awe" while reviewing your submission


Tell me when you're ready to pay


I am willing to put down $25, just to see if you pass. Any other ranchers willing to participate?


Even if he thinks too many advanced mechanisms is necessary to scale the database and I have to change approach, I'm still very glad I've done all this thinking/design as well as the prototype.


Yeah, way to go.
Eugene.
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, using the Data class as a semaphore makes all my other efforts moot.
Therefore my initial design of the database put all the locking in the LockManager, which is basically a hashtable with a key corresponding to the key in the file and a reference to an object containing some sort of locking mechanism. A db-operation would then lock on a single element in that hashtable.
You can't compare my threadpool with WebLogics, mine is static in runtime, i.e. you have to decide before execution what the optimal number of threads are according to the capabilities of the underlying platform. I'm pretty sure WebLogics threadpooling is much more advanced.
A discussion about what features are available in a typical J2EE server is really beyond the scope of this thread, so let's just agree on that we disagree regarding this issue?
It's nice of you to offer to pay my re-submission, but I'm not paying myself anyhow. Besides I would feel like being under pressure if you paid. So thanks, but no thanks.
I have talked to my DBA colleague now, and he thinks it will be tricky (as was expected) to make the db scale as good as the netserver probably will.
What I can do, however, is to extend my prototype with my redesigned database-package and do some benchmarks.
What I'm prepared to do is to include the above LockManager and queued read/write locks. If that's not enough to make the scaling balanced, I'll give up on my current approach.
I believe the main problem will be that searches will require traversing the hashtable and set readlocks for each and every element included in the search. This could be tackled by implementing escalating locks so that I could set a table read-lock for the whole table in one single set-operation but that is simply to much.
Do you know a good tool for finding "hotspots" in an application? Preferably I'd like a tool that I can use to instrument my code and then have it report how much time is spent in different parts of the code.
 
John Smith
Ranch Hand
Posts: 2937
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Do you know a good tool for finding "hotspots" in an application? Preferably I'd like a tool that I can use to instrument my code and then have it report how much time is spent in different parts of the code.


You mean the profiler? I have OptimizeIt at work, and I like it very much. The closest competitor is probably JProbe. There is also a free one (I believe), called JInsight.


I believe the main problem will be that searches will require traversing the hashtable and set readlocks for each and every element included in the search.


Hashtable is a legacy collection, don't use it. Try HashMap (or synchronized HashMap) instead. So, you are planning to have a collection where you would map all records to their lock status? If your db table contains 1 mil records, you would then have a map of 1 mil entries?

Eugene.
[ March 24, 2003: Message edited by: Eugene Kononov ]
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Mmm, I have looked through alot of profilers and decided to start w. JInsight, since it's free.
Thanks for the tip about what collection implementation to use, however I was referring to Hashtable in a generic sense not a particular implementation (perhaps I used capital letters when describing it earlier, which is confusing of course).
Yes this Hashtable will potentially be very huge, just like an index/lock structure would be in a real database. Are you questioning if a Hashtable is a suitable collection or if any of the hashtable implementations would cope? Or perhaps the memory reqirements?
I'm thinking that the tests would consist of three main test cases:
1) benchmark ordered reads after a warm-up period, this would ensure that most if not all data is cached so I can measure the effectiveness of the different parts of the application without being affected by the disks
2) random reads (80%) and writes (20%), to simulate a more realistic load, here it's important to avoid having sustained datastreams since this would just place a bottle-neck on the platters of the disks.
3) tests 1) or 2) but measuring the primary memory requirements for a huge database. This is important since I have absolutely no idea how many bytes each element in the Hashtable will use. There's a risk that memory consumption turns out to be unrealisticly high.
[ March 25, 2003: Message edited by: Alexander Gunnerhell ]
 
Alexander Gunnerhell
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OK, today I have thought all this through and decided to skip the scalability mechanisms.
First remember that the reasons for having the FIFO-queue and the pool with db-threads were to:
1) limit the max number of threads in the server (this reason became moot when I decided to use SUN-RMI which anyhow spawns at least one thread per client)
2) maximize availability, so that a client could issue a typical sequence of operations without waiting for each operation to finish (for example lock->modify->unlock
The main reason for scrapping scalability showed up when I discovered that there's a clear possibility for the sequence lock->db-operation->unlock to be executed out-of-order.
According to my current design each of the above operations would be separate requests to the server and therefore be executed by separate workerthreads. Now, the thread executing lock could be switched out of the cpu immediately before it has actually acquired the lock in the db but after it has dequeued the lock-request from the queue. Just after the previous dequeue another workerthread dequeues the db-operation and tries to execute the db-operation without the associated lock, which would result in an error.
This can be fixed (this was also my plan from the beginning) by "packaging" a sequence of logically related operations in one request to guarantee the correct ordering and execution in one thread.
However the unfortunate side-effect of this is that reason 2) above becomes moot as well. Which makes it impossible to defend the FIFO-queue and the pool with db-threads.
So I will have to redesign according to the typical "RMI-threads only" approach.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic