jatan bhavsar wrote:hi All,
It's very old thread but just curious to know how can i increase the performance with the 4,00,000 records.Suppose i am using joins to search the record and multiple user is requesting the same.So how can i implement the caching mechanism which will be efficient. I am not using ORM.
Regards
Jatan
Winston's covered the key points, but it sounds like you need to re-think your query strategy.
First, build and
test your SQL in the database, so you can explore issues like performance without all the overhead of Java-DB interfaces etc. Make your SQL as efficient as you can within the database - look at all those joins and see if they are really necessary or if you could tweak your table structures to make them easier e.g. denormalise commonly used columns from parent tables into the children (but remember you will need to maintain these denormalised values when the parent record changes).
If you have lots of similar queries, consider defining a database view to encapsulate the common joins etc, then your individual queries against the view will be simpler. but you still need to make sure the underlying view SQL is as efficient as possible. Also, make sure you do NOT do lots of DML across the network: fetching large volumes of data and then writing it back again is very inefficient. Instead, do big INSERTs/UPDATEs as single SQL statements within the database where possible, or use stored procedures if necessary.
Make sure you use all the available database tools to make your queries as efficient as possible - EXPLAIN PLAN, use of indexes, denormalisation, partitioning, etc. Partitioning might make a big difference if your users tend to query against particular logical sets of data e.g. for a particular month, as you can partition your table on the relevant column so that each month's data (for example) is physically separate, so any queries for e.g. January only have to search through January's partition instead of the whole table. This will depend on your RDBMS as these features are usually platform-specific.
If you don't know how to do all this stuff, get your DBA to help. If they don't know how to do this either, you shouldn't be running an application with millions of rows in the first place.
Now you have some SQL queries and you want to run them from your Java application. Again, make sure you do everything you can to make this as efficient as possible. Make sure you use PreparedStatement and bind variables to make your SQL both secure and re-usable (the SQL engine can re-use parsed SQL with bind variables), use batches etc if appropriate. Make sure you apply as much filtering as possible within the database before you fetch any records back. Reading data out of the database across the network is slow, so only read the data you really want - don't fetch everything and try to filter the data in Java.
Caching query results is a problem - you'll need some proper Java experts to advise on that (i.e. not me!) - but remember that the DB has its own caching mechanisms as well. Again, talk to your DBA to make sure you are using these effectively.
Use a profiling tool to find out where all the time is being spent. No point trying to speed up your SQL queries if most of the time is being wasted on network traffic, for example.
If your queries are still too slow, consider separating the "read" data from the "write" data. Your data model may be optimised for writing transaction data into a particular set of tables, but these may not be best structure for querying large volumes of data later on. You could look at building snapshots (in Oracle these are "materialised views") to hold commonly queried data in a more suitable structure, then query against the snaphots instead of the data tables, and refresh these snapshots e.g. once a day from your operational (transaction) data store. This approach is similar to the basic principle for data warehouses/data marts (on a much larger scale), but you can take the same approach on a smaller scale if it will help your application. This will also depend on your query requirements etc.
Finally, maybe think about whether you might be able to use some of the performance features of an ORM on your application.