I have a table, with 2 bigint columns, and a query of the form "select col2 from table where col1=someValue".
There will be 1 row satisfying the query, or no rows at all.
I currently have 6,800,000 rows of data spread evenly over 10 tables containing 680,000 rows each.
The time for 1 query is 0.1 seconds.
My question is, would it be faster or slower to use 1 large table, or more smaller tables?
I can test it out, but it is time consuming handling such a high quantity of data.
Thanks for any input.
Your query is certainly getting the row via an index. Therefore total size of data is irrelevant, since index lookup time is practically constant. With really small tables your index might (just might) have lower height, saving maybe one logical IO operation, but that probably would not offset additional maintenance of managing 10 tables and partitioning data against them.
That is assuming that you'll know which table would contain the value you're looking up. If it could be any of the ten tables, you'd of course have to probe every one, and the resulting ten queries would probably take about ten times more time than a single query against one big table.
What could speed things up a bit would be if you indexed both of the columns. In some databases the optimizer would figure out it can answer the query directly from index and would not have to read data from the table at all, saving one logical IO for every query. And some databases (eg. Oracle) allow you to create index organized table, so that you can get rid of the unneeded underlying table altogether, saving space and update time.
I'm not sure what you mean about indexing, do you think I could make it any faster than it is now?
There is one sure way to know - run an explain plan on your query and the database will tell you. There is no 100% guaranteed way of knowing up front if your table structure will produce fast queries - this is very much down to the type of queries (and other processing) you intend to do on your data and the data itself. You need to examine the query logic you would like to run against real data and tune accordingly, anything else could make it slower.
colin shuker wrote:I'm not sure what you mean about indexing, do you think I could make it any faster than it is now?
I could help you with this only if you are on Oracle. I don't know other databases well enough to offer more specific help.
If you use Oracle database, post the DDL statements you used to create the table(s) and any indexes on the table(s), and the exact text of the query you use. If I'll see a space for improvement, I'll let you know.
colin shuker wrote:I'm using MySql, I'm getting a time of 0.1 seconds for each SELECT query, which is pretty decent because I was previously using varchar in my table, and it was taking 3.5 seconds for the query.
ALWAYS use correct data types. If you use incorrect types, an implicit conversion will probably take place somewhere. Then all of sudden your code might stop working when the locale changes, because eg. numbers or dates will be formatted another way and won't match the data in the database any longer. Moreover, with wrong data types, the database might create suboptimal plan (eg. using number for dates) or might be unable to sort the data properly (eg. using varchar for numbers).
In this particular case, I'd guess that using VARCHAR prevented your application from using an index (or you didn't have an index in place). The table was fullscanned (all contents of the table was read) for every execution of your query. Your current solution very probably utilizes an index, vastly reducing the amount of work the database has to do to answer your query. Hence the speedup.
The time for a query is now 0.02 seconds, thats what PhpMyAdmin says anyway, I guess this includes time to find the table given in the SELECT query.
My application does seem a little snappier now anyway, so it looks good.