posted 9 years ago
I wouldn't recommend this scheme. It would make querying the database very cumbersome and potentially increase the risk of SQL injection attacks (you would be creating SQL queries on the fly, stuffing table names in in the process - a recipe for an SQL injection). And some queries would be outright impossible to make efficiently - let's say that you would have, for example, a salary column in each of the company tables and would want to find out the maximum salary over all of the companies. With all data stored in one table, the query would be trivial (and if the column was indexed, it would also be instantaneous). However, in the scheme you propose, it wouldn't be possible to write an SQL query to return these data and you'd have to do so procedurally. (Well, you could maintain a view that would return an UNION of all company tables and query that view, but that would essentially be a crude form of partitioning I'll describe later - and an administrative nightmare.)
First of all, as some others have already told you several times, databases are fast and were build to process large amounts of data efficiently. It is possible to design a database in a particularly ineffective way, but when done correctly, even million of records do not pose a particular problem these days. And if you're worried about efficiency, the best you could do would be to create a database, fill it up with hundred thousands artificial records and see for yourself how it behaves under the conditions you envisage.
Also, if your individual tables will have hundreds of thousand records, you need to index them properly for quick data retrieval anyway. And if you manage to get quick access to data in one sub-table, properly indexed, you'll obtain practically the same speed of data retrieval over one global table with the same similar indexes (edit: the indexes wouldn't be the same, they would contain column identifying the company at the beginning). The only difference is the case when full scan of the table (ie. reading the contents of the entire table, without using any index) is required - here, the split up table could theoretically perform better (depending on the exact query), but a full-scan over several hundred thousand records is going to take some time, so if you want quick responses, you'll have to avoid full scans anyway.
Having different structure of individual company tables complicates the things further, in a big way, as the application(s) operating over such database will need information about all of these tables to avoid accessing column that aren't there.
Some databases have ways to split up large tables. This is called partitioning and you might try to look it up for the database you plan to use. In general, partitioning allows you to break large tables up into smaller chunks (partitions) by some criteria. Depending on your exact database, it might be possible to achieve splitting the table up to partitions each containing data about just one company. So, you would achieve exactly what you plan (except that all the partitions need to have the same columns, since they're part of the original, partitioned table), and the database would handle by itself all the nasty details of going to the correct partition for the piece of data being accessed. The application code doesn't even need to know whether the table is partitioned or not!
However, partitioning is most of the time used to make administering really, really large tables easier (and we're speaking about billions of rows here), and one of the problems of partitioning is to make sure that the operations over the partitioned table won't get slowed down. Only seldom is partitioning used to speed things up, and, frankly, one needs to read up something about the subject to be able to utilize it this way, especially as the details can differ between different databases.