This week's book giveaway is in the XML and Related Technologies forum.
We're giving away four copies of Java XML & JSON and have Jeff Friesen on-line!
See this thread for details.
Win a copy of Java XML & JSON this week in the XML and Related Technologies forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Liutauras Vilda
  • Devaka Cooray
  • Jeanne Boyarsky
  • Bear Bibeault
  • Junilu Lacar
  • Paul Clapham
  • Knute Snortum
Saloon Keepers:
  • Ron McLeod
  • Tim Moores
  • Stephan van Hulst
  • salvin francis
  • Carey Brown
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar

Partitioning or new schema  RSS feed

Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a schema inside my database with big tables (the schema is composed by 20 tables and each of them contains 500 million rows equivalent about  40GB ). the tables are quite simple, they contain 3 fields: id,value,datetime.
The queries (normally are based on the search in a certain time range one or more parameters values) now are very slow and some improvement is needed, so for the structure of the tables I was thinking about horizontal partitioning by range using the datetime field as element or split the schema in different pieces each of them including data for a certain period of time.
The second solution will create some problems in the software using the database because according to the range time selected by the user, the query must be addressed to a specific DB, containing that period, so if the performances are similar I will prefer the fist one.
What do you think about the performances of the two solutions, is there any other (third) applicable solution?    
Posts: 20307
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Usually you'd partition based on the ID, but if you search more often on time, perhaps that might work. But before you do that, have you tried indexing on the datetime field?

Another reason for partitioning large data based on its timestamp would be if you were doing the majority of your queries over a given time range. In that case, partitioning might make it easier to roll the data offline, or for DBMS's that support it, archival mode (where the old stuff might be run out to tape and restored on-demand).

Certainly it's worth considering whether or not to manually partition by storing data for a separate month or year in its own table, too.
All of the world's problems can be solved in a garden - Geoff Lawton. Tiny ad:
RavenDB is an Open Source NoSQL Database that’s fully transactional (ACID) across your database
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!