• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

SQL Query Optimization for ORDER BY and LIMIT

 
Ranch Hand
Posts: 42
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All.

I want to optimize a query

SELECT * FROM EMPLOYEE ORDER BY EMPID ASC LIMIT 9,1.

This Query will give me the 10th LOWEST EMPID in the table.
But if the number of rows is abut 100 million the query have a bad performance.

So please help me on this.
 
Sheriff
Posts: 3837
66
Netbeans IDE Oracle Firefox Browser
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First of all, how often is the query run, how long it takes now and how much do you need it to improve?

If there isn't an index on the EMPID column, create one. Since you're selecting just a few rows, the database should use the index and it should be quite fast.

The query itself is so simple that there isn't much to optimize. However, you should explicitly name columns instead of specifying SELECT *. It won't help much, but (depending on the database, your indexes and network) it might be a bit faster, especially if you select more than just a few rows and/or there is an index containing all columns you're selecting.
 
Vishal Baid
Ranch Hand
Posts: 42
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Martin Vajsar wrote:First of all, how often is the query run, how long it takes now and how much do you need it to improve?

If there isn't an index on the EMPID column, create one. Since you're selecting just a few rows, the database should use the index and it should be quite fast.

The query itself is so simple that there isn't much to optimize. However, you should explicitly name columns instead of specifying SELECT *. It won't help much, but (depending on the database, your indexes and network) it might be a bit faster, especially if you select more than just a few rows and/or there is an index containing all columns you're selecting.



Hi Thanks for your reply.

Their EMPID is a primary key. The index is not defined properly. First Employee ID is 1000 the n may be next one is 10010. So no proper indexing.
Problem is that for finding the 10th lowest EMPID, ORDER BY do the ordering of whole table first then only give the 10th row. If the number of rows are 100 Million then it is taking abt 1 mins.
The process takes lot of time. So I want someone give me some idea who to make the query work fast.

Thanks in advance.
 
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Firstly, if there is no index on your EMPID and you are sorting 100 million rows by EMPID, then you need to create an index on EMPID. Basically, if you are going to do a lot of sorting on a column (or columns) then you need to index the column(s). However, if the EMPID has been declared with a primary key constraint, then it should already be indexed - the index is how the database enforces the PK constraint.

After you've done this, there are different ways to achieve your goal of finding the Nth record in a sorted list of records. You could use a Top-N query and just discard the first N-1 rows. Alternatively, if you are on Oracle, you could use an analytic function like RANK or DENSE_RANK to calculate a ranking for each record (based on your sorting criteria), then wrap this query in a simple outer SELECT that just fetches the record where the ranking = N.

But as Martin says, you should avoid doing SELECT *. One option might be to find your chosen EMPID first, without fetching any of the other values for the Employee record, then wrap this SELECT EMPID... in another SELECT that fetches the rest of the Employee data for the single EMPID you are interested in. This should allow your "find the Nth record" query to return the required EMPID using just the index, then the final SELECT to get the rest of the Employee data for the chosen EMPID will be very quick.
 
Bartender
Posts: 1051
5
Hibernate Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Their EMPID is a primary key. The index is not defined properly. First Employee ID is 1000 the n may be next one is 10010. So no proper indexing


I think you may be confusing indexes with sequences here. This post may help:
https://coderanch.com/t/453208/Oracle-OAS/Difference-betwwn-Indexes-Sequences
 
Martin Vashko
Sheriff
Posts: 3837
66
Netbeans IDE Oracle Firefox Browser
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What database are you using, Vishal? (It certainly isn't Oracle, Oracle doesn't have the LIMIT clause).

It looks like there's a bit of confusion about database indexes, as James already noted. A database index is similar to an index in a book - the index keeps all indexed keys in a sorted structure (corresponding to the keywords in a book index sorted alphabetically), together with an "address" of the row containing the key (a page number in the book index). So to find the 10th lowest key, the database has to find 10th entry in the index (it does so by reading the first ten entries) and locate corresponding row. This should under all circumstances be much quicker than to read (and sort) hundred million rows.

If the EMPID column is a primary key, in most databases it means that it is indexed. I'd also expect the LIMIT clause to use an index. Perhaps someone who knows the database you're using will be able to help.
 
Vishal Baid
Ranch Hand
Posts: 42
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am using MYSQL.

EMPID is not auto-incremant. In ofbiz framework, the primary key Empid is random number.
Not in an order that is why i am facing this issue.
 
chris webster
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Vishal Baid wrote:I am using MYSQL. EMPID is not auto-incremant. In ofbiz framework, the primary key Empid is random number. Not in an order that is why i am facing this issue.


It doesn't matter how the EMPID is generated, what's important is that it is correctly defined as a primary key in MySQL. If this is the case then EMPID should already have a unique index, which means you can use this for your query. Check the MySQL online docs:

MySQL manual wrote:The primary key for a table represents the column or set of columns that you use in your most vital queries. It has an associated index, for fast query performance.


If possible, I would suggest you use a Top-N query with LIMIT and only fetch the EMPIDs, because this can be achieved using just the index, which will be fast. I don't use MySQL, but I think you could get just the record you want by using the offset (N-1) in the LIMIT clause e.g. if you want just the 10th record try using:You'll need to play around with this to make sure you get the record you want. Then wrap this in another SELECT that fetches the Employee details for the single EMPID you are interested in.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic