Nth Highest Distinct Record using SQL


Write a SQL query to get the nth highest distinct salary from the Employee table.

+----+--------+
| Id | Salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+

For example, given the above Employee table, the nth highest salary where n = 2 is 200. If there is no nth highest salary, then the query should return null.

Nth Highest Distinct Record using SQL via Sorting

The straightforward solution is to sort the Data in descending order and returns the N-th salary by using the Limit. However, the Limit cannot take a variable directly so you have to build the query dynamically. The following runs at 811ms.

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  Set N=N-1;
  RETURN (
      Select Distinct SALARY From Employee Order by Salary Desc Limit N, 1
  );
END

The parameter N cannot be used directly unless you explicitly SET N=N-1. Alternatively, you can

Declare M INT;
Set M=N-1;

to declare a new variable.

You can also group by Salary column and this gives the same results:

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  Set N=N-1;
  RETURN (
      Select Max(SALARY) From Employee Group By Salary Order By Salary DESC Limit N, 1
  );
END

This is a bit slower, that runs the query at 862ms.

Nth Highest Distinct Record using SQL via Counting

We can return the N-th salary by counting its position i.e. there are N-1 salaries which are higher. Based on this idea, we need to select the same table multiple times.

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (      
      SELECT 
        e1.Salary
      FROM 
        Employee e1
      WHERE (
              SELECT COUNT(*) FROM (SELECT DISTINCT Salary FROM Employee) e2 WHERE e2.Salary > e1.Salary
            ) = N - 1      
      Limit 1
  );
END

This is the slowest, and it takes 1088ms to complete. It involves nested SELECT statements which are time consuming. We use Limit 1 to return 1 salary only in case there are several same answers i.e. multiple highest salary. We can replace it with the Distinct keyword to return the unique value.

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (      
      SELECT 
        Distinct e1.Salary
      FROM 
        Employee e1
      WHERE 
        (
           SELECT COUNT(*) FROM (SELECT DISTINCT Salary FROM Employee) e2 WHERE e2.Salary > e1.Salary
        ) = N - 1      
  );
END

And this is improved to 939ms. The following also works:

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (
    select distinct Salary 
    from Employee e1 
    where N = (
         select count(distinct Salary)
         from Employee e2
         where e1.Salary <= e2.Salary 
    )
  );
END

This correlated SQL takes 1300ms to finish. The following takes slightly faster (876ms):

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (
    SELECT e1.Salary
    FROM (SELECT DISTINCT Salary FROM Employee) e1
    WHERE (
        SELECT COUNT(1) FROM (SELECT DISTINCT Salary FROM Employee) e2 WHERE e2.Salary > e1.Salary
    ) = N - 1      
  );
END

Algorithms: Second Highest Record

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
711 words
Last Post: How to Compare Version Numbers in C++?
Next Post: C++ Coding Exercise - How to Check Duplicates with Sliding Window?

The Permanent URL is: Nth Highest Distinct Record using SQL

Leave a Reply