Teaching Kids Programming – Second Highest Distinct Record using SQL and Python


Teaching Kids Programming: Videos on Data Structures and Algorithms

Table: Employee

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+

id is the primary key column for this table.
Each row of this table contains information about the salary of an employee.

Write an SQL query to report the second highest salary from the Employee table. If there is no second highest salary, the query should report null. The query result format is in the following example.

Example 1:
Input:
Employee table:

+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+

Output:

+---------------------+
| SecondHighestSalary |
+---------------------+
| 200                 |
+---------------------+

Example 2:
Input:
Employee table:

+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
+----+--------+

Output:

+---------------------+
| SecondHighestSalary |
+---------------------+
| null                |
+---------------------+

Get the Second Highest Distinct Record from Table using Limit/Offset (Sorting)

We can use the Limit/Offset to get the second highest record. The key to this is to sort (order by), and then with the distinct keyword, skip the first record (offset), and return the next record (limit 1).

SELECT DISTINCT Salary
FROM Employee
ORDER BY Salary DESC
LIMIT 1 OFFSET 1

However, when the table consists of only 1 distinct record, the above SQL will return empty rows, and we can use the IFNULL function to make it return NULL.

SELECT IFNULL(
    (SELECT DISTINCT Salary
    FROM Employee
    ORDER BY Salary DESC
    LIMIT 1 OFFSET 1), NULL
) AS SecondHighestSalary

The sorting makes the SQL O(NLogN) which isn’t that efficient.

Python Function to Get Second Highest Distinct Element in the List or Iterables via Sorting Algorithm

The Python code to show the algorithm to get the second highest distinct element in the list or array. Time time complexity is O(NLogN) due to sorting:

1
2
3
4
def secondHighest(data):
    data = list(set(data))
    data.sort()
    return data[-2] if len(data) > 1 else None
def secondHighest(data):
    data = list(set(data))
    data.sort()
    return data[-2] if len(data) > 1 else None

Example:

1
2
3
4
5
6
7
8
9
10
print(secondHighest([2,3,5,12,23,3,3,4,99]))
print(secondHighest([2,2,2,2,2]))
print(secondHighest([2,2,2,2,2,3]))
print(secondHighest([2,2,2,2,2,3,3]))
 
### prints the following  
23
None
2
2
print(secondHighest([2,3,5,12,23,3,3,4,99]))
print(secondHighest([2,2,2,2,2]))
print(secondHighest([2,2,2,2,2,3]))
print(secondHighest([2,2,2,2,2,3,3]))

### prints the following  
23
None
2
2

We convert the given iterables (list or tuple) to set, to consider only unique elements, and then we sort them, and return the second highest element if there is such, otherwise return None (if the array contains only 1 unique element).

Linear Algorithm to Get the Second Highest Distinct Record from Table

We know already the effective algorithm to get the highest record from array/table, which is to linear scan e.g. O(N) complexity where N is the number of total records.

Select Max(Salary) from Employee

To Return the Second Distinct Highest, we can perform such linear scan again, but filter out the records that have the highest record:

select max(Salary) SecondHighestSalary 
from Employee
where Salary not in (select max(Salary) from Employee)

Here, we use the subquery to obtain the highest record, and then we can filter out the highest record via “not in”, or less equal (<) or not equal (<>).

The Python code to show the idea:

Optimal Linear Algorithm: Python Function to Get Second Highest Distinct Element in the List or Iterables

The following Python function performs the linear scan twice to obtain the second highest record in the list or iterables (e.g. tuples). If no such record exists, it will return None.

1
2
3
def secondHighest(data):
    mx = max(data) # get the highest
    return max((i for i in data if i != mx), default=None)
def secondHighest(data):
    mx = max(data) # get the highest
    return max((i for i in data if i != mx), default=None)

Algorithms: Second Highest Record

Learning SQL

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
915 words
Last Post: OpenAI Text Classifier Fails to Detect ChatGPT Text
Next Post: Introduction to Dry Testing in Software Engineering

The Permanent URL is: Teaching Kids Programming – Second Highest Distinct Record using SQL and Python

Leave a Reply