SQL Queries: Filtering Data with SQL OR Conditions, UNION Operator, and Common Table Expressions (CTEs)

·

4 min read

SQL Queries: Filtering Data with SQL OR Conditions, UNION Operator, and Common Table Expressions (CTEs)

Day 3 of 100 Days of DSA: Exploring Big Countries with SQL

Welcome back to the "100 Days of Data Structures and Algorithms (DSA)" series! In this 3rd-day entry, we continue our exploration by solving a problem from LeetCode that revolves around the concept of querying big countries using SQL.
LeetCode: 595. Big Countries

The Problem: Identifying Big Countries

The problem, sourced from LeetCode, presents us with a table named "World" containing information about different countries. Our task is to identify the "big" countries based on the following criteria:

  • Area of at least 3,000,000 km²

  • The population of at least 25,000,000

We'll approach this task using two different SQL solutions, carefully explaining each step to make it accessible for SQL beginners.

The World Table Structure

Here's a snapshot of the table:

Snapshot of a table with columsn name, continent, area, population and gdp

Solution #1: Using the OR Condition

This was the out-of-the-box entry that I would write on our MS SQL. The reality is that very rarely queries end up being this easy, but then again as I go through this path I am most likely getting used to writing the tougher ones too. As it is our database at work is so complex that I usually use the Query Designer GUI to build up the most complex queries.

So here is the code snippet for the Query:

SQL Query

SELECT name, population, area
FROM World
WHERE area >= 3000000 OR population >= 25000000;

Explanation

  • SELECT Statement: Specifies the columns to retrieve (name, population, area).

  • FROM Statement: Identifies the table being queried ("World").

  • WHERE Clause: Applies the filtering conditions using an OR operator, ensuring that either condition is met.

Solution #2: Using the UNION Operator

Then searched around for possible other solutions and decided to test out the UNION operator which is mainly used to make complex Queries more readable by breaking them down into multiple queries that are then combined with the UNION operator. It doesn't look that bad but that becomes a problem when the database has millions of entries and you have to read them through multiple times. SO ok on this size but suboptimal with big datasets.

SQL Query

SELECT name, population, area
FROM World
WHERE area >= 3000000
UNION
SELECT name, population, area
FROM World
WHERE population >= 25000000;

Explanation

  • Two Separate SELECT Statements: Breaks down the problem into two queries for area and population.

  • UNION Operator: Combines the results of both queries, removing any duplicates.

Solution #3: Using Common Table Expression (CTE)

In this particular instance, the solution employs a new learning approach for me, treating SQL as if it were a function in Python, or at least it appears that way to me and I understand it better like that. The reason behind this approach is that the LeetCode servers operate on Coordinated Universal Time (UTC), and the solution builds were initiated from a timezone that is two hours ahead of UTC (UTC+2). As a result, the Day 3 markings were not obtained. Instead, the rest of the night was spent writing this solution, with plans to test it during a more suitable time in the morning.

SQL Query

WITH BigCountries AS (
  SELECT name, population, area
  FROM World
  WHERE area >= 3000000 OR population >= 25000000
)
SELECT * FROM BigCountries;

Explanation

  • WITH Clause: This defines a CTE named "BigCountries" that includes countries meeting either of the big country criteria (area or population).

  • SELECT Statement: This queries the results from the CTE to retrieve the name, population, and area of the big countries.

Why Use a CTE?

Using a CTE allows us to create a temporary result set that can be referenced within the main query. In this particular problem, it may add clarity by separating the filtering logic into a named expression.

Comparison with Previous Solutions

  • Solution #1: The third approach is similar to Solution #1 but adds a layer of abstraction with the CTE. It may be useful in more complex queries where the filtering logic is reused.

  • Solution #2: Unlike Solution #2, which breaks down the problem into two separate queries, Solution #3 filters the data in one go, similar to Solution #1.

Conclusion

In conclusion, we explored three different SQL solutions to identify big countries based on area and population criteria. Each approach has its advantages, with the use of the OR condition, UNION operator, and Common Table Expression catering to varying levels of complexity and readability. Selecting the appropriate method depends on the specific problem and the size of the dataset being queried.

Status in Leetcode after 3 days

Did you find this article valuable?

Support Timo by becoming a sponsor. Any amount is appreciated!