Lesson 8: Advanced Join Techniques

Executive Summary:

This post is going to delve deeper into SQL Joins, with a focus on several advanced techniques and strategies. We’ll be exploring leveraging some of these techniques to improve query performance and be more efficient.

Creating Temp Tables:

Please be aware, that for space reasons, I’m not utilizing a ”real” database size. In production settings, you will encounter situations of 10,000 customers or more. Imagine if a company has 3 stores and each store does 1,000 transactions over a 12 hour shift; Think of how many records would be created in a month. What would happen when they celebrate their fifth year of business? As you can imagine, records snowball into larger and larger volumes.

--Create and Fill Order Table                     
CREATE TABLE Customers (CustomerID INT, CustomerName VARCHAR(100) )
INSERT INTO Customers (CustomerID, CustomerName )
values (1, 'John Doe'), (2, 'Jane Smith'), (3, 'David Johnson')
Insert into Customers values ('4','Scrooge McDuck')

--Create and Fill Order Table
CREATE TABLE Orders (OrderID INT, CustomerID INT )
INSERT INTO Orders (OrderID, CustomerID )
VALUES (1, 1), (2, 1), (3, 2), (4, 3), (5, 3)

--Create and Fill Discounts Table
CREATE TABLE Discounts (MinOrders INT, MaxOrders INT, DiscountRate FLOAT)
INSERT INTO Discounts (MinOrders, MaxOrders, DiscountRate)
VALUES (1, 1, 0.05), (2, 2, 0.1), (3, 9999, 0.15)

WHERE and Joins

In our above example data, let’s examine a common query. Our customer table (in a production system) would have both leads and customers that have a purchase history with the company. In our case, there’d be a few different ways to find out all customer names (or Customers with a purchase history).
In the below example, we’ll complete a traditional left join and filter to the desired result sets with a where statement. Alternatively, we can replace the where clause with a simple join, with the caveat that we would not be able to identify leads.
Finally, we can also utilize a normal join and a group by to both replace the distinct and

--Find all Customers with Orders
---Traditional Left Join With Where
SELECT C.CustomerName
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID
WHERE O.OrderID IS NOT NULL

---Traditional Join W/Distinct 
SELECT distinct C.CustomerName
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID

---Group By Aggregation 
Select C.CustomerName
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID
GROUP BY C.CustomerName
HAVING Count(O.OrderID) >= 1

Joins with Derived Tables

Frequently, it’s useful to create a derived table (or a table created from a query) to join to. This generally can help organize your query and allow you to have the ‘perfect’ dataset for your use.
Check the below example, which shows two ways of finding customer order counts.

--SubQuery 
SELECT R.CustomerName, Count(OrderID) NumberOfOrders
FROM ( 
	SELECT C.CustomerName, O.OrderID 
	FROM Customers C 
	JOIN Orders O on C.CustomerID = O.CustomerID
) R
GROUP BY R.CustomerName

--Derived Table & CTE
; WITH OrderCount AS 
(
  SELECT CustomerID, COUNT(OrderID) NumberOfOrders 
  FROM Orders 
  GROUP BY CustomerID
) 
SELECT C.CustomerName, OrderCount.NumberOfOrders
FROM Customers C
INNER JOIN 
  OrderCount
ON C.CustomerID = OrderCount.CustomerID

Implicit Joins

There’s even more syntax to create joins; You can simply list the tables and the relationship as below. How do you think this syntax will handle complicated queries?

SELECT DISTINCT C.CustomerName
FROM Customers c, Orders O
WHERE C.CustomerID = O.CustomerID

Leave a Reply

Your email address will not be published. Required fields are marked *