This is a repost of a post I wrote years ago (circa 2018). At the time I was working on some This was educational content to teach data scientists some software engineering fundamentals. The join explanations are still accurate and the examples work, though my writing style has evolved since then. Including it for the archive.


When working with a normalized SQL database, you’re often left with disjointed data spread across multiple tables. Since these individual datum tend not to carry a lot of weight when viewed in isolation, we have to reunite the disparate pieces. One of the most ubiquitous, performant ways to coalesce this data is through Join.

Join is used to combine data from multiple entities in a relational database. ANSI SQL defines 5 Joins. They are:

  • Inner Join
  • Left Outer Join
  • Right Outer Join
  • Full Outer Join
  • Cross Join

INNER JOIN

An INNER JOIN returns a set of relations where attributes of the first table are combined with attributes of the second, given the relations satisfy a join predicate. That is, tables are combined when they share a common identifying attribute.

Syntax

SELECT [COLUMNS]
FROM table_1
INNER JOIN table_2
ON table_1.analagous_column = table_2.analagous_column;

or

SELECT [COLUMNS]
FROM table_1, table_2
ON table_1.analagous_column = table_2.analagous_column;

Example

Consider the following tables:

staff

staff_num staff_name
123 John Doe
456 Jane Doe
789 Robert Fakename

project_staff

project_num staff_num
12345 123
12345 456
67890 123
13579 456
13579 789
13579 123
24680 789
24680 456

If we were to use an INNER JOIN to combine these tables, the query would look something like this:

SELECT project_staff.project_num, staff.staff_num, staff.staff_name
FROM staff, project_staff
ON project_staff.staff_num = staff.staff_num;

The query would return the following result set:

project_staff.project_num staff.staff_num staff.staff_name
12345 123 John Doe
12345 456 Jane Doe
67890 123 John Doe
3579 456 Jane Doe
13579 789 Robert Fakename
13579 123 John Doe
24680 789 Robert Fakename
24680 456 Jane Doe

LEFT OUTER JOIN

A LEFT OUTER JOIN returns a set of relations where attributes from every relation in the first table are combined with attributes of the second, given the relations in the second satisfy a join predicate. In other words, everything from the first table is returned, along with stuff from the second table, so long as the tables share a common identifying attribute.

Syntax

SELECT [COLUMNS]
FROM table_1
LEFT OUTER JOIN table_2
ON table_1.analagous_column = table_2.analagous_column;

Example

Consider the following tables:

staff

staff_num staff_name
123 John Doe
456 Jane Doe
789 Robert Fakename

project_staff

project_num staff_num
12345 123
12345 456
67890 123
13579 456
13579 789
13579 123
24680 789
24680 456
86420 NULL
97531 NULL

A LEFT OUTER JOIN of project_staff and staff would look like this:

SELECT project_staff.project_num, staff.staff_num, staff.staff_name
FROM staff
LEFT OUTER JOIN project_staff
ON project_staff.staff_num = staff.staff_num;

… and would yield this:

project_staff.project_num staff.staff_num staff.staff_name
12345 123 John Doe
12345 456 Jane Doe
67890 123 John Doe
13579 456 Jane Doe
13579 789 Robert Fakename
13579 123 John Doe
24680 789 Robert Fakename
24680 456 Jane Doe
86420 NULL NULL
97531 NULL NULL

RIGHT OUTER JOIN

A RIGHT OUTER JOIN mimics the behavior of a LEFT OUTER JOIN, except the tables are reversed. It returns a set of relations where attributes satisfying a join predicate from the first table are combined with every relation in the second table. In other words, matching records from the first table are combined with all records from the second.

Syntax

SELECT [COLUMNS]
FROM table_1
RIGHT OUTER JOIN table_2
ON table_1.analagous_column = table_2.analagous_column;

Example

For demonstration purposes, consider the following tables:

staff

staff_num staff_name
123 John Doe
456 Jane Doe
789 Robert Fakename
555 Bobby Tables

project_staff

project_num staff_num
12345 123
12345 456
67890 123
13579 456
13579 789
13579 123
24680 789
24680 456

A RIGHT OUTER JOIN to combine the tables would look like this:

SELECT project_staff.project_num, staff.staff_num, staff.staff_name
FROM staff
RIGHT OUTER JOIN project_staff
ON project_staff.staff_num = staff.staff_num;

Yielding this as a result set:

project_staff.project_num staff.staff_num staff.staff_name
12345 123 John Doe
12345 456 Jane Doe
67890 123 John Doe
13579 456 Jane Doe
13579 789 Robert Fakename
13579 123 John Doe
24680 789 Robert Fakename
24680 456 Jane Doe
NULL 555 Bobby Tables

FULL OUTER JOIN

A FULL OUTER JOIN returns a set of relations where attributes from every record in both left and right tables are returned. Conceptually, this combines LEFT OUTER JOIN and RIGHT OUTER JOIN functionality. Matching records mimic INNER JOIN. In other words, every record from the each table is returned, regardless of completion.

Syntax

SELECT [COLUMNS]
FROM table_1
FULL OUTER JOIN table_2
ON table_1.analagous_column = table_2.analagous_column;

Example

Consider the following tables:

staff

staff_num staff_name
123 John Doe
456 Jane Doe
789 Robert Fakename
555 Bobby Tables

project_staff

project_num staff_num
12345 123
12345 456
67890 123
13579 456
13579 789
13579 123
24680 789
24680 456
86420 NULL
97531 NULL

FULL OUTER JOIN of project_staff and staff would take this shape:

SELECT project_staff.project_num, staff.staff_num, staff.staff_name
FROM staff
FULL OUTER JOIN project_staff
ON project_staff.staff_num = staff.staff_num;

It would return results like this:

project_staff.project_num staff.staff_num staff.staff_name
12345 123 John Doe
12345 456 Jane Doe
67890 123 John Doe
13579 456 Jane Doe
13579 789 Robert Fakename
13579 123 John Doe
24680 789 Robert Fakename
24680 456 Jane Doe
86420 NULL NULL
97531 NULL NULL
NULL 555 Bobby Tables

CROSS JOIN

A CROSS JOIN returns the Cartesian product of rows in the joined relations. In other words, it produces a table with every row in the first table combined with every table from the second table.

Syntax

SELECT [COLUMNS]
FROM table_1
CROSS JOIN table_2;

or

SELECT [COLUMNS]
FROM table_1, table_1;

Example

Consider the following tables:

staff

staff_num staff_name
123 John Doe
456 Jane Doe
789 Robert Fakename

project_staff

project_num staff_num
12345 123
12345 456
67890 123
13579 456
13579 789
13579 123
24680 789
24680 456

CROSS JOIN of project_staff and staff would look like this:

SELECT project_staff.project_num, staff.staff_num, staff.staff_name
FROM staff
CROSS JOIN project_staff;

The query above would yield:

project_num staff_num staff_name
12345 123 John Doe
12345 456 John Doe
67890 123 John Doe
13579 456 John Doe
13579 789 John Doe
13579 123 John Doe
24680 789 John Doe
24680 456 John Doe
12345 123 Jane Doe
12345 456 Jane Doe
67890 123 Jane Doe
13579 456 Jane Doe
13579 789 Jane Doe
13579 123 Jane Doe
24680 789 Jane Doe
24680 456 Jane Doe
12345 123 Robert Fakename
12345 456 Robert Fakename
67890 123 Robert Fakename
13579 456 Robert Fakename
13579 789 Robert Fakename
13579 123 Robert Fakename
24680 789 Robert Fakename
24680 456 Robert Fakename