join

SQL Joins Explained |¦| Joins in SQL |¦| SQL Tutorial



Sharing buttons:

Welcome to the future. This timeline is amazing! The global hyperloop network is finally complete.

The bioengineered glow-in-the-dark Redwood Forest in the Sahara is a vacation paradise.

And best of all, we have colonies throughout the Solar System. One thing that might surprise

you about the future is that everyone still uses SQL. I didn’t see that coming.

So to prepare you for the future, we now explore an essential part of SQL: Joins.

The database we will be using to learn about joins has 5 Tables:

martian, base, visitor, inventory, and supply.

The Martian table is a list of all people

living on the Red Planet. After all, if you are living on Mars, you

are no longer an Earthling - you’re a Martian! The “base” table holds information on

all the different habitats on Planet #4. In addition to serving as a stopover for people

visiting the outer planets, Mars is now a tourist destination.

Every visitor to Mars is tracked in the “visitor” table.

The “inventory” table keeps track of the supplies at each base.

And the “supply” table shows what is available at the central Martian distribution center.

Let’s take a closer look at the “Martian” and “Base” tables.

Here are the columns in the “Martian” table…

The “super_id” is the “ID” of the Martian they report to.

And here are the columns in the “Base” table.

These two tables are connected, because each “Martian” has a “base_id” identifying

where they live and work. Suppose you need to generate a report listing

the full names of all Martians and the name of their home base.

How would you do it? The Martian name is in the “Martian” table,

while the “base” name is in the “Base” table.

The solution is to JOIN these two tables together by the “base_id.”

To begin, look at the DATA in the Martian table… and the Base table.

Let us visualize how to join these tables together by the “base_id.”

Take the first row from the “Martian” table...

Find the row in the Base table with the matching base_id…

And then join them together to make a new, larger row.

You continue doing this row… by row… by row…

This brings us to John Carter. He does not have a base_id, so what do we

do? Do we include him in the join? Or leave him

out? This is a very important question.

The answer depends on the TYPE of join you do.

For now, we will leave John Carter out. The result is 12 rows.

Each row is a combination of data from both tables.

Having joined these two tables, you can select data from these rows like an ordinary SELECT query.

By the way, notice that there is currently

no Martian at the Olympus Mons Spa & Casino. This is because the staff is en route from

the Moon and will not arrive for a few more weeks.

Here’s how to write a join using SQL. Begin with SELECT * so we can see all columns.

This will help in understanding the join. Next, specify the two tables.

FROM martian … INNER JOIN base. We will call “martian” the left table,

and “base” the right table. The “JOIN” keyword indicates we will be

joining these two tables. We’ll explain the INNER keyword in just a moment.

Next, specify HOW to connect two rows from

these tables with an ON clause. We only want to connect a martian row with

a base row if they have the same base_id. In the “ON” clause, you have to specify

both the table name... and column name. This makes it clear which table each column

is from. Now execute…

The result contains all the columns from the Martian table…

And all the columns from the Base table. The rows are matched by base_id.

John Carter and the Olympus Mons Spa & Casino do not appear in the result.

This is because neither one had a match in the other table.

We can now update this query to return only those columns we need for our report.

First name… last name… and base name… Execute.

Report complete.

When joining the tables, we encountered two problematic rows. John Carter is in the Martian

table, but his base is unknown. And the Olympus Mons Spa & Casino currently has no Martians.

So how do you tell SQL whether or not to include rows from a table that do not have a match

in the other? Hmm?? We have two tables, and we have two choices per table. Include rows

without a match or exclude them. This means there are 4 total options. Spoiler Alert:

SQL has a JOIN for each of the four options. Let’s visit them now.

We return to the “Martian” and “Base” tables.

We will call the “Martian” table the left table, and “Base” the right table.

Each table has a row with no match in the other table.

Here is the syntax for joining these two tables: Write SELECT followed by the columns you want

FROM martian... blank... JOIN base

ON martian.base_id = base.base_id The ON clause specifies HOW rows from the

two tables will be connected. Because this is a SELECT query, you can also

have WHERE and ORDER clauses. All that remains is to fill in the blank and

specify what kind of JOIN to perform.

We will learn about four types of joins: INNER.. LEFT .. RIGHT .. and FULL.

An INNER JOIN will only return connected rows

when there is a matching base_id in both tables.

A LEFT JOIN will return EVERY row from the left table, even if there is no matching row

in the right table. If a Martian row has no match in the Base

table, it will return nulls for ALL the columns in the right table.

A RIGHT JOIN will return EVERY row from the right table, even when there isn’t a match

in the left table. When a Base row does not have a matching Martian

row, null values will be used for ALL columns in the left table.

And finally there is the FULL JOIN. This is sometimes called a FULL OUTER JOIN.

The FULL JOIN is a combination of the LEFT JOIN and RIGHT JOIN.

A full join returns every row from the left table and every row from the right table.

When the base_ids match, the rows are connected. But when there isn’t a match, the row is

still included in the JOIN with nulls for the columns from the other table.

A common way to visualize the difference between these 4 kinds of joins is with Venn Diagrams.

The Inner Join… Left Join… Right Join… and Full Join…

The inner join diagram highlights how only rows with matching columns are connected.

The left join diagram shows that all rows in the left table are returned, even those

without a match on the right. The right join is the mirror image of the

left join. And the full join shows that all

rows from both tables are returned. When two rows match, they are connected.

Otherwise, nulls will fill in the gaps.

I just received a Dear John letter. It’s especially sad because my name isn’t even

John! It reads, “Dear John. Not every database supports the four joins you discussed. For

example, MySql and SQLite do not support FULL JOINs. Sincerely, R. E. Joinder”

I will not be deterred. We are going to see examples of these four kinds of joins using Postgres.

Maybe the other databases will… join us?

Before seeing examples for INNER, LEFT, RIGHT,

and FULL joins I would like to revisit our first query.

SELECT * FROM martian

INNER JOIN base ON martian.base_id = base.base_id

Execute…

Notice how the output has TWO columns named base_id.

The first one is from the “martian” table,

and the second one is from the “base” table.

What if instead of selecting all columns from both tables, we only select the martian_id,

base_id, and base_name.

Execute.

We get an error. This is because it is ambiguous which base_id column we want.

Is it the base_id from the martian table or base table?

To avoid ambiguity, you specify the table

AND column name in the SELECT clause. We will select the “base_id” from the

“base” table... Execute…

Success...

Repeatedly typing the table name can make queries a bit word-y, and hands a bit sore-y

We can fix this by creating an “alias” for each table name

To do this, use the “AS” keyword. If we write “FROM martian AS m”, then

‘m’ is now an alias for the table name “martian”

And let us give the “base” table an alias of “b”

Now everywhere in our query we can use “m” instead of the table name “martian”, and

use “b” instead of the table name “base”. The “ON” clause becomes nicer and shorter:

And the SELECT clause is more compact as well

Now that you have properly been introduced to Joins, it is time to see some examples.

Not just one or two examples, mind you, but a LOT of examples. But not too many! We don’t

want you to feel bored. Somewhere between not quite enough and way too many. That is

how many examples we are going to see…. In our next video. So JOIN me in the next

video as we boldly go where no one has gone before… Ok, that’s a lie. If you look

at the number of views on the next video, you will see exactly how many people have

gone before you.