When we're using relational databases correctly, we're going to end up with a lot of tables in our database. That's great for organizing our data, but how do we query that data when it's spread across multiple tables? Well, the answer is JOIN.
We're going to work with a small database that contains just two tables pulled from the ACADEMY AWARDS® DATABASE. An actors table and an awards table. The following SQL contains the schema for the database and some seed data.
Note
Make sure you execute the schema then the seeds before continuing.
Schema
Seeds
If we wanted to get all of the awards with the id of the actor that won the award, that would be a simple query
But instead of getting the winner's id, we want the winner's name. We want to
SELECT year, title, full_name
Now we're dealing with data from more than one table. We've got the year
and title
from the awards
table, and the full_name
from the actors
table. We also have a relationship, the winner_id
is an id of one of the actors
.
So we need to use this information to JOIN
the tables together so that we can get the data we need.
Note that when there is a shared column name between tables, we must specify which table the column belongs to, hence the actors.id
In order to get data from multiple tables, we can JOIN
the tables together. When we JOIN
tables, we must specify how the tables are related to each other using ON
.
This kind of JOIN is called an INNER JOIN
. When we just write JOIN
like this, it's just short hand for INNER JOIN
. Try writing INNER
before the word JOIN
and re running the query.
Did you notice that the first query without the JOIN
returned 12 results, but the second query with the INNER JOIN
only returned 8 results? Why is that?
It's because an INNER JOIN
will only return values where there's a match between the tables.
In the first query, we selected all of the awards, even if there wasn't a winner (the winner_id was null).
Since the table's are being JOIN
ed when actors.id = winner_id
, we won't get any results when the winner_id
is null.
But what if we want all of the awards, even when there's no winner? Then we'll have to use another JOIN
!
INNER
returns rows from both tables where there's a match between the tables.LEFT
andRIGHT
return all rows from the left or right table respectively, and only the matched results from the other table, or NULL if there is no match.FULL
returns all rows from both tables.