A Visual Explanation of SQL Joins

FaridM · June 30, 2010, 12:00am

thanks, this is the best Joins explanation, the diagrams HELPS ALOT!
this solved my problem, thanks again

Carnright · July 9, 2010, 12:00am

There are sixteen possible Venn states with two variables, you forgot eleven of them Seriously, I wonder how many people have independently made the connection between SQL join and Venn. I also worked this out a while back, it being a expansion of the UNIX sort and uniq combinations I used to.

Given two lists, and given that each list has only unique entries inside itself:

sort a b b | uniq -u (gives only what is in a)
sort a b | uniq -d (gives what is in both a and b)
sort a a b | uniq -u (gives what is only in b)

I found this useful for simple lists like hostnames or IPs

Cotyrocksteady · July 18, 2010, 12:00am

Thank you so much for this explannation!!! NOW I CAN FINALLY UNDERSTAND THIS!!!

Helpstring · December 16, 2010, 12:00am

Thanks for this great visual explanation - just what I needed.

Ken_Weston · January 6, 2011, 12:00am

I think a venn diagram of a cross join would show a single filled circle equal to the area^2 of the previous sets of circles.

JonathonO · January 24, 2011, 12:00am

Superbly explained Jeff. I must say that I am very naive when it comes to database programming since I am an applications and networking programmer primarily and as such, have a tendency to view databases as a big black box of data, nothing more…

I liked your usage of Venn Diagrams, it helped me see joins in a completely different light, I was very narrow minded.

That being said, could Venn Diagrams be applied to three or more table join statements? I find myself running into some problems recently in my rather inferior SQL statements hence the question.

Once again, great work, really enjoy reading your posts.

outis · April 19, 2011, 12:00am

Old post, and this will probably be lost among the spam, but…

While joins themselves can’t be accurately defined by Venn diagrams, the relationships between joins can be. With the left and right circles representing the left and right joins, their union is the outer join and their intersection is the inner join.

HendrikK · July 10, 2011, 12:00am

Awesome explanation, thanks!

jpk · August 1, 2011, 12:00am

Nice Visuals…
thanks for the explanations…

ejhost · April 23, 2014, 1:02pm

7 or so years after this was originally posted, I still refer to it all of the time & refer others to it. Nicely done!

poissonlabs · May 31, 2014, 6:23am

This post is hugely misleading. There is some relationship between a join and an intersection, sure. But they are not the same thing.

In particular, an intersection is a join in the lattice of sets. But tables/relations aren’t just sets. They’re sets of tuples, and they have a richer notion of a join, based on quotients and other algebraic constructs. The join of a pair of relations is the smallest relation that is bigger than them both. The inner join is a closure operator. The (left) outer join is an adjoint operator. Etc.

The funny thing is, the blog post gave you half the story in 10 times as many words. Don’t fight math. It will always win.

pai1009 · September 26, 2014, 8:41am

Great article! According to one of the post about other articles about sql joins I recommend another with similar approach (venn diagram, the example table and query with result for each case): http://www.vertabelo.com/blog/technical-articles/sql-joins

learning_dbs · February 21, 2015, 8:09pm

2015 and STILL your example is the very BEST on the web!

matthewsmaynard · March 26, 2015, 2:35pm

I have been looking for this explanation for years, thank you.

John_Gotts · May 29, 2015, 6:06pm

If you’ve been using SQL, but not primarily (in other words, as a means to an end), for many years, you know when you need to use a JOIN. Most programmers don’t care enough about SQL or databases or set theory or discrete math to learn this theory in detail and pages like this are critical for us to get our jobs done.

John_Gotts · May 29, 2015, 6:10pm

That’s mathematical rigor and 99% of programmers don’t need it. Programming is about getting the math good enough, or else no programs would ever get written. We already know where the math matters, for example, in encryption code.

poissonlabs · June 1, 2015, 6:40pm

You missed the point. The rigor makes things easier, not harder. It takes 10 times as many words to explain things the wrong way. And then you often need to know how to un-roll the wrong explanation into the right one.

How do you figure out a query’s computational complexity with the “set” explanation? You have to unroll it into the real thing (i.e., tuples) and then do a counting argument. If you start with the real thing, the counting argument takes less than a minute. If you start with the wrong explanation, you might never even reach a valid estimate.

snoyes · June 16, 2015, 4:18pm

How about something like this for the cross join diagram?

MrBCut · February 7, 2016, 9:36am

Hey thanks for making JOINS visually familiar with the notion of Venn diagrams. I want to thank my colleague Hengineer for pointing me to this reference too. But I have a question:

I’m getting more comfortable with SQL syntax. I use this as a resource. In your examples, your functions are:

INNER JOIN
FULL OUTER JOIN
LEFT OUTER JOIN

On the site I mentioned, these are the JOIN functions:

INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL JOIN
SELF JOIN
CARTESIAN JOIN

I see a couple matches and similarities, but I don’t want to assume. Also, the wiki page you mentioned in your intro kinda messes me up too

Can you or anyone walk me through the syntax (maybe semantics?) of these function names, so I can be clear about what really does what?

Thanks!

JuanRamos · February 8, 2016, 7:10pm

It is a nice explanation, but why you use tables with only names?
A real scenario would be better.