A Visual Explanation of SQL Joins

Big deal, SQL finally caught up with the 1980’s. Used a very fine DBMS and language, NOMAD2, on mainframes that did this stuff but only much better and more thoroughly.

nicely done. Though I do have to admit to finding it humorous that what started out as a fast, dirty way to try to clarify to someone (on an IRC channel) why they were getting the wrong information for a join now has others correcting and linking to me.

Bindun!

http://www.khankennels.com/blog/index.php/archives/2007/04/20/getting-joins/

Nice post though :wink:

“I was under the impression that INNER JOINs use a hash match internally (in fact, I’m quite certain of this). CROSS JOINs obviously don’t. Therefore I don’t see how it’s reasonable to say that an IJ is just a special case of the CJ. I suppose in bread-and-butter set theory, where there’s no notion of a hash or an index, this could work, but this is the world of CS, where the clean academic solution is almost never the most practical one. The IJ you speak of is the naivest possible algorithm, only used when there are no indexes (and if you’re designing databases without any indexes, well… stop designing databases).” – Aaron G

An IJ is defined as being the equivalent of a filtered CJ. The fact that this would not be a reasonable implementation does not make a difference.

IJs are ABSOLUTELY NOT implemented exclusively with hash algorithms. Sort merge algorithms and nested loop algorithms are used frequently.

Jeff:

Nice illustration. I think your example would be minutely more readable if the left-hand table had an ID column containing values 1,2,3,4 (as it does) but the right-hand table had an ID column containing values A,B,C,D (instead of numbers). It just makes that tiny bit easier to tell what’s what by avoiding the two similar-looking-but-different columns.

Very nice article. I use a Sqlite database (via the command line) to manage my database of comics and this would have been a very nice thing to have when I started out. Hopefully, this will help people in grasping SQL joins.

I love it. The grumpy comments, that is. Plenty of complaints that this is too simplistic a view to be useful, nary a link to anything meatier.

And SQL-addicts wonder why everyone else who has to deal with RDBS jumps at the first library that lets them treat them as anything other than RDBS…

(just finished re-working a big fat chunk of code originally written to pull all tables into memory and iteratively query, join, and sort them - even an ultra-simplistic understanding would have done someone a ton of good)

The Oracle (+), actually predates Oracle’s usage, was because when implemented there was ANSI no outer join syntax.
In Oracle 8, they added the ANSI syntax. No one should be using the old syntax so hopfully you will not being seeing it out in the wide.

While this is an interesting example of these kinds of join, I have found that in practice I almost never need anything other than left/inner join. 10 years of practice, and it’s not the worlds most complicated sql most of the time, but there it is. Thank GOD that stupid thing didn’t illustrate “RIGHT” joins too!

The first “left outer join” is the same as just “left join”. The second one is the same as well, just put “b.id is null” in the where. And please don’t anyone tell me that the optimizer isn’t going to figure that out (or do the same thing with the outer.) Well, maybe it won’t if it is MySQL, but thats why you don’t use that.

Die “outer”, die!

Excellent visual representation. This should help the novice see a concrete picture in their mind of the different joins. While LINQ isn’t quite set based (but close) these same visualizations would work there too.

Excellent article.

Thanks a lot.

can you write something similar for indexes.

mpunaskar at gmail dot com

Something you may or may not know. . . .

During the late 70’s Maths was re-packaged to include

Venn Diagrams
Matrices

and so on,
with the idea of preparing for the coming of computers.
At least in England that happened…

What they teach now is hard for me to determine, as most victims of the educational system can not count out the correct change in a shop.

Wow man, your design is stellar. Content is great too…you’ve got a new reader.

Holy crap this is WAY over due! I’ve read TONS of SQL books and NONE of them had this simple diagram!

I think the Cartesian equation or Cross Product in Venn Diagram form would be both circles combining to form a sphere.

Interesting. I find joins to be pretty obvious personally. Fundamentally, you have two sets of rows, and you’re matching them up by some criteria. In an inner join, you only take rows that match up. In a left outer join, you take all the ones from the left, plus the ones from the right that match. In a right outer join, the opposite. If you have multiple matches, you take all possible combinations of them.

Thank you! I’ve been using SQL for so long and have many books on SQL and I have never seen such simple examples. I love it!

The reason many people who do DB development have such a inadequate understanding of this simple topic is because the SQL language is an inadequate tool to try to absorb it from. So, if you never were taught this in school or on the job, it’s kind of a surprise.

Some DBMS’ has syntax with more descriptive verbs (e.g., ‘Merge’, ‘Subset’, ‘Reject’), and Venn diagrams as well in the vendor documentation.

SQL 2005 is an improvement over 2000. But the sad fact is is many people learn about relational algebra by writing SQL statements.

A less pretty version of much of what Jeff is saying can be found here:

http://db.grussell.org/section010.html

With all respect to Jeff Atwood, beware if you’ve read this post and think, “wow, that’s easy – why did noone explain it like that before?” Well, there’s a reason why so many SQL tutorials and books don’t use these diagrams – they’re an inadequate explanation of what database joins are all about.

Joins are indeed analogous to set operations, but it is only an analogy. Understand what’s really going on, or you will get burned when you actually have to use the things.

Very nice first post on the subject. I think many of the comments are valid and I think it would be good to continue this as a series that builds up the more complex issues of SQL joins.

I, who learned this stuff the hard way, really apprecieate these primers!