mysql – Filtering Condition in n-Table Joins

(Aside from JD’s Answer, …)

SQL syntax is picky; WHERE comes after all the JOINs.

The ON clauses should say how the table is ‘related’ to the preceding table. Otherwise, it works identically to WHERE.

I’ll quibble with the “notoriously inefficient”. This depends on the ONs and WHERE. In particular, if WHERE filters on more than one table, the Optimizer must guess at what order to look at the tables, and may fail to do the best job. Also, it implies gathering data from some tables for throwing out the rows due to a WHERE clause that is acted on later.

The Optimizer is free to look at the tables in any order and rearrange ON and WHERE clauses as long as it leads to the same resultset.

An optimization that shows up some times:

    FROM a
    JOIN b  ON b.x = a.y
    WHERE b.x = 123

The Optimizer is smart enough to realize that a.y is necessarily filtered to “123”. This might be advantageous, for example, if a has INDEX(y).

After WHERE can come GROUP BY, HAVING, ORDER BY, and LIMIT, each optional, but necessarily in that order.

HAVING is identical to WHERE but HAVING can see aggregates, but WHERE cannot. Example:

SELECT user_id, COUNT(*) AS ct
    FROM t
    GROUP BY user_id
    HAVING ct >= 2

This provides any users that show up at least twice. This cannot be done with WHERE. (OK, you could do it less efficiently via a subquery with WHERE.)