(Aside from JD’s Answer, …)
SQL syntax is picky; WHERE
comes after all the JOINs
.
The ON
clauses should say how the table is ‘related’ to the preceding table. Otherwise, it works identically to WHERE
.
I’ll quibble with the “notoriously inefficient”. This depends on the ONs
and WHERE
. In particular, if WHERE
filters on more than one table, the Optimizer must guess at what order to look at the tables, and may fail to do the best job. Also, it implies gathering data from some tables for throwing out the rows due to a WHERE
clause that is acted on later.
The Optimizer is free to look at the tables in any order and rearrange ON
and WHERE
clauses as long as it leads to the same resultset.
An optimization that shows up some times:
SELECT ...
FROM a
JOIN b ON b.x = a.y
WHERE b.x = 123
The Optimizer is smart enough to realize that a.y
is necessarily filtered to “123”. This might be advantageous, for example, if a
has INDEX(y)
.
After WHERE
can come GROUP BY
, HAVING
, ORDER BY
, and LIMIT
, each optional, but necessarily in that order.
HAVING
is identical to WHERE
but HAVING
can see aggregates, but WHERE
cannot. Example:
SELECT user_id, COUNT(*) AS ct
FROM t
GROUP BY user_id
HAVING ct >= 2
This provides any users that show up at least twice. This cannot be done with WHERE
. (OK, you could do it less efficiently via a subquery with WHERE
.)