I was looking at a query to make it more performant and I encountered an interesting case. This query executes much faster when I add a (redundant) JOIN that doesn’t change the actual result set.
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) select * from product_reservations l1_ inner join product_occupancy_items l0_ on l1_.id = l0_.id inner join products l2_ on l0_.product_id = l2_.id where l2_.customer_id = 'a19917c2-5ee8-47c2-a757-7799c0e54b0d' and l0_.date_range && '(2019-08-09,2019-08-11)' = true
The query executes at ~88ms, this is the explain
Adding a redundant JOIN:
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) select * from product_reservations l1_ inner join product_occupancy_items l0_ on l1_.id = l0_.id inner join products l2_ on l0_.product_id = l2_.id inner join products l3_ on l0_.product_id = l3_.id where l2_.customer_id = 'a19917c2-5ee8-47c2-a757-7799c0e54b0d' and l0_.date_range && '(2019-08-09,2019-08-11)' = true
This produces the following explain https://explain.dalibo.com/plan/Op3
This query runs at 20ms. About 4 times faster. As you maybe noticed, the only difference is just an useless JOIN on a table already JOINed (products).
When looking at the EXPLAIN for differences, we can find something in the
- In the 1st query, the planner uses only the index of the external key with
productsand takes all the matching products applying the date range filter without index. This is much faster even though there are ~12k matching products to deal with (and then after applying the filter just 65)
- In the 2nd query, the planner uses the index
idx_occupancy_items_date_rangethat’s actually good on paper, but it takes a whole 80ms.
What’s going on is kind of clear from a “what” point of view. I’m not sure to understand the “why”. It looks to me that purely for an implementation point of view, it changed the part of the plan to access
product_occupancy_items and got lucky.
I’m wondering if we can do anything to allow the planner to make the same decision without requiring that JOIN.
Note how while the difference may sound in the order of few ms, this query is executed many times for a bulk process. The whole bulk execution runs at about 3s with the double join and at about 30s without it. So the difference is sensible.