postgresql – A query with multiple JOINs versus multiple queries

I'm working on Posrgres 9.6 with PostGIS 2.3 hosted on AWS RDS. I am trying to optimize some geo-radius queries for data that comes from different tables.

I'm thinking of two approaches: a single query with multiple joins, or two separate but simpler queries.


At a high level and to simplify the structure, my scheme is:

CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION IF NOT POSTAGIS exists;


CREATE TABLE addresses (
id bigint NOT NULL,
Latitude double precision,
Longitude double precision,
Line1 character that varies NOT NULL,
Geography "Position" (item 4326),
CONSTRAINT enforce_srid CHECK ((st_srid ("position") = 4326))
);

CREATE INDEX index_addresses_on_position ON addresses USING gist ("position");

CREATE TABLE locations (
id bigint NOT NULL,
uuid uuid DEFAULT uuid_generate_v4 () NOT NULL,
address_id bigint NOT NULL
);

CREATE TABLE stores (
id bigint NOT NULL,
Name character varies NOT NULL,
location_id bigint NOT NULL
);

CREATE TABLE inventories (
id bigint NOT NULL,
shop_id bigint NOT NULL,
Status character varies NOT NULL
);

The addresses Table contains the geographic data. The position The column is calculated from the Lat-Lng columns as the rows are inserted or updated.

Everyone address is connected to one Location,

Everyone address can have many shops, and each one business will have one inventory,

I've omitted them for the sake of brevity, but all tables have the correct foreign key constraints and Btree indexes for the reference columns.

The tables have hundreds of thousands of lines.


This allows my main use case to be satisfied by this single query being searched for addresses 1000 meters from a central geographical point (10.0, 10.0) and returns data from all tables:

CHOOSE
s.id AS shop_id,
s.name AS business name,
i.status AS inventory_status,
l.uuid AS location_uuid,
a.line1 AS addr_line,
Latitude AS lat,
a.longitude AS lng
From addresses a
JOIN locations l ON l.address_id = a.id
JOIN Shops s ON s.location_id = l.id
JOIN Inventories i ON i.shop_id = s.id
WO ST_DIf (
a.position, - the position of each address
ST_SetSRID (ST_Point (10.0, 10.0), 4326), - the center of the circle
1000, - Radius distance in meters
true
);

This query works and EXPLANATORY ANALYSIS indicates that it is being used correctly CORE Index.

However, I could split this query in half and manage the intermediate results at the application level. This also works for example:

--- Search only for the addresses
CHOOSE
a.id as addr_id,
a.line1 AS addr_line,
Latitude AS lat,
a.longitude AS lng
From addresses a
WO ST_DIf (
a.position, - the position of each address
ST_SetSRID (ST_Point (10.0, 10.0), 4326), - the center of the circle
1000, - Radius distance in meters
true
);

--- Get the rest of the data
CHOOSE
s.id AS shop_id,
s.name AS business name,
i.status AS inventory_status,
l.id AS location_id,
l.uuid AS location_uuid
FROM the locations l
JOIN Shops s ON s.location_id = l.id
JOIN Inventories i ON i.shop_id = s.id
FROM WHERE
l.address_id IN (1, 2, 3, 4, 5) - possibly thousands of values
;

where the values ​​in l.address_id IN (1, 2, 3, 4, 5) come from the first query.


The query plans for the two separate queries look simpler than the first, but I wonder if that in itself means the second solution is better.

I know that inner joins are pretty well optimized and that a single roundtrip to the DB is preferable.

What about memory usage? Or resource conflicts on the tables? (eg locks)