postgresql – Whaty could be the cause of postgres functions losing or swapping values from a CTE?

I have a number of functions which are generally in this shape:

with cte1 as
   select a, b, c, d, e
   from (some_table)
   where (some_condition)
cte2 as
    select x, y, z
    from (big_table)
    where (bunch_of_conditions)
select foo.x, foo.y, bar.a, bar.b, bar.c, bar.d
from cte2 foo
join cte1 bar on foo.z = bar.a

I have seen occasions where the values of bar.b and bar.d are swapped, and on others no values are returned – upon investigation, because bar.a contains only nulls.

In all cases, when I take the code out of the functions, it always runs correctly. In each case where it goes wrong, there is nothing I can point to which causes the glitch, and usually it goes away for no obvious reason. This is all running in pgAdmin under Windows 10, and the behaviour persists through stopping pgAdmin, and/or rebooting the PC.

The db where this happens is running on AWS, and I don’t have access to stop/restart that server.

It suggests to me that the problem is happening at the AWS and/or postgres server end, but that may or may not be correct.

Possibly also relevant, we have also seen cases where queries run exceptionally slowly (45 minutes instead of less that one second, being one example today).

In each case of the swap or the slowdown, it eventually returns to normal behaviour without any obvious intervention from me. I’m still experiencing the null value which should be populated.

Has anyone seen behaviour like this? Any ideas what could be causing it, or how to remedy it?