postgresql – Why the Postgres replication write_lag is measured in seconds

I run pg_receivewal locally without compression, segments are stored on the same drive as the DB files. Here is the replication statistics:

imbolc=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 12920                   
usesysid         | 10                           
usename          | postgres                
application_name | pg_receivewal                
client_addr      |               
client_hostname  |                
client_port      | -1             
backend_start    | 2019-08-25 13:18:00.504531+07
backend_xmin     |               
state            | streaming                    
sent_lsn         | C/80009218              
write_lsn        | C/80009218                   
flush_lsn        |                         
replay_lsn       |                              
write_lag        | 00:00:09.838314         
flush_lag        | 00:00:40.028949              
replay_lag       | 00:00:40.028949
sync_priority    | 0              
sync_state       | async

When I sit down synchronous_commit = off. write_lag remains significant, over 2.5 seconds.

So my questions are:

  • why to write_lag is that high?
  • Does this mean that in the event of a database crash, the last 10 seconds of transactions may be lost?
  • Is there a way to improve it?

postgresql – Regex character class: print: in postgres

I'm trying to replicate an Oracle test condition that prevents characters from being printed in Postgres, and I'm at a loss. I think Postgres can handle: print: and: ascii: as synonymys?!?


select REGEXP_REPLACE('bla', '(^(:print:))', '(X)','g') ;
 regexp_replace 
----------------
 bla
(1 row)

select REGEXP_REPLACE('bla'||chr(10)||'bla', '(^(:print:))', '(X)','g') ;
 regexp_replace 
----------------
 bla(X)bla
(1 row)

select REGEXP_REPLACE('Ҕ', '(^(:print:))', '(X)','g') ;
 regexp_replace 
----------------
 (X)
(1 row)

select REGEXP_REPLACE('ñino', '(^(:print:))', '(X)','g') ;
 regexp_replace 
----------------
 (X)ino
(1 row)

Why should Ҕ and ñ be trapped in it?

Every guide is appreciated.

postgresql – How do I handle `.partial` WAL segments generated with the Postgres command` pg_receivewal`?

pg_receivewal create one .partial File for incomplete WAL segment. My question is how to handle this file during recovery. If I leave it as it is, the last transactions will not be restored. If I rename the removing file .partial After the fix, the recovery procedure crashes with the error:

FATAL:  archive file "000000010000000C00000080" has wrong size: 152 instead of 16777216

The only way I have found is to remove this .partial Segment after the previous crash recovery and start the process again. In this way, db seems to be fully restored.

What is the right way to deal with them? .partial Segments?

My DB version is 11.5.

postgresql Subquery Results for Postgres Submission to Function Returning TABLE

I have a function that accepts arrays as parameters and returns a table. I can get the arguments correctly, but return rows of type record. When I try to extend the record with FROM I get syntax errors

SELECT
  zdb.filters(
      sub.idx
    , sub.labels
    , variadic sub.filters
    ) 
    FROM (
      SELECT
        'conversations_zdx' as idx
        , ARRAY_AGG(id)::TEXT() as labels
        , ARRAY_AGG(filters) as filters
        FROM conversation_views 
        group by organization_id
) as sub;
+------------------------------------------+
| filters                                  |
|------------------------------------------|
| (32a430c6-eb33-4ac1-8ce7-9fb64e5b465c,1) |
| (a9177e44-95bf-4b1b-8724-b48efc97e3cf,4) |
+------------------------------------------+

The function signature is as follows

FUNCTION zdb.filters(
    index regclass,
    labels text(),
    filters JSON()) 
RETURNS TABLE (
    label text,
    doc_count bigint)

restore – Postgres Completely ignores the configurations specified in the postgresql.conf file

Every time I start the server, it seems to completely ignore all configuration settings. My data directory is an external drive, so I use it pg_ctl -D /Volumes/Data/Postgres start to begin.

Only localhost connections over port 5432 are accepted, which I have confirmed from show listen_addresses; although the configuration file under connection settings contains the following:

#listen_addresses = '*' # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost'; use '*' for all
                    # (change requires restart)
#port = 9000                # (change requires restart)

I also checked the postgresql.auto.conf file, which apparently overwrites the configuration file and is completely empty.

For the record, the database crashed a few times because the power was interrupted / connection problems have occurred … but it still works fine, it seems because I can connect locally …

postgresql – array containment queries in postgres

Question about the GIN index of the array.

I have 2 million lines in a work table. And I have to find work that a user can do based on their skills. The user will have more and more skills.

Started with the standard method of RDMS, but the performance of the queries is bad. Therefore, postgres supports array containment queries when searching for other found options, and arrays can also be indexed.

Table:

CREATE TABLE
    work
    (
        work_id TEXT DEFAULT nextval('work_id_seq'::regclass) NOT NULL,
        priority_score BIGINT NOT NULL,
        work_data JSONB,
        created_date TIMESTAMP(6) WITHOUT TIME ZONE NOT NULL,
        current_status CHARACTER VARYING,
        PRIMARY KEY (work_id)
    );

Index:

CREATE INDEX test_gin_1 ON work USING gin(jsonarray2intarray((work_data ->> 'skills'::text);

Function: 

CREATE OR REPLACE FUNCTION jsonarray2intarray" (text)  RETURNS integer()
  IMMUTABLE
AS $body$
SELECT translate($1, '()', '{}')::int()
$body$ LANGUAGE sql

Sample data:

282941 1564 {"Skills": (213, 311, 374, 554)}

The answer is slow with the following query. There is only one record with 254,336,391,485 as a skill array

with T as (
SELECT   work_id,
        priority_score,
        current_status,
        work_data
FROM     work
WHERE    jsonarray2intarray( work.work_data ->> 'skills') <@ '{254,336,391,485 }'
AND      work.current_status = 'ASSIGNABLE'
ORDER BY priority_score DESC, created_date  ) 
select * from t  LIMIT 1 FOR UPDATE skip locked
Limit  (cost=45095.54..45095.56 rows=1 width=296) (actual time=3776.169..3776.170 rows=1 loops=1)                                                                                                                                                                                                                                                                                                                    
  Output: t.work_id,t.priority_score, t.current_status,t.work_data                                                                                                                                                                            
  CTE t                                                                                                                                                                                                                                                                                                                                                                                                              
    ->  Sort  (cost=45059.29..45095.54 rows=14503 width=325) (actual time=3776.166..3776.166 rows=1 loops=1)                                                                                                                                                                                                                                                                                                         
          Output: work.work_id,work.priority_score, work.current_status,work.work_data        
          Sort Key: work.priority_score DESC, work.created_date                                                                                                                                                                                                                                                                                                                                    
          Sort Method: quicksort  Memory: 25kB                                                                                                                                                                                                                                                                                                                                                                       
          ->  Bitmap Heap Scan on work  (cost=524.44..41872.83 rows=14503 width=325) (actual time=37.718..3776.159 rows=1 loops=1)                                                                                                                                                                                                                                             
                Output: work.work_id,work.priority_score, work.current_status,work.work_data  
                Recheck Cond: (jsonarray2intarray((work.work_data ->> 'skills'::text)) <@ '{254,336,391,485}'::integer())                                                                                                                                                                                                                                                                               
                Rows Removed by Index Recheck: 1072296                                                                                                                                                                                                                                                                                                                                                               
                Filter: ((work.current_status)::text = 'ASSIGNABLE'::text)                                                                                                                                                                                                                                                                                                                             
                Heap Blocks: exact=41243 lossy=26451                                                                                                                                                                                                                                                                                                                                                                 
                ->  Bitmap Index Scan on test_gin_1  (cost=0.00..520.81 rows=14509 width=0) (actual time=30.699..30.699 rows=154888 loops=1)                                                                                                                                                                                                                                                                         
                      Index Cond: (jsonarray2intarray((work.work_data ->> 'skills'::text)) <@ '{254,336,391,485}'::integer())                                                                                                                                                                                                                                                                           
  ->  CTE Scan on t  (cost=0.00..290.06 rows=14503 width=296) (actual time=3776.168..3776.168 rows=1 loops=1)                                                                                                                                                                                                                                                                                                        
        Output: t.work_id,t.priority_score, t.current_status,t.work_data                                                                                                                                                                      
Planning time: 0.161 ms                                                                                                                                                                                                                                                                                                                                                                                              
Execution time: 3776.202 ms                                                                                                                                                                                                                                                              

The same query with different inputs is fast. There are approximately 26,000 records with the skills 101, 103:

with T as (
SELECT   work_id,
        priority_score,
        current_status,
        work_data
FROM     work
WHERE    jsonarray2intarray( work.work_data ->> 'skills') <@ '{101, 103 }'
AND      work.current_status = 'ASSIGNABLE'
ORDER BY priority_score DESC, created_date  ) 
select * from t  LIMIT 1 FOR UPDATE skip locked
Limit  (cost=45076.55..45076.57 rows=1 width=296) (actual time=116.185..116.186 rows=1 loops=1)                                                                                                                                                                                                                                                                                                                      
  Output: t.work_id,t.priority_score, t.current_status,t.work_data                                                                                                                                                                         
 CTE t                                                                                                                                                                                                                                                                                                                                                                                                              
    ->  Sort  (cost=45040.26..45076.55 rows=14513 width=325) (actual time=116.182..116.182 rows=1 loops=1)                                                                                                                                                                                                                                                                                                           
          Output: work.work_id,work.priority_score, work.current_status,work.work_data        
          Sort Key: work.priority_score DESC, work.created_date                                                                                                                                                                                                                                                                                                                                    
          Sort Method: external merge  Disk: 8088kB                                                                                                                                                                                                                                                                                                                                                                  
          ->  Bitmap Heap Scan on work  (cost=476.52..41853.05 rows=14513 width=325) (actual time=9.223..94.591 rows=26301 loops=1)                                                                                                                                                                                                                                            
                Output: work.work_id,work.priority_score, work.current_status,work.work_data  
                Recheck Cond: (jsonarray2intarray((workd.work_data ->> 'skills'::text)) <@ '{101,103}'::integer())                                                                                                                                                                                                                                                                                       
                Filter: ((workd.current_status)::text = 'ASSIGNABLE'::text)                                                                                                                                                                                                                                                                                                                             
                Rows Removed by Filter: 1357                                                                                                                                                                                                                                                                                                                                                                         
                Heap Blocks: exact=2317                                                                                                                                                                                                                                                                                                                                                                              
                ->  Bitmap Index Scan on test_gin_1  (cost=0.00..472.89 rows=14519 width=0) (actual time=4.638..4.638 rows=39871 loops=1)                                                                                                                                                                                                                                                                            
                      Index Cond: (jsonarray2intarray((workd.work_data ->> 'skills'::text)) <@ '{101,103}'::integer())                                                                                                                                                                                                                                                                                   
  ->  CTE Scan on t  (cost=0.00..290.26 rows=14513 width=296) (actual time=116.184..116.184 rows=1 loops=1)                                                                                                                                                                                                                                                                                                          
        Output: t.work_id,t.priority_score, t.current_status,t.work_data                                                                                                                                                                       
Planning time: 0.160 ms                                                                                                                                                                                                                                                                                                                                                                                              
Execution time: 117.278 ms                                                                                                                               

I am looking for suggestions to get a consistent answer.

NOTE:
Approach that is not postgres-specific:

The query takes about 40 to 50 seconds, which is very bad

I used two tables

CREATE TABLE public.work
(
    id integer NOT NULL DEFAULT nextval('work_id_seq'::regclass),
    priority_score BIGINT NOT NULL,
    work_data JSONB,
    created_date TIMESTAMP(6) WITHOUT TIME ZONE NOT NULL,
    current_status CHARACTER VARYING,
    PRIMARY KEY (work_id)
)

CREATE TABLE public.work_data
(
    skill_id bigint,
    work_id bigint

)

Query:

 select work.id 
    from work  
       inner join work_data on (work.id=work_data.work_id) 
    group by work.id 
    having sum(case when work_data.skill_id in (2269,3805,828,9127) then 0 else 1 end)=0 

postgresql – Problems with Postgres permissions for new views

I have an RDS Postgres database that uses only the public schema.

I expect that webservice Users to gain access to all newly created views. When creating a view with the master User, I see all the grants as expected.

Here are the DEFAULT PRIVILEGES that I have set up.

ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, UPDATE, INSERT, DELETE ON TABLES TO webservice;

In my opinion, when the Rormigrationsuser creates a view in the public schema, it belongs to the Rorm migration, and none of the STANDARD PRIVILEGES exist. I have to change the owner to get the rights.

What can I do to fix these permission issues?

postgresql performance – Postgres memory settings (RAM, work_mem, etc.) for complex text searches in indexed tsvectors

I know very little about the Postgres memory settings.
I have developed a paid database that mainly stores text (and some metadata).
2 tables have 10 million lines with little text (1 paragraph for each line)
2 more tables have 100,000 lines of full texts (40 pages per line)
Overall, the database size is about 10 GB.

The goal of the project is to enable users to perform complex text searches (sometimes combined with unproblematic metadata queries such as titles or data).
In the tables described above, text searches are performed for indexed tsvectors that automatically trigger a trigger.
They are pretty fast on my PC (MacBook Pro 2019 – 2.6 GHz Intel Core i7 6 cores – RAM 16 GB 2400 MHz DDR4).
Unfortunately, they are quite slow on the remote ODS server (RAM 2 GB) during production.

Hence my question:
What are the ideal memory settings for complex text searches on indexed tsvectors?

Does it help if the memory of the server is larger than the database size (in my case 10 GB), so that the database can be stored completely in the RAM?

Should I increase work_mem or other settings?

Does it help to reduce the number of queried tables, or is it irrelevant?
(Let's say if users are only allowed to query the tables with paragraphs and not the tables with full texts)

Many thanks for your help !

postgresql – Postgres Trigger that updates the materialized view if a record is missing

We have several inserts in a table that contains a trigger on statement to update the materialized view after insertions. Although all inserts into the table work correctly, the last record inserted into the source table does not render in the materialized view. Any ideas why the last commit is not reflected in the materialized view?