master and sync-replica diverging - Mailing list pgsql-hackers

From Erik Rijkers
Subject master and sync-replica diverging
Date
Msg-id 879c61028d9ed551eb2da6fe93fa2616.squirrel@webmail.xs4all.nl
Whole thread Raw
Responses Re: master and sync-replica diverging
List pgsql-hackers
AMD FX 8120 / centos 6.2 / latest source (git head)


It seems to be quite easy to force a 'sync' replica to not be equal to master by
recreating+loading a table in a while loop.


For this test I compiled+checked+installed three separate instances on the same machine.  The
replica application_name are names 'wal_receiver_$copy' where $copy is 01, resp. 02.

$ ./sync_state.sh pid  | application_name |   state   | sync_state
-------+------------------+-----------+------------19520 | wal_receiver_01  | streaming | sync19567 | wal_receiver_02
|streaming | async
 
(2 rows)
port | synchronous_commit | synchronous_standby_names
------+--------------------+---------------------------6564 | on                 | wal_receiver_01
(1 row)
port | synchronous_commit | synchronous_standby_names
------+--------------------+---------------------------6565 | off                |
(1 row)
port | synchronous_commit | synchronous_standby_names
------+--------------------+---------------------------6566 | off                |
(1 row)



The test consists of creating a table and loading tab-separated data from file with COPY and then
taking the rowcount of that table (13 MB, almost 200k rows) in all three instances:


# wget http://flybase.org/static_pages/downloads/FB2012_03/genes/fbgn_annotation_ID_fb_2012_03.tsv.gz

slurp_file=fbgn_annotation_ID_fb_2012_03.tsv.gz

zcat $slurp_file \| grep -v '^#' \| grep -Ev '^[[:space:]]*$' \| psql -c "   drop table if exists $table cascade;
createtable $table (            gene_symbol      text       ,    primary_fbgn     text       ,    secondary_fbgns  text
     ,    annotation_id    text       ,    secondary_annotation_ids text   );   copy $table from stdin csv delimiter
E'\t';";

# count on master:
echo "select current_setting('port') port,count(*) from $table"|psql -qtXp 6564

# count on wal_receiver_01 (sync replica):
echo "select current_setting('port') port,count(*) from $table"|psql -qtXp 6565

# count on wal_receiver_02 (async replica):
echo "select current_setting('port') port,count(*) from $table"|psql -qtXp 6566



I expected the rowcounts from master and sync replica to always be the same.

Initially this seemed to be the case, but when I run the above sequence in a while loop for a few
minutes about 10% of rowcounts from the sync-replica are not equal to the master.

Perhaps not a likely scenario, but surely such a deviating rowcount on a sync replica should not
be possible?


thank you,


Erik Rijkers






pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: counting pallocs
Next
From: Ants Aasma
Date:
Subject: Re: Why is indexonlyscan so darned slow?