Thread: pg_restore: error returned by PQputline

pg_restore: error returned by PQputline

From
Clodoaldo Pinto Neto
Date:
Hi all,

Trying to restore actual data into a development database:
$ pg_dump -Fc KakaoStats > KakaoStats.dump
$ pg_restore -a -d ksDesenv KakaoStats.dump
pg_restore: [compressão de arquivo (banco de dados)] erro retornado
pelo PQputline -- [file compression (database)] error returned by
PQputline

The database is about 3GB
postgresql 7.4.2, FC2, 768 MB memory
$ uname -a
Linux s1 2.6.8-1.521 #1 Mon Aug 16 09:01:18 EDT 2004 i686 athlon i386
GNU/Linux

During some minutes the machine was irresponsive. A litle before that
free showed:
$ free
             total       used       free     shared    buffers
cached
Mem:        759600     755712       3888          0       8380
70144
-/+ buffers/cache:     677188      82412
Swap:      1539712    1539712          0

From postgreql.conf:
shared_buffers = 3000           # min 16, at least max_connections*2,
8KB each
sort_mem = 65536                # min 64, size in KB
vacuum_mem = 65536              # min 1024, size in KB
max_fsm_pages = 300000          # min max_fsm_relations*16, 6 bytes
each
#max_fsm_relations = 1000       # min 100, ~50 bytes each






_______________________________________________________
Yahoo! Acesso Grátis - navegue de graça com conexão de qualidade!
http://br.acesso.yahoo.com/

Re: pg_restore: error returned by PQputline

From
Tom Lane
Date:
=?iso-8859-1?q?Clodoaldo=20Pinto=20Neto?= <clodoaldo_pinto@yahoo.com.br> writes:
> Trying to restore actual data into a development database:
> $ pg_dump -Fc KakaoStats > KakaoStats.dump
> $ pg_restore -a -d ksDesenv KakaoStats.dump
> pg_restore: [compress�o de arquivo (banco de dados)] erro retornado
> pelo PQputline -- [file compression (database)] error returned by
> PQputline

Can't tell a thing from that --- presumably there was some error on the
backend side, but this message isn't telling us what.  Look in the
postmaster's log to see what happened.

            regards, tom lane

Re: pg_restore: error returned by PQputline

From
Clodoaldo Pinto Neto
Date:
 --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
> Look in the
> postmaster's log to see what happened.
>
LOG:  recycled transaction log file "0000002B0000007B"
LOG:  recycled transaction log file "0000002B0000007A"
LOG:  recycled transaction log file "0000002B00000079"
LOG:  checkpoints are occurring too frequently (17 seconds apart)
HINT:  Consider increasing the configuration parameter
"checkpoint_segments".
LOG:  server process (PID 9576) was terminated by signal 9
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server
process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
CONTEXT:  writing block 767665 of relation 463888/463894
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2004-08-29 22:08:55 BRT
LOG:  checkpoint record is at 2B/81896F18
LOG:  redo record is at 2B/7F6E7230; undo record is at 0/0; shutdown
FALSE
LOG:  next transaction ID: 41275; next OID: 882451
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 2B/7F6E7230
LOG:  unexpected pageaddr 2B/76C1E000 in log file 43, segment 130,
offset 12705792
LOG:  redo done at 2B/82C1BD08
LOG:  recycled transaction log file "0000002B0000007D"
LOG:  recycled transaction log file "0000002B0000007C"
LOG:  removing transaction log file "0000002B0000007E"
LOG:  database system is ready

From postgresql.conf:
#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each

/var/log/messages:
Aug 29 22:31:31 s1 kernel: oom-killer: gfp_mask=0x1d2
Aug 29 22:31:33 s1 kernel: DMA per-cpu:
Aug 29 22:31:33 s1 kernel: cpu 0 hot: low 2, high 6, batch 1
Aug 29 22:31:33 s1 kernel: cpu 0 cold: low 0, high 2, batch 1
Aug 29 22:31:33 s1 kernel: Normal per-cpu:
Aug 29 22:31:33 s1 kernel: cpu 0 hot: low 32, high 96, batch 16
Aug 29 22:31:33 s1 kernel: cpu 0 cold: low 0, high 32, batch 16
Aug 29 22:31:33 s1 kernel: HighMem per-cpu: empty
Aug 29 22:31:33 s1 kernel:
Aug 29 22:31:33 s1 kernel: Free pages:        3408kB (0kB HighMem)
Aug 29 22:31:33 s1 kernel: Active:100779 inactive:81540 dirty:0
writeback:0 unstable:0 free:852 slab:2537 mapped:174822 pagetables:1417
Aug 29 22:31:33 s1 kernel: DMA free:1744kB min:16kB low:32kB high:48kB
active:5404kB inactive:5196kB present:16384kB
Aug 29 22:31:33 s1 kernel: protections[]: 8 436 436
Aug 29 22:31:33 s1 kernel: Normal free:1664kB min:856kB low:1712kB
high:2568kB active:397712kB inactive:320964kB present:753600kB
Aug 29 22:31:33 s1 kernel: protections[]: 0 428 428
Aug 29 22:31:34 s1 kernel: HighMem free:0kB min:128kB low:256kB
high:384kB active:0kB inactive:0kB present:0kB
Aug 29 22:31:34 s1 kernel: protections[]: 0 0 0
Aug 29 22:31:35 s1 kernel: DMA: 4*4kB 8*8kB 14*16kB 1*32kB 0*64kB
1*128kB 1*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 1744kB
Aug 29 22:31:35 s1 kernel: Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB
1*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1664kB
Aug 29 22:31:36 s1 kernel: HighMem: empty
Aug 29 22:31:36 s1 kernel: Swap cache: add 1273826, delete 1273816,
find 26834/46799, race 0+109
Aug 29 22:31:36 s1 kernel: Out of Memory: Killed process 9576
(postmaster).

Regards,
Clodoaldo Pinto





_______________________________________________________
Yahoo! Acesso Grátis - navegue de graça com conexão de qualidade!
http://br.acesso.yahoo.com/

Re: pg_restore: error returned by PQputline

From
Tom Lane
Date:
=?iso-8859-1?q?Clodoaldo=20Pinto=20Neto?= <clodoaldo_pinto@yahoo.com.br> writes:
>  --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>> Look in the postmaster's log to see what happened.

> LOG:  server process (PID 9576) was terminated by signal 9

Well, there's a smoking gun ...

Assuming that you didn't deliberately kill -9 your server process,
the only mechanism I've heard of that causes this is the Linux kernel's
out-of-memory killer --- see
http://www.postgresql.org/docs/7.4/static/kernel-resources.html#AEN17068
Googling for "OOM kill" will give you additional gory details :-(.

            regards, tom lane

Re: pg_restore: error returned by PQputline

From
Richard Huxton
Date:
Tom Lane wrote:
> =?iso-8859-1?q?Clodoaldo=20Pinto=20Neto?= <clodoaldo_pinto@yahoo.com.br> writes:
>
>> --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>>
>>>Look in the postmaster's log to see what happened.
>
>
>>LOG:  server process (PID 9576) was terminated by signal 9
>
>
> Well, there's a smoking gun ...
>
> Assuming that you didn't deliberately kill -9 your server process,
> the only mechanism I've heard of that causes this is the Linux kernel's
> out-of-memory killer

In fact, looking at the original post, the evidence is there in the free
output - swap all used up, no shared/buffers left to speak of...

 > $ free
 >              total       used       free     shared    buffers
 > cached
 > Mem:        759600     755712       3888          0       8380
 > 70144
 > -/+ buffers/cache:     677188      82412
 > Swap:      1539712    1539712          0


The sort_mem might be too high at 64MB - that's memory per sort and I
don't know how many simultaneous sorts you might have.
--
   Richard Huxton
   Archonet Ltd

Re: pg_restore: error returned by PQputline

From
Clodoaldo Pinto Neto
Date:
 --- Richard Huxton <dev@archonet.com> escreveu:
> The sort_mem might be too high at 64MB - that's memory per sort and I
>
> don't know how many simultaneous sorts you might have.
> --
Lowered it to 32MB and restarted the server with the same bad result.

How many simultaneous sorts a restore can do? Will it build more than
one index at a time? I know nothing about dumping and restoring or
database internals. Just following the recipe.

Regards,
Clodoaldo Pinto





_______________________________________________________
Yahoo! Acesso Grátis - navegue de graça com conexão de qualidade!
http://br.acesso.yahoo.com/

Re: pg_restore: error returned by PQputline

From
Clodoaldo Pinto Neto
Date:
 --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
> Assuming that you didn't deliberately kill -9 your server process,

I didn't

> the only mechanism I've heard of that causes this is the Linux
> kernel's
> out-of-memory killer --- see
>
http://www.postgresql.org/docs/7.4/static/kernel-resources.html#AEN17068
> Googling for "OOM kill" will give you additional gory details :-(.

I will try the solution suggested in the docs: sysctl -w
vm.overcommit_memory=2

Has the -c option any effect on memory usage?
$ pg_restore -c -a -d ksDesenv KakaoStats.dump

Regards,
Clodoaldo Pinto





_______________________________________________________
Yahoo! Acesso Grátis - navegue de graça com conexão de qualidade!
http://br.acesso.yahoo.com/

Re: pg_restore: error returned by PQputline

From
Gaetano Mendola
Date:
Tom Lane wrote:

> =?iso-8859-1?q?Clodoaldo=20Pinto=20Neto?= <clodoaldo_pinto@yahoo.com.br> writes:
>
>> --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>>
>>>Look in the postmaster's log to see what happened.
>
>
>>LOG:  server process (PID 9576) was terminated by signal 9
>
>
> Well, there's a smoking gun ...
>
> Assuming that you didn't deliberately kill -9 your server process,
> the only mechanism I've heard of that causes this is the Linux kernel's
> out-of-memory killer --- see
> http://www.postgresql.org/docs/7.4/static/kernel-resources.html#AEN17068
> Googling for "OOM kill" will give you additional gory details :-(.

Is for sure his messages contain:

Aug 29 22:31:36 s1 kernel: Out of Memory: Killed process 9576
(postmaster).



Regards
Gaetano Mendola



Re: pg_restore: error returned by PQputline

From
Clodoaldo Pinto Neto
Date:
>  --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
> > the only mechanism I've heard of that causes this is the Linux
> > kernel's
> > out-of-memory killer --- see
> >
>
http://www.postgresql.org/docs/7.4/static/kernel-resources.html#AEN17068
> > Googling for "OOM kill" will give you additional gory details :-(.
>
I did:
# /sbin/sysctl -w vm.overcommit_memory=2
vm.overcommit_memory = 2

But it didn't help much:

TopMemoryContext: 32768 total in 3 blocks; 6512 free (3 chunks); 26256
used
TopTransactionContext: 8192 total in 1 blocks; 8136 free (0 chunks); 56
used
DeferredTriggerXact: 1414520832 total in 179 blocks; 3016 free (10
chunks); 1414517816 used
MessageContext: 8192 total in 1 blocks; 7008 free (1 chunks); 1184 used
PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used
PortalHeapMemory: 23552 total in 4 blocks; 20800 free (20 chunks); 2752
used
ExecutorState: 8192 total in 1 blocks; 7944 free (0 chunks); 248 used
ExprContext: 8192 total in 1 blocks; 8016 free (0 chunks); 176 used
CacheMemoryContext: 516096 total in 6 blocks; 256216 free (3 chunks);
259880 used
usuarios_data: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
usuarios_temp_ndx: 1024 total in 1 blocks; 320 free (0 chunks); 704
used
pg_index_indrelid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_amop_opc_strategy_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_shadow_usename_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_amop_opr_opc_index: 1024 total in 1 blocks; 320 free (0 chunks); 704
used
pg_conversion_oid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_language_name_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_statistic_relid_att_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_shadow_usesysid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_cast_source_target_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_conversion_name_nsp_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_namespace_nspname_index: 1024 total in 1 blocks; 640 free (0
chunks); 384 used
pg_conversion_default_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_aggregate_fnoid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_language_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_type_typname_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks);
704 used
pg_group_sysid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_namespace_oid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_opclass_am_name_nsp_index: 2048 total in 1 blocks; 768 free (0
chunks); 1280 used
pg_group_name_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_proc_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_operator_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_amproc_opc_procnum_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_opclass_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_type_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_class_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
MdSmgr: 8192 total in 1 blocks; 6096 free (0 chunks); 2096 used
DynaHash: 8192 total in 1 blocks; 6680 free (0 chunks); 1512 used
DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used
DynaHashTable: 8192 total in 1 blocks; 2008 free (0 chunks); 6184 used
DynaHashTable: 8192 total in 1 blocks; 3016 free (0 chunks); 5176 used
DynaHashTable: 8192 total in 1 blocks; 4040 free (0 chunks); 4152 used
DynaHashTable: 24576 total in 2 blocks; 13240 free (4 chunks); 11336
used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ErrorContext: 8192 total in 1 blocks; 8176 free (4 chunks); 16 used
ERROR:  out of memory
DETAIL:  Failed on request of size 44.
CONTEXT:  COPY usuarios, line 19646042: "428279 1251    10362   11
27"

Regards,
Clodoaldo Pinto





_______________________________________________________
Yahoo! Acesso Grátis - navegue de graça com conexão de qualidade!
http://br.acesso.yahoo.com/

Re: pg_restore: error returned by PQputline

From
Tom Lane
Date:
=?iso-8859-1?q?Clodoaldo=20Pinto=20Neto?= <clodoaldo_pinto@yahoo.com.br> writes:
>> --- Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>>> Googling for "OOM kill" will give you additional gory details :-(.
>>
> I did:
> # /sbin/sysctl -w vm.overcommit_memory=2
> vm.overcommit_memory = 2

> But it didn't help much:

> DeferredTriggerXact: 1414520832 total in 179 blocks; 3016 free (10
> chunks); 1414517816 used

Well, at least now we can see *why* you have an issue: the
pending-trigger list is getting too big.

What you probably need to do is disable user triggers and/or drop
foreign key constraints while you load the data, then reinstall
these things.

The normal pg_restore sequence is designed to avoid this problem
by doing things in the "right" order.  However, if you restore
schema and data separately then you can easily run into this problem.

            regards, tom lane