Thread: Slow BLOBs restoring

Slow BLOBs restoring

From
Vlad Arkhipov
Date:
I have encountered a problem while restoring the database. There is a
table that contains XML data (BLOB), ~ 3 000 000 records, ~ 5.5Gb of
data. pg_restore has been running for a week without any considerable
progress. There are plenty of lines like these in the log:

pg_restore: processing item 3125397 BLOB 10001967
pg_restore: executing BLOB 10001967

CPU usage is 100% always. The total database size is about 100 Gb and it
restores in an hour or so without BLOBs.

Re: Slow BLOBs restoring

From
Vlad Arkhipov
Date:
I discovered this issue a bit more. -j option is slowing down BLOBs
restoring. It's about 1000x times slower if you specify this option.
Does anybody plan to fix it?
> I have encountered a problem while restoring the database. There is a
> table that contains XML data (BLOB), ~ 3 000 000 records, ~ 5.5Gb of
> data. pg_restore has been running for a week without any considerable
> progress. There are plenty of lines like these in the log:
>
> pg_restore: processing item 3125397 BLOB 10001967
> pg_restore: executing BLOB 10001967
>
> CPU usage is 100% always. The total database size is about 100 Gb and
> it restores in an hour or so without BLOBs.
>


Re: Slow BLOBs restoring

From
Tom Lane
Date:
Vlad Arkhipov <arhipov@dc.baikal.ru> writes:
> I discovered this issue a bit more. -j option is slowing down BLOBs
> restoring. It's about 1000x times slower if you specify this option.

Are you by any chance restoring from an 8.3 or older pg_dump file made
on Windows?  If so, it's a known issue.

> Does anybody plan to fix it?

Not without a complete reproducible example ... and not at all if it's
the known problem.  The fix for that is to update pg_dump to 8.4 or
later.

            regards, tom lane

Re: Slow BLOBs restoring

From
Vlad Arkhipov
Date:
08.12.2010 22:46, Tom Lane writes:
> Are you by any chance restoring from an 8.3 or older pg_dump file made
> on Windows?  If so, it's a known issue.
>
No, I tried Linux only.

> Not without a complete reproducible example ... and not at all if it's
> the known problem.  The fix for that is to update pg_dump to 8.4 or
> later.
>
I think you can reproduce it. First I created a database full of many
BLOBs on Postres 8.4.5. Then I created a dump:

pg_dump -F c test > test.backup8

It took about 15 minutes. Then I tried to restore it on Postgres 8.

pg_restore -v -d test2 -j 2 test.backup8

It restored in 18 minutes. Then I restored it to Postgres 9.0.1, it took
20 minutes. Then I created a dump there:

/usr/pgsql-9.0/bin/pg_dump -F c test > test.backup9

It took 25 minutes. Finally I tried to restore it and got what I've
already described:

/usr/pgsql-9.0/bin/pg_restore -v -d test2 -j 2 test.backup9

However if I remove the option '-j', the database restores in 45 minutes.

Re: Slow BLOBs restoring

From
Vlad Arkhipov
Date:
08.12.2010 22:46, Tom Lane writes:
> Are you by any chance restoring from an 8.3 or older pg_dump file made
> on Windows?  If so, it's a known issue.
>
No, I tried Linux only.

> Not without a complete reproducible example ... and not at all if it's
> the known problem.  The fix for that is to update pg_dump to 8.4 or
> later.
>
I think you can reproduce it. First I created a database full of many
BLOBs on Postres 8.4.5. Then I created a dump:

pg_dump -F c test > test.backup8

It took about 15 minutes. Then I tried to restore it on Postgres 8.

pg_restore -v -d test2 -j 2 test.backup8

It restored in 18 minutes. Then I restored it to Postgres 9.0.1, it took
20 minutes. Then I created a dump there:

/usr/pgsql-9.0/bin/pg_dump -F c test > test.backup9

It took 25 minutes. Finally I tried to restore it and got what I've
already described:

/usr/pgsql-9.0/bin/pg_restore -v -d test2 -j 2 test.backup9

However if I remove the option '-j', the database restores in 45 minutes.

Re: Slow BLOBs restoring

From
Tom Lane
Date:
Vlad Arkhipov <arhipov@dc.baikal.ru> writes:
> 08.12.2010 22:46, Tom Lane writes:
>> Are you by any chance restoring from an 8.3 or older pg_dump file made
>> on Windows?  If so, it's a known issue.

> No, I tried Linux only.

OK, then it's not the missing-data-offsets issue.

> I think you can reproduce it. First I created a database full of many
> BLOBs on Postres 8.4.5. Then I created a dump:

Oh, you should have said how many was "many".  I had tried with several
thousand large blobs yesterday and didn't see any problem.  However,
with several hundred thousand small blobs, indeed it gets pretty slow
as soon as you use -j.

oprofile shows all the time is going into reduce_dependencies during the
first loop in restore_toc_entries_parallel (ie, before we've actually
started doing anything in parallel).  The reason is that for each blob,
we're iterating through all of the several hundred thousand TOC entries,
uselessly looking for anything that depends on the blob.  And to add
insult to injury, because the blobs are all marked as SECTION_PRE_DATA,
we don't get to parallelize at all.  I think we won't get to parallelize
the blob data restoration either, since all the blob data is hidden in a
single TOC entry :-(

So the short answer is "don't bother to use -j in a mostly-blobs restore,
becausw it isn't going to help you in 9.0".

One fairly simple, if ugly, thing we could do about this is skip calling
reduce_dependencies during the first loop if the TOC object is a blob;
effectively assuming that nothing could depend on a blob.  But that does
nothing about the point that we're failing to parallelize blob
restoration.  Right offhand it seems hard to do much about that without
some changes to the archive representation of blobs.  Some things that
might be worth looking at for 9.1:

* Add a flag to TOC objects saying "this object has no dependencies",
to provide a generalized and principled way to skip the
reduce_dependencies loop.  This is only a good idea if pg_dump knows
that or can cheaply determine it at dump time, but I think it can.

* Mark BLOB TOC entries as SECTION_DATA, or somehow otherwise make them
parallelizable.  Also break the BLOBS data item apart into an item per
BLOB, so that that part's parallelizable.  Maybe we should combine the
metadata and data for each blob into one TOC item --- if we don't, it
seems like we need a dependency, which will put us back behind the
eight-ball.  I think the reason it's like this is we didn't originally
have a separate TOC item per blob; but now that we added that to support
per-blob ACL data, the monolithic BLOBS item seems pretty pointless.
(Another thing that would have to be looked at here is the dependency
between a BLOB and any BLOB COMMENT for it.)

Thoughts?

            regards, tom lane