Hi,
@Francisco - Yeah, the file is around 600 Mb currently, uncompressed.
You're right, our internet connection is going to be the limiting factor.
Essentially, the PostgreSQL server is in a datacentre, the server we're dumping to is in the office.
Running a script on the PostgreSQL server in the datacentre is going to be tricky (not so much technically, just from a procedures/security point of view).
Dumping to a spare table seems like an interesting point - so we'd just create the table, COPY the results to that table, then use LIMIT/OFFSET to paginate through that, then drop the table afterwards?
Currently, I'm doing a quick hack where we download an ordered list of the ids (auto-incrementing integer) into Python, chunk it up into groups of ids, then use a WHERE IN clause to download each chunk via COPY.
Would dumping to a spare table and paginating a better approach? Reasons? (Not challenging it, I just want to understand everything).
Cheers,
Victor