Thread: I have an exporting need...

I have an exporting need...

From
Juan Hernández
Date:

Hi team!

First, i want to thank you for having your hands in this. You are doing a fantastic and blessing job. Bless to you all!

I have a special need i want to comment to you. This is not a bug, is a need i have and i write here for been redirected where needed.

I have to make a daily backup. The database is growing a lot per day, and sometimes i've had the need to recover just a table. And would be easier move a 200M file with only the needed table instead of moving a 5G file with all the tables i don't need, just a matter of speed.

I've created a script to export every table one by one, so in case i need to import a table again, don't have the need to use the very big exportation file, but the "tablename.sql" file created for every table.

My hosting provider truncated my script because is very large (more than 200 lines, each line to export one table), so i think the way i do this is hurting the server performance.

Then my question.

Do you consider useful to add a parameter (for example, --separatetables) so when used the exporting file process can create a different tablename.sql file for each table in database automatically?

Example...

PGHOST="/tmp" PGPASSWORD="mydbpass" pg_dump -U dbusername --separatetables -Fp --inserts dbname > "/route/dbname.sql"

And if this database has tables table1...table10, then 10 files are created...

dbname_table1.sql
dbname_table2.sql
dbname_table3.sql
...
dbname_table8.sql
dbname_table9.sql
dbname_table10.sql


In each file, all main parameters will be generated again. For example the file dbname_table1.sql...

--
-- PostgreSQL database dump
--
-- Dumped from database version 10.21
-- Dumped by pg_dump version 15.6
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
...
...
SET default_tablespace = '';
--
-- Name: table1; Type: TABLE; Schema: public; Owner: dbusername
--
CREATE TABLE public.table1 (
    code numeric(5,0),
    name character varying(20)
)


I dont know if many developers have same need as me. I hope this help in future.

Thanks for reading me and thanks for what you've done.. You are doing fine! Cheers!


______________
Juan de Jesús


Re: I have an exporting need...

From
David Rowley
Date:
On Tue, 14 May 2024 at 06:18, Juan Hernández <dejesusog@gmail.com> wrote:
> Do you consider useful to add a parameter (for example, --separatetables) so when used the exporting file process can
createa different tablename.sql file for each table in database automatically? 
>
> Example...
>
> PGHOST="/tmp" PGPASSWORD="mydbpass" pg_dump -U dbusername --separatetables -Fp --inserts dbname > "/route/dbname.sql"
>
> And if this database has tables table1...table10, then 10 files are created...

pg_dump has code to figure out the dependency of objects in the
database so that the dump file produced can be restored.  If one file
was output per table, how would you know in which order to restore
them? For example, you have a table with a FOREIGN KEY to reference
some other table, you need to restore the referenced table first.
That's true for both restoring the schema and restoring the data.

David



Re: I have an exporting need...

From
Heikki Linnakangas
Date:
On 13/05/2024 16:01, Juan Hernández wrote:
> Hi team!
> 
> First, i want to thank you for having your hands in this. You are doing 
> a fantastic and blessing job. Bless to you all!
> 
> I have a special need i want to comment to you. This is not a bug, is a 
> need i have and i write here for been redirected where needed.
> 
> I have to make a daily backup. The database is growing a lot per day, 
> and sometimes i've had the need to recover just a table. And would be 
> easier move a 200M file with only the needed table instead of moving a 
> 5G file with all the tables i don't need, just a matter of speed.
> 
> I've created a script to export every table one by one, so in case i 
> need to import a table again, don't have the need to use the very big 
> exportation file, but the "tablename.sql" file created for every table.
> 
> My hosting provider truncated my script because is very large (more than 
> 200 lines, each line to export one table), so i think the way i do this 
> is hurting the server performance.

Some ideas for you to explore:

- Use "pg_dump -Fcustom" format. That still creates one large file, but 
you can then use "pg_restore --table=foobar" to extract a .sql file for 
single table from that when restoring.

- "pg_dump -Fdirectory" format does actually create one file per table. 
It's in pg_dump's internal format though, so you'll still need to use 
pg_restore to make sense of it.

- Use rsync to copy just the changed parts between two dump.

> Then my question.
> 
> Do you consider useful to add a parameter (for example, 
> --separatetables) so when used the exporting file process can create a 
> different tablename.sql file for each table in database automatically?

It'd be tricky to restore from, as you need to restore the tables in the 
right order. I think you'd still need a "main" sql file that includes 
all the other files in the right order. And using the table names as 
filenames gets tricky if the table names contain any funny characters.

For manual operations, yeah, I can see it being useful nevertheless.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: I have an exporting need...

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 13/05/2024 16:01, Juan Hernández wrote:
>> Do you consider useful to add a parameter (for example, 
>> --separatetables) so when used the exporting file process can create a 
>> different tablename.sql file for each table in database automatically?

> It'd be tricky to restore from, as you need to restore the tables in the 
> right order. I think you'd still need a "main" sql file that includes 
> all the other files in the right order. And using the table names as 
> filenames gets tricky if the table names contain any funny characters.

It's a lot worse than that, as it's entirely possible to have circular
FK dependencies, meaning there is no "right order" if you think of
each table file as self-contained DDL plus data.  Other sorts of
circularities are possible too.

pg_dump deals with that hazard by splitting things up: first create
all the tables, then load all the data, then create all the indexes
and foreign keys.  You can tell it to just emit the parts relevant to
a particular table, but it's on your head whether that's actually
going to be useful in your context.  I doubt that it's widely enough
useful to justify creating a special mode beyond what we already
have.

            regards, tom lane



Re: I have an exporting need...

From
Juan Hernández
Date:
Hi team!

I read all your comments and this leads me to learn more.

For me and my case would be useful, even there are other ways to solve this, but I may be wrong and just have to learn more about maintenance, backup and recovery tasks.

What if when --separatetables clause is used, table definition and data are exported. Indexes, foreign keys and relations declarations are exported too, but commented, with an advice. Just an idea.

Thank you all and best regards!

Juan de Jesús



El mar, 14 may 2024 a las 10:54, Tom Lane (<tgl@sss.pgh.pa.us>) escribió:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 13/05/2024 16:01, Juan Hernández wrote:
>> Do you consider useful to add a parameter (for example,
>> --separatetables) so when used the exporting file process can create a
>> different tablename.sql file for each table in database automatically?

> It'd be tricky to restore from, as you need to restore the tables in the
> right order. I think you'd still need a "main" sql file that includes
> all the other files in the right order. And using the table names as
> filenames gets tricky if the table names contain any funny characters.

It's a lot worse than that, as it's entirely possible to have circular
FK dependencies, meaning there is no "right order" if you think of
each table file as self-contained DDL plus data.  Other sorts of
circularities are possible too.

pg_dump deals with that hazard by splitting things up: first create
all the tables, then load all the data, then create all the indexes
and foreign keys.  You can tell it to just emit the parts relevant to
a particular table, but it's on your head whether that's actually
going to be useful in your context.  I doubt that it's widely enough
useful to justify creating a special mode beyond what we already
have.

                        regards, tom lane