Thread: Generate test data inserts - 1GB
Hello
Postgresql 9.6
Need to generate 1GB test data in very less time. I got some techniques online but they take around 40mins for 400GB. Any quicker way?
Thanks.
On 8/9/19 4:12 AM, Shital A wrote: > Hello > > Postgresql 9.6 > > Need to generate 1GB test data in very less time. I got some techniques > online but they take around 40mins for 400GB. Any quicker way? 1) Postgres version? 2) Data going into single table or multiple tables? 3) What is the time frame you are trying to achieve? 4) What techniques have you tried? 5) If you need only 1GB why the 400GB number? > > > Thanks. > -- Adrian Klaver adrian.klaver@aklaver.com
On Fri, 9 Aug 2019, 20:08 Adrian Klaver, <adrian.klaver@aklaver.com> wrote:
On 8/9/19 4:12 AM, Shital A wrote:
> Hello
>
> Postgresql 9.6
>
> Need to generate 1GB test data in very less time. I got some techniques
> online but they take around 40mins for 400GB. Any quicker way?
1) Postgres version?
2) Data going into single table or multiple tables?
3) What is the time frame you are trying to achieve?
4) What techniques have you tried?
5) If you need only 1GB why the 400GB number?
>
>
> Thanks.
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Hello,
Sorry 400GB was a typo. Its 400 MB Details are below:
1) Postgres version?
9.6
2) Data going into single table or multiple tables?
2) Data going into single table or multiple tables?
Single table having Multiple columns
3) What is the time frame you are trying to achieve?
As quick as possible.
3) What is the time frame you are trying to achieve?
As quick as possible.
4) What techniques have you tried?
Insert into with With statement, inserting 2000000 rows at a time. This takes 40 mins.
5) If you need only 1GB why the 400GB number?
5) If you need only 1GB why the 400GB number?
That was 400MB. M checking size using \l+ option.
Thanks.
On 8/9/19 8:14 AM, Shital A wrote: > > > On Fri, 9 Aug 2019, 20:08 Adrian Klaver, <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 8/9/19 4:12 AM, Shital A wrote: > > Hello > > > > Postgresql 9.6 > > > > Need to generate 1GB test data in very less time. I got some > techniques > > online but they take around 40mins for 400GB. Any quicker way? > > 1) Postgres version? > > 2) Data going into single table or multiple tables? > > 3) What is the time frame you are trying to achieve? > > 4) What techniques have you tried? > > 5) If you need only 1GB why the 400GB number? > > > > > > > > Thanks. > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > Hello, > > Sorry 400GB was a typo. Its 400 MB Details are below: > > 1) Postgres version? > 9.6 > > 2) Data going into single table or multiple tables? > Single table having Multiple columns > > 3) What is the time frame you are trying to achieve? > As quick as possible. > > 4) What techniques have you tried? > Insert into with With statement, inserting 2000000 rows at a time. This > takes 40 mins. Might take a look at COPY: https://www.postgresql.org/docs/11/sql-copy.html or its psql equivalent: \copy. COPY runs as the server user and sees files relative to the server location. \copy runs as the psql(client) user. In either case suitability will depend on where you are sourcing the test data from and what format it is in to begin with. If you want a more sophisticated wrapper over the above then: https://ossc-db.github.io/pg_bulkload/pg_bulkload.html > > 5) If you need only 1GB why the 400GB number? > That was 400MB. M checking size using \l+ option. > > > Thanks. > > -- Adrian Klaver adrian.klaver@aklaver.com
On 8/9/19 8:14 AM, Shital A wrote: > > Hello, > > 4) What techniques have you tried? > Insert into with With statement, inserting 2000000 rows at a time. This > takes 40 mins. > To add to my previous post. If you already have data in a Postgres database then you could do: pg_dump -d db -t some_table -a -f test_data.sql That will dump the data only for the table in COPY format. Then you could apply that to your test database(after TRUNCATE on table, assuming you want to start fresh): psql -d test_db -f test_data.sql -- Adrian Klaver adrian.klaver@aklaver.com
On Fri, 9 Aug 2019, 21:25 Adrian Klaver, <adrian.klaver@aklaver.com> wrote:
On 8/9/19 8:14 AM, Shital A wrote:
>
> Hello,
>
> 4) What techniques have you tried?
> Insert into with With statement, inserting 2000000 rows at a time. This
> takes 40 mins.
>
To add to my previous post. If you already have data in a Postgres
database then you could do:
pg_dump -d db -t some_table -a -f test_data.sql
That will dump the data only for the table in COPY format. Then you
could apply that to your test database(after TRUNCATE on table, assuming
you want to start fresh):
psql -d test_db -f test_data.sql
--
Adrian Klaver
adrian.klaver@aklaver.com
Thanks for the reply Adrian.
Missed one requirement. Will these methods generate wal logs needed for replication?
Actually the data is to check if replication catches up. Below is scenario :
1. Have a master slave cluster with replication setup
2. Kill master so that standby takes over. We are using pacemaker for auto failure.
Insert 1 GB data in new master while replication is broken.
3 Start oldnode as standby and check if 1GB data gets replicated.
As such testing might be frequent we needed to spend minimum time in generating data.
Master slave are in same network.
Thanks !
How quickly does your production instance grow, in terms of GB/per-unit-time?
--
Adrian Klaver
adrian.klaver@aklaver.comThanks for the reply Adrian.Missed one requirement. Will these methods generate wal logs needed for replication?Actually the data is to check if replication catches up. Below is scenario :1. Have a master slave cluster with replication setup2. Kill master so that standby takes over. We are using pacemaker for auto failure.Insert 1 GB data in new master while replication is broken.3 Start oldnode as standby and check if 1GB data gets replicated.As such testing might be frequent we needed to spend minimum time in generating data.Master slave are in same network.Thanks !
On 8/9/19 9:51 AM, Shital A wrote: > > > On Fri, 9 Aug 2019, 21:25 Adrian Klaver, <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 8/9/19 8:14 AM, Shital A wrote: > > > > > Hello, > > > > > 4) What techniques have you tried? > > Insert into with With statement, inserting 2000000 rows at a > time. This > > takes 40 mins. > > > > To add to my previous post. If you already have data in a Postgres > database then you could do: > > pg_dump -d db -t some_table -a -f test_data.sql > > That will dump the data only for the table in COPY format. Then you > could apply that to your test database(after TRUNCATE on table, > assuming > you want to start fresh): > > psql -d test_db -f test_data.sql > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > Thanks for the reply Adrian. > > Missed one requirement. Will these methods generate wal logs needed for > replication? For COPY, AFAIK yes. To verify set up a small test table and COPY to it and sees if the data shows up on the standby. pg_bulkload: https://ossc-db.github.io/pg_bulkload/pg_bulkload.html "IMPORTANT NOTE: Under streaming replication environment, pg_bulkload does not work properly. See here for details. > > Actually the data is to check if replication catches up. Below is scenario : > > 1. Have a master slave cluster with replication setup > > 2. Kill master so that standby takes over. We are using pacemaker for > auto failure. > Insert 1 GB data in new master while replication is broken. > > 3 Start oldnode as standby and check if 1GB data gets replicated. > > As such testing might be frequent we needed to spend minimum time in > generating data. > Master slave are in same network. > > Thanks ! -- Adrian Klaver adrian.klaver@aklaver.com