Thread: querying while copying into a table and optimizations

querying while copying into a table and optimizations

From
davidgn@servidor.unam.mx
Date:
Message-ID: <1130523763.43626c739791d@www.correo.unam.mx>
Date: Fri, 28 Oct 2005 13:22:43 -0500 (CDT)
From: davidgn@servidor.unam.mx
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
User-Agent: IMP/PHP IMAP webmail program 2.2.8
X-Originating-IP: 200.34.166.34

Hi
This is postgresql 7.4
I am trying to check that postgres is updating a table.
I have a pretty large ascii table  file (+- 210 Mb) which I am copying into a
table with pgsql, but it is taking a long time, and any select query I do to the
table returns me 0 rows

Is there any way to ask postgres to update the data more frequently or some way
to make it end sooner?
I am a bit clueless as to what can I do to the configuration files to optimize
this copy.

Thank you

-------------------------------------------------
www.correo.unam.mx
UNAMonos Comunic�ndonos


Re: querying while copying into a table and optimizations

From
Michael Fuhr
Date:
On Fri, Oct 28, 2005 at 06:22:43PM -0000, davidgn@servidor.unam.mx wrote:
> This is postgresql 7.4
> I am trying to check that postgres is updating a table.
> I have a pretty large ascii table  file (+- 210 Mb) which I am copying into a
> table with pgsql, but it is taking a long time, and any select query I do to the
> table returns me 0 rows

Presumably you're using COPY or doing multiple inserts inside a
transaction (hopefully the former for performance reasons).  No
other transaction will be able to see any of the data until the
inserting transaction commits it (a single statement like COPY is
wrapped in its own transaction even if it occurs outside an explicit
transaction block).  You could use contrib/pgstattuple to check on
the copy/insert's progress, but that won't allow you to query the
data itself.

> Is there any way to ask postgres to update the data more frequently or some way
> to make it end sooner?

By "update the data more frequently" I assume you mean "make the
data visible to other transactions more frequently."  To do that
you could use multiple COPY statements and make sure each is committed
before beginning the next.  One way to do that would be to write a
simple client that reads the file and issues a new "COPY tablename
FROM stdin" for each X number of lines.  But are you sure you want
other transactions querying the table before it's completely loaded?
Are queries based on incomplete data acceptable for whatever you're
doing?

> I am a bit clueless as to what can I do to the configuration files to optimize
> this copy.

See "Populating a Database" in the "Performance Tips" chapter of
the documentation for some ideas.

http://www.postgresql.org/docs/8.0/interactive/populate.html

--
Michael Fuhr

Re: querying while copying into a table and optimizations

From
David Eduardo Gómez Noguera
Date:
On Sat, 2005-10-29 at 09:10 -0600, Michael Fuhr wrote:
> On Fri, Oct 28, 2005 at 06:22:43PM -0000, davidgn@servidor.unam.mx wrote:
> > This is postgresql 7.4
> > I am trying to check that postgres is updating a table.
> > I have a pretty large ascii table  file (+- 210 Mb) which I am copying into a
> > table with pgsql, but it is taking a long time, and any select query I do to the
> > table returns me 0 rows
>
> Presumably you're using COPY or doing multiple inserts inside a
> transaction (hopefully the former for performance reasons).  No
> other transaction will be able to see any of the data until the
> inserting transaction commits it (a single statement like COPY is
> wrapped in its own transaction even if it occurs outside an explicit
> transaction block).  You could use contrib/pgstattuple to check on
> the copy/insert's progress, but that won't allow you to query the
> data itself.
>
Yep. The script uses COPY to populate the database right after erasing
all its contents.
I was wondering about querying just as a mean of feedback, as you
guessed, but anything else would be right.
I initially though that read access was available as each row was
copied.

I have been playing with this script, which was handed to me, and before
DELETE and COPY they run a vacuumdb on the whole database.

I commented that line out and the COPY ran in about 5.5 mins (under
postgres 7.3 the script runs all right)
I am wondering if that was what was somehow causing the problem, as I
did some lot of stuff with the database and couldn't be sure.

I found some discussions about ANALYZE, but such discussions were under
Postgres 8, so I don't know if this is related.


> > I am a bit clueless as to what can I do to the configuration files to optimize
> > this copy.
>
> See "Populating a Database" in the "Performance Tips" chapter of
> the documentation for some ideas.
>
> http://www.postgresql.org/docs/8.0/interactive/populate.html
>

Thank you.
A lot of stuff I can't grab yet. Guess I need some more experience
administering databases.

Sorry about the double mail, I didn't realize I hadn't replied to the
list. Lest this be archived.
--
David Eduardo Gómez Noguera <davidgn@servidor.unam.mx>