Thread: querying while copying into a table and optimizations
Message-ID: <1130523763.43626c739791d@www.correo.unam.mx> Date: Fri, 28 Oct 2005 13:22:43 -0500 (CDT) From: davidgn@servidor.unam.mx MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: IMP/PHP IMAP webmail program 2.2.8 X-Originating-IP: 200.34.166.34 Hi This is postgresql 7.4 I am trying to check that postgres is updating a table. I have a pretty large ascii table file (+- 210 Mb) which I am copying into a table with pgsql, but it is taking a long time, and any select query I do to the table returns me 0 rows Is there any way to ask postgres to update the data more frequently or some way to make it end sooner? I am a bit clueless as to what can I do to the configuration files to optimize this copy. Thank you ------------------------------------------------- www.correo.unam.mx UNAMonos Comunic�ndonos
On Fri, Oct 28, 2005 at 06:22:43PM -0000, davidgn@servidor.unam.mx wrote: > This is postgresql 7.4 > I am trying to check that postgres is updating a table. > I have a pretty large ascii table file (+- 210 Mb) which I am copying into a > table with pgsql, but it is taking a long time, and any select query I do to the > table returns me 0 rows Presumably you're using COPY or doing multiple inserts inside a transaction (hopefully the former for performance reasons). No other transaction will be able to see any of the data until the inserting transaction commits it (a single statement like COPY is wrapped in its own transaction even if it occurs outside an explicit transaction block). You could use contrib/pgstattuple to check on the copy/insert's progress, but that won't allow you to query the data itself. > Is there any way to ask postgres to update the data more frequently or some way > to make it end sooner? By "update the data more frequently" I assume you mean "make the data visible to other transactions more frequently." To do that you could use multiple COPY statements and make sure each is committed before beginning the next. One way to do that would be to write a simple client that reads the file and issues a new "COPY tablename FROM stdin" for each X number of lines. But are you sure you want other transactions querying the table before it's completely loaded? Are queries based on incomplete data acceptable for whatever you're doing? > I am a bit clueless as to what can I do to the configuration files to optimize > this copy. See "Populating a Database" in the "Performance Tips" chapter of the documentation for some ideas. http://www.postgresql.org/docs/8.0/interactive/populate.html -- Michael Fuhr
On Sat, 2005-10-29 at 09:10 -0600, Michael Fuhr wrote: > On Fri, Oct 28, 2005 at 06:22:43PM -0000, davidgn@servidor.unam.mx wrote: > > This is postgresql 7.4 > > I am trying to check that postgres is updating a table. > > I have a pretty large ascii table file (+- 210 Mb) which I am copying into a > > table with pgsql, but it is taking a long time, and any select query I do to the > > table returns me 0 rows > > Presumably you're using COPY or doing multiple inserts inside a > transaction (hopefully the former for performance reasons). No > other transaction will be able to see any of the data until the > inserting transaction commits it (a single statement like COPY is > wrapped in its own transaction even if it occurs outside an explicit > transaction block). You could use contrib/pgstattuple to check on > the copy/insert's progress, but that won't allow you to query the > data itself. > Yep. The script uses COPY to populate the database right after erasing all its contents. I was wondering about querying just as a mean of feedback, as you guessed, but anything else would be right. I initially though that read access was available as each row was copied. I have been playing with this script, which was handed to me, and before DELETE and COPY they run a vacuumdb on the whole database. I commented that line out and the COPY ran in about 5.5 mins (under postgres 7.3 the script runs all right) I am wondering if that was what was somehow causing the problem, as I did some lot of stuff with the database and couldn't be sure. I found some discussions about ANALYZE, but such discussions were under Postgres 8, so I don't know if this is related. > > I am a bit clueless as to what can I do to the configuration files to optimize > > this copy. > > See "Populating a Database" in the "Performance Tips" chapter of > the documentation for some ideas. > > http://www.postgresql.org/docs/8.0/interactive/populate.html > Thank you. A lot of stuff I can't grab yet. Guess I need some more experience administering databases. Sorry about the double mail, I didn't realize I hadn't replied to the list. Lest this be archived. -- David Eduardo Gómez Noguera <davidgn@servidor.unam.mx>