Thread: Insertion to temp table deteriorating over time

Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

13 December 2006, 12:44:32

Hi,

Our application is using Postgres 7.4 and I'd like to understand the root cause of this problem:

To speed up overall insert time, our application will write thousands of rows, one by one, into a temp table (same structure as a permanent table), then do a bulk insert from the temp table to the permanent table. After this bulk insert is done, the temp table is truncated and the process is repeated. We do this because Postgres can do many individual inserts to a temp table much faster than to a permanent table.

The problem we are seeing is that over time, the cost of a single insert to the temp table seems to grow. After a restart of postgres, a single insert to the temp table takes about 3ms. Over a few days, this grows to about 60ms per insert. Restarting postgres drops this insert time back to 3ms, supposedly because the temp table is re-created. Our workaround right now is to restart the database every few days, but we don't like this solution much.

Any idea where the bloat is happening? I believe that if we were dropping and re-creating the temp table over and over, that could result in pg_class bloat (among other catalog tables), but what is going wrong if we use the same table over and over and truncate it?

Thanks,

Steve

Re: Insertion to temp table deteriorating over time

From

"Rajesh Kumar Mallah"

Date:

13 December 2006, 13:14:23

On 12/13/06, Steven Flatt <steven.flatt@gmail.com> wrote:
> Hi,
>
> Our application is using Postgres 7.4 and I'd like to understand the root
> cause of this problem:
>
> To speed up overall insert time, our application will write thousands of
> rows, one by one, into a temp table

1. how frequently are you commiting the transaction ?
    if you commit less frequetly it will be faster.

2. If you use COPY instead of INSERT it will be faster.
    using COPY is easy with DBD::Pg (perl). In versions
    8.x i think there has been major speed improvements
   in COPY.

I do not know the root cause of slowdown though.

Regds
mallah.



>

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

13 December 2006, 13:29:38

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Any idea where the bloat is happening?  I believe that if we were dropping
> and re-creating the temp table over and over, that could result in pg_class
> bloat (among other catalog tables), but what is going wrong if we use the
> same table over and over and truncate it?

That seems very strange --- I too would have expected a TRUNCATE to
bring you back to ground zero performance-wise.  I wonder whether the
issue is not directly related to the temp table but is just some generic
resource leakage problem in a very long-running backend.  Have you
checked to see if the backend process bloats memory-wise, or perhaps has
a huge number of files open (I wonder if it could be leaking open file
handles to the successive generations of the temp table)?  Are you sure
that the slowdown is specific to inserts into the temp table, as opposed
to generic SQL activity?

Also, which PG version is this exactly ("7.4" is not specific enough)?
On what platform?  Can you show us the full schema definition for the
temp table and any indexes on it?

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

13 December 2006, 19:17:54

After running some further standalone tests using temp tables, I'm not convinced the problem is specific to temp table usage. In fact it looks like generic SQL activity degrades over time.

Having said that, what kinds of things should I be looking for that could deteriorate/bloat over time? Ordinarily the culprit might be infrequent vacuuming or analyzing, but that wouldn't be corrected by a restart of Postgres. In our case, restarting Postgres gives us a huge performance improvement (for a short while, anyways).

By the way, we are using PG 7.4.6 on FreeBSD 5.30.0170. The temp table has 15 columns: a timestamp, a double, and the rest integers. It has no indexes.

Thanks,

Steve

On 12/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Any idea where the bloat is happening?  I believe that if we were dropping
> and re-creating the temp table over and over, that could result in pg_class
> bloat (among other catalog tables), but what is going wrong if we use the
> same table over and over and truncate it?

That seems very strange --- I too would have expected a TRUNCATE to
bring you back to ground zero performance-wise.  I wonder whether the
issue is not directly related to the temp table but is just some generic
resource leakage problem in a very long-running backend.  Have you
checked to see if the backend process bloats memory-wise, or perhaps has
a huge number of files open (I wonder if it could be leaking open file
handles to the successive generations of the temp table)?  Are you sure
that the slowdown is specific to inserts into the temp table, as opposed
to generic SQL activity?

Also, which PG version is this exactly (" 7.4" is not specific enough)?
On what platform?  Can you show us the full schema definition for the
temp table and any indexes on it?

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

13 December 2006, 19:27:33

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Having said that, what kinds of things should I be looking for that could
> deteriorate/bloat over time?  Ordinarily the culprit might be infrequent
> vacuuming or analyzing, but that wouldn't be corrected by a restart of
> Postgres.  In our case, restarting Postgres gives us a huge performance
> improvement (for a short while, anyways).

Do you actually need to restart the postmaster, or is just starting a
fresh session (fresh backend) sufficient?  And again, have you monitored
the backend process to see if it's bloating memory-wise or open-file-wise?

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

13 December 2006, 19:44:55

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Having said that, what kinds of things should I be looking for that could
> deteriorate/bloat over time?  Ordinarily the culprit might be infrequent
> vacuuming or analyzing, but that wouldn't be corrected by a restart of
> Postgres.  In our case, restarting Postgres gives us a huge performance
> improvement (for a short while, anyways).

> By the way, we are using PG 7.4.6 on FreeBSD 5.30.0170.  The temp table has
> 15 columns: a timestamp, a double, and the rest integers.  It has no
> indexes.

Hm, *are* you vacuuming only infrequently?  In particular, what is your
maintenance policy for pg_class?

Some experimentation with TRUNCATE and VACUUM VERBOSE shows that in 7.4,
a TRUNCATE of a temp table with no indexes and no toast table generates
three dead row versions in pg_class.  (I'm surprised that it's as many
as three, but in any case the TRUNCATE would certainly have to do one
update of the table's pg_class entry and thereby generate one dead row
version.)

If you're being sloppy about vacuuming pg_class, then over time the
repeated-truncate pattern would build up a huge number of dead rows
in pg_class, *all with the same OID*.  It's unsurprising that this
would create some slowness in looking up the temp table's pg_class
entry.

If this theory is correct, the reason that starting a fresh backend
makes it fast again is that the new backend creates a whole new temp
table with a new OID assigned, and so the adjacent litter in pg_class
doesn't matter anymore (or not so much anyway).

Solution would be to institute regular vacuuming of the system
catalogs...

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

14 December 2006, 16:40:37

Thanks for your replies.

Starting a fresh session (not restarting the postmaster) seems to be sufficient to reset performance (and is an easy enough workaround). Still, it would be nice to know the root cause of the problem.

The backend process does not seem to be bloating memory-wise (I'm using vmstat to monitor memory usage on the machine). It also does not appear to be bloating in terms of open file handles (using fstat, I can see the backend process has 160-180 open file handles, not growing).

Regarding your other email -- interesting -- but we are vacuuming pg_class every hour. So I don't think the answer lies there...

Steve

On 12/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Having said that, what kinds of things should I be looking for that could
> deteriorate/bloat over time?  Ordinarily the culprit might be infrequent
> vacuuming or analyzing, but that wouldn't be corrected by a restart of
> Postgres.  In our case, restarting Postgres gives us a huge performance
> improvement (for a short while, anyways).

Do you actually need to restart the postmaster, or is just starting a
fresh session (fresh backend) sufficient?  And again, have you monitored
the backend process to see if it's bloating memory-wise or open-file-wise?

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

14 December 2006, 17:23:26

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Regarding your other email -- interesting -- but we are vacuuming pg_class
> every hour.  So I don't think the answer lies there...

That's good, but is the vacuum actually accomplishing anything?  I'm
wondering if there's also a long-running transaction in the mix.
Try a manual "VACUUM VERBOSE pg_class;" after the thing has slowed down,
and see what it says about removable and nonremovable rows.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

15 December 2006, 12:21:48

Here's the output of "VACUUM VERBOSE pg_class". I think it looks fine. I even did it three times in a row, each about 10 minutes apart, just to see what was changing:

--------------------

INFO: vacuuming "pg_catalog.pg_class"
INFO: index "pg_class_oid_index" now contains 3263 row versions in 175 pages
DETAIL: 5680 index row versions were removed.
150 index pages have been deleted, 136 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_class_relname_nsp_index" now contains 3263 row versions in 1301
pages
DETAIL: 5680 index row versions were removed.
822 index pages have been deleted, 734 are currently reusable.
CPU 0.01s/0.01u sec elapsed 0.03 sec.
INFO: "pg_class": removed 5680 row versions in 109 pages
DETAIL: CPU 0.00s/0.01u sec elapsed 0.04 sec.
INFO: "pg_class": found 5680 removable, 3263 nonremovable row versions in 625 p
ages
DETAIL: 0 dead row versions cannot be removed yet.
There were 23925 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.04u sec elapsed 0.10 sec.
VACUUM

--------------------

INFO: vacuuming "pg_catalog.pg_class"
INFO: index "pg_class_oid_index" now contains 3263 row versions in 175 pages
DETAIL: 24 index row versions were removed.
150 index pages have been deleted, 150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_class_relname_nsp_index" now contains 3263 row versions in 1301
pages
DETAIL: 24 index row versions were removed.
822 index pages have been deleted, 822 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_class": removed 24 row versions in 2 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_class": found 24 removable, 3263 nonremovable row versions in 625 pag
es
DETAIL: 0 dead row versions cannot be removed yet.
There were 29581 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

--------------------

INFO: vacuuming "pg_catalog.pg_class"
INFO: index "pg_class_oid_index" now contains 3263 row versions in 175 pages
DETAIL: 150 index pages have been deleted, 150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_class_relname_nsp_index" now contains 3263 row versions in 1301
pages
DETAIL: 822 index pages have been deleted, 822 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_class": found 0 removable, 3263 nonremovable row versions in 625 page
s
DETAIL: 0 dead row versions cannot be removed yet.
There were 29605 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

--------------------

The one thing that seems to be steadily increasing is the number of unused item pointers. Not sure if that's normal. I should also point out that SELECT statements are not experiencing the same degradation as the INSERTs to the temp table. SELECTs are performing just as well now (24 hours since restarting the connection) as they did immediately after restarting the connection. INSERTs to the temp table are 5 times slower now than they were 24 hours ago.

I wonder if the problem has to do with a long running ODBC connection.

Steve

On 12/14/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Regarding your other email -- interesting -- but we are vacuuming pg_class
> every hour.  So I don't think the answer lies there...

That's good, but is the vacuum actually accomplishing anything?  I'm
wondering if there's also a long-running transaction in the mix.
Try a manual "VACUUM VERBOSE pg_class;" after the thing has slowed down,
and see what it says about removable and nonremovable rows.

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

15 December 2006, 13:09:53

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Here's the output of "VACUUM VERBOSE pg_class".  I think it looks fine.  I
> even did it three times in a row, each about 10 minutes apart, just to see
> what was changing:

Hm, look at the numbers of rows removed:

> INFO:  "pg_class": found 5680 removable, 3263 nonremovable row versions in
> 625 pages
> DETAIL:  0 dead row versions cannot be removed yet.

> INFO:  "pg_class": found 24 removable, 3263 nonremovable row versions in 625
> pages
> DETAIL:  0 dead row versions cannot be removed yet.

> INFO:  "pg_class": found 0 removable, 3263 nonremovable row versions in 625
> pages
> DETAIL:  0 dead row versions cannot be removed yet.

The lack of unremovable dead rows is good, but why were there so many
dead rows the first time?  You didn't say what the cycle time is on your
truncate-and-refill process, but the last two suggest that the average
rate of accumulation of dead pg_class rows is only a couple per minute,
in which case it's been a lot longer than an hour since the previous
VACUUM of pg_class.  I'm back to suspecting that you don't vacuum
pg_class regularly.  You mentioned having an hourly cron job to fire off
vacuums ... are you sure it's run as a database superuser?

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

15 December 2006, 13:56:00

Our application is such that there is a great deal of activity at the beginning of the hour and minimal activity near the end of the hour. Those 3 vacuums were done at (approximately) 30 minutes past, 40 minutes past, and 50 minutes past the hour, during low activity. Vacuums of pg_class look like they're being done on the hour. So it's not surprising that the first vacuum found a lot of dead rows while the latter two found very few.

In fact, I just did another vacuum (about 30 minutes past the hour again) and got:

INFO: "pg_class": found 5490 removable, 3263 nonremovable row versions in 171 pages
DETAIL: 0 dead row versions cannot be removed yet.

... and clearly a vacuum was done under an hour ago.

The truncate and re-fill process is done once per hour, at the end of the high-load cycle, so I doubt that's even a big contributor to the number of removable rows in pg_class.

For this particular setup, we expect high load for 10-15 minutes at the beginning of the hour, which is the case when a new connection is initialized. After a day or so (as is happening right now), the high-load period spills into the second half of the hour. Within 3-4 days, we start spilling into the next hour and, as you can imagine, everything gets behind and we spiral down from there. For now, our workaround is to manually kill the connection every few days, but I would like a better solution than setting up a cron job to do this!

Thanks again,

Steve

On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm, look at the numbers of rows removed:

> INFO:  "pg_class": found 5680 removable, 3263 nonremovable row versions in
> 625 pages
> DETAIL:  0 dead row versions cannot be removed yet.

> INFO:  "pg_class": found 24 removable, 3263 nonremovable row versions in 625
> pages
> DETAIL:  0 dead row versions cannot be removed yet.

> INFO:  "pg_class": found 0 removable, 3263 nonremovable row versions in 625
> pages
> DETAIL:  0 dead row versions cannot be removed yet.

The lack of unremovable dead rows is good, but why were there so many
dead rows the first time?  You didn't say what the cycle time is on your
truncate-and-refill process, but the last two suggest that the average
rate of accumulation of dead pg_class rows is only a couple per minute,
in which case it's been a lot longer than an hour since the previous
VACUUM of pg_class.  I'm back to suspecting that you don't vacuum
pg_class regularly.  You mentioned having an hourly cron job to fire off
vacuums ... are you sure it's run as a database superuser?

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

15 December 2006, 14:06:53

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Our application is such that there is a great deal of activity at the
> beginning of the hour and minimal activity near the end of the hour.

OK ...

> The truncate and re-fill process is done once per hour, at the end of the
> high-load cycle, so I doubt that's even a big contributor to the number of
> removable rows in pg_class.

Oh, then where *are* the removable rows coming from?  At this point I
think that the truncate/refill thing is not the culprit, or at any rate
is only one part of a problematic usage pattern that we don't see all of
yet.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

15 December 2006, 14:26:44

Good question, and I agree with your point.

Are the removable rows in pg_class even an issue? So what if 5000-6000 dead tuples are generated every hour then vacuumed? Performance continues to steadily decline over a few days time. Memory usage does not appear to be bloating. Open file handles remain fairly fixed. Is there anything else I can monitor (perhaps something to do with the odbc connection) that I could potentially correlate with the degrading performance?

Steve

On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Oh, then where *are* the removable rows coming from? At this point I
think that the truncate/refill thing is not the culprit, or at any rate
is only one part of a problematic usage pattern that we don't see all of
yet.

regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

15 December 2006, 15:09:54

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Are the removable rows in pg_class even an issue?  So what if 5000-6000 dead
> tuples are generated every hour then vacuumed?  Performance continues to
> steadily decline over a few days time.  Memory usage does not appear to be
> bloating.  Open file handles remain fairly fixed.  Is there anything else I
> can monitor (perhaps something to do with the odbc connection) that I could
> potentially correlate with the degrading performance?

At this point I think the most productive thing for you to do is to try
to set up a self-contained test case that reproduces the slowdown.  That
would allow you to poke at it without disturbing your production system,
and would let other people look at it too.  From what you've said, I'd
try a simple little program that inserts some data into a temp table,
truncates the table, and repeats, as fast as it can, using the same SQL
commands as your real code and similar but dummy data.  It shouldn't
take long to observe the slowdown if it occurs.  If you can't reproduce
it in isolation then we'll know that some other part of your application
environment is contributing to the problem; if you can, I'd be happy to
look at the test case with gprof or oprofile and find out exactly what's
going on.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

15 December 2006, 15:21:14

I've been trying to reproduce the problem for days now :). I've done pretty much exactly what you describe below, but I can't reproduce the problem on any of our lab machines. Something is indeed special in this environment.

Thanks for all your help,

Steve

On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Are the removable rows in pg_class even an issue?  So what if 5000-6000 dead
> tuples are generated every hour then vacuumed?  Performance continues to
> steadily decline over a few days time.  Memory usage does not appear to be
> bloating.  Open file handles remain fairly fixed.  Is there anything else I
> can monitor (perhaps something to do with the odbc connection) that I could
> potentially correlate with the degrading performance?

At this point I think the most productive thing for you to do is to try
to set up a self-contained test case that reproduces the slowdown.  That
would allow you to poke at it without disturbing your production system,
and would let other people look at it too.  From what you've said, I'd
try a simple little program that inserts some data into a temp table,
truncates the table, and repeats, as fast as it can, using the same SQL
commands as your real code and similar but dummy data.  It shouldn't
take long to observe the slowdown if it occurs.  If you can't reproduce
it in isolation then we'll know that some other part of your application
environment is contributing to the problem; if you can, I'd be happy to
look at the test case with gprof or oprofile and find out exactly what's
going on.

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

15 December 2006, 15:30:38

"Steven Flatt" <steven.flatt@gmail.com> writes:
> I've been trying to reproduce the problem for days now :).  I've done pretty
> much exactly what you describe below, but I can't reproduce the problem on
> any of our lab machines.  Something is indeed special in this environment.

Yuck.  You could try strace'ing the problem backend and see if anything
is visibly different between fast and slow operation.  I don't suppose
you have oprofile on that machine, but if you did it'd be even better.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

18 December 2006, 13:51:29

I have an update on this.

The reason I couldn't reproduce this problem was because of the way I was creating the temp table in my tests. I was using:

CREATE TEMP TABLE tmp (LIKE perm);

This did not observe performance degradation over time.

However, the way our application was creating this table (something I should have observed sooner, no doubt) is:

CREATE TEMP TABLE tmp AS SELECT <column-list> FROM perm LIMIT 0;

This, on its own however, is not enough to reproduce the problem. Next imagine that perm is actually a view, which is defined as a UNION ALL SELECT from several other views, and those views are also defined as UNION ALL SELECTs from a bunch of permanent tables. All views have insert rules redirecting rows according to some criteria. The whole structure is pretty convoluted.

I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE TEMP TABLE ... AS.

I'm still curious about the root cause of this problem. From the docs, I see that CREATE TABLE AS evaluates the query just once to create the table, but based on what I'm seeing, I'm wondering whether this isn't truly the case. Are there any known issues with CREATE TABLE AS when the table you're creating is temporary and you're selecting from a view?

Steve

On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Steven Flatt" <steven.flatt@gmail.com> writes:
> I've been trying to reproduce the problem for days now :).  I've done pretty
> much exactly what you describe below, but I can't reproduce the problem on
> any of our lab machines.  Something is indeed special in this environment.

Yuck.  You could try strace'ing the problem backend and see if anything
is visibly different between fast and slow operation.  I don't suppose
you have oprofile on that machine, but if you did it'd be even better.

                       regards, tom lane

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

18 December 2006, 14:01:07

"Steven Flatt" <steven.flatt@gmail.com> writes:
> I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE
> TEMP TABLE ... AS.

That seems ... um ... bizarre.  Now are you able to put together a
self-contained test case?  Seems like we could have two independent bugs
here: first, why (and how) is the temp table different, and second how
does that result in the observed performance problem.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

18 December 2006, 19:06:13

Please ignore my post from earlier today. As strange as it sounds, changing "CREATE TEMP TABLE ... AS" to "CREATE TEMP TABLE ... LIKE" appeared to fix my performance problem because things errored out so quickly (and silently in my test program). After checking the pgsql logs, it became clear to me that you can't use LIKE on a view. Duh.

Moving forward, I have also discovered that our temp table did in fact have a varchar column (no specified limit on varchar). With this in mind, I could easily reproduce the problem on a temp table with one column. So...

Issue #1:

(I'm assuming there's a reasonable explanation for this.) If I create a temp table with a single varchar column (or text column), do 100 inserts to that table, copy to a permanent table, truncate the temp table and repeat, the time required for the 100 inserts grows almost linearly. Maybe the data is treated as large objects.

Note that if I change the column type to varchar(SOME_LIMIT), integer, timestamptz, interval, etc., performance does not degrade. Also note that if I do not use a temp table (but do use a varchar column), inserts are slower (as expected) but do not degrade over time. So this seems to be specific to temp tables with varchar/text column(s).

Issue #2:

As I said earlier, the temp table is created via:

CREATE TEMP TABLE tmp AS SELECT <column-list> FROM perm LIMIT 0;

where perm is a view defined as follows:

View definition:
SELECT <column-list>
   FROM view2
   JOIN tbl USING (col1, col2)
WHERE <some-conditions>
UNION ALL
SELECT <column-list>
   FROM view3
   JOIN tbl USING (col1, col2)
WHERE <some-conditions>;

Now the varchar columns that end up in the perm view come from the tbl table, but in tbl, they are defined as varchar(40). Somehow the 40 limit is lost when constructing the view. After a little more testing, I found that this problem only occurs when you are creating a view ( i.e. CREATE TABLE ... AS does not observe this problem) and also that the UNION ALL clause must be present to observe this problem.

This looks like a bug. I know this is Postgres 7.4.6 and I haven't been able to verify with a later version of Postgres, but does this look familiar to anyone?

Steve

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

19 December 2006, 04:09:03

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Issue #1:

> (I'm assuming there's a reasonable explanation for this.)  If I create a
> temp table with a single varchar column (or text column), do 100 inserts to
> that table, copy to a permanent table, truncate the temp table and repeat,
> the time required for the 100 inserts grows almost linearly.

I still can't reproduce this.  Using 7.4 branch tip, I did

create temp table foo(f1 varchar);
create table nottemp(f1 varchar);
\timing
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
... repeat several thousand times ...

and couldn't see any consistent growth in the reported times.  So either
it's been fixed since 7.4.6 (but I don't see anything related-looking in
the CVS logs), or you haven't provided all the details.

> Now the varchar columns that end up in the perm view come from the tbl
> table, but in tbl, they are defined as varchar(40).  Somehow the 40 limit is
> lost when constructing the view.

Yeah, this is a known issue with UNIONs not preserving the length info
--- which is not entirely unreasonable: what will you do with varchar(40)
union varchar(50)?  There's a hack in place as of 8.2 to keep the
length if all the union arms have the same length.

            regards, tom lane

Re: Insertion to temp table deteriorating over time

From

"Steven Flatt"

Date:

19 December 2006, 11:43:24

On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I still can't reproduce this. Using 7.4 branch tip, I did

create temp table foo(f1 varchar);
create table nottemp(f1 varchar);
\timing
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
... repeat several thousand times ...

I can't reproduce the problem that way either (or when using a server-side PLpgSQL function to do similar). It looks like you have to go through an ODBC connection, with the looping done on the client side. Each individual insert to the temp table needs to be sent over the connection and this is what degrades over time. I can reproduce on 7.4.6 and 8.1.4. I have a small C program to do this which I can send you offline if you're interested.

> Now the varchar columns that end up in the perm view come from the tbl
> table, but in tbl, they are defined as varchar(40). Somehow the 40 limit is
> lost when constructing the view.

Yeah, this is a known issue with UNIONs not preserving the length info
--- which is not entirely unreasonable: what will you do with varchar(40)
union varchar(50)? There's a hack in place as of 8.2 to keep the
length if all the union arms have the same length.

I guess it comes down to what your philosophy is on this. You might just disallow unions when the data types do not match (varchar(40) != varchar(50)). But it might come down to what's best for your application. I tend to think that when the unioned types do match, the type should be preserved in the inheriting view (as done by the "hack" in 8.2).

Thanks again for all your help.

Steve

Re: Insertion to temp table deteriorating over time

From

Tom Lane

Date:

19 December 2006, 12:59:59

"Steven Flatt" <steven.flatt@gmail.com> writes:
>  I can't reproduce the problem that way either (or when using a server-side
> PLpgSQL function to do similar).  It looks like you have to go through an
> ODBC connection, with the looping done on the client side.  Each individual
> insert to the temp table needs to be sent over the connection and this is
> what degrades over time.  I can reproduce on 7.4.6 and 8.1.4.  I have a
> small C program to do this which I can send you offline if you're
> interested.

Please.

            regards, tom lane