Thread: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

From

"Nick Fankhauser"

Date:

10 May 2002, 10:14:50

Hi-

The background info:

Debian 2.4.14 on both source & destination boxes
PostgreSQL 7.1.3 on source
PostgreSQL 7.2.1-2 on destination

We are testing 7.2 with our application by installing it on a separate box &
accessing it via the network. The 7.2 install is just a plain vanilla
install using the Debian package which went well. (Well done, as usual
Oliver!)

The export of data from the old system was simply a pg_dumpall, and the
import was a load of the dump file using psql and an empty newly created
database.

The load returned two errors that I don't understand:

(1)
ERROR:  DefineIndex: index function must be marked iscachable

unfortunately, I don't know which line of the dumpfile we were on when this
occurred, but It only happened once, and the only indexes I can think of
that might be different from the rest are functional indexes. One of them is
an index on upper(field), and the other is based on a function defined in
psql that uses the translate function.


(2)
ERROR:  cannot find attribute 1 of relation schedule_notification_log
ERROR:  cannot find attribute 1 of relation schedule_notification_log

We got this one twice, just as shown above. When I do a \d
schedule_notification_log, I get this result:
develop=# \d schedule_notification_log
Table "schedule_notification_log"
 Column | Type | Modifiers
--------+------+-----------


Other than these errors, the data migration went very smoothly, and our
applications have functioned at least partially without modification. We
haven't been able to test thoroughly yet, because a few queries are very
slow, and the second error above seems to keep vacuum analyze from running
as well.

Any thoughts on these errors are appreciated.

Regards,

-Nick

--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/

Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

From

Tom Lane

Date:

10 May 2002, 11:47:32

"Nick Fankhauser" <nickf@ontko.com> writes:
> The load returned two errors that I don't understand:

> ERROR:  DefineIndex: index function must be marked iscachable

This means what it says: add "with (iscachable)" to your function
declaration.  This is intended to remind you that it is not sensible
to build an index on a function that has time-varying results...

> ERROR:  cannot find attribute 1 of relation schedule_notification_log

This is more disturbing.  Can you provide the sequence of commands that
led up to it?

            regards, tom lane

Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

From

Tom Lane

Date:

10 May 2002, 12:48:24

Ray Ontko <rayo@ontko.com> writes:
> develop=# select count(*) from temp_event ;
> ERROR:  cannot find attribute 1 of relation temp_event

I'm beginning to think there is something seriously messed up about your
installation.  The simplest theory is that the indexes on pg_attribute
are corrupted.  You could try a query that won't use the indexes, eg

select (select relname from pg_class where oid = attrelid)
from pg_attribute where attnum = 1 order by 1;

and look through the output to see if temp_event and the other broken
table are mentioned.  If so, then the rows are there but the index is
wrong.  You should be able to recover using REINDEX --- read the
reference page for it carefully before executing the recovery procedure!

If that doesn't work, re-initdb and re-importation of your dump would be
a good next step.  Something must have dropped a bit somewhere...

            regards, tom lane

Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

From

"Nick Fankhauser"

Date:

10 May 2002, 13:29:42

Tom-

Thanks! your diagnosis was correct & the repair worked.

-Nick & Ray


> I'm beginning to think there is something seriously messed up about your
> installation.  The simplest theory is that the indexes on pg_attribute
> are corrupted.
...
> You should be able to recover using REINDEX

VACUUM FULL

From

Brian McCane

Date:

10 May 2002, 20:19:10

Okay, I guess I misunderstood something about "VACUUM FULL".  My
understanding was that a VACUUM (without FULL), marked unused records for
reuse.  VACUUM FULL moved records from the end of a file to the holes
where unused records were stored and truncated the file to free disk
space.  So far so good, but....

I have had continued loss of performance on one of my largest tables
(600,000,000 records).  There are 4 fields in a record like such:

CREATE TABLE foo (
  a int not null references bar(a) on delete cascade on update no action,
  b int not null references baz(b) on delete cascade on update no action,
  c int,
  d smallint,
  primary key(a, b)) ;
CREATE INDEX foo_ac ON foo (a,c) ;
CREATE INDEX foo_ad on foo (a,d) ;
And there are 3 triggers which fire before insert/delete/update.

I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
number is increasing on a daily basis as I make changes which improve the
performance of my data gathering tools (spiders ;).  Two days ago, it had
reached the point where a search for a 3-word term (ie. free news servers)
took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
on my table, and now the same search takes < 10 seconds.  I assume that
the increase in performance is due to the decrease in table/index size
which added up to approximate 1GB of freed space on the machine, which was
approximately 4% of the original size of the table and all its indices.
But, a 4% decrease in size should not add up to a 84% increase in
performance (is that right? I always get the ratio confused :).

If all that VACUUM FULL did was move records from file 12345678.6 to file
12345678, the database would still being doing a large number of random
accesses on the table.  However, if VACUUM FULL clusters the data
according to the primary key, it would still be doing a large number of
random access on the table, because the primary key has almost nothing to
do with how I actually access the data in real life.  So, is VACUUM FULL
looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
which index I actually use most (foo_ad), and then clustering the data
that way, or is there some other agent at work here.

- brian


Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

Re: VACUUM FULL

From

"Dan Langille"

Date:

10 May 2002, 20:38:32

On 10 May 2002 at 19:18, Brian McCane wrote:

> But, a 4% decrease in size should not add up to a 84% increase in
> performance (is that right? I always get the ratio confused :).

Have you tried an ANALYZE as well?  It wasn't mentioned.

See http://www.postgresql.org/idocs/index.php?routine-vacuuming.html

PostgreSQL's VACUUM command must be  run on a regular basis for several
reasons:

1. To recover disk space occupied by updated or deleted rows.

2. To update data statistics used by the PostgreSQL query planner.

3. To protect against loss of very old data due to transaction ID
    wraparound.

Read 8.2.2. Updating planner statistics
--
Dan Langille
The FreeBSD Diary - http://freebsddiary.org/ - practical examples

Re: VACUUM FULL

From

Tom Lane

Date:

10 May 2002, 22:05:02

Brian McCane <bmccane@mccons.net> writes:
> I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
> number is increasing on a daily basis as I make changes which improve the
> performance of my data gathering tools (spiders ;).  Two days ago, it had
> reached the point where a search for a 3-word term (ie. free news servers)
> took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
> on my table, and now the same search takes < 10 seconds.

I would have to assume that the change is due to a change in plan, and
not directly due to VACUUM having compacted out a small part of the
space.

Did you do VACUUM ANALYZE, or just plain VACUUM?  When was the last time
you did VACUUM (or ANALYZE) on this table?  What plan is being used now?
Do you have any idea what plan was used before?

            regards, tom lane

Re: VACUUM FULL

From

"Rajesh Kumar Mallah."

Date:

11 May 2002, 01:28:29

Hi Brian ,

are you performing full text search in any case?

Apart from optimizing the TABLE/INDEXES (thru VACUUM measures)

i feel using text indexes provided by contrib/tsearch can also
lead to significant improvement in search performance.

regds
mallah.

On Saturday 11 May 2002 05:48 am, Brian McCane wrote:
> Okay, I guess I misunderstood something about "VACUUM FULL".  My
> understanding was that a VACUUM (without FULL), marked unused records for
> reuse.  VACUUM FULL moved records from the end of a file to the holes
> where unused records were stored and truncated the file to free disk
> space.  So far so good, but....
>
> I have had continued loss of performance on one of my largest tables
> (600,000,000 records).  There are 4 fields in a record like such:
>
> CREATE TABLE foo (
>   a int not null references bar(a) on delete cascade on update no action,
>   b int not null references baz(b) on delete cascade on update no action,
>   c int,
>   d smallint,
>   primary key(a, b)) ;
> CREATE INDEX foo_ac ON foo (a,c) ;
> CREATE INDEX foo_ad on foo (a,d) ;
> And there are 3 triggers which fire before insert/delete/update.
>
> I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
> number is increasing on a daily basis as I make changes which improve the
> performance of my data gathering tools (spiders ;).  Two days ago, it had
> reached the point where a search for a 3-word term (ie. free news servers)
> took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
> on my table, and now the same search takes < 10 seconds.  I assume that
> the increase in performance is due to the decrease in table/index size
> which added up to approximate 1GB of freed space on the machine, which was
> approximately 4% of the original size of the table and all its indices.
> But, a 4% decrease in size should not add up to a 84% increase in
> performance (is that right? I always get the ratio confused :).
>
> If all that VACUUM FULL did was move records from file 12345678.6 to file
> 12345678, the database would still being doing a large number of random
> accesses on the table.  However, if VACUUM FULL clusters the data
> according to the primary key, it would still be doing a large number of
> random access on the table, because the primary key has almost nothing to
> do with how I actually access the data in real life.  So, is VACUUM FULL
> looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
> which index I actually use most (foo_ad), and then clustering the data
> that way, or is there some other agent at work here.
>
> - brian
>
>
> Wm. Brian McCane                    | Life is full of doors that won't open
> Search http://recall.maxbaud.net/   | when you knock, equally spaced amid
> those Usenet http://freenews.maxbaud.net/ | that open when you don't want
> them to. Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of
> Amber"
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Re: VACUUM FULL

From

Brian McCane

Date:

11 May 2002, 12:53:49

I have tried contrib/tsearch, but I do fairly complex score computation
based on word placement (headers,body,graphics,metatags, etc).  And also
do additional computations on the scores when the user does a search.  It
is very quick to use something like tsearch to figure out which pages
match my criterion, but very slow to then compute a valid score for each
page.  Instead, in an attempt to speed things up, a score is computed for
each distinct word on a page, and that score is stored in this table.
Then when a search is requested, I use previously existing search
information and data from this table to compute the new results.

The upshot of all this computing is that I get fairly relevant results,
but I have to do a lot of math to get there.  I have been considering
combining the two methods, doing something like a full text search to find
pages that meet my criterion, and then using this table to actually
compute a pages score.

- brian


On Sat, 11 May 2002, Rajesh Kumar Mallah. wrote:

>
> Hi Brian ,
>
> are you performing full text search in any case?
>
> Apart from optimizing the TABLE/INDEXES (thru VACUUM measures)
>
> i feel using text indexes provided by contrib/tsearch can also
> lead to significant improvement in search performance.
>
> regds
> mallah.
>
> On Saturday 11 May 2002 05:48 am, Brian McCane wrote:
> > Okay, I guess I misunderstood something about "VACUUM FULL".  My
> > understanding was that a VACUUM (without FULL), marked unused records for
> > reuse.  VACUUM FULL moved records from the end of a file to the holes
> > where unused records were stored and truncated the file to free disk
> > space.  So far so good, but....
> >
> > I have had continued loss of performance on one of my largest tables
> > (600,000,000 records).  There are 4 fields in a record like such:
> >
> > CREATE TABLE foo (
> >   a int not null references bar(a) on delete cascade on update no action,
> >   b int not null references baz(b) on delete cascade on update no action,
> >   c int,
> >   d smallint,
> >   primary key(a, b)) ;
> > CREATE INDEX foo_ac ON foo (a,c) ;
> > CREATE INDEX foo_ad on foo (a,d) ;
> > And there are 3 triggers which fire before insert/delete/update.
> >
> > I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
> > number is increasing on a daily basis as I make changes which improve the
> > performance of my data gathering tools (spiders ;).  Two days ago, it had
> > reached the point where a search for a 3-word term (ie. free news servers)
> > took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
> > on my table, and now the same search takes < 10 seconds.  I assume that
> > the increase in performance is due to the decrease in table/index size
> > which added up to approximate 1GB of freed space on the machine, which was
> > approximately 4% of the original size of the table and all its indices.
> > But, a 4% decrease in size should not add up to a 84% increase in
> > performance (is that right? I always get the ratio confused :).
> >
> > If all that VACUUM FULL did was move records from file 12345678.6 to file
> > 12345678, the database would still being doing a large number of random
> > accesses on the table.  However, if VACUUM FULL clusters the data
> > according to the primary key, it would still be doing a large number of
> > random access on the table, because the primary key has almost nothing to
> > do with how I actually access the data in real life.  So, is VACUUM FULL
> > looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
> > which index I actually use most (foo_ad), and then clustering the data
> > that way, or is there some other agent at work here.
> >
> > - brian
> >
> >
> > Wm. Brian McCane                    | Life is full of doors that won't open
> > Search http://recall.maxbaud.net/   | when you knock, equally spaced amid
> > those Usenet http://freenews.maxbaud.net/ | that open when you don't want
> > them to. Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of
> > Amber"
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>
> --
> Rajesh Kumar Mallah,
> Project Manager (Development)
> Infocom Network Limited, New Delhi
> phone: +91(11)6152172 (221) (L) ,9811255597 (M)
>
> Visit http://www.trade-india.com ,
> India's Leading B2B eMarketplace.
>
>

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

Re: VACUUM FULL

From

"Rajesh Kumar Mallah."

Date:

12 May 2002, 01:24:45

Hi Brian,

Yes contrib/tsearch does not provide relevence ranking at the
moment,

But  OpenFTS-perl/tcl (openfts.sf.net)
(from which tsearch is derieved) do provide
relevence ranking and those are also very fast.

and they work in similar fashion as you have metioned
in your reply,

also OpenFTS does stemming in you search phrases.

but i understand applicability of those solutions
are also subject to your current requirements.

regds
mallah.

On Saturday 11 May 2002 10:23 pm, Brian McCane wrote:
> I have tried contrib/tsearch, but I do fairly complex score computation
> based on word placement (headers,body,graphics,metatags, etc).  And also
> do additional computations on the scores when the user does a search.  It
> is very quick to use something like tsearch to figure out which pages
> match my criterion, but very slow to then compute a valid score for each
> page.  Instead, in an attempt to speed things up, a score is computed for
> each distinct word on a page, and that score is stored in this table.
> Then when a search is requested, I use previously existing search
> information and data from this table to compute the new results.
>
> The upshot of all this computing is that I get fairly relevant results,
> but I have to do a lot of math to get there.  I have been considering
> combining the two methods, doing something like a full text search to find
> pages that meet my criterion, and then using this table to actually
> compute a pages score.
>
> - brian
>
> On Sat, 11 May 2002, Rajesh Kumar Mallah. wrote:
> > Hi Brian ,
> >
> > are you performing full text search in any case?
> >
> > Apart from optimizing the TABLE/INDEXES (thru VACUUM measures)
> >
> > i feel using text indexes provided by contrib/tsearch can also
> > lead to significant improvement in search performance.
> >
> > regds
> > mallah.
> >
> > On Saturday 11 May 2002 05:48 am, Brian McCane wrote:
> > > Okay, I guess I misunderstood something about "VACUUM FULL".  My
> > > understanding was that a VACUUM (without FULL), marked unused records
> > > for reuse.  VACUUM FULL moved records from the end of a file to the
> > > holes where unused records were stored and truncated the file to free
> > > disk space.  So far so good, but....
> > >
> > > I have had continued loss of performance on one of my largest tables
> > > (600,000,000 records).  There are 4 fields in a record like such:
> > >
> > > CREATE TABLE foo (
> > >   a int not null references bar(a) on delete cascade on update no
> > > action, b int not null references baz(b) on delete cascade on update no
> > > action, c int,
> > >   d smallint,
> > >   primary key(a, b)) ;
> > > CREATE INDEX foo_ac ON foo (a,c) ;
> > > CREATE INDEX foo_ad on foo (a,d) ;
> > > And there are 3 triggers which fire before insert/delete/update.
> > >
> > > I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
> > > number is increasing on a daily basis as I make changes which improve
> > > the performance of my data gathering tools (spiders ;).  Two days ago,
> > > it had reached the point where a search for a 3-word term (ie. free
> > > news servers) took about 60 seconds.  I have just spent 48 hours
> > > running a VACUUM FULL on my table, and now the same search takes < 10
> > > seconds.  I assume that the increase in performance is due to the
> > > decrease in table/index size which added up to approximate 1GB of freed
> > > space on the machine, which was approximately 4% of the original size
> > > of the table and all its indices. But, a 4% decrease in size should not
> > > add up to a 84% increase in performance (is that right? I always get
> > > the ratio confused :).
> > >
> > > If all that VACUUM FULL did was move records from file 12345678.6 to
> > > file 12345678, the database would still being doing a large number of
> > > random accesses on the table.  However, if VACUUM FULL clusters the
> > > data according to the primary key, it would still be doing a large
> > > number of random access on the table, because the primary key has
> > > almost nothing to do with how I actually access the data in real life.
> > > So, is VACUUM FULL looking somewhere in pg_statistics (or
> > > pg_stat_user_indexes), to determine which index I actually use most
> > > (foo_ad), and then clustering the data that way, or is there some other
> > > agent at work here.
> > >
> > > - brian
> > >
> > >
> > > Wm. Brian McCane                    | Life is full of doors that won't
> > > open Search http://recall.maxbaud.net/   | when you knock, equally
> > > spaced amid those Usenet http://freenews.maxbaud.net/ | that open when
> > > you don't want them to. Auction http://www.sellit-here.com/ | - Roger
> > > Zelazny "Blood of Amber"
> > >
> > >
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Re: A couple of errors encountered in 7.1.3=>7.2.1-2 data migration

From

Ray Ontko

Date:

13 May 2002, 08:49:16

Tom, et al,

> > ERROR:  cannot find attribute 1 of relation schedule_notification_log
>
> This is more disturbing.  Can you provide the sequence of commands that
> led up to it?

Here are a few related datapoints.

1) We think that schedule_notification_log is the only table that
we loaded from the dump that had _no_ rows in it.

2) We encountered the same "cannot find attribute" error when doing
something like this:

develop=# create table temp_event as select * from event ;
...
develop=# \d temp_event
     Table "temp_event"
 Attribute | Type | Modifier
-----------+------+----------

develop=# select count(*) from temp_event ;
ERROR:  cannot find attribute 1 of relation temp_event
develop=# drop table temp_event ;
ERROR:  cannot find attribute 1 of relation temp_event

Note that "event" has plenty of rows.  It may be our only table
that includes the INTERVAL datatype.

BTW, How do I get rid of temp_event?

So: Is there something about a table that might cause it to NOT
load correctly?  Is there something about a table that might
cause it to not copy correctly (using create as select * from )?

Ray
----------------------------------------------------------------------
Ray Ontko   rayo@ontko.com   Phone 1.765.935.4283   Fax 1.765.962.9788
Ray Ontko & Co.   Software Consulting Services   http://www.ontko.com/