This is a *big* help.
Yes, the table is approx 10-12GB in size and running your length() and
update queries is going to take a lifetime, since it will require a
calculation on 4 million rows.
This doesn't address the serious performance problem I'm finding in
7.0.2 for a multi-key select/order by/limit/offset query, which I sent
in a separate email.
Tim
Mike Mascari wrote:
>
> Tim Perdue wrote:
> > ...
> > After that, I get this error from psql:
> >
> > ERROR: btree: index item size 2820 exceeds maximum 2717
> >
> > Any way to tell where that item is at?
>
> I've been wondering at the state of the problems you've been
> having with PostgreSQL and wondering why I haven't experienced
> the same. I think this may very well be it. Earlier versions of
> PostgreSQL allowed for the creation of indexes on fields whose
> length would not permit at least 2 entries per index page. 95% of
> the time, things would work fine. But 5% you would get corrupted
> data.
>
> Before creating the index:
>
> SELECT * FROM tbl_main_archive WHERE Length(fld_mail_subject) >
> 2700;
>
> will get you the list of records which cannot be indexed. You're
> attempting to create a multi-key index so I would truncate (or
> delete) any record whose fld_mail_subject is > 2700:
>
> UPDATE tbl_main_archive SET fld_mail_subject =
> SubStr(fld_mail_subject, 1, 2700);
>
> At this point, your index creation should be relatively quick
> (and successful) depending upon how many rows you have. I have a
> few tables with ~2 million rows that take about 5 - 10 minutes
> (with fsync off, naturally) to index. I would also recommend
> letting PostgreSQL determine the correct "ops":
>
> CREATE INDEX idx_mail_archive_list_subject
> ON tbl_mail_archive (fld_mail_list, fld_mail_subject);
>
> Without following the lists every day, most people wouldn't know
> about this issue. I'm surprised it took so long for PostgreSQL
> 7.0.2 to bail on the index creation though. Is this a
> particularly large table? At any rate, this is an example of a
> bug which *would* allow for the kinds of corruption you've seen
> in the past that has been addressed in 7.0.2, as Tom Lane crushed
> them by the hundreds. If you can:
>
> psql db_geocrawler < 6_4dump.txt
>
> and it never bails, then you know all your data is "clean". Until
> that point, any index you have on a "text" datatype is subject to
> similar problems.
>
> Hope that helps,
>
> Mike Mascari
--
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723