Re: [HACKERS] New Driver and Unique Indexes - Mailing list pgsql-interfaces

From Bruce Momjian
Subject Re: [HACKERS] New Driver and Unique Indexes
Date
Msg-id 199804280242.WAA02534@candle.pha.pa.us
Whole thread Raw
In response to New Driver and Unique Indexes  (Byron Nikolaidis <byronn@insightdist.com>)
List pgsql-interfaces
> One downside about UNIQUE INDEXES however, is how Microsoft Access
> handles them when you open the table in datasheet view.  Whether you
> specify the unique index at link time, or the driver provides the info,
> Access will try to use queries which show up a problem with the backend:
>
> Here is an example of an Access query with a unique index on a single
> field:
>
> SELECT balance_id,company_id, balance_date, is_audited,comment,
> balance_type, balance_filename  FROM balance  WHERE balance_id = 1 OR
> balance_id = 2 OR balance_id = 3 OR balance_id = 4 OR balance_id = 5 OR
> balance_id = 6 OR balance_id = 7 OR balance_id = 8 OR balance_id = 9 OR
> balance_id = 10
>
> The more keyparts you have, the worse the problem is (2 keyparts):
>
> SELECT balance_id,company_id, balance_date, is_audited,comment,
> balance_type, balance_filename  FROM balance  WHERE balance_id = 1 AND
> company_id=1 OR balance_id = 1 AND company_id=2 OR balance_id = 1 AND
> company_id=3 OR balance_id = 2 AND company_id=1 OR balance_id = 2 AND
> company_id=2 OR balance_id = 2 AND company_id=3 OR balance_id = 3 AND
> company_id=1 OR balance_id = 3 AND company_id=2 OR balance_id = 3 AND
> company_id=3 OR balance_id = 4 AND company_id=1

OK, I have the dope on this one.  The palloc failure is not the OR
indexing, but rather the item:

    * Fix memory exhaustion when using many OR's

The bug report that prompted this is attached.  As you can see, it was
also prompted by MS-Access.  The problem is that the backend uses the
text-book method of processing OR's by converting the WHERE clause to
Conjunctive-Normal-Form(CNF), and this exponentially explodes the number
of tests where there are many OR clauses.

We are not sure how to fix it yet.  Vadim has improved the handling of
this in 6.3.*, but it still is not perfect and needs a solution.
Obviously other databases are not CNF'ifing the queries so there must be
a solution.  David?

---------------------------------------------------------------------------

Date: Mon, 12 Jan 1998 15:53:18 -0500
From: David Hartwig <daveh@insightdist.com>
To: Bruce Momjian <maillist@candle.pha.pa.us>
Subject: Re: [BUGS] General Bug Report: palloc fails with lots of ANDs and ORs

This is a multi-part message in MIME format.
--------------20C7AC27E8BCA117B23354BE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Bruce,

I did some homework.   Here is what I have.   The default max data segment size on our (AIX 4.1.4) box is around 130000
kbytes.

I put together a query which put me just past the threshold of the palloc "out of memory error".  It is as follows:

create table outlet (
    number int,
    name varchar(30),
    ...
}

create unique index outlet_key on outlet using btree (number);

select count(*) from outlet
where
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1) or
    (number = 1 and number = 1 and number = 1);

Not pretty but it makes the point.   Take out two OR clauses and the query works fine (but a bit slow).

The above query is all it takes to use up all 130000 Kbytes of memory.    And, since the query takes a long time to
finallyfail, I was able to 
observe the memory consumption.

I extended the max data segment to 300000.  And tried again.   I could observer the memory consumption up to about
280000when the system 
suddenly got sick.   I was getting all kinds of messages like "cant fork"; bad stuff.  The system did finally recover
onits own.   I am not 
sure happened there.   I know that ulimit puts us right around the physical memory limits of out system.

Using 300 meg for the above query seems like a bit of a problem.   It is difficult to imagine where all that memory is
beingused.   I will 
research the problem further if you need more information.

Bruce Momjian wrote:

> Try changing your OS default memory size.  Unsure how to do this under
> AIX.
>
> >
> >
> > ============================================================================
> >                         POSTGRESQL BUG REPORT TEMPLATE
> > ============================================================================
> >
> >
> > Your name             : David Hartwig
> > Your email address    : daveh@insightdist.com
> >
> > Category              : runtime: back-end: SQL
> > Severity              : serious
> >
> > Summary: palloc fails with lots of ANDs and ORs
> >
> > System Configuration
> > --------------------
> >   Operating System   : AIX 4.1
> >
> >   PostgreSQL version : 6.2
> >
> >   Compiler used      : native CC
> >
> > Hardware:
> > ---------
> > RS 6000
> >
> > Versions of other tools:
> > ------------------------
> > NA
> >
> > --------------------------------------------------------------------------
> >
> > Problem Description:
> > --------------------
> > The follow is a mail message describing the problem on the PostODBC mailing list:
> >
> >
> > I have run across this also.   We traced it down to a failure in the PostgreSQL server.   This occurs under the
followingconditions. 
> >
> >     1.  MS Access
> >     2.  Specify a multi-part key in the link time setup with postgresql
> >     3.  Click on table view.
> >
> > What happens is MS Access takes the following steps.   First it selects all possible key values for the table being
viewed.  I 
> > suspect it maps the key values to the relative row position in the display.   Then it uses the mapping to generate
futurequeries based 
> > on the mapping and the rows showing on the screen.   The queries take the following form:
> >
> >     SELECT  keypart1, keypart2, keypart3, col4, col5, col6 ... FROM example_table
> >     WHERE
> >         (keypart1 = row1keypartval1 AND  keypart2 = row1keypartval2 AND  keypart3 = row1keypartval3) OR
> >         (keypart1 = row2keypartval1 AND  keypart2 = row2keypartval2 AND  keypart3 = row2keypartval3) OR
> >             .
> >             .      --  28 lines of this stuff.   Why 28... Why not 28
> >             .
> >         (keypart1 = row27keypartval1 AND  keypart2 = row27keypartval2 AND  keypart3 = row27keypartval3) OR
> >         (keypart1 = row28keypartval1 AND  keypart2 = row28keypartval2 AND  keypart3 = row28keypartval3);
> >
> >
> > The PostgreSQL sever chokes on this statement claiming it is out of memory.  (palloc)  In this example I used a
threepart key.  I 
> > do not recall if a three part key is enough to trash the backend.  It has been a while.    I have tried sending
thesekinds of statements 
> > directly through the psql monitor and get the same result.
> >
> >
> > --------------------------------------------------------------------------
> >
> > Test Case:
> > ----------
> > select c1, c1 c3, c4, c5 ... from example_table
> > where
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something) or
> > (c1 = something and c2 = something and c3 = something and c4 = something);
> >
> >
> > --------------------------------------------------------------------------
> >
> > Solution:
> > ---------
> >
> >
> > --------------------------------------------------------------------------
> >
> >
> >
>
> --
> Bruce Momjian
> maillist@candle.pha.pa.us



--------------20C7AC27E8BCA117B23354BE
Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for David Hartwig
Content-Disposition: attachment; filename="vcard.vcf"

begin:          vcard
fn:             David Hartwig
n:              Hartwig;David
org:            Insight Distribution Systems
adr:            222 Shilling Circle;;;Hunt Valley ;MD;21030;USA
email;internet: daveh@insightdist.com
title:          Manager Research & Development
tel;work:       (410)403-2308
x-mozilla-cpt:  ;0
x-mozilla-html: TRUE
version:        2.1
end:            vcard


--------------20C7AC27E8BCA117B23354BE--


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

pgsql-interfaces by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: [INTERFACES] retrieving varchar size
Next
From: The Hermit Hacker
Date:
Subject: [ssl-users] ANNOUNCE: PostgreSQL-SSL patch v.01a (fwd)