Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Date
Msg-id CAHyXU0yKyz43s1GXO2QBDRwK8FeNqNAb6S0XUU66dSg+BD=xhg@mail.gmail.com
Whole thread Raw
In response to Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2  (Rodrigo Rosenfeld Rosas <rr.rosas@gmail.com>)
Responses Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
List pgsql-performance
On Tue, Nov 6, 2012 at 12:09 PM, Rodrigo Rosenfeld Rosas
<rr.rosas@gmail.com> wrote:
> http://explain.depesz.com/s/ToX (fast on 9.1)
> http://explain.depesz.com/s/65t (fast on 9.2)
> http://explain.depesz.com/s/gZm (slow on 9.1)
> http://explain.depesz.com/s/END (slow on 9.2 - funny that the generated URL
> was END while this was my last explain :D )

Hm -- looking at your 'slow' 9.2 query, it is reporting that the query
took 3 seconds  (reported times are in milliseconds).  How are you
timing the data?  What happens when you run explain analyze
<your_query> from psql (as in, how long does it take)?

> Let me explain how the application works, how the database was designed and
> hopefully you'll be able to guide me in the correct way to design the
> database for this use case.
>
> Our application will present a big contract to some attorneys. There is
> currently a dynamic template with around 800 fields to be extracted from
> each contract in our system. These fields can be of different types
> (boolean, string, number, currency, percents, fixed options, dates,
> time-spans and so on). There is a fields tree that is maintained by the
> application editors. The application will allow the attorneys to read the
> contracts and highlight parts of the contract where they extracted each
> field from and associate each field with its value interpreted by the
> attorney and store the reference to what paragraphs in the contract
> demonstrate where the value came from.
>
> Then there is an interface that will allow clients to search for
> transactions based on its associated contracts and those ~800 fields. For
> the particular query above, 14 of the 800 fields have been searched by this
> particular user (most of them were boolean ones plus a few options and a
> string field). Usually the queries perform much better when less than 10
> fields are used in the criteria. But our client wants us to handle up to 20
> fields in a single query or they won't close the deal and this is a really
> important client to us.
>
> So, for the time being my only plan is to rollback to PG 9.1 and replace my
> query builder that currently generate queries like slow.sql and change it to
> generate the queries like fast.sql but I'm pretty sure this approach should
> be avoided. I just don't know any other alternative for the time being.
>
> What database design would you recommend me for this use case?

I would strongly consider investigation of hstore type along with
gist/gin index.
select * from company_transaction where contract_attributes @>
'State=>Delaware, Paid=Y';
etc

Barring that, I would then consider complete elimination of integer
proxies for your variables.  They make your query virtually impossible
to read/write, and they don't help.

merlin


pgsql-performance by date:

Previous
From: Rodrigo Rosenfeld Rosas
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Next
From: Rodrigo Rosenfeld Rosas
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2