Thread: Trying to track down weird query stalls

Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 14:57:36

I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a really
strange, annoying transient problem with one particular query stalling.

The symptom here is that when this query is made with X or more records in
a temp table involved in the join (where X is constant when the problem
manifests, but is different between manifestations) the query takes about
20 minutes. When the query is made with X-1 records it takes less than a
second. It's just this one query -- for everything else the system's nice
and snappy. The machine load's never above 9 (it's a 32 CPU box) and
hasn't had less than 60G of free system memory on it.

An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
more bizarre. Not only are the two query plans identical (save trivial
differences because of the record count differences) but the explain
EXPLAIN ANALYZE total runtimes are near-identical -- the fast case showed
259ms, the slow case 265ms.

When the slow query was actually run, though, it took 20 minutes. There
were no blocked back ends shown in pg_stat_activity, and the back end
itself was definitely moving. I trussed the back end stuck running the
slow query and it spent nearly all its time doing kread() and kwrite()
calls. The DB log didn't show anything interesting in it. I checked to
make sure the SQL statement that was EXPLAIN ANALYZEd was the one actually
executed, and I pulled the client code into the debugger and
single-stepped through just to make sure it was getting stuck on that one
SQL statement and it wasn't the client doing something unexpected.

Just to be even more annoying, the problem goes away. Right now I can't
trigger the problem. Last friday it happened reliably feeding 58 records
into this query. The week before it was 38 records. Today? Nothing, the
system's full of snappy.

At the moment I'm at a loss as to what's going wrong, and the fact that I
can't duplicate it right now when I actually have time to look into the
problem's damned annoying. What I'm looking for are some hints as to where
to look the next time it does happen, or things I can flip on to catch
more in the log. (The logging parameters are all set at their defaults)
I'm going to try attaching with dbx to get a stack trace (I assume this is
relatively safe?) but past that I'm kind of stumped.

Help?

Re: Trying to track down weird query stalls

From

Robert Haas

Date:

30 March 2009, 15:31:48

On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a really
> strange, annoying transient problem with one particular query stalling.
>
> The symptom here is that when this query is made with X or more records in
> a temp table involved in the join (where X is constant when the problem
> manifests, but is different between manifestations) the query takes about
> 20 minutes. When the query is made with X-1 records it takes less than a
> second. It's just this one query -- for everything else the system's nice
> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
> hasn't had less than 60G of free system memory on it.
>
> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
> more bizarre. Not only are the two query plans identical (save trivial
> differences because of the record count differences) but the explain
> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case showed
> 259ms, the slow case 265ms.
>
> When the slow query was actually run, though, it took 20 minutes. There
> were no blocked back ends shown in pg_stat_activity, and the back end
> itself was definitely moving. I trussed the back end stuck running the
> slow query and it spent nearly all its time doing kread() and kwrite()
> calls. The DB log didn't show anything interesting in it. I checked to
> make sure the SQL statement that was EXPLAIN ANALYZEd was the one actually
> executed, and I pulled the client code into the debugger and
> single-stepped through just to make sure it was getting stuck on that one
> SQL statement and it wasn't the client doing something unexpected.
>
> Just to be even more annoying, the problem goes away. Right now I can't
> trigger the problem. Last friday it happened reliably feeding 58 records
> into this query. The week before it was 38 records. Today? Nothing, the
> system's full of snappy.
>
> At the moment I'm at a loss as to what's going wrong, and the fact that I
> can't duplicate it right now when I actually have time to look into the
> problem's damned annoying. What I'm looking for are some hints as to where
> to look the next time it does happen, or things I can flip on to catch
> more in the log. (The logging parameters are all set at their defaults)
> I'm going to try attaching with dbx to get a stack trace (I assume this is
> relatively safe?) but past that I'm kind of stumped.

log_min_duration_statement is a good place to start, but it sounds
like the query plan you're getting when you test it by hand isn't the
same one that's actually executing, so I'm not sure how far that's
going to get you.  contrib/auto_explain sounds like it would be just
the thing, but unfortunately that's an 8.4 feature which hasn't been
released yet.  I'm not sure whether it could be built and run against
8.3.

...Robert

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 15:42:18

> On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
>> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a
>> really
>> strange, annoying transient problem with one particular query stalling.
>>
>> The symptom here is that when this query is made with X or more records
>> in
>> a temp table involved in the join (where X is constant when the problem
>> manifests, but is different between manifestations) the query takes
>> about
>> 20 minutes. When the query is made with X-1 records it takes less than a
>> second. It's just this one query -- for everything else the system's
>> nice
>> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
>> hasn't had less than 60G of free system memory on it.
>>
>> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
>> more bizarre. Not only are the two query plans identical (save trivial
>> differences because of the record count differences) but the explain
>> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case
>> showed
>> 259ms, the slow case 265ms.
>
> log_min_duration_statement is a good place to start, but it sounds
> like the query plan you're getting when you test it by hand isn't the
> same one that's actually executing, so I'm not sure how far that's
> going to get you.  contrib/auto_explain sounds like it would be just
> the thing, but unfortunately that's an 8.4 feature which hasn't been
> released yet.  I'm not sure whether it could be built and run against
> 8.3.

I'm not executing any of the EXPLAINs by hand, because I didn't want to
have to worry about typos or filling in temp tables with test data. Inside
the app the SQL for the problematic query's stored in a variable -- when
the task runs with debugging enabled it first executes the query with
EXPLAIN ANALYZE prepended and dumps the output, then it executes the query
itself. It's possible something's going wrong in that, but the code's
pretty simple.

Arguably in this case the actual query should run faster than the EXPLAIN
ANALYZE version, since the cache is hot. (Though that'd only likely shave
a few dozen ms off the runtime)

-Dan

Re: Trying to track down weird query stalls

From

Robert Haas

Date:

30 March 2009, 16:20:38

On Mon, Mar 30, 2009 at 2:42 PM,  <dan@sidhe.org> wrote:
>> On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
>>> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a
>>> really
>>> strange, annoying transient problem with one particular query stalling.
>>>
>>> The symptom here is that when this query is made with X or more records
>>> in
>>> a temp table involved in the join (where X is constant when the problem
>>> manifests, but is different between manifestations) the query takes
>>> about
>>> 20 minutes. When the query is made with X-1 records it takes less than a
>>> second. It's just this one query -- for everything else the system's
>>> nice
>>> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
>>> hasn't had less than 60G of free system memory on it.
>>>
>>> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
>>> more bizarre. Not only are the two query plans identical (save trivial
>>> differences because of the record count differences) but the explain
>>> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case
>>> showed
>>> 259ms, the slow case 265ms.
>>
>> log_min_duration_statement is a good place to start, but it sounds
>> like the query plan you're getting when you test it by hand isn't the
>> same one that's actually executing, so I'm not sure how far that's
>> going to get you.  contrib/auto_explain sounds like it would be just
>> the thing, but unfortunately that's an 8.4 feature which hasn't been
>> released yet.  I'm not sure whether it could be built and run against
>> 8.3.
>
> I'm not executing any of the EXPLAINs by hand, because I didn't want to
> have to worry about typos or filling in temp tables with test data. Inside
> the app the SQL for the problematic query's stored in a variable -- when
> the task runs with debugging enabled it first executes the query with
> EXPLAIN ANALYZE prepended and dumps the output, then it executes the query
> itself. It's possible something's going wrong in that, but the code's
> pretty simple.
>
> Arguably in this case the actual query should run faster than the EXPLAIN
> ANALYZE version, since the cache is hot. (Though that'd only likely shave
> a few dozen ms off the runtime)

Well... yeah.  Also EXPLAIN ANALYZE has a non-trivial amount of
overhead, so that is quite bizarre.  I have to suspect there is some
subtle difference between the way the EXPLAIN ANALYZE is done and the
way the actual query is done... like maybe one uses parameter
substitution and the other doesn't or, well, I don't know.  But I
don't see how turning on debugging (which is essentially what EXPLAIN
ANALYZE is) can prevent the query from being slow.

...Robert

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 16:35:17

> On Mon, Mar 30, 2009 at 2:42 PM,  <dan@sidhe.org> wrote:
>>> On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
>> I'm not executing any of the EXPLAINs by hand, because I didn't want to
>> have to worry about typos or filling in temp tables with test data.
>> Inside
>> the app the SQL for the problematic query's stored in a variable -- when
>> the task runs with debugging enabled it first executes the query with
>> EXPLAIN ANALYZE prepended and dumps the output, then it executes the
>> query
>> itself. It's possible something's going wrong in that, but the code's
>> pretty simple.
>>
>> Arguably in this case the actual query should run faster than the
>> EXPLAIN
>> ANALYZE version, since the cache is hot. (Though that'd only likely
>> shave
>> a few dozen ms off the runtime)
>
> Well... yeah.  Also EXPLAIN ANALYZE has a non-trivial amount of
> overhead, so that is quite bizarre.  I have to suspect there is some
> subtle difference between the way the EXPLAIN ANALYZE is done and the
> way the actual query is done... like maybe one uses parameter
> substitution and the other doesn't or, well, I don't know.  But I
> don't see how turning on debugging (which is essentially what EXPLAIN
> ANALYZE is) can prevent the query from being slow.

Hence the query to the list. *Something* is going on, and beats me what.
I'm assuming I'm triggering some bug in the postgres back end, or there's
some completely bizarre edge case that this tickles. (The massive
kread/kwrite activity that truss showed me when I checked seemed rather
unusual, to say the least)

EXPLAIN ANALYZE is my normal means of diagnosing performance problems, but
that isn't helping as it shows perfectly sane results. That leaves
abnormal means, and outside of trussing the back end or attaching with dbx
to get a stack trace I just don't have any of those. I'm not even sure
what I should be looking for when I do get a stack trace.

-Dan

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 16:42:35

> On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>> Arguably in this case the actual query should run faster than the
>> EXPLAIN
>> ANALYZE version, since the cache is hot. (Though that'd only likely
>> shave
>> a few dozen ms off the runtime)
>
> Joining a lot of tables together?  Could be GEQO kicking in.

Only if I get different query plans for the query depending on whether
it's being EXPLAIN ANALYZEd or not. That seems unlikely...

-Dan

Re: Trying to track down weird query stalls

From

Scott Marlowe

Date:

30 March 2009, 16:45:06

On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>> On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
>>> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a
>>> really
>>> strange, annoying transient problem with one particular query stalling.
>>>
>>> The symptom here is that when this query is made with X or more records
>>> in
>>> a temp table involved in the join (where X is constant when the problem
>>> manifests, but is different between manifestations) the query takes
>>> about
>>> 20 minutes. When the query is made with X-1 records it takes less than a
>>> second. It's just this one query -- for everything else the system's
>>> nice
>>> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
>>> hasn't had less than 60G of free system memory on it.
>>>
>>> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
>>> more bizarre. Not only are the two query plans identical (save trivial
>>> differences because of the record count differences) but the explain
>>> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case
>>> showed
>>> 259ms, the slow case 265ms.
>>
>> log_min_duration_statement is a good place to start, but it sounds
>> like the query plan you're getting when you test it by hand isn't the
>> same one that's actually executing, so I'm not sure how far that's
>> going to get you.  contrib/auto_explain sounds like it would be just
>> the thing, but unfortunately that's an 8.4 feature which hasn't been
>> released yet.  I'm not sure whether it could be built and run against
>> 8.3.
>
> I'm not executing any of the EXPLAINs by hand, because I didn't want to
> have to worry about typos or filling in temp tables with test data. Inside
> the app the SQL for the problematic query's stored in a variable -- when
> the task runs with debugging enabled it first executes the query with
> EXPLAIN ANALYZE prepended and dumps the output, then it executes the query
> itself. It's possible something's going wrong in that, but the code's
> pretty simple.
>
> Arguably in this case the actual query should run faster than the EXPLAIN
> ANALYZE version, since the cache is hot. (Though that'd only likely shave
> a few dozen ms off the runtime)

Joining a lot of tables together?  Could be GEQO kicking in.

Re: Trying to track down weird query stalls

From

Scott Marlowe

Date:

30 March 2009, 16:47:23

On Mon, Mar 30, 2009 at 1:42 PM,  <dan@sidhe.org> wrote:
>> On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>>> Arguably in this case the actual query should run faster than the
>>> EXPLAIN
>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>> shave
>>> a few dozen ms off the runtime)
>>
>> Joining a lot of tables together?  Could be GEQO kicking in.
>
> Only if I get different query plans for the query depending on whether
> it's being EXPLAIN ANALYZEd or not. That seems unlikely...

Yes, you can.  In fact you often will.  Not because it's being
explained or not, just because that's how GEQO works.

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 17:02:38

> On Mon, Mar 30, 2009 at 1:42 PM,  <dan@sidhe.org> wrote:
>>> On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>>>> Arguably in this case the actual query should run faster than the
>>>> EXPLAIN
>>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>>> shave
>>>> a few dozen ms off the runtime)
>>>
>>> Joining a lot of tables together?  Could be GEQO kicking in.
>>
>> Only if I get different query plans for the query depending on whether
>> it's being EXPLAIN ANALYZEd or not. That seems unlikely...
>
> Yes, you can.  In fact you often will.  Not because it's being
> explained or not, just because that's how GEQO works.

Ouch. I did *not* know that was possible -- I assumed that the plan was
deterministic and independent of explain analyze. The query has seven
tables (one of them a temp table) and my geqo_threshold is set to 12. If
I'm reading the docs right GEQO shouldn't kick in.

-Dan

Re: Trying to track down weird query stalls

From

Robert Haas

Date:

30 March 2009, 17:13:33

On Mon, Mar 30, 2009 at 4:02 PM,  <dan@sidhe.org> wrote:
>> On Mon, Mar 30, 2009 at 1:42 PM,  <dan@sidhe.org> wrote:
>>>> On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>>>>> Arguably in this case the actual query should run faster than the
>>>>> EXPLAIN
>>>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>>>> shave
>>>>> a few dozen ms off the runtime)
>>>>
>>>> Joining a lot of tables together?  Could be GEQO kicking in.
>>>
>>> Only if I get different query plans for the query depending on whether
>>> it's being EXPLAIN ANALYZEd or not. That seems unlikely...
>>
>> Yes, you can.  In fact you often will.  Not because it's being
>> explained or not, just because that's how GEQO works.
>
> Ouch. I did *not* know that was possible -- I assumed that the plan was
> deterministic and independent of explain analyze. The query has seven
> tables (one of them a temp table) and my geqo_threshold is set to 12. If
> I'm reading the docs right GEQO shouldn't kick in.

Any chance we could see the actual query?  Right now I think we are
shooting in the dark.

...Robert

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 17:33:21

> On Mon, Mar 30, 2009 at 4:02 PM,  <dan@sidhe.org> wrote:
>>> On Mon, Mar 30, 2009 at 1:42 PM,  <dan@sidhe.org> wrote:
>>>>> On Mon, Mar 30, 2009 at 12:42 PM,  <dan@sidhe.org> wrote:
>>>>>> Arguably in this case the actual query should run faster than the
>>>>>> EXPLAIN
>>>>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>>>>> shave
>>>>>> a few dozen ms off the runtime)
>>>>>
>>>>> Joining a lot of tables together?  Could be GEQO kicking in.
>>>>
>>>> Only if I get different query plans for the query depending on whether
>>>> it's being EXPLAIN ANALYZEd or not. That seems unlikely...
>>>
>>> Yes, you can.  In fact you often will.  Not because it's being
>>> explained or not, just because that's how GEQO works.
>>
>> Ouch. I did *not* know that was possible -- I assumed that the plan was
>> deterministic and independent of explain analyze. The query has seven
>> tables (one of them a temp table) and my geqo_threshold is set to 12. If
>> I'm reading the docs right GEQO shouldn't kick in.
>
> Any chance we could see the actual query?  Right now I think we are
> shooting in the dark.

The query is:

select distinct
       temp_symbol.entityid,
       temp_symbol.libname,
       temp_symbol.objid,
       temp_symbol.objname,
       temp_symbol.fromsymid,
       temp_symbol.fromsymtype,
       temp_symbol.objinstance,
       NULL,
       temp_symbol.csid,
       libinstance.entityid,
       NULL,
       libobject.objid,
       NULL,
       provide_symbol.symbolid,
       provide_symbol.symboltype,
       libobject.objinstance,
       libobject.libinstanceid,
       objectinstance.csid,
       NULL,
       provide_symbol.is_weak,
       NULL,
       provide_symbol.is_local,
       NULL,
       provide_symbol.is_template,
       NULL,
       provide_symbol.is_common
  from libinstance,
       library,
       libobject,
       provide_symbol,
       temp_symbol,
       objectinstance,
       attributes
where libinstance.libdate <= 1238445044
   and libinstance.enddate > 1238445044
   and libinstance.libinstanceid = libobject.libinstanceid
   and libinstance.architecture = ?


   and attributes.entityid = libinstance.entityid
   and attributes.branchid = libinstance.branchid
   and attributes.architecture = libinstance.architecture
   and library.libid = libinstance.libid
   and not secondary
and attribute in ('notoffline', 'notoffline')
and (provide_symbol.symboltype = 'T')
   and libobject.objinstance = provide_symbol.objinstance
   and libinstance.branchid = ?
   and provide_symbol.symbolid = temp_symbol.symbolid
   and objectinstance.objinstance = libobject.objinstance
and libinstance.istemp =  0

The explain analyze for the query's attached in a (possibly hopeless)
attempt to keep it from being word-wrapped into unreadability.

-Dan

Attachment

slowquery.txt

Re: Trying to track down weird query stalls

From

Alvaro Herrera

Date:

30 March 2009, 17:39:57

dan@sidhe.org escribió:

> where libinstance.libdate <= 1238445044
>    and libinstance.enddate > 1238445044
>    and libinstance.libinstanceid = libobject.libinstanceid
>    and libinstance.architecture = ?

How are you generating the explain?  My bet is that you're just
substituting a literal in the architecture condition, but if the driver
is smart then maybe it's preparating the query beforehand.  You'll get a
different plan in that case.  Try something like this:

prepare foo(smallint) as ...
   where libinstance.architecture = $1
   ...

explain analyze execute foo(1);

If the plan you get from that is bad (and it often is), you should look
at avoiding a query prepare.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 17:49:59

> dan@sidhe.org escribió:
>
>> where libinstance.libdate <= 1238445044
>>    and libinstance.enddate > 1238445044
>>    and libinstance.libinstanceid = libobject.libinstanceid
>>    and libinstance.architecture = ?
>
> How are you generating the explain?  My bet is that you're just
> substituting a literal in the architecture condition, but if the driver
> is smart then maybe it's preparating the query beforehand.  You'll get a
> different plan in that case.

I don't think so. Perl's DBI is involved, but the statement's in a
variable. The code in question is:

   if ($db->{debug}) {
    $db->debug("SQL is: $sql\n");
    my $rows = $db->{dbh}->selectall_arrayref("explain analyze $sql",
                                              undef, $db->{arch},
                                              $db->{basebranch});
    foreach my $row (@$rows) {
      $db->debug(join(" ", @$row). "\n");
    }
    $db->debug_stamp("Initial query done\n");
  }

  $rows = $db->{dbh}->selectall_arrayref($sql,
                     undef, $db->{arch},
                     $db->{basebranch});

There's no transform of the sql variable between the two statements, just
a quick loop over the returned rows from the explain analyze to print them
out. (I did try to make sure that the debugging bits were done the same
way as the mainline code, but I may have bobbled something)

-Dan

Re: Trying to track down weird query stalls

From

Alvaro Herrera

Date:

30 March 2009, 18:00:37

dan@sidhe.org escribió:
> > dan@sidhe.org escribió:
> >
> >> where libinstance.libdate <= 1238445044
> >>    and libinstance.enddate > 1238445044
> >>    and libinstance.libinstanceid = libobject.libinstanceid
> >>    and libinstance.architecture = ?
> >
> > How are you generating the explain?  My bet is that you're just
> > substituting a literal in the architecture condition, but if the driver
> > is smart then maybe it's preparating the query beforehand.  You'll get a
> > different plan in that case.
>
> I don't think so. Perl's DBI is involved, but the statement's in a
> variable.

So what's the "?" in the query you pasted earlier?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 18:07:40

> dan@sidhe.org escribió:
>> > dan@sidhe.org escribió:
>> >
>> >> where libinstance.libdate <= 1238445044
>> >>    and libinstance.enddate > 1238445044
>> >>    and libinstance.libinstanceid = libobject.libinstanceid
>> >>    and libinstance.architecture = ?
>> >
>> > How are you generating the explain?  My bet is that you're just
>> > substituting a literal in the architecture condition, but if the
>> driver
>> > is smart then maybe it's preparating the query beforehand.  You'll get
>> a
>> > different plan in that case.
>>
>> I don't think so. Perl's DBI is involved, but the statement's in a
>> variable.
>
> So what's the "?" in the query you pasted earlier?

The first ? (for architecture) is 1, the second ? (for branchid) is 0.
They both should get passed to Postgres as $1 and $2, respectively,
assuming DBD::Pg does its substitution right. (They're both supposed to go
in as placeholders)

-Dan

Re: Trying to track down weird query stalls

From

Alvaro Herrera

Date:

30 March 2009, 18:22:52

dan@sidhe.org escribió:

> > So what's the "?" in the query you pasted earlier?
>
> The first ? (for architecture) is 1, the second ? (for branchid) is 0.
> They both should get passed to Postgres as $1 and $2, respectively,
> assuming DBD::Pg does its substitution right. (They're both supposed to go
> in as placeholders)

Right, so how about you reread what I wrote above?

Oh, hmm, so to be more clear: I don't think DBD::Pg is actually sending
EXECUTE PREPARE.  You need to do this over psql.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Trying to track down weird query stalls

From

dan@sidhe.org

Date:

30 March 2009, 18:34:21

> dan@sidhe.org escribió:
>
>> > So what's the "?" in the query you pasted earlier?
>>
>> The first ? (for architecture) is 1, the second ? (for branchid) is 0.
>> They both should get passed to Postgres as $1 and $2, respectively,
>> assuming DBD::Pg does its substitution right. (They're both supposed to
>> go
>> in as placeholders)
>
> Right, so how about you reread what I wrote above?
>
> Oh, hmm, so to be more clear: I don't think DBD::Pg is actually sending
> EXECUTE PREPARE.  You need to do this over psql.

Fair enough. (And sorry about the mis-read) Next time this occurs I'll try
and duplicate this in psql. FWIW, a quick read of the C underlying the
DBD::Pg module shows it using PQexecPrepared, so I'm pretty sure it is
using prepared statements with placeholders, but double-checking seems
prudent.

-Dan

Re: Trying to track down weird query stalls

From

Alvaro Herrera

Date:

30 March 2009, 18:38:49

dan@sidhe.org escribió:

> Fair enough. (And sorry about the mis-read) Next time this occurs I'll try
> and duplicate this in psql. FWIW, a quick read of the C underlying the
> DBD::Pg module shows it using PQexecPrepared, so I'm pretty sure it is
> using prepared statements with placeholders, but double-checking seems
> prudent.

Yes, but I doubt that it'll be smart enough to work for EXPLAIN in the
way we need here.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Trying to track down weird query stalls

From

Scott Carey

Date:

30 March 2009, 18:55:15

On 3/30/09 2:34 PM, "dan@sidhe.org" <dan@sidhe.org> wrote:

>> dan@sidhe.org escribió:
>>
>>>> So what's the "?" in the query you pasted earlier?
>>>
>>> The first ? (for architecture) is 1, the second ? (for branchid) is 0.
>>> They both should get passed to Postgres as $1 and $2, respectively,
>>> assuming DBD::Pg does its substitution right. (They're both supposed to
>>> go
>>> in as placeholders)
>>
>> Right, so how about you reread what I wrote above?
>>
>> Oh, hmm, so to be more clear: I don't think DBD::Pg is actually sending
>> EXECUTE PREPARE.  You need to do this over psql.
>
> Fair enough. (And sorry about the mis-read) Next time this occurs I'll try
> and duplicate this in psql. FWIW, a quick read of the C underlying the
> DBD::Pg module shows it using PQexecPrepared, so I'm pretty sure it is
> using prepared statements with placeholders, but double-checking seems
> prudent.
>
> -Dan
>

Regardless, its always a good idea to do a manual explain analyze with and
without parameterization in psql if prepared statements are involved.  The
query planner functions very differently with and without them, almost
always with a performance detriment to query execution times when
parameterized.