Thread: 8.2.4 selects make applications wait indefinitely

8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
Hi all,
 
We are facing some problems after the migration of our PostgreSQL 8.0 to the 8.2.4 version. The entire box runs under SUSE 10.3.
 
bd_sgp=# select version();
                                          version
--------------------------------------------------------------------------------------------
 PostgreSQL 8.2.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.2.1 (SUSE Linux)
 
The problem occurs when some SELECTs does not return any row and the application waits indefinitely. One of the SELECTs that locks is the "SELECT * FROM tb_produtos where codigo=5002;" although the query "SELECT codigo, descricao, embalagem, grupo, marca, unidade, grupo_cliente, codmarca, ativo, kg, codigo_deposito FROM tb_produtos where codigo=5002" runs fine. In summary, if you name all the table columns instead of using the * the query runs fine, otherwise it locks.
 
I've queried the pg_locks and no locks are there when the application was waiting.
 
pg_stat_activity reports that the SELECT was accepted by the database because the column "query_start" is updated although the pg_log (log_statement(all)) does not report it.
 
If the where clause is changed from "codigo=5002" to "codigo=3334" in the "SELECT *" statement, it runs fine.
 
The problem only occurs if we use remote clients, if the "SELECT * from tb_produtos where codigo=5002" is processed by a local(server) psql utility it runs fine too. When we try to run the query in a remote client using the windows psql it locks. The pg_stat_activity's current_query column reports "<idle>". We also tried ODBC clients and they lock too.
 
I've defined another table using the LIKE CREATE option and inserted all the 85 lines of tb_produtos into the new one and tried the "SELECT * FROM tb_produtostest where codigo=5002" against it. The query locks too.
 
Summary:
Local   SELECT * FROM tb_produtos where codigo=5002 Runs
Remote  SELECT * FROM tb_produtos where codigo=5002 locks
Remote  SELECT * from tb_produtos where codigo=3334 runs
Remote  SELECT list of all columns
        FROM tb_produtos where codigo=5002          runs
 
I´ve noticed one strange local psql behaviour when we try to see the table definition of the tb_produtos table using the \d command. The column named "codigo_deposito" is returned as "ndices:deposito". Apparently is a psql issue because if we query the pg_attribute the column name appears correctly as "codigo_deposito".
 
I'm thinking to install the 8.2.5 to fix this issue. Am I thinking right?
 
Would appreciate any other suggestions.
 
Thank you very much in advance.

Reimer

Re: 8.2.4 selects make applications wait indefinitely

From
Tom Lane
Date:
"Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> ... if you name all the table columns instead of using the * the query
> runs fine, otherwise it locks.

[ blink... ]  You really, really, really need to provide a reproducible
test case to prove that claim.

> The problem only occurs if we use remote clients, if the "SELECT * from
> tb_produtos where codigo=5002" is processed by a local(server) psql utility
> it runs fine too. When we try to run the query in a remote client using the
> windows psql it locks.

That sounds like your unspecified "remote client" has got some issues,
but you've not provided any details that would let anyone else figure
it out.

            regards, tom lane

Re: 8.2.4 selects make applications wait indefinitely

From
Erik Jones
Date:
On Oct 10, 2007, at 10:09 PM, Carlos H. Reimer wrote:

> Hi all,
>
> We are facing some problems after the migration of our PostgreSQL
> 8.0 to the 8.2.4 version. The entire box runs under SUSE 10.3.
>
> bd_sgp=# select version();
>                                           version
> ----------------------------------------------------------------------
> ----------------------
>  PostgreSQL 8.2.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc
> (GCC) 4.2.1 (SUSE Linux)
>
> The problem occurs when some SELECTs does not return any row and
> the application waits indefinitely. One of the SELECTs that locks
> is the "SELECT * FROM tb_produtos where codigo=5002;" although the
> query "SELECT codigo, descricao, embalagem, grupo, marca, unidade,
> grupo_cliente, codmarca, ativo, kg, codigo_deposito FROM
> tb_produtos where codigo=5002" runs fine. In summary, if you name
> all the table columns instead of using the * the query runs fine,
> otherwise it locks.
>
> I've queried the pg_locks and no locks are there when the
> application was waiting.
>
> pg_stat_activity reports that the SELECT was accepted by the
> database because the column "query_start" is updated although the
> pg_log (log_statement(all)) does not report it.
>
> If the where clause is changed from "codigo=5002" to "codigo=3334"
> in the "SELECT *" statement, it runs fine.
>
> The problem only occurs if we use remote clients, if the "SELECT *
> from tb_produtos where codigo=5002" is processed by a local(server)
> psql utility it runs fine too. When we try to run the query in a
> remote client using the windows psql it locks. The
> pg_stat_activity's current_query column reports "<idle>". We also
> tried ODBC clients and they lock too.
>
> I've defined another table using the LIKE CREATE option and
> inserted all the 85 lines of tb_produtos into the new one and tried
> the "SELECT * FROM tb_produtostest where codigo=5002" against it.
> The query locks too.
>
> Summary:
> Local   SELECT * FROM tb_produtos where codigo=5002 Runs
> Remote  SELECT * FROM tb_produtos where codigo=5002 locks
> Remote  SELECT * from tb_produtos where codigo=3334 runs
> Remote  SELECT list of all columns
>         FROM tb_produtos where codigo=5002          runs
>
> I´ve noticed one strange local psql behaviour when we try to see
> the table definition of the tb_produtos table using the \d command.
> The column named "codigo_deposito" is returned as
> "ndices:deposito". Apparently is a psql issue because if we query
> the pg_attribute the column name appears correctly as
> "codigo_deposito".
>
> I'm thinking to install the 8.2.5 to fix this issue. Am I thinking
> right?
>
> Would appreciate any other suggestions.
>
> Thank you very much in advance.
> Reimer
Are all of these remote connections from the same machine?  Did you
upgrade your client postgres libraries on your remote machine(s) as
well?

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
Thank you Tom,

> That sounds like your unspecified "remote client" has got some issues,
> but you've not provided any details that would let anyone else figure
> it out.
The referred client is a Windows psql 8.2.4 utility. This client is running
as part of a Windows PostgreSQL 8.2.4 server where we are making this tests
against the SUSE Postgresql 8.2.4 server.

But the problem first was reported by users of our Visual Basic
applications. After debuging the application we realized that all the
problems reported with the VB applications were with the "select *"
statements. We used the Windows PostgreSQL 8.2.4 server to make the psql
"select *" into the SUSE PostgreSQL    8.2.4 server because we suspected that
the problem with the VB applications was the ODBC driver but when the psql
from the PostgreSQL 8.2.4 Windows server tried to do the same "select *"
into the SUSE and it locked, we discard the possibility of a problem with
the ODBC driver.

Reimer


RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
Hi Erik,

> Are all of these remote connections from the same machine?  Did you
> upgrade your client postgres libraries on your remote machine(s) as
> well?

No, the problem happens with many machines where our Visual Basic
applications is running. After debuging the application we discovered that
the problem was always with "select *" statements. We started then some
tests to understand what was happening. This tests were done using the psql
utility from a Windows Postgresql 8.2.4 server against the SUSE PG 8.2.4
production server. Using this Windows server we make all the other tests
against the SUSE PG 8.2.4 production server.

Reimer


RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
Some new data about this issue:

SELECT * or naming all the columns locks the client application. Yesterday
I´ve wrongly said that when naming all the columns instead of using the *
the applications did not lock.

I can not reproduce the problem in others 8.2.4 servers.  I´ve others 8.2.4
servers and I´m able to do the "SELECT * from tb_produtos where codigo=5002"
and always using the same windows psql client. I´ve transfered the table to
the other 8.2.4 servers using the pg_dump/psql sequence.

If I create another database in the same server, and transfer the table with
a pg_dump/psql in this new database the problem persists.

Even creating another cluster in the same server and restoring the table
there it does not work.

This is the only 64 bits box with the 8.2.4, don´t know if this has anything
to do with the problem.

Don´t know but apparently the problem is not an issue in the client, as I´m
able to connect and do the select * in other 8.2.4 servers.

Don´t know what kind of tests I should do to help fixing this problem.

Any suggestions?

Reimer


Re: RES: 8.2.4 selects make applications wait indefinitely

From
Alan Hodgson
Date:
On Thursday 11 October 2007, "Carlos H. Reimer"
<carlos.reimer@opendb.com.br> wrote:
> Don´t know but apparently the problem is not an issue in the client, as
> I´m able to connect and do the select * in other 8.2.4 servers.
>
> Don´t know what kind of tests I should do to help fixing this problem.
>
> Any suggestions?

It kind of sounds like a network problem, perhaps a duplex setting or driver
issue on the server, or some sort of firewall problem.

--
The global consumer economy can best be described as the most efficient way
to convert natural resources into waste.


Re: RES: 8.2.4 selects make applications wait indefinitely

From
Tom Lane
Date:
"Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> SELECT * or naming all the columns locks the client application. Yesterday
> I�ve wrongly said that when naming all the columns instead of using the *
> the applications did not lock.

Hm, are you sure it's not one specific column that's causing the
problem?

            regards, tom lane

RES: RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
> "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> > SELECT * or naming all the columns locks the client
> application. Yesterday
> > I´ve wrongly said that when naming all the columns instead of
> using the *
> > the applications did not lock.
>
> Hm, are you sure it's not one specific column that's causing the
> problem?
Yes, I´ve just doublechecked again. The table has 11 columns, I used 11
SELECTs, one for each column, and all run successfully. Started adding more
columns, no problem. When the full list of columns was specified in the
SELECT it locked.


Re: RES: 8.2.4 selects make applications wait indefinitely

From
"Scott Marlowe"
Date:
On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
> > "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> > > SELECT * or naming all the columns locks the client
> > application. Yesterday
> > > I´ve wrongly said that when naming all the columns instead of
> > using the *
> > > the applications did not lock.
> >
> > Hm, are you sure it's not one specific column that's causing the
> > problem?
> Yes, I´ve just doublechecked again. The table has 11 columns, I used 11
> SELECTs, one for each column, and all run successfully. Started adding more
> columns, no problem. When the full list of columns was specified in the
> SELECT it locked.

If you turn on stats_command_string do you see in

select * from pg_stat_activity;

for the current_query ???

RES: RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
> On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
> > > "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> > > > SELECT * or naming all the columns locks the client
> > > application. Yesterday
> > > > I´ve wrongly said that when naming all the columns instead of
> > > using the *
> > > > the applications did not lock.
> > >
> > > Hm, are you sure it's not one specific column that's causing the
> > > problem?
> > Yes, I´ve just doublechecked again. The table has 11 columns, I used 11
> > SELECTs, one for each column, and all run successfully. Started
> adding more
> > columns, no problem. When the full list of columns was specified in the
> > SELECT it locked.
>
> If you turn on stats_command_string do you see in
>
> select * from pg_stat_activity;
>
> for the current_query ???

It´s "<IDLE>" but the "query_start" column is refreshed.

Reimer


Re: RES: 8.2.4 selects make applications wait indefinitely

From
"Scott Marlowe"
Date:
On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
> > On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
> > > > "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> > > > > SELECT * or naming all the columns locks the client
> > > > application. Yesterday
> > > > > I´ve wrongly said that when naming all the columns instead of
> > > > using the *
> > > > > the applications did not lock.
> > > >
> > > > Hm, are you sure it's not one specific column that's causing the
> > > > problem?
> > > Yes, I´ve just doublechecked again. The table has 11 columns, I used 11
> > > SELECTs, one for each column, and all run successfully. Started
> > adding more
> > > columns, no problem. When the full list of columns was specified in the
> > > SELECT it locked.
> >
> > If you turn on stats_command_string do you see in
> >
> > select * from pg_stat_activity;
> >
> > for the current_query ???
>
> It´s "<IDLE>" but the "query_start" column is refreshed.

Then the query runs and finishes and the problem is something to do
with the delivery of the data.  Not sure after that...

Re: RES: 8.2.4 selects make applications wait indefinitely

From
Tom Lane
Date:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
>> It=B4s "<IDLE>" but the "query_start" column is refreshed.

> Then the query runs and finishes and the problem is something to do
> with the delivery of the data.  Not sure after that...

That seems to eliminate the possibility that the problem is on the
server side.  I'd suggest trying to get a stack trace from the client
to figure out what it's doing.

BTW, have you looked into the theory that it's triggered by total
data volume rather than number of columns?  That is, try selecting
all the columns but use LIMIT to reduce the number of rows fetched?

            regards, tom lane

RES: RES: 8.2.4 selects make applications wait indefinitely

From
"Carlos H. Reimer"
Date:
> "Scott Marlowe" <scott.marlowe@gmail.com> writes:
> > On 10/11/07, Carlos H. Reimer <carlos.reimer@opendb.com.br> wrote:
> >> It=B4s "<IDLE>" but the "query_start" column is refreshed.
>
> > Then the query runs and finishes and the problem is something to do
> > with the delivery of the data.  Not sure after that...
>
> That seems to eliminate the possibility that the problem is on the
> server side.  I'd suggest trying to get a stack trace from the client
> to figure out what it's doing.
>
> BTW, have you looked into the theory that it's triggered by total
> data volume rather than number of columns?  That is, try selecting
> all the columns but use LIMIT to reduce the number of rows fetched?
The where clause limits the number of rows returned to 1 only. Only some
primery keys are affected. For example, "Select * from table where pk=1"
works and returns only one line but "select * from table where pk=2" locks
and there is only one line with pk=2 in the table. I believe it is triggered
by something else.

Reimer


Re: RES: RES: 8.2.4 selects make applications wait indefinitely

From
Tom Lane
Date:
"Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
>> BTW, have you looked into the theory that it's triggered by total
>> data volume rather than number of columns?  That is, try selecting
>> all the columns but use LIMIT to reduce the number of rows fetched?

> The where clause limits the number of rows returned to 1 only. Only some
> primery keys are affected. For example, "Select * from table where pk=1"
> works and returns only one line but "select * from table where pk=2" locks
> and there is only one line with pk=2 in the table. I believe it is triggered
> by something else.

Hmm ... are some of the rows particularly wide?  It could still be a
data-volume effect ...

            regards, tom lane

Re: RES: 8.2.4 selects make applications wait indefinitely

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> BTW, have you looked into the theory that it's triggered by total
> data volume rather than number of columns?  That is, try selecting
> all the columns but use LIMIT to reduce the number of rows fetched?

Or conversely select the first or second half of the columns but put two
copies of them in the column list.

Are the client and server on two separate machines? Is it possible you have a
network issue between these two machines (like pmtud problems, for example)?

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: RES: 8.2.4 selects make applications wait indefinitely

From
Tomasz Ostrowski
Date:
On Thu, 11 Oct 2007, Carlos H. Reimer wrote:

> the problem happens with many machines where our Visual Basic
> applications is running. After debuging the application we discovered that
> the problem was always with "select *" statements.

I'd try locally:

$ psql -c 'select * from table where pk=1' [dbname] [username]

then locally:

$ psql -h localhost -c 'select * from table where pk=1' [dbname] [username]

then locally

$ psql -h [IP_eth0] -c 'select * from table where pk=1' [dbname] [username]

then I'd use a local network computer with a clear operating system,
like Fedora Live CD, because this can be caused by, for example, some
kind of intrusion detection system, which accidentally triggers on
data in this row.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh