Re: Receiving many more rows than expected - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Receiving many more rows than expected
Date
Msg-id 536C19A9.3010204@aklaver.com
Whole thread Raw
In response to Re: Receiving many more rows than expected  (Vincent de Phily <vincent.dephily@mobile-devices.fr>)
Responses Re: Receiving many more rows than expected  (Vincent de Phily <vincent.dephily@mobile-devices.fr>)
List pgsql-general
On 05/08/2014 03:11 PM, Vincent de Phily wrote:
> On Thursday 08 May 2014 06:30:39 Adrian Klaver wrote:
>> On 05/08/2014 04:09 AM, Vincent de Phily wrote:
>>> Hello,
>>>


>
> DO_LOOP is tested in a few places where we can make a clean exit. A cronjob
> will restart the process if it is not or badly running.
>
>>>           curs.execute(query)
>>>           results = curs.fetchall()
>>>           rlen = len(results)
>>>
>>>           if rlen > 0:
>>>               LOG.debug("Fetched %d rows", rlen)
>>
>>              # What do you see in LOG for rlen values?
>
> The histogram shows a large amount of small values, progressively becoming
> rarer for bigger values, up to value 5000 which is very frequent again
> (depending on the day, between 0.5 and 5% of queries return the maximum number
> of rows), then a handfull freak values of ~9k, 11k, 15k, 33k, 46k, 107k, etc
> which cause my problems.

So just to be clear this the value for rlen and that value and log
message only appear in the loop above?

As Sim suggested, it might be worth it to add a unique id to the above
message to prove or not that the values are coming from where you think
they are.

>
>>>           if rlen == 0:
>>>               # [...] wait for notification...
>>>               continue
>>>
>>>           # [...] Enqueue batch and let other threads process it.
>>>           # [...] Those threads will eventually delete the processed rows
>>>           from
>>>           #       the foo table.
>>>
>>> The problem is that sometimes (once every few days at about 2-300K queries
>>> per day) I get many more rows than the max 5000 I asked for (I've seen up
>>> to 25k). And I'm getting timeouts and other problems as a result.


And there is no instance of the UPDATE query that is unconstrained and a
code path to that query?

Or as Sim suggested another copy of this code without the LIMIT?

Also what happens in the rlen == 0 case?

>>> Any idea ? Thanks.
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Vincent de Phily
Date:
Subject: Re: Receiving many more rows than expected
Next
From: Tatsuo Ishii
Date:
Subject: Re: Pgpool starting problem