Re: Query hitting empty tables taking 48 minutes - Mailing list pgsql-general

From Rob Sargent
Subject Re: Query hitting empty tables taking 48 minutes
Date
Msg-id 3f0ce0f4-1103-bce8-7398-c77c7ef8c739@gmail.com
Whole thread Raw
In response to Re: Query hitting empty tables taking 48 minutes  (Robert Creager <robert@logicalchaos.org>)
Responses Re: Query hitting empty tables taking 48 minutes  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: Query hitting empty tables taking 48 minutes  (Steve Atkins <steve@blighty.com>)
List pgsql-general



On 06/07/2018 12:55 PM, Robert Creager wrote:


On Jun 7, 2018, at 12:40 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 06/07/2018 11:17 AM, Robert Creager wrote:
I have a system running FreeBSD 11.1-STABLE, PostgreSQL 9.6.8,Java OpenJDK 1.8.0_131, jdbc 9.3-1104-jdbc41 which is exhibiting very bizarre behavior.  A query is executing against a couple of tables that have 1 and 0 records in them.  ds3.job_entry has 0 rows, ds3.blob has 1 row.   If I execute the query manually via command line, it executes fine.   There are no other active queries, no locks.  The system is idle, in our Dev Test group, so I can leave it this way for a bit of time.  The general software setup is in production and I’ve seen nothing like this before.  Even a system with 300M ds3.blob entries executes this query fine.

So I am assuming the problem query is being run through Java/jdbc, correct?

Yes.


There is also the below in the log:

" ... execute fetch from S_2037436/C_2037437 …"

So, that means nothing to me.  Something to you?


My guess is that we will need to see the Java code that sets up and runs the query. Is that possible?

OMG, you had to ask ;-)  Not really, for two reasons.  It’s an actual shipping product, and I’d have to send you so much code to get from the source of the query down through the execute…. Now, the execute, on it’s own, is below.  m_statement is a java.sql.PreparedStatement.  Keep in mind this code is literally executing millions of times a day on a busy system.  Except for this special snowflake system...

private Executor executeQuery()
{
    final MonitoredWork monitoredWork = new MonitoredWork( 
            StackTraceLogging.NONE, m_readableSql.getLogMessagePreExecution() );
    try
    {
        m_closeTransactionUponClose = ( null == m_transactionNumber );
        m_statement.setFetchSize( 10000 );
        final Duration duration = new Duration();
        m_resultSet = m_statement.executeQuery();
        m_readableSql.log( duration, null );
        return this;
    }
    catch ( final SQLException ex )
    {
        throw new DaoException( 
                "Failed to execute: " + m_readableSql.getLogMessagePreExecution(),
                ex );
    }
    finally
    {
        monitoredWork.completed();
    }
}


Jun  7 17:24:21 blackpearl postgres[10670]: [7737-1] db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1 LOG:  duration: *2903612.206* ms  execute fetch from S_2037436/C_2037437: SELECT * FROM ds3.blob WHERE EXISTS (SELECT * FROM ds3.job_entry WHERE blob_id = ds3.blob.id <http://ds3.blob.id> AND (job_id = $1))
tapesystem=# explain analyze SELECT * FROM ds3.blob WHERE EXISTS (SELECT * FROM ds3.job_entry WHERE blob_id = ds3.blob.id <http://ds3.blob.id> AND (job_id = 'b51357cd-e07a-4c87-a50b-999c347a5c71'));
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------   Nested Loop  (cost=0.84..6.89 rows=1 width=77) (actual time=0.044..0.044 rows=0 loops=1)
   ->  Index Scan using job_entry_job_id_idx on job_entry  (cost=0.42..2.44 rows=1 width=16) (actual time=0.042..0.042 rows=0 loops=1)
         Index Cond: (job_id = 'b51357cd-e07a-4c87-a50b-999c347a5c71'::uuid)
   ->  Index Scan using blob_pkey on blob  (cost=0.42..4.44 rows=1 width=77) (never executed)
         Index Cond: (id = job_entry.blob_id)
 Planning time: 0.388 ms
 Execution time: 0.118 ms
pid client_port runtime query_start datname state query usename
10670 11211 0 years 0 mons 0 days 0 hours 43 mins 28.852273 secs 2018-06-07 17:24:22.026384 tapesystem active SELECT * FROM ds3.blob WHERE EXISTS (SELECT * FROM ds3.job_entry WHERE blob_id = ds3.blob.id <http://ds3.blob.id> AND (job_id = $1)) Administrator
From the system with 300M ds3.blob entries, which works fine (along with every other system in house):
QUERY PLAN
Nested Loop  (cost=0.57..738052.90 rows=164379 width=75) (actual time=1.001..1947.029 rows=50000 loops=1)
  ->  Seq Scan on job_entry  (cost=0.00..10039.50 rows=164379 width=16) (actual time=0.871..56.442 rows=50000 loops=1)
        Filter: (job_id = 'ef27d2fa-2727-424e-8f44-da9e33a5ca05'::uuid)
        Rows Removed by Filter: 60001
  ->  Index Scan using blob_pkey on blob  (cost=0.57..4.42 rows=1 width=75) (actual time=0.037..0.037 rows=1 loops=50000)
        Index Cond: (id = job_entry.blob_id)
Planning time: 6.330 ms
Execution time: 1951.564 ms
Please keep my CC of my work e-mail present.
Best,
Robert


--
Adrian Klaver
adrian.klaver@aklaver.com

What's the url doing in "blob_id = ds3.blob.id <http://ds3.blob.id"?

pgsql-general by date:

Previous
From: Robert Creager
Date:
Subject: Re: Query hitting empty tables taking 48 minutes
Next
From: Adrian Klaver
Date:
Subject: Re: Query hitting empty tables taking 48 minutes