Thread: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
"andremachado"
Date:
Hello,
continuing the saga,
http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
my coleague created a test database with fake data (see below).

The above archived message contains the the timings of firebird and postgresql.
The weird problem are the 2 queries that firebird executes in less than 2
seconds and postgresql took almost half hour to complete at 100% cpu.


you could download the test database at the address below. It is a 128 kpbs
adsl connection.
74 MB
http://www.eicomm.no-ip.com/download/BackDNF_Cript.zip


Many thanks.
Andre Felipe Machado


Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
"Jim C. Nasby"
Date:
On Fri, May 12, 2006 at 12:48:52PM -0200, andremachado wrote:
> Hello,
> continuing the saga,
> http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
> my coleague created a test database with fake data (see below).
>
> The above archived message contains the the timings of firebird and postgresql.
> The weird problem are the 2 queries that firebird executes in less than 2
> seconds and postgresql took almost half hour to complete at 100% cpu.

How about posting EXPLAIN ANALYZE for those two queries, as well as the
queries themselves?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
"Joshua D. Drake"
Date:
Jim C. Nasby wrote:
> On Fri, May 12, 2006 at 12:48:52PM -0200, andremachado wrote:
>> Hello,
>> continuing the saga,
>> http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
>> my coleague created a test database with fake data (see below).
>>
>> The above archived message contains the the timings of firebird and postgresql.
>> The weird problem are the 2 queries that firebird executes in less than 2
>> seconds and postgresql took almost half hour to complete at 100% cpu.
>
> How about posting EXPLAIN ANALYZE for those two queries, as well as the
> queries themselves?

I have this database downloaded if anyone wants a copy off a faster link.

--

            === The PostgreSQL Company: Command Prompt, Inc. ===
      Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
      Providing the most comprehensive  PostgreSQL solutions since 1997
                     http://www.commandprompt.com/



Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
Tom Lane
Date:
"andremachado" <andremachado@techforce.com.br> writes:
> continuing the saga,
> http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
> my coleague created a test database with fake data (see below).

Thanks.  I played around with this a bit, and got results like these:
original query, 8.1 branch from a couple weeks back: 945 sec
original query, 8.1 branch tip: 184 sec
modified query, 8.1 branch tip: 15 sec

The first differential is because of this patch:
http://archives.postgresql.org/pgsql-committers/2006-04/msg00355.php
viz
    Remove the restriction originally coded into
    optimize_minmax_aggregates() that MIN/MAX not be converted to
    use an index if the query WHERE clause contains any volatile
    functions or subplans.

Allowing the max(DEC2.AM_REFERENCIA) subquery to be converted to an
indexscan makes for about a 5X reduction in the number of times the
EXISTS sub-subquery is executed.  But the real problem is that Postgres
isn't excessively smart about EXISTS subqueries.  I manually changed it
into an IN to get the 15-second runtime: instead of

      (select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
        where DEC2.IN_FOI_RETIFICADA=0 and
      exists (select CAD3.ID_CADASTRO from CADASTRO CAD3 where
          CAD3.ID_DECLARACAO=DEC2.ID_DECLARACAO and
          CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA ) )

write

      (select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
        where DEC2.IN_FOI_RETIFICADA=0 and DEC2.ID_DECLARACAO in
             (select CAD3.ID_DECLARACAO from CADASTRO CAD3 where
          CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA ) )

I'm not clear on how Firebird is managing to do this query in under
a second --- I can believe that they know how to do EXISTS as a join
but it still seems like the subqueries need to be done many thousand
times.  I thought maybe they were caching the results of the overall
subquery for specific values of CADASTRO.ID_EMPRESA, but now that I
see your test data, there are several thousand distinct values of
that, so there's not a lot of traction to be gained that way.

            regards, tom lane

Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
André Felipe Machado
Date:
Hello, Jim
I did not want to clutter mailboxes of those who are not interested at
this weird problem.
So, i pointed to the archived message that contains 2 tar.gz files
(around 50 KB) with the two sets of queries (firebird and postgresql
respectively), its results, explain analyze, pg configurations, firebird
plan outputs, etc.
Please, open the cited link and scroll to the end of message. You will
find the 2 tar.gz files.
http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php
If you have some difficulty, I could send a private email containing the
2 files in order to not send the big email to the all list again.
Many thanks.
Andre Felipe Machado


Em Sex, 2006-05-12 às 14:14 -0500, Jim C. Nasby escreveu:
> On Fri, May 12, 2006 at 12:48:52PM -0200, andremachado wrote:
> > Hello,
> > continuing the saga,
> > http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
> > my coleague created a test database with fake data (see below).
> >
> > The above archived message contains the the timings of firebird and postgresql.
> > The weird problem are the 2 queries that firebird executes in less than 2
> > seconds and postgresql took almost half hour to complete at 100% cpu.
>
> How about posting EXPLAIN ANALYZE for those two queries, as well as the
> queries themselves?


Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)

From
Tom Lane
Date:
"andremachado" <andremachado@techforce.com.br> writes:
> continuing the saga,
> http://archives.postgresql.org/pgsql-performance/2006-04/msg00558.php ,
> my coleague created a test database with fake data (see below).

I tried to use this data to replicate your results, and could not.
I grabbed a copy of what I think is the latest Firebird release,
firebird-1.5.3.4870, built it on a Fedora Core 4 machine (32-bit,
couldn't get it to build cleanly on my newer 64-bit machine :-()
and compared to Postgres 8.1 branch tip on the same machine.
On the interesting sub-sub-EXISTS query, I see these results:

Firebird:
SQL> set stats on;
SQL> set plan on;
SQL>  update CADASTRO set IN_CADASTRO_MAIS_ATUAL = case when CADASTRO.ID_CADASTRO=  (select max(CAD2.ID_CADASTRO) from
CADASTROCAD2 inner join DECLARACAO DECL on (DECL.ID_DECLARACAO=CAD2.ID_DECLARACAO) where
CAD2.ID_EMPRESA=CADASTRO.ID_EMPRESA and DECL.AM_REFERENCIA =  (select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
whereDEC2.IN_FOI_RETIFICADA=0 and exists (select CAD3.ID_CADASTRO from CADASTRO CAD3  where
CAD3.ID_DECLARACAO=DEC2.ID_DECLARACAOand CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA )  )and DECL.IN_FOI_RETIFICADA=0 )then 1
else0 end ; 

PLAN (CAD3 INDEX (RDB$FOREIGN1))
PLAN (DEC2 NATURAL)
PLAN JOIN (DECL INDEX (IDX_DT_REFERENCIA),CAD2 INDEX (RDB$FOREIGN1))
PLAN (CADASTRO NATURAL)
Current memory = 786704
Delta memory = 309056
Max memory = 786704
Elapsed time= 344.19 sec
Cpu = 0.03 sec
Buffers = 75
Reads = 2081702
Writes = 16173
Fetches = 21713743

The cpu = 0.03 sec bit is bogus; in reality the CPU is maxed out
and the isql process accumulates very nearly 344 seconds runtime.

Postgres:
bc=# \timing
Timing is on.
bc=# update CADASTRO set IN_CADASTRO_MAIS_ATUAL = case when CADASTRO.ID_CADASTRO=  (select max(CAD2.ID_CADASTRO) from
CADASTROCAD2 inner join DECLARACAO DECL on (DECL.ID_DECLARACAO=CAD2.ID_DECLARACAO) where
CAD2.ID_EMPRESA=CADASTRO.ID_EMPRESA and DECL.AM_REFERENCIA =  (select max(DEC2.AM_REFERENCIA) from DECLARACAO DEC2
whereDEC2.IN_FOI_RETIFICADA=0 and exists (select CAD3.ID_CADASTRO from CADASTRO CAD3  where
CAD3.ID_DECLARACAO=DEC2.ID_DECLARACAOand CAD3.ID_EMPRESA=CADASTRO.ID_EMPRESA )  )and DECL.IN_FOI_RETIFICADA=0 )then 1
else0 end ; 
UPDATE 15490
Time: 420350.628 ms

Now I know nothing about Firebird and it's quite possible that I missed
some essential tuning step, but I'm sure not in the same ballpark as
your report of 0.72 sec to run this query.

            regards, tom lane