Thread: The Two Towers

The Two Towers

From
Mladen Gogala
Date:
Battleground was a machine running CentOS 5.5 and the two towers were 2
databases. First, PostgreSQL 8.4.4:

scott=#  select count(*) from emp
e1,emp e2, emp e3, emp e4,emp e5, emp e6;
    count 
  ---------
   7529536
  (1 row)

  Time: 2127.345 ms
  scott=# select version();

                                                   

version                    


                               

  --------------------------------------------------------------------------------
  --------------------------------
   PostgreSQL 8.4.4 on i686-redhat-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 2008
  0704 (Red Hat 4.1.2-46), 32-bit
  (1 row)

  Time: 1.008 ms
  scott=# explain analyze
  scott-# select count(*) from emp e1,emp e2,
emp e3, emp e4,emp e5, emp e6;

                                                              
QUERY PLAN      

                                                        

  --------------------------------------------------------------------------------
  ---------------------------------------------------------
   Aggregate 
(cost=1052615959933232.62..1052615959933232.62 rows=1 width=0) (actu
  al time=13684.811..13684.811 rows=1 loops=1)
     ->  Nested Loop 
(cost=16.60..935975959933232.62 rows=46656000000000000 width
  =0) (actual time=0.038..10508.108
rows=7529536 loops=1)
           ->  Nested Loop 
(cost=0.00..2855959933216.00 rows=77760000000000 width
  =0) (actual time=0.025..820.829 rows=537824
loops=1)
                 ->  Nested Loop 
(cost=0.00..4759933216.00 rows=129600000000 widt
  h=0) (actual time=0.021..59.701 rows=38416
loops=1)

                    
-> Nested Loop  
(cost=0.00..7933216.00 rows=216000000 widt
  h=0) (actual time=0.018..4.424 rows=2744
loops=1)

                          
-> Nested 
Loop  (cost=0.00..13216.00 rows=360000 wid
  th=0) (actual time=0.015..0.333 rows=196
loops=1)

                                
-> Seq 
Scan on emp e1  (cost=0.00..16.00 rows=
  600 width=0) (actual time=0.009..0.018
rows=14 loops=1)

                                
-> Seq 
Scan on emp e6  (cost=0.00..16.00 rows=
  600 width=0) (actual time=0.002..0.010
rows=14 loops=14)

                          
-> Seq Scan 
on emp e5  (cost=0.00..16.00 rows=600 wi
  dth=0) (actual time=0.002..0.008 rows=14
loops=196)

                    
-> Seq Scan on emp 
e4  (cost=0.00..16.00 rows=600 width=0)
   (actual time=0.001..0.007 rows=14
loops=2744)
                 ->  Seq Scan on emp
e3 
(cost=0.00..16.00 rows=600 width=0) (actu
  al time=0.001..0.007 rows=14 loops=38416)
           ->  Materialize 
(cost=16.60..22.60 rows=600 width=0) (actual time=0.00
  0..0.006 rows=14 loops=537824)
                 ->  Seq Scan on emp
e2 
(cost=0.00..16.00 rows=600 width=0) (actu
  al time=0.003..0.012 rows=14 loops=1)
   Total runtime: 13684.977 ms
  (14 rows)
  Time: 13714.576 ms
  scott=# \d+ emp

                            
Table
"public.emp"
    Column  |           
Type            | 
Modifiers | Storage  | Description
  ----------+-----------------------------+-----------+----------+-------------
   empno    |
smallint                   
|
not null  | plain    |
   ename    | character varying(10)       |
not null  | extended |
   job      | character varying(9)       
|           | extended |
   mgr      |
smallint                   
|           | plain    |
   hiredate | timestamp without time zone
|           | plain    |
   sal      | double
precision           
|           | plain    |
   comm     | double
precision           
|           | plain    |
   deptno   |
smallint                   
|           | plain    |
  Indexes:
      "emp_pkey" PRIMARY KEY, btree (empno)
      "emp_deptno_i" btree (deptno)
      "emp_ename_id" btree (ename)
      "ind_emp_deptno" btree (deptno)
  Foreign-key constraints:
      "fk_deptno" FOREIGN KEY (deptno)
REFERENCES dept(deptno)
  Has OIDs: no


Now, Oracle 11.2.0.1 with the April patch:
SQL> set autotrace on
  SQL> select count(*) from emp e1,emp e2,
emp e3, emp e4,emp e5, emp e6;

    COUNT(*)
  ----------
     7529536

  Elapsed: 00:00:00.85

  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 1049923164

  ------------------------------------------------------------------------------
  | Id  |
Operation                  | 
Name   | Rows  | Cost (%CPU)| Time     |
  ------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT           
|        |     1 |   157K  (1)| 00:31:25 |
  |   1 |  SORT AGGREGATE            
|        |     1
|           
|         | 
  |   2 |   MERGE JOIN CARTESIAN     
|        |  7529K|   157K  (1)| 00:31:25 |
  |   3 |    MERGE JOIN CARTESIAN    
|        |   537K| 11224   (1)| 00:02:15 |
  |   4 |     MERGE JOIN CARTESIAN   
|        | 38416 |   808   (1)| 00:00:10 |
  |   5 |      MERGE JOIN CARTESIAN  
|        |  2744 |    62   (0)| 00:00:01 |
  |   6 |       MERGE JOIN CARTESIAN 
|        |   196 |     7   (0)| 00:00:01 |
  |   7 |        INDEX FULL SCAN      |
PK_EMP |    14 |     1   (0)| 00:00:01 |
  |   8 |        BUFFER
SORT         
|        |    14 |     6   (0)|
00:00:01| 
  |   9 |         INDEX FAST FULL SCAN|
PK_EMP |    14 |     0   (0)| 00:00:01 |
  |  10 |       BUFFER
SORT          
|        |    14 |    62   (0)| 00:00:01 |
  |  11 |        INDEX FAST FULL SCAN |
PK_EMP |    14 |     0   (0)| 00:00:01 |
  |  12 |      BUFFER
SORT           
|        |    14 |   808   (1)| 00:00:10 |
  |  13 |       INDEX FAST FULL SCAN  |
PK_EMP |    14 |     0   (0)| 00:00:01 |
  |  14 |     BUFFER
SORT            
|        |    14 | 11224   (1)| 00:02:15 |
  |  15 |      INDEX FAST FULL SCAN   |
PK_EMP |    14 |     0   (0)| 00:00:01 |
  |  16 |    BUFFER
SORT             
|        |    14 |   157K  (1)| 00:31:25 |
  |  17 |     INDEX FAST FULL SCAN    |
PK_EMP |    14 |     0   (0)| 00:00:01 |
  ------------------------------------------------------------------------------


  Statistics
  ----------------------------------------------------------
            0  recursive calls
            0  db block gets
           21  consistent gets
            0  physical reads
            0  redo size
          425  bytes sent via SQL*Net to
client
          419  bytes received via SQL*Net
from client
            2  SQL*Net roundtrips to/from
client
            5  sorts (memory)
            0  sorts (disk)
            1  rows processed

  SQL>

  SQL> select * from v$version;

  BANNER
  --------------------------------------------------------------------------------
  Oracle Database 11g Enterprise Edition
Release 11.2.0.1.0 - Production
  PL/SQL Release 11.2.0.1.0 - Production
  CORE    11.2.0.1.0      Production
  TNS for Linux: Version 11.2.0.1.0 -
Production
  NLSRTL Version 11.2.0.1.0 - Production


Both databases have current statistics, the data is exactly the same,
as shown by the results. Oracle finished in 0.85 second (850
milliseconds) while PostgreSQL took 2127.345 ms, 2.5 times slower than
Oracle. The difference that is obvious is the access path: Postgres
chose nested loops, while Oracle chose merge join and utilized the
primary key for the table. Postgres, apparently cannot do "fast full
scan" of an index and doesn't know how to utilize the primary key in
cases like this. However, my gripe is that the optimizer should have
selected merge join, just like Oracle did. In this case, nested loops
are definitely the wrong choice. Rule based optimizers, the kind of the
optimizer that takes into consideration only the structure of the
table, usually ends up being dominated by the nested loops method.
Nested loops method usually dominates the OLTP type applications but
can really mess up large reports. I am under the impression that
Postgres query planner is geared toward the OLTP type of the database.
Maybe a new parameter is needed that would somehow shift gears to "data
warehouse use", on demand? I have to say, the advantage is still on the
side of Sauron.


--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

Re: The Two Towers

From
Donn Washburn
Date:
On 06/07/2010 09:41 PM, Mladen Gogala wrote:
> Battleground was a machine running CentOS 5.5 and the two towers were
> 2 databases. First, PostgreSQL 8.4.4:
>
By the way if anyone is using openSuSE's version of 8.4.4 in M7.
I finally got it to work by editing /etc/init.d/postgresql and
/etc/sysconfig/postgresql to correct incorrect /data directory.
And finally I needed to completely replace 8.4.4 from source.
It is now working correctly.  It seems that the binary is broken also.

--
73 de Donn Washburn
307 Savoy Street     Email:" n5xwb@comcast.net "
Sugar Land, TX 77478 LL# 1.281.242.3256
Ham Callsign N5XWB   HAMs : " n5xwb@arrl.net "
VoIP via Gizmo: bmw_87kbike / via Skype: n5xwbg
BMW MOA #: 4146 - Ambassador
       " http://counter.li.org " #279316


Re: The Two Towers

From
Mladen Gogala
Date:
Donn Washburn wrote:
> By the way if anyone is using openSuSE's version of 8.4.4 in M7.
> I finally got it to work by editing /etc/init.d/postgresql and
> /etc/sysconfig/postgresql to correct incorrect /data directory.
> And finally I needed to completely replace 8.4.4 from source.
> It is now working correctly.  It seems that the binary is broken also.
>
>
This has nothing to do with my thread. Please, open a separate thread
the next time.  This is really annoying.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


Re: The Two Towers

From
Tom Lane
Date:
Mladen Gogala <mladen.gogala@vmsinfo.com> writes:
> This has nothing to do with my thread. Please, open a separate thread
> the next time.  This is really annoying.

Not *nearly* as annoying as HTML-only email (ahem).

            regards, tom lane

Re: The Two Towers

From
Andrej
Date:
On 8 June 2010 16:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This has nothing to do with my thread. Please, open a separate thread
>> the next time.  This is really annoying.
> Not *nearly* as annoying as HTML-only email (ahem).
Don't know about that, Tom, I'd say they're en par. ;)


Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

Re: The Two Towers

From
Jon Jensen
Date:
> Both databases have current statistics, the data is exactly the same, as shown by the results. Oracle finished in
0.85second (850 milliseconds) while PostgreSQL took 2127.345 ms, 2.5 times slower than Oracle. The difference that is
obviousis the access path: Postgres chose nested loops, while Oracle chose merge join and utilized the primary key for
thetable. Postgres, apparently cannot do "fast full scan" of an index and doesn't know how to utilize the primary key
incases like this. However, my gripe is that the optimizer should have selected merge join, just like Oracle did. In
thiscase, nested loops are definitely the wrong choice. Rule based optimizers, the kind of the optimizer that takes
intoconsideration only the structure of the table, usually ends up being dominated by the nested loops method. Nested
loopsmethod usually dominates the OLTP type applications but can really mess up large reports. I am under the
impressionthat Postgres query planner is geared toward the OLTP type of the database. Maybe a new parameter is needed
thatwould somehow shift gears to "data warehouse use", on demand? I have to say, the advantage is still on the side of
Sauron.

You may want to check out http://www.postgresql.org/docs/8.4/static/runtime-config-query.html

If you have a good understanding of your query and how it should be run, you can toggle such settings as
enable_nestloopon demand to see if you get a better result (e.g. SET enable_nestloop TO OFF). Of course, it's a
double-edgedsword and it's far easier to make queries perform more poorly when toggling these settings. 

Jon