Query optimizing - paradox behave - Mailing list pgsql-sql

From David M. Richter
Subject Query optimizing - paradox behave
Date
Msg-id 3B57F09E.C807E20A@freenet.de
Whole thread Raw
Responses Re: Query optimizing - paradox behave  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
(Here again; my email adress was killed)

Hallo !

I want to tune a database. There a many redundant datas in the database
, because of all the relations were consider as n:m relations. But the
most of them are 1:n Relations. So my approach was to cut the
redundancies to get more performance. But .. happens!

The query with the 3 tables is faster than the query with 2 tables.

That is paradox to the Explain output.
And: the real database functions like dbPG95GetIndex and all functions
defined by me are slower.
The whole program is slower than before.
I disabled all the index.(since with index the behavior is the same) The
database pacs ist only restructured. They have the same data. With
database pacs and compare a vacuum was made.
I looked at the user time , since system time is faked because my
testprogram hands over the control to the postmaster and the postmaster
is doing his own work. So I made a lot of tests to get a average
usertime. So escapes will be catched and eliminated.

Here are the tabledescriptions for the original database "compare":

tables i.e.
There is a n:m relationship between patient and study realized with
relpatient_study000 relationtable.

                    Table "patient"
      Attribute       |          Type          | Modifier
----------------------+------------------------+----------
 chilioid             | character varying(80)  |
 name                 | text                   |
 id                   | character varying(256) |
 birthdate            | date                   |
 birthtime            | time                   |
 sex                  | character(1)           |
 medicalrecordlocator | character varying(128) |

                       Table "study"
       Attribute        |          Type          | Modifier
------------------------+------------------------+----------
 chilioid               | character varying(80)  |
 instanceuid            | character varying(64)  |
 id                     | character varying(64)  |
 studydate              | date                   |
 studytime              | time                   |
 modality               | character varying(2)   |
 manufacturer           | character varying(128) |
 referingphysician      | text                   |
 description            | character varying(128) |
 manufacturersmodelname | character varying(128) |
 importtime             | double precision       |
 chilisenderid          | character varying(80)  |
 accessionnumber        | character varying(64)  |
 institutionname        | character varying(128) |
 workflowstate          | character varying(8)   |
 flags                  | character varying(8)   |
 performingphysician    | character varying(128) |
 reportingphysician     | character varying(128) |


         Table "relpatient_study000"
 Attribute |         Type          | Modifier
-----------+-----------------------+----------
 chilioid  | character varying(80) |
 parentoid | character varying(80) |
 childoid  | character varying(80) |

 parentoid is here the oid of the patient and childoid is here the oid
of the study.

Thats the query with the original database "compare":

time psql -d compare -c "SELECT patient.*,study.* FROM
patient,study,relpatient_study000 r0 WHERE
(patient.chiliOID=r0.parentOID AND study.chiliOID=r0.childOID) order by
patient.name using <" > 3tableni
1.671u 0.130s 0:11.14 16.4%     0+0k 0+0io 208pf+0w

psql -d compare -c "EXPLAIN SELECT patient.*,study.* FROM
patient,study,relpatient_study000 r0 WHERE
(patient.chiliOID=r0.parentOID AND study.chiliOID=r0.childOID) order by
patient.name using <" > 3tableni
NOTICE:  QUERY PLAN:

Sort  (cost=1135170635.79..1135170635.79 rows=748802386 width=296)
  ->  Merge Join  (cost=1025510.64..1057837.48 rows=748802386 width=296)
        ->  Sort  (cost=1017989.22..1017989.22 rows=2556861 width=96)
              ->  Merge Join  (cost=4287.84..4763.21 rows=2556861
width=96)
                    ->  Sort  (cost=990.43..990.43 rows=8725 width=72)
                          ->  Seq Scan on patient  (cost=0.00..212.25
rows=8725 width=72)
                    ->  Sort  (cost=3297.40..3297.40 rows=29305
width=24)
                          ->  Seq Scan on relpatient_study000 r0
(cost=0.00..774.05 rows=29305 width=24)
        ->  Sort  (cost=7521.42..7521.42 rows=29286 width=200)
              ->  Seq Scan on study  (cost=0.00..1116.86 rows=29286
width=200)

-----------------------------



Thats the query with the new restructured database "pacs":

time psql -d pacs -c "SELECT patient.*,study.* FROM patient,study WHERE
(patient.chiliOID=study.patientOID ) order by patient.name using <" >
2tableni
1.730u 0.120s 0:09.44 19.5%     0+0k 0+0io 208pf+0w


psql -d pacs -c "EXPLAIN SELECT patient.*,study.* FROM patient,study
WHERE (patient.chiliOID=study.patientOID ) order by patient.name using
<" > 2tableni       NOTICE:  QUERY PLAN:

Sort  (cost=2194791.19..2194791.19 rows=2555204 width=284)
  ->  Merge Join  (cost=8978.44..9453.57 rows=2555204 width=284)
        ->  Sort  (cost=990.43..990.43 rows=8725 width=72)
              ->  Seq Scan on patient  (cost=0.00..212.25 rows=8725
width=72)
        ->  Sort  (cost=7988.00..7988.00 rows=29286 width=212)
              ->  Seq Scan on study  (cost=0.00..1236.86 rows=29286
width=212)


Restructured tables i.e.
PAtient-study relationship is 1:n realized with column patientoid in
table study.

                    Table "patient"
      Attribute       |          Type          | Modifier
----------------------+------------------------+----------
 chilioid             | character varying(80)  |
 name                 | text                   |
 id                   | character varying(256) |
 birthdate            | date                   |
 birthtime            | time                   |
 sex                  | character(1)           |
 medicalrecordlocator | character varying(128) |



Table "study"
       Attribute        |          Type          | Modifier
------------------------+------------------------+----------
 chilioid               | character varying(80)  |
 instanceuid            | character varying(64)  |
 id                     | character varying(64)  |
 studydate              | date                   |
 studytime              | time                   |
 modality               | character varying(2)   |
 manufacturer           | character varying(128) |
 referingphysician      | text                   |
 description            | character varying(128) |
 manufacturersmodelname | character varying(128) |
 importtime             | double precision       |
 chilisenderid          | character varying(80)  |
 accessionnumber        | character varying(64)  |
 institutionname        | character varying(128) |
 workflowstate          | character varying(8)   |
 flags                  | character varying(8)   |
 performingphysician    | character varying(128) |
 reportingphysician     | character varying(128) |
 patientoid             | character varying(80)  |


The times of the processes are escape-eliminated by statistical methods.

I determined that the "compare" database is 8% times faster than the new
restructured "pacs" database.
How can I understand this? Whats my mistake?

Anybody  who can make some sugestions on the above will
receive my enthusiastic gratitude

David M. Richter
Attachment

pgsql-sql by date:

Previous
From: Henry House
Date:
Subject: Re: nextval on insert by arbitrary sequence
Next
From: Gary Stainburn
Date:
Subject: multiple lookup per row