Re: postgres slower on nested queries - Mailing list pgsql-general

From Alexander Elgert
Subject Re: postgres slower on nested queries
Date
Msg-id 45EE9E34.2010900@adiva.de
Whole thread Raw
In response to Re: postgres slower on nested queries  (Richard Huxton <dev@archonet.com>)
Responses Re: postgres slower on nested queries
Re: postgres slower on nested queries
List pgsql-general
Richard Huxton schrieb:
> Alexander Elgert wrote:
>> Hello,
>>
>> I programmed a little script which iterates over all databases in a
>> DBMS, iterating over all tables and then over all columns.
>> This skript works for mysql and postgres.
>>
>> 1. Solution overview
>> ====================
>>
>> foreach database {
>>    foreach table {
>>        foreach column {
>>            do something ...
>>        }
>>    }
>> }
>
> OK - obviously not the way to do it if you really want all the columns
> in all the databases...
>
>> 1. Solution detail
>> ==================
>>
>> foreach database
>> ----------------
>> mysql: show databases
>> postgres: SELECT d.datname as Name FROM pg_catalog.pg_database d
>> WHERE (d.datname != 'template0' AND d.datname != 'template1') ORDER
>> BY 1;)
>
> You can have template databases called other names too of course.
>
>> foreach table
>> -------------
>> mysql: show tables
>> postgres: select table_name from information_schema.tables where
>> table_schema = 'public')
>>
>> foreach column
>> --------------
>> mysql: show columns from '%s'
>> postgres: select column_name from information_schema.columns where
>> table_name = '%s')
>>
>> (If there are better queries for postgres, please let me know.)
>>
>> 2. Solution
>> ===========
>>
>> I found the postgres version VERY slow, so a decided to fetch
>
> Define VERY - it took what, milliseconds to do this? Seconds? Hours?

I wrote two different functions to get the data out of the database, the
one uses a huge amount of queries, the other uses an array to hold the data.
Because in mysql I have to use "show tables/show columns from ..." I
thought I can use the same style in postgres.

Of course, it works, but the unoptimized postgres code is 80 times
slower. ;(
The overhead for a single 'show columns from ...' seems to be small in
mysql.
Please read on, I hope I am able to show you the difference.

>
>>    select table_name, column_name from information_schema.columns
>> where table_schema = 'public'
>>
>> and wrote the output to an two dimensional array to process the
>> elements later:
>>    $tableA[$row['table_name']][$row['column_name']]=True;
>
> Not sure what this code is meant to show.
foreach database {
   foreach table_name, column_name {
       $tableA[$row['table_name']][$row['column_name']]=True;
   }
   // process 2-dim $tableA array
}

This results in a structure where I can itereate over all keys in the
2-dim array.
You can see I iterate first over the databases and then over table AND
columns!
--- mysql: ~1s (Database X)
--- postgres: ~1s (Database Y)
;)

In contrast: =======================================================

foreach database {
   foreach table {
       foreach column {
           do something ...
       }
   }
}
--- mysql: ~1s (Database X)
--- postgres: ~80s (Database Y)
;(

>
>> The second approach ist much faster, this must be because there is no
>> nesting. ;(
>
> What nesting? Are you trying to do sub-queries of some sort?
I did a loop over all tables and THEN calling a query for each table to
get the columns (from the same table).
Yes, there are definitively more queries the DBMS has to manage.
(It is a bad style, but it is intuitive. Maybe the overhead of a single
query is more time consuming than in mysql.)
>
> In your first example, you were querying for individual rows, in the
> second you fetch them all at once. Of course it's quicker to run one
> query and return many rows than many queries each returning one row.
> Otherwise your RDBMS has something seriously wrong with it.
>
>> Further - I think - there is no real caching done in the PHP Library
>> of xampp 1.5.4, php_pgsql.dll ;(
>
> No (not that I'm sure what xampp is). Look into memcache/pg_memcache
> if you want caching - that does it properly.
Thanks.
(xampp is an environment to develop wep applications.)
>
>
>> It would be very helpful what causes the problem.
>
> You haven't actually described a problem afaict. Can you describe what
> it is you're trying to do? Do you want:
>   1. Details of a single column
>   2. To loop through all the columns doing something
> If you can say which of these you're trying to do, and what the
> "something" is, then I'm sure there's somebody here who can help.
I think it will be the best to provide example code.
>
>> Another question rises from this task:
>> ======================================
>>
>> Is there any possibility to use the shortcut \d as a query from PHP?
>> I used psql -E to display the query, but these queries rely on the
>> system catalogs, but the page:
>> http://www.postgresql.org/docs/8.2/interactive/information-schema.html
>> says they are not stable.
>> And - of course - it is much easier to type! ;)
>
> Nope - \d is psql only. That's because what it does changes with the
> system catalogues. If you want a stable representation, use the
> information_schema. If you want all the PostgreSQL-specific details
> you'll have to cope with changes to the system catalogues. However, I
> seem to remember some utility views that pgadmin uses that might be of
> use.
It would be nice to have the \d-query stored somewhere in the database.
>
> Oh - and you might find phppgadmin useful to look at.
>
Thanks.

Greetings,
    Alexander Elgert


pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: vacuum error
Next
From: Markus Schiltknecht
Date:
Subject: Re: real multi-master replication?