Thread: PsqlODBC slow on UNION queries

PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Hi,

we are using PowerBuilder for development and I experimented a little
with PostgreSQL-8.1 through ODBC and I experienced the following
on psqlodbc-8.01.0106.

We have some large views, consisting of 6+ UNION members.
Recently, I rewrote them to use UNION ALL, since the members
give distinct rows across the whole view.

The performance difference between the two is enormous, e.g. using UNION,
the rows starts flowing only after 30-32 seconds but when using UNION ALL,
it starts instantly. Doing the same in psql like below runs under 4 seconds
using UNION and under about 12 seconds using UNION ALL, and this
includes getting all the rows from the server.

time echo "select * from view_x" | psql db >result.txt

What may cause such a large difference?


Re: PsqlODBC slow on UNION queries

From
Tom Lane
Date:
Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:
> We have some large views, consisting of 6+ UNION members.
> Recently, I rewrote them to use UNION ALL, since the members
> give distinct rows across the whole view.

> The performance difference between the two is enormous, e.g. using UNION,
> the rows starts flowing only after 30-32 seconds but when using UNION ALL,
> it starts instantly.

Well, UNION requires fetching all the source rows and then doing a
duplicate-elimination step (sort+uniq basically).  Why does the
performance change surprise you?

            regards, tom lane

Re: PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Tom Lane írta:

>Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:
>
>
>>We have some large views, consisting of 6+ UNION members.
>>Recently, I rewrote them to use UNION ALL, since the members
>>give distinct rows across the whole view.
>>
>>
>
>
>
>>The performance difference between the two is enormous, e.g. using UNION,
>>the rows starts flowing only after 30-32 seconds but when using UNION ALL,
>>it starts instantly.
>>
>>
>
>Well, UNION requires fetching all the source rows and then doing a
>duplicate-elimination step (sort+uniq basically).  Why does the
>performance change surprise you?
>
>            regards, tom lane
>
>

Well, as I said, doing the same test from PSQL, the performance
difference is much less. I was really asking about the difference
between psql and PsqlODBC:

                            psql(output to file)         psqlodbc
UNION               total time: < 12 sec        first row received after
30 seconds
UNION ALL      total time < 4 sec           first row received instantly

My question is, why does PsqlODBC does the same unique
seach again, when the server already did it? Only this would explain
the difference between the psql and PsqlODBC timings.

Best regards,
Zoltán Böszörményi


Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> Well, as I said, doing the same test from PSQL, the performance
> difference is much less. I was really asking about the difference
> between psql and PsqlODBC:
>
>                            psql(output to file)         psqlodbc
> UNION               total time: < 12 sec        first row received after
> 30 seconds
> UNION ALL      total time < 4 sec           first row received instantly

I suppose you have used Declare/Fetch turned on. Please try the same
query in psql client. The exact query you can find in mylog output.
It could be something like
DECLARE CURSOR <CURSOR_NAME> FOR SELECT <your select>;
FETCH FORWARD <number of rows to fetch at first time> FROM <CURSOR_NAME>;

If you have used Server side prepare then query is already different.
It use:
PREPARE <plan_name>[(paremters types)] AS <query>;
EXECUTE <plan_name> [(parameters)];

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Hi,

Ludek Finstrle írta:

>>Well, as I said, doing the same test from PSQL, the performance
>>difference is much less. I was really asking about the difference
>>between psql and PsqlODBC:
>>
>>                           psql(output to file)         psqlodbc
>>UNION               total time: < 12 sec        first row received after
>>30 seconds
>>UNION ALL      total time < 4 sec           first row received instantly
>>
>>
>
>I suppose you have used Declare/Fetch turned on. Please try the same
>query in psql client. The exact query you can find in mylog output.
>It could be something like
>DECLARE CURSOR <CURSOR_NAME> FOR SELECT <your select>;
>FETCH FORWARD <number of rows to fetch at first time> FROM <CURSOR_NAME>;
>
>If you have used Server side prepare then query is already different.
>It use:
>PREPARE <plan_name>[(paremters types)] AS <query>;
>EXECUTE <plan_name> [(parameters)];
>
>Regards,
>
>Luf
>
>

At the moment I am at home, being sick, so I cannot tell which queries
PowerBuilder use.

But here are the above queries and their timings from psql,
this is on my home machine, so the timings are different from
what I quoted. 84693 rows were produced by SELECT *,
so that's the number I used in FETCH FORWARD.

[zozo@host-81-17-177-202 psql]$ cat q1.txt
select * from v_invoice_browse;

[zozo@host-81-17-177-202 psql]$ cat q2.txt
begin;
declare mycur1 cursor for select * from v_invoice_browse;
FETCH FORWARD 84693 FROM mycur1;
commit;

[zozo@host-81-17-177-202 psql]$ cat q3.txt
prepare plan1 as select * from v_invoice_browse;
execute plan1;

DECLARE CURSOR complained about being outside of a transaction
block, so I put it between BEGIN and  COMMIT.

UNION ALL:

[zozo@host-81-17-177-202 psql]$ time psql -f q1.txt -U bolt bolt2 >r1.txt

real    0m5.663s
user    0m2.440s
sys     0m0.347s

[zozo@host-81-17-177-202 psql]$ time psql -f q2.txt -U bolt bolt2 >r2.txt

real    0m5.870s
user    0m2.480s
sys     0m0.354s

[zozo@host-81-17-177-202 psql]$ time psql -f q3.txt -U bolt bolt2 >r3.txt

real    0m5.981s
user    0m2.449s
sys     0m0.353s

UNION:

[zozo@host-81-17-177-202 psql]$ time psql -f q1.txt -U bolt bolt2 >r1.txt

real    0m10.237s
user    0m2.453s
sys     0m0.214s

[zozo@host-81-17-177-202 psql]$ time psql -f q2.txt -U bolt bolt2 >r2.txt

real    0m10.535s
user    0m2.480s
sys     0m0.241s
[zozo@host-81-17-177-202 psql]$ time psql -f q3.txt -U bolt bolt2 >r3.txt

real    0m10.488s
user    0m2.466s
sys     0m0.245s

The timings between UNION and UNION ALL comparing the same query
is about what I expected. But there isn't too much difference between
the different queries on the same view, so the PowerBuilder behaviour still
isn't clear to me.

BTW, the quoted timings in the previous post from psql and UNION
vs UNION ALL were taken on a 3GHz HT P4, running RedHat 9.
The client is Windows2000 with PsqlODBC 08.01.0106.
The above timings were on an Athlon 3200+, running FC3/x86-64.
Both servers are running PostgreSQL 8.1.1.

And the  timing differences happen differently if I attach the view
in an Access database in Office2003 using PsqlODBC 8.01.0106.
This is on WinXP running in VMWare on the same machine as the PostgreSQL
server. E.g. the view opens in Access in about 7 seconds when I use
UNION ALL,
but it requires about 17 seconds if I use UNION. This was with both
Declare/Fetch and Server side prepare being OFF. Also the same with
only Server side prepare being ON.

If I set only Declare/Fetch or both to ON, opening the VIEW in Access
comes close to what I experinced in psql, e.g. it appears almost instantly
when using UNION ALL, and just under 7 seconds when using UNION.

I don't remember which ODBC settings I used for the Windows2000 client
and PowerBuilder, I am not near that machine. I will recheck it when
I get back to my workplace next monday.

Best regards,
Zoltán Böszörményi


Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> >>Well, as I said, doing the same test from PSQL, the performance
> >>difference is much less. I was really asking about the difference
> >>between psql and PsqlODBC:
> >>
> >>                          psql(output to file)         psqlodbc
> >>UNION               total time: < 12 sec        first row received after
> >>30 seconds
> >>UNION ALL      total time < 4 sec           first row received instantly
> >
> >I suppose you have used Declare/Fetch turned on. Please try the same
> >query in psql client. The exact query you can find in mylog output.
> >It could be something like
> >DECLARE CURSOR <CURSOR_NAME> FOR SELECT <your select>;
> >FETCH FORWARD <number of rows to fetch at first time> FROM <CURSOR_NAME>;
> >
> >If you have used Server side prepare then query is already different.
> >It use:
> >PREPARE <plan_name>[(paremters types)] AS <query>;
> >EXECUTE <plan_name> [(parameters)];
>
> At the moment I am at home, being sick, so I cannot tell which queries
> PowerBuilder use.

You don't understand me. When you turn on psqlODBC option Use Declare/Fetch
all SELECT statements are internally (by psqlODBC) changed to
DECLARE/FETCH commands. It's the same with Server side prepare.

So it doesn't matter what does PowerBuilder use.

> But here are the above queries and their timings from psql,
> this is on my home machine, so the timings are different from
> what I quoted. 84693 rows were produced by SELECT *,
> so that's the number I used in FETCH FORWARD.

psqlODBC driver use only Cache size (default is 100) in FETCH FORWARD.
But you can change it to get better performance.

> And the  timing differences happen differently if I attach the view
> in an Access database in Office2003 using PsqlODBC 8.01.0106.
> This is on WinXP running in VMWare on the same machine as the PostgreSQL
> server. E.g. the view opens in Access in about 7 seconds when I use
> UNION ALL,
> but it requires about 17 seconds if I use UNION. This was with both
> Declare/Fetch and Server side prepare being OFF. Also the same with
> only Server side prepare being ON.
>
> If I set only Declare/Fetch or both to ON, opening the VIEW in Access
> comes close to what I experinced in psql, e.g. it appears almost instantly
> when using UNION ALL, and just under 7 seconds when using UNION.
>
> I don't remember which ODBC settings I used for the Windows2000 client
> and PowerBuilder, I am not near that machine. I will recheck it when
> I get back to my workplace next monday.

There is another thing. psqlODBC driver simulate SQLColAttributes (I wrote
it on the fly so it could be different ODBC API) throught call
the query and then get the columns information before query is already
open. So it could be the time bottleneck.
I try describe it better:
SQLPrepare, SQLColAttributes (this call the statement for getting columns
information), SQLExecute (this call the statement for result).

If you want trace it down the good start point is turn mylog output on
and read the result in C:\mylog_XXXX.log (or txt extension?).

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> > If I set only Declare/Fetch or both to ON, opening the VIEW in Access
> > comes close to what I experinced in psql, e.g. it appears almost instantly
> > when using UNION ALL, and just under 7 seconds when using UNION.
> >
> > I don't remember which ODBC settings I used for the Windows2000 client
> > and PowerBuilder, I am not near that machine. I will recheck it when
> > I get back to my workplace next monday.
>
> There is another thing. psqlODBC driver simulate SQLColAttributes (I wrote
> it on the fly so it could be different ODBC API) throught call
> the query and then get the columns information before query is already
> open. So it could be the time bottleneck.
> I try describe it better:
> SQLPrepare, SQLColAttributes (this call the statement for getting columns
> information), SQLExecute (this call the statement for result).
>
> If you want trace it down the good start point is turn mylog output on
> and read the result in C:\mylog_XXXX.log (or txt extension?).

I forgot. Feel free to send the mylog output here to list. We can
help you to decode it ;-)

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Ludek Finstrle írta:

>>>If I set only Declare/Fetch or both to ON, opening the VIEW in Access
>>>comes close to what I experinced in psql, e.g. it appears almost instantly
>>>when using UNION ALL, and just under 7 seconds when using UNION.
>>>
>>>I don't remember which ODBC settings I used for the Windows2000 client
>>>and PowerBuilder, I am not near that machine. I will recheck it when
>>>I get back to my workplace next monday.
>>>
>>>
>>There is another thing. psqlODBC driver simulate SQLColAttributes (I wrote
>>it on the fly so it could be different ODBC API) throught call
>>the query and then get the columns information before query is already
>>open. So it could be the time bottleneck.
>>I try describe it better:
>>SQLPrepare, SQLColAttributes (this call the statement for getting columns
>>information), SQLExecute (this call the statement for result).
>>
>>If you want trace it down the good start point is turn mylog output on
>>and read the result in C:\mylog_XXXX.log (or txt extension?).
>>
>>
>
>I forgot. Feel free to send the mylog output here to list. We can
>help you to decode it ;-)
>
>

I will, when I am back to my workplace.

In the meantime, I have set MyLog on in WinXP under VMWare here,
but Access produced some 120+ MB log and it wasn't even finished
reading the records... I didn't feel like I have to send it to the list. ;-)
Anyway, it turned out it used DECLARE CURSOR and it fetched
100 records at a time.

Best regards,
Zoltán Böszörményi


Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> >I forgot. Feel free to send the mylog output here to list. We can
> >help you to decode it ;-)
>
> I will, when I am back to my workplace.
>
> In the meantime, I have set MyLog on in WinXP under VMWare here,
> but Access produced some 120+ MB log and it wasn't even finished
> reading the records... I didn't feel like I have to send it to the list. ;-)
> Anyway, it turned out it used DECLARE CURSOR and it fetched
> 100 records at a time.

Ufff. What tons of records do you get? BTW could you compile psqlODBC
from source code? It is possible to disable mylog output in function
CC_mapping (it's per record mylog for problems with encoding). I think
you don't have such a problem so there is no problem to comment out
mylog(...) calls from CC_mapping. This results in much smaller mylog
output.

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Ludek Finstrle írta:

>>>I forgot. Feel free to send the mylog output here to list. We can
>>>help you to decode it ;-)
>>>
>>>
>>I will, when I am back to my workplace.
>>
>>In the meantime, I have set MyLog on in WinXP under VMWare here,
>>but Access produced some 120+ MB log and it wasn't even finished
>>reading the records... I didn't feel like I have to send it to the list. ;-)
>>Anyway, it turned out it used DECLARE CURSOR and it fetched
>>100 records at a time.
>>
>>
>
>Ufff. What tons of records do you get? BTW could you compile psqlODBC
>
>

About 86000 records and I found every fields in every records logged, too.

>from source code? It is possible to disable mylog output in function
>
>

I can, but not on Windows. I can take the postgresql-odbc src.rpm
from RedHat Rawhide and replace the source TGZ with the current
CVS snapshot. In fact, I have done it several times, to test my Linux apps
against the "official" ODBC driver instead of the ones that come with
unixODBC.
But that doesn't help me under Windows. :-) I don't have VC++.

>CC_mapping (it's per record mylog for problems with encoding). I think
>you don't have such a problem so there is no problem to comment out
>mylog(...) calls from CC_mapping. This results in much smaller mylog
>output.
>
>Regards,
>
>Luf
>
>
>

Best regards,
Zoltán Böszörményi


Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> I can, but not on Windows. I can take the postgresql-odbc src.rpm
...
> But that doesn't help me under Windows. :-) I don't have VC++.

Ok. I'll send you changed psqlodbc[aw].dll tomorrow at first.
I'm going to come at home late today.

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
Wed, Jan 11, 2006 at 10:25:33AM +0100, Ludek Finstrle napsal(a):
> > I can, but not on Windows. I can take the postgresql-odbc src.rpm
> ...
> > But that doesn't help me under Windows. :-) I don't have VC++.
>
> Ok. I'll send you changed psqlodbc[aw].dll tomorrow at first.
> I'm going to come at home late today.

Oh, I'm sorry. I'm cleaning my mailbox and I see I forgot to you :-(
Please feel free to notify me next time.

I'll send it to you privately in few minutes.

BTW there was similar problem reported from some man. He use BC++
and the solution was turn on Parse Statement with recent psqlODBC
driver.

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> finally here's the results. I hope you can find something useful in it.

Finally I see the problem in mylog output. Unfortunatelly I don't know
how to change the behaviour :-( without more investigation.

There is problem that you (I mean PowerBuilder) calling
SQLExecDirect select * from view (which fire the execution)
SQLDescribeCol (for each column in result)
SQLCancel (I don't know why the statement is canceled
           it seems to me that PB miss some information)
...
SQLTables (get additional informations)
...
SQLColumns
...
SQLPrimaryKeys
...
SQLStatistics
...
SQLExecDirect select * from view (which fire the execution again)

I have a lot of questions it will be the best if you could try
create some repro app. Some of the questions:
1) could you try similar view without text (char, varchar, ...) columns?
2) maybe minimalization of view to one column could be the best test
3) could you try enable trace ODBC (in ODBC manager) - independent on
   datasource and create two logs - againist psqlodbc and another DB
   - maybe we could find the difference there
4) could you say me what are the query for (it's fired bettwen two calls
   of select * from view):
   - select pbd_fhgt, pbd_fwgt, pbd_fitl, pbd_funl, pbd_fchr, pbd_fptc,
            pbd_ffce, pbh_fhgt, pbh_fwgt, pbh_fitl, pbh_funl, pbh_fchr,
            pbh_fptc, pbh_ffce, pbl_fhgt, pbl_fwgt, pbl_fitl, pbl_funl,
            pbl_fchr, pbl_fptc, pbl_ffce, pbt_cmnt
     from informix.pbcattbl
     where pbt_tnam = 'v_invoice_browse' and pbt_ownr = 'informix'
   - select pbv_name, pbv_vald, pbv_type, pbv_cntr, pbv_msg
     from informix.pbcatvld
   - select pbf_name, pbf_frmt, pbf_type, pbf_cntr
     from informix.pbcatfmt
   - select pbe_name, pbe_edit, pbe_type, pbe_cntr, pbe_work, pbe_seqn,
            pbe_flag
     from informix.pbcatedt
     order by pbe_name, pbe_seqn
   - select pbc_tnam, pbc_ownr, pbc_cnam, pbc_labl, pbc_lpos, pbc_hdr,
            pbc_hpos, pbc_jtfy, pbc_mask, pbc_case, pbc_hght, pbc_wdth,
            pbc_ptrn, pbc_bmap, pbc_init, pbc_cmnt, pbc_edit
     from informix.pbcatcol
     where (pbc_tnam = 'v_invoice_browse' and pbc_ownr = 'informix')
5) How long does it take only (without fetch):
   BEGIN; DECLARE CURSOR csr_test AS select * from v_invoice_browse; COMMIT;
6) How long does it take:
   BEGIN; DECLARE CURSOR csr_test AS select * from v_invoice_browse;
   FETCH BACKWARD in csr_test; COMMIT;

Maybe the PowerBuilder go this way all the time. It's our problem
that we get first data in SQLExecDirect. I don't know it right now.
We have to find the solution but the way could be hard.

Regards,

Luf

Re: PsqlODBC slow on UNION queries

From
Zoltan Boszormenyi
Date:
Ludek Finstrle írta:

>>finally here's the results. I hope you can find something useful in it.
>>
>>
>
>Finally I see the problem in mylog output. Unfortunatelly I don't know
>how to change the behaviour :-( without more investigation.
>
>There is problem that you (I mean PowerBuilder) calling
>SQLExecDirect select * from view (which fire the execution)
>SQLDescribeCol (for each column in result)
>SQLCancel (I don't know why the statement is canceled
>           it seems to me that PB miss some information)
>...
>SQLTables (get additional informations)
>...
>SQLColumns
>...
>SQLPrimaryKeys
>...
>SQLStatistics
>...
>SQLExecDirect select * from view (which fire the execution again)
>
>

... retrieve 1000+ rows and then _I_ pressed Cancel. Can you confirm that?
I am just curious.

>I have a lot of questions it will be the best if you could try
>create some repro app. Some of the questions:
>1) could you try similar view without text (char, varchar, ...) columns?
>
>

Would the same view do but only seleting numeric (integer and decimal)
columns?

>2) maybe minimalization of view to one column could be the best test
>
>

OK.

>3) could you try enable trace ODBC (in ODBC manager) - independent on
>   datasource and create two logs - againist psqlodbc and another DB
>   - maybe we could find the difference there
>
>

OK.

>4) could you say me what are the query for (it's fired bettwen two calls
>   of select * from view):
>   - select pbd_fhgt, pbd_fwgt, pbd_fitl, pbd_funl, pbd_fchr, pbd_fptc,
>            pbd_ffce, pbh_fhgt, pbh_fwgt, pbh_fitl, pbh_funl, pbh_fchr,
>            pbh_fptc, pbh_ffce, pbl_fhgt, pbl_fwgt, pbl_fitl, pbl_funl,
>            pbl_fchr, pbl_fptc, pbl_ffce, pbt_cmnt
>     from informix.pbcattbl
>     where pbt_tnam = 'v_invoice_browse' and pbt_ownr = 'informix'
>   - select pbv_name, pbv_vald, pbv_type, pbv_cntr, pbv_msg
>     from informix.pbcatvld
>   - select pbf_name, pbf_frmt, pbf_type, pbf_cntr
>     from informix.pbcatfmt
>   - select pbe_name, pbe_edit, pbe_type, pbe_cntr, pbe_work, pbe_seqn,
>            pbe_flag
>     from informix.pbcatedt
>     order by pbe_name, pbe_seqn
>   - select pbc_tnam, pbc_ownr, pbc_cnam, pbc_labl, pbc_lpos, pbc_hdr,
>            pbc_hpos, pbc_jtfy, pbc_mask, pbc_case, pbc_hght, pbc_wdth,
>            pbc_ptrn, pbc_bmap, pbc_init, pbc_cmnt, pbc_edit
>     from informix.pbcatcol
>     where (pbc_tnam = 'v_invoice_browse' and pbc_ownr = 'informix')
>
>

I don't know what are these. I guess  PowerBuilder tries something
behind everyones' back.
I just opened the "Database Painter" and issued "select * from
v_invoice_browse;".
I will recheck my database but those table names aren't familiar at all.
Maybe PB created
them at first connect and collected something about the database in
those tables.

>5) How long does it take only (without fetch):
>   BEGIN; DECLARE CURSOR csr_test AS select * from v_invoice_browse; COMMIT;
>6) How long does it take:
>   BEGIN; DECLARE CURSOR csr_test AS select * from v_invoice_browse;
>   FETCH BACKWARD in csr_test; COMMIT;
>
>

I will test these, too.

>Maybe the PowerBuilder go this way all the time. It's our problem
>that we get first data in SQLExecDirect. I don't know it right now.
>We have to find the solution but the way could be hard.
>
>Regards,
>
>Luf
>
>

Best regards,
Zoltán Böszörményi


Re: PsqlODBC slow on UNION queries

From
Ludek Finstrle
Date:
> >SQLExecDirect select * from view (which fire the execution again)
>
> ... retrieve 1000+ rows and then _I_ pressed Cancel. Can you confirm that?
> I am just curious.

Yes. I see > 1000 < 1100 rows. You could go to the bottom of mylog (I see
union one) and search backward "select * from v_invoice_browse".
Then you search forward fetch <case size> (<case size> = 100 in your
configuration so you search "fetch 100"). I count 11 (I hope I don't
create a mistake) "fetch 100" => > 1000 < 1100 :-)

There is SQLCancel after last "fetch 100" ;-)

> >create some repro app. Some of the questions:
> >1) could you try similar view without text (char, varchar, ...) columns?
>
> Would the same view do but only seleting numeric (integer and decimal)
> columns?

Yes. It's enough. I don't know if the problem isn't char column
handling.

> >4) could you say me what are the query for (it's fired bettwen two calls
> >  of select * from view):

...

> I don't know what are these. I guess  PowerBuilder tries something
> behind everyones' back.

I think so. Could you try these selects in psql or PgAdmin tool?
I'm curios what you see.

Regards,

Luf