Thread: Re: [INTERFACES] ODBC is slow with M$-Access Report

Re: [INTERFACES] ODBC is slow with M$-Access Report

From
"Jose' Soares Da Silva"
Date:
We are working on a project that IMHO give more prestige to
PostgreSQL.
The Hygea project concern the use of an Unix-like Operating  sys-
tem  as  "back-end" of a Client M$-windows application connected
by ODBC that will be installed in about 80 Italian Helth Depart-
ments for the veterinary controls and prevention.
Therefore...

O.S.: We choose Linux for his proved reliability.

Client: We choose to develop the Client with M$-Access because we
need (unfortunately) a complete integration with Micro$oft World.

Database: We choose PostgreSQL for his reliability  and  for  his
compatibility with SQL/92 standard recommendation and for his ex-
cellent technical support provided by "The PostgreSQL Development
Team" and his mailing lists.

Nevertheless  the  union  among M$-Access and PostgreSQL is quite
suffered for the following reasons:

1. The PostgreSQL doesn't use the index with  "OR"  operator  and
so is not possible to define a multiple key to use with M$-Access
and we need to retreat using OID as primary keys (thanks to Byron
Nikolaidis and David Hartwig of insightdist.com that are doing a
really great job with ODBC driver), but with the obvious consequences.

2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
included in the target list of the "SELECT", (I know that it is
SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
to  have the "dynaset "sorted for any field that is different from
the key (in our case the useless OIDs).

3. The times required to run complex reports (for example those that
include LEFT JOINS) is very long (about 15 minutes to retrieve
2850 rows).

We hope the PostgreSQL next release v6.4 may have some of these features
otherwise, we have to give up the project.

> Jose' Soares Da Silva wrote:
>
> > Hi,
> >
> >    I'm using PostgreSQL-6.3 / psqlodbc 06.30.0242 / M$-Access97.
> > I created a REPORT with a leftjoin that takes a lot of time.
> > There are 3850 rows in the main table.
> > PostgreSQL takes about..............: 960 secs to print all records.
> > The same test using MySQL takes only:  85 secs and the same
> > test using M$-Access takes about....:  45 secs.
>
> This is never a simple comparison.   Performance using Access/PostgreSQL can be
> greatly effected by the driver settings.   In particular, if you tell MS Access
> that there is a unique index on a table, at link time, or to "Recognize Unique
> Indexed"  (and there is one), Access will generate queries which the backend will
> not respond to very optimally.   Especially where outer joins are concerned.
> These queries are characterized by numerous OR(s).    Unfortunately under these
> conditions the backend does make use of the very index that Access is trying to
> take advantage of.
>
> So relinking the table without Access's recognition of the primary key (unique
> index) may help performance.  The down side is that you may not modify a table
> from Access without a specified primary key.
>
> There is also another  factor.  Does MySql support outer joins?   PostgreSQL does
> not at this time.   MS Access will hide this fact from the users and perform the
> join within Access.  Thus, creating the situation described above.
>
> > I configured ODBC drive to write the log file to sees what ODBC is doing
> > but seems that it writes log file only while fetching rows.
> > Is there a way to know what ODBC is doing. To know why it takes so long time?
> >                                                            Thanks, Jose'
>
> The CommLog was created to log SQL statement  communication with the server.   A
> much more detailed log can be activated from the "ODBC Data Source Administrator"
> dialog under the "Tracing" tab.  If you use this feature you may want to clear it
> out first.   It will also bring processing to a craw.

                                  |  |
~~~~~~~~~~~~~~~~~~~~~~~~          |  |            ~~~~~~~~~~~~~~~~~~~~~~~~
          Progetto HYGEA      ----    ----        www.sferacarta.com
    Sfera Carta Software      ----    ----        sferac@bo.nettuno.it
       Via Bazzanese, 69          |  |            Fax. ++39 51 6131537
Casalecchio R.(BO) Italy          |  |            Tel. ++39 51  591054
-----------------------------------------------------------------------------


Re: [INTERFACES] ODBC is slow with M$-Access Report

From
David Hartwig
Date:

Jose' Soares Da Silva wrote:

> We are working on a project that IMHO give more prestige to
> PostgreSQL.
> The Hygea project concern the use of an Unix-like Operating  sys-
> tem  as  "back-end" of a Client M$-windows application connected
> by ODBC that will be installed in about 80 Italian Helth Depart-
> ments for the veterinary controls and prevention.
> Therefore...

> O.S.: We choose Linux for his proved reliability.
>
> Client: We choose to develop the Client with M$-Access because we
> need (unfortunately) a complete integration with Micro$oft World.
>
> Database: We choose PostgreSQL for his reliability  and  for  his
> compatibility with SQL/92 standard recommendation and for his ex-
> cellent technical support provided by "The PostgreSQL Development
> Team" and his mailing lists.
>
> Nevertheless  the  union  among M$-Access and PostgreSQL is quite
> suffered for the following reasons:
>
> 1. The PostgreSQL doesn't use the index with  "OR"  operator  and
> so is not possible to define a multiple key to use with M$-Access
> and we need to retreat using OID as primary keys (thanks to Byron
> Nikolaidis and David Hartwig of insightdist.com that are doing a
> really great job with ODBC driver), but with the obvious consequences.

 I am currently working on a solution as time will allow.   Hopefully part of 6.4

>
>
> 2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
> included in the target list of the "SELECT", (I know that it is
> SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
> to  have the "dynaset "sorted for any field that is different from
> the key (in our case the useless OIDs).
>

This fix is in alpha and will be in the 6.4 release.  I do not know when 6.4 is slated
for release, but I am willing to send you a patch if it is critical for you to proceed.

> 3. The times required to run complex reports (for example those that
> include LEFT JOINS) is very long (about 15 minutes to retrieve
> 2850 rows).
>

The solution to your first item will resolve this also.

> We hope the PostgreSQL next release v6.4 may have some of these features
> otherwise, we have to give up the project.
>
> > Jose' Soares Da Silva wrote:
> >
> > > Hi,
> > >
> > >    I'm using PostgreSQL-6.3 / psqlodbc 06.30.0242 / M$-Access97.
> > > I created a REPORT with a leftjoin that takes a lot of time.
> > > There are 3850 rows in the main table.
> > > PostgreSQL takes about..............: 960 secs to print all records.
> > > The same test using MySQL takes only:  85 secs and the same
> > > test using M$-Access takes about....:  45 secs.
> >
> > This is never a simple comparison.   Performance using Access/PostgreSQL can be
> > greatly effected by the driver settings.   In particular, if you tell MS Access
> > that there is a unique index on a table, at link time, or to "Recognize Unique
> > Indexed"  (and there is one), Access will generate queries which the backend will
> > not respond to very optimally.   Especially where outer joins are concerned.
> > These queries are characterized by numerous OR(s).    Unfortunately under these
> > conditions the backend does make use of the very index that Access is trying to
> > take advantage of.
> >
> > So relinking the table without Access's recognition of the primary key (unique
> > index) may help performance.  The down side is that you may not modify a table
> > from Access without a specified primary key.
> >
> > There is also another  factor.  Does MySql support outer joins?   PostgreSQL does
> > not at this time.   MS Access will hide this fact from the users and perform the
> > join within Access.  Thus, creating the situation described above.
> >
> > > I configured ODBC drive to write the log file to sees what ODBC is doing
> > > but seems that it writes log file only while fetching rows.
> > > Is there a way to know what ODBC is doing. To know why it takes so long time?
> > >                                                            Thanks, Jose'
> >
> > The CommLog was created to log SQL statement  communication with the server.   A
> > much more detailed log can be activated from the "ODBC Data Source Administrator"
> > dialog under the "Tracing" tab.  If you use this feature you may want to clear it
> > out first.   It will also bring processing to a craw.
>


Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
Bruce Momjian
Date:
>
> We are working on a project that IMHO give more prestige to
> PostgreSQL.
> The Hygea project concern the use of an Unix-like Operating  sys-
> tem  as  "back-end" of a Client M$-windows application connected
> by ODBC that will be installed in about 80 Italian Helth Depart-
> ments for the veterinary controls and prevention.
> Therefore...
>
> O.S.: We choose Linux for his proved reliability.
>
> Client: We choose to develop the Client with M$-Access because we
> need (unfortunately) a complete integration with Micro$oft World.
>
> Database: We choose PostgreSQL for his reliability  and  for  his
> compatibility with SQL/92 standard recommendation and for his ex-
> cellent technical support provided by "The PostgreSQL Development
> Team" and his mailing lists.

Great.

>
> Nevertheless  the  union  among M$-Access and PostgreSQL is quite
> suffered for the following reasons:
>
> 1. The PostgreSQL doesn't use the index with  "OR"  operator  and
> so is not possible to define a multiple key to use with M$-Access
> and we need to retreat using OID as primary keys (thanks to Byron
> Nikolaidis and David Hartwig of insightdist.com that are doing a
> really great job with ODBC driver), but with the obvious consequences.

Yes, we need to work on this.  I am sure performance really suffers
because of this.  Vadim, is this on your short list?

>
> 2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
> included in the target list of the "SELECT", (I know that it is
> SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
> to  have the "dynaset "sorted for any field that is different from
> the key (in our case the useless OIDs).

David at Insight just added this, so it certainly will be in 6.4.

>
> 3. The times required to run complex reports (for example those that
> include LEFT JOINS) is very long (about 15 minutes to retrieve
> 2850 rows).

Yea, we need this too.  Not sure where we are with this.  Can you give
an example?



--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
The Hermit Hacker
Date:
On Tue, 2 Jun 1998, David Hartwig wrote:

> > O.S.: We choose Linux for his proved reliability.

    *quiet snicker*

> > 2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
> > included in the target list of the "SELECT", (I know that it is
> > SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
> > to  have the "dynaset "sorted for any field that is different from
> > the key (in our case the useless OIDs).
> >
>
> This fix is in alpha and will be in the 6.4 release.  I do not know when
> 6.4 is slated for release, but I am willing to send you a patch if it is
> critical for you to proceed.

    6.4 is slated for Oct 1st...we had thought Sep 1st, except that,
being the tail end of the summer, alot of ppl tend to be in limbo



Re: [INTERFACES] ODBC is slow with M$-Access Report

From
Hannu Krosing
Date:
David Hartwig wrote:
>
> > 1. The PostgreSQL doesn't use the index with  "OR"  operator  and
> > so is not possible to define a multiple key to use with M$-Access
> > and we need to retreat using OID as primary keys (thanks to Byron
> > Nikolaidis and David Hartwig of insightdist.com that are doing a
> > really great job with ODBC driver), but with the obvious consequences.
>
>  I am currently working on a solution as time will allow.   Hopefully part of 6.4
>

Will this solution be in ODBC driver (rewrite ORs to UNION) or in
the backend (fix the optimiser)?

------
Hannu

Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
David Hartwig
Date:

Hannu Krosing wrote:

> David Hartwig wrote:
> >
> > > 1. The PostgreSQL doesn't use the index with  "OR"  operator  and
> > > so is not possible to define a multiple key to use with M$-Access
> > > and we need to retreat using OID as primary keys (thanks to Byron
> > > Nikolaidis and David Hartwig of insightdist.com that are doing a
> > > really great job with ODBC driver), but with the obvious consequences.
> >
> >  I am currently working on a solution as time will allow.   Hopefully part of 6.4
> >
>
> Will this solution be in ODBC driver (rewrite ORs to UNION) or in
> the backend (fix the optimiser)?
>

The short answer is that the rewrite on the driver side is problematic.

I had hoped to be further along with my feasibility research before raising the issue
for again discussion.  But, now is as good a time as any.  Let me first clarify the
problem for any ppl jumping into the middle of this thread.

Many general purpose database clients applications such as MS Access routinely generate
queries with the following signature:

SELECT k1, k2, k3, a4, a5, ...  FROM t WHERE
    (k1 = const01 AND k2 = const02 AND k3 = const03)  OR
    (k1 = const11 AND k2 = const12 AND k3 = const13)  OR
    (k1 = const21 AND k2 = const22 AND k3 = const23)  OR
    (k1 = const31 AND k2 = const32 AND k3 = const33)  OR
    (k1 = const41 AND k2 = const42 AND k3 = const43)  OR
    (k1 = const51 AND k2 = const52 AND k3 = const53)  OR
    (k1 = const61 AND k2 = const62 AND k3 = const63)  OR
    (k1 = const71 AND k2 = const72 AND k3 = const73)  OR
    (k1 = const81 AND k2 = const82 AND k3 = const73)  OR
    (k1 = const91 AND k2 = const92 AND k3 = const93);

Where k(n) id is the attribute for a multi-part primary key and const(m)(n) is any
constant.

Performance on this kind of a query is crucial to these client side tools.  These are
used to maneuver through large tables without having to slurp in the entire table.
Currently the backend optimizer tries to arrange the WHERE clause into conjunctive
normal form (cnfify()).  Unfortunatley this process leads to memory exhaustion.

I have come up with 3 methods of attacking the problem.

1.   As Mr. Krosing mentioned we could rewrite the query on the driver side. before
sending it to the backend.    One could identify the signature of such a query and upon
verification replace all the ORs with a "UNION SELECT k1, k2, k3, a4, ... FROM t
WHERE"    I have tested this substitution with up to 30 OR groupings and it performs
like a charm.   Thanks Bruce.  Here is the kicker.  If you do some guestimations using
a table with say 50 attributes, you will see that very quickly you will be bumping into
the 8K message limit.   I am finding that this is not unusual in our user community.

2.  Use a similar strategy to the first method except you do the rewrite the query in
the backend; some where after parsing and before optimizations.   The exact location
can be debated.   The basic idea is to pre-qualify the rewrite by requiring only one
table, no subselects, no unions, etc.   Then, identify the AND/OR signature in the
qualifier expression tree.   For each OR grouping, clone the main query tree (minus the
qualifier clause) onto a list of query trees hanging off the UNION structure element.
All the while, pruning the OR nodes off and attaching them to the cloned query tree.
The code required by this approach is very isolated and should be low risk as a
result.   My concern is that this approach is too narrow and does not fit well into the
long term goals of the project.   My guess is that performance will be even better than
the first method.

3.  Get out of the way and let Vadim do his thing.

Comments?


Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
Bruce Momjian
Date:
> 3.  Get out of the way and let Vadim do his thing.
>
> Comments?

Yes, I have queried him to find out where this sits on his list.  It
would be intestesting to what, if anything, he has planned for 6.4.  I
think I have forgotten.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [INTERFACES] ODBC is slow with M$-Access Report

From
"Jose' Soares Da Silva"
Date:
On Tue, 2 Jun 1998, David Hartwig wrote:

<DELETED>
> > Nikolaidis and David Hartwig of insightdist.com that are doing a
> > really great job with ODBC driver), but with the obvious consequences.
>
>  I am currently working on a solution as time will allow.   Hopefully part of 6.4
> > 2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
> > included in the target list of the "SELECT", (I know that it is
> > SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
> > to  have the "dynaset "sorted for any field that is different from
> > the key (in our case the useless OIDs).
>
> This fix is in alpha and will be in the 6.4 release.  I do not know when 6.4 is slated
> for release, but I am willing to send you a patch if it is critical for you to proceed.
>
> > 3. The times required to run complex reports (for example those that
> > include LEFT JOINS) is very long (about 15 minutes to retrieve
> > 2850 rows).
> >
> The solution to your first item will resolve this also.
>
This is a great new David, Thank you very much for your work, this allow us
to go on with this important project.
For us, is enough to know that it will be available *maybe* on next release.
                                               Thanks a lot, Jose'


Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
"Jose' Soares Da Silva"
Date:
On Tue, 2 Jun 1998, The Hermit Hacker wrote:

> On Tue, 2 Jun 1998, David Hartwig wrote:
>
> > > O.S.: We choose Linux for his proved reliability.
>
>     *quiet snicker*
If I understand, this mean incredibility.
We are using Linux since 1994 and we are satisfied. ;-)
> > > 2.  As PostgreSQL doesn't allow an "ORDER BY" on columns not
> > > included in the target list of the "SELECT", (I know that it is
> > > SQL/92 standard, but IMO it's a fool thing), therefore, is not possible
> > > to  have the "dynaset "sorted for any field that is different from
> > > the key (in our case the useless OIDs).
> > >
> >
> > This fix is in alpha and will be in the 6.4 release.  I do not know when
> > 6.4 is slated for release, but I am willing to send you a patch if it is
> > critical for you to proceed.
>
>     6.4 is slated for Oct 1st...we had thought Sep 1st, except that,
> being the tail end of the summer, alot of ppl tend to be in limbo
Thank you,
                                         Jose'


Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From
"Jose' Soares Da Silva"
Date:
On Tue, 2 Jun 1998, Bruce Momjian wrote:

> > 3. The times required to run complex reports (for example those that
> > include LEFT JOINS) is very long (about 15 minutes to retrieve
> > 2850 rows).
>
> Yea, we need this too.  Not sure where we are with this.  Can you give
> an example?
Our problem is linked with using sub-reports, for now we solved this problem
using queries instead of sub-reports and it works well.
                                                          Thanks any way,
                                   Jose'