Thread: atttypmod now 32 bits, interface change
I was wrong. atttypmod was passed as int16 to the clients. attypmod is now passed as int32. I have modified libpq to fix this. I think only odbc needs to be changed for this. I know odbc is not maintained here, but is uploaded from somewhere else. The maintainer needs to change this. The other interfaces look OK. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Not that we have been sitting on our hands, but we have been waiting for the FE/BE protocol to stabilize before updating the ODBC driver to the 6.4 specs. Have we reached this point? Bruce Momjian wrote: > I was wrong. > > atttypmod was passed as int16 to the clients. attypmod is now passed as > int32. I have modified libpq to fix this. I think only odbc needs to > be changed for this. I know odbc is not maintained here, but is > uploaded from somewhere else. The maintainer needs to change this. The > other interfaces look OK. > > -- > Bruce Momjian | 830 Blythe Avenue > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > + If your life is a hard drive, | (610) 353-9879(w) > + Christ can be your backup. | (610) 853-3000(h)
David Hartwig <daveh@insightdist.com> writes: > Not that we have been sitting on our hands, but we have been waiting for the > FE/BE protocol to stabilize before updating the ODBC driver to the 6.4 > specs. Have we reached this point? The cancel changeover and this atttypmod width business were the only open issues I know about. I'm prepared to declare the protocol frozen for 6.4 ... are there any objections? regards, tom lane
> Not that we have been sitting on our hands, but we have been waiting for the > FE/BE protocol to stabilize before updating the ODBC driver to the 6.4 > specs. Have we reached this point? Of course, beta does not start until Sep 1, so it is possible to wait some more to see of other things change before updating things, but currently, there are no open items I know about. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> Not that we have been sitting on our hands, but we have been waiting for the > FE/BE protocol to stabilize before updating the ODBC driver to the 6.4 > specs. Have we reached this point? Good point. I totally agree. I think we have finally stabalized the protocol, with the CANCEL completed last week by Tom Lane. As far as I know, the libpq and sgml docs are updated, so you can use them to see the changes. If you need details, I have kept some of Tom Lane's postings. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> The cancel changeover and this atttypmod width business were the only > open issues I know about. I'm prepared to declare the protocol frozen > for 6.4 ... are there any objections? Sounds good. Should we ask Tatsuo to do some mixed-endian tests, or is that area completely unchanged from v6.3? - Tom
> David Hartwig <daveh@insightdist.com> writes: > > Not that we have been sitting on our hands, but we have been waiting for the > > FE/BE protocol to stabilize before updating the ODBC driver to the 6.4 > > specs. Have we reached this point? > > The cancel changeover and this atttypmod width business were the only > open issues I know about. I'm prepared to declare the protocol frozen > for 6.4 ... are there any objections? I agree. We are done. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > The cancel changeover and this atttypmod width business were the only > > open issues I know about. I'm prepared to declare the protocol frozen > > for 6.4 ... are there any objections? > > Sounds good. Should we ask Tatsuo to do some mixed-endian tests, or is > that area completely unchanged from v6.3? > > - Tom > Unchanged, I think. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
"Thomas G. Lockhart" <lockhart@alumnus.caltech.edu> writes: > Should we ask Tatsuo to do some mixed-endian tests, or is > that area completely unchanged from v6.3? I don't think I broke anything in that regard ... but more testing is always a good thing. If Tatsuo-san can spare the time, it would be appreciated. regards, tom lane
At 11:32 AM 98.7.13 -0400, Tom Lane wrote: >"Thomas G. Lockhart" <lockhart@alumnus.caltech.edu> writes: >> Should we ask Tatsuo to do some mixed-endian tests, or is >> that area completely unchanged from v6.3? > >I don't think I broke anything in that regard ... but more testing is >always a good thing. If Tatsuo-san can spare the time, it would be >appreciated. Ok, I think I can start the testing next week. This week I'm too busy because I have to finish writing an article on PostgreSQL for a Japanese magazine! By the way what are the visible changes of 6.4? I know now we can cancel a query. Could you tell me any other thing so that I could refer to them in the article? -- Tatsuo Ishii t-ishii@sra.co.jp
At 11:32 AM 98.7.13 -0400, Tom Lane wrote: >"Thomas G. Lockhart" <lockhart@alumnus.caltech.edu> writes: >> Should we ask Tatsuo to do some mixed-endian tests, or is >> that area completely unchanged from v6.3? > >I don't think I broke anything in that regard ... but more testing is >always a good thing. If Tatsuo-san can spare the time, it would be >appreciated. >Ok, I think I can start the testing next week. >This week I'm too busy because I have to finish writing an article >on PostgreSQL for a Japanese magazine! >By the way what are the visible changes of 6.4? >I know now we can cancel a query. Could you tell me any other >thing so that I could refer to them in the article? --------------------------------------------------------------------------- Nothing big that I can think of. Lots of cleanup/improvements to existing areas. Vadim has some big items (as usual), but I don't think we want to mention them publically yet. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> >This week I'm too busy because I have to finish writing an article > >on PostgreSQL for a Japanese magazine! > >By the way what are the visible changes of 6.4? > >I know now we can cancel a query. Could you tell me any other > >thing so that I could refer to them in the article? > Nothing big that I can think of. Lots of cleanup/improvements to > existing areas. Now Bruce! The automatic type coersion features are a pretty big change, especially for the casual user; the columns in queries get matched up and converted without any explicit work from the user. I can give Tatsuo some examples if he would like. I'll bet there are a few other changes which would give readers a good idea about the ongoing support and improvements to Postgres... Speaking of docs, we'll have SQL and utility commands in an html/hardcopy reference manual. Hmm, may not be as exciting for Japanese readers, but... :) I've been updating the old release notes in the sgml sources, and have that completed. Perhaps we can start the v6.4 release notes now? With the sgml sources we can have more summary verbiage to help users get introduced to new features, and then roll it out into a text file if necessary. - Tom
> > >This week I'm too busy because I have to finish writing an article > > >on PostgreSQL for a Japanese magazine! > > >By the way what are the visible changes of 6.4? > > >I know now we can cancel a query. Could you tell me any other > > >thing so that I could refer to them in the article? > > Nothing big that I can think of. Lots of cleanup/improvements to > > existing areas. > > Now Bruce! The automatic type coersion features are a pretty big change, > especially for the casual user; the columns in queries get matched up > and converted without any explicit work from the user. I can give Tatsuo > some examples if he would like. I'll bet there are a few other changes > which would give readers a good idea about the ongoing support and > improvements to Postgres... > > Speaking of docs, we'll have SQL and utility commands in an > html/hardcopy reference manual. Hmm, may not be as exciting for Japanese > readers, but... :) > > I've been updating the old release notes in the sgml sources, and have > that completed. Perhaps we can start the v6.4 release notes now? With > the sgml sources we can have more summary verbiage to help users get > introduced to new features, and then roll it out into a text file if > necessary. I was afraid I was going to insult someone by saying what I did. I MEANT that there are no features being added that a non-postgresql user would be interested in. subselects was one feature that non-postgresql users would understand. Most of our stuff now is cleanup/extension of 6.3 features, many of which would be uninteresting to potential users. I suggest we focus on telling them about 6.3, which is ready NOW, and has many nice features. In fact, since we started two years ago, every release has gotten much better than the previous, so we are now at a point where we are adding 'nifty' features like 'cancel' and atttypmod and stuff like that. The days where every release fixed server crashes, or added a feature that users were 'screaming for' may be a thing of the past. We are nearing a maturity stage, where we can focus on performance, documenation, features, and cleanup. The days when we have a 'major' feature may be fewer, because we have added 'most' of the major features people have been asking for. Our user base is growing, and the number of sophisticated developers is growing too, so we are getting major patches to improve lots of existing functionality. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
At 18:37 +0300 on 14/7/98, Bruce Momjian wrote: > The days where every release fixed server crashes, or added a feature > that users were 'screaming for' may be a thing of the past. We are > nearing a maturity stage, where we can focus on performance, > documenation, features, and cleanup. The days when we have a 'major' > feature may be fewer, because we have added 'most' of the major features > people have been asking for. Except row-level locking, referential integrity and PL/SQL... Just an example of major features yet to be implemented (speaking from the point of view of a user who doesn't know what the plans are for 6.4, of course). Herouth (PS. This thread doesn't really have anything to do with the interfaces list, does it? I redirected the crosspost to "general".) -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
> At 18:37 +0300 on 14/7/98, Bruce Momjian wrote: > > > > The days where every release fixed server crashes, or added a feature > > that users were 'screaming for' may be a thing of the past. We are > > nearing a maturity stage, where we can focus on performance, > > documenation, features, and cleanup. The days when we have a 'major' > > feature may be fewer, because we have added 'most' of the major features > > people have been asking for. > > Except row-level locking, referential integrity and PL/SQL... I said the days would be fewer, not gone. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote: > > > I was afraid I was going to insult someone by saying what I did. > > I MEANT that there are no features being added that a non-postgresql > user would be interested in. subselects was one feature that > non-postgresql users would understand. Most of our stuff now is > cleanup/extension of 6.3 features, many of which would be uninteresting > to potential users. Not requiring the column to sort on in target list ia also quite important. As are the (still elementary) constraints, still elementary becuse there is no way to use functions or "is null" in check constraint, and constraints can be used only when defining tables, not in "alter table" construct. > The days where every release fixed server crashes, or added a feature > that users were 'screaming for' may be a thing of the past. Is anyone working on fixing the exploding optimisations for many OR-s, at least the canonic case used by access? My impression is that this has fallen somewhere between insightdist and Vadim. > We are nearing a maturity stage, where we can focus on performance, > documenation, features, and cleanup. The days when we have a 'major' > feature may be fewer, because we have added 'most' of the major features > people have been asking for. Expect them asking more soon ;) I'm sure that soon being just basic ANSI SQL compliant is not enough; people will want (in no particular order ;): * ANSI CLI, * updatable cursors, * foreign key constraints, * distributed databases, * row level locking, * better inheritance, * domains, * isolation levels, * PL/SQL, * better optimisation for special cases, * temporary tables (both global and session level), * more SQL3 constructs, * unlisten command, maybe an argument to listen command, * better support for installing your own access methods, * separating the methods typinput/typoutput (native binary) and typreceive/typsend (wire binary), they are in pg_type * implementing a new fe/be protocol that is easier to implement (does not mix zero terminated, and count-prefixed chunks), preferrably modelled after X-Window protocol. * getting rid of the 8k limitations, both in fe/be protocol and in disk storage. I know that some of these things are being worked on, but I've lost track which are expected for 6.4, which for 6.5 and which I should not expect before 8.0 ;) > Our user base is growing, and the number of sophisticated developers is > growing too, so we are getting major patches to improve lots of existing > functionality. Yep. Great future is awaiting PostgreSQL. I'm really looking forward to a time when I can find some time to contribute some actual code. Hannu
> Bruce Momjian wrote: > > > > > > I was afraid I was going to insult someone by saying what I did. > > > > I MEANT that there are no features being added that a non-postgresql > > user would be interested in. subselects was one feature that > > non-postgresql users would understand. Most of our stuff now is > > cleanup/extension of 6.3 features, many of which would be uninteresting > > to potential users. > > Not requiring the column to sort on in target list ia also quite > important. > > As are the (still elementary) constraints, still elementary becuse > there is no way to use functions or "is null" in check constraint, > and constraints can be used only when defining tables, not in > "alter table" construct. > > > The days where every release fixed server crashes, or added a feature > > that users were 'screaming for' may be a thing of the past. > > Is anyone working on fixing the exploding optimisations for many OR-s, > at least the canonic case used by access? > > My impression is that this has fallen somewhere between > insightdist and Vadim. > > > We are nearing a maturity stage, where we can focus on performance, > > documenation, features, and cleanup. The days when we have a 'major' > > feature may be fewer, because we have added 'most' of the major features > > people have been asking for. > > Expect them asking more soon ;) > > I'm sure that soon being just basic ANSI SQL compliant is not enough; > people will want (in no particular order ;): > * ANSI CLI, > * updatable cursors, > * foreign key constraints, > * distributed databases, > * row level locking, > * better inheritance, > * domains, > * isolation levels, > * PL/SQL, > * better optimisation for special cases, > * temporary tables (both global and session level), > * more SQL3 constructs, > * unlisten command, maybe an argument to listen command, > * better support for installing your own access methods, > * separating the methods typinput/typoutput (native binary) > and typreceive/typsend (wire binary), they are in pg_type > * implementing a new fe/be protocol that is easier to implement > (does not mix zero terminated, and count-prefixed chunks), > preferrably modelled after X-Window protocol. > * getting rid of the 8k limitations, both in fe/be protocol and > in disk storage. > > I know that some of these things are being worked on, but I've lost > track which are expected for 6.4, which for 6.5 and which I should > not expect before 8.0 ;) > > > Our user base is growing, and the number of sophisticated developers is > > growing too, so we are getting major patches to improve lots of existing > > functionality. > > Yep. Great future is awaiting PostgreSQL. > > I'm really looking forward to a time when I can find some time to > contribute some actual code. > > Hannu > Hard to argue with this. There are more MAJOR things that I had forgotten. Still, I will say that the things we are working on now are more 'extras', than the stuff we were working on a year ago, which were 'usablility' issues. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Hannu Krosing wrote: > > Not requiring the column to sort on in target list ia also quite > important. I'm not sure but isn't this already in 6.4-current ? > > As are the (still elementary) constraints, still elementary becuse > there is no way to use functions or "is null" in check constraint, ispas=> create table t (x int, check (x is null or x = 5)); CREATE ispas=> insert into t values (1); ERROR: ExecAppend: rejected due to CHECK constraint $1 ispas=> insert into t values (null); INSERT 168212 1 ispas=> insert into t values (5); INSERT 168213 1 And I'm sure that functions are supported too. This is 6.3.2 > and constraints can be used only when defining tables, not in > "alter table" construct. I hadn't time to do this when implementing and have no plans to do this. In "near" future :) > > > The days where every release fixed server crashes, or added a feature > > that users were 'screaming for' may be a thing of the past. > > Is anyone working on fixing the exploding optimisations for many OR-s, > at least the canonic case used by access? > > My impression is that this has fallen somewhere between > insightdist and Vadim. I'm not working... Vadim
> > > The days where every release fixed server crashes, or added a feature > > > that users were 'screaming for' may be a thing of the past. > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > at least the canonic case used by access? > > > > My impression is that this has fallen somewhere between > > insightdist and Vadim. > > I'm not working... > Not sure anyone has an idea how to fix this. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Hannu Krosing wrote: > Bruce Momjian wrote: > > > > > > I was afraid I was going to insult someone by saying what I did. > > > > I MEANT that there are no features being added that a non-postgresql > > user would be interested in. subselects was one feature that > > non-postgresql users would understand. Most of our stuff now is > > cleanup/extension of 6.3 features, many of which would be uninteresting > > to potential users. > > Not requiring the column to sort on in target list ia also quite > important. > Along these lines - I heard someone grumbling a while back about not being able to use a function in the ORDER/GROUP BY clauses. (i.e. SELECT bar FROM foo ORDER BY LCASE(alpha);) I believe it is on the TODO list. Bruce, I will claim this item unless someone else already has.
> Along these lines - I heard someone grumbling a while back about not being > able to use a function in the ORDER/GROUP BY clauses. (i.e. SELECT bar FROM > foo ORDER BY LCASE(alpha);) I believe it is on the TODO list. Bruce, I > will claim this item unless someone else already has. > > Done. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Hannu Krosing wrote: > > The days where every release fixed server crashes, or added a feature > > that users were 'screaming for' may be a thing of the past. > > Is anyone working on fixing the exploding optimisations for many OR-s, > at least the canonic case used by access? > > My impression is that this has fallen somewhere between > insightdist and Vadim. This is really big for the ODBCers. (And I suspect for JDBCers too.) Many desktop libraries and end-user tools depend on this "record set" strategy to operate effectively. I have put together a workable hack that runs just before cnfify(). The option is activated through the SET command. Once activated, it identifies queries with this particular multi-OR pattern generated by these RECORD SET strategies. Qualified query trees are rewritten as multiple UNIONs. (One for each OR grouping). The results are profound. Queries that used to scan tables because of the ORs, now make use of any indexes. Thus, the size of the table has virtually no effect on performance. Furthermore, queries that used to crash the backend, now run in under a second. Currently the down sides are: 1. If there is no usable index, performance is significantly worse. The patch does not check to make sure that there is a usable index. I could use some pointers on this. 2. Small tables are actually a bit slower than without the patch. 3. Not very elegant. I am looking for a more generalized solution. I have lots of ideas, but I would need to know the backend much better before attempting any of them. My favorite idea is before cnfify(), to factor the OR terms and pull out the constants into a virtual (temporary) table spaces. Then rewrite the query as a join. The optimizer will (should) treat the new query accordingly. This assumes that an efficient factoring algorithm exists and that temporary tables can exist in the heap. Illustration: SELECT ... FROM tab WHERE (var1 = const1 AND var2 = const2) OR (var1 = const3 AND var2 = const4) OR (var1 = const5 AND var2 = const6) SELECT ... FROM tab, tmp WHERE (var1 = var_x AND var2 = var_y) tmp var_x | var_y -------------- const1|const2 const3|const4 const5|const6 Comments?
> The results are profound. Queries that used to scan tables because of the > ORs, now make use of any indexes. Thus, the size of the table has virtually > no effect on performance. Furthermore, queries that used to crash the > backend, now run in under a second. > > Currently the down sides are: > 1. If there is no usable index, performance is significantly worse. The > patch does not check to make sure that there is a usable index. I could use > some pointers on this. > > 2. Small tables are actually a bit slower than without the patch. > > 3. Not very elegant. I am looking for a more generalized solution. > I have lots of ideas, but I would need to know the backend much better before > attempting any of them. My favorite idea is before cnfify(), to factor the > OR terms and pull out the constants into a virtual (temporary) table spaces. > Then rewrite the query as a join. The optimizer will (should) treat the new > query accordingly. This assumes that an efficient factoring algorithm exists > and that temporary tables can exist in the heap. OK, I have an idea. Just today, we allow: select * from tab1 where val in ( select x from tab2 union select y from tab3 ) How about if instead of doing: select * from tab1 where val = 3 union select * from tab1 where val = 4 ... you change it to: select * from tab1 where val in ( select 3 union select 4 ) This may be a big win. You aren't running the same query over and over again, with the same joins, and just a different constant. Let me know. If it fails for some reason, it is possible my subselect union code has a bug, so let me know. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > The results are profound. Queries that used to scan tables because of the > OK, I have an idea. Just today, we allow: > > select * > from tab1 > where val in ( > select x from tab2 > union > select y from tab3 > ) > > How about if instead of doing: > > select * from tab1 where val = 3 > union > select * from tab1 where val = 4 > ... > > you change it to: > > select * from tab1 where val in ( > select 3 > union > select 4 > ) OK, I just ran some test, and it does not look good: --------------------------------------------------------------------------- son_db=> explain select mmatter from matter where mmatter = 'A01-001'; NOTICE: QUERY PLAN: Index Scan using i_matt2 on matter (cost=2.05 size=1 width=12) EXPLAIN son_db=> explain select mmatter from matter where mmatter in (select 'A01-001'); NOTICE: QUERY PLAN: Seq Scan on matter (cost=512.20 size=1001 width=12) SubPlan -> Result (cost=0.00 size=0 width=0) EXPLAIN --------------------------------------------------------------------------- Turns out indexes are not used in outer queries of subselects. Not sure why. Vadim? -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
On Wed, 15 Jul 1998, Bruce Momjian wrote: > > > > The days where every release fixed server crashes, or added a feature > > > > that users were 'screaming for' may be a thing of the past. > > > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > > at least the canonic case used by access? > > > > > > My impression is that this has fallen somewhere between > > > insightdist and Vadim. > > > > I'm not working... > > > > Not sure anyone has an idea how to fix this. What? How to get Vadim back to work? ;) Maarten _____________________________________________________________________________ | TU Delft, The Netherlands, Faculty of Information Technology and Systems | | Department of Electrical Engineering | | Computer Architecture and Digital Technique section | | M.Boekhold@et.tudelft.nl | -----------------------------------------------------------------------------
Vadim Mikheev wrote: > > Hannu Krosing wrote: > > > > Not requiring the column to sort on in target list ia also quite > > important. > > I'm not sure but isn't this already in 6.4-current ? > > > > > As are the (still elementary) constraints, still elementary becuse > > there is no way to use functions or "is null" in check constraint, > > ispas=> create table t (x int, check (x is null or x = 5)); > CREATE > ispas=> insert into t values (1); > ERROR: ExecAppend: rejected due to CHECK constraint $1 > ispas=> insert into t values (null); > INSERT 168212 1 > ispas=> insert into t values (5); > INSERT 168213 1 > > And I'm sure that functions are supported too. This is 6.3.2 Sorry, i tried the wrong syntax (without IS ) ;( but functions still dont work: hannu=> create table test1 (a text, b text, hannu-> check (trim(a) <> '' or trim(b) <> '')); ERROR: parser: parse error at or near "trim" If I use a non-existing function, I get a different answer hannu=> create table test1 (a text, b text, hannu-> check (strip(a) <> '' or strip(b) <> '')); ERROR: function strip(text) does not exist So it cant't be just "parser" error > > and constraints can be used only when defining tables, not in > > "alter table" construct. > > I hadn't time to do this when implementing and have no plans > to do this. In "near" future :) > > > > > > The days where every release fixed server crashes, or added a feature > > > that users were 'screaming for' may be a thing of the past. > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > at least the canonic case used by access? > > > > My impression is that this has fallen somewhere between > > insightdist and Vadim. > > I'm not working... Are you after some general solution, or are you first implementing the 'rewrite to union' way ? Hannu
Bruce Momjian wrote: > > > The results are profound. Queries that used to scan tables because of the > > How about if instead of doing: > > select * from tab1 where val = 3 > union > select * from tab1 where val = 4 > ... > > you change it to: > > select * from tab1 where val in ( > select 3 > union > select 4 > ) > the explosion happens for ORs of multiple ANDs that get rewritten to: select * from tabl wehere val1=1 and val2=1 and val3=1 union select * from tabl wehere val1=1 and val2=1 and val3=2 union ... And there is no way of doing (at least presently): select * from table where (val1,val2,val3) in (select 1,1,1 union select 1,1,2); Hannu
Hannu Krosing wrote: > > but functions still dont work: > > hannu=> create table test1 (a text, b text, > hannu-> check (trim(a) <> '' or trim(b) <> '')); > ERROR: parser: parse error at or near "trim" TRIM is keyword, not a function... We have to copy some lines in gram.y Real functions are working... Vadim
hi, guys! It seems to me that two or three weeks ago there were some messages about porting libpq for Win32 platform. I think it is very imporant feature and it should be mentioned with no doubts in all reviews about PostgreSQL 'cause it moved PostgreSQL far beyond any other free DB engeens in the world of Windowz Al.
Vadim Mikheev wrote: > > Hannu Krosing wrote: > > > > but functions still dont work: > > > > hannu=> create table test1 (a text, b text, > > hannu-> check (trim(a) <> '' or trim(b) <> '')); > > ERROR: parser: parse error at or near "trim" > > TRIM is keyword, not a function... > We have to copy some lines in gram.y Wow! is this standard ? I found the function trim by doing 'select oprname from pg_oper' and tested it as follows: hannu=> select trim(' x '); btrim ----- x (1 row) why is the column called btrim ? some rewrite magic in parser ? If it must stay a keyword, then perhaps we should remove the proc ? > Real functions are working... yep! Thanks: create table test2(a text,b text, check (btrim(a) <> '' or btrim(b) <> '')); does work ;) Hannu
> > TRIM is keyword, not a function... > > We have to copy some lines in gram.y > Wow! is this standard ? > I found the function trim by doing 'select oprname from pg_oper' > and tested it as follows: > > hannu=> select trim(' x '); > btrim > ----- > x > (1 row) > why is the column called btrim ? > some rewrite magic in parser ? > If it must stay a keyword, then perhaps we should remove the proc ? Uh, yes, I think you are right. Here's why: The SQL92 syntax for the trim() function is as follows: TRIM([LEADING|TRAILING|BOTH] [char FROM] string) This syntax is _not_ the clean "function(arg1,arg2,...)" syntax that the parser could handle without change, so I had to make TRIM a keyword in the parser and explicitly decode the possible argument phrases. To implement all possibilities, I transform the function in the parser to the functions btrim(), rtrim(), and ltrim() implemented earlier by Edmund Mergl as the "Oracle compatibility functions". I'll add TRIM() and the other goofy pseudo-functions to the CHECK syntax, and take the trim(arg1) declaration out of pg_proc since it can never get executed. Oh, btw we allow trimming strings from strings, not just trimming chars from strings :) - Tom
> And there is no way of doing (at least presently): > > select * from table where (val1,val2,val3) > in (select 1,1,1 union select 1,1,2); I'll look at that... - Tom
> hi, guys! > > > It seems to me that two or three weeks ago there were some messages about > porting libpq for Win32 platform. I think it is very imporant feature and > it should be mentioned with no doubts in all reviews about PostgreSQL > 'cause it moved PostgreSQL far beyond any other free DB engeens in the Already done in the current snapshot on ftp.postgresql.org. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > hannu-> check (trim(a) <> '' or trim(b) <> '')); > > > ERROR: parser: parse error at or near "trim" > > > > TRIM is keyword, not a function... > > We have to copy some lines in gram.y I think that having trim as a keyword is a problem. The primary virtue of postgres is that everything is either a function or a type and as such is definable by the user and resolved at runtime. Making a keyword out of a function spoils that capability. -dg David Gould dg@illustra.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 - If simplicity worked, the world would be overrun with insects. -
> > > > hannu-> check (trim(a) <> '' or trim(b) <> '')); > > > > ERROR: parser: parse error at or near "trim" > > > > > > TRIM is keyword, not a function... > > > We have to copy some lines in gram.y > > I think that having trim as a keyword is a problem. The primary virtue of > postgres is that everything is either a function or a type and as such is > definable by the user and resolved at runtime. Making a keyword out of a > function spoils that capability. Problem was that SQL standard syntax (or Oracle) did not allow it to be a function. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
On Thu, 16 Jul 1998, Aleksey Dashevsky wrote: > It seems to me that two or three weeks ago there were some messages about > porting libpq for Win32 platform. I think it is very imporant feature and > it should be mentioned with no doubts in all reviews about PostgreSQL > 'cause it moved PostgreSQL far beyond any other free DB engeens in the > world of Windowz I'd thought that the ODBC driver would have more of an impact with Win32 than porting libpq, especially with existing applications. -- Peter T Mount peter@retep.org.uk or petermount@earthling.net Main Homepage: http://www.retep.org.uk ************ Someday I may rebuild this signature completely ;-) ************ Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
On Wed, 15 Jul 1998, David Hartwig wrote: > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > desktop libraries and end-user tools depend on this "record set" strategy to > operate effectively. Although I haven't seen what they produce, it is possible that JBuilder and others do have this affect with JDBC. However, not all JDBC applications have this problem. Infact the majority I've seen only produce much simpler queries. -- Peter T Mount peter@retep.org.uk or petermount@earthling.net Main Homepage: http://www.retep.org.uk ************ Someday I may rebuild this signature completely ;-) ************ Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
Thomas G. Lockhart wrote: > > > And there is no way of doing (at least presently): > > > > select * from table where (val1,val2,val3) > > in (select 1,1,1 union select 1,1,2); > > I'll look at that... Could it be a good idea to have the syntax (at least for constants), changed to (or at least allowed ;) to the following: select * from table where (val1,val2,val3) in ( (1,1,3), (1,1,2), (1,1,1) ); Which brings us to another issue: Should (val1,val2,val3) be just some construct that gets rewritten to "something else" in parser, or should it create an instance of anonymus row type ? Allowing anonymus row type creation on the fly would allow us many nice things, for example a way to create new types of aggregate functions, like FOR_MAX((price,date)), so that we could do the following in only one pass SELECT FOR_MAX((price,sales_datetime)) as last_price, MAX(sales_datetime) as last_sale, WEEK(sales_datetime) week_nr GROUP BY week_nr ; This would get the prices and dates of each weeks last sale, and is much hairier to do using just standard sql. Hannu
>At 11:32 AM 98.7.13 -0400, Tom Lane wrote: >>"Thomas G. Lockhart" <lockhart@alumnus.caltech.edu> writes: >>> Should we ask Tatsuo to do some mixed-endian tests, or is >>> that area completely unchanged from v6.3? >> >>I don't think I broke anything in that regard ... but more testing is >>always a good thing. If Tatsuo-san can spare the time, it would be >>appreciated. > >Ok, I think I can start the testing next week. I did some cross-platform testing today against 7/18 snapshot. The platforms tested are: 1. Sparc/Solaris 2.6 2. PowerPC/Linux 3. x86/FreeBSD 4. x86/Linux They workd fine! P.S. I noticed that 6.4 client did not talk to 6.3.2 server. [srapc451.sra.co.jp]t-ishii{157} Connection to database 'test' failed. Unsupported frontend protocol.[srapc451.sra.co.jp]t-ishii{158} I thought that we have kept the "backward compatibility" since we introduced "protocol version" in libpq?
> >At 11:32 AM 98.7.13 -0400, Tom Lane wrote: > >>"Thomas G. Lockhart" <lockhart@alumnus.caltech.edu> writes: > >>> Should we ask Tatsuo to do some mixed-endian tests, or is > >>> that area completely unchanged from v6.3? > >> > >>I don't think I broke anything in that regard ... but more testing is > >>always a good thing. If Tatsuo-san can spare the time, it would be > >>appreciated. > > > >Ok, I think I can start the testing next week. > > I did some cross-platform testing today against 7/18 snapshot. The > platforms tested are: > > 1. Sparc/Solaris 2.6 > 2. PowerPC/Linux > 3. x86/FreeBSD > 4. x86/Linux > > They workd fine! > > P.S. > I noticed that 6.4 client did not talk to 6.3.2 server. > > [srapc451.sra.co.jp]t-ishii{157} > > > > Connection to database 'test' failed. > Unsupported frontend protocol.[srapc451.sra.co.jp]t-ishii{158} > > I thought that we have kept the "backward compatibility" since we > introduced "protocol version" in libpq? Might be my atttypmod changes. I did not make those version-sensitive. I will do that now. However, the protocol version number thing looks like something more fundamental. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Re: [INTERFACES] Re: [HACKERS] atttypmod now 32 bits, interface change
From
"Thomas G. Lockhart"
Date:
> I did some cross-platform testing today against 7/18 snapshot. > They workd fine! Great. Thanks Tatsuo. - Tom
t-ishii@sra.co.jp writes: >> I noticed that 6.4 client did not talk to 6.3.2 server. >> Connection to database 'test' failed. >> Unsupported frontend protocol. >> >> I thought that we have kept the "backward compatibility" since we >> introduced "protocol version" in libpq? Backwards compatibility yes: a 6.4 server should be able to talk to an old client. You're asking about cross-version compatibility in the other direction, which is something we don't have. The connection protocol is designed to let the server accommodate to the client, not vice versa --- the client tells the server its version, but not vice versa. I suppose the client might check for that particular error message after a connect failure and then try again with a lower version number ... but that's pretty messy. On a practical level, the new libpq is not capable of talking to an old server anyway --- some of the cleanups I made are critically dependent on new protocol features, such as the 'Z' (ReadyForQuery) message. Bruce Momjian <maillist@candle.pha.pa.us> writes: > Might be my atttypmod changes. I did not make those version-sensitive. > I will do that now. Yes, if we want to have backward compatibility as I just defined it, then the backend will have to send atttypmod as either 2 or 4 bytes depending on ProtocolVersion. Shouldn't be too hard. But I'm concerned that you and I both missed that initially. We had better actually test that the current backend sources will work with a 6.3.2-release frontend. regards, tom lane
> Bruce Momjian <maillist@candle.pha.pa.us> writes: > > Might be my atttypmod changes. I did not make those version-sensitive. > > I will do that now. > > Yes, if we want to have backward compatibility as I just defined it, > then the backend will have to send atttypmod as either 2 or 4 bytes > depending on ProtocolVersion. Shouldn't be too hard. But I'm concerned > that you and I both missed that initially. We had better actually test > that the current backend sources will work with a 6.3.2-release frontend. Already done. We never passed atttypmod to the backend before 6.4, so the change it just to pass it or not pass it, and Tom already did that. The fact that the internal length was 2 and is not 4 is not relevant because we never passed it to the frontend in the past. if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) pq_putint(attrs[i]->atttypmod, sizeof(attrs[i]->atttypmod... -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian <maillist@candle.pha.pa.us> writes: > Already done. We never passed atttypmod to the backend before 6.4, so > the change it just to pass it or not pass it, and Tom already did that. > The fact that the internal length was 2 and is not 4 is not relevant > because we never passed it to the frontend in the past. Ah, right. Should check the code before opining that it's wrong ;-) regards, tom lane
> > Hannu Krosing wrote: > > > > The days where every release fixed server crashes, or added a feature > > > that users were 'screaming for' may be a thing of the past. > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > at least the canonic case used by access? > > > > My impression is that this has fallen somewhere between > > insightdist and Vadim. > > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > desktop libraries and end-user tools depend on this "record set" strategy to > operate effectively. > > I have put together a workable hack that runs just before cnfify(). The > option is activated through the SET command. Once activated, it identifies > queries with this particular multi-OR pattern generated by these RECORD SET > strategies. Qualified query trees are rewritten as multiple UNIONs. (One > for each OR grouping). > > The results are profound. Queries that used to scan tables because of the > ORs, now make use of any indexes. Thus, the size of the table has virtually > no effect on performance. Furthermore, queries that used to crash the > backend, now run in under a second. > > Currently the down sides are: > 1. If there is no usable index, performance is significantly worse. The > patch does not check to make sure that there is a usable index. I could use > some pointers on this. > > 2. Small tables are actually a bit slower than without the patch. > > 3. Not very elegant. I am looking for a more generalized solution. > I have lots of ideas, but I would need to know the backend much better before > attempting any of them. My favorite idea is before cnfify(), to factor the > OR terms and pull out the constants into a virtual (temporary) table spaces. > Then rewrite the query as a join. The optimizer will (should) treat the new > query accordingly. This assumes that an efficient factoring algorithm exists > and that temporary tables can exist in the heap. > > Illustration: > SELECT ... FROM tab WHERE > (var1 = const1 AND var2 = const2) OR > (var1 = const3 AND var2 = const4) OR > (var1 = const5 AND var2 = const6) > > SELECT ... FROM tab, tmp WHERE > (var1 = var_x AND var2 = var_y) > > tmp > var_x | var_y > -------------- > const1|const2 > const3|const4 > const5|const6 David, where are we on this? I know we have OR's using indexes. Do we still need to look this as a fix, or are we OK. I have not gotten far enough in the optimizer to know how to fix the cnf'ify problem. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote: > > > > Hannu Krosing wrote: > > > > > > The days where every release fixed server crashes, or added a feature > > > > that users were 'screaming for' may be a thing of the past. > > > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > > at least the canonic case used by access? > > > > > > My impression is that this has fallen somewhere between > > > insightdist and Vadim. > > > > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > > desktop libraries and end-user tools depend on this "record set" strategy to > > operate effectively. > > > > I have put together a workable hack that runs just before cnfify(). The > > option is activated through the SET command. Once activated, it identifies > > queries with this particular multi-OR pattern generated by these RECORD SET > > strategies. Qualified query trees are rewritten as multiple UNIONs. (One > > for each OR grouping). > > > > The results are profound. Queries that used to scan tables because of the > > ORs, now make use of any indexes. Thus, the size of the table has virtually > > no effect on performance. Furthermore, queries that used to crash the > > backend, now run in under a second. > > > > Currently the down sides are: > > 1. If there is no usable index, performance is significantly worse. The > > patch does not check to make sure that there is a usable index. I could use > > some pointers on this. > > > > 2. Small tables are actually a bit slower than without the patch. > > > > 3. Not very elegant. I am looking for a more generalized solution. > > I have lots of ideas, but I would need to know the backend much better before > > attempting any of them. My favorite idea is before cnfify(), to factor the > > OR terms and pull out the constants into a virtual (temporary) table spaces. > > Then rewrite the query as a join. The optimizer will (should) treat the new > > query accordingly. This assumes that an efficient factoring algorithm exists > > and that temporary tables can exist in the heap. > > > > Illustration: > > SELECT ... FROM tab WHERE > > (var1 = const1 AND var2 = const2) OR > > (var1 = const3 AND var2 = const4) OR > > (var1 = const5 AND var2 = const6) > > > > SELECT ... FROM tab, tmp WHERE > > (var1 = var_x AND var2 = var_y) > > > > tmp > > var_x | var_y > > -------------- > > const1|const2 > > const3|const4 > > const5|const6 > > David, where are we on this? I know we have OR's using indexes. Do we > still need to look this as a fix, or are we OK. I have not gotten far > enough in the optimizer to know how to fix the Bruce, If the question is, have I come up with a solution for the cnf'ify problem: No If the question is, is it still important: Very much yes. It is essential for many RAD tools using remote data objects which make use of key sets. Your recent optimization of the OR list goes a long way, but inevitably users are confronted with multi-part keys. When I look at the problem my head spins. I do not have the experience (yet?) with the backend to be mucking around in the optimizer. As I see it, cnf'ify is doing just what it is supposed to do. Boundless boolean logic. I think hope may lay though, in identifying each AND'ed group associated with a key and tagging it as a special sub-root node which cnf'ify does not penetrate. This node would be allowed to pass to the later stages of the optimizer where it will be used to plan index scans. Easy for me to say. In the meantime, I still have the patch that I described in prior email. It has worked well for us. Let me restate that. We could not survive without it! However, I do not feel that is a sufficiently functional approach that should be incorporated as a final solution. I will submit the patch if you, (anyone) does not come up with a better solution. It is coded to be activated by a SET KSQO to minimize its reach.
> > > Bruce Momjian wrote: > > > > > > > Hannu Krosing wrote: > > > > > > > > The days where every release fixed server crashes, or added a feature > > > > > that users were 'screaming for' may be a thing of the past. > > > > > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > > > at least the canonic case used by access? > > > > > > > > My impression is that this has fallen somewhere between > > > > insightdist and Vadim. > > > > > > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > > > desktop libraries and end-user tools depend on this "record set" strategy to > > > operate effectively. > > > > > > I have put together a workable hack that runs just before cnfify(). The > > > option is activated through the SET command. Once activated, it identifies > > > queries with this particular multi-OR pattern generated by these RECORD SET > > > strategies. Qualified query trees are rewritten as multiple UNIONs. (One > > > for each OR grouping). > > > > > > The results are profound. Queries that used to scan tables because of the > > > ORs, now make use of any indexes. Thus, the size of the table has virtually > > > no effect on performance. Furthermore, queries that used to crash the > > > backend, now run in under a second. > > > > > > Currently the down sides are: > > > 1. If there is no usable index, performance is significantly worse. The > > > patch does not check to make sure that there is a usable index. I could use > > > some pointers on this. > > > > > > 2. Small tables are actually a bit slower than without the patch. > > > > > > 3. Not very elegant. I am looking for a more generalized solution. > > > I have lots of ideas, but I would need to know the backend much better before > > > attempting any of them. My favorite idea is before cnfify(), to factor the > > > OR terms and pull out the constants into a virtual (temporary) table spaces. > > > Then rewrite the query as a join. The optimizer will (should) treat the new > > > query accordingly. This assumes that an efficient factoring algorithm exists > > > and that temporary tables can exist in the heap. > > > > > > Illustration: > > > SELECT ... FROM tab WHERE > > > (var1 = const1 AND var2 = const2) OR > > > (var1 = const3 AND var2 = const4) OR > > > (var1 = const5 AND var2 = const6) > > > > > > SELECT ... FROM tab, tmp WHERE > > > (var1 = var_x AND var2 = var_y) > > > > > > tmp > > > var_x | var_y > > > -------------- > > > const1|const2 > > > const3|const4 > > > const5|const6 > > > > David, where are we on this? I know we have OR's using indexes. Do we > > still need to look this as a fix, or are we OK. I have not gotten far > > enough in the optimizer to know how to fix the > > Bruce, > > If the question is, have I come up with a solution for the cnf'ify problem: No > > If the question is, is it still important: Very much yes. > > It is essential for many RAD tools using remote data objects which make use of key > sets. Your recent optimization of the OR list goes a long way, but inevitably > users are confronted with multi-part keys. > > When I look at the problem my head spins. I do not have the experience (yet?) > with the backend to be mucking around in the optimizer. As I see it, cnf'ify is > doing just what it is supposed to do. Boundless boolean logic. > > I think hope may lay though, in identifying each AND'ed group associated with a key > and tagging it as a special sub-root node which cnf'ify does not penetrate. This > node would be allowed to pass to the later stages of the optimizer where it will be > used to plan index scans. Easy for me to say. > > In the meantime, I still have the patch that I described in prior email. It has > worked well for us. Let me restate that. We could not survive without it! > However, I do not feel that is a sufficiently functional approach that should be > incorporated as a final solution. I will submit the patch if you, (anyone) does > not come up with a better solution. It is coded to be activated by a SET KSQO to > minimize its reach. > > OK, let me try this one. Why is the system cnf'ifying the query. Because it wants to have a list of qualifications that are AND'ed, so it can just pick the most restrictive/cheapest, and evaluate that one first. If you have: (a=b and c=d) or e=1 In this case, without cnf'ify, it has to evaluate both of them, because if one is false, you can't be sure another would be true. In the cnf'ify case, (a=b or e=1) and (c=d or e=1) In this case, it can choose either, and act on just one, if a row fails to meet it, it can stop and not evaluate it using the other restriction. The fact is that it is only going to use fancy join/index in one of the two cases, so it tries to pick the best one, and does a brute-force qualification test on the remaining item if the first one tried is true. The problem is of course large where clauses can exponentially expand this. What it really trying to do is to pick a cheapest restriction, but the memory explosion and query failure are serious problems. The issue is that it thinks it is doing something to help things, while it is actually hurting things. In the ODBC case of: (x=3 and y=4) or (x=3 and y=5) or (x=3 and y=6) or ... it clearly is not going to gain anything by choosing any CHEAPEST path, because they are all the same in terms of cost, and the use by ODBC clients is hurting reliability. I am inclined to agree with David's solution of breaking apart the query into separate UNION queries in certain cases. It seems to be the most logical solution, because the cnf'ify code is working counter to its purpose in these cases. Now, the question is how/where to implement this. I see your idea of making the OR a join to a temp table that holds all the constants. Another idea would be to do actual UNION queries: SELECT * FROM tab WHERE (x=3 and y=4) UNION SELECT * FROM tab WHERE (x=3 and y=5) UNION SELECT * FROM tab WHERE (x=3 and y=6) ... This would work well for tables with indexes, but for a sequential scan, you are doing a sequential scan for each UNION. Another idea is subselects. Also, you have to make sure you return the proper rows, keeping duplicates where they are in the base table, but not returning them when the meet more than one qualification. SELECT * FROM tab WHERE (x,y) IN (SELECT 3, 4 UNION SELECT 3, 5 UNION SELECT 3, 6) I believe we actually support this. This is not going to use an index on tab, so it may be slow if x and y are indexed. Another more bizarre solution is: SELECT * FROM tab WHERE (x,y) = (SELECT 3, 4) OR (x,y) = (SELECT 3, 5) OR (x,y) = (SELECT 3, 6) Again, I think we do this too. I don't think cnf'ify does anything with this. I also believe "=" uses indexes on subselects, while IN does not because IN could return lots of rows, and an index is slower than a non-index join on lots of rows. Of course, now that we index OR's. Let me ask another question. If I do: SELECT * FROM tab WHERE x=3 OR x=4 it works, and uses indexes. Why can't the optimizer just not cnf'ify things sometimes, and just do: SELECT * FROM tab WHERE (x=3 AND y=4) OR (x=3 AND y=5) OR (x=3 AND y=6) Why can it handle x=3 OR x=4, but not the more complicated case above, without trying to be too smart? If x,y is a multi-key index, it could use that quite easily. If not, it can do a sequentail scan and run the tests. Another issue. To the optimizer, x=3 and x=y are totally different. In x=3, it is a column compared to a constant, while in x=y, it is a join. That makes a huge difference. In the case of (a=b and c=d) or e=1, you pick the best path and do the a=b join, and throw in the e=1 entries. You can't easily do both joins, because you also need the e=1 stuff. I wounder what would happen if we prevent cnf'ifying of cases where the OR represent only column = constant restrictions. I meant to really go through the optimizer this month, but other backend items took my time. Can someone run some tests on disabling the cnf'ify calls. It is my understanding that with the non-cnf-ify'ed query, it can't choose an optimial path, and starts to do either straight index matches, sequential scans, or cartesian products where it joins every row to every other row looking for a match. Let's say we turn off cnf-ify just for non-join queries. Does that help? I am not sure of the ramifications of telling the optimizer it no longer has a variety of paths to choose for evaluating the query. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote: > OK, let me try this one. > > Why is the system cnf'ifying the query. Because it wants to have a > list of qualifications that are AND'ed, so it can just pick the most > restrictive/cheapest, and evaluate that one first. > > If you have: > > (a=b and c=d) or e=1 > > In this case, without cnf'ify, it has to evaluate both of them, because > if one is false, you can't be sure another would be true. In the > cnf'ify case, > > (a=b or e=1) and (c=d or e=1) > > In this case, it can choose either, and act on just one, if a row fails > to meet it, it can stop and not evaluate it using the other restriction. > > The fact is that it is only going to use fancy join/index in one of the > two cases, so it tries to pick the best one, and does a brute-force > qualification test on the remaining item if the first one tried is true. > > The problem is of course large where clauses can exponentially expand > this. What it really trying to do is to pick a cheapest restriction, > but the memory explosion and query failure are serious problems. > > The issue is that it thinks it is doing something to help things, while > it is actually hurting things. > > In the ODBC case of: > > (x=3 and y=4) or > (x=3 and y=5) or > (x=3 and y=6) or ... > > it clearly is not going to gain anything by choosing any CHEAPEST path, > because they are all the same in terms of cost, and the use by ODBC > clients is hurting reliability. > > I am inclined to agree with David's solution of breaking apart the query > into separate UNION queries in certain cases. It seems to be the most > logical solution, because the cnf'ify code is working counter to its > purpose in these cases. > > Now, the question is how/where to implement this. I see your idea of > making the OR a join to a temp table that holds all the constants. > Another idea would be to do actual UNION queries: > > SELECT * FROM tab > WHERE (x=3 and y=4) > UNION > SELECT * FROM tab > WHERE (x=3 and y=5) > UNION > SELECT * FROM tab > WHERE (x=3 and y=6) ... > > This would work well for tables with indexes, but for a sequential scan, > you are doing a sequential scan for each UNION. Practically speaking, the lack of an index concern, may not be justified. The reason these queries are being generated, with this shape, is because remote data objects on the client side are being told that a primary key exists on these tables. The object is told about these keys in one of two ways. 1. It queries the database for the primary key of the table. The ODBC driver serviced this request by querying for the attributes used in {table_name}_pkey. 2. The user manually specifies the primary key. In this case an actual index may not exist. (i.e. MS Access asks the user for this information if a primary key is not found in a table) The second case is the only one that would cause a problem. Fortunately, the solution is simple. Add a primary key index! My only concern is to be able to accurately identify a query with the proper signature before rewriting it as a UNION. To what degree should this inspection be taken? BTW, I would not do the rewrite on OR's without AND's since you have fixed the OR's use of the index. There is one other potential issue. My experience with using arrays in tables and UNIONS creates problems. There are missing array comparison operators which are used by the implied DISTINCT. > Another idea is > subselects. Also, you have to make sure you return the proper rows, > keeping duplicates where they are in the base table, but not returning > them when the meet more than one qualification. > > SELECT * FROM tab > WHERE (x,y) IN (SELECT 3, 4 > UNION > SELECT 3, 5 > UNION > SELECT 3, 6) > > I believe we actually support this. This is not going to use an index > on tab, so it may be slow if x and y are indexed. > > Another more bizarre solution is: > > SELECT * FROM tab > WHERE (x,y) = (SELECT 3, 4) OR > (x,y) = (SELECT 3, 5) OR > (x,y) = (SELECT 3, 6) > > Again, I think we do this too. I don't think cnf'ify does anything with > this. I also believe "=" uses indexes on subselects, while IN does not > because IN could return lots of rows, and an index is slower than a > non-index join on lots of rows. Of course, now that we index OR's. > > Let me ask another question. If I do: > > SELECT * FROM tab WHERE x=3 OR x=4 > > it works, and uses indexes. Why can't the optimizer just not cnf'ify > things sometimes, and just do: > > SELECT * FROM tab > WHERE (x=3 AND y=4) OR > (x=3 AND y=5) OR > (x=3 AND y=6) > > Why can it handle x=3 OR x=4, but not the more complicated case above, > without trying to be too smart? If x,y is a multi-key index, it could > use that quite easily. If not, it can do a sequentail scan and run the > tests. > > Another issue. To the optimizer, x=3 and x=y are totally different. In > x=3, it is a column compared to a constant, while in x=y, it is a join. > That makes a huge difference. > > In the case of (a=b and c=d) or e=1, you pick the best path and do the > a=b join, and throw in the e=1 entries. You can't easily do both joins, > because you also need the e=1 stuff. > > I wounder what would happen if we prevent cnf'ifying of cases where the > OR represent only column = constant restrictions. > > I meant to really go through the optimizer this month, but other backend > items took my time. > > Can someone run some tests on disabling the cnf'ify calls. It is my > understanding that with the non-cnf-ify'ed query, it can't choose an > optimial path, and starts to do either straight index matches, > sequential scans, or cartesian products where it joins every row to > every other row looking for a match. > > Let's say we turn off cnf-ify just for non-join queries. Does that > help? > > I am not sure of the ramifications of telling the optimizer it no longer > has a variety of paths to choose for evaluating the query. I did not try this earlier because I thought it was too good to be true. I was right. I tried commenting out the normalize() function in the cnfify(). The EXPLAIN showed a sequential scan and the resulting tuple set was empty. Time will not allow me to dig into this further this weekend. Unless you come up with a better solution, I am going to submit my patch on Monday to make the Sept. 1st deadline. It includes a SET switch to activate the rewrite so as not to cause problems outside the ODBC users. We can either improve, it or yank it, by the Oct. 1st deadline.
Hello, At 11.40 30/08/98 -0400, David Hartwig wrote: >> Why is the system cnf'ifying the query. Because it wants to have a >> list of qualifications that are AND'ed, so it can just pick the most >> restrictive/cheapest, and evaluate that one first. Just a small question about all this optimizations stuff. I'm not a database expert but I think we are talking about a NP-complete problem. Could'nt we convert this optimization problem into another NP one that is known to have a good solution ? For example for the traveling salesman problem there's an alghoritm that provide a solution that's never more than two times the optimal one an provides results that are *really* near the optimal one most of the times. The simplex alghoritm may be another example. I think that this kind of alghoritm would be better than a collection ot tricks for special cases, and this tricks could be used anyway when special cases are detected. Furthermore I also know that exists a free program I used in the past that provides this kind of optimizations for chip design. I don't remember the exact name of the program but I remember it came from Berkeley university. Of course may be I'm totally missing the point. Hope it helps ! Bye! Dr. Sbragion Denis InfoTecna Tel, Fax: +39 39 2324054 URL: http://space.tin.it/internet/dsbragio
This is an old message, but still relivant. I belive 6.6 will have much better OR memory usage. > > > Bruce Momjian wrote: > > > > > > > Hannu Krosing wrote: > > > > > > > > The days where every release fixed server crashes, or added a feature > > > > > that users were 'screaming for' may be a thing of the past. > > > > > > > > Is anyone working on fixing the exploding optimisations for many OR-s, > > > > at least the canonic case used by access? > > > > > > > > My impression is that this has fallen somewhere between > > > > insightdist and Vadim. > > > > > > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many > > > desktop libraries and end-user tools depend on this "record set" strategy to > > > operate effectively. > > > > > > I have put together a workable hack that runs just before cnfify(). The > > > option is activated through the SET command. Once activated, it identifies > > > queries with this particular multi-OR pattern generated by these RECORD SET > > > strategies. Qualified query trees are rewritten as multiple UNIONs. (One > > > for each OR grouping). > > > > > > The results are profound. Queries that used to scan tables because of the > > > ORs, now make use of any indexes. Thus, the size of the table has virtually > > > no effect on performance. Furthermore, queries that used to crash the > > > backend, now run in under a second. > > > > > > Currently the down sides are: > > > 1. If there is no usable index, performance is significantly worse. The > > > patch does not check to make sure that there is a usable index. I could use > > > some pointers on this. > > > > > > 2. Small tables are actually a bit slower than without the patch. > > > > > > 3. Not very elegant. I am looking for a more generalized solution. > > > I have lots of ideas, but I would need to know the backend much better before > > > attempting any of them. My favorite idea is before cnfify(), to factor the > > > OR terms and pull out the constants into a virtual (temporary) table spaces. > > > Then rewrite the query as a join. The optimizer will (should) treat the new > > > query accordingly. This assumes that an efficient factoring algorithm exists > > > and that temporary tables can exist in the heap. > > > > > > Illustration: > > > SELECT ... FROM tab WHERE > > > (var1 = const1 AND var2 = const2) OR > > > (var1 = const3 AND var2 = const4) OR > > > (var1 = const5 AND var2 = const6) > > > > > > SELECT ... FROM tab, tmp WHERE > > > (var1 = var_x AND var2 = var_y) > > > > > > tmp > > > var_x | var_y > > > -------------- > > > const1|const2 > > > const3|const4 > > > const5|const6 > > > > David, where are we on this? I know we have OR's using indexes. Do we > > still need to look this as a fix, or are we OK. I have not gotten far > > enough in the optimizer to know how to fix the > > Bruce, > > If the question is, have I come up with a solution for the cnf'ify problem: No > > If the question is, is it still important: Very much yes. > > It is essential for many RAD tools using remote data objects which make use of key > sets. Your recent optimization of the OR list goes a long way, but inevitably > users are confronted with multi-part keys. > > When I look at the problem my head spins. I do not have the experience (yet?) > with the backend to be mucking around in the optimizer. As I see it, cnf'ify is > doing just what it is supposed to do. Boundless boolean logic. > > I think hope may lay though, in identifying each AND'ed group associated with a key > and tagging it as a special sub-root node which cnf'ify does not penetrate. This > node would be allowed to pass to the later stages of the optimizer where it will be > used to plan index scans. Easy for me to say. > > In the meantime, I still have the patch that I described in prior email. It has > worked well for us. Let me restate that. We could not survive without it! > However, I do not feel that is a sufficiently functional approach that should be > incorporated as a final solution. I will submit the patch if you, (anyone) does > not come up with a better solution. It is coded to be activated by a SET KSQO to > minimize its reach. > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Hi All I'm a bit lost! Where can I find documentation on accessing postgres from inside PERL (5)? Any help will be appreciated. (The thing is, I'm sure I've seen the info somewhere, but for the life of me I can't remember where...) Thanks Jason Doller
On Sun, 19 Sep 1999, Jason Doller wrote: > I'm a bit lost! Where can I find documentation on accessing postgres > from inside PERL (5)? > > Any help will be appreciated. (The thing is, I'm sure I've seen the info > somewhere, but for the life of me I can't remember where...) Under the source tree, go to the interfaces directory, and there is a directory for the perl interface (Pg.pm). You have to enable the Perl option when you run configure, and you will also have to install the module as root, since it gets installed under the module hierarchy wherever you have Perl installed. Then you only need to do a 'perldoc Pg' to see the documentation on it. See the build instructions for more information on the how to install the Perl module. Brett W. McCoy http://www.lan2wan.com/~bmccoy/ ----------------------------------------------------------------------- Don't let your mind wander -- it's too little to be let out alone.