Thread: Docs: GIST
I'm a bit confused. In 7.3 is it possible to use GIST without using any of the stuff in contrib/? If it is, how can it be done and in which cases should it be done? The pgsql docs about indexes keep talking about GIST here and there, but I can't seem to use them on anything. And there's no gist in the "ops" and access method listing. Having the docs say Postgresql provides GIST as one of the four access methods, GIST supports multicolumn indexes, GIST etc, is just confusing if the docs pertaining to indexes don't also say that in a default postgresql installation you cannot create an index using GIST (if you can actually create a GIST index "out of box", how??). Another thing: is Eugene Selkov's 1998 message on GIST indexes in the 7.3 docs (see GIST Indexes) still valid? There's mention of Postgresql 6.3 and postgres95 there too. BTW, 7.3 is GREAT! Multiple col/row returns, prepare queries, schemas etc. Also set enable_seq_scan=off can get rolled back to whatever it was before now right? Cool, coz I have to force index use for a particular select. Thanks to the postgresql dev team and everyone involved! Cheerio, Link.
Lincoln Yeoh <lyeoh@pop.jaring.my> writes: > I'm a bit confused. > In 7.3 is it possible to use GIST without using any of the stuff in > contrib/? No, because there are no GIST opclasses in the standard installation. They are only in contrib. Yes, that's a bit silly. As GIST improves out of the "academic toy" category into the "production tool" category, I expect we will migrate GIST opclasses into the standard installation. regards, tom lane
I did figure it out eventually but it'll be clearer to mention that in the docs - e.g. the only way to use GIST is to use the stuff in contrib. Coz I had a bit of wishful thinking - thought that maybe some bits of GIST might have at least become useable by default in 7.3 e.g. the simpler stuff (the docs didn't quite contradict that). Definitely not asking for it to be rushed in tho. Software is more reliable when the developers know what they are doing, and they get to release stuff when they think it's ready, not when others say it is. Cheerio, Link. At 12:41 AM 12/12/02 -0500, Tom Lane wrote: >Lincoln Yeoh <lyeoh@pop.jaring.my> writes: > > I'm a bit confused. > > In 7.3 is it possible to use GIST without using any of the stuff in > > contrib/? > >No, because there are no GIST opclasses in the standard installation. >They are only in contrib.
Hi all, I just read about the cluster command and was a little (very) disapointed. Clustered tables do not remain clustered after inserts. Clustered tables are usefull when the table is very large and there are few different keys. Because the table file is already extended (2G limit) using different files extension (.N) how complicated (modifying the code) would it be to have the table files split according to the cluster key? This would: Greatly improve performance when the cluster key in included in search criteria. Allow for a much larger table before a file has to be split (.N). Simplify the management of symblinks (that's something else we need to look at). The index file for that field would no longer be required. Of course, there should be only one cluster key per table. The length the "key" should be short and the number of unique key should be low as well. SO... ? JLL
Oh, and something else, I think the syntax should be: Cluster <table> on <attribute> Maybe inheritance can be use here. The problem is creating the new "table" when a new key is detected. I know, I can use rules, but the optimiser is not aware of the clustering. Enough from me for now. What do you think? JLL Jean-Luc Lachance wrote: > > Hi all, > > I just read about the cluster command and was a little (very) > disapointed. > Clustered tables do not remain clustered after inserts. > Clustered tables are usefull when the table is very large and there are > few different keys. > > Because the table file is already extended (2G limit) using different > files extension (.N) > how complicated (modifying the code) would it be to have the table files > split according to the cluster key? > > This would: > > Greatly improve performance when the cluster key in included in search > criteria. > Allow for a much larger table before a file has to be split (.N). > Simplify the management of symblinks (that's something else we need to > look at). > The index file for that field would no longer be required. > > Of course, there should be only one cluster key per table. > The length the "key" should be short and the number of unique key should > be low as well. > > SO... ? > > JLL > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Thu, 12 Dec 2002, Jean-Luc Lachance wrote: > Hi all, > > I just read about the cluster command and was a little (very) > disapointed. > Clustered tables do not remain clustered after inserts. > Clustered tables are usefull when the table is very large and there are > few different keys. > > > Because the table file is already extended (2G limit) using different > files extension (.N) > how complicated (modifying the code) would it be to have the table files > split according to the cluster key? I'd vote against changing the existing CLUSTER since the existing CLUSTER while not great does handle many different key values fairly well as well and this solution wouldn't. Many different key values are still useful to cluster if you're doing searches over ranges since it lowers the number of heap file reads necessary. If done this should probably be separate from the existing cluster or at least both versions should be possible.
The current cluster command is equivalant to: create b as select * from a order by i; So you would not be loosing anything. Stephan Szabo wrote: > > On Thu, 12 Dec 2002, Jean-Luc Lachance wrote: > > > Hi all, > > > > I just read about the cluster command and was a little (very) > > disapointed. > > Clustered tables do not remain clustered after inserts. > > Clustered tables are usefull when the table is very large and there are > > few different keys. > > > > > > Because the table file is already extended (2G limit) using different > > files extension (.N) > > how complicated (modifying the code) would it be to have the table files > > split according to the cluster key? > > I'd vote against changing the existing CLUSTER since the existing CLUSTER > while not great does handle many different key values fairly well as well > and this solution wouldn't. Many different key values are still > useful to cluster if you're doing searches over ranges since it lowers the > number of heap file reads necessary. If done this should probably be > separate from the existing cluster or at least both versions should be > possible.
On Thu, 12 Dec 2002, Jean-Luc Lachance wrote: > The current cluster command is equivalant to: > > create b as select * from a order by i; > > So you would not be loosing anything. Except for the fact that the CLUSTER is intended (although I don't know if it does yet) to retain things like constraints and other indexes which the above doesn't.
Splitting table files by indexed value may not help if the operating system doesn't manage to keep the tables unfragmented on disk. I suppose the O/S should know how to do it though. Cheerio, Link. At 04:31 PM 12/12/02 -0500, Jean-Luc Lachance wrote: >Hi all, > >I just read about the cluster command and was a little (very) >disapointed. >Clustered tables do not remain clustered after inserts. >Clustered tables are usefull when the table is very large and there are >few different keys. > > >Because the table file is already extended (2G limit) using different >files extension (.N) >how complicated (modifying the code) would it be to have the table files >split according to the cluster key?
OK fine, Let's create a new command: PARTITION <table> ON <attribute> I did not want to start a fight. You can keep the CLUSTER command as it is. I still think clustering/partitioning would be a great idea. This is what I want to talk about. Look at the original post for the reasons. JLL johnnnnnn wrote: > > On Thu, Dec 12, 2002 at 02:03:56PM -0800, Stephan Szabo wrote: > > I'd vote against changing the existing CLUSTER since the existing > > CLUSTER while not great does handle many different key values fairly > > well as well and this solution wouldn't. > > I would agree. What's being proposed sounds much more like table > partitioning than clustering. > > That's not to say that the existing CLUSTER couldn't be improved, at > the very least to the point where it allows inserts to respect the > clustered structure. That's a post for another thread, though. > > -johnnnnnnnnnnn > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Hi - I've been running PostgreSQL 7.3 on Mac OS X 10.2 since it was released and it's been running fine. I'm using pyPgSQL 2.3 for client side programming which also was working great until tonight. Now whenever I do any query of any type, I get warnings like this: WARNING: PerformPortalClose: portal "pgsql_00179f10" not found It "appears" that everything is still working the way it was but it's a bit discomforting to have these show up on my screen while running my applications. Anyone that can explain this? Here's a tiny bit of Python sample code that I used to make sure it wasn't my other code causing the problems from pyPgSQL import PgSQL dbname = "template1" conn = PgSQL.connect(database=dbname) cursor = conn.cursor() sql = "SELECT now()"; cursor.execute(sql) res = cursor.fetchall() for i in res: print i cursor.close() conn.commit() strangely if I remove the last 2 lines (cursor.close() and conn.commit()) I don't get the errors. Also I don't notice that I don't have this problem with psql command line either. Is this the Python API causing this? Thanks for any help Mike
Michael Engelhart wrote in gmane.comp.db.postgresql.general: > Hi - > I've been running PostgreSQL 7.3 on Mac OS X 10.2 since it was released > and it's been running fine. I'm using pyPgSQL 2.3 for client side > programming which also was working great until tonight. Now whenever > I do any query of any type, I get warnings like this: > > WARNING: PerformPortalClose: portal "pgsql_00179f10" not found > > It "appears" that everything is still working the way it was but it's a > bit discomforting to have these show up on my screen while running my > applications. > > Anyone that can explain this? > > Here's a tiny bit of Python sample code that I used to make sure it > wasn't my other code causing the problems > > from pyPgSQL import PgSQL > > dbname = "template1" > conn = PgSQL.connect(database=dbname) > cursor = conn.cursor() > sql = "SELECT now()"; > cursor.execute(sql) > res = cursor.fetchall() > for i in res: > print i > cursor.close() > conn.commit() Actually, pyPgSQL is using PostgreSQL portals behind your back. This is a feature! To show this, we use the undocumented, but very handy toggleShowQuery flag. The effect is that we can see what SQL pyPgSQL sends to the backend using libpq (the lines staring with QUERY: below): #v+ gerhard@gargamel:~$ python Python 2.2.2 (#1, Nov 30 2002, 23:19:58) [GCC 2.95.4 20020320 [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> from pyPgSQL import PgSQL >>> con = PgSQL.connect() >>> con.conn.toggleShowQuery 'On' >>> cursor = con.cursor() QUERY: BEGIN WORK >>> cursor.execute("select * from test") QUERY: DECLARE "PgSQL_0811F1EC" CURSOR FOR select * from test QUERY: FETCH 1 FROM "PgSQL_0811F1EC" QUERY: SELECT typname, -1 , typelem FROM pg_type WHERE oid = 23 QUERY: SELECT typname, -1 , typelem FROM pg_type WHERE oid = 1043 >>> result = cursor.fetchmany(5) QUERY: FETCH 4 FROM "PgSQL_0811F1EC" >>> result [[None, 'A'], [None, 'B'], [None, 'C'], [None, 'F'], [None, 'F']] >>> con.commit() QUERY: CLOSE PgSQL_0811F1EC QUERY: COMMIT WORK >>> #v- This gives me a warning like this: #v+ WARNING: PerformPortalClose: portal "pgsql_0811f1ec" not found #v- As far as I can see, the SQL pyPgSQL emits is perfectly ok. But I'd be glad to hear a clarification. > strangely if I remove the last 2 lines (cursor.close() and > conn.commit()) I don't get the errors. > > Also I don't notice that I don't have this problem with psql command > line either. Is this the Python API causing this? If you use the same SQL statements using portals in psql, you get the same warning (obviously). I just tried. Gerhard (pyPgSQL developer) -- Favourite database: http://www.postgresql.org/ Favourite programming language: http://www.python.org/ Combine the two: http://pypgsql.sf.net/ Embedded database for Python: http://pysqlite.sf.net/
Gerhard Haering <haering_postgresql@gmx.de> writes: > To show this, we use the undocumented, but very handy toggleShowQuery > flag. The effect is that we can see what SQL pyPgSQL sends to the > backend using libpq (the lines staring with QUERY: below): > QUERY: DECLARE "PgSQL_0811F1EC" CURSOR FOR select * from test > ... > QUERY: CLOSE PgSQL_0811F1EC This looks like a pyPgSQL bug to me. If it's going to use a mixed-case name for the cursor then it must either always double-quote the name or never do so. Failing to double-quote in the CLOSE command is wrong. regards, tom lane