Thread: tsearch2 patch status report
I've applied version 0.58 of the patch with a lot of further editorializing. I feel fairly confident now in the code that interfaces between tsearch and the rest of the system, but a lot of the lowest-level "guts" of tsearch (mainly in src/backend/tsearch/*.c and src/backend/utils/adt/ts*.c) left my eyes glazing over. Perhaps someone else can make an extra review pass over that stuff. I am quite confident that this commit broke the MSVC build, which seems to need to know individually about each shared library ... Magnus, can you do something about that? We'll see what other portability problems emerge from the buildfarm. The main thing that is lacking at the moment is documentation. The stuff Bruce has been working on will be good introductory material, but we've got basically zip in reference material. I'll do some work on that over the next couple of days, but there's probably room for more hands. Also, we need to decide what to do with contrib/tsearch2, which is currently DOA because of conflicts with the new core code. We could either rip it out entirely, or try to convert it into a compatibility package. In view of the renamings of functions we agreed to do, I think there is some scope for a compatibility package, but I have no time to work on that. This is, by a wide margin, the largest single patch ever to hit the Postgres CVS tree. Congratulations to Oleg and Teodor on seeing it through! regards, tom lane
Tom Lane wrote: > I am quite confident that this commit broke the MSVC build, which seems > to need to know individually about each shared library ... Magnus, > can you do something about that? We'll see what other portability > problems emerge from the buildfarm. > > > You broke my shiny new MinGW and Cygwin buildfarm members too :-) (MSVC is on the way - I'll try to have it up in a few days). cheers andrew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tom Lane wrote: > Also, we need to decide what to do with contrib/tsearch2, which is > currently DOA because of conflicts with the new core code. We could > either rip it out entirely, or try to convert it into a compatibility > package. In view of the renamings of functions we agreed to do, I > think there is some scope for a compatibility package, but I have no > time to work on that. I saw we leave it as a stub with a readme pointing to the docs. Sincerely, Joshua D. Drake > > This is, by a wide margin, the largest single patch ever to hit the > Postgres CVS tree. Congratulations to Oleg and Teodor on seeing > it through! > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGyk80ATb/zqfZUUQRAhbQAJwP95u10LTZ/apiUELtT2GthIZHfQCdGxDh JRfdszL69TQOBD/6hlVZLuA= =h9E9 -----END PGP SIGNATURE-----
Andrew Dunstan <andrew@dunslane.net> writes: > You broke my shiny new MinGW and Cygwin buildfarm members too :-) Yeah, I was just looking at that. I seem to recall that the fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' bleat is a symptom of a reference to a variable that isn't marked DLLIMPORT ... but CurrentMemoryContext certainly is, so there's not anything here sufficient to fix it. I trust someone with access to a Windows build environment will dig into that. regards, tom lane
Tom Lane wrote: > The main thing that is lacking at the moment is documentation. The > stuff Bruce has been working on will be good introductory material, > but we've got basically zip in reference material. I'll do some work > on that over the next couple of days, but there's probably room for > more hands. Oleg and Teodor did provide reference documentation. You can see the SGML here: http://momjian.us/expire/textsearch/SGML/ref/ The SQL commands were in a state of flux so I haven't worked on them yet. I can start now. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Oleg and Teodor did provide reference documentation. You can see the > SGML here: > http://momjian.us/expire/textsearch/SGML/ref/ > The SQL commands were in a state of flux so I haven't worked on them > yet. I can start now. OK. I whacked around the command syntax a bit in order to cut down the number of keywords the grammar needed to know about. (Every new keyword creates a distributed cost in the size and speed of the parser, so we shouldn't create 'em when we don't have to.) So I guess I'm on the hook to get the command syntax reference pages done while the reality is fresh in my mind. Will get on it tomorrow. The other areas that need work include datatype and function reference documentation --- how do you want to attack that? Should we create new <sect1> sections in those chapters? regards, tom lane
On Mon, Aug 20, 2007 at 10:50:49PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > You broke my shiny new MinGW and Cygwin buildfarm members too :-) > > Yeah, I was just looking at that. I seem to recall that the > fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' > bleat is a symptom of a reference to a variable that isn't marked > DLLIMPORT ... but CurrentMemoryContext certainly is, so there's not > anything here sufficient to fix it. I trust someone with access to > a Windows build environment will dig into that. Mingw fixed, I think. I think that will also fix cygwin, but less sure there.. //Magnus
On Mon, Aug 20, 2007 at 09:23:25PM -0400, Tom Lane wrote: > I've applied version 0.58 of the patch with a lot of further > editorializing. I feel fairly confident now in the code that interfaces > between tsearch and the rest of the system, but a lot of the > lowest-level "guts" of tsearch (mainly in src/backend/tsearch/*.c and > src/backend/utils/adt/ts*.c) left my eyes glazing over. Perhaps someone > else can make an extra review pass over that stuff. > > I am quite confident that this commit broke the MSVC build, which seems > to need to know individually about each shared library ... Magnus, > can you do something about that? We'll see what other portability > problems emerge from the buildfarm. Looking at that now. I get awarning from the following line in bison generated parse.h: #define TEXT 577 Because TEXT is a macro that's used in the windows headers. (A redefinition warning). Any chance to change that, or is it coming out of the syntax itself? (bison newbie here, as you know :-P) > Also, we need to decide what to do with contrib/tsearch2, which is > currently DOA because of conflicts with the new core code. We could > either rip it out entirely, or try to convert it into a compatibility > package. In view of the renamings of functions we agreed to do, I > think there is some scope for a compatibility package, but I have no > time to work on that. OTOH, if we do it as a compat package, we need to set a firm end-date on it, so we don't have to maintain it forever. Given the issues always at hand for doing such an upgrade, my vote is actually for ripping it out completely and take the migration pain once and then be done with it. > This is, by a wide margin, the largest single patch ever to hit the > Postgres CVS tree. Congratulations to Oleg and Teodor on seeing > it through! Yes, congratulations indeed. Great to see this in! //Magnus
Magnus Hagander wrote: > On Mon, Aug 20, 2007 at 09:23:25PM -0400, Tom Lane wrote: > I get awarning from the following line in bison generated parse.h: > #define TEXT 577 That's easy, change the terminal to TEXT_P on gram.y and keywords.c. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Magnus Hagander wrote: >> I get awarning from the following line in bison generated parse.h: >> #define TEXT 577 > That's easy, change the terminal to TEXT_P on gram.y and keywords.c. Yeah, I was wondering if we'd have to do that with any of the new keywords. Will fix, if you didn't already. regards, tom lane
On Tue, Aug 21, 2007 at 10:59:52AM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Magnus Hagander wrote: > >> I get awarning from the following line in bison generated parse.h: > >> #define TEXT 577 > > > That's easy, change the terminal to TEXT_P on gram.y and keywords.c. > > Yeah, I was wondering if we'd have to do that with any of the new > keywords. Will fix, if you didn't already. Nope, not yet since it's just a warning, so please do. //Magnus
On Mon, 20 Aug 2007, Tom Lane wrote: > I've applied version 0.58 of the patch with a lot of further > editorializing. I feel fairly confident now in the code that interfaces Great ! Just checked and most things after trivial changes are working ! We need to summarize changes and provide upgraide guide. > The main thing that is lacking at the moment is documentation. The > stuff Bruce has been working on will be good introductory material, > but we've got basically zip in reference material. I'll do some work > on that over the next couple of days, but there's probably room for > more hands. We have SQL reference and I'm sure Bruce should have it too. Bruce, let me know when you do last changes, so I could read docs again > > Also, we need to decide what to do with contrib/tsearch2, which is > currently DOA because of conflicts with the new core code. We could > either rip it out entirely, or try to convert it into a compatibility > package. In view of the renamings of functions we agreed to do, I > think there is some scope for a compatibility package, but I have no > time to work on that. Probably, we could leave tsearch2 in contrib as a compat module and explicitly define 8.4 will be the last release with tsearch2 support. > > This is, by a wide margin, the largest single patch ever to hit the > Postgres CVS tree. Congratulations to Oleg and Teodor on seeing > it through! It was really hard task for all of us. It's pity I don't drink at all, but Teodor will drink the health of "text search in PostgreSQL" today ! Tom, as I know, already drain out bottle of vodka, so Bruce, you need to join us ! Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Hi, On Tue, 2007-08-21 at 20:04 +0400, Oleg Bartunov wrote: > It was really hard task for all of us. It's pity I don't drink at all, > but Teodor will drink the health of "text search in PostgreSQL" > today . Haha :) I'd drink Vodka today, with some of my friends. (I'll drink Absolut, and I know you guys do not like it ;) ) Thanks for the good work guys. Cheers, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/
Tom Lane wrote: > I've applied version 0.58 of the patch with a lot of further > editorializing. I feel fairly confident now in the code that interfaces > between tsearch and the rest of the system, but a lot of the > lowest-level "guts" of tsearch (mainly in src/backend/tsearch/*.c and > src/backend/utils/adt/ts*.c) left my eyes glazing over. Perhaps someone > else can make an extra review pass over that stuff. I'm skimming through tsearch code, trying to understand it. I'd like to see more comments, at least function-comments explaining what each function does, what the arguments are etc. I can try to add them as I go as well, and send a patch. The memory management of the init functions looks weird. In spell.c, there's this piece of code: > /* > * during initialization dictionary requires a lot > * of memory, so it will use temporary context > */ > static MemoryContext tmpCtx = NULL; > > #define tmpalloc(sz) MemoryContextAlloc(tmpCtx, (sz)) > #define tmpalloc0(sz) MemoryContextAllocZero(tmpCtx, (sz)) > > static void > checkTmpCtx(void) > { > if (CurrentMemoryContext->firstchild == NULL) > { > tmpCtx = AllocSetContextCreate(CurrentMemoryContext, > "Ispell dictionary init context", > ALLOCSET_DEFAULT_MINSIZE, > ALLOCSET_DEFAULT_INITSIZE, > ALLOCSET_DEFAULT_MAXSIZE); > } > else > tmpCtx = CurrentMemoryContext->firstchild; > } checkTmpCtx is called by all the initialization functions in spell.c. I believe it's assumed that if firstchild exists, it's a previously allocated "init context". But there isn't actually any guarantee that the CurrentMemoryContext doesn't already have a child context, in which case we would use whatever the first child context happens to be. And at least dispell_init calls MemoryContextDeleteChildren(CurrentMemoryContext), again with no guarantee that there isn't other unrelated child contexts. I think dispell_init should create the new context before calling the functions in spell.c, and destroy it at the end. I can submit a patch, unless I'm missing something. More comments as I get further... PS. Nice to see tsearch in core! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki@enterprisedb.com> writes: > I'm skimming through tsearch code, trying to understand it. I'd like to > see more comments, at least function-comments explaining what each > function does, what the arguments are etc. I can try to add them as I go > as well, and send a patch. Yes, the patch is not well commented. I improved matters in the interface code that I worked over (eg tsearchcmds.c), but ran out of patience for trying to comment the "guts" code. Please send a patch for any improvements you make. > The memory management of the init functions looks weird. In spell.c, > there's this piece of code: I saw that and didn't care for it much, but didn't look into exactly what would be needed to get rid of it. (I did clean up another place that had a *global* magic context pointer ... that looked like trouble waiting to happen.) If you make the outside caller create the context, does it need to be passed explicitly or can it just be CurrentMemoryContext when they are called? regards, tom lane
Tom Lane wrote: >> The memory management of the init functions looks weird. In spell.c, >> there's this piece of code: > > I saw that and didn't care for it much, but didn't look into exactly > what would be needed to get rid of it. (I did clean up another place > that had a *global* magic context pointer ... that looked like trouble > waiting to happen.) If you make the outside caller create the context, > does it need to be passed explicitly or can it just be > CurrentMemoryContext when they are called? I believe CurrentMemoryContext would work just fine. I'll put together a patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
tgl@sss.pgh.pa.us (Tom Lane) writes: > The main thing that is lacking at the moment is documentation. The > stuff Bruce has been working on will be good introductory material, > but we've got basically zip in reference material. I'll do some work > on that over the next couple of days, but there's probably room for > more hands. If someone has either text or HTML formatted material that seems plausible, I'd be happy to help get it into DocBook form. -- "cbbrowne","@","acm.org" http://cbbrowne.com/info/linuxxian.html "Although Unix is more reliable, NT may become more reliable with time" -- Ron Redman, deputy technical director of the Fleet Introduction Division of the Aegis Program Executive Office, US Navy.
Oleg Bartunov wrote: > On Mon, 20 Aug 2007, Tom Lane wrote: > > > I've applied version 0.58 of the patch with a lot of further > > editorializing. I feel fairly confident now in the code that interfaces > > Great ! Just checked and most things after trivial changes are working ! > We need to summarize changes and provide upgraide guide. > > > The main thing that is lacking at the moment is documentation. The > > stuff Bruce has been working on will be good introductory material, > > but we've got basically zip in reference material. I'll do some work > > on that over the next couple of days, but there's probably room for > > more hands. > > We have SQL reference and I'm sure Bruce should have it too. Bruce, > let me know when you do last changes, so I could read docs again OK, I am unclear how the main documentation is going to change. I will look over the reference pages and get them online soon and we can all see them. > > This is, by a wide margin, the largest single patch ever to hit the > > Postgres CVS tree. Congratulations to Oleg and Teodor on seeing > > it through! > > It was really hard task for all of us. It's pity I don't drink at all, but > Teodor will drink the health of "text search in PostgreSQL" today ! > Tom, as I know, already drain out bottle of vodka, so Bruce, you need to > join us ! Shame I don't drink either. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Just a heads up, not sure if you guys are aware of it. But one of the Makefile's (src/backend/tsearch/Makefile) added bythis patch breaks the "build out of source tree" feature of autoconf/automake. The problem is pretty straightforward,and after adding $(srcdir) everything seems to be fine. <br /><br />Index: src/backend/tsearch/Makefile<br/>===================================================================<br />RCS file: /u01/mirrors/cvs/pgsql/pgsql/src/backend/tsearch/Makefile,v<br/>retrieving revision 1.1<br />diff -p -c - r1.1 Makefile<br/>*** src/backend/tsearch/Makefile 21 Aug 2007 01:11:18 -0000 1.1<br />--- src/backend/tsearch/Makefile 21 Aug 2007 23:58:37 -0000<br />*************** depend dep:<br />*** 31,37 ****<br /> .PHONY: install-data <br /> install-data: $(DICTFILES) installdirs<br /> for i in $(DICTFILES); \<br />! do $(INSTALL_DATA) $$i '$(DESTDIR)$(datadir)/$(DICTDIR)/'$$i; \<br /> done<br /> <br /> installdirs:<br/>--- 31,37 ---- <br /> .PHONY: install-data<br /> install-data: $(DICTFILES) installdirs<br /> fori in $(DICTFILES); \<br />! do $(INSTALL_DATA) $(srcdir)/$$i '$(DESTDIR)$(datadir)/$(DICTDIR)/'$$i; \<br/> done <br /> <br /> installdirs:<br /><br />
On 8/21/07, Magnus Hagander <magnus@hagander.net> wrote: > OTOH, if we do it as a compat package, we need to set a firm end-date on > it, so we don't have to maintain it forever. Given the issues always at > hand for doing such an upgrade, my vote is actually for ripping it out > completely and take the migration pain once and then be done with it. I would suggest making a pgfoundry project...that's what was done with userlocks. I'm pretty certain no one besides me has ever used the wrappers I created...a lot more people use tsearch2 than userlocks though. merlin
On Tue, 21 Aug 2007, Chris Browne wrote: > tgl@sss.pgh.pa.us (Tom Lane) writes: >> The main thing that is lacking at the moment is documentation. The >> stuff Bruce has been working on will be good introductory material, >> but we've got basically zip in reference material. I'll do some work >> on that over the next couple of days, but there's probably room for >> more hands. > > If someone has either text or HTML formatted material that seems > plausible, I'd be happy to help get it into DocBook form. > Documentation existed for months on DocBook form. I even got an offer to translate it to chinese while patch was evaluated. Bruce did a lot of editorial work and now he maintain cvs version. What we really need - is a summary of changes and upgrade guide. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
"Chad Wagner" <chad.wagner@gmail.com> writes: > Just a heads up, not sure if you guys are aware of it. But one of the > Makefile's (src/backend/tsearch/Makefile) added by this patch breaks the > "build out of source tree" feature of autoconf/automake. The problem is > pretty straightforward, and after adding $(srcdir) everything seems to be > fine. Applied, thanks. (Hm, I thought we had some buildfarm machines testing VPATH builds these days? Guess not ...) regards, tom lane
Le mardi 21 août 2007, Oleg Bartunov a écrit : > On Mon, 20 Aug 2007, Tom Lane wrote: > > I've applied version 0.58 of the patch with a lot of further > > editorializing. I feel fairly confident now in the code that interfaces > > Great ! Just checked and most things after trivial changes are working ! > We need to summarize changes and provide upgraide guide. Congratulations Teodor and Oleg ! > > > Also, we need to decide what to do with contrib/tsearch2, which is > > currently DOA because of conflicts with the new core code. We could > > either rip it out entirely, or try to convert it into a compatibility > > package. In view of the renamings of functions we agreed to do, I > > think there is some scope for a compatibility package, but I have no > > time to work on that. > > Probably, we could leave tsearch2 in contrib as a compat module and > explicitly define 8.4 will be the last release with tsearch2 support. Are you going to keep tools to compile stemmer from tsearch2 ? (I think about those files : http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/gendict/config.sh ) Or just let people use this method : http://momjian.us/expire/textsearch/HTML/textsearch-parser-example.html ? -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org
Tom Lane wrote: > Applied, thanks. (Hm, I thought we had some buildfarm machines testing > VPATH builds these days? Guess not ...) > > > I have switched dungbeetle to use vpath. It's a one line config file change, and it builds every hour (3 a day for stable banches, remainder for HEAD) if there are changes on the relevant branch. cheers andrew
Merlin Moncure wrote: > On 8/21/07, Magnus Hagander <magnus@hagander.net> wrote: >> OTOH, if we do it as a compat package, we need to set a firm end-date on >> it, so we don't have to maintain it forever. .... > > I would suggest making a pgfoundry project...that's what was done with > userlocks. I'm pretty certain no one besides me has ever used the > wrappers I created...a lot more people use tsearch2 than userlocks > though. > Hmm.. In that case I'd think people should ask if anyone would use the tsearch2 compatibility layer before even doing pgfoundry. Speaking for myself, I expect migrating to the core text search APIs as something we'd do as part of our 8.3 migration even if such a compatibility layer existed.