Thread: Re: [PATCHES] Patch for more readable parse error messages
Jeroen van Vianen <jeroen@design.nl> writes: >> Does this work with a non-bison parser? It looks mighty >> bison-dependent to me... > I'm not sure, but it probably is flex dependent (but Postgres always needed > flex anyway). I'm not aware of any yacc / byacc / bison dependencies. Don't > know if anybody has been successful building Postgres with another parser > generator. Um, you're right of course --- those are lexer not parser datastructures you're poking into. Sorry for my confusion. We do in fact work with non-bison parser generators, or did last time I tried it (around 6.5 release). I would not like us to stop working with non-bison yaccs, since bison's output depends on alloca() which is not available everywhere. I'm not sure about the situation with lexers. We have been saying for a long time that flex was required, but since we got rid of the scanner's use of trailing context ('/' rules) I think there is a better chance that it would work with vanilla lex. Anyone want to try that with current sources? > BTW, as we ship flex's output lex.yy.c (as scan.c) and bison's output > (gram.c) in the distribution, any user would be able to compile the > sources, but if they want to start hacking the .l or .y files, they'll > need appropriate tools. Right. I am not aware of any portability problems with flex's output as there are with bison's, so it may be that the concern is moot. We may just be able to say "use the prebuilt scan.c or get flex; we don't care about supporting vendor lexes anymore". I do see a potential problem with this patch that's not related to portability questions; it is that you're assuming that the lexer's furthest penetration into the source text is a good place to point at for parser errors. That may not be true always. In particular, I've been advocating solving some other problems by inserting a one-token lookahead buffer between the parser and the lexer. If that happens then you'd be off by (at least) one token in some cases. I think the way that this sort of thing is customarily handled in "real" compilers is that each token carries along an indication of just where it was found in the source, and then error messages can finger the right place without making assumptions about synchronization between different phases of the scanning/parsing process. That might be more work than we can justify for SQL queries; not sure. BTW, I think that the immediate problem of generating a good error message for unterminated comments and literals could be solved in other ways. This patch or something like it might be cool anyway, but you should bear in mind that printing out a query and then a marker that's supposed to line up with something in the query doesn't always work all that well. Consider a query that's several dozen lines long, such as a large table definition. If we had more control over the user interface and could highlight the offending token directly, I'd be more excited about doing something like this. (Actually, you could partially address that problem by only printing one line's worth of query text leading up to the error marker point. It would still be tricky to get it right in the presence of newlines, tabs, etc.) regards, tom lane
Tom Lane wrote: > We do in fact work with non-bison parser generators, or did last time > I tried it (around 6.5 release). I would not like us to stop working > with non-bison yaccs, since bison's output depends on alloca() which > is not available everywhere. I think GNU alloca should work on any platform because it's written in a portable way.
At 20:03 20-02-00 -0500, Tom Lane wrote: >I do see a potential problem with this patch that's not related to >portability questions; it is that you're assuming that the lexer's >furthest penetration into the source text is a good place to point >at for parser errors. That may not be true always. In particular, >I've been advocating solving some other problems by inserting a >one-token lookahead buffer between the parser and the lexer. If that >happens then you'd be off by (at least) one token in some cases. That's true, but the '*' indicator might at least indicate the approximate location of the error. I'm not aware of many (programming) languages that are able to indicate the error at the correct location all the time, anyway. >I think the way that this sort of thing is customarily handled in >"real" compilers is that each token carries along an indication of >just where it was found in the source, and then error messages can >finger the right place without making assumptions about synchronization >between different phases of the scanning/parsing process. That might >be more work than we can justify for SQL queries; not sure. True, but requires a lot more work. >BTW, I think that the immediate problem of generating a good error >message for unterminated comments and literals could be solved in other >ways. This patch or something like it might be cool anyway, but you >should bear in mind that printing out a query and then a marker that's >supposed to line up with something in the query doesn't always work >all that well. Consider a query that's several dozen lines long, >such as a large table definition. If we had more control over the >user interface and could highlight the offending token directly, >I'd be more excited about doing something like this. (Actually, you >could partially address that problem by only printing one line's worth >of query text leading up to the error marker point. It would still be >tricky to get it right in the presence of newlines, tabs, etc.) I try to make a good guess at where the location of the error is, but am hesitant to only print a few tokens near the error locations, as you won't be able to know where the error was found in complex queries or table definitions. Please try with more complex queries and tell me what you think. Jeroen
Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages
From
Peter Eisentraut
Date:
On 2000-02-20, Tom Lane mentioned: > I would not like us to stop working > with non-bison yaccs, since bison's output depends on alloca() which > is not available everywhere. Couldn't alloca(x) be defined to palloc(x) where missing? The performance will be worse, but it ought to work. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > On 2000-02-20, Tom Lane mentioned: >> I would not like us to stop working >> with non-bison yaccs, since bison's output depends on alloca() which >> is not available everywhere. > Couldn't alloca(x) be defined to palloc(x) where missing? Probably, but I wasn't looking for a workaround; that was just one quick illustration of a reason not to want to use bison (one that's bitten me personally, so I knew it offhand). We should try not to become dependent on bison when there are near-equivalent tools, just on general principles of maintaining portability. For an analogy, I believe most of the developers use gcc, but it would be a real bad idea for us to abandon support for other compilers. For the same sort of reasons I'd prefer that our scanner worked with vanilla lex, not just flex. I'm not sure how far away we are from that; it may be an unrealistic goal. But if it is within reach then we shouldn't give it up lightly. regards, tom lane
> Peter Eisentraut <peter_e@gmx.net> writes: > > On 2000-02-20, Tom Lane mentioned: > >> I would not like us to stop working > >> with non-bison yaccs, since bison's output depends on alloca() which > >> is not available everywhere. > > > Couldn't alloca(x) be defined to palloc(x) where missing? > > Probably, but I wasn't looking for a workaround; that was just one > quick illustration of a reason not to want to use bison (one that's > bitten me personally, so I knew it offhand). We should try not to > become dependent on bison when there are near-equivalent tools, just > on general principles of maintaining portability. For an analogy, > I believe most of the developers use gcc, but it would be a real bad > idea for us to abandon support for other compilers. But I don't see non-bison solutions for finding the location of errors. Is it possible? Could we enable the feature just for bison? -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
At 10:56 PM 2/21/00 -0500, Tom Lane wrote: >Probably, but I wasn't looking for a workaround; that was just one >quick illustration of a reason not to want to use bison (one that's >bitten me personally, so I knew it offhand). We should try not to >become dependent on bison when there are near-equivalent tools, just >on general principles of maintaining portability. For an analogy, >I believe most of the developers use gcc, but it would be a real bad >idea for us to abandon support for other compilers. > >For the same sort of reasons I'd prefer that our scanner worked >with vanilla lex, not just flex. I'm not sure how far away we are >from that; it may be an unrealistic goal. But if it is within reach >then we shouldn't give it up lightly. I agree entirely with the above. The more portable the tool, the larger the potential user base. Unless the goal is to bundle-up Postgres with a pre-defined set of software, i.e. GNU in this case (despite the fact that I don't see Postgres on their site as part of their list of open-source software, and I think I looked twice), go for the cover-the-earth approach. SQL syntax isn't particularly difficult. On the other hand, I realize there's a legacy to support. Still, making portions of the product dependent on one tool or another is an issue that merits close scrutiny. Shouldn't be done except under compelling reasons. I mean, presuming a reasonably modern C, C tools, and large-scale operating-system environment makes sense (no reason to run native on a palm pilot, at this point). But unecessary dependence on particular tools when not necessary doesn't make much sense. Just IMO, of course. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages
From
Peter Eisentraut
Date:
On Mon, 21 Feb 2000, Tom Lane wrote: > For the same sort of reasons I'd prefer that our scanner worked > with vanilla lex, not just flex. I'm not sure how far away we are > from that; it may be an unrealistic goal. But if it is within reach > then we shouldn't give it up lightly. I concur. Somewhere in between vanilla lex and flex is also POSIX lex, which does support exclusive start conditions but no <<EOF>>. Anyone for getting rid of GNU make? -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: > ... Somewhere in between vanilla lex and flex is also POSIX lex, > which does support exclusive start conditions but no <<EOF>>. I noticed that in the flex manual. Does it help us any? That is, are there a useful number of lexes out there that do the full POSIX spec? If flex is our only real choice for exclusive start conditions then it's pointless to avoid <<EOF>>. > Anyone for getting rid of GNU make? No ;-). GNU make has enough important features that there is no near-equivalent non-GNU make. VPATH, for example. One thing I hope we will be able to do sometime soon is build in an object directory tree separate from the source tree... can't realistically do that with any non-GNU make that I've heard of. regards, tom lane
Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages
From
Peter Eisentraut
Date:
On 2000-02-22, Tom Lane mentioned: > > Anyone for getting rid of GNU make? > > No ;-). GNU make has enough important features that there is no > near-equivalent non-GNU make. VPATH, for example. There are other makes that support this too. While I love GNU make, too, all the talk about allowing vanilla lex, etc. is pointless while GNU make is required. Users don't see lex at all, they do see make. OTOH, it is very hard for me to get an overview these days what's actually out there in terms of other make's, other lex's, other yacc's, other compilers. You should have an edge there (HPUX and all). Most installations of commercial Unix vendors I get to nowadays use gcc, gmake, flex as system tools. Yesterday I read that Sun builds Java proper with GNU make! The best way of going about this seems to take one of the perpetrators (make file, gram.y, etc.) and try to port it to some given non-GNU tool and take a look at the consequences. For example, if we get PostgreSQL to compile with FreeBSD's make without crippling everything, that would be a win for the user base. This may in fact be the first experiment. > One thing I hope we will be able to do sometime soon is build in an > object directory tree separate from the source tree... can't > realistically do that with any non-GNU make that I've heard of. I'm planning to work on that for 7.1. But here's an interesting tidbit: Automake does support this feature but in its manual it claims that it does not use any GNU make specific features. And in fact, VPATH exists in both System V's and 4.3 BSD's make. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > On 2000-02-22, Tom Lane mentioned: >>>> Anyone for getting rid of GNU make? >> >> No ;-). GNU make has enough important features that there is no >> near-equivalent non-GNU make. VPATH, for example. > There are other makes that support this too. While I love GNU make, too, > all the talk about allowing vanilla lex, etc. is pointless while GNU make > is required. Users don't see lex at all, they do see make. Huh? Assuming someone will have program X installed is not the same as assuming they will have program Y installed. In this particular case, a more exact way of putting it is that assuming program X is installed is not the same as assuming that program Y's prebuilt-on-another-machine output is usable on this platform. > OTOH, it is very hard for me to get an overview these days what's actually > out there in terms of other make's, other lex's, other yacc's, other > compilers. Not much. The real problem here is "what set of tool features do you assume you have, and what's it costing you in portability?" GNU make provides a very rich feature set that's widely portable, although you do have to port the particular implementation. If you don't want to assume GNU make but just a generic make, there's a big gap in features before you drop down to what's actually portable to a wide class of vendor-provided makes. VPATH, for example, does exist in *some* vendor makes, but as a practical matter if you use it then you'd better tell people "my program requires GNU make". It's not worth the trouble to keep track of the exceptions. I will be the first to admit this is all a matter of judgment calls rather than certainties. As far as I can see, it's not worth our trouble to try to operate with non-GNU makes; it is worth the trouble to work with non-GNU yaccs, because we're not really using any bison- specific features; it's looking like we should forget about non-GNU lexes, but I'm not quite convinced yet. You're free to hold different opinions of course. I've been around for a few years in the portable- software game, so I tend to think I know where the minefields are, but perhaps my hard experiences are out of date. > The best way of going about this seems to take one of the perpetrators > (make file, gram.y, etc.) and try to port it to some given non-GNU tool > and take a look at the consequences. But that only tells you about the one tool; in fact, only about the one version of the one tool that you test. In practice, useful knowledge in this area comes from the school of hard knocks: ship an open-source program and see what complaints you get. I'd rather rely on experience previously gained than learn those lessons again... >> One thing I hope we will be able to do sometime soon is build in an >> object directory tree separate from the source tree... can't >> realistically do that with any non-GNU make that I've heard of. > I'm planning to work on that for 7.1. But here's an interesting tidbit: > Automake does support this feature but in its manual it claims that it > does not use any GNU make specific features. Yeah? Do they claim not to need VPATH to do it? I suppose it might be possible, if they are willing to write sufficiently ugly and non-hand-maintainable makefiles. Not sure that's a good tradeoff though. > And in fact, VPATH exists in both System V's and 4.3 BSD's make. You're still confusing two datapoints with the wide world... regards, tom lane
GNU make (Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages)
From
Peter Eisentraut
Date:
On Wed, 23 Feb 2000, Tom Lane wrote: > > And in fact, VPATH exists in both System V's and 4.3 BSD's make. > > You're still confusing two datapoints with the wide world... I challenge everyone to show me a make without VPATH. In fact, show me two makes without a feature that you can't live without, and I shall forever hold my peace. It's certainly easier to say "let's support yacc, because we actually don't use any non-yacc features" than saying it for make. But it's not the idea to say "we need GNU make because it has all these features" when 93% of these features in fact exist in all other reasonable makes as well. It's not the end of the world but it's something that shouldn't be ignored. -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Re: GNU make (Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages)
From
Tom Lane
Date:
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: > On Wed, 23 Feb 2000, Tom Lane wrote: >>>> And in fact, VPATH exists in both System V's and 4.3 BSD's make. >> >> You're still confusing two datapoints with the wide world... > I challenge everyone to show me a make without VPATH. In fact, show me two > makes without a feature that you can't live without, and I shall forever > hold my peace. Out of the four systems I have easy access to: HPUX 10, HPUX 9, Linux (some fairly old RedHat version), and SunOS 4.1.4, two have makes without VPATH ... and Linux doesn't really count since it's using gmake anyway. Now you can argue that HPUX 9 and SunOS 4.1.4 are dinosaurs that should be put out of their misery, and I wouldn't disagree --- but reality is that a lot of people are running older systems and don't have the time or interest to upgrade 'em. "Portability" doesn't mean "portability to the newest and most standards-conformant systems", it means portability to what's actually out there. > it's not the idea to say "we need GNU make because it has all these > features" when 93% of these features in fact exist in all other reasonable > makes as well. If I thought we were anywhere near that close to being able to use old makes, I'd be arguing for removing the GNU-make dependency too. But I don't think it's going to be practical... regards, tom lane
Re: GNU make (Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages)
From
Peter Eisentraut
Date:
On Wed, 23 Feb 2000, Tom Lane wrote: > Peter Eisentraut <e99re41@DoCS.UU.SE> writes: > > I challenge everyone to show me a make without VPATH. In fact, show me two > > makes without a feature that you can't live without, and I shall forever > > hold my peace. > > Out of the four systems I have easy access to: HPUX 10, HPUX 9, Linux > (some fairly old RedHat version), and SunOS 4.1.4, two have makes > without VPATH You win. ;) I surveyed several machines as well (Solaris, IRIX, FreeBSD, HPUX) which all had this feature. I feel better now with actual data points, I hope that's fair enough. -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Here is more information about it. > Jeroen van Vianen <jeroen@design.nl> writes: > >> Does this work with a non-bison parser? It looks mighty > >> bison-dependent to me... > > > I'm not sure, but it probably is flex dependent (but Postgres always needed > > flex anyway). I'm not aware of any yacc / byacc / bison dependencies. Don't > > know if anybody has been successful building Postgres with another parser > > generator. > > Um, you're right of course --- those are lexer not parser datastructures > you're poking into. Sorry for my confusion. > > We do in fact work with non-bison parser generators, or did last time > I tried it (around 6.5 release). I would not like us to stop working > with non-bison yaccs, since bison's output depends on alloca() which > is not available everywhere. > > I'm not sure about the situation with lexers. We have been saying for > a long time that flex was required, but since we got rid of the > scanner's use of trailing context ('/' rules) I think there is a better > chance that it would work with vanilla lex. Anyone want to try that > with current sources? > > > BTW, as we ship flex's output lex.yy.c (as scan.c) and bison's output > > (gram.c) in the distribution, any user would be able to compile the > > sources, but if they want to start hacking the .l or .y files, they'll > > need appropriate tools. > > Right. I am not aware of any portability problems with flex's output > as there are with bison's, so it may be that the concern is moot. > We may just be able to say "use the prebuilt scan.c or get flex; we > don't care about supporting vendor lexes anymore". > > I do see a potential problem with this patch that's not related to > portability questions; it is that you're assuming that the lexer's > furthest penetration into the source text is a good place to point > at for parser errors. That may not be true always. In particular, > I've been advocating solving some other problems by inserting a > one-token lookahead buffer between the parser and the lexer. If that > happens then you'd be off by (at least) one token in some cases. > > I think the way that this sort of thing is customarily handled in > "real" compilers is that each token carries along an indication of > just where it was found in the source, and then error messages can > finger the right place without making assumptions about synchronization > between different phases of the scanning/parsing process. That might > be more work than we can justify for SQL queries; not sure. > > BTW, I think that the immediate problem of generating a good error > message for unterminated comments and literals could be solved in other > ways. This patch or something like it might be cool anyway, but you > should bear in mind that printing out a query and then a marker that's > supposed to line up with something in the query doesn't always work > all that well. Consider a query that's several dozen lines long, > such as a large table definition. If we had more control over the > user interface and could highlight the offending token directly, > I'd be more excited about doing something like this. (Actually, you > could partially address that problem by only printing one line's worth > of query text leading up to the error marker point. It would still be > tricky to get it right in the presence of newlines, tabs, etc.) > > regards, tom lane > > ************ > -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026