Thread: RE: [HACKERS] Lex and things...

RE: [HACKERS] Lex and things...

From
"Ansley, Michael"
Date:
As far as I understand it, the MAX_PARSE_BUFFER limit only applies if char
parsestring[] is used, not if char *parsestring is used.  This is the whole
reason for using flex.  And scan.l is set up to compile using char
*parsestring, not char parsestring[].

The NAMEDATALEN limit is imposed by the db structure, and is the limit of an
identifier.  Because this is not actual data, I'm not too concerned with
this at the moment.  As long as we can get pretty much unlimited data into
the tuples, I don't care what I have to call my tables, views, procedures,
etc.

>> 
>> Ansley, Michael wrote:
>> 
>> > >> Hmm. There is something going on to remove fixed length limits
>> > >> entirely, maybe someone is already doing something to lexer in
>> > >> that respect? If no, I could look at what can be done there.
>> > Yes, me.  I've removed the query string limit from psql, libpq, and as
much
>> > of the backend as I can see.  I have done some (very) preliminary
testing,
>> > and managed to get a 95kB query to execute.  However, the two remaining
>> > problems that I have run into so far are token size (which you have
just
>> > removed, many thanks ;-), 
>> 
>> I'm afraid not. There is arbitrary limit (named NAMEDATALEN) 
>> in lexer.
>> If identifier exeeds it, it gets '\0' at that limit, so truncated
>> effectively. Strings are also limited by MAX_PARSE_BUFFER which is
>> finally something like QUERY_BUF_SIZE = 8k*2.
>> 
>> Seems that string literals are the primary target, because it is
>> real-life constraint here now. This is not the case with supposed
>> huge identifiers. Should I work on it, or will you do it yourself?
>> 
>> > and string literals, which are limited, it seems
>> > to YY_BUF_SIZE (I think).
>> 
>> -- 
>> Leon.
>> 


Re: [HACKERS] Lex and things...

From
Leon
Date:
Ansley, Michael wrote:
> 
> As far as I understand it, the MAX_PARSE_BUFFER limit only applies if char
> parsestring[] is used, not if char *parsestring is used.  This is the whole
> reason for using flex.  And scan.l is set up to compile using char
> *parsestring, not char parsestring[].
> 

What is defined explicitly:

#ifdef  YY_READ_BUF_SIZE
#undef  YY_READ_BUF_SIZE
#endif
#define YY_READ_BUF_SIZE    MAX_PARSE_BUFFER

(these strings are repeated twice :)

...
char literal[MAX_PARSE_BUFFER];

...
<xq>{xqliteral} {                if ((llen+yyleng) > (MAX_PARSE_BUFFER - 1))                    elog(ERROR,"quoted
stringparse buffer of %d chars
 
exceeded",MAX_PARSE_BUFFER);                memcpy(literal+llen, yytext, yyleng+1);                llen += yyleng;
     }
 

Seems that limits are everywhere ;)

-- 
Leon.