Thread: RE: [HACKERS] Lex and things...

RE: [HACKERS] Lex and things...

From
"Ansley, Michael"
Date:
>> > Shot, Leon.  The patch removes the #define YY_USES_REJECT from scan.c,
which
>> > means we now have expandable tokens.  Of course, it also removes the
>> > scanning of "embedded minuses", which apparently causes the optimizer
to
>> > unoptimize a little. 
>> 
>> Oh, no. Unary minus gets to grammar parser and there is recognized as
>> such. Then for numeric constants it becomes an *embedded* minus in
>> function doNegate. So unary minus after parser in numeric constants
>> is embedded minus, as it was earlier before patch. In other words,
>> I can see no change in representation of grammar after patching.
Great.
>> 
>> > However, the next step is attacking the limit on the
>> > size of string literals.  These seemed to be wired to YY_BUF_SIZE, or
>> > something.  Is there any reason for this?
>> 
>> Hmm. There is something going on to remove fixed length limits 
>> entirely, maybe someone is already doing something to lexer in
>> that respect? If no, I could look at what can be done there.
Yes, me.  I've removed the query string limit from psql, libpq, and as much
of the backend as I can see.  I have done some (very) preliminary testing,
and managed to get a 95kB query to execute.  However, the two remaining
problems that I have run into so far are token size (which you have just
removed, many thanks ;-), and string literals, which are limited, it seems
to YY_BUF_SIZE (I think).

You see, if I can get the query string limited removed, perhaps someone who
knows a bit more than I do will do something like, hmmm, say, remove the
block size limit from tuple size... hint, hint... anybody...

MikeA


>> 
>> -- 
>> Leon.
>> 


Re: [HACKERS] Lex and things...

From
Leon
Date:
Ansley, Michael wrote:

> >> Hmm. There is something going on to remove fixed length limits
> >> entirely, maybe someone is already doing something to lexer in
> >> that respect? If no, I could look at what can be done there.
> Yes, me.  I've removed the query string limit from psql, libpq, and as much
> of the backend as I can see.  I have done some (very) preliminary testing,
> and managed to get a 95kB query to execute.  However, the two remaining
> problems that I have run into so far are token size (which you have just
> removed, many thanks ;-), 

I'm afraid not. There is arbitrary limit (named NAMEDATALEN) in lexer.
If identifier exeeds it, it gets '\0' at that limit, so truncated
effectively. Strings are also limited by MAX_PARSE_BUFFER which is
finally something like QUERY_BUF_SIZE = 8k*2.

Seems that string literals are the primary target, because it is
real-life constraint here now. This is not the case with supposed
huge identifiers. Should I work on it, or will you do it yourself?

> and string literals, which are limited, it seems
> to YY_BUF_SIZE (I think).

-- 
Leon.



Re: [HACKERS] Lex and things...

From
Adriaan Joubert
Date:
> I'm afraid not. There is arbitrary limit (named NAMEDATALEN) in lexer.
> If identifier exeeds it, it gets '\0' at that limit, so truncated
> effectively. Strings are also limited by MAX_PARSE_BUFFER which is
> finally something like QUERY_BUF_SIZE = 8k*2.

I think NAMEDATALEN referes to the size of a NAME field in the database,
which is used to store attribute names etc. So you cannot exceed
NAMEDATALEN, or the identifier won't fit into the system tables.

Adriaan


Re: [HACKERS] Lex and things...

From
Leon
Date:
Adriaan Joubert wrote:

> I think NAMEDATALEN referes to the size of a NAME field in the database,
> which is used to store attribute names etc. So you cannot exceed
> NAMEDATALEN, or the identifier won't fit into the system tables.

Ok. Let's leave identifiers alone.

-- 
Leon.