Thread: Read-only plan trees

Read-only plan trees

From
Tom Lane
Date:
Han Holl's recent complaint about memory leaks in SQL-language functions
has started me thinking again about making plan trees read-only to the
executor.  This would make it a lot easier to manage memory cleanly in
the SQL function executor, and would eliminate a lot of plan tree
copying that currently goes on in plpgsql, prepared queries, etc.

Basically, instead of having plan tree nodes point to associated
executor state nodes, we should turn that around: executor state should
point to plan nodes.  Executor startup should build a state-node tree
that exactly parallels the plan tree, and *all* data that is changed by
the executor should live in that tree.  We can build this tree in a
memory context that is made to have query lifetime.  At executor
shutdown, rather than individually pfree'ing lots of stuff (and having
memory leaks wherever we forget), we can just delete the query memory
context.

This is a nontrivial task, and so I plan to tackle it in several stages.

Step 1: restructure plannodes.h and execnodes.h so that there is an
executor state tree with entries for each "plan node".  This tree will
be built recursively during ExecInitNode() --- you pass it a plan tree,
and it returns a state tree that links to the plan tree nodes.
ExecutorRun then needs only a pointer to the state tree.

Step 2: similarly restructure trees for expressions (quals and
targetlists).  Currently we do not explicitly build a state tree for
expressions --- the objects that ought to be in this tree are the
"fcache" entries that are attached to OP_EXPR and FUNC_EXPR nodes in
an expression plan tree.  The fcache objects really need to be in the
executor's context however, and the cleanest way to make that happen
seems to be to build a state tree paralleling the expression plan tree.

But this is slightly inefficient, since there would be many nodes in the
expression state trees that aren't doing anything very useful, ie, all
the ones that correspond to nodes other than OP and FUNC in the plan
tree.

An alternative approach would be to make it work somewhat like Params
do now: in each OP and FUNC node, put an integer index field to replace
the current fcache pointer.  The planner would be responsible for
assigning sequential index values to every OP and FUNC in a plan, and
storing the total number of 'em in the plan's top node.  Then at
runtime, the executor would allocate an array of that many fcache
structs which it'd store in the EState for the plan.  Execution of
an individual op or func would index into the EState to find the fcache.

Either of these approaches would mean that we couldn't easily "just
execute" a scalar expression tree, which is something that we do in
quite a few places (constraint checking for instance).  There would need
to be some advance setup done.  With the Param-style approach, the
advance setup would not be read-only on the expression plan tree ...
which seems like a bad idea, so I'm leaning towards building the more
expensive data structure.

Step 3: only after all the above spadework is done could we actually set
up a query-lifetime memory context and build the executor's state in it.

Comments?
        regards, tom lane


Re: Read-only plan trees

From
Joe Conway
Date:
Tom Lane wrote:
> Either of these approaches would mean that we couldn't easily "just
> execute" a scalar expression tree, which is something that we do in
> quite a few places (constraint checking for instance).  There would need
> to be some advance setup done.  With the Param-style approach, the
> advance setup would not be read-only on the expression plan tree ...
> which seems like a bad idea, so I'm leaning towards building the more
> expensive data structure.

Even though the former is a bit more expensive, it sounds like it is still a 
net win due to reduced/eliminated need for making plan tree copies, right? It 
sounds like it is also simpler and easier to maintain.

> Step 3: only after all the above spadework is done could we actually set
> up a query-lifetime memory context and build the executor's state in it.
> 
> Comments?

Sounds like a great plan. Let me know if there's anything I can do to help.

Joe




Re: Read-only plan trees

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> Either of these approaches would mean that we couldn't easily "just
>> execute" a scalar expression tree, which is something that we do in
>> quite a few places (constraint checking for instance).  There would need
>> to be some advance setup done.  With the Param-style approach, the
>> advance setup would not be read-only on the expression plan tree ...
>> which seems like a bad idea, so I'm leaning towards building the more
>> expensive data structure.

> Even though the former is a bit more expensive, it sounds like it is still a 
> net win due to reduced/eliminated need for making plan tree copies,
> right?

I think it will be a net win compared to our current code, because we
can save copying whole plan trees in a number of places.  But I was
wondering if further improvement is possible.

Another reason not to go with the fcache-array approach is that it does not
help with storing executor-state data for anything except op/func nodes.
I am not sure offhand whether we need any for other expression node
types, but it's sure a reasonably likely future possibility.

> Sounds like a great plan. Let me know if there's anything I can do to help.

Right at the moment I'm struggling a bit with terminology.  We've got
basically four categories of node types to deal with in this scheme:
        Plan steps        Expressions        (Scan, Sort, etc)    (Var, Op, Func, etc)

Planner output        "Plan"            "Expr"?

Executor state        "CommonState"        ???

The existing Plan-category nodes are all derived from nodetype Plan,
so that seems reasonably well set.  The existing executor state nodes
for Plan nodes are all derived from CommonState, but that seems like
a name that conveys hardly anything.  The existing expression-category
nodes do *not* have any common substructure, and don't seem to need any.
I'm not thrilled about using Expr as a generic term for them, but am not
sure what else to write.  (I'm also finding it confusing whether "plan
node" means "any node in a tree output by the planner" --- which would
then include expression nodes --- or just nodes that correspond to major
steps in the query pipeline --- which is the present usage.)  And what
about a generic term for execution state nodes for expression nodes?

Any ideas about naming are welcome.
        regards, tom lane


Re: Read-only plan trees

From
Joe Conway
Date:
Tom Lane wrote:
> Right at the moment I'm struggling a bit with terminology.  We've got
> basically four categories of node types to deal with in this scheme:
> 
>             Plan steps        Expressions
>             (Scan, Sort, etc)    (Var, Op, Func, etc)
> 
> Planner output        "Plan"            "Expr"?
> 
> Executor state        "CommonState"        ???
> 
> The existing Plan-category nodes are all derived from nodetype Plan,
> so that seems reasonably well set.  The existing executor state nodes
> for Plan nodes are all derived from CommonState, but that seems like
> a name that conveys hardly anything.  The existing expression-category
> nodes do *not* have any common substructure, and don't seem to need any.
> I'm not thrilled about using Expr as a generic term for them, but am not
> sure what else to write.  (I'm also finding it confusing whether "plan
> node" means "any node in a tree output by the planner" --- which would
> then include expression nodes --- or just nodes that correspond to major
> steps in the query pipeline --- which is the present usage.)  And what
> about a generic term for execution state nodes for expression nodes?
> 
> Any ideas about naming are welcome.

Maybe:        Plan steps        Expressions        -----------------    --------------------
Planner output        "Plan"            "Expr"
Executor state        "PlanState"        "ExprState"

I think "Plan node" should only refer to nodes literally derived from nodetype 
Plan. Similarly with "PlanState nodes".

Joe



Re: Read-only plan trees

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
>> Any ideas about naming are welcome.

> Maybe:
>             Plan steps        Expressions
>             -----------------    --------------------
> Planner output        "Plan"            "Expr"
> Executor state        "PlanState"        "ExprState"

> I think "Plan node" should only refer to nodes literally derived from
> nodetype Plan. Similarly with "PlanState nodes".

That part works for me.  The other part isn't quite right since most
expression-class nodes don't inherit from Expr, and their state nodes
certainly don't need an fcache.

But come to think of it, we don't need an fcache for AND/OR/NOT nodes,
and SUBPLAN has different needs altogether.  I wonder if it's time to
split the Expr node class into three or so classes: op/func, boolean,
and subplan.  If we did that, we could use the Expr struct name for the
superclass of all expression-type nodes (since it'd contain only
NodeTag, it'd be a purely decorative superclass) and then ExprState
works as the name of the associated superclass of expression-state nodes
(only slightly less decorative, it'd contain NodeTag and the "Expr *"
link to the associated expression node).  The existing FunctionCache
struct would then become part of the ExprState subclass that's
associated with the op/func Expr subclass.  This seems like it works...
        regards, tom lane


Patch to make Turks happy.

From
Nicolai Tufar
Date:
Hi,

Yet another problem with Turkish encoding. clean_encoding_name()
in src/backend/utils/mb/encnames.c uses tolower() to convert locale
names to lower-case. This causes errors if locale name contains
capital "I" and current olcale is Turkish. Some examples:

aaa=# \l
      List of databases
   Name    | Owner | Encoding
-----------+-------+----------
 aaa       | pgsql | LATIN5
 bbb       | pgsql | LATIN5
 template0 | pgsql | LATIN5
 template1 | pgsql | LATIN5
(4 rows)
aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
ERROR:  LATIN5 is not a valid encoding name
aaa=# \encoding
SQL_ASCII
aaa=# \encoding SQL_ASCII
SQL_ASCII: invalid encoding name or conversion procedure not found
aaa=# \encoding LATIN5
LATIN5: invalid encoding name or conversion procedure not found


Patch, is a simple change to use ASCII-only lower-case conversion
instead of locale-dependent tolower()

Best regards,
Nic.






*** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
--- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
***************
*** 407,413 ****
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             *np++ = tolower((unsigned char) *p);
      }
      *np = '\0';
      return newkey;
--- 407,416 ----
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             if (*p >= 'A' && *p <= 'Z')
!                 *np++ = *p + 'a' - 'A';
!             else
!                 *np++ = *p;
      }
      *np = '\0';
      return newkey;


Re: Patch to make Turks happy.

From
Bruce Momjian
Date:
I am not going to apply this patch because I think it will mess up the
handling of other locales.


---------------------------------------------------------------------------

Nicolai Tufar wrote:
> Hi,
>
> Yet another problem with Turkish encoding. clean_encoding_name()
> in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> names to lower-case. This causes errors if locale name contains
> capital "I" and current olcale is Turkish. Some examples:
>
> aaa=# \l
>       List of databases
>    Name    | Owner | Encoding
> -----------+-------+----------
>  aaa       | pgsql | LATIN5
>  bbb       | pgsql | LATIN5
>  template0 | pgsql | LATIN5
>  template1 | pgsql | LATIN5
> (4 rows)
> aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> ERROR:  LATIN5 is not a valid encoding name
> aaa=# \encoding
> SQL_ASCII
> aaa=# \encoding SQL_ASCII
> SQL_ASCII: invalid encoding name or conversion procedure not found
> aaa=# \encoding LATIN5
> LATIN5: invalid encoding name or conversion procedure not found
>
>
> Patch, is a simple change to use ASCII-only lower-case conversion
> instead of locale-dependent tolower()
>
> Best regards,
> Nic.
>
>
>
>
>
>
> *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> ***************
> *** 407,413 ****
>       for (p = key, np = newkey; *p != '\0'; p++)
>       {
>           if (isalnum((unsigned char) *p))
> !             *np++ = tolower((unsigned char) *p);
>       }
>       *np = '\0';
>       return newkey;
> --- 407,416 ----
>       for (p = key, np = newkey; *p != '\0'; p++)
>       {
>           if (isalnum((unsigned char) *p))
> !             if (*p >= 'A' && *p <= 'Z')
> !                 *np++ = *p + 'a' - 'A';
> !             else
> !                 *np++ = *p;
>       }
>       *np = '\0';
>       return newkey;
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch to make Turks happy.

From
Nicolai Tufar
Date:
Bruce Momjian wrote:
> I am not going to apply this patch because I think it will mess up the
> handling of other locales.

As far as I figured from the source code this function only deals with
cleaning up
locale names and nothing else. Since all the locale names are in plain
ASCII I think
it will be safe to use ASCII-only lower-case conversion.

By the way, I noticed only after sending the patch that compiler
complains about
ambiguous `else' so it can be rewritten as:


    if (*p >= 'A' && *p <= 'Z'){

        *np++ = *p + 'a' - 'A';

    }else{

        *np++ = *p;
            }



Regards,
Nicolai


>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
>
>>Hi,
>>
>>Yet another problem with Turkish encoding. clean_encoding_name()
>>in src/backend/utils/mb/encnames.c uses tolower() to convert locale
>>names to lower-case. This causes errors if locale name contains
>>capital "I" and current olcale is Turkish. Some examples:
>>
>>aaa=# \l
>>      List of databases
>>   Name    | Owner | Encoding
>>-----------+-------+----------
>> aaa       | pgsql | LATIN5
>> bbb       | pgsql | LATIN5
>> template0 | pgsql | LATIN5
>> template1 | pgsql | LATIN5
>>(4 rows)
>>aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
>>ERROR:  LATIN5 is not a valid encoding name
>>aaa=# \encoding
>>SQL_ASCII
>>aaa=# \encoding SQL_ASCII
>>SQL_ASCII: invalid encoding name or conversion procedure not found
>>aaa=# \encoding LATIN5
>>LATIN5: invalid encoding name or conversion procedure not found
>>
>>
>>Patch, is a simple change to use ASCII-only lower-case conversion
>>instead of locale-dependent tolower()
>>
>>Best regards,
>>Nic.
>>
>>
>>
>>
>>
>>
>>*** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
>>--- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
>>***************
>>*** 407,413 ****
>>      for (p = key, np = newkey; *p != '\0'; p++)
>>      {
>>          if (isalnum((unsigned char) *p))
>>!             *np++ = tolower((unsigned char) *p);
>>      }
>>      *np = '\0';
>>      return newkey;
>>--- 407,416 ----
>>      for (p = key, np = newkey; *p != '\0'; p++)
>>      {
>>          if (isalnum((unsigned char) *p))
>>!             if (*p >= 'A' && *p <= 'Z')
>>!                 *np++ = *p + 'a' - 'A';
>>!             else
>>!                 *np++ = *p;
>>      }
>>      *np = '\0';
>>      return newkey;
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>
>




Re: [PATCHES] Patch to make Turks happy.

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> I am not going to apply this patch because I think it will mess up the
> handling of other locales.

This patch looks OK to me.  Normally, character set names should use
identifier case-folding rules anyway, so seems to be a step in the right
direction.  Much better than saying that users of certain locales can't
properly use PostgreSQL.

>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
> > Hi,
> >
> > Yet another problem with Turkish encoding. clean_encoding_name()
> > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > names to lower-case. This causes errors if locale name contains
> > capital "I" and current olcale is Turkish. Some examples:
> >
> > aaa=# \l
> >       List of databases
> >    Name    | Owner | Encoding
> > -----------+-------+----------
> >  aaa       | pgsql | LATIN5
> >  bbb       | pgsql | LATIN5
> >  template0 | pgsql | LATIN5
> >  template1 | pgsql | LATIN5
> > (4 rows)
> > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > ERROR:  LATIN5 is not a valid encoding name
> > aaa=# \encoding
> > SQL_ASCII
> > aaa=# \encoding SQL_ASCII
> > SQL_ASCII: invalid encoding name or conversion procedure not found
> > aaa=# \encoding LATIN5
> > LATIN5: invalid encoding name or conversion procedure not found
> >
> >
> > Patch, is a simple change to use ASCII-only lower-case conversion
> > instead of locale-dependent tolower()
> >
> > Best regards,
> > Nic.
> >
> >
> >
> >
> >
> >
> > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > ***************
> > *** 407,413 ****
> >       for (p = key, np = newkey; *p != '\0'; p++)
> >       {
> >           if (isalnum((unsigned char) *p))
> > !             *np++ = tolower((unsigned char) *p);
> >       }
> >       *np = '\0';
> >       return newkey;
> > --- 407,416 ----
> >       for (p = key, np = newkey; *p != '\0'; p++)
> >       {
> >           if (isalnum((unsigned char) *p))
> > !             if (*p >= 'A' && *p <= 'Z')
> > !                 *np++ = *p + 'a' - 'A';
> > !             else
> > !                 *np++ = *p;
> >       }
> >       *np = '\0';
> >       return newkey;
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>

--
Peter Eisentraut   peter_e@gmx.net




Re: [PATCHES] Patch to make Turks happy.

From
Bruce Momjian
Date:
OK, Peter, that helps.  Thanks.  I will apply it.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [PATCHES] Patch to make Turks happy.

From
Bruce Momjian
Date:
OK, patch applied.  Peter, should this appear in 7.3.1 too?

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/backend/utils/mb/encnames.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/mb/encnames.c,v
retrieving revision 1.10
diff -c -c -r1.10 encnames.c
*** src/backend/utils/mb/encnames.c    4 Sep 2002 20:31:31 -0000    1.10
--- src/backend/utils/mb/encnames.c    5 Dec 2002 23:19:40 -0000
***************
*** 407,413 ****
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             *np++ = tolower((unsigned char) *p);
      }
      *np = '\0';
      return newkey;
--- 407,418 ----
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!         {
!             if (*p >= 'A' && *p <= 'Z')
!                 *np++ = *p + 'a' - 'A';
!             else
!                 *np++ = *p;
!         }
      }
      *np = '\0';
      return newkey;

Re: [PATCHES] Patch to make Turks happy.

From
Florian Weimer
Date:
Nicolai Tufar <ntufar@apb.com.tr> writes:

> As far as I figured from the source code this function only deals
> with cleaning up locale names and nothing else. Since all the locale
> names are in plain ASCII I think it will be safe to use ASCII-only
> lower-case conversion.

Does PostgreSQL run on the UNIX subsystem of OS/390? ;-)

(EBCDIC is a bit, uhm, strange.  Only the decimal digits are
consecutive the rest is a big mess.)

-- 
Florian Weimer                       Weimer@CERT.Uni-Stuttgart.DE
University of Stuttgart           http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT                          fax +49-711-685-5898


Re: [PATCHES] Patch to make Turks happy.

From
Bruce Momjian
Date:
Are you 64-bit s390?

We have OS/390 patches in 7.4 but they were judged too risky for 7.3.
Attached is the patch.   Comment is:

    revision 1.103
    date: 2002/11/22 01:13:16;  author: tgl;  state: Exp;  lines: +29 -2
    TAS code originally written for s390 (32-bit) does not work for s390x
    (64-bit).  Fix it.  Per report from Permaine Cheung.

---------------------------------------------------------------------------

Florian Weimer wrote:
> Nicolai Tufar <ntufar@apb.com.tr> writes:
>
> > As far as I figured from the source code this function only deals
> > with cleaning up locale names and nothing else. Since all the locale
> > names are in plain ASCII I think it will be safe to use ASCII-only
> > lower-case conversion.
>
> Does PostgreSQL run on the UNIX subsystem of OS/390? ;-)
>
> (EBCDIC is a bit, uhm, strange.  Only the decimal digits are
> consecutive the rest is a big mess.)
>
> --
> Florian Weimer                       Weimer@CERT.Uni-Stuttgart.DE
> University of Stuttgart           http://CERT.Uni-Stuttgart.DE/people/fw/
> RUS-CERT                          fax +49-711-685-5898
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: s_lock.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/storage/s_lock.h,v
retrieving revision 1.102
retrieving revision 1.103
diff -c -c -r1.102 -r1.103
*** s_lock.h    10 Nov 2002 00:33:43 -0000    1.102
--- s_lock.h    22 Nov 2002 01:13:16 -0000    1.103
***************
*** 63,69 ****
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
!  *      $Id: s_lock.h,v 1.102 2002/11/10 00:33:43 momjian Exp $
   *
   *-------------------------------------------------------------------------
   */
--- 63,69 ----
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
!  *      $Id: s_lock.h,v 1.103 2002/11/22 01:13:16 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
***************
*** 150,156 ****

  #endif     /* __arm__ */

! #if defined(__s390__) || defined(__s390x__)
  /*
   * S/390 Linux
   */
--- 150,157 ----

  #endif     /* __arm__ */

!
! #if defined(__s390__) && !defined(__s390x__)
  /*
   * S/390 Linux
   */
***************
*** 175,180 ****
--- 176,207 ----
  }

  #endif     /* __s390__ */
+
+ #if defined(__s390x__)
+ /*
+  * S/390x Linux (64-bit zSeries)
+  */
+ #define TAS(lock)       tas(lock)
+
+ static __inline__ int
+ tas(volatile slock_t *lock)
+ {
+     int            _res;
+
+     __asm__    __volatile__(
+         "    la    1,1            \n"
+         "    lg     2,%2        \n"
+         "    slr 0,0            \n"
+         "    cs     0,1,0(2)    \n"
+         "    lr     %1,0        \n"
+ :        "=m"(lock), "=d"(_res)
+ :        "m"(lock)
+ :        "0", "1", "2");
+
+     return (_res);
+ }
+
+ #endif     /* __s390x__ */


  #if defined(__sparc__)

Re: [PATCHES] Patch to make Turks happy.

From
Florian Weimer
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:

> Are you 64-bit s390?

The patch is for GNU/Linux on S/390 hardware.  This platform is sane
and it uses ASCII.

Only the traditional UNIX subsystem for OS/390 uses EBCDIC.

-- 
Florian Weimer                       Weimer@CERT.Uni-Stuttgart.DE
University of Stuttgart           http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT                          fax +49-711-685-5898


Re: [PATCHES] Patch to make Turks happy.

From
Bruce Momjian
Date:
Peter, is that patch OK for 7.3.1?  I am not sure.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [PATCHES] Patch to make Turks happy.

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> Peter, is that patch OK for 7.3.1?  I am not sure.

Definitely.  It's a bug fix.

--
Peter Eisentraut   peter_e@gmx.net


Re: [PATCHES] Patch to make Turks happy.

From
Bruce Momjian
Date:
Thanks.   Applied for 7.3.1.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > Peter, is that patch OK for 7.3.1?  I am not sure.
>
> Definitely.  It's a bug fix.
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073