Thread: \dS and \df crashing psql

\dS and \df crashing psql

From
Nishad PRAKASH
Date:
Your name        : Nishad Prakash
Your email address    : prakashn@uci.edu


System Configuration
---------------------
  Architecture (example: Intel Pentium)      : Sun Sparc

  Operating System (example: Linux 2.0.26 ELF)     : Solaris 2.6

  PostgreSQL version (example: PostgreSQL-6.5.1): PostgreSQL-7.0

  Compiler used (example:  gcc 2.8.0)        : gcc 2.95.2


Please enter a FULL description of your problem:
------------------------------------------------

In psql, when connected to template1 as the postgres superuser, the
\df function complains about some memory allocation problem.  See the
following four examples for representative errors:

template1=# \df get
ERROR:  AllocSetFree: cannot find block containing chunk

template1=# \df get
NOTICE:  PortalHeapMemoryFree: 0x31f5b0 not in alloc set!
             List of functions
 Result |      Function       |  Arguments
--------+---------------------+-------------
 int4   | get_bit             | bytea int4
 int4   | get_byte            | bytea int4
 name   | getdatabaseencoding |
 name   | getpgusername       |
(4 rows)

template1=# \df get
NOTICE:  PortalHeapMemoryFree: 0x344350 not in alloc set!
ERROR:  AllocSetFree: cannot find block containing chunk

template1=# \df get
ERROR:  SearchSysCache: recursive use of cache 2

Note that this is before creating any of my own databases -- at the
time when I got these errors I had just finished the installation.

There is another problem with the \d family.  I created a new db
(named can) and its tables.  Then, typing \dS has the following
effect:

can=# \dS
The connection to the server was lost. Attempting reset: Failed.
!# \d
You are currently not connected to a database.
!# \c can
No Postgres username specified in startup packet.
Segmentation fault

Note that this happens whether or not the tables are actually
populated; I ran a vacuum right after both acts (creation and
population) and \dS caused a crash out of psql each time.

FWIW, my 6.5.3 installation with the same configure and build
parameters, same data, etc. ran with no problems at all.  Has anyone
had similar problems with the \d functions in 7.0?

Nishad

Re: \dS and \df crashing psql

From
Tatsuo Ishii
Date:
> System Configuration
> ---------------------
>   Architecture (example: Intel Pentium)      : Sun Sparc
> 
>   Operating System (example: Linux 2.0.26 ELF)     : Solaris 2.6
> 
>   PostgreSQL version (example: PostgreSQL-6.5.1): PostgreSQL-7.0
> 
>   Compiler used (example:  gcc 2.8.0)        : gcc 2.95.2
> 
> 
> Please enter a FULL description of your problem:
> ------------------------------------------------
> 
> In psql, when connected to template1 as the postgres superuser, the
> \df function complains about some memory allocation problem.  See the
> following four examples for representative errors:

Neither \df or \dS problem reproduces here (I have exactly same
configuration as you).

Instead, I have another problem already reported at hackers list:
creatdb/dropdb does not work

See the posting "Solaris 2.6 problems" in the archives.
--
Tatsuo Ishii


Re: \dS and \df crashing psql

From
Peter Eisentraut
Date:
Nishad PRAKASH writes:

> In psql, when connected to template1 as the postgres superuser, the
> \df function complains about some memory allocation problem.

The \d series of psql commands are really just shortcuts for various SQL
queries to the system catalogs. Start psql with the -E option to see them.
Therefore it is unlikely that this behaviour is entirely localized at
these functions. Have you run the regression tests without problems?

> can=# \dS
> The connection to the server was lost. Attempting reset: Failed.

Can you show the server output. There's probably a segmentation fault or
failed assertion in the backend involved, which we'd need to see.

> !# \d
> You are currently not connected to a database.
> !# \c can
> No Postgres username specified in startup packet.
> Segmentation fault

That's certainly a psql problem. Can you show a backtrace from gdb?


--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

Re: \dS and \df crashing psql

From
Nishad PRAKASH
Date:
On Fri, 26 May 2000, Peter Eisentraut wrote:


> The \d series of psql commands are really just shortcuts for various SQL
> queries to the system catalogs. Start psql with the -E option to see them.
> Therefore it is unlikely that this behaviour is entirely localized at
> these functions. Have you run the regression tests without problems?

First of all, this was not a Postgres bug but a configuration mistake on
my part.  I had been meaning to write back to the list explaining what
really happened:

I compiled 7.0 with locale support, recode, and multibyte options all
enabled.  In the postgres (db superuser) .cshrc, I had set LC_CTYPE to
"en_US".  This was the problem.  When I would start postmaster and run
anything that involved a regexp (and the query that \dS expands to uses
regexps) on a "bytea" type field, psql would crash.

To fix this, I tried first letting the locale default to "C", then setting
LC_CTYPE to "iso_8859_1".  Starting postmaster with either of these works
perfectly.

If you are still interested in server output or backtraces (perhaps to
implement a more graceful exit?), I'd be glad to send them, but I'm sure
you can replicate this pretty easily now if required.

I have never needed to mess around with locales before, so I apologize for
posting this as bug -- I didn't quite know where to look at first.

By the way, I don't know what you guys have done with the optimizer but my
previously slow queries now run VERY FAST.  This prevents me from
taking cigarette breaks, coffee breaks, etc. under the "I'm running a
large query" pretext.  Please do what you can to fix this problem.

Thanks for the help,

Nishad

Re: \dS and \df crashing psql

From
Tom Lane
Date:
Nishad  PRAKASH <prakashn@uci.edu> writes:
> I compiled 7.0 with locale support, recode, and multibyte options all
> enabled.  In the postgres (db superuser) .cshrc, I had set LC_CTYPE to
> "en_US".  This was the problem.  When I would start postmaster and run
> anything that involved a regexp (and the query that \dS expands to uses
> regexps) on a "bytea" type field, psql would crash.

> To fix this, I tried first letting the locale default to "C", then setting
> LC_CTYPE to "iso_8859_1".  Starting postmaster with either of these works
> perfectly.

> If you are still interested in server output or backtraces (perhaps to
> implement a more graceful exit?), I'd be glad to send them, but I'm sure
> you can replicate this pretty easily now if required.

Hmm, news to us.  It may be a platform-specific problem, so yes please
do send a backtrace.

            regards, tom lane

Re: \dS and \df crashing psql

From
Tatsuo Ishii
Date:
> First of all, this was not a Postgres bug but a configuration mistake on
> my part.  I had been meaning to write back to the list explaining what
> really happened:
>
> I compiled 7.0 with locale support, recode, and multibyte options all
> enabled.  In the postgres (db superuser) .cshrc, I had set LC_CTYPE to
> "en_US".  This was the problem.  When I would start postmaster and run
> anything that involved a regexp (and the query that \dS expands to uses
> regexps) on a "bytea" type field, psql would crash.
>
> To fix this, I tried first letting the locale default to "C", then setting
> LC_CTYPE to "iso_8859_1".  Starting postmaster with either of these works
> perfectly.
>
> If you are still interested in server output or backtraces (perhaps to
> implement a more graceful exit?), I'd be glad to send them, but I'm sure
> you can replicate this pretty easily now if required.

Of course regexp should not crash in this situation above. Thanks for
the info. I will dig into the problem.
--
Tatsuo Ishii

Re: \dS and \df crashing psql

From
Nishad PRAKASH
Date:
On Thu, 25 May 2000, Tom Lane wrote:

>
> Hmm, news to us.  It may be a platform-specific problem, so yes please
> do send a backtrace.
>

CAVEAT: I may just be missing something really obvious.

A high-level description of the problem is: If postmaster is started
with LC_COLLATE set to en_US in the db superuser's environment, then
working on a db created with createdb -E LATIN1 <foo> causes strange
behaviour in regexps.  If that sounds like an obviously wrong use
of locale settings, you probably don't need to read any further, but
just tell me what's going on.

To replicate the problem, you need to do the following.  All actions
are performed by postgres, the db superuser account

Install postgres 7.0 with all three of --enable-locale, --enable-recode,
and --enable-multibyte specified.  Set the user postgres's LC_COLLATE env
var to any of the en_* locales available on your machine /except/
en_US.UTF-8, which doesn't seem to cause problems.  The other locale vars
appear to be irrelevant; LC_COLLATE alone will do for replication.  These
were my settings:

> locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE=en_US
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

What follows are the operations I performed to get psql to crash:

> createdb -E LATIN1 foo
CREATE DATABASE
> psql foo
Welcome to psql, the PostgreSQL interactive terminal.
<snip>
foo=# create table TenChrName ( somelongname varchar (100) unique);
NOTICE:  CREATE TABLE/UNIQUE will create implicit index
'tenchrname_somelongname_key' for table 'tenchrname'
CREATE
foo=# vacuum analyze;
VACUUM
foo=# \dS
The connection to the server was lost. Attempting reset: Failed.
!# \q
> kill `cat postmaster.pid`
> gdb postgres
<snip>
(gdb) run foo

/* note: the following query is the smallest part of \dS's expansion
 * that is sufficient for a crash
 */
backend> select * from pg_class where relname ~ '^n';
ERROR:  expression_tree_walker: Unexpected node type 0
ERROR:  expression_tree_walker: Unexpected node type 0
backend> select * from pg_class where relname ~ '^n';
NOTICE:  PortalHeapMemoryFree: 0x51c330 not in alloc set!
NOTICE:  PortalHeapMemoryFree: 0x51c330 not in alloc set!

Program received signal SIGBUS, Bus error.
0x21ddf4 in AllocSetAlloc (set=0x500ff8, size=12) at aset.c:233
233                     if (chunk->size >= size)
(gdb) bt
#0  0x21ddf4 in AllocSetAlloc (set=0x500ff8, size=12) at aset.c:233
#1  0x21f8a0 in PortalHeapMemoryAlloc (this=0x2bddc0, size=12)
    at portalmem.c:253
#2  0x21ed20 in MemoryContextAlloc (context=0x2bddc0, size=12) at
mcxt.c:224
#3  0x126e84 in newNode (size=12, tag=T_List) at nodes.c:38
#4  0x127180 in lcons (obj=0x51a240, list=0x0) at list.c:112
#5  0x127220 in lappend (list=0x0, obj=0x51a240) at list.c:144
#6  0x14e6f8 in get_actual_clauses (restrictinfo_list=0x51a298)
    at restrictinfo.c:55
#7  0x144b80 in create_scan_node (root=0x5134f8, best_path=0x51be80,
    tlist=0x51b0b0) at createplan.c:152
#8  0x144ab0 in create_plan (root=0x5134f8, best_path=0x51be80)
    at createplan.c:103
#9  0x147698 in subplanner (root=0x5134f8, flat_tlist=0x51a4a0,
    qual=0x51a280, tuple_fraction=0) at planmain.c:288
#10 0x14740c in query_planner (root=0x5134f8, tlist=0x519b08,
qual=0x51a280,
    tuple_fraction=0) at planmain.c:128
#11 0x14817c in union_planner (parse=0x5134f8, tuple_fraction=0)
    at planner.c:530
#12 0x147b38 in subquery_planner (parse=0x5134f8, tuple_fraction=-1)
    at planner.c:202
#13 0x147810 in planner (parse=0x5134f8) at planner.c:67
#14 0x1977c0 in pg_plan_query (querytree=0x5134f8) at postgres.c:512
#15 0x197a9c in pg_exec_query_dest (
    query_string=0x2ba070 "select * from pg_class where relname ~ '^n';
\n",
    dest=Debug, aclOverride=0 '\000') at postgres.c:646
#16 0x1978e4 in pg_exec_query (
    query_string=0x2ba070 "select * from pg_class where relname ~ '^n';
\n")
    at postgres.c:562
#17 0x1996f4 in PostgresMain (argc=2, argv=0xeffffa64, real_argc=2,
    real_argv=0xeffffa64) at postgres.c:1590
#18 0x1026d0 in main (argc=2, argv=0xeffffa64) at main.c:103

If you actually care to go through the steps above, don't leave
anything out.  The vacuum analyze serves no useful purpose, but you
won't get a crash if you omit it.  The table indentifiers really
do need to be around 10 chars long.  The regexp needs to match the
front of a string, so use '^foo' -- I couldn't get a crash with other
types of regexps but then I didn't try too many.

With the local settings described above, a query on pg_proc of the type
"select * from pg_proc where proname ~ '^n';" will /always/ produce the
following kind of error: "NOTICE:  PortalHeapMemoryFree: <addr> not in
alloc set!" before printing the result (it never causes a crash, AFAICT,
and always does produce a correct result).  You can get this behaviour
just by connecting to template1; perhaps other tables with bytea fields
may also do this, but pg_proc does it every single time. If you like, I'll
do a backtrace from where it produces that error, but this message is
getting too long for that.

If someone can replicate this (or even try and fail), it would help me
to learn whether the error lies in Postgres, Solaris's locales, or
yours truly.  It seems too quirky to be a genuine bug.

Thanks, and let me know if you have any ideas.

Nishad