Thread: Question about debugging bootstrapping and catalog entries

Question about debugging bootstrapping and catalog entries

From
Gregory Stark
Date:
I've been fooling with catalog entries here and I've obviously done something
wrong. But I'm a bit frustrated trying to debug initdb. Because of the way it
starts up the database in a separate process I'm finding it really hard to
connect to the database and get a backtrace. And the debugging log is being
spectacularly unhelpful in not telling me where the problem is.

Are there any tricks people have for debugging bootstrapping processing? I
just need to know what index it's trying to build here and that should be
enough to point me in the right direction:

creating template1 database in /var/tmp/db7/base/1 ... FATAL:  could not create unique index
DETAIL:  Table contains duplicated values.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Question about debugging bootstrapping and catalog entries

From
Martijn van Oosterhout
Date:
On Mon, Dec 18, 2006 at 11:35:44AM +0000, Gregory Stark wrote:
> Are there any tricks people have for debugging bootstrapping processing? I
> just need to know what index it's trying to build here and that should be
> enough to point me in the right direction:

Here's what I did: you can step over functions in initdb until it fails
(although I alredy know which part it's failing I guess). Restart. Then
you go into that function and step until the new backend has been
started. At this point you attach another gdb to the backend and let it
run.

Some steps create multiple backends, a printf() statement sometime help
determining where to stop.

If the backend process segfaults, the easiest is to enable core dumps,
then you can run gdb on the left-overs, so to speak.

If you get an error, you put a breakpoint on errfinish(). Note, that
gets called even on messages you don't normally see, so you may have to
skip a couple to get the real message.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Question about debugging bootstrapping and catalog entries

From
"Zeugswetter Andreas ADI SD"
Date:
> > Are there any tricks people have for debugging bootstrapping
processing? I
> > just need to know what index it's trying to build here and that
should be
> > enough to point me in the right direction:
>
> Here's what I did: you can step over functions in initdb until it
fails
> (although I alredy know which part it's failing I guess). Restart.
Then
> you go into that function and step until the new backend has been
> started. At this point you attach another gdb to the backend and let
it
> run.

How do you attach fast enough, so not all is over before you are able to
attach ?
I'd like to debug initdb failure on Windows (postgres executable not
found) when
running make check with disabled is_admin check and --prefix=/postgres
in msys.

The program "postgres" is needed by initdb but was not found in the
same directory as
"j:/postgres/src/test/regress/./tmp_check/install/postgres/bin/initdb".

Andreas


Re: Question about debugging bootstrapping and catalog entries

From
Gregory Stark
Date:
"Martijn van Oosterhout" <kleptog@svana.org> writes:

> Here's what I did: you can step over functions in initdb until it fails
> (although I alredy know which part it's failing I guess). Restart. Then
> you go into that function and step until the new backend has been
> started. At this point you attach another gdb to the backend and let it
> run.

Hm, I suppose. Though starting a second gdb is a pain. What I've done in the
past is introduce a usleep(30000000) in strategic points in the backend to
give me a chance to attach.

Perhaps what would be handy is having an option to initdb to just run the
backend under gdb automatically. I'm not sure if initdb runs the backend in
the terminal though. Or perhaps initdb should start the backend with an option
that instructs it to enter an infinite loop shortly after startup so you can
attach with gdb.

In the meantime this trivial patch saved my day:

diff -c -r1.225 bootstrap.c
*** src/backend/bootstrap/bootstrap.c    4 Oct 2006 00:29:49 -0000    1.225
--- src/backend/bootstrap/bootstrap.c    18 Dec 2006 12:11:11 -0000
***************
*** 1293,1298 ****
--- 1293,1300 ----         heap = heap_open(ILHead->il_heap, NoLock);         ind = index_open(ILHead->il_ind, NoLock);

+         elog(DEBUG4, "building index %s on %s", NameStr(ind->rd_rel->relname), NameStr(heap->rd_rel->relname));
+          index_build(heap, ind, ILHead->il_info, false);          index_close(ind, NoLock);

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Question about debugging bootstrapping and catalog entries

From
"Gurjeet Singh"
Date:
On 12/18/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
If you get an error, you put a breakpoint on errfinish(). Note, that
gets called even on messages you don't normally see, so you may have to
skip a couple to get the real message.

You wouldn't need to skip anything if you put the breakpoint inside the '    if (elevel == ERROR)' code-block in errfinish(). It will stop only for an ERROR.

Regards,

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Question about debugging bootstrapping and catalog entries

From
Martijn van Oosterhout
Date:
On Mon, Dec 18, 2006 at 12:59:28PM +0100, Zeugswetter Andreas ADI SD wrote:
> How do you attach fast enough, so not all is over before you are able to
> attach ?
> I'd like to debug initdb failure on Windows (postgres executable not
> found) when
> running make check with disabled is_admin check and --prefix=/postgres
> in msys.

When running initdb under gdb, you step over the PG_CMD_OPEN;. At that
point the backend is started, but hasn't done anything yet, so you can
attach to it. The backend stays until the next PG_CMD_CLOSE;

As someone pointed out, sleep works also.

> The program "postgres" is needed by initdb but was not found in the
> same directory as
> "j:/postgres/src/test/regress/./tmp_check/install/postgres/bin/initdb".

No idea about that, the binary *should* be there...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Question about debugging bootstrapping and catalog

From
Gavin Sherry
Date:
On Mon, 18 Dec 2006, Gregory Stark wrote:

>
> I've been fooling with catalog entries here and I've obviously done something
> wrong. But I'm a bit frustrated trying to debug initdb. Because of the way it
> starts up the database in a separate process I'm finding it really hard to
> connect to the database and get a backtrace. And the debugging log is being
> spectacularly unhelpful in not telling me where the problem is.
>
> Are there any tricks people have for debugging bootstrapping processing? I
> just need to know what index it's trying to build here and that should be
> enough to point me in the right direction:
>
> creating template1 database in /var/tmp/db7/base/1 ... FATAL:  could not create unique index
> DETAIL:  Table contains duplicated values.
>

Not much fun. Run src/include/catalog/duplicate_oids first.

Thanks,

Gavin


Re: Question about debugging bootstrapping and catalog entries

From
Alvaro Herrera
Date:
Gregory Stark wrote:
> 
> I've been fooling with catalog entries here and I've obviously done something
> wrong. But I'm a bit frustrated trying to debug initdb. Because of the way it
> starts up the database in a separate process I'm finding it really hard to
> connect to the database and get a backtrace. And the debugging log is being
> spectacularly unhelpful in not telling me where the problem is.
> 
> Are there any tricks people have for debugging bootstrapping processing? I
> just need to know what index it's trying to build here and that should be
> enough to point me in the right direction:
> 
> creating template1 database in /var/tmp/db7/base/1 ... FATAL:  could not create unique index
> DETAIL:  Table contains duplicated values.

One easy thing to try is to use -n (noclean) and then start a standalone
backend on the borked dir and issue the commands that initdb was feeding
at that point (usually embedded in the initdb source).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Question about debugging bootstrapping and catalog

From
Zdenek Kotala
Date:
Gregory Stark wrote:
> "Martijn van Oosterhout" <kleptog@svana.org> writes:
> 
>> Here's what I did: you can step over functions in initdb until it fails
>> (although I alredy know which part it's failing I guess). Restart. Then
>> you go into that function and step until the new backend has been
>> started. At this point you attach another gdb to the backend and let it
>> run.
> 
> Hm, I suppose. Though starting a second gdb is a pain. What I've done in the
> past is introduce a usleep(30000000) in strategic points in the backend to
> give me a chance to attach.

I use dtrace which wait on write syscall for stderr output and if it is 
happen then stop(freeze) the process and I able to connect into the 
process with debugger and examine what happened.


Zdenek


Re: Question about debugging bootstrapping and catalog

From
Andrew Dunstan
Date:
Alvaro Herrera wrote:
> Gregory Stark wrote:
>   
>> I've been fooling with catalog entries here and I've obviously done something
>> wrong. But I'm a bit frustrated trying to debug initdb. Because of the way it
>> starts up the database in a separate process I'm finding it really hard to
>> connect to the database and get a backtrace. And the debugging log is being
>> spectacularly unhelpful in not telling me where the problem is.
>>
>> Are there any tricks people have for debugging bootstrapping processing? I
>> just need to know what index it's trying to build here and that should be
>> enough to point me in the right direction:
>>
>> creating template1 database in /var/tmp/db7/base/1 ... FATAL:  could not create unique index
>> DETAIL:  Table contains duplicated values.
>>     
>
> One easy thing to try is to use -n (noclean) and then start a standalone
> backend on the borked dir and issue the commands that initdb was feeding
> at that point (usually embedded in the initdb source).
>
>   

This step actually runs the BKI file, so it's not embedded in the initdb 
code. The other thing with this procedure is to clean up any partial 
data left behind first, i.e. clean the global and base/1 directories. 
Apart from that it should work fine, I think - probably something like:
 gdb postgres   set args -boot -x1 -F -d 5 template1   run < /path/to/bkifile

cheers

andrew




Re: Question about debugging bootstrapping and catalog entries

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> Hm, I suppose. Though starting a second gdb is a pain. What I've done in the
> past is introduce a usleep(30000000) in strategic points in the backend to
> give me a chance to attach.

There is already an option to sleep early in backend startup for the
normal case.  Not sure if it works for bootstrap, autovacuum, etc,
but I could see making it do so.  The suggestion of single-stepping
initdb will only work well if you have a version of gdb that can step
into a fork, which is something that's never worked for me :-(.
Otherwise the backend will free-run until it blocks waiting for input
from initdb, which means you are still stuck for debugging startup
crashes ...
        regards, tom lane


Re: Question about debugging bootstrapping and catalog entries

From
"Takayuki Tsunakawa"
Date:
Hello, Mr. Stark

> Are there any tricks people have for debugging bootstrapping
processing? I
> just need to know what index it's trying to build here and that
should be
> enough to point me in the right direction:

As Mr. Lane says, it would be best to be able to make postgres sleep
for an arbitrary time.  The direction may be either a command line
option or an environment variable (like BOOTSTRAP_SLEEP) or both.  iI
think the env variable is easy to handle n this case.

How about mimicing postgres with a script that starts gdb to run
postgres?  That is, rename the original postgres module to
postgres.org and create a shell script named postgres like this:

#!/bin/bash
gdb postgres $*

Tell me if it works.







Re: Question about debugging bootstrapping and catalog entries

From
"Takayuki Tsunakawa"
Date:
From: "Takayuki Tsunakawa" <tsunakawa.takay@jp.fujitsu.com>How about mimicing postgres with a script that starts gdb to
run
> postgres?  That is, rename the original postgres module to
> postgres.org and create a shell script named postgres like this:
>
> #!/bin/bash
> gdb postgres $*

Sorry, this should be postgres.org $*.

----- Original Message ----- 
From: "Takayuki Tsunakawa" <tsunakawa.takay@jp.fujitsu.com>
To: "Gregory Stark" <stark@enterprisedb.com>; "PostgreSQL Hackers"
<pgsql-hackers@postgresql.org>
Sent: Tuesday, December 19, 2006 9:37 AM
Subject: Re: [HACKERS] Question about debugging bootstrapping and
catalog entries


> Hello, Mr. Stark
>
>> Are there any tricks people have for debugging bootstrapping
> processing? I
>> just need to know what index it's trying to build here and that
> should be
>> enough to point me in the right direction:
>
> As Mr. Lane says, it would be best to be able to make postgres sleep
> for an arbitrary time.  The direction may be either a command line
> option or an environment variable (like BOOTSTRAP_SLEEP) or both.
iI
> think the env variable is easy to handle n this case.
>
> How about mimicing postgres with a script that starts gdb to run
> postgres?  That is, rename the original postgres module to
> postgres.org and create a shell script named postgres like this:
>
> #!/bin/bash
> gdb postgres $*
>
> Tell me if it works.
>
>
>
>
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>




Re: Question about debugging bootstrapping and catalog entries

From
"Gurjeet Singh"
Date:
On 12/18/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Gregory Stark <stark@enterprisedb.com> writes:
> Hm, I suppose. Though starting a second gdb is a pain. What I've done in the
> past is introduce a usleep(30000000) in strategic points in the backend to
> give me a chance to attach.

There is already an option to sleep early in backend startup for the
normal case.  Not sure if it works for bootstrap, autovacuum, etc,
but I could see making it do so.  

You are probably referring to the command-line switch -W to posrgres, that translates to 'PostAuthDelay' GUC variable; I think that kicks in a bit too late! Once I was trying to debug check_root() (called by main() ), and had to resort to my own pg_usleep() to make the process wait for debugger-attach. We should somehow pull the sleep() code into main() as far up as possible.

BTW, here's how I made PG sleep until I attached to it (should be done only in the function you intend to debug):

{
  bool waitFor_Debugger = true;
  while( waitForDebugger )
    pg_usleep(1000000);
}

It will wait forever here, until you set a breakpoint on 'while' and then set the var to false.

The suggestion of single-stepping
initdb will only work well if you have a version of gdb that can step
into a fork, which is something that's never worked for me :-(.
Otherwise the backend will free-run until it blocks waiting for input
from initdb, which means you are still stuck for debugging startup
crashes ...

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@postgresql.org so that your
       message can get through to the mailing list cleanly



--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Question about debugging bootstrapping and catalog

From
Zdenek Kotala
Date:
Zdenek Kotala wrote:
> Gregory Stark wrote:
>> "Martijn van Oosterhout" <kleptog@svana.org> writes:
>>
>>> Here's what I did: you can step over functions in initdb until it fails
>>> (although I alredy know which part it's failing I guess). Restart. Then
>>> you go into that function and step until the new backend has been
>>> started. At this point you attach another gdb to the backend and let it
>>> run.
>>
>> Hm, I suppose. Though starting a second gdb is a pain. What I've done 
>> in the
>> past is introduce a usleep(30000000) in strategic points in the 
>> backend to
>> give me a chance to attach.
> 
> I use dtrace which wait on write syscall for stderr output and if it is 
> happen then stop(freeze) the process and I able to connect into the 
> process with debugger and examine what happened.
> 

There is dtrace script which "sitting" on exec. It stops postgres 
process after exec.  It works on Solaris. Different name of kernel 
function probably will be on other platform where is dtrace implemented 
(Freebsd,MacOS).



::exec_common:return
/execname == "initdb"/
{  exec_pg = 1;
}


syscall:::entry
/execname == "postgres" && exec_pg == 1/
{  stop();  printf("Postgres is stopped.\n");  exec_pg = 0;
}



Re: Question about debugging bootstrapping and catalog entries

From
Tom Lane
Date:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> On 12/18/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> There is already an option to sleep early in backend startup for the
>> normal case.  Not sure if it works for bootstrap, autovacuum, etc,
>> but I could see making it do so.

> You are probably referring to the command-line switch -W to posrgres, that
> translates to 'PostAuthDelay' GUC variable; I think that kicks in a bit too
> late!

No, I was thinking of PreAuthDelay.  There might be cases where even
that is too late in the procedure --- probably not on Unix, but on
Windows there's a lot that happens before BackendInitialize.  But
offhand I don't know how we'd have a configurable delay much earlier
... custom insertions of hardwired delays into the source code are
probably the only good approach if you find that, say, guc.c
initialization fails in individual backends under Windows.

Back at the ranch, though, the question was whether it'd be worth
honoring PreAuthDelay in the other startup code paths such as
BootstrapMain.
        regards, tom lane