Thread: preventing encoding conversion while starting up

preventing encoding conversion while starting up

From
Tatsuo Ishii
Date:
I have faced a problem with encoding conversion while the database is
starting up. If postmaster accepts a connection request while in the
state, it issues a fatal message "The database system is starting
up". Then the encoding conversion system tries to convert the message
to client encoding if neccessary (e.g. PGCLIENTENCODING is set for
postmaster process). Then it calles recomputeNamespacePath() which
calls GetUserId(), it ends up with an assersion error (see below).

To prevent this, I would like to add a public function to postmaster.c
to know we are in the database starting up phase:

CAC_state BackendState()

Comments?
--
Tatsuo Ishii

#0  0x40103841 in __kill () from /lib/libc.so.6
#1  0x40103594 in raise (sig=6) at ../sysdeps/posix/raise.c:27
#2  0x40104c81 in abort () at ../sysdeps/generic/abort.c:88
#3  0x0817932b in Letext () at excabort.c:27
#4  0x08179292 in ExcUnCaught (excP=0x821813c, detail=0, data=0x0,    message=0x820dd40 "!(((bool) ((CurrentUserId) !=
((Oid)0))))")   at exc.c:168
 
#5  0x081792d9 in ExcRaise (excP=0x821813c, detail=0, data=0x0,    message=0x820dd40 "!(((bool) ((CurrentUserId) !=
((Oid)0))))")   at exc.c:185
 
#6  0x08177fa2 in ExceptionalCondition (   conditionName=0x820dd40 "!(((bool) ((CurrentUserId) != ((Oid) 0))))",
exceptionP=0x821813c,detail=0x0, fileName=0x820dcc0 "miscinit.c",    lineNumber=502) at assert.c:70
 
#7  0x0817c8b2 in GetUserId () at miscinit.c:502
#8  0x080a2726 in recomputeNamespacePath () at namespace.c:1301
#9  0x080a26e8 in FindDefaultConversionProc (for_encoding=0, to_encoding=4)   at namespace.c:1280
#10 0x0818930e in pg_do_encoding_conversion (   src=0xbfffe510 "FATAL:  The database system is starting up\n", len=43,
 src_encoding=0, dest_encoding=4) at mbutils.c:108
 
#11 0x081896a3 in pg_server_to_client (   s=0xbfffe510 "FATAL:  The database system is starting up\n", len=43)   at
mbutils.c:243
#12 0x080f2d76 in pq_sendstring (buf=0xbfffe474,    str=0xbfffe510 "FATAL:  The database system is starting up\n")   at
pqformat.c:162
#13 0x08178c16 in send_message_to_frontend (type=20,    msg=0xbfffe510 "FATAL:  The database system is starting up\n")
at elog.c:750
 
#14 0x08178562 in elog (lev=21,    fmt=0x81ecc60 "The database system is starting up") at elog.c:427
#15 0x0811755f in ProcessStartupPacket (port=0x826f4e0, SSLdone=0)   at postmaster.c:1176
#16 0x0811838a in DoBackend (port=0x826f4e0) at postmaster.c:2115
#17 0x08117f6f in BackendStartup (port=0x826f4e0) at postmaster.c:1863
#18 0x081171dc in ServerLoop () at postmaster.c:972
#19 0x08116d46 in PostmasterMain (argc=4, argv=0x8254bf0) at postmaster.c:754
#20 0x080f33ff in main (argc=4, argv=0xbffff1a4) at main.c:204
#21 0x400f1fff in __libc_start_main (main=0x80f3230 <main>, argc=4,    ubp_av=0xbffff1a4, init=0x8069b30 <_init>,
fini=0x818a4e0<_fini>,    rtld_fini=0x4000c420 <_dl_fini>, stack_end=0xbffff19c)   at
../sysdeps/generic/libc-start.c:129



Re: preventing encoding conversion while starting up

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I have faced a problem with encoding conversion while the database is
> starting up. If postmaster accepts a connection request while in the
> state, it issues a fatal message "The database system is starting
> up". Then the encoding conversion system tries to convert the message
> to client encoding if neccessary (e.g. PGCLIENTENCODING is set for
> postmaster process). Then it calles recomputeNamespacePath() which
> calls GetUserId(), it ends up with an assersion error (see below).

This seems to me to be a fatal objection to the entire concept of
storing encoding status in the database.  If the low-level client
communication code needs to access that information, then the postmaster
is broken and is not repairable.

> To prevent this, I would like to add a public function to postmaster.c
> to know we are in the database starting up phase:

How will that help?
        regards, tom lane


Re: preventing encoding conversion while starting up

From
Hannu Krosing
Date:
On Thu, 2002-07-18 at 06:55, Tatsuo Ishii wrote:
> I have faced a problem with encoding conversion while the database is
> starting up. If postmaster accepts a connection request while in the
> state, it issues a fatal message "The database system is starting
> up". Then the encoding conversion system tries to convert the message
> to client encoding if neccessary (e.g. PGCLIENTENCODING is set for
> postmaster process). Then it calles recomputeNamespacePath() which
> calls GetUserId(), it ends up with an assersion error (see below).
> 
> To prevent this, I would like to add a public function to postmaster.c
> to know we are in the database starting up phase:

Why can't we just open the listening socket _after_ the database has
completed starting up phase ?

------------------
Hannu



Re: preventing encoding conversion while starting up

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> Why can't we just open the listening socket _after_ the database has
> completed starting up phase ?

The problem is not just there.  The real problem is that with this patch
installed, it is impossible to report startup errors of any kind,
because the client communication mechanism now depends on having working
database access.  I regard this as a fatal problem :-(
        regards, tom lane


Re: preventing encoding conversion while starting up

From
Hannu Krosing
Date:
On Thu, 2002-07-18 at 18:57, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > Why can't we just open the listening socket _after_ the database has
> > completed starting up phase ?
> 
> The problem is not just there.  The real problem is that with this patch
> installed, it is impossible to report startup errors of any kind,
> because the client communication mechanism now depends on having working
> database access.  I regard this as a fatal problem :-(

So the right way would be to always start up in us-ascii (7-bit) and
re-negotiate encodings later ?

-----------
Hannu



Re: preventing encoding conversion while starting up

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
> On Thu, 2002-07-18 at 18:57, Tom Lane wrote:
>> The problem is not just there.  The real problem is that with this patch
>> installed, it is impossible to report startup errors of any kind,
>> because the client communication mechanism now depends on having working
>> database access.  I regard this as a fatal problem :-(

> So the right way would be to always start up in us-ascii (7-bit) and
> re-negotiate encodings later ?

That might be one way out ... but doesn't it mean breaking the wire
protocol?  Existing clients aren't likely to know to do that.

It seems like we've collected enough reasons for a protocol change that
one might happen for 7.4.  I'd rather not have it happen in 7.3, though,
because we don't have enough time left to address all the issues I'd
like to see addressed...
        regards, tom lane


Re: preventing encoding conversion while starting up

From
Hannu Krosing
Date:
On Fri, 2002-07-19 at 03:21, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > On Thu, 2002-07-18 at 18:57, Tom Lane wrote:
> >> The problem is not just there.  The real problem is that with this patch
> >> installed, it is impossible to report startup errors of any kind,
> >> because the client communication mechanism now depends on having working
> >> database access.  I regard this as a fatal problem :-(
> 
> > So the right way would be to always start up in us-ascii (7-bit) and
> > re-negotiate encodings later ?
> 
> That might be one way out ... but doesn't it mean breaking the wire
> protocol?  Existing clients aren't likely to know to do that.

It may be possible to make it compatible with old clients by

1) starting with the same encodings as we always did

2) change the encoding only if both parties agree to do so. I think that
we could use listen/notify for that

So client must first ask for certain encoding by (mis)using listen and
will then be confirmed by notify

hannu=# listen "pg_encoding ISO-8859-15"; 
LISTEN
hannu=# notify "pg_encoding ISO-8859-15"; 
NOTIFY
Asynchronous NOTIFY 'pg_encoding ISO-8859-15' from backend with pid 2319
received.
hannu=# 

It would allow us to do it without protocol changes.

Not that i like it though ;(

> It seems like we've collected enough reasons for a protocol change that
> one might happen for 7.4.  I'd rather not have it happen in 7.3, though,
> because we don't have enough time left to address all the issues I'd
> like to see addressed...

But we could start making a list of issues/proposed solution, or we will
not have enough time in 7.4 cycle either.

--------------
Hannu




Re: preventing encoding conversion while starting up

From
Tatsuo Ishii
Date:
> Hannu Krosing <hannu@tm.ee> writes:
> > On Thu, 2002-07-18 at 18:57, Tom Lane wrote:
> >> The problem is not just there.  The real problem is that with this patch
> >> installed, it is impossible to report startup errors of any kind,
> >> because the client communication mechanism now depends on having working
> >> database access.  I regard this as a fatal problem :-(
> 
> > So the right way would be to always start up in us-ascii (7-bit) and
> > re-negotiate encodings later ?
> 
> That might be one way out ... but doesn't it mean breaking the wire
> protocol?  Existing clients aren't likely to know to do that.

No. We have been doing the encoding negotiation outside the existing
protocol. Aafter backend goes into the normal idle loop, clients sends
"set client_encoding" query if it wishes.

BTW, for the problem I reported, what about checking
IsTransactionState returns true before accessing database to find out
conversions?
--
Tatsuo Ishii


Re: preventing encoding conversion while starting up

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> BTW, for the problem I reported, what about checking
> IsTransactionState returns true before accessing database to find out
> conversions?

The $64 problem here is *what do you do before you can access the database*.
Detecting whether you can access the database yet is irrelevant unless
you can say what you're going to do when the answer is "no".
        regards, tom lane


Re: preventing encoding conversion while starting up

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > BTW, for the problem I reported, what about checking
> > IsTransactionState returns true before accessing database to find out
> > conversions?
> 
> The $64 problem here is *what do you do before you can access the database*.
> Detecting whether you can access the database yet is irrelevant unless
> you can say what you're going to do when the answer is "no".

Of course we could do no encoding conversion if the answer is "no".
What's wrong with this?

Also I'm thinking about treating SQL_ASCII encoding as "special": if
database or client encoding is SQL_ASCII, then we could alwasy avoid
encoding conversion. Currently guc assumes the default encoding for
client is SQL_ASCII until the conversion system finds requested client
encoding (actually conversion system itself regards SQL_ASCII is
default). This is actualy unnecessary right now, but it would minimize
possible problem in the future. Ideally there should be a special
encoding "NO_CONVERSION", people seem to treat SQL_ASCII to be almost
identical to it anyway (remember the days when multibyte was optional).
--
Tatsuo Ishii


Re: preventing encoding conversion while starting up

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> The $64 problem here is *what do you do before you can access the database*.
>> Detecting whether you can access the database yet is irrelevant unless
>> you can say what you're going to do when the answer is "no".

> Of course we could do no encoding conversion if the answer is "no".
> What's wrong with this?

Maybe nothing, if that's what the clients will expect.

I don't actually much care for using IsTransactionState() to try to
determine whether it's safe to look at the database.  I'd suggest that
the conversion system start up in the no-conversions state, and change
over to doing a conversion only when explicitly told to --- which would
happen in the same late phase of startup that it used to (near the
SetConfigOption("client_encoding") calls), or at subsequent SET
commands.  In other words, keep the database lookups firmly out of the
conversion system proper, and have it do only what it's told.  This
seems much safer than doing a lookup at whatever random time a message
is generated.

> Also I'm thinking about treating SQL_ASCII encoding as "special": if
> database or client encoding is SQL_ASCII, then we could alwasy avoid
> encoding conversion.

Good idea.  I was somewhat troubled by the prospect of added overhead
for people who didn't need multibyte at all --- wasn't there a
commitment that there wouldn't be any noticeable slowdown from enabling
multibyte all the time?
        regards, tom lane


PITR and rollback

From
Dhruv Pilania
Date:
Hi,

I am a new postgresql developer. needed some help with wal/PITR. Can
someone working in this area answer my question?
(the email looks long but the question is simple :) )

I have been trying to implement undo of transactions using wal. i.e. given
a xid x, postgres can undo all operations of x. For starters, I
want to do this in very simple cases i.e. assume x only
inserts/updates/deletes tuples and does not change database schema. also I
assume that all of x's wal entries are in one segment.

The code for this is quite simple if database supports undo or rollback to
a point in time. There is a lot of discussion on the mailing list about
PITR. I am eagerly waiting for the PITR code to be available on cvs. so
my questions are....

1. once PITR has been implemented, infinite play forward will work. Will
undo also be supported? i.e. can we recover to the past from a "current"
wal log?
as a very simple scenario---
xid 1 " insert record y in relation r" commit
xid 2 " update record x in relation r" commit
shutdown
---now we take database back to start of xid 1.

if answer to qn 1 is no...
2. my approach is something like this,
scan log back until start of transaction record
scan forward until commit recordif record is for transaction x    undo(record)
to undo,
use preimage in record and everything else is pretty much same as redo.
i.e. we open relation, get desired block and work on it etc.
can someone tell me if this will work?


hoping someone currently working on wal/pitr can help me on this
issues....

thanks,
Dhruv


PS.

transaction dependency tracking
-------------------------------
I added support in postgres to do transaction dependency tracking.
basically, x depends on y if x reads something written by y. I maintain a
dependency graph and also a corresponding disk based log that is accessed
only at transaction commit. there is a tool which can be used to query
this graph. the time over heads are pretty low (< 1%).
with a dependency graph a DBA can say " I want to undo transaction x and
all transactions that depend on x".

so now in the second phase, I am looking at undo of a transactions. any
thoughts on this are very welcome....




Re: PITR and rollback

From
Richard Tucker
Date:

> -----Original Message-----
> From: Dhruv Pilania [mailto:dhruv@cs.sunysb.edu]
> Sent: Saturday, July 20, 2002 11:37 PM
> To: pgsql-hackers@postgresql.org
> Cc: pgman@candle.pha.pa.us; richt@multera.com
> Subject: PITR and rollback
>
>
> Hi,
>
> I am a new postgresql developer. needed some help with wal/PITR. Can
> someone working in this area answer my question?
> (the email looks long but the question is simple :) )
>
> I have been trying to implement undo of transactions using wal. i.e. given
> a xid x, postgres can undo all operations of x. For starters, I
> want to do this in very simple cases i.e. assume x only
> inserts/updates/deletes tuples and does not change database schema. also I
> assume that all of x's wal entries are in one segment.

Strictly speaking Postgres does not undo transactions but only aborts them.
It does not roll back through the log undoing the effects of the
transaction.  It merely sets the state of the transaction in the commit log
(clog) to aborted.  Then application of the tuple visibility rules prevents
other transactions from seeing any tuples changed by the aborted
transaction.

>
> The code for this is quite simple if database supports undo or rollback to
> a point in time. There is a lot of discussion on the mailing list about
> PITR. I am eagerly waiting for the PITR code to be available on cvs. so
> my questions are....

What I implemented was a roll forward to a point in time.  That is restore a
backup and roll forward through the wal files until you reach a transaction
that committed at or after the specified time.
I should have a context diff for my roll forward implementation available
against current cvs HEAD by the end of the week.

>
> 1. once PITR has been implemented, infinite play forward will work. Will
> undo also be supported? i.e. can we recover to the past from a "current"
> wal log?
> as a very simple scenario---
> xid 1 " insert record y in relation r" commit
> xid 2 " update record x in relation r" commit
> shutdown
> ---now we take database back to start of xid 1.
>
> if answer to qn 1 is no...
> 2. my approach is something like this,
> scan log back until start of transaction record
> scan forward until commit record
>     if record is for transaction x
>         undo(record)
> to undo,
> use preimage in record and everything else is pretty much same as redo.

The wal file does not contain "preimages" only after images.

> i.e. we open relation, get desired block and work on it etc.
> can someone tell me if this will work?
>
I did a roll forward to a point in time but I think a roll back to a point
in time would work like:
Roll back through the wal files looking for transaction commit records and
change the status in the clog to aborted until you reach the first commit
record that aborted before the specified roll back time.
The main thing that needs to be implemented here is reading backward through
the log files which I'm not sure is possible since the wal records do not
have a length suffix (they have a length prefix for reading forward but I
don't think they have a length suffix as well).

>
> hoping someone currently working on wal/pitr can help me on this
> issues....
>
> thanks,
> Dhruv
>
>
> PS.
>
> transaction dependency tracking
> -------------------------------
> I added support in postgres to do transaction dependency tracking.
> basically, x depends on y if x reads something written by y. I maintain a
> dependency graph and also a corresponding disk based log that is accessed
> only at transaction commit. there is a tool which can be used to query
> this graph. the time over heads are pretty low (< 1%).
> with a dependency graph a DBA can say " I want to undo transaction x and
> all transactions that depend on x".
>
> so now in the second phase, I am looking at undo of a transactions. any
> thoughts on this are very welcome....
>
>
>



Re: PITR and rollback

From
Bruce Momjian
Date:
Any chance you can work on save points/nested transactions?  See
doc/TODO.detail/transactions for info.  I can help explaining the ideas
in there.

---------------------------------------------------------------------------

Dhruv Pilania wrote:
> Hi,
> 
> I am a new postgresql developer. needed some help with wal/PITR. Can
> someone working in this area answer my question?
> (the email looks long but the question is simple :) )
> 
> I have been trying to implement undo of transactions using wal. i.e. given
> a xid x, postgres can undo all operations of x. For starters, I
> want to do this in very simple cases i.e. assume x only
> inserts/updates/deletes tuples and does not change database schema. also I
> assume that all of x's wal entries are in one segment.
> 
> The code for this is quite simple if database supports undo or rollback to
> a point in time. There is a lot of discussion on the mailing list about
> PITR. I am eagerly waiting for the PITR code to be available on cvs. so
> my questions are....
> 
> 1. once PITR has been implemented, infinite play forward will work. Will
> undo also be supported? i.e. can we recover to the past from a "current"
> wal log?
> as a very simple scenario---
> xid 1 " insert record y in relation r" commit
> xid 2 " update record x in relation r" commit
> shutdown
> ---now we take database back to start of xid 1.
> 
> if answer to qn 1 is no...
> 2. my approach is something like this,
> scan log back until start of transaction record
> scan forward until commit record
>     if record is for transaction x
>         undo(record)
> to undo,
> use preimage in record and everything else is pretty much same as redo.
> i.e. we open relation, get desired block and work on it etc.
> can someone tell me if this will work?
> 
> 
> hoping someone currently working on wal/pitr can help me on this
> issues....
> 
> thanks,
> Dhruv
> 
> 
> PS.
> 
> transaction dependency tracking
> -------------------------------
> I added support in postgres to do transaction dependency tracking.
> basically, x depends on y if x reads something written by y. I maintain a
> dependency graph and also a corresponding disk based log that is accessed
> only at transaction commit. there is a tool which can be used to query
> this graph. the time over heads are pretty low (< 1%).
> with a dependency graph a DBA can say " I want to undo transaction x and
> all transactions that depend on x".
> 
> so now in the second phase, I am looking at undo of a transactions. any
> thoughts on this are very welcome....
> 
> 
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026