Re: xlog viewer proposal - Mailing list pgsql-hackers

From Diogo Biazus
Subject Re: xlog viewer proposal
Date
Msg-id eca519a10606230659p19f8be81p38a1801a2838ae65@mail.gmail.com
Whole thread Raw
In response to Re: xlog viewer proposal  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: xlog viewer proposal  ("Diogo Biazus" <diogob@gmail.com>)
Re: xlog viewer proposal  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers


On 6/23/06, Simon Riggs <simon@2ndquadrant.com > wrote:
> - give more flexibility for managing the xlogs remotely

Not sure what you mean.

> - I think it's faster to implement and to have a working and usable
> tool.

Why do you think that? It sounds like you've got more work since you
effectively need to rewrite the _desc routines.

Yes, but I don't need to worry with program output, and I have the backend's memory management and error handling.

> And there is one option to minimize the problem in the failed cluster
> case: the wrapper program could give the option to initdb a temporary
> area when no connection is given, creating a backend just to analyze a
> set of xlogs.

It seems a reasonable assumption that someone reading PostgreSQL logs
would have access to another PostgreSQL cluster. It obviously needs to
work when the server that originated the logs is unavailable, but that
does not mean that all PostgreSQL systems are unavailable. There's no
need to try to wrap initdb - just note that people would have to have
access to a PostgreSQL system.

Yes, that's what I tought,  wrap the initdb isn't needed but would make things easier for a newbie.

> Other option is to start by the standalone tool and create a wrapper
> function inside postgresql that would just call this external program
> and extract data from the xlogs using this program's output (with some
> option to output all data in a CSV format).

I think this idea is a good one, but we must also consider whether is
can be done effectively within the time available. Is this: can do now
or want to do in future?

I think that could be done, I have some code to call external programs within the database ready. It would be one of the enhancements in the case we choose the standalone path.

The alternative of reinforcing xlogdump needs to be considered more
fully now and quickly, so coding can begin as soon as possible.
- Diogo: what additional things can you make xlogdump do?

I could add options to display the data of the bkp bloks, add the missing rmids: RM_HASH_ID, RM_GIST_ID, RM_SEQ_ID. Make a options to query only the transaction info xids status. And make a contrib module that calls the xlogdump and parses the output.

- Tom: can you say more about what you'd like to see from a tool, to
help Diogo determine the best way forward. What value can he add if you
have already written the tool?


Some other considerations:
The biggest difficulty is finding "loser transactions" - ones that have
not yet committed by the end of the log. You need to do this in both
cases if you want to allow transaction state to be determined precisely
for 100% of transactions; otherwise you might have to have an Unknown
transaction state in addition to the others.

Yes, this is one thing we have to do in any case.

What nobody has mentioned is that connecting to a db to lookup table
names from OIDs is only possible if that db knows about the set of
tables the log files refer to. How would we be certain that the
OID-to-tablename match would be a reliable one?

Good question, It seems to me that the only case where this have a trivial aswer is if your inside the backend querying the current xlog directory.
I'm still thinking about the solution for the other cases (inside or outside the backend).

--
Diogo Biazus - diogob@gmail.com
Móvel Consultoria
http://www.movelinfo.com.br
http://www.postgresql.org.br

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions
Next
From: Alvaro Herrera
Date:
Subject: Re: vacuum, performance, and MVCC