Thread: building a binary-portable database

building a binary-portable database

From

Alexy Khrabrov

Date:

31 July 2009, 19:02:58

I've filled a database on Linux and tried to bring it across to Mac OSX, and got an error: FATAL: incorrect checksum in control file.

Exploring with pg_controldata shows differences such as:

Linux:

pg_control version number: 833

Catalog version number: 200711281

Database system identifier: 5338438316294798685

Database cluster state: in production

pg_control last modified: Fri 24 Jul 2009 06:37:28 PM EDT

Latest checkpoint location: 4B/5E4B2EC8

Prior checkpoint location: 4B/5E011FD8

Latest checkpoint's REDO location: 4B/5E4B2EC8

Latest checkpoint's TimeLineID: 1

Latest checkpoint's NextXID: 0/47680877

Latest checkpoint's NextOID: 24576

Latest checkpoint's NextMultiXactId: 1

Latest checkpoint's NextMultiOffset: 0

Time of latest checkpoint: Fri 24 Jul 2009 06:34:58 PM EDT

Minimum recovery ending location: 0/0

Maximum data alignment: 8

Database block size: 8192

Blocks per segment of large relation: 131072

WAL block size: 8192

Bytes per WAL segment: 16777216

Maximum length of identifiers: 64

Maximum columns in an index: 32

Maximum size of a TOAST chunk: 1996

Date/time type storage: floating-point numbers

Maximum length of locale name: 128

LC_COLLATE: en_US.UTF-8

LC_CTYPE: en_US.UTF-8

Mac:

WARNING: Calculated CRC checksum does not match value stored in file.

Either the file is corrupt, or it has a different layout than this program

is expecting. The results below are untrustworthy.

pg_control version number: 833

Catalog version number: 200711281

Database system identifier: 5338438316294798685

Database cluster state: in production

pg_control last modified: Wed Dec 31 19:00:00 1969

Latest checkpoint location: 4A6A35E8/0

Prior checkpoint location: 4B/5E011FD8

Latest checkpoint's REDO location: 4B/5B2B3F28

Latest checkpoint's TimeLineID: 75

Latest checkpoint's NextXID: 1560372384/1

Latest checkpoint's NextOID: 0

Latest checkpoint's NextMultiXactId: 47680877

Latest checkpoint's NextMultiOffset: 24576

Time of latest checkpoint: Wed Dec 31 19:00:01 1969

Minimum recovery ending location: 0/4A6A35E6

Maximum data alignment: 0

Database block size: 8

Blocks per segment of large relation: 0

WAL block size: 0

Bytes per WAL segment: 1093850759

Maximum length of identifiers: 8192

Maximum columns in an index: 131072

Maximum size of a TOAST chunk: 8192

Date/time type storage: 64-bit integers

Maximum length of locale name: 64

LC_COLLATE:

LC_CTYPE:

-- apparently, Linux uses floating-point timestamps, while Mac uses 64-bit integers.

Is there a set of options for the build which will ensure I'll be able to rsync the databases between 64-bit Linux and 64-bit Mac, both Intel ones?

Cheers,

Alexy

Re: building a binary-portable database

From

John R Pierce

Date:

31 July 2009, 19:31:02

Alexy Khrabrov wrote:
> I've filled a database on Linux and tried to bring it across to Mac
> OSX, and got an error: FATAL:  incorrect checksum in control file.

use pg_dumpall | psql  to export/import databases between machines.

Re: building a binary-portable database

From

Alexy Khrabrov

Date:

31 July 2009, 19:54:03

On Jul 31, 2009, at 6:30 PM, John R Pierce wrote:

> Alexy Khrabrov wrote:
>> I've filled a database on Linux and tried to bring it across to Mac
>> OSX, and got an error: FATAL:  incorrect checksum in control file.
>
> use pg_dumpall | psql  to export/import databases between machines.

That's doable of course, but I wonder what would it take to get it to
work as-is, when building pg from source on each box, giving their
fairly similar characteristics.  Or, if time_t is different, would it
be a show-stopper?

Cheers,
Alexy

Re: building a binary-portable database

From

Tom Lane

Date:

31 July 2009, 20:10:14

Alexy Khrabrov <deliverable@gmail.com> writes:
> That's doable of course, but I wonder what would it take to get it to
> work as-is, when building pg from source on each box, giving their
> fairly similar characteristics.

It's not promised to work, and if it breaks you get to keep both pieces.

> Or, if time_t is different, would it be a show-stopper?

The pg_controldata output suggests that indeed time_t size difference is
the immediate issue.  But there might be other ones lurking behind that.

(FWIW, 8.4 has removed this particular potential platform difference:
http://archives.postgresql.org/pgsql-committers/2008-02/msg00184.php
But I still wouldn't trust copying DB files between significantly
different OSes.)

            regards, tom lane

Re: building a binary-portable database

From

Alban Hertroys

Date:

01 August 2009, 09:24:35

On 1 Aug 2009, at 24:53, Alexy Khrabrov wrote:
> That's doable of course, but I wonder what would it take to get it
> to work as-is, when building pg from source on each box, giving
> their fairly similar characteristics.  Or, if time_t is different,
> would it be a show-stopper?

Fairly similar? The one is a Linux kernel with a Linux environment,
the other is a Mach kernel with a mostly BSD environment. I think you
can expect all kinds of fun with line-ending styles (Mac traditionally
used \r, Linux/Unix uses \n), locale handling differences, different
time zone handling, different page sizes, different floating point
sizes, etc.

You're in for an interesting experiment, if you manage to get the DB
running at all. If you do I'd thoroughly check for any data-
misinterpretation. I suggest doing a diff on text-dumps from both
databases and see if there are any differences. I'd still not trust
the data in it afterwards, but whether that matters depends on what
you intend to use it for.

If you want safe and sound, use pg_dump/restore.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4a7433f410135509746239!

Re: building a binary-portable database

From

Alexy Khrabrov

Date:

02 August 2009, 02:42:33

On Aug 1, 2009, at 8:24 AM, Alban Hertroys wrote:

> On 1 Aug 2009, at 24:53, Alexy Khrabrov wrote:
>> That's doable of course, but I wonder what would it take to get it
>> to work as-is, when building pg from source on each box, giving
>> their fairly similar characteristics.  Or, if time_t is different,
>> would it be a show-stopper?
>
>
> Fairly similar? The one is a Linux kernel with a Linux environment,
> the other is a Mach kernel with a mostly BSD environment. I think
> you can expect all kinds of fun with line-ending styles (Mac
> traditionally used \r, Linux/Unix uses \n), locale handling
> differences, different time zone handling, different page sizes,
> different floating point sizes, etc.
>
> You're in for an interesting experiment, if you manage to get the DB
> running at all. If you do I'd thoroughly check for any data-
> misinterpretation. I suggest doing a diff on text-dumps from both
> databases and see if there are any differences. I'd still not trust
> the data in it afterwards, but whether that matters depends on what
> you intend to use it for.
>
> If you want safe and sound, use pg_dump/restore.

Well, my question, of course, is, how come all those differences might
affect PG binary data so much -- portable design would try to minimize
such effects, wouldn't it?  Does it optimize for all of the above
intentionally, is it a side-effect of its design, and/or is there a
set of options for the build time which might minimize binary
incompatibility?  I'd like to understand exactly why and how we get
binary incompatibility, and what exactly do we get for not having it,
and can it be a choice. There's a lot of databases out there. e.g.
Berkeley DB, where the backup is mv or ftp.  Performance is allright,
too.  I wish I could configure some of my PG ones that way...

Cheers,
Alexy

Re: building a binary-portable database

From

Martijn van Oosterhout

Date:

02 August 2009, 07:02:09

On Sun, Aug 02, 2009 at 01:42:13AM -0400, Alexy Khrabrov wrote:
> Well, my question, of course, is, how come all those differences might
> affect PG binary data so much -- portable design would try to minimize
> such effects, wouldn't it?  Does it optimize for all of the above
> intentionally, is it a side-effect of its design, and/or is there a set
> of options for the build time which might minimize binary
> incompatibility?  I'd like to understand exactly why and how we get
> binary incompatibility, and what exactly do we get for not having it,
> and can it be a choice. There's a lot of databases out there. e.g.
> Berkeley DB, where the backup is mv or ftp.  Performance is allright,
> too.  I wish I could configure some of my PG ones that way...

As long as you're only dealing with strings it's not a problem, but
when you start talking about integers or floating point there is no
"standard format". While it would theoretically be possible to make a
binary compatable version, the cost would be an extra conversion layer
for each and every column access in every table.

This is before you have even taken into acocunt the fact that different
CPUs have different alignment requirements for different types, so to
be portable you would have to take the worst case, which is just
wasting space on architectures which don't need it.

Finally you have effects like on 64-bit architectures you can pass a
64-bit value in a register, whereas on 32-bit architectures you may need
to allocate memory and pass a pointer.

Binary compatability takes work and costs performance and we prefer to
focus on other things.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

Re: building a binary-portable database

From

Alexy Khrabrov

Date:

02 August 2009, 13:02:57

How about portability between systems with the same endianness and
bitness, e.g. Intel 64-bit ones?

Cheers,
Alexy

Re: building a binary-portable database

From

Martijn van Oosterhout

Date:

02 August 2009, 13:21:29

On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote:
> How about portability between systems with the same endianness and
> bitness, e.g. Intel 64-bit ones?

Some parameters vary between compilers on the same platform. IIRC
whether a long on a 64-bit platform is 64-bit depends on the compiler
(windows platforms leave long as 32-bit). Alignment also differs
between compilers which will translate to differnces on disk.

Then you have things like time_t which depend on the C library you use.
size_t depends on the memory model, or perhaps even on the compile
flags. integer datetimes is a configure option.

There's nothing to stop you trying, but there's been no effort in
making it work.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

Re: building a binary-portable database

From

Tom Lane

Date:

02 August 2009, 13:45:11

Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote:
>> How about portability between systems with the same endianness and
>> bitness, e.g. Intel 64-bit ones?

> There's nothing to stop you trying, but there's been no effort in
> making it work.

Exactly.  It might work, but we don't promise it or test for it.
If it breaks you get to keep both pieces.

            regards, tom lane

Problem trying to load trigger

From

Michael Gould

Date:

02 August 2009, 14:08:57

CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS $$
BEGIN

IF    (TG_WHEN = 'BEFORE') THEN
  IF    (TG_OP = 'INSERT') THEN
    NEW.createdatetime := current_timestamp
    NEW.createuser := current_user
  ELSIF (TG_OP = 'UPDATE') THEN
    INSERT into iss.auditaccessor SELECT 'B','C',OLD.*;
  ELSIF (TG_OP = 'DELETE') THEN
    INSERT into iss.auditaccessor SELECT 'B','D',OLD.*;
  END IF;
  RETURN NEW;
ELSIF (TG_WHEN = 'AFTER') THEN
  IF    (TG_OP = 'INSERT') THEN
  ELSIF (TG_OP = 'UPDATE') THEN
    INSERT into iss.auditaccessor SELECT 'A','C',NEW.*;
  ELSIF (TG_OP = 'DELETE') THEN
  END IF;
  RETURN OLD;
END IF;

END $$ LANGUAGE plpgsql VOLATILE;

I'm trying to use a single trigger to do a couple of things...

The first is when a record is created to timestamp the createdatetime and
the createuser columns with the current date/time or user.  If there is a
update then I want to make before and after images of the record and if a
delete I want to keep the before image of the record.

when I try and load this I get the following errors.


I'm new to postgres so I'm not sure where I'm off here.  Any help is greatly
appreciated

ERROR:  syntax error at or near "$1"
LINE 1: SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =...
                                   ^
QUERY:  SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =
'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .*
CONTEXT:  SQL statement in PL/PgSQL function "accessor_trigger" near line 8

********** Error **********

ERROR: syntax error at or near "$1"
SQL state: 42601
Context: SQL statement in PL/PgSQL function "accessor_trigger" near line 8





--
Michael Gould, Managing Partner
Intermodal Software Solutions, LLC
904.226.0978
904.592.5250 fax

Re: Problem trying to load trigger

From

Alban Hertroys

Date:

02 August 2009, 14:42:23

On 2 Aug 2009, at 19:08, Michael Gould wrote:

> CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger"
> AS $$
> BEGIN
>
> IF    (TG_WHEN = 'BEFORE') THEN
>  IF    (TG_OP = 'INSERT') THEN
>    NEW.createdatetime := current_timestamp
>    NEW.createuser := current_user
>  ELSIF (TG_OP = 'UPDATE') THEN
>    INSERT into iss.auditaccessor SELECT 'B','C',OLD.*;
>  ELSIF (TG_OP = 'DELETE') THEN
>    INSERT into iss.auditaccessor SELECT 'B','D',OLD.*;
>  END IF;
>  RETURN NEW;
> ELSIF (TG_WHEN = 'AFTER') THEN
>  IF    (TG_OP = 'INSERT') THEN
>  ELSIF (TG_OP = 'UPDATE') THEN
>    INSERT into iss.auditaccessor SELECT 'A','C',NEW.*;
>  ELSIF (TG_OP = 'DELETE') THEN
>  END IF;
>  RETURN OLD;
> END IF;
>
> END $$ LANGUAGE plpgsql VOLATILE;

> ERROR:  syntax error at or near "$1"
> LINE 1: SELECT  current_timestamp  $1  := current_user ELSIF ( $2
> =...
>                                   ^
> QUERY:  SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =
> 'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .*
> CONTEXT:  SQL statement in PL/PgSQL function "accessor_trigger" near
> line 8
>
> ********** Error **********
>
> ERROR: syntax error at or near "$1"
> SQL state: 42601
> Context: SQL statement in PL/PgSQL function "accessor_trigger" near
> line 8


You're missing a few semi-colons after the first two assignments. I'm
also not sure whether the empty THEN clauses at lines 14 and 17 will
be accepted, it's probably better to leave them out. And lastly, you
don't need braces around your conditional expressions or identifier
quotation around the RETURN-type.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a75cff110131139260432!

Re: Problem trying to load trigger

From

Tom Lane

Date:

02 August 2009, 14:42:54

Michael Gould <mgould@intermodalsoftwaresolutions.net> writes:
> CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS $$
> BEGIN

> IF    (TG_WHEN = 'BEFORE') THEN
>   IF    (TG_OP = 'INSERT') THEN
>     NEW.createdatetime := current_timestamp
>     NEW.createuser := current_user


You've forgotten to end these statements with semicolons ...

            regards, tom lane

Re: Problem trying to load trigger

From

Michael Gould

Date:

02 August 2009, 17:03:30

Thanks to everyone who answered.  Getting used to PostGres's unique syntax
can take time getting used to.

Best Regards

Michael Gould

"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> Michael Gould <mgould@intermodalsoftwaresolutions.net> writes:
>> CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS
$$
>> BEGIN
>
>> IF    (TG_WHEN = 'BEFORE') THEN
>>   IF    (TG_OP = 'INSERT') THEN
>>     NEW.createdatetime := current_timestamp
>>     NEW.createuser := current_user
>
>
> You've forgotten to end these statements with semicolons ...
>
>                         regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>