Thread: building a binary-portable database
I've filled a database on Linux and tried to bring it across to Mac OSX, and got an error: FATAL: incorrect checksum in control file.
Exploring with pg_controldata shows differences such as:
Linux:
pg_control version number: 833
Catalog version number: 200711281
Database system identifier: 5338438316294798685
Database cluster state: in production
pg_control last modified: Fri 24 Jul 2009 06:37:28 PM EDT
Latest checkpoint location: 4B/5E4B2EC8
Prior checkpoint location: 4B/5E011FD8
Latest checkpoint's REDO location: 4B/5E4B2EC8
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 0/47680877
Latest checkpoint's NextOID: 24576
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Time of latest checkpoint: Fri 24 Jul 2009 06:34:58 PM EDT
Minimum recovery ending location: 0/0
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Date/time type storage: floating-point numbers
Maximum length of locale name: 128
LC_COLLATE: en_US.UTF-8
LC_CTYPE: en_US.UTF-8
Mac:
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.
pg_control version number: 833
Catalog version number: 200711281
Database system identifier: 5338438316294798685
Database cluster state: in production
pg_control last modified: Wed Dec 31 19:00:00 1969
Latest checkpoint location: 4A6A35E8/0
Prior checkpoint location: 4B/5E011FD8
Latest checkpoint's REDO location: 4B/5B2B3F28
Latest checkpoint's TimeLineID: 75
Latest checkpoint's NextXID: 1560372384/1
Latest checkpoint's NextOID: 0
Latest checkpoint's NextMultiXactId: 47680877
Latest checkpoint's NextMultiOffset: 24576
Time of latest checkpoint: Wed Dec 31 19:00:01 1969
Minimum recovery ending location: 0/4A6A35E6
Maximum data alignment: 0
Database block size: 8
Blocks per segment of large relation: 0
WAL block size: 0
Bytes per WAL segment: 1093850759
Maximum length of identifiers: 8192
Maximum columns in an index: 131072
Maximum size of a TOAST chunk: 8192
Date/time type storage: 64-bit integers
Maximum length of locale name: 64
LC_COLLATE:
LC_CTYPE:
-- apparently, Linux uses floating-point timestamps, while Mac uses 64-bit integers.
Is there a set of options for the build which will ensure I'll be able to rsync the databases between 64-bit Linux and 64-bit Mac, both Intel ones?
Cheers,
Alexy
Alexy Khrabrov wrote: > I've filled a database on Linux and tried to bring it across to Mac > OSX, and got an error: FATAL: incorrect checksum in control file. use pg_dumpall | psql to export/import databases between machines.
On Jul 31, 2009, at 6:30 PM, John R Pierce wrote: > Alexy Khrabrov wrote: >> I've filled a database on Linux and tried to bring it across to Mac >> OSX, and got an error: FATAL: incorrect checksum in control file. > > use pg_dumpall | psql to export/import databases between machines. That's doable of course, but I wonder what would it take to get it to work as-is, when building pg from source on each box, giving their fairly similar characteristics. Or, if time_t is different, would it be a show-stopper? Cheers, Alexy
Alexy Khrabrov <deliverable@gmail.com> writes: > That's doable of course, but I wonder what would it take to get it to > work as-is, when building pg from source on each box, giving their > fairly similar characteristics. It's not promised to work, and if it breaks you get to keep both pieces. > Or, if time_t is different, would it be a show-stopper? The pg_controldata output suggests that indeed time_t size difference is the immediate issue. But there might be other ones lurking behind that. (FWIW, 8.4 has removed this particular potential platform difference: http://archives.postgresql.org/pgsql-committers/2008-02/msg00184.php But I still wouldn't trust copying DB files between significantly different OSes.) regards, tom lane
On 1 Aug 2009, at 24:53, Alexy Khrabrov wrote: > That's doable of course, but I wonder what would it take to get it > to work as-is, when building pg from source on each box, giving > their fairly similar characteristics. Or, if time_t is different, > would it be a show-stopper? Fairly similar? The one is a Linux kernel with a Linux environment, the other is a Mach kernel with a mostly BSD environment. I think you can expect all kinds of fun with line-ending styles (Mac traditionally used \r, Linux/Unix uses \n), locale handling differences, different time zone handling, different page sizes, different floating point sizes, etc. You're in for an interesting experiment, if you manage to get the DB running at all. If you do I'd thoroughly check for any data- misinterpretation. I suggest doing a diff on text-dumps from both databases and see if there are any differences. I'd still not trust the data in it afterwards, but whether that matters depends on what you intend to use it for. If you want safe and sound, use pg_dump/restore. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a7433f410135509746239!
On Aug 1, 2009, at 8:24 AM, Alban Hertroys wrote: > On 1 Aug 2009, at 24:53, Alexy Khrabrov wrote: >> That's doable of course, but I wonder what would it take to get it >> to work as-is, when building pg from source on each box, giving >> their fairly similar characteristics. Or, if time_t is different, >> would it be a show-stopper? > > > Fairly similar? The one is a Linux kernel with a Linux environment, > the other is a Mach kernel with a mostly BSD environment. I think > you can expect all kinds of fun with line-ending styles (Mac > traditionally used \r, Linux/Unix uses \n), locale handling > differences, different time zone handling, different page sizes, > different floating point sizes, etc. > > You're in for an interesting experiment, if you manage to get the DB > running at all. If you do I'd thoroughly check for any data- > misinterpretation. I suggest doing a diff on text-dumps from both > databases and see if there are any differences. I'd still not trust > the data in it afterwards, but whether that matters depends on what > you intend to use it for. > > If you want safe and sound, use pg_dump/restore. Well, my question, of course, is, how come all those differences might affect PG binary data so much -- portable design would try to minimize such effects, wouldn't it? Does it optimize for all of the above intentionally, is it a side-effect of its design, and/or is there a set of options for the build time which might minimize binary incompatibility? I'd like to understand exactly why and how we get binary incompatibility, and what exactly do we get for not having it, and can it be a choice. There's a lot of databases out there. e.g. Berkeley DB, where the backup is mv or ftp. Performance is allright, too. I wish I could configure some of my PG ones that way... Cheers, Alexy
On Sun, Aug 02, 2009 at 01:42:13AM -0400, Alexy Khrabrov wrote: > Well, my question, of course, is, how come all those differences might > affect PG binary data so much -- portable design would try to minimize > such effects, wouldn't it? Does it optimize for all of the above > intentionally, is it a side-effect of its design, and/or is there a set > of options for the build time which might minimize binary > incompatibility? I'd like to understand exactly why and how we get > binary incompatibility, and what exactly do we get for not having it, > and can it be a choice. There's a lot of databases out there. e.g. > Berkeley DB, where the backup is mv or ftp. Performance is allright, > too. I wish I could configure some of my PG ones that way... As long as you're only dealing with strings it's not a problem, but when you start talking about integers or floating point there is no "standard format". While it would theoretically be possible to make a binary compatable version, the cost would be an extra conversion layer for each and every column access in every table. This is before you have even taken into acocunt the fact that different CPUs have different alignment requirements for different types, so to be portable you would have to take the worst case, which is just wasting space on architectures which don't need it. Finally you have effects like on 64-bit architectures you can pass a 64-bit value in a register, whereas on 32-bit architectures you may need to allocate memory and pass a pointer. Binary compatability takes work and costs performance and we prefer to focus on other things. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
How about portability between systems with the same endianness and bitness, e.g. Intel 64-bit ones? Cheers, Alexy
On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote: > How about portability between systems with the same endianness and > bitness, e.g. Intel 64-bit ones? Some parameters vary between compilers on the same platform. IIRC whether a long on a 64-bit platform is 64-bit depends on the compiler (windows platforms leave long as 32-bit). Alignment also differs between compilers which will translate to differnces on disk. Then you have things like time_t which depend on the C library you use. size_t depends on the memory model, or perhaps even on the compile flags. integer datetimes is a configure option. There's nothing to stop you trying, but there's been no effort in making it work. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote: >> How about portability between systems with the same endianness and >> bitness, e.g. Intel 64-bit ones? > There's nothing to stop you trying, but there's been no effort in > making it work. Exactly. It might work, but we don't promise it or test for it. If it breaks you get to keep both pieces. regards, tom lane
CREATE OR REPLACE FUNCTION iss.accessor_trigger() RETURNS "trigger" AS $$ BEGIN IF (TG_WHEN = 'BEFORE') THEN IF (TG_OP = 'INSERT') THEN NEW.createdatetime := current_timestamp NEW.createuser := current_user ELSIF (TG_OP = 'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C',OLD.*; ELSIF (TG_OP = 'DELETE') THEN INSERT into iss.auditaccessor SELECT 'B','D',OLD.*; END IF; RETURN NEW; ELSIF (TG_WHEN = 'AFTER') THEN IF (TG_OP = 'INSERT') THEN ELSIF (TG_OP = 'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'A','C',NEW.*; ELSIF (TG_OP = 'DELETE') THEN END IF; RETURN OLD; END IF; END $$ LANGUAGE plpgsql VOLATILE; I'm trying to use a single trigger to do a couple of things... The first is when a record is created to timestamp the createdatetime and the createuser columns with the current date/time or user. If there is a update then I want to make before and after images of the record and if a delete I want to keep the before image of the record. when I try and load this I get the following errors. I'm new to postgres so I'm not sure where I'm off here. Any help is greatly appreciated ERROR: syntax error at or near "$1" LINE 1: SELECT current_timestamp $1 := current_user ELSIF ( $2 =... ^ QUERY: SELECT current_timestamp $1 := current_user ELSIF ( $2 = 'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .* CONTEXT: SQL statement in PL/PgSQL function "accessor_trigger" near line 8 ********** Error ********** ERROR: syntax error at or near "$1" SQL state: 42601 Context: SQL statement in PL/PgSQL function "accessor_trigger" near line 8 -- Michael Gould, Managing Partner Intermodal Software Solutions, LLC 904.226.0978 904.592.5250 fax
On 2 Aug 2009, at 19:08, Michael Gould wrote: > CREATE OR REPLACE FUNCTION iss.accessor_trigger() RETURNS "trigger" > AS $$ > BEGIN > > IF (TG_WHEN = 'BEFORE') THEN > IF (TG_OP = 'INSERT') THEN > NEW.createdatetime := current_timestamp > NEW.createuser := current_user > ELSIF (TG_OP = 'UPDATE') THEN > INSERT into iss.auditaccessor SELECT 'B','C',OLD.*; > ELSIF (TG_OP = 'DELETE') THEN > INSERT into iss.auditaccessor SELECT 'B','D',OLD.*; > END IF; > RETURN NEW; > ELSIF (TG_WHEN = 'AFTER') THEN > IF (TG_OP = 'INSERT') THEN > ELSIF (TG_OP = 'UPDATE') THEN > INSERT into iss.auditaccessor SELECT 'A','C',NEW.*; > ELSIF (TG_OP = 'DELETE') THEN > END IF; > RETURN OLD; > END IF; > > END $$ LANGUAGE plpgsql VOLATILE; > ERROR: syntax error at or near "$1" > LINE 1: SELECT current_timestamp $1 := current_user ELSIF ( $2 > =... > ^ > QUERY: SELECT current_timestamp $1 := current_user ELSIF ( $2 = > 'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .* > CONTEXT: SQL statement in PL/PgSQL function "accessor_trigger" near > line 8 > > ********** Error ********** > > ERROR: syntax error at or near "$1" > SQL state: 42601 > Context: SQL statement in PL/PgSQL function "accessor_trigger" near > line 8 You're missing a few semi-colons after the first two assignments. I'm also not sure whether the empty THEN clauses at lines 14 and 17 will be accepted, it's probably better to leave them out. And lastly, you don't need braces around your conditional expressions or identifier quotation around the RETURN-type. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a75cff110131139260432!
Michael Gould <mgould@intermodalsoftwaresolutions.net> writes: > CREATE OR REPLACE FUNCTION iss.accessor_trigger() RETURNS "trigger" AS $$ > BEGIN > IF (TG_WHEN = 'BEFORE') THEN > IF (TG_OP = 'INSERT') THEN > NEW.createdatetime := current_timestamp > NEW.createuser := current_user You've forgotten to end these statements with semicolons ... regards, tom lane
Thanks to everyone who answered. Getting used to PostGres's unique syntax can take time getting used to. Best Regards Michael Gould "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Michael Gould <mgould@intermodalsoftwaresolutions.net> writes: >> CREATE OR REPLACE FUNCTION iss.accessor_trigger() RETURNS "trigger" AS $$ >> BEGIN > >> IF (TG_WHEN = 'BEFORE') THEN >> IF (TG_OP = 'INSERT') THEN >> NEW.createdatetime := current_timestamp >> NEW.createuser := current_user > > > You've forgotten to end these statements with semicolons ... > > regards, tom lane > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >