Thread: [Solved] SQL Server to PostgreSQL

[Solved] SQL Server to PostgreSQL

From
"Roderick A. Anderson"
Date:
A thanks to everyone on this list and especially; Jeffery Rhines, Chris
Knight, Chris Bitmead, and Sevo Stille.

The solution turned out to be very simple.  After catching a SCSI BUS
speed mismatch problem which caused a NT Backup 'Restore' failure I
discovered that the actual data was in .mdb files!  Copied the files to a
system running MS Access (Office 97) and was able to export them to a
delimited format which went into PostgreSQL with very few problems.
Mostly there were split lines which the \copy command didn't like.  Hand
corrected them.

I was able to get the table format by using MS Access.  Only question left
is what is the corresponding field type in PostgreSQL for a memo field in
SQL Server/Access (varchar(nnnn))?

Again thanks for all the help,
Rod
--
Roderick A. Anderson
raanders@altoplanos.net               Altoplanos Information Systems, Inc.
Voice: 208.765.6149                            212 S. 11th Street, Suite 5
FAX: 208.664.5299                                  Coeur d'Alene, ID 83814


Re: [Solved] SQL Server to PostgreSQL

From
Tressens Lionel
Date:
Le 22.08.00 a 09:37, "Roderick A. Anderson" m'ecrivait :

)I was able to get the table format by using MS Access.  Only question left
)is what is the corresponding field type in PostgreSQL for a memo field in
)SQL Server/Access (varchar(nnnn))?

'text' type perhaps ?

Lionel


I lied! [Solved] SQL Server to PostgreSQL

From
"Roderick A. Anderson"
Date:
I hate it when I do this.  See an answer I want and run with it rather
than find the real answer.

Turned out the data files (.mdb) _didn't_ belong to the database.  They
were a piece of the database that was used for a report.

Back to the old grind wheel.


Rod
--
Roderick A. Anderson
raanders@altoplanos.net               Altoplanos Information Systems, Inc.
Voice: 208.765.6149                            212 S. 11th Street, Suite 5
FAX: 208.664.5299                                  Coeur d'Alene, ID 83814


Re: [Solved] SQL Server to PostgreSQL

From
Tom Lane
Date:
Tressens Lionel <tressens@etud.insa-tlse.fr> writes:
> Le 22.08.00 a 09:37, "Roderick A. Anderson" m'ecrivait :
> )I was able to get the table format by using MS Access.  Only question left
> )is what is the corresponding field type in PostgreSQL for a memo field in
> )SQL Server/Access (varchar(nnnn))?

> 'text' type perhaps ?

Uh ... what's wrong with varchar(n) ?

            regards, tom lane

Re: [Solved] SQL Server to PostgreSQL

From
Vince Vielhaber
Date:
On Tue, 22 Aug 2000, Tom Lane wrote:

> Tressens Lionel <tressens@etud.insa-tlse.fr> writes:
> > Le 22.08.00 a 09:37, "Roderick A. Anderson" m'ecrivait :
> > )I was able to get the table format by using MS Access.  Only question left
> > )is what is the corresponding field type in PostgreSQL for a memo field in
> > )SQL Server/Access (varchar(nnnn))?
>
> > 'text' type perhaps ?
>
> Uh ... what's wrong with varchar(n) ?

How big can our n be for varchar?  By looking at his description I'm
thinking SQL Server allows a large n.

Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net
 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking
        Online Campground Directory    http://www.camping-usa.com
       Online Giftshop Superstore    http://www.cloudninegifts.com
==========================================================================




Re: [Solved] SQL Server to PostgreSQL

From
"Jeffrey A. Rhines"
Date:
I've wondered that myself, actually.  What are the benefits and
drawbacks to going with one over the other, besides the obvious 255-char
field length limit for varchar?  The reason to stay away from "memo"
fields in other serious RDBMSs are typically more difficult maintenance,
significantly lower performance, and requiring special function calls to
get the data out.  Do any of those apply to PG?

Jeff

Tom Lane wrote:
>
> Tressens Lionel <tressens@etud.insa-tlse.fr> writes:
> > Le 22.08.00 a 09:37, "Roderick A. Anderson" m'ecrivait :
> > )I was able to get the table format by using MS Access.  Only question left
> > )is what is the corresponding field type in PostgreSQL for a memo field in
> > )SQL Server/Access (varchar(nnnn))?
>
> > 'text' type perhaps ?
>
> Uh ... what's wrong with varchar(n) ?
>
>                         regards, tom lane

Re: [Solved] SQL Server to PostgreSQL

From
Tom Lane
Date:
"Jeffrey A. Rhines" <jrhines@email.com> writes:
>> Uh ... what's wrong with varchar(n) ?
>
> I've wondered that myself, actually.  What are the benefits and
> drawbacks to going with one over the other, besides the obvious 255-char
> field length limit for varchar?

AFAIK there has *never* been a 255-char limit on char or varchar in
pgsql ... you must be thinking of Some Other DBMS.

The limit for these datatypes in 7.0 and before is BLCKSZ less some
overhead --- ~8000 bytes in a default setup.  Beginning in 7.1 it's
an essentially arbitrary number.  I set it at 10Mb in current sources,
but there's no strong reason for that number over any other.  In theory
it could be up to 1Gb, but as Jan Wieck points out in a nearby thread,
you probably wouldn't like the performance of shoving gigabyte-sized
text values around.  We need to think about offering API functions that
will allow reading and writing huge field values in bite-sized chunks.

There's no essential performance difference between char(n), varchar(n),
and text in Postgres, given the same-sized data value.  char(n)
truncates or blank-pads to exactly n characters; varchar(n) truncates
if more than n characters; text never truncates nor pads.  Beyond that
they are completely identical in storage requirements.  Pick one based
on the semantics you want for your application.

            regards, tom lane

RE: [Solved] SQL Server to PostgreSQL

From
Franck Martin
Date:
As we are talking about 7.1 and huge field size...

MS-SQL has a function that allows you to retreive part of a field a kind of
mid$(field, start, length). This would be a good addition indeed.

last problem. PG does not allow to store binary data. I'm not talking about
the current implementation of BLOB, but what will happen in 7.1...

If I want to store an image in a field. I cannot do that in 7.1 because the
data sent by 7.1 and received in libpq must be formated in ASCII. I haven't
play around to see if I could create a user type called varbinary(n), which
will output via varbinary_out just the content of a buffer... May be varchar
does it already (even if there is a \0?).

I know I should submit this problem to the hacker list, but I don't want to
subscribe to hacker just to submit one message...

BTW is there an alpha/beta release of PG 7.1 ?

Franck Martin
Database Development Officer
SOPAC South Pacific Applied Geoscience Commission
Fiji
E-mail: franck@sopac.org <mailto:franck@sopac.org>
Web site: http://www.sopac.org/ <http://www.sopac.org/>

This e-mail is intended for its recipients only. Do not forward this
e-mail without approval. The views expressed in this e-mail may not be
neccessarily the views of SOPAC.



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, August 23, 2000 3:11 PM
To: Jeffrey A. Rhines
Cc: PostgreSQL::General List
Subject: Re: [GENERAL] [Solved] SQL Server to PostgreSQL


"Jeffrey A. Rhines" <jrhines@email.com> writes:
>> Uh ... what's wrong with varchar(n) ?
>
> I've wondered that myself, actually.  What are the benefits and
> drawbacks to going with one over the other, besides the obvious 255-char
> field length limit for varchar?

AFAIK there has *never* been a 255-char limit on char or varchar in
pgsql ... you must be thinking of Some Other DBMS.

The limit for these datatypes in 7.0 and before is BLCKSZ less some
overhead --- ~8000 bytes in a default setup.  Beginning in 7.1 it's
an essentially arbitrary number.  I set it at 10Mb in current sources,
but there's no strong reason for that number over any other.  In theory
it could be up to 1Gb, but as Jan Wieck points out in a nearby thread,
you probably wouldn't like the performance of shoving gigabyte-sized
text values around.  We need to think about offering API functions that
will allow reading and writing huge field values in bite-sized chunks.

There's no essential performance difference between char(n), varchar(n),
and text in Postgres, given the same-sized data value.  char(n)
truncates or blank-pads to exactly n characters; varchar(n) truncates
if more than n characters; text never truncates nor pads.  Beyond that
they are completely identical in storage requirements.  Pick one based
on the semantics you want for your application.

            regards, tom lane

Re: [Solved] SQL Server to PostgreSQL

From
Craig Johannsen
Date:
I think the ODBC spec limits varchar to 255 bytes.
Some ODBC drivers enforce that limit.

Tom Lane wrote:

> "Jeffrey A. Rhines" <jrhines@email.com> writes:
> >> Uh ... what's wrong with varchar(n) ?
> >
> > I've wondered that myself, actually.  What are the benefits and
> > drawbacks to going with one over the other, besides the obvious 255-char
> > field length limit for varchar?
>
> AFAIK there has *never* been a 255-char limit on char or varchar in
> pgsql ... you must be thinking of Some Other DBMS.
>
> [snip]
>                         regards, tom lane


Re: [Solved] SQL Server to PostgreSQL

From
Martijn van Oosterhout
Date:
Franck Martin wrote:
>
> As we are talking about 7.1 and huge field size...
>
> MS-SQL has a function that allows you to retreive part of a field a kind of
> mid$(field, start, length). This would be a good addition indeed.

does substr() not handle this?

> last problem. PG does not allow to store binary data. I'm not talking about
> the current implementation of BLOB, but what will happen in 7.1...
>
> If I want to store an image in a field. I cannot do that in 7.1 because the
> data sent by 7.1 and received in libpq must be formated in ASCII. I haven't
> play around to see if I could create a user type called varbinary(n), which
> will output via varbinary_out just the content of a buffer... May be varchar
> does it already (even if there is a \0?).

One thing I've thought about is creating an escape char to delimit this
sort
of thing. Maybe a control-A, followed by four bytes giving the length
followed
by that many byes of data. Escape \0's with ^A0 and a real ^A with ^A^A.

I would love something like this because I always get tripped up by
fields
I'm inserting containing quotes and other characters that need to be
escaped.

> I know I should submit this problem to the hacker list, but I don't want to
> subscribe to hacker just to submit one message...

Heh, I know what you mean...

--
Martijn van Oosterhout <kleptog@cupid.suninternet.com>
http://cupid.suninternet.com/~kleptog/

Re: [Solved] SQL Server to PostgreSQL

From
Radoslaw Stachowiak
Date:
*** Tom Lane <tgl@sss.pgh.pa.us> [Tuesday, 22.August.2000, 23:11 -0400]:
> There's no essential performance difference between char(n), varchar(n),
> and text in Postgres, given the same-sized data value.  char(n)
> truncates or blank-pads to exactly n characters; varchar(n) truncates
> if more than n characters; text never truncates nor pads.  Beyond that
> they are completely identical in storage requirements.
[.rs.]

Does varchar(188) takes 188 bytes (+ bytes for length storage) every
time, no matter if it contains 'my text'  or 'my long 188 char text.....'
?


--
radoslaw.stachowiak.........................................http://alter.pl/

Re: [Solved] SQL Server to PostgreSQL

From
Martin Christensen
Date:
>>>>> "Radoslaw" == Radoslaw Stachowiak <radek@alter.pl> writes:
Radoslaw> Does varchar(188) takes 188 bytes (+ bytes for length
Radoslaw> storage) every time, no matter if it contains 'my text' or
Radoslaw> 'my long 188 char text.....'  ?

The way I understand it varchar(n) is variable-length, while char(n)
is fixed-lenght. Thus the behaviour you describe above is that of
char(n).

Martin

--
GPG public key: http://home1.stofanet.dk/factotum/gpgkey.txt

Re: [Solved] SQL Server to PostgreSQL

From
Jan Wieck
Date:
Martin Christensen wrote:
> >>>>> "Radoslaw" == Radoslaw Stachowiak <radek@alter.pl> writes:
> Radoslaw> Does varchar(188) takes 188 bytes (+ bytes for length
> Radoslaw> storage) every time, no matter if it contains 'my text' or
> Radoslaw> 'my long 188 char text.....'  ?
>
> The way I understand it varchar(n) is variable-length, while char(n)
> is fixed-lenght. Thus the behaviour you describe above is that of
> char(n).

    Right for any pre-7.1 version.

    From  7.1  on  the  system  will  try  to  compress all types
    internally stored as variable length (char(), varchar(), text
    and  some  more). So the real amount of bytes for a char(188)
    will be "at maximum 192 - probably less".


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: [Solved] SQL Server to PostgreSQL

From
Ian Turner
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>     From  7.1  on  the  system  will  try  to  compress all types
>     internally stored as variable length (char(), varchar(), text
>     and  some  more). So the real amount of bytes for a char(188)
>     will be "at maximum 192 - probably less".

Don't variable-length records incur a performance overhead? In this case,
ought I be able to specify the length for a record if I know ahead of time
it will be the same in every case? :o

Ian Turner
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE5sabgfn9ub9ZE1xoRAhayAKCwMjh/5tYlg8zZiAimJlgFSfCLsQCghBce
Gxx6X8sSwIACIHvdbxBsgGQ=
=bogc
-----END PGP SIGNATURE-----


Re: [Solved] SQL Server to PostgreSQL

From
Tom Lane
Date:
Ian Turner <vectro@pipeline.com> writes:
> Don't variable-length records incur a performance overhead?

Only to the extent that the system can't cache offset information for
later columns in that table.  While someone evidently once thought that
was worthwhile, I've never seen the column-access code show up as a
particularly hot spot in any profile I've run.  I doubt you could
actually measure any difference, let alone show it to be important
enough to be worth worrying about.

In any case, char(n) will still do what you want for reasonable-size
records.  The TOAST code only kicks in when the total tuple size exceeds
BLCKSZ/4 ... and at that point, compression is a good idea in any case.

Now that you mention it, though, doesn't TOAST break heapam's assumption
that char(n) is fixed length?  Seems like we'd better either remove that
assumption or mark char(n) nontoastable.  Any opinions which is better?

            regards, tom lane

Re: [Solved] SQL Server to PostgreSQL

From
Jan Wieck
Date:
Tom Lane wrote:
> Now that you mention it, though, doesn't TOAST break heapam's assumption
> that char(n) is fixed length?  Seems like we'd better either remove that
> assumption or mark char(n) nontoastable.  Any opinions which is better?

    Is  the  saved overhead from assuming char(n) is fixed really
    that big that it's worth NOT to gain  the  TOAST  advantages?
    After  the  GB  benchmarks  we  know  that we have some spare
    performance to waste for such things :-)


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: [Solved] SQL Server to PostgreSQL

From
Tom Lane
Date:
Ian Turner <vectro@pipeline.com> writes:
> Don't variable-length records incur a performance overhead?

Only to the extent that the system can't cache offset information for
later columns in that table.  While someone evidently once thought that
was worthwhile, I've never seen the column-access code show up as a
particularly hot spot in any profile I've run.  I doubt you could
actually measure any difference, let alone show it to be important
enough to be worth worrying about.

In any case, char(n) will still do what you want for reasonable-size
records.  The TOAST code only kicks in when the total tuple size exceeds
BLCKSZ/4 ... and at that point, compression is a good idea in any case.

Now that you mention it, though, doesn't TOAST break heapam's assumption
that char(n) is fixed length?  Seems like we'd better either remove that
assumption or mark char(n) nontoastable.  Any opinions which is better?

            regards, tom lane

Re: [Solved] SQL Server to PostgreSQL

From
Jan Wieck
Date:
Tom Lane wrote:
> Now that you mention it, though, doesn't TOAST break heapam's assumption
> that char(n) is fixed length?  Seems like we'd better either remove that
> assumption or mark char(n) nontoastable.  Any opinions which is better?

    Is  the  saved overhead from assuming char(n) is fixed really
    that big that it's worth NOT to gain  the  TOAST  advantages?
    After  the  GB  benchmarks  we  know  that we have some spare
    performance to waste for such things :-)


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: [Solved] SQL Server to PostgreSQL

From
Bruce Momjian
Date:
[ Charset ISO-8859-1 unsupported, converting... ]
> Ian Turner <vectro@pipeline.com> writes:
> > Don't variable-length records incur a performance overhead?
>
> Only to the extent that the system can't cache offset information for
> later columns in that table.  While someone evidently once thought that
> was worthwhile, I've never seen the column-access code show up as a
> particularly hot spot in any profile I've run.  I doubt you could
> actually measure any difference, let alone show it to be important
> enough to be worth worrying about.

It clearly is a hot-spot.  That monster macro, fastgetattr(), in
heapam.h is in there for a reason. It accounts for about 5% for straight
sequential scan case, last I heard from someone who ran a test.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [Solved] SQL Server to PostgreSQL

From
Bruce Momjian
Date:
> Ian Turner <vectro@pipeline.com> writes:
> > Don't variable-length records incur a performance overhead?
>
> Only to the extent that the system can't cache offset information for
> later columns in that table.  While someone evidently once thought that
> was worthwhile, I've never seen the column-access code show up as a
> particularly hot spot in any profile I've run.  I doubt you could
> actually measure any difference, let alone show it to be important
> enough to be worth worrying about.
>
> In any case, char(n) will still do what you want for reasonable-size
> records.  The TOAST code only kicks in when the total tuple size exceeds
> BLCKSZ/4 ... and at that point, compression is a good idea in any case.

My logic is that I use char() when I want the length to be fixed, like
2-letter state codes, and varchar() for others where I just want a
maximum allowed, like last name.  I use text for arbitrary length stuff.
Tom is right that though there is a small performance difference, it is
better just to use the right type.

>
> Now that you mention it, though, doesn't TOAST break heapam's assumption
> that char(n) is fixed length?  Seems like we'd better either remove that
> assumption or mark char(n) nontoastable.  Any opinions which is better?

I am sure Jan handled that.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [Solved] SQL Server to PostgreSQL

From
Bruce Momjian
Date:
> Tom Lane wrote:
> > Now that you mention it, though, doesn't TOAST break heapam's assumption
> > that char(n) is fixed length?  Seems like we'd better either remove that
> > assumption or mark char(n) nontoastable.  Any opinions which is better?
>
>     Is  the  saved overhead from assuming char(n) is fixed really
>     that big that it's worth NOT to gain  the  TOAST  advantages?
>     After  the  GB  benchmarks  we  know  that we have some spare
>     performance to waste for such things :-)

Oh, now I get it.  Some TOAST values may be out-of line.  Can we really
throw char() into TOAST?  I guess we can.  We have to record somewhere
that we have toasted that tuple and disable the offset cache for it.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [Solved] SQL Server to PostgreSQL

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Tom Lane wrote:
>>>> Now that you mention it, though, doesn't TOAST break heapam's assumption
>>>> that char(n) is fixed length?  Seems like we'd better either remove that
>>>> assumption or mark char(n) nontoastable.  Any opinions which is better?

> I don't see any more communication on this in my mail archives.

Nothing's been done yet, but we *must* fix this before 7.1.

I'm wondering whether it's worthwhile keeping the VARLENA_FIXED_SIZE
macro at all.  Is there any potential use for it?

            regards, tom lane