Thread: Syntax error in a large COPY

Syntax error in a large COPY

From

Reg Me Please

Date:

06 November 2007, 14:06:05

Hi all.
I'm generating an SQL script to load some million rows into a table.
I'm trying to use the COPY command in order to speed the load up.

At a certain point I get an error telling about a
"invalid input syntax for type numeric"

The incriminated line number is the last one (the one containing the \.).

Is there a way to know which line is really malformed?

Thanks.

--
Reg me Please

Re: Syntax error in a large COPY

From

Tom Lane

Date:

06 November 2007, 14:14:07

Reg Me Please <regmeplease@gmail.com> writes:
> At a certain point I get an error telling about a
> "invalid input syntax for type numeric"
> The incriminated line number is the last one (the one containing the \.).
> Is there a way to know which line is really malformed?

Why do you think the report is inaccurate?

I can reproduce this by putting a few spaces in front of \., for
instance.

regression=# create table foo(f1 numeric);
CREATE TABLE
regression=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 123
>> \.
regression=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>   \.
>> \.
ERROR:  invalid input syntax for type numeric: "  "
CONTEXT:  COPY foo, line 1, column f1: "  "
regression=#

As the psql prompt mentions, \. has to be alone on a line.

            regards, tom lane

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

06 November 2007, 14:23:59

That seems not to be the case.
The last line has a \. by its own and the last but one is
well formed.

Il Tuesday 06 November 2007 19:14:00 Tom Lane ha scritto:
> Reg Me Please <regmeplease@gmail.com> writes:
> > At a certain point I get an error telling about a
> > "invalid input syntax for type numeric"
> > The incriminated line number is the last one (the one containing the \.).
> > Is there a way to know which line is really malformed?
>
> Why do you think the report is inaccurate?
>
> I can reproduce this by putting a few spaces in front of \., for
> instance.
>
> regression=# create table foo(f1 numeric);
> CREATE TABLE
> regression=# copy foo from stdin;
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself.
>
> >> 123
> >> \.
>
> regression=# copy foo from stdin;
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself.
>
> >>   \.
> >> \.
>
> ERROR:  invalid input syntax for type numeric: "  "
> CONTEXT:  COPY foo, line 1, column f1: "  "
> regression=#
>
> As the psql prompt mentions, \. has to be alone on a line.
>
>             regards, tom lane



--
Reg me Please

Re: Syntax error in a large COPY

From

"Scott Marlowe"

Date:

06 November 2007, 14:43:49

On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> That seems not to be the case.
> The last line has a \. by its own and the last but one is
> well formed.

(Please don't top post...)

Got a self contained test case you can post?

Re: Syntax error in a large COPY

From

Collin Kidder

Date:

06 November 2007, 15:17:17

This is offtopic but there is nothing wrong with top posting. Is there a
mail list policy on it or are you just picky about it?

Scott Marlowe wrote:
> On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
>
>> That seems not to be the case.
>> The last line has a \. by its own and the last but one is
>> well formed.
>>
>
> (Please don't top post...)
>
> Got a self contained test case you can post?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

Re: Syntax error in a large COPY

From

"Scott Marlowe"

Date:

06 November 2007, 15:22:35

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

On Nov 6, 2007 1:16 PM, Collin Kidder <adderd@kkmfg.com> wrote:
> This is offtopic but there is nothing wrong with top posting. Is there a
> mail list policy on it or are you just picky about it?
>
>
> Scott Marlowe wrote:
> > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> >
> >> That seems not to be the case.
> >> The last line has a \. by its own and the last but one is
> >> well formed.
> >>
> >
> > (Please don't top post...)
> >
> > Got a self contained test case you can post?
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >        subscribe-nomail command to majordomo@postgresql.org so that your
> >        message can get through to the mailing list cleanly
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: Syntax error in a large COPY

From

"Scott Marlowe"

Date:

06 November 2007, 15:39:10

On Nov 6, 2007 1:16 PM, Collin Kidder <adderd@kkmfg.com> wrote:
> This is offtopic but there is nothing wrong with top posting. Is there a
> mail list policy on it or are you just picky about it?

OK, I was being a bit flippant with my last response.  The reasoning
on technical mailing lists for preferring inline or bottom posting is
several fold.  One is the reason I posted, that the answers being out
of order tend to make it difficult to follow the flow of the
conversation, especially if the conversation has, until then, been
inline / bottom posted.

But the other reason is that top posters are often the biggest
offenders of not editing their replies.  there's nothing like seeing a
one line reply to a 500 line conversation.

When inline posting, one can delete the parts of the converstation
that they aren't replying to so that it is obvious what parts they are
replying to and keep their post concise.

Basically, in a technical mailing list, one is trying to converse in
the most efficient and concise manner.  Top posting does not help to
accomplish this goal, because a top post has no context within the
conversation.

OTOH, sometimes, a top post is appropriate.  I.e. someone posts a long
article explaining how some arcane part of the database works, and a
person replies with a top post of "Thanks for the great article, it
really explained a lot" and then deletes MOST of the article since it
isn't needed for the reply.

So it's not really a hard and fast rule, it's more a guideline on a
technical list.

RFC 1855 is a bit old but still makes some good points

http://tools.ietf.org/html/rfc1855

Re: Syntax error in a large COPY

From

Tom Lane

Date:

06 November 2007, 16:21:37

"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> But the other reason is that top posters are often the biggest
> offenders of not editing their replies.  there's nothing like seeing a
> one line reply to a 500 line conversation.
> When inline posting, one can delete the parts of the converstation
> that they aren't replying to so that it is obvious what parts they are
> replying to and keep their post concise.

The real point here is that when writing to a mailing list, you should
make an effort to conserve other peoples' time.  Hundreds of people are
going to read what you wrote, possibly thousands when you count in
people searching the list archives in the future.  Saving them a few
seconds apiece adds up fast.  Not bothering to expend a minute of your
own time to make it a bit easier on the reader marks you as a boor.

In the context of this particular issue, what that means is that it's
polite to quote only enough of the previous message to make it clear
what you're replying to --- you should assume that most readers already
saw the earlier message and don't need to read it all again, but maybe
could use a few sentences of context to remember what the thread was
about (especially since it might be hours or days since they read the
prior message).  And you should quote the part of the message that bears
on what you've got to say, if there's an identifiable part that you're
responding to.  And that quoting should come before your reply, because
otherwise your comment is out of context for the reader; again, they'd
have to burrow through the quoted message to remember what the
discussion was, and then come back and re-read your comment.

Another reason for being selective when you quote is to be polite to
people searching the list archives in the future --- they don't want to
be buried in hits from repeated quotes of the message they're actually
looking for.

I don't say that you *can't* top post and still be polite, but nearly
every example I've ever seen came off as rude to the reader because
it didn't take any of these things into account, and it very obviously
was something that the sender didn't spend any time on.  If people
decide you're a jerk they're likely to ignore your posts.

            regards, tom lane

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

06 November 2007, 17:08:40

Il Tuesday 06 November 2007 19:43:38 Scott Marlowe ha scritto:
> On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > That seems not to be the case.
> > The last line has a \. by its own and the last but one is
> > well formed.
>
> (Please don't top post...)
>
> Got a self contained test case you can post?

Back to the original topic ...

I'm trying to understand what and where.
The point is that I have this 29M+ lines script telling me there's a problem
somewhere.

A self contained test, at the moment, would be that long!

I'm considering a "branch and bound" approach ... but it'd be quite long
and tedious as the program generating the script has not been written
to do such things.

I'm pretty sure that the interpreter knows exactly where the problem
is. It's just missing to tell it.
Unless there's some friendly advise.

--
Reg me Please

Re: Syntax error in a large COPY

From

"Scott Marlowe"

Date:

06 November 2007, 17:14:07

On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> Il Tuesday 06 November 2007 19:43:38 Scott Marlowe ha scritto:
> > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > That seems not to be the case.
> > > The last line has a \. by its own and the last but one is
> > > well formed.
> >
> > (Please don't top post...)
> >
> > Got a self contained test case you can post?
>
> Back to the original topic ...
>
> I'm trying to understand what and where.
> The point is that I have this 29M+ lines script telling me there's a problem
> somewhere.
>
> A self contained test, at the moment, would be that long!
>
> I'm considering a "branch and bound" approach ... but it'd be quite long
> and tedious as the program generating the script has not been written
> to do such things.

Split it in half, with the appropriate sql on each end so the data
still works, and see which half causes a problem.  Keep splitting the
one that causes a problem in half until you have a small one with the
problem still.

I'm guessing the problem will become obvious then.

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

06 November 2007, 17:19:18

Il Tuesday 06 November 2007 22:13:15 hai scritto:
> On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > Il Tuesday 06 November 2007 19:43:38 Scott Marlowe ha scritto:
> > > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > > That seems not to be the case.
> > > > The last line has a \. by its own and the last but one is
> > > > well formed.
> > >
> > > (Please don't top post...)
> > >
> > > Got a self contained test case you can post?
> >
> > Back to the original topic ...
> >
> > I'm trying to understand what and where.
> > The point is that I have this 29M+ lines script telling me there's a
> > problem somewhere.
> >
> > A self contained test, at the moment, would be that long!
> >
> > I'm considering a "branch and bound" approach ... but it'd be quite long
> > and tedious as the program generating the script has not been written
> > to do such things.
>
> Split it in half, with the appropriate sql on each end so the data
> still works, and see which half causes a problem.  Keep splitting the
> one that causes a problem in half until you have a small one with the
> problem still.
>
> I'm guessing the problem will become obvious then.

That's the "branch and bound". Editing 29M+ lines file takes some time.
But this is the way I'm going to go right now.

--
Reg me Please

Re: Syntax error in a large COPY

From

"Scott Marlowe"

Date:

06 November 2007, 17:37:19

On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> Il Tuesday 06 November 2007 22:13:15 hai scritto:
> > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > Il Tuesday 06 November 2007 19:43:38 Scott Marlowe ha scritto:
> > > > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > > > That seems not to be the case.
> > > > > The last line has a \. by its own and the last but one is
> > > > > well formed.
> > > >
> > > > (Please don't top post...)
> > > >
> > > > Got a self contained test case you can post?
> > >
> > > Back to the original topic ...
> > >
> > > I'm trying to understand what and where.
> > > The point is that I have this 29M+ lines script telling me there's a
> > > problem somewhere.
> > >
> > > A self contained test, at the moment, would be that long!
> > >
> > > I'm considering a "branch and bound" approach ... but it'd be quite long
> > > and tedious as the program generating the script has not been written
> > > to do such things.
> >
> > Split it in half, with the appropriate sql on each end so the data
> > still works, and see which half causes a problem.  Keep splitting the
> > one that causes a problem in half until you have a small one with the
> > problem still.
> >
> > I'm guessing the problem will become obvious then.
>
> That's the "branch and bound". Editing 29M+ lines file takes some time.
> But this is the way I'm going to go right now.

Oh, we called it half-splitting in the military.

Using something like head / tail in unix to do it.  Should be fairly
fast, especially if you keep cutting it in half after the first test.
I can't imagine editing something that large even in vi being very
fast.

Re: Syntax error in a large COPY

From

Tom Lane

Date:

06 November 2007, 17:40:58

Reg Me Please <regmeplease@gmail.com> writes:
> I'm trying to understand what and where.
> The point is that I have this 29M+ lines script telling me there's a problem
> somewhere.

It told you exactly where it detected the problem.

Actually ... are you looking at the right error message?  If this is a
copy operation within a larger psql script, I think it's probably true
that psql will point at the end of the copy data, because *it* doesn't
know any better.  But the error message from the backend should
correctly finger which line of copy input it got confused at.

            regards, tom lane

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

06 November 2007, 17:51:34

Il Tuesday 06 November 2007 22:37:12 Scott Marlowe ha scritto:
> On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > Il Tuesday 06 November 2007 22:13:15 hai scritto:
> > > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > > Il Tuesday 06 November 2007 19:43:38 Scott Marlowe ha scritto:
> > > > > On 11/6/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > > > > > That seems not to be the case.
> > > > > > The last line has a \. by its own and the last but one is
> > > > > > well formed.
> > > > >
> > > > > (Please don't top post...)
> > > > >
> > > > > Got a self contained test case you can post?
> > > >
> > > > Back to the original topic ...
> > > >
> > > > I'm trying to understand what and where.
> > > > The point is that I have this 29M+ lines script telling me there's a
> > > > problem somewhere.
> > > >
> > > > A self contained test, at the moment, would be that long!
> > > >
> > > > I'm considering a "branch and bound" approach ... but it'd be quite
> > > > long and tedious as the program generating the script has not been
> > > > written to do such things.
> > >
> > > Split it in half, with the appropriate sql on each end so the data
> > > still works, and see which half causes a problem.  Keep splitting the
> > > one that causes a problem in half until you have a small one with the
> > > problem still.
> > >
> > > I'm guessing the problem will become obvious then.
> >
> > That's the "branch and bound". Editing 29M+ lines file takes some time.
> > But this is the way I'm going to go right now.
>
> Oh, we called it half-splitting in the military.
>
> Using something like head / tail in unix to do it.  Should be fairly
> fast, especially if you keep cutting it in half after the first test.
> I can't imagine editing something that large even in vi being very
> fast.

My laptop has a fairly slow disk and even with Linux and vi it takes time.

Well, a better diagnostic messages would be better in any case.


--
Reg me Please

Re: Syntax error in a large COPY

From

Thomas Kellerer

Date:

06 November 2007, 18:05:39

Tom Lane wrote on 06.11.2007 21:21:
> The real point here is that when writing to a mailing list, you should
> make an effort to conserve other peoples' time.  Hundreds of people are
> going to read what you wrote, possibly thousands when you count in
> people searching the list archives in the future.  Saving them a few
> seconds apiece adds up fast.  Not bothering to expend a minute of your
> own time to make it a bit easier on the reader marks you as a boor.

Hmm.
My News/Mail client (Thunderbird) displays all the posts belonging together in a
threaded manner. So I click on the first post, read the posting, click on the
second post, and the first thing I see is the first post repeated (or at least
parts of it).
If everyone simply top-posted, there would be no need for me to scroll down,
just to find a two line answer below a forty line quote - which I personally
find more irritating than top-posting.

But then that may only be me ;)

Re: Syntax error in a large COPY

From

Dimitri Fontaine

Date:

06 November 2007, 18:11:58

Hi,

Le Tuesday 06 November 2007 22:40:50 Tom Lane, vous avez écrit :
> Reg Me Please <regmeplease@gmail.com> writes:
> > I'm trying to understand what and where.
> > The point is that I have this 29M+ lines script telling me there's a
> > problem somewhere.

You could use pgloader, which reads CSV input (or text which is not quite CSV)
and will import valid rows and reject invalid ones, filling both a reject log
file with error messages and a rejected data file with bad input lines.
Basically, it will issue the dichotomy steps (branch and bound) for you.
  http://pgfoundry.org/projects/pgloader
  http://pgloader.projects.postgresql.org/

> It told you exactly where it detected the problem.
>
> Actually ... are you looking at the right error message?  If this is a
> copy operation within a larger psql script, I think it's probably true
> that psql will point at the end of the copy data, because *it* doesn't
> know any better.  But the error message from the backend should
> correctly finger which line of copy input it got confused at.

Or just use the server logged error message, which will be the same as
reported by pgloader (no magic here) --- pgloader will import successfully
the rest of the data and have the faulty lines separated out in the reject
file, but still uses COPY internally.

Hope this helps,
--
dim

Re: Syntax error in a large COPY

From

Alvaro Herrera

Date:

06 November 2007, 20:29:58

Reg Me Please wrote:
> Il Tuesday 06 November 2007 22:13:15 hai scritto:

> > Split it in half, with the appropriate sql on each end so the data
> > still works, and see which half causes a problem.  Keep splitting the
> > one that causes a problem in half until you have a small one with the
> > problem still.
> >
> > I'm guessing the problem will become obvious then.
>
> That's the "branch and bound". Editing 29M+ lines file takes some time.
> But this is the way I'm going to go right now.

Huh, why not just use pgloader?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Syntax error in a large COPY

From

Tom Lane

Date:

07 November 2007, 01:24:56

Thomas Kellerer <spam_eater@gmx.net> writes:
> If everyone simply top-posted, there would be no need for me to scroll down,
> just to find a two line answer below a forty line quote - which I personally
> find more irritating than top-posting.

I think you're ignoring my basic point, which was that people shouldn't
be quoting forty lines' worth in the first place.  *Especially* not if
they only have two lines to contribute.

But this horse has been beat to death before...

            regards, tom lane

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

07 November 2007, 02:54:49

Il Wednesday 07 November 2007 01:29:44 Alvaro Herrera ha scritto:
> Reg Me Please wrote:
> > Il Tuesday 06 November 2007 22:13:15 hai scritto:
> > That's the "branch and bound". Editing 29M+ lines file takes some time.
> > But this is the way I'm going to go right now.
>
> Huh, why not just use pgloader?

Becasue I never heard about it.
Because it's not included into my distribution package list.
And because I was trusting the "core tools" to work reasonably.

I'll compile and use that.

By the way, unsless you want to have logs at the debug level, no
information has been found in the logs about the offending line(s)
in the 29M+ COPY script.

--
Reg me Please

Re: Syntax error in a large COPY

From

Thomas Kellerer

Date:

07 November 2007, 03:50:05

Tom Lane, 07.11.2007 06:14:
> Thomas Kellerer <spam_eater@gmx.net> writes:
>> If everyone simply top-posted, there would be no need for me to scroll down,
>> just to find a two line answer below a forty line quote - which I personally
>> find more irritating than top-posting.
>
> I think you're ignoring my basic point, which was that people shouldn't
> be quoting forty lines' worth in the first place.  *Especially* not if
> they only have two lines to contribute.

No, I did get your point.

My point is: with top-posting I don't care how many lines were repeated
because I don't have to scroll.


> But this horse has been beat to death before...
Yes, nuff said :)

Regards
Thomas

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

07 November 2007, 05:51:34

Il Wednesday 07 November 2007 07:54:41 Reg Me Please ha scritto:
> Il Wednesday 07 November 2007 01:29:44 Alvaro Herrera ha scritto:
> > Reg Me Please wrote:
> > > Il Tuesday 06 November 2007 22:13:15 hai scritto:
> > > That's the "branch and bound". Editing 29M+ lines file takes some time.
> > > But this is the way I'm going to go right now.
> >
> > Huh, why not just use pgloader?
>
> Becasue I never heard about it.
> Because it's not included into my distribution package list.
> And because I was trusting the "core tools" to work reasonably.
>
> I'll compile and use that.
>
> By the way, unsless you want to have logs at the debug level, no
> information has been found in the logs about the offending line(s)
> in the 29M+ COPY script.

pgloader seems not that easy to use for a newbie like myself.
Also because domentation seems too skinny.

In any case each "goto line, add lines, save, run" cycle requires about 10
minutes on my PC. And the logs don't provide any useful detail.
So, again, better logging would help in any case.

--
Reg me Please

Re: Syntax error in a large COPY

From

Dimitri Fontaine

Date:

07 November 2007, 06:10:54

Le mercredi 07 novembre 2007, Reg Me Please a écrit :
> pgloader seems not that easy to use for a newbie like myself.
> Also because domentation seems too skinny.

Sorry about this, writting documentation in English is not that easy when it's
not one's natural language... I'll accept any comment/patch to the
documentation, the aim of it being to ease users life, of course ;)
  http://pgloader.projects.postgresql.org/

Short story: you have to make a pgloader.conf file where you explain where is
the data file and what pgloader should expect into it (csv, text, what
delimiter and quotes, etc), then run
  pgloader -Tc pgloader.conf

The -T option will TRUNCATE the configured table(s) before COPYing data into
it (them).

> In any case each "goto line, add lines, save, run" cycle requires about 10
> minutes on my PC. And the logs don't provide any useful detail.
> So, again, better logging would help in any case.

pgloader would certainly give this, at first run... It seems to me it is worth
the effort of reading the manual...
--
dim

Attachment

signature.asc

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

07 November 2007, 06:18:01

Il Wednesday 07 November 2007 11:10:40 Dimitri Fontaine ha scritto:
> Le mercredi 07 novembre 2007, Reg Me Please a écrit :
> > pgloader seems not that easy to use for a newbie like myself.
> > Also because domentation seems too skinny.
>
> Sorry about this, writting documentation in English is not that easy when
> it's not one's natural language... I'll accept any comment/patch to the
> documentation, the aim of it being to ease users life, of course ;)
> http://pgloader.projects.postgresql.org/
>
> Short story: you have to make a pgloader.conf file where you explain where
> is the data file and what pgloader should expect into it (csv, text, what
> delimiter and quotes, etc), then run
>   pgloader -Tc pgloader.conf
>
> The -T option will TRUNCATE the configured table(s) before COPYing data
> into it (them).
>
> > In any case each "goto line, add lines, save, run" cycle requires about
> > 10 minutes on my PC. And the logs don't provide any useful detail.
> > So, again, better logging would help in any case.
>
> pgloader would certainly give this, at first run... It seems to me it is
> worth the effort of reading the manual...

Hi.

pgloader rocks!

Maybe just a complete example would suffice. Let's say a table structure, a
CSV and a raw text file, a config file and the run output.

Thanks.

P.S.
Why not including the pgloader into the main tarball?


--
Reg me Please

Re: Syntax error in a large COPY

From

Dimitri Fontaine

Date:

07 November 2007, 06:27:11

Le mercredi 07 novembre 2007, Reg Me Please a écrit :
> Maybe just a complete example would suffice. Let's say a table structure, a
> CSV and a raw text file, a config file and the run output.

Do you mean something like the included examples, which I tend to also use as
(regression) tests?
  http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgloader/pgloader/examples/

> Why not including the pgloader into the main tarball?

If you're talking about the main PostgreSQL tarball... For starters it would
have to be rewritten in C and be good enough to merit a contrib inclusion,
and that's considering the core product would benefit of such a project in
the first place...

PgFoundry is a great resource for a myriad of tools making the PG-user life
easier, maybe you'd be better off browsing it than expecting core to include
more helper tools!

Regards,
--
dim

Attachment

signature.asc

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

07 November 2007, 06:50:20

Il Wednesday 07 November 2007 11:26:56 Dimitri Fontaine ha scritto:
> Le mercredi 07 novembre 2007, Reg Me Please a écrit :
> > Maybe just a complete example would suffice. Let's say a table structure,
> > a CSV and a raw text file, a config file and the run output.
>
> Do you mean something like the included examples, which I tend to also use
> as (regression) tests?
>   http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgloader/pgloader/examples/

Right.
I installed .deb. The man page has not been included.
The examples have been copied in /usr/share/doc/pgloader/examples.

The examples are OK.

> Regards,

Thanks for yor work.

--
Reg me Please

Re: Syntax error in a large COPY

From

Reg Me Please

Date:

07 November 2007, 07:03:23

Il Tuesday 06 November 2007 19:05:52 Reg Me Please ha scritto:
> Hi all.
> I'm generating an SQL script to load some million rows into a table.
> I'm trying to use the COPY command in order to speed the load up.
>
> At a certain point I get an error telling about a
> "invalid input syntax for type numeric"
>
> The incriminated line number is the last one (the one containing the \.).
>
> Is there a way to know which line is really malformed?
>
> Thanks.

Blame on me!
The problem (spotted thanks to the pgloader) was that I was using "\n" for
null values instead of "\N" (capital n).

As stated into the friendly documentation "\n" stands for "new line",
while "\N" stands for "NULL".

While being clearly stated, this choice is a little bit confusing, at least
for newbies like myself.

Thanks everyone and Dimitri Fontaine for his pgloader.

--
Reg me Please

Re: Syntax error in a large COPY

From

Dimitri Fontaine

Date:

07 November 2007, 07:15:54

Le mercredi 07 novembre 2007, Reg Me Please a écrit :
> I installed .deb. The man page has not been included.

It seems the latter package on pgfoundry does have a problem here. As I have
some patches waiting for a release, I'll make current CVS the 2.2.2 version
and update the pgfoundry files sometime later.

Thanks for reporting the issue, regards,
--
dim

Attachment

signature.asc

Re: Syntax error in a large COPY

From

Bill Moran

Date:

07 November 2007, 09:31:25

In response to Thomas Kellerer <spam_eater@gmx.net>:

> Tom Lane, 07.11.2007 06:14:
> > Thomas Kellerer <spam_eater@gmx.net> writes:
> >> If everyone simply top-posted, there would be no need for me to scroll down,
> >> just to find a two line answer below a forty line quote - which I personally
> >> find more irritating than top-posting.
> >
> > I think you're ignoring my basic point, which was that people shouldn't
> > be quoting forty lines' worth in the first place.  *Especially* not if
> > they only have two lines to contribute.
>
> No, I did get your point.
>
> My point is: with top-posting I don't care how many lines were repeated
> because I don't have to scroll.

Considering there is an RFC that recommends inline posting over
top-posting (http://tools.ietf.org/html/rfc1855), and considering the
fact that this topic has been beat to death on dozens of mailing lists
and the predominant preference is _not_ for top-posting -- perhaps you
should either follow the preferences of the group, or leave the group.

> > But this horse has been beat to death before...

Obviously not, as it keeps coming back to life.  I guess it's an
undead horse?

--
Bill Moran
http://www.potentialtech.com

Re: Syntax error in a large COPY

From

Alvaro Herrera

Date:

07 November 2007, 09:37:43

Reg Me Please wrote:

> P.S.
> Why not including the pgloader into the main tarball?

We are not attempting to include every useful tool in the database
server.  We're actually moving in the opposite direction: stuff has been
offloaded to pgfoundry as appropriate. Add-on packages are encouraged.
If your distributor does not carry packages for interesting pgFoundry
projects, complain to them!

--
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"We're here to devour each other alive"            (Hobbes)

Re: Syntax error in a large COPY

From

Collin Kidder

Date:

07 November 2007, 13:12:23

>>
>> My point is: with top-posting I don't care how many lines were repeated
>> because I don't have to scroll.
>>
>
> Considering there is an RFC that recommends inline posting over
> top-posting (http://tools.ietf.org/html/rfc1855), and considering the
> fact that this topic has been beat to death on dozens of mailing lists
> and the predominant preference is _not_ for top-posting -- perhaps you
> should either follow the preferences of the group, or leave the group.
>

I'm with Thomas. I think that, while inline posting is a good thing,
bottom posting is dead stupid and wastes my time. It is far easier to
follow a thread with top posting as the relevant text is right there at
the top ready to be read.

>
>>> But this horse has been beat to death before...
>>>
>
> Obviously not, as it keeps coming back to life.  I guess it's an
> undead horse?
>
>

No, just not everyone agrees with your viewpoint on this topic. Top
posting has it's place and some of us prefer it. Obviously I'm not doing
it but it's only because of the large amount of anal retentive people on
lists like this. And so... with that my view is out there. I hate bottom
posting. But I for one will do it to keep the peace.

Re: Syntax error in a large COPY

From

"Andrej Ricnik-Bay"

Date:

07 November 2007, 13:36:03

On 11/8/07, Collin Kidder <adderd@kkmfg.com> wrote:

> I'm with Thomas. I think that, while inline posting is a good thing,
> bottom posting is dead stupid and wastes my time.
Just as bad as top-posting, really.

> It is far easier to
> follow a thread with top posting as the relevant text is right there at
> the top ready to be read.
The relevant bit being what?  Two lines dangling loosely
at the top of a mail?

You omitted the crucial bit here:
"It is far easier FOR ME WITH MY CURRENT MAIL CLIENT
to follow a thread with top posting ..."
If that's good enough reason for you to ignore RFCs and
complain about the habit on this list, by all means, there's
no point in arguing. But we can flog the dead horse some
more ....


> No, just not everyone agrees with your viewpoint on this topic. Top
> posting has it's place and some of us prefer it.
But they could just adhere to the "law of the land", or "when in
Rome, ... " practice instead of kicking off fuss.  And with my
"mail client" top-posting has no place.  Let's just stick to good
old standards.

[ ... offensive material removed ... ]
> And so... with that my view is out there. I hate bottom
> posting. But I for one will do it to keep the peace.
You were actually using the appropriate interleaved quoting
style, not bottom posting (minus the 'trimming', mind you).
At least get your terminology right. ;D



Cheers,
Andrej


--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

Re: Syntax error in a large COPY

From

Steve Wampler

Date:

07 November 2007, 13:42:27

Collin Kidder wrote:
> I'm with Thomas. I think that, while inline posting is a good thing,
> bottom posting is dead stupid and wastes my time. It is far easier to
> follow a thread with top posting as the relevant text is right there at
> the top ready to be read.

That sounds more like an argument to not including the original text at all.

As far as I'm concerned, in-line posting *with* editing of the text is
the *only* reasonable thing to do.  Top-posting is lazy, arrogant, and assumes
reading material on a recently read thread (so the context is fresh).  It
offers no advantages when reading a posting after-the-fact or taken in
isolation.  The reader has to first re-establish the context, which means
reading the message from the bottom up.

Bottom posting (w/o editing) is only *slightly* less lazy, but doesn't
make the assumption that the reader is current on the context, at least.
Bottom posting w/o editing forces the reader to wade through old material
that isn't relevant, however, to reestablish the context.

The argument based on being able to link back up through a thread to
get context is a non-sequitur.  If one really believes that's the case,
then don't include the original text *at all* (whether top or bottom
posting) [and see how many people complain about lack of context!]
If there is some context that is relevant to what's being
added, seeing *just that context* immediately prior to reading the new material
is invaluable.

[This *isn't* a bottom-posted message - it just looks like one because
of the context editing!]

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Syntax error in a large COPY

From

Ron Johnson

Date:

07 November 2007, 14:20:43

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/07/07 11:35, Andrej Ricnik-Bay wrote:
> On 11/8/07, Collin Kidder <adderd@kkmfg.com> wrote:
[snip]
>
>
>> No, just not everyone agrees with your viewpoint on this topic. Top
>> posting has it's place and some of us prefer it.
> But they could just adhere to the "law of the land", or "when in
> Rome, ... " practice instead of kicking off fuss.  And with my
> "mail client" top-posting has no place.  Let's just stick to good
> old standards.

<SARCASM>
What ever happened to "I gotta do what's right for me!" and "I'm OK,
you're OK"?
</SARCASM>

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHMgG/S9HxQb37XmcRAiDmAJ9heLxbBvBSVP0duhzSfI1bvnskoACeOlWp
UDS6YJV0KsYD44FMIpa54m4=
=x+Xs
-----END PGP SIGNATURE-----

Re: Syntax error in a large COPY

From

Bruce Momjian

Date:

07 November 2007, 22:48:26

Andrej Ricnik-Bay wrote:
> On 11/8/07, Collin Kidder <adderd@kkmfg.com> wrote:
>
> > I'm with Thomas. I think that, while inline posting is a good thing,
> > bottom posting is dead stupid and wastes my time.
> Just as bad as top-posting, really.
>
> > It is far easier to
> > follow a thread with top posting as the relevant text is right there at
> > the top ready to be read.
> The relevant bit being what?  Two lines dangling loosely
> at the top of a mail?
>
> You omitted the crucial bit here:
> "It is far easier FOR ME WITH MY CURRENT MAIL CLIENT
> to follow a thread with top posting ..."
> If that's good enough reason for you to ignore RFCs and
> complain about the habit on this list, by all means, there's
> no point in arguing. But we can flog the dead horse some
> more ....

Offtopic, but what actually confuses me most is people replying to
quoted text and not putting a blank line before their additional text,
like above.  It makes it very hard for me to quickly see the next text.

FYI, I only top post when I want talk talk _about_ the email, like "Is
this a TODO item", and put a dashed line under my text so people realize
I top-posted and there is nothing new below my text.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +