Thread: gsoc ideas

gsoc ideas

From
longlong
Date:
hi,all.

i want some advice about ideas for gsoc. i don't konw if it is appropriate that i send a email here, so if you feel uncomfortable, please accept my apology.

1.release8.2 make COPY TO can copy the output of an arbitrary SELECT statement. so i think maybe COPY FROM can get data from output and 'insert into' some column that designated. the format of the command will be discussed.

2.this come from TODO list: COPY always behaviors like a unit of work thar consists of some insert commands, if any error, it rollback. but sometimes we only care the data should be inserted. in that situation, i used to use "try....catch...." insert row by row to skip the error, because it will take much time to examine every row. so:
    Allow COPY to report error lines and continue.
this is a good idea.

3.sometimes, i want to copy data from one database to another. i think using COPY will simple the code. i want the content from COPY TO not store in the file, but in the memory, and i can COPY FROM the memory(i don't kown COPY with STDIN and STDOUT can do this or not.).

how do you think of these ideas?

Re: gsoc ideas

From
Devrim GÜNDÜZ
Date:
Hi,

On Mon, 2008-03-10 at 13:10 +0800, longlong wrote:
> i want some advice about ideas for gsoc. i don't konw if it is
> appropriate that i send a email here, so if you feel uncomfortable,
> please accept my apology.

Use pgsql-hackers list. Development discussions are done on that list.

Regards,
--
Devrim GÜNDÜZ , RHCE
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

Attachment

Re: gsoc ideas

From
Alvaro Herrera
Date:
longlong escribió:

> 3.sometimes, i want to copy data from one database to another. i think using
> COPY will simple the code. i want the content from COPY TO not store in the
> file, but in the memory, and i can COPY FROM the memory(i don't kown COPY
> with STDIN and STDOUT can do this or not.).

I don't think this is very interesting because you can do

pg_dump -t foo | psql

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: gsoc ideas

From
Greg Smith
Date:
On Mon, 10 Mar 2008, longlong wrote:

> 1.release8.2 make COPY TO can copy the output of an arbitrary SELECT
> statement. so i think maybe COPY FROM can get data from output and 'insert
> into' some column that designated. the format of the command will be
> discussed.

This would be a nice feature.  Right now there are often applications
where there is a data loading or staging table that ends up being merged
with a larger table after some cleanup.  Moving that data from the
preperation area into the final table right now is most easily done with
INSERT INTO X (SELECT A,B FROM C) type actions.  This is slow because
INSERT takes much longer than COPY.  Adding support for COPY X FROM
(SELECT A,B FROM C) would make this problem go away.

It is possible to do this right now with some clever use of STDIN/OUT like
the below, but having a pure SQL solution would be more widely applicable.
The overhead of having to pass everything through the client (as STDIN/OUT
do) is certainly not zero.

> 2.this come from TODO list: COPY always behaviors like a unit of work thar
> consists of some insert commands, if any error, it rollback. but sometimes
> we only care the data should be inserted. in that situation, i used to use
> "try....catch...." insert row by row to skip the error, because it will take
> much time to examine every row. so:
>    Allow COPY to report error lines and continue.  this is a good idea.

This is a long standing request and many people would be happy to see it
implemented.  You do want to make sure the implementation easily allows
pushing all the lines that didn't commit into what's commonly called a
"reject file".

> 3.sometimes, i want to copy data from one database to another. i think using
> COPY will simple the code. i want the content from COPY TO not store in the
> file, but in the memory, and i can COPY FROM the memory(i don't kown COPY
> with STDIN and STDOUT can do this or not.).

It can:

create table x(a int);
insert into x(select generate_series(1,10));
create table y(b int);

psql -c "copy x to stdout" | psql -c "copy y from stdout"

Try it out, table y will have the same thing when it's all done.

I think you've got the basics of some useful features to add here.  What
you probably want to do is write a slightly longer description of your
plan and submit it to the pgsql-hackers list where the developers are at
to get feedback on the feasibility of doing this as a GSOC project.  From
your message, I get the impression that English writing is tough for you.
That will make it a little harder for you to get through the process of
getting a patch designed and then accepted, as this community likes to
talk through that sort of thing.  If you've got another language you're
more comfortable with, you might also want to see if there's an existing
community member who speaks it you might work with to make that easier.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: gsoc ideas

From
longlong
Date:
your words are a great encouragement to me!

i am embarrassed by my English and i will improve it.

maybe i missed something that makes you a little misunderstand in point 3. what i want is data-copys between databases automatically processed by program. and databases may be not in the same host. i feel that this is hard to implement now. so i'll try 1 and 2 in [HACKERS].

thanks for you advice.

Re: gsoc ideas

From
Richard Huxton
Date:
longlong wrote:
> your words are a great encouragement to me!
>
> i am embarrassed by my English and i will improve it.
>
> maybe i missed something that makes you a little misunderstand in point 3.
> what i want is data-copys between databases automatically processed by
> program. and databases may be not in the same host. i feel that this is hard
> to implement now. so i'll try 1 and 2 in [HACKERS].

You can do it with different hosts too:

psql -h host1 -c "copy x to stdout" | psql -h host2 -c "copy y from stdin"


--
   Richard Huxton
   Archonet Ltd

Re: gsoc ideas

From
longlong
Date:
i see.
now i know that COPY with STDIN/OUT can do what i mentioned before exactly.

 
2008/3/11, Richard Huxton <dev@archonet.com>:
longlong wrote:
> your words are a great encouragement to me!
>
> i am embarrassed by my English and i will improve it.
>
> maybe i missed something that makes you a little misunderstand in point 3.
> what i want is data-copys between databases automatically processed by
> program. and databases may be not in the same host. i feel that this is hard
> to implement now. so i'll try 1 and 2 in [HACKERS].

You can do it with different hosts too:

psql -h host1 -c "copy x to stdout" | psql -h host2 -c "copy y from stdin"


--
  Richard Huxton
  Archonet Ltd