Thread: interesting SQL puzzle - concatenating column with itself.

interesting SQL puzzle - concatenating column with itself.

From
Nick Fankhauser
Date:
Hi- I'm doing a conversion from an older database in which a memo field was handled by storing one line per record and
thendisplaying the related records in order. I want to compress all of the lines into a single text field with one
recordper memo entry.
 


So for instance, the old database looks like this:


memo_id | sequence |     memo_text
---------------------------------------
666     | 1        | The quick
666     | 2        | red fox
666     | 3        | jumped over
666     | 4        | the lazy brown dog


And my goal is to transform each group of lines into a single record that looks like this:


memo_id | memo_text
----------------------------------------------------------
666     | The quick red fox jumped over the lazy brown dog


Any thoughts on how to do this via sql? I could write a little program to do it, but it seems like there must be a
pure-SQLsolution that I'm not seeing.
 


Thanks   -Nick
-- 
------------------------------------------------------------------
Nick Fankhauser    
  nickf@ontko.com   Phone 765.935.4283   Fax 765.962.9788      Ray Ontko & Co. - Software Consulting Services
        http://www.ontko.com
 



Re: interesting SQL puzzle - concatenating column with itself.

From
Alvaro Herrera
Date:
On Sun, May 08, 2005 at 01:55:56PM -0500, Nick Fankhauser wrote:
> 
> Hi- I'm doing a conversion from an older database in which a memo field was 
> handled by storing one line per record and then displaying the related 
> records in order. I want to compress all of the lines into a single text 
> field with one record per memo entry.
> 
> 
> So for instance, the old database looks like this:
> 
> 
> memo_id | sequence |     memo_text
> ---------------------------------------
> 666     | 1        | The quick
> 666     | 2        | red fox
> 666     | 3        | jumped over
> 666     | 4        | the lazy brown dog
> 
> 
> And my goal is to transform each group of lines into a single record that 
> looks like this:
> 
> 
> memo_id | memo_text
> ----------------------------------------------------------
> 666     | The quick red fox jumped over the lazy brown dog
> 
> 
> Any thoughts on how to do this via sql? I could write a little program to 
> do it, but it seems like there must be a pure-SQL solution that I'm not 
> seeing.

You can create a custom aggregate to do concatenation and group by
memo_id.

create or replace function text_cat(text, text) returns text called on null input
language sql immutable as 'select case when $1 is null then $2 when $2
is null then $1 else $1 || '' '' || $2 end';

create aggregate textcat (basetype = text, sfunc = text_cat, stype = text);

create table memos (memo_id int, sequence int, memo_text text);
insert into memos values (666, 3, 'jumped over');
insert into memos values (666, 1, 'The quick');
insert into memos values (666, 4, 'the lazy brown dog');
insert into memos values (666, 2, 'red fox');

select memo_id, textcat(memo_text) from (select * from memos order by
memo_id, sequence) as foo group by memo_id;


The order is not really guaranteed, though if this is a one-shot thing,
you may get away with turning off hashed aggregates.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"There was no reply" (Kernel Traffic)


Re: interesting SQL puzzle - concatenating column with itself.

From
Nick Fankhauser
Date:
Excellent! Thanks for providing both the idea and an example. I didn't get the idea right away, but the example made it
clear.I'll try that on my table and report back on how it works out.
 

Regards,    -Nick


------------------------------------------------------------------
Nick Fankhauser    
  nickf@ontko.com   Phone 765.935.4283   Fax 765.962.9788      Ray Ontko & Co. - Software Consulting Services
        http://www.ontko.com
 

Alvaro Herrera wrote:

> 
> You can create a custom aggregate to do concatenation and group by
> memo_id.
> 
> create or replace function text_cat(text, text) returns text called on null input
> language sql immutable as 'select case when $1 is null then $2 when $2
> is null then $1 else $1 || '' '' || $2 end';
> 
> create aggregate textcat (basetype = text, sfunc = text_cat, stype = text);
> 
> create table memos (memo_id int, sequence int, memo_text text);
> insert into memos values (666, 3, 'jumped over');
> insert into memos values (666, 1, 'The quick');
> insert into memos values (666, 4, 'the lazy brown dog');
> insert into memos values (666, 2, 'red fox');
> 
> select memo_id, textcat(memo_text) from (select * from memos order by
> memo_id, sequence) as foo group by memo_id;
> 
> 
> The order is not really guaranteed, though if this is a one-shot thing,
> you may get away with turning off hashed aggregates.
> 



Re: interesting SQL puzzle - concatenating column with itself.

From
Nick Fankhauser
Date:

Alvaro Herrera wrote:

> The order is not really guaranteed, though if this is a one-shot thing,
> you may get away with turning off hashed aggregates.
> 

When I read this, I assumed there was a runtime parameter I could set that was similar to ENABLE_HASHJOIN. Are you
referringto a different runtime parameter or something else entirely?
 

BTW the concatenation function you suggest works nicely except that as you noted, it concatenates in an unpredictable
order,so I'm now trying to solve that problem.
 

-Nick

------------------------------------------------------------------
Nick Fankhauser    
  nickf@ontko.com   Phone 765.935.4283   Fax 765.962.9788      Ray Ontko & Co. - Software Consulting Services
        http://www.ontko.com
 



> BTW the concatenation function you suggest works nicely except that as  
> you noted, it concatenates in an unpredictable order, so I'm now trying  
> to solve that problem.

memo_id | sequence |     memo_text
---------------------------------------
666     | 1        | The quick
666     | 2        | red fox
666     | 3        | jumped over
666     | 4        | the lazy brown dog


You have :

SELECT your_concat( memo_text ) FROM table GROUP BY memo_id

You can use :

SELECT your_concat( memo_text ) FROM
(SELECT memo_id, sequence, memo_text FROM table ORDER BY memo_id, sequence  
OFFSET 0) AS foo
GROUP BY memo_id

the OFFSET 0 may be necessary (or not). Try it !



Re: interesting SQL puzzle - concatenating column with itself.

From
Greg Stark
Date:
Nick Fankhauser <nickf@ontko.com> writes:

> Alvaro Herrera wrote:
> 
> > The order is not really guaranteed, though if this is a one-shot thing,
> > you may get away with turning off hashed aggregates.
> >
> 
> When I read this, I assumed there was a runtime parameter I could set that was
> similar to ENABLE_HASHJOIN. Are you referring to a different runtime parameter
> or something else entirely?

Similar but different. enable_hashagg. You can see all of these with "show
all" in psql.

However you do not have to worry. Even with (actually, *especially with*) hash
aggregates the records will be processed in the order they're received.

It's actually the normal aggregate method of sorting on the GROUP BY columns
that risks damaging the order. However Postgres does a special check to see if
the subquery is already sorted by the GROUP BY column and avoids the extra
sort which could cause you problems.

So this is not guaranteed by the SQL spec to work (but then the SQL spec
doesn't have custom aggregates at all) but Postgres goes out of its way to
make sure this doesn't break unnecessarily.

[This is all for 8.0 and I think 7.4. Some versions before that might 

-- 
greg



Re: interesting SQL puzzle - concatenating column with itself.

From
"Dmitri Bichko"
Date:
> SELECT your_concat( memo_text ) FROM
> (SELECT memo_id, sequence, memo_text FROM table ORDER BY memo_id,
sequence
> OFFSET 0) AS foo
> GROUP BY memo_id

I'm just curious - what's the 'OFFSET 0' for?

Dmitri


> I'm just curious - what's the 'OFFSET 0' for?
Trick to fool postgres into thinking it can't rewrite out your subquery  
and eliminate it ...