Thread: Problem with Crosstab - Allocating value to wrong column

Problem with Crosstab - Allocating value to wrong column

From
Stefan Schwarzer
Date:
Hi there,

I have a database with statistical tables which look like this:

id    |    year_start   |    value
3             1960              736.1
3             1961              723.3
....
4             1960              123.4


Years can cover 40 years for each of the countries (ID) and each (variable) table .

Now, if for a given year there is no value, expressed not in form of NoData, but as missing line in the table, than I have a problem with this SQL: 

SELECT
FROM 
crosstab( 'SELECT 
COALESCE(c.name, '''') AS name,  
d.year_start AS year, 
d.value AS value  
FROM 
co2_total_cdiac AS d 
RIGHT JOIN 
countries_view AS c ON c.id = d.id_country 
WHERE 
(d.year_start = 1960 OR d.year_start = 2007 ) 
ORDER BY 
1,2;', 3) AS ct (name varchar, y_1960 numeric, y_2007 numeric)


For one of the countries, I have a value for 2007, but not for 1960. When using only the inner query, than I see one line: Andorra - 2007 - 539
But when running the whole SQL, the value for year 2007 get's allocated to the year 1960. The table looks as follows:

name     |    y_1960    |   y_2007
Andorra         539            NULL



(Not sure if the copy/pasted part displays correct:

namey_1960y_2007
Afghanistan
415
716
Albania
2026
4246
Algeria
6166
140234
Andorra
539
NULL
)


Can anyone give me a hint in which direction to search for a solution?

Thanks a lot!

Stefan


Re: Problem with Crosstab - Allocating value to wrong column

From
Joe Conway
Date:
On 10/19/2010 03:07 AM, Stefan Schwarzer wrote:
> For one of the countries, I have a value for 2007, but not for 1960.
> When using only the inner query, than I see one line: Andorra - 2007
> - 539 But when running the whole SQL, the value for year 2007 get's
> allocated to the year 1960. The table looks as follows:
>
> name     |    y_1960    |   y_2007
> Andorra  |       539    |   NULL


That is documented behavior. See:
  http://www.postgresql.org/docs/8.4/interactive/tablefunc.html

> F.33.1.2. crosstab(text)
...
> The crosstab function produces one output row for each consecutive
> group of input rows with the same row_name value. It fills the output
> value columns, left to right, with the value fields from these rows.
> If there are fewer rows in a group than there are output value
> columns, the extra output columns are filled with nulls; if there are
> more rows, the extra input rows are skipped.

You probably want the other form of crosstab

> F.33.1.4. crosstab(text, text)

> The main limitation of the single-parameter form of crosstab is that
> it treats all values in a group alike, inserting each value into the
> first available column. If you want the value columns to correspond to
> specific categories of data, and some groups might not have data for
> some of the categories, that doesn't work well. The two-parameter form
> of crosstab handles this case by providing an explicit list of the
> categories corresponding to the output columns.

HTH,

Joe


--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Attachment

Re: Problem with Crosstab (Concatenate Problem)

From
Stefan Schwarzer
Date:
>> For one of the countries, I have a value for 2007, but not for 1960.
>> When using only the inner query, than I see one line: Andorra - 2007
>> - 539 But when running the whole SQL, the value for year 2007 get's
>> allocated to the year 1960. The table looks as follows:
>>
>> name     |    y_1960    |   y_2007
>> Andorra  |       539    |   NULL
>
>
> That is documented behavior. See:
> http://www.postgresql.org/docs/8.4/interactive/tablefunc.html
>
> You probably want the other form of crosstab
>
>>
> F.33.1.4. crosstab(text, text)

Thanks a lot for the help. Indeed, that should be the one which should do it. Unfortunately, I ran into an error
message:

I have this query:

SELECT * FROM
    crosstab(
        'SELECT
            c.name AS name,
            d.year_start AS year,
            d.value AS value
        FROM
            co2_total_cdiac AS d
        RIGHT JOIN
            countries_view AS c ON c.id = d.id_country
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1',

        'SELECT
            DISTINCT ''y_'' || year_start AS year
        FROM
            co2_total_cdiac AS d
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1'
    )
    AS ct(name varchar, y_1960 numeric, y_1965 numeric, y_2003 numeric, y_2007 numeric)


Now, I get an error message:

ERROR:  invalid input syntax for integer: "SELECT
            DISTINCT 'y_' || year_start AS year
        FROM
            co2_total_cdiac AS d
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1"
LINE 15:   'SELECT
          ^


I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this
then?

Thanks for any help!

Stef



Re: Problem with Crosstab (Concatenate Problem)

From
Stefan Schwarzer
Date:
>> For one of the countries, I have a value for 2007, but not for 1960.
>> When using only the inner query, than I see one line: Andorra - 2007
>> - 539 But when running the whole SQL, the value for year 2007 get's
>> allocated to the year 1960. The table looks as follows:
>>
>> name     |    y_1960    |   y_2007
>> Andorra  |       539    |   NULL
>
>
> That is documented behavior. See:
>  http://www.postgresql.org/docs/8.4/interactive/tablefunc.html
>
> You probably want the other form of crosstab
>
>>
> F.33.1.4. crosstab(text, text)

Thanks a lot for the help. Indeed, that should be the one which should do it. Unfortunately, I ran into an error
message:

I have this query:

SELECT * FROM
    crosstab(
        'SELECT
            c.name AS name,
            d.year_start AS year,
            d.value AS value
        FROM
            co2_total_cdiac AS d
        RIGHT JOIN
            countries_view AS c ON c.id = d.id_country
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1',

        'SELECT
            DISTINCT ''y_'' || year_start AS year
        FROM
            co2_total_cdiac AS d
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1'
    )
    AS ct(name varchar, y_1960 numeric, y_1965 numeric, y_2003 numeric, y_2007 numeric)


Now, I get an error message:

ERROR:  invalid input syntax for integer: "SELECT
            DISTINCT 'y_' || year_start AS year
        FROM
            co2_total_cdiac AS d
        WHERE
            (d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
        ORDER BY 1"
LINE 15:   'SELECT
           ^


I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this
then?

Thanks for any help!

Stef


Re: Problem with Crosstab (Concatenate Problem)

From
Joe Conway
Date:
On 11/01/2010 06:24 AM, Stefan Schwarzer wrote:
> I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve
thisthen? 

Try something like:

create table foo (
name text,
year_start int,
value float8);

insert into foo values('a',2010,1.23),('b',2011,2.34);

SELECT * FROM
    crosstab(
    'SELECT name, year_start, value FROM foo ORDER BY 1',
    'SELECT DISTINCT year_start FROM foo'
    )
AS ct(name varchar, y_2010 float8, y_2011 float8);

 name | y_2010 | y_2011
------+--------+--------
 a    |        |   1.23
 b    |   2.34 |
(2 rows)


HTH,

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Attachment

Re: Problem with Crosstab (Concatenate Problem)

From
Stefan Schwarzer
Date:
>> I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve
thisthen? 
>
> Try something like:
>
> create table foo (
> name text,
> year_start int,
> value float8);
>
> insert into foo values('a',2010,1.23),('b',2011,2.34);
>
> SELECT * FROM
>    crosstab(
>    'SELECT name, year_start, value FROM foo ORDER BY 1',
>    'SELECT DISTINCT year_start FROM foo'
>    )
> AS ct(name varchar, y_2010 float8, y_2011 float8);
>
> name | y_2010 | y_2011
> ------+--------+--------
> a    |        |   1.23
> b    |   2.34 |
> (2 rows)

Hi Joe. Thanks a lot for the suggestions. Tried it out, but same error message:

ERROR:  invalid input syntax for integer: "SELECT DISTINCT year_start FROM foo"
LINE 4:    'SELECT DISTINCT year_start FROM foo'
           ^

Did this work for you? Then this would indeed be strange.

I wonder if the second crosstab SQL must have the same column names as the final output or not ("2010" vs. "y_2010").

Anyone can help me out? Thanks a lot for any tips!

Stef

Re: Problem with Crosstab (Concatenate Problem)

From
Joseph Conway
Date:
On 11/1/10 11:54 PM, Stefan Schwarzer wrote:
>> SELECT * FROM
>>    crosstab(
>>    'SELECT name, year_start, value FROM foo ORDER BY 1',
>>    'SELECT DISTINCT year_start FROM foo'
>>    )
>> AS ct(name varchar, y_2010 float8, y_2011 float8);

> Hi Joe. Thanks a lot for the suggestions. Tried it out, but same error message:
>
> ERROR:  invalid input syntax for integer: "SELECT DISTINCT year_start FROM foo"
> LINE 4:    'SELECT DISTINCT year_start FROM foo'
>            ^
>
> Did this work for you? Then this would indeed be strange.
>
> I wonder if the second crosstab SQL must have the same column names as the final output or not ("2010" vs. "y_2010").
>
> Anyone can help me out? Thanks a lot for any tips!

What version of PostgreSQL are you running? The error seems to indicate
that you don't have the crosstab(text,text) form of the function.

In psql do:
contrib_regression=# \df crosstab
                          List of functions
 Schema |   Name   | Result data type | Argument data types |  Type
--------+----------+------------------+---------------------+--------
 public | crosstab | SETOF record     | text                | normal
 public | crosstab | SETOF record     | text, integer       | normal
 public | crosstab | SETOF record     | text, text          | normal
(3 rows)

Does it look like this?

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Re: Problem with Crosstab (Concatenate Problem)

From
Stefan Schwarzer
Date:
> What version of PostgreSQL are you running? The error seems to indicate
> that you don't have the crosstab(text,text) form of the function.
>
> In psql do:
> contrib_regression=# \df crosstab
>                          List of functions
> Schema |   Name   | Result data type | Argument data types |  Type
> --------+----------+------------------+---------------------+--------
> public | crosstab | SETOF record     | text                | normal
> public | crosstab | SETOF record     | text, integer       | normal
> public | crosstab | SETOF record     | text, text          | normal
> (3 rows)
>
> Does it look like this?

Ha, that's a thing! Indeed, it looks like this in my case:


 Schema |   Name   | Result data type | Argument data types |  Type
--------+----------+------------------+---------------------+--------
 public | crosstab | SETOF record     | text, integer       | normal
(1 row)


I am running version 8.4.3, on Mac with Kingchaos libraries. Installed the contrib/crosstab myself afterwards.

What can I do now to include the other functions too? Or is this only with newer versions of Crosstab?

Thanks a lot for your help!

Stef


Re: Problem with Crosstab (Concatenate Problem)

From
Stefan Schwarzer
Date:
>> What version of PostgreSQL are you running? The error seems to indicate
>> that you don't have the crosstab(text,text) form of the function.
>>
>> In psql do:
>> contrib_regression=# \df crosstab
>>                         List of functions
>> Schema |   Name   | Result data type | Argument data types |  Type
>> --------+----------+------------------+---------------------+--------
>> public | crosstab | SETOF record     | text                | normal
>> public | crosstab | SETOF record     | text, integer       | normal
>> public | crosstab | SETOF record     | text, text          | normal
>> (3 rows)
>>
>> Does it look like this?
>
> Ha, that's a thing! Indeed, it looks like this in my case:
>
>
> Schema |   Name   | Result data type | Argument data types |  Type
> --------+----------+------------------+---------------------+--------
> public | crosstab | SETOF record     | text, integer       | normal
> (1 row)
>
>
> I am running version 8.4.3, on Mac with Kingchaos libraries. Installed the contrib/crosstab myself afterwards.
>
> What can I do now to include the other functions too? Or is this only with newer versions of Crosstab?

Ok, got it re-compiled and re-inserted, and now they are there, these functions. Thanks a lot for your help!