Problem with Crosstab - Allocating value to wrong column
Hi there,
I have a database with statistical tables which look like this:
id | year_start | value
3 1960 736.1
3 1961 723.3
....
4 1960 123.4
Years can cover 40 years for each of the countries (ID) and each (variable) table .
Now, if for a given year there is no value, expressed not in form of NoData, but as missing line in the table, than I have a problem with this SQL:
SELECT
*
FROM
crosstab( 'SELECT
COALESCE(c.name, '''') AS name,
d.year_start AS year,
d.value AS value
FROM
co2_total_cdiac AS d
RIGHT JOIN
countries_view AS c ON c.id = d.id_country
WHERE
(d.year_start = 1960 OR d.year_start = 2007 )
ORDER BY
1,2;', 3) AS ct (name varchar, y_1960 numeric, y_2007 numeric)
For one of the countries, I have a value for 2007, but not for 1960. When using only the inner query, than I see one line: Andorra - 2007 - 539
But when running the whole SQL, the value for year 2007 get's allocated to the year 1960. The table looks as follows:
name | y_1960 | y_2007
Andorra 539 NULL
(Not sure if the copy/pasted part displays correct:
name y_1960 y_2007
Afghanistan
415
716
Albania
2026
4246
Algeria
6166
140234
Andorra
539
NULL
)
Can anyone give me a hint in which direction to search for a solution?
Thanks a lot!
Stefan
On 10/19/2010 03:07 AM, Stefan Schwarzer wrote:
For one of the countries, I have a value for 2007, but not for 1960.
When using only the inner query, than I see one line: Andorra - 2007
- 539 But when running the whole SQL, the value for year 2007 get's
allocated to the year 1960. The table looks as follows:name | y_1960 | y_2007
Andorra | 539 | NULL
That is documented behavior. See:
http://www.postgresql.org/docs/8.4/interactive/tablefunc.html
F.33.1.2. crosstab(text)
...
The crosstab function produces one output row for each consecutive
group of input rows with the same row_name value. It fills the output
value columns, left to right, with the value fields from these rows.
If there are fewer rows in a group than there are output value
columns, the extra output columns are filled with nulls; if there are
more rows, the extra input rows are skipped.
You probably want the other form of crosstab
F.33.1.4. crosstab(text, text)
The main limitation of the single-parameter form of crosstab is that
it treats all values in a group alike, inserting each value into the
first available column. If you want the value columns to correspond to
specific categories of data, and some groups might not have data for
some of the categories, that doesn't work well. The two-parameter form
of crosstab handles this case by providing an explicit list of the
categories corresponding to the output columns.
HTH,
Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
For one of the countries, I have a value for 2007, but not for 1960.
When using only the inner query, than I see one line: Andorra - 2007
- 539 But when running the whole SQL, the value for year 2007 get's
allocated to the year 1960. The table looks as follows:name | y_1960 | y_2007
Andorra | 539 | NULLThat is documented behavior. See:
http://www.postgresql.org/docs/8.4/interactive/tablefunc.htmlYou probably want the other form of crosstab
F.33.1.4. crosstab(text, text)
Thanks a lot for the help. Indeed, that should be the one which should do it. Unfortunately, I ran into an error message:
I have this query:
SELECT * FROM
crosstab(
'SELECT
c.name AS name,
d.year_start AS year,
d.value AS value
FROM
co2_total_cdiac AS d
RIGHT JOIN
countries_view AS c ON c.id = d.id_country
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1',
'SELECT
DISTINCT ''y_'' || year_start AS year
FROM
co2_total_cdiac AS d
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1'
)
AS ct(name varchar, y_1960 numeric, y_1965 numeric, y_2003 numeric, y_2007 numeric)
Now, I get an error message:
ERROR: invalid input syntax for integer: "SELECT
DISTINCT 'y_' || year_start AS year
FROM
co2_total_cdiac AS d
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1"
LINE 15: 'SELECT
^
I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this then?
Thanks for any help!
Stef
For one of the countries, I have a value for 2007, but not for 1960.
When using only the inner query, than I see one line: Andorra - 2007
- 539 But when running the whole SQL, the value for year 2007 get's
allocated to the year 1960. The table looks as follows:name | y_1960 | y_2007
Andorra | 539 | NULLThat is documented behavior. See:
http://www.postgresql.org/docs/8.4/interactive/tablefunc.htmlYou probably want the other form of crosstab
F.33.1.4. crosstab(text, text)
Thanks a lot for the help. Indeed, that should be the one which should do it. Unfortunately, I ran into an error message:
I have this query:
SELECT * FROM
crosstab(
'SELECT
c.name AS name,
d.year_start AS year,
d.value AS value
FROM
co2_total_cdiac AS d
RIGHT JOIN
countries_view AS c ON c.id = d.id_country
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1',
'SELECT
DISTINCT ''y_'' || year_start AS year
FROM
co2_total_cdiac AS d
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1'
)
AS ct(name varchar, y_1960 numeric, y_1965 numeric, y_2003 numeric, y_2007 numeric)
Now, I get an error message:
ERROR: invalid input syntax for integer: "SELECT
DISTINCT 'y_' || year_start AS year
FROM
co2_total_cdiac AS d
WHERE
(d.year_start = 1960 OR d.year_start = 1965 OR d.year_start = 2003 OR d.year_start = 2007 )
ORDER BY 1"
LINE 15: 'SELECT
^
I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this then?
Thanks for any help!
Stef
On 11/01/2010 06:24 AM, Stefan Schwarzer wrote:
I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this then?
Try something like:
create table foo (
name text,
year_start int,
value float8);
insert into foo values('a',2010,1.23),('b',2011,2.34);
SELECT * FROM
crosstab(
'SELECT name, year_start, value FROM foo ORDER BY 1',
'SELECT DISTINCT year_start FROM foo'
)
AS ct(name varchar, y_2010 float8, y_2011 float8);
name | y_2010 | y_2011
------+--------+--------
a | | 1.23
b | 2.34 |
(2 rows)
HTH,
Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
I need to convert the integer values for the years into column names, i.e. "1965" into "y_1965". How do I achieve this then?
Try something like:
create table foo (
name text,
year_start int,
value float8);insert into foo values('a',2010,1.23),('b',2011,2.34);
SELECT * FROM
crosstab(
'SELECT name, year_start, value FROM foo ORDER BY 1',
'SELECT DISTINCT year_start FROM foo'
)
AS ct(name varchar, y_2010 float8, y_2011 float8);name | y_2010 | y_2011
------+--------+--------
a | | 1.23
b | 2.34 |
(2 rows)
Hi Joe. Thanks a lot for the suggestions. Tried it out, but same error message:
ERROR: invalid input syntax for integer: "SELECT DISTINCT year_start FROM foo"
LINE 4: 'SELECT DISTINCT year_start FROM foo'
^
Did this work for you? Then this would indeed be strange.
I wonder if the second crosstab SQL must have the same column names as the final output or not ("2010" vs. "y_2010").
Anyone can help me out? Thanks a lot for any tips!
Stef
On 11/1/10 11:54 PM, Stefan Schwarzer wrote:
SELECT * FROM
crosstab(
'SELECT name, year_start, value FROM foo ORDER BY 1',
'SELECT DISTINCT year_start FROM foo'
)
AS ct(name varchar, y_2010 float8, y_2011 float8);
Hi Joe. Thanks a lot for the suggestions. Tried it out, but same error message:
ERROR: invalid input syntax for integer: "SELECT DISTINCT year_start FROM foo"
LINE 4: 'SELECT DISTINCT year_start FROM foo'
^Did this work for you? Then this would indeed be strange.
I wonder if the second crosstab SQL must have the same column names as the final output or not ("2010" vs. "y_2010").
Anyone can help me out? Thanks a lot for any tips!
What version of PostgreSQL are you running? The error seems to indicate
that you don't have the crosstab(text,text) form of the function.
In psql do:
contrib_regression=# \df crosstab
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+----------+------------------+---------------------+--------
public | crosstab | SETOF record | text | normal
public | crosstab | SETOF record | text, integer | normal
public | crosstab | SETOF record | text, text | normal
(3 rows)
Does it look like this?
Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
What version of PostgreSQL are you running? The error seems to indicate
that you don't have the crosstab(text,text) form of the function.In psql do:
contrib_regression=# \df crosstab
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+----------+------------------+---------------------+--------
public | crosstab | SETOF record | text | normal
public | crosstab | SETOF record | text, integer | normal
public | crosstab | SETOF record | text, text | normal
(3 rows)Does it look like this?
Ha, that's a thing! Indeed, it looks like this in my case:
Schema | Name | Result data type | Argument data types | Type
--------+----------+------------------+---------------------+--------
public | crosstab | SETOF record | text, integer | normal
(1 row)
I am running version 8.4.3, on Mac with Kingchaos libraries. Installed the contrib/crosstab myself afterwards.
What can I do now to include the other functions too? Or is this only with newer versions of Crosstab?
Thanks a lot for your help!
Stef
What version of PostgreSQL are you running? The error seems to indicate
that you don't have the crosstab(text,text) form of the function.In psql do:
contrib_regression=# \df crosstab
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+----------+------------------+---------------------+--------
public | crosstab | SETOF record | text | normal
public | crosstab | SETOF record | text, integer | normal
public | crosstab | SETOF record | text, text | normal
(3 rows)Does it look like this?
Ha, that's a thing! Indeed, it looks like this in my case:
Schema | Name | Result data type | Argument data types | Type
--------+----------+------------------+---------------------+--------
public | crosstab | SETOF record | text, integer | normal
(1 row)I am running version 8.4.3, on Mac with Kingchaos libraries. Installed the contrib/crosstab myself afterwards.
What can I do now to include the other functions too? Or is this only with newer versions of Crosstab?
Ok, got it re-compiled and re-inserted, and now they are there, these functions. Thanks a lot for your help!