unnest and string_to_array on two columns

Started by Michael Grahamover 14 years ago2 messagesgeneral
Jump to latest
#1Michael Graham
mgraham@bloxx.com

Hi all,

I'm trying to migrate an old (and sucky) schema to a new one and I'm
having some difficulties coming up with a sane select.

I have basically id, a, and b where a and b contain a list of flags like

id | a | b |
--------------------
1 | abc | abcdef |

and what to convert this to multiple ids with single flags, like:

id | a | b |
------------------
1 | a | a |
1 | b | b |
1 | c | c |
1 | NULL | d |
1 | NULL | e |
1 | NULL | f |

My first attempt was

SELECT id, unnest(string_to_array(a,NULL)),
unnest(string_to_array(b,NULL)) FROM foo;

But this causes the shorter string to be repeated until it is the same
length as the shorter string. In the end I have managed to get the
behaviour that I want but the select is horrible:

SELECT COALESCE(aa.id,bb.id) AS id,
aa.unnest AS aaaaa,
bb.unnest AS bbbbb FROM
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(a,NULL)) FROM foo
) AS a
) AS aa
FULL JOIN
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(b,NULL)) FROM foo
) AS b
) AS bb
ON aa.row_number=bb.row_number AND aa.id=bb.id;

So I was wondering if anyone had any better solutions.

Thanks,
--
Michael Graham <mgraham@bloxx.com>

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Michael Graham (#1)
Re: unnest and string_to_array on two columns

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Michael Graham
Sent: Tuesday, October 25, 2011 4:36 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] unnest and string_to_array on two columns

Hi all,

I'm trying to migrate an old (and sucky) schema to a new one and I'm having some difficulties coming up with a sane select.

I have basically id, a, and b where a and b contain a list of flags like

id | a | b |
--------------------
1 | abc | abcdef |

and what to convert this to multiple ids with single flags, like:

id | a | b |
------------------
1 | a | a |
1 | b | b |
1 | c | c |
1 | NULL | d |
1 | NULL | e |
1 | NULL | f |

My first attempt was

SELECT id, unnest(string_to_array(a,NULL)),
unnest(string_to_array(b,NULL)) FROM foo;

But this causes the shorter string to be repeated until it is the same length as the shorter string. In the end I have managed to get the behaviour that I want but the select is horrible:

SELECT COALESCE(aa.id,bb.id) AS id,
aa.unnest AS aaaaa,
bb.unnest AS bbbbb FROM
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(a,NULL)) FROM foo
) AS a
) AS aa
FULL JOIN
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(b,NULL)) FROM foo
) AS b
) AS bb
ON aa.row_number=bb.row_number AND aa.id=bb.id;

So I was wondering if anyone had any better solutions.

Thanks,
--
Michael Graham <mgraham@bloxx.com>

----------------- /Original Message -----------------
Same solution but using CTEs.

WITH val_src AS (
SELECT * FROM (VALUES (1, 'a,b,c','a,b,c,d,e')) vals (id, a, b)
), a_expanded AS (
SELECT *, ROW_NUMBER() OVER () AS row_index FROM (
SELECT id, unnest(string_to_array(a, ',')) AS a_item FROM val_src
) a_src
), b_expanded aS (
SELECT *, ROW_NUMBER() OVER () AS row_index FROM (
SELECT id, unnest(string_to_array(b, ',')) AS b_item FROM val_src
) b_src
)
SELECT *
FROM b_expanded NATURAL FULL OUTER JOIN a_expanded;

The only other thought would be to limit your first query to [pseudocode] "WHERE ROW_NUMBER() OVER () <= MIN(b.length, a.length)" and the UNION the remainder of A and B where the row number is "> MIN(b.length, a.length)". This seems worse though.

David J.