lag_until_you_get_something() OVER () window function

Started by Kirk Roybalabout 11 years ago6 messages
#1Kirk Roybal
kirk@webfinish.com

Hi Guys,

I propose a lag (and/or lead) window function that propagates the last
non-null value to the current row.
Here's an example of what I mean by that:

CREATE TABLE lag_test (id serial primary key, natural_key integer,
somebody text);

INSERT INTO lag_test(natural_key, somebody)
VALUES (1, NULL), (1, 'Kirk'), (1, NULL), (2, 'Roybal'), (2, NULL), (2,
NULL);

/*

Creates this data in the table.
id natural_key somebody
-- ----------- --------
1 1 NULL
2 1 Kirk
3 1 NULL
4 2 Roybal
5 2 NULL
6 2 NULL

lag_until_you_get_something(text) function should return this in the
"somebody" column:

id natural_key somebody
-- ----------- --------
1 1 NULL
2 1 Kirk
3 1 Kirk
4 2 Roybal
5 2 Roybal
6 2 Roybal

Notice that row 6 has a value "Roybal", when the last known value was in
row 4. Also, Row 1 did not get a value.
*/

-- Query that gets the right result for limited example data:

CREATE FUNCTION last_elem (text[]) RETURNS text AS $$
SELECT $1[array_upper($1,1)];
$$ LANGUAGE SQL;

SELECT id, natural_key,
last_elem(string_to_array(string_agg(somebody, '|') OVER (ORDER BY
natural_key, id)::text, '|')) lag_hard
FROM lag_test
ORDER BY natural_key, id;

Sorry, I'm not a C-coder, or I'd whip this up myself and submit it.

Thank you for your consideration,

/Kirk

#2Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Kirk Roybal (#1)
Re: lag_until_you_get_something() OVER () window function

There is already a patch for that (ignore/respect nulls in lead/lag):
https://commitfest.postgresql.org/action/patch_view?id=1096

--
Vladimir

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Kirk Roybal (#1)
Re: lag_until_you_get_something() OVER () window function

On Tue, Oct 28, 2014 at 12:40 PM, Kirk Roybal <kirk@webfinish.com> wrote:

Hi Guys,

I propose a lag (and/or lead) window function that propagates the last
non-null value to the current row.
Here's an example of what I mean by that:

CREATE TABLE lag_test (id serial primary key, natural_key integer, somebody
text);

INSERT INTO lag_test(natural_key, somebody)
VALUES (1, NULL), (1, 'Kirk'), (1, NULL), (2, 'Roybal'), (2, NULL), (2,
NULL);

/*

Creates this data in the table.
id natural_key somebody
-- ----------- --------
1 1 NULL
2 1 Kirk
3 1 NULL
4 2 Roybal
5 2 NULL
6 2 NULL

lag_until_you_get_something(text) function should return this in the
"somebody" column:

id natural_key somebody
-- ----------- --------
1 1 NULL
2 1 Kirk
3 1 Kirk
4 2 Roybal
5 2 Roybal
6 2 Roybal

Notice that row 6 has a value "Roybal", when the last known value was in row
4. Also, Row 1 did not get a value.
*/

-- Query that gets the right result for limited example data:

CREATE FUNCTION last_elem (text[]) RETURNS text AS $$
SELECT $1[array_upper($1,1)];
$$ LANGUAGE SQL;

SELECT id, natural_key,
last_elem(string_to_array(string_agg(somebody, '|') OVER (ORDER BY
natural_key, id)::text, '|')) lag_hard
FROM lag_test
ORDER BY natural_key, id;

Here's a more efficient and cleaner version of same:

CREATE OR REPLACE FUNCTION GapFillInternal(
s anyelement,
v anyelement) RETURNS anyelement AS
$$
BEGIN
RETURN COALESCE(v,s);
END;
$$ LANGUAGE PLPGSQL IMMUTABLE;

CREATE AGGREGATE GapFill(anyelement) (
SFUNC=GapFillInternal,
STYPE=anyelement
);

postgres=# select id, natural_key, gapfill(somebody) OVER (ORDER BY
natural_key, id) from lag_test;
id │ natural_key │ gapfill
────┼─────────────┼─────────
1 │ 1 │
2 │ 1 │ Kirk
3 │ 1 │ Kirk
4 │ 2 │ Roybal
5 │ 2 │ Roybal
6 │ 2 │ Roybal
(6 rows)

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Kirk Roybal
kirk@webfinish.com
In reply to: Vladimir Sitnikov (#2)
Re: lag_until_you_get_something() OVER () window function

This is cleaner and better.

Thanks for the link, I hope to see it in a commitfest some time soon.

/Kirk

On 2014-10-28 16:34, Vladimir Sitnikov wrote:

There is already a patch for that (ignore/respect nulls in lead/lag):
https://commitfest.postgresql.org/action/patch_view?id=1096 [1]

--
Vladimir

Links:
------
[1]: https://commitfest.postgresql.org/action/patch_view?id=1096

#5Kirk Roybal
kirk@webfinish.com
In reply to: Merlin Moncure (#3)
Re: lag_until_you_get_something() OVER () window function

This is a pretty elegant way of getting there.

It also does a better job of respecting the window frame.

I'll use this until this
https://commitfest.postgresql.org/action/patch_view?id=1096 [1]https://commitfest.postgresql.org/action/patch_view?id=1096 shows
up.

Thanks

On 2014-10-28 17:35, Merlin Moncure wrote:

On Tue, Oct 28, 2014 at 12:40 PM, Kirk Roybal <kirk@webfinish.com> wrote:

Hi Guys, I propose a lag (and/or lead) window function that propagates the last non-null value to the current row. Here's an example of what I mean by that: CREATE TABLE lag_test (id serial primary key, natural_key integer, somebody text); INSERT INTO lag_test(natural_key, somebody) VALUES (1, NULL), (1, 'Kirk'), (1, NULL), (2, 'Roybal'), (2, NULL), (2, NULL); /* Creates this data in the table. id natural_key somebody -- ----------- -------- 1 1 NULL 2 1 Kirk 3 1 NULL 4 2 Roybal 5 2 NULL 6 2 NULL lag_until_you_get_something(text) function should return this in the "somebody" column: id natural_key somebody -- ----------- -------- 1 1 NULL 2 1 Kirk 3 1 Kirk 4 2 Roybal 5 2 Roybal 6 2 Roybal Notice that row 6 has a value "Roybal", when the last known value was in row 4. Also, Row 1 did not get a value. */ -- Query that gets the right result for limited example data: CREATE FUNCTION last_elem (text[]) RETURNS text AS $$ SELECT $1[array_upper($1,1)]; $$ LANGUAGE SQL; SELECT id,

natural_key, last_elem(string_to_array(string_agg(somebody, '|') OVER (ORDER BY natural_key, id)::text, '|')) lag_hard FROM lag_test ORDER BY natural_key, id;

Here's a more efficient and cleaner version of same:

CREATE OR REPLACE FUNCTION GapFillInternal(
s anyelement,
v anyelement) RETURNS anyelement AS
$$
BEGIN
RETURN COALESCE(v,s);
END;
$$ LANGUAGE PLPGSQL IMMUTABLE;

CREATE AGGREGATE GapFill(anyelement) (
SFUNC=GapFillInternal,
STYPE=anyelement
);

postgres=# select id, natural_key, gapfill(somebody) OVER (ORDER BY
natural_key, id) from lag_test;
id │ natural_key │ gapfill
────┼─────────────┼─────────
1 │ 1 │
2 │ 1 │ Kirk
3 │ 1 │ Kirk
4 │ 2 │ Roybal
5 │ 2 │ Roybal
6 │ 2 │ Roybal
(6 rows)

merlin

Links:
------
[1]: https://commitfest.postgresql.org/action/patch_view?id=1096

#6Merlin Moncure
mmoncure@gmail.com
In reply to: Kirk Roybal (#5)
Re: lag_until_you_get_something() OVER () window function

On Wed, Oct 29, 2014 at 12:04 PM, Kirk Roybal <kirk@webfinish.com> wrote:

This [custom aggregate gapfill] is a pretty elegant way of getting there.

It also does a better job of respecting the window frame.

I'll use this until this
https://commitfest.postgresql.org/action/patch_view?id=1096 shows up.

Yes. In fact it turns out you can implement all kinds of things
including gaps in standards support by via the combination of windows
functions + custom aggregates. Performance is pretty good but not
great.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers