Access a window's frame_end row from a window function

Started by Clodoaldoabout 13 years ago3 messagesgeneral
Jump to latest
#1Clodoaldo
clodoaldo.pinto.neto@gmail.com

Suppose there is the need to get the average of a value v over a 6 hours
time window starting 7 hours before the current row's time.

create table t (ts timestamp, v integer);
insert into t (ts, v) values
('2013-01-01 00:46', 2),
('2013-01-01 03:54', 4),
('2013-01-01 06:28', 4),
('2013-01-01 11:19', 2),
('2013-01-01 14:44', 1),
('2013-01-01 15:56', 5),
('2013-01-01 18:01', 4),
('2013-01-01 19:40', 0),
('2013-01-01 20:38', 5),
('2013-01-01 21:22', 0);

I can do it with a correlated subquery:

select ts, v,
(
select avg(v)
from t s
where ts between
t.ts - interval '7 hours'
and t.ts - interval '1 hour'
) average
from t
order by ts
;
ts | v | average
---------------------+---+--------------------
2013-01-01 00:46:00 | 2 |
2013-01-01 03:54:00 | 4 | 2.0000000000000000
2013-01-01 06:28:00 | 4 | 3.0000000000000000
2013-01-01 11:19:00 | 2 | 4.0000000000000000
2013-01-01 14:44:00 | 1 | 2.0000000000000000
2013-01-01 15:56:00 | 5 | 1.5000000000000000
2013-01-01 18:01:00 | 4 | 2.6666666666666667
2013-01-01 19:40:00 | 0 | 3.3333333333333333
2013-01-01 20:38:00 | 5 | 3.3333333333333333
2013-01-01 21:22:00 | 0 | 2.5000000000000000

But if I could access a window's frame_end row as a record from a window
function:

select ts,
avg(case when ts between
frame_end.ts - interval '7 hours'
and frame_end.ts - interval '1 hour'
then v else null end
) over(order by ts)
from t
order by ts

I'm naively posting this as I have no idea how complex would it be to add
this feature. Would it perform better than the correlated subquery?

Regards, Clodoaldo

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Clodoaldo (#1)
Re: Access a window's frame_end row from a window function

On Fri, Feb 22, 2013 at 8:26 AM, Clodoaldo Neto
<clodoaldo.pinto.neto@gmail.com> wrote:

Suppose there is the need to get the average of a value v over a 6 hours
time window starting 7 hours before the current row's time.

create table t (ts timestamp, v integer);
insert into t (ts, v) values
('2013-01-01 00:46', 2),
('2013-01-01 03:54', 4),
('2013-01-01 06:28', 4),
('2013-01-01 11:19', 2),
('2013-01-01 14:44', 1),
('2013-01-01 15:56', 5),
('2013-01-01 18:01', 4),
('2013-01-01 19:40', 0),
('2013-01-01 20:38', 5),
('2013-01-01 21:22', 0);

I can do it with a correlated subquery:

select ts, v,
(
select avg(v)
from t s
where ts between
t.ts - interval '7 hours'
and t.ts - interval '1 hour'
) average
from t
order by ts
;
ts | v | average
---------------------+---+--------------------
2013-01-01 00:46:00 | 2 |
2013-01-01 03:54:00 | 4 | 2.0000000000000000
2013-01-01 06:28:00 | 4 | 3.0000000000000000
2013-01-01 11:19:00 | 2 | 4.0000000000000000
2013-01-01 14:44:00 | 1 | 2.0000000000000000
2013-01-01 15:56:00 | 5 | 1.5000000000000000
2013-01-01 18:01:00 | 4 | 2.6666666666666667
2013-01-01 19:40:00 | 0 | 3.3333333333333333
2013-01-01 20:38:00 | 5 | 3.3333333333333333
2013-01-01 21:22:00 | 0 | 2.5000000000000000

But if I could access a window's frame_end row as a record from a window
function:

select ts,
avg(case when ts between
frame_end.ts - interval '7 hours'
and frame_end.ts - interval '1 hour'
then v else null end
) over(order by ts)
from t
order by ts

I'm naively posting this as I have no idea how complex would it be to add
this feature. Would it perform better than the correlated subquery?

Well, correlated subquery is about the bottom of the barrel in
performance terms, so anything would be an improvement.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Clodoaldo (#1)
Re: Access a window's frame_end row from a window function

Clodoaldo Neto <clodoaldo.pinto.neto@gmail.com> writes:

Suppose there is the need to get the average of a value v over a 6 hours
time window starting 7 hours before the current row's time.
...
But if I could access a window's frame_end row as a record from a window
function:

select ts,
avg(case when ts between
frame_end.ts - interval '7 hours'
and frame_end.ts - interval '1 hour'
then v else null end
) over(order by ts)
from t
order by ts

I'm naively posting this as I have no idea how complex would it be to add
this feature. Would it perform better than the correlated subquery?

Doubt it. As stated, it'd likely perform worse, since it's not obvious
in this construction that rows outside the desired time window need not
be scanned to compute the avg(). But even if you rearranged the SQL to
avoid that pitfall, I'm not sure how the implementation could look
noticeably different from a correlated subselect. It'd still end up
scanning all the desired rows for each row of the outer query.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general