aggregate and order by

Started by Matthew Dennisover 18 years ago8 messagesgeneral

mdennis@merfer.net

over 18 years ago

I want to create an aggregate that will give the average velocity (sum of
distance traveled / sum of elapsed time) from position and timestamps.

example:

create table data(position integer, pos_time timestamp, trip_id integer);

insert into data values(1, "time x", 1);
insert into data values(2, "time x plus 1 second", 1);
insert into data values(4, "time x plus 2 second", 1);
insert into data values(1, "time y", 2);
insert into data values(4, "time y plus 1 second", 2);
insert into data values(16, "time y plus 3 second", 2);
etc, etc, etc...

select trip_id, avg_vel(position, pos_time) from data group by trip_id;

Row one to row two has an elapsed time of ("time x plus 1 second" - "time
x") and a difference in position of (2 - 1) and from row two to row three
there is a difference in position of (4 -2) and a elapsed time of ("time x
plus 1 second" - "time x plus 2 seconds") so for trip_id we get ((2-1) +
(4-2)) / (1 + 1).

Row 4 to row 5 has a difference in position of (4-1) and a elapsed time of
("time y plus 1 second" - "time y") and from row 5 to row 6 there is a
position difference of (16-4) and time difference of ("time y plus 3
seconds" - "time y plus 1 second") so for trip_id 2 we get ((4-1) + (16-4))
/ (1 + 2).

Keep in mind that I can't just take the difference between the start and end
of the trip because I might move from 1 to 10 to 1. If I just took the end
points (1-1) the velocity would be zero because it looks like I didn't move.

So I could write an aggregate that remembers the last row and on each new
row, does the diff and keeps the running sums and then when it's done, the
final function does the division and returns the average velocity. However,
this only works if the rows come into the aggregate function in the correct
order (otherwise I might count the total distance and/or elapsed time wrong
because both are calculated from the difference of the previous row). So,
my question is if I can have PostgreSQL honor order by clauses such as:

select trip_id, avg_vel(position, pos_time) from (select position, pos_time,
trip_id from data order by pos_time) sorted_data

Would this in fact guarantee that the rows are passed into the aggregate in
the order specified?

Other suggestions/discussions/questions/etc are welcome.

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: Matthew Dennis (#1)

Re: aggregate and order by

"Matthew Dennis" <mdennis@merfer.net> writes:

So, my question is if I can have PostgreSQL honor order by clauses such as:

select trip_id, avg_vel(position, pos_time)
from (select position, pos_time, trip_id from data order by pos_time) sorted_data

Would this in fact guarantee that the rows are passed into the aggregate in
the order specified?

Yes. AFAIK this isn't covered by the spec but it works in Postgres and we know
there are people depending on it so we wouldn't break it without a big notice
and presumably some replacement.

Other suggestions/discussions/questions/etc are welcome.

Good luck, this looks pretty painful to get right. Keep in mind you have to
keep all your state in the state data. If you keep a temporary variable
outside that data then your function won't work if it's called twice in the
same query like "select avg_vel(position, pos_time), avg_vel(position2,
pos_time2) from ..."

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

Glyn Astill

glynastill@yahoo.co.uk

over 18 years ago

In reply to: Bruce Momjian (#2)

Replication Monitoring

Hi people,

I intend to set up two slave servers, one using WAL shipping and one
using Slony I.

Are there any good tools, or scripts that'll help us check that both
replication methods are working?

I know theres Nagios - but what does this actually allow us to
monitor?

Also if I want to make backups from the slaves to save master
downtime / load what are my options?

Thanks

__________________________________________________________
Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: Glyn Astill (#3)

Re: Replication Monitoring

"Glyn Astill" <glynastill@yahoo.co.uk> writes:

Hi people,

I intend to set up two slave servers, one using WAL shipping and one
using Slony I.

This has nothing to do with "aggregate and ordering" the subject of the
message to which you're replying. You're more likely to see responses if you
post in a new thread.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

Glyn Astill

glynastill@yahoo.co.uk

over 18 years ago

In reply to: Bruce Momjian (#4)

Re: Replication Monitoring

How did that happen? The subject is totally different, so is the
body.

This is shit.

--- Gregory Stark <stark@enterprisedb.com> wrote:

"Glyn Astill" <glynastill@yahoo.co.uk> writes:

Hi people,

I intend to set up two slave servers, one using WAL shipping and

one

using Slony I.

This has nothing to do with "aggregate and ordering" the subject of
the
message to which you're replying. You're more likely to see
responses if you
post in a new thread.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's
PostgreSQL training!

---------------------------(end of
broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Glyn Astill

__________________________________________________________
Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com

Alvaro Herrera

alvherre@2ndquadrant.com

over 18 years ago

In reply to: Glyn Astill (#5)

Re: Replication Monitoring

Glyn Astill wrote:

How did that happen? The subject is totally different, so is the
body.

It has an "In-Reply-To:" and possibly "References:" header which relates
it to the other thread.

The solution is simple. Don't reply to an existing message when you
want to post a new thread. Compose a new one instead.

--
Alvaro Herrera Developer, http://www.PostgreSQL.org/
"Aprender sin pensar es inï¿½til; pensar sin aprender, peligroso" (Confucio)

Glyn Astill

glynastill@yahoo.co.uk

over 18 years ago

In reply to: Alvaro Herrera (#6)

Re: Replication Monitoring

Ah thanks, thats what I must have done. Never happened on other lists
so I assumed it'd be okay. My Bad.
--- Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Glyn Astill wrote:

How did that happen? The subject is totally different, so is the
body.

It has an "In-Reply-To:" and possibly "References:" header which
relates
it to the other thread.

The solution is simple. Don't reply to an existing message when
you
want to post a new thread. Compose a new one instead.

--
Alvaro Herrera Developer,
http://www.PostgreSQL.org/
"Aprender sin pensar es inï¿½til; pensar sin aprender, peligroso"
(Confucio)

Glyn Astill

__________________________________________________________
Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com

Sam Mason

sam@samason.me.uk

over 18 years ago

In reply to: Matthew Dennis (#1)

Re: aggregate and order by

On Thu, Dec 06, 2007 at 02:12:48PM -0600, Matthew Dennis wrote:

I want to create an aggregate that will give the average velocity (sum of
distance traveled / sum of elapsed time) from position and timestamps.

How do you want to handle noisy data? If you want to handle it in any
reasonable way you'd need to some sort of regression; i.e. you'd need
to consider all the relavant data and then try and minimise the total
error somehow. I've always relied on external tools to do this sort of
thing, but if you want to do it in the database you may be able to get
somewhere with pl/r.

Sam