Histogram generator
Hi,
I have a table containing events with timestamps. I would like to generate a histogram of the number of each type of event for each half-hour period from 8:00 am to 6:00 pm. Are there any tools that will help me do this?
Thanks,
Patrick
Attachments:
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Patrick May
Sent: Tuesday, July 27, 2010 3:59 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Histogram generatorHi,
I have a table containing events with timestamps. I would like
to generate a histogram of the number of each type of event for each
half-hour period from 8:00 am to 6:00 pm. Are there any tools that
will help me do this?
GROUP BY is essentially a histogram generator.
The age() function will give you an interval.
I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)
I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous.
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
I have a table containing events with timestamps. I would like
to generate a histogram of the number of each type of event for each
half-hour period from 8:00 am to 6:00 pm. Are there any tools that
will help me do this?GROUP BY is essentially a histogram generator.
The age() function will give you an interval.I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)
I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous.
I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this:
date start end event count
---------- -------- -------- -------- -----
2010-07-27 08:00:00 08:29:59 EVENT_1 20
2010-07-27 08:00:00 08:29:59 EVENT_2 15
2010-07-27 08:30:00 08:59:59 EVENT_1 10
2010-07-27 08:30:00 08:59:59 EVENT_3 5
I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.
Thanks,
Patrick
Attachments:
On Jul 27, 2010, at 6:07 PM, Patrick May wrote:
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
I have a table containing events with timestamps. I would like
to generate a histogram of the number of each type of event for each
half-hour period from 8:00 am to 6:00 pm. Are there any tools that
will help me do this?GROUP BY is essentially a histogram generator.
The age() function will give you an interval.I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)
I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous.
I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this:
date start end event count
---------- -------- -------- -------- -----
2010-07-27 08:00:00 08:29:59 EVENT_1 20
2010-07-27 08:00:00 08:29:59 EVENT_2 15
2010-07-27 08:30:00 08:59:59 EVENT_1 10
2010-07-27 08:30:00 08:59:59 EVENT_3 5I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.
There's probably a better way, but something like this might work:
select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*) from bar group by 1, 2 order by 1 asc;
Cheers,
Steve
On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote:
On Jul 27, 2010, at 6:07 PM, Patrick May wrote:
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
I have a table containing events with timestamps. I would like
to generate a histogram of the number of each type of event for each
half-hour period from 8:00 am to 6:00 pm. Are there any tools that
will help me do this?GROUP BY is essentially a histogram generator.
The age() function will give you an interval.I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)
I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous.
I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this:
date start end event count
---------- -------- -------- -------- -----
2010-07-27 08:00:00 08:29:59 EVENT_1 20
2010-07-27 08:00:00 08:29:59 EVENT_2 15
2010-07-27 08:30:00 08:59:59 EVENT_1 10
2010-07-27 08:30:00 08:59:59 EVENT_3 5I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.
There's probably a better way, but something like this might work:
select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*) from bar group by 1, 2 order by 1 asc;
Thanks! It looks like interval is what I need to play with.
Regards,
Patrick
Attachments:
On Tue, Jul 27, 2010 at 09:25:05PM -0400, Patrick May wrote:
On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote:
select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*) from bar group by 1, 2 order by 1 asc;
Thanks! It looks like interval is what I need to play with.
Another useful tool to use is the classic unix "seconds since epoch".
You could turn the key expression from above into:
timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60)
I'd probably go with Steve's version here, it's a bit more obvious
what's going on. Also note, that if you don't really care about what
the specific groups are, just that you have a set of evenly divided
30minute periods you don't need to convert back to a date, so could just
use:
floor(date_part('epoch',foo) / (30*60))
One final note, if you're dealing with lots of data and the above
expression is slow, you could delay converting back to a date until
"after" the grouping, i.e:
SELECT timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) AS t, COUNT(*)
FROM data
GROUP BY floor(date_part('epoch',foo) / (30*60));
This will save PG from converting back to a date for every row when it's
going to chuck most of them away anyway.
Hope that gives you some more ideas!
--
Sam http://samason.me.uk/