Grouping logs by ip and time

Started by Alex Thurlowover 14 years ago7 messagesgeneral
Jump to latest
#1Alex Thurlow
alex-reports@blastro.com

Hello all,
I have a table which stores action logs from users. It looks
something like this:
log_type text,
date date,
"time" time without time zone,
ip inet

The log type can be action1, action2, action3, action4, or action5. I
know that each user session will have a max of one of each log and it
will always start with action1. It may not have every action though. I
also know that each session will take no longer than one minute.

What I'd like to do is be able to group these logs by sessions based on
the IP and the time range so I can figure out the time taken between
each action.

I know how to script it, although it's very slow. I was hoping there
was some way to do this in SQL. I'm running Postgresql 8.3.7 on this
machine right now, but if there's something that would make this easier
and doesn't exist there yet, I would be willing to upgrade.

Thanks,
Alex

In reply to: Alex Thurlow (#1)
Re: Grouping logs by ip and time

On 08/11/2011 17:59, Alex Thurlow wrote:

Hello all,
I have a table which stores action logs from users. It looks
something like this:
log_type text,
date date,
"time" time without time zone,
ip inet

[snip]

What I'd like to do is be able to group these logs by sessions based on
the IP and the time range so I can figure out the time taken between
each action.

Would something like this work? -

select ip, max("time") - min("time") as session_duration
from log_table
group by ip;

This doesn't take the date into account - what happens if the session
spans midnight? You can get around this by using a timestamp column
instead of separate date and time.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

In reply to: Raymond O'Donnell (#2)
Re: Grouping logs by ip and time

On 08/11/2011 18:48, Raymond O'Donnell wrote:

On 08/11/2011 17:59, Alex Thurlow wrote:

Hello all,
I have a table which stores action logs from users. It looks
something like this:
log_type text,
date date,
"time" time without time zone,
ip inet

[snip]

What I'd like to do is be able to group these logs by sessions based on
the IP and the time range so I can figure out the time taken between
each action.

Would something like this work? -

select ip, max("time") - min("time") as session_duration
from log_table
group by ip;

This doesn't take the date into account - what happens if the session
spans midnight? You can get around this by using a timestamp column
instead of separate date and time.

Whoops - just re-read your email, and realised that you're looking for
the time between actions; the above just gives you the total duration of
the session from each IP address.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

#4John R Pierce
pierce@hogranch.com
In reply to: Raymond O'Donnell (#2)
Re: Grouping logs by ip and time

On 11/08/11 10:48 AM, Raymond O'Donnell wrote:

Would something like this work? -

select ip, max("time") - min("time") as session_duration
from log_table
group by ip;

This doesn't take the date into account - what happens if the session
spans midnight? You can get around this by using a timestamp column
instead of separate date and time.

he said a session always starts with 'action1', and presumably there can
be more than one session per day, so this won't work. the 'end' of a
session is presumably the previous action2|3|4|5 thats prior to the next
action1. I have no idea how you'd code this as a SQL query.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

#5Ascarabina
ascarabina@gmail.com
In reply to: John R Pierce (#4)
Re: Grouping logs by ip and time

Would something like this work? -

select ip, max("time") - min("time") as session_duration
from log_table
group by ip;

I don't think this is the right way to do. This is based on ip address,
so if
- client connect diffrent times with same ip
- client has sime ip but he made another action on other day.
you will have a wrong results.

You should save also the session id and group by sesion id not ip.
Ex. :
Table
------------------
log_type text,
date date,
"time" time without time zone,
ip inet session_id text -- you can use maybe foreign tables ?

SQL ( Same as Raynold's but groups session ids)
-------------------
select ip, max("time") - min("time") as session_duration
from log_table
group by session_id;

#6Alex Thurlow
alex-reports@blastro.com
In reply to: Ascarabina (#5)
Re: Grouping logs by ip and time

On 11/8/2011 1:00 PM, Ascarabina wrote:

Would something like this work? -

select ip, max("time") - min("time") as session_duration
from log_table
group by ip;

I don't think this is the right way to do. This is based on ip
address, so if
- client connect diffrent times with same ip
- client has sime ip but he made another action on other day.
you will have a wrong results.

You should save also the session id and group by sesion id not ip.
Ex. :
Table
------------------
log_type text,
date date,
"time" time without time zone,
ip inet session_id text -- you can use maybe foreign tables ?

SQL ( Same as Raynold's but groups session ids)
-------------------
select ip, max("time") - min("time") as session_duration
from log_table
group by session_id;

Thanks for the responses guys. I guess I'll need to figure out how to
add a session ID if I'm going to do this.

-Alex

#7Thomas Markus
t.markus@proventis.net
In reply to: Alex Thurlow (#1)
Re: Grouping logs by ip and time

Hi Alex,

in PG9 you can use a query like this:

with a as ( select a.*, rank() over (partition by a.ip,a.date order by
a.log_type, a.time) from log_table a )
select a.*, b.*, b.time-a.time from a join a b on a.ip=b.ip and
a.date=b.date and a.rank+1=b.rank

this orders entry by time grouped by ip and date and selects entries
with there successors. In older versions this is not so easy. It should
work like this:
create temp sequence s;
create temp table a as select a.*, nextval('s') as rank from ( select
a.* from log_table a order by a.ip, a.date, a.time) a;
select a.*, b.*, b.time-a.time from a a join a b on a.ip=b.ip and
a.date=b.date and a.rank+1=b.rank;

Thomas

Am 08.11.2011 18:59, schrieb Alex Thurlow:

Show quoted text

Hello all,
I have a table which stores action logs from users. It looks
something like this:
log_type text,
date date,
"time" time without time zone,
ip inet

The log type can be action1, action2, action3, action4, or action5. I
know that each user session will have a max of one of each log and it
will always start with action1. It may not have every action though.
I also know that each session will take no longer than one minute.

What I'd like to do is be able to group these logs by sessions based
on the IP and the time range so I can figure out the time taken
between each action.

I know how to script it, although it's very slow. I was hoping there
was some way to do this in SQL. I'm running Postgresql 8.3.7 on this
machine right now, but if there's something that would make this
easier and doesn't exist there yet, I would be willing to upgrade.

Thanks,
Alex