best practice in archiving CDR data

Started by Juan Backsonabout 16 years ago5 messagesgeneral
Jump to latest
#1Juan Backson
juanbackson@gmail.com

Hi,

I am using Postgres to store CDR data for voip switches. The data size
quickly goes about a few TBs.

What I would like to do is to be able to regularly archive the oldest data
so only the most recent 6 months of data is available.

All those old data will be stored in a format that can be retrieved back
either into DB table or flat files.

Does anyone know how should I go about doing that? Is there any existing
tool that can already do that?

thanks,
jb

#2A. Kretschmer
andreas.kretschmer@schollglas.com
In reply to: Juan Backson (#1)
Re: best practice in archiving CDR data

In response to Juan Backson :

Hi,

I am using Postgres to store CDR data for voip switches. �The data size quickly
goes about a few TBs. �

What I would like to do is to be able to regularly archive the oldest data so
only the most recent 6 months of data is available. �

All those old data will be stored in a format that can be retrieved back either
into DB table or flat files.

Does anyone know how should I go about doing that? �Is there any existing tool
that can already do that?

Sounds like table partitioning: create, for instance, a table for each
month and DROP old tables after 6 month or so.

http://www.postgresql.org/docs/current/static/ddl-partitioning.html

Regards, Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99

#3Edgardo Portal
egportal2002@yahoo.com
In reply to: Juan Backson (#1)
Re: best practice in archiving CDR data

On 2010-03-29, Juan Backson <juanbackson@gmail.com> wrote:

--0016e64ccb10fb54050482f07924
Content-Type: text/plain; charset=ISO-8859-1

Hi,

I am using Postgres to store CDR data for voip switches. The data size
quickly goes about a few TBs.

What I would like to do is to be able to regularly archive the oldest data
so only the most recent 6 months of data is available.

All those old data will be stored in a format that can be retrieved back
either into DB table or flat files.

Does anyone know how should I go about doing that? Is there any existing
tool that can already do that?

thanks,
jb

FWIW, I partition by ISO week, use INSERT RULEs to route CDRs to the correct
partition (keeping about 3 partitions "open" to new CDRs at any one time),
use pg_dump to archive partition tables to off-line storage, and
DROP TABLE to keep the main DBs to about 40 weeks of data. I used
to use monthly partitioning, but the file sizes got a bit awkward
to deal with.

When I need to restore old CDRs (e.g. to service a subpoena) I
use pg_restore to load the needed CDRs to a throwaway database
and process as necessary.

#4Juan Backson
juanbackson@gmail.com
In reply to: A. Kretschmer (#2)
Re: best practice in archiving CDR data

Hi

Instead of dropping the table, I would like to archive the old table into a
format that can be read and retrieved.

Can I db_dump on each child table?

What is the best way to do it? db_dump and make the data into csv and then
tar.gz it or backup it up into a pg archived format?

thanks,
jb

On Mon, Mar 29, 2010 at 9:33 PM, A. Kretschmer <
andreas.kretschmer@schollglas.com> wrote:

Show quoted text

In response to Juan Backson :

Hi,

I am using Postgres to store CDR data for voip switches. The data size

quickly

goes about a few TBs.

What I would like to do is to be able to regularly archive the oldest

data so

only the most recent 6 months of data is available.

All those old data will be stored in a format that can be retrieved back

either

into DB table or flat files.

Does anyone know how should I go about doing that? Is there any existing

tool

that can already do that?

Sounds like table partitioning: create, for instance, a table for each
month and DROP old tables after 6 month or so.

http://www.postgresql.org/docs/current/static/ddl-partitioning.html

Regards, Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5David Fetter
david@fetter.org
In reply to: Edgardo Portal (#3)
Re: best practice in archiving CDR data

On Mon, Mar 29, 2010 at 02:08:23PM +0000, Edgardo Portal wrote:

On 2010-03-29, Juan Backson <juanbackson@gmail.com> wrote:

--0016e64ccb10fb54050482f07924
Content-Type: text/plain; charset=ISO-8859-1

Hi,

I am using Postgres to store CDR data for voip switches. The data
size quickly goes about a few TBs.

What I would like to do is to be able to regularly archive the
oldest data so only the most recent 6 months of data is available.

All those old data will be stored in a format that can be
retrieved back either into DB table or flat files.

Does anyone know how should I go about doing that? Is there any
existing tool that can already do that?

FWIW, I partition by ISO week, use INSERT RULEs to route CDRs

Just generally, triggers are much better than RULEs for this kind of
thing. The underlying functions can be made quite efficient.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate