Postgres (selection of thesis topic)

Started by Harpreet Dhaliwalalmost 19 years ago7 messagesgeneral
Jump to latest
#1Harpreet Dhaliwal
harpreet.dhaliwal01@gmail.com

Hi,
I'm kind of new to postgresql and the project that I'm working on currently
deals with parsing emails, storing parsed components in postgresql DB and
fire triggers
on certain inserts that opens socket connection with a unix tools server,
initiates tools like whois, traceroute etc and unix tools server opens ODBC
connection back to same
postgres database and stores the results fetched from running the unix
tools.

In this regard, I have to start working on some thesis topic related to the
postgres database that we are using in the project. It can be in conjunction
with email parsing on unix tools but the theme of the thesis topic should
revolve around postgres database.

I have done alot of homework on this and could think of something like "bulk
of data storage in email parsing and how vacuuming it would increase the
performance" because i think this vacuum DB concept is not there in other
RDBMS. This is just a petty topic but i was thinking something on these
lines.

I have no clue what other options or topics do I have write to start writing
my thesis on.
Any kind of help would be highly appreciated in this regard.

Thanks,
~Harpreet

#2Richard Huxton
dev@archonet.com
In reply to: Harpreet Dhaliwal (#1)
Re: Postgres (selection of thesis topic)

Harpreet Dhaliwal wrote:

In this regard, I have to start working on some thesis topic related to the
postgres database that we are using in the project. It can be in
conjunction
with email parsing on unix tools but the theme of the thesis topic should
revolve around postgres database.

You probably want to talk to the people behind this:
http://www.archiveopteryx.org/

--
Richard Huxton
Archonet Ltd

#3Alexander Staubo
alex@purefiction.net
In reply to: Harpreet Dhaliwal (#1)
Re: Postgres (selection of thesis topic)

On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:

I'm kind of new to postgresql and the project that I'm working on currently
deals with parsing emails, storing parsed components in postgresql DB and
fire triggers
on certain inserts that opens socket connection with a unix tools server,

Are you sure it is a good idea to do this processing synchronously?
What happens if there is a network problem? It sounds like an
inefficient and inflexible design.

I have done alot of homework on this and could think of something like "bulk
of data storage in email parsing and how vacuuming it would increase the
performance" because i think this vacuum DB concept is not there in other
RDBMS.

SQLite also requires vacuuming, as does other databases based on
MVCC-like designs, although some (eg., Oracle with its redo logs,
iirc) do their housekeeping behind the scenes.

Alexander.

#4Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Alexander Staubo (#3)
Re: Postgres (selection of thesis topic)

On Wed, 2007-05-02 at 08:00, Alexander Staubo wrote:

On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:

I'm kind of new to postgresql and the project that I'm working on currently
deals with parsing emails, storing parsed components in postgresql DB and
fire triggers
on certain inserts that opens socket connection with a unix tools server,

Are you sure it is a good idea to do this processing synchronously?
What happens if there is a network problem? It sounds like an
inefficient and inflexible design.

I have done alot of homework on this and could think of something like "bulk
of data storage in email parsing and how vacuuming it would increase the
performance" because i think this vacuum DB concept is not there in other
RDBMS.

SQLite also requires vacuuming, as does other databases based on
MVCC-like designs, although some (eg., Oracle with its redo logs,
iirc) do their housekeeping behind the scenes.

We're running Oracle 9 here, and it's even worse than vacuuming. Once a
table grows, it stays grown until you rebuild it (you use the move
command, you just don't move it), and if it's filled up it's tablespace,
you have to extend it to get room to do that. On top of that, you can't
move a partitioned table.

I'd say Oracle9 is about 10 times worse than PostgreSQL (any version)
for the amount of manual maintenance it takes to keep it happy.

#5Martin Gainty
mgainty@hotmail.com
In reply to: Harpreet Dhaliwal (#1)
Re: Postgres (selection of thesis topic)

Good Morning Scott-

The following URL contains the directive to Move Partition in a Partitioned
Tables
http://www.csee.umbc.edu/help/oracle8/server.815/a67772/partiti.htm
you will then need to rebuild the indices to point to the new partition

Is there some manner of automatically rebuilding the indices when moving
partition tables under Postgres?

Thanks/
Martin
This email message and any files transmitted with it contain confidential
information intended only for the person(s) to whom this email message is
addressed. If you have received this email message in error, please notify
the sender immediately by telephone or email and destroy the original
message without making a copy. Thank you.

----- Original Message -----
From: "Scott Marlowe" <smarlowe@g2switchworks.com>
To: "Alexander Staubo" <alex@purefiction.net>
Cc: "Harpreet Dhaliwal" <harpreet.dhaliwal01@gmail.com>; "pgsql general"
<pgsql-general@postgresql.org>
Sent: Wednesday, May 02, 2007 11:28 AM
Subject: Re: [GENERAL] Postgres (selection of thesis topic)

Show quoted text

On Wed, 2007-05-02 at 08:00, Alexander Staubo wrote:

On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:

I'm kind of new to postgresql and the project that I'm working on
currently
deals with parsing emails, storing parsed components in postgresql DB
and
fire triggers
on certain inserts that opens socket connection with a unix tools
server,

Are you sure it is a good idea to do this processing synchronously?
What happens if there is a network problem? It sounds like an
inefficient and inflexible design.

I have done alot of homework on this and could think of something like
"bulk
of data storage in email parsing and how vacuuming it would increase
the
performance" because i think this vacuum DB concept is not there in
other
RDBMS.

SQLite also requires vacuuming, as does other databases based on
MVCC-like designs, although some (eg., Oracle with its redo logs,
iirc) do their housekeeping behind the scenes.

We're running Oracle 9 here, and it's even worse than vacuuming. Once a
table grows, it stays grown until you rebuild it (you use the move
command, you just don't move it), and if it's filled up it's tablespace,
you have to extend it to get room to do that. On top of that, you can't
move a partitioned table.

I'd say Oracle9 is about 10 times worse than PostgreSQL (any version)
for the amount of manual maintenance it takes to keep it happy.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#6Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Martin Gainty (#5)
Re: Postgres (selection of thesis topic)

On Wed, 2007-05-02 at 10:52, Martin Gainty wrote:

Good Morning Scott-

The following URL contains the directive to Move Partition in a Partitioned
Tables
http://www.csee.umbc.edu/help/oracle8/server.815/a67772/partiti.htm
you will then need to rebuild the indices to point to the new partition

Yeah, I've seen that, and used it even.

Is there some manner of automatically rebuilding the indices when moving
partition tables under Postgres?

You don't need to move partition tables / rebuild indexes in postgresql
generally, that was my main point. If you need to reclaim space because
you forgot to regularly vacuum, then you can do a vacuumdb -fz and a
reindexdb... And no mucked up indexes like you get when you move a
table in oracle and forget to rebuild its indexes. Why automatic index
rebuilding wasn't a part of oracle 9 I'll never know.

It just seems that all the things someone decided to write a script /
program for in postgresql that got included into the pgsql/bin directory
or at least contrib or pgfoundy are spread across various web pages for
oracle. Which seems almost backwards. I'd kind of expect the open
source project to have things all over the web, mildly disorganized, and
the commercial project to have it all come with the package.

Keep in mind, I'm not slagging Oracle, really. It's an impressive
database. It just feels krufty, like there are tons of things you just
"have to know" to make it work. More than I'd expected when I first
started using it.

#7Alexander Staubo
alex@purefiction.net
In reply to: Scott Marlowe (#4)
Re: Postgres (selection of thesis topic)

On 5/2/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:

We're running Oracle 9 here, and it's even worse than vacuuming. Once a
table grows, it stays grown until you rebuild it (you use the move
command, you just don't move it), and if it's filled up it's tablespace,

It's been a while since I touched the Beast, but does this unused
space significantly impact performance in the way it does with
PostgreSQL?

Alexander.