performance issues with DBI module when data too big
<html><div style='background-color:'><DIV>
<DIV>
<DIV></DIV>
<DIV></DIV>
<P>Hi all,</P>
<P>I have developped a perl script to populate a database with two tables: sessions and actions.</P>
<P>the table actions contains the following columns: session_id, url, timestamp. The column session_id references to the table sessions. </P>
<P>the table sessions contains the following columns: session_id, remote_ip, phone_type, phone_number. The column session_id is serial.</P>
<P>The lines of the table actions which are part of the same session have the same session_id.</P>
<P>There are then more lines in the table actions than in the table session.</P>
<P><BR>To fill the two tables, I first need to know if the session already exists for a certain phone_type, a certain remote_ip and a certain phone_number:</P>
<P>SELECT session_id FROM sessions WHERE (phone_number = ?) AND (remote_ip = ?) AND (phone_type = ?) ORDER BY session_id;</P>
<P>I also need to apply a criteria to know if I have to add a new entry in the table sessions or not:</P>
<P>SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?); </P>
<P> </P>
<P>If the session already exists I add a line in the table actions with a INSERT</P>
<P>If the session doesn't exist or if the criteria is true, I add a line in the table sessions with an INSERT, retrieve the session_id of the line just added with the following request</P>
<P>SELECT session_id FROM sessions where (msisdn=?) AND (remote_ip=?) AND (user_agent=?) ORDER BY session_id DESC LIMIT 1</P>
<P>and the add with a INSERT a line in the table actions.</P>
<P>I have put indexes on sessions(session_id), sessions(msisdn), actions(session_id).</P>
<P>My problem is that populating my database is slower when the data gets bigger and the performance falls dramatically. I thought that is would be improve with my indexes but the problem still persists.</P>
<P>Could you please give me some clues that could solve this issue.</P>
<P>Thanks.</P>
<P><BR> <BR><BR>Nicolas Nolst </P>
<DIV></DIV>
<DIV></DIV><IMG src="http://www.3dflags/World/Gif/belgium_gs.gif">
<DIV></DIV></DIV></DIV></div><br clear=all><hr>Join the world�s largest e-mail service with MSN Hotmail. <a href='http://g.msn.com/1HM300901/158'>Click Here</a><br></html>
Nicolas Nolst wrote:
[text/html is unsupported, treating like TEXT/PLAIN]
[...]
<P>My problem is that populating my database is slower when the data gets bigger and the performance falls dramatically. I thought that is would be improve with my indexes but the problem still persists.</P>
<P>Could you please give me some clues that could solve this issue.</P>
Is your database frequently vacuumed and analyzed (and do you
frequently read frequently asked questions)?
<P>Thanks.</P>
<P><BR> <BR><BR>Nicolas Nolst </P>
And just for the record, do you prefer answers to HTML mail
in Postscript, PDF, DVI or nothing at all?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
Are you using the {AutoCommit => 0} argument to DBI->connect()? If not,
do so, and then add a $dbh->commit; line when you're done with the
inserts. Should help a lot.
----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu
On Mon, 3 Jun 2002, Nicolas Nolst wrote:
Show quoted text
Hi all,
I have developped a perl script to populate a database with two tables:
sessions and actions.the table actions contains the following columns: session_id, url,
timestamp. The column session_id references to the table sessions.the table sessions contains the following columns: session_id, remote_ip,
phone_type, phone_number. The column session_id is serial.The lines of the table actions which are part of the same session have
the same session_id.There are then more lines in the table actions than in the table session.
To fill the two tables, I first need to know if the session already
exists for a certain phone_type, a certain remote_ip and a certain
phone_number:SELECT session_id FROM sessions WHERE (phone_number = ?) AND (remote_ip
= ?) AND (phone_type = ?) ORDER BY session_id;I also need to apply a criteria to know if I have to add a new entry in
the table sessions or not:SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?);
If the session already exists I add a line in the table actions with a
INSERTIf the session doesn't exist or if the criteria is true, I add a line in
the table sessions with an INSERT, retrieve the session_id of the line
just added with the following requestSELECT session_id FROM sessions where (msisdn=?) AND (remote_ip=?) AND
(user_agent=?) ORDER BY session_id DESC LIMIT 1and the add with a INSERT a line in the table actions.
I have put indexes on sessions(session_id), sessions(msisdn),
actions(session_id).My problem is that populating my database is slower when the data gets
bigger and the performance falls dramatically. I thought that is would be
improve with my indexes but the problem still persists.Could you please give me some clues that could solve this issue.
Thanks.
Nicolas Nolst
[belgium_gs.gif]
________________________________________________________________________________
Join the worlds largest e-mail service with MSN Hotmail. Click Here