Maximum number of tables per database and slowness
What is the current maximum number of tables per database? Also, does
having more tables slow down performance in any way?
Benjamin Arai wrote:
What is the current maximum number of tables per database? Also, does
having more tables slow down performance in any way?
For most cases, the answer is no. However, once you get near 6 figure
table counts, pg_catalog ends up being pretty massive. The problem is
that the query planner must check pg_catalog for every query to see what
indexes are available, what the statistics & value distributions are,
etc. in order to build the optimal plan. At some point, a really large
pg_catalog can begin to bog down your system.
See the performance list for a recent discussion about this very issue.
My issue actually stems from the fact that I cannot do large weekly
updates on fast enough to meet a weekend window for the following work
week. I am currently using a machine with a raid 1, 4GB RAM, and dual
opteron. I could go 0+1 but peroformance increase is only about 20%
from the benchmarks I have read. ANY suggestions would be greatly
appreciated.
On Sat, Dec 10, 2005 at 03:22:47PM -0800, Benjamin Arai wrote:
My issue actually stems from the fact that I cannot do large weekly
updates on fast enough to meet a weekend window for the following work
week. I am currently using a machine with a raid 1, 4GB RAM, and dual
opteron. I could go 0+1 but peroformance increase is only about 20%
from the benchmarks I have read. ANY suggestions would be greatly
appreciated.
Since you have given us no information to work with you're not going to
get any specific recommendations either. If you can tell is the things
you do most commonly and are most performance sensetive (exact
commands, timings, versions and settings please) we may be able to
help.
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
To be more specific, there are two types of commands that are run on
the system. There are application commands that do all different types
of joins and etc but for the most part are fast enough to meet user
expectations. On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc. For the most part they are
records of the type name,date,value. The update currently takes almost
two days. The update does deletions, insertion, and updates depending
on what has happened from the previous week.
For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk. I am currently
using Postgresql 8.0.3.
Example command "UPDATE data where name=x and date=y;".
Martijn van Oosterhout wrote:
Show quoted text
On Sat, Dec 10, 2005 at 03:22:47PM -0800, Benjamin Arai wrote:
My issue actually stems from the fact that I cannot do large weekly
updates on fast enough to meet a weekend window for the following work
week. I am currently using a machine with a raid 1, 4GB RAM, and dual
opteron. I could go 0+1 but peroformance increase is only about 20%
from the benchmarks I have read. ANY suggestions would be greatly
appreciated.Since you have given us no information to work with you're not going to
get any specific recommendations either. If you can tell is the things
you do most commonly and are most performance sensetive (exact
commands, timings, versions and settings please) we may be able to
help.
On 12/10/05, Benjamin Arai <barai@cs.ucr.edu> wrote:
To be more specific, there are two types of commands that are run on
the system. There are application commands that do all different types
of joins and etc but for the most part are fast enough to meet user
expectations. On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc. For the most part they are
records of the type name,date,value. The update currently takes almost
two days. The update does deletions, insertion, and updates depending
on what has happened from the previous week.For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk. I am currently
using Postgresql 8.0.3.Example command "UPDATE data where name=x and date=y;".
Try using VACUUM or VACUUM FULL after those weekly updates...
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
On Saturday 10 December 2005 19:28, Jaime Casanova wrote:
On 12/10/05, Benjamin Arai <barai@cs.ucr.edu> wrote:
To be more specific, there are two types of commands that are run on
the system. There are application commands that do all different types
of joins and etc but for the most part are fast enough to meet user
expectations. On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc. For the most part they are
records of the type name,date,value. The update currently takes almost
two days. The update does deletions, insertion, and updates depending
on what has happened from the previous week.For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk. I am currently
using Postgresql 8.0.3.Example command "UPDATE data where name=x and date=y;".
Try using VACUUM or VACUUM FULL after those weekly updates...
Once you've updated a majority of the rows in a large tables, if your still
doing further work on that table that involves complex conditionals, you'll
probably notice a slow down, so you might even think about doing a vacuum
durring the updates if you can swing it.
--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Benjamin Arai
wrote on Saturday, December 10, 2005 3:37 PM
... On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc. For the most part they are
records of the type name,date,value. The update currently takes almost
two days. The update does deletions, insertion, and updates depending
on what has happened from the previous week.For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk.
I experimented with batch size and found that large batches (thousands or
tens of thousands) slowed things down in our situation, while using a
batch size of around 100 or so sped things up tremendously.
Of course, YMMV ...
-Roger
Import Notes
Resolved by subject fallback
William Yu <wyu@talisys.com> writes:
Benjamin Arai wrote:
What is the current maximum number of tables per database? Also, does
having more tables slow down performance in any way?
For most cases, the answer is no. However, once you get near 6 figure
table counts, pg_catalog ends up being pretty massive.
You also have to think about the performance implications of having tens
of thousands of files in your database directory. While some newer
filesystems aren't fazed by that particularly, a lot of 'em bog down on
lookups when there are more than a few thousand entries in a directory.
regards, tom lane
Benjamin Arai wrote:
For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk. I am currently
using Postgresql 8.0.3.Example command "UPDATE data where name=x and date=y;".
Before you start throwing the baby out with the bathwater by totally
revamping your DB architecture, try some simple debugging first to see
why these queries take a long time. Use explain analyze, test
vacuuming/analyzing mid-updates, fiddle with postgresql.conf parameters
(the wal/commit settings especially). Try using using commit w/
different amounts of transactions -- the optimal # won't be the same
across all development tools.
My own experience is that periodic vacuuming & analyzing are very much
needed for batches of small update commands. For our batch processing,
autovacuum plus 1K-10K commit batches did the trick in keeping
performance up.
Please start new threads to ask unrelated questions, rather than replying
to an existing thread. This makes the archives less useful, and may keep
people from seeing your question.
On Sat, Dec 10, 2005 at 15:37:01 -0800,
Benjamin Arai <barai@cs.ucr.edu> wrote:
To be more specific, there are two types of commands that are run on
the system. There are application commands that do all different types
of joins and etc but for the most part are fast enough to meet user
expectations. On the other hand there is a weekly update (This is the
problem) that updates all of the modified records for a bunch of
finacial data such as closes and etc. For the most part they are
records of the type name,date,value. The update currently takes almost
two days. The update does deletions, insertion, and updates depending
on what has happened from the previous week.For the most part the updates are simple one liners. I currently commit
in large batch to increase performance but it still takes a while as
stated above. From evaluating the computers performance during an
update, the system is thrashing both memory and disk. I am currently
using Postgresql 8.0.3.Example command "UPDATE data where name=x and date=y;".
Another possibility is that foreign key checks may be slowing things down.
You may need to create some indexes on referencing tables if you are doing
deletes (or updates which change the key) in referenced tables.
Finding out more about which queries are slow and what plans they are using
is a good first step.
On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:
For the most part the updates are simple one liners. I currently
commit in large batch to increase performance but it still takes a
while as stated above. From evaluating the computers performance
during an update, the system is thrashing both memory and disk. I
am currently using Postgresql 8.0.3.
Then buy faster disks. My current favorite is to use U320 15kRPM
disks using a dual-chanel RAID controller with 1/2 the disks on one
channel and 1/2 on the other and mirroring them across channels, then
striping down the mirrors (ie, RAID10).
I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
RAID1.
Hello,
may I ask you some questions.
What is the performance difference between U320 15kRPM and U320 10kRPM ?
Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
and something like memory backup write cache (like HP DL 380 server) ?
Do you use Intel or Opteron cpus ?
regards,
-Franz
-----Ursprüngliche Nachricht-----
Von: Vivek Khera [mailto:vivek@khera.org]
Gesendet: Montag, 12. Dezember 2005 23:15
An: PG-General General
Betreff: Re: [GENERAL] Performance large tables.
On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:
For the most part the updates are simple one liners. I currently
commit in large batch to increase performance but it still takes a
while as stated above. From evaluating the computers performance
during an update, the system is thrashing both memory and disk. I
am currently using Postgresql 8.0.3.
Then buy faster disks. My current favorite is to use U320 15kRPM
disks using a dual-chanel RAID controller with 1/2 the disks on one
channel and 1/2 on the other and mirroring them across channels, then
striping down the mirrors (ie, RAID10).
I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
RAID1.
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
Import Notes
Resolved by subject fallback
What kind of performance boost do you get from using raid 10? I am trying
to do a little cost analysis.
Benjamin Arai
barai@cs.ucr.edu
benjamin@cs.ucr.edu
http://www.benjaminarai.com
Show quoted text
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of
Franz.Rasper@izb.de
Sent: Monday, December 12, 2005 11:50 PM
To: vivek@khera.org
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Performance large tables.Hello,
may I ask you some questions.
What is the performance difference between U320 15kRPM and
U320 10kRPM ?
Does your RAID crontoller has some memory (e.g. 128 MB or 256
MB ) and something like memory backup write cache (like HP DL
380 server) ?
Do you use Intel or Opteron cpus ?regards,
-Franz
-----Ursprüngliche Nachricht-----
Von: Vivek Khera [mailto:vivek@khera.org]
Gesendet: Montag, 12. Dezember 2005 23:15
An: PG-General General
Betreff: Re: [GENERAL] Performance large tables.On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:
For the most part the updates are simple one liners. I currently
commit in large batch to increase performance but it still takes a
while as stated above. From evaluating the computers performance
during an update, the system is thrashing both memory anddisk. I
am currently using Postgresql 8.0.3.
Then buy faster disks. My current favorite is to use U320 15kRPM
disks using a dual-chanel RAID controller with 1/2 the disks on one
channel and 1/2 on the other and mirroring them across
channels, then
striping down the mirrors (ie, RAID10).I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
RAID1.---------------------------(end of
broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?http://www.postgresql.org/docs/faq
---------------------------(end of
broadcast)---------------------------
TIP 4: Have you searched our list archives?
On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote:
What is the performance difference between U320 15kRPM and U320
10kRPM ?
Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
and something like memory backup write cache (like HP DL 380 server) ?
Do you use Intel or Opteron cpus ?
The 15k drives have higher sustained throughput so theoretically they
would be faster for sequential scans of data. I have no hard numbers
about this, though. See my thread on choosing between them from last
thursday.
As for RAID controller, I've been using LSI MegaRAID 320-2x
controller lately. I like it a lot. I configure 1/2 the disks on
one channel and 1/2 on the other and RAID mirror and RAID 10 them
across the channels. I *always* get battery backup for the
controllers. No point not to do so.
As for Intel vs. Opteron: Opteron hands down. The Intel Xeon EM64T
are adequate for low-end use, but for really pushing the bits back
and forth the Opterons are the top of the heap in performance.
On Dec 13, 2005, at 3:50 AM, Benjamin Arai wrote:
What kind of performance boost do you get from using raid 10? I am
trying
to do a little cost analysis.
For small amounts of data you probably wont notice anything. Once
you get into the 10's of GB you'll notice improvement when you have
lots of scattered reads as they get parallelized well.
Vivek Khera <vivek@khera.org> writes:
On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote:
What is the performance difference between U320 15kRPM and U320 10kRPM ?
Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
and something like memory backup write cache (like HP DL 380 server) ?
Do you use Intel or Opteron cpus ?The 15k drives have higher sustained throughput so theoretically they would be
faster for sequential scans of data. I have no hard numbers about this,
though. See my thread on choosing between them from last thursday.
Actually the 15k drives have only moderately higher throughput. The top of the
line 15k Maxtor has a maximum throughput of 98MB/s while my 3 year old 7200
rpm drive can get over 50MB/s. Newer 7200rpm drives would be better but they
don't seem to include throughput in their specs.
While that's not bad, the difference in seek time and rotational latency is
the main advantage of a faster drive. The seek time of a 7200rpm is about
8-9ms and the rotational latency about 4ms. The seek time of the to of the
line 15kRPM drive is about 3ms and the rotational latency 2ms.
In short while the throughput is less than doubled, the speed for random
access reads is almost tripled.
--
greg