indexes and big tables

Started by Robert Vojtaover 24 years ago7 messages

vojta@ipex.cz

over 24 years ago

Hi,
I'm fighting with problem with indexes. I read documentation about
performance tips, about internal logic functions which are making decision
if to use or not use indexes, etc. and I'm still failed. I'm not SQL
guru and I don't know what to do now. My tables and indexes looks like
...

CREATE TABLE counters (
line VARCHAR(64) NOT NULL,
counterfrom INT8 NOT NULL,
counterto INT8 NOT NULL,
counterstamp TIMESTAMP NOT NULL,
stamp TIMESTAMP NOT NULL DEFAULT 'now');

CREATE INDEX counters_line_stamp ON counters (line, counterstamp);

I have several other tables too with names static_counters_(hour|day|month).
Why? It's only for fast sumarization, so ...

in counters - 5min counters for last hour, rows are moved into static_counters
after hour sumarization in counters_hour table

in counters_hour - last day hour sums, rows are moved into static_counters_
hour table after day sumarization in counters_day

in counters_day - last month days sums, rows are moved into static_counters_
days table after month sumarization in counters_month

I'm inserting about 300 rows into counters table in 5min period (fetching
info from routers). Sumarization is doing everyhour with some internal logic
and decision are made by hour info. There are about 3 milions rows in
static_counters table and they are only for last month. It means, that when
next month begins, we moved this old data into tables counters_YYYY_MM, etc.
I'm running VACUUM ANALYZE two times a day. Everything works fine, but I'm drawing graphs from static_counters and counters tables. For first graph I
need about 33 hour old data and for second graph I need about a week old data.
I know, now there is a more data than I need in this table, but if I create a
table with needed values only, there is no indexes used too. Select for graphs
looks like ...

netacc=> EXPLAIN (SELECT SUM(counterfrom) AS from, SUM(counterto) AS to,
floor((985098900 - date_part('epoch', counterstamp)) / 300) AS sequence
FROM counters WHERE line='absolonll' AND date_part('epoch', counterstamp)

984978900 GROUP BY sequence, line) UNION (SELECT SUM(counterfrom) AS

from, SUM(counterto) AS to, floor((985098900 - date_part('epoch',
counterstamp)) / 300) AS sequence FROM static_counters WHERE
line='absolonll' AND date_part('epoch', counterstamp) > 984978900 GROUP BY
sequence, line); NOTICE: QUERY PLAN:

Unique (cost=67518.73..67525.44 rows=89 width=36)
-> Sort (cost=67518.73..67518.73 rows=895 width=36)
-> Append (cost=1860.01..67474.87 rows=895 width=36)
-> Aggregate (cost=1860.01..1870.90 rows=109 width=36)
-> Group (cost=1860.01..1865.46 rows=1089 width=36)
-> Sort (cost=1860.01..1860.01 rows=1089
width=36)
-> Seq Scan on counters
(cost=0.00..1805.10 rows=1089 width=36)
-> Aggregate (cost=65525.38..65603.97 rows=786 width=36)
-> Group (cost=65525.38..65564.67 rows=7858
width=36)
-> Sort (cost=65525.38..65525.38 rows=7858
width=36)
-> Seq Scan on static_counters
(cost=0.00..65016.95 rows=7858 width=36)

EXPLAIN
netacc=>

... Indexes are used when I have a few rows in table only :( Result of this
select is about ~105 rows in every way. Now, I don't know what to do, because
drawing of this two graphs is about 30 seconds and it's too much.
Please, how may I change my solution for fast graphs drawings? May I
split this table? Make table for each line? Upgrade HW?

I'm running PostgreSQL 7.0.3 now on RedHat 6.2 linux box. HW of this box
is Duron 700MHz, 384MB RAM, SCSI disk. May I upgrade PostgreSQL? May I
upgrade RAM, CPU? I don't know what to do now and any help will be very
appreciated.

Thank you very much,
king regards,
Robert Vojta

--
_
|-| __ Robert Vojta <vojta-at-ipex.cz> -= Oo.oO =-
|=| [Ll] IPEX, s.r.o.
"^" ====`o

Zeugswetter Andreas SB

ZeugswetterA@wien.spardat.at

over 24 years ago

In reply to: Robert Vojta (#1)

AW: indexes and big tables

netacc=> EXPLAIN (SELECT SUM(counterfrom) AS from,
SUM(counterto) AS to,
floor((985098900 - date_part('epoch', counterstamp)) / 300)
AS sequence
FROM counters WHERE line='absolonll' AND date_part('epoch',
) counterstamp > 984978900 GROUP BY sequence, line) ...

I would guess the problem is the restriction on counterstamp, because
written like that, it probably can't use the index.

try something where you avoid the use of the date_part function e.g.:
AND counterstamp > '2001-07-26 00:00:00.0'

Andreas

Import Notes

Resolved by subject fallback

Robert Vojta

vojta@ipex.cz

over 24 years ago

In reply to: Zeugswetter Andreas SB (#2)

Re: indexes and big tables

I would guess the problem is the restriction on counterstamp, because
written like that, it probably can't use the index.

try something where you avoid the use of the date_part function e.g.:
AND counterstamp > '2001-07-26 00:00:00.0'

I will try it, but it use the index when there is a few amount of rows.
When I insert a lot of rows like me (in milions), index isn't used. I don't
know the number of rows which makes border between using and don't using
index and I can discover it if you want. Going to try your suggestions ...

Best regards,
Robert

--
_
|-| __ Robert Vojta <vojta-at-ipex.cz> -= Oo.oO =-
|=| [Ll] IPEX, s.r.o.
"^" ====`o

Robert Vojta

vojta@ipex.cz

over 24 years ago

In reply to: Robert Vojta (#3)

Re: indexes and big tables

The index is only used for the line= part of the where clause
with your query. With many rows the "line=" is not selective enough
to justify the index.

Hi,
I tried you suggestion about 'AND counterstamp > '2001-07-26 00:00:00.0' and
it works and index is used :) But, whole query run for 10 sec (was 30s) and
it's too much, I need about 1 sec. May I optimize my tables, queries or may I
upgrade something from my HW (duron 700, 384MB RAM, slow scsi disk :( )? I
do not want solution, some hint in which part may I focus and I will go through
documentation again, thank you very much.

Best regards,
Robert

--
_
|-| __ Robert Vojta <vojta-at-ipex.cz> -= Oo.oO =-
|=| [Ll] IPEX, s.r.o.
"^" ====`o

Import Notes

Reply to msg id not found: 11C1E6749A55D411A9670001FA6879633683A8@sdexcsrv1.f000.d0188.sd.spardat.atReference msg id not found: 11C1E6749A55D411A9670001FA6879633683A8@sdexcsrv1.f000.d0188.sd.spardat.at | Resolved by subject fallback

Robert Vojta

vojta@ipex.cz

over 24 years ago

In reply to: Robert Vojta (#4)

Re: indexes and big tables

The index is only used for the line= part of the where clause
with your query. With many rows the "line=" is not selective enough
to justify the index.

I tried move only needed data into new table and change query into ...

netacc=> EXPLAIN SELECT counterfrom AS from, counterto AS to,
floor((980000000 - date_part('epoch', counterstamp)) / 300) AS sequence
FROM graphs_5m WHERE line='absolonll'; NOTICE: QUERY PLAN:

Index Scan using graphs_5m_idx on graphs_5m (cost=0.00..58.38 rows=29
width=24)

EXPLAIN

and query runs for 3-5 seconds. Any idea how to make it faster? I think,
that now it's ready to HW upgrade for faster result ...

Best regards,
Robert

--
_
|-| __ Robert Vojta <vojta-at-ipex.cz> -= Oo.oO =-
|=| [Ll] IPEX, s.r.o.
"^" ====`o

Import Notes

Reply to msg id not found: 11C1E6749A55D411A9670001FA6879633683A8@sdexcsrv1.f000.d0188.sd.spardat.atReference msg id not found: 11C1E6749A55D411A9670001FA6879633683A8@sdexcsrv1.f000.d0188.sd.spardat.at | Resolved by subject fallback

Stephan Szabo

sszabo@megazone23.bigpanda.com

over 24 years ago

In reply to: Robert Vojta (#1)

Re: indexes and big tables

On Fri, 27 Jul 2001, Robert Vojta wrote:

netacc=> EXPLAIN (SELECT SUM(counterfrom) AS from, SUM(counterto) AS to,
floor((985098900 - date_part('epoch', counterstamp)) / 300) AS sequence
FROM counters WHERE line='absolonll' AND date_part('epoch', counterstamp)

984978900 GROUP BY sequence, line) UNION (SELECT SUM(counterfrom) AS

from, SUM(counterto) AS to, floor((985098900 - date_part('epoch',
counterstamp)) / 300) AS sequence FROM static_counters WHERE
line='absolonll' AND date_part('epoch', counterstamp) > 984978900 GROUP BY
sequence, line); NOTICE: QUERY PLAN:

Is there any possibility of overlapping rows between the parts of the
union? If not, I'd suggest union all, since that might get rid of the top
level unique and sort steps (probably not a huge gain, but might help).

Show quoted text

Unique (cost=67518.73..67525.44 rows=89 width=36)
-> Sort (cost=67518.73..67518.73 rows=895 width=36)
-> Append (cost=1860.01..67474.87 rows=895 width=36)
-> Aggregate (cost=1860.01..1870.90 rows=109 width=36)
-> Group (cost=1860.01..1865.46 rows=1089 width=36)
-> Sort (cost=1860.01..1860.01 rows=1089
width=36)
-> Seq Scan on counters
(cost=0.00..1805.10 rows=1089 width=36)
-> Aggregate (cost=65525.38..65603.97 rows=786 width=36)
-> Group (cost=65525.38..65564.67 rows=7858
width=36)
-> Sort (cost=65525.38..65525.38 rows=7858
width=36)
-> Seq Scan on static_counters
(cost=0.00..65016.95 rows=7858 width=36)

EXPLAIN
netacc=>

Robert Vojta

vojta@ipex.cz

over 24 years ago

In reply to: Stephan Szabo (#6)

Re: indexes and big tables

Is there any possibility of overlapping rows between the parts of the
union? If not, I'd suggest union all, since that might get rid of the top
level unique and sort steps (probably not a huge gain, but might help).

Hi,
thanx for the response, there is a little possibility of overlapping rows.
But, I removed date_part function and it use index now. It's too slow, so
I move all needed data into graphs_5m table and it works fine now, previous
time for graphs drawing was ~30-60seconds, today it's ~3-5 seconds. Thanx.

Best regards,
Robert

--
_
|-| __ Robert Vojta <vojta-at-ipex.cz> -= Oo.oO =-
|=| [Ll] IPEX, s.r.o.
"^" ====`o