speeding up a query

Started by Marcus Engeneabout 19 years ago4 messagesgeneral
Jump to latest
#1Marcus Engene
mengpg2@engene.se

Hi,

I'm on 8.0.10 and there is a query I cannot quite get adequately fast.
Should it take 2.5s to sort these 442 rows? Are my settings bad? Is
my query stupid?

Would appreciate any tips.

Best regards,
Marcus

apa=> explain analyze
apa-> select
apa-> ai.objectid as ai_objectid
apa-> from
apa-> apa_item ai
apa-> where
apa-> idxfti @@ to_tsquery('default', 'KCA0304') AND
apa-> ai.status = 30
apa-> ORDER BY ai.calc_rating desc
apa-> LIMIT 1000;

Limit (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.254..2651.093 rows=442 loops=1)
-> Sort (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.251..2650.515 rows=442 loops=1)
Sort Key: calc_rating
-> Index Scan using apa_item_fts on apa_item ai
(cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
rows=442 loops=1)
Index Cond: (idxfti @@ '''kca0304'''::tsquery)
Filter: (status = 30)
Total runtime: 2651.659 ms
(7 rows)

apa=> explain analyze
apa-> select
apa-> ai.objectid as ai_objectid
apa-> from
apa-> apa_item ai
apa-> where
apa-> idxfti @@ to_tsquery('default', 'KCA0304') AND
apa-> ai.status = 30
apa-> LIMIT 1000;

Limit (cost=0.00..54.18 rows=12 width=4) (actual time=0.186..18.628
rows=442 loops=1)
-> Index Scan using apa_item_fts on apa_item ai (cost=0.00..54.18
rows=12 width=4) (actual time=0.183..17.999 rows=442 loops=1)
Index Cond: (idxfti @@ '''kca0304'''::tsquery)
Filter: (status = 30)
Total runtime: 19.062 ms
(5 rows)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marcus Engene (#1)
Re: speeding up a query

Marcus Engene <mengpg2@engene.se> writes:

Should it take 2.5s to sort these 442 rows?

Limit (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.254..2651.093 rows=442 loops=1)
-> Sort (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.251..2650.515 rows=442 loops=1)
Sort Key: calc_rating
-> Index Scan using apa_item_fts on apa_item ai
(cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
rows=442 loops=1)
Index Cond: (idxfti @@ '''kca0304'''::tsquery)
Filter: (status = 30)
Total runtime: 2651.659 ms

It's not the sort that's taking 2.5s --- the sort looks to be taking
about a millisec and a half. The indexscan is eating the other 2649
msec. The question that seems to be interesting is what's the
difference between the contexts of your two queries, because they
sure look like the indexscans were the same. Maybe the second one
is merely benefiting from the first one having already sucked all the
data into cache?

regards, tom lane

#3Marcus Engene
mengpg2@engene.se
In reply to: Marcus Engene (#1)
Re: speeding up a query

Hi again,

I was thinking, in my slow query it seems the sorting is the villain.
Doing a simple qsort test I notice that:
ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 430

real 0m0.051s
user 0m0.030s
sys 0m0.000s

ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 430000

real 0m0.238s
user 0m0.218s
sys 0m0.015s

ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 4300000

real 0m2.594s
user 0m2.061s
sys 0m0.108s

From this very unfair test indeed I see that my machine has the
capability to sort 4.3 million entries during the same time my pg is
sorting 430.

And i cannot stop wondering if there is some generic sorting routine
that is incredibly slow? Would it be possible to, in the situations
where order by is by simple datatypes of one column, to do a special
sorting, like the qsort example in the end of this mail? Is this already
addressed in later versions?

If no, why? and if yes, where in the pg code do I look?

Best regards,
Marcus

#include <stdio.h>
#include <stdlib.h>

typedef struct {
int val;
void *pek;
} QSORTSTRUCT_INT_S;

int sortstruct_int_compare(void const *a, void const *b)
{
return ( ((QSORTSTRUCT_INT_S *)a)->val - ((QSORTSTRUCT_INT_S
*)b)->val );
}

int main (int argc, char **argv)
{
int nbr = 0;
int i = 0;
QSORTSTRUCT_INT_S *sort_arr = 0;
if (1 == argc) {
printf("forgot amount argument\n");
exit(1);
}
nbr = atoi (argv[1]);
if (0 == (sort_arr = malloc (sizeof(QSORTSTRUCT_INT_S) * nbr))) {
printf("cannot alloc\n");
exit(1);
}
srand(123);
for (i=0; i<nbr; i++) {
sort_arr[i].val = rand();
}
qsort(sort_arr, nbr, sizeof(QSORTSTRUCT_INT_S),sortstruct_int_compare);
return 0;
}

#4Marcus Engene
mengpg2@engene.se
In reply to: Tom Lane (#2)
Re: speeding up a query

Tom Lane skrev:

Marcus Engene <mengpg2@engene.se> writes:

Should it take 2.5s to sort these 442 rows?

Limit (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.254..2651.093 rows=442 loops=1)
-> Sort (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.251..2650.515 rows=442 loops=1)
Sort Key: calc_rating
-> Index Scan using apa_item_fts on apa_item ai
(cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
rows=442 loops=1)
Index Cond: (idxfti @@ '''kca0304'''::tsquery)
Filter: (status = 30)
Total runtime: 2651.659 ms

It's not the sort that's taking 2.5s --- the sort looks to be taking
about a millisec and a half. The indexscan is eating the other 2649
msec. The question that seems to be interesting is what's the
difference between the contexts of your two queries, because they
sure look like the indexscans were the same. Maybe the second one
is merely benefiting from the first one having already sucked all the
data into cache?

regards, tom lane

Yes indeed you are completely right! Both queries take about the same when
run after the other. And I just made a fool of myself with an optimizing
idea I
had...

Sorry for the noise and thanks for your answer!

Best regards,
Marcus