Index on integer or on string field
Hello,
i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.
The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.
Is index are integer and not characteral, performance are better and workload reduces?
Is there any comparisons?
Thank you!
Francesco
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
You should probably experiment with a btree-gin index on those.
Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it> escreveu:
Show quoted text
Hello,
i have a table of about 10 millions of records, with the index on a string
field.
Actually is alphabetical; since queries are about 100/200 per seconds, i
was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and
we could create a second table to codify, with numerical integer values,
the 50 recurring names.Is index are integer and not characteral, performance are better and
workload reduces?Is there any comparisons?
Thank you!
Francesco--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hello Arthur!
So, i read that btree-gin have got "the ability to enforce uniqueness".
If in this 10.millions long table i have, in index, 50 recurring values, i can leave the alphabetical field and change to btree-gin the index on it?!
Thank you!
Francesco
________________________________
Da: Arthur Silva [arthurprs@gmail.com]
Inviato: venerdì 15 maggio 2015 17.26
A: Job
Cc: pgsql-general@postgresql.org
Oggetto: Re: [GENERAL] Index on integer or on string field
You should probably experiment with a btree-gin index on those.
Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it<mailto:Job@colliniconsulting.it>> escreveu:
Hello,
i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.
The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.
Is index are integer and not characteral, performance are better and workload reduces?
Is there any comparisons?
Thank you!
Francesco
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Are you saying your indexed field has only 50 distinct values? Seems a horrible candidate for an index. Might be good to partition on those fifty values but ten million records probably doesn't warrant that.
Sent from my iPhone
Show quoted text
On May 15, 2015, at 9:34 AM, Job <Job@colliniconsulting.it> wrote:
Hello Arthur!
So, i read that btree-gin have got "the ability to enforce uniqueness".
If in this 10.millions long table i have, in index, 50 recurring values, i can leave the alphabetical field and change to btree-gin the index on it?!
Thank you!
FrancescoDa: Arthur Silva [arthurprs@gmail.com]
Inviato: venerdì 15 maggio 2015 17.26
A: Job
Cc: pgsql-general@postgresql.org
Oggetto: Re: [GENERAL] Index on integer or on string fieldYou should probably experiment with a btree-gin index on those.
Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it> escreveu:Hello,
i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.
Is index are integer and not characteral, performance are better and workload reduces?
Is there any comparisons?
Thank you!
Francesco--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Yes that's my suggestion. Btree-Gin deals with lots of repeated values much
better than the Btree index as repeated keys are only stored once.
Em 15/05/2015 12:38, "Job" <Job@colliniconsulting.it> escreveu:
Show quoted text
Hello Arthur!
So, i read that btree-gin have got "the ability to enforce uniqueness".
If in this 10.millions long table i have, in index, 50 recurring values, i
can leave the alphabetical field and change to btree-gin the index on it?!Thank you!
Francesco------------------------------
*Da:* Arthur Silva [arthurprs@gmail.com]
*Inviato:* venerdì 15 maggio 2015 17.26
*A:* Job
*Cc:* pgsql-general@postgresql.org
*Oggetto:* Re: [GENERAL] Index on integer or on string fieldYou should probably experiment with a btree-gin index on those.
Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it> escreveu:Hello,
i have a table of about 10 millions of records, with the index on a
string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i
was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and
we could create a second table to codify, with numerical integer values,
the 50 recurring names.Is index are integer and not characteral, performance are better and
workload reduces?Is there any comparisons?
Thank you!
Francesco--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hello Francesco,
You should probably set timing on, run an explain analyze, and use pgbadger
to diagnose your performance issue.
While it may be the case that comparison in the index might be slightly
faster because of the modulo arithmetic, those in-memory operations are
extremely fast and it is likely that the seek in that index is the fastest
part of your query. And since you only have 50 distinct values the btree is
probably extremely shallow and there will be very few comparisons anyway.
I don't know much about your query but I suspect that the issue is that
your index scan is not selective enough so Postgres needs to scan a lot
from disk (which is *extremely* slow). If you want to improve the
performance you should first try to make the query as selective as
possible, and try to put an index on a more selective column of the WHERE
clause.
If you really must rely primarily on that column which has only 50 distinct
values you can try periodically running a CLUSTER command on the table for
that column index (doc:
http://www.postgresql.org/docs/devel/static/sql-cluster.html), or partition
the table (doc:
http://www.postgresql.org/docs/devel/static/ddl-partitioning.html) so that
the data you are scanning is close together on disk and you can get as much
of it per IO operation as possible.
*Will J. Dunn*
*willjdunn.com <http://willjdunn.com>*
On Fri, May 15, 2015 at 1:32 PM, Arthur Silva <arthurprs@gmail.com> wrote:
Show quoted text
Yes that's my suggestion. Btree-Gin deals with lots of repeated values
much better than the Btree index as repeated keys are only stored once.
Em 15/05/2015 12:38, "Job" <Job@colliniconsulting.it> escreveu:Hello Arthur!
So, i read that btree-gin have got "the ability to enforce uniqueness".
If in this 10.millions long table i have, in index, 50 recurring values,
i can leave the alphabetical field and change to btree-gin the index on it?!Thank you!
Francesco------------------------------
*Da:* Arthur Silva [arthurprs@gmail.com]
*Inviato:* venerdì 15 maggio 2015 17.26
*A:* Job
*Cc:* pgsql-general@postgresql.org
*Oggetto:* Re: [GENERAL] Index on integer or on string fieldYou should probably experiment with a btree-gin index on those.
Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it> escreveu:Hello,
i have a table of about 10 millions of records, with the index on a
string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i
was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and
we could create a second table to codify, with numerical integer values,
the 50 recurring names.Is index are integer and not characteral, performance are better and
workload reduces?Is there any comparisons?
Thank you!
Francesco--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, May 15, 2015 at 9:18 AM, Job <Job@colliniconsulting.it> wrote:
Hello,
i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.
Is index are integer and not characteral, performance are better and workload reduces?
Is there any comparisons?
Have you considered using a multi-column index here? if there's a more
selective field you could index along with your rather non-selective
one that might work better. But it's hard to tell without looking at
you database usage etc.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi,
My approach would be to improve the uniqueness of each record/row. Otherwise
you'll have to traverse the entire table for every query. At 100/200 queries
per second you are asking for trouble on several fronts. Including crashing
your hard disk faster than need be.
Hope this helps. Good luck.
Best Regards
Farjad Farid
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Job
Sent: 15 May 2015 16:19
To: pgsql-general@postgresql.org
Subject: [GENERAL] Index on integer or on string field
Hello,
i have a table of about 10 millions of records, with the index on a string
field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was
looking for a better way to improve performance and reduce workload.
The unique values, of that fields, are about the 50 (category name), and we
could create a second table to codify, with numerical integer values, the 50
recurring names.
Is index are integer and not characteral, performance are better and
workload reduces?
Is there any comparisons?
Thank you!
Francesco
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 2015-05-15 18:31, Scott Marlowe wrote:
On Fri, May 15, 2015 at 9:18 AM, Job <Job@colliniconsulting.it> wrote:
i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.
Is index are integer and not characteral, performance are better and workload reduces?
Is there any comparisons?
Have you considered using a multi-column index here? if there's a more
selective field you could index along with your rather non-selective
one that might work better. But it's hard to tell without looking at
you database usage etc.
To add to Scott suggestion, think about what columns you include in your
queries when you are looking for a unique row (what columns are on the right
hand side of the WHEN clause). This will help you narrow down what makes
unique conditions on your table, from which you can create a better index.
--
http://yves.zioup.com
gpg: 4096R/32B0F416
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general