what can depend on index

Started by Szymon Guzover 15 years ago14 messagesgeneral
Jump to latest
#1Szymon Guz
mabewlun@gmail.com

Hi,
today I noticed that in the documentation there is DROP INDEX CASCADE.
I've got one question: what is that for? What can depend on index?

regards
Szymon

#2David Fetter
david@fetter.org
In reply to: Szymon Guz (#1)
Re: what can depend on index

On Tue, Oct 26, 2010 at 10:13:04AM +0200, Szymon Guz wrote:

Hi,
today I noticed that in the documentation there is DROP INDEX
CASCADE. I've got one question: what is that for? What can depend
on index?

A foreign key can, if the index is unique.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Szymon Guz (#1)
Re: what can depend on index

Szymon Guz <mabewlun@gmail.com> writes:

today I noticed that in the documentation there is DROP INDEX CASCADE.
I've got one question: what is that for? What can depend on index?

Unique constraints and foreign key constraints, for starters.

But even if there weren't anything, we'd still provide the CASCADE
syntax for future-proofing purposes.

regards, tom lane

#4Szymon Guz
mabewlun@gmail.com
In reply to: David Fetter (#2)
Re: what can depend on index

On 26 October 2010 16:21, David Fetter <david@fetter.org> wrote:

On Tue, Oct 26, 2010 at 10:13:04AM +0200, Szymon Guz wrote:

Hi,
today I noticed that in the documentation there is DROP INDEX
CASCADE. I've got one question: what is that for? What can depend
on index?

A foreign key can, if the index is unique.

Thanks for the answer.

regards
Szymon

#5Szymon Guz
mabewlun@gmail.com
In reply to: Tom Lane (#3)
Re: what can depend on index

On 26 October 2010 16:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Szymon Guz <mabewlun@gmail.com> writes:

today I noticed that in the documentation there is DROP INDEX CASCADE.
I've got one question: what is that for? What can depend on index?

Unique constraints and foreign key constraints, for starters.

But even if there weren't anything, we'd still provide the CASCADE
syntax for future-proofing purposes.

regards, tom lane

Thanks for the answer, I didn't know about the foreign key, I only thought
about the future use.

regards
Szymon

#6Ozz Nixon
ozznixon@gmail.com
In reply to: Szymon Guz (#5)
Why Select Count(*) from table - took over 20 minutes?

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

quer.SQL.Add('create table '+DBTags+' (');
quer.SQL.Add(' pagename '+SQL_TITLE+'(100) not null,');
quer.SQL.Add(' tagword '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' soundex2 '+SQL_TITLE+'(4) not null,');
quer.SQL.Add(' metaphone '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' metaphone2 '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' carverphone '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' instances '+SQL_INT32+' not null,');
if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add(' constraint '+DBTags+'_PK');
quer.SQL.Add(' primary key(pagename, tagword, instances)');
quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow - this table is a write once, read many... *never* update, nor delete.

Any suggestions?

#7Bill Moran
wmoran@potentialtech.com
In reply to: Ozz Nixon (#6)
Re: Why Select Count(*) from table - took over 20 minutes?

In response to Ozz Nixon <ozznixon@gmail.com>:

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

quer.SQL.Add('create table '+DBTags+' (');
quer.SQL.Add(' pagename '+SQL_TITLE+'(100) not null,');
quer.SQL.Add(' tagword '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' soundex2 '+SQL_TITLE+'(4) not null,');
quer.SQL.Add(' metaphone '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' metaphone2 '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' carverphone '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' instances '+SQL_INT32+' not null,');
if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add(' constraint '+DBTags+'_PK');
quer.SQL.Add(' primary key(pagename, tagword, instances)');
quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow - this table is a write once, read many... *never* update, nor delete.

Any suggestions?

Generate the count one time and store it somewhere for quick retrieval.

In an MVCC database, count(*) is designed to be accurate, which requires
a scan of the entire table (which appears to take about 20 mins on your
hardware).

MVCC just isn't optimized for a table that never changes. However, it's
easy to cache that value, since it never changes the cache never needs
to be updated.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

#8John R Pierce
pierce@hogranch.com
In reply to: Ozz Nixon (#6)
Re: Why Select Count(*) from table - took over 20 minutes?

On 10/26/10 10:18 AM, Ozz Nixon wrote:

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

quer.SQL.Add('create table '+DBTags+' (');
quer.SQL.Add(' pagename '+SQL_TITLE+'(100) not null,');
quer.SQL.Add(' tagword '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' soundex2 '+SQL_TITLE+'(4) not null,');
quer.SQL.Add(' metaphone '+SQL_TITLE+'(15) not null,');
quer.SQL.Add(' metaphone2 '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' carverphone '+SQL_TITLE+'(22) not null,');
quer.SQL.Add(' instances '+SQL_INT32+' not null,');
if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add(' constraint '+DBTags+'_PK');
quer.SQL.Add(' primary key(pagename, tagword, instances)');
quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow - this table is a write once, read many... *never* update, nor delete.

count(*) has to read the whole table to get the accurate count. The
reason for this is that different clients can see different versions of
that table, for instance, if client A is already in a transaction, and
client B then does an INSERT, the two clients will see different values
for the count.

#9Alan Hodgson
ahodgson@simkin.ca
In reply to: Ozz Nixon (#6)
Re: Why Select Count(*) from table - took over 20 minutes?

On October 26, 2010 10:18:41 am Ozz Nixon wrote:

I have hung off indexes for each column, to resolve my previous
"performance" issue from 3+ weeks ago. However, COUNT() is still dog slow
- this table is a write once, read many... *never* update, nor delete.

Any suggestions?

If you need to do count(*) on 60 million row tables, you will probably need
faster hardware.

#10Diego Schulz
dschulz@gmail.com
In reply to: Ozz Nixon (#6)
Re: Why Select Count(*) from table - took over 20 minutes?

On Tue, Oct 26, 2010 at 2:18 PM, Ozz Nixon <ozznixon@gmail.com> wrote:

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

  quer.SQL.Add('create table '+DBTags+' (');
  quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
  quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
  quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
  quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
  quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
  quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
  quer.SQL.Add('   instances '+SQL_INT32+' not null,');
  if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
  quer.SQL.Add('   primary key(pagename, tagword, instances)');
  quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow - this table is a write once, read many... *never* update, nor delete.

Any suggestions?
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

You could try something like what's suggested in this blog post:
http://jakub.fedyczak.net/post/26

I didn't actually tried it, but I think it should work ok.

cheers,

diego

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Diego Schulz (#10)
Re: Why Select Count(*) from table - took over 20 minutes?

On Tue, Oct 26, 2010 at 4:30 PM, Diego Schulz <dschulz@gmail.com> wrote:

On Tue, Oct 26, 2010 at 2:18 PM, Ozz Nixon <ozznixon@gmail.com> wrote:

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

  quer.SQL.Add('create table '+DBTags+' (');
  quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
  quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
  quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
  quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
  quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
  quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
  quer.SQL.Add('   instances '+SQL_INT32+' not null,');
  if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
  quer.SQL.Add('   primary key(pagename, tagword, instances)');
  quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow - this table is a write once, read many... *never* update, nor delete.

Any suggestions?
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

You could try something like what's suggested in this blog post:
http://jakub.fedyczak.net/post/26

I didn't actually tried it, but I think it should work ok.

Before you try that, you should ask yourself if you really need a 100%
accurate count. A reasonable approximation is maintained via the
stats system (pg_class.reltuples) that will often do and is free.

merlin

#12Ozz Nixon
ozznixon@gmail.com
In reply to: Merlin Moncure (#11)
Re: Why Select Count(*) from table - took over 20 minutes?

How/where do I query this?

My script does not need a 100% accurate count - just a recently valid count - so I can verify the web crawlers are still crawling :-)

On Oct 27, 2010, at 7:15 AM, Merlin Moncure wrote:

Show quoted text

pg_class.reltuples

#13Merlin Moncure
mmoncure@gmail.com
In reply to: Ozz Nixon (#12)
Re: Why Select Count(*) from table - took over 20 minutes?

On Thu, Oct 28, 2010 at 11:36 AM, Ozz Nixon <ozznixon@gmail.com> wrote:

How/where do I query this?

My script does not need a 100% accurate count - just a recently valid count - so I can verify the web crawlers are still crawling :-)

you can do this:
select reltuples from pg_class where relname = 'your_table' and relkind = 'r';

that will give you accurate count as of the last analyze, which is
going to be driven by table usage and/or manual analyze. Probably
much better in your particular case is to do this:

select * from pg_stat_all_tables where relname = 'your_table';

and look at the n_tup_ins, del, etc. and make sure they are changing
(those numbers are reset when server resets, fyi).

merlin

#14Jasen Betts
jasen@xnet.co.nz
In reply to: Szymon Guz (#1)
Re: Why Select Count(*) from table - took over 20 minutes?

On 2010-10-26, John R Pierce <pierce@hogranch.com> wrote:

count(*) has to read the whole table to get the accurate count. The
reason for this is that different clients can see different versions of
that table, for instance, if client A is already in a transaction, and
client B then does an INSERT, the two clients will see different values
for the count.

They may or may not. the default transaction isolation level "read commited"
allows a session to see most changes that were committed externally after the
start of the transaction.

Tlso the index may include deleted rows.
which is another reason count(*) does a table scan.

--
ɹǝpun uʍop ɯoɹɟ sƃuıʇǝǝɹ⅁