Home > mailing lists

Thread: what can depend on index

what can depend on index

From

Szymon Guz

Date:

26 October 2010, 05:13:12

Hi,

today I noticed that in the documentation there is DROP INDEX CASCADE.

I've got one question: what is that for? What can depend on index?

regards

Szymon

Re: what can depend on index

From

David Fetter

Date:

26 October 2010, 11:22:07

On Tue, Oct 26, 2010 at 10:13:04AM +0200, Szymon Guz wrote:
> Hi,
> today I noticed that in the documentation there is DROP INDEX
> CASCADE.  I've got one question: what is that for? What can depend
> on index?

A foreign key can, if the index is unique.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: what can depend on index

From

Tom Lane

Date:

26 October 2010, 11:29:53

Szymon Guz <mabewlun@gmail.com> writes:
> today I noticed that in the documentation there is DROP INDEX CASCADE.
> I've got one question: what is that for? What can depend on index?

Unique constraints and foreign key constraints, for starters.

But even if there weren't anything, we'd still provide the CASCADE
syntax for future-proofing purposes.

            regards, tom lane

Re: what can depend on index

From

Szymon Guz

Date:

26 October 2010, 11:37:55

On 26 October 2010 16:21, David Fetter <david@fetter.org> wrote:

On Tue, Oct 26, 2010 at 10:13:04AM +0200, Szymon Guz wrote:
> Hi,
> today I noticed that in the documentation there is DROP INDEX
> CASCADE. I've got one question: what is that for? What can depend
> on index?

A foreign key can, if the index is unique.

Thanks for the answer.

regards

Szymon

Re: what can depend on index

From

Szymon Guz

Date:

26 October 2010, 11:38:13

On 26 October 2010 16:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Szymon Guz <mabewlun@gmail.com> writes:
> today I noticed that in the documentation there is DROP INDEX CASCADE.
> I've got one question: what is that for? What can depend on index?

Unique constraints and foreign key constraints, for starters.

But even if there weren't anything, we'd still provide the CASCADE
syntax for future-proofing purposes.

regards, tom lane

Thanks for the answer, I didn't know about the foreign key, I only thought about the future use.

regards

Szymon

Why Select Count(*) from table - took over 20 minutes?

From

Ozz Nixon

Date:

26 October 2010, 14:19:57

I am the only user on this system right now, and one table select count(*) took over 20 minutes:

wikitags exists and has 58,988,656 records.

Structure (in pascal) is:

   quer.SQL.Add('create table '+DBTags+' (');
   quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
   quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
   quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
   quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
   quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
   quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
   quer.SQL.Add('   instances '+SQL_INT32+' not null,');
   if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
   quer.SQL.Add('   primary key(pagename, tagword, instances)');
   quer.SQL.Add(')');

where SQL_TITLE = 'varchar', SQL_IN32 = 'int'

I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However, COUNT()
isstill dog slow - this table is a write once, read many... *never* update, nor delete. 

Any suggestions?

Re: Why Select Count(*) from table - took over 20 minutes?

From

Bill Moran

Date:

26 October 2010, 14:26:58

In response to Ozz Nixon <ozznixon@gmail.com>:

> I am the only user on this system right now, and one table select count(*) took over 20 minutes:
>
> wikitags exists and has 58,988,656 records.
>
> Structure (in pascal) is:
>
>    quer.SQL.Add('create table '+DBTags+' (');
>    quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
>    quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
>    quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
>    quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
>    quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
>    quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
>    quer.SQL.Add('   instances '+SQL_INT32+' not null,');
>    if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
>    quer.SQL.Add('   primary key(pagename, tagword, instances)');
>    quer.SQL.Add(')');
>
> where SQL_TITLE = 'varchar', SQL_IN32 = 'int'
>
> I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However,
COUNT()is still dog slow - this table is a write once, read many... *never* update, nor delete. 
>
> Any suggestions?

Generate the count one time and store it somewhere for quick retrieval.

In an MVCC database, count(*) is designed to be accurate, which requires
a scan of the entire table (which appears to take about 20 mins on your
hardware).

MVCC just isn't optimized for a table that never changes.  However, it's
easy to cache that value, since it never changes the cache never needs
to be updated.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Why Select Count(*) from table - took over 20 minutes?

From

John R Pierce

Date:

26 October 2010, 14:33:25

On 10/26/10 10:18 AM, Ozz Nixon wrote:
> I am the only user on this system right now, and one table select count(*) took over 20 minutes:
>
> wikitags exists and has 58,988,656 records.
>
> Structure (in pascal) is:
>
>     quer.SQL.Add('create table '+DBTags+' (');
>     quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
>     quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
>     quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
>     quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
>     quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
>     quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
>     quer.SQL.Add('   instances '+SQL_INT32+' not null,');
>     if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
>     quer.SQL.Add('   primary key(pagename, tagword, instances)');
>     quer.SQL.Add(')');
>
> where SQL_TITLE = 'varchar', SQL_IN32 = 'int'
>
> I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However,
COUNT()is still dog slow - this table is a write once, read many... *never* update, nor delete. 

count(*) has to read the whole table to get the accurate count.   The
reason for this is that different clients can see different versions of
that table, for instance, if client A is already in a transaction, and
client B then does an INSERT, the two clients will see different values
for the count.

Re: Why Select Count(*) from table - took over 20 minutes?

From

Alan Hodgson

Date:

26 October 2010, 14:34:03

On October 26, 2010 10:18:41 am Ozz Nixon wrote:
> I have hung off indexes for each column, to resolve my previous
> "performance" issue from 3+ weeks ago. However, COUNT() is still dog slow
> - this table is a write once, read many... *never* update, nor delete.
>
> Any suggestions?

If you need to do count(*) on 60 million row tables, you will probably need
faster hardware.

Re: Why Select Count(*) from table - took over 20 minutes?

From

Diego Schulz

Date:

26 October 2010, 17:30:54

On Tue, Oct 26, 2010 at 2:18 PM, Ozz Nixon <ozznixon@gmail.com> wrote:
> I am the only user on this system right now, and one table select count(*) took over 20 minutes:
>
> wikitags exists and has 58,988,656 records.
>
> Structure (in pascal) is:
>
>   quer.SQL.Add('create table '+DBTags+' (');
>   quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
>   quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
>   quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
>   quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
>   quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
>   quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
>   quer.SQL.Add('   instances '+SQL_INT32+' not null,');
>   if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
>   quer.SQL.Add('   primary key(pagename, tagword, instances)');
>   quer.SQL.Add(')');
>
> where SQL_TITLE = 'varchar', SQL_IN32 = 'int'
>
> I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However,
COUNT()is still dog slow - this table is a write once, read many... *never* update, nor delete. 
>
> Any suggestions?
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

You could try something like what's suggested in this blog post:
http://jakub.fedyczak.net/post/26

I didn't actually tried it, but I think it should work ok.

cheers,

diego

Re: Why Select Count(*) from table - took over 20 minutes?

From

Diego Schulz

Date:

26 October 2010, 18:24:01

On Tue, Oct 26, 2010 at 2:18 PM, Ozz Nixon <ozznixon@gmail.com> wrote:
> I am the only user on this system right now, and one table select count(*) took over 20 minutes:
>
> wikitags exists and has 58,988,656 records.
>
> Structure (in pascal) is:
>
>   quer.SQL.Add('create table '+DBTags+' (');
>   quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
>   quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
>   quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
>   quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
>   quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
>   quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
>   quer.SQL.Add('   instances '+SQL_INT32+' not null,');
>   if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
>   quer.SQL.Add('   primary key(pagename, tagword, instances)');
>   quer.SQL.Add(')');
>
> where SQL_TITLE = 'varchar', SQL_IN32 = 'int'
>
> I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However,
COUNT()is still dog slow - this table is a write once, read many... *never* update, nor delete. 
>
> Any suggestions?
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

You could try something like what's suggested in this blog post:
http://jakub.fedyczak.net/post/26

I didn't actually tried it, but I think it should work ok.

cheers,

diego

Re: Why Select Count(*) from table - took over 20 minutes?

From

Merlin Moncure

Date:

27 October 2010, 08:15:22

On Tue, Oct 26, 2010 at 4:30 PM, Diego Schulz <dschulz@gmail.com> wrote:
> On Tue, Oct 26, 2010 at 2:18 PM, Ozz Nixon <ozznixon@gmail.com> wrote:
>> I am the only user on this system right now, and one table select count(*) took over 20 minutes:
>>
>> wikitags exists and has 58,988,656 records.
>>
>> Structure (in pascal) is:
>>
>>   quer.SQL.Add('create table '+DBTags+' (');
>>   quer.SQL.Add('   pagename '+SQL_TITLE+'(100) not null,');
>>   quer.SQL.Add('   tagword '+SQL_TITLE+'(15) not null,');
>>   quer.SQL.Add('   soundex2 '+SQL_TITLE+'(4) not null,');
>>   quer.SQL.Add('   metaphone '+SQL_TITLE+'(15) not null,');
>>   quer.SQL.Add('   metaphone2 '+SQL_TITLE+'(22) not null,');
>>   quer.SQL.Add('   carverphone '+SQL_TITLE+'(22) not null,');
>>   quer.SQL.Add('   instances '+SQL_INT32+' not null,');
>>   if SQL_NAME_PRIMARY_KEYS then quer.SQL.Add('   constraint '+DBTags+'_PK');
>>   quer.SQL.Add('   primary key(pagename, tagword, instances)');
>>   quer.SQL.Add(')');
>>
>> where SQL_TITLE = 'varchar', SQL_IN32 = 'int'
>>
>> I have hung off indexes for each column, to resolve my previous "performance" issue from 3+ weeks ago. However,
COUNT()is still dog slow - this table is a write once, read many... *never* update, nor delete. 
>>
>> Any suggestions?
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
> You could try something like what's suggested in this blog post:
> http://jakub.fedyczak.net/post/26
>
> I didn't actually tried it, but I think it should work ok.

Before you try that, you should ask yourself if you really need a 100%
accurate count.  A reasonable approximation is maintained via the
stats system (pg_class.reltuples) that will often do and is free.

merlin

Re: Why Select Count(*) from table - took over 20 minutes?

From

Ozz Nixon

Date:

28 October 2010, 12:37:30

How/where do I query this?

My script does not need a 100% accurate count - just a recently valid count - so I can verify the web crawlers are
stillcrawling :-) 

On Oct 27, 2010, at 7:15 AM, Merlin Moncure wrote:

> pg_class.reltuples

Re: Why Select Count(*) from table - took over 20 minutes?

From

Merlin Moncure

Date:

28 October 2010, 13:05:02

On Thu, Oct 28, 2010 at 11:36 AM, Ozz Nixon <ozznixon@gmail.com> wrote:
> How/where do I query this?
>
> My script does not need a 100% accurate count - just a recently valid count - so I can verify the web crawlers are
stillcrawling :-) 

you can do this:
select reltuples from pg_class where relname = 'your_table' and relkind = 'r';

that will give you accurate count as of the last analyze, which is
going to be driven by table usage and/or manual analyze.   Probably
much better in your particular case is to do this:

select * from pg_stat_all_tables where relname = 'your_table';

and look at the n_tup_ins, del, etc.  and make sure they are changing
(those numbers are reset when server resets, fyi).

merlin

Re: Why Select Count(*) from table - took over 20 minutes?

From

Jasen Betts

Date:

03 November 2010, 07:15:45

On 2010-10-26, John R Pierce <pierce@hogranch.com> wrote:
>
> count(*) has to read the whole table to get the accurate count.   The
> reason for this is that different clients can see different versions of
> that table, for instance, if client A is already in a transaction, and
> client B then does an INSERT, the two clients will see different values
> for the count.

They may or may not. the default transaction isolation level "read commited"
allows a session to see most changes that were committed externally after the
start of the transaction.

Tlso the index may include deleted rows.
which is another reason count(*) does a table scan.

--
ɹǝpun uʍop ɯoɹɟ sƃuıʇǝǝɹ⅁