Thread: Partial index question

Partial index question

From
Anton Nikiforov
Date:
Dear All,
I have a question about using partial indexes.
Lets say i have a table containing data types (table_datatype) and the
table containing data entrys (table_data).
While inserting into data entrys i have to number the entrys according
to it's type, so i always have to do
select max(id) from table_data where data_type=X;
And then insert a new value into the table data with this type and index.
Looks like there is no way to use sequences in this case without using
different tables that will make application not so clear.
But "my" way is not so clear also because i could get a collision while
concurrent inserts, so i have to control insertion from the application
and always check that it is unique.
So i'm planning to use partable indexes and hope they will help in
performance improving (the table data will contain millions of records
of each type so without indexing the performance will be not good and it
is not clear form me that it will be faster using complex index)
I know that i can do
create indexe ...... where type=X;
But is there any way to create all types of indexes at a time of
database creation without using triggers and creating indexes from it?
The matter is that data types are being added by the user, so i do not
know the indexes that i should create now.
And what will be faster?
CREATE UNIQUE INDEX type_index ON table_data (type, id);
or
CREATE UNIQUE INDEX type_1_index ON table_data (id) WHERE type=1;
CREATE UNIQUE INDEX type_X_index ON table_data (id) WHERE type=X;

--
Thanks in advance,
Best regads,
Anton Nikiforov


Attachment

Re: Partial index question

From
Anton Nikiforov
Date:
Anton Nikiforov пишет:

> ....skipped.....
> CREATE UNIQUE INDEX type_index ON table_data (type, id);
> or
> CREATE UNIQUE INDEX type_1_index ON table_data (id) WHERE type=1;
> CREATE UNIQUE INDEX type_X_index ON table_data (id) WHERE type=X;
>
Hello All,
looks like after a few tests that i've done i'm confused more than before.
I did create a table and inserted 200000 random records of two different
types of data into it.
Now i'm trying to do different selects with different where options.
And i'm always getting explain analyze output like this:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Seq Scan on table_data  (cost=0.00..4105.40 rows=63882 width=59)
(actual time=0.477..425.550 rows=65536 loops=1)
   Filter: (game = 1)
 Total runtime: 657.153 ms
It is no matter the type of select
SELECT max(id) FROM table_data WHERE game=1;
SELECT * FROM table_data WHERE game=1;
The same selects even without where clause.
I always have the same result - sequence scan but the filter appears in
the explain output and the number of records being reduced.
It that means that prtial index works? Or this is just means that i use
where clause in the select?

--
Best regads,
Anton Nikiforov



Attachment

Re: Partial index question

From
Bruno Wolff III
Date:
On Thu, Apr 29, 2004 at 11:34:43 +0400,
  Anton Nikiforov <anton@nikiforov.ru> wrote:
> Dear All,
> I have a question about using partial indexes.
> Lets say i have a table containing data types (table_datatype) and the
> table containing data entrys (table_data).
> While inserting into data entrys i have to number the entrys according
> to it's type, so i always have to do
> select max(id) from table_data where data_type=X;
> And then insert a new value into the table data with this type and index.
> Looks like there is no way to use sequences in this case without using
> different tables that will make application not so clear.

Not if you have to really number by each data_type entry without gaps.
You could use one sequence for the whole table and calculate the numbers
when you select the data. Depending on how often you need to do that, that
might have acceptable performance and you could use a simpler scheme for
storing the data.

> But "my" way is not so clear also because i could get a collision while
> concurrent inserts, so i have to control insertion from the application
> and always check that it is unique.

You can avoid this by locking the table. If you do lots of inserts, updates
and/or deletes the contention caused by doing this may be a problem. If
it won't be, than simpler is probably better.

> So i'm planning to use partable indexes and hope they will help in
> performance improving (the table data will contain millions of records
> of each type so without indexing the performance will be not good and it
> is not clear form me that it will be faster using complex index)
> I know that i can do
> create indexe ...... where type=X;

You almost certainly don't want to use partial indexes to solve this
problem. You want to use a two column index on (data_type, id).

> But is there any way to create all types of indexes at a time of
> database creation without using triggers and creating indexes from it?

Solving this problem is one reason you want to use a two column index.

Re: Partial index question

From
Bruno Wolff III
Date:
On Thu, Apr 29, 2004 at 12:53:24 +0400,
  Anton Nikiforov <anton@nikiforov.ru> wrote:
> Seq Scan on table_data  (cost=0.00..4105.40 rows=63882 width=59)
> (actual time=0.477..425.550 rows=65536 loops=1)
>   Filter: (game = 1)
> Total runtime: 657.153 ms
> It is no matter the type of select
> SELECT max(id) FROM table_data WHERE game=1;
> SELECT * FROM table_data WHERE game=1;
> The same selects even without where clause.
> I always have the same result - sequence scan but the filter appears in
> the explain output and the number of records being reduced.
> It that means that prtial index works? Or this is just means that i use
> where clause in the select?

You don't want to use an aggregate. For reasons why aggregates search the
archives. What you want is to have an index on (game, id) and
use:
SELECT id from table_data where game = 1 order by game desc, id desc limit 1;

Re: Partial index question

From
Paul Thomas
Date:
On 29/04/2004 09:53 Anton Nikiforov wrote:
> looks like after a few tests that i've done i'm confused more than
> before.
> I did create a table and inserted 200000 random records of two different
> types of data into it.

Did you analyze the table afterwards?

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business             |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: Partial index question

From
Anton Nikiforov
Date:
Paul Thomas пишет:

>
> On 29/04/2004 09:53 Anton Nikiforov wrote:
>
>> looks like after a few tests that i've done i'm confused more than
>> before.
>> I did create a table and inserted 200000 random records of two
>> different types of data into it.
>
>
> Did you analyze the table afterwards?
>
Hello and thanks for the comment. Yes. I did analyze.
Also thanks to Bruno Wolff III for his comments, i have found it easyear
to use his suggestion.
But while i was making my tests i found a way how to use partial or
complex indexes, but they make my selects slower than without indexes at
all :( (I was selecting * not a ma()). But i'll continue my tests.

--
Best regads,
Anton Nikiforov



Attachment

Re: Partial index question

From
"John Sidney-Woollett"
Date:
Anton Nikiforov said:
> Dear All,
> I have a question about using partial indexes.
> Lets say i have a table containing data types (table_datatype) and the
> table containing data entrys (table_data).
> While inserting into data entrys i have to number the entrys according
> to it's type, so i always have to do
> select max(id) from table_data where data_type=X;
> And then insert a new value into the table data with this type and index.
> Looks like there is no way to use sequences in this case without using
> different tables that will make application not so clear.
> But "my" way is not so clear also because i could get a collision while
> concurrent inserts, so i have to control insertion from the application
> and always check that it is unique.
> So i'm planning to use partable indexes and hope they will help in
> performance improving (the table data will contain millions of records
> of each type so without indexing the performance will be not good and it
> is not clear form me that it will be faster using complex index)
> I know that i can do
> create indexe ...... where type=X;
> But is there any way to create all types of indexes at a time of
> database creation without using triggers and creating indexes from it?
> The matter is that data types are being added by the user, so i do not
> know the indexes that i should create now.
> And what will be faster?
> CREATE UNIQUE INDEX type_index ON table_data (type, id);
> or
> CREATE UNIQUE INDEX type_1_index ON table_data (id) WHERE type=1;
> CREATE UNIQUE INDEX type_X_index ON table_data (id) WHERE type=X;

Not really answering your question directly, but some alternative
stragegies are:

If the set of types is small, then using sequences would be manageable.
Create a function that returns the next value for the passed-in data_type.
If the sequence does not yet exist for that id, you create the sequence,
and then return the sequence value.

If the set is large, then consider using an extra table which creates a
mapping between the data_type, and the last allocated value. Again create
a function to increment and return the next id for that data type.

My feeling (and I may be wrong) is that SELECT MAX(id) FROM table_data
WHERE type=x is always going to be less efficient that one of the above
methods regardless of the types of index you use, especially as the table
gets larger.

John Sidney-Woollett