Thread: temporary indexes?

temporary indexes?

From
Jonathan Vanasco
Date:
I couldn't find any mention of this on the archives...

Have the project maintainers ever considered extending CREATE INDEX to support "temporary" indexes like CREATE
TEMPORARYTABLE? 

When creating temporary tables for analytics/reporting, I've noticed that I often need to create (then drop) indexes on
regulartables.  Temporary indexes seemed like a natural fit here, so i was wondering if there was any reason why
they'renot supported (other than no one wanted it!) 

Re: temporary indexes?

From
Adrian Klaver
Date:
On 10/21/2015 11:43 AM, Jonathan Vanasco wrote:
> I couldn't find any mention of this on the archives...
>
> Have the project maintainers ever considered extending CREATE INDEX to support "temporary" indexes like CREATE
TEMPORARYTABLE? 
>
> When creating temporary tables for analytics/reporting, I've noticed that I often need to create (then drop) indexes
onregular tables.  Temporary indexes seemed like a natural fit here, so i was wondering if there was any reason why
they'renot supported (other than no one wanted it!) 

Something like this?:

aklaver@test=> create temporary table temp_test(id int, fld_1 varchar);
CREATE TABLE

aklaver@test=> create index temp_idx on temp_test(fld_1);
CREATE INDEX

aklaver@test=> \d temp_test
       Table "pg_temp_2.temp_test"
  Column |       Type        | Modifiers
--------+-------------------+-----------
  id     | integer           |
  fld_1  | character varying |
Indexes:
     "temp_idx" btree (fld_1)

>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: temporary indexes?

From
Jeff Janes
Date:
On Wed, Oct 21, 2015 at 11:50 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 10/21/2015 11:43 AM, Jonathan Vanasco wrote:
I couldn't find any mention of this on the archives...

Have the project maintainers ever considered extending CREATE INDEX to support "temporary" indexes like CREATE TEMPORARY TABLE?

When creating temporary tables for analytics/reporting, I've noticed that I often need to create (then drop) indexes on regular tables.  Temporary indexes seemed like a natural fit here, so i was wondering if there was any reason why they're not supported (other than no one wanted it!)

Something like this?:

aklaver@test=> create temporary table temp_test(id int, fld_1 varchar);
CREATE TABLE

aklaver@test=> create index temp_idx on temp_test(fld_1);
CREATE INDEX


I think he means more like:

create temporary table temp_test(id int, fld_1 varchar);
create temporary index on permanent_table (fld_1);

select something from temp_test join permanent_table using (fld_1) where a=b;
select something_else from temp_test join permanent_table using (fld_1) where c=d;

Cheers,

Jeff

Re: temporary indexes?

From
Jonathan Vanasco
Date:
On Oct 21, 2015, at 2:59 PM, Jeff Janes wrote:
> I think he means more like:
>
> create temporary table temp_test(id int, fld_1 varchar);
> create temporary index on permanent_table (fld_1);
>
> select something from temp_test join permanent_table using (fld_1) where a=b;
> select something_else from temp_test join permanent_table using (fld_1) where c=d;

Yes. That's exactly what I mean:  A "temporary index" on a "permanent table" , which expires in the same manner of a
"temporarytable". 




Re: temporary indexes?

From
melvin6925
Date:
What Adrian is saying is that there is no need for "temporary" indexes. You can create the idxs on a temp table and they get dropped when you drop the table.



Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
-------- Original message --------
From: Adrian Klaver <adrian.klaver@aklaver.com>
Date: 10/21/2015 14:50 (GMT-05:00)
To: Jonathan Vanasco <postgres@2xlp.com>, PostgreSQL mailing lists <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] temporary indexes?

On 10/21/2015 11:43 AM, Jonathan Vanasco wrote:
> I couldn't find any mention of this on the archives...
>
> Have the project maintainers ever considered extending CREATE INDEX to support "temporary" indexes like CREATE TEMPORARY TABLE?
>
> When creating temporary tables for analytics/reporting, I've noticed that I often need to create (then drop) indexes on regular tables.  Temporary indexes seemed like a natural fit here, so i was wondering if there was any reason why they're not supported (other than no one wanted it!)

Something like this?:

aklaver@test=> create temporary table temp_test(id int, fld_1 varchar);
CREATE TABLE

aklaver@test=> create index temp_idx on temp_test(fld_1);
CREATE INDEX

aklaver@test=> \d temp_test
       Table "pg_temp_2.temp_test"
  Column |       Type        | Modifiers
--------+-------------------+-----------
  id     | integer           |
  fld_1  | character varying |
Indexes:
     "temp_idx" btree (fld_1)

>


--
Adrian Klaver
adrian.klaver@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: temporary indexes?

From
Adrian Klaver
Date:
On 10/21/2015 12:27 PM, Jonathan Vanasco wrote:
>
> On Oct 21, 2015, at 2:59 PM, Jeff Janes wrote:
>> I think he means more like:
>>
>> create temporary table temp_test(id int, fld_1 varchar);
>> create temporary index on permanent_table (fld_1);
>>
>> select something from temp_test join permanent_table using (fld_1) where a=b;
>> select something_else from temp_test join permanent_table using (fld_1) where c=d;
>
> Yes. That's exactly what I mean:  A "temporary index" on a "permanent table" , which expires in the same manner of a
"temporarytable". 

I misunderstood then. The only thing I can think of is to wrap in a
transaction, though that presents other issues with open transactions
and/or errors in the transaction.

>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: temporary indexes?

From
Jonathan Vanasco
Date:
On Oct 21, 2015, at 3:42 PM, Adrian Klaver wrote:

> I misunderstood then. The only thing I can think of is to wrap in a transaction, though that presents other issues
withopen transactions and/or errors in the transaction. 

I just explicitly drop.  The convenience of an auto-drop would be a nice backup.

Transactions and table-locking issues are probably why temporary indexes don't exist.

Re: temporary indexes?

From
Adrian Klaver
Date:
On 10/21/2015 01:28 PM, Jonathan Vanasco wrote:
>
> On Oct 21, 2015, at 3:42 PM, Adrian Klaver wrote:
>
>> I misunderstood then. The only thing I can think of is to wrap in a transaction, though that presents other issues
withopen transactions and/or errors in the transaction. 
>
> I just explicitly drop.  The convenience of an auto-drop would be a nice backup.
>
> Transactions and table-locking issues are probably why temporary indexes don't exist.
>

On later versions there is CREATE INDEX CONCURRENTLY which alleviates
locking issues at the expense of time. I would think the greater issue
is the time and overhead of building an index for a table of any size
would eat into 'temporary'. Seems if you are joining temporary tables
against permanent tables on a regular basis it would pay just to keep
the indexes on the permanent tables and pay the expense over a longer
period of time.

--
Adrian Klaver
adrian.klaver@aklaver.com


Re: temporary indexes?

From
Andreas Kretschmer
Date:
Jonathan Vanasco <postgres@2xlp.com> wrote:

> I couldn't find any mention of this on the archives...
>
> Have the project maintainers ever considered extending CREATE INDEX to
> support "temporary" indexes like CREATE TEMPORARY TABLE?

Not sure if you mean something like this:

http://www.depesz.com/2015/09/07/hypothetical-indexes/


> When creating temporary tables for analytics/reporting, I've noticed
> that I often need to create (then drop) indexes on regular tables.

Than you have to create this index. But the creation of an index is
expensive, maybe it's cheaper do the nalytics/reporting without an
index.

And yes: you can create an indexe concurrently.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


Re: temporary indexes?

From
vincent elschot
Date:

On 21-10-15 20:43, Jonathan Vanasco wrote:
> I couldn't find any mention of this on the archives...
>
> Have the project maintainers ever considered extending CREATE INDEX to support "temporary" indexes like CREATE
TEMPORARYTABLE? 
>
> When creating temporary tables for analytics/reporting, I've noticed that I often need to create (then drop) indexes
onregular tables.  Temporary indexes seemed like a natural fit here, so i was wondering if there was any reason why
they'renot supported (other than no one wanted it!) 
>
Do you mean creating a temporary index on a non-temporary table to speed
up the queries that fills the temporary table?


Re: temporary indexes?

From
Jim Nasby
Date:
On 10/21/15 3:28 PM, Jonathan Vanasco wrote:
>
> On Oct 21, 2015, at 3:42 PM, Adrian Klaver wrote:
>
>> I misunderstood then. The only thing I can think of is to wrap in a transaction, though that presents other issues
withopen transactions and/or errors in the transaction. 
>
> I just explicitly drop.  The convenience of an auto-drop would be a nice backup.
>
> Transactions and table-locking issues are probably why temporary indexes don't exist.

I think it's more that no one has proposed it until now. It probably
wouldn't be terribly hard to add them... the biggest issue would
probably be changing the buffer management code so it didn't assume that
a temporary relation went into temporary buffers. Other than that,
presumably they'd work the same as temp tables.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: temporary indexes?

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 10/21/15 3:28 PM, Jonathan Vanasco wrote:
>> Transactions and table-locking issues are probably why temporary indexes don't exist.

> I think it's more that no one has proposed it until now. It probably
> wouldn't be terribly hard to add them... the biggest issue would
> probably be changing the buffer management code so it didn't assume that
> a temporary relation went into temporary buffers.

Uh, why would you do that?  You'd be throwing away one of the principal
performance advantages of temp tables.

            regards, tom lane


Re: temporary indexes?

From
Jim Nasby
Date:
On 10/22/15 12:36 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> On 10/21/15 3:28 PM, Jonathan Vanasco wrote:
>>> Transactions and table-locking issues are probably why temporary indexes don't exist.
>
>> I think it's more that no one has proposed it until now. It probably
>> wouldn't be terribly hard to add them... the biggest issue would
>> probably be changing the buffer management code so it didn't assume that
>> a temporary relation went into temporary buffers.
>
> Uh, why would you do that?  You'd be throwing away one of the principal
> performance advantages of temp tables.

This would be for temporary *indexes* on permanent tables.

Actually, it depends on what behavior you'd expect from a temporary
index. If it was only going to exist for the duration of a REPEATABLE
READ transaction it wouldn't care about concurrent DML on the table, so
the index could use temp buffers and the index creation could take
shortcuts as well, since it'd only need to index tuples that satisfy
that transaction's snapshot.

OTOH, if you had anything looser than that the index would need to
operate the same as a regular index, so all other backends would need to
update it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: temporary indexes?

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 10/22/15 12:36 PM, Tom Lane wrote:
>> Uh, why would you do that?  You'd be throwing away one of the principal
>> performance advantages of temp tables.

> Actually, it depends on what behavior you'd expect from a temporary
> index. If it was only going to exist for the duration of a REPEATABLE
> READ transaction it wouldn't care about concurrent DML on the table, so
> the index could use temp buffers and the index creation could take
> shortcuts as well, since it'd only need to index tuples that satisfy
> that transaction's snapshot.

> OTOH, if you had anything looser than that the index would need to
> operate the same as a regular index, so all other backends would need to
> update it.

Hmm, good point.  Still, that means that such indexes would be
considerably more expensive than true temp indexes, because of the
concurrency and locking requirements, which would be just like regular
indexes.  AFAICS it would be better to think of them as unlogged indexes,
because suppressing WAL logging is all you could get out of it.

FWIW, I don't find much attraction in the idea of building an index for
use by a single query.  There basically isn't any scenario where that's
going to beat running a plan that doesn't require the index.  The value of
an index is generally to avoid a whole-table scan and/or a sort, but
you'll necessarily pay those costs to make the index.

            regards, tom lane


Re: temporary indexes?

From
Jonathan Vanasco
Date:

On Oct 22, 2015, at 2:08 PM, Tom Lane wrote:
> FWIW, I don't find much attraction in the idea of building an index for
> use by a single query.  There basically isn't any scenario where that's
> going to beat running a plan that doesn't require the index.  The value of
> an index is generally to avoid a whole-table scan and/or a sort, but
> you'll necessarily pay those costs to make the index.


On Oct 22, 2015, at 8:17 AM, vincent elschot wrote:
> Do you mean creating a temporary index on a non-temporary table to speed up the queries that fills the temporary
table?

One of the use-cases is speeding up inserts on create, but another is for periodic analytics routines (which we handle
withexplicit create/drop index commands. 

In one example of our analytics routines, we end up needing to create/drop about 15 indexes to optimize 45 queries.
Thisspeeds up the execution by 1000% and minimizes RAM usage.  We don't keep the indexes active, because we only need
themfor analytics and the overhead of managing them during high write periods during the day is noticeable.  Creating
anddropping these indexes on-demand gives us all the benefit with none of the drawbacks. 

Re: temporary indexes?

From
Jim Nasby
Date:
On 10/22/15 3:15 PM, Jonathan Vanasco wrote:
> On Oct 22, 2015, at 8:17 AM, vincent elschot wrote:
>> Do you mean creating a temporary index on a non-temporary table to speed up the queries that fills the temporary
table?
>
> One of the use-cases is speeding up inserts on create, but another is for periodic analytics routines (which we
handlewith explicit create/drop index commands. 
>
> In one example of our analytics routines, we end up needing to create/drop about 15 indexes to optimize 45 queries.
Thisspeeds up the execution by 1000% and minimizes RAM usage.  We don't keep the indexes active, because we only need
themfor analytics and the overhead of managing them during high write periods during the day is noticeable.  Creating
anddropping these indexes on-demand gives us all the benefit with none of the drawbacks. 

What % of execution time is spent creating those indexes? Or is that
factored into the 1000%? Also, could your analysis queries be run in a
REPEATABLE READ transaction (meaning that once the transaction starts it
doesn't get any new data)? If it could then the temp indexes could be
static, which would mean no update overhead.

Tom's point about unlogged indexes is a good one, I hadn't thought about
that.

Something else to think about here is catalog churn. So the ideal case
for an index that couldn't be static would be an unlogged index that you
could invalidate and rebuild at will. When you're done with it you
invalidate it and maintenance costs go to essentially 0. Rebuilds and
maintenance are cheaper because they're not WAL logged.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: temporary indexes?

From
Jonathan Vanasco
Date:
On Oct 22, 2015, at 5:04 PM, Jim Nasby wrote:

>
> What % of execution time is spent creating those indexes? Or is that factored into the 1000%? Also, could your
analysisqueries be run in a REPEATABLE READ transaction (meaning that once the transaction starts it doesn't get any
newdata)? If it could then the temp indexes could be static, which would mean no update overhead. 

Running without the indexes would take over an hour to execute the scripts, and totally jams the machine (we got
30minutesin once, and had to kill it).  That's because of millions of rows used in joins and sequential scans.   

Building all the indexes takes 30 seconds; most SQL commands then run only against the indexes (some of which are
partial)and the entire suite finishes in about 3 minutes. 

If the indexes stay active during the day, there seems to be a 2-3% drop in write performance.   This is on a webapp,
sowe're just happier shifting the index work from peak hours to offpeak hours.  It means we can delay spinning up
anotherapplication server a bit longer. 

I'll definitely look into your suggestions the next time I hit this code.