Re: Can the query planner create indexes? - Mailing list pgsql-general

From Jeremy Harris
Subject Re: Can the query planner create indexes?
Date
Msg-id 4D10C90E.7070403@wizmail.org
Whole thread Raw
In response to Re: Can the query planner create indexes?  (tv@fuzzy.cz)
Responses Re: Can the query planner create indexes?  (bricklen <bricklen@gmail.com>)
Re: Can the query planner create indexes?  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-general
On 2010-12-21 14:26, tv@fuzzy.cz wrote:
>> Why not auto-create indices for some limited period after database load
>> (copy?  any large number of inserts from a single connection?), track
>> those
>> that actually get re-used and remove the rest?   Would this not provide
>> a better out-of-the-box experience for neophytes?
>
> Say you have a table with several columns (A,B,C), and the query is using
> some of them. What indexes would you create? One index on every column? A
> multi-column index on all columns? Indexs for each combination of columns?

One single-column index, on the first index-worthy column appearing.
Keep it simple.    Maybe, while you're doing that full-table-scan. gather
stats on all the indexable columns for later reference, to guide choice of
which column to index later.

> There really is no automatic way to solve this puzzle using a single
> query. Indexing strategy is a very tough design discipline, and it
> requires a complex knowledge of the workload. One slow query does not mean
> the index should be created - what if that was just an ad-hoc query and
> will not be executed ever again?

Then the index you just built gets automatically dropped, as I said above.

> I really don't want to work with products that try to be smarter than me
> (and I admit I'm pretty dumb from time to time) and act rather randomly
> due to this 'artificial intelligence'. I've already been there and I don't
> want to repeat this experience.

Then, since you're not a neophyte, leave the feature turned off.   But don't
deny the possibility of using it to actual neophytes.


>>> as in b): Creating an index is quite expensiv
>>
>> How much more so than doing that full-table-scan plus sort, which your
>> query is doing anyway?
>
> A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
> maintenance_work_mem) etc. So imagine a few users, building indices on a
> big table simultaneously

Pffthht.   One simple trylock, used only by the auto-indexer.

>
> Building an index is just one side of the problem - maintenance of the
> indexes is another thing. Each index has an impact on write operations
> (INSERT/UPDATE) and may cause that HOT actually does not work (thus
> causing unnecessary bloat of the index).

This is worth discussing.    How much help does the DBMS currently give
the DBA in evaluating these tradeoffs?    Could we do better, given an
actual installation and workload?

> And the indexes may need lot of
> space on a disk.

By all means require limits as well as a "don't do that" switch.

>
> But the real show stopper is probably locking. Building an index takes a
> write lock on a table, effectively blocking writes. Sure, you can use a
> 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
> disadvantages of that (see
> http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

The only disadvantages I see there are a) the leftover "invalid" index - which feels like a
bug; why is it not auto-dropped?   and b) the second scan, which implies more total work
and a wish to background that portion after completing the query triggering
the auto-index.

Don't forget I suggested doing this only for a limited time after DB creation.  Possibly another
reason to turn it off should be any manual index creation, as that implies that the DBA
knows about indexing.

I see in another thread you suggest merely placing hints in the log.   That's a fine
first step - but I'll then be wanting to auto-parse that log to auto-create....

Cheers,
     Jeremy


pgsql-general by date:

Previous
From: tv@fuzzy.cz
Date:
Subject: Re: Can the query planner create indexes?
Next
From: bricklen
Date:
Subject: Re: Can the query planner create indexes?