Thread: Indexing questions: Index == key? And index vs substring - how successful?
As the title of this message suggests, I've got a couple of questions about indexing that I'm not sure about. I've tried to take a look at the docs, but I can't remember seeing anything on these; it's quite possible, I admit, that I'm simply not remembering all of what I saw, but I would appreciate it if someone would help me to understand these.
1. Does an indexed column on a table have to be a potential primary key? I've been working with a couple of rather large tables where a common select is on a foreign key called 'cntrct_id' (Varchar(9) in format). However, the same 'cntrct_id' can appear on multiple records in the tables I'm trying to work with now; the tables themselves record events associated with the given 'cntrct_id' record and can store many events for one 'cntrct_id' value. I'd thought that creating an index on the table.cntrct_id field for the event tables would allow me to speed up the transations some, but comparisons of time before and after the indexing lead me to wonder if I was mistaken in this. The times were almost identical in the following areas: Before Indexing, after Indexing but before Analyzing, and after Analyzing.
2. Another common sort on these fields uses part, not all, of the 'cntrct_id' value to search for things; the first character marks original location in an internal framework we're using, for example, and the third character marks the month of the year that the original 'cntrct_id' record was set up. Sorts on either of those are fairly common as well; would indexing on the cntrct_id as a whole be able to speed up a sort on a portion of it?
I have in mind something like this:
select * from [event table] where substring(cntrct_id, 3,1) = 'H';
which should select any event records associated with 'cntrct_id' values initally set up in August. (Jan = A, Feb = B, etc)
If I established an index on the 'cntrct_id' field in the event tables, would it assist in speeding up the substring-based search, or would it not be effective at doing so?
Thank you for your assistance.
Ahhh...imagining that irresistible "new car" smell?
Check out new cars at Yahoo! Autos.
Re: Indexing questions: Index == key? And index vs substring - how successful?
From
Martijn van Oosterhout
Date:
On Thu, May 03, 2007 at 01:42:44PM -0700, Andrew Edson wrote: > As the title of this message suggests, I've got a couple of questions about indexing that I'm not sure about. I've triedto take a look at the docs, but I can't remember seeing anything on these; it's quite possible, I admit, that I'm simplynot remembering all of what I saw, but I would appreciate it if someone would help me to understand these. > > 1. Does an indexed column on a table have to be a potential primary key? No, that's the difference between unique and non-unique indexes. > The times were almost identical in the following areas: Before Indexing, after Indexing but before Analyzing, and afterAnalyzing. If you want reasons, you're going to need to provide EXPLAIN ANALYSE output. > I have in mind something like this: > select * from [event table] where substring(cntrct_id, 3,1) = 'H'; > which should select any event records associated with 'cntrct_id' values initally set up in August. (Jan = A, Feb =B, etc) > > If I established an index on the 'cntrct_id' field in the event > tables, would it assist in speeding up the substring-based search, > or would it not be effective at doing so? Not directly, no. However, you can have indexes on expressions: CREATE INDEX foo ON bar((substring(cntrct_id, 3,1))); Which could speed up the above query (could, since it depends on exactly how much of the table actually needs to be searched...) Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
> 1. Does an indexed column on a table have to be a potential primary > key? Nope, create as many index as you need/must/should. > I've been working with a couple of rather large tables where a common > select is on a foreign key called 'cntrct_id' (Varchar(9) in format). > However, the same 'cntrct_id' can appear on multiple records in the > tables I'm trying to work with now; the tables themselves record events > associated with the given 'cntrct_id' record and can store many events > for one 'cntrct_id' value. I'd thought that creating an index on the > table.cntrct_id field for the event tables would allow me to speed up > the transations some, but comparisons of time before and after the > indexing lead me to wonder if I was mistaken in this. The times were > almost identical in the following areas: Before Indexing, after Indexing > but before Analyzing, and after Analyzing. > 2. Another common sort on these fields uses part, not all, of the > 'cntrct_id' value to search for things; the first character marks > original location in an internal framework we're using, for example, and > the third character marks the month of the year that the original > 'cntrct_id' record was set up. Sorts on either of those are fairly > common as well; would indexing on the cntrct_id as a whole be able to > speed up a sort on a portion of it? Nope. This looks like suboptimal schema design... If you had an indexed date column, you would be able to make fast indexed queries with BETWEEN, >=, <=, etc.