Indexing questions: Index == key? And index vs substring - how successful?
As the title of this message suggests, I've got a couple of questions about indexing that I'm not sure about. I've tried to take a look at the docs, but I can't remember seeing anything on these; it's quite possible, I admit, that I'm simply not remembering all of what I saw, but I would appreciate it if someone would help me to understand these.
1. Does an indexed column on a table have to be a potential primary key? I've been working with a couple of rather large tables where a common select is on a foreign key called 'cntrct_id' (Varchar(9) in format). However, the same 'cntrct_id' can appear on multiple records in the tables I'm trying to work with now; the tables themselves record events associated with the given 'cntrct_id' record and can store many events for one 'cntrct_id' value. I'd thought that creating an index on the table.cntrct_id field for the event tables would allow me to speed up the transations some, but comparisons of time before and after the indexing lead me to wonder if I was mistaken in this. The times were almost identical in the following areas: Before Indexing, after Indexing but before Analyzing, and after Analyzing.
2. Another common sort on these fields uses part, not all, of the 'cntrct_id' value to search for things; the first character marks original location in an internal framework we're using, for example, and the third character marks the month of the year that the original 'cntrct_id' record was set up. Sorts on either of those are fairly common as well; would indexing on the cntrct_id as a whole be able to speed up a sort on a portion of it?
I have in mind something like this:
select * from [event table] where substring(cntrct_id, 3,1) = 'H';
which should select any event records associated with 'cntrct_id' values initally set up in August. (Jan = A, Feb = B, etc)
If I established an index on the 'cntrct_id' field in the event tables, would it assist in speeding up the substring-based search, or would it not be effective at doing so?
Thank you for your assistance.
---------------------------------
Ahhh...imagining that irresistible "new car" smell?
Check outnew cars at Yahoo! Autos.
On Thu, May 03, 2007 at 01:42:44PM -0700, Andrew Edson wrote:
As the title of this message suggests, I've got a couple of questions about indexing that I'm not sure about. I've tried to take a look at the docs, but I can't remember seeing anything on these; it's quite possible, I admit, that I'm simply not remembering all of what I saw, but I would appreciate it if someone would help me to understand these.
1. Does an indexed column on a table have to be a potential primary key?
No, that's the difference between unique and non-unique indexes.
The times were almost identical in the following areas: Before Indexing, after Indexing but before Analyzing, and after Analyzing.
If you want reasons, you're going to need to provide EXPLAIN ANALYSE
output.
I have in mind something like this:
select * from [event table] where substring(cntrct_id, 3,1) = 'H';
which should select any event records associated with 'cntrct_id' values initally set up in August. (Jan = A, Feb = B, etc)If I established an index on the 'cntrct_id' field in the event
tables, would it assist in speeding up the substring-based search,
or would it not be effective at doing so?
Not directly, no. However, you can have indexes on expressions:
CREATE INDEX foo ON bar((substring(cntrct_id, 3,1)));
Which could speed up the above query (could, since it depends on
exactly how much of the table actually needs to be searched...)
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
1. Does an indexed column on a table have to be a potential primary
key?
Nope, create as many index as you need/must/should.
I've been working with a couple of rather large tables where a common
select is on a foreign key called 'cntrct_id' (Varchar(9) in format).
However, the same 'cntrct_id' can appear on multiple records in the
tables I'm trying to work with now; the tables themselves record events
associated with the given 'cntrct_id' record and can store many events
for one 'cntrct_id' value. I'd thought that creating an index on the
table.cntrct_id field for the event tables would allow me to speed up
the transations some, but comparisons of time before and after the
indexing lead me to wonder if I was mistaken in this. The times were
almost identical in the following areas: Before Indexing, after Indexing
but before Analyzing, and after Analyzing.
2. Another common sort on these fields uses part, not all, of the
'cntrct_id' value to search for things; the first character marks
original location in an internal framework we're using, for example, and
the third character marks the month of the year that the original
'cntrct_id' record was set up. Sorts on either of those are fairly
common as well; would indexing on the cntrct_id as a whole be able to
speed up a sort on a portion of it?
Nope.
This looks like suboptimal schema design...
If you had an indexed date column, you would be able to make fast indexed
queries with BETWEEN, >=, <=, etc.