Re: Re: Advice - indexing on varchar fields where only last x characters known - Mailing list pgsql-sql

From Gary Stainburn
Subject Re: Re: Advice - indexing on varchar fields where only last x characters known
Date
Msg-id 201310180822.48251.gary.stainburn@ringways.co.uk
Whole thread Raw
In response to Re: Advice - indexing on varchar fields where only last x characters known  (David Johnston <polobo@yahoo.com>)
Responses Re: Re: Advice - indexing on varchar fields where only last x characters known  (skinner@britvault.co.uk (Craig R. Skinner))
Re: Advice - indexing on varchar fields where only last x characters known  (David Johnston <polobo@yahoo.com>)
List pgsql-sql
On Thursday 17 October 2013 20:16:30 David Johnston wrote:
> Gary Stainburn wrote
>
> > However, it means that every time I'm trying to connect various tables up
> > using foreign keys
>
> The degree to which each input source guarantees uniqueness of a given VIN
> matters.  Keep in mind that any system that required the user to manually
> enter the VIN has the propensity for errors.  Either outright invalid VINs
> or marginally correct VINs with typos (which mean the VIN might be less or
> more than 17 characters even if the 17-character version was intended).
> Specifically it is not uncommon for the VIN to be made-up when it is a
> required field but the user does not know what the VIN is.
>
> 6 characters are unique within a model year but you need at least 8
> characters to be generally unique for a given manufacturer.
>
> For foreign key purposes it may be worthwhile to generate a "matching"
> table and then during import use an algorithm to match up different
> records.  Then during general queries that table can be used for joins.  In
> this way you only pay the price of matching once and that during import as
> opposed to during user requests.  Having a canonical VIN table helps here
> though during import that table then has to be maintained.  The added
> advantage is that such a mapping table allows you to search against a
> single table and such a table (and likely its indexes) should be fairly
> small so as to make good use of memory.
>
> David J.

David

You raise a number of valid points here. Thankfully it is very rare that the 
user creates a VIN number so the chances of creating the wrong VIN is slim. 
Most data is created from data feeds from external sources. Although these 
may also be wrong (even Ford get it wrong some times) the chances are slim.

Where a user enters a 6 character VIN the times that more than one vehicle is 
returned is negligable, and where it does I return an error and ask for a 
longer VIN, which may be anywhere between 8 and the full 17 characters.

I like the idea of using reverse() to create the index, and will investigate 
that. Will it work on an old server running 8.3.3?

On one system I did generate a second field fixed at 11 characters for use 
only with foreign keys and used a function and trigger to automatically 
update it based on the value of the original VIN field.

I don't like it because it goes against the principles of relational databases 
in that it is redundant information, and was wondering if there were better 
options.  

While the reverse() option may work for searching it will not be ideal for 
foreign keys.



-- 
Gary Stainburn
Group I.T. Manager
Ringways Garages
http://www.ringways.co.uk 



pgsql-sql by date:

Previous
From: David Johnston
Date:
Subject: Re: Advice - indexing on varchar fields where only last x characters known
Next
From: skinner@britvault.co.uk (Craig R. Skinner)
Date:
Subject: Re: Re: Advice - indexing on varchar fields where only last x characters known