Thread: optimizing LIKE '%2345' queries

optimizing LIKE '%2345' queries

From
Gene
Date:
Is there any way to create a reverse index on string columns so that
queries of the form:

where column like '%2345';

can use an index and perform as fast as searching with like '2345%'?

Is the only way to create a reverse function and create an index using
the reverse function and modify queries to use:

where reverse(column) like reverse('%2345') ?

thanks

--
Eugene Hart
Cell: 443-604-2679

Re: optimizing LIKE '%2345' queries

From
Mario Weilguni
Date:
Am Sonntag, 2. Juli 2006 23:50 schrieb Gene:
> Is there any way to create a reverse index on string columns so that
> queries of the form:
>
> where column like '%2345';
>
> can use an index and perform as fast as searching with like '2345%'?
>
> Is the only way to create a reverse function and create an index using
> the reverse function and modify queries to use:
>
> where reverse(column) like reverse('%2345') ?
>
> thanks

create a trigger that computes this at insert/update time, index this fix, and
rewrite the query this way:
where inverted_column like '5432%';


Is postgresql ca do the job for software deployed in ASP ou SaaS mode?

From
David Gagnon
Date:
Hi all,

  I've been working on my personal project for 3.5 years now.  I
developed an ERP system in web/java.  Now the people I will work with
suggest to offers it in Saas mode.  Which means my customer will connect
to my website and found they ERP software and data there.  It's not the
deployment I planned initially so if you can just validate some
technicals points to be sure it's not crazy using Postgresl here and not
a big $$$ db to do the job.

Typically I will have 1db per client and around 150 tables per db.  So
since I hope I didn`t work all those year for nothing .. I expect to
have bunch of clients witch means the same amount of db since I have 1
db/client.

Can I hope having several hundred of db on 1 db server?  Like 250 dbs =
250 client = 360 000 tables !!!
So is there a limit for the number of db in the db server ?(this spec is
not on the website)
What about the performance? Can I expect to have the same performance?

Since I put everything on the web I do needs an High Availability
infrastructure.  I looked into SlonyI and Mammoth to replicate the db
but since SlonyI use triggers can I expect it to do the job?  Is Mammoth
is the only available solution?

Last question and not the least  I'm reading this performance list for
several years now and know suggestion about hardware to run postgresl is
discussed.  Since I wrote software there severals points about hardware
that I don`t understand.  Do you have any suggestion of platform to run
into my Saas configuration?  I do need the WISE one!  I'm pretty sure
that if I was a big company I would be able throw bunch of $$$$ but it's
not my case.  I'm pretty sure it exists out there some piece of Hardware
that would do the job perfectly with a fair price.

So far I did understand that Postgresql loves Opteron and I have looked
into the dl145 series of HP.  I did understand that Dell Hardware it`s
not reliable.  But it's still not clear what should be my requirement
for memory, disk, nb cpu, cpu power, etc.

I'm pretty sure it`s better to have more slower CPUs that having the
latest Opteron available on the market, or more slower servers that
having the fastest one...  am I right?  But agains what it`s the optimal
choice?

Thanks you to share your knowledge on those point.  I do consider using
Postgresql is the Smart choice in my project since the beginning but
before putting all the money (That I don`t have ..:-)) to buy some
hardware I just want to be sure I'm not crazy!

Thanks for your help I really appreciate it!!

Best Regards
/David







Re: Is postgresql ca do the job for software deployed in ASP ou SaaS mode?

From
Richard Broersma Jr
Date:
> Typically I will have 1db per client and around 150 tables per db.  So
> since I hope I didn`t work all those year for nothing .. I expect to
> have bunch of clients witch means the same amount of db since I have 1
> db/client.
>
> Can I hope having several hundred of db on 1 db server?  Like 250 dbs =
> 250 client = 360 000 tables !!!
> So is there a limit for the number of db in the db server ?(this spec is
> not on the website)

I'll take a stab at this question.

Each table and database are referenced by an OID.

So the sum(tables) + sum(database) << max-size(OID).
In my case max-size of OID (I believe) is 9223372036854775807.
So if you limited yourself to 1% of the OIDs for use as tables and databases then you could
potentially have 92233720368547758 table or database.

Each database create produces a directory with the database OID:
./data/base/10792
./data/base/10793
./data/base/16814
...
...

since the creation of a new db produces a directory, one limitation would come from your
file-systems limitation on the number of sub-directories that are allowed.

Each table with-in the database is assigned an OID and is located inside the DB directory.  So if
there is a file-system limitation on the number of files with-in a given directory it would also
be a limit to the number of tables that could be created for each database.

The only other limitation that I am aware of is the storage capacity of you DB server.

If there are additional limitations beyond these I would be interested in knowing about them and
adding them to the http://www.postgresql.org/about/ we be helpful also.

Regards,

Richard Broersma Jr.

Re: Is postgresql ca do the job for software deployed in ASP ou SaaS mode?

From
"Chris Hoover"
Date:
On 7/3/06, David Gagnon <dgagnon@siunik.com> wrote:

Can I hope having several hundred of db on 1 db server?  Like 250 dbs =
250 client = 360 000 tables !!!
So is there a limit for the number of db in the db server ?(this spec is
not on the website)
What about the performance? Can I expect to have the same performance?


I am running a similar environment.  Each of our customers has a seperate database with serveral hundred tables per database.  One of our servers is running over 200 customer databases with absolutely no problems.

HTH,

chris

Re: Is postgresql ca do the job for software deployed in

From
"Craig A. James"
Date:
Richard Broersma Jr wrote:
> Each table with-in the database is assigned an OID and is located inside the DB directory.  So if
> there is a file-system limitation on the number of files with-in a given directory it would also
> be a limit to the number of tables that could be created for each database.

You could handle this with tablespaces.  For example, create ten tablespaces, and then assign customer databases to
themin round-robin fashion.  This also lets you assign databases to different disks to balance the I/O load. 

Craig

Re: Is postgresql ca do the job for software deployed in ASP ou SaaS mode?

From
"Guoping Zhang"
Date:
Hi, Chris,
 
In your deployment, can you put a bit more detail if available? Many thanks!
 
My questions are: 
  a)  How do you resolve the connection pool issue?
  b)  If each client application has many threads of connections to the remote server, what is the likely performance penalty with compare to the DBMS hosted at the same host as the client?
 
Indeed, the application requirements may be quite different for us, but above two are my main concerns prior to doing a porting work for a large application (from other vendor DBMS).
 
We have several idential applications on different servers, each has 250+ database connections, currently they are having a DBMS on each server but we want them to share one DBMS at a dedicate DBMS server (in the same LAN) if performance penalty is little. I wonder if anyone there can provide your comments and experience on this. Many thanks.  
 
Regards,
Guoping
 
-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Chris Hoover
Sent: 2006Äê7ÔÂ4ÈÕ 3:20
To: David Gagnon
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Is postgresql ca do the job for software deployed in ASP ou SaaS mode?

On 7/3/06, David Gagnon <dgagnon@siunik.com> wrote:

Can I hope having several hundred of db on 1 db server?  Like 250 dbs =
250 client = 360 000 tables !!!
So is there a limit for the number of db in the db server ?(this spec is
not on the website)
What about the performance? Can I expect to have the same performance?


I am running a similar environment.  Each of our customers has a seperate database with serveral hundred tables per database.  One of our servers is running over 200 customer databases with absolutely no problems.

HTH,

chris

Re: optimizing LIKE '%2345' queries

From
Tarhon-Onu Victor
Date:
On Sun, 2 Jul 2006, Gene wrote:

> can use an index and perform as fast as searching with like '2345%'?
>
> Is the only way to create a reverse function and create an index using
> the reverse function and modify queries to use:
>
> where reverse(column) like reverse('%2345') ?

     Hmm.. interesting.
     If (and only if) the records stored in "column" column have fixed
length (say, all are 50 characters in length) you could create and index
on, say, substring(column,45,50), and use this in the WHERE clauses in
your queries.
     Or if the length of those records is not the same maybe it is
feasible to create an ondex on substring(column, length(column)-5,
length(column)).

--
Any views or opinions presented within this e-mail are solely those of
the author and do not necessarily represent those of any company, unless
otherwise expressly stated.

Re: optimizing LIKE '%2345' queries

From
Gene
Date:
Thanks for the suggestion. Actually I went ahead and created a reverse
function using plpgsql, created an index using reverse column and now
my queries use "where reverse(column) like reverse('%2345') and it's
using the index like i hoped it would! Now if I could figure out how
to optimize like  '%2345%' queries. I don't want to create many
indexes though the table is very write heavy.

> > Is the only way to create a reverse function and create an index using
> > the reverse function and modify queries to use:
> >
> > where reverse(column) like reverse('%2345') ?
>
>         Hmm.. interesting.
>         If (and only if) the records stored in "column" column have fixed
> length (say, all are 50 characters in length) you could create and index
> on, say, substring(column,45,50), and use this in the WHERE clauses in
> your queries.
>         Or if the length of those records is not the same maybe it is
> feasible to create an ondex on substring(column, length(column)-5,
> length(column)).
>
> --
> Any views or opinions presented within this e-mail are solely those of
> the author and do not necessarily represent those of any company, unless
> otherwise expressly stated.
>

Re: optimizing LIKE '%2345' queries

From
Chris
Date:
Gene wrote:
> Thanks for the suggestion. Actually I went ahead and created a reverse
> function using plpgsql, created an index using reverse column and now
> my queries use "where reverse(column) like reverse('%2345') and it's
> using the index like i hoped it would! Now if I could figure out how
> to optimize like  '%2345%' queries. I don't want to create many
> indexes though the table is very write heavy.

You can't because that text can be anywhere inside the database field,
so the whole field basically has to be checked to see if it's there.

You could check out full text indexing (tsearch2).

<shameless plug>
http://www.designmagick.com/article/27/PostgreSQL/Introduction-to-Full-Text-Indexing

--
Postgresql & php tutorials
http://www.designmagick.com/