Thread: Regular Expression for 'and' instead of 'or'
This expression matches the word socks or shoes or nike in product category
where productdescr ~* '(socks|shoes|nike)'
Does anybody know what the expression should be if I want to do 'and' of those key words instead of 'or' ?
Thanks,
Sam
En un mensaje anterior, Samuel J. Sutjiono escribió: > This expression matches the word socks or shoes or nike in product category > where productdescr ~* '(socks|shoes|nike)' > > Does anybody know what the expression should be if I want to do 'and' of those key words instead of 'or' ? You have to use something like: '(socks.*shoes.*nike)|(socks.*nike.*shoes)|...' where ... is every other possible combination. Not pretty, but that's a limitation of finite automata (or regular expressions, which are the same). Regards. Fernando P. Schapachnik Gerente de tecnología de red y sistemas de información VIA NET.WORKS ARGENTINA S.A. fschapachnik@vianetworks.com.ar Tel.: (54-11) 4323-3381
En un mensaje anterior, Samuel J. Sutjiono escribió: > Thank you very much Fernando. Do you know whether this expression (regex) > use index scan ? Not sure, but probably an EXPLAIN will tell. Good luck. Fernando P. Schapachnik Gerente de tecnología de red y sistemas de información VIA NET.WORKS ARGENTINA S.A. fschapachnik@vianetworks.com.ar Tel.: (54-11) 4323-3381
Thank you very much Fernando. Do you know whether this expression (regex) use index scan ? Regards, Sam ----- Original Message ----- From: "Fernando Schapachnik" <fschapachnik@vianetworks.com.ar> To: "Samuel J. Sutjiono" <ssutjiono@wc-group.com> Cc: <pgsql-sql@postgresql.org>; <pgsql-general@postgresql.org> Sent: Friday, February 22, 2002 1:15 PM Subject: Re: [SQL] [GENERAL] Regular Expression for 'and' instead of 'or' > En un mensaje anterior, Samuel J. Sutjiono escribió: > > This expression matches the word socks or shoes or nike in product category > > where productdescr ~* '(socks|shoes|nike)' > > > > Does anybody know what the expression should be if I want to do 'and' of those key words instead of 'or' ? > > You have to use something like: > > '(socks.*shoes.*nike)|(socks.*nike.*shoes)|...' > > where ... is every other possible combination. Not pretty, but that's > a limitation of finite automata (or regular expressions, which are > the same). > > Regards. > > > > Fernando P. Schapachnik > Gerente de tecnología de red > y sistemas de información > VIA NET.WORKS ARGENTINA S.A. > fschapachnik@vianetworks.com.ar > Tel.: (54-11) 4323-3381 > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Fernando Schapachnik <fschapachnik@vianetworks.com.ar> writes: > You have to use something like: > '(socks.*shoes.*nike)|(socks.*nike.*shoes)|...' > where ... is every other possible combination. Not pretty, but that's > a limitation of finite automata (or regular expressions, which are > the same). Seems a lot easier to do WHERE field ~* 'socks' AND field ~* 'shoes' AND field ~* 'nike' regards, tom lane
Based on generic rules of Regular Expression, the default is a AND. For example RE-1RE-2RE-3 is really (RE-1) and (RE-2) and (RE-3). However Regular Expression also provides the "Alternation Metachar" to change it from a default AND to a OR. The problem however is if you have target-1 junk target-2 junk junk target-3 Then the simple (target-1target-2target-3) would not work, as you are implying they are following each other. So the fix could be productdescr ~* '(socks)' AND productdescr ~* '(shoes)' AND productdescr ~*(nike)' Effectively you are creating a composite expression consisting of three sub-expression. "Samuel J. Sutjiono" wrote: > This expression matches the word socks or shoes or nike in product > categorywhere productdescr ~* '(socks|shoes|nike)' Does anybody know > what the expression should be if I want to do 'and' of those key > words instead of 'or' ? Thanks,Sam -- ------------------------------------------------------------------------- Medi Montaseri medi@CyberShell.com Unix Distributed Systems Engineer HTTP://www.CyberShell.com CyberShell Engineering -------------------------------------------------------------------------
Samuel J. Sutjiono writes: > This expression matches the word socks or shoes or nike in product category > where productdescr ~* '(socks|shoes|nike)' > > Does anybody know what the expression should be if I want to do 'and' > of those key words instead of 'or' ? You probably want something like productdescr ~* 'socks' AND productdescr ~* 'shoes' AND productdescr ~* 'nike' I don't think it's possible to do this for the general case with just a regular expression. -- Peter Eisentraut peter_e@gmx.net
I'm having an interesting time trying to figure out some behavior with postgresql indexes that I am trying to understand. I have a table, historyticket. In that table, I have a column, fk_opener_id, which is an integer column. I have an index (default btree index) on the fk_opener_id column in that table. When I select from this table with the following query: select * from historyticket where fk_opener_id = ? The query sometimes uses the index, and sometimes does a sequential scan. I experimented for a little bit and found out that if the number of rows that match the query is greater than a certain number (somewhere around 1000 rows from what I can tell) then the index is not used. For example: ============= testing=# explain select * from historyticket where fk_opener_id = 67; NOTICE: QUERY PLAN: Seq Scan on historyticket (cost=0.00..768.62 rows=1246 width=419) testing=# select count(*) from historyticket where fk_opener_id = 67; count ------- 1158 (1 row) --------- testing=# explain select * from historyticket where fk_opener_id = 4; NOTICE: QUERY PLAN: Index Scan using fk_opener_id on historyticket (cost=0.00..179.47 rows=47 width=419) testing=# select count(*) from historyticket where fk_opener_id = 4; count ------- 79 (1 row) =============== I did more queries and confirmed that when the number of rows returned is below a certain number (I don't have enough data to determine the exact number) the index is used, and when it is above a certain number, it is not used. Can anyone explain to me what is happening / why it is happening / how to make the indexes work correctly? Thanks. Brian
On Wed, Mar 13, 2002 at 04:28:00PM -0500, Brian Knox wrote: > > I'm having an interesting time trying to figure out some behavior with > postgresql indexes that I am trying to understand. [snip] > I did more queries and confirmed that when the number of rows returned is > below a certain number (I don't have enough data to determine the exact > number) the index is used, and when it is above a certain number, it is > not used. > > Can anyone explain to me what is happening / why it is happening / how to > make the indexes work correctly? Well, checking a tuple from an index is more expensive than checking a tuple from a sequential scan. So, if you want to select 50% of the table, it's faster to read the whole table than it is to use the index. The planner tries to guess where the break-even point is. Above, seq scan, below index scan. This is a FAQ, IIRC. HTH, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > 11/9/2001 - a new beginning or the beginning of the end?