Thread: Re: To all the pgsql developers..Have a look at the operators proposed by me in my researc

Re: To all the pgsql developers..Have a look at the operators proposed by me in my researc

From

Tasneem Memon

Date:

02 June 2007, 10:42:43

<br />> CC: pgsql-hackers@postgresql.org<br />> From: decibel@decibel.org<br />> Subject: Re: [HACKERS] To all
thepgsql developers..Have a look at the operators proposed by me in my research paper.<br />> Date: Fri, 1 Jun 2007
19:13:54-0500<br />> To: tasneememon@hotmail.com<br />> <br />> On Jun 1, 2007, at 8:24 AM, Tasneem Memon
wrote:<br/>> > NEAR<br />> ><br />> > It deals with the NUMBER and DATE datatypes simulating the
human<br />> > behavior and processing the<br />> <br />> Why just number and date?<br />  <br />  <br /><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">I have just started
workingon it for my MS research work.. <span style="mso-spacerun: yes"> </span>for the moment I have written algorithms
forthese two datatypes only, but I intend to implement these operators for the other datatypes also. As for other
datatypes,especially those involving "strings", its very complicated.</font>  <br />  <br />  <br /><br />> <br
/>>> information contained in NEAR in the same way as we humans take it. <br />> > This is a binary
operatorwith the syntax:<br />> > op1 NEAR op2<br />> > Here, the op1 refers to an attribute, whereas op2
isa fixed value, <br />> > both of the same datatype.<br />> > Suppose we want a list of all the VGAs,
priceof which should be <br />> > somewhere around 30$ .. the query will look like:<br />> ><br />> >
SELECT*<br />> > FROM accessories<br />> > WHERE prod_name = ‘VGA’<br />> > AND prod_price NEAR 30<br
/>>><br />> > A query for the datatype DATE will look like:<br />> ><br />> > SELECT *<br
/>>> FROM sales<br />> > WHERE item = ’printer’<br />> > AND s_date NEAR 10-7-06<br />> ><br
/>>><br />> > The algorithm for the NEAR operator works as follows:<br />> ><br />> > The
marginsto the op2, i.e. m1 and m2, are added dynamically on <br />> > both the sides, considering the value it
contains.To keep this <br />> > margin big is important for a certain reason discussed later.<br />> > The
NEARoperator is supposed to obtain the values near to the op2, <br />> > thus the target membership degree(md) is
initiallyset to 0.8.<br />> > The algorithm compares the op1(column) values row by row to the <br />> >
elementsof the set that NEAR defined, i.e. the values from md 1.0 <br />> > to 0.8, adding matching tuples to the
resultset.<br />> <br />> How would one change 0.8 to some other value?<br />  <br />  <br /><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">We can make the system ask the user as
towhat membership degree s/he wants to get the values, but we don’t want to make the system interactive, where a user
givesa membership degree value of his/her choice. These operators are supposed to work just like the other operators in
SQL..you just put them in the query and get a result. I have put 0.8 because all the case studies I have made for the
NEAR,<spanstyle="mso-spacerun: yes">  </span>0.8 seems to be the best choice.. 0.9 narrows the range.. 0.75 or 0.7 gets
thosevalues also that are irrelevant.. However, these values will no more seem to be irrelevant when we haven’t got any
valuestill the md 0.8, so the operator fetches them when they are the NEARest. </font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">I would like to mention another thing
herethat this looks like defining the range like BETWEEN operator does, but its different in a way that with BETWEEN we
definean exact, strict range. Anything outside that range wont be included no matter that value might be of interest of
theuser querying the system, and if there are no values between that range, the result set is empty. </font><p> <br />
 <br/><br />> <br />> > 4. It is very much possible that the result set is empty since <br />> > no
valueswithin the range exist in the column. Thus, the algorithm <br />> > checks for empty result set, and in
thatcase, decreases the target <br />> > md by 0.2 and jumps to step 3. This is the reason big margins to <br
/>>> the op2 are added.<br />> > 5. In case there are no values in op1 that are between m1 and <br />>
>m2 (where the membership degree of the values with respect to NEAR <br />> > becomes 0.1) and the result set
isempty, the algorithm fetches the <br />> > two nearest values (tuples) to op2, one smaller and one larger than
<br/>> > the op2, as the result.<br />> ><br />> > The algorithm will give an empty result only if
thetable referred <br />> > to in the query is empty.<br />> ><br />> > 2. NOT NEAR<br />> ><br
/>>> This operator is also a binary operator, dealing with <br />> > the datatype NUMBER and DATE. It has
thesyntax:<br />> > op1 NOT NEAR op2<br />> > The op1 refers to an attribute, whereas op2 is a fixed value,
both<br />> > of the same data type.<br />> > A query containing the operator looks like:<br />> ><br
/>>> SELECT id, name, age, history<br />> > FROM casualties<br />> > WHERE cause = ‘heart attack’<br
/>>> AND age NOT NEAR 55<br />> ><br />> > Or suppose we need a list of some event that is not
clashingwith <br />> > some commitment of ours:<br />> ><br />> > SELECT *<br />> > FROM
events<br/>> > WHERE e_name= ‘concert’<br />> > AND date NOT NEAR 8/28/2007<br />> ><br />> >
Thealgorithm for NOT NEAR works like this:<br />> > First of all it adds the margins to the op2, i.e. m1 and m2,
<br/>> > dynamically on both the sides, considering the value op2 contains.<br />> > op1 values outside the
scopeof the op2 (m1, m2) are retrieved and <br />> > added to the result.<br />> > If the result set is
empty,the farthest values within the op2 <br />> > fuzzy set (those possessing the least membership degree) are
<br/>> > retrieved. This is done by continuing the search from values with <br />> > md=0.1 till the
md=0.6,where the md for NOT NEAR reaches 0.4.<br />> <br />> Why isn't this just the exact opposite set of
NEAR?<br/>  <br />  <br /><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"
size="3">Becausewe are talking about the Fuzzy behavior, so it doesn’t have to be exact opposite.. it shouldn’t be. If
itis, we might don’t see the information that’s important to us in the result set. This is something you can not define
precisely,if it has to be precise, then the BETWEEN is doing a good job, there would have been no need to introduce
theseoperators.</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"
size="3"> </font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"
size="3">Andeven for different datatypes, different values of md seem to work better. Like for NUMBER, op2 md = 0.6
(NOTNEAR md = 0.4) looks just right to identify the margins.. but for DATE, it seems better to take the margins to the
<spanstyle="mso-spacerun: yes"> </span>op2 md = 0.85 (NOT NEAR md = 0.15) if we don’t get the values till the previous
margins.</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"
size="3"> </font><p> <br/>  <br /><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times
NewRoman" size="3"></font> <p> <br /> > --<br />> Jim Nasby jim@nasby.net<br />> EnterpriseDB
http://enterprisedb.com512.569.9461 (cell)<br />> <br />> <br /><br /> <br /><p class="MsoNormal" style="MARGIN:
0in0in 0pt"><font color="#000000" face="Times New Roman" size="3">I hope I explained things better.. and my English is
notvery good, so I am sorry if you couldn’t get my point.</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font
color="#000000"face="Times New Roman" size="3">Any other comments/critics welcome..</font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">Regards,</font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"></font> <p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">-Tasneem Memon</font><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"></font> <p> <br />
 <br/>  <br /><br /><hr />Get news, entertainment and everything you care about at Live.com. <a
href="http://www.live.com/getstarted.aspx" target="_new">Check it out!</a>

Re: To all the pgsql developers..Have a look at the operators proposed by me in my researc

From

"Jim C. Nasby"

Date:

07 June 2007, 01:47:38

On Sat, Jun 02, 2007 at 01:37:19PM +0000, Tasneem Memon wrote:
> We can make the system ask the user as to what membership degree s/he wants to get the values, but we don?t want to
makethe system interactive, where a user gives a membership degree value of his/her choice. These operators are
supposedto work just like the other operators in SQL.. you just put them in the query and get a result. I have put 0.8
becauseall the case studies I have made for the NEAR,  0.8 seems to be the best choice.. 0.9 narrows the range.. 0.75
or0.7 gets those values also that are irrelevant.. However, these values will no more seem to be irrelevant when we
haven?tgot any values till the md 0.8, so the operator fetches them when they are the NEARest.   
While having them function just like any other operator is good, it
seems like you're making quite a bit of an assumption for the user;
namely that you know what their data looks like better than they might.
Is it not possible that someone would come along with a dataset that
looks different enough from your test cases so that the values you
picked wouldn't work?
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)