Thread: To all the pgsql developers..Have a look at the operators proposed by me in my research paper.
To all the pgsql developers..Have a look at the operators proposed by me in my research paper.
From
Tasneem Memon
Date:
<br /><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">Dear All, </font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">I am explaininghere how i have defined the operators NEAR and NOT NEAR in my research paper. I want critics/comments from allof you on the operators.</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times NewRoman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"size="3">These operators belong to the category of “Fuzzy Operators” proposed in the paper for ANSI SQL.</font><p class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><ol style="MARGIN-TOP:0in" type="1"><li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo1; tab-stops: list.5in"><font color="#000000" face="Times New Roman" size="3">NEAR</font></ol><p class="MsoNormal" style="MARGIN: 0in 0in0pt"><b style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 11pt"><font color="#000000" face="Times New Roman"> </font></span></b><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><font color="#000000" face="TimesNew Roman" size="3">It deals with the NUMBER and DATE datatypes simulating the human behavior and processing theinformation contained in NEAR in the same way as we humans take it. This is a binary operator with the syntax:</font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">op1<span style="mso-spacerun: yes"> </span>NEAR<span style="mso-spacerun: yes"> </span>op2</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times NewRoman" size="3">Here, the <b style="mso-bidi-font-weight: normal">op1</b> refers to an attribute, whereas <b style="mso-bidi-font-weight:normal">op2 </b>is a fixed value, both of the same datatype. </font><p class="MsoNormal" style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">Suppose we want a list of all the VGAs,price of which should be somewhere around 30$ .. the query will look like:</font><p class="MsoNormal" style="MARGIN:0in 0in 0pt"><span style="FONT-SIZE: 8pt"><font color="#000000"><font face="Times New Roman"><span style="mso-spacerun:yes"> </span></font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY:Verdana"><font size="3"><font color="#000000">SELECT <span style="mso-tab-count: 1"> </span>*</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">FROM <span style="mso-tab-count: 1"> </span>accessories</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">WHERE<span style="mso-tab-count: 1"> </span>prod_name = ‘VGA’</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font size="3"><fontcolor="#000000">AND<span style="mso-tab-count: 2"> </span>prod_price <b style="mso-bidi-font-weight:normal">NEAR</b> 30</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><spanstyle="FONT-FAMILY: Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal" style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">A query for the datatype DATE will looklike:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-SIZE: 8pt"><font color="#000000" face="TimesNew Roman"> </font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">SELECT <span style="mso-tab-count: 1"> </span>*</font></font></span><p class="MsoNormal"style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">FROM<span style="mso-tab-count: 1"> </span>sales</font></font></span><p class="MsoNormal" style="MARGIN:0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">WHERE<span style="mso-tab-count:1"> </span>item = ’printer’</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><fontsize="3"><font color="#000000"><span style="FONT-FAMILY: Verdana">AND<span style="mso-tab-count: 2"> </span>s_date <b style="mso-bidi-font-weight: normal">NEAR</b> </span><span style="FONT-FAMILY: Verdana">10-7-06</span><spanstyle="FONT-FAMILY: Verdana"></span></font></font><p class="MsoNormal" style="MARGIN: 0in 0in0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><spanstyle="FONT-FAMILY: Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal" style="MARGIN:0in 0in 0pt; TEXT-INDENT: 0.5in"><font color="#000000" face="Times New Roman" size="3">The algorithm for theNEAR operator works as follows:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><font color="#000000"face="Times New Roman" size="3"> </font><ol style="MARGIN-TOP: 0in" type="1"><li class="MsoNormal" style="TEXT-JUSTIFY:inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l1 level1 lfo2; tab-stops: list.5in"><font color="#000000" face="Times New Roman" size="3">The margins to the <b style="mso-bidi-font-weight: normal">op2</b>,i.e. <b style="mso-bidi-font-weight: normal">m1</b> and<b style="mso-bidi-font-weight: normal"> m2,</b> areadded dynamically on both the sides, considering the value it contains. To keep this margin big is important for a certainreason discussed later.</font><li class="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN:justify; mso-list: l1 level1 lfo2; tab-stops: list .5in"><font color="#000000" face="Times New Roman" size="3">The<b style="mso-bidi-font-weight: normal">NEAR</b> operator is supposed to obtain the values near to the op2, thusthe target membership degree(<b style="mso-bidi-font-weight: normal">md</b>) is initially set to 0.8.</font><li class="MsoNormal"style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l1 level1 lfo2;tab-stops: list .5in"><font color="#000000" face="Times New Roman" size="3">The algorithm compares the <b style="mso-bidi-font-weight:normal">op1</b>(column) values row by row to the elements of the set that <b style="mso-bidi-font-weight:normal">NEAR</b> defined, i.e. the values from md 1.0 to 0.8, adding matching tuples to the resultset.</font></ol><p class="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT:-0.25in; TEXT-ALIGN: justify; mso-list: l1 level1 lfo2; tab-stops: list .25in"><font face="Times New Roman"><spanstyle="mso-fareast-font-family: 'Times New Roman'"><span style="mso-list: Ignore"><font color="#000000"><fontsize="3">4.</font><span style="FONT: 7pt 'Times New Roman'"> </span></font></span></span><fontcolor="#000000" size="3">It is very much possible that the result set is empty since novalues within the range exist in the column. Thus, the algorithm checks for empty result set, and in that case, decreasesthe target <b style="mso-bidi-font-weight: normal">md</b> by 0.2 and jumps to step 3. This is the reason big marginsto the <b style="mso-bidi-font-weight: normal">op2</b> are added.</font></font><p class="MsoNormal" style="TEXT-JUSTIFY:inter-ideograph; MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; TEXT-ALIGN: justify; mso-list: l1 level1lfo2; tab-stops: list .25in"><font face="Times New Roman"><span style="mso-fareast-font-family: 'Times New Roman'"><spanstyle="mso-list: Ignore"><font color="#000000"><font size="3">5.</font><span style="FONT: 7pt 'Times New Roman'"> </span></font></span></span><font color="#000000" size="3">In case there are no values in <b style="mso-bidi-font-weight:normal">op1</b> that are between <b style="mso-bidi-font-weight: normal">m1</b> and <b style="mso-bidi-font-weight:normal">m2 </b>(where the<b style="mso-bidi-font-weight: normal"> </b>membership degree of thevalues with respect to NEAR becomes 0.1) and the result set is empty, the algorithm fetches the two nearest values (tuples)to <b style="mso-bidi-font-weight: normal">op2, </b>one smaller and one larger than the op2, as the result.</font></font><pclass="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify"><fontcolor="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><fontcolor="#000000" face="Times New Roman" size="3">The algorithm will give an empty result only if the table referredto in the query is empty. </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="TimesNew Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight: normal"><spanstyle="FONT-SIZE: 11pt"><font color="#000000"><font face="Times New Roman">2. NOT NEAR </font></font></span></b><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight: normal"><span style="FONT-SIZE:11pt"><font color="#000000" face="Times New Roman"> </font></span></b><p class="MsoNormal" style="MARGIN:0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><b style="mso-bidi-font-weight:normal"><span style="mso-tab-count: 1"> </span></b>This operator is also a binaryoperator, dealing with the datatype NUMBER and DATE. It has the syntax:</font></font></font><p class="MsoNormal" style="MARGIN:0in 0in 0pt; TEXT-INDENT: 0.5in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">op1<spanstyle="mso-spacerun: yes"> </span>NOT NEAR<span style="mso-spacerun: yes"> </span>op2</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"size="3">The <b style="mso-bidi-font-weight: normal">op1</b> refers to an attribute, whereas <b style="mso-bidi-font-weight:normal">op2 </b>is a fixed value, both of the same data type.</font><p class="MsoNormal" style="MARGIN:0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><span style="mso-tab-count:1"> </span>A query containing the operator looks like:</font></font></font><p class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal"style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">SELECT<span style="mso-tab-count: 1"></span>id, name, age, history</font></font></span><p class="MsoNormal"style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">FROM<spanstyle="mso-tab-count: 1"> </span>casualties </font></font></span><p class="MsoNormal" style="MARGIN:0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">WHERE<span style="mso-tab-count:1"> </span>cause = ‘heart attack’</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">AND<span style="mso-tab-count: 1"> </span>age NOT NEAR 55</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY:Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt;TEXT-INDENT: 0.5in"><font color="#000000" face="Times New Roman" size="3">Or suppose we need a list of some event thatis not clashing with some commitment of ours:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><fontcolor="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><spanstyle="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">SELECT <span style="mso-tab-count: 1"></span>*</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">FROM <span style="mso-tab-count: 1"> </span>events</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><fontsize="3"><font color="#000000">WHERE<span style="mso-tab-count: 1"> </span>e_name= ‘concert’</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 1in; TEXT-INDENT: -0.75in; tab-stops: 423.0pt"><fontsize="3"><font color="#000000"><span style="FONT-FAMILY: Verdana">AND <span style="mso-tab-count: 1"> </span>dateNOT NEAR </span><span style="FONT-FAMILY: Verdana">8/28/2007</span><span style="FONT-FAMILY: Verdana"></span></font></font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="mso-tab-count: 1"><font color="#000000"face="Times New Roman" size="3"> </font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt;TEXT-INDENT: 0.5in"><font size="3"><font color="#000000"><font face="Times New Roman">The algorithm for NOT NEAR workslike this:</font></font></font><ol style="MARGIN-TOP: 0in" type="1"><li class="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph;MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l2 level1 lfo3; tab-stops: list .5in"><font color="#000000"face="Times New Roman" size="3">First of all it adds the margins to the <b style="mso-bidi-font-weight: normal">op2</b>,i.e. <b style="mso-bidi-font-weight: normal">m1</b> and<b style="mso-bidi-font-weight: normal"> m2,</b> dynamicallyon both the sides, considering the value <b style="mso-bidi-font-weight: normal">op2</b> contains.</font><li class="MsoNormal"style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l2 level1 lfo3;tab-stops: list .5in"><font size="3"><font color="#000000"><font face="Times New Roman"><b style="mso-bidi-font-weight:normal">op1 v</b>alues outside the scope of the <b style="mso-bidi-font-weight: normal">op2</b>(<b style="mso-bidi-font-weight: normal">m1, m2</b>) are retrieved and added to the result. </font></font></font><liclass="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify;mso-list: l2 level1 lfo3; tab-stops: list .5in"><font color="#000000" face="Times New Roman" size="3">If the resultset is empty, the farthest values within the <b style="mso-bidi-font-weight: normal">op2 </b>fuzzy set (those possessingthe least membership degree) are retrieved. This is done by continuing the search from values with md=0.1 tillthe md=0.6, where the md for <b style="mso-bidi-font-weight: normal">NOT NEAR</b> reaches 0.4. </font></ol><p class="MsoNormal"style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight: normal"><font color="#000000" face="Times NewRoman" size="3"> </font></b><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"size="3">Regards,</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman"size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font face="Times New Roman, Times, Serif" size="3"><strong>TasneemIshaque Memon <br /></strong>MS (Computer Science) <br />Institute of Mathmetics and Computer Science<br />University of Sindh, Jamshoro <br />Pakistan <br /><br /></font><a href="http://www.geocities.com/tasneememon/"><fontface="Times New Roman, Times, Serif" size="3">http://www.geocities.com/tasneememon/</font></a><br/><br /><font size="3"><font color="#660099"><font face="TimesNew Roman, Times, Serif"><strong><font color="#3300cc">- </font></strong><strong><font color="#3300cc">The processof gaining knowledge is not producing results but clearing away ignorance.</font><br /></strong></font></font></font><br/><hr />Discover the new Windows Vista <a href="http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE"target="_new">Learn more!</a>
Re: To all the pgsql developers..Have a look at the operators proposed by me in my research paper.
From
Jim Nasby
Date:
On Jun 1, 2007, at 8:24 AM, Tasneem Memon wrote: > NEAR > > It deals with the NUMBER and DATE datatypes simulating the human > behavior and processing the Why just number and date? > information contained in NEAR in the same way as we humans take it. > This is a binary operator with the syntax: > op1 NEAR op2 > Here, the op1 refers to an attribute, whereas op2 is a fixed value, > both of the same datatype. > Suppose we want a list of all the VGAs, price of which should be > somewhere around 30$ .. the query will look like: > > SELECT * > FROM accessories > WHERE prod_name = ‘VGA’ > AND prod_price NEAR 30 > > A query for the datatype DATE will look like: > > SELECT * > FROM sales > WHERE item = ’printer’ > AND s_date NEAR 10-7-06 > > > The algorithm for the NEAR operator works as follows: > > The margins to the op2, i.e. m1 and m2, are added dynamically on > both the sides, considering the value it contains. To keep this > margin big is important for a certain reason discussed later. > The NEAR operator is supposed to obtain the values near to the op2, > thus the target membership degree(md) is initially set to 0.8. > The algorithm compares the op1(column) values row by row to the > elements of the set that NEAR defined, i.e. the values from md 1.0 > to 0.8, adding matching tuples to the result set. How would one change 0.8 to some other value? > 4. It is very much possible that the result set is empty since > no values within the range exist in the column. Thus, the algorithm > checks for empty result set, and in that case, decreases the target > md by 0.2 and jumps to step 3. This is the reason big margins to > the op2 are added. > 5. In case there are no values in op1 that are between m1 and > m2 (where the membership degree of the values with respect to NEAR > becomes 0.1) and the result set is empty, the algorithm fetches the > two nearest values (tuples) to op2, one smaller and one larger than > the op2, as the result. > > The algorithm will give an empty result only if the table referred > to in the query is empty. > > 2. NOT NEAR > > This operator is also a binary operator, dealing with > the datatype NUMBER and DATE. It has the syntax: > op1 NOT NEAR op2 > The op1 refers to an attribute, whereas op2 is a fixed value, both > of the same data type. > A query containing the operator looks like: > > SELECT id, name, age, history > FROM casualties > WHERE cause = ‘heart attack’ > AND age NOT NEAR 55 > > Or suppose we need a list of some event that is not clashing with > some commitment of ours: > > SELECT * > FROM events > WHERE e_name= ‘concert’ > AND date NOT NEAR 8/28/2007 > > The algorithm for NOT NEAR works like this: > First of all it adds the margins to the op2, i.e. m1 and m2, > dynamically on both the sides, considering the value op2 contains. > op1 values outside the scope of the op2 (m1, m2) are retrieved and > added to the result. > If the result set is empty, the farthest values within the op2 > fuzzy set (those possessing the least membership degree) are > retrieved. This is done by continuing the search from values with > md=0.1 till the md=0.6, where the md for NOT NEAR reaches 0.4. Why isn't this just the exact opposite set of NEAR? -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Re: To all the pgsql developers..Have a look at the operators proposed by me in my research paper.
From
Josh Berkus
Date:
Tasneem, > > The margins to the op2, i.e. m1 and m2, are added dynamically on > > both the sides, considering the value it contains. To keep this > > margin big is important for a certain reason discussed later. > > The NEAR operator is supposed to obtain the values near to the op2, > > thus the target membership degree(md) is initially set to 0.8. > > The algorithm compares the op1(column) values row by row to the > > elements of the set that NEAR defined, i.e. the values from md 1.0 > > to 0.8, adding matching tuples to the result set. Are we talking about a mathematical calculation on the values, or an algorithm against the population of the result set? I'm presuming the latter or you could just use a function. If so, is NEAR an absolute range or based on something logarithmic like standard deviation? Beyond that, I would think that this mechanism would need some kind of extra heuristics to be at all performant, otherwise you're querying the entire table (or at least the entire index) every time you run a query. Have you given any thought to this? -- Josh Berkus PostgreSQL @ Sun San Francisco