Thread: OT: seeking query help, where?
First, sorry for the OT, flame me off-list! I'm a sysadmin being impressed into dba service. i've been getting along pretty well writing queries and making reports, but i've got some questions. suggestions for a good list/forum for help? from two tables both with email_addr columns, i want a distinct list of all email_addrs in one column. what i do now is select distinct on each and then sort -u the results. thanks in advance -t
I've never really used this site but it might be of interest: http://www.dbforums.com/ You may also want to check out "SQL Queries for Mere Mortals" (Hernandez/Viescas - Addison Wesley) -Steve On Thursday 16 January 2003 2:15 pm, Tim Lynch wrote: > First, sorry for the OT, flame me off-list! > > I'm a sysadmin being impressed into dba service. i've been getting along > pretty well writing queries and making reports, but i've got some > questions. suggestions for a good list/forum for help? > > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and > then sort -u the results. > > thanks in advance > -t > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
there are several ways to do this, one example would be: select distinct(email_addr) from table1 union select distinct(email_addr) from table2 You can ask questions like these on the pgsql-newbies or pgsql-sql lists if you'd like (though generally you shouldn't get flamed no matter which group you post to) Robert Treat On Thu, 2003-01-16 at 17:15, Tim Lynch wrote: > First, sorry for the OT, flame me off-list! > > I'm a sysadmin being impressed into dba service. i've been getting along > pretty well writing queries and making reports, but i've got some questions. > suggestions for a good list/forum for help? > > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and then > sort -u the results. > > thanks in advance > -t > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
you should be able to do something like:
select distinct email from test1
union (select distinct email from test2);
regards,
Devinder Rajput
Stores Division Corporate Offices
Chicago, IL
(773) 442-6474
"Tim Lynch" <admin+pgsqladmin@thirdage.com> Sent by: pgsql-admin-owner@postgresql.org 01/16/2003 04:15 PM | To: pgsql-admin@postgresql.org cc: Subject: [ADMIN] OT: seeking query help, where? |
First, sorry for the OT, flame me off-list!
I'm a sysadmin being impressed into dba service. i've been getting along
pretty well writing queries and making reports, but i've got some questions.
suggestions for a good list/forum for help?
from two tables both with email_addr columns, i want a distinct list of all
email_addrs in one column. what i do now is select distinct on each and then
sort -u the results.
thanks in advance
-t
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Thu, Jan 16, 2003 at 14:15:22 -0800, Tim Lynch <admin+pgsqladmin@thirdage.com> wrote: > First, sorry for the OT, flame me off-list! > > I'm a sysadmin being impressed into dba service. i've been getting along > pretty well writing queries and making reports, but i've got some questions. > suggestions for a good list/forum for help? > > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and then > sort -u the results. Use the union operator.
On Thu, 2003-01-16 at 17:15, Tim Lynch wrote: > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and then > sort -u the results. You want either pgsql-sql or pgsql-novice, both of which are mailing lists for PostgreSQL. The former would be exactly the correct forum. Odds are someone in this list will answer you anyway with something mostly-useful. I think the UNION clause will help you to your eventual goal. -- Tim Ellis Senior Database Architect
I'd suggest something similar to, but slightly different from, what others have: SELECT DISTINCT email FROM (SELECT email FROM table_1 UNION SELECT email FROM table_2) AS combined; that way you avoid duplicates across tables. ---------------------------------------------------------------------- Andrew J Perrin - http://www.unc.edu/~aperrin Assistant Professor of Sociology, U of North Carolina, Chapel Hill clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu On Thu, 16 Jan 2003, Tim Lynch wrote: > First, sorry for the OT, flame me off-list! > > I'm a sysadmin being impressed into dba service. i've been getting along > pretty well writing queries and making reports, but i've got some questions. > suggestions for a good list/forum for help? > > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and then > sort -u the results. > > thanks in advance > -t > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
On Fri, Jan 17, 2003 at 08:57:14 -0500, Andrew Perrin <clists@perrin.socsci.unc.edu> wrote: > I'd suggest something similar to, but slightly different from, what others > have: > > SELECT DISTINCT email FROM (SELECT email FROM table_1 UNION SELECT > email FROM table_2) AS combined; > > that way you avoid duplicates across tables. The union operator already removes duplicates.
Hey, thanks - nice to learn something. ap ---------------------------------------------------------------------- Andrew J Perrin - http://www.unc.edu/~aperrin Assistant Professor of Sociology, U of North Carolina, Chapel Hill clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu On Fri, 17 Jan 2003, Bruno Wolff III wrote: > On Fri, Jan 17, 2003 at 08:57:14 -0500, > Andrew Perrin <clists@perrin.socsci.unc.edu> wrote: > > I'd suggest something similar to, but slightly different from, what others > > have: > > > > SELECT DISTINCT email FROM (SELECT email FROM table_1 UNION SELECT > > email FROM table_2) AS combined; > > > > that way you avoid duplicates across tables. > > The union operator already removes duplicates. > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
On Fri, 17 Jan 2003, Bruno Wolff III wrote: > On Fri, Jan 17, 2003 at 08:57:14 -0500, > Andrew Perrin <clists@perrin.socsci.unc.edu> wrote: > > > > that way you avoid duplicates across tables. > > The union operator already removes duplicates. Right, which means everyone's UNION queries have too many DISTINCTs in them: SELECT email from table1 UNION SELECT email from table2 ; should do it. Here's proof: note the duplicate 'foo' and 'bar': test=# select * from table1; email ----------------- foo@example.com bar@example.com bar@example.com (3 rows) test=# select * from table2; email ------------------ quux@example.com foo@example.com (2 rows) test=# select email test=# select email from table1 union select email from table2; email ------------------ bar@example.com foo@example.com quux@example.com (3 rows) Ross -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Research Scientist phone: 713-348-6166 The Connexions Project http://cnx./rice.edu fax: 713-348-6182 Rice University MS-39 Houston, TX 77005
"Ross J. Reedstrom" <reedstrm@rice.edu> writes: > On Fri, 17 Jan 2003, Bruno Wolff III wrote: >> The union operator already removes duplicates. > Right, which means everyone's UNION queries have too many DISTINCTs > in them: Relevant to this thread: if you don't want duplicate removal, write UNION ALL. This is considerably cheaper than UNION, so it's a good thing to keep in mind. Examples: regression=# explain analyze select unique1 from tenk1 union all select unique2 from tenk1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Append (cost=0.00..916.00 rows=20000 width=4) (actual time=0.20..1416.60 rows=20000 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.18..549.34 rows=10000 loops=1) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..279.58 rows=10000 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.14..548.90 rows=10000 loops=1) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.11..278.33 rows=10000 loops=1) Total runtime: 1570.02 msec (6 rows) regression=# explain analyze select unique1 from tenk1 union select unique2 from tenk1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Unique (cost=2344.77..2444.77 rows=20000 width=4) (actual time=2881.85..3477.51 rows=10000 loops=1) -> Sort (cost=2344.77..2394.77 rows=20000 width=4) (actual time=2881.82..3169.29 rows=20000 loops=1) Sort Key: unique1 -> Append (cost=0.00..916.00 rows=20000 width=4) (actual time=0.21..1590.55 rows=20000 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.19..570.31 rows=10000loops=1) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..293.18 rows=10000 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..698.43 rows=10000loops=1) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.12..419.75 rows=10000 loops=1) Total runtime: 3574.98 msec (9 rows) The sort-and-unique phases are what implement duplicate removal, and as you can see they add a good deal to the cost of the query. regards, tom lane
On Thu, 16 Jan 2003, Tim Lynch wrote: > First, sorry for the OT, flame me off-list! > > I'm a sysadmin being impressed into dba service. i've been getting along > pretty well writing queries and making reports, but i've got some questions. > suggestions for a good list/forum for help? > > from two tables both with email_addr columns, i want a distinct list of all > email_addrs in one column. what i do now is select distinct on each and then > sort -u the results. > I like it! My sort of solution. Only I don't know the -u switch, I'd have done `cat blah* | sort | uniq` but it's obviously the same idea. As for the query that would do it, I believe that's already been answered I just thought I'd stand up for the good old fashioned unix ways. :) -- Nigel J. Andrews
UNION! thanks everyone! i'll take this stuff up on the pgsql-sql list in the future. union does intra- and inter- table distinct, notice `red' and `two': test=> select str from foo ; str ----- one two two red (4 rows) test=> select str from bar ; str -------- red orange yellow green blue indigo violet red (8 rows) test=> select str from foo union ( select str from bar ) ; str -------- blue green indigo one orange red two violet yellow (9 rows)