Thread: stuck on values in 8.2
OK, after reviewing many emails and what I was trying to do I upgraded from 8.2. Seems to work as it did in 8.1 which is a good start. I'm doing all of this so I can use the 'values' that was described as being something like: select * from (values ('one','two','three')) "foo"; But instead of getting three rows of one value each, I get three columns of one row. Not what I was looking for. :( What I am doing now is run a large list (300+) of values that is being iterated through a sql statement like: 'select * from mytable where token = ?' to see if the value exists or not and then "do stuff". A while back I was told that this would all be fixed if I used values (...) Did I misunderstand the proper care and feeding of the VALUES statement or is there something else I need to do here? I initially thought that I could do this with: select t.value, v.value from values('one','two','three') left outer join mytable using (value) except -- not sure how to name the column in values() to come out as name "value" -- and quite frankly I can't get anything to actually run other than my first example. Ideally the output would have something like: v.value | t.value ----------+--------- one | two | two three | three where the value 'one' is not in the table... Am I still able to do something like this but don't have the syntax correct? Or am I more SOL than RTFM?
"Tom Allison" <tom@tacocat.net> writes: > OK, after reviewing many emails and what I was trying to do I upgraded from 8.2. > > Seems to work as it did in 8.1 which is a good start. > > I'm doing all of this so I can use the 'values' that was described as being > something like: > > select * from (values ('one','two','three')) "foo"; SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value) > I initially thought that I could do this with: > select t.value, v.value from > values('one','two','three') left outer join mytable using (value) postgres=# SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value) LEFT OUTER JOIN mytable ON (foo.value = mytable.value); value | value -------+------- one | two | two three | three (3 rows) "USING" would work too but then you only get one output column rather than two which is not so helpful in this case. postgres=# SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value) LEFT OUTER JOIN mytable USING (value) ; value ------- one two three (3 rows) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
This is getting really ugly... it won't finish in less than .. minutes. spam=> explain select u2.token_idx, t.token_idx, foo.token from tokens t left outer join user_token u1 using (token_idx), users u left outer join user_token u2 using (user_idx), (values('one'),('want'),('examine'),('three')) as foo(token) left outer join tokens using (token) where u.user_idx = 15; QUERY PLAN ------------------------------------------------------------------------ -------------------------------------- Nested Loop Left Join (cost=423.70..4327392.26 rows=216168368 width=48) -> Nested Loop (cost=250.49..3851.69 rows=30472 width=44) -> Hash Left Join (cost=249.40..3241.16 rows=30472 width=40) Hash Cond: (t.token_idx = u1.token_idx) -> Nested Loop (cost=0.00..967.34 rows=29908 width=40) -> Nested Loop Left Join (cost=0.00..33.18 rows=4 width=32) -> Values Scan on "*VALUES*" (cost=0.00..0.05 rows=4 width=32) -> Index Scan using tokens_token_key on tokens (cost=0.00..8.27 rows=1 width=16) Index Cond: ("*VALUES*".column1 = (tokens.token)::text) -> Seq Scan on tokens t (cost=0.00..158.77 rows=7477 width=8) -> Hash (cost=145.29..145.29 rows=8329 width=8) -> Seq Scan on user_token u1 (cost=0.00..145.29 rows=8329 width=8) -> Materialize (cost=1.09..1.10 rows=1 width=4) -> Seq Scan on users u (cost=0.00..1.09 rows=1 width=4) Filter: (user_idx = 15) -> Materialize (cost=173.21..244.15 rows=7094 width=12) -> Seq Scan on user_token u2 (cost=0.00..166.11 rows=7094 width=12) Filter: (user_idx = 15) (18 rows) On May 12, 2007, at 11:08 AM, Gregory Stark wrote: > > "Tom Allison" <tom@tacocat.net> writes: > >> OK, after reviewing many emails and what I was trying to do I >> upgraded from 8.2. >> >> Seems to work as it did in 8.1 which is a good start. >> >> I'm doing all of this so I can use the 'values' that was >> described as being >> something like: >> >> select * from (values ('one','two','three')) "foo"; > > SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value) > >> I initially thought that I could do this with: >> select t.value, v.value from >> values('one','two','three') left outer join mytable using (value) > > postgres=# SELECT * > FROM (VALUES ('one'),('two'),('three')) AS foo(value) > LEFT OUTER JOIN mytable ON (foo.value = mytable.value); > > value | value > -------+------- > one | > two | two > three | three > (3 rows) > > "USING" would work too but then you only get one output column > rather than two > which is not so helpful in this case. > > postgres=# SELECT * > FROM (VALUES ('one'),('two'),('three')) AS foo(value) > LEFT OUTER JOIN mytable USING (value) ; > > value > ------- > one > two > three > (3 rows) > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/
--- Tom Allison <tom@tacocat.net> wrote: > This is getting really ugly... > it won't finish in less than .. minutes. > > spam=> explain select u2.token_idx, t.token_idx, foo.token from > tokens t left outer join user_token u1 using (token_idx), > users u left outer join user_token u2 using (user_idx), > (values('one'),('want'),('examine'),('three')) as foo(token) > left outer join tokens using (token) > where u.user_idx = 15; It looks to me that your query has (3) left joins and (3) implied cross-joins. Perhaps reforming your query to eliminate the cross-joins with help performance. In regard to your tables <tokens> and <user_tokens>, in this query you are referring to (2) separate instances of these tables when a single instance these tables would probably work just fine. i.e. tokens t vs. tokens, user_token u1 vs user_token u2 Regards, Richard Broersma Jr.
I agree I'm going in the wrong direction. in a Venn sort of way, what I'm trying to do is: values(...) --> left outer --> tokens --> left outer --> (user_tokens where user_tokens.user_idx = users.user_idx and users.user_idx = 4) To give me a list of all values || any matching token || any matching user_token where user_idx = 4 something like: SELECT values.token, t.token_idx, ut.token_idx FROM (values('one'),('want'),('examine'),('three')) as values(token) left outer join tokens t using (token) left outer join ( select token_idx from user_token where user_idx = 14) "ut" using (token_idx) ; That seems to be better. I think the part I was trying to get my brain around was how, in postgres, do you do multiple outer joins. On my day job I do this in Oracle without thinking, but the syntax of postgres is new to me. Like doing dates. Everyone has a different way of doing dates and they are all weird. Now I have to go impliment it into my code and see what it actually does. I'm hoping to peel 3-5 seconds off each process! On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote: > > --- Tom Allison <tom@tacocat.net> wrote: > >> This is getting really ugly... >> it won't finish in less than .. minutes. >> >> spam=> explain select u2.token_idx, t.token_idx, foo.token from >> tokens t left outer join user_token u1 using (token_idx), >> users u left outer join user_token u2 using (user_idx), >> (values('one'),('want'),('examine'),('three')) as foo(token) >> left outer join tokens using (token) >> where u.user_idx = 15; > > It looks to me that your query has (3) left joins and (3) implied > cross-joins. Perhaps reforming > your query to eliminate the cross-joins with help performance. In > regard to your tables <tokens> > and <user_tokens>, in this query you are referring to (2) separate > instances of these tables when > a single instance these tables would probably work just fine. i.e. > > tokens t vs. tokens, > user_token u1 vs user_token u2 > > Regards, > Richard Broersma Jr. > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: explain analyze is your friend
"Tom Allison" <tom@tacocat.net> writes: > This is getting really ugly... > it won't finish in less than .. minutes. > > spam=> explain select u2.token_idx, t.token_idx, foo.token from > tokens t left outer join user_token u1 using (token_idx), > users u left outer join user_token u2 using (user_idx), > (values('one'),('want'),('examine'),('three')) as foo(token) > left outer join tokens using (token) > where u.user_idx = 15; You might find it easier to follow your SQL queries if you formatted them to make their structure clear: SELECT u2.token_idx, t.token_idx, foo.token FROM tokens t LEFT OUTER JOIN user_token u1 USING (token_idx), users u LEFT OUTER JOIN user_token u2 USING (user_idx), (VALUES('one'),('want'),('examine'),('three')) AS foo(token) LEFT OUTER JOIN tokens USING (token) WHERE u.user_idx = 15; Note that your query is joining 6 tables and there are two joins that don't have any join constraint on them. So you're getting the cartesian product of those joins. Postgres estimates your query will return 216 million records. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Thank you very much for all your help!!! Solved this one rather nicely: my $glue = q{'),(E'}; my $string = "(E'" . join($glue, map{quotemeta } @$tokens) . "')"; my $sql =<<SQL; SELECT values.token, t.token_idx, ut.token_idx FROM (values TOKEN_LIST_STRING) as values(token) left outer join tokens t using (token) left outer join ( select token_idx from user_token where user_idx = $self-> {user_idx}) "ut" using (token_idx) SQL $sql =~ s/TOKEN_LIST_STRING/$string/o; ------------------------- This is something on the order to 10-1000 times faster depending on the various result sets matching more/less across the various tables. "very nice..." I'm thinking I could do the same thing where I have a lot of inserts. But this is going to be considerably harder because I don't have any really good experience with doing this with transactional integrity. From this query, I eventually want to grab all the values.token and insert them into the token and user_token table. I can call: insert into tokens(token) values(('one'),('two'),('three')); then call: insert into user_token(user_idx, token_idx) select 14, token_idx from tokens where token in ('one','two','three') And that should work. HOWEVER: If I have two simultaneous INSERT INTO token(token) queries with overlapping values I'm going to get into all kinds of trouble with the integrity constraint on my index of unique tokens. Typically I'll get an integrity violation error. Or am I looking at something like: insert into tokens select .. values(..) as values(token) left outer join tokens using (token) (Am I getting the hang of this any better?) On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote: > > --- Tom Allison <tom@tacocat.net> wrote: > >> This is getting really ugly... >> it won't finish in less than .. minutes. >> >> spam=> explain select u2.token_idx, t.token_idx, foo.token from >> tokens t left outer join user_token u1 using (token_idx), >> users u left outer join user_token u2 using (user_idx), >> (values('one'),('want'),('examine'),('three')) as foo(token) >> left outer join tokens using (token) >> where u.user_idx = 15; > > It looks to me that your query has (3) left joins and (3) implied > cross-joins. Perhaps reforming > your query to eliminate the cross-joins with help performance. In > regard to your tables <tokens> > and <user_tokens>, in this query you are referring to (2) separate > instances of these tables when > a single instance these tables would probably work just fine. i.e. > > tokens t vs. tokens, > user_token u1 vs user_token u2 > > Regards, > Richard Broersma Jr.
I think I fixed the rest of my sql statements with the following: insert into tokens (token)select values.token from (values TOKEN_LIST_STRING ) as values(token) left outer join tokens t using (token) where t.token_idx is null insert into user_token(user_idx, token_idx)select $self->{user_idx}, token_idxfrom tokens left outer join (select token_idx from user_token where user_idx = $self->{user_idx}) ut using (token_idx) where ut.token_idx is null and token_idx in ($string) I think does what I was trying to accomplish. At least the little test sql seems to work. interestingly, the time to process has gone from >100s to <1s. On May 12, 2007, at 2:23 PM, Tom Allison wrote: > Thank you very much for all your help!!! > > Solved this one rather nicely: > > my $glue = q{'),(E'}; > > my $string = "(E'" . join($glue, map{quotemeta } @$tokens) . "')"; > > my $sql =<<SQL; > SELECT values.token, t.token_idx, ut.token_idx > FROM > (values TOKEN_LIST_STRING) as values(token) > left outer join tokens t using (token) > left outer join > ( select token_idx from user_token where user_idx = $self-> > {user_idx}) "ut" > using (token_idx) > SQL > > $sql =~ s/TOKEN_LIST_STRING/$string/o; > > ------------------------- > This is something on the order to 10-1000 times faster depending on > the various result sets matching more/less across the various tables. > "very nice..." > > I'm thinking I could do the same thing where I have a lot of inserts. > But this is going to be considerably harder because I don't have > any really good experience with doing this with transactional > integrity. > > From this query, I eventually want to grab all the values.token and > insert them into the token and user_token table. > > I can call: > insert into tokens(token) values(('one'),('two'),('three')); > then call: > insert into user_token(user_idx, token_idx) > select 14, token_idx from tokens where token in > ('one','two','three') > > And that should work. > HOWEVER: > If I have two simultaneous INSERT INTO token(token) queries with > overlapping values I'm going to get into all kinds of trouble with > the integrity constraint on > my index of unique tokens. Typically I'll get an integrity > violation error. > > Or am I looking at something like: > insert into tokens > select .. > values(..) as values(token) > left outer join tokens using (token) > > (Am I getting the hang of this any better?) > > > > On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote: > >> >> --- Tom Allison <tom@tacocat.net> wrote: >> >>> This is getting really ugly... >>> it won't finish in less than .. minutes. >>> >>> spam=> explain select u2.token_idx, t.token_idx, foo.token from >>> tokens t left outer join user_token u1 using (token_idx), >>> users u left outer join user_token u2 using (user_idx), >>> (values('one'),('want'),('examine'),('three')) as foo(token) >>> left outer join tokens using (token) >>> where u.user_idx = 15; >> >> It looks to me that your query has (3) left joins and (3) implied >> cross-joins. Perhaps reforming >> your query to eliminate the cross-joins with help performance. In >> regard to your tables <tokens> >> and <user_tokens>, in this query you are referring to (2) separate >> instances of these tables when >> a single instance these tables would probably work just fine. i.e. >> >> tokens t vs. tokens, >> user_token u1 vs user_token u2 >> >> Regards, >> Richard Broersma Jr. > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster