Thread: max length of sql select statement ?

max length of sql select statement ?

From
markus brosch
Date:
Hi All!

I was searching the archive and was wondering why nobody asked this
strange(!) question (or I've not found it?):

"What is the max allowed length of a sql statement or query?"
I want to combine hundrets or thousands 'OR' within a select statement.
Possible or not (e.g: SELECT * FROM table WHERE col='a' OR col='b' OR
col='c' OR ...... )

This would be a very "dirty" or some would say, a "horrible" solution;

but how are you searching through your table with sql, when your SELECT
is against a collection of different elements (e.g. Array of Strings,
which should fit on one column and return all records which fit)

Hope for help
cheers Markus - sql beginner ;-)





Re: max length of sql select statement ?

From
Date:

Depending on ur original problem EXISTS or IN may be usable
EXISTS is efficient and IN can be used efficiently in 7.4 version
of postgresql

regds
mallah.




> Hi All!
>
> I was searching the archive and was wondering why nobody asked this
> strange(!) question (or I've not found it?):
>
> "What is the max allowed length of a sql statement or query?"
> I want to combine hundrets or thousands 'OR' within a select statement.
> Possible or not (e.g: SELECT * FROM table WHERE col='a' OR col='b' OR
> col='c' OR ...... )
>
> This would be a very "dirty" or some would say, a "horrible" solution;
>
> but how are you searching through your table with sql, when your SELECT
> is against a collection of different elements (e.g. Array of Strings,
> which should fit on one column and return all records which fit)
>
> Hope for help
> cheers Markus - sql beginner ;-)
>
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> commands go to majordomo@postgresql.org



-----------------------------------------
Get your free web based email at trade-india.com.  "India's Leading B2B eMarketplace.!"
http://www.trade-india.com/




Re: max length of sql select statement ?

From
markus brosch
Date:
On Mon, 2003-07-07 at 17:57, mallah@trade-india.com wrote: 
> Depending on ur original problem EXISTS or IN may be usable
> EXISTS is efficient and IN can be used efficiently in 7.4 version
> of postgresql

Could be a solution?!
The question is - how long could the IN be?

I mean, if I write something like: 
SELECT * FROM table WHERE columnX IN ('a', 'b', 'c', 'd', 'e', ... ); 
How long can the collection (list) within IN be? Also thousands of
elements?

And what means efficient? Goes the DB only once through the table?

Cheers, Markus





Re: max length of sql select statement ?

From
Date:
> On Mon, 2003-07-07 at 17:57, mallah@trade-india.com wrote:
>> Depending on ur original problem EXISTS or IN may be usable
>> EXISTS is efficient and IN can be used efficiently in 7.4 version of
>> postgresql
>
> Could be a solution?!
> The question is - how long could the IN be?
>
> I mean, if I write something like:
> SELECT * FROM table WHERE columnX IN ('a', 'b', 'c', 'd', 'e', ... );
> How long can the collection (list) within IN be? Also thousands of
> elements?


Well i DO NOT know the exact limit.
May be someone else can answer it accurately.

But you could  produce the list within IN using a subselect
that again depends on the exact problem.

regds
Mallah.


>
> And what means efficient? Goes the DB only once through the table?
>
> Cheers, Markus
>
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 2: you can get off all lists
> at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



-----------------------------------------
Get your free web based email at trade-india.com.  "India's Leading B2B eMarketplace.!"
http://www.trade-india.com/




Re: max length of sql select statement ?

From
Rod Taylor
Date:
> Could be a solution?!
> The question is - how long could the IN be?

I'm not sure about IN specifically, but I know you can do:

SELECT * FROM table WHERE col = '<1GB long file>';

It tends not to be friendly for Ram though :)

Re: max length of sql select statement ?

From
markus brosch
Date:
On Mon, 2003-07-07 at 18:13, mallah@trade-india.com wrote:
> Well i DO NOT know the exact limit.
> May be someone else can answer it accurately.
> 
> But you could  produce the list within IN using a subselect
> that again depends on the exact problem.
> 

Maybe anybody knows how many?

Anyway: My exact problem is "in words" quite easy:
col1 | col2
------------123  | 958143  | 394124  | 345324  | 345346  | 541743  | 144346  | 986

Imagine, this table is really big (millions records). 
Now, I want to retrieve for all records in col A OR col B where either
123, 124, 144, 541 (and a view thousands more ...) fits.

As far as I understud you:
SELECT * FROM table 
WHERE col1 IN (123,124,144,541,...) 
OR col2 IN (123,124,144,541,...);

Cheers, Markus



Re: max length of sql select statement ?

From
Stephan Szabo
Date:
On 7 Jul 2003, markus brosch wrote:

> I was searching the archive and was wondering why nobody asked this
> strange(!) question (or I've not found it?):
>
> "What is the max allowed length of a sql statement or query?"

AFAIR in recent versions it's effectively limited only by resources (how
much bandwidth/memory do you want to use).

> I want to combine hundrets or thousands 'OR' within a select statement.
> Possible or not (e.g: SELECT * FROM table WHERE col='a' OR col='b' OR
> col='c' OR ...... )

It should be possible.

> This would be a very "dirty" or some would say, a "horrible" solution;
>
> but how are you searching through your table with sql, when your SELECT
> is against a collection of different elements (e.g. Array of Strings,
> which should fit on one column and return all records which fit)

Are the 'a', 'b', etc... fixed or generated on the fly?  I'm not sure
that the plan for a thousand OR clauses (or equivalently a thousand
entries in a static IN) is going to necessarily be terribly good.  You
might have better luck setting it up to do a join with a table (possibly a
temporary table if they're generated on the fly).



Re: max length of sql select statement ?

From
markus brosch
Date:
> > but how are you searching through your table with sql, when your SELECT
> > is against a collection of different elements (e.g. Array of Strings,
> > which should fit on one column and return all records which fit)
> 
> Are the 'a', 'b', etc... fixed or generated on the fly?  I'm not sure
> that the plan for a thousand OR clauses (or equivalently a thousand
> entries in a static IN) is going to necessarily be terribly good.  You
> might have better luck setting it up to do a join with a table (possibly a
> temporary table if they're generated on the fly).

Generated "on the fly" by JDBC within Java ;-)
I already tried the join, but it takes ages *g*



Re: max length of sql select statement ?

From
markus brosch
Date:
On Mon, 2003-07-07 at 18:20, Rod Taylor wrote: 
> > Could be a solution?!
> > The question is - how long could the IN be?
> 
> I'm not sure about IN specifically, but I know you can do:
> SELECT * FROM table WHERE col = '<1GB long file>';
> It tends not to be friendly for Ram though :)


Hi again!

After I decided to use this statement (discussion two days ago!):
SELECT * FROM table WHERE columnX IN ('a', 'b', 'c', 'd', 'e', ... );

I got this error :-(((

java.sql.SQLException: ERROR:  Expression too complex: nesting depth
exceeds max_expr_depth = 10000

I was searching the web and only found this discussion:


http://groups.google.de/groups?hl=de&lr=&ie=UTF-8&oe=UTF-8&threadm=3B2EA1E9.843D940A%40selectacast.net&rnum=1&prev=/groups%3Fhl%3Dde%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26q%3DExpression%2Btoo%2Bcomplex%253A%2Bnesting%2Bdepth%2Bexceeds%2Bmax_expr_depth%2B%253D%2B10000%26btnG%3DGoogle%2BSuche

Maybe we can disuss that problem here again?! What exactly means
"max_expr_depth"? Thanks for any help
Cheers Markus




Re: max length of sql select statement ?

From
Rod Taylor
Date:
> Maybe we can disuss that problem here again?! What exactly means
> "max_expr_depth"? Thanks for any help

If I'm not mistaken, max_expr_depth is used to catch runaway recursion
(view a is select * from b, view b is select * from a).

It's a tunable in postgresql.conf.  Toss a couple of 0's behind the
number and uncomment the line.


Re: max length of sql select statement ?

From
markus brosch
Date:
On Wed, 2003-07-09 at 17:45, Rod Taylor wrote:
> > Nobody a better idea? Why is the join of a temporary table (the IN
> > paramters) and the original table so slow? Any tricks here? 
> 
> Did you index and ANALYZE the temporary table?

No!

I have to do this "strange" and "long" statement also a view thousand
times ... so I always have to create the temp table again and again and
if I index the table it costs me more time. 

As far I am relativly new to SQL, I ask you:
Where's the difference between and indexed temp (!!!) table and an
unindexed one? The main table is of course fully indexed ;-)

Cheers, Markus




Re: max length of sql select statement ?

From
Rod Taylor
Date:
On Wed, 2003-07-09 at 15:51, markus brosch wrote:
> On Wed, 2003-07-09 at 17:45, Rod Taylor wrote:
> > > Nobody a better idea? Why is the join of a temporary table (the IN
> > > paramters) and the original table so slow? Any tricks here?
> >
> > Did you index and ANALYZE the temporary table?
>
> No!
>
> I have to do this "strange" and "long" statement also a view thousand
> times ... so I always have to create the temp table again and again and
> if I index the table it costs me more time.

So.. which costs more.  Building the index + fast query or doing the
slow query?

If you have anything more than a thousand lines in the temp table, I bet
indexing it would be the better way to go.

> As far I am relativly new to SQL, I ask you:
> Where's the difference between and indexed temp (!!!) table and an
> unindexed one? The main table is of course fully indexed ;-)

One has an index, which means that other (much faster) methods may be
available for the database to use.

Re: max length of sql select statement (long!)

From
markus brosch
Date:
It's me again - with the easy and complex statement at the same time ;-)
Who can help me?!

As far, my solution is still unsatisfactory. 
I'll explain my full task this time, 
otherwise it is hard to explain!

My tables: 
(redundant for speed reasons - doesn't matter - only requests!)

===============================================================

table: goGraph
primarykey(parent,child)
index (B-tree) on parent column and on child column.

parent     |   child
--------------------------
GO:0000001 |   GO:0000002
GO:0000001 |   GO:0000003
GO:0000002 |   GO:0000005
GO:0000002 |   GO:0000006
GO:0000003 |   GO:0000006
GO:0000003 |   GO:0000007
...        |   ...
about 15000 elements more!

This represents a direct acyclic graph:
                GO:0000001                /        \               v          v       GO:0000002        GO:0000003
/        \        /        \      v          v      v          v
 
GO:0000005       GO:0000006        GO:0000007

===============================================================

table: binaryInteraction
primarykey(baitProtein, preyProtein, gogoKey)
index (B-tree) on gogokey.

baitProtein | preyProtein | "baitGoId" | "preyGoId" | gogoKey
--------------------------------------------------------------
123         | 123         | GO:0000001 | GO:0000001 | 11
123         | 123         | GO:0000020 | GO:0000001 | 120
123         | 123         | GO:0000020 | GO:0000020 | 2020
123         | 345         | GO:0000001 | GO:0000001 | 11 
123         | 345         | GO:0000020 | GO:0000001 | 120
...         | ...         | ...        | ...        | ...
up to several millions entries!

===============================================================

So, what I have to do is:
Someone choose 2 goIds. For these ids and all their children
(information from goGraph) I want the count(*) of "fits" within the
table binaryInteraction. That means, baitGoId and preyGoId must be
either the two choosen goIds or one of the related children. I use the
gogoKey for this task: I compute Integers (within Java) from the goIds
and sort smallerGoId.concatenate(biggerGoId) - so I've unique keys for
baitGo / preyGo pairs. One more problem: The goGraph (see table) is a
graph ... and for that reason one node can have more parents. If I
choose e.g. GO:000002 and GO:000003 from the example above and want to
compute their children, I don't want to take GO:000006 into account (two
parents in which I am intested in). That menas, whenever I ask for
children of two nodes, I want a DISTINCT SET of children.  

Example how I am processing the data at the moment:

User chosse GoId1: GO:0000005 and GO:0000008;

Java: 
- get all childs of GO:000005 by jdbc from goGraph - collection "one"
- get all childs of GO:000006 by jdbc from goGraph - collection "two"
- compute intersection of the two collections
- substract of each collection the intersection
- result: distinct children of two goIds

Now the binaryInteraction table came into our game ;-)
I apply the one distinct collection "one" and collection "two" to
gogoKey:

-----------------------

1. SELECT count(*)   FROM (SELECT DISTINCT bait, prey         FROM binaryInteraction         WHERE gogoKey IN
(gogokey1,gogokey2, gogokey3, ... )       )  AS foo;
 

gogokeyX will be processed by java -> it generates for each
go-go-combination the unique gogokey. 

Speed: medium

Limit: only 10000 Elements within the IN statement can be processed by
default. In postgres.conf I can change this limit, as Rod Taylor already
posted. But in the manual there is a warning of BufferOverflow if the
value it too high ...

If I reach the limit, I get something like:
java.sql.SQLException: ERROR:  Expression too complex: nesting depth
exceeds max_expr_depth = 10000

BTW: Why nested depth? Within IN I compute thousands of "OR"s, but why
nested ... ???

-----------------------

2. for all gogokeys I generate a temporary table and do a join with the
binaryInteraction table.

SELECT COUNT(*)
FROM (SELECT DISTINCT binaryInteraction.bait binaryInteraction.prey      FROM binaryInteraction, tempTable      WHERE
binaryInteraction.gogokey= tempTable.gogokey) 
 
AS temp)

Speed: very slow, but no limit!

-----------------------

After all, I want to compute and precalculate every possible
goId-goId-combination ;-) So, each of these things I explained here,
I've to do millions of time! I tried to find a recursive solution -
impossible!

Who can help? Maybe there is a way to compute it in one but more complex
sql-statemnt. Me as far I am still new to sql I don't know the "tricks"!
If there would be a way to calculate the intersection ... then maybe a
solution is not that far ... Also I was thinking about a denormalized
goGraph table: parent -> allChilds. Then I compute (within SQL?) the
intersection between the 2 GoIds and then I apply it to
binaryInteraction.

Thanks for any ideas and thanks at least 
for reading that long story ;-)

Cheers Markus

PS: If you have any additional questions: please feel free to ask!




Re: max length of sql select statement (long!)

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> ... I don't want to take GO:000006 into account (two parents in which 
> I am intested in). That menas, whenever I ask for children of two nodes, 
> I want a DISTINCT SET of children.

To start with, you can avoid the Java and do this in SQL:

SELECT child FROM gograph WHERE parent='GO:0000002' OR parent='GO:0000005'
EXCEPT
(SELECT child FROM gograph WHERE parent='GO:0000002'
INTERSECT 
SELECT child FROM gograph WHERE parent='GO:0000005');


And yes, I would certainly start by normalizing things a little bit:


CREATE SEQUENCE goid_seq;

CREATE TABLE goID ( idname TEXT, id INTEGER NOT NULL DEFAULT nextval('goid_seq')
);

INSERT INTO goid(idname) SELECT DISTINCT parent FROM gograph;

INSERT INTO goid(idname) 
SELECT DISTINCT child FROM gograph WHERE NOT EXISTS (SELECT 1 FROM goID WHERE idname = child);

CREATE TABLE gomap ( parent INTEGER, child INTEGER
);

INSERT INTO gomap SELECT 
(SELECT id FROM goid WHERE idname=parent),
(SELECT id FROM goid WHERE idname=child)
FROM gograph


As far as the binaryInteraction table, a little more information is needed: 
how are each of these tables being populated? Why the distinct? Is it because 
there may be duplicate rows in the table? The reason I as is that it might be 
better to ue triggers to compute some of the information as it comes in, 
depending on which tables are changes and how often.


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200307151035

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/FBGrvJuQZxSWSsgRAlEfAKCL4ttDdTRHxPRW9N00nowPh1/q/QCgqrkv
e7Ncj4al4aJ4ihktEyweJJo=
=Z/rk
-----END PGP SIGNATURE-----