Thread: Support for Limit in Update, Insert...
Would be any future support for limit in update/insert queries? so you could do something like
update table1 set col1=value1 limit 1000;
would update just the first 1000 rows in the table. I've been playing a little with the SPI and I get the SPI already has the support for limit the number of changes or elements in a query (the SPI_execute function).
Any comments?
Thanks a lot
"Cristian Prieto" <cristian@clickdiario.com> writes: > Would be any future support for limit in update/insert queries? so you = > could do something like > update table1 set col1=3Dvalue1 limit 1000; > would update just the first 1000 rows in the table. That seems like a spectacularly bad idea, considering that you could have no guarantees about *which* 1000 rows get updated. regards, tom lane
Maybe the first 1000 rows based in the primary index???? ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Cristian Prieto" <cristian@clickdiario.com> Cc: <pgsql-general@postgresql.org> Sent: Thursday, September 08, 2005 6:05 PM Subject: Re: [GENERAL] Support for Limit in Update, Insert... > "Cristian Prieto" <cristian@clickdiario.com> writes: >> Would be any future support for limit in update/insert queries? so you = >> could do something like >> update table1 set col1=3Dvalue1 limit 1000; >> would update just the first 1000 rows in the table. > > That seems like a spectacularly bad idea, considering that you could > have no guarantees about *which* 1000 rows get updated. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
On Thu, Sep 08, 2005 at 07:19:34PM -0600, Cristian Prieto wrote: > Maybe the first 1000 rows based in the primary index???? No, this is not a satisfactory answer, because 1. it's possible that there's no primary key at all, or 2. said index may not get used for the execution of the update. Maybe something else I'm forgetting right now. This has been discussed before, and rejected. Please see the archives. -- Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com "Java is clearly an example of a money oriented programming" (A. Stepanov)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > This has been discussed before, and rejected. Please see the archives. For SELECT, both LIMIT and OFFSET are only well-defined in the presence of an ORDER BY clause. (One could argue that we should reject them when no ORDER BY, but given that the database isn't getting changed as a side effect, that's probably too anal-retentive. When the database *is* going to be changed, however, I for one like well-defined results.) If this proposal included adding an ORDER BY to UPDATE/DELETE, then it would at least be logically consistent. I have not seen the use-case for it though. In any case you can usually get the equivalent result with something like UPDATE foo SET ... WHERE pkey IN (SELECT pkey FROM foo ORDER BY ... LIMIT ...); regards, tom lane
Well, I do have a use case for it. Context: We have data coming in from web requests, which must be fast, so we just insert them in temporary tables without any verification. Then they are periodically processed by a background task, but even that one will process just a chunk at a time to avoid long running queries and the possible socket timeouts bundled with them. Now for identifying a chunk we use a "chunkid" field in those temporary tables, which is initially null. When a chunk is selected for processing, we update the chunkid field with the next value of a sequence, and then all further processing has a where clause which selects only records with that chunkid. Use case: To set the chunkid only for 1000 rows, we actually don't care which ones. The idea is to uniquely mark a chunk of data, we really don't care which rows are selected in each chunk, they will be processed all eventually. Of course right now we do it by something similar with what you proposed, using a subselect with a limit clause, I wonder if a simple update with limit could be faster ? Cheers, Csaba. On Fri, 2005-09-09 at 04:49, Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > This has been discussed before, and rejected. Please see the archives. > > For SELECT, both LIMIT and OFFSET are only well-defined in the presence > of an ORDER BY clause. (One could argue that we should reject them when > no ORDER BY, but given that the database isn't getting changed as a side > effect, that's probably too anal-retentive. When the database *is* > going to be changed, however, I for one like well-defined results.) > > If this proposal included adding an ORDER BY to UPDATE/DELETE, then it > would at least be logically consistent. I have not seen the use-case > for it though. In any case you can usually get the equivalent result > with something like > > UPDATE foo SET ... > WHERE pkey IN (SELECT pkey FROM foo ORDER BY ... LIMIT ...); > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster
On Thu, Sep 08, 2005 at 10:49:25PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > This has been discussed before, and rejected. Please see the archives. > > For SELECT, both LIMIT and OFFSET are only well-defined in the presence > of an ORDER BY clause. (One could argue that we should reject them when > no ORDER BY, but given that the database isn't getting changed as a side > effect, that's probably too anal-retentive. When the database *is* > going to be changed, however, I for one like well-defined results.) > > If this proposal included adding an ORDER BY to UPDATE/DELETE, then it > would at least be logically consistent. I have not seen the use-case > for it though. In any case you can usually get the equivalent result > with something like > > UPDATE foo SET ... > WHERE pkey IN (SELECT pkey FROM foo ORDER BY ... LIMIT ...); BTW, this is a case where using ctid would make sense, though you can't: decibel=# update rrs set parent=parent+1 where ctid in (select ctid from rrs order by rrs_id limit 1); ERROR: could not identify an ordering operator for type tid HINT: Use an explicit ordering operator or modify the query. ERROR: could not identify an ordering operator for type tid HINT: Use an explicit ordering operator or modify the query. decibel=# -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Sep 09, 2005 at 06:42:10PM -0500, Jim C. Nasby wrote: > On Thu, Sep 08, 2005 at 10:49:25PM -0400, Tom Lane wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > This has been discussed before, and rejected. Please see the archives. > > > > For SELECT, both LIMIT and OFFSET are only well-defined in the presence > > of an ORDER BY clause. (One could argue that we should reject them when > > no ORDER BY, but given that the database isn't getting changed as a side > > effect, that's probably too anal-retentive. When the database *is* > > going to be changed, however, I for one like well-defined results.) > > > > If this proposal included adding an ORDER BY to UPDATE/DELETE, then it > > would at least be logically consistent. I have not seen the use-case > > for it though. In any case you can usually get the equivalent result > > with something like > > > > UPDATE foo SET ... > > WHERE pkey IN (SELECT pkey FROM foo ORDER BY ... LIMIT ...); > > BTW, this is a case where using ctid would make sense, though you can't: > > decibel=# update rrs set parent=parent+1 where ctid in (select ctid from > rrs order by rrs_id limit 1); > ERROR: could not identify an ordering operator for type tid > HINT: Use an explicit ordering operator or modify the query. > ERROR: could not identify an ordering operator for type tid > HINT: Use an explicit ordering operator or modify the query. > decibel=# Actually, after trying this, curiosity took hold: (Note that it's not actually safe to use ctid like this) decibel=# explain analyze select * from rrs where ctid='(0,3)'; QUERY PLAN ---------------------------------------------------------------------------------------------- Tid Scan on rrs (cost=0.00..4.01 rows=1 width=66) (actual time=0.072..0.076 rows=1 loops=1) Filter: (ctid = '(0,3)'::tid) Total runtime: 0.265 ms (3 rows) decibel=# Shouldn't there be an access method that goes directly to the specified ctid instead of doing a seqscan? Even on a small table it seems this would be faster than a seqscan. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461