Thread: query for a time interval
Hello everybody, I'm looking for an elegant SQL statement that will work in Postgresql, MySQL and ORACLE. The query will be executed by Java client. To have this query for Postgresql is priority number one. In this query I try to get a list of message Ids that expired. time_to_live is in seconds. SELECT id FROM mq WHERE now - start_date > time_to_live; I have a following table: CREATE TABLE mq { msg_id INTEGER, retry_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1), start_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1), time_to_live INTEGER } Thanks! Mark. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Hello everybody, I have a table that stores responses to questions in different questionnaires. This table will grow to millions of rows. My problem is that while most of the data in the table are rarely used in queries one type of response will be used quite often: biodata - name, gender, e-mail and this sort of. This data is also collected as responses to questionnaires. My question: what is the best strategy if I wanted to quickly retrieve biodata from this table: CREATE TABLE "public"."itemresponse" ( "testoccasionid" INTEGER NOT NULL, "itemorder" SMALLINT NOT NULL, "response" TEXT NOT NULL, "bio" INTEGER DEFAULT 0 NOT NULL, "datatype" SMALLINT NOT NULL, CONSTRAINT "ItemResponseText_pk" PRIMARY KEY("testoccasionid", "itemorder"), CONSTRAINT "ItemResponseText_TestOccasionID_fkey" FOREIGN KEY ("testoccasionid") REFERENCES "public"."testoccasion"("testoccasionid") ON DELETE NO ACTION ON UPDATE NO ACTION NOT DEFERRABLE ) WITH OIDS; I can store the fact that it is biodata in the bio field - it is biodata if the value of that field is not 0 and I can index that field and simply use that as one of the conditions in queries. Or should I instead create a view that contains only the biodata and select from that? But will postgres use the indexes than? Would that be a better approach? Thanks for the help. SWK
Am Mittwoch, 21. Dezember 2005 21:27 schrieb Klein Balázs: > Or should I instead create a view that contains only the biodata and select > from that? But will postgres use the indexes than? Would that be a better > approach? Whether the query is executed by a view or typed in in its full form by hand is completely irrelevant to the question whether indexes are used or should be created. Views do not optimize anything.
I thought that if I used a view to retrieve data its content might be cached so it would make the query faster. Am Mittwoch, 21. Dezember 2005 21:27 schrieb Klein Balázs: > Or should I instead create a view that contains only the biodata and select > from that? But will postgres use the indexes than? Would that be a better > approach? Whether the query is executed by a view or typed in in its full form by hand is completely irrelevant to the question whether indexes are used or should be created. Views do not optimize anything. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
Klein Balázs wrote: > Hello everybody, > > I have a table that stores responses to questions in different > questionnaires. This table will grow to millions of rows. [snip] > I can store the fact that it is biodata in the bio field - it is biodata if > the value of that field is not 0 and I can index that field and simply use > that as one of the conditions in queries. > > Or should I instead create a view that contains only the biodata and select > from that? But will postgres use the indexes than? Would that be a better > approach? Create the index on the table and a view should use it. You might like to read up on partial indexes where you can do something like: CREATE INDEX my_index ON itemresponse (testoccasionid,itemorder) WHERE bio > 0; So long as your query/view definition has WHERE bio > 0 in it then this index can be used. HTH -- Richard Huxton Archonet Ltd
On Wed, Dec 21, 2005 at 10:49:29PM +0100, Klein Bal?zs wrote: > I thought that if I used a view to retrieve data its content might be cached > so it would make the query faster. No. A view is essentially exactly the same as inserting the view definition into the query that's using it. IE: CREATE VIEW v AS SELECT * FROM t; SELECT * FROM v becomes: SELECT * FROM (SELECT * FROM t) v; What you could do is partition the table so that critical information is stored in a smaller table while everything else goes to a larger table. You can then do a UNION ALL view on top of that to 'glue' the two tables together. You can even define rules so that you can do updates on the view. http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ has an example that's similar to this. Note that you'll need an appropriate index on the large table so that PostgreSQL can quickly tell it doesn't contain values that are in the small table. Or, in 8.1 you could use a constraint. You could also do this with inherited tables instead of views. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, Dec 21, 2005 at 11:52:56AM -0800, Mark wrote: > SELECT id > FROM mq > WHERE now - start_date > time_to_live; The problem is you can't use an index on this, because you'd need to index on (now() - start_date), which obviously wouldn't work. Instead, re-write the WHERE as: WHERE start_date < now() - time_to_live -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
thanks for the help >What you could do is partition the table so that critical information is >stored in a smaller table while everything else goes to a larger table. I was thinking the other way round - maybe I can split the large table by creating a materialized view. But than I read that it is maybe unnecessary to create a materialized view because pg "Materialized views sound a useful workaround, if your database doesn't have a query cache. If you do have a query cache, then you already effectively have eager or lazy materialized views (depending on your isolation level): Just use your normal view (or query) and let the database figure it out." Quote from Farce Pest in http://spyced.blogspot.com/2005/05/materialized-views-in-postgresql.html But later in the same blog it seems to indicate that there is a choice to either use or not use the query cache of pg. So I don't know now how this cache works and whether it could help me in this. SWK
=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs.Klein@axelero.hu> writes: > But later in the same blog it seems to indicate that there is a choice to > either use or not use the query cache of pg. Hm? There is no query cache in PG. regards, tom lane
On Thu, Dec 22, 2005 at 09:47:11AM -0600, Jim C. Nasby wrote: > On Wed, Dec 21, 2005 at 11:52:56AM -0800, Mark wrote: > > SELECT id > > FROM mq > > WHERE now - start_date > time_to_live; > > The problem is you can't use an index on this, because you'd need to > index on (now() - start_date), which obviously wouldn't work. Instead, > re-write the WHERE as: > > WHERE start_date < now() - time_to_live Unless I'm missing something that wouldn't use an index either, because the planner wouldn't know what value to compare start_date against without hitting each row to find that row's time_to_live. But something like this should be able to use an expression index on (start_date + time_to_live): WHERE start_date + time_to_live < now() -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > Unless I'm missing something that wouldn't use an index either, > because the planner wouldn't know what value to compare start_date > against without hitting each row to find that row's time_to_live. > But something like this should be able to use an expression index > on (start_date + time_to_live): > WHERE start_date + time_to_live < now() Or perhaps simpler, get rid of time_to_live in favor of an end_date column, which could be indexed directly. regards, tom lane
On Wed, Dec 21, 2005 at 11:52:56 -0800, Mark <sendmailtomark@yahoo.com> wrote: > Hello everybody, > > I'm looking for an elegant SQL statement that will work in > Postgresql, MySQL and ORACLE. > The query will be executed by Java client. > > To have this query for Postgresql is priority number one. > > > In this query I try to get a list of message Ids that expired. > > time_to_live is in seconds. > > SELECT id > FROM mq > WHERE now - start_date > time_to_live; An interval is not going to be comparable to an integer without casting. If you use explicit casting the query probably won't be portable. If there is some implicit cast that makes the above SQL valid, it is probably a cast to text which won't do what you want. > > I have a following table: > > CREATE TABLE mq > { > msg_id INTEGER, > retry_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1), > start_date TIMESTAMP NOT NULL DEFAULT ('now'::text)::timestamp(1), > time_to_live INTEGER > } Can you make time_to_live an interval? > > Thanks! > Mark. > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match