Collating records based on a custom group by (aggregate like) function - Mailing list pgsql-hackers
From | Dan Searle |
---|---|
Subject | Collating records based on a custom group by (aggregate like) function |
Date | |
Msg-id | 1281754331.20080318140010@adelix.com Whole thread Raw |
List | pgsql-hackers |
Hi, I want to create a custom function/query (using the C API?) that collates records from a record set (takes multiple records as input) and produces a set of records (returns multiple records). The Postgres docs only give examples of how to create a custom function that returns multiple records (sets of tuples), but does not show how you can create a function that can take a set of tuples as input and scan them. This requirement has come about because I want to scan an event log table and collate individual log entries (group them) depending on a custom notion of equality. For instance, if we had the table "events": timestamp, event 00:00:00, a 00:00:01, b 00:00:02, a 00:10:00, b 00:10:01, b The custom function/query should output: event, start, end a, 00:00:00, 00:00:02 b, 00:00:01, 00:00:01 b, 00:10:00, 00:10:01 It's like grouping the events in the log table by name and timestamp, however the timestamp grouping is not based simply on equality, but the notion that if two events happen within 5 minutes of each other they are collated into the same event "bucket". Another way to look at the problem would be to create a special grouping function in a GROUP BY clause, e.g. SELECT name, min(timestamp), max(timestamp) FROM events GROUP BY name, enumerate_event(name, timestamp); Where "enumerate_event" would take the name ("a" or "b" as in the example previously), and the timestamp for the event, then return a unique collated event ID for each distinct event "bucket". The problem here is that the "enumerate_event" function would need to save it's state (a temporary list of active events) between calls. I can't see an easy way to make a custom function using the Postgres C API that can (for each transaction) setup an internal state, repeatedly use the state during the execution of a query, and then free it's state when the query finishes. I've racked my brain about this but can't think of a simple solution, even though this appears to be a simple problem, any suggestions much appreciated. Regards, Dan... -- Dan Searle Adelix Ltd dan.searle@adelix.com web: www.adelix.com tel: 0845 230 9590 / fax: 0845 230 9591 / support: 0845 230 9592 snail: The Old Post Office, Bristol Rd, Hambrook, Bristol BS16 1RY. UK. Adelix Ltd is a registered company in England & Wales No. 4232156 VAT registration number 779 4232 91 Adelix Ltd is BS EN ISO 9001:2000 Certified (No. GB 12763) Any views expressed in this email communication are those of the individual sender, except where the sender specifically states them to be the views of a member of Adelix Ltd. Adelix Ltd. does not represent, warrant or guarantee that the integrity of this communication has been maintained nor that the communication is free of errors or interference. ------------------------------------------------------------------------------------ Scanned for viruses, spam and offensive content by CensorNet MailSafe Professional Web & E-mail Filtering from www.censornet.com
pgsql-hackers by date: