Re: Find inconsistencies in data with date range - Mailing list pgsql-sql
From | Jason Aleski |
---|---|
Subject | Re: Find inconsistencies in data with date range |
Date | |
Msg-id | 550111A0.2070105@gmail.com Whole thread Raw |
In response to | Re: Find inconsistencies in data with date range (s d <daku.sandor@gmail.com>) |
List | pgsql-sql |
In case anyone else needs similar code, I was able to get this working. Below is the code that pulls the missing datesusing a cursor and returns the information into a table. I'm sure there may be a way to make the code more efficient,but considering this will only get ran maybe once a quarter (for quarterly reports), it works for me. With 700+stores, it takes about 30 minutes to fully run from a reporting server. I have a JAVA program that queries the function"SELECT * FROM eod_missing_dates();" Then sends all the missing dates to a RabbitMQ server to with a worker programto try to rebuild the missing eod summaries and if not, it will send a message to the store managers. Hopefully thiscode will help someone else!<br /><br /><br /> CREATE OR REPLACE FUNCTION eod_missing_dates()<br /> RETURNS TABLE(store_iduuid, location character varying, missing_ts timestamp with time zone) AS<br /> $BODY$<br /> DECLARE<br /> location_cursor CURSOR FOR SELECT * FROM locations ORDER BY store_id;<br /> store_rec location%ROWTYPE;<br /> BEGIN<br/><br /> CREATE TEMP TABLE dr_temptable(dr_ts, dr_dow) AS (SELECT t1.GenDate as gendate, extract(dow from GenDate)AS dayofweek<br /> FROM (SELECT date as GenDate<br /> FROM generate_series('1950-01-01'::date,CURRENT_TIMESTAMP::date,'1day'::interval) date<br /> ) AS t1 <br /> WHERE extract(dowfrom GenDate) NOT IN (0,6));<br /><br /> OPEN location_cursor;<br /> LOOP<br /> FETCH location_cursorINTO store_rec;<br /> EXIT WHEN store_rec IS NULL;<br /><br /> IF NOT FOUND THEN<br /> EXIT;<br/> END IF;<br /> <br /> RAISE INFO '%', 'Checking data/dates for ' || store_rec.location;<br /> RETURNQUERY SELECT store_rec.row_id as store_id, store_rec.location AS location, dr_ts AS missing_ts FROM dr_temptable ASt1<br /> WHERE dr_ts NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1<br /> JOIN location AS t2 ON t1.store_id = t2.row_id<br /> WHERE t2.location = store_rec.location) <br /> AND dr_ts > (SELECT start_date FROM locations WHERElocation=store_rec.location); <br /> END LOOP;<br /> CLOSE location_cursor; <br /> DROP TABLE dr_temptable;<br/> END;<br /> $BODY$<br /> LANGUAGE plpgsql VOLATILE<br /> ;<br /><br /><br /><pre class="moz-signature"cols="72">Jason Aleski / IT Specialist</pre><br /><blockquote cite="mid:CAKyoTgbVfWULPdG1-BX3FSfm=hT3aFRs=avM+kg=0ozhP_RCwg@mail.gmail.com"type="cite"><div class="gmail_extra"><br /><divclass="gmail_quote">On 6 March 2015 at 22:38, Jason Aleski <span dir="ltr"><<a href="mailto:jason.aleski@gmail.com"moz-do-not-send="true" target="_blank">jason.aleski@gmail.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><divbgcolor="#FFFFFF" text="#000000"> I know I can do this Java, but I'd rather have this runningas a Stored Procedure. What I am wanting to do is identify and potentially correct the summary data for date inconsistencies. We have policies/red flag reports in place to keep this from happening, but we are now cleaning up history. The query below works on a per store basis, but I'd like to be able to run this for all stores in the location table.<br/><br /> I've looked at some procedure codes regarding looping, but everything I try to create seems to give meproblems. THe code I'm trying is also below. Does anyone have any suggestions on how to accomplish this?<br /><br /><br/><br /><u>Working Tables</u><br /> locations - table contains store information, startup date, address, etc<br /> daily_salessummary - table holds daily sales summary by store (summary should be updated nightly). eod_ts is End ofDay Timestamp.<br /><br /><u>Query</u><br /> WITH datelist AS(<br /> SELECT t1.GenDate as gendate, extract(dow from GenDate)AS dayofweek<br /> FROM (SELECT date as GenDate<br /> FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1day'::interval) date<br /> ) AS t1 <br /> )<br /> SELECTgendate FROM datelist AS t1<br /> WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1<br /> JOIN locations AS t2 ON t1.location_id = t2.row_id<br /> WHERE t2.locationCode= 'US_FL_TAMPA_141')<br /> <br /> AND gendate > (SELECT start_date FROM locationsWHERE locationCode = 'US_FL_TAMPA_141')<br /><br /><br /><u>Desired Output</u> - could output to an exceptions table<br/> StoreCode 'US_FL_TAMA_141' missing daily summary for 1998-01-01<br /> StoreCode 'MX_OAXACA_SALINA_8344' missingdaily summary for 2011-06-05<br /><br /><br /><u>ProcedureSQL</u> (contains unknown errors)<br /> DECLARE<br /> CURSOR location_table IS<br /> SELECT locationCode FROM locations;<br /> BEGIN<br /> FOR thisSymbol IN ticker_tablesLOOP<br /> EXECUTE IMMEDIATE 'WITH datelist AS(<br /> SELECTt1.GenDate as gendate, extract(dow from GenDate) AS dayofweek<br /> FROM (SELECT date as GenDate<br /> FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1day'::interval) date<br /> ) AS t1 <br /> )<br /> SELECT gendate FROM datelist AS t1<br /> WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1<br /> JOIN locations AS t2 ON t1.location_id = t2.row_id<br/> WHERE t2.locationCode = '' || location_table.locationCode || '') <br /> AND gendate >(SELECT start_date FROM locations WHERE locationCode = '' || location_table.locationCode || '')';<br /> END LOOP;<br/> END;<span class="HOEnZb"><font color="#888888"><br /><br /><br /><br /><pre cols="72">-- Jason Aleski / IT Specialist</pre> </font></span></div></blockquote></div><br /></div></blockquote><br />