Re: How to find first non-vacation day - Mailing list pgsql-general

From Chris Browne
Subject Re: How to find first non-vacation day
Date
Msg-id 607j8cj9j9.fsf@dba2.int.libertyrms.com
Whole thread Raw
In response to How to find first non-vacation day  ("Andrus Moor" <eetasoft@online.ee>)
List pgsql-general
"Andrus Moor" <eetasoft@online.ee> writes:

> I have a table of vacations
>
> create table vacation (
> id integer primary key,
> dstart date,
> dend date );
>
>
> I need to find first non-vacation day before given date.
>
> This can be done using the following procedural vfp code
>
> function nonvacation( dbefore )
>
> for i=dbefore to  date(1960,1,1) step -1
>   select vacation
>   locate for between( i, dstart, dend )
>   if not found()
>     return i
>     endif
>   endfor
> return null
>
> but this is very slow
>
> How to implement this as sql select statement ?

People try to get baroquely clever about building tiny tables to
represent these things; it tends not to work out well, because the
queries get even more baroque...

I'd create a table of all the days of the year:

create table days (
   a_day date,
   primary key(a_day)
);

Fill it in with 365 values:

insert into days
   select '2005-12-31'::date + (generate_series||'days')::interval from generate_series(1,365);

Suppose vacations are thus...

/* cbbrowne@[local]/dba2 ~=*/ select * from vacation;
   dstart   |    dend
------------+------------
 2006-01-01 | 2006-01-01
 2006-03-15 | 2006-03-19
 2006-12-24 | 2006-12-25
(3 rows)

Forget about your representation of vacation; replace it with the
following "set of vacation days":

create table vacation_days as select distinct a_day from vacation, days where a_day between dstart and dend;

Now, to find the *last working day* before, oh, say, 2006-03-18...

/* cbbrowne@[local]/dba2 ~=*/ select max(a_day) from (select a_day from days where a_day not in (select a_day from
vacation_days))as non_vac_days where a_day < '2006-03-18'; 
    max
------------
 2006-03-14
(1 row)

Determining cost...

/* cbbrowne@[local]/dba2 ~=*/ explain analyze select max(a_day) from (select a_day from days where a_day not in (select
a_dayfrom vacation_days)) as non_vac_days where a_day < '2006-03-18'; 
                                                       QUERY PLAN


------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5.43..5.44 rows=1 width=4) (actual time=0.644..0.647 rows=1 loops=1)
   ->  Index Scan using days_pkey on days  (cost=1.10..5.33 rows=38 width=4) (actual time=0.112..0.406 rows=72 loops=1)
         Index Cond: (a_day < '2006-03-18'::date)
         Filter: (NOT (hashed subplan))
         SubPlan
           ->  Seq Scan on vacation_days  (cost=0.00..1.08 rows=8 width=4) (actual time=0.006..0.033 rows=8 loops=1)
 Total runtime: 0.729 ms
(7 rows)

If you're doing a lot of calculations of "work days," then it would
make a lot of sense to create a "materialized calendar" representing
the work days of the year...

--- Start with all days
create table work_calendar as select a_day from days;
create unique index wc_day on work_calendar (a_day);
--- Drop out Saturday/Sunday
delete from work_calendar where date_part('dow', a_day) not in (0,6);
--- Drop out vacation days
delete from work_calendar where a_day in (select a_day from days, vacation where a_day between dstart and dend);

/* cbbrowne@[local]/dba2 ~=*/ select max(a_day) from work_calendar where a_day < '2006-03-18';
    max
------------
 2006-03-14
(1 row)

/* cbbrowne@[local]/dba2 ~=*/ explain analyze select max(a_day) from work_calendar where a_day < '2006-03-18';
                                                                  QUERY PLAN
                      

----------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.07..0.08 rows=1 width=0) (actual time=0.043..0.047 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..0.07 rows=1 width=4) (actual time=0.027..0.030 rows=1 loops=1)
           ->  Index Scan Backward using wc_day on work_calendar  (cost=0.00..3.73 rows=54 width=4) (actual
time=0.019..0.019rows=1 loops=1) 
                 Index Cond: (a_day < '2006-03-18'::date)
                 Filter: (a_day IS NOT NULL)
 Total runtime: 0.101 ms
(7 rows)

The overall point: If you create the calendars as sets of days, then
SQL provides you with *excellent* ways of manipulating them as sets
where you say things like "where day is in this set" and "where day is
*not* in that set" and such.
--
"cbbrowne","@","acm.org"
http://cbbrowne.com/info/nonrdbms.html
"If the programmer  can simulate a construct faster  then the compiler
can implement the construct itself, then the compiler writer has blown
it badly."  -- Guy L. Steele, Jr., Tartan Laboratories

pgsql-general by date:

Previous
From: Philippe Ferreira
Date:
Subject: Bug with sequences and WAL ?
Next
From: Chris Browne
Date:
Subject: Re: How to find first non-vacation day