Thread: Re: Speeding up a query.
yes, I have to make that because the data on the table need to be pivoted so it is joined many times with different filter on the column that describe the meaning of the column called numeric_value I'm going to show. That could be very ineffective, event because that table contains something like 25000000 rows... There are two tables in this condition (as you can se in the explain) and both are the table with the higher number of rows in the database. But I don's see any other choice to obtain that information. P.S.: i'm trying with all enable_* to on and pumping to higher values from_collapse_limit and join_collapse_limit that I've put to 30. The result is that the query, after an hour of work, goes out of memory (SQL State 53200)...
Alberto Dalmaso <dalmaso@clesius.it> wrote: > P.S.: i'm trying with all enable_* to on and pumping to higher > values from_collapse_limit and join_collapse_limit that I've put to > 30. Tom suggested that you set those numbers higher than the number of tables joined in the query. I don't think 30 will do that. > The result is that the query, after an hour of work, goes out of > memory (SQL State 53200)... Ouch! Can you provide more details? All information from the PostgreSQL log about that event would be good. If there's anything which might be related in the OS logs from around that time, please include that, too. Also, with those settings at a high value, try running just an EXPLAIN (no ANALYZE) of the query, to see how long that takes, and whether you have a memory issue during the planning phase. (You can use \timing in psql to get a report of the run time of the EXPLAIN.) -Kevin
I promised to provide more details of the query (or the function as it is). Here goes.
Scenario:
A chemotherapy regimen requires chair time and nursing time. A patient might sit in the chair for three hours but the nurse only has to be with them for the first hour. Therefore, nurses can manage multiple chairs at a time. Each regimen has a different time requirement.
To efficiently manage our chair and nursing resources, we want to schedule against these constraints. Our room currently has 17 chairs and around 8 nurses per day. We administer several hundred different regimens and the time for each regimen varies based on the day of the regimen as well as the course. All of these variables are entered and maintained through a web application I wrote.
Scheduling algorithm:
Written in PostgreSQL (naturally), the algorithm is a single function call. It gathers the data for a day into a temporary table and cycles through each appointment. Appointments are scheduled in the following order: locked appointments (previously scheduled and assigned to a nurse and chair), reserved appointments (a desired time slot has been selected), open appointments (ordered by the required chair time descending and the required nurse time descending). Here’s the busy part that loops through each appointment. The table definition follows. Anything beginning with an underscore is a declared variable.
-- Reserved and unscheduled appointments.
FOR _APPOINTMENT IN SELECT * FROM MATRIX_UNSCHEDULED WHERE APPT_STATUS <> 'L' ORDER BY ROW_NUM
LOOP
-- Initialize the variables for this record.
RAISE NOTICE 'Status ''%'' - %', _APPOINTMENT.APPT_STATUS, _APPOINTMENT;
_AVAILABLE := null;
select into _UNIT_INTERVALS, _NURSE_INTERVALS, _UNIT_REQUIRED, _NURSE_REQUIRED
_APPOINTMENT.total_unit_time / 5,
_APPOINTMENT.total_nurse_time / 5,
(_APPOINTMENT.total_unit_time || ' minutes')::INTERVAL,
(_APPOINTMENT.total_nurse_time || ' minutes')::INTERVAL;
-- Find the first available row for the required unit and nurse time.
select into _AVAILABLE unit.row_num
from (
select m1.row_num
from matrix m1,
matrix m2
where m1.unit_id = m2.unit_id
and m1.nurse_id = m2.nurse_id
and m1.unit_scheduled = false
and m2.unit_scheduled = false
and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)
and m2.timeslot between m1.timeslot and (m1.timeslot + _UNIT_REQUIRED)
group by m1.row_num
having count(m2.row_num) = _UNIT_INTERVALS + 1
) unit,
(
select m1.row_num
from matrix m1,
matrix m2
where m1.unit_id = m2.unit_id
and m1.nurse_id = m2.nurse_id
and m1.nurse_scheduled = false
and m2.nurse_scheduled = false
and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)
and m2.timeslot between m1.timeslot and (m1.timeslot + _NURSE_REQUIRED)
group by m1.row_num
having count(m1.row_num) = _NURSE_INTERVALS + 1
) nurse
where nurse.row_num = unit.row_num
order by unit.row_num
limit 1;
-- Assign the time, unit, and nurse to the unscheduled appointment.
update matrix_unscheduled set
appt_time = matrix.timeslot,
unit_id = matrix.unit_id,
nurse_id = matrix.nurse_id,
appt_status = 'S'
from matrix
where schedule_appt_id = _APPOINTMENT.schedule_appt_id
and matrix.row_num = _AVAILABLE;
-- Mark the unit as scheduled for that time.
update matrix set
unit_scheduled = true
from (select timeslot, unit_id from matrix where row_num = _AVAILABLE) m2
where matrix.unit_id = m2.unit_id
and matrix.timeslot between m2.timeslot and (m2.timeslot + _UNIT_REQUIRED);
-- Mark the nurse as scheduled for that time.
update matrix set
nurse_scheduled = true
from (select timeslot, nurse_id from matrix where row_num = _AVAILABLE) m2
where matrix.nurse_id = m2.nurse_id
and matrix.timeslot between m2.timeslot and (m2.timeslot + _NURSE_REQUIRED);
END LOOP;
CREATE TABLE matrix_unscheduled
(
row_num serial NOT NULL,
schedule_appt_id integer NOT NULL,
appt_time timestamp without time zone,
reserved_time timestamp without time zone,
appt_status character(1) NOT NULL,
unit_id integer,
nurse_id integer,
total_unit_time integer NOT NULL,
total_nurse_time integer NOT NULL,
CONSTRAINT pk_matrix_unscheduled PRIMARY KEY (row_num)
)
WITH (OIDS=FALSE);
CREATE TABLE matrix
(
row_num serial NOT NULL,
timeslot timestamp without time zone NOT NULL,
unit_id integer NOT NULL,
nurse_id integer NOT NULL,
unit_scheduled boolean NOT NULL,
nurse_scheduled boolean NOT NULL,
CONSTRAINT pk_matrix PRIMARY KEY (row_num)
)
WITH (OIDS=FALSE);
There are indexes on “matrix” for “timeslot,unit_id”, “timeslot,nurse_id”, and “unit_id,nurse_id”.
Matthew Hartman
Programmer/Analyst
Information Management, ICP
Kingston General Hospital
(613) 549-6666 x4294
Alberto Dalmaso <dalmaso@clesius.it> writes: > P.S.: i'm trying with all enable_* to on and pumping to higher values > from_collapse_limit and join_collapse_limit that I've put to 30. > The result is that the query, after an hour of work, goes out of memory > (SQL State 53200)... Hmm, is that happening during planning (ie, do you get the same error if you just try to EXPLAIN the query with those settings)? If not, please show the EXPLAIN output. regards, tom lane