It still seems that disk sort and everything after that is where the query plan dies. It seems odd that it went to disk if work_mem was already 250MB. Can you allocate more as a test? As an alternative, if this is a frequently needed data, can you aggregate this data and keep a summarized copy updated periodically?