thank you very much for all this advice! I will try each of these and post back results (some of this stuff, like creating the index, which is happening now, takes a very long time).
> Index Scan using field_unit_id_idx on field (cost=0.00..8746678.52 rows=850149 width=8) (actual time=0.030..2414345.998 rows=10315 loops=1)"
This estimate seems a long way off. Are the stats up-to-date on the table? Try again after running: ANALYZE field;
It might also be a good idea to ANALYZE all the tables. Is auto-vacuum switched on?
The plan in question would work better if you create an index on field (field_name, unit_id);
but I think if you update the stats the plan will switch.
A HashJoin, hashing "unit" and index scanning on field_field_name_idx would have been a much smarter plan choice for the planner to make.
Also how many distinct field_names are there? SELECT COUNT(DISTINCT field_name) FROM field;
You may want to increase the histogram buckets on that columns if there are more than 100 field names, and the number of rows with each field name is highly variable. ALTER TABLE field ALTER COLUMN field_name SET STATISTICS <n buckets>; 100 is the default, and 10000 is the maximum.