Re: How to read query plan - Mailing list pgsql-performance

From Miroslav Šulc
Subject Re: How to read query plan
Date
Msg-id 42347407.2000202@startnet.cz
Whole thread Raw
In response to Re: How to read query plan  (John Arbash Meinel <john@arbash-meinel.com>)
Responses Re: How to read query plan
Re: How to read query plan
List pgsql-performance
Hi John,

thank you for your response.

John Arbash Meinel wrote:

> You really need to post the original query, so we can see *why* postgres
> thinks it needs to run the plan this way.

Here it is:

SELECT AdDevicesSites.IDPK, AdDevicesSites.AdDevicesSiteSizeIDFK,
AdDevicesSites.AdDevicesSiteRegionIDFK,
AdDevicesSites.AdDevicesSiteCountyIDFK,
AdDevicesSites.AdDevicesSiteCityIDFK,
AdDevicesSites.AdDevicesSiteDistrictIDFK,
AdDevicesSites.AdDevicesSiteStreetIDFK,
AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK,
AdDevicesSites.AdDevicesSitePositionIDFK,
AdDevicesSites.AdDevicesSiteVisibilityIDFK,
AdDevicesSites.AdDevicesSiteStatusTypeIDFK,
AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK,
AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK,
AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK,
AdDevicesSites.AdDevicesSitePartnerStickerIDFK,
AdDevicesSites.CadastralUnitIDFK, AdDevicesSites.MediaType,
AdDevicesSites.Mark, AdDevicesSites.Amount, AdDevicesSites.Distance,
AdDevicesSites.OwnLightening, AdDevicesSites.LocationDownTown,
AdDevicesSites.LocationSuburb, AdDevicesSites.LocationBusinessDistrict,
AdDevicesSites.LocationResidentialDistrict,
AdDevicesSites.LocationIndustrialDistrict,
AdDevicesSites.LocationNoBuildings, AdDevicesSites.ParkWayHighWay,
AdDevicesSites.ParkWayFirstClassRoad, AdDevicesSites.ParkWayOtherRoad,
AdDevicesSites.ParkWayStreet, AdDevicesSites.ParkWayAccess,
AdDevicesSites.ParkWayExit, AdDevicesSites.ParkWayParkingPlace,
AdDevicesSites.ParkWayPassangersOnly, AdDevicesSites.ParkWayCrossRoad,
AdDevicesSites.PositionStandAlone,
AdDevicesSites.NeighbourhoodPublicTransportation,
AdDevicesSites.NeighbourhoodInterCityTransportation,
AdDevicesSites.NeighbourhoodPostOffice,
AdDevicesSites.NeighbourhoodNewsStand,
AdDevicesSites.NeighbourhoodAmenities,
AdDevicesSites.NeighbourhoodSportsSpot,
AdDevicesSites.NeighbourhoodHealthServiceSpot,
AdDevicesSites.NeighbourhoodShops,
AdDevicesSites.NeighbourhoodShoppingCenter,
AdDevicesSites.NeighbourhoodSuperMarket,
AdDevicesSites.NeighbourhoodPetrolStation,
AdDevicesSites.NeighbourhoodSchool, AdDevicesSites.NeighbourhoodBank,
AdDevicesSites.NeighbourhoodRestaurant,
AdDevicesSites.NeighbourhoodHotel, AdDevicesSites.RestrictionCigarettes,
AdDevicesSites.RestrictionPolitics, AdDevicesSites.RestrictionSpirits,
AdDevicesSites.RestrictionSex, AdDevicesSites.RestrictionOther,
AdDevicesSites.RestrictionNote, AdDevicesSites.SpotMapFile,
AdDevicesSites.SpotPhotoFile, AdDevicesSites.SourcePhotoTimeStamp,
AdDevicesSites.SourceMapTimeStamp, AdDevicesSites.Price,
AdDevicesSites.WebPrice, AdDevicesSites.CadastralUnitCode,
AdDevicesSites.BuildingNumber, AdDevicesSites.ParcelNumber,
AdDevicesSites.GPSLatitude, AdDevicesSites.GPSLongitude,
AdDevicesSites.GPSHeight, AdDevicesSites.MechanicalOpticalCoordinates,
AdDevicesSites.Deleted, AdDevicesSites.Protected,
AdDevicesSites.DateCreated, AdDevicesSites.DateLastModified,
AdDevicesSites.DateDeleted, AdDevicesSites.CreatedByUserIDFK,
AdDevicesSites.LastModifiedByUserIDFK, AdDevicesSites.DeletedByUserIDFK,
AdDevicesSites.PhotoLastModificationDate,
AdDevicesSites.MapLastModificationDate, AdDevicesSites.DateLastImported,
AdDevicesSiteRegions.Name AS AdDevicesSiteRegionName,
AdDevicesSiteCounties.Name AS AdDevicesSiteCountyName,
AdDevicesSiteCities.Name AS AdDevicesSiteCityName,
AdDevicesSiteStreets.Name AS AdDevicesSiteStreetName,
AdDevicesSiteDistricts.Name AS AdDevicesSiteDistrictName,
AdDevicesSiteStreetDescriptions.Name_cs AS
AdDevicesSiteStreetDescriptionName_cs,
AdDevicesSiteStreetDescriptions.Name_en AS
AdDevicesSiteStreetDescriptionName_en, AdDevicesSiteSizes.Name AS
AdDevicesSiteSizeName, SUBSTRING(AdDevicesSiteVisibilities.Name_cs, 3)
AS AdDevicesSiteVisibilityName_cs,
SUBSTRING(AdDevicesSiteVisibilities.Name_en, 3) AS
AdDevicesSiteVisibilityName_en, AdDevicesSitePositions.Name_cs AS
AdDevicesSitePositionName_cs, AdDevicesSitePositions.Name_en AS
AdDevicesSitePositionName_en, AdDevicesSiteStatusTypes.Name_cs AS
AdDevicesSiteStatusTypeName_cs, AdDevicesSiteStatusTypes.Name_en AS
AdDevicesSiteStatusTypeName_en, PartnerIdentificationsOperator.Name AS
PartnerIdentificationOperatorName, PartnersElectricitySupplier.Name AS
PartnerElectricitySupplierName, PartnersMaintainer.Name AS
PartnerMaintainerName, PartnersSticker.Name AS PartnerStickerName,
CadastralUnits.Code AS CadastralUnitCodeNative, CadastralUnits.Name AS
CadastralUnitName
FROM AdDevicesSites
LEFT JOIN AdDevicesSiteRegions ON AdDevicesSites.AdDevicesSiteRegionIDFK
= AdDevicesSiteRegions.IDPK
LEFT JOIN AdDevicesSiteCounties ON
AdDevicesSites.AdDevicesSiteCountyIDFK = AdDevicesSiteCounties.IDPK
LEFT JOIN AdDevicesSiteCities ON AdDevicesSites.AdDevicesSiteCityIDFK =
AdDevicesSiteCities.IDPK
LEFT JOIN AdDevicesSiteStreets ON AdDevicesSites.AdDevicesSiteStreetIDFK
= AdDevicesSiteStreets.IDPK
LEFT JOIN AdDevicesSiteStreetDescriptions ON
AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK =
AdDevicesSiteStreetDescriptions.IDPK
LEFT JOIN AdDevicesSiteDistricts ON
AdDevicesSites.AdDevicesSiteDistrictIDFK = AdDevicesSiteDistricts.IDPK
LEFT JOIN AdDevicesSiteSizes ON AdDevicesSites.AdDevicesSiteSizeIDFK =
AdDevicesSiteSizes.IDPK
LEFT JOIN AdDevicesSiteVisibilities ON
AdDevicesSites.AdDevicesSiteVisibilityIDFK = AdDevicesSiteVisibilities.IDPK
LEFT JOIN AdDevicesSitePositions ON
AdDevicesSites.AdDevicesSitePositionIDFK = AdDevicesSitePositions.IDPK
LEFT JOIN AdDevicesSiteStatusTypes ON
AdDevicesSites.AdDevicesSiteStatusTypeIDFK = AdDevicesSiteStatusTypes.IDPK
LEFT JOIN PartnerIdentifications AS PartnerIdentificationsOperator ON
AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK =
PartnerIdentificationsOperator.IDPK
LEFT JOIN Partners AS PartnersElectricitySupplier ON
AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK =
PartnersElectricitySupplier.IDPK
LEFT JOIN Partners AS PartnersMaintainer ON
AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK = PartnersMaintainer.IDPK
LEFT JOIN Partners AS PartnersSticker ON
AdDevicesSites.AdDevicesSitePartnerStickerIDFK = PartnersSticker.IDPK
LEFT JOIN CadastralUnits ON AdDevicesSites.CadastralUnitIDFK =
CadastralUnits.IDPK

> Also, the final sort actually isn't that expensive.
>
> When you have the numbers (cost=xxx..yyy) the xxx is the time when the
> step can start, and the yyy is the time when the step can finish. For a
> lot of steps, it can start running while the sub-steps are still feeding
> back more data, for others, it has to wait for the sub-steps to finish.

This is thi bit of information I didn't find in the documentation and
were looking for. Thank you for the enlightening :-) With this knowledge
I can see that the JOINs are the bottleneck.

> The first thing to look for, is to make sure the estimated number of
> rows is close to the actual number of rows. If they are off, then
> postgres may be mis-estimating the optimal plan. (If postgres thinks it
> is going to only need 10 rows, it may use an index scan, but when 1000
> rows are returned, a seq scan might have been faster.)

The "row=" numbers are equal to those of the total count of items in
that tables (generated by VACUUM ANALYZE).

> You seem to be doing a lot of outer joins. Is that necessary?

These external tables contain information that are a unique parameter of
the AdDevice (like Position, Region, County, City etc.), in some
containing localized description of the property attribute. Some of them
could be moved into the main table but that would create a redundancy,
some of them cannot be moved into the main table (like information about
Partners which is definitely another object with respect to AdDevices).
I think the names of the tables are self-explanatory so it should be
clear what each table stores. Is this design incorrect?

In fact, we only need about 30 records at a time but LIMIT can speed-up
the query only when looking for the first 30 records. Setting OFFSET
slows the query down.

> I don't
> really know what you are looking for, but you are joining against enough
> tables, that I think this query is always going to be slow.

In MySQL the query was not so slow and I don't see any reason why there
should be large differences in SELECT speed. But if the design of the
tables is incorrect, we will correct it.

> From what I can tell, you have 1 table which has 6364 rows, and you are
> grabbing all of those rows, and then outer joining it with about 11
> other tables.

Here are the exact numbers:

AdDevicesSites - 6364
AdDevicesSiteRegions - 15
AdDevicesSiteCounties - 110
AdDevicesSiteCities - 124
AdDevicesSiteStreets - 2858
AdDevicesSiteStreetDescriptions - 4585
AdDevicesSiteDistricts - 344
AdDevicesSiteSizes - 110
AdDevicesSiteVisibilities - 4
AdDevicesSitePositions - 3
AdDevicesSiteStatusTypes - 5
PartnerIdentifications - 61
Partners - 61
CadastralUnits - 13027

> I would actually guess that the most expensive parts of the plan are the
> NESTED LOOPS which when they go to materialize have to do a sequential
> scan, and they get executed 6364 times. It looks like the other tables
> are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan,
> the problem is that because you are doing it 6k times, it ends up taking
> about 300ms of your time.
>
> You could try setting "set enable_nestloop to off".
> I don't know that it will be faster, but it could be.

I have tried that and it resulted in about 2 sec slowdown :-(

> In general, though, it seems like you should be asking a different
> question, rather than trying to optimize the query that you have.

You mean "how should I improve the design to make the query faster"?

> Can you post the original SQL statement, and maybe describe what you are
> trying to do?

I hope the explanation above is clear and sufficient :-)

>
> John
> =:->
>

Attachment

pgsql-performance by date:

Previous
From: Ragnar Hafstað
Date:
Subject: Re: How to read query plan
Next
From: "Tambet Matiisen"
Date:
Subject: Re: One tuple per transaction