Re: How to read query plan - Mailing list pgsql-performance
From | Miroslav Šulc |
---|---|
Subject | Re: How to read query plan |
Date | |
Msg-id | 42347407.2000202@startnet.cz Whole thread Raw |
In response to | Re: How to read query plan (John Arbash Meinel <john@arbash-meinel.com>) |
Responses |
Re: How to read query plan
Re: How to read query plan |
List | pgsql-performance |
Hi John, thank you for your response. John Arbash Meinel wrote: > You really need to post the original query, so we can see *why* postgres > thinks it needs to run the plan this way. Here it is: SELECT AdDevicesSites.IDPK, AdDevicesSites.AdDevicesSiteSizeIDFK, AdDevicesSites.AdDevicesSiteRegionIDFK, AdDevicesSites.AdDevicesSiteCountyIDFK, AdDevicesSites.AdDevicesSiteCityIDFK, AdDevicesSites.AdDevicesSiteDistrictIDFK, AdDevicesSites.AdDevicesSiteStreetIDFK, AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK, AdDevicesSites.AdDevicesSitePositionIDFK, AdDevicesSites.AdDevicesSiteVisibilityIDFK, AdDevicesSites.AdDevicesSiteStatusTypeIDFK, AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK, AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK, AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK, AdDevicesSites.AdDevicesSitePartnerStickerIDFK, AdDevicesSites.CadastralUnitIDFK, AdDevicesSites.MediaType, AdDevicesSites.Mark, AdDevicesSites.Amount, AdDevicesSites.Distance, AdDevicesSites.OwnLightening, AdDevicesSites.LocationDownTown, AdDevicesSites.LocationSuburb, AdDevicesSites.LocationBusinessDistrict, AdDevicesSites.LocationResidentialDistrict, AdDevicesSites.LocationIndustrialDistrict, AdDevicesSites.LocationNoBuildings, AdDevicesSites.ParkWayHighWay, AdDevicesSites.ParkWayFirstClassRoad, AdDevicesSites.ParkWayOtherRoad, AdDevicesSites.ParkWayStreet, AdDevicesSites.ParkWayAccess, AdDevicesSites.ParkWayExit, AdDevicesSites.ParkWayParkingPlace, AdDevicesSites.ParkWayPassangersOnly, AdDevicesSites.ParkWayCrossRoad, AdDevicesSites.PositionStandAlone, AdDevicesSites.NeighbourhoodPublicTransportation, AdDevicesSites.NeighbourhoodInterCityTransportation, AdDevicesSites.NeighbourhoodPostOffice, AdDevicesSites.NeighbourhoodNewsStand, AdDevicesSites.NeighbourhoodAmenities, AdDevicesSites.NeighbourhoodSportsSpot, AdDevicesSites.NeighbourhoodHealthServiceSpot, AdDevicesSites.NeighbourhoodShops, AdDevicesSites.NeighbourhoodShoppingCenter, AdDevicesSites.NeighbourhoodSuperMarket, AdDevicesSites.NeighbourhoodPetrolStation, AdDevicesSites.NeighbourhoodSchool, AdDevicesSites.NeighbourhoodBank, AdDevicesSites.NeighbourhoodRestaurant, AdDevicesSites.NeighbourhoodHotel, AdDevicesSites.RestrictionCigarettes, AdDevicesSites.RestrictionPolitics, AdDevicesSites.RestrictionSpirits, AdDevicesSites.RestrictionSex, AdDevicesSites.RestrictionOther, AdDevicesSites.RestrictionNote, AdDevicesSites.SpotMapFile, AdDevicesSites.SpotPhotoFile, AdDevicesSites.SourcePhotoTimeStamp, AdDevicesSites.SourceMapTimeStamp, AdDevicesSites.Price, AdDevicesSites.WebPrice, AdDevicesSites.CadastralUnitCode, AdDevicesSites.BuildingNumber, AdDevicesSites.ParcelNumber, AdDevicesSites.GPSLatitude, AdDevicesSites.GPSLongitude, AdDevicesSites.GPSHeight, AdDevicesSites.MechanicalOpticalCoordinates, AdDevicesSites.Deleted, AdDevicesSites.Protected, AdDevicesSites.DateCreated, AdDevicesSites.DateLastModified, AdDevicesSites.DateDeleted, AdDevicesSites.CreatedByUserIDFK, AdDevicesSites.LastModifiedByUserIDFK, AdDevicesSites.DeletedByUserIDFK, AdDevicesSites.PhotoLastModificationDate, AdDevicesSites.MapLastModificationDate, AdDevicesSites.DateLastImported, AdDevicesSiteRegions.Name AS AdDevicesSiteRegionName, AdDevicesSiteCounties.Name AS AdDevicesSiteCountyName, AdDevicesSiteCities.Name AS AdDevicesSiteCityName, AdDevicesSiteStreets.Name AS AdDevicesSiteStreetName, AdDevicesSiteDistricts.Name AS AdDevicesSiteDistrictName, AdDevicesSiteStreetDescriptions.Name_cs AS AdDevicesSiteStreetDescriptionName_cs, AdDevicesSiteStreetDescriptions.Name_en AS AdDevicesSiteStreetDescriptionName_en, AdDevicesSiteSizes.Name AS AdDevicesSiteSizeName, SUBSTRING(AdDevicesSiteVisibilities.Name_cs, 3) AS AdDevicesSiteVisibilityName_cs, SUBSTRING(AdDevicesSiteVisibilities.Name_en, 3) AS AdDevicesSiteVisibilityName_en, AdDevicesSitePositions.Name_cs AS AdDevicesSitePositionName_cs, AdDevicesSitePositions.Name_en AS AdDevicesSitePositionName_en, AdDevicesSiteStatusTypes.Name_cs AS AdDevicesSiteStatusTypeName_cs, AdDevicesSiteStatusTypes.Name_en AS AdDevicesSiteStatusTypeName_en, PartnerIdentificationsOperator.Name AS PartnerIdentificationOperatorName, PartnersElectricitySupplier.Name AS PartnerElectricitySupplierName, PartnersMaintainer.Name AS PartnerMaintainerName, PartnersSticker.Name AS PartnerStickerName, CadastralUnits.Code AS CadastralUnitCodeNative, CadastralUnits.Name AS CadastralUnitName FROM AdDevicesSites LEFT JOIN AdDevicesSiteRegions ON AdDevicesSites.AdDevicesSiteRegionIDFK = AdDevicesSiteRegions.IDPK LEFT JOIN AdDevicesSiteCounties ON AdDevicesSites.AdDevicesSiteCountyIDFK = AdDevicesSiteCounties.IDPK LEFT JOIN AdDevicesSiteCities ON AdDevicesSites.AdDevicesSiteCityIDFK = AdDevicesSiteCities.IDPK LEFT JOIN AdDevicesSiteStreets ON AdDevicesSites.AdDevicesSiteStreetIDFK = AdDevicesSiteStreets.IDPK LEFT JOIN AdDevicesSiteStreetDescriptions ON AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK = AdDevicesSiteStreetDescriptions.IDPK LEFT JOIN AdDevicesSiteDistricts ON AdDevicesSites.AdDevicesSiteDistrictIDFK = AdDevicesSiteDistricts.IDPK LEFT JOIN AdDevicesSiteSizes ON AdDevicesSites.AdDevicesSiteSizeIDFK = AdDevicesSiteSizes.IDPK LEFT JOIN AdDevicesSiteVisibilities ON AdDevicesSites.AdDevicesSiteVisibilityIDFK = AdDevicesSiteVisibilities.IDPK LEFT JOIN AdDevicesSitePositions ON AdDevicesSites.AdDevicesSitePositionIDFK = AdDevicesSitePositions.IDPK LEFT JOIN AdDevicesSiteStatusTypes ON AdDevicesSites.AdDevicesSiteStatusTypeIDFK = AdDevicesSiteStatusTypes.IDPK LEFT JOIN PartnerIdentifications AS PartnerIdentificationsOperator ON AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK = PartnerIdentificationsOperator.IDPK LEFT JOIN Partners AS PartnersElectricitySupplier ON AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK = PartnersElectricitySupplier.IDPK LEFT JOIN Partners AS PartnersMaintainer ON AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK = PartnersMaintainer.IDPK LEFT JOIN Partners AS PartnersSticker ON AdDevicesSites.AdDevicesSitePartnerStickerIDFK = PartnersSticker.IDPK LEFT JOIN CadastralUnits ON AdDevicesSites.CadastralUnitIDFK = CadastralUnits.IDPK > Also, the final sort actually isn't that expensive. > > When you have the numbers (cost=xxx..yyy) the xxx is the time when the > step can start, and the yyy is the time when the step can finish. For a > lot of steps, it can start running while the sub-steps are still feeding > back more data, for others, it has to wait for the sub-steps to finish. This is thi bit of information I didn't find in the documentation and were looking for. Thank you for the enlightening :-) With this knowledge I can see that the JOINs are the bottleneck. > The first thing to look for, is to make sure the estimated number of > rows is close to the actual number of rows. If they are off, then > postgres may be mis-estimating the optimal plan. (If postgres thinks it > is going to only need 10 rows, it may use an index scan, but when 1000 > rows are returned, a seq scan might have been faster.) The "row=" numbers are equal to those of the total count of items in that tables (generated by VACUUM ANALYZE). > You seem to be doing a lot of outer joins. Is that necessary? These external tables contain information that are a unique parameter of the AdDevice (like Position, Region, County, City etc.), in some containing localized description of the property attribute. Some of them could be moved into the main table but that would create a redundancy, some of them cannot be moved into the main table (like information about Partners which is definitely another object with respect to AdDevices). I think the names of the tables are self-explanatory so it should be clear what each table stores. Is this design incorrect? In fact, we only need about 30 records at a time but LIMIT can speed-up the query only when looking for the first 30 records. Setting OFFSET slows the query down. > I don't > really know what you are looking for, but you are joining against enough > tables, that I think this query is always going to be slow. In MySQL the query was not so slow and I don't see any reason why there should be large differences in SELECT speed. But if the design of the tables is incorrect, we will correct it. > From what I can tell, you have 1 table which has 6364 rows, and you are > grabbing all of those rows, and then outer joining it with about 11 > other tables. Here are the exact numbers: AdDevicesSites - 6364 AdDevicesSiteRegions - 15 AdDevicesSiteCounties - 110 AdDevicesSiteCities - 124 AdDevicesSiteStreets - 2858 AdDevicesSiteStreetDescriptions - 4585 AdDevicesSiteDistricts - 344 AdDevicesSiteSizes - 110 AdDevicesSiteVisibilities - 4 AdDevicesSitePositions - 3 AdDevicesSiteStatusTypes - 5 PartnerIdentifications - 61 Partners - 61 CadastralUnits - 13027 > I would actually guess that the most expensive parts of the plan are the > NESTED LOOPS which when they go to materialize have to do a sequential > scan, and they get executed 6364 times. It looks like the other tables > are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan, > the problem is that because you are doing it 6k times, it ends up taking > about 300ms of your time. > > You could try setting "set enable_nestloop to off". > I don't know that it will be faster, but it could be. I have tried that and it resulted in about 2 sec slowdown :-( > In general, though, it seems like you should be asking a different > question, rather than trying to optimize the query that you have. You mean "how should I improve the design to make the query faster"? > Can you post the original SQL statement, and maybe describe what you are > trying to do? I hope the explanation above is clear and sufficient :-) > > John > =:-> >
Attachment
pgsql-performance by date: