Re: Avoiding tuple construction/deconstruction during joining - Mailing list pgsql-performance
From | Miroslav Šulc |
---|---|
Subject | Re: Avoiding tuple construction/deconstruction during joining |
Date | |
Msg-id | 4235E1EF.2000402@startnet.cz Whole thread Raw |
In response to | Avoiding tuple construction/deconstruction during joining (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Avoiding tuple construction/deconstruction during joining
|
List | pgsql-performance |
Tom Lane wrote: >>So I have some results. I have tested the query on both PostgreSQL 8.0.1 >>and MySQL 4.1.8 with LIMIT set to 30 and OFFSET set to 6000. PostgreSQL >>result is 11,667.916 ms, MySQL result is 448.4 ms. >> >> > >That's a fairly impressive discrepancy :-(, and even the slot_getattr() >patch that Atsushi Ogawa provided isn't going to close the gap. >(I got about a 4x speedup on Miroslav's example in my testing, which >leaves us still maybe 6x slower than MySQL.) > > As I wrote, the comparison is not "fair". Here are the conditions: "Both databases are running on the same machine (my laptop) and contain the same data. However there are some differences in the data table definitions: 1) in PostgreSQL I use 'varchar(1)' for a lot of fields and in MySQL I use 'enum' 2) in PostgreSQL in some cases I use connection fields that are not of the same type (smallint <-> integer (SERIAL)), in MySQL I use the same types" For those not used to MySQL, enum is an integer "mapped" to a text string (that's how I see it). That means that when you have field such as enum('yes','no','DK'), in the table there are stored numbers 1, 2 and 3 which are mapped to the text values 'yes', 'no' and 'DK'. The description is not accurate (I'm not MySQL programmer, I didn't check it recently and I didn't inspect the code - I wouldn't understand it either) but I think it's not that important. What is important is the fact that MySQL has to work with some dozen fields that are numbers and PostgreSQL has to work with the same fields as varchar(). Some of the other fields are also varchars. This might (or might not) cause the speed difference. However I think that if devs figure out how to speed this up, other cases will benefit from the improvement too. As I understood from the contributions of other, the 2) shouldn't have a great impact on the speed. >Looking at the post-patch profile for the test case, there is still >quite a lot of cycles going into tuple assembly and disassembly: > >Each sample counts as 0.01 seconds. > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 24.47 4.49 4.49 _mcount > 8.01 5.96 1.47 9143692 0.00 0.00 ExecEvalVar > 6.92 7.23 1.27 6614373 0.00 0.00 slot_deformtuple > 6.54 8.43 1.20 9143692 0.00 0.00 slot_getattr > 6.21 9.57 1.14 103737 0.01 0.03 ExecTargetList > 5.56 10.59 1.02 103775 0.01 0.01 DataFill > 3.22 11.18 0.59 103775 0.01 0.01 ComputeDataSize > 2.83 11.70 0.52 ExecEvalVar > 2.72 12.20 0.50 9094122 0.00 0.00 memcpy > 2.51 12.66 0.46 encore > 2.40 13.10 0.44 427448 0.00 0.00 nocachegetattr > 2.13 13.49 0.39 103775 0.00 0.02 heap_formtuple > 2.07 13.87 0.38 noshlibs > 1.20 14.09 0.22 225329 0.00 0.00 _doprnt > 1.20 14.31 0.22 msquadloop > 1.14 14.52 0.21 chunks > 0.98 14.70 0.18 871885 0.00 0.00 AllocSetAlloc > 0.98 14.88 0.18 $$dyncall > 0.76 15.02 0.14 594242 0.00 0.00 FunctionCall3 > 0.71 15.15 0.13 213312 0.00 0.00 comparetup_heap > 0.65 15.27 0.12 6364 0.02 0.13 printtup > 0.60 15.38 0.11 790702 0.00 0.00 pfree > >(_mcount is profiling overhead, ignore it.) It looks to me like just >about everything in the top dozen functions is there as a result of the >fact that join steps form new tuples that are the merge of their input >tuples. Even our favorite villains, palloc and pfree, are down in the >sub-percent range. > >I am guessing that the reason MySQL wins on this is that they avoid >doing any data copying during a join step. I wonder whether we could >accomplish the same by taking Ogawa's patch to the next level: allow >a TupleTableSlot to contain either a "materialized" tuple as now, >or a "virtual" tuple that is simply an array of Datums and null flags. >(It's virtual in the sense that any pass-by-reference Datums would have >to be pointing to data at the next level down.) This would essentially >turn the formtuple and deformtuple operations into no-ops, and get rid >of a lot of the associated overhead such as ComputeDataSize and >DataFill. The only operations that would have to forcibly materialize >a tuple would be ones that need to keep the tuple till after they fetch >their next input tuple --- hashing and sorting are examples, but very >many plan node types don't ever need to do that. > >I haven't worked out the details, but it seems likely that this could be >a relatively nonintrusive patch. The main thing that would be an issue >would be that direct reference to slot->val would become verboten (since >you could no longer be sure there was a materialized tuple there). >I think this would possibly affect some contrib stuff, which is a strong >hint that it'd break some existing user-written code out there. > >Thoughts? > > Mine thought is that I don't know what you are talking about :-) Now seriously, I am far below this level of knowledge. But I can contribute a test that (maybe) can help. I have rewritten the query so it JOINs the varchar() fields (in fact all fields except the IDPK) at the last INNER JOIN. Though there is one more JOIN, the query is more than 5 times faster (1975.312 ms) :-) So my silly opinion is that if the planner could decide that there are too much time expensive fields that are not needed during performing JOINs and these could be JOINed at the last step, it would do it this way :-) Below is the adjusted query and the EXPLAIN ANALYZE output. (Tom, you can run it on the data I have sent you and it should run without changes.) SELECT AdDevicesSites.IDPK, AdDevicesSites2.AdDevicesSiteSizeIDFK, AdDevicesSites2.AdDevicesSiteRegionIDFK, AdDevicesSites2.AdDevicesSiteCountyIDFK, AdDevicesSites2.AdDevicesSiteCityIDFK, AdDevicesSites2.AdDevicesSiteDistrictIDFK, AdDevicesSites2.AdDevicesSiteStreetIDFK, AdDevicesSites2.AdDevicesSiteStreetDescriptionIDFK, AdDevicesSites2.AdDevicesSitePositionIDFK, AdDevicesSites2.AdDevicesSiteVisibilityIDFK, AdDevicesSites2.AdDevicesSiteStatusTypeIDFK, AdDevicesSites2.AdDevicesSitePartnerIdentificationOperatorIDFK, AdDevicesSites2.AdDevicesSitePartnerElectricitySupplierIDFK, AdDevicesSites2.AdDevicesSitePartnerMaintainerIDFK, AdDevicesSites2.AdDevicesSitePartnerStickerIDFK, AdDevicesSites2.CadastralUnitIDFK, AdDevicesSites2.MediaType, AdDevicesSites2.Mark, AdDevicesSites2.Amount, AdDevicesSites2.Distance, AdDevicesSites2.OwnLightening, AdDevicesSites2.LocationDownTown, AdDevicesSites2.LocationSuburb, AdDevicesSites2.LocationBusinessDistrict, AdDevicesSites2.LocationResidentialDistrict, AdDevicesSites2.LocationIndustrialDistrict, AdDevicesSites2.LocationNoBuildings, AdDevicesSites2.ParkWayHighWay, AdDevicesSites2.ParkWayFirstClassRoad, AdDevicesSites2.ParkWayOtherRoad, AdDevicesSites2.ParkWayStreet, AdDevicesSites2.ParkWayAccess, AdDevicesSites2.ParkWayExit, AdDevicesSites2.ParkWayParkingPlace, AdDevicesSites2.ParkWayPassangersOnly, AdDevicesSites2.ParkWayCrossRoad, AdDevicesSites2.PositionStandAlone, AdDevicesSites2.NeighbourhoodPublicTransportation, AdDevicesSites2.NeighbourhoodInterCityTransportation, AdDevicesSites2.NeighbourhoodPostOffice, AdDevicesSites2.NeighbourhoodNewsStand, AdDevicesSites2.NeighbourhoodAmenities, AdDevicesSites2.NeighbourhoodSportsSpot, AdDevicesSites2.NeighbourhoodHealthServiceSpot, AdDevicesSites2.NeighbourhoodShops, AdDevicesSites2.NeighbourhoodShoppingCenter, AdDevicesSites2.NeighbourhoodSuperMarket, AdDevicesSites2.NeighbourhoodPetrolStation, AdDevicesSites2.NeighbourhoodSchool, AdDevicesSites2.NeighbourhoodBank, AdDevicesSites2.NeighbourhoodRestaurant, AdDevicesSites2.NeighbourhoodHotel, AdDevicesSites2.RestrictionCigarettes, AdDevicesSites2.RestrictionPolitics, AdDevicesSites2.RestrictionSpirits, AdDevicesSites2.RestrictionSex, AdDevicesSites2.RestrictionOther, AdDevicesSites2.RestrictionNote, AdDevicesSites2.SpotMapFile, AdDevicesSites2.SpotPhotoFile, AdDevicesSites2.SourcePhotoTimeStamp, AdDevicesSites2.SourceMapTimeStamp, AdDevicesSites2.Price, AdDevicesSites2.WebPrice, AdDevicesSites2.CadastralUnitCode, AdDevicesSites2.BuildingNumber, AdDevicesSites2.ParcelNumber, AdDevicesSites2.GPSLatitude, AdDevicesSites2.GPSLongitude, AdDevicesSites2.GPSHeight, AdDevicesSites2.MechanicalOpticalCoordinates, AdDevicesSites2.Deleted, AdDevicesSites2.Protected, AdDevicesSites2.DateCreated, AdDevicesSites2.DateLastModified, AdDevicesSites2.DateDeleted, AdDevicesSites2.CreatedByUserIDFK, AdDevicesSites2.LastModifiedByUserIDFK, AdDevicesSites2.DeletedByUserIDFK, AdDevicesSites2.PhotoLastModificationDate, AdDevicesSites2.MapLastModificationDate, AdDevicesSites2.DateLastImported, AdDevicesSiteRegions.Name AS AdDevicesSiteRegionName, AdDevicesSiteCounties.Name AS AdDevicesSiteCountyName, AdDevicesSiteCities.Name AS AdDevicesSiteCityName, AdDevicesSiteStreets.Name AS AdDevicesSiteStreetName, AdDevicesSiteDistricts.Name AS AdDevicesSiteDistrictName, AdDevicesSiteStreetDescriptions.Name_cs AS AdDevicesSiteStreetDescriptionName_cs, AdDevicesSiteStreetDescriptions.Name_en AS AdDevicesSiteStreetDescriptionName_en, AdDevicesSiteSizes.Name AS AdDevicesSiteSizeName, SUBSTRING(AdDevicesSiteVisibilities.Name_cs, 3) AS AdDevicesSiteVisibilityName_cs, SUBSTRING(AdDevicesSiteVisibilities.Name_en, 3) AS AdDevicesSiteVisibilityName_en, AdDevicesSitePositions.Name_cs AS AdDevicesSitePositionName_cs, AdDevicesSitePositions.Name_en AS AdDevicesSitePositionName_en, AdDevicesSiteStatusTypes.Name_cs AS AdDevicesSiteStatusTypeName_cs, AdDevicesSiteStatusTypes.Name_en AS AdDevicesSiteStatusTypeName_en, PartnerIdentificationsOperator.Name AS PartnerIdentificationOperatorName, PartnersElectricitySupplier.Name AS PartnerElectricitySupplierName, PartnersMaintainer.Name AS PartnerMaintainerName, PartnersSticker.Name AS PartnerStickerName, CadastralUnits.Code AS CadastralUnitCodeNative, CadastralUnits.Name AS CadastralUnitName FROM AdDevicesSites LEFT JOIN AdDevicesSiteRegions ON AdDevicesSites.AdDevicesSiteRegionIDFK = AdDevicesSiteRegions.IDPK LEFT JOIN AdDevicesSiteCounties ON AdDevicesSites.AdDevicesSiteCountyIDFK = AdDevicesSiteCounties.IDPK LEFT JOIN AdDevicesSiteCities ON AdDevicesSites.AdDevicesSiteCityIDFK = AdDevicesSiteCities.IDPK LEFT JOIN AdDevicesSiteStreets ON AdDevicesSites.AdDevicesSiteStreetIDFK = AdDevicesSiteStreets.IDPK LEFT JOIN AdDevicesSiteStreetDescriptions ON AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK = AdDevicesSiteStreetDescriptions.IDPK LEFT JOIN AdDevicesSiteDistricts ON AdDevicesSites.AdDevicesSiteDistrictIDFK = AdDevicesSiteDistricts.IDPK LEFT JOIN AdDevicesSiteSizes ON AdDevicesSites.AdDevicesSiteSizeIDFK = AdDevicesSiteSizes.IDPK LEFT JOIN AdDevicesSiteVisibilities ON AdDevicesSites.AdDevicesSiteVisibilityIDFK = AdDevicesSiteVisibilities.IDPK LEFT JOIN AdDevicesSitePositions ON AdDevicesSites.AdDevicesSitePositionIDFK = AdDevicesSitePositions.IDPK LEFT JOIN AdDevicesSiteStatusTypes ON AdDevicesSites.AdDevicesSiteStatusTypeIDFK = AdDevicesSiteStatusTypes.IDPK LEFT JOIN PartnerIdentifications AS PartnerIdentificationsOperator ON AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK = PartnerIdentificationsOperator.IDPK LEFT JOIN Partners AS PartnersElectricitySupplier ON AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK = PartnersElectricitySupplier.IDPK LEFT JOIN Partners AS PartnersMaintainer ON AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK = PartnersMaintainer.IDPK LEFT JOIN Partners AS PartnersSticker ON AdDevicesSites.AdDevicesSitePartnerStickerIDFK = PartnersSticker.IDPK LEFT JOIN CadastralUnits ON AdDevicesSites.CadastralUnitIDFK = CadastralUnits.IDPK INNER JOIN AdDevicesSites AS AdDevicesSites2 ON AdDevicesSites.IDPK = AdDevicesSites2.IDPK QUERY PLAN Hash Join (cost=193867.68..235417.92 rows=142556 width=815) (actual time=1577.080..2200.677 rows=6364 loops=1) Hash Cond: ("outer".idpk = "inner".idpk) -> Seq Scan on addevicessites addevicessites2 (cost=0.00..13118.56 rows=142556 width=473) (actual time=40.650..49.195rows=6364 loops=1) -> Hash (cost=186898.29..186898.29 rows=142556 width=350) (actual time=1534.080..1534.080 rows=0 loops=1) -> Merge Right Join (cost=184758.38..186898.29 rows=142556 width=350) (actual time=1187.653..1244.955 rows=6364loops=1) Merge Cond: ("outer".idpk = "inner".cadastralunitidfk) -> Index Scan using cadastralunits_pkey on cadastralunits (cost=0.00..314.49 rows=13027 width=31) (actualtime=0.034..0.128 rows=63 loops=1) -> Sort (cost=184758.38..185114.77 rows=142556 width=325) (actual time=1187.582..1190.111 rows=6364 loops=1) Sort Key: addevicessites.cadastralunitidfk -> Hash Left Join (cost=93887.04..143074.76 rows=142556 width=325) (actual time=502.584..1129.145 rows=6364loops=1) Hash Cond: ("outer".addevicessitepartnerstickeridfk = "inner".idpk) -> Hash Left Join (cost=93884.28..140933.65 rows=142556 width=307) (actual time=502.388..1075.572rows=6364 loops=1) Hash Cond: ("outer".addevicessitepartnermaintaineridfk = "inner".idpk) -> Hash Left Join (cost=93881.51..138792.55 rows=142556 width=289) (actual time=502.208..1023.802rows=6364 loops=1) Hash Cond: ("outer".addevicessitepartnerelectricitysupplieridfk = "inner".idpk) -> Hash Left Join (cost=93878.75..136651.45 rows=142556 width=271) (actual time=502.041..969.959rows=6364 loops=1) Hash Cond: ("outer".addevicessitepartneridentificationoperatoridfk = "inner".idpk) -> Nested Loop Left Join (cost=93875.99..134510.35 rows=142556 width=253) (actualtime=501.849..915.256 rows=6364 loops=1) Join Filter: ("outer".addevicessitestatustypeidfk = "inner".idpk) -> Nested Loop Left Join (cost=93874.93..118471.74 rows=142556 width=228)(actual time=501.826..818.436 rows=6364 loops=1) Join Filter: ("outer".addevicessitepositionidfk = "inner".idpk) -> Nested Loop Left Join (cost=93873.90..108848.18 rows=142556width=207) (actual time=501.802..737.137 rows=6364 loops=1) Join Filter: ("outer".addevicessitevisibilityidfk = "inner".idpk) -> Merge Right Join (cost=93872.86..96017.09 rows=142556width=177) (actual time=501.741..554.834 rows=6364 loops=1) Merge Cond: ("outer".idpk = "inner".addevicessitesizeidfk) -> Index Scan using addevicessitesizes_pkey on addevicessitesizes (cost=0.00..5.62 rows=110 width=14) (actual time=0.009..0.264 rows=110 loops=1) -> Sort (cost=93872.86..94229.25 rows=142556 width=169)(actual time=501.697..504.267 rows=6364 loops=1) Sort Key: addevicessites.addevicessitesizeidfk -> Hash Left Join (cost=57752.91..65764.23 rows=142556width=169) (actual time=234.321..466.130 rows=6364 loops=1) Hash Cond: ("outer".addevicessitedistrictidfk= "inner".idpk) -> Hash Left Join (cost=57743.17..63616.15rows=142556 width=164) (actual time=233.456..421.267 rows=6364 loops=1) Hash Cond: ("outer".addevicessitestreetdescriptionidfk= "inner".idpk) -> Hash Left Join (cost=57634.86..62707.43rows=142556 width=137) (actual time=223.396..362.945 rows=6364 loops=1) Hash Cond: ("outer".addevicessitestreetidfk= "inner".idpk) -> Hash Left Join (cost=57566.95..61844.20rows=142556 width=119) (actual time=217.101..312.605 rows=6364 loops=1) Hash Cond: ("outer".addevicessitecityidfk= "inner".idpk) -> Merge Right Join (cost=57561.85..59700.76rows=142556 width=110) (actual time=216.635..266.672 rows=6364 loops=1) Merge Cond: ("outer".idpk= "inner".addevicessitecountyidfk) -> Sort (cost=6.19..6.48rows=117 width=17) (actual time=0.350..0.389 rows=116 loops=1) Sort Key: addevicessitecounties.idpk -> Seq Scanon addevicessitecounties (cost=0.00..2.17 rows=117 width=17) (actual time=0.008..0.146 rows=117 loops=1) -> Sort (cost=57555.66..57912.05rows=142556 width=99) (actual time=216.250..218.611 rows=6364 loops=1) Sort Key: addevicessites.addevicessitecountyidfk -> Merge RightJoin (cost=33573.63..35712.03 rows=142556 width=99) (actual time=173.646..199.036 rows=6364 loops=1) MergeCond: ("outer".idpk = "inner".addevicessiteregionidfk) -> Sort (cost=1.44..1.48 rows=15 width=23) (actual time=0.055..0.059 rows=13 loops=1) Sort Key: addevicessiteregions.idpk -> Seq Scan on addevicessiteregions (cost=0.00..1.15 rows=15 width=23) (actual time=0.016..0.032 rows=15 loops=1) -> Sort (cost=33572.19..33928.58 rows=142556 width=82) (actual time=173.559..176.398 rows=6364 loops=1) Sort Key: addevicessites.addevicessiteregionidfk -> Seq Scan on addevicessites (cost=0.00..13118.56 rows=142556 width=82) (actual time=62.345..164.783 rows=6364 loops=1) -> Hash (cost=4.48..4.48rows=248 width=17) (actual time=0.283..0.283 rows=0 loops=1) -> Seq Scan on addevicessitecities (cost=0.00..4.48 rows=248 width=17) (actual time=0.011..0.162 rows=138 loops=1) -> Hash (cost=60.53..60.53rows=2953 width=34) (actual time=6.229..6.229 rows=0 loops=1) -> Seq Scan on addevicessitestreets (cost=0.00..60.53 rows=2953 width=34) (actual time=0.010..3.816 rows=2984 loops=1) -> Hash (cost=96.85..96.85 rows=4585width=43) (actual time=10.017..10.017 rows=0 loops=1) -> Seq Scan on addevicessitestreetdescriptions (cost=0.00..96.85 rows=4585 width=43) (actual time=0.008..6.371 rows=4585 loops=1) -> Hash (cost=8.59..8.59 rows=459 width=21)(actual time=0.815..0.815 rows=0 loops=1) -> Seq Scan on addevicessitedistricts (cost=0.00..8.59 rows=459 width=21) (actual time=0.007..0.541 rows=382 loops=1) -> Materialize (cost=1.04..1.08 rows=4 width=36) (actualtime=0.000..0.002 rows=4 loops=6364) -> Seq Scan on addevicessitevisibilities (cost=0.00..1.04rows=4 width=36) (actual time=0.026..0.035 rows=4 loops=1) -> Materialize (cost=1.03..1.06 rows=3 width=27) (actual time=0.000..0.001rows=3 loops=6364) -> Seq Scan on addevicessitepositions (cost=0.00..1.03 rows=3width=27) (actual time=0.007..0.010 rows=3 loops=1) -> Materialize (cost=1.05..1.10 rows=5 width=31) (actual time=0.000..0.002rows=5 loops=6364) -> Seq Scan on addevicessitestatustypes (cost=0.00..1.05 rows=5width=31) (actual time=0.006..0.016 rows=5 loops=1) -> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.144..0.144 rows=0loops=1) -> Seq Scan on partneridentifications partneridentificationsoperator (cost=0.00..2.61 rows=61 width=34) (actual time=0.007..0.103 rows=61 loops=1) -> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.122..0.122 rows=0 loops=1) -> Seq Scan on partners partnerselectricitysupplier (cost=0.00..2.61 rows=61width=34) (actual time=0.004..0.072 rows=61 loops=1) -> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.136..0.136 rows=0 loops=1) -> Seq Scan on partners partnersmaintainer (cost=0.00..2.61 rows=61 width=34) (actualtime=0.005..0.086 rows=61 loops=1) -> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.143..0.143 rows=0 loops=1) -> Seq Scan on partners partnerssticker (cost=0.00..2.61 rows=61 width=34) (actual time=0.009..0.098rows=61 loops=1) Total runtime: 2210.937 ms > regards, tom lane > > Miroslav
Attachment
pgsql-performance by date: