Re: Berkeley DB... - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Berkeley DB... |
Date | |
Msg-id | 2130.959491684@sss.pgh.pa.us Whole thread Raw |
In response to | RE: Berkeley DB... ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>) |
List | pgsql-hackers |
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes: >>>> Well, I've dropped index but INSERTs still take 70 sec and >>>> COPY just 1sec -:((( Mebbe so, but you can't blame it all on parse/plan overhead. I did some experimentation on this with current sources, using a test case of inserting 100,000 rows of 16 columns (the regression "tenk1" table's contents repeated 10 times). Each test was started with a freshly created empty table. The initial runs were done with all postmaster options except -F defaulted. All numbers are wall-clock time in seconds; the "+" column is the time increase from the previous case: load via COPY, fsync off: 0 indexes 24.45s 1 index 48.88s + 24.43 2 indexes 62.65s + 13.77 3 indexes 96.84s + 34.19 4 indexes 134.09s + 37.25 load via INSERTs, fsync off, one xact (begin/end around all inserts): 0 indexes 194.95s 1 index 247.21s + 52.26 2 indexes 269.69s + 22.48 3 indexes 307.33s + 37.64 4 indexes 352.72s + 45.39 load via INSERTs, fsync off, separate transaction for each insert: 0 indexes 236.53s 1 index 295.96s + 59.43 2 indexes 323.40s + 27.44 [ got bored before doing 3/4 index cases ... ] load via INSERTs, fsync on, separate transactions: 0 indexes 5189.99s [ don't want to know how long it will take with indexes :-( ] So while the parse/plan overhead looks kinda bad next to a bare COPY, it's not anything like a 70:1 penalty. But an fsync per insert is that bad and worse. I then recompiled with -pg to learn more about where the time was going. One of the useful places to look at is calls to FileSeek, since mdread, mdwrite, and mdextend all call it. To calibrate these numbers, the table being created occupies 2326 pages and the first index is 343 pages. Inserts (all in 1 xact), no indexes: 0.00 0.00 1/109528 init_irels [648] 0.00 0.00 85/109528 mdread [592] 0.01 0.00 2327/109528 mdextend [474] 0.01 0.00 2343/109528 mdwrite [517] 0.23 0.00 104772/109528 _mdnblocks [251] [250] 0.0 0.24 0.00 109528 FileSeek [250] Inserts (1 xact), 1 index: 0.00 0.00 1/321663 init_irels [649] 0.00 0.00 2667/321663 mdextend [514] 0.10 0.00 55478/321663 mdread [277] 0.11 0.00 58096/321663 mdwrite [258] 0.38 0.00 205421/321663 _mdnblocks [229] [213] 0.1 0.60 0.00 321663 FileSeek [213] COPY, no indexes: 0.00 0.00 1/109527 init_irels [431] 0.00 0.00 84/109527 mdread [404] 0.00 0.00 2327/109527 mdextend [145] 0.00 0.00 2343/109527 mdwrite [178] 0.07 0.00 104772/109527 _mdnblocks [77] [83] 0.0 0.07 0.00 109527 FileSeek [83] COPY, 1 index: 0.00 0.00 1/218549 init_irels [382] 0.00 0.00 2667/218549 mdextend [220] 0.07 0.00 53917/218549 mdread [106] 0.08 0.00 56542/218549 mdwrite [99] 0.14 0.00 105422/218549 _mdnblocks [120] [90] 0.0 0.30 0.00 218549 FileSeek [90] The extra _mdnblocks() calls for the inserts/1index case seem to be from the pg_index scans in ExecOpenIndices (which is called 100000 times in the inserts case but just once in the COPY case). We know how to fix that. Otherwise the COPY and INSERT paths seem to be pretty similar as far as actual I/O calls go. The thing that jumps out here, however, is that it takes upwards of 50000 page reads and writes to prepare a 343-page index. Most of the write calls turn out to be from BufferReplace, which is pretty conclusive evidence that the default setting of -B 64 is not enough for this example; we need more buffers. At -B 128, inserts/0index seems about the same, inserts/1index traffic is 0.00 0.00 1/270331 init_irels [637] 0.01 0.00 2667/270331 mdextend [510] 0.06 0.00 29798/270331 mdread [354] 0.06 0.00 32444/270331 mdwrite [277] 0.40 0.00 205421/270331 _mdnblocks [229] [223] 0.1 0.52 0.00 270331 FileSeek [223] At -B 256, inserts/1index traffic is 0.00 0.00 1/221849 init_irels [650] 0.00 0.00 2667/221849 mdextend [480] 0.01 0.00 5556/221849 mdread [513] 0.01 0.00 8204/221849 mdwrite [460] 0.37 0.00 205421/221849 _mdnblocks [233] [240] 0.0 0.40 0.00 221849 FileSeek [240] At -B 512, inserts/1index traffic is 0.00 0.00 1/210788 init_irels [650] 0.00 0.00 25/210788 mdread [676] 0.00 0.00 2667/210788 mdextend [555] 0.00 0.00 2674/210788 mdwrite [564] 0.27 0.00 205421/210788 _mdnblocks [248] [271] 0.0 0.28 0.00 210788 FileSeek [271] So as long as the -B setting is large enough to avoid thrashing, there shouldn't be much penalty to making an index. I didn't have time to run the COPY cases but I expect they'd be about the same. Bottom line is that where I/O costs are concerned, the parse/plan overhead for INSERTs is insignificant except for the known problem of wanting to rescan pg_index for each INSERT. The CPU overhead is significant, at least if you're comparing no-fsync performance ... but as I commented before, I doubt we can do a whole lot better in that area for simple INSERTs. regards, tom lane
pgsql-hackers by date: