More thoughts on sorting - Mailing list pgsql-hackers
| From | PFC | 
|---|---|
| Subject | More thoughts on sorting | 
| Date | |
| Msg-id | op.uxxmuqgncigqcu@soyouz Whole thread Raw | 
| Responses | Re: More thoughts on sorting | 
| List | pgsql-hackers | 
There was a thread some time ago about sorting... it kind of died... I did some tests on a desktop (Postgres 8.3.7, kubuntu, Core 2 dual core, 4GB RAM, RAID1 of 2 SATA disks) Quick conclusions : - grabbing the stuff to sort can be IO bound of course (not here) - for short strings (average 12 bytes), sort is CPU-bound in strcoll() - for longer strings (average 120 bytes), sort is even more CPU-bound in strcoll() - strcoll() time seems to depend on the length of the strings, not the place where a difference occurs (grokking glibc source code confirms) See detailed test procedure below. Locale is fr_FR.UTF-8 and database is UNICODE All strings are ASCII, they are mostly alphanumeric references. There are 391469 strings.min length 6 charsmax length 80 charsavg length 11.82 chars We have a table test with (id INTEGER PRIMARY KEY, TEXT, BYTEA ), and contents of TEXT and BYTEA are identical. We have a table test2 which contains the same thing as test, except the id and a 100-character constant are appended to the strings to make them longer. Test Procedure : Grab test data from : http://home.peufeu.com/pg/test_data_references.copy.gz **** Sorting with Python Sorting all string converted to unicode (from utf8) using strcoll() and correct locale => 5.8 s With longer strings (as in table test2 below ) => 8 s **** Postgres To get query timings, I use \t and "SELECT * FROM test ORDER BY id OFFSET 391468;" which avoids EXPLAIN ANALYZE overhead, it just prints the last row from the results. Timings are a bit shorter than EXPLAIN ANALYZE gives, and I checked the plans, they are all sorts. -- Create test table and load it BEGIN; CREATE TABLE test1 (t TEXT NOT NULL); \copy test1 FROM test_data_references.copy CREATE TABLE test (id SERIAL PRIMARY KEY, t TEXT NOT NULL, b BYTEA NOT NULL ); INSERT INTO test (t,b) SELECT t,t::BYTEA FROM test1; DROP TABLE test1; ALTER TABLE test DROP CONSTRAINT test_pkey; CREATE TABLE test2 (id INTEGER NOT NULL, t TEXT NOT NULL, b BYTEA NOT NULL ); INSERT INTO test2 SELECT id, (t || id || 'This is a dummy text of length 100 bytes************************************************************') AS t, (t || id || 'This is a dummy text of length 100 bytes************************************************************')::BYTEA AS b FROM test; COMMIT; \d test SHOW work_mem; --> 16MB SHOW maintenance_work_mem; --> 512MB \timing -- cache it really well SELECT count(*) FROM test; SELECT count(*) FROM test; SELECT count(*) FROM test; --> 391469 --> Temps : 87,033 ms SELECT * FROM test ORDER BY id OFFSET 391468; --> Temps : 918,893 ms SELECT id FROM test ORDER BY id OFFSET 391468; --> Temps : 948,015 ms Interpretation : - Time for hauling around extra data (SELECT * instead of SELECT id) is not significant. - Sorting by integers is quite fast (not THAT fast though, MySQL below is 3x faster when selecting just 'id' and 2x slower when SELECT *, hum.) SELECT * FROM test ORDER BY b OFFSET 391468; --> Temps : 2145,555 ms SELECT id FROM test ORDER BY b OFFSET 391468; --> Temps : 2152,273 ms Interpretation : - Time for hauling around extra data (SELECT * instead of SELECT id) is not significant. - Sorting by BYTEA is just a memcmp(), it is strange that is it 2x slower than ints. Probably the varlena stuff, I guess. - See ridiculous MySQL results using a BLOB below which are 10x slower SELECT * FROM test ORDER BY t OFFSET 391468; --> Temps : 7305,373 ms SELECT id FROM test ORDER BY t OFFSET 391468; --> Temps : 7345,234 ms Interpretation : - Time for hauling around extra data (SELECT * instead of SELECT id) is not significant. - Sorting localized TEXT really is SLOW !!!!! - The little test above calling strcoll() from Python confirms the slowness is in strcoll() - MySQL (see below) seems to be much faster (about equal to postgres) on VARCHAR, and 2x slower on TEXT (hum...) BEGIN; CREATE INDEX test_id ON test( id ); --> Temps : 555,718 ms CREATE INDEX test_b ON test( b ); --> Temps : 1762,263 ms CREATE INDEX test_t ON test( t ); --> Temps : 6274,624 ms ROLLBACK; Interpretation : - maintenance_work_mem is much higher than work_mem so the sorts are faster, but the slowness in localized text sorting subsists... SELECT count(*) FROM test2; --> 391469 --> Temps : 114,669 ms SELECT * FROM test2 ORDER BY id OFFSET 391468; --> Temps : 1788,246 ms SELECT id FROM test2 ORDER BY id OFFSET 391468; --> Temps : 989,238 ms Interpretation : - Time for hauling around extra data (SELECT * instead of SELECT id) IS significant this time due to the extra string lengths. SELECT * FROM test2 ORDER BY b OFFSET 391468; --> Temps : 2906,108 ms SELECT id FROM test2 ORDER BY b OFFSET 391468; --> Temps : 2554,931 ms SELECT * FROM test2 ORDER BY t OFFSET 391468; --> Temps : 10637,649 ms SELECT id FROM test2 ORDER BY t OFFSET 391468; --> Temps : 10322,480 ms Interpretation : - Note : the strings are longer, however they are sortable only by looking at the first few chars. - As shown by the bytea sort above, hauling around the longer strings does take a little bit more time, perhaps enough to justify an extra second over the short strings, but not to justify 3 more seconds. - So, strcoll() again. SELECT * FROM test2 ORDER BY t::BYTEA OFFSET 391468; --> Temps : 4406,958 ms Interpretation : - Conversion of TEXT to BYTEA and memcmp() is much faster than strcoll() BEGIN; CREATE INDEX test2_id ON test2( id ); --> 586,790 ms CREATE INDEX test2_b ON test2( b ); --> 3011,777 ms CREATE INDEX test2_t ON test2( t ); --> 8970,160 ms ROLLBACK; Now let's check... MySQL 5.0.51a-3ubuntu5.4 ! CREATE TABLE test1 (t TEXT NOT NULL); LOAD DATA INFILE '/tmp/test_data_references.copy' INTO TABLE test1; CREATE TABLE test (id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, t TEXT NOT NULL, c VARCHAR(255) NOT NULL, b BLOB NOT NULL ); INSERT INTO test (t,c,b) SELECT t,t,t FROM test1; DROP TABLE test1; ALTER TABLE `test` CHANGE `id` `id` INT( 11 ) NOT NULL; ALTER TABLE test DROP PRIMARY KEY; CREATE TABLE test2 (id INTEGER NOT NULL, t TEXT NOT NULL, c VARCHAR(255) NOT NULL, b BLOB NOT NULL ); INSERT INTO test2 SELECT id, concat(t, id, 'This is a dummy text of length 100 bytes************************************************************'), concat(t, id, 'This is a dummy text of length 100 bytes************************************************************'), concat(t, id, 'This is a dummy text of length 100 bytes************************************************************') FROM test; -- give it same sort buffer as pg had SET sort_buffer_size=16777216; SELECT * FROM test ORDER BY id LIMIT 391468,391468; --> 1.87 sec SELECT id FROM test ORDER BY id LIMIT 391468,391468; --> 0.37 sec SELECT * FROM test ORDER BY t LIMIT 391468,391468; --> 14.67 sec SELECT id FROM test ORDER BY t LIMIT 391468,391468; --> 14.73 sec SELECT * FROM test ORDER BY c LIMIT 391468,391468; --> 6.22 sec SELECT id FROM test ORDER BY c LIMIT 391468,391468; --> 5.37 sec SELECT * FROM test ORDER BY b LIMIT 391468,391468; --> 20.77 sec SELECT id FROM test ORDER BY b LIMIT 391468,391468; --> 21.12 sec -- now table test2 SELECT * FROM test2 ORDER BY id LIMIT 391468,391468; --> 2.02 sec SELECT id FROM test2 ORDER BY id LIMIT 391468,391468; --> 0.47 sec SELECT * FROM test2 ORDER BY t LIMIT 391468,391468; --> 15.45 sec SELECT id FROM test2 ORDER BY t LIMIT 391468,391468; --> 14.81 sec SELECT * FROM test2 ORDER BY c LIMIT 391468,391468; --> 6.01 sec SELECT id FROM test2 ORDER BY c LIMIT 391468,391468; --> 6.53 sec SELECT * FROM test2 ORDER BY b LIMIT 391468,391468; --> 20.33 sec SELECT id FROM test2 ORDER BY b LIMIT 391468,391468; --> 20.84 sec Ideas : In this case, and probably in much other use cases, the strings are references, simple short alphanumeric strings, that contain only ASCII character codes. strcoll() is overkill. Many other things only use ASCII : emails, urls, product codes, zipcodes, etc. - implement per-column locales Set the relevant columns to US_ASCII Problem, per-column locales are definitely not easy to implement, and besides, it seems like too much generalization. Since a table could contain text entries for many countries, how to you decide on the locale to use for the usernames ? Besides, if you go to the end of this line of thought, you'll want to sort according to the current user's locale, and this doesn't solve it at all. - new TEXT types There is citext, the case-insensitive text... Perhaps there should be : * Type ASCII : ASCII TEXT. - Behaves exactly the same as TEXT, except : - Restricted on input to only ASCII values, or maybe a subset of ASCII - Convertible from TEXT (with errors if it encounters an illegal char) - Convertible to TEXT - Sorted using simple fast locale-agnostic case-insensitive comparison Performance gains : As seen above, a big speedup on sorting : => 3.5x faster on 400K small strings. But also, comparing TEXT to BYTEA here, which gives the same results because all strings are actually lowercase : select count(*) from test where t >= 'a' AND t < 'b' ; => returns 32K rows => bytea is 1.6x faster without an index => also 1.6x speedup with an index (Bitmap Index Scan) because of the 'Recheck Cond' which hits on strcoll() again (same with BETWEEN) select count(*) from test where t >= 'annonce_pap_6' AND t < 'annonce_pap_63'; => returns 163 rows => bytea is 1.6 / 1.7x faster on index scan * a little JOIN test with indexes on a and b - 1000 rows (uses nested loop + index scan) SELECT count(b.id) FROM test a JOIN test b ON (a.t=b.t) WHERE a.id <= 1000; => bytea is 2.3-2.5x faster - 100K rows (index scan on id, sort on a.t, merge join with index a.t) SELECT count(b.id) FROM test a JOIN test b ON (a.t=b.t) WHERE a.id <= 100000; => bytea is 3x faster - full table JOIN (uses Merge Join using the index) SELECT * FROM test a JOIN test b ON (a.t=b.t) => bytea is 2x faster I tested on other types of queries, basically using BYTEA instead of TEXT gives a 1.5-2.5x speedup on - < > <= >= BETWEEN - all index operations - JOINs on TEXT columns - sorts (even better speedup) Since case-insensitive comparison of ASCII strings is an extremely fast and simple operation, I don't think an ASCII text type would be slower than a BYTEA... Switching the column in MySQL from utf8_general_ci to ascii_general_ci makes it 2x faster on Sorts. MySQL on Joins : SELECT count(b.id) FROM test a JOIN test b ON (a.t=b.t) WHERE a.id <= 1000; SELECT count(b.id) FROM test a JOIN test b ON (a.t=b.t) WHERE a.id <= 100000; Here MySQL is 10x slower than postgres (ahem) so I guess the string encoding is the least of its worries... Thoughts ?
pgsql-hackers by date: