Thread: Re : Solaris Performance - Profiling
I am attempting to produce a profile for a slow query run on Solaris - (c.f previous thread titled "Solaris performance"). I am a bit confused about the the output I am seeing, as the times do not get anywhere close to the 1 hour elapsed time thatthe query takes....(see "notes" section at the end for detail on the query) I will confess to being a bit of a profile newbie... so if I missed something important, please dont hesitate to put me right. So anyway I thought I would put down here what I actually had...here goes ... I tried 2 approaches - tracing via truss and profiling via gprof. i) Tracing Before profiling I used truss on the backend process, latter bits of the output being : (left hand column is the time delta) 0.0003 read(40, "\0\0\0\087 KC4\b\0\0\0\t".., 8192) = 8192 0.0009 lseek(40, 0x14796000, SEEK_SET) = 0x14796000 0.0002 read(40, "\0\0\0\080 t E90\0\0\0\t".., 8192) = 8192 0.0094 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192 0.0154 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192 0.0155 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192 4992.4387 send(7, " P b l a n k\0 T\002 f 3".., 99, 0) = 99 0.0012 recv(7, " Q S E L E C T c u r r".., 8192, 0) = 22 0.0011 lseek(36, 0, SEEK_END) = 8192 0.0013 send(7, " P b l a n k\0 T\001 t i".., 60, 0) = 60 recv(7, 0x002929A8, 8192, 0) (sleeping...) 46994.9585 recv(7, " X", 8192, 0) = 1 Including more librarys in the truss'ing showed thousands of loops, consisting mainly of memcpy+memcmp, unfortunatly this file is _huge_ ... so I did not take a copy of it (I am not sending this mail from the Solaris box....). ii) Profiling Rebuilt Postgresql with : CFLAGS=-pg ./configure --disable-shared --prefix=/opt/pgsql started up the database, ran the query and shut it down again. A raw profile gmon.out was created, which I ran gprof on. Call Graph for Top 10 granularity: each sample hit covers 4 byte(s) for 4.17% of 0.24 seconds called/total parents index %time self descendents called+self name index called/total children 0.00 0.23 1/1 main [2] [1] 95.8 0.00 0.23 1 PostmasterMain [1] 0.00 0.22 1/1 pgstat_start [4] 0.00 0.01 1/1 reset_shared [11] 0.00 0.00 1/1 MemoryContextInit [41] 0.00 0.00 1/1 checkDataDir [42] 0.00 0.00 1/1 CreateOptsFile [43] 0.00 0.00 1/12 AllocSetContextCreate [24] 0.00 0.00 1/1 ProcessConfigFile [46] 0.00 0.00 1/1 StreamServerPort [47] 0.00 0.00 11/25 pqsignal [72] 0.00 0.00 2/2 getenv [131] 0.00 0.00 2/2 _getopt [4862] 0.00 0.00 1/1 _umask [4891] 0.00 0.00 1/5 _getpid [4841] 0.00 0.00 1/1 EnableExceptionHandling [149] 0.00 0.00 1/1 MemoryContextSwitchTo [163] 0.00 0.00 1/1 ResetAllOptions [168] 0.00 0.00 1/1 SetDataDir [170] 0.00 0.00 1/1 IgnoreSystemIndexes [155] 0.00 0.00 1/2 FindExec [122] 0.00 0.00 1/1 CreateDataDirLockFile [143] 0.00 0.00 1/1 RemovePgTempFiles [167] 0.00 0.00 1/1 XLOGPathInit [172] 0.00 0.00 1/1 DLNewList [148] 0.00 0.00 1/1 pqinitmask [193] 0.00 0.00 1/1 _sigprocmask [4884] 0.00 0.00 1/1 pgstat_init [190] ----------------------------------------------- 0.00 0.23 1/1 _start [3] [2] 95.8 0.00 0.23 1 main [2] 0.00 0.23 1/1 PostmasterMain [1] 0.00 0.00 2/5 _geteuid [4840] 0.00 0.00 1/1 save_ps_display_args [194] 0.00 0.00 1/82 malloc [57] 0.00 0.00 1/49 _strdup [4802] 0.00 0.00 1/2 _getuid [4863] 0.00 0.00 1/137 strlen [55] 0.00 0.00 1/15 strcmp [83] ----------------------------------------------- <spontaneous> [3] 95.8 0.00 0.23 _start [3] 0.00 0.23 1/1 main [2] 0.00 0.00 2/4 atexit [103] ----------------------------------------------- 0.00 0.22 1/1 PostmasterMain [1] [4] 91.7 0.00 0.22 1 pgstat_start [4] 0.02 0.19 1/1 pgstat_main [5] 0.00 0.01 1/2 _fork [15] 0.00 0.00 2/4 fflush [104] 0.00 0.00 1/1 on_exit_reset [188] 0.00 0.00 1/1 ClosePostmasterPorts [142] 0.00 0.00 1/1 exit [179] ----------------------------------------------- 0.02 0.19 1/1 pgstat_start [4] [5] 89.6 0.02 0.19 1 pgstat_main [5] 0.00 0.17 30497/30497 select [6] 0.02 0.00 30499/30556 _memset [9] 0.00 0.01 1/2 _fork [15] 0.00 0.00 1/1 pgstat_read_statsfile [32] 0.00 0.00 1/1 pgstat_recv_bestart [33] 0.00 0.00 1/8 hash_create [20] 0.00 0.00 4/4 pgstat_write_statsfile [38] 0.00 0.00 1/1 set_ps_display [49] 0.00 0.00 14/25 pqsignal [72] 0.00 0.00 6/6 _gettimeofday [4838] 0.00 0.00 3/8 close [91] 0.00 0.00 3/5 read [102] 0.00 0.00 2/45 .div [61] 0.00 0.00 2/17 .rem [75] 0.00 0.00 1/2 _pipe [4864] 0.00 0.00 1/1 init_ps_display [184] 0.00 0.00 1/82 malloc [57] 0.00 0.00 1/1 pgstat_recv_beterm [191] ----------------------------------------------- 0.00 0.17 30497/30497 pgstat_main [5] [6] 70.8 0.00 0.17 30497 select [6] 0.00 0.17 30497/30497 _select [8] ----------------------------------------------- 0.17 0.00 30497/30497 _select [8] [7] 70.8 0.17 0.00 30497 _poll [7] ----------------------------------------------- 0.00 0.17 30497/30497 select [6] [8] 70.8 0.00 0.17 30497 _select [8] 0.17 0.00 30497/30497 _poll [7] 0.00 0.00 6/53 .mul [59] ----------------------------------------------- 0.00 0.00 1/30556 CLOGShmemInit [44] 0.00 0.00 1/30556 StreamServerPort [47] 0.00 0.00 1/30556 pgstat_read_statsfile [32] 0.00 0.00 1/30556 InitShmemAllocation [45] 0.00 0.00 1/30556 LockMethodTableInit [27] 0.00 0.00 1/30556 set_ps_display [49] 0.00 0.00 2/30556 pgstat_add_backend [31] 0.00 0.00 3/30556 XLOGShmemInit [40] 0.00 0.00 9/30556 getiop [28] 0.00 0.00 12/30556 MemoryContextCreate [25] 0.00 0.00 25/30556 seg_alloc [21] 0.02 0.00 30499/30556 pgstat_main [5] [9] 8.3 0.02 0.00 30556 _memset [9] ----------------------------------------------- 0.00 0.01 1/1 reset_shared [11] [10] 4.2 0.00 0.01 1 CreateSharedMemoryAndSemaphores [10] 0.00 0.01 1/1 IpcMemoryCreate [13] 0.00 0.00 1/1 InitLockTable [26] 0.00 0.00 1/1 InitShmemIndex [37] 0.00 0.00 1/1 InitBufferPool [35] 0.00 0.00 1/1 InitFreeSpaceMap [36] 0.00 0.00 1/1 XLOGShmemInit [40] 0.00 0.00 1/1 InitShmemAllocation [45] 0.00 0.00 1/1 CLOGShmemInit [44] 0.00 0.00 1/1 BufferShmemSize [139] 0.00 0.00 1/1 LockShmemSize [162] 0.00 0.00 1/1 XLOGShmemSize [173] 0.00 0.00 1/1 CLOGShmemSize [140] 0.00 0.00 1/2 LWLockShmemSize [127] 0.00 0.00 1/2 SInvalShmemSize [128] 0.00 0.00 1/1 FreeSpaceShmemSize [150] 0.00 0.00 1/1 CreateSpinlocks [147] 0.00 0.00 1/1 CreateLWLocks [144] 0.00 0.00 1/1 InitLocks [157] 0.00 0.00 1/1 InitProcGlobal [158] 0.00 0.00 1/1 CreateSharedInvalidationState [145] 0.00 0.00 1/1 PMSignalInit [164] ----------------------------------------------- Flat Graph granularity: each sample hit covers 4 byte(s) for 4.17% of 0.24 seconds % cumulative self self total time seconds seconds calls ms/call ms/call name 70.8 0.17 0.17 30497 0.01 0.01 _poll [7] 8.3 0.19 0.02 30556 0.00 0.00 _memset [9] 8.3 0.21 0.02 1 20.00 214.98 pgstat_main [5] 4.2 0.22 0.01 2 5.00 5.00 _libc_fork [16] 4.2 0.23 0.01 2 5.00 5.00 _syscall [17] 4.2 0.24 0.01 internal_mcount [14] 0.0 0.24 0.00 30497 0.00 0.01 _select [8] 0.0 0.24 0.00 30497 0.00 0.01 select [6] 0.0 0.24 0.00 8008 0.00 0.00 LWLockAssign [51] 0.0 0.24 0.00 442 0.00 0.00 strcasecmp [52] Index by function name [61] .div [4844] __open64 [4830] _xflsbuf [59] .mul [4809] __sigaction [174] assign_defaultxacti [75] .rem [4869] __sigfillset [175] assign_xlog_sync_me [136] .udiv [4870] _atexit_init [103] atexit [98] .umul [4821] _cerror [176] atfork_alloc [63] AllocSetAlloc [4871] _chmod [177] atfork_append [24] AllocSetContextCrea[4872] _cleanup [79] call_hash [137] AllocSetFree [4817] _close [42] checkDataDir [62] AllocSetFreeIndex [4859] _closedir [178] check_nlspath_env [86] AllocSetInit [4813] _doprnt [56] cleanfree [39] AllocateFile [4873] _doscan [91] close [138] BasicOpenFile [4801] _dowrite [115] element_alloc [139] BufferShmemSize [4825] _endopen [179] exit [44] CLOGShmemInit [4874] _exithandle [92] fclose [140] CLOGShmemSize [4810] _ferror_unlocked [180] fcntl [141] ClearDateCache [4819] _fflush_u [104] fflush [142] ClosePostmasterPort[4860] _fflush_u_iops [93] find_option [143] CreateDataDirLockFi[4861] _fileno [29] fopen [144] CreateLWLocks [4834] _findbuf [80] fputc [121] CreateLockFile [30] _findiop [181] fputs [43] CreateOptsFile [15] _fork [130] fread [145] CreateSharedInvalid[4875] _fprintf [73] free [10] CreateSharedMemoryA[4814] _free_unlocked [116] free_mem [146] CreateSocketLockFil[4824] _fstat64 [182] free_name_value_lis [147] CreateSpinlocks [4831] _fwrite_unlocked [183] fscanf [148] DLNewList [4835] _getc_unlocked [94] fwrite [70] DynaHashAlloc [4845] _getdents64 [105] get_mem [149] EnableExceptionHand[4876] _getegid [131] getenv [122] FindExec [4840] _geteuid [28] getiop [109] FreeFile [4877] _getgid [20] hash_create [150] FreeSpaceShmemSize [4862] _getopt [100] hash_estimate_size [151] GUC_yy_create_buffe[4841] _getpid [81] hash_search [123] GUC_yy_flex_alloc [4838] _gettimeofday [101] hash_select_dirsize [124] GUC_yy_flush_buffer[4863] _getuid [89] hash_seq_init [125] GUC_yy_get_next_buf[4826] _ioctl [76] hash_seq_search [152] GUC_yy_get_previous[4827] _isatty [95] hdefault [126] GUC_yy_init_buffer [4878] _libc_cond_init [22] init_htab [110] GUC_yy_load_buffer_[4853] _libc_fcntl [184] init_ps_display [53] GUC_yylex [16] _libc_fork [14] internal_mcount [153] GUC_yyrestart [4815] _libc_mutex_init [185] issock [154] GUC_yywrap [4822] _libc_open [186] keys_destruct [155] IgnoreSystemIndexes[4846] _libc_open64 [2] main [34] InitBufTable [4854] _libc_rwlock_init [57] malloc [35] InitBufferPool [4811] _libc_sigaction [187] mem_init [156] InitFreeList [4879] _libc_thr_keycreate [54] mutex_unlock [36] InitFreeSpaceMap [4880] _lseek [68] my_log2 [26] InitLockTable [4842] _lseek64 [132] number [157] InitLocks [4799] _malloc_unlocked [67] nvmatch [158] InitProcGlobal [4881] _memccpy [188] on_exit_reset [45] InitShmemAllocation[4798] _memcpy [117] on_proc_exit [37] InitShmemIndex [9] _memset [118] on_shmem_exit [12] InternalIpcMemoryCr[4818] _morecore [106] open [111] InternalIpcSemaphor[4797] _mutex_lock [96] parse_int [159] IpcInitKeyAssignmen[4847] _opendir [189] pfree [13] IpcMemoryCreate [4864] _pipe [31] pgstat_add_backend [112] IpcSemaphoreCreate [7] _poll [190] pgstat_init [113] IpcSemaphoreUnlock [4882] _private_sigprocmas [5] pgstat_main [84] LWLockAcquire [4883] _profil [32] pgstat_read_statsfi [51] LWLockAssign [4848] _putc_unlocked [33] pgstat_recv_bestart [85] LWLockRelease [4832] _read [191] pgstat_recv_beterm [127] LWLockShmemSize [4828] _readdir [4] pgstat_start [160] LockMethodInit [4829] _readdir64 [192] pgstat_sub_backend [27] LockMethodTableInit[4805] _realbufend [38] pgstat_write_statsf [161] LockMethodTableRena[4849] _rename [193] pqinitmask [162] LockShmemSize [4806] _return_negone [72] pqsignal [64] MemoryContextAlloc [4796] _return_zero [82] putc [25] MemoryContextCreate[4803] _sbrk [102] read [41] MemoryContextInit [4804] _sbrk_unlocked [60] realfree [163] MemoryContextSwitch [8] _select [133] realloc [114] NumLWLocks [4839] _semctl [107] rename [164] PMSignalInit [4855] _semget [11] reset_shared [1] PostmasterMain [4856] _semop [194] save_ps_display_arg [46] ProcessConfigFile [4823] _semsys [21] seg_alloc [165] ReadControlFile [4836] _setbufend [6] select [166] RecordSharedMemoryI[4837] _setorientation [97] set_config_option [167] RemovePgTempFiles [18] _shmat [49] set_ps_display [168] ResetAllOptions [19] _shmget [74] sigvalid [169] SIBufferInit [4812] _sigaction [77] snprintf [128] SInvalShmemSize [4816] _sigdelset [99] sprintf [170] SetDataDir [4808] _sigemptyset [52] strcasecmp [65] ShmemAlloc [4865] _sigfillset [134] strcat [23] ShmemInitHash [4884] _sigprocmask [135] strchr [87] ShmemInitStruct [4800] _smalloc [83] strcmp [171] StreamClose [4866] _so_bind [58] strcpy [47] StreamServerPort [4885] _so_connect [88] string_hash [129] ValidateBinary [4886] _so_getsockname [55] strlen [48] ValidatePgVersion [4887] _so_getsockopt [78] strncpy [172] XLOGPathInit [4888] _so_listen [108] strrchr [40] XLOGShmemInit [4889] _so_setsockopt [90] strtol [173] XLOGShmemSize [4857] _stat [195] t_delete [4807] ___errno [4802] _strdup [119] tag_hash [4868] __doscan_u [17] _syscall [66] 1:tas [4851] __fcntl [4890] _sysconfig [71] 1:tas [4852] __filbuf [4891] _umask [69] thr_main [4843] __flsbuf [4867] _unlink [120] write [4858] __mul64 [4833] _write [50] <cycle 1> [4820] __open [4850] _wrtchk Object modules 1: /opt/pgsql/bin/postmaster Notes ----- i) Background This query takes 30s on an old, slow Intel box (333Mhz) running Freebsd. It takes about 1 hour on this Sparc box. This suggested something not quite right somewhere. ii) Platform Info Sun E220 2x450Mhz Utlrasparc CPU + 2G memory. OS + Postgresql is installed on 2x36G 15000 RPM Seagate Cheetah drives Remaining IO subsystem is fibre to a Compaq SAN (RAID 5). The Postgresql database files are located on this SAN system. iii) Version Info db1=# SELECT version(); version ---------------------------------------------------------------- PostgreSQL 7.2 on sparc-sun-solaris2.8, compiled by GCC 2.95.2 (1 row) iv) Tables And Indexes CREATE TABLE dim0 ( d0key INTEGER, f1 TIMESTAMP, f2 VARCHAR(20), f3 VARCHAR(20) ); CREATE TABLE fact0 ( d0key INTEGER, d1key INTEGER, d2key INTEGER, val INTEGER, filler TEXT ); CREATE UNIQUE INDEX dim0_pk ON dim0 (d0key); CREATE UNQIUE INDEX dim0_q1 ON dim0 (f1); CREATE INDEX fact0_q1 ON fact0(d0key) ; db1=# SELECT relname,tuples FROM pg_class WHERE relname IN ('dim0','fact0'); relname | reltuples ---------+----------- dim0 | 10080 fact0 | 1e+07 (2 rows) v) Query Access Plan And Elapsed Time db1=# EXPLAIN SELECT d0.f3, count(f.val) FROM dim0 d0, fact0 f WHERE d0.d0key = f.d0key AND d0.f1 between '1999-12-01' AND '2000-02-29' GROUP BY d0.f3 ; Aggregate (cost=27560.60..27594.10 rows=670 width=18) -> Group (cost=27560.60..27577.35 rows=6700 width=18) -> Sort (cost=27560.60..27560.60 rows=6700 width=18) -> Nested Loop (cost=0.00..27134.84 rows=6700 width=18) -> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..4.40 rows=87 width=10) -> Index Scan using fact0_q1 on fact0 f (cost=0.00..310.06 rows=77 width=8) EXPLAIN db1=# \i qtype1.sql timetz -------------------- 18:54:43.366466+12 (1 row) f3 | count ----+------- 01 | 30000 02 | 28000 12 | 30000 (3 rows) timetz -------------------- 20:18:01.102664+12 (1 row) vi) Non Default Postgresql Server Parameters shared_buffers=4000 sort_mem=10240 wal_buffers=100 wal_files=10
Mark kirkwood <markir@slingshot.co.nz> writes: > I will confess to being a bit of a profile newbie... so if I missed something > important, please dont hesitate to put me right. It looks like you profiled the postmaster (parent process), not the backend you were interested in. I'm aware of two ways to make that sort of mistake: 1. Backends will drop their gmon.out files into the database subdirectory they connect to ($PGDATA/base/nnn/gmon.out). Any gmon.out you see in the top-level directory is from the postmaster itself or a checkpoint process --- neither of which are likely to be interesting. 2. I dunno if Solaris has this problem, but on Linux profiling is not inherited across fork(), so the postmaster's children fail to collect useful data at all. There is a patch in current sources to work around this by explicitly re-enabling the ITIMER_PROF interrupt after fork. If this seems like the problem then try the attached (slightly old) patch. regards, tom lane *** src/backend/postmaster/postmaster.c.orig Wed Dec 12 14:52:03 2001 --- src/backend/postmaster/postmaster.c Mon Dec 17 19:38:29 2001 *************** *** 1823,1828 **** --- 1823,1829 ---- { Backend *bn; /* for backend cleanup */ pid_t pid; + struct itimerval svitimer; /* * Compute the cancel key that will be assigned to this backend. The *************** *** 1858,1869 **** --- 1859,1874 ---- beos_before_backend_startup(); #endif + getitimer(ITIMER_PROF, &svitimer); + pid = fork(); if (pid == 0) /* child */ { int status; + setitimer(ITIMER_PROF, &svitimer, NULL); + free(bn); #ifdef __BEOS__ /* Specific beos backend startup actions */
>Previously (snippage....) > > It looks like you profiled the postmaster (parent process), not the > backend you were interested in.... > 1. Backends will drop their gmon.out files into the database > subdirectory they connect to ($PGDATA/base/nnn/gmon.out). > Any gmon.out you see in the top-level directory is from the postmaster > itself or a checkpoint process --- neither of which are likely to be > interesting. > Thanks Tom... I looked just about everywhere _except_ there :-( The backend profile is in $PGDATA/base/nnnn. I am a little concerned that the cpu time recored by the profiler seems to be a bit on the short side at 4047.53 seconds (67 minutes), as opposed to 119.2 for the backend itself. Are these two numbers supposed to match closely ? (they did for shorter queries..) Anyway in the hope that this captured output is interesting enough, here it is...or rather here is some of it (biggish file), so I am leaving a complete copy under : http://homepages.slingshot.co.nz/~markir/tar/solaris/ Let me know if further information is likely to be helpful. I have profiled the same query (on the same dataset) on Freebsd and have the output available at : http://homepages.slingshot.co.nz/~markir/tar/freebsd/ regards Mark The interesting bits of the flat profile looks like : % cumulative self self total time seconds seconds calls ms/call ms/call name 30.8 1268.83 1268.83 internal_mcount [20] 19.5 2070.96 802.13 1680561053 0.00 0.00 comparetup_heap [7] 18.8 2843.64 772.68 3361298446 0.00 0.00 nocachegetattr [24] 4.8 3041.20 197.56 1680926512 0.00 0.00 _memset [27] 4.3 3218.96 177.76 1680561053 0.00 0.00 varcharcmp [25] 4.1 3386.71 167.75 1680561053 0.00 0.00 qsort_comparetup [6] 3.8 3543.21 156.50 1680835238 0.00 0.00 FunctionCall2 <cycle 14> [23] 3.4 3684.64 141.43 1680561053 0.00 0.00 ApplySortFunction [21] 3.4 3825.56 140.92 3361298219 0.00 0.00 pg_detoast_datum [28] 2.8 3941.52 115.96 1680649052 0.00 0.00 varstr_cmp [26] 2.0 4023.16 81.64 1680649731 0.00 0.00 strncmp [29] 1.7 4094.92 71.76 _mcount (5712) 0.5 4114.65 19.73 1 19730.00 2774314.94 qst [5] 0.0 4115.28 0.63 3078 0.20 0.20 _read [41] 0.0 4115.59 0.31 616267 0.00 0.00 ExecEvalVar [52] 0.0 4115.76 0.17 440367 0.00 0.00 AllocSetReset [62] 0.0 4115.93 0.17 264093 0.00 0.00 heap_formtuple [47] 0.0 4116.09 0.16 264093 0.00 0.00 ExecTargetList [38] 0.0 4116.24 0.15 533566 0.00 0.00 AllocSetAlloc [59] 0.0 4116.39 0.15 365666 0.00 0.00 LWLockAcquire [58] 0.0 4116.53 0.14 96464 0.00 0.00 .rem [67] 0.0 4116.67 0.14 88400 0.00 0.02 heap_fetch [34] 0.0 4116.78 0.11 365666 0.00 0.00 LWLockRelease [63] 0.0 4116.88 0.10 731332 0.00 0.00 tas [75] 0.0 4116.98 0.10 352190 0.00 0.01 ExecProcNode <cycle 3> [30] 0.0 4117.07 0.09 977319 0.00 0.00 AllocSetFreeIndex [78] 0.0 4117.16 0.09 264093 0.00 0.00 ComputeDataSize [82] 0.0 4117.25 0.09 264093 0.00 0.00 DataFill [70] 0.0 4117.34 0.09 120472 0.00 0.00 _memcmp [80] 0.0 4117.43 0.09 89883 0.00 0.01 ReadBufferInternal [36] 0.0 4117.52 0.09 88491 0.00 0.00 _bt_checkkeys <cycle 14> [81] 0.0 4117.61 0.09 88088 0.00 0.00 HeapTupleSatisfiesSnapshot [74] 0.0 4117.70 0.09 5198 0.02 0.02 .div [79] 0.0 4117.77 0.07 88497 0.00 0.00 btgettuple <cycle 14> [56] 0.0 4117.84 0.07 88001 0.00 0.01 ExecNestLoop <cycle 3> [44] 0.0 4117.90 0.06 528386 0.00 0.00 ExecClearTuple [54] 0.0 4117.96 0.06 264093 0.00 0.00 ExecProject [37] 0.0 4118.02 0.06 96981 0.00 0.00 hash_search [50] 0.0 4118.08 0.06 88177 0.00 0.03 ExecScan [32] 0.0 4118.14 0.06 88092 0.00 0.00 _bt_restscan [72] 0.0 4118.19 0.05 533564 0.00 0.00 MemoryContextAlloc [55] 0.0 4118.24 0.05 443729 0.00 0.00 AllocSetFree [77] 0.0 4118.29 0.05 271989 0.00 0.00 _memcpy [92] 0.0 4118.34 0.05 264291 0.00 0.00 heap_freetuple [71] 0.0 4118.39 0.05 89797 0.00 0.00 UnpinBuffer [86] 0.0 4118.44 0.05 88177 0.00 0.02 IndexNext [33] 0.0 4118.49 0.05 88094 0.00 0.00 _bt_step [87] 0.0 4118.54 0.05 86858 0.00 0.00 PinBuffer [91] and similarly the call graph like : granularity: each sample hit covers 4 byte(s) for 0.00% of 4047.53 seconds called/total parents index %time self descendents called+self name index called/total children 0.00 2778.68 1/1 main [2] [1] 68.7 0.00 2778.68 1 PostmasterMain [1] 0.00 1391.42 1/1 ServerLoop [8] 0.00 1387.23 1/1 load_password_cache [18] 0.00 0.01 1/1 load_hba_and_ident [136] 0.00 0.01 1/1 reset_shared [138] 0.00 0.00 1/1 pgstat_start [221] 0.00 0.00 1/1 SSDataBase [220] 0.00 0.00 1/1 ProcessConfigFile [335] 0.00 0.00 1/1 checkDataDir [345] 0.00 0.00 1/1 CreateOptsFile [440] 0.00 0.00 1/1 StreamServerPort [443] 0.00 0.00 1/1 CreateDataDirLockFile [445] 0.00 0.00 1/2 FindExec [560] 0.00 0.00 1/1 RemovePgTempFiles [639] 0.00 0.00 1/1 XLOGPathInit [696] 0.00 0.00 1/1 pgstat_init [697] 0.00 0.00 1/1 MemoryContextInit [701] 0.00 0.00 1/70 AllocSetContextCreate [505] 0.00 0.00 1/1056908 MemoryContextSwitchTo [112] 0.00 0.00 11/28 pqsignal [873] 0.00 0.00 2/39 getenv [857] 0.00 0.00 2/5 _getopt [4913] 0.00 0.00 1/1 _umask [4973] 0.00 0.00 1/7 _getpid [4902] 0.00 0.00 1/1 EnableExceptionHandling [1096] 0.00 0.00 1/1 ResetAllOptions [1120] 0.00 0.00 1/1 SetDataDir [1124] 0.00 0.00 1/1 IgnoreSystemIndexes [1109] 0.00 0.00 1/1 DLNewList [1094] 0.00 0.00 1/2 pqinitmask [1077] 0.00 0.00 1/11 _sigprocmask [4893] ----------------------------------------------- 0.00 2778.68 1/1 _start [3] [2] 68.7 0.00 2778.68 1 main [2] 0.00 2778.68 1/1 PostmasterMain [1] 0.00 0.00 2/5 _geteuid [4912] 0.00 0.00 1/1 save_ps_display_args [1173] 0.00 0.00 1/264 malloc [818] 0.00 0.00 1/53 _strdup [4875] 0.00 0.00 1/2 _getuid [4939] 0.00 0.00 1/3826 strlen [768] 0.00 0.00 1/403 strcmp [809] ----------------------------------------------- <spontaneous> [3] 68.7 0.00 2778.68 _start [3] 0.00 2778.68 1/1 main [2] 0.00 0.00 2/4 atexit [987] ----------------------------------------------- 0.00 1387.23 1/2 load_password_cache [18] 0.00 1387.23 1/2 tuplesort_performsort [19] [4] 68.5 0.00 2774.46 2 qsort [4] 19.73 2754.58 1/1 qst [5] 0.01 0.14 88012/1680561053 qsort_comparetup [6] 0.00 0.00 1/61473 .umul [154] ----------------------------------------------- 19.73 2754.58 1/1 qsort [4] [5] 68.5 19.73 2754.58 1 qst [5] 167.74 2586.81 1680473041/1680561053 qsort_comparetup [6] 0.02 0.00 59630/59703 .udiv [114] 0.01 0.00 59630/61473 .umul [154] ----------------------------------------------- 0.01 0.14 88012/1680561053 qsort [4] 167.74 2586.81 1680473041/1680561053 qst [5] [6] 68.1 167.75 2586.95 1680561053 qsort_comparetup [6] 802.13 1784.82 1680561053/1680561053 comparetup_heap [7] ----------------------------------------------- 802.13 1784.82 1680561053/1680561053 qsort_comparetup [6] [7] 63.9 802.13 1784.82 1680561053 comparetup_heap [7] 141.43 870.75 1680561053/1680561053 ApplySortFunction [21] 772.64 0.00 3361122106/3361298446 nocachegetattr [24] ----------------------------------------------- 0.00 1391.42 1/1 PostmasterMain [1] [8] 34.4 0.00 1391.42 1 ServerLoop [8] 0.00 1391.42 1/1 BackendStartup [9] 0.00 0.00 1/1 ConnCreate [526] 0.00 0.00 4/271989 _memcpy [92] 0.00 0.00 1/1 initMasks [750] 0.00 0.00 4/11 _sigprocmask [4893] 0.00 0.00 2/7 _gettimeofday [4903] 0.00 0.00 2/2 select [1078] ----------------------------------------------- 0.00 1391.42 1/1 ServerLoop [8] [9] 34.4 0.00 1391.42 1 BackendStartup [9] 0.00 1391.42 1/1 DoBackend [10] 0.00 0.00 1/3 _fork [152] 0.00 0.00 2/8 fflush [915] 0.00 0.00 1/5 PostmasterRandom [945] 0.00 0.00 1/264 malloc [818] 0.00 0.00 1/99 free [836] ----------------------------------------------- 0.00 1391.42 1/1 BackendStartup [9] [10] 34.4 0.00 1391.42 1 DoBackend [10] 0.00 1391.41 1/1 PostgresMain [11] 0.01 0.00 1/1 ClientAuthentication [145] 0.00 0.00 1/1 ProcessStartupPacket [535] 0.00 0.00 1/1 enable_sigalrm_interrupt [536] 0.00 0.00 2/2140 strncpy [155] 0.00 0.00 1/9 set_ps_display [502] 0.00 0.00 1/1 init_ps_display [729] 0.00 0.00 1/62 sprintf [515] 0.00 0.00 1/33 MemoryContextDelete <cycle 15> [571] 0.00 0.00 1/1056908 MemoryContextSwitchTo [112] 0.00 0.00 3/28 pqsignal [873] 0.00 0.00 2/11 _sigprocmask [4893] 0.00 0.00 2/2 split_opts [1081] 0.00 0.00 1/1 on_exit_reset [1160] 0.00 0.00 1/1 ClosePostmasterPorts [1089] 0.00 0.00 1/7 _getpid [4902] 0.00 0.00 1/1 pq_init [1170] 0.00 0.00 1/1 disable_sigalrm_interrupt [1145] 0.00 0.00 1/7 _gettimeofday [4903] 0.00 0.00 1/2 srandom [1082] ----------------------------------------------- 0.00 1391.41 1/1 DoBackend [10] [11] 34.4 0.00 1391.41 1 PostgresMain [11] 0.00 1391.39 3/3 pg_exec_query_string [12] 0.00 0.01 1/1 InitPostgres [134] 0.00 0.00 5/9 set_ps_display [502] 0.00 0.00 1/1 proc_exit [537] 0.00 0.00 1/1 BaseInit [582] 0.00 0.00 4/4 makeStringInfo [612] 0.00 0.00 4/7 MemoryContextResetAndDeleteChildren [603] 0.00 0.00 4/4 ReadyForQuery [725] 0.00 0.00 1/70 AllocSetContextCreate [505] 0.00 0.00 1/10 pq_endmessage [593] 0.00 0.00 1/14 initStringInfo [577] 0.00 0.00 2/31 pq_sendint [587] 0.00 0.00 4/1056908 MemoryContextSwitchTo [112] 0.00 0.00 14/28 pqsignal [873] 0.00 0.00 7/7 pgstat_report_activity [929] 0.00 0.00 4/4 pgstat_report_tabstat [997] 0.00 0.00 4/4 IsTransactionBlock [977] 0.00 0.00 4/4 EnableNotifyInterrupt [969] 0.00 0.00 4/4 ReadCommand [978] 0.00 0.00 4/4 DisableNotifyInterrupt [968] 0.00 0.00 3/5 _getopt [4913] 0.00 0.00 3/3 strspn [1042] 0.00 0.00 2/11 _sigprocmask [4893] 0.00 0.00 1/1 set_default_datestyle [1175] 0.00 0.00 1/53 _strdup [4875] 0.00 0.00 1/1 atoi [1139] 0.00 0.00 1/2 pqinitmask [1077] 0.00 0.00 1/39 _sigdelset [4879] 0.00 0.00 1/10 pq_sendbyte [906] 0.00 0.00 1/1 pgstat_bestart [1164] 0.00 0.00 1/1 _sigsetjmp [4965] ----------------------------------------------- 0.00 1391.39 3/3 PostgresMain [11] [12] 34.4 0.00 1391.39 3 pg_exec_query_string [12] 0.00 1391.35 3/3 ProcessQuery [13] 0.00 0.03 3/3 pg_plan_query [95] 0.00 0.02 3/3 pg_analyze_and_rewrite [125] 0.00 0.00 3/3 start_xact_command [287] 0.00 0.00 3/3 pg_parse_query [501] 0.00 0.00 3/3 finish_xact_command [516] 0.00 0.00 3/3 SetQuerySnapshot [585] 0.00 0.00 3/3 CommandCounterIncrement [626] 0.00 0.00 3/7 MemoryContextResetAndDeleteChildren [603] 0.00 0.00 12/1056908 MemoryContextSwitchTo [112] 0.00 0.00 3/3 IsAbortedTransactionBlockState [1017] ----------------------------------------------- 0.00 1391.35 3/3 pg_exec_query_string [12] [13] 34.4 0.00 1391.35 3 ProcessQuery [13] 0.00 1391.32 3/3 ExecutorRun [14] 0.00 0.02 3/3 ExecutorEnd [103] 0.00 0.01 3/3 ExecutorStart [177] 0.00 0.00 3/9 set_ps_display [502] 0.00 0.00 3/3 BeginCommand [572] 0.00 0.00 3/3 EndCommand [634] 0.00 0.00 3/3 CreateExecutorState [674] 0.00 0.00 3/3 CreateQueryDesc [685] 0.00 0.00 3/3 CreateOperationTag [1009] 0.00 0.00 3/3 UpdateCommandInfo [1020] ----------------------------------------------- 0.00 1391.32 3/3 ProcessQuery [13] [14] 34.4 0.00 1391.32 3 ExecutorRun [14] 0.00 1391.32 3/3 ExecutePlan [15] 0.00 0.00 3/3 printtup_cleanup [671] 0.00 0.00 3/3 DestToFunction [686] 0.00 0.00 3/3 printtup_setup [1037] ----------------------------------------------- 0.00 1391.32 3/3 ExecutorRun [14] [15] 34.4 0.00 1391.32 3 ExecutePlan [15] 0.20 1391.12 8/8 ExecProcNode <cycle 3> [30] 0.00 0.00 5/5 ExecRetrieve [238] ----------------------------------------------- [16] 34.4 0.20 1391.12 8+704194 <cycle 3 as a whole> [16] 0.03 1387.42 88001 ExecSort <cycle 3> [17] 0.10 2.50 352190 ExecProcNode <cycle 3> [30] 0.07 0.50 88001 ExecNestLoop <cycle 3> [44] 0.00 0.55 88003 ExecGroupEveryTuple <cycle 3> [45] 0.00 0.15 4 ExecAgg <cycle 3> [65] 0.00 0.00 88003 ExecGroup <cycle 3> [763] ----------------------------------------------- 88001 ExecProcNode <cycle 3> [30] [17] 34.3 0.03 1387.42 88001 ExecSort <cycle 3> [17] 0.00 1387.23 1/1 tuplesort_performsort [19] 0.02 0.12 88000/88000 tuplesort_puttuple [69] 0.01 0.04 88001/528274 ExecStoreTuple [53] 0.00 0.00 1/1 tuplesort_begin_heap [332] 0.00 0.00 1/1 ExtractSortKeys [705] 0.00 0.00 2/443727 pfree [73] 0.00 0.00 88001/88001 tuplesort_gettuple [765] 0.00 0.00 1/8 ExecGetTupType [912] 0.00 0.00 1/9 ExecAssignResultType [907] 88001 ExecProcNode <cycle 3> [30] ----------------------------------------------- 0.00 1387.23 1/1 PostmasterMain [1] [18] 34.3 0.00 1387.23 1 load_password_cache [18] 0.00 1387.23 1/2 qsort [4] 0.00 0.00 2/2 fgets [384] 0.00 0.00 1/1 crypt_openpwdfile [444] 0.00 0.00 1/92 MemoryContextStrdup [486] 0.00 0.00 1/533564 MemoryContextAlloc [55] 0.00 0.00 1/3826 strlen [768] 0.00 0.00 1/7 FreeFile [925] ----------------------------------------------- 0.00 1387.23 1/1 ExecSort <cycle 3> [17] [19] 34.3 0.00 1387.23 1 tuplesort_performsort [19] 0.00 1387.23 1/2 qsort [4] ----------------------------------------------- <spontaneous> [20] 31.3 1268.83 0.00 internal_mcount [20] 0.00 0.00 1/4 atexit [987] ----------------------------------------------- 141.43 870.75 1680561053/1680561053 comparetup_heap [7] [21] 25.0 141.43 870.75 1680561053 ApplySortFunction [21] 156.67 714.08 1680561053/1680739842 FunctionCall2 <cycle 14> [23] ----------------------------------------------- [22] 21.5 156.69 714.15 1680739842+368109 <cycle 14 as a whole> [22] 156.50 713.84 1680835238 FunctionCall2 <cycle 14> [23] 0.07 0.17 88497 btgettuple <cycle 14> [56] 0.03 0.13 88092 _bt_next <cycle 14> [64] 0.09 0.00 88491 _bt_checkkeys <cycle 14> [81] 0.00 0.02 405 _bt_search <cycle 14> [117] 0.00 0.00 405 _bt_first <cycle 14> [270] 0.00 0.00 5666 _bt_compare <cycle 14> [4858] 0.00 0.00 781 _bt_binsrch <cycle 14> [4862] 0.00 0.00 376 _bt_moveright <cycle 14> [4865] -----------------------------------------------
Mark kirkwood <markir@slingshot.co.nz> writes: > I am a little concerned that the cpu time recored by the profiler seems > to be a bit on the short side at 4047.53 seconds (67 minutes), as > opposed to 119.2 for the backend itself. Are these two numbers supposed > to match closely ? (they did for shorter queries..) Hmm. Where exactly did you get those numbers from? I see 4118.54 sec as the total CPU accounted for in the profile. Some versions of gprof hide the time spent in the profiler subroutine, leading to discrepancies between the time accounted for in the profile and the actual process CPU time. However, yours doesn't seem to be one of them --- internal_mcount is right there. > Anyway in the hope that this captured output is interesting enough, here > it is...or rather here is some of it (biggish file), so I am leaving a > complete copy under : > http://homepages.slingshot.co.nz/~markir/tar/solaris/ > Let me know if further information is likely to be helpful. I have > profiled the same query (on the same dataset) on Freebsd and have the > output available at : > http://homepages.slingshot.co.nz/~markir/tar/freebsd/ Hmm. Assuming that the profile data is trustworthy and the queries are indeed the same (did you compare EXPLAIN output?), it seems that Solaris' problem is a spectacularly bad qsort() implementation. The FreeBSD profile shows: 1 qsort [29] 0.00 0.31 1/2 load_password_cache [44] 0.00 0.31 1/2 tuplesort_performsort [45] [29] 9.8 0.01 0.62 2+1 qsort [29] 0.01 0.62 148033/148033 qsort_comparetup [30] 1 qsort [29] where Solaris has 0.00 1387.23 1/2 load_password_cache [18] 0.00 1387.23 1/2 tuplesort_performsort [19] [4] 68.5 0.00 2774.46 2 qsort [4] 19.73 2754.58 1/1 qst [5] 0.01 0.14 88012/1680561053 qsort_comparetup [6] 0.00 0.00 1/61473 .umul [154] ----------------------------------------------- 19.73 2754.58 1/1 qsort [4] [5] 68.5 19.73 2754.58 1 qst [5] 167.74 2586.81 1680473041/1680561053 qsort_comparetup [6] 0.02 0.00 59630/59703 .udiv [114] 0.01 0.00 59630/61473 .umul [154] and all the rest of the top profile entries are explained by the fact that qsort_comparetup is called 1.68 billion times instead of 148K times. Can these really be the same dataset? Can Solaris' qsort really be that outstandingly incompetent? How many rows are actually being sorted here, anyway? It might be entertaining to snarf a qsort off the net (from glibc, perhaps) and link it into Postgres to see if you get better results. regards, tom lane
On Wed, 2002-04-03 at 04:02, Tom Lane wrote: > > Hmm. Where exactly did you get those numbers from? I see 4118.54 sec > as the total CPU accounted for in the profile. > odd ...the call graph has 4047.53 and the flat graph has 4118.54 > > Hmm. Assuming that the profile data is trustworthy and the queries are > indeed the same (did you compare EXPLAIN output?), it seems that > Solaris' problem is a spectacularly bad qsort() implementation. > A bit surfing finds heaps of unhappy Solaris qsort users... apparently it cannot sort lists with many repeated items... so our GROUP BY will be causing it grief here > > It might be entertaining to snarf a qsort off the net (from glibc, > perhaps) and link it into Postgres to see if you get better results. > > regards, tom lane > Indeed it is - obtained qsort.c from Freebsd CVS and rebuilt Postgresql : The query now takes 6 seconds instead of 1 hour ! Thanks for an excellent suggestion. For those in need to a quick fix : I did a cheap and dirty mod to src/backend/utils/sort/Makefile changed OBJS = logtape.o -> OBJS = qsort.o logtape.o and copied qsort.c into this directory (had to comment out a couple of lines to compile under Solaris : /*#include <sys/cdefs.h> __FBSDID("$FreeBSD: src/lib/libc/stdlib/qsort.c,v 1.11 2002/03/22 21:53:10 obrien Exp $"); */ ) What do you think about providing something like this for the Solaris port ? (since its clearly quicker...) regards Mark
Hi Tom, How about we include this and have configure somehow ensure the Solaris users get it automatically? There are a *bunch* of Solaris users out there. :-) Regards and best wishes, Justin Clift Mark kirkwood wrote: > > On Wed, 2002-04-03 at 04:02, Tom Lane wrote: > > > > Hmm. Where exactly did you get those numbers from? I see 4118.54 sec > > as the total CPU accounted for in the profile. > > > odd ...the call graph has 4047.53 and the flat graph has 4118.54 > > > > Hmm. Assuming that the profile data is trustworthy and the queries are > > indeed the same (did you compare EXPLAIN output?), it seems that > > Solaris' problem is a spectacularly bad qsort() implementation. > > > A bit surfing finds heaps of unhappy Solaris qsort users... apparently > it cannot sort lists with many repeated items... so our GROUP BY will be > causing it grief here > > > > It might be entertaining to snarf a qsort off the net (from glibc, > > perhaps) and link it into Postgres to see if you get better results. > > > > regards, tom lane > > > Indeed it is - obtained qsort.c from Freebsd CVS and rebuilt Postgresql : > The query now takes 6 seconds instead of 1 hour ! Thanks for an > excellent suggestion. > > For those in need to a quick fix : > > I did a cheap and dirty mod to src/backend/utils/sort/Makefile > > changed OBJS = logtape.o -> OBJS = qsort.o logtape.o > > and copied qsort.c into this directory > > (had to comment out a couple of lines to compile under Solaris : > > /*#include <sys/cdefs.h> > __FBSDID("$FreeBSD: src/lib/libc/stdlib/qsort.c,v 1.11 2002/03/22 > 21:53:10 obrien Exp $"); > */ > > ) > > What do you think about providing something like this for the Solaris > port ? (since its clearly quicker...) > > regards > > Mark > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Wed, Apr 03, 2002 at 10:23:41AM -0500, Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > > How about we include this and have configure somehow ensure the Solaris > > users get it automatically? > > Hmm. I suppose there'd be no license issues with borrowing a BSD qsort. > But I can't see any reasonable way for configure to decide automatically > whether we should replace the system qsort. I think we'd have to put > a USE_PRIVATE_QSORT symbol definition into src/template/solaris. > > Can anyone see a problem with doing it that way --- are there any > versions of Solaris where this'd be a bad idea? Would it be possible instead to make it a --configure option, or just to add a note to the Solaris FAQ about adding an option to CFLAGS? I'd be leery of automatically replacing system libraries, if only because it might surprise people. Clearly the improvement is a win in this case, but if Sun fixes their library, it might be yet faster. (By the way, I've been following this thread, and noticed that the problem shows up with gcc 2.95; AFAIK, 2.95 couldn't generate 64 bit Solaris binaries, so we can be fairly certain the problem is in the 32 bit library. Maybe the 64 bit one is better? I _may_ have time to check this week, but it's looking unlikely. If no-one else does, I'll try it out as soon as I can.) A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Andrew Sullivan <andrew@libertyrms.info> writes: > On Wed, Apr 03, 2002 at 10:23:41AM -0500, Tom Lane wrote: >> But I can't see any reasonable way for configure to decide automatically >> whether we should replace the system qsort. I think we'd have to put >> a USE_PRIVATE_QSORT symbol definition into src/template/solaris. > Would it be possible instead to make it a --configure option, or just > to add a note to the Solaris FAQ about adding an option to CFLAGS? I think the default should be to replace, but we could probably have a configure option to prevent it --- or to force it, in case people want to try a non-system qsort on other platforms besides Solaris. Whenever I see something like this, I wonder whether the problem is more widespread than we know. > ... so we can be fairly certain the problem is in the > 32 bit library. Maybe the 64 bit one is better? Good point. Please check it out and let us know. regards, tom lane
Mark kirkwood wrote: > > It might be entertaining to snarf a qsort off the net (from glibc, > > perhaps) and link it into Postgres to see if you get better results. > > > > regards, tom lane > > > Indeed it is - obtained qsort.c from Freebsd CVS and rebuilt Postgresql : > The query now takes 6 seconds instead of 1 hour ! Thanks for an > excellent suggestion. That is shocking. I have the Solaris 8 source code here so if people want info on exactly what is done in their qsort() routine, I can supply that. How can a routine change a query from 1 hour to 6 seconds? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Thu, 2002-04-04 at 06:41, Bruce Momjian wrote: > That is shocking. I have the Solaris 8 source code here so if people > want info on exactly what is done in their qsort() routine, I can supply > that. How can a routine change a query from 1 hour to 6 seconds? > I am considering raising a bug with Sun ( I have a simple test c program, that runs about 100x faster with BSD qsort.. enough to prove the point for the support analyst) However considering this problem seems to have been around since 1996, I am not all that hopeful of persuading them to alter it. regards Mark
Added to TODO: * Add BSD-licensed qsort() for Solaris --------------------------------------------------------------------------- Mark kirkwood wrote: > On Wed, 2002-04-03 at 04:02, Tom Lane wrote: > > > > Hmm. Where exactly did you get those numbers from? I see 4118.54 sec > > as the total CPU accounted for in the profile. > > > odd ...the call graph has 4047.53 and the flat graph has 4118.54 > > > > Hmm. Assuming that the profile data is trustworthy and the queries are > > indeed the same (did you compare EXPLAIN output?), it seems that > > Solaris' problem is a spectacularly bad qsort() implementation. > > > A bit surfing finds heaps of unhappy Solaris qsort users... apparently > it cannot sort lists with many repeated items... so our GROUP BY will be > causing it grief here > > > > It might be entertaining to snarf a qsort off the net (from glibc, > > perhaps) and link it into Postgres to see if you get better results. > > > > regards, tom lane > > > Indeed it is - obtained qsort.c from Freebsd CVS and rebuilt Postgresql : > The query now takes 6 seconds instead of 1 hour ! Thanks for an > excellent suggestion. > > For those in need to a quick fix : > > I did a cheap and dirty mod to src/backend/utils/sort/Makefile > > changed OBJS = logtape.o -> OBJS = qsort.o logtape.o > > and copied qsort.c into this directory > > (had to comment out a couple of lines to compile under Solaris : > > /*#include <sys/cdefs.h> > __FBSDID("$FreeBSD: src/lib/libc/stdlib/qsort.c,v 1.11 2002/03/22 > 21:53:10 obrien Exp $"); > */ > > ) > > What do you think about providing something like this for the Solaris > port ? (since its clearly quicker...) > > regards > > Mark > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Wed, Apr 03, 2002 at 11:35:30AM -0500, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > > ... so we can be fairly certain the problem is in the > > 32 bit library. Maybe the 64 bit one is better? > > Good point. Please check it out and let us know. Sorry this has taken me so long, but what with various stuff going on here [1] and some vacation, I didn't have a chance to get to it sooner. But now I'm wondering, is there anyone who knows of troubles with the profiling of programs compiled with -pg under gcc 3.0.3 64 bit on Solaris 7? Here's my configure line: CFLAGS="-pg -mcmodel=medlow" ./configure --prefix=/opt/OXRS/pgsql721-profile --with-pgport=12000 --disable-shared But I can't do anything with it: postgres721-profdata$ /opt/OXRS/pgsql721-profile/bin/postgres No space for profiling buffer(s) FATAL 2: invalid checksum in control file Segmentation Fault (core dumped) Anyone with advice? [1] We were purchased by our largest customer, which is probably a good thing, but meant more meetings and less, um, other work. -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Sorry to reply to myself, but this might be useful for the archive. On Tue, Apr 30, 2002 at 11:55:53AM -0400, Andrew Sullivan wrote: > But now I'm wondering, is there anyone who knows of troubles with the > profiling of programs compiled with -pg under gcc 3.0.3 64 bit on > Solaris 7? Here's my configure line: You can't use the -pg cflag without -mcmodel=medlow, because gcc (at least in its 64 bit incarnation on Solaris) apparently can't produce binaries that way: only the medlow memory model is supported for profiling. Unhappily, this appears to cause conflicting libraries to be invoked (I _think_ that's what's going on, anyway). I think this means I can't build a 64-bit system with gcc for profiling. I might be wrong (I'm sort of puzzling this out from two or three rather cryptic entries in some man pages; and as anyone who's ever seen my posts will attest, I'm not that bright anyway, so I may have misunderstood something). In any case, I can't offer a definite answer about the 64-bit qsort for now. If I have a chance to come back to it, and discover anything, I'll post it here. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Tue, Apr 30, 2002 at 03:28:13PM -0400, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > ... In any case, I can't offer a definite > > answer about the 64-bit qsort for now. > > Do you need to profile it? It seemed that the 32-bit behavior for > many-equal-keys was so bad that it'd be easy to tell whether it's > fixed, just by rough overall timing of a test case... Yes, that's what I thought, too, so I figured I'd do that instead (although I didn't think of it until after I sent the mail). On the other hand, now I'm like a dog with a bone, because I want to know why in the world it doesn't work. No wonder I never get anything done. Thanks to Travis Hoyt, who pointed out that I could at least test for library problems with truss. I did, and the interesting thing is that it appears to be the profile writing that's causing the segfault (it's during the write to gmon.out that the segfault occurs). So my earlier view was wrong. But in any case, it looks like there really is something broken about profiling with this configuration. Since the original case was so bad, can anyone tell me roughly how many equal keys were in the set, and how big the total set was? That way I'll be able to get something reasonably close, and I can use wall-clock time or something to expose whether there's a problem for 64 bit libraries too. Thanks, A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Hi I am fairly new to PostgreSQL. I have been trying to install PostgreSQL-7.2.1 on my Sun server, however it fails at this point: ... ... gcc -Wall -Wmissing-prototypes -Wmissing-declarations -fPIC -I. -I../../../src/include -DFRONTEND -DSYSCONFDIR='"/usr/local/pgsql/etc"' -c -o pqsignal.o pqsignal.c ar crs libpq.a fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o pqexpbuffer.o dllist.o md5.o pqsignal.o make[3]: ar: Command not found make[3]: *** [libpq.a] Error 127 make[3]: Leaving directory `/usr/local/install/postgresql-7.2/src/interfaces/libpq' make[2]: *** [all] Error 2 make[2]: Leaving directory `/usr/local/install/postgresql-7.2/src/interfaces' make[1]: *** [all] Error 2 make[1]: Leaving directory `/usr/local/install/postgresql-7.2/src' make: *** [all] Error 2 System details: bash-2.03$ uname -a SunOS webdev1 5.8 Generic_108528-12 sun4u sparc SUNW,UltraAX-i2 Is this a common problem on this platform? Or am I doing something wrong. Any help in getting this working would be greatly appreciated. Thanks Adam
Hi Adam, Are you using the Solaris compilation instructions from? http://techdocs.postgresql.org/installguides.php#solaris They're known to work. :-) Regards and best wishes, Justin Clift Adam Witney wrote: > > Hi > > I am fairly new to PostgreSQL. I have been trying to install > PostgreSQL-7.2.1 on my Sun server, however it fails at this point: > > ... > ... > > gcc -Wall -Wmissing-prototypes -Wmissing-declarations -fPIC -I. > -I../../../src/include -DFRONTEND -DSYSCONFDIR='"/usr/local/pgsql/etc"' -c > -o pqsignal.o pqsignal.c > ar crs libpq.a fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o > fe-lobj.o pqexpbuffer.o dllist.o md5.o pqsignal.o > make[3]: ar: Command not found > make[3]: *** [libpq.a] Error 127 > make[3]: Leaving directory > `/usr/local/install/postgresql-7.2/src/interfaces/libpq' > make[2]: *** [all] Error 2 > make[2]: Leaving directory > `/usr/local/install/postgresql-7.2/src/interfaces' > make[1]: *** [all] Error 2 > make[1]: Leaving directory `/usr/local/install/postgresql-7.2/src' > make: *** [all] Error 2 > > System details: > > bash-2.03$ uname -a > SunOS webdev1 5.8 Generic_108528-12 sun4u sparc SUNW,UltraAX-i2 > > Is this a common problem on this platform? Or am I doing something wrong. > > Any help in getting this working would be greatly appreciated. > > Thanks > > Adam > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Wed, May 01, 2002 at 10:38:26AM +0000, Adam Witney wrote: > fe-lobj.o pqexpbuffer.o dllist.o md5.o pqsignal.o > make[3]: ar: Command not found ^^^^^^^^^^^^^^^^^^^^^ 'ar' isn't in your PATH. It's a common problem. It's probably in /usr/ccs/bin. Just append that to your path (make sure you add it last, because there's probably a non-GNU 'make' in there too) and you'll be set to go. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Andrew Sullivan <andrew@libertyrms.info> writes: > ... In any case, I can't offer a definite > answer about the 64-bit qsort for now. Do you need to profile it? It seemed that the 32-bit behavior for many-equal-keys was so bad that it'd be easy to tell whether it's fixed, just by rough overall timing of a test case... regards, tom lane
On Tue, Apr 30, 2002 at 03:28:13PM -0400, Tom Lane wrote: > Do you need to profile it? It seemed that the 32-bit behavior for > many-equal-keys was so bad that it'd be easy to tell whether it's > fixed, just by rough overall timing of a test case... Sorry for taking yet again so long. Fitting in little tests of this sort of thing can be a bit of a bear -- there's always about 50 other things to do. Anyway, I've performed some simple timed tests that, I think, confirm that the 64 bit library on Solaris is not so bad. version ----------------------------------------------------------------- PostgreSQL 7.2.1 on sparc-sun-solaris2.7, compiled by GCC 3.0.3 bin$ file postmaster postmaster: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped The config file is the default I _think_ I've captured the case that was problematic. As I understood it, qsort was having trouble when hit with many equal keys. I created this table: CREATE TABLE table1 (_date_stamp timestamp default current_timestamp, foo text); The table has no index. It has 5120000 records; field "foo" has only four distinct values. No matter whether I compiled with the system qsort or the qsort from FreeBSD, I got roughly equivalent results running psql under time. I know that's hardly an ideal test, but as Tom suggested, the 32-bit case seemed to be so astonishingly bad that it should have been enough. I ran the test repeatedly, and the results seem pretty consistent. Here are some typical results: system lib: src$ time psql -p 12000 -o /dev/null -c "select * from table1 order ^by foo" test1 real 29m23.822s user 2m10.241s sys 0m7.432s FreeBSD lib: postgresql-7.2.1$ time psql -p 12000 -o /dev/null -c "select * from table1 order by foo" test1 real 29m38.880s user 2m10.571s sys 0m8.032s This example suggests the FreeBSD library is slightly worse in the 64-bit case. That's consistently the case, but the difference is not so great that I'd put any stock in it. I do not know whether there might be any trouble using the FreeBSD library in a 64-bit configuration. I'd say, if you're going to use a 64-bit postmaster, use the Solaris libraries. Hope this is helpful, A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
TODO updated: Add BSD-licensed qsort() for 32-bit Solaris --------------------------------------------------------------------------- Andrew Sullivan wrote: > On Tue, Apr 30, 2002 at 03:28:13PM -0400, Tom Lane wrote: > > > Do you need to profile it? It seemed that the 32-bit behavior for > > many-equal-keys was so bad that it'd be easy to tell whether it's > > fixed, just by rough overall timing of a test case... > > Sorry for taking yet again so long. Fitting in little tests of this > sort of thing can be a bit of a bear -- there's always about 50 other > things to do. Anyway, I've performed some simple timed tests that, I > think, confirm that the 64 bit library on Solaris is not so bad. > > version > ----------------------------------------------------------------- > PostgreSQL 7.2.1 on sparc-sun-solaris2.7, compiled by GCC 3.0.3 > > bin$ file postmaster > postmaster: ELF 64-bit MSB executable SPARCV9 Version 1, > dynamically linked, not stripped > > The config file is the default > > I _think_ I've captured the case that was problematic. As I > understood it, qsort was having trouble when hit with many equal > keys. I created this table: > > CREATE TABLE table1 (_date_stamp timestamp default current_timestamp, > foo text); > > The table has no index. It has 5120000 records; field "foo" has only > four distinct values. > > No matter whether I compiled with the system qsort or the qsort from > FreeBSD, I got roughly equivalent results running psql under time. I > know that's hardly an ideal test, but as Tom suggested, the 32-bit > case seemed to be so astonishingly bad that it should have been > enough. I ran the test repeatedly, and the results seem pretty > consistent. Here are some typical results: > > system lib: > > src$ time psql -p 12000 -o /dev/null -c "select * from table1 order > ^by foo" test1 > > real 29m23.822s > user 2m10.241s > sys 0m7.432s > > FreeBSD lib: > > postgresql-7.2.1$ time psql -p 12000 -o /dev/null -c "select * from > table1 order by foo" test1 > > > real 29m38.880s > user 2m10.571s > sys 0m8.032s > > > This example suggests the FreeBSD library is slightly worse in the > 64-bit case. That's consistently the case, but the difference is not > so great that I'd put any stock in it. > > I do not know whether there might be any trouble using the FreeBSD > library in a 64-bit configuration. I'd say, if you're going to use a > 64-bit postmaster, use the Solaris libraries. > > Hope this is helpful, > > A > > -- > ---- > Andrew Sullivan 87 Mowat Avenue > Liberty RMS Toronto, Ontario Canada > <andrew@libertyrms.info> M6K 3E3 > +1 416 646 3304 x110 > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Mon, 27 May 2002 21:00:43 -0400 (EDT) "Bruce Momjian" <pgman@candle.pha.pa.us> wrote: > TODO updated: > > Add BSD-licensed qsort() for 32-bit Solaris Is this necessary? Didn't someone say that Sun had acknowledged the performance problem and were going to be releasing a patch for it? If that patch exists (or will exist), it would probably be better to suggest in the docs that users of 32-bit Solaris apply the patch. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On Mon, May 27, 2002 at 09:00:43PM -0400, Bruce Momjian wrote: > > TODO updated: > > Add BSD-licensed qsort() for 32-bit Solaris I've received an email noting that someone else ran a test program with the 64 bit library, and had just as bad performance as the 32 bit one. I haven't had a chance to look at it yet, but it suggests that the result is still inconclusive. Maybe, if just one more fire goes out here, I can look at it this week. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Neil Conway wrote: > On Mon, 27 May 2002 21:00:43 -0400 (EDT) > "Bruce Momjian" <pgman@candle.pha.pa.us> wrote: > > TODO updated: > > > > Add BSD-licensed qsort() for 32-bit Solaris > > Is this necessary? Didn't someone say that Sun had acknowledged the > performance problem and were going to be releasing a patch for it? > If that patch exists (or will exist), it would probably be better to > suggest in the docs that users of 32-bit Solaris apply the patch. Sun said they would look at it (actually McNeeley (sp?)), but I haven't seen any mention of a patch. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Andrew Sullivan wrote: > On Mon, May 27, 2002 at 09:00:43PM -0400, Bruce Momjian wrote: > > > > TODO updated: > > > > Add BSD-licensed qsort() for 32-bit Solaris > > I've received an email noting that someone else ran a test program > with the 64 bit library, and had just as bad performance as the 32 > bit one. I haven't had a chance to look at it yet, but it suggests > that the result is still inconclusive. Maybe, if just one more fire > goes out here, I can look at it this week. TODO reverted to be: Add BSD-licensed qsort() for Solaris My guess is that your test case didn't tickle the bug. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026