Thread: Sort time

Sort time

From

pginfo

Date:

14 November 2002, 12:39:17

Hi,

Why is the sort part of my query getting so much time?

I run a relative complex query and it gets about 50 sec.
For sorting I need another 50 sec!

Can I increase the sort memory for better performance?
How meny memory is needet for the sort in pg.
The same data readet in java and sorted cost 10 sec !

Any idea about the pg tining?

Regards,
Ivan.

Re: Sort time

From

"scott.marlowe"

Date:

14 November 2002, 15:00:19

On Thu, 14 Nov 2002, pginfo wrote:

> Hi,
>
> Why is the sort part of my query getting so much time?
>
> I run a relative complex query and it gets about 50 sec.
> For sorting I need another 50 sec!
>
> Can I increase the sort memory for better performance?
> How meny memory is needet for the sort in pg.
> The same data readet in java and sorted cost 10 sec !

Increasing sort_mem can help, but often the problem is that your query
isn't optimal.  If you'd like to post the explain analyze output of your
query, someone might have a hint on how to increase the efficiency of the
query.

Re: Sort time

From

pginfo

Date:

15 November 2002, 02:04:42

Hi,
The sort mem is prety big at the moment.
For this tuning I use 256 MB for sort mem !

The explain plan is:
EXPLAIN
gibi=# explain analyze select
S.IDS_NUM,S.OP,S.KOL,S.OTN_MED,S.CENA,S.DDS,S.KURS,S.TOT,S.DTO,S.PTO,S.DTON,MED.MNAME
AS MEDNAME,N.MNAME AS NOMENNAME,N.NUM AS NNUM,S.PART,S.IZV,D.DATE_OP  from
A_DOC D  , A_SKLAD S,  A_NOMEN N ,A_MED MED       WHERE S.FID=0 AND
N.OSN_MED=MED.ID
S AND S.IDS_NUM=N.IDS AND S.IDS_DOC=D.IDS  ORDER BY  S.IDS_NUM,S.PART,S.OP ;
NOTICE:  QUERY PLAN:

Sort  (cost=100922.53..100922.53 rows=22330 width=215) (actual
time=111241.88..111735.33 rows=679743 loops=1)
  ->  Hash Join  (cost=9153.28..99309.52 rows=22330 width=215) (actual
time=3386.45..53065.59 rows=679743 loops=1)
        ->  Hash Join  (cost=2271.05..91995.05 rows=30620 width=198) (actual
time=2395.76..36710.54 rows=679743 loops=1)
              ->  Seq Scan on a_sklad s  (cost=0.00..84181.91 rows=687913
width=111) (actual time=2111.30..22354.10 rows=679743 loops=1)
              ->  Hash  (cost=2256.59..2256.59 rows=5784 width=87) (actual
time=282.95..282.95 rows=0 loops=1)
                    ->  Hash Join  (cost=2.52..2256.59 rows=5784 width=87)
(actual time=132.54..270.29 rows=5784 loops=1)
                          ->  Seq Scan on a_nomen n  (cost=0.00..2152.84
rows=5784 width=74) (actual time=127.97..218.02 rows=5784 loops=1)
                          ->  Hash  (cost=2.42..2.42 rows=42 width=13)
(actual time=0.55..0.55 rows=0 loops=1)
                                ->  Seq Scan on a_med med  (cost=0.00..2.42
rows=42 width=13) (actual time=0.22..0.43 rows=42 loops=1)
        ->  Hash  (cost=6605.19..6605.19 rows=110819 width=17) (actual
time=987.26..987.26 rows=0 loops=1)
              ->  Seq Scan on a_doc d  (cost=0.00..6605.19 rows=110819
width=17) (actual time=67.96..771.54 rows=109788 loops=1)
Total runtime: 112402.30 msec

EXPLAIN

All IDS_XXX fields are varchar(20),S.PART is also varchar(20).
All tables are indexed.

Can I change any parameters on my pg to increase the speed.
It looks very slow.

Only for test ( I do not need it) I executed:
EXPLAIN
gibi=# explain analyze select
S.IDS_NUM,S.OP,S.KOL,S.OTN_MED,S.CENA,S.DDS,S.KURS,S.TOT,S.DTO,S.PTO,S.DTON,MED.MNAME
AS MEDNAME,N.MNAME AS NOMENNAME,N.NUM AS NNUM,S.PART,S.IZV,D.DATE_OP  from
A_DOC D  , A_SKLAD S,  A_NOMEN N ,A_MED MED       WHERE S.FID=0 AND
N.OSN_MED=MED.ID
S AND S.IDS_NUM=N.IDS AND S.IDS_DOC=D.IDS  ORDER BY S.OP ;
NOTICE:  QUERY PLAN:

Sort  (cost=100922.53..100922.53 rows=22330 width=215) (actual
time=62141.60..62598.05 rows=679743 loops=1)
  ->  Hash Join  (cost=9153.28..99309.52 rows=22330 width=215) (actual
time=9032.59..54703.33 rows=679743 loops=1)
        ->  Hash Join  (cost=2271.05..91995.05 rows=30620 width=198) (actual
time=8046.91..39132.91 rows=679743 loops=1)
              ->  Seq Scan on a_sklad s  (cost=0.00..84181.91 rows=687913
width=111) (actual time=7790.01..25565.74 rows=679743 loops=1)
              ->  Hash  (cost=2256.59..2256.59 rows=5784 width=87) (actual
time=255.32..255.32 rows=0 loops=1)
                    ->  Hash Join  (cost=2.52..2256.59 rows=5784 width=87)
(actual time=123.40..243.02 rows=5784 loops=1)
                          ->  Seq Scan on a_nomen n  (cost=0.00..2152.84
rows=5784 width=74) (actual time=118.75..204.41 rows=5784 loops=1)
                          ->  Hash  (cost=2.42..2.42 rows=42 width=13)
(actual time=0.59..0.59 rows=0 loops=1)
                                ->  Seq Scan on a_med med  (cost=0.00..2.42
rows=42 width=13) (actual time=0.25..0.47 rows=42 loops=1)
        ->  Hash  (cost=6605.19..6605.19 rows=110819 width=17) (actual
time=982.22..982.22 rows=0 loops=1)
              ->  Seq Scan on a_doc d  (cost=0.00..6605.19 rows=110819
width=17) (actual time=73.46..787.87 rows=109788 loops=1)
Total runtime: 63194.60 msec

The field S.OP is INT.

It is huge improvement when I sort by INT field, but I need to sort varchar
fileds !

Is this normal for pg to work so slow with varchar or I can change the setup.

Also I think the query time ( without sorting is big).

regards and thanks in advance.

scott.marlowe wrote:

> On Thu, 14 Nov 2002, pginfo wrote:
>
> > Hi,
> >
> > Why is the sort part of my query getting so much time?
> >
> > I run a relative complex query and it gets about 50 sec.
> > For sorting I need another 50 sec!
> >
> > Can I increase the sort memory for better performance?
> > How meny memory is needet for the sort in pg.
> > The same data readet in java and sorted cost 10 sec !
>
> Increasing sort_mem can help, but often the problem is that your query
> isn't optimal.  If you'd like to post the explain analyze output of your
> query, someone might have a hint on how to increase the efficiency of the
> query.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

Re: Sort time

From

pginfo

Date:

15 November 2002, 02:25:20

Sorry,
I can post a little more info:

I run the same query ( and receive the same result), but in this time I
started vmstat 2, to see the system state.
The results:

gibi=# explain analyze select
S.IDS_NUM,S.OP,S.KOL,S.OTN_MED,S.CENA,S.DDS,S.KURS,S.TOT,S.DTO,S.PTO,S.DTON,MED.MNAME
AS MEDNAME,N.MNAME AS NOMENNAME,N.NUM AS NNUM,S.PART,S.IZV,D.DATE_OP  from
A_DOC D  , A_SKLAD S,  A_NOMEN N ,A_MED MED       WHERE S.FID=0 AND
N.OSN_MED=MED.ID
S AND S.IDS_NUM=N.IDS AND S.IDS_DOC=D.IDS  ORDER BY  S.IDS_NUM,S.PART,S.OP ;
NOTICE:  QUERY PLAN:

Sort  (cost=100922.53..100922.53 rows=22330 width=215) (actual
time=109786.23..110231.74 rows=679743 loops=1)
  ->  Hash Join  (cost=9153.28..99309.52 rows=22330 width=215) (actual
time=12572.01..56330.28 rows=679743 loops=1)
        ->  Hash Join  (cost=2271.05..91995.05 rows=30620 width=198) (actual
time=7082.66..36482.57 rows=679743 loops=1)
              ->  Seq Scan on a_sklad s  (cost=0.00..84181.91 rows=687913
width=111) (actual time=6812.81..23085.36 rows=679743 loops=1)
              ->  Hash  (cost=2256.59..2256.59 rows=5784 width=87) (actual
time=268.05..268.05 rows=0 loops=1)
                    ->  Hash Join  (cost=2.52..2256.59 rows=5784 width=87)
(actual time=125.25..255.48 rows=5784 loops=1)
                          ->  Seq Scan on a_nomen n  (cost=0.00..2152.84
rows=5784 width=74) (actual time=120.63..216.93 rows=5784 loops=1)
                          ->  Hash  (cost=2.42..2.42 rows=42 width=13)
(actual time=0.57..0.57 rows=0 loops=1)
                                ->  Seq Scan on a_med med  (cost=0.00..2.42
rows=42 width=13) (actual time=0.24..0.46 rows=42 loops=1)
        ->  Hash  (cost=6605.19..6605.19 rows=110819 width=17) (actual
time=5485.90..5485.90 rows=0 loops=1)
              ->  Seq Scan on a_doc d  (cost=0.00..6605.19 rows=110819
width=17) (actual time=61.18..5282.99 rows=109788 loops=1)
Total runtime: 110856.36 msec

EXPLAIN

 vmstat 2
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 0  0  0  32104 196932  77404 948256   0   0    30    12   24    12   6   1
27
 0  1  1  32104 181792  77404 952416   0   0  2080    36  328   917   7   9
84
 0  1  0  32104 170392  77404 959584   0   0  3584    16  533  1271   5   4
91
 1  0  0  32104 162612  77404 965216   0   0  2816     0  514  1332   2   6
92
 1  0  0  32104 146832  77404 979956   0   0  7370    18  631  1741   5  16
79
 1  0  0  32104 129452  77404 997364   0   0  8704     0  719  1988   7   7
86
 0  2  1  32104 116016  77404 1010632   0   0  6634     8  563  1495   6  20
74
 1  0  0  32104 109844  77404 1013360   0   0  1364     2  228   584  31  24
45
 1  0  0  32104 101244  77404 1013364   0   0     2     0  103   219  43  11
46
 1  0  0  32104  84652  77404 1021328   0   0  3982    16  402   455  44   8
49
 3  0  0  32104  72916  77404 1024404   0   0  1538     0  294   215  44   5
51
 2  0  0  32104  63844  77404 1024404   0   0     0    10  103   222  47   3
50
 1  0  0  32104  54600  77404 1024404   0   0     0     0  102   222  55   6
39
 1  0  0  32104  45472  77404 1024404   0   0     0     0  102   220  45   6
50
 1  0  0  32104  36060  77404 1024404   0   0     0    10  103   215  45   5
50
 2  0  0  32104  26640  77404 1024404   0   0     0     0  106   218  43   7
50
 2  0  0  32104  17440  77404 1024404   0   0     0    10  148   253  46   6
48
 1  0  0  32104  10600  77404 1022004   0   0     0     0  102   215  42   8
50
 1  0  0  32104  10604  77404 1013900   0   0     0     0  103   212  41   9
50
 1  0  0  32104  10600  77404 1006452   0   0     0    26  106   225  38  12
50
 2  0  0  32104  10600  77404 997412   0   0     0     0  102   213  48   3
50
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 1  0  0  32104  10572  77428 988936   0   0   340   118  214   455  62   8
29
 1  0  0  32104  10532  77432 979872   0   0   642   124  307   448  70  12
18
 1  0  0  32104  10516  77432 970316   0   0     0     0  102   238  49   6
45
 1  0  0  32104  10508  77432 960880   0   0     0    46  105   224  50   5
45
 1  0  0  32104  10500  77432 951740   0   0  3398    34  174   445  47   9
44
 1  0  1  32104  10112  77432 943588   0   0  8192    94  289   544  50  12
39
 1  0  0  32104  10484  77432 937204   0   0 16896     0  386  1698  37  20
43
 2  0  0  32104  10484  77432 930004   0   0 14080     0  345  1415  39  17
45
 3  0  0  32104  27976  77432 925592   0   0  1844    16  136   329  46   6
49
 2  0  0  32104  27924  77432 925592   0   0     0     0  104   220  50   0
49
 2  0  0  32104  27756  77436 925592   0   0     0     8  103   222  51   2
47
 1  0  0  32104  27756  77436 925592   0   0     0     0  102   222  54   1
45
 1  0  0  32104  27756  77436 925592   0   0     0     0  102   220  55   0
45
 1  0  0  32104  27424  77436 925592   0   0     0    24  104   224  54   1
45
 1  0  0  32104  27424  77436 925592   0   0     0     0  102   218  55   0
45
 3  0  0  32104  27424  77436 925592   0   0     0     8  103   221  55   0
45
 1  0  0  32104  27424  77436 925592   0   0     0     0  103   222  55   0
45
 1  0  0  32104  27456  77436 925592   0   0     0     0  104   222  55   0
45
 1  0  0  32104  27456  77436 925592   0   0     0     8  104   222  55   0
45
 2  0  0  32104  26792  77436 925592   0   0     0     0  102   218  55   1
44
 2  0  0  32104  26792  77436 925592   0   0     0     8  103   222  55   0
44
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 2  0  0  32104  26792  77436 925592   0   0     0     0  102   221  66   0
33
 1  0  0  32104  26792  77436 925592   0   0     0     0  103   221  55   0
44
 1  0  0  32104  26792  77436 925592   0   0     0     8  103   219  55   0
44
 1  0  0  32104  26792  77436 925592   0   0     0     0  104   221  56   0
44
 2  0  0  32104  26792  77436 925592   0   0     0     8  105   223  56   0
44
 1  0  0  32104  26792  77436 925592   0   0     0     0  102   222  56   0
44
 1  0  0  32104  26792  77436 925592   0   0     0     8  106   223  55   1
44
 1  0  0  32104  26792  77436 925592   0   0     0     0  102   216  56   0
44
 2  0  0  32104  26792  77436 925592   0   0     0     0  102   221  56   0
43
 2  0  0  32104  26628  77436 925592   0   0     0    26  106   230  57   0
43
 1  0  0  32104  26768  77440 925592   0   0     0    12  104   228  57   0
43
 1  0  0  32104  26760  77448 925592   0   0     0    30  106   226  56   1
43
 2  0  0  32104  26168  77448 925592   0   0     0     0  102   221  57   0
43
 1  0  0  32104  28088  77448 925592   0   0     0     0  103   220  46  12
42

Can I tune better my linux box or pq to get faster execution?

regards.



scott.marlowe wrote:

> On Thu, 14 Nov 2002, pginfo wrote:
>
> > Hi,
> >
> > Why is the sort part of my query getting so much time?
> >
> > I run a relative complex query and it gets about 50 sec.
> > For sorting I need another 50 sec!
> >
> > Can I increase the sort memory for better performance?
> > How meny memory is needet for the sort in pg.
> > The same data readet in java and sorted cost 10 sec !
>
> Increasing sort_mem can help, but often the problem is that your query
> isn't optimal.  If you'd like to post the explain analyze output of your
> query, someone might have a hint on how to increase the efficiency of the
> query.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

Re: Sort time

From

Tom Lane

Date:

15 November 2002, 10:43:37

pginfo <pginfo@t1.unisoftbg.com> writes:
> It is huge improvement when I sort by INT field, but I need to sort varchar
> fileds !

What locale are you using?  strcoll() comparisons can be awfully slow in
some locales.

            regards, tom lane

Re: Sort time

From

pginfo

Date:

15 November 2002, 11:08:59

Hi Tom,

I use unicode for my db, but the locale is US!
The unicode is only for non english varchar and I do not make any comparation or
sorts or joins based
on non english fields ( all this is made in the client part of the system).

What locale will be fast?
Have you any info about the speed in the faster locale and in INT?

regards.

Tom Lane wrote:

> pginfo <pginfo@t1.unisoftbg.com> writes:
> > It is huge improvement when I sort by INT field, but I need to sort varchar
> > fileds !
>
> What locale are you using?  strcoll() comparisons can be awfully slow in
> some locales.
>
>                         regards, tom lane

Re: Sort time

From

Tom Lane

Date:

15 November 2002, 11:46:16

pginfo <pginfo@t1.unisoftbg.com> writes:
> What locale will be fast?

C locale (a/k/a POSIX locale) should be quick.  Not sure about anything
else.

            regards, tom lane

Re: Sort time

From

pginfo

Date:

15 November 2002, 11:46:33

Ok,
Thanks!

Have any one anoder idea?
regards.

Tom Lane wrote:

> pginfo <pginfo@t1.unisoftbg.com> writes:
> > What locale will be fast?
>
> C locale (a/k/a POSIX locale) should be quick.  Not sure about anything
> else.
>
>                         regards, tom lane

Re: Sort time

From

pginfo

Date:

15 November 2002, 12:50:38

Hi,
Yes I have indexes on all this fields.
Also I vacuumed and that is the result after it.
Actualy I do not see what bad in query execution. The problem is in sort
time!

regards.

Josh Berkus wrote:

> Pginfo,
>
> > Sort  (cost=100922.53..100922.53 rows=22330 width=215) (actual
> > time=109786.23..110231.74 rows=679743 loops=1)
> >   ->  Hash Join  (cost=9153.28..99309.52 rows=22330 width=215)
> > (actual
> > time=12572.01..56330.28 rows=679743 loops=1)
> >         ->  Hash Join  (cost=2271.05..91995.05 rows=30620 width=198)
> > (actual
> > time=7082.66..36482.57 rows=679743 loops=1)
> >               ->  Seq Scan on a_sklad s  (cost=0.00..84181.91
> > rows=687913
> > width=111) (actual time=6812.81..23085.36 rows=679743 loops=1)
> >               ->  Hash  (cost=2256.59..2256.59 rows=5784 width=87)
> > (actual
> > time=268.05..268.05 rows=0 loops=1)
> >                     ->  Hash Join  (cost=2.52..2256.59 rows=5784
> > width=87)
> > (actual time=125.25..255.48 rows=5784 loops=1)
> >                           ->  Seq Scan on a_nomen n
> >  (cost=0.00..2152.84
> > rows=5784 width=74) (actual time=120.63..216.93 rows=5784 loops=1)
> >                           ->  Hash  (cost=2.42..2.42 rows=42
> > width=13)
> > (actual time=0.57..0.57 rows=0 loops=1)
> >                                 ->  Seq Scan on a_med med
> >  (cost=0.00..2.42
> > rows=42 width=13) (actual time=0.24..0.46 rows=42 loops=1)
> >         ->  Hash  (cost=6605.19..6605.19 rows=110819 width=17)
> > (actual
> > time=5485.90..5485.90 rows=0 loops=1)
> >               ->  Seq Scan on a_doc d  (cost=0.00..6605.19
> > rows=110819
> > width=17) (actual time=61.18..5282.99 rows=109788 loops=1)
> > Total runtime: 110856.36 msec
>
> Pardon me if we've been over this ground, but that's a *lot* of seq
> scans for this query.   It seems odd that there's not *one* index scan.
>
> Have you tried indexing *all* of the following fields?
> S.FID
> N.OSN_MED
> S.IDS_NUM
> N.IDS
> S.IDS_DOC
> D.IDS
> (check to avoid duplicate indexes.  don't forget to VACUUM ANALYZE
> after you index)
>
> -Josh Berkus

Re: Sort time

From

"Josh Berkus"

Date:

15 November 2002, 12:55:08

Pginfo,

> Sort  (cost=100922.53..100922.53 rows=22330 width=215) (actual
> time=109786.23..110231.74 rows=679743 loops=1)
>   ->  Hash Join  (cost=9153.28..99309.52 rows=22330 width=215)
> (actual
> time=12572.01..56330.28 rows=679743 loops=1)
>         ->  Hash Join  (cost=2271.05..91995.05 rows=30620 width=198)
> (actual
> time=7082.66..36482.57 rows=679743 loops=1)
>               ->  Seq Scan on a_sklad s  (cost=0.00..84181.91
> rows=687913
> width=111) (actual time=6812.81..23085.36 rows=679743 loops=1)
>               ->  Hash  (cost=2256.59..2256.59 rows=5784 width=87)
> (actual
> time=268.05..268.05 rows=0 loops=1)
>                     ->  Hash Join  (cost=2.52..2256.59 rows=5784
> width=87)
> (actual time=125.25..255.48 rows=5784 loops=1)
>                           ->  Seq Scan on a_nomen n
>  (cost=0.00..2152.84
> rows=5784 width=74) (actual time=120.63..216.93 rows=5784 loops=1)
>                           ->  Hash  (cost=2.42..2.42 rows=42
> width=13)
> (actual time=0.57..0.57 rows=0 loops=1)
>                                 ->  Seq Scan on a_med med
>  (cost=0.00..2.42
> rows=42 width=13) (actual time=0.24..0.46 rows=42 loops=1)
>         ->  Hash  (cost=6605.19..6605.19 rows=110819 width=17)
> (actual
> time=5485.90..5485.90 rows=0 loops=1)
>               ->  Seq Scan on a_doc d  (cost=0.00..6605.19
> rows=110819
> width=17) (actual time=61.18..5282.99 rows=109788 loops=1)
> Total runtime: 110856.36 msec

Pardon me if we've been over this ground, but that's a *lot* of seq
scans for this query.   It seems odd that there's not *one* index scan.

Have you tried indexing *all* of the following fields?
S.FID
N.OSN_MED
S.IDS_NUM
N.IDS
S.IDS_DOC
D.IDS
(check to avoid duplicate indexes.  don't forget to VACUUM ANALYZE
after you index)

-Josh Berkus

Re: Sort time

From

"Josh Berkus"

Date:

15 November 2002, 12:57:43

Pginfo,

> Yes I have indexes on all this fields.
> Also I vacuumed and that is the result after it.
> Actualy I do not see what bad in query execution. The problem is in
> sort
> time!

Hmmm... I don't understand.  The way I read the EXPLAIN, the sort is
only taking a few seconds.   Am I missing something, here?

And that's "VACUUM FULL ANALYZE", not just "VACUUM", yes?

If all of the above has been tried, what happens to the query when you
set enable_seqscan=off?

-Josh Berkus

Re: Sort time

From

Rod Taylor

Date:

15 November 2002, 14:24:21

On Fri, 2002-11-15 at 12:33, Josh Berkus wrote:
> Pginfo,
>
> > Yes I have indexes on all this fields.
> > Also I vacuumed and that is the result after it.
> > Actualy I do not see what bad in query execution. The problem is in
> > sort
> > time!
>
> Hmmm... I don't understand.  The way I read the EXPLAIN, the sort is
> only taking a few seconds.   Am I missing something, here?

The estimated cost had the sort at a few seconds, but the actual times
show it is taking 50% of the total query time.

The big problem is he's sorting by a varchar() which isn't overly quick
no matter what locale.  Integers are nice and quick (s.OP is an int,
which shows this).

If IDS_NUM is a number, he could try casting it to an int8, but without
data examples I couldn't say.

--
Rod Taylor <rbt@rbt.ca>

Re: Sort time

From

Josh Berkus

Date:

15 November 2002, 14:25:52

Rod,

> The estimated cost had the sort at a few seconds, but the actual times
> show it is taking 50% of the total query time.

D'oh!   I was, of course, subtracting the estimated from the actual time.
Oops.

>
> The big problem is he's sorting by a varchar() which isn't overly quick
> no matter what locale.  Integers are nice and quick (s.OP is an int,
> which shows this).
>
> If IDS_NUM is a number, he could try casting it to an int8, but without
> data examples I couldn't say.

Hmmm ... how big *is* that varchar field?  8 characters gives us about 6mb for
the column.  Of course, if it's a 128-char global unque id, that;s a bit
larger.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 621-2533
    and non-profit organizations.     San Francisco

Re: Sort time

From

Rod Taylor

Date:

15 November 2002, 15:54:20

On Fri, 2002-11-15 at 14:27, Josh Berkus wrote:
> > The big problem is he's sorting by a varchar() which isn't overly quick
> > no matter what locale.  Integers are nice and quick (s.OP is an int,
> > which shows this).
> >
> > If IDS_NUM is a number, he could try casting it to an int8, but without
> > data examples I couldn't say.
>
> Hmmm ... how big *is* that varchar field?  8 characters gives us about 6mb for
> the column.  Of course, if it's a 128-char global unque id, that;s a bit
> larger.

20 characters long in the Unicode locale -- which is 40 bytes?
--
Rod Taylor <rbt@rbt.ca>

Re: Sort time

From

Josh Berkus

Date:

15 November 2002, 16:16:57

Rod,

> > Hmmm ... how big *is* that varchar field?  8 characters gives us about 6mb
for
> > the column.  Of course, if it's a 128-char global unque id, that;s a bit
> > larger.
>
> 20 characters long in the Unicode locale -- which is 40 bytes?

Well, 40+, probably about 43.    Should be about 29mb, yes?
Here's a question: is the total size of the column a good indicator of the
sort_mem required?   Or does the rowsize affect it somehow?

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 621-2533
    and non-profit organizations.     San Francisco

Re: Sort time

From

Rod Taylor

Date:

15 November 2002, 17:39:46

On Fri, 2002-11-15 at 16:18, Josh Berkus wrote:
> Rod,
>
> > > Hmmm ... how big *is* that varchar field?  8 characters gives us about 6mb
> for
> > > the column.  Of course, if it's a 128-char global unque id, that;s a bit
> > > larger.
> >
> > 20 characters long in the Unicode locale -- which is 40 bytes?
>
> Well, 40+, probably about 43.    Should be about 29mb, yes?
> Here's a question: is the total size of the column a good indicator of the
> sort_mem required?   Or does the rowsize affect it somehow?

I'd suspect the total row is sorted, especially in this case where he's
sorting more than one attribute.

--
Rod Taylor <rbt@rbt.ca>

ANALYZE and indexes (was Re: Sort time)

From

Tom Lane

Date:

15 November 2002, 18:02:28

"Josh Berkus" <josh@agliodbs.com> writes:
> ... don't forget to VACUUM ANALYZE after you index ...

People keep saying that, but it's a myth.  ANALYZE doesn't care what
indexes are present; adding or deleting an index doesn't invalidate
previous ANALYZE results.

            regards, tom lane

Re: Sort time

From

Tom Lane

Date:

15 November 2002, 18:32:55

Josh Berkus <josh@agliodbs.com> writes:
> Here's a question: is the total size of the column a good indicator of the
> sort_mem required?   Or does the rowsize affect it somehow?

It will include all the data that's supposed to be output by the sort...
both the key column(s) and the others.

            regards, tom lane

Re: Sort time

From

pginfo

Date:

16 November 2002, 02:16:37

Hi,

Rod Taylor wrote:

> On Fri, 2002-11-15 at 16:18, Josh Berkus wrote:
> > Rod,
> >
> > > > Hmmm ... how big *is* that varchar field?  8 characters gives us about 6mb
> > for
> > > > the column.  Of course, if it's a 128-char global unque id, that;s a bit
> > > > larger.
> > >
> > > 20 characters long in the Unicode locale -- which is 40 bytes?
> >
> > Well, 40+, probably about 43.    Should be about 29mb, yes?
> > Here's a question: is the total size of the column a good indicator of the
> > sort_mem required?   Or does the rowsize affect it somehow?
>
> I'd suspect the total row is sorted, especially in this case where he's
> sorting more than one attribute.
>

I think that total the row is sorted.I do not know hoe is sorting in pg working and
why so slow,
but I tested all this in java ( in C is much quicker)
and the make this:
1. Read all data in memory defined as ArrayList from structure of data.
2. make comparator with unicode string compare.
3. Execute sort (all in memory)

The sort take 2-4 sek for all this rows!!!
It is good as performance.
The question is : Why is it in ps so slow?
Sorting is normal think for db!
Also I have 256 MB for sort mem and this was the only executing query at the moment.

I know that if the fields are INT all will work better, but we migrate this
application from oracle
and the fields in oracle was varchar.
We do not have any performance problems with oracle and this data.
Also one part from users will continue to work with oracle and exchange ( import and
export) data
to the pg systems.

> --
> Rod Taylor <rbt@rbt.ca>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Sort time

From

pginfo

Date:

16 November 2002, 02:32:51

Hi,

Tom Lane wrote:

> Josh Berkus <josh@agliodbs.com> writes:
> > Here's a question: is the total size of the column a good indicator of the
> > sort_mem required?   Or does the rowsize affect it somehow?
>
> It will include all the data that's supposed to be output by the sort...
> both the key column(s) and the others.
>

Hmm it is not clear for me.Let we have all data.
If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
I think we move all this data anly the number of comparation is by INT. I think
the number of comparation
is ~ n * ln(n).
If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
varchar(string).
I don't think that it can take 50 sek.

Is it not so?

regards,
ivan.

>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Sort time

From

Stephan Szabo

Date:

16 November 2002, 12:15:14

On Sat, 16 Nov 2002, pginfo wrote:

> Hi,
>
> Tom Lane wrote:
>
> > Josh Berkus <josh@agliodbs.com> writes:
> > > Here's a question: is the total size of the column a good indicator of the
> > > sort_mem required?   Or does the rowsize affect it somehow?
> >
> > It will include all the data that's supposed to be output by the sort...
> > both the key column(s) and the others.
> >
>
> Hmm it is not clear for me.Let we have all data.
> If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> I think we move all this data anly the number of comparation is by INT. I think
> the number of comparation
> is ~ n * ln(n).
> If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> varchar(string).
> I don't think that it can take 50 sek.
>
> Is it not so?

Have you tried setting up another database in "C" locale and compared the
timings there?  I'd wonder if maybe there's some extra copying going on
given the comments in varstr_cmp.

Re: Sort time

From

pginfo

Date:

17 November 2002, 02:27:02

Hi,

Stephan Szabo wrote:

> On Sat, 16 Nov 2002, pginfo wrote:
>
> > Hi,
> >
> > Tom Lane wrote:
> >
> > > Josh Berkus <josh@agliodbs.com> writes:
> > > > Here's a question: is the total size of the column a good indicator of the
> > > > sort_mem required?   Or does the rowsize affect it somehow?
> > >
> > > It will include all the data that's supposed to be output by the sort...
> > > both the key column(s) and the others.
> > >
> >
> > Hmm it is not clear for me.Let we have all data.
> > If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> > I think we move all this data anly the number of comparation is by INT. I think
> > the number of comparation
> > is ~ n * ln(n).
> > If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> > varchar(string).
> > I don't think that it can take 50 sek.
> >
> > Is it not so?
>
> Have you tried setting up another database in "C" locale and compared the
> timings there?  I'd wonder if maybe there's some extra copying going on
> given the comments in varstr_cmp.

No, I do not have any info about it.I will see if it is possible ( the data are not
so simple).
If it is possible I will make the tests.
Have no one that have 700K row in thow tables?
It is simple to test:
1. Run query that returns ~700K rows from this tables.
2. Make sort.

It is interest only the sort time!

regards,
Ivan.

Re: Sort time

From

Stephan Szabo

Date:

17 November 2002, 02:44:56

On Sun, 17 Nov 2002, pginfo wrote:

> Hi,
>
> Stephan Szabo wrote:
>
> > On Sat, 16 Nov 2002, pginfo wrote:
> >
> > > Hi,
> > >
> > > Tom Lane wrote:
> > >
> > > > Josh Berkus <josh@agliodbs.com> writes:
> > > > > Here's a question: is the total size of the column a good indicator of the
> > > > > sort_mem required?   Or does the rowsize affect it somehow?
> > > >
> > > > It will include all the data that's supposed to be output by the sort...
> > > > both the key column(s) and the others.
> > > >
> > >
> > > Hmm it is not clear for me.Let we have all data.
> > > If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> > > I think we move all this data anly the number of comparation is by INT. I think
> > > the number of comparation
> > > is ~ n * ln(n).
> > > If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> > > varchar(string).
> > > I don't think that it can take 50 sek.
> > >
> > > Is it not so?
> >
> > Have you tried setting up another database in "C" locale and compared the
> > timings there?  I'd wonder if maybe there's some extra copying going on
> > given the comments in varstr_cmp.
>
> No, I do not have any info about it.I will see if it is possible ( the data are not
> so simple).
> If it is possible I will make the tests.
> Have no one that have 700K row in thow tables?
> It is simple to test:
> 1. Run query that returns ~700K rows from this tables.
> 2. Make sort.
>
> It is interest only the sort time!

I can make a table of 700k rows and test it (and am generating 700k of
random varchar rows), but I wouldn't hold great hope that this is
necessarily a valid test since possibly any of OS, configuration settings
and actual data (width and values) might have an effect on the results.

Re: Sort time

From

Stephan Szabo

Date:

17 November 2002, 03:18:29

On Sat, 16 Nov 2002, Stephan Szabo wrote:

> On Sun, 17 Nov 2002, pginfo wrote:
>
> > Hi,
> >
> > Stephan Szabo wrote:
> >
> > > On Sat, 16 Nov 2002, pginfo wrote:
> > >
> > > > Hi,
> > > >
> > > > Tom Lane wrote:
> > > >
> > > > > Josh Berkus <josh@agliodbs.com> writes:
> > > > > > Here's a question: is the total size of the column a good indicator of the
> > > > > > sort_mem required?   Or does the rowsize affect it somehow?
> > > > >
> > > > > It will include all the data that's supposed to be output by the sort...
> > > > > both the key column(s) and the others.
> > > > >
> > > >
> > > > Hmm it is not clear for me.Let we have all data.
> > > > If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> > > > I think we move all this data anly the number of comparation is by INT. I think
> > > > the number of comparation
> > > > is ~ n * ln(n).
> > > > If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> > > > varchar(string).
> > > > I don't think that it can take 50 sek.
> > > >
> > > > Is it not so?
> > >
> > > Have you tried setting up another database in "C" locale and compared the
> > > timings there?  I'd wonder if maybe there's some extra copying going on
> > > given the comments in varstr_cmp.
> >
> > No, I do not have any info about it.I will see if it is possible ( the data are not
> > so simple).
> > If it is possible I will make the tests.
> > Have no one that have 700K row in thow tables?
> > It is simple to test:
> > 1. Run query that returns ~700K rows from this tables.
> > 2. Make sort.
> >
> > It is interest only the sort time!
>
> I can make a table of 700k rows and test it (and am generating 700k of
> random varchar rows), but I wouldn't hold great hope that this is
> necessarily a valid test since possibly any of OS, configuration settings
> and actual data (width and values) might have an effect on the results.

On my not terribly powerful or memory filled box, I got a time of about
16s after going through a couple iterations of raising sort_mem and
watching if it made temp files (which is probably a good idea to check as
well).  The data size ended up being in the vicinity of 100 meg in my
case.

Re: Sort time

From

pginfo

Date:

17 November 2002, 03:25:25

Hi,

Stephan Szabo wrote:

> On Sun, 17 Nov 2002, pginfo wrote:
>
> > Hi,
> >
> > Stephan Szabo wrote:
> >
> > > On Sat, 16 Nov 2002, pginfo wrote:
> > >
> > > > Hi,
> > > >
> > > > Tom Lane wrote:
> > > >
> > > > > Josh Berkus <josh@agliodbs.com> writes:
> > > > > > Here's a question: is the total size of the column a good indicator of the
> > > > > > sort_mem required?   Or does the rowsize affect it somehow?
> > > > >
> > > > > It will include all the data that's supposed to be output by the sort...
> > > > > both the key column(s) and the others.
> > > > >
> > > >
> > > > Hmm it is not clear for me.Let we have all data.
> > > > If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> > > > I think we move all this data anly the number of comparation is by INT. I think
> > > > the number of comparation
> > > > is ~ n * ln(n).
> > > > If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> > > > varchar(string).
> > > > I don't think that it can take 50 sek.
> > > >
> > > > Is it not so?
> > >
> > > Have you tried setting up another database in "C" locale and compared the
> > > timings there?  I'd wonder if maybe there's some extra copying going on
> > > given the comments in varstr_cmp.
> >
> > No, I do not have any info about it.I will see if it is possible ( the data are not
> > so simple).
> > If it is possible I will make the tests.
> > Have no one that have 700K row in thow tables?
> > It is simple to test:
> > 1. Run query that returns ~700K rows from this tables.
> > 2. Make sort.
> >
> > It is interest only the sort time!
>
> I can make a table of 700k rows and test it (and am generating 700k of
> random varchar rows), but I wouldn't hold great hope that this is
> necessarily a valid test since possibly any of OS, configuration settings
> and actual data (width and values) might have an effect on the results.
>

It is so.But the info will help.
If the sort time is 5-6 sek.(by me it is 50 sek) I will work on config and OS settings.
I am uning RH 7.3 at the moment. If anoder OS will have better performance I will make
the change.
But if the sort time is ~50 sek in any OS and config the problem will be in pg and I will
start to think about to
rewrite the sort part of src or migrate to anoder db(mysql or SAPdb. On oracle we have
super performance in sorting at the moment, but the idea is to move
the project to pg).

I think the sort is very important for any db.

Also it will be possible for me (in 1-2 days ) to install anoder box for tests and give
access to some one that can see the problem.
But as beginning it will be great to have more info about sort test results.

If any one have better idea I am ready to discuse it.

regards,
Ivan.

> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Sort time

From

pginfo

Date:

17 November 2002, 04:12:06

Hi,

Stephan Szabo wrote:

> On Sat, 16 Nov 2002, Stephan Szabo wrote:
>
> > On Sun, 17 Nov 2002, pginfo wrote:
> >
> > > Hi,
> > >
> > > Stephan Szabo wrote:
> > >
> > > > On Sat, 16 Nov 2002, pginfo wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Tom Lane wrote:
> > > > >
> > > > > > Josh Berkus <josh@agliodbs.com> writes:
> > > > > > > Here's a question: is the total size of the column a good indicator of the
> > > > > > > sort_mem required?   Or does the rowsize affect it somehow?
> > > > > >
> > > > > > It will include all the data that's supposed to be output by the sort...
> > > > > > both the key column(s) and the others.
> > > > > >
> > > > >
> > > > > Hmm it is not clear for me.Let we have all data.
> > > > > If I make sort by S.OP ( it is INT) it take < 6 sek for sort.
> > > > > I think we move all this data anly the number of comparation is by INT. I think
> > > > > the number of comparation
> > > > > is ~ n * ln(n).
> > > > > If we sort by S.IDS_xxx we have also n*ln(n) comparations but in
> > > > > varchar(string).
> > > > > I don't think that it can take 50 sek.
> > > > >
> > > > > Is it not so?
> > > >
> > > > Have you tried setting up another database in "C" locale and compared the
> > > > timings there?  I'd wonder if maybe there's some extra copying going on
> > > > given the comments in varstr_cmp.
> > >
> > > No, I do not have any info about it.I will see if it is possible ( the data are not
> > > so simple).
> > > If it is possible I will make the tests.
> > > Have no one that have 700K row in thow tables?
> > > It is simple to test:
> > > 1. Run query that returns ~700K rows from this tables.
> > > 2. Make sort.
> > >
> > > It is interest only the sort time!
> >
> > I can make a table of 700k rows and test it (and am generating 700k of
> > random varchar rows), but I wouldn't hold great hope that this is
> > necessarily a valid test since possibly any of OS, configuration settings
> > and actual data (width and values) might have an effect on the results.
>
> On my not terribly powerful or memory filled box, I got a time of about
> 16s after going through a couple iterations of raising sort_mem and
> watching if it made temp files (which is probably a good idea to check as
> well).  The data size ended up being in the vicinity of 100 meg in my
> case.

The time is very good!
It is very good idea to watch the temp files.
I started the sort_mem to 32 mb (it is 256 on the production system)
and I see 3 temp files. The first is ~ 1.8 mb. The second is ~55 mb and the last is ~150
mb.

Also I removed the bigest as size fileds from my query but got only litle improvemen.

regards,
ivan.

Re: Sort time

From

Stephan Szabo

Date:

17 November 2002, 12:29:49

On Sun, 17 Nov 2002, pginfo wrote:

> > On my not terribly powerful or memory filled box, I got a time of about
> > 16s after going through a couple iterations of raising sort_mem and
> > watching if it made temp files (which is probably a good idea to check as
> > well).  The data size ended up being in the vicinity of 100 meg in my
> > case.
>
> The time is very good!
> It is very good idea to watch the temp files.
> I started the sort_mem to 32 mb (it is 256 on the production system)
> and I see 3 temp files. The first is ~ 1.8 mb. The second is ~55 mb and the last is ~150
> mb.

As a note, the same data loaded into a non-"C" locale database took about
42 seconds on the same machine, approximately 2.5x as long.

Re: Sort time

From

Tom Lane

Date:

17 November 2002, 15:56:21

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> As a note, the same data loaded into a non-"C" locale database took about
> 42 seconds on the same machine, approximately 2.5x as long.

The non-C locale is undoubtedly the problem.  I made a test dataset of
700000 all-alphabetic 20-character random strings:

$ head rand.data
duofoesrlycdnilvlcrg
crealjdrjpyczfbnlouo
lxaiyicslwjnxgpehtzp
ykizuovkvpkvvqsaocys
rkkvrqfiiybczwqdvvfu
stonxhbbvgwtjszodguv
prqxhwcfibiopjpiddud
ubgexbfdodhnauytebcf
urfoqifgbrladpssrwzw
ydcrsnxjpxospfqqoilw

I performed the following experiment in 7.3 using a database in
en_US locale, SQL_ASCII encoding:

enus=# create table vc20 (f1 varchar(20));
CREATE TABLE
enus=# \copy vc20 from rand.data
\.
enus=# vacuum analyze vc20;
VACUUM
enus=# set sort_mem to 50000;
SET
enus=# explain analyze select count(*) from
enus-# (select * from vc20 order by f1) ss;
                                                           QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=83607.48..83607.48 rows=1 width=24) (actual time=1058167.66..1058167.67 rows=1 loops=1)
   ->  Subquery Scan ss  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=1022972.86..1049559.50 rows=700000
loops=1)
         ->  Sort  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=1022972.80..1034036.58 rows=700000
loops=1)
               Sort Key: f1
               ->  Seq Scan on vc20  (cost=0.00..12148.00 rows=700000 width=24) (actual time=0.20..24651.65 rows=700000
loops=1)
 Total runtime: 1058220.10 msec
(6 rows)

(The point of the select count(*) was to avoid shipping the result rows
to the client, but in hindsight "explain analyze" would suppress that
anyway.  But the main datapoint here is the time for the Sort step.)

I tried the test using datatype NAME as well, since it sorts using
plain strcmp() instead of strcoll():

enus=# create table nm (f1 name);
CREATE TABLE
enus=# insert into nm select f1 from vc20;
INSERT 0 700000
enus=# vacuum analyze nm;
VACUUM
enus=# set sort_mem to 50000;
SET
enus=# explain analyze select count(*) from
enus-# (select * from nm order by f1) ss;
                                                           QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=122701.48..122701.48 rows=1 width=64) (actual time=157877.84..157877.85 rows=1 loops=1)
   ->  Subquery Scan ss  (cost=119201.48..120951.48 rows=700000 width=64) (actual time=121286.65..149376.93 rows=700000
loops=1)
         ->  Sort  (cost=119201.48..120951.48 rows=700000 width=64) (actual time=121286.60..134075.61 rows=700000
loops=1)
               Sort Key: f1
               ->  Seq Scan on nm  (cost=0.00..15642.00 rows=700000 width=64) (actual time=0.21..24150.57 rows=700000
loops=1)
 Total runtime: 157962.79 msec
(6 rows)

In C locale, the identical test sequence gives

                                                           QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=83607.48..83607.48 rows=1 width=24) (actual time=187480.70..187480.71 rows=1 loops=1)
   ->  Subquery Scan ss  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=141100.03..178625.97 rows=700000
loops=1)
         ->  Sort  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=141099.98..162288.95 rows=700000
loops=1)
               Sort Key: f1
               ->  Seq Scan on vc20  (cost=0.00..12148.00 rows=700000 width=24) (actual time=0.20..23954.71 rows=700000
loops=1)
 Total runtime: 187565.79 msec
(6 rows)

and of course about the same runtime as before for datatype NAME.  So on
this platform (HPUX 10.20), en_US locale incurs about a 6x penalty over
C locale for sorting varchars.


Note that NAME beats VARCHAR by a noticeable margin even in C locale,
despite the handicap of requiring much more I/O (being 64 bytes per row
not 24).  This surprises me; it looks like varstr_cmp() is reasonably
well optimized in the C-locale case.  But the real loser is VARCHAR in
non-C locales.  I suspect the primary time sink is strcoll() not the
palloc/copy overhead in varstr_cmp(), but don't have time right now to
do profiling to prove it.


Anyway, use of NAME instead of VARCHAR might be a workable workaround
if you cannot change your database locale to C.

            regards, tom lane

Re: Sort time

From

Hannu Krosing

Date:

17 November 2002, 16:10:42

Stephan Szabo kirjutas P, 17.11.2002 kell 22:29:
> On Sun, 17 Nov 2002, pginfo wrote:
>
> > > On my not terribly powerful or memory filled box, I got a time of about
> > > 16s after going through a couple iterations of raising sort_mem and
> > > watching if it made temp files (which is probably a good idea to check as
> > > well).  The data size ended up being in the vicinity of 100 meg in my
> > > case.
> >
> > The time is very good!
> > It is very good idea to watch the temp files.
> > I started the sort_mem to 32 mb (it is 256 on the production system)
> > and I see 3 temp files. The first is ~ 1.8 mb. The second is ~55 mb and the last is ~150
> > mb.
>
> As a note, the same data loaded into a non-"C" locale database took about
> 42 seconds on the same machine, approximately 2.5x as long.

I have investigated IBM's ICU (International Code for Unicode or smth
like that) in order to use it for implementing native UNICODE text
types.

The sorting portion seems to work in two stages - 1. convert UTF_16 to
"sorting string" and 2. compare said "sorting strings" - with the stages
being also available separately.

if the same is true for "native" locale support, then there is a good
explanation why the text sort is orders of magnitude slower than int
sort: as the full conversion to "sorting string" has to be done at each
comparison (plus probably malloc/free) for locale-aware compare, but on
most cases in C locale one does not need these, plus the comparison can
usually stop at first or second char.

Getting good performance on locale-aware text sorts seems to require
storing these "sorting strings", either additionally or only these and
find a way for reverse conversion ("sorting string" --> original)

Some speed could be gained by doing the original --> "sorting string"
conversion only once for each line, but that will probably require a
major rewrite of sorting code - in essence

select loctxt,a,b,c,d,e,f,g from mytab sort by localestring;

should become

select loctxt,a,b,c,d,e,f,g from (
   select localestring,a,b,c,d,e,f,g
     from mytab
  sort by sorting_string(loctxt)
) t;

or even

select loctxt,a,b,c,d,e,f,g from (
  select localestring,a,b,c,d,e,f,g, ss  from (
    select localestring,a,b,c,d,e,f,g, sorting_string(loctxt) as ss from
      from mytab
    )
  sort by ss
) t;

depending on how the second form is implemented (i.e. if
sorting_string(loctxt) is evaluated once per row or one per compare)

-------------
Hannu

Re: Sort time

From

Tom Lane

Date:

17 November 2002, 18:05:23

I've applied the attached patch to current sources (7.4devel).  It
eliminates palloc/pfree overhead in varstr_cmp() for short strings
(up to 1K as committed).  I find that this reduces the sort time for
700,000 rows by about 10% on my HPUX box; might be better on machines
with better-optimized strcoll().

            regards, tom lane

*** src/backend/utils/adt/varlena.c.orig    Wed Sep  4 17:30:48 2002
--- src/backend/utils/adt/varlena.c    Sun Nov 17 17:21:43 2002
***************
*** 736,771 ****
  varstr_cmp(char *arg1, int len1, char *arg2, int len2)
  {
      int            result;
-     char       *a1p,
-                *a2p;

      /*
       * Unfortunately, there is no strncoll(), so in the non-C locale case
       * we have to do some memory copying.  This turns out to be
       * significantly slower, so we optimize the case where LC_COLLATE is
!      * C.
       */
      if (!lc_collate_is_c())
      {
!         a1p = (char *) palloc(len1 + 1);
!         a2p = (char *) palloc(len2 + 1);

          memcpy(a1p, arg1, len1);
!         *(a1p + len1) = '\0';
          memcpy(a2p, arg2, len2);
!         *(a2p + len2) = '\0';

          result = strcoll(a1p, a2p);

!         pfree(a1p);
!         pfree(a2p);
      }
      else
      {
!         a1p = arg1;
!         a2p = arg2;
!
!         result = strncmp(a1p, a2p, Min(len1, len2));
          if ((result == 0) && (len1 != len2))
              result = (len1 < len2) ? -1 : 1;
      }
--- 736,782 ----
  varstr_cmp(char *arg1, int len1, char *arg2, int len2)
  {
      int            result;

      /*
       * Unfortunately, there is no strncoll(), so in the non-C locale case
       * we have to do some memory copying.  This turns out to be
       * significantly slower, so we optimize the case where LC_COLLATE is
!      * C.  We also try to optimize relatively-short strings by avoiding
!      * palloc/pfree overhead.
       */
+ #define STACKBUFLEN        1024
+
      if (!lc_collate_is_c())
      {
!         char    a1buf[STACKBUFLEN];
!         char    a2buf[STACKBUFLEN];
!         char   *a1p,
!                *a2p;
!
!         if (len1 >= STACKBUFLEN)
!             a1p = (char *) palloc(len1 + 1);
!         else
!             a1p = a1buf;
!         if (len2 >= STACKBUFLEN)
!             a2p = (char *) palloc(len2 + 1);
!         else
!             a2p = a2buf;

          memcpy(a1p, arg1, len1);
!         a1p[len1] = '\0';
          memcpy(a2p, arg2, len2);
!         a2p[len2] = '\0';

          result = strcoll(a1p, a2p);

!         if (len1 >= STACKBUFLEN)
!             pfree(a1p);
!         if (len2 >= STACKBUFLEN)
!             pfree(a2p);
      }
      else
      {
!         result = strncmp(arg1, arg2, Min(len1, len2));
          if ((result == 0) && (len1 != len2))
              result = (len1 < len2) ? -1 : 1;
      }

Re: Sort time

From

Tom Lane

Date:

17 November 2002, 18:54:42

Hannu Krosing <hannu@tm.ee> writes:
> Some speed could be gained by doing the original --> "sorting string"
> conversion only once for each line, but that will probably require a
> major rewrite of sorting code - in essence

> select loctxt,a,b,c,d,e,f,g from mytab sort by localestring;

> should become

> select loctxt,a,b,c,d,e,f,g from (
>    select localestring,a,b,c,d,e,f,g
>      from mytab
>   sort by sorting_string(loctxt)
> ) t;

> or even

> select loctxt,a,b,c,d,e,f,g from (
>   select localestring,a,b,c,d,e,f,g, ss  from (
>     select localestring,a,b,c,d,e,f,g, sorting_string(loctxt) as ss from
>       from mytab
>     )
>   sort by ss
> ) t;

> depending on how the second form is implemented (i.e. if
> sorting_string(loctxt) is evaluated once per row or one per compare)

Indeed the function call will be evaluated only once per row, so it
wouldn't be too hard to kluge up a prototype implementation to test what
the real speed difference turns out to be.  You'd basically need
(a) a non-locale-aware set of comparison operators for type text ---
you might as well build a whole index opclass, so that non-locale-aware
indexes could be made (this'd be a huge win for LIKE optimization too);
(b) a strxfrm() function to produce the sortable strings.

If it turns out to be a big win, which is looking probable from the
comparisons Stephan and I just reported, then the next question is how
to make the transformation occur automatically.  I think it'd be
relatively simple to put a hack in the planner to do this when it's
emitting a SORT operation that uses the locale-aware sort operators.
It'd be kind of an ugly special case, but surely no worse than the ones
that are in there already for LIKE and some other operators.

            regards, tom lane

Re: Sort time

From

pginfo

Date:

18 November 2002, 02:06:21

Hi Tom,

The idea is very good.
I recreated the tables and for all IDS_xxx I used name (not varchar(20)).
The the is also in unicode.
I ran the query and got huge improvement!
The work time is 166 sek. ( before it was ~320 - 340 sek.).

I will continue to make new tests and play around the setups.
I think all this can be more quicker. I expect to get ~ 45-60 sek. ( this is the time in oracle), but laso 166 sek is
good.

I think that we need to work around the non us sorting and compare.
It is not possible to be so slow (all the functions are executed in memory
and in java and by oracle and by ms all this is working very fast).

regards,
ivan.

Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > As a note, the same data loaded into a non-"C" locale database took about
> > 42 seconds on the same machine, approximately 2.5x as long.
>
> The non-C locale is undoubtedly the problem.  I made a test dataset of
> 700000 all-alphabetic 20-character random strings:
>
> $ head rand.data
> duofoesrlycdnilvlcrg
> crealjdrjpyczfbnlouo
> lxaiyicslwjnxgpehtzp
> ykizuovkvpkvvqsaocys
> rkkvrqfiiybczwqdvvfu
> stonxhbbvgwtjszodguv
> prqxhwcfibiopjpiddud
> ubgexbfdodhnauytebcf
> urfoqifgbrladpssrwzw
> ydcrsnxjpxospfqqoilw
>
> I performed the following experiment in 7.3 using a database in
> en_US locale, SQL_ASCII encoding:
>
> enus=# create table vc20 (f1 varchar(20));
> CREATE TABLE
> enus=# \copy vc20 from rand.data
> \.
> enus=# vacuum analyze vc20;
> VACUUM
> enus=# set sort_mem to 50000;
> SET
> enus=# explain analyze select count(*) from
> enus-# (select * from vc20 order by f1) ss;
>                                                            QUERY PLAN
>
>
---------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=83607.48..83607.48 rows=1 width=24) (actual time=1058167.66..1058167.67 rows=1 loops=1)
>    ->  Subquery Scan ss  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=1022972.86..1049559.50
rows=700000loops=1) 
>          ->  Sort  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=1022972.80..1034036.58 rows=700000
loops=1)
>                Sort Key: f1
>                ->  Seq Scan on vc20  (cost=0.00..12148.00 rows=700000 width=24) (actual time=0.20..24651.65
rows=700000loops=1) 
>  Total runtime: 1058220.10 msec
> (6 rows)
>
> (The point of the select count(*) was to avoid shipping the result rows
> to the client, but in hindsight "explain analyze" would suppress that
> anyway.  But the main datapoint here is the time for the Sort step.)
>
> I tried the test using datatype NAME as well, since it sorts using
> plain strcmp() instead of strcoll():
>
> enus=# create table nm (f1 name);
> CREATE TABLE
> enus=# insert into nm select f1 from vc20;
> INSERT 0 700000
> enus=# vacuum analyze nm;
> VACUUM
> enus=# set sort_mem to 50000;
> SET
> enus=# explain analyze select count(*) from
> enus-# (select * from nm order by f1) ss;
>                                                            QUERY PLAN
>
>
---------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=122701.48..122701.48 rows=1 width=64) (actual time=157877.84..157877.85 rows=1 loops=1)
>    ->  Subquery Scan ss  (cost=119201.48..120951.48 rows=700000 width=64) (actual time=121286.65..149376.93
rows=700000loops=1) 
>          ->  Sort  (cost=119201.48..120951.48 rows=700000 width=64) (actual time=121286.60..134075.61 rows=700000
loops=1)
>                Sort Key: f1
>                ->  Seq Scan on nm  (cost=0.00..15642.00 rows=700000 width=64) (actual time=0.21..24150.57 rows=700000
loops=1)
>  Total runtime: 157962.79 msec
> (6 rows)
>
> In C locale, the identical test sequence gives
>
>                                                            QUERY PLAN
>
>
---------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=83607.48..83607.48 rows=1 width=24) (actual time=187480.70..187480.71 rows=1 loops=1)
>    ->  Subquery Scan ss  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=141100.03..178625.97 rows=700000
loops=1)
>          ->  Sort  (cost=80107.48..81857.48 rows=700000 width=24) (actual time=141099.98..162288.95 rows=700000
loops=1)
>                Sort Key: f1
>                ->  Seq Scan on vc20  (cost=0.00..12148.00 rows=700000 width=24) (actual time=0.20..23954.71
rows=700000loops=1) 
>  Total runtime: 187565.79 msec
> (6 rows)
>
> and of course about the same runtime as before for datatype NAME.  So on
> this platform (HPUX 10.20), en_US locale incurs about a 6x penalty over
> C locale for sorting varchars.
>
> Note that NAME beats VARCHAR by a noticeable margin even in C locale,
> despite the handicap of requiring much more I/O (being 64 bytes per row
> not 24).  This surprises me; it looks like varstr_cmp() is reasonably
> well optimized in the C-locale case.  But the real loser is VARCHAR in
> non-C locales.  I suspect the primary time sink is strcoll() not the
> palloc/copy overhead in varstr_cmp(), but don't have time right now to
> do profiling to prove it.
>
> Anyway, use of NAME instead of VARCHAR might be a workable workaround
> if you cannot change your database locale to C.
>
>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

Re: Sort time

From

pginfo

Date:

18 November 2002, 06:12:22

Hi,
anoder important point of view forme:

As I know the size of sort_mem is given at the pg start
and is not shared.
Also I can use set sort_mem to xxx;
Can I dot set sort_mem to myvalue ; execute my query , set sort_mem to old_value; only
for querys that needet more sort memory?

If I can so will the new seted sort_mem be only for the opened connection or for connections?
Also will this dynamic sort_mem setting cause problems in pg?

regards,
iavn.

Re: Sort time

From

Stephan Szabo

Date:

18 November 2002, 06:56:52

On Mon, 18 Nov 2002, pginfo wrote:

> I think that we need to work around the non us sorting and compare.
> It is not possible to be so slow (all the functions are executed in memory
> and in java and by oracle and by ms all this is working very fast).

I get similar results from the unix sort command, (8 sec elapsed for C
locale, 25 sec for en_US) on my redhat 8 machine (and I forced the buffer
size high enough to not get any temp files afaict).

I'm not sure what platform Tom was using for his test, but maybe someone
could try this on a non x86/linux machine and see what they get (I don't
have access to one).