Thread: my.cnf to postgresql.conf Conversion

my.cnf to postgresql.conf Conversion

From
Kevin Old
Date:
Hello everyone,

I have a question about setting the variables in the postgresql.conf
file.  I have the following settings in my my.cnf file for MySQL....and
need to know which variables in the postgresql.conf file would
correspond to the ones below, so that I may tune my system correctly.

set-variable    = key_buffer=384M
set-variable    = max_allowed_packet=1M
set-variable    = table_cache=64
set-variable    = thread_stack=64K
set-variable    = sort_buffer=2M
set-variable    = record_buffer=2M
set-variable    = thread_cache=8
# Try number of CPU's*2 for thread_concurrency
set-variable    = thread_concurrency=8
set-variable    = myisam_sort_buffer_size=64M

Any suggestions are greatly appreciated!!!!

Kevin
--
Kevin Old <kold@carolina.rr.com>


Re: my.cnf to postgresql.conf Conversion

From
"scott.marlowe"
Date:
On 31 Oct 2002, Kevin Old wrote:

> Hello everyone,
>
> I have a question about setting the variables in the postgresql.conf
> file.  I have the following settings in my my.cnf file for MySQL....and
> need to know which variables in the postgresql.conf file would
> correspond to the ones below, so that I may tune my system correctly.
>
> set-variable    = key_buffer=384M
> set-variable    = max_allowed_packet=1M
> set-variable    = table_cache=64
> set-variable    = thread_stack=64K
> set-variable    = sort_buffer=2M
> set-variable    = record_buffer=2M
> set-variable    = thread_cache=8
> # Try number of CPU's*2 for thread_concurrency
> set-variable    = thread_concurrency=8
> set-variable    = myisam_sort_buffer_size=64M

Many of these have no direct 1 to 1 correlation to Postgresql.

the only ones that are pretty similar are sort_buffer and record_buffer.

The ones to look at changing are first are:

max_connections   (the default 32 is kinda small for some servers, but
fine for departmental servers.  You may need to adjust fs.file-max in the
kernel (I'm assuming linux here) if you go very high here.)

shared_buffers (These are measured in 8k blocks.  I've found anywhere from
1000 to 20000 works well, anything over that tends to be a waste of
memory for most servers.  Note you may need to adjust the settings for
kernel.shmall and kernel.shmmax if you want this to be very high).

sort_mem  (bigger isn't always better here.  This is the amount of memory
EACH sort operation will grab when it runs, and it seems to be that it
will grab all of it whether it needs it or not, so be careful not to crank
this up.  I've found that 2048 to 8192 are suitably large (it's measured
in k I believe.)

fsync  (Setting this to false can more than double the speed of your
database, while slightly increasing the chance that an unscheduled power
down (i.e. a big dummy trips over your power cord kinda thing) but in my
testing I've never lost data by pulling the plug when running with it set
to false or true.)

then you can look at these settings.  they are used to tell the planner
how to execute your query.

random_page_cost  (The default of 4 here is a bit high for most people.  1
to 2 seems a more realistic setting for machines with a bit of memory
where a lot of your result set may be cached.)

cpu_tuple_cost
cpu_index_tuple_cost  (setting this lower makes the machine favor using
indexes)
cpu_operator_cost

If you are gonna update thousands of rows at a time, look at increasing
these settings, which will let the database recover vacuumed rows from
tables that have had massive updates.

max_fsm_relations
#max_fsm_pages


Re: my.cnf to postgresql.conf Conversion

From
Neil Conway
Date:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> sort_mem  (bigger isn't always better here.  This is the amount of memory
> EACH sort operation will grab when it runs, and it seems to be that it
> will grab all of it whether it needs it or not, so be careful not to crank
> this up.

That's not correct -- sort_mem specifies the point at which a sort
operation will start to swap data to disk (i.e. the memory is not
consumed regardless). See the entry on SORT_MEM here for more info:

  http://developer.postgresql.org/docs/postgres/runtime-config.html

Another parameter you might want to tweak is wal_buffers.

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

Re: my.cnf to postgresql.conf Conversion

From
Kevin Old
Date:
Scott,

Thanks for your reply. I am running Solaris 2.7 with 4 400Mhz processors
and 2GB RAM.  I have questions still with the settings in
postgresql.conf.  With MySQL, the key_buffer variable seems to "rope
off" a particular amount of memory for the mysql daemon.  That's
basically what I'm trying to do here for Postgresql.

I have 200,000+ rows (approx 50 fields) going into a postgres database
and I need to give it as much memory as I can.

I am using the copy command, but it is still very slow.....any
suggestions.

My perl code and tables/queries are as optimized as they can
be....postgres shows that the size is only about 3,000K when viewed in
top and running the copy command......so I assume it's problem with
memory allocation.

Thanks,
Kevin

On Thu, 2002-10-31 at 17:46, scott.marlowe wrote:
> On 31 Oct 2002, Kevin Old wrote:
>
> > Hello everyone,
> >
> > I have a question about setting the variables in the postgresql.conf
> > file.  I have the following settings in my my.cnf file for MySQL....and
> > need to know which variables in the postgresql.conf file would
> > correspond to the ones below, so that I may tune my system correctly.
> >
> > set-variable    = key_buffer=384M
> > set-variable    = max_allowed_packet=1M
> > set-variable    = table_cache=64
> > set-variable    = thread_stack=64K
> > set-variable    = sort_buffer=2M
> > set-variable    = record_buffer=2M
> > set-variable    = thread_cache=8
> > # Try number of CPU's*2 for thread_concurrency
> > set-variable    = thread_concurrency=8
> > set-variable    = myisam_sort_buffer_size=64M
>
> Many of these have no direct 1 to 1 correlation to Postgresql.
>
> the only ones that are pretty similar are sort_buffer and record_buffer.
>
> The ones to look at changing are first are:
>
> max_connections   (the default 32 is kinda small for some servers, but
> fine for departmental servers.  You may need to adjust fs.file-max in the
> kernel (I'm assuming linux here) if you go very high here.)
>
> shared_buffers (These are measured in 8k blocks.  I've found anywhere from
> 1000 to 20000 works well, anything over that tends to be a waste of
> memory for most servers.  Note you may need to adjust the settings for
> kernel.shmall and kernel.shmmax if you want this to be very high).
>
> sort_mem  (bigger isn't always better here.  This is the amount of memory
> EACH sort operation will grab when it runs, and it seems to be that it
> will grab all of it whether it needs it or not, so be careful not to crank
> this up.  I've found that 2048 to 8192 are suitably large (it's measured
> in k I believe.)
>
> fsync  (Setting this to false can more than double the speed of your
> database, while slightly increasing the chance that an unscheduled power
> down (i.e. a big dummy trips over your power cord kinda thing) but in my
> testing I've never lost data by pulling the plug when running with it set
> to false or true.)
>
> then you can look at these settings.  they are used to tell the planner
> how to execute your query.
>
> random_page_cost  (The default of 4 here is a bit high for most people.  1
> to 2 seems a more realistic setting for machines with a bit of memory
> where a lot of your result set may be cached.)
>
> cpu_tuple_cost
> cpu_index_tuple_cost  (setting this lower makes the machine favor using
> indexes)
> cpu_operator_cost
>
> If you are gonna update thousands of rows at a time, look at increasing
> these settings, which will let the database recover vacuumed rows from
> tables that have had massive updates.
>
> max_fsm_relations
> #max_fsm_pages
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
--
Kevin Old <kold@carolina.rr.com>


Re: my.cnf to postgresql.conf Conversion

From
Neil Conway
Date:
Kevin Old <kold@carolina.rr.com> writes:
> My perl code and tables/queries are as optimized as they can
> be....postgres shows that the size is only about 3,000K when viewed in
> top and running the copy command......so I assume it's problem with
> memory allocation.

Disabling fsync should help. Also, consider removing any indexes /
foreign keys on the table you're copying into, and then re-adding them
when the bulk load is finished.

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

Re: my.cnf to postgresql.conf Conversion

From
"scott.marlowe"
Date:
What are your current settings for postgresql.conf?

I'm not that familiar with Solaris, but I know there were some issues with
the built in sort having performance issues if there were a lot of
duplicate keys when sorting.  That could be a possible problem, but I'm
not sure how likely that is.

What kinds of indexs and such do you have on that table?

If you have indexes on the table, it may be faster to drop them then
recreate them after a load.



On 1 Nov 2002, Kevin Old wrote:

> Scott,
>
> Thanks for your reply. I am running Solaris 2.7 with 4 400Mhz processors
> and 2GB RAM.  I have questions still with the settings in
> postgresql.conf.  With MySQL, the key_buffer variable seems to "rope
> off" a particular amount of memory for the mysql daemon.  That's
> basically what I'm trying to do here for Postgresql.
>
> I have 200,000+ rows (approx 50 fields) going into a postgres database
> and I need to give it as much memory as I can.
>
> I am using the copy command, but it is still very slow.....any
> suggestions.
>
> My perl code and tables/queries are as optimized as they can
> be....postgres shows that the size is only about 3,000K when viewed in
> top and running the copy command......so I assume it's problem with
> memory allocation.
>
> Thanks,
> Kevin
>
> On Thu, 2002-10-31 at 17:46, scott.marlowe wrote:
> > On 31 Oct 2002, Kevin Old wrote:
> >
> > > Hello everyone,
> > >
> > > I have a question about setting the variables in the postgresql.conf
> > > file.  I have the following settings in my my.cnf file for MySQL....and
> > > need to know which variables in the postgresql.conf file would
> > > correspond to the ones below, so that I may tune my system correctly.
> > >
> > > set-variable    = key_buffer=384M
> > > set-variable    = max_allowed_packet=1M
> > > set-variable    = table_cache=64
> > > set-variable    = thread_stack=64K
> > > set-variable    = sort_buffer=2M
> > > set-variable    = record_buffer=2M
> > > set-variable    = thread_cache=8
> > > # Try number of CPU's*2 for thread_concurrency
> > > set-variable    = thread_concurrency=8
> > > set-variable    = myisam_sort_buffer_size=64M
> >
> > Many of these have no direct 1 to 1 correlation to Postgresql.
> >
> > the only ones that are pretty similar are sort_buffer and record_buffer.
> >
> > The ones to look at changing are first are:
> >
> > max_connections   (the default 32 is kinda small for some servers, but
> > fine for departmental servers.  You may need to adjust fs.file-max in the
> > kernel (I'm assuming linux here) if you go very high here.)
> >
> > shared_buffers (These are measured in 8k blocks.  I've found anywhere from
> > 1000 to 20000 works well, anything over that tends to be a waste of
> > memory for most servers.  Note you may need to adjust the settings for
> > kernel.shmall and kernel.shmmax if you want this to be very high).
> >
> > sort_mem  (bigger isn't always better here.  This is the amount of memory
> > EACH sort operation will grab when it runs, and it seems to be that it
> > will grab all of it whether it needs it or not, so be careful not to crank
> > this up.  I've found that 2048 to 8192 are suitably large (it's measured
> > in k I believe.)
> >
> > fsync  (Setting this to false can more than double the speed of your
> > database, while slightly increasing the chance that an unscheduled power
> > down (i.e. a big dummy trips over your power cord kinda thing) but in my
> > testing I've never lost data by pulling the plug when running with it set
> > to false or true.)
> >
> > then you can look at these settings.  they are used to tell the planner
> > how to execute your query.
> >
> > random_page_cost  (The default of 4 here is a bit high for most people.  1
> > to 2 seems a more realistic setting for machines with a bit of memory
> > where a lot of your result set may be cached.)
> >
> > cpu_tuple_cost
> > cpu_index_tuple_cost  (setting this lower makes the machine favor using
> > indexes)
> > cpu_operator_cost
> >
> > If you are gonna update thousands of rows at a time, look at increasing
> > these settings, which will let the database recover vacuumed rows from
> > tables that have had massive updates.
> >
> > max_fsm_relations
> > #max_fsm_pages
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Re: my.cnf to postgresql.conf Conversion

From
Bruce Momjian
Date:
scott.marlowe wrote:
> What are your current settings for postgresql.conf?
>
> I'm not that familiar with Solaris, but I know there were some issues with
> the built in sort having performance issues if there were a lot of
> duplicate keys when sorting.  That could be a possible problem, but I'm
> not sure how likely that is.

The solaris duplicate keys problem is fixed in 7.3beta by using our own
NetBSD qsort.  :-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073