Thread: Postgres questions

Postgres questions

From

Trenta sis

Date:

23 May 2011, 19:10:16

Hi,

I have Debian Lenny server with Postgres 8.3 that is workin correctly, but I have some questions about this installation.

This server is working whitout any problem but it is a little critical and I need to know how can I do...

- I need to connect from postgres to other database (linked server) no postgres (for example with jdbc or odbc). I have tried to work with dbi-link with sql, it seems to work but with poor performance and whit other database different mssql is not working. What possible options exsits with 8.3? and with other versions?

- This server has some critical applications and I need high availability, but I'm not sure about possible options for this versions or similar. I have thought about active/active, active/passive or active/read-only but I'm not sure what are real options, and what could be a possible environment for this situation

Thanks

Re: Postgres questions

From

Craig Ringer

Date:

23 May 2011, 21:53:39

On 24/05/2011 6:10 AM, Trenta sis wrote:

> - I need to connect from postgres to other database (linked server) no
> postgres (for example with jdbc or odbc). I have tried to work with
> dbi-link with sql, it seems to work but with poor performance and whit
> other database different mssql is not working. What possible options
> exsits with 8.3? and with other versions?

DBI-link is probably your best bet. Another possibility is to use an
in-database procedural language to talk to the other database - for
example, PL/perl via DBI::DBD or PL/Python via a PEP-249
(http://www.python.org/dev/peps/pep-0249/) database driver like pymssql.

Otherwise you can do the data sharing/sync/whatever via a client
application that has connections to PostgreSQL and to the other database
of interest. That's often a better choice for more complex jobs.

Perhaps it'd help if you explained why you need this and what you want
to accomplish with it?

> - This server has some critical applications and I need high
> availability, but I'm not sure about possible options for this versions
> or similar. I have thought about active/active, active/passive or
> active/read-only but I'm not sure what are real options, and what could
> be a possible environment for this situation

It depends a LOT on what your needs are, and what your budget is. You
have some basic questions to ask yourself, like:

- Do I need true HA with failover, or just to protect against data loss?

- Can I modify my apps to be aware of failover, or does failover have to
   be transparent?

- Do I need multi-site failover or is all access of interest within
   one site?

- What kind of guarantees do I need about data loss windows at failover
   time? Can I afford to lose the last <x> transactions / seconds worth
   of transactions? Or must absolutely every transaction be retained
   at all costs?

Once you've worked out the answers to those kinds of questions, THEN you
can look at bucardo, slony-I, PostgreSQL 9.0 native replication, etc etc
as well as failover-control options like heartbeat and decide what might
be suitable for you.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: Postgres questions

From

Trenta sis

Date:

24 May 2011, 06:15:34

Hi Craig,

First of all, thanks for you answer.
That I need is to connect to another databse (Cache Intersystems) to use select from Postgres. I have tried to configure dbilink but is not working with this database, with sql server seems to work but with poor peroformance.

About second question I'll try to answer your questions:
- Do I need true HA with failover, or just to protect against data loss?
It's a critical environment and if it is possible we need failover...

- Can I modify my apps to be aware of failover, or does failover have to
be transparent?
Failover must be transparent for applications.

- Do I need multi-site failover or is all access of interest within
one site?
With 2 site It could be a valid option, if one is master (RW), and secondary could be (RO) to use as datawarehouse (allow only selects, except in failover scenario: allow all)

- What kind of guarantees do I need about data loss windows at failover
time? Can I afford to lose the last <x> transactions / seconds worth
of transactions? Or must absolutely every transaction be retained
at all costs?
A windows data loss could be some secodn/minutes but if is not very complex no data loss will be excellent.

I have scheduled to create a second server but I'm not sure what options to choose to implement with our requirements...

Thanks

2011/5/24 Craig Ringer <craig@postnewspapers.com.au>

On 24/05/2011 6:10 AM, Trenta sis wrote:

- I need to connect from postgres to other database (linked server) no
postgres (for example with jdbc or odbc). I have tried to work with
dbi-link with sql, it seems to work but with poor performance and whit
other database different mssql is not working. What possible options
exsits with 8.3? and with other versions?

DBI-link is probably your best bet. Another possibility is to use an in-database procedural language to talk to the other database - for example, PL/perl via DBI::DBD or PL/Python via a PEP-249 (http://www.python.org/dev/peps/pep-0249/) database driver like pymssql.

Otherwise you can do the data sharing/sync/whatever via a client application that has connections to PostgreSQL and to the other database of interest. That's often a better choice for more complex jobs.

Perhaps it'd help if you explained why you need this and what you want to accomplish with it?

- This server has some critical applications and I need high
availability, but I'm not sure about possible options for this versions
or similar. I have thought about active/active, active/passive or
active/read-only but I'm not sure what are real options, and what could
be a possible environment for this situation

It depends a LOT on what your needs are, and what your budget is. You have some basic questions to ask yourself, like:

- Do I need true HA with failover, or just to protect against data loss?

- Can I modify my apps to be aware of failover, or does failover have to
be transparent?

- Do I need multi-site failover or is all access of interest within
one site?

- What kind of guarantees do I need about data loss windows at failover
time? Can I afford to lose the last <x> transactions / seconds worth
of transactions? Or must absolutely every transaction be retained
at all costs?

Once you've worked out the answers to those kinds of questions, THEN you can look at bucardo, slony-I, PostgreSQL 9.0 native replication, etc etc as well as failover-control options like heartbeat and decide what might be suitable for you.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: Postgres questions

From

Craig Ringer

Date:

24 May 2011, 06:57:28

On 05/24/2011 05:15 PM, Trenta sis wrote:

> That I need is to connect to another databse (Cache Intersystems) to use
> select from Postgres. I have tried to configure dbilink but is not
> working with this database, with sql server seems to work but with poor
> peroformance.

You really need to be specific about things like "not working" and "poor
performance". Nobody can help you if that's all you say, you need to
give error messages, versions, commands, configurations, etc. You
haven't shown any queries, any EXPLAIN ANALYZE output, any schema, any
timing data, or pretty much anything else that'd allow anyone to help
you. It might be worth remedying that. See:

In addition to that, some obvious things to test and report back on include:

Can you connect to Cache from a stand-alone Perl script using Perl DBI?

If so, can you connect to Cache from a PL/Perlu script using Perl DBI?

Is MS-SQL performance OK when connecting from a stand-alone Perl script
outside the database using Perl DBI?

Is MS-SQL performance OK when connecting from a custom test plperlu
script using Perl DBI?

What do you expect performance to be like? Why? What kind of queries are
you executing? Which ones perform badly?

> /- What kind of guarantees do I need about data loss windows at failover
>   time? Can I afford to lose the last <x> transactions / seconds worth
>   of transactions? Or must absolutely every transaction be retained
>   at all costs?/
> A windows data loss could be some secodn/minutes but if is not very
> complex no data loss will be excellent.

It sounds like your requirements can probably be satisfied by PostgreSQL
9.0's built-in replication combined with the use of repmgr and heartbeat.

For anything like this, though, I STRONGLY suggest that you hire a
consultant who has done what you need before and has learned the traps
and pitfalls. See:

http://www.postgresql.org/support/professional_support

I don't do PostgreSQL support commercially and am not experienced with
failover/HA setups, so I can't help you much there and suggest you find
someone who _really_ knows what they are doing. Getting HA right is *hard*.

--
Craig Ringer

Re: Postgres questions

From

Trenta sis

Date:

28 May 2011, 07:08:35

Dbi link was configured with instruction form dbi-link. Where I can find this scripts:
Can you connect to Cache from a stand-alone Perl script using Perl DBI?

If so, can you connect to Cache from a PL/Perlu script using Perl DBI?

Is MS-SQL performance OK when connecting from a stand-alone Perl script outside the database using Perl DBI?

Is MS-SQL performance OK when connecting from a custom test plperlu script using Perl DBI?

Queries in mssql are sql select from talbe on mssql with about 400.000 rows.

About replication now I can't upgrade to 9.0 with 8.3 what possible options exists, if is active/passive and is not real ha also it can be OK

Thanks

2011/5/24 Craig Ringer <craig@postnewspapers.com.au>

On 05/24/2011 05:15 PM, Trenta sis wrote:

That I need is to connect to another databse (Cache Intersystems) to use
select from Postgres. I have tried to configure dbilink but is not
working with this database, with sql server seems to work but with poor
peroformance.

You really need to be specific about things like "not working" and "poor performance". Nobody can help you if that's all you say, you need to give error messages, versions, commands, configurations, etc. You haven't shown any queries, any EXPLAIN ANALYZE output, any schema, any timing data, or pretty much anything else that'd allow anyone to help you. It might be worth remedying that. See:

In addition to that, some obvious things to test and report back on include:

Can you connect to Cache from a stand-alone Perl script using Perl DBI?

If so, can you connect to Cache from a PL/Perlu script using Perl DBI?

Is MS-SQL performance OK when connecting from a stand-alone Perl script outside the database using Perl DBI?

Is MS-SQL performance OK when connecting from a custom test plperlu script using Perl DBI?

What do you expect performance to be like? Why? What kind of queries are you executing? Which ones perform badly?

/- What kind of guarantees do I need about data loss windows at failover

time? Can I afford to lose the last <x> transactions / seconds worth
of transactions? Or must absolutely every transaction be retained
at all costs?/
A windows data loss could be some secodn/minutes but if is not very
complex no data loss will be excellent.

It sounds like your requirements can probably be satisfied by PostgreSQL 9.0's built-in replication combined with the use of repmgr and heartbeat.

For anything like this, though, I STRONGLY suggest that you hire a consultant who has done what you need before and has learned the traps and pitfalls. See:

http://www.postgresql.org/support/professional_support

I don't do PostgreSQL support commercially and am not experienced with failover/HA setups, so I can't help you much there and suggest you find someone who _really_ knows what they are doing. Getting HA right is *hard*.

--
Craig Ringer

Re: Postgres questions

From

Craig Ringer

Date:

28 May 2011, 08:49:25

On 05/28/2011 06:08 PM, Trenta sis wrote:
>
> Dbi link was configured with instruction form dbi-link. Where I can find
> this scripts:

Write them.

> Queries in mssql are sql select from talbe on mssql with about 400.000 rows.

Well, no wonder it's slow. As far as I know DBI-Link can't push
predicates down into queries to the remote database. If you write:

SELECT * FROM some_remote_table t WHERE t.some_field > 42;

then (as far as I know) the whole of some_remote_table is retrieved,
then the filter is applied. It executes:
"SELECT * FROM underlying_table"
on the remote server, not
"SELECT * FROM underlying_table WHERE underlying_table.some_field > 42"

For more information, read:

http://stackoverflow.com/questions/1328601/postgresql-and-sql-server-2008-dbi-link

A full SQL/MED implementation *might* be able to push down predicates,
but even then it's hard to do reliably because of possible differences
in locales, collation rules, etc.

For that kind of data size you're usually better off periodically
copying the data from the remote table to a local table with the same
structure. Then query the remote table. Think of it as a materialized
view of the remote table. You can use DBI-link to refresh the local copy
and PgAgent to schedule periodic refreshes.

> About replication now I can't upgrade to 9.0 with 8.3 what possible
> options exists, if is active/passive and is not real ha also it can be OK

Use Google. Type in "postgresql replication" and the first result should be:

http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

--
Craig Ringer