Thread: Migrating a FoxPro system and would like input on the best way to achieve optimal performance

Migrating a FoxPro system and would like input on the best way to achieve optimal performance

From

TonyS

Date:

09 February 2015, 17:21:43

I have a system that I am needing to convert from FoxPro files being accessed
with DAO to PostgreSQL.

This system serves 1,000 clients and will be expanding to 2,000 within the
next 18 months.

The current system has a directory with files that contain information of a
global nature such as the list of clients, list of users and which client
they belong to, etc. Then each client has files within their own directory
to keep the size of the tables manageable. Each client has 165 tables. These
tables are all the same definition across the different groups.

I have been trying to determine the best way to define this system within
PostgreSQL.

I have considered partitioning tables, but if I am correct that would result
in 330,000 files and I am not certain if that will cause an issue with
degraded file system performance.

Is it possible to create a tablespace for each group and then create
partitioned tables for the groups within the group's tablespace to limit the
number of files in a directory? I plan on utilizing the built-in streaming
replication, so I assume if I went the tablespace route I would need to
create directories for all future groups from the outset since creating them
individually with code on the backup systems would be difficult.

Another option would be placing an extra field in each table identifying the
group it belongs to and combining all of the separate tables of the same
definition into one table. This would result in some tables having 300
million entries currently and that would climb over the next 18 months.

The final option I can see is creating a schema for each of the different
clients. I am not certain if this is a better option than partitioned
tables. I haven't been able to determine if schema objects are stored in a
sub directory or if they are in the same directory as all of the other
tables. If they are in the same directory then the same issue arises as the
partitioned tables.

Of course, I am certain there are a number of other possibilities that I am
overlooking. I am just trying to determine the best way to move this over
and get things into a more modern system.

--
View this message in context:
http://postgresql.nabble.com/Migrating-a-FoxPro-system-and-would-like-input-on-the-best-way-to-achieve-optimal-performance-tp5837211.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: Migrating a FoxPro system and would like input on the best way to achieve optimal performance

From

David G Johnston

Date:

09 February 2015, 21:17:58

TonyS wrote
> Then each client has files within their own directory to keep the size of
> the tables manageable. Each client has 165 tables. These tables are all
> the same definition across the different groups.
>
> I have considered partitioning tables, but if I am correct that would
> result in 330,000 files and I am not certain if that will cause an issue
> with degraded file system performance.

I suggest you not think about "files" when pondering about PostgreSQL.  That
said, 330,000 tables within a single database, or even cluster, is likely to
be problematic.


> Is it possible to create a tablespace for each group and then create
> partitioned tables for the groups within the group's tablespace to limit
> the number of files in a directory?

Same point about ignoring "files" and "directories".  Tablespaces let you
place different kinds of data onto different filesystems; using them for
"directory management" is not particularly helpful.

Note that I presume you are planning on leaving the database backend on
Windows...my experience is more with Linux but your core issue is data model
which is largely O/S agnostic.


> I plan on utilizing the built-in streaming replication, so I assume if I
> went the tablespace route I would need to create directories for all
> future groups from the outset since creating them individually with code
> on the backup systems would be difficult.

Which is another reason why tablespaces should not implement logical
attributes of the system.


> Another option would be placing an extra field in each table identifying
> the group it belongs to and combining all of the separate tables of the
> same definition into one table. This would result in some tables having
> 300 million entries currently and that would climb over the next 18
> months.

This is the canonical solution to multi-tenancy.  Physical partitioning then
occurs on a hash of whatever key you are using; you do not have one tenant
per table.


> The final option I can see is creating a schema for each of the different
> clients. I am not certain if this is a better option than partitioned
> tables. I haven't been able to determine if schema objects are stored in a
> sub directory or if they are in the same directory as all of the other
> tables. If they are in the same directory then the same issue arises as
> the partitioned tables.

Depending on whether clients are able to get access to the data directly you
can also consider having a separate database for each client.  I would then
recommend using either dblink or postgres_fdw to connect to the single
shared database - or just replicate the shared schema and data subset into
each individual client database.


> Of course, I am certain there are a number of other possibilities that I
> am overlooking. I am just trying to determine the best way to move this
> over and get things into a more modern system.

Without understanding how your application works and makes use of the
existing data it is difficult to suggest alternatives.  Specifically around
data visibility and the mechanics behind how the application access
different clients' data.

I would personally choose only between having different databases for each
client or using a "client_id" column in conjunction with a multi-tenant
database.  Those are the two logical models; everything else (e.g.
partitioning) are physical implementation details.

David J.



--
View this message in context:
http://postgresql.nabble.com/Migrating-a-FoxPro-system-and-would-like-input-on-the-best-way-to-achieve-optimal-performance-tp5837211p5837241.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.