Thread: Distributed Table Partitioning
I have this problem: a Master table containing records with a timestamp column registering creation date-time, and one Detail table containing info related to the Master table. As time went by, those tables grew enormously, and I can't afford expanding my SSD VPS. So I'm thinking about storing only NEW data into it, and move OLD data to a cheaper SATA VPS. The goal is using the SSD server as "main", and the other (or others?) as "child", so queries still go to the main server, it somehow detects which records must be fetched from it and what from the child servers, then return the "composed" dataset to the caller. I think this is called Distributed Horizontal Table Partitioning. Is there a way to do this without changing my application code?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877
Hi. I think pgpool-II can do that job for you. It's a middleware, so you can use it without even changing your app code(butyour postgres configuration). It suppoerts many clustering functions, including replication, failover, and a lotmore; it also supports partitioning. so that may be suitable for you. Check the tutorial, it even has some examples: http://www.pgpool.net/docs/latest/tutorial-en.html Alvaro Aguayo Jefe de Operaciones Open Comb Systems E.I.R.L. Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC: (+51) 954183248 Website: www.ocs.pe ----- Original Message ----- From: "Leonardo M. Ramé" <l.rame@griensu.com> To: "PostgreSql-general" <pgsql-general@postgresql.org> Sent: Saturday, 12 March, 2016 8:25:01 PM Subject: [GENERAL] Distributed Table Partitioning I have this problem: a Master table containing records with a timestamp column registering creation date-time, and one Detail table containing info related to the Master table. As time went by, those tables grew enormously, and I can't afford expanding my SSD VPS. So I'm thinking about storing only NEW data into it, and move OLD data to a cheaper SATA VPS. The goal is using the SSD server as "main", and the other (or others?) as "child", so queries still go to the main server, it somehow detects which records must be fetched from it and what from the child servers, then return the "composed" dataset to the caller. I think this is called Distributed Horizontal Table Partitioning. Is there a way to do this without changing my application code?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Sat, Mar 12, 2016 at 8:33 PM, Alvaro Aguayo Garcia-Rada <aaguayo@opensysperu.com> wrote:
Hi. I think pgpool-II can do that job for you. It's a middleware, so you can use it without even changing your app code(but your postgres configuration). It suppoerts many clustering functions, including replication, failover, and a lot more; it also supports partitioning. so that may be suitable for you. Check the tutorial, it even has some examples: http://www.pgpool.net/docs/latest/tutorial-en.html
Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.
Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC: (+51) 954183248
Website: www.ocs.pe
----- Original Message -----
From: "Leonardo M. Ramé" <l.rame@griensu.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Saturday, 12 March, 2016 8:25:01 PM
Subject: [GENERAL] Distributed Table Partitioning
I have this problem: a Master table containing records with a timestamp
column registering creation date-time, and one Detail table containing
info related to the Master table.
As time went by, those tables grew enormously, and I can't afford
expanding my SSD VPS. So I'm thinking about storing only NEW data into
it, and move OLD data to a cheaper SATA VPS.
The goal is using the SSD server as "main", and the other (or others?)
as "child", so queries still go to the main server, it somehow detects
which records must be fetched from it and what from the child servers,
then return the "composed" dataset to the caller.
I think this is called Distributed Horizontal Table Partitioning.
Is there a way to do this without changing my application code?.
Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Why don't you just make use of tablespaces and partition the child tablespaces
so that the newer parttion is on the SSD and the older one is on SATA?
You will need a trigger and tg function to handle inserts
eg: {note: below is psuedo code}
child {master} (SSD) NO ROWS33
child1 (SSD) CONSTRAINT timestamp > {specified time}
tg_insert_child1 on insert execute tgf_split_data
child2 (SATA) CONSTRAINT timestamp <= {specified time}
tg_insert_child2 on insert execute tgf_split_data
tgf_split_data()
if timestamp > {specified time}
insert into child1
else
insert into child2
endif
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
oops! Better example
eg: {note: below is psuedo code}
child {master} (SSD) NO ROWS33
tg_insert_child before insert execute tgf_split_data
child1 (SSD) CONSTRAINT timestamp > {specified time}
child2 (SATA) CONSTRAINT timestamp <= {specified time}
tgf_split_data()
if timestamp > {specified time}
insert into child1
else
insert into child2
endif
eg: {note: below is psuedo code}
child {master} (SSD) NO ROWS33
tg_insert_child before insert execute tgf_split_data
child1 (SSD) CONSTRAINT timestamp > {specified time}
child2 (SATA) CONSTRAINT timestamp <= {specified time}
tgf_split_data()
if timestamp > {specified time}
insert into child1
else
insert into child2
endif
On Sat, Mar 12, 2016 at 9:19 PM, Melvin Davidson <melvin6925@gmail.com> wrote:
Why don't you just make use of tablespaces and partition the child tablespacesOn Sat, Mar 12, 2016 at 8:33 PM, Alvaro Aguayo Garcia-Rada <aaguayo@opensysperu.com> wrote:Hi. I think pgpool-II can do that job for you. It's a middleware, so you can use it without even changing your app code(but your postgres configuration). It suppoerts many clustering functions, including replication, failover, and a lot more; it also supports partitioning. so that may be suitable for you. Check the tutorial, it even has some examples: http://www.pgpool.net/docs/latest/tutorial-en.html
Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.
Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103 | RPC: (+51) 954183248
Website: www.ocs.pe
----- Original Message -----
From: "Leonardo M. Ramé" <l.rame@griensu.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Saturday, 12 March, 2016 8:25:01 PM
Subject: [GENERAL] Distributed Table Partitioning
I have this problem: a Master table containing records with a timestamp
column registering creation date-time, and one Detail table containing
info related to the Master table.
As time went by, those tables grew enormously, and I can't afford
expanding my SSD VPS. So I'm thinking about storing only NEW data into
it, and move OLD data to a cheaper SATA VPS.
The goal is using the SSD server as "main", and the other (or others?)
as "child", so queries still go to the main server, it somehow detects
which records must be fetched from it and what from the child servers,
then return the "composed" dataset to the caller.
I think this is called Distributed Horizontal Table Partitioning.
Is there a way to do this without changing my application code?.
Regards,
--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
so that the newer parttion is on the SSD and the older one is on SATA?
You will need a trigger and tg function to handle inserts
eg: {note: below is psuedo code}
child {master} (SSD) NO ROWS33
child1 (SSD) CONSTRAINT timestamp > {specified time}
tg_insert_child1 on insert execute tgf_split_data
child2 (SATA) CONSTRAINT timestamp <= {specified time}
tg_insert_child2 on insert execute tgf_split_data
tgf_split_data()
if timestamp > {specified time}
insert into child1
else
insert into child2
endif
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 2016-03-12 21:19:11 -0500, Melvin Davidson wrote: > ----- Original Message ----- > From: "Leonardo M. Ramé" <l.rame@griensu.com> > To: "PostgreSql-general" <pgsql-general@postgresql.org> > Sent: Saturday, 12 March, 2016 8:25:01 PM > Subject: [GENERAL] Distributed Table Partitioning > > I have this problem: a Master table containing records with a timestamp > column registering creation date-time, and one Detail table containing > info related to the Master table. > > As time went by, those tables grew enormously, and I can't afford > expanding my SSD VPS. So I'm thinking about storing only NEW data into > it, and move OLD data to a cheaper SATA VPS. [...] > Why don't you just make use of tablespaces and partition the child tablespaces > so that the newer parttion is on the SSD and the older one is on SATA? Since he mentioned virtual private servers (VPS) the reason might be that his hoster offers VPS with SSDs (of various sizes) and VPS with rotating hard disks (of various sizes), but not VPS with both. So he can't rent a VPS with a relatively small SSD and a larger hard disk. That might be a reason to look for an alternate hoster, but if he's otherwise happy, switching to an unknown provider might be considered too large a risk. hp -- _ | Peter J. Holzer | I want to forget all about both belts and |_|_) | | suspenders; instead, I want to buy pants | | | hjp@hjp.at | that actually fit. __/ | http://www.hjp.at/ | -- http://noncombatant.org/
Attachment
El 13/03/16 a las 10:04, Peter J. Holzer escribió: > On 2016-03-12 21:19:11 -0500, Melvin Davidson wrote: >> ----- Original Message ----- >> From: "Leonardo M. Ramé" <l.rame@griensu.com> >> To: "PostgreSql-general" <pgsql-general@postgresql.org> >> Sent: Saturday, 12 March, 2016 8:25:01 PM >> Subject: [GENERAL] Distributed Table Partitioning >> >> I have this problem: a Master table containing records with a timestamp >> column registering creation date-time, and one Detail table containing >> info related to the Master table. >> >> As time went by, those tables grew enormously, and I can't afford >> expanding my SSD VPS. So I'm thinking about storing only NEW data into >> it, and move OLD data to a cheaper SATA VPS. > [...] >> Why don't you just make use of tablespaces and partition the child tablespaces >> so that the newer parttion is on the SSD and the older one is on SATA? > > Since he mentioned virtual private servers (VPS) the reason might be > that his hoster offers VPS with SSDs (of various sizes) and VPS with > rotating hard disks (of various sizes), but not VPS with both. So he > can't rent a VPS with a relatively small SSD and a larger hard disk. > > That might be a reason to look for an alternate hoster, but if he's > otherwise happy, switching to an unknown provider might be considered > too large a risk. > > hp > Yes, Peter is right, I must store one table in one VPS and the other in a 2nd VPS. I'm thinking of partitioning the table, on local and the remote using a Foreign Data Wrapper, what do you think about this approach?.
I guess that would work if you used a view to combine both tables, but you would still need a BEFORE trigger to make sure DML goes to the appropriate table.
Another solution might be to use dblink if you need to access the SATA table on the 2nd VPS,On Sun, Mar 13, 2016 at 11:37 AM, Leonardo M. Ramé <l.rame@griensu.com> wrote:
El 13/03/16 a las 10:04, Peter J. Holzer escribió:On 2016-03-12 21:19:11 -0500, Melvin Davidson wrote:----- Original Message -----[...]
From: "Leonardo M. Ramé" <l.rame@griensu.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Saturday, 12 March, 2016 8:25:01 PM
Subject: [GENERAL] Distributed Table Partitioning
I have this problem: a Master table containing records with a timestamp
column registering creation date-time, and one Detail table containing
info related to the Master table.
As time went by, those tables grew enormously, and I can't afford
expanding my SSD VPS. So I'm thinking about storing only NEW data into
it, and move OLD data to a cheaper SATA VPS.Why don't you just make use of tablespaces and partition the child tablespaces
so that the newer parttion is on the SSD and the older one is on SATA?
Since he mentioned virtual private servers (VPS) the reason might be
that his hoster offers VPS with SSDs (of various sizes) and VPS with
rotating hard disks (of various sizes), but not VPS with both. So he
can't rent a VPS with a relatively small SSD and a larger hard disk.
That might be a reason to look for an alternate hoster, but if he's
otherwise happy, switching to an unknown provider might be considered
too large a risk.
hp
Yes, Peter is right, I must store one table in one VPS and the other in a 2nd VPS.
I'm thinking of partitioning the table, on local and the remote using a Foreign Data Wrapper, what do you think about this approach?.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.