Thread: Ways to deal with large amount of columns;
Hi all:
I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
Therefore, the column number would be 1000+.
I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.
So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.
Thanks so much!
Shore
On Thursday, August 30, 2018, a <372660931@qq.com> wrote:
Hi all:I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).Therefore, the column number would be 1000+.I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.
Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet. Can be down in pl/pgsql or in your preferred programming language. Psql variables can maybe be useful too.
David J.
Am 30.08.2018 um 11:13 schrieb a: > Therefore, the column number would be 1000+. just as a additional note: there is a limit, a table can contains not more than 250-100 columns, dependsing on column types. https://wiki.postgresql.org/wiki/FAQ Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
As David said, you'd be better off having a table that looks like this (in terms of columns): * MONTH * AGENT * CASHFLOW So your query to get the sum of a single agent would be looking like: select sum(CHASFLOW) where AGENT = 'Agent' and MONTH between values; It might be a little more work to create a GUI for that (or map this model to the existing GUI) but it is much simpler to maintain and work with. On 30.08.2018 11:13, a wrote: > Hi all: > > > I need to make a table contains projected monthly cashflow for multiple agents (10,000 around). > > > Therefore, the column number would be 1000+. > > > I would need to perform simple aggregate function such as count, sum or average on each cashflow projected. > > > So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columnsby simple word that representing them. > > > Thanks so much! > > > Shore >
Am 30.08.2018 um 15:15 schrieb Robert Zenz: > As David said, you'd be better off having a table that looks like this (in terms > of columns): > > * MONTH > * AGENT > * CASHFLOW > > So your query to get the sum of a single agent would be looking like: > > select > sum(CHASFLOW) > where > AGENT = 'Agent' > and MONTH between values; > > It might be a little more work to create a GUI for that (or map this model to > the existing GUI) but it is much simpler to maintain and work with. agree. it is also possible to use partitioning. Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
a <372660931@qq.com> writes: > Hi all: > > > I need to make a table contains projected monthly cashflow for multiple agents (10,000 around). > > > Therefore, the column number would be 1000+. > > Not sure your data model is correct. Typically, with something like this, increasing the number of agents would result in tables with more rows rather than more columns. Tables with large numbers of columns is often a 'code smell' and indicates the underlying data model needs to be reviewed. Designs which result in new columns being required because you are adding new data sources is almost certainly an indication of the need to review the data model. Postgres (and many other databases) have lots of functionality to help deal with tables that have large numbers of rows, but working with tables that have large numbers of columns has less functionality and options. While it is very tempting to create a table and then start coding, you will almost always get a much better result and simpler code if you spend some initial time to really analyse your domain, understand the data elements and how they relate to each other, map them out into a data model and then start development. Have a look at https://en.wikipedia.org/wiki/Database_normalization for some background on the normal forms and why they are useful. HTH Tim -- Tim Cross
Hi thanks for your mail.
So my data is arranged as this due to the nature of business. I need the projected cash flow in each agent to calculate such as present value or other related issue to report to regulator. Furthermore, some basic functions such as sum, average and other aggregate functions are needed. I also considered to store them as array, but it would not be easy to write SQL for summing the select rows.
My need to discuss further on designing of the model.
Thank you again.
Shore
------------------ Original message ------------------
From: "Tim Cross";
Sendtime: Friday, Aug 31, 2018 6:24 AM
To: "a"<372660931@qq.com>;
Cc: "pgsql-general";
Subject: Re: Ways to deal with large amount of columns;
a <372660931@qq
> Hi all:
>
>
> I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
>
>
> Therefore, the column number would be 1000+.
>
>
Not sure your data model is correct. Typically, with something like
this, increasing the number of agents would result in tables with more
rows rather than more columns. Tables with large numbers of columns is
often a 'code smell' and indicates the underlying data model needs to be
reviewed. Designs which result in new columns being required because you
are adding new data sources is almost certainly an indication of the
need to review the data model.
Postgres (and many other databases) have lots of functionality to help
deal with tables that have large numbers of rows, but working with
tables that have large numbers of columns has less functionality and
options.
While it is very tempting to create a table and then start coding, you
will almost always get a much better result and simpler code if you
spend some initial time to really analyse your domain, understand the
data elements and how they relate to each other, map them out into a
data model and then start development. Have a look at
https://en.w
on the normal forms and why they are useful.
HTH
Tim
--
Tim Cross
Thank you very much. Creating a function seems to be a good idea :)
------------------ Original message ------------------
From: "David G. Johnston";
Sendtime: Thursday, Aug 30, 2018 8:31 PM
To: "a"<372660931@qq.com>;
Cc: "pgsql-general";
Subject: Re: Ways to deal with large amount of columns;
Hi all:I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).Therefore, the column number would be 1000+.I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.
Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet. Can be down in pl/pgsql or in your preferred programming language. Psql variables can maybe be useful too.
David J.
On Fri, 31 Aug 2018 at 10:47, a <372660931@qq.com> wrote:
Thank you very much. Creating a function seems to be a good idea :)------------------ Original message ------------------From: "David G. Johnston";Sendtime: Thursday, Aug 30, 2018 8:31 PMTo: "a"<372660931@qq.com>;Cc: "pgsql-general";Subject: Re: Ways to deal with large amount of columns;On Thursday, August 30, 2018, a <372660931@qq.com> wrote:Hi all:I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).Therefore, the column number would be 1000+.I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet. Can be down in pl/pgsql or in your preferred programming language. Psql variables can maybe be useful too.David J.
| agent_id | year | cash_on_hand | bank | creditors | debtors | ....
and queries like
select sum(cash_on_hand)
from table
where agent_id = 'agent1'
and yesr = 2018;
to get the sum of cash on hand for agent1 in 2018.
instead of something like
| agent1_cash2018 | agent2_cash2017 | ....
which will not work well.
Tim
regards,
Tim
--
Tim CrossHi - this is a spreadsheet model, not a database model, and could be modelled with three columns.
The aggregate functions are an analytic issue, not a data issue.
cheers
Ben
On 30 August 2018 at 17:13, a <372660931@qq.com> wrote:
Hi all:I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).Therefore, the column number would be 1000+.I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.Thanks so much!Shore
Dr Ben Madin
BVMS MVPHMgmt PhD MANZCVS GAICD
Managing Director
You need to normalise your data model... the basic dataset you describe will probably fit in one table with three columns, but that might be simplistic...
All your aggregate queries are pretty basic SQL statements, which could be represented as views or made simpler to invoke by turning them into functions.
So, one table describing agents, with columns for agent, date and projected cashflow will store all your data...
(index agent & date for performance)
eg: for a summary for all agents for the last 12 months:
select agent, sum(cashflow)
from table
where date >= now() - interval '1 year'
group by agent
order by agent;
or a cashflow aggregate summary for the latest month
select sum(cashflow), avg(cashflow), min(cashflow), max(cashflow)
from table
where date = (select max(date) from table);
or get the agent with highest projected cashflow for the latest month
select agent
from table
where date = (select max(date) from table)
and cashflow=(select max(cashflow) from table
where date = (select max(date) from table));
From: a <372660931@qq.com>
To: pgsql-general <pgsql-general@postgresql.org>
Sent: Thursday, August 30, 2018 9:14 PM
Subject: Ways to deal with large amount of columns;
Hi all:
I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
Therefore, the column number would be 1000+.
I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.
So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.
Thanks so much!
Shore