Thread: Ways to deal with large amount of columns;

Ways to deal with large amount of columns;

From
"a"
Date:
Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Thanks so much!

Shore

Re: Ways to deal with large amount of columns;

From
"David G. Johnston"
Date:
On Thursday, August 30, 2018, a <372660931@qq.com> wrote:
Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet.  Can be down in pl/pgsql or in your preferred programming language.  Psql variables can maybe be useful too.

David J.
 

Re: Ways to deal with large amount of columns;

From
Andreas Kretschmer
Date:

Am 30.08.2018 um 11:13 schrieb a:
> Therefore, the column number would be 1000+.

just as a additional note: there is a limit, a table can contains not 
more than 250-100 columns, dependsing on column types.
https://wiki.postgresql.org/wiki/FAQ

Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: Ways to deal with large amount of columns;

From
Robert Zenz
Date:
As David said, you'd be better off having a table that looks like this (in terms
of columns):

 * MONTH
 * AGENT
 * CASHFLOW

So your query to get the sum of a single agent would be looking like:

    select
        sum(CHASFLOW)
    where
        AGENT = 'Agent'
        and MONTH between values;

It might be a little more work to create a GUI for that (or map this model to
the existing GUI) but it is much simpler to maintain and work with.


On 30.08.2018 11:13, a wrote:
> Hi all:
> 
> 
> I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
> 
> 
> Therefore, the column number would be 1000+. 
> 
> 
> I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.
> 
> 
> So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple
columnsby simple word that representing them.
 
> 
> 
> Thanks so much!
> 
> 
> Shore
>

Re: Ways to deal with large amount of columns;

From
Andreas Kretschmer
Date:

Am 30.08.2018 um 15:15 schrieb Robert Zenz:
> As David said, you'd be better off having a table that looks like this (in terms
> of columns):
>
>   * MONTH
>   * AGENT
>   * CASHFLOW
>
> So your query to get the sum of a single agent would be looking like:
>
>      select
>          sum(CHASFLOW)
>      where
>          AGENT = 'Agent'
>          and MONTH between values;
>
> It might be a little more work to create a GUI for that (or map this model to
> the existing GUI) but it is much simpler to maintain and work with.

agree. it is also possible to use partitioning.

Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: Ways to deal with large amount of columns;

From
Tim Cross
Date:
a <372660931@qq.com> writes:

> Hi all:
>
>
> I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
>
>
> Therefore, the column number would be 1000+. 
>
>

Not sure your data model is correct. Typically, with something like
this, increasing the number of agents would result in tables with more
rows rather than more columns. Tables with large numbers of columns is
often a 'code smell' and indicates the underlying data model needs to be
reviewed. Designs which result in new columns being required because you
are adding new data sources is almost certainly an indication of the
need to review the data model.

Postgres (and many other databases) have lots of functionality to help
deal with tables that have large numbers of rows, but working with
tables that have large numbers of columns has less functionality and
options.

While it is very tempting to create a table and then start coding, you
will almost always get a much better result and simpler code if you
spend some initial time to really analyse your domain, understand the
data elements and how they relate to each other, map them out into a
data model and then start development. Have a look at
https://en.wikipedia.org/wiki/Database_normalization for some background
on the normal forms and why they are useful.

HTH

Tim



-- 
Tim Cross


Re: Ways to deal with large amount of columns;

From
"a"
Date:
Hi thanks for your mail.

So my data is arranged as this due to the nature of business. I need the projected cash flow in each agent to calculate such as present value or other related issue to report to regulator. Furthermore, some basic functions such as sum, average and other aggregate functions are needed. I also considered to store them as array, but it would not be easy to write SQL for summing the select rows.

My need to discuss further on designing of the model.

Thank you again.

Shore


------------------ Original message ------------------
From: "Tim Cross";
Sendtime: Friday, Aug 31, 2018 6:24 AM
To: "a"<372660931@qq.com>;
Cc: "pgsql-general";
Subject: Re: Ways to deal with large amount of columns;


a <372660931@qq.com> writes:

> Hi all:
>
>
> I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).
>
>
> Therefore, the column number would be 1000+.
>
>

Not sure your data model is correct. Typically, with something like
this, increasing the number of agents would result in tables with more
rows rather than more columns. Tables with large numbers of columns is
often a 'code smell' and indicates the underlying data model needs to be
reviewed. Designs which result in new columns being required because you
are adding new data sources is almost certainly an indication of the
need to review the data model.

Postgres (and many other databases) have lots of functionality to help
deal with tables that have large numbers of rows, but working with
tables that have large numbers of columns has less functionality and
options.

While it is very tempting to create a table and then start coding, you
will almost always get a much better result and simpler code if you
spend some initial time to really analyse your domain, understand the
data elements and how they relate to each other, map them out into a
data model and then start development. Have a look at
https://en.wikipedia.org/wiki/Database_normalization for some background
on the normal forms and why they are useful.

HTH

Tim



--
Tim Cross

Re: Ways to deal with large amount of columns;

From
"a"
Date:
Thank you very much. Creating a function seems to be a good idea :)


------------------ Original message ------------------
From: "David G. Johnston";
Sendtime: Thursday, Aug 30, 2018 8:31 PM
To: "a"<372660931@qq.com>;
Cc: "pgsql-general";
Subject: Re: Ways to deal with large amount of columns;

On Thursday, August 30, 2018, a <372660931@qq.com> wrote:
Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet.  Can be down in pl/pgsql or in your preferred programming language.  Psql variables can maybe be useful too.

David J.
 

Re: Ways to deal with large amount of columns;

From
Tim Cross
Date:


On Fri, 31 Aug 2018 at 10:47, a <372660931@qq.com> wrote:
Thank you very much. Creating a function seems to be a good idea :)


------------------ Original message ------------------
From: "David G. Johnston";
Sendtime: Thursday, Aug 30, 2018 8:31 PM
To: "a"<372660931@qq.com>;
Cc: "pgsql-general";
Subject: Re: Ways to deal with large amount of columns;

On Thursday, August 30, 2018, a <372660931@qq.com> wrote:
Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Better to design a data model that doesn't have so many columns. Otherwise generating dynamic SQL via the for,at function and loops and such is your best bet.  Can be down in pl/pgsql or in your preferred programming language.  Psql variables can maybe be useful too.

David J.
 

Perhaps post your proposed table design/definition. There is nothing in what you have described so far which would indicate a necessity to have more columns as you increase the number of agents.  It would be normal to have something like

| agent_id | year | cash_on_hand | bank | creditors | debtors | ....

and queries like

select sum(cash_on_hand)
from table
where agent_id = 'agent1'
and yesr = 2018;

to get the sum of cash on hand for agent1 in 2018.

instead of something like

| agent1_cash2018 | agent2_cash2017 | ....

which will not work well.

Tim


--
regards,

Tim

--
Tim Cross

Re: Ways to deal with large amount of columns;

From
Ben Madin
Date:
Hi - this is a spreadsheet model, not a database model, and could be modelled with three columns. 

The aggregate functions are an analytic issue, not a data issue.

cheers

Ben


On 30 August 2018 at 17:13, a <372660931@qq.com> wrote:
Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Thanks so much!

Shore



--

Ausvet Logo

Dr Ben Madin
 
BVMS MVPHMgmt PhD MANZCVS GAICD
Managing Director
Skype: benmadin
Address: 
5 Shuffrey Street
Fremantle, WA 6160
Australia

Re: Ways to deal with large amount of columns;

From
Brent Wood
Date:
You need to normalise your data model... the basic dataset you describe will probably fit in one table with three columns, but that might be simplistic...

All your aggregate queries are pretty basic SQL statements, which could be represented as views or made simpler to invoke by turning them into functions.


So, one table describing agents, with columns for agent, date and projected cashflow will store all your data...
(index agent & date for performance)

eg: for a summary for all agents for the last 12 months:

select agent, sum(cashflow)
from table
where date >= now() - interval '1 year'
group by agent
order by agent;

or a cashflow aggregate summary for the latest month

select sum(cashflow), avg(cashflow), min(cashflow), max(cashflow) 
from table
where date = (select max(date) from table);

or get the agent with highest projected cashflow for the latest month

select agent
from table
where date = (select max(date) from table)
  and cashflow=(select max(cashflow) from table
      where date = (select max(date) from table));



From: a <372660931@qq.com>
To: pgsql-general <pgsql-general@postgresql.org>
Sent: Thursday, August 30, 2018 9:14 PM
Subject: Ways to deal with large amount of columns;

Hi all:

I need to make a table contains projected monthly cashflow for multiple agents (10,000 around).

Therefore, the column number would be 1000+. 

I would need to perform simple aggregate function such as count, sum or average on each cashflow projected.

So if there is anyway of doing this? Will there be anything like define a macro in C that I can manipulate multiple columns by simple word that representing them.

Thanks so much!

Shore