Thread: Patching for increasing the number of columns

Patching for increasing the number of columns

From
Mayeul Kauffmann
Date:
Hello,

I am trying to patch the server source to increase the number of columns 
above 1600. I'm not planning to commit this but as suggested elsewhere 
[1], someone might suggest a configure option based on this.
I came up with a patch which seems to work (see below), but 3 of the 136 
tests fail.

I understand some will question db design, but, as written elsewhere, 
"What would be most helpful though is if the answer to this question 
stop being an attack on the business requirement analysis, database 
design skills, and/or sanity of the requester" [1]

I based my attempts on these discussions:
http://www.postgresql.org/message-id/200512221633.jBMGXWM13248@candle.pha.pa.us
http://www.postgresql.org/message-id/8914.1289620175@sss.pgh.pa.us

http://dba.stackexchange.com/questions/40137/in-postgresql-is-it-possible-to-change-the-maximum-number-of-columns-a-table-ca

I build this on Ubuntu 14.04, 64 bits. Bash session follows:

======================================================
sudo apt-get install flex
sudo apt-get install bison build-essential
sudo apt-get install libreadline6-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libossp-uuid-dev

version=3_64  # change this if you want to build several versions of 
postgres in parallel
# see also "MODIFY THIS TOO" below

echo "current version is"  $version
mkdir -p ~/bin/postgresql_9.3.4
cd ~/bin/postgresql_9.3.4
wget ftp://ftp.postgresql.org/pub/source/v9.3.4/postgresql-9.3.4.tar.bz2
mkdir -p ~/bin/postgresql_9.3.4/patched_$version
tar -xvf postgresql-9.3.*.tar.bz2 -C ~/bin/postgresql_9.3.4/patched_$version
cd patched_$version/postgresql-9.3.*

# use kate (KDE) or your preferred text editor:
kate src/include/access/htup_details.h

# See: 

http://dba.stackexchange.com/questions/40137/in-postgresql-is-it-possible-to-change-the-maximum-number-of-columns-a-table-ca
# Replace this:
#define MaxTupleAttributeNumber 1664    /* 8 * 208 */
# by this: (the '#' sign  'define' should be included)
#define MaxTupleAttributeNumber 6656    /* 32 * 208 */
# or this:
#define MaxTupleAttributeNumber 13312    /* 64 * 208 */

# Replace this:
#define MaxHeapAttributeNumber    1600    /* 8 * 200 */
# by this: (the '#' sign before 'define' should be included)
#define MaxHeapAttributeNumber    6400    /* 32 * 200 */
# or this:
#define MaxHeapAttributeNumber    12800    /* 64 * 200 */


# See: 
http://www.postgresql.org/message-id/8914.1289620175@sss.pgh.pa.us 
suggests this:  uint16    t_hoff;
# Replace this:  (in TWO PLACES)  (near lines 148 and lines 523. If you 
miss one, postgresql segfaults.)  uint8 t_hoff; /* sizeof header incl. bitmap, padding */
# by this:     (in TWO PLACES)  uint32 t_hoff; /* sizeof header incl. bitmap, padding */
# or by this:    (in TWO PLACES)    uint64 t_hoff; /* sizeof header incl. bitmap, padding */

# Save and close htup_details.h
# (TODO: write the above as a command-line patch)

./configure --with-blocksize=32 --prefix=/usr/local/pgsql_patched_$version

make
make check

# join                     ... FAILED
# select_views             ... FAILED
# without_oid              ... FAILED
# ========================
#  3 of 136 tests failed.     FIXME
# ========================
(not sure whether I can attach the log and diff of the test here).

I launched the server anyway and logged in with pgadmin3. I created a 
few tables with 2000 integer fields or so. Performed a few insert, 
select, update and join without any issue.
So at least basic join works. And in pgadmin3, the "has OIDs" porperties 
of tables I created is not checked.

Just to be sure, I performed again all the tests with 'make check' 
without any patch and without raising the blocksize  (configure option), 
and this time all the tests passed (NO failure).

Would anyone have some hint or advice?
Thank you!
Best regards,
Mayeul


[1] http://www.postgresql.org/message-id/8914.1289620175@sss.pgh.pa.us

PS: and since it's my first post here: thank you all so much for this 
wonderful DBMS :-)



Re: Patching for increasing the number of columns

From
Tom Lane
Date:
Mayeul Kauffmann <mayeul.kauffmann@free.fr> writes:
> I am trying to patch the server source to increase the number of columns 
> above 1600. I'm not planning to commit this but as suggested elsewhere 
> [1], someone might suggest a configure option based on this.
> I came up with a patch which seems to work (see below), but 3 of the 136 
> tests fail.

You would have to show us the actual failure diffs to get much useful
comment, but in general increasing the size of tuple headers could
easily lead to changes in plan choices, which would affect output
row ordering (not to mention EXPLAIN results).  This is particularly
the case if you're testing on a 64-bit machine, since the maxalign'd
size of the header would go from 24 to 32 bytes ...
        regards, tom lane



Re: Patching for increasing the number of columns

From
Mayeul Kauffmann
Date:
Tom wrote:> You would have to show us the actual failure diffs to get much useful 
comment, but in general increasing the size of tuple headers could 
easily lead to> changes in plan choices

Thank you Tom. So there is some hope! In effect the query plan is 
different for the join and the view tests. The result set is different 
only for the 'without_oid' test.
A side question: Are these tests comprehensive, or should I run other 
tests just to be sure? Hints on where to find those tests are welcome.
Thanks!
(diff below)
Mayeul
*** 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/expected/join.out 
2014-03-17 19:35:47.000000000 +0000
--- 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/results/join.out 
2014-08-20 15:40:56.248603754 +0100
***************
*** 2791,2814 ****    join int4_tbl i1 on b.thousand = f1    right join int4_tbl i2 on i2.f1 = b.tenthous    order by
1;
!                                        QUERY PLAN
! 
-----------------------------------------------------------------------------------------   Sort     Sort Key:
b.unique1
!    ->  Nested Loop Left Join
!          ->  Seq Scan on int4_tbl i2           ->  Nested Loop Left Join                 Join Filter: (b.unique1 =
42)                ->  Nested Loop                       ->  Nested Loop                             ->  Seq Scan on
int4_tbli1
 
!                            ->  Index Scan using tenk1_thous_tenthous 
on tenk1 b
!                                  Index Cond: ((thousand = i1.f1) AND 
(i2.f1 = tenthous))                       ->  Index Scan using tenk1_unique1 on tenk1 a
IndexCond: (unique1 = b.unique2)                 ->  Index Only Scan using tenk1_thous_tenthous on tenk1 c
        Index Cond: (thousand = a.thousand)
 
! (15 rows)
  select b.unique1 from    tenk1 a join tenk1 b on a.unique1 = b.unique2
--- 2791,2818 ----    join int4_tbl i1 on b.thousand = f1    right join int4_tbl i2 on i2.f1 = b.tenthous    order by
1;
!                                   QUERY PLAN
! 
-------------------------------------------------------------------------------   Sort     Sort Key: b.unique1
!    ->  Hash Right Join
!          Hash Cond: (b.tenthous = i2.f1)           ->  Nested Loop Left Join                 Join Filter: (b.unique1
=42)                 ->  Nested Loop                       ->  Nested Loop                             ->  Seq Scan on
int4_tbli1
 
!                            ->  Bitmap Heap Scan on tenk1 b
!                                  Recheck Cond: (thousand = i1.f1)
!                                  ->  Bitmap Index Scan on 
tenk1_thous_tenthous
!                                        Index Cond: (thousand = i1.f1)                       ->  Index Scan using
tenk1_unique1on tenk1 a                             Index Cond: (unique1 = b.unique2)                 ->  Index Only
Scanusing tenk1_thous_tenthous on tenk1 c                       Index Cond: (thousand = a.thousand)
 
!          ->  Hash
!                ->  Seq Scan on int4_tbl i2
! (19 rows)
  select b.unique1 from    tenk1 a join tenk1 b on a.unique1 = b.unique2

======================================================================

*** 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/expected/select_views_1.out 
2014-03-17 19:35:47.000000000 +0000
--- 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/results/select_views.out 
2014-08-20 15:41:01.212603532 +0100
***************
*** 1413,1423 ****         WHERE f_leak(cnum) AND ymd >= '2011-10-01' AND ymd < '2011-11-01';
        QUERY PLAN
 
------------------------------------------------------------------------------
!  Nested Loop
!    Join Filter: (l.cid = r.cid)     ->  Seq Scan on credit_usage r           Filter: ((ymd >= '10-01-2011'::date) AND
(ymd< 
 
'11-01-2011'::date))
!    ->  Materialize           ->  Subquery Scan on l                 Filter: f_leak(l.cnum)                 ->  Hash
Join
--- 1413,1423 ----         WHERE f_leak(cnum) AND ymd >= '2011-10-01' AND ymd < '2011-11-01';
        QUERY PLAN
 
------------------------------------------------------------------------------
!  Hash Join
!    Hash Cond: (r.cid = l.cid)     ->  Seq Scan on credit_usage r           Filter: ((ymd >= '10-01-2011'::date) AND
(ymd< 
 
'11-01-2011'::date))
!    ->  Hash           ->  Subquery Scan on l                 Filter: f_leak(l.cnum)                 ->  Hash Join
***************
*** 1446,1456 ****
------------------------------------------------------------------------------------   Subquery Scan on
my_credit_card_usage_secure    Filter: f_leak(my_credit_card_usage_secure.cnum)
 
!    ->  Nested Loop
!          Join Filter: (l.cid = r.cid)           ->  Seq Scan on credit_usage r                 Filter: ((ymd >=
'10-01-2011'::date)AND (ymd < 
 
'11-01-2011'::date))
!          ->  Materialize                 ->  Hash Join                       Hash Cond: (r_1.cid = l.cid)
         ->  Seq Scan on credit_card r_1
 
--- 1446,1456 ----
------------------------------------------------------------------------------------   Subquery Scan on
my_credit_card_usage_secure    Filter: f_leak(my_credit_card_usage_secure.cnum)
 
!    ->  Hash Join
!          Hash Cond: (r.cid = l.cid)           ->  Seq Scan on credit_usage r                 Filter: ((ymd >=
'10-01-2011'::date)AND (ymd < 
 
'11-01-2011'::date))
!          ->  Hash                 ->  Hash Join                       Hash Cond: (r_1.cid = l.cid)
  ->  Seq Scan on credit_card r_1
 

======================================================================

*** 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/expected/without_oid.out 
2014-03-17 19:35:47.000000000 +0000
--- 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/results/without_oid.out 
2014-08-20 15:41:02.068603494 +0100
***************
*** 53,59 ****   WHERE relname IN ('wi', 'wo');   ?column? | ?column?  ----------+----------
!  t        |        0  (1 row)
  DROP TABLE wi;
--- 53,59 ----   WHERE relname IN ('wi', 'wo');   ?column? | ?column?  ----------+----------
!  f        |        0  (1 row)
  DROP TABLE wi;

======================================================================






Re: Patching for increasing the number of columns

From
Tom Lane
Date:
Mayeul Kauffmann <mayeul.kauffmann@free.fr> writes:
> Tom wrote:
>> You would have to show us the actual failure diffs to get much useful 
>> comment, but in general increasing the size of tuple headers could 
>> easily lead to changes in plan choices

> Thank you Tom. So there is some hope! In effect the query plan is 
> different for the join and the view tests. The result set is different 
> only for the 'without_oid' test.

Hm.  I think the without_oid test is not showing that anything is broken;
what it's testing is whether a table with oids is physically bigger (more
pages) than one without oids but the same data.  It's not implausible
that your change results in the same number of tuples fitting onto a page
in both cases.  It'd be worth doing the math to make sure that makes
sense.  Not sure if there's an easy way to change the table schema so that
you get different physical sizes in both cases.

The other tests aren't showing any functional issue either AFAICS.
The change away from a nestloop plan in join.out is a bit annoying,
because IIRC that test is specifically intended to check nestloop
parameter management; but that just means the test is brittle.

> A side question: Are these tests comprehensive, or should I run other 
> tests just to be sure? Hints on where to find those tests are welcome.

No, they're not comprehensive, and no, we don't have more :-(
        regards, tom lane



Re: Patching for increasing the number of columns

From
Mayeul Kauffmann
Date:
On 20/08/14 18:17, Tom Lane wrote:
> Hm. I think the without_oid test is not showing that anything is broken; 

> The other tests aren't showing any functional issue either AFAICS. 
Thanks a lot Tom! That's very helpful.
I have written more details and some basic SQL tests in the wiki of the 
application (LimeSurvey) which requires this:

http://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux

I will give update here or on that wiki (where most relevant) should I 
find issues while testing.

Cheers,

mayeulk




Re: Patching for increasing the number of columns

From
Steven Niu
Date:


Mayeul Kauffmann <mayeul.kauffmann@free.fr> 于2024年12月13日周五 15:11写道:

On 20/08/14 18:17, Tom Lane wrote:
> Hm. I think the without_oid test is not showing that anything is broken;

> The other tests aren't showing any functional issue either AFAICS.
Thanks a lot Tom! That's very helpful.
I have written more details and some basic SQL tests in the wiki of the
application (LimeSurvey) which requires this:

http://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux

I will give update here or on that wiki (where most relevant) should I
find issues while testing.

Cheers,

mayeulk



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



Hi,  Mayeul ,

I tried your solution on PG17.2 and it does works when the columns is less than 2048. 
However, when there are 2048 columns in one table,  you cannot read out the content of table.

postgres=# CREATE TABLE large_table (
    col1 char,
    col2 char,
    col3 char,
    col4 char,
    col5 char,
    ...... 
    col2045 char,
    col2046 char,
    col2047 char,
    col2048 char
);
CREATE TABLE

postgres=# insert into large_table values(3);
INSERT 0 1

postgres=# select col1 from large_table;
col1
------

(1 row)

Yes, there does be a tuple in this table and the result shows you one row is retrieved. But you got nothing. Definitely some limitation still exists somewhere.

Steven