Thread: pgsql: doc: Warn that ts_headline() output is not HTML-safe.
doc: Warn that ts_headline() output is not HTML-safe. Add a documentation warning to ts_headline() pointing out that, when working with untrusted input documents, the output is not guaranteed to be safe for direct inclusion in web pages. This is because, while it does remove some XML tags from the input, it doesn't remove all HTML markup, and so the result may be unsafe (e.g., it might permit XSS attacks). To guard against that, all HTML markup should be removed from the input, making it plain text, or the output should be passed through an HTML sanitizer. In addition, document precisely what the default text search parser recognises as valid XML tags, since that's what determines which XML tags ts_headline() will remove. Reported-by: Richard Neill <richard.neill@telos.digital> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 13 Branch ------ REL_14_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/1ba9ffa56eb510b8d9ae57431ad61a9e1a396674 Modified Files -------------- doc/src/sgml/textsearch.sgml | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-)
Hi,
On Thu, 1 May 2025 at 19:43, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
doc: Warn that ts_headline() output is not HTML-safe.
Backpatch-through: 13
This commit looks harmless, but 2 separate machines are
failing on this commit (at the same point).
For now, it appears more like a compiler bug. I have requested
a gcc account (to file a bug) but I wouldn't be surprised if these
machines keep failing until that resolves (or until I fix a gcc
compile flag etc).
(I'd expect v13 to fail soon too)
Pasting the error here, in case someone can point me to something
I'm doing wrong, or else, I'll revert once I have an update from GCC.
postgres@dell:~/proj/postgres/src/backend/nodes$ gcc -v -save-temps -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-deprecated-non-prototype -Wno-format-truncation -Wno-stringop-truncation -g -O2 -std=gnu17 -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o nodeFuncs.o nodeFuncs.c
Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-pc-linux-gnu
Configured with: /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib : (reconfigured) /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib : (reconfigured) /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib --enable-languages=c,c++,fortran,lto,objc --no-create --no-recursion
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 16.0.0 20250501 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wmissing-prototypes' '-Wpointer-arith' '-Wdeclaration-after-statement' '-Werror=vla' '-Wendif-labels' '-Wsuggest-attribute=format' '-Wimplicit-fallthrough=3' '-Wcast-function-type' '-Wformat-security' '-fno-strict-aliasing' '-fwrapv' '-fexcess-precision=standard' '-Wno-deprecated-non-prototype' '-Wformat-truncation=0' '-Wno-stringop-truncation' '-g' '-O2' '-std=gnu17' '-I' '../../../src/include' '-D' '_GNU_SOURCE' '-I' '/usr/include/libxml2' '-c' '-o' 'nodeFuncs.o' '-mtune=generic' '-march=x86-64'
/opt/gcc/prod/bin/../libexec/gcc/x86_64-pc-linux-gnu/16.0.0/cc1 -E -quiet -v -I ../../../src/include -I /usr/include/libxml2 -imultiarch x86_64-linux-gnu -iprefix /opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/ -D _GNU_SOURCE nodeFuncs.c -mtune=generic -march=x86-64 -std=gnu17 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wsuggest-attribute=format -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -Wno-deprecated-non-prototype -Wformat-truncation=0 -Wno-stringop-truncation -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fworking-directory -O2 -fpch-preprocess -o nodeFuncs.i
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed/x86_64-linux-gnu"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../x86_64-pc-linux-gnu/include"
ignoring duplicate directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed/x86_64-linux-gnu"
ignoring duplicate directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../x86_64-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
../../../src/include
/usr/include/libxml2
/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include
/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed
/usr/local/include
/opt/gcc/prod/bin/../lib/gcc/../../include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wmissing-prototypes' '-Wpointer-arith' '-Wdeclaration-after-statement' '-Werror=vla' '-Wendif-labels' '-Wsuggest-attribute=format' '-Wimplicit-fallthrough=3' '-Wcast-function-type' '-Wformat-security' '-fno-strict-aliasing' '-fwrapv' '-fexcess-precision=standard' '-Wno-deprecated-non-prototype' '-Wformat-truncation=0' '-Wno-stringop-truncation' '-g' '-O2' '-std=gnu17' '-I' '../../../src/include' '-D' '_GNU_SOURCE' '-I' '/usr/include/libxml2' '-c' '-o' 'nodeFuncs.o' '-mtune=generic' '-march=x86-64'
/opt/gcc/prod/bin/../libexec/gcc/x86_64-pc-linux-gnu/16.0.0/cc1 -fpreprocessed nodeFuncs.i -quiet -dumpbase nodeFuncs.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wsuggest-attribute=format -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -Wno-deprecated-non-prototype -Wformat-truncation=0 -Wno-stringop-truncation -std=gnu17 -version -fno-strict-aliasing -fwrapv -fexcess-precision=standard -o nodeFuncs.s
GNU C17 (GCC) version 16.0.0 20250501 (experimental) (x86_64-pc-linux-gnu)
compiled by GNU C version 16.0.0 20250501 (experimental), GMP version 6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 1595b8feb42ed5b4136e55368db04a28
nodeFuncs.c: In function ‘expression_tree_walker’:
nodeFuncs.c:1949:25: internal compiler error: Segmentation fault
1949 | return walker(((WithCheckOption *) node)->qual, context);
| ^~~~~~
0x263119f internal_error(char const*, ...)
/opt/gcc/source/gcc/diagnostic-global-context.cc:517
0x1131fef crash_signal
/opt/gcc/source/gcc/toplev.cc:321
0x75a3c964532f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xa30a1f tree_check(tree_node const*, char const*, int, char const*, tree_code)
/opt/gcc/source/gcc/tree.h:3979
0xa30a1f fndecl_built_in_p(tree_node const*)
/opt/gcc/source/gcc/tree.h:6922
0xa30a1f convert_arguments
/opt/gcc/source/gcc/c/c-typeck.cc:4340
0xa30a1f build_function_call_vec(unsigned long, vec<unsigned long, va_heap, vl_ptr>, tree_node*, vec<tree_node*, va_gc, vl_embed>*, vec<tree_node*, va_gc, vl_embed>*, tree_node*)
/opt/gcc/source/gcc/c/c-typeck.cc:3881
0xa7c194 c_parser_postfix_expression_after_primary
/opt/gcc/source/gcc/c/c-parser.cc:13735
0xa58893 c_parser_postfix_expression
/opt/gcc/source/gcc/c/c-parser.cc:13286
0xa5deaa c_parser_unary_expression
/opt/gcc/source/gcc/c/c-parser.cc:10604
0xa5fb1b c_parser_cast_expression
/opt/gcc/source/gcc/c/c-parser.cc:10445
0xa5ff0f c_parser_binary_expression
/opt/gcc/source/gcc/c/c-parser.cc:10213
0xa61523 c_parser_conditional_expression
/opt/gcc/source/gcc/c/c-parser.cc:9908
0xa61d24 c_parser_expr_no_commas
/opt/gcc/source/gcc/c/c-parser.cc:9821
0xa62187 c_parser_expression
/opt/gcc/source/gcc/c/c-parser.cc:13875
0xa629b7 c_parser_expression_conv
/opt/gcc/source/gcc/c/c-parser.cc:13934
0xa556bd c_parser_statement_after_labels
/opt/gcc/source/gcc/c/c-parser.cc:8226
0xa58541 c_parser_compound_statement_nostart
/opt/gcc/source/gcc/c/c-parser.cc:7757
0xa8cad7 c_parser_compound_statement
/opt/gcc/source/gcc/c/c-parser.cc:6975
0xa54029 c_parser_statement_after_labels
/opt/gcc/source/gcc/c/c-parser.cc:8163
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-pc-linux-gnu
Configured with: /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib : (reconfigured) /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib : (reconfigured) /opt/gcc/source/configure --prefix=/opt/gcc/target --disable-multilib --enable-languages=c,c++,fortran,lto,objc --no-create --no-recursion
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 16.0.0 20250501 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wmissing-prototypes' '-Wpointer-arith' '-Wdeclaration-after-statement' '-Werror=vla' '-Wendif-labels' '-Wsuggest-attribute=format' '-Wimplicit-fallthrough=3' '-Wcast-function-type' '-Wformat-security' '-fno-strict-aliasing' '-fwrapv' '-fexcess-precision=standard' '-Wno-deprecated-non-prototype' '-Wformat-truncation=0' '-Wno-stringop-truncation' '-g' '-O2' '-std=gnu17' '-I' '../../../src/include' '-D' '_GNU_SOURCE' '-I' '/usr/include/libxml2' '-c' '-o' 'nodeFuncs.o' '-mtune=generic' '-march=x86-64'
/opt/gcc/prod/bin/../libexec/gcc/x86_64-pc-linux-gnu/16.0.0/cc1 -E -quiet -v -I ../../../src/include -I /usr/include/libxml2 -imultiarch x86_64-linux-gnu -iprefix /opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/ -D _GNU_SOURCE nodeFuncs.c -mtune=generic -march=x86-64 -std=gnu17 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wsuggest-attribute=format -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -Wno-deprecated-non-prototype -Wformat-truncation=0 -Wno-stringop-truncation -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fworking-directory -O2 -fpch-preprocess -o nodeFuncs.i
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed/x86_64-linux-gnu"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../x86_64-pc-linux-gnu/include"
ignoring duplicate directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed/x86_64-linux-gnu"
ignoring duplicate directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed"
ignoring nonexistent directory "/opt/gcc/prod/bin/../lib/gcc/../../lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../x86_64-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
../../../src/include
/usr/include/libxml2
/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include
/opt/gcc/prod/bin/../lib/gcc/x86_64-pc-linux-gnu/16.0.0/include-fixed
/usr/local/include
/opt/gcc/prod/bin/../lib/gcc/../../include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wmissing-prototypes' '-Wpointer-arith' '-Wdeclaration-after-statement' '-Werror=vla' '-Wendif-labels' '-Wsuggest-attribute=format' '-Wimplicit-fallthrough=3' '-Wcast-function-type' '-Wformat-security' '-fno-strict-aliasing' '-fwrapv' '-fexcess-precision=standard' '-Wno-deprecated-non-prototype' '-Wformat-truncation=0' '-Wno-stringop-truncation' '-g' '-O2' '-std=gnu17' '-I' '../../../src/include' '-D' '_GNU_SOURCE' '-I' '/usr/include/libxml2' '-c' '-o' 'nodeFuncs.o' '-mtune=generic' '-march=x86-64'
/opt/gcc/prod/bin/../libexec/gcc/x86_64-pc-linux-gnu/16.0.0/cc1 -fpreprocessed nodeFuncs.i -quiet -dumpbase nodeFuncs.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wsuggest-attribute=format -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -Wno-deprecated-non-prototype -Wformat-truncation=0 -Wno-stringop-truncation -std=gnu17 -version -fno-strict-aliasing -fwrapv -fexcess-precision=standard -o nodeFuncs.s
GNU C17 (GCC) version 16.0.0 20250501 (experimental) (x86_64-pc-linux-gnu)
compiled by GNU C version 16.0.0 20250501 (experimental), GMP version 6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version none
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 1595b8feb42ed5b4136e55368db04a28
nodeFuncs.c: In function ‘expression_tree_walker’:
nodeFuncs.c:1949:25: internal compiler error: Segmentation fault
1949 | return walker(((WithCheckOption *) node)->qual, context);
| ^~~~~~
0x263119f internal_error(char const*, ...)
/opt/gcc/source/gcc/diagnostic-global-context.cc:517
0x1131fef crash_signal
/opt/gcc/source/gcc/toplev.cc:321
0x75a3c964532f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xa30a1f tree_check(tree_node const*, char const*, int, char const*, tree_code)
/opt/gcc/source/gcc/tree.h:3979
0xa30a1f fndecl_built_in_p(tree_node const*)
/opt/gcc/source/gcc/tree.h:6922
0xa30a1f convert_arguments
/opt/gcc/source/gcc/c/c-typeck.cc:4340
0xa30a1f build_function_call_vec(unsigned long, vec<unsigned long, va_heap, vl_ptr>, tree_node*, vec<tree_node*, va_gc, vl_embed>*, vec<tree_node*, va_gc, vl_embed>*, tree_node*)
/opt/gcc/source/gcc/c/c-typeck.cc:3881
0xa7c194 c_parser_postfix_expression_after_primary
/opt/gcc/source/gcc/c/c-parser.cc:13735
0xa58893 c_parser_postfix_expression
/opt/gcc/source/gcc/c/c-parser.cc:13286
0xa5deaa c_parser_unary_expression
/opt/gcc/source/gcc/c/c-parser.cc:10604
0xa5fb1b c_parser_cast_expression
/opt/gcc/source/gcc/c/c-parser.cc:10445
0xa5ff0f c_parser_binary_expression
/opt/gcc/source/gcc/c/c-parser.cc:10213
0xa61523 c_parser_conditional_expression
/opt/gcc/source/gcc/c/c-parser.cc:9908
0xa61d24 c_parser_expr_no_commas
/opt/gcc/source/gcc/c/c-parser.cc:9821
0xa62187 c_parser_expression
/opt/gcc/source/gcc/c/c-parser.cc:13875
0xa629b7 c_parser_expression_conv
/opt/gcc/source/gcc/c/c-parser.cc:13934
0xa556bd c_parser_statement_after_labels
/opt/gcc/source/gcc/c/c-parser.cc:8226
0xa58541 c_parser_compound_statement_nostart
/opt/gcc/source/gcc/c/c-parser.cc:7757
0xa8cad7 c_parser_compound_statement
/opt/gcc/source/gcc/c/c-parser.cc:6975
0xa54029 c_parser_statement_after_labels
/opt/gcc/source/gcc/c/c-parser.cc:8163
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
-
robins
Robins Tharakan <tharakan@gmail.com> writes: > On Thu, 1 May 2025 at 19:43, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> doc: Warn that ts_headline() output is not HTML-safe. > This commit looks harmless, but 2 separate machines are > failing on this commit (at the same point). It's hardly likely that a docs-only commit is the cause. Did you update the "experimental" compiler on those machines since their last run? regards, tom lane
On Fri, 2 May 2025 at 06:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robins Tharakan <tharakan@gmail.com> writes:
> On Thu, 1 May 2025 at 19:43, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> doc: Warn that ts_headline() output is not HTML-safe.
> This commit looks harmless, but 2 separate machines are
> failing on this commit (at the same point).
It's hardly likely that a docs-only commit is the cause. Did you
update the "experimental" compiler on those machines since their last
run?
I agree. I did try to compile postgres (in a separate folder) with the recent-most gcc
but this does seem to be happening even with the latest gcc commit (25921d66424)
checked in ~40 min ago.
For background, the machines are pretty aggressive [1] in updating gcc - retries every
15 min [2] . So ideally if gcc fixes the issue (or if I fix something stupid in the way I am
compiling gcc etc), the next buildfarm run should automatically go green soon after.
(I tried to auto-update the gcc string in postgres buildfarm programmatically - when
recompiling gcc - but it just gets too noisy on the members page. Ideally it'd be just
nicer if gcc -v could also give commit string, but I am not sure how to force that yet).
-
robins
2. robins@dell:/opt/gcc$ tail -20 build.log
gcs09da 20250502_0615 - make successfulgcs09da 20250502_0616 - make install successful.
gcs09da 20250502_0616 - Postgres Buildfarm process not running (0). Good.
gcs09da 20250502_0616 - gcc version string has changed from [16.0.0 20250501 (experimental) - b6d37ec1dd2] to [16.0.0 20250501 (experimental) - 87c4460024d]
gcs2d5a 20250502_0630 - git checkout successful.
gcs2d5a 20250502_0630 - git pull successful.
gcs2d5a 20250502_0630 - No change in gcc version. Quitting.
gcs6cc4 20250502_0645 - git checkout successful.
gcs6cc4 20250502_0645 - git pull successful.
gcs6cc4 20250502_0645 - gcc has changed - [87c4460024d] vs [25921d66424]. Recompiling.
gcs6cc4 20250502_0647 - make successful
gcs6cc4 20250502_0648 - make install successful.
gcs6cc4 20250502_0648 - Postgres Buildfarm process not running (0). Good.
gcs6cc4 20250502_0648 - gcc version string has changed from [16.0.0 20250501 (experimental) - 87c4460024d] to [16.0.0 20250501 (experimental) - 25921d66424]
gcs0df0 20250502_0700 - git checkout successful.
gcs0df0 20250502_0700 - git pull successful.
gcs0df0 20250502_0700 - No change in gcc version. Quitting.
On Fri, 2 May 2025 at 07:28, Robins Tharakan <tharakan@gmail.com> wrote:
So ideally if gcc fixes the issue (or if I fix something stupid in the way I amcompiling gcc etc), the next buildfarm run should automatically go green soon after.
Does look like a gcc bug. Ideally should go green automatically (once the proposed patch goes through).
-
robins