Home > mailing lists

The query plan get all columns but I'm using only one column. - Mailing list pgsql-performance

From	Moises Lopez
Subject	The query plan get all columns but I'm using only one column.
Date	April 24, 2020 21:11:12
Msg-id	CAGro9RXViOXNBBA-SZur_UbpWrbkg_Ow4ha13FJu9dOJ=WdB7Q@mail.gmail.com Whole thread
Responses	Re: The query plan get all columns but I'm using only one column.
List	pgsql-performance

Tree view

We have production database that has slow queries because of the query get all columns even if I'm using only one column.

The result is slow for tables that there are too much columns

The weird part is that there is environment that I can't reproduce it even if they are using the same postgresql.conf

I didn't find what is the variant/configuration to avoid it

I could reproduce it using the official docker image of postgresql

* Steps to reproduce it

1. Run the following script:
docker run --name psql1 -d -e POSTGRES_PASSWORD=pwd postgres
docker exec -it --user=postgres psql1 psql
# Into docker container
CREATE DATABASE db;
\connect db;
CREATE TABLE link (
ID serial PRIMARY KEY,
url VARCHAR (255) NOT NULL,
name VARCHAR (255) NOT NULL,
description VARCHAR (255),
rel VARCHAR (50)
);
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT l1.url
FROM link l1
JOIN link l2
ON l1.url=l2.url;

2. See result of the Query Plan:
QUERY PLAN
-------------------------------------------------------------------------------------------
Hash Join (cost=10.90..21.85 rows=40 width=516) (actual time=0.080..0.081 rows=1 loops=1)
Output: l1.url
Hash Cond: ((l1.url)::text = (l2.url)::text)
Buffers: shared hit=5
-> Seq Scan on public.link l1 (cost=0.00..10.40 rows=40 width=516) (actual time=0.010..0.011 rows=1 loops=1)
Output: l1.id, l1.url, l1.name, l1.description, l1.rel
Buffers: shared hit=1
-> Hash (cost=10.40..10.40 rows=40 width=516) (actual time=0.021..0.021 rows=1 loops=1)
Output: l2.url
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on public.link l2 (cost=0.00..10.40 rows=40 width=516) (actual time=0.010..0.011 rows=1 loops=1)
Output: l2.url
Buffers: shared hit=1
Planning Time: 0.564 ms
Execution Time: 0.142 ms

3. Notice that I'm using only the column "url" for "JOIN" and "SELECT" section,
but the "Output" section is returning all columns.

Is there a manner to avoid returning all columns in order to get a better performance?

Thank you in advance

* PostgreSQL version:

psql postgres -c "SELECT version()"
PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Changes made to the settings in the postgresql.conf file: see Server Configuration for a quick way to list them all.
without changes

Operating system and version:
cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"

Moisés López

@moylop260

pgsql-performance by date:

From: Tom Lane
Date: 24 April 2020, 19:39:52
Subject: Re: PostgreSQL does not choose my indexes well

From: Jeff Janes
Date: 24 April 2020, 22:59:41
Subject: Re: PostgreSQL does not choose my indexes well

The query plan get all columns but I'm using only one column. - Mailing list pgsql-performance

Previous

Next