Thread: Index size
I'd like to understand a bit more about indexes in PG. When I have a row in a table, and an index, say, for a single column, does that duplicate the entire row on disk? Or is there some kind of id lookup involved? Thanks.
On 11/29/2016 03:30 PM, Samuel Williams wrote: > I'd like to understand a bit more about indexes in PG. https://www.postgresql.org/docs/9.5/static/indexam.html "An index is effectively a mapping from some data key values to tuple identifiers, or TIDs, of row versions (tuples) in the index's parent table. A TID consists of a block number and an item number within that block (see Section 63.6). This is sufficient information to fetch a particular row version from the table. ..." > > When I have a row in a table, and an index, say, for a single column, > does that duplicate the entire row on disk? Or is there some kind of > id lookup involved? > > Thanks. > > -- Adrian Klaver adrian.klaver@aklaver.com
Is there any reason why for the same data set, and same indexes, that the data in postgres would be significantly larger than innodb/mariadb?
On Thu, Dec 01, 2016 at 12:38:37AM +1300, Samuel Williams wrote: > Is there any reason why for the same data set, and same indexes, that > the data in postgres would be significantly larger than > innodb/mariadb? Sure: because they do entirely different things on-disk. Regards, Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 11/30/2016 03:38 AM, Samuel Williams wrote: > Is there any reason why for the same data set, and same indexes, that > the data in postgres would be significantly larger than > innodb/mariadb? > Hard to say without the table/index definitions and some indication of what the data is. What version of Postgres? Also how are you measuring the size of the data set? Have you looked at/tried the functions here?: https://www.postgresql.org/docs/9.5/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT -- Adrian Klaver adrian.klaver@aklaver.com
Thanks everyone for your feedback so far. I've done a bit more digging: MySQL in MBytes (about 350 million rows): index_user_event_on_what_category_id_created_at_latlng | 22806.00 index_user_event_for_reporting | 18211.00 index_user_event_on_created_at | 9519.00 index_user_event_on_user_id | 6884.00 index_user_event_on_poi_id | 4891.00 index_user_event_on_deal_id | 3979.00 Postgres (about 250 million rows): index_user_event_on_what_category_id_created_at_latlng | 25 GB index_user_event_for_reporting | 19 GB index_user_event_on_created_at | 7445 MB index_user_event_on_user_id | 7274 MB index_user_event_on_deal_id | 7132 MB index_user_event_on_poi_id | 7099 MB So, the index is a bit bigger, plus there is also the PKEY index which increases disk usage by another whole index. Keep in mind in the above, MySQL has about 40% more data. With some indexes, it looks like MySQL might not be adding all data to the index (e.g. ignoring NULL values). Does MySQL ignore null values in an index? Can we get the same behaviour in Postgres to minimise usage? What would be the recommendation here? For the composite indexes, if any value is null, we are also less interested in it, so it could be ignored from the index. Thanks, Samuel
> On Dec 3, 2016, at 3:57 PM, Samuel Williams <space.ship.traveller@gmail.com> wrote: > > Thanks everyone for your feedback so far. I've done a bit more digging: > > MySQL in MBytes (about 350 million rows): > > index_user_event_on_what_category_id_created_at_latlng | 22806.00 > index_user_event_for_reporting | 18211.00 > index_user_event_on_created_at | 9519.00 > index_user_event_on_user_id | 6884.00 > index_user_event_on_poi_id | 4891.00 > index_user_event_on_deal_id | 3979.00 > > Postgres (about 250 million rows): > > index_user_event_on_what_category_id_created_at_latlng | 25 GB > index_user_event_for_reporting | 19 GB > index_user_event_on_created_at | 7445 MB > index_user_event_on_user_id | 7274 MB > index_user_event_on_deal_id | 7132 MB > index_user_event_on_poi_id | 7099 MB > > So, the index is a bit bigger, plus there is also the PKEY index which > increases disk usage by another whole index. Keep in mind in the > above, MySQL has about 40% more data. > > With some indexes, it looks like MySQL might not be adding all data to > the index (e.g. ignoring NULL values). Does MySQL ignore null values > in an index? Can we get the same behaviour in Postgres to minimise > usage? What would be the recommendation here? It's unlikely anyone will be able to usefully answer the questions you should be asking without seeing the schema and index definitions, and maybe some clues about how you're querying the data. Cheers, Steve
> On Dec 3, 2016, at 3:57 PM, Samuel Williams <space.ship.traveller@gmail.com> wrote:
>
> Thanks everyone for your feedback so far. I've done a bit more digging:
>
> MySQL in MBytes (about 350 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 22806.00
> index_user_event_for_reporting | 18211.00
> index_user_event_on_created_at | 9519.00
> index_user_event_on_user_id | 6884.00
> index_user_event_on_poi_id | 4891.00
> index_user_event_on_deal_id | 3979.00
>
> Postgres (about 250 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 25 GB
> index_user_event_for_reporting | 19 GB
> index_user_event_on_created_at | 7445 MB
> index_user_event_on_user_id | 7274 MB
> index_user_event_on_deal_id | 7132 MB
> index_user_event_on_poi_id | 7099 MB
>
> So, the index is a bit bigger, plus there is also the PKEY index which
> increases disk usage by another whole index. Keep in mind in the
> above, MySQL has about 40% more data.
>
> With some indexes, it looks like MySQL might not be adding all data to
> the index (e.g. ignoring NULL values). Does MySQL ignore null values
> in an index? Can we get the same behaviour in Postgres to minimise
> usage? What would be the recommendation here?
It's unlikely anyone will be able to usefully answer the questions you
should be asking without seeing the schema and index definitions,
and maybe some clues about how you're querying the data.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
You are comparing apples to oranges. MySQL and PostgreSQL engines are different by design, so likewise
the size of the indexes will be different.
You may as well ask why a 2015 Porsche 911 Carrera 6 cyl, 3.4 L, Auto(AM-S7) gets 22 MPG,
but the 2016 Jaguar XF MPG gets 24 mpg.
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
--
-- PostgreSQL database dump
--
-- Dumped from database version 9.5.4
-- Dumped by pg_dump version 9.5.4
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;
SET search_path = public, pg_catalog;
SET default_tablespace = '';
SET default_with_oids = false;
--
-- Name: user_event; Type: TABLE; Schema: public; Owner: http
--
CREATE TABLE user_event (
id integer NOT NULL,
user_id integer NOT NULL,
latitude numeric(9,6),
longitude numeric(9,6),
active boolean DEFAULT true NOT NULL,
poi_id integer,
deal_id integer,
category_id integer,
what character varying(32) NOT NULL,
locale character varying(8),
created_at timestamp without time zone NOT NULL,
parameters jsonb
);
ALTER TABLE user_event OWNER TO http;
--
-- Name: user_event_id_seq; Type: SEQUENCE; Schema: public; Owner: http
--
CREATE SEQUENCE user_event_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ALTER TABLE user_event_id_seq OWNER TO http;
--
-- Name: user_event_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: http
--
ALTER SEQUENCE user_event_id_seq OWNED BY user_event.id;
--
-- Name: id; Type: DEFAULT; Schema: public; Owner: http
--
ALTER TABLE ONLY user_event ALTER COLUMN id SET DEFAULT nextval('user_event_id_seq'::regclass);
--
-- Name: user_event_pkey; Type: CONSTRAINT; Schema: public; Owner: http
--
ALTER TABLE ONLY user_event
ADD CONSTRAINT user_event_pkey PRIMARY KEY (id);
--
-- Name: index_user_event_for_reporting; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_for_reporting ON user_event USING btree (latitude, longitude, created_at, user_id);
--
-- Name: index_user_event_on_created_at; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_on_created_at ON user_event USING btree (created_at);
--
-- Name: index_user_event_on_deal_id; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_on_deal_id ON user_event USING btree (deal_id);
--
-- Name: index_user_event_on_poi_id; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_on_poi_id ON user_event USING btree (poi_id);
--
-- Name: index_user_event_on_user_id; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_on_user_id ON user_event USING btree (user_id);
--
-- Name: index_user_event_on_what_category_id_created_at_latlng; Type: INDEX; Schema: public; Owner: http
--
CREATE INDEX index_user_event_on_what_category_id_created_at_latlng ON user_event USING btree (what, category_id, created_at, latitude, longitude);
--
-- PostgreSQL database dump complete
--
You are comparing apples to oranges. MySQL and PostgreSQL engines are different by design, so likewiseOn Sat, Dec 3, 2016 at 9:32 PM, Steve Atkins <steve@blighty.com> wrote:
> On Dec 3, 2016, at 3:57 PM, Samuel Williams <space.ship.traveller@gmail.com> wrote:
>
> Thanks everyone for your feedback so far. I've done a bit more digging:
>
> MySQL in MBytes (about 350 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 22806.00
> index_user_event_for_reporting | 18211.00
> index_user_event_on_created_at | 9519.00
> index_user_event_on_user_id | 6884.00
> index_user_event_on_poi_id | 4891.00
> index_user_event_on_deal_id | 3979.00
>
> Postgres (about 250 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 25 GB
> index_user_event_for_reporting | 19 GB
> index_user_event_on_created_at | 7445 MB
> index_user_event_on_user_id | 7274 MB
> index_user_event_on_deal_id | 7132 MB
> index_user_event_on_poi_id | 7099 MB
>
> So, the index is a bit bigger, plus there is also the PKEY index which
> increases disk usage by another whole index. Keep in mind in the
> above, MySQL has about 40% more data.
>
> With some indexes, it looks like MySQL might not be adding all data to
> the index (e.g. ignoring NULL values). Does MySQL ignore null values
> in an index? Can we get the same behaviour in Postgres to minimise
> usage? What would be the recommendation here?
It's unlikely anyone will be able to usefully answer the questions you
should be asking without seeing the schema and index definitions,
and maybe some clues about how you're querying the data.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
the size of the indexes will be different.
You may as well ask why a 2015 Porsche 911 Carrera 6 cyl, 3.4 L, Auto(AM-S7) gets 22 MPG,
but the 2016 Jaguar XF MPG gets 24 mpg.
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On Sun, Dec 4, 2016 at 4:43 AM, Samuel Williams <space.ship.traveller@gmail.com> wrote:
Melvin, of course there are differences. However, I suspect there are at least SOME tangible differences which can be identified.On 4 December 2016 at 15:53, Melvin Davidson <melvin6925@gmail.com> wrote:You are comparing apples to oranges. MySQL and PostgreSQL engines are different by design, so likewiseOn Sat, Dec 3, 2016 at 9:32 PM, Steve Atkins <steve@blighty.com> wrote:
> On Dec 3, 2016, at 3:57 PM, Samuel Williams <space.ship.traveller@gmail.com> wrote:
>
> Thanks everyone for your feedback so far. I've done a bit more digging:
>
> MySQL in MBytes (about 350 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 22806.00
> index_user_event_for_reporting | 18211.00
> index_user_event_on_created_at | 9519.00
> index_user_event_on_user_id | 6884.00
> index_user_event_on_poi_id | 4891.00
> index_user_event_on_deal_id | 3979.00
>
> Postgres (about 250 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 25 GB
> index_user_event_for_reporting | 19 GB
> index_user_event_on_created_at | 7445 MB
> index_user_event_on_user_id | 7274 MB
> index_user_event_on_deal_id | 7132 MB
> index_user_event_on_poi_id | 7099 MB
>
> So, the index is a bit bigger, plus there is also the PKEY index which
> increases disk usage by another whole index. Keep in mind in the
> above, MySQL has about 40% more data.
>
> With some indexes, it looks like MySQL might not be adding all data to
> the index (e.g. ignoring NULL values). Does MySQL ignore null values
> in an index? Can we get the same behaviour in Postgres to minimise
> usage? What would be the recommendation here?
It's unlikely anyone will be able to usefully answer the questions you
should be asking without seeing the schema and index definitions,
and maybe some clues about how you're querying the data.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
the size of the indexes will be different.
You may as well ask why a 2015 Porsche 911 Carrera 6 cyl, 3.4 L, Auto(AM-S7) gets 22 MPG,
but the 2016 Jaguar XF MPG gets 24 mpg.
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
>Melvin, of course there are differences. However, I suspect there are at least SOME tangible differences which can be identified.
Tutorial - Indexes
https://www.tutorialspoint.com/postgresql/postgresql_indexes.htm
CREATE INDEX
https://www.postgresql.org/docs/9.4/static/sql-createindex.html
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On Sun, Dec 4, 2016 at 4:43 AM, Samuel Williams <space.ship.traveller@gmail.com> wrote:Melvin, of course there are differences. However, I suspect there are at least SOME tangible differences which can be identified.On 4 December 2016 at 15:53, Melvin Davidson <melvin6925@gmail.com> wrote:You are comparing apples to oranges. MySQL and PostgreSQL engines are different by design, so likewiseOn Sat, Dec 3, 2016 at 9:32 PM, Steve Atkins <steve@blighty.com> wrote:
> On Dec 3, 2016, at 3:57 PM, Samuel Williams <space.ship.traveller@gmail.com> wrote:
>
> Thanks everyone for your feedback so far. I've done a bit more digging:
>
> MySQL in MBytes (about 350 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 22806.00
> index_user_event_for_reporting | 18211.00
> index_user_event_on_created_at | 9519.00
> index_user_event_on_user_id | 6884.00
> index_user_event_on_poi_id | 4891.00
> index_user_event_on_deal_id | 3979.00
>
> Postgres (about 250 million rows):
>
> index_user_event_on_what_category_id_created_at_latlng | 25 GB
> index_user_event_for_reporting | 19 GB
> index_user_event_on_created_at | 7445 MB
> index_user_event_on_user_id | 7274 MB
> index_user_event_on_deal_id | 7132 MB
> index_user_event_on_poi_id | 7099 MB
>
> So, the index is a bit bigger, plus there is also the PKEY index which
> increases disk usage by another whole index. Keep in mind in the
> above, MySQL has about 40% more data.
>
> With some indexes, it looks like MySQL might not be adding all data to
> the index (e.g. ignoring NULL values). Does MySQL ignore null values
> in an index? Can we get the same behaviour in Postgres to minimise
> usage? What would be the recommendation here?
It's unlikely anyone will be able to usefully answer the questions you
should be asking without seeing the schema and index definitions,
and maybe some clues about how you're querying the data.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
the size of the indexes will be different.
You may as well ask why a 2015 Porsche 911 Carrera 6 cyl, 3.4 L, Auto(AM-S7) gets 22 MPG,
but the 2016 Jaguar XF MPG gets 24 mpg.
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.Samuel,Please note that the accepted convention in this forum is to bottom post. Please do not top post.
On 12/03/2016 03:57 PM, Samuel Williams wrote: > With some indexes, it looks like MySQL might not be adding all data to > the index (e.g. ignoring NULL values). Does MySQL ignore null values > in an index? Can we get the same behaviour in Postgres to minimise > usage? What would be the recommendation here? I don't think you are going to find many answers about the MySQL implementation (since it will be storage engine specific and it isn't Postgres) on a PostgreSQL list. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own.
Melvin, uh... I'm a software engineer... since when was it a problem to want to know how things work and why they are different? If you have nothing to contribute of a relevant technical nature, please don't reply, I'm really not interested.....
Samuel,
>since when was it a problem to want to know how things work and why they are different?
I am a retired DBA/Software Engineer/Customer Engineer. I have worked all aspects of IT.
Hardware, Software, Database, etc. I never said there was a problem wanting to know
the difference between things. What I did say was that there IS a difference. You are
concentrating your inquiry in the wrong area. If you want to know something specific about
a PostgreSQL index, then ask that, but don't expect anyone to explain why it's different
from a MySQL index, because there is no point in it.
> Could it be I should add WHERE the fields are not null?
That depends on the queries they will be used with.
If you have a WHERE clause in your query that requires a not null field, then the index
will be useful, providing the table is large enough to make the index scan faster than
a sequential scan.
So if you have a query like:
SELECT id, user_id, poi_id
FROM user_event
WHERE deal_id IS NULL;
Then, yes, adding a WHERE clause to index_user_event_on_deal_id for NULL deal_id's
is fine.
But if you want
SELECT id, user_id, poi_id
FROM user_event
WHERE deal_id = <some_value> ;
Then you don't need the WHERE deal_id IS NULL in your index.
BTW, most dba's prefer simpler, shorter, index names.
EG: idx_user_event_deal_id
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Samuel Williams <space.ship.traveller@gmail.com> wrote: > So, uh, my main question was, does MySQL add null values to an index, and is this different from Postgres... Samuel, A quick google says that Mysql does index NULLs. Ask a Mysql group to get a more definitive answer. More relevant to your original question, I'll go out on a limb as I struggle to recall a fuzzy memory. The difference between Mysql and Postgresql is fundamental architecture, so yes the index creation will be very different,as others have said. IIRC (and I may not be), Mysql stores where a row is on the disk via the PK index. That meansthat secondary indexes point to the proper row in the PK index, which does mean that when you use a secondary indexto get data that there is a double look up. They claim that's faster for updates and other stuff because a change toa row only requires 1 index to be changed. Postgresql stores the direct disk location in each index, which slows down updates a little, but makes selects faster (andI find this really amusing because so many people say Mysql is great because it's fast at reads, yet architecturallyPG is faster). If I'm wrong here, I'm sure I'll be corrected. :) So you can see that Mysql indexes should be smaller than PG indexes because of what they carry. Personally, I think the diffis small enough I'm not going to worry about it, but math is such that some numbers are smaller than others. :) So thatshould explain what you're seeing. My opinion is that you shouldn't worry about the index size. Which DB does what you want the best? That obviously dependson what your needs are, but after using both Mysql and PG, I'll take PG whenever possible, thank you. Mysql has gottenbetter over the last 5-8 years, but there are still many pits of quicksand ready to swallow you up there that PG doesn'thave. If you know where those pits are and/or don't travel into that part of the jungle, great for you; personally,I prefer to get the job done without having to watch where I'm walking. ;) HTH, Kevin
Thanks Kevin, that makes sense. Yeah, I understand the architectural difference a bit more now. I also read that when you change a column which is not index, all the indexes for that row need to be updated anyway. Is that correct? On 7 December 2016 at 05:27, <kbrannen@pwhome.com> wrote: > Samuel Williams <space.ship.traveller@gmail.com> wrote: >> So, uh, my main question was, does MySQL add null values to an index, and is this different from Postgres... > > Samuel, > > A quick google says that Mysql does index NULLs. Ask a Mysql group to get a more definitive answer. > > More relevant to your original question, I'll go out on a limb as I struggle to recall a fuzzy memory. > > The difference between Mysql and Postgresql is fundamental architecture, so yes the index creation will be very different,as others have said. IIRC (and I may not be), Mysql stores where a row is on the disk via the PK index. That meansthat secondary indexes point to the proper row in the PK index, which does mean that when you use a secondary indexto get data that there is a double look up. They claim that's faster for updates and other stuff because a change toa row only requires 1 index to be changed. > > Postgresql stores the direct disk location in each index, which slows down updates a little, but makes selects faster (andI find this really amusing because so many people say Mysql is great because it's fast at reads, yet architecturallyPG is faster). If I'm wrong here, I'm sure I'll be corrected. :) > > So you can see that Mysql indexes should be smaller than PG indexes because of what they carry. Personally, I think thediff is small enough I'm not going to worry about it, but math is such that some numbers are smaller than others. :) Sothat should explain what you're seeing. > > My opinion is that you shouldn't worry about the index size. Which DB does what you want the best? That obviously dependson what your needs are, but after using both Mysql and PG, I'll take PG whenever possible, thank you. Mysql has gottenbetter over the last 5-8 years, but there are still many pits of quicksand ready to swallow you up there that PG doesn'thave. If you know where those pits are and/or don't travel into that part of the jungle, great for you; personally,I prefer to get the job done without having to watch where I'm walking. ;) > > HTH, > Kevin > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Thanks Kevin, that makes sense. Yeah, I understand the architectural
difference a bit more now. I also read that when you change a column
which is not index, all the indexes for that row need to be updated
anyway. Is that correct?
On 7 December 2016 at 05:27, <kbrannen@pwhome.com> wrote:
> Samuel Williams <space.ship.traveller@gmail.com> wrote:
>> So, uh, my main question was, does MySQL add null values to an index, and is this different from Postgres...
>
> Samuel,
>
> A quick google says that Mysql does index NULLs. Ask a Mysql group to get a more definitive answer.
>
> More relevant to your original question, I'll go out on a limb as I struggle to recall a fuzzy memory.
>
> The difference between Mysql and Postgresql is fundamental architecture, so yes the index creation will be very different, as others have said. IIRC (and I may not be), Mysql stores where a row is on the disk via the PK index. That means that secondary indexes point to the proper row in the PK index, which does mean that when you use a secondary index to get data that there is a double look up. They claim that's faster for updates and other stuff because a change to a row only requires 1 index to be changed.
>
> Postgresql stores the direct disk location in each index, which slows down updates a little, but makes selects faster (and I find this really amusing because so many people say Mysql is great because it's fast at reads, yet architecturally PG is faster). If I'm wrong here, I'm sure I'll be corrected. :)
>
> So you can see that Mysql indexes should be smaller than PG indexes because of what they carry. Personally, I think the diff is small enough I'm not going to worry about it, but math is such that some numbers are smaller than others. :) So that should explain what you're seeing.
>
> My opinion is that you shouldn't worry about the index size. Which DB does what you want the best? That obviously depends on what your needs are, but after using both Mysql and PG, I'll take PG whenever possible, thank you. Mysql has gotten better over the last 5-8 years, but there are still many pits of quicksand ready to swallow you up there that PG doesn't have. If you know where those pits are and/or don't travel into that part of the jungle, great for you; personally, I prefer to get the job done without having to watch where I'm walking. ;)
>
> HTH,
> Kevin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
>I also read that when you change a column which is not index, all the
>indexes for that row need to be updated anyway. Is that correct?
That is not correct. Indexes are changed under the following conditions:
A. An insert is done to the table which involves an index.
B. A delete is done to the table which involves an index.
C. An update is done that involves columns included in an index.
Indexes point to the tid of the row for which the column(s) in the index
are involved. So if columns updated are not involved in the index,
there is no need to change the index.
https://www.postgresql.org/docs/9.4/static/sql-createindex.html
https://www.postgresql.org/docs/9.4/static/sql-reindex.html
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 2016-12-09 21:45:35 -0500, Melvin Davidson wrote: > On Fri, Dec 9, 2016 at 6:40 PM, Samuel Williams <space.ship.traveller@gmail.com> > wrote: > >I also read that when you change a column which is not index, all the > >indexes for that row need to be updated anyway. Is that correct? > > That is not correct. Indexes are changed under the following conditions: > A. An insert is done to the table which involves an index. > B. A delete is done to the table which involves an index. > C. An update is done that involves columns included in an index. > D. An index is REINDEXed > > Indexes point to the tid of the row for which the column(s) in the index > are involved. So if columns updated are not involved in the index, > there is no need to change the index. I don't think this is generally correct. The TID is a (block,item) tuple. It the updated version of the row doesn't fit into the same block it has to be stored in a different block, so the TID will change (AIUI there is a bit of trickery to avoid changing the TID if the new version is stored in the same block). This means that all the index entries for this row (not just for the changed field) will have to be updated. You can set fillfactor to a smaller value to make this less likely. hp -- _ | Peter J. Holzer | A coding theorist is someone who doesn't |_|_) | | think Alice is crazy. | | | hjp@hjp.at | -- John Gordon __/ | http://www.hjp.at/ | http://downlode.org/Etext/alicebob.html
Attachment
On 2016-12-09 21:45:35 -0500, Melvin Davidson wrote:
> On Fri, Dec 9, 2016 at 6:40 PM, Samuel Williams <space.ship.traveller@gmail.com>
> wrote:
> >I also read that when you change a column which is not index, all the
> >indexes for that row need to be updated anyway. Is that correct?
>
> That is not correct. Indexes are changed under the following conditions:
> A. An insert is done to the table which involves an index.
> B. A delete is done to the table which involves an index.
> C. An update is done that involves columns included in an index.
> D. An index is REINDEXed
>
> Indexes point to the tid of the row for which the column(s) in the index
> are involved. So if columns updated are not involved in the index,
> there is no need to change the index.
I don't think this is generally correct. The TID is a (block,item)
tuple. It the updated version of the row doesn't fit into the same block
it has to be stored in a different block, so the TID will change (AIUI
there is a bit of trickery to avoid changing the TID if the new version
is stored in the same block). This means that all the index entries for
this row (not just for the changed field) will have to be updated. You
can set fillfactor to a smaller value to make this less likely.
hp
--
_ | Peter J. Holzer | A coding theorist is someone who doesn't
|_|_) | | think Alice is crazy.
| | | hjp@hjp.at | -- John Gordon
__/ | http://www.hjp.at/ | http://downlode.org/Etext/alicebob.html
Yes, I see your point, but the case where the row does not fit into the same block
would only occur with unlimited field types such as var[], bytea[], etc. I believe
that to be the exception, and not the rule, so can we agree that we are both right
in that for the general case indexes are updated as I have described and for the
exception they act as you describe?
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
> On Dec 11, 2016, at 08:37, Melvin Davidson <melvin6925@gmail.com> wrote: > > Yes, I see your point, but the case where the row does not fit into the same block > would only occur with unlimited field types such as var[], bytea[], etc. No, that's not correct either. A block can fill up for any number of reasons: Dead tuples, a long HOT chain, or just a lotof data packed into the block. The first UPDATE to a tuple in a block that was populated by continuous INSERTs or a COPY,if the table is using the default fillfactor, will cause the new tuple to be written to a different block, which willcause an index update. The important thing to remember is that the process you are describing (in which indexes are not updated unless a columninvolved in the index changes) is an optimization, Heap-Only Tuples. It's a very common optimization, but it's notguaranteed. -- -- Christophe Pettus xof@thebuild.com