Thread: Physical column size
Hi,
I've created a table like this :
CREATE TABLE tmp_A (
c "char",
i int4
);
And another one
CREATE TABLE tmp_B (
i int4,
ii int4
);
I then inerted a bit more than 19 million rows in each table (exactly the same number of rows in each).
The end result is that the physical size on disk used by table tmp_A is exactly the same as table tmp_B (as revealed by the pg_relation_size function) ! Given that a "char" field is supposed to be 1 byte in size and a int4 4 bytes, shouldn't the tmp_A use a smaller disk space ? Or is it that any value, whatever the type, requires at least 4 bytes to be stored ?
Thanks,
Paul
I've created a table like this :
CREATE TABLE tmp_A (
c "char",
i int4
);
And another one
CREATE TABLE tmp_B (
i int4,
ii int4
);
I then inerted a bit more than 19 million rows in each table (exactly the same number of rows in each).
The end result is that the physical size on disk used by table tmp_A is exactly the same as table tmp_B (as revealed by the pg_relation_size function) ! Given that a "char" field is supposed to be 1 byte in size and a int4 4 bytes, shouldn't the tmp_A use a smaller disk space ? Or is it that any value, whatever the type, requires at least 4 bytes to be stored ?
Thanks,
Paul
Am Donnerstag, 26. Januar 2006 11:06 schrieb Paul Mackay: > Hi, > > I've created a table like this : > CREATE TABLE tmp_A ( > c "char", > i int4 > ); > > And another one > CREATE TABLE tmp_B ( > i int4, > ii int4 > ); > > I then inerted a bit more than 19 million rows in each table (exactly the > same number of rows in each). > > The end result is that the physical size on disk used by table tmp_A is > exactly the same as table tmp_B (as revealed by the pg_relation_size > function) ! Given that a "char" field is supposed to be 1 byte in size and > a int4 4 bytes, shouldn't the tmp_A use a smaller disk space ? Or is it > that any value, whatever the type, requires at least 4 bytes to be stored ? I think this is caused by alignment.
Hey guys, how u been. This is quite a newbie question, but I need to ask it. I'm trying to wrap my mind around the syntax of join and why and when to use it. I understand the concept of making a query go faster by creating indexes, but it seems that when I want data from multiple tables that link together the query goes slow. The slow is typically due to expensive nested loops. The reason is, all my brain understands is:
select
tablea.data
tableb.data
tablec.data
from
tablea
tableb
tablec
where
tablea.pri_key = tableb.foreign_key AND
tableb.pri_key = tablec.foreign_key AND...
From what I read, it seems you can use inner/outer right/left join on (bla) but when I see syntax examples I see that sometimes tables are omitted from the 'from' section of the query and other times, no. Sometimes I see that the join commands are nested and others, no and sometimes I see joins syntax that only applies to one table. From what I understand join can be used to tell the database the fast way to murge table data together to get results by specifiying the table that has the primary keys and the table that has the foreign keys.
I've read all through the postgres docs on this command and I'm still left lost. Can someone please explain to me in simple language how to use these commands or provide me with a link. I need it to live right now. Thanx.
First, this isn't really the right place to ask -- this forum is about performance, not SQL syntax. Second, this isn't a question anyone can answer in a reasonable length of time. What you're asking for usually is taughtin a class on relational database theory, which is typically a semester or two in college. If you really need a crash course, dig around on the web for terms like "SQL Tutorial". Good luck, Craig J@Planeti.Biz wrote: > Hey guys, how u been. This is quite a newbie question, but I need to ask > it. I'm trying to wrap my mind around the syntax of join and why and > when to use it. I understand the concept of making a query go faster by > creating indexes, but it seems that when I want data from multiple > tables that link together the query goes slow. The slow is typically due > to expensive nested loops. The reason is, all my brain understands is: > > select > tablea.data > tableb.data > tablec.data > from > tablea > tableb > tablec > where > tablea.pri_key = tableb.foreign_key AND > tableb.pri_key = tablec.foreign_key AND... > > From what I read, it seems you can use inner/outer right/left join on > (bla) but when I see syntax examples I see that sometimes tables are > omitted from the 'from' section of the query and other times, no. > Sometimes I see that the join commands are nested and others, no and > sometimes I see joins syntax that only applies to one table. From what I > understand join can be used to tell the database the fast way to murge > table data together to get results by specifiying the table that has the > primary keys and the table that has the foreign keys. > > I've read all through the postgres docs on this command and I'm still > left lost. Can someone please explain to me in simple language how to > use these commands or provide me with a link. I need it to live right > now. Thanx. > >
If I want my database to go faster, due to X then I would think that the issue is about performance. I wasn't aware of a paticular constraint on X. I have more that a rudementary understanding of what's going on here, I was just hoping that someone could shed some light on the basic principal of this JOIN command and its syntax. Most people I ask, don't give me straight answers and what I have already read on the web is not very helpful thus far. ----- Original Message ----- From: "Craig A. James" <cjames@modgraph-usa.com> To: <pgsql-performance@postgresql.org> Sent: Thursday, January 26, 2006 11:12 AM Subject: Re: [PERFORM] Query optimization with X Y JOIN > First, this isn't really the right place to ask -- this forum is about > performance, not SQL syntax. > > Second, this isn't a question anyone can answer in a reasonable length of > time. What you're asking for usually is taught in a class on relational > database theory, which is typically a semester or two in college. > > If you really need a crash course, dig around on the web for terms like > "SQL Tutorial". > > Good luck, > Craig > > > J@Planeti.Biz wrote: >> Hey guys, how u been. This is quite a newbie question, but I need to ask >> it. I'm trying to wrap my mind around the syntax of join and why and when >> to use it. I understand the concept of making a query go faster by >> creating indexes, but it seems that when I want data from multiple tables >> that link together the query goes slow. The slow is typically due to >> expensive nested loops. The reason is, all my brain understands is: >> select >> tablea.data >> tableb.data >> tablec.data >> from >> tablea >> tableb >> tablec >> where >> tablea.pri_key = tableb.foreign_key AND >> tableb.pri_key = tablec.foreign_key AND... >> From what I read, it seems you can use inner/outer right/left join on >> (bla) but when I see syntax examples I see that sometimes tables are >> omitted from the 'from' section of the query and other times, no. >> Sometimes I see that the join commands are nested and others, no and >> sometimes I see joins syntax that only applies to one table. From what I >> understand join can be used to tell the database the fast way to murge >> table data together to get results by specifiying the table that has the >> primary keys and the table that has the foreign keys. >> I've read all through the postgres docs on this command and I'm still >> left lost. Can someone please explain to me in simple language how to use >> these commands or provide me with a link. I need it to live right now. >> Thanx. >> > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
J@Planeti.Biz wrote: > If I want my database to go faster, due to X then I would think that > the issue is about performance. I wasn't aware of a paticular > constraint on X. > > I have more that a rudementary understanding of what's going on here, > I was just hoping that someone could shed some light on the basic > principal of this JOIN command and its syntax. Most people I ask, > don't give me straight answers and what I have already read on the web > is not very helpful thus far. What you are looking for is here: http://sqlzoo.net/ It is an excellent website that discusses in depth but at a tutorial style level how and what SQL is and how to use it. Including JOINS. FYI, a JOIN is basically a FROM with an integrated WHERE clause. That is a very simplified description and isn't 100% accurate but it is close. I strongly suggest the website I mentioned above as it will resolve your question. Joshua D. Drake -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl - http://www.commandprompt.com/
J@Planeti.Biz wrote: > If I want my database to go faster, due to X then I would think that the > issue is about performance. I wasn't aware of a paticular constraint on X. You haven't asked a performance question yet though. > I have more that a rudementary understanding of what's going on here, I > was just hoping that someone could shed some light on the basic > principal of this JOIN command and its syntax. Most people I ask, don't > give me straight answers and what I have already read on the web is not > very helpful thus far. OK - firstly it's not a JOIN command. It's a SELECT query that happens to join (in your example) three tables together. The syntax is specified in the SQL reference section of the manuals, and I don't think it's different from the standard SQL spec here. A query that joins two or more tables (be they real base-tables, views or sub-query result-sets) produces the product of both. Normally you don't want this so you apply constraints to that join (table_a.col1 = table_b.col2). In some cases you want all the rows from one side of a join, whether or not you get a match on the other side of the join. This is called an outer join and results in NULLs for all the columns on the "outside" of the join. A left-join returns all rows from the table on the left of the join, a right-join from the table on the right of it. When planning a join, the planner will try to estimate how many matches it will see on each side, taking into account any extra constraints (you might want only some of the rows in table_a anyway). It then decides whether to use any indexes on the relevant column(s). Now, if you think the planner is making a mistake we'll need to see the output of EXPLAIN ANALYSE for the query and will want to know that you've vacuumed and analysed the tables in question. Does that help at all? -- Richard Huxton Archonet Ltd
Yes, that helps a great deal. Thank you so much. ----- Original Message ----- From: "Richard Huxton" <dev@archonet.com> To: <J@planeti.biz> Cc: <pgsql-performance@postgresql.org> Sent: Thursday, January 26, 2006 11:47 AM Subject: Re: [PERFORM] Query optimization with X Y JOIN > J@Planeti.Biz wrote: >> If I want my database to go faster, due to X then I would think that the >> issue is about performance. I wasn't aware of a paticular constraint on >> X. > > You haven't asked a performance question yet though. > >> I have more that a rudementary understanding of what's going on here, I >> was just hoping that someone could shed some light on the basic principal >> of this JOIN command and its syntax. Most people I ask, don't give me >> straight answers and what I have already read on the web is not very >> helpful thus far. > > OK - firstly it's not a JOIN command. It's a SELECT query that happens to > join (in your example) three tables together. The syntax is specified in > the SQL reference section of the manuals, and I don't think it's different > from the standard SQL spec here. > > A query that joins two or more tables (be they real base-tables, views or > sub-query result-sets) produces the product of both. Normally you don't > want this so you apply constraints to that join (table_a.col1 = > table_b.col2). > > In some cases you want all the rows from one side of a join, whether or > not you get a match on the other side of the join. This is called an outer > join and results in NULLs for all the columns on the "outside" of the > join. A left-join returns all rows from the table on the left of the join, > a right-join from the table on the right of it. > > When planning a join, the planner will try to estimate how many matches it > will see on each side, taking into account any extra constraints (you > might want only some of the rows in table_a anyway). It then decides > whether to use any indexes on the relevant column(s). > > Now, if you think the planner is making a mistake we'll need to see the > output of EXPLAIN ANALYSE for the query and will want to know that you've > vacuumed and analysed the tables in question. > > Does that help at all? > -- > Richard Huxton > Archonet Ltd > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >