Re: Alter/update large tables - VERRRY annoying behaviour! - Mailing list pgsql-general

From Dmitry Tkach
Subject Re: Alter/update large tables - VERRRY annoying behaviour!
Date
Msg-id 3CBB4554.90701@openratings.com
Whole thread Raw
In response to Alter/update large tables - VERRRY annoying behaviour!  (Dmitry Tkach <dmitry@openratings.com>)
Responses Re: Alter/update large tables - VERRRY annoying behaviour!
Re: Alter/update large tables - VERRRY annoying behaviour!
List pgsql-general
Neil Conway wrote:<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap="">OnMon, 15 Apr 2002 13:07:20 -0400<br />"Dmitry Tkach" <a class="moz-txt-link-rfc2396E"
href="mailto:dmitry@openratings.com"><dmitry@openratings.com></a>wrote:<br /></pre><blockquote type="cite"><pre
wrap="">Hi,everybody!<br /></pre></blockquote><pre wrap=""><br />Hi Dmitry! Don't cross-post! It's annoying!<br
/></pre></blockquote>What do you mean by 'cross-post'?<br /> Are you saying that posting to several lists at a time is
annoying?<br/> I just thought, that this problem might be interesting to people, who read those (and not necessarily
ALL<br /> of them)... What's annoying about it?<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br /></pre><blockquote
type="cite"><prewrap="">This took me awfully long, but worked (I guess).<br />I say 'I guess', because I wasn't able so
farto verify that - when I triued to do<br /><br />select * from a limit 1;<br /><br />It just hungs on me ... at
least,it looks like it does.<br /></pre></blockquote><pre wrap=""><br />This didn't hang, it just requires a sequential
scanof the whole table.</pre></blockquote> I know it does (as I said below). The point is that it SHOULD NOT, and
especially,that I can't imagine anyone, not familiar with postgres internals to expect that it would - all it needs to
dois to grab the first row and return immediately.<br /> That's what it would do, if you just create a new table and
populateit with data.<br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />As you observe below, it will also need to scan through dead tuples,</pre></blockquote>
Not'also' - JUST the dead ones! That's what's especially annoying about it - as soon as it finds the first tuple,
that'snot dead, it returns.<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />but that is just a product of MVCC and there's no real way around<br
/>it.</pre></blockquote>My whole point is that I don't believe it (that there is no way around) :-)<br /> For one
thing,I have never seen ANY database engine (Oracle, Informix, DB2) that would take more than a second to get the first
rowfrom a table, regardless of what has been done to that table before.<br /> That (and my common sense too) tells me
thatthere MUST be a 'way around it'. <br /> I can see, that it's not currently implemented in postgres, but do believe
(andthat's the whole point of me posting that message in the first place) that it is a huge usability issue and really
needsto be fixed.<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap="">Once you VACUUM the dead tuples will be removed and sequential<br />scans should be fast once more.<br
/></pre></blockquote><br/> Yeah... I hope so. I am running vacuum on that table. It's been running for 6 hours now and
stillhas not finished. <br /> Doesn't it look to you like a little too much trouble to go through just to take a look
atthe first row of a table ? :-)<br /><br /> And, once again, I am not done modifying that schema - this is just an
intermediatestep, which means, I will have to do the vacuum all over when I am finished...<br /><br /> This seems like
WAYtoo much trouble to me :-(<br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />And before assuming that something has hung, it's a good idea to<br />look at the output
ofEXPLAIN for that query, as well as monitoring<br />system performance (through top, vmstat, etc) to see what the<br
/>systemis doing.<br /></pre></blockquote> Yeah, right...<br /><br /> explain select * from a limit 1;<br /> NOTICE: 
QUERYPLAN:<br /><br /> Limit  (cost=0.00..1.01 rows=1 width=46)<br />   ->  Seq Scan on a  (cost=0.00..32529003.00
rows=32243660width=46)<br /><br /> EXPLAIN<br /><br /> There is absolutely nothing in this plan, that would suggest it
willgo on executing for ages...<br /> Look at the 'cost' value for example...<br /> In any event, there is nothing
differentin this plan from what I was getting before I modified the table (when the query would take just a few
millisecondsto be executed).<br /><br /> As for monitoring system performance... Well, I could see it maxing out on CPU
usageand disk IO at times... How exactly does it help me to realize it did not hung?<br /><br /> (Let me clarify that -
by'hung' I mean 'not going to return the results in any reasonable time', not necessarily 'not doing anything at
all')<br/><br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap=""><br/></pre><blockquote type="cite"><pre wrap="">Lucky me, I have compiled the backend from sources with full
debuginfo, because if I hadn't done that,<br />(as most users), I would certainly had thought, that my database is
hopelesslycorrupted, and would have to<br />recreate it from scratch :-(<br /></pre></blockquote><pre wrap=""><br
/>That'sa ludicrous conclusion.<br /></pre></blockquote><br /> Why is it?<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br /></pre><blockquote
type="cite"><prewrap="">First of all, a question for you - is ANY update to a table equivalent (in this respect) to a
delete+insert?<br/></pre></blockquote><pre wrap=""><br />Yes, AFAIK -- MVCC requires this.<br /></pre></blockquote><br
/>What's MVCC?<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap=""><br/></pre><blockquote type="cite"><pre wrap="">- Vacuum, isn't the speediest thing in the world too (it's been
runningfor a hour now, and still has not finished).<br /></pre></blockquote><pre wrap=""><br />Is this 7.2? If not,
VACUUMshould be substantially faster in 7.2.</pre></blockquote> Yes, it is 7.2<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br />In any case, you'll always
wantto VACUUM or VACUUM FULL (and<br />ANALYZE) when you change your tables in such a dramatic fashion.<br /><br
/></pre></blockquote><br/> I know... Once again, I was hoping to be able to complete my changes before doing the vacuum
:-(<br/><br /> Thanks for your reply!<br /><br /> Dima<br /><br /><br /> 

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: using CAST and LIKE
Next
From: elein
Date:
Subject: Re: User-Defined Datatypes