Thread: Major Problem, need help! Can't run our website!

Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
Anyone!

Before anything else, I have no background in PostgreSQL. But I have a
little knowledge in Linux. We used postgreSQL to run one of our website. It
runs in Redhat Linux 7.3. Our System Administrator, who used to maintain
this server, had resigned and didn't have a proper documentation on how to
maintain this server. Right now, our NEW System Administrator is clearing
some logs in /var/lib/pgsql/data/pg_xlog in able to free some space in the
/var file system. It used to work before, but now, its not working anymore.
Information below is the message we are encountering when we are trying to
connect to the website. Please, ANYONE, help us!

Warning: Unable to connect to PostgreSQL server: could not connect to
server: Connection refused Is the server running on host localhost and
accepting TCP/IP connections on port 5432? in
/var/lib/phplib-7.2d/php/db_pgsql.inc on line 48
Database error: Link-ID == false, pconnect failed
ODBC Error: 0 ()
Session halted.


Thanks! --AGUEDO


Re: Major Problem, need help! Can't run our website!

From
Tim Allen
Date:
ITS ONT Alcazar, Jose Aguedo C wrote:
> Anyone!
>
> Before anything else, I have no background in PostgreSQL. But I have a
> little knowledge in Linux. We used postgreSQL to run one of our website. It
> runs in Redhat Linux 7.3. Our System Administrator, who used to maintain
> this server, had resigned and didn't have a proper documentation on how to
> maintain this server. Right now, our NEW System Administrator is clearing
> some logs in /var/lib/pgsql/data/pg_xlog in able to free some space in the
> /var file system. It used to work before, but now, its not working anymore.
> Information below is the message we are encountering when we are trying to
> connect to the website. Please, ANYONE, help us!

We've seen reports of people firing this particular foot-gun before,
haven't we? Would it make sense to rename pg_xlog to something that
doesn't sound like it's "just" full of log files? Eg pg_wal - something
where the half-educated will have no idea what it is, and therefore not
think they know what they can do with it.

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

Re: [HACKERS] Major Problem, need help! Can't run our website!

From
Tom Lane
Date:
Tim Allen <tim@proximity.com.au> writes:
> We've seen reports of people firing this particular foot-gun before,
> haven't we? Would it make sense to rename pg_xlog to something that
> doesn't sound like it's "just" full of log files? Eg pg_wal - something
> where the half-educated will have no idea what it is, and therefore not
> think they know what they can do with it.

There's something in what you say.  We'd have to rename pg_clog as well,
since that's even more critical than pg_xlog ...

            regards, tom lane

Re: [HACKERS] Major Problem, need help! Can't run our website!

From
Christopher Kings-Lynne
Date:
> We've seen reports of people firing this particular foot-gun before,
> haven't we? Would it make sense to rename pg_xlog to something that
> doesn't sound like it's "just" full of log files? Eg pg_wal - something
> where the half-educated will have no idea what it is, and therefore not
> think they know what they can do with it.

Would it be wise or insane for us to to mention in the startup error a
HINT that if you've removed such files, only hope is full restore from
backup or pg_resetxlog with data loss?

Chris


Re: Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
Can anyone help me on how to create this? I will personally do this in able
not to happen again. Or can you link me to the part of the archive that
re-creates xlog? Our internet here is soooooooooo slow.
Pleasssssssseeeeeeeee... Many thanks in advance!

-----Original Message-----
From: Tim Allen [mailto:tim@proximity.com.au]
Sent: Tuesday, November 15, 2005 11:40 AM
To: ITS ONT Alcazar, Jose Aguedo C
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Major Problem, need help! Can't run our website!


ITS ONT Alcazar, Jose Aguedo C wrote:
> Anyone!
>
> Before anything else, I have no background in PostgreSQL. But I have a
> little knowledge in Linux. We used postgreSQL to run one of our website.
It
> runs in Redhat Linux 7.3. Our System Administrator, who used to maintain
> this server, had resigned and didn't have a proper documentation on how to
> maintain this server. Right now, our NEW System Administrator is clearing
> some logs in /var/lib/pgsql/data/pg_xlog in able to free some space in the
> /var file system. It used to work before, but now, its not working
anymore.

Ouch. Tell him to stop. Though it's probably too late. The transaction
logs (ie anything you'll find in pg_xlog) are not disposable, they're an
important part of the database. Your new sysadmin has presumably
corrupted your database. So you've definitely lost any recent
transactions. What you can probably do is at least get your database up
and running again by re-creating the xlog (aka WAL) files with empty
data. Maybe someone else will post telling you how to do that, otherwise
search the mailing lists archives.

> Information below is the message we are encountering when we are trying to
> connect to the website. Please, ANYONE, help us!
>
> Warning: Unable to connect to PostgreSQL server: could not connect to
> server: Connection refused Is the server running on host localhost and
> accepting TCP/IP connections on port 5432? in
> /var/lib/phplib-7.2d/php/db_pgsql.inc on line 48
> Database error: Link-ID == false, pconnect failed
> ODBC Error: 0 ()
> Session halted.
>
> Thanks! --AGUEDO

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/


Re: [HACKERS] Major Problem, need help! Can't run our website!

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Would it be wise or insane for us to to mention in the startup error a
> HINT that if you've removed such files, only hope is full restore from
> backup or pg_resetxlog with data loss?

Not sure that we should have a HINT recommending a worst-case-scenario
course of action as the first resort.  We'll have people blowing away
their data for what might be relatively fixable problems (eg, bogus
permissions on the pg_xlog directory, which I think was an issue that
just came up a day or two ago ...)

(We're all really jumping to conclusions here anyway.  The guy may have
been foolish to remove xlog files, but that doesn't explain why his
postmaster isn't running.  There's some facts missing in that report.)

            regards, tom lane

Re: [HACKERS] Major Problem, need help! Can't run our

From
Rod Taylor
Date:
On Mon, 2005-11-14 at 23:02 -0500, Tom Lane wrote:
> Tim Allen <tim@proximity.com.au> writes:
> > We've seen reports of people firing this particular foot-gun before,
> > haven't we? Would it make sense to rename pg_xlog to something that
> > doesn't sound like it's "just" full of log files? Eg pg_wal - something
> > where the half-educated will have no idea what it is, and therefore not
> > think they know what they can do with it.
>
> There's something in what you say.  We'd have to rename pg_clog as well,
> since that's even more critical than pg_xlog ...

Rename them to pg_donttouchthis and pg_youneedthis.
--


Re: Major Problem, need help! Can't run our website!

From
Tom Lane
Date:
"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Can anyone help me on how to create this?

Before you do anything else, it would help to figure out why your
postmaster stopped running in the first place.  Removing xlog files
wasn't a very bright recovery measure, but that didn't cause the
postmaster to stop.  What did?  Were you completely out of disk space?

            regards, tom lane

Re: Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
Hi Tom! The hard disk space right now is 96%, according to them (sysadmins)
it should work now. But, its not displaying anything in the website. They
have cleared the said pg_xlog after making backups. They issued mv
*filename* /directory (for backup) and cat null > *filename* (for creating
file). Please help. Need to work on this ASAP. Thanks!


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 15, 2005 12:36 PM
To: ITS ONT Alcazar, Jose Aguedo C
Cc: Tim Allen; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Major Problem, need help! Can't run our website!


"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Can anyone help me on how to create this?

Before you do anything else, it would help to figure out why your
postmaster stopped running in the first place.  Removing xlog files
wasn't a very bright recovery measure, but that didn't cause the
postmaster to stop.  What did?  Were you completely out of disk space?

            regards, tom lane


Re: [HACKERS] Major Problem, need help! Can't run our

From
Tom Lane
Date:
Rod Taylor <pg@rbt.ca> writes:
> On Mon, 2005-11-14 at 23:02 -0500, Tom Lane wrote:
>> There's something in what you say.  We'd have to rename pg_clog as well,
>> since that's even more critical than pg_xlog ...

> Rename them to pg_donttouchthis and pg_youneedthis.

:-)

On a more serious level: Tim's suggestion of "pg_wal" for pg_xlog sounds
fine to me.  How about "pg_trans" for pg_clog, by analogy to the
existing pg_subtrans?  Nothing else in the standard layout looks like
it's got a name that a newbie would think means discardable data.

            regards, tom lane

Re: Major Problem, need help! Can't run our website!

From
Tom Lane
Date:
"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Hi Tom! The hard disk space right now is 96%, according to them (sysadmins)
> it should work now. But, its not displaying anything in the website. They
> have cleared the said pg_xlog after making backups. They issued mv
> *filename* /directory (for backup) and cat null > *filename* (for creating
> file). Please help. Need to work on this ASAP. Thanks!

Well, if they were smart enough to keep a copy, then just moving the
copy back should make it possible to restart the postmaster.

What happens exactly when you try to start the postmaster?

            regards, tom lane

Re: Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
Tom, as i've said in my previous eMail, I have no background on postgresql.
can you help me on how to restart this? my apologies for my zero knowledge,
but i really need this to work. thanks!



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 15, 2005 12:50 PM
To: ITS ONT Alcazar, Jose Aguedo C
Cc: Tim Allen; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Major Problem, need help! Can't run our website!


"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Hi Tom! The hard disk space right now is 96%, according to them
(sysadmins)
> it should work now. But, its not displaying anything in the website. They
> have cleared the said pg_xlog after making backups. They issued mv
> *filename* /directory (for backup) and cat null > *filename* (for creating
> file). Please help. Need to work on this ASAP. Thanks!

Well, if they were smart enough to keep a copy, then just moving the
copy back should make it possible to restart the postmaster.

What happens exactly when you try to start the postmaster?

            regards, tom lane


Re: Major Problem, need help! Can't run our website!

From
Aaron Glenn
Date:
On 11/14/05, ITS ONT Alcazar, Jose Aguedo C
<jacalcazar@exportbank.com.ph> wrote:
> Tom, as i've said in my previous eMail, I have no background on postgresql.
> can you help me on how to restart this? my apologies for my zero knowledge,
> but i really need this to work. thanks!
>

In a situation like this, you want to hire a knowledgable PostgreSQL
consultant. www.commandprompt.com is a good place to start.

aaron.glenn

Re: Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
It will be an option. I will include the trainings requirement for this in
my budget plan for Y2006. Meanwhile, I might ask for some "FREE" help for
some generous and expert people here... ;)

-----Original Message-----
From: Aaron Glenn [mailto:aaron.glenn@gmail.com]
Sent: Tuesday, November 15, 2005 1:06 PM
To: ITS ONT Alcazar, Jose Aguedo C
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Major Problem, need help! Can't run our website!


On 11/14/05, ITS ONT Alcazar, Jose Aguedo C
<jacalcazar@exportbank.com.ph> wrote:
> Tom, as i've said in my previous eMail, I have no background on
postgresql.
> can you help me on how to restart this? my apologies for my zero
knowledge,
> but i really need this to work. thanks!
>

In a situation like this, you want to hire a knowledgable PostgreSQL
consultant. www.commandprompt.com is a good place to start.

aaron.glenn


Re: Major Problem, need help! Can't run our website!

From
Tom Lane
Date:
"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Tom, as i've said in my previous eMail, I have no background on postgresql.
> can you help me on how to restart this? my apologies for my zero knowledge,
> but i really need this to work. thanks!

On typical Linux systems, "sudo /sbin/service postgresql start" would
probably do it, but this assumes that there isn't any problem that
requires manual intervention.  If you do that, does it work?  If not,
what do you get from "su - postgres -c postmaster" ?

            regards, tom lane

Re: Major Problem, need help! Can't run our website!

From
"ITS ONT Alcazar, Jose Aguedo C"
Date:
Thanks Tom! I think we did it. Below are the thing's i've inputed. Also, i
have change the attributes and ownership of the xlog file as to the same
with the previous file. Probably, they (sysadmins) have restored the backup
xlog file to the original folder. That's why it was changed the attributes
and ownership. Do you think this will resolve the issue? BTW, can you
suggest on the clean-up thing? I think I wont recommend the one they are
doing right now, deleting the files in pxlog. whats the best way they can do
the clean-up? which one do they need to be "deleted"? Please advise. Thanks!

# sudo /sbin/service postgresql start
   Starting postgresql service:   [FAILED]
# su - postgre -c postmaster
   DEBUG: database system was shut down at 2005-11-13 22:14:02 PHT
   DEBUG: checkpoint record is at 0/51C059A4
   DEBUG: redo record is at 0/51C059A4; undo record is at 0/0; shutdown TRUE
   DEBUG: next transaction id: 7517511; next oid: 492827
   FATAL 2:   open of /var/lib/pgsql/data/pg_xlog/0000000000000052 (logfile
0, segment 82) failed: Permission denied
   DEBUG: startup process (pid 2502) exited with exit code 2
   DEBUG: aborting startup due to startup process failure
# chmod 600 0000000000000052
# chown postgres:postgres 0000000000000052
# sudo /sbin/service postgresql start
   Starting postgresql service:   [OK]




-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 15, 2005 1:19 PM
To: ITS ONT Alcazar, Jose Aguedo C
Cc: Tim Allen; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Major Problem, need help! Can't run our website!


"ITS ONT Alcazar, Jose Aguedo C" <jacalcazar@exportbank.com.ph> writes:
> Tom, as i've said in my previous eMail, I have no background on
postgresql.
> can you help me on how to restart this? my apologies for my zero
knowledge,
> but i really need this to work. thanks!

On typical Linux systems, "sudo /sbin/service postgresql start" would
probably do it, but this assumes that there isn't any problem that
requires manual intervention.  If you do that, does it work?  If not,
what do you get from "su - postgres -c postmaster" ?

            regards, tom lane


Re: [HACKERS] Major Problem, need help! Can't run our

From
"Jonah H. Harris"
Date:
I agree.
 
(sorry again Tom... dang GMAIL should default reply to all.... grrrr!)
 
On 11/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rod Taylor <pg@rbt.ca> writes:
> On Mon, 2005-11-14 at 23:02 -0500, Tom Lane wrote:
>> There's something in what you say.  We'd have to rename pg_clog as well,
>> since that's even more critical than pg_xlog ...

> Rename them to pg_donttouchthis and pg_youneedthis.

:-)

On a more serious level: Tim's suggestion of "pg_wal" for pg_xlog sounds
fine to me.  How about "pg_trans" for pg_clog, by analogy to the
existing pg_subtrans?  Nothing else in the standard layout looks like
it's got a name that a newbie would think means discardable data.

                       regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: Major Problem, need help! Can't run our website!

From
Andrew Sullivan
Date:
On Tue, Nov 15, 2005 at 01:53:27PM +0800, ITS ONT Alcazar, Jose Aguedo C wrote:
> Thanks Tom! I think we did it. Below are the thing's i've inputed. Also, i
> have change the attributes and ownership of the xlog file as to the same
> with the previous file. Probably, they (sysadmins) have restored the backup
> xlog file to the original folder. That's why it was changed the attributes
> and ownership. Do you think this will resolve the issue? BTW, can you
> suggest on the clean-up thing? I think I wont recommend the one they are
> doing right now, deleting the files in pxlog. whats the best way they can do
> the clean-up? which one do they need to be "deleted"? Please advise. Thanks!

Actually, given that you are nearly full on that disk, you probably
need more disk.  You ought to be able to trim some things out of
/var/log.  But you can't delete anything from the PostgreSQL area
(/var/lib/pgsql/*, it looks like).  There's no "clean up" you can do
there -- your data is there.

You _may_ need a vacuum full -- if you have a lot of dead tuples, you
might be able to recover some disk space.  It'll lock all the tables,
so you'll need to be able to take an outage to do this.  In the psql
command monitor (psql -U postgres [yourdbname]) you can issue a
VACUUM FULL VERBOSE and recover available space (and find out how
much you're recovering -- that's what the VERBOSE does).  Note that
this may take a _long_ time to complete (depending how big oyur
database is), and (as I noted) it will cause your application not to
work while it's going on.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
        --H.W. Fowler

Re: Major Problem, need help! Can't run our website!

From
"Jim C. Nasby"
Date:
http://www.postgresql.org/support/professional_support is an even better
place to start, since it lists all companies who've registered as
providing support. :)

On Mon, Nov 14, 2005 at 09:06:20PM -0800, Aaron Glenn wrote:
> On 11/14/05, ITS ONT Alcazar, Jose Aguedo C
> <jacalcazar@exportbank.com.ph> wrote:
> > Tom, as i've said in my previous eMail, I have no background on postgresql.
> > can you help me on how to restart this? my apologies for my zero knowledge,
> > but i really need this to work. thanks!
> >
>
> In a situation like this, you want to hire a knowledgable PostgreSQL
> consultant. www.commandprompt.com is a good place to start.
>
> aaron.glenn
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461