Thread: Problem retrieving large records (bytea) data from a table

Problem retrieving large records (bytea) data from a table

From

jtkells@verizon.net

Date:

05 July 2011, 14:08:02

I am having a hang condition every time I try to retrieve a large
records (bytea) data from  a table
The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory
running Postgresql 8.4.3 with a standard postgresql.conf file (nothing
has been changed)
I have the following table called doc_table
      Column  |              Type              |  Modifiers     |
Storage  | Description
------------------------+--------------------------------+---------------------------------------
 id           | numeric                        | not null    | main |
 file_n       | character varying(4000)        |             |
extended |
 create_date  | timestamp(6) without time zone | not null
                default (clock_timestamp())
                ::timestamp(0)without time zone              | plain |
 desc         | character varying(4000)        |             |
extended |
 doc_cc       | character varying(120)         | not null    |
extended |
 by           | numeric                        | not null    | main |
 doc_data     | bytea                          |             |
extended |
 mime_type_id | character varying(16)          | not null    |
extended |
 doc_src      | text                           |             |
extended |
 doc_stat     | character varying(512)         | not null
               default 'ACTIVE'::character varying           |
extended |
Indexes:
   "documents_pk" PRIMARY KEY, btree (document_id)


A while ago the some developers inserted several records with a
document (stored in doc_Data) that was around 400 - 450 MB each. Now
when you do a select * (all) from this table you get a hang and the
system becomes unresponsive.  Prior to these inserts, a select * (all,
no where clause) worked.  I'm also told a select * from doc_table
where id = xxx still works.  I haven't seen any error message in the
postgresql log files.
So I'm not sure how to find these bad records and why I am getting a
hang.  Since this postgresql is running with the default config files
could I be running out of a resource?  If so I'm not sure how to or
how much to add to these resources to fix this problem since I have
very little memory on this system.  Does anyone have any ideas why I
am getting a hang.  Thanks

Re: Problem retrieving large records (bytea) data from a table

From

pasman pasmański

Date:

20 July 2011, 11:31:52

You may do a backup of this table. Then with ultraedit search your
documents and remove them.

2011/7/5, jtkells@verizon.net <jtkells@verizon.net>:
> I am having a hang condition every time I try to retrieve a large
> records (bytea) data from  a table
> The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory
> running Postgresql 8.4.3 with a standard postgresql.conf file (nothing
> has been changed)
> I have the following table called doc_table
>       Column  |              Type              |  Modifiers     |
> Storage  | Description
> ------------------------+--------------------------------+---------------------------------------
>  id           | numeric                        | not null    | main |
>  file_n       | character varying(4000)        |             |
> extended |
>  create_date  | timestamp(6) without time zone | not null
>                 default (clock_timestamp())
>                 ::timestamp(0)without time zone              | plain |
>  desc         | character varying(4000)        |             |
> extended |
>  doc_cc       | character varying(120)         | not null    |
> extended |
>  by           | numeric                        | not null    | main |
>  doc_data     | bytea                          |             |
> extended |
>  mime_type_id | character varying(16)          | not null    |
> extended |
>  doc_src      | text                           |             |
> extended |
>  doc_stat     | character varying(512)         | not null
>                default 'ACTIVE'::character varying           |
> extended |
> Indexes:
>    "documents_pk" PRIMARY KEY, btree (document_id)
>
>
> A while ago the some developers inserted several records with a
> document (stored in doc_Data) that was around 400 - 450 MB each. Now
> when you do a select * (all) from this table you get a hang and the
> system becomes unresponsive.  Prior to these inserts, a select * (all,
> no where clause) worked.  I'm also told a select * from doc_table
> where id = xxx still works.  I haven't seen any error message in the
> postgresql log files.
> So I'm not sure how to find these bad records and why I am getting a
> hang.  Since this postgresql is running with the default config files
> could I be running out of a resource?  If so I'm not sure how to or
> how much to add to these resources to fix this problem since I have
> very little memory on this system.  Does anyone have any ideas why I
> am getting a hang.  Thanks
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>


--
------------
pasman

Re: Problem retrieving large records (bytea) data from a table

From

Thomas Kellerer

Date:

20 July 2011, 11:53:50

jtkells@verizon.net, 05.07.2011 18:44:
> A while ago the some developers inserted several records with a
> document (stored in doc_Data) that was around 400 - 450 MB each. Now
> when you do a select * (all) from this table you get a hang and the
> system becomes unresponsive.

What application/program is "hanging"? The client that retrieves the data or PostgreSQL itself?

> So I'm not sure how to find these bad records and why I am getting a
> hang.
If your client application is hanging, I don't think there is a single "bad" record, it just chokes on the size of the
resultset. 

Regards
Thomas

Re: Problem retrieving large records (bytea) data from a table

From

Achilleas Mantzios

Date:

20 July 2011, 12:08:59

he could use smth like this to know the size like:
SELECT count(*),CASE WHEN length(doc_data)<50000000 THEN '<=50 MB' WHEN  length(doc_data)<100000000 THEN '<=100 MB'
ELSE'>100MB' END from doc_table GROUP by 2; 

and then based on the above, to do finer queries to find large data.

However, i dont know if cursor based queries (like the ones used by JDBC) should be affected by the size.
Normally they would not. However, querying by psql this way will definitely be slow.


Στις Wednesday 20 July 2011 17:31:45 ο/η pasman pasmański έγραψε:
> You may do a backup of this table. Then with ultraedit search your
> documents and remove them.
>
> 2011/7/5, jtkells@verizon.net <jtkells@verizon.net>:
> > I am having a hang condition every time I try to retrieve a large
> > records (bytea) data from  a table
> > The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory
> > running Postgresql 8.4.3 with a standard postgresql.conf file (nothing
> > has been changed)
> > I have the following table called doc_table
> >       Column  |              Type              |  Modifiers     |
> > Storage  | Description
> > ------------------------+--------------------------------+---------------------------------------
> >  id           | numeric                        | not null    | main |
> >  file_n       | character varying(4000)        |             |
> > extended |
> >  create_date  | timestamp(6) without time zone | not null
> >                 default (clock_timestamp())
> >                 ::timestamp(0)without time zone              | plain |
> >  desc         | character varying(4000)        |             |
> > extended |
> >  doc_cc       | character varying(120)         | not null    |
> > extended |
> >  by           | numeric                        | not null    | main |
> >  doc_data     | bytea                          |             |
> > extended |
> >  mime_type_id | character varying(16)          | not null    |
> > extended |
> >  doc_src      | text                           |             |
> > extended |
> >  doc_stat     | character varying(512)         | not null
> >                default 'ACTIVE'::character varying           |
> > extended |
> > Indexes:
> >    "documents_pk" PRIMARY KEY, btree (document_id)
> >
> >
> > A while ago the some developers inserted several records with a
> > document (stored in doc_Data) that was around 400 - 450 MB each. Now
> > when you do a select * (all) from this table you get a hang and the
> > system becomes unresponsive.  Prior to these inserts, a select * (all,
> > no where clause) worked.  I'm also told a select * from doc_table
> > where id = xxx still works.  I haven't seen any error message in the
> > postgresql log files.
> > So I'm not sure how to find these bad records and why I am getting a
> > hang.  Since this postgresql is running with the default config files
> > could I be running out of a resource?  If so I'm not sure how to or
> > how much to add to these resources to fix this problem since I have
> > very little memory on this system.  Does anyone have any ideas why I
> > am getting a hang.  Thanks
> >
> > --
> > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-admin
> >
>
>
> --
> ------------
> pasman
>



--
Achilleas Mantzios

Re: Problem retrieving large records (bytea) data from a table

From

Bob Lunney

Date:

20 July 2011, 13:39:42

PostgreSQL has to accumulate all the rows of a query before returning the result set to the client.  It is probably
spoolingthose several 400-450 Mb docs, plus all the other attributes, to a temporary file prior to sending the results
back. If you have just three document stored in the database you're looking at > 1 Gb for the spool file alone. 

Remember, select * is a convenience.  You will probably get the same response time as before is you name the columns,
exceptdoc_data, in the select clause of your query.  See the 'extended' attribute of doc_data?  That means the bytea
datais stored out-of-line from the other columns like id, create_date and by.  
See http://www.postgresql.org/docs/9.0/interactive/storage-toast.html forthe particulars of TOAST. 

If you need to remove the doc data from the table a quick way to do that would be to either update the table and set
doc_datato NULL, or use the "create table as select" (CTAS) syntax and specify NULL as the value for doc_date, then
dropthe original table and rename the new one to doc_table.  Note if you use the CTAS method you will have to alter the
tableafterwards to re-establish the not null and default attributes of each column.  Don't forget to recreate the
primarykey, too. 

Finally, talk with the developers to see if the document data really needs to be in the database, or could just be in a
fileoutside of the database.  If you need transactional semantics (ACID properties) to manage the documents you may be
stuck. If not, replace doc_data with doc_filename (or maybe file_n is that column already) and move on from there. 

Good luck,

Bob Lunney



Στις Wednesday 20 July 2011 17:31:45 ο/η pasman pasmański έγραψε:
> You may do a backup of this table. Then with ultraedit search your
> documents and remove them.
>
> 2011/7/5, jtkells@verizon.net <jtkells@verizon.net>:
> > I am having a hang condition every time I try to retrieve a large
> > records (bytea) data from  a table
> > The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory
> > running Postgresql 8.4.3 with a standard postgresql.conf file (nothing
> > has been changed)
> > I have the following table called doc_table
> >       Column  |              Type              |  Modifiers     |
> > Storage  | Description
> > ------------------------+--------------------------------+---------------------------------------
> >  id           | numeric                        | not null    | main |
> >  file_n       | character varying(4000)        |             |
> > extended |
> >  create_date  | timestamp(6) without time zone | not null
> >                 default (clock_timestamp())
> >                 ::timestamp(0)without time zone              | plain |
> >  desc         | character varying(4000)        |             |
> > extended |
> >  doc_cc       | character varying(120)         | not null    |
> > extended |
> >  by           | numeric                        | not null    | main |
> >  doc_data     | bytea                          |             |
> > extended |
> >  mime_type_id | character varying(16)          | not null    |
> > extended |
> >  doc_src      | text                           |             |
> > extended |
> >  doc_stat     | character varying(512)         | not null
> >                default 'ACTIVE'::character varying           |
> > extended |
> > Indexes:
> >    "documents_pk" PRIMARY KEY, btree (document_id)
> >
> >
> > A while ago the some developers inserted several records with a
> > document (stored in doc_Data) that was around 400 - 450 MB each. Now
> > when you do a select * (all) from this table you get a hang and the
> > system becomes unresponsive.  Prior to these inserts, a select * (all,
> > no where clause) worked.  I'm also told a select * from doc_table
> > where id = xxx still works.  I haven't seen any error message in the
> > postgresql log files.
> > So I'm not sure how to find these bad records and why I am getting a
> > hang.  Since this postgresql is running with the default config files
> > could I be running out of a resource?  If so I'm not sure how to or
> > how much to add to these resources to fix this problem since I have
> > very little memory on this system.  Does anyone have any ideas why I
> > am getting a hang.  Thanks
> >
> > --
> > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-admin
> >
>
>
> --
> ------------
> pasman
>



--
Achilleas Mantzios

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin