Thread: Problem retrieving large records (bytea) data from a table
I am having a hang condition every time I try to retrieve a large records (bytea) data from a table The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory running Postgresql 8.4.3 with a standard postgresql.conf file (nothing has been changed) I have the following table called doc_table Column | Type | Modifiers | Storage | Description ------------------------+--------------------------------+--------------------------------------- id | numeric | not null | main | file_n | character varying(4000) | | extended | create_date | timestamp(6) without time zone | not null default (clock_timestamp()) ::timestamp(0)without time zone | plain | desc | character varying(4000) | | extended | doc_cc | character varying(120) | not null | extended | by | numeric | not null | main | doc_data | bytea | | extended | mime_type_id | character varying(16) | not null | extended | doc_src | text | | extended | doc_stat | character varying(512) | not null default 'ACTIVE'::character varying | extended | Indexes: "documents_pk" PRIMARY KEY, btree (document_id) A while ago the some developers inserted several records with a document (stored in doc_Data) that was around 400 - 450 MB each. Now when you do a select * (all) from this table you get a hang and the system becomes unresponsive. Prior to these inserts, a select * (all, no where clause) worked. I'm also told a select * from doc_table where id = xxx still works. I haven't seen any error message in the postgresql log files. So I'm not sure how to find these bad records and why I am getting a hang. Since this postgresql is running with the default config files could I be running out of a resource? If so I'm not sure how to or how much to add to these resources to fix this problem since I have very little memory on this system. Does anyone have any ideas why I am getting a hang. Thanks
You may do a backup of this table. Then with ultraedit search your documents and remove them. 2011/7/5, jtkells@verizon.net <jtkells@verizon.net>: > I am having a hang condition every time I try to retrieve a large > records (bytea) data from a table > The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory > running Postgresql 8.4.3 with a standard postgresql.conf file (nothing > has been changed) > I have the following table called doc_table > Column | Type | Modifiers | > Storage | Description > ------------------------+--------------------------------+--------------------------------------- > id | numeric | not null | main | > file_n | character varying(4000) | | > extended | > create_date | timestamp(6) without time zone | not null > default (clock_timestamp()) > ::timestamp(0)without time zone | plain | > desc | character varying(4000) | | > extended | > doc_cc | character varying(120) | not null | > extended | > by | numeric | not null | main | > doc_data | bytea | | > extended | > mime_type_id | character varying(16) | not null | > extended | > doc_src | text | | > extended | > doc_stat | character varying(512) | not null > default 'ACTIVE'::character varying | > extended | > Indexes: > "documents_pk" PRIMARY KEY, btree (document_id) > > > A while ago the some developers inserted several records with a > document (stored in doc_Data) that was around 400 - 450 MB each. Now > when you do a select * (all) from this table you get a hang and the > system becomes unresponsive. Prior to these inserts, a select * (all, > no where clause) worked. I'm also told a select * from doc_table > where id = xxx still works. I haven't seen any error message in the > postgresql log files. > So I'm not sure how to find these bad records and why I am getting a > hang. Since this postgresql is running with the default config files > could I be running out of a resource? If so I'm not sure how to or > how much to add to these resources to fix this problem since I have > very little memory on this system. Does anyone have any ideas why I > am getting a hang. Thanks > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin > -- ------------ pasman
jtkells@verizon.net, 05.07.2011 18:44: > A while ago the some developers inserted several records with a > document (stored in doc_Data) that was around 400 - 450 MB each. Now > when you do a select * (all) from this table you get a hang and the > system becomes unresponsive. What application/program is "hanging"? The client that retrieves the data or PostgreSQL itself? > So I'm not sure how to find these bad records and why I am getting a > hang. If your client application is hanging, I don't think there is a single "bad" record, it just chokes on the size of the resultset. Regards Thomas
he could use smth like this to know the size like: SELECT count(*),CASE WHEN length(doc_data)<50000000 THEN '<=50 MB' WHEN length(doc_data)<100000000 THEN '<=100 MB' ELSE'>100MB' END from doc_table GROUP by 2; and then based on the above, to do finer queries to find large data. However, i dont know if cursor based queries (like the ones used by JDBC) should be affected by the size. Normally they would not. However, querying by psql this way will definitely be slow. Στις Wednesday 20 July 2011 17:31:45 ο/η pasman pasmański έγραψε: > You may do a backup of this table. Then with ultraedit search your > documents and remove them. > > 2011/7/5, jtkells@verizon.net <jtkells@verizon.net>: > > I am having a hang condition every time I try to retrieve a large > > records (bytea) data from a table > > The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory > > running Postgresql 8.4.3 with a standard postgresql.conf file (nothing > > has been changed) > > I have the following table called doc_table > > Column | Type | Modifiers | > > Storage | Description > > ------------------------+--------------------------------+--------------------------------------- > > id | numeric | not null | main | > > file_n | character varying(4000) | | > > extended | > > create_date | timestamp(6) without time zone | not null > > default (clock_timestamp()) > > ::timestamp(0)without time zone | plain | > > desc | character varying(4000) | | > > extended | > > doc_cc | character varying(120) | not null | > > extended | > > by | numeric | not null | main | > > doc_data | bytea | | > > extended | > > mime_type_id | character varying(16) | not null | > > extended | > > doc_src | text | | > > extended | > > doc_stat | character varying(512) | not null > > default 'ACTIVE'::character varying | > > extended | > > Indexes: > > "documents_pk" PRIMARY KEY, btree (document_id) > > > > > > A while ago the some developers inserted several records with a > > document (stored in doc_Data) that was around 400 - 450 MB each. Now > > when you do a select * (all) from this table you get a hang and the > > system becomes unresponsive. Prior to these inserts, a select * (all, > > no where clause) worked. I'm also told a select * from doc_table > > where id = xxx still works. I haven't seen any error message in the > > postgresql log files. > > So I'm not sure how to find these bad records and why I am getting a > > hang. Since this postgresql is running with the default config files > > could I be running out of a resource? If so I'm not sure how to or > > how much to add to these resources to fix this problem since I have > > very little memory on this system. Does anyone have any ideas why I > > am getting a hang. Thanks > > > > -- > > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-admin > > > > > -- > ------------ > pasman > -- Achilleas Mantzios
PostgreSQL has to accumulate all the rows of a query before returning the result set to the client. It is probably spoolingthose several 400-450 Mb docs, plus all the other attributes, to a temporary file prior to sending the results back. If you have just three document stored in the database you're looking at > 1 Gb for the spool file alone. Remember, select * is a convenience. You will probably get the same response time as before is you name the columns, exceptdoc_data, in the select clause of your query. See the 'extended' attribute of doc_data? That means the bytea datais stored out-of-line from the other columns like id, create_date and by. See http://www.postgresql.org/docs/9.0/interactive/storage-toast.html forthe particulars of TOAST. If you need to remove the doc data from the table a quick way to do that would be to either update the table and set doc_datato NULL, or use the "create table as select" (CTAS) syntax and specify NULL as the value for doc_date, then dropthe original table and rename the new one to doc_table. Note if you use the CTAS method you will have to alter the tableafterwards to re-establish the not null and default attributes of each column. Don't forget to recreate the primarykey, too. Finally, talk with the developers to see if the document data really needs to be in the database, or could just be in a fileoutside of the database. If you need transactional semantics (ACID properties) to manage the documents you may be stuck. If not, replace doc_data with doc_filename (or maybe file_n is that column already) and move on from there. Good luck, Bob Lunney Στις Wednesday 20 July 2011 17:31:45 ο/η pasman pasmański έγραψε: > You may do a backup of this table. Then with ultraedit search your > documents and remove them. > > 2011/7/5, jtkells@verizon.net <jtkells@verizon.net>: > > I am having a hang condition every time I try to retrieve a large > > records (bytea) data from a table > > The OS is a 5.11 snv_134 i86pc i386 i86pc Solaris with 4GB memory > > running Postgresql 8.4.3 with a standard postgresql.conf file (nothing > > has been changed) > > I have the following table called doc_table > > Column | Type | Modifiers | > > Storage | Description > > ------------------------+--------------------------------+--------------------------------------- > > id | numeric | not null | main | > > file_n | character varying(4000) | | > > extended | > > create_date | timestamp(6) without time zone | not null > > default (clock_timestamp()) > > ::timestamp(0)without time zone | plain | > > desc | character varying(4000) | | > > extended | > > doc_cc | character varying(120) | not null | > > extended | > > by | numeric | not null | main | > > doc_data | bytea | | > > extended | > > mime_type_id | character varying(16) | not null | > > extended | > > doc_src | text | | > > extended | > > doc_stat | character varying(512) | not null > > default 'ACTIVE'::character varying | > > extended | > > Indexes: > > "documents_pk" PRIMARY KEY, btree (document_id) > > > > > > A while ago the some developers inserted several records with a > > document (stored in doc_Data) that was around 400 - 450 MB each. Now > > when you do a select * (all) from this table you get a hang and the > > system becomes unresponsive. Prior to these inserts, a select * (all, > > no where clause) worked. I'm also told a select * from doc_table > > where id = xxx still works. I haven't seen any error message in the > > postgresql log files. > > So I'm not sure how to find these bad records and why I am getting a > > hang. Since this postgresql is running with the default config files > > could I be running out of a resource? If so I'm not sure how to or > > how much to add to these resources to fix this problem since I have > > very little memory on this system. Does anyone have any ideas why I > > am getting a hang. Thanks > > > > -- > > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-admin > > > > > -- > ------------ > pasman > -- Achilleas Mantzios -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin