Thread: Re: how to store more than 3 MB of character data in Postgres

Re: how to store more than 3 MB of character data in Postgres

From

Richard Huxton

Date:

06 June 2005, 11:55:30

Vadivel Subramaniam wrote:
> 
> I assume, it could not be a problem with ODBC. I am able to store 2.5 MB of
> data into Oracle using the same ODBC APIs.

Well, it certainly isn't to do with PG itself:

$ cat bigtest.pl
#!/usr/bin/perl
print "CREATE TABLE foo (a int, b varchar);\n";
print "INSERT INTO foo VALUES (1, '".('a' x 5000000)."');\n";

$ perl bigtest.pl | psql -Urichardh richardh
CREATE TABLE
INSERT 3634376 1
$ psql -Urichardh richardh

richardh=> SELECT a,length(b) FROM foo; a | length
---+--------- 1 | 5000000
(1 row)


> Only difference is that Oracle supports CLOB datatype, that is mapped to
> LONG VARCHAR in ODBC.

> The input parameter during insert is bound using the below ODBC API
>       retStatus = SQLBindParameter(dStmtHandle, 1, SQL_PARAM_INPUT,
>             SQL_C_CHAR, SQL_LONGVARCHAR, dLen, 1, pData, 0,  &dLen);
> 
> For PostGres also the same API is used.  Here varchar/text at the Postgres
> schema is mapped to LONGVARCHAR at ODBC.
> We can not go for bytea etc, since the data is in character format.
> 
> I tried using "text" also at the postgres schema, still the impact is same
> (only stores around 64K).
> Could you please provide some example, how the large character data ( more
> than 1 MB ) can be inserted to text/varchar (in PostGres) using ODBC?

Hmm - looking back at your original email, I see you're using the 
UnixODBC driver. Might be worth trying the PG version:
http://gborg.postgresql.org/project/psqlodbc/genpage.php?downloads

First step though, should be to check whether there are any 
configuration settings that apply to longvarchar, because it sounds like 
the odbc driver should be mapping to that.

I can't help you with finding those settings, since I have only ever 
used the "official" PG driver on Windows.

--  Richard Huxton  Archonet Ltd

What is faster?

From

"ON.KG"

Date:

06 June 2005, 14:25:32

Hi All!

What is faster - SLECTion data from one large table (200 000 - 300 000
records), or SELECTion from a few small tables (example, 2 tables 150
000 records each)?

Thank You

Re: What is faster?

From

Richard Huxton

Date:

06 June 2005, 14:45:29

ON.KG wrote:
> Hi All!
> 
> What is faster - SLECTion data from one large table (200 000 - 300 000
> records), or SELECTion from a few small tables (example, 2 tables 150
> 000 records each)?

It depends. Are you selecting all records? One record? A few records? If 
one or a few, do you have a suitable index on the table(s)? Is the table 
clustered? Expected to be cached in RAM?

Do you have a specific problem, or reason to believe you may encounter one?

--   Richard Huxton  Archonet Ltd

Re: What is faster?

From

"ON.KG"

Date:

06 June 2005, 15:02:18

Hi All!

>> What is faster - SLECTion data from one large table (200 000 - 300 000
>> records), or SELECTion from a few small tables (example, 2 tables 150
>> 000 records each)?

RH> It depends. Are you selecting all records? One record? A few records? If
RH> one or a few, do you have a suitable index on the table(s)? Is the table
RH> clustered? Expected to be cached in RAM?

RH> Do you have a specific problem, or reason to believe you may encounter one?

It's a real problem

For example i have two large tables
Structure of tables is same - has two fields - id, ip

Now i'm using two selection from each in one transaction
Each of them selects only one record
selection clase like WHERE ip = 'xxx.xxx.xxx.xxx'
so it is searches existance of IP in each table

tables are clustered

about cached in RAM - i'm novice in Postgresql - how does it work?

now i need to make much faster as it is possible
and have an idea just merge two tables in one - will it help me?

Thank You

Re: What is faster?

From

Richard Huxton

Date:

06 June 2005, 16:20:50

ON.KG wrote:
> Hi All!
> 
>>>What is faster - SLECTion data from one large table (200 000 - 300 000
>>>records), or SELECTion from a few small tables (example, 2 tables 150
>>>000 records each)?
> 
> RH> It depends. Are you selecting all records? One record? A few records? If
> RH> one or a few, do you have a suitable index on the table(s)? Is the table
> RH> clustered? Expected to be cached in RAM?
> 
> RH> Do you have a specific problem, or reason to believe you may encounter one?
> 
> It's a real problem
> 
> For example i have two large tables
> Structure of tables is same - has two fields - id, ip
> 
> Now i'm using two selection from each in one transaction
> Each of them selects only one record
> selection clase like WHERE ip = 'xxx.xxx.xxx.xxx'
> so it is searches existance of IP in each table
> 
> tables are clustered

OK - so the tables aren't updated frequently, I assume. Do you have an 
index on "ip"?

> about cached in RAM - i'm novice in Postgresql - how does it work?

The operating-system will keep frequently used disk-blocks in memory. 
You don't have to do anything. Have you done any performance tuning in 
your postgresql.conf file? If not, try reading:  http://www.powerpostgresql.com/PerfList

> now i need to make much faster as it is possible
> and have an idea just merge two tables in one - will it help me?

If they hold the same information, they probably shouldn't have been 
split in the first place.

--   Richard Huxton  Archonet Ltd

Re: What is faster?

From

"ON.KG"

Date:

07 June 2005, 05:28:34

RH> ON.KG wrote:
>> Hi All!
>> 
>>>>What is faster - SLECTion data from one large table (200 000 - 300 000
>>>>records), or SELECTion from a few small tables (example, 2 tables 150
>>>>000 records each)?
>> 
>> For example i have two large tables
>> Structure of tables is same - has two fields - id, ip
>> 
>> Now i'm using two selection from each in one transaction
>> Each of them selects only one record
>> selection clase like WHERE ip = 'xxx.xxx.xxx.xxx'
>> so it is searches existance of IP in each table
>> 
>> tables are clustered

RH> OK - so the tables aren't updated frequently, I assume. Do you have an
RH> index on "ip"?

Yes

>> about cached in RAM - i'm novice in Postgresql - how does it work?

RH> The operating-system will keep frequently used disk-blocks in memory.
RH> You don't have to do anything. Have you done any performance tuning in
RH> your postgresql.conf file? If not, try reading:
RH>    http://www.powerpostgresql.com/PerfList

Server Administrator says, he did

>> now i need to make much faster as it is possible
>> and have an idea just merge two tables in one - will it help me?

RH> If they hold the same information, they probably shouldn't have been
RH> split in the first place.

Content of tables is not absolutely same
One has one kind of IPs, second - others

And there's one more addition in question - if I will merge tables,
in new table will be set new additional field - `type` char - to
determine type of IP

Thank You

Re: What is faster?

From

Achilleus Mantzios

Date:

07 June 2005, 06:55:53

O ON.KG έγραψε στις Jun 7, 2005 :

> RH> ON.KG wrote:
> >> Hi All!
> >> 
> >>>>What is faster - SLECTion data from one large table (200 000 - 300 000
> >>>>records), or SELECTion from a few small tables (example, 2 tables 150
> >>>>000 records each)?
> >> 
> >> For example i have two large tables
> >> Structure of tables is same - has two fields - id, ip
> >> 
> >> Now i'm using two selection from each in one transaction
> >> Each of them selects only one record
> >> selection clase like WHERE ip = 'xxx.xxx.xxx.xxx'
> >> so it is searches existance of IP in each table
> >> 
> >> tables are clustered
> 
> RH> OK - so the tables aren't updated frequently, I assume. Do you have an
> RH> index on "ip"?
> 
> Yes
> 
> >> about cached in RAM - i'm novice in Postgresql - how does it work?
> 
> RH> The operating-system will keep frequently used disk-blocks in memory.
> RH> You don't have to do anything. Have you done any performance tuning in
> RH> your postgresql.conf file? If not, try reading:
> RH>    http://www.powerpostgresql.com/PerfList
> 
> Server Administrator says, he did
> 
> >> now i need to make much faster as it is possible
> >> and have an idea just merge two tables in one - will it help me?
> 
> RH> If they hold the same information, they probably shouldn't have been
> RH> split in the first place.
> 
> Content of tables is not absolutely same
> One has one kind of IPs, second - others
> 
> And there's one more addition in question - if I will merge tables,
> in new table will be set new additional field - `type` char - to
> determine type of IP
> 
> Thank You

Just a thought, also you could try if converting your ip from varchar(32)
to int8 (unsigned int4) would make any difference, since
the index on int8 would be more efficient.

> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
> 

-- 
-Achilleus