Home > mailing lists

Re: Speed of lo_unlink vs. DELETE on BYTEA - Mailing list pgsql-general

From	Reuven M. Lerner
Subject	Re: Speed of lo_unlink vs. DELETE on BYTEA
Date	September 22, 2011 19:39:04
Msg-id	4E7BB8F3.7020608@lerner.co.il Whole thread Raw
In response to	Speed of lo_unlink vs. DELETE on BYTEA ("Reuven M. Lerner" <reuven@lerner.co.il>)
Responses	Re: Speed of lo_unlink vs. DELETE on BYTEA
List	pgsql-general

Tree view

body p { margin-bottom: 10pt; margin-top: 0pt; }

  <body style="direction: ltr;"
    bidimailui-detected-decoding-type="UTF-8" bgcolor="#FFFFFF"
    text="#000000">
    Hi again, everyone.Â  I'm replying to my own posting, to add some
    information: I decided to do some of my own benchmarking.Â  And if my
    benchmarks are at all accurate, then I'm left wondering why people
    use large objects at all, given their clunky API and their extremely
    slow speed.Â  I'm posting my benchmarks as a sanity test, because I'm
    blown away by the results.

    I basically tried three different scenarios, each with 1,000 and
    10,000 records.Â  In each scenario, there was a table named
    MasterTable that contained a SERIAL "id" column and a "one_value"
    integer column, containing a number from generate_series, and a
    second table named SecondaryTable containing its own SERIAL "id"
    column, a "one_value" value (from generate_series, identical to the
    "id" column, and a "master_value" column that's a foreign key back
    to the main table.Â  That is, here's the definition of the tables in
    the 10,000-record benchmark:

    CREATE TABLE MasterTable (
    Â Â Â  idÂ Â Â Â Â Â Â Â Â Â Â  SERIALÂ Â Â  NOT NULL,
    Â Â Â  one_valueÂ Â Â Â  INTEGERÂ Â  NOT NULL,
    Â Â Â 
    Â Â Â  PRIMARY KEY(id)
    );
    INSERT INTO MasterTable (one_value) values
    (generate_series(1,10000));

    CREATE TABLE SecondaryTable (
    Â Â Â  idÂ Â Â Â Â Â Â Â Â Â Â  SERIALÂ Â Â  NOT NULL,
    Â Â Â  one_valueÂ Â Â Â  INTEGERÂ Â  NOT NULL,
    Â Â Â  master_valueÂ  INTEGERÂ Â  NOT NULLÂ Â Â Â  REFERENCES MasterTable ON
    DELETE CASCADE
    Â Â Â 
    Â Â Â  PRIMARY KEY(id)
    );

    INSERT INTO SecondaryTable (master_value, one_value)
    Â (SELECT s.a, s.a FROM generate_series(1,10000) AS s(a));

    I also had two other versions of SecondaryTable: In one scenario,
    there is a my_blob column, of type BYTEA, containing 5 million 'x"
    characters.Â  A final version had a 5-million 'x' character document
    loaded into a large object in SecionaryTable.

    The idea was simple: I wanted to see how much faster or slower it
    was to delete (not truncate) all of the recordsÂ  in MasterTable,
    given these different data types.Â  Would bytea be significantly
    faster than large objects?Â Â Â  How would the cascading delete affect
    things?Â  And how long does it take to pg_dump with large objects
    around?

    Here are the results, which were pretty dramatic.Â  Basically,
    pg_dump seems to always be far, far slower than BYTEA columns.Â 
    Again, I'm wondering whether I'm doing something wrong here, or if
    this explains why in my many years of using PostgreSQL, I've neither
    used nor been tempted to use large objects before.

    1.1 1,000 records
        ==================

        Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  DeleteÂ Â Â  DumpÂ Â Â 
        Â ---------------+---------+--------
        Â  Empty contentÂ Â  0.172sÂ Â Â  0.057sÂ 
        Â  byteaÂ Â Â Â Â Â Â Â Â Â  0.488sÂ Â Â  0.066sÂ 
        Â  large objectÂ Â Â  30.833sÂ Â  9.275sÂ 


        1.2 10,000 records
        ===================

        Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  DeleteÂ Â Â Â Â  DumpÂ Â Â Â Â Â 
        Â ---------------+-----------+-----------
        Â  Empty contentÂ Â  8.162sÂ Â Â Â Â  0.064sÂ Â Â Â 
        Â  byteaÂ Â Â Â Â Â Â Â Â Â  1m0.417sÂ Â Â  0.157sÂ Â Â Â 
        Â  large objectÂ Â Â  4m44.501sÂ Â  1m38.454sÂ 


    Any ideas?Â Â Â  If this is true, should we be warning people away from
    large objects in the documentation, and toward bytea?

    Reuven

pgsql-general by date:

From: Tom Lane
Date: 22 September 2011, 19:35:31
Subject: Re: Statistics collector failure messages on startup

From: Tim Landscheidt
Date: 22 September 2011, 19:44:05
Subject: Re: Quick Date/Time Index Question

Re: Speed of lo_unlink vs. DELETE on BYTEA - Mailing list pgsql-general

Previous

Next