Thread: DBT-5 Stored Procedure Development (2022)

DBT-5 Stored Procedure Development (2022)

From
Mahesh Gouru
Date:
Dear all,

Please review the attached for my jerry-rigged project proposal. I am seeking to continually refactor the proposal as I can!

Thanks,
Mahesh

Attachment

Re: DBT-5 Stored Procedure Development (2022)

From
Peter Geoghegan
Date:
On Tue, Apr 19, 2022 at 11:02 AM Mahesh Gouru <mahesh.gouru@gmail.com> wrote:
> Please review the attached for my jerry-rigged project proposal. I am seeking to continually refactor the proposal as
Ican!
 

I for one see a lot of value in this proposal. I think it would be
great to revive DBT-5, since TPC-E has a number of interesting
bottlenecks that we'd likely learn something from. It's particularly
good at stressing concurrency control, which TPC-C really doesn't do.
It's also a lot easier to run smaller benchmarks that don't require
lots of storage space, but are nevertheless correct according to the
spec.

-- 
Peter Geoghegan



Re: DBT-5 Stored Procedure Development (2022)

From
Mark Wong
Date:
Hi Mahesh,

On Tue, Apr 19, 2022 at 02:01:54PM -0400, Mahesh Gouru wrote:
> Dear all,
> 
> Please review the attached for my jerry-rigged project proposal. I am
> seeking to continually refactor the proposal as I can!

My comments might briefer that they should be, but I need to write this
quickly.  :)

* The 4 steps in the description aren't needed, they already exist.
* May 20: I think this should be more about reviewing the TPC-E
  specification rather than industry research, as we want to try to
  follow specification guidelines.
* June 20: Random data generation and scaling are provided by and
  already defined by the spec
* Aug 01: A report generator already exists, but I think time could be
  allocated to redoing the raw HTML generation with something like
  reStructuredText, something that is easier to generate with scripts
  and convertible into other formats with other tools

As some of tasks proposed are actually in place, one other task could be
updating egen (the TPC supplied code.)  The kit was last developed again
1.12 and 1.14 is current as this email.

Regards,
Mark



Re: DBT-5 Stored Procedure Development (2022)

From
Peter Geoghegan
Date:
On Tue, Apr 19, 2022 at 11:31 AM Mark Wong <markwkm@gmail.com> wrote:
> As some of tasks proposed are actually in place, one other task could be
> updating egen (the TPC supplied code.)  The kit was last developed again
> 1.12 and 1.14 is current as this email.

As you know, I have had some false starts with using DBT5 on a modern
Linux distribution. Perhaps I gave up too easily at the time, but I'm
definitely still interested. Has there been work on that since?

Thanks
-- 
Peter Geoghegan



Re: DBT-5 Stored Procedure Development (2022)

From
Mark Wong
Date:
On Tue, Apr 19, 2022 at 05:20:50PM -0700, Peter Geoghegan wrote:
> On Tue, Apr 19, 2022 at 11:31 AM Mark Wong <markwkm@gmail.com> wrote:
> > As some of tasks proposed are actually in place, one other task could be
> > updating egen (the TPC supplied code.)  The kit was last developed again
> > 1.12 and 1.14 is current as this email.
> 
> As you know, I have had some false starts with using DBT5 on a modern
> Linux distribution. Perhaps I gave up too easily at the time, but I'm
> definitely still interested. Has there been work on that since?

I'm afraid not.  I'm guessing that pulling in egen 1.14 would address
that.  Maybe it would make sense to put that on the top of todo list if
this project is accepted...

Regards,
Mark



Re: DBT-5 Stored Procedure Development (2022)

From
Peter Geoghegan
Date:
On Tue, Apr 26, 2022 at 10:36 AM Mark Wong <markwkm@gmail.com> wrote:
> I'm afraid not.  I'm guessing that pulling in egen 1.14 would address
> that.  Maybe it would make sense to put that on the top of todo list if
> this project is accepted...

Wouldn't it be a prerequisite here? I don't actually have any reason
to prefer the old function-based code to the new stored procedure
based code. Really, all I'm looking for is a credible implementation
of TPC-E that I can use to model some aspects of OLTP performance for
my own purposes.

TPC-C (which I have plenty of experience with) has only two secondary
indexes (in typical configurations), and doesn't really stress
concurrency control at all. Plus there are no low cardinality indexes
in TPC-C, while TPC-E has quite a few. Chances are high that I'd learn
something from TPC-E, which has all of these things -- I'm really
looking for bottlenecks, where Postgres does entirely the wrong thing.
It's especially interesting to me as somebody that focuses on B-Tree
indexing.

-- 
Peter Geoghegan



Re: DBT-5 Stored Procedure Development (2022)

From
Mark Wong
Date:
On Mon, May 02, 2022 at 07:14:28AM -0700, Mark Wong wrote:
> On Tue, Apr 26, 2022, 10:45 AM Peter Geoghegan <pg@bowt.ie> wrote:
> 
> > On Tue, Apr 26, 2022 at 10:36 AM Mark Wong <markwkm@gmail.com> wrote:
> > > I'm afraid not.  I'm guessing that pulling in egen 1.14 would address
> > > that.  Maybe it would make sense to put that on the top of todo list if
> > > this project is accepted...
> >
> > Wouldn't it be a prerequisite here? I don't actually have any reason
> > to prefer the old function-based code to the new stored procedure
> > based code. Really, all I'm looking for is a credible implementation
> > of TPC-E that I can use to model some aspects of OLTP performance for
> > my own purposes.
> >
> > TPC-C (which I have plenty of experience with) has only two secondary
> > indexes (in typical configurations), and doesn't really stress
> > concurrency control at all. Plus there are no low cardinality indexes
> > in TPC-C, while TPC-E has quite a few. Chances are high that I'd learn
> > something from TPC-E, which has all of these things -- I'm really
> > looking for bottlenecks, where Postgres does entirely the wrong thing.
> > It's especially interesting to me as somebody that focuses on B-Tree
> > indexing.

I think it could be done in either order.

While it's not ideal that the kit seems to work most reliably as-is on
RHEL/Centos/etc. 6, I think that could provide some confidence in
getting familiar with something on a working platform.  The updates to
the stored functions/procedures would be the same regardless of egen
version.

If we get the project slot, we can talk further about what to actually
tackle first.

Regards,
Mark