Re: [GSOC 18] Discussion on the datatable - Mailing list pgsql-hackers

From Mark Wong
Subject Re: [GSOC 18] Discussion on the datatable
Date
Msg-id 20180307161057.GA23953@2ndQuadrant.com
Whole thread Raw
List pgsql-hackers
Hi Hongyuan,

On Tue, Mar 06, 2018 at 01:36:23AM +0800, Hongyuan Ma wrote:
> Hi Mark,
> In the past few days I have read some code in pgperffarm.git repository.I look forward to discussing the project in
detailwith you and gradually defining the datasheet structure and refining the requirements. Here are some of my ideas,
ifthere are any errors or deficiencies, please feel free to correct me.
 
> 
> 
> To create a datasheet: pg_perf_cate
> Overview:
> pg_perf_cate table is used to store performance test project categories that support multi-level categories.
> 
> 
> Description:
> In line 12 of the "pgperffarm \ client \ benchmarks \ runner.py" file there is a line like this:
> 
> 
> ''
> 'manages runs of all the benchmarks, including cluster restarts etc.'
> ''
> 
> 
> In my imagination, there may be more items of performance tests than build tests. Based on the above comments, I
guess,for example, may be there are "single instance of performance tests","cluster performance tests", "other
performancetests" three major categories. Each major category also contains their own test sub-categories, such as
additiontests and deletion tests and so on. In the pg_perf_cate table, the cate_pid field indicates the parent category
ofthe current test category. If the pid field of a row of data has a value of 0, the row represents the top-level
category.
> 
> 
> Related Functions:
>  - Maybe in the navigation bar we can create a category menu to help users quickly find their own interest in the
testitems (similar to the Amazon Mall product categories). The cate_order field is used to manage the order of the
categoriesin the current level for easy front-end menu display.
 
>  - In the admin back-end need a page which can add or modify categories.
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
> To create a datasheet: pg_perf_test
> Overview:
> The pg_perf_test table is used to store specific test items, including the test item number(test_id), the name of the
testitem(test_name), the ID of the sub-category(cate_id), the description of the test item (test_desc,to be discussed),
andthe person ID(user_id).
 
> 
> 
> Description:
> In line 15 of the "pgperffarm \ client \ benchmarks \ pgbench.py" file, I see a note like this:
> ''
> # TODO allow running custom scripts, not just the default
> ''
> Now that I want to allow users to run custom test scripts and upload them, I think it is necessary to tell others how
torun the test again. So I want to add a test_desc field that will store the details about this test and tell the user
howto run this test. But I am not too sure about the storage format for the details of the test, perhaps the rich text
formator markdown format is a suitable choice.
 
> When this test item is added by the administrator, the user_id field has a value of 0. Otherwise, this field
correspondsto the user_id field in the user table. For this field, I prefer not to use foreign keys.
 
> 
> 
> Related Functions:
>  - At the front end, each test has its own detail page, on which the test related content is presented and a list of
testresults is listed.
 
>  - In the admin background need a page to manage test items.
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
> To create a datasheet: pg_perf_test_result
> 
> 
> Overview:
> The pg_perf_test_result table is used to store test results, including at least the result ID(result_id), user ID
(user_id,Iprefer not to create a user-test result association table), test item ID(test_id), test branch
number(branch_id),system configuration(os_config), pg configuration(pg_config), test result details(result_desc) , test
time(add_time)and other fields.
 
> Confusion:
> I think compared to other tables, pg_perf_test_result table may be a relatively complex one.
> This piece of code can be seen around line 110 of the "pgperffarm \ client \ benchmarks \ runner.py" file:
> ''
>         r ['meta'] = {
>                 'benchmark': config ['benchmark'],
>                 'date': strftime ("% Y-% m-% d% H:% M:% S.000000 + 00", gmtime ()),
>                 'name': config_name,
>                 'uname': uname,
>         }
> 
> 
>         with open ('% s / results.json'% self._output, 'w') as f:
>             f.write (json.dumps (r, indent = 4))
> ''
> Could you please provide a sample results.json file so that I can better understand what information is contained in
theuploaded data and design the datasheet based on it.
 

Don't let this distract you too much from finishing your current term.
There will be plenty of time to hammer out the schema.

Here's a brief description of the data that is summarized in a json
object:

    The idea is that the json document represents a series of tests
    against one particular commit.  Then detail of each individual test
    run.

    {
        "pgbench": {
            "ro": {
                "results at a scale factor": {
                    "number of clients": {
                        "std": 4389.727756305762,
                        "metric": 41390.590287,
                        "median": 43137.716637,
                        "results": [
                            {
                                ... pgbench results of a test ...
                            }
                            ...
                        ]
                    },
                    ....
            },
            "rw": {
                ... read write test results ...
        },
        ... other tests results
    }


> Related Function(s):
>  - At the front end, when the user clicks the test result item in the list, it jumps to the test result details page,
whichshow the test result information in detail.
 
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
> To create a datasheet: pg_perf_test_branch
> 
> 
> Overview:
> The pg_perf_test_branch table is used to store the branch version, including the branch ID(branch_id) and branch
name(branch_name)fields.
 
> Related Function(s):
> At the front end and back end, provides the ability to search test results by branch version.
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
> To create a data table: pg_perf_user
> Overview:
> pg_perf_user table is used to register the user ID(user_id), user's name, (user_name)user's email
address(user_email),the user's unique key (to be discussed, used to generate user certificates), registration
time(register_time),the last login ip(last_ip) fields.
 
> Confusion:
> I tried to fill in the information on the registration page of the build farm project and click the submit button,
butunfortunately the page shows "500 - Internal Server Error". So I can only try to design the user table by
imagination.If you've ever used buildfarm's registration feature, please tell me more details about it.
 
> 
> 
> Related Functions
>  - Possibly: The user sends a confirmation email and a certificate file to the user's mailbox after registering.
>  - Provid the ability to search test results by username.
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
> In the last email, you mentioned Django Rest Framework. I very much agree to use it, because it can be very good to
helpthe site to achieve the separation of the front and back ends. At the same time I very much hope to use Vue as a
front-enddevelopment framework. Vue provides the concept of component is relatively easy to use. Vue is widely used in
manyprojects (including NASA), so I think vue is trustworthy.
 
> 
> 
> My postgresql.org community account is:
> Username: maleicacid
> Email: cs_maleicacid@163.com
> 
> 
> I hope to get permission to the pgperffarm.git repository. (In fact, I think it would be better to pull request than
tosubmit code directly, because my code may need to be reviewed.) I would like to create the basic model class. 

You can fork the repository somewhere public to allow us to pull and
review from.  That's how we're likely going to work with whoever is
accepted.

Regards,
Mark


pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: PATCH: Configurable file mode mask
Next
From: Petr Jelinek
Date:
Subject: Re: public schema default ACL