Re: Do we want a hashset type? - Mailing list pgsql-hackers

From jian he
Subject Re: Do we want a hashset type?
Date
Msg-id CACJufxFUeadb3qj8sSM_z2Z9YxGtQn_9YEL3Y=xd=xVVZ_=SaA@mail.gmail.com
Whole thread Raw
In response to Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: Do we want a hashset type?
List pgsql-hackers


On Thu, Jun 15, 2023 at 5:04 AM Joel Jacobson <joel@compiler.org> wrote:
On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> On 6/14/23 14:57, Joel Jacobson wrote:
>> Would it be feasible to teach the planner to utilize the internal hash table of
>> hashset directly? In the case of arrays, the hash table construction is an
...
> It's definitely something I'd leave out of v0, personally.

OK, thanks for guidance, I'll stay away from it.

I've been doing some preparatory work on this todo item:

> 3) support for other types (now it only works with int32)

I've renamed the type from "hashset" to "int4hashset",
and the SQL-functions are now prefixed with "int4"
when necessary. The overloaded functions with
int4hashset as input parameters don't need to be prefixed,
e.g. hashset_add(int4hashset, int).

Other changes since last update (4e60615):

* Support creation of empty hashset using '{}'::hashset
* Introduced a new function hashset_capacity() to return the current capacity
  of a hashset.
* Refactored hashset initialization:
  - Replaced hashset_init(int) with int4hashset() to initialize an empty hashset
    with zero capacity.
  - Added int4hashset_with_capacity(int) to initialize a hashset with
    a specified capacity.
* Improved README.md and testing

As a next step, I'm planning on adding int8 support.

Looks and sounds good?

/Joel

still playing around with hashset-0.0.1-a8a282a.patch.

I think "postgres.h" should be on the top, (someone have said it on another email thread, I forgot who said that) 

In my local /home/jian/postgres/pg16/include/postgresql/server/libpq/pqformat.h:
/*
 * Append a binary integer to a StringInfo buffer
 *
 * This function is deprecated; prefer use of the functions above.
 */
static inline void
pq_sendint(StringInfo buf, uint32 i, int b)

So I changed to pq_sendint32.

ending and beginning, and in between white space should be stripped. The following c example seems ok for now. but I am not sure, I don't know how to glue it in hashset_in.

forgive me the patch name.... 

/*
gcc /home/jian/Desktop/regress_pgsql/strip_white_space.c && ./a.out
*/

#include<stdio.h>
#include<stdint.h>
#include<string.h>
#include<stdbool.h>
#include <ctype.h>
#include<stdlib.h>

/*
 * array_isspace() --- a non-locale-dependent isspace()
 *
 * We used to use isspace() for parsing array values, but that has
 * undesirable results: an array value might be silently interpreted
 * differently depending on the locale setting.  Now we just hard-wire
 * the traditional ASCII definition of isspace().
 */
static bool
array_isspace(char ch)
{
if (ch == ' ' ||
ch == '\t' ||
ch == '\n' ||
ch == '\r' ||
ch == '\v' ||
ch == '\f')
return true;
return false;
}

int main(void)
{
    long *temp   = malloc(10 * sizeof(long));
    memset(temp,0,10);
    char    source[5][50]   = {{0}};
    snprintf(source[0],sizeof(source[0]),"%s","  { 1   ,   20  }");
    snprintf(source[1],sizeof(source[0]),"%s","   { 1      ,20 ,   30 ");
    snprintf(source[2],sizeof(source[0]),"%s","   {1      ,20 ,   30 ");
    snprintf(source[3],sizeof(source[0]),"%s","   {1      ,  20 ,   30  }");
    snprintf(source[4],sizeof(source[0]),"%s","   {1      ,  20 ,   30  } ");
    /* Make a modifiable copy of the input */
char    *p;
    char    string_save[50];
   
    for(int j = 0; j < 5; j++)
    {
        snprintf(string_save,sizeof(string_save),"%s",source[j]);
        p = string_save;

        int     i = 0;
        while (array_isspace(*p))
            p++;
        if (*p != '{')
        {
            printf("line: %d should be {\n",__LINE__);
            exit(EXIT_FAILURE);
        }

        for (;;)
        {
            char   *q;
            if (*p == '{')
                p++;
            temp[i]     = strtol(p, &q,10);
            printf("temp[j=%d] [%d]=%ld\n",j,i,temp[i]);  

            if (*q == '}' && (*(q+1) == '\0'))
            {
                printf("all works ok now exit\n");
                break;
            }
            if( !array_isspace(*q) && *q != ',')
            {
                printf("wrong format. program will exit\n");
                exit(EXIT_FAILURE);
            }
            while(array_isspace(*q))
                q++;
            if(*q != ',')
                break;
            else
                p = q+1;
            i++;
        }  
    }
}



Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Shouldn't cost_append() also scale the partial path's cost?
Next
From: Thomas Munro
Date:
Subject: Re: Bypassing shared_buffers