Thread: The Axe list

The Axe list

From
Josh Berkus
Date:
Folks,

It's that time again!  Purging antiquated contrib modules.

chkpass: this module is incomplete and does not implement all functions 
it describes.  It's not really even useful as an Example since it uses 
crypt() and not any modern encryption.  And Darcy hasn't touched it in 6 
years.

intagg: the aggregation function has been obsolete since 7.4 because 
standard array functionality supports the same.  intagg has a nice 
equivalent for UNROLL, but it only works for arrays of INT, and only 
one-dimensional arrays.  Has not been updated since 2001.

Any objections to dropping both of these?

--Josh


Re: The Axe list

From
"Robert Haas"
Date:
> intagg: the aggregation function has been obsolete since 7.4 because
> standard array functionality supports the same.  intagg has a nice
> equivalent for UNROLL, but it only works for arrays of INT, and only
> one-dimensional arrays.  Has not been updated since 2001.

I think this one can be dropped.  The definition of a general
array_accum() is very easy, and supplied in the docs:

CREATE AGGREGATE array_accum (anyelement)
(   sfunc = array_append,   stype = anyarray,   initcond = '{}'
);

A general one-dimensional array enumerator is equally easy to define,
though not in the docs (there is a two-dimensional array enumerator
called unnest2 under generate_subscripts):

CREATE OR REPLACE FUNCTION array_enum(anyarray)   RETURNS SETOF anyelement AS
$$   SELECT $1[i] FROM generate_subscripts($1,1) i
$$ LANGUAGE sql;

With a little more work you could even (crazy thought) add some error checking.

It would probably be helpful if the generate_subscripts and array
documentation were cross-referenced a bit better.  There's no
indication on the array page that generate_subscripts() is out there,
and it's clearly both very useful and array-related.

...Robert


Re: The Axe list

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Any objections to dropping both of these?

You should ask on -general, not here, if you are trying to find out
whether the modules have any users.

I tend to agree that chkpass is of doubtful value, but I'm not so sure
about intagg.  As you said yourself, we haven't yet replaced its
functionality with a generalized substitute.  Also, it's the only
example in the current codebase of a useful technique, ie, an aggregate
function doing its own memory management.
        regards, tom lane


Re: The Axe list

From
Joshua Drake
Date:
On Fri, 10 Oct 2008 16:28:29 -0700
Josh Berkus <josh@agliodbs.com> wrote:

> Folks,
> 
> It's that time again!  Purging antiquated contrib modules.
> 
> chkpass: this module is incomplete and does not implement all
> functions it describes.  It's not really even useful as an Example
> since it uses crypt() and not any modern encryption.  And Darcy
> hasn't touched it in 6 years.
> 
> intagg: the aggregation function has been obsolete since 7.4 because 
> standard array functionality supports the same.  intagg has a nice 
> equivalent for UNROLL, but it only works for arrays of INT, and only 
> one-dimensional arrays.  Has not been updated since 2001.
> 
> Any objections to dropping both of these?

None.


> 
> --Josh
> 


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/




Re: The Axe list

From
Ron Mayer
Date:
Josh Berkus wrote:
> intagg: ...  Has not been updated since 2001.

Really?  Just a couple years ago (2005) bugs we reported were
still getting fixed in it:
http://archives.postgresql.org/pgsql-bugs/2005-03/msg00202.php
http://archives.postgresql.org/pgsql-bugs/2005-04/msg00165.php

Here's one use-case I had at the time.
http://archives.postgresql.org/pgsql-general/2005-04/msg01249.php
I think we still use that code, but I suppose I can re-write it
to use features that were added since then.
And hey - where'd README.int_aggregate go? IIRC that had
a couple interesting use-cases too.

I also like intagg, because it's kinda like a "hello world" for
writing one kind of C extensions.   I'm not saying it needs to
stay in contrib.  Perhaps it could live on as an example in the docs?


Re: The Axe list

From
tomas@tuxteam.de
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Oct 10, 2008 at 09:09:51PM -0700, Ron Mayer wrote:
> Josh Berkus wrote:
>> intagg: ...  Has not been updated since 2001.

[...]

> I also like intagg, because it's kinda like a "hello world" for
> writing one kind of C extensions.   I'm not saying it needs to
> stay in contrib.  Perhaps it could live on as an example in the docs?

and Tom Lane said upthread

> I tend to agree that chkpass is of doubtful value, but I'm not so sure
> about intagg.  As you said yourself, we haven't yet replaced its
> functionality with a generalized substitute.  Also, it's the only
> example in the current codebase of a useful technique, ie, an aggregate
> function doing its own memory management.

So it seems that intagg should rather live in a section "examples" than
in contrib?

(this seems to stem from the promotion of contrib from "just a bunch of
stuff which might be useful" to "useful modules you can use as-is to
extend PostgreSQL, no?)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI8C/yBcgs9XrR2kYRAr3WAJ0dDY6yVIqTitmjVB2bfih5K3/rzwCfd7wk
7hvtw/pHPyfbEUgGvFT87QQ=
=iofV
-----END PGP SIGNATURE-----


Re: The Axe list

From
Gregory Stark
Date:
"Robert Haas" <robertmhaas@gmail.com> writes:

> CREATE AGGREGATE array_accum (anyelement)
>
> CREATE OR REPLACE FUNCTION array_enum(anyarray)

Have you actually tried these functions on large data sets? They're not in the
same performance league as intagg. Your array_accum is O(n^2)!

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!


Re: The Axe list

From
"Robert Haas"
Date:
>> CREATE AGGREGATE array_accum (anyelement)
>>
>> CREATE OR REPLACE FUNCTION array_enum(anyarray)
>
> Have you actually tried these functions on large data sets?

No.  :-)

> They're not in the same performance league as intagg. Your array_accum is O(n^2)!

It's not "mine" - I copied it from the official PostgreSQL
documentation and used it because it did what I needed!

If it's a bad way to do it, that's certainly an argument for keeping
(or maybe generalizing) intagg.

...Robert


Re: The Axe list

From
Gregory Stark
Date:
"Robert Haas" <robertmhaas@gmail.com> writes:

> If it's a bad way to do it, that's certainly an argument for keeping
> (or maybe generalizing) intagg.

There was actually a patch this past commitfest to *add* functionality to
intagg. When I reviewed it I said it would make more sense to generalize it
and integrate that functionality into the base array operations.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


Re: The Axe list

From
"Robert Haas"
Date:
>> If it's a bad way to do it, that's certainly an argument for keeping
>> (or maybe generalizing) intagg.
>
> There was actually a patch this past commitfest to *add* functionality to
> intagg. When I reviewed it I said it would make more sense to generalize it
> and integrate that functionality into the base array operations.

I suppose it's just a question of finding enough round tuits.

I might take a look at it but my grasp of toasting and memory
management may not be good enough yet.

...Robert


Re: The Axe list

From
"D'Arcy J.M. Cain"
Date:
On Fri, 10 Oct 2008 16:28:29 -0700
Josh Berkus <josh@agliodbs.com> wrote:
> It's that time again!  Purging antiquated contrib modules.
> 
> chkpass: this module is incomplete and does not implement all functions 
> it describes.  It's not really even useful as an Example since it uses 
> crypt() and not any modern encryption.  And Darcy hasn't touched it in 6 
> years.

Well, I still use it a lot but I have it in my own CVS tree anyway so I
won't miss it in contrib.

However, if all it needs is a modern encryption scheme that's probably
an hour's work.  The only reason that I haven't done so yet is because
I have no use case.  If I am storing encrypted passwords in a database
it's probably because I need to generate many password files from it.
As a result I need to keep it at the LCD.  That's DES.

Which described functions are missing?  I wouldn't mind having a chance
to clean it up before it is removed just in case someone else wants to
grab it from CVS later.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: The Axe list

From
Ron Mayer
Date:
tomas@tuxteam.de wrote:
> So it seems that intagg should rather live in a section "examples" than
> in contrib?

Perhaps.   Seems my old intagg use case from 8.1 is not really needed
anymore since it seems ANY got much smarter since then.   Cool.




=================================================================================
== With 8.1
=================================================================================
fli=# explain analyze
fli-#   select * from lines, selected_lines
fli-#   where selected_lines.id = 16238 and tlid = ANY (line_ids);
    QUERY PLAN
 

--------------------------------------------------------------------------------------------------------------------------------
NestedLoop  (cost=24.51..16899784.67 rows=166031616 width=202) (actual time=0.980..615547.605 rows=2 loops=1)   Join
Filter:("outer".tlid = ANY ("inner".line_ids))   ->  Seq Scan on lines  (cost=0.00..1956914.72 rows=55343872 width=166)
(actualtime=0.012..160106.897 rows=55291697 loops=1)   ->  Materialize  (cost=24.51..24.57 rows=6 width=36) (actual
time=0.002..0.003rows=1 loops=55291697)         ->  Seq Scan on selected_lines  (cost=0.00..24.50 rows=6 width=36)
(actualtime=0.012..0.016 rows=1 loops=1)               Filter: (id = 16238) Total runtime: 615547.680 ms
 
(7 rows)


fli=# explain analyze
fli-#   select *
fli-#  from lines,
fli-#  (select int_array_enum(line_ids) as lid from selected_lines where id=16238) as z
fli-#   where lid = lines.tlid;                                                            QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
NestedLoop  (cost=0.00..54.42 rows=6 width=170) (actual time=36.755..42.268 rows=2 loops=1)   ->  Seq Scan on
selected_lines (cost=0.00..24.52 rows=6 width=32) (actual time=0.023..0.029 rows=2 loops=1)         Filter: (id =
16238)  ->  Index Scan using rtgr_lines__tlid on lines  (cost=0.00..4.96 rows=1 width=166) (actual time=21.105..21.108
rows=1loops=2)         Index Cond: ("outer"."?column1?" = lines.tlid) Total runtime: 42.352 ms
 
(6 rows)

=================================================================================
==  With HEAD
=================================================================================

fli=# explain analyze  select * from lines, selected_lines  where selected_lines.id = 16238 and tlid = ANY (line_ids);

fli-# fli-#                                                            QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------
NestedLoop  (cost=36.44..424.78 rows=61 width=210) (actual time=0.335..0.421 rows=2 loops=1)   ->  Seq Scan on
selected_lines (cost=0.00..24.50 rows=6 width=36) (actual time=0.018..0.021 rows=1 loops=1)         Filter: (id =
16238)  ->  Bitmap Heap Scan on lines  (cost=36.44..66.49 rows=10 width=174) (actual time=0.293..0.374 rows=2 loops=1)
      Recheck Cond: (lines.tlid = ANY (selected_lines.line_ids))         ->  Bitmap Index Scan on rtgr_lines__tlid
(cost=0.00..36.44rows=10 width=0) (actual time=0.251..0.251 rows=2 loops=1)               Index Cond: (lines.tlid = ANY
(selected_lines.line_ids))Total runtime: 0.653 ms
 
(8 rows)


Re: The Axe list

From
Stephen Frost
Date:
* Gregory Stark (stark@enterprisedb.com) wrote:
> "Robert Haas" <robertmhaas@gmail.com> writes:
> > CREATE AGGREGATE array_accum (anyelement)
> >
> > CREATE OR REPLACE FUNCTION array_enum(anyarray)
>
> Have you actually tried these functions on large data sets? They're not in the
> same performance league as intagg. Your array_accum is O(n^2)!

array_accum itself may also end up in core, if I have my dithers.  It
makes psql's job to display column-level privs in a nice way alot
easier, and there's been quite a few cases where I've used it outside of
that, along with others.  It'd be nice to have there.

That said, once it's in, we really need to rework it to not suck(tm).  I
wrote a patch to do just that quite some time ago, but it required some
things in core that were ugly to expose to any aggregation function (as
I recall).  If we made that only available to built-in's, then we might
be able to have array_accum in core *and* have it be fast. :)

It's a goal anyway.
Thanks,
    Stephen

Re: The Axe list

From
Magnus Hagander
Date:
D'Arcy J.M. Cain wrote:
> On Fri, 10 Oct 2008 16:28:29 -0700
> Josh Berkus <josh@agliodbs.com> wrote:
>> It's that time again!  Purging antiquated contrib modules.
>>
>> chkpass: this module is incomplete and does not implement all functions 
>> it describes.  It's not really even useful as an Example since it uses 
>> crypt() and not any modern encryption.  And Darcy hasn't touched it in 6 
>> years.
> 
> Well, I still use it a lot but I have it in my own CVS tree anyway so I
> won't miss it in contrib.
> 
> However, if all it needs is a modern encryption scheme that's probably
> an hour's work.  The only reason that I haven't done so yet is because
> I have no use case.  If I am storing encrypted passwords in a database
> it's probably because I need to generate many password files from it.
> As a result I need to keep it at the LCD.  That's DES.

Is there any reason for using this one over just using pgcrypto, which
also gives you a whole lot more functionality?


> Which described functions are missing?  I wouldn't mind having a
> chance to clean it up before it is removed just in case someone else
> wants to grab it from CVS later.

/* This function checks that the password is a good one* It's just a placeholder for now */
static int
verify_pass(const char *str)
{       return 0;
}


It is documented that this is just a stub though.

//Magnus




Re: The Axe list

From
"D'Arcy J.M. Cain"
Date:
On Sat, 11 Oct 2008 16:07:31 +0200
Magnus Hagander <magnus@hagander.net> wrote:
> D'Arcy J.M. Cain wrote:
> > However, if all it needs is a modern encryption scheme that's probably
> > an hour's work.  The only reason that I haven't done so yet is because
> > I have no use case.  If I am storing encrypted passwords in a database
> > it's probably because I need to generate many password files from it.
> > As a result I need to keep it at the LCD.  That's DES.
> 
> Is there any reason for using this one over just using pgcrypto, which
> also gives you a whole lot more functionality?

Not quite the same.  The pgcrypto module adds encryption functions but
chkpass adds an encrypted type.  I suppose chkpass could be implemented
in terms of pgcrypto if one wished.

> > Which described functions are missing?  I wouldn't mind having a
> > chance to clean it up before it is removed just in case someone else
> > wants to grab it from CVS later.
> 
> /* This function checks that the password is a good one
>  * It's just a placeholder for now */
> static int
> verify_pass(const char *str)
> {
>         return 0;
> }
> 
> 
> It is documented that this is just a stub though.

Ah yes.  I generally call external modules for that functionality as
they are much better at that than I could be in chkpass.  I can upgrade
the external module when new ones appear rather than recompiling
chkpass each time.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: The Axe list

From
Josh Berkus
Date:
D'Arcy,

> However, if all it needs is a modern encryption scheme that's probably
> an hour's work.  The only reason that I haven't done so yet is because
> I have no use case. 

Well, I had no use case either which is why I didn't propose updating 
it.  I can certainly see having chkpass live on pgFoundry, though.

--Josh



Re: The Axe list

From
Josh Berkus
Date:
All,

So it sounds like intagg is still in use/development.  But ... is it 
more of an example, or is it useful as a type/function in production?

--Josh


Re: The Axe list

From
"D'Arcy J.M. Cain"
Date:
On Sat, 11 Oct 2008 11:57:50 -0700
Josh Berkus <josh@agliodbs.com> wrote:
> > However, if all it needs is a modern encryption scheme that's probably
> > an hour's work.  The only reason that I haven't done so yet is because
> > I have no use case. 
> 
> Well, I had no use case either which is why I didn't propose updating 
> it.  I can certainly see having chkpass live on pgFoundry, though.

No need.  I have places to put it up.  I would like to make the
following changes for the CVS archives before it is removed though.
Any objections?

Index: chkpass.c
===================================================================
RCS file: /cvsroot/pgsql/contrib/chkpass/chkpass.c,v
retrieving revision 1.20
diff -u -p -u -r1.20 chkpass.c
--- chkpass.c   25 Mar 2008 22:42:41 -0000  1.20
+++ chkpass.c   11 Oct 2008 19:52:52 -0000
@@ -70,6 +70,7 @@ chkpass_in(PG_FUNCTION_ARGS)   char       *str = PG_GETARG_CSTRING(0);   chkpass    *result;   char
    mysalt[4];
 
+   static bool random_initialized = false;   static char salt_chars[] =
"./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
@@ -88,10 +89,16 @@ chkpass_in(PG_FUNCTION_ARGS)   result = (chkpass *) palloc(sizeof(chkpass));
+   if (!random_initialized)
+   {
+       srandom((unsigned int) time(NULL));
+       random_initialized = true;
+   }
+   mysalt[0] = salt_chars[random() & 0x3f];   mysalt[1] = salt_chars[random() & 0x3f];
-   mysalt[2] = 0;              /* technically the terminator is not
necessary
-                                * but I like to play safe */
+   mysalt[2] = 0;              /* technically the terminator is not
+                                * necessary but I like to play safe */   strcpy(result->password, crypt(str, mysalt));
 PG_RETURN_POINTER(result);}
 
@@ -108,9 +115,11 @@ chkpass_out(PG_FUNCTION_ARGS)   chkpass    *password = (chkpass *) PG_GETARG_POINTER(0);   char
  *result;
 
-   result = (char *) palloc(16);
-   result[0] = ':';
-   strcpy(result + 1, password->password);
+   if ((result = (char *) palloc(16)) != NULL)
+   {
+       result[0] = ':';
+       strcpy(result + 1, password->password);
+   }   PG_RETURN_CSTRING(result);}
@@ -142,6 +151,9 @@ chkpass_eq(PG_FUNCTION_ARGS)   text       *a2 = PG_GETARG_TEXT_PP(1);   char        str[9];
+   if (!a1 || !a2)
+       PG_RETURN_BOOL(0);
+   text_to_cstring_buffer(a2, str, sizeof(str));   PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) ==
0);}
@@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS)   text       *a2 = PG_GETARG_TEXT_PP(1);   char        str[9];
+   if (!a1 || !a2)
+       PG_RETURN_BOOL(0);
+   text_to_cstring_buffer(a2, str, sizeof(str));   PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) !=
0);}

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: The Axe list

From
Ron Mayer
Date:
Josh Berkus wrote:
> So it sounds like intagg is still in use/development.  But ... is it 
> more of an example, or is it useful as a type/function in production?

Where I work we (and our customers) use it in our production systems.

At first glance it seems our reasons for using it are mostly
legacy reasons dating to 8.1 where intagg was the best way to
write some queries.  At least some of these seem to be unnecessary
with 8.3.  If intagg's at risk of going away soon I could
further check the range of queries where we use it against 8.3
or CVS head if that's useful to the discussion.
From our testing notes, here's another 8.1 query where we had
order-of-magnitude speedups using intagg and friends.
-- with 30000
-- explain analyze select fac_nam  from userfeatures.point_features  join entity_facets using (entity_id)  where
featureid=115group by fac_nam;
 
-- -- Total runtime: 7125.322 ms
-- select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select distinct fac_ids from
entity_facidsnatural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id);
 
-- -- Total runtime: 1297.558 ms
-- explain analyze select fac_nam from (select int_array_enum(fac_ids) as fac_id from (select fac_ids from
entity_facidsnatural join point_features where featureid=115 group by fac_ids) as a group by int_array_enum(fac_ids))
asa join facet_lookup using (fac_id) order by fac_nam;
 
-- -- Total runtime: 1164.258 ms
-- explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select
intarray_union_agg(fac_ids)as fac_ids from entity_facids natural join point_features where featureid=115) as a) as a
joinfacet_lookup using (fac_id);
 
-- -- Total runtime: 803.187 ms
I can check it on 8.3 monday.


Re: The Axe list

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> So it sounds like intagg is still in use/development.  But ... is it 
> more of an example, or is it useful as a type/function in production?

You're still asking the wrong list ...
        regards, tom lane


Re: The Axe list

From
"Marko Kreen"
Date:
On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote:
> No need.  I have places to put it up.  I would like to make the
>  following changes for the CVS archives before it is removed though.
>  Any objections?
>
>  Index: chkpass.c
>  ===================================================================
>  RCS file: /cvsroot/pgsql/contrib/chkpass/chkpass.c,v
>  retrieving revision 1.20
>  diff -u -p -u -r1.20 chkpass.c
>  --- chkpass.c   25 Mar 2008 22:42:41 -0000  1.20
>  +++ chkpass.c   11 Oct 2008 19:52:52 -0000
>  @@ -70,6 +70,7 @@ chkpass_in(PG_FUNCTION_ARGS)
>     char       *str = PG_GETARG_CSTRING(0);
>     chkpass    *result;
>     char        mysalt[4];
>  +   static bool random_initialized = false;
>     static char salt_chars[] =
>     "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
>
>  @@ -88,10 +89,16 @@ chkpass_in(PG_FUNCTION_ARGS)
>
>     result = (chkpass *) palloc(sizeof(chkpass));
>
>  +   if (!random_initialized)
>  +   {
>  +       srandom((unsigned int) time(NULL));
>  +       random_initialized = true;
>  +   }
>  +

This is bad idea, postgres already does srandom()

>     mysalt[0] = salt_chars[random() & 0x3f];
>     mysalt[1] = salt_chars[random() & 0x3f];
>  -   mysalt[2] = 0;              /* technically the terminator is not
>  necessary
>  -                                * but I like to play safe */
>  +   mysalt[2] = 0;              /* technically the terminator is not
>  +                                * necessary but I like to play safe */
>     strcpy(result->password, crypt(str, mysalt));
>     PG_RETURN_POINTER(result);
>   }

Comment change only?  Ok.

>  @@ -108,9 +115,11 @@ chkpass_out(PG_FUNCTION_ARGS)
>     chkpass    *password = (chkpass *) PG_GETARG_POINTER(0);
>     char       *result;
>
>  -   result = (char *) palloc(16);
>  -   result[0] = ':';
>  -   strcpy(result + 1, password->password);
>  +   if ((result = (char *) palloc(16)) != NULL)
>  +   {
>  +       result[0] = ':';
>  +       strcpy(result + 1, password->password);
>  +   }

AFAIK palloc() cannot return NULL?

>
>     PG_RETURN_CSTRING(result);
>   }
>  @@ -142,6 +151,9 @@ chkpass_eq(PG_FUNCTION_ARGS)
>     text       *a2 = PG_GETARG_TEXT_PP(1);
>     char        str[9];
>
>  +   if (!a1 || !a2)
>  +       PG_RETURN_BOOL(0);
>  +
>     text_to_cstring_buffer(a2, str, sizeof(str));
>     PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0);
>   }
>  @@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS)
>     text       *a2 = PG_GETARG_TEXT_PP(1);
>     char        str[9];
>
>  +   if (!a1 || !a2)
>  +       PG_RETURN_BOOL(0);
>  +
>     text_to_cstring_buffer(a2, str, sizeof(str));
>     PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0);
>
>  }

The functions are already defined as STRICT, so unnecessary.
Also returning non-NULL on NULL input seems to go against SQL style.

-- 
marko


Re: The Axe list

From
Gregory Stark
Date:
> Josh Berkus <josh@agliodbs.com> writes:
>> So it sounds like intagg is still in use/development.  But ... is it 
>> more of an example, or is it useful as a type/function in production?

Based on the patch submitted it's definitely in heavy production use.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: The Axe list

From
"D'Arcy J.M. Cain"
Date:
On Sun, 12 Oct 2008 12:57:58 +0300
"Marko Kreen" <markokr@gmail.com> wrote:
> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote:
> >  +   if (!random_initialized)
> >  +   {
> >  +       srandom((unsigned int) time(NULL));
> >  +       random_initialized = true;
> >  +   }
> 
> This is bad idea, postgres already does srandom()

Is that new?  I added that to my local version at one time because I
was getting the same salt every time I ran it.

> >  -                                * but I like to play safe */
> >  +   mysalt[2] = 0;              /* technically the terminator is not
> >  +                                * necessary but I like to play safe */
> >     strcpy(result->password, crypt(str, mysalt));
> >     PG_RETURN_POINTER(result);
> >   }
> 
> Comment change only?  Ok.

If that turns out to be the only change I won't bother.

> >  +   if ((result = (char *) palloc(16)) != NULL)
> >  +   {
> >  +       result[0] = ':';
> >  +       strcpy(result + 1, password->password);
> >  +   }
> 
> AFAIK palloc() cannot return NULL?

Really?  My program will simply come crashing down if there is a memory
problem without giving me a chance to clean up?

> >  +   if (!a1 || !a2)
> >  +       PG_RETURN_BOOL(0);
> >  +
> >     text_to_cstring_buffer(a2, str, sizeof(str));
> >     PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0);
> >   }
> >  @@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS)
> >     text       *a2 = PG_GETARG_TEXT_PP(1);
> >     char        str[9];
> >
> >  +   if (!a1 || !a2)
> >  +       PG_RETURN_BOOL(0);
> >  +
> >     text_to_cstring_buffer(a2, str, sizeof(str));
> >     PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0);
> >
> >  }
> 
> The functions are already defined as STRICT, so unnecessary.
> Also returning non-NULL on NULL input seems to go against SQL style.

I'm a belt and suspenders guy.  However, I agree that this is
unneccessary.  So, I guess I just need to know, how long has PG been
doing srandom().


-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: The Axe list

From
Martijn van Oosterhout
Date:
On Sun, Oct 12, 2008 at 10:41:21AM -0400, D'Arcy J.M. Cain wrote:
> > >  +   if ((result = (char *) palloc(16)) != NULL)
> > >  +   {
> > >  +       result[0] = ':';
> > >  +       strcpy(result + 1, password->password);
> > >  +   }
> >
> > AFAIK palloc() cannot return NULL?
>
> Really?  My program will simply come crashing down if there is a memory
> problem without giving me a chance to clean up?

If by "come crashing down" you mean "the transaction will be aborted,
all memory and other resources freed and the user gets a nice error
message", then yes. palloc will never return an invalid pointer.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: The Axe list

From
Tom Lane
Date:
"D'Arcy J.M. Cain" <darcy@druid.net> writes:
> On Sun, 12 Oct 2008 12:57:58 +0300
> "Marko Kreen" <markokr@gmail.com> wrote:
>> This is bad idea, postgres already does srandom()

> Is that new?  I added that to my local version at one time because I
> was getting the same salt every time I ran it.

Quite a while ago we went around and removed random calls to srandom.
In any case it is *not* acceptable to put one into a datatype, because
the effects are global.  If we did have a problem like that, the
appropriate solution would be elsewhere.
        regards, tom lane


Re: The Axe list

From
Magnus Hagander
Date:
D'Arcy J.M. Cain wrote:
> On Sun, 12 Oct 2008 12:57:58 +0300
> "Marko Kreen" <markokr@gmail.com> wrote:
>> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote:
>>>  +   if (!random_initialized)
>>>  +   {
>>>  +       srandom((unsigned int) time(NULL));
>>>  +       random_initialized = true;
>>>  +   }
>> This is bad idea, postgres already does srandom()
> 
> Is that new?  I added that to my local version at one time because I
> was getting the same salt every time I ran it.

You really should not be using the standard random() function to generat
salts... You need a more secure one.


>>>  +   if ((result = (char *) palloc(16)) != NULL)
>>>  +   {
>>>  +       result[0] = ':';
>>>  +       strcpy(result + 1, password->password);
>>>  +   }
>> AFAIK palloc() cannot return NULL?
> 
> Really?  My program will simply come crashing down if there is a memory
> problem without giving me a chance to clean up?

It will do an ereport() call and clean things up. This is one of the
things that rock with using palloc ;-)

//Magnus



Re: The Axe list

From
Gregory Stark
Date:
Magnus Hagander <magnus@hagander.net> writes:

> D'Arcy J.M. Cain wrote:
>> On Sun, 12 Oct 2008 12:57:58 +0300
>> "Marko Kreen" <markokr@gmail.com> wrote:
>>> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote:
>>>>  +   if (!random_initialized)
>>>>  +   {
>>>>  +       srandom((unsigned int) time(NULL));
>>>>  +       random_initialized = true;
>>>>  +   }
>>> This is bad idea, postgres already does srandom()
>> 
>> Is that new?  I added that to my local version at one time because I
>> was getting the same salt every time I ran it.
>
> You really should not be using the standard random() function to generat
> salts... You need a more secure one.

Do salts have to be secure at all? I thought they just had to be widely
distributed so that you couldn't use a dictionary attack. The traditional way
to pick crypt salts for /etc/passwd was to use the first two letters of the
username after all.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: The Axe list

From
Magnus Hagander
Date:
Gregory Stark wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> 
>> D'Arcy J.M. Cain wrote:
>>> On Sun, 12 Oct 2008 12:57:58 +0300
>>> "Marko Kreen" <markokr@gmail.com> wrote:
>>>> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote:
>>>>>  +   if (!random_initialized)
>>>>>  +   {
>>>>>  +       srandom((unsigned int) time(NULL));
>>>>>  +       random_initialized = true;
>>>>>  +   }
>>>> This is bad idea, postgres already does srandom()
>>> Is that new?  I added that to my local version at one time because I
>>> was getting the same salt every time I ran it.
>> You really should not be using the standard random() function to generat
>> salts... You need a more secure one.
> 
> Do salts have to be secure at all? I thought they just had to be widely
> distributed so that you couldn't use a dictionary attack. The traditional way
> to pick crypt salts for /etc/passwd was to use the first two letters of the
> username after all.

Gah. I blame (jet|beer)lag. You're right, of course. Salts just need to
be distributed, because you usually store them along with the encrypted
password after all.

Now it can be argued that random() just isn't distributed enough for
even that - using the userid is usually guaranteed to be distributed
since it's the primary key...

/me pulls bag back over head.

//Magnus


Re: The Axe list

From
Markus Wanner
Date:
Hi,

Josh Berkus wrote:
> So it sounds like intagg is still in use/development.  But ... is it
> more of an example, or is it useful as a type/function in production?

We use it in production for quite remarkable speedups of operations on
int4[].

Having reviewed the last commit fest's intagg patch as well, I thought
we agreed that a more general functionality is wanted for core. But as
long as we don't have that, I'd like intagg to stay in contrib.

Regards

Markus Wanner



Re: The Axe list

From
Robert Treat
Date:
On Monday 13 October 2008 04:53:44 Markus Wanner wrote:
> Hi,
>
> Josh Berkus wrote:
> > So it sounds like intagg is still in use/development.  But ... is it
> > more of an example, or is it useful as a type/function in production?
>
> We use it in production for quite remarkable speedups of operations on
> int4[].
>

+1, istr we are indexing icount values on one system, but certainly this is 
something being actively used.

> Having reviewed the last commit fest's intagg patch as well, I thought
> we agreed that a more general functionality is wanted for core. But as
> long as we don't have that, I'd like intagg to stay in contrib.
>

While I agree that the "right" solution would be to make this code work more 
generally for other aggregates, I also think that until someone is willing to 
do that work, this needs to stay in contrib, and that we ought to accept 
patches improving it. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


Re: The Axe list

From
"Ian Caulfield"
Date:
2008/10/14 Robert Treat <xzilla@users.sourceforge.net>
>
> On Monday 13 October 2008 04:53:44 Markus Wanner wrote:
>
> > Having reviewed the last commit fest's intagg patch as well, I thought
> > we agreed that a more general functionality is wanted for core. But as
> > long as we don't have that, I'd like intagg to stay in contrib.
> >
>
> While I agree that the "right" solution would be to make this code work more
> generally for other aggregates, I also think that until someone is willing to
> do that work, this needs to stay in contrib, and that we ought to accept
> patches improving it.

I started to look at implementing array_agg by making the existing
intagg stuff more generic, but I came across an issue (which might
also be a bug in intagg). Intagg relies on being able to stomp on the
input transition value, and basically allocates the working array as a
block, reallocating it when it becomes too small. The lower bound of
the array is (ab)used to keep track of how many items have been
allocated.

For a generic aggregate, which accepts any types and NULL inputs, in
order to avoid a linear search through the data to find where to put
the next element, it would be useful to instead store the offset to
the next free byte in the lower pointer.

The problem is that I can't see any way to guarantee that someone
wouldn't create another aggregate using the same transition function,
but giving an initial value to the lower bound which will cause the
transition function to do naughty things (as far as I can tell, this
is also true of intagg - giving an inital state value of
'[200000:200000]{1}' will cause it to happily write up to 200000
integers off the end of that one element array without allocating any
extra storage...

I'm not sure what the best way around this is - it seems that
performancewise, avoiding an array_seek() call in the transition
function would be good. Is there some way that the transition function
can tell which context the state value was allocated in, and thus
whether it was supplied as an initial value or not?

Ian


Re: The Axe list

From
Greg Stark
Date:
Sorry for top posting - darn phone...

1) as I mentioned when I reviewed the patch in commitfest I don't see  
the point of the manual memory management. Palloc/realloc has just the  
same kind of doubling behaviour behind the scenes anyways. Just call  
realloc before adding every new element.

2) look at the aggregate for count() which avoids pallocing a new  
bigint using a similar trick. It protects against the bug you describe  
checking the fctxt to verify it's being called as an aggregate  
function and not a regular function. So as long as the aggregate has  
the right initial state it should be fine.

Come to think of it though... Do we require creators of new aggregates  
own the state transition function? If not we have a problem...

greg

On 15 Oct 2008, at 07:08 PM, "Ian Caulfield" <ian.caulfield@gmail.com>  
wrote:

> 2008/10/14 Robert Treat <xzilla@users.sourceforge.net>
>>
>> On Monday 13 October 2008 04:53:44 Markus Wanner wrote:
>>
>>> Having reviewed the last commit fest's intagg patch as well, I  
>>> thought
>>> we agreed that a more general functionality is wanted for core.  
>>> But as
>>> long as we don't have that, I'd like intagg to stay in contrib.
>>>
>>
>> While I agree that the "right" solution would be to make this code  
>> work more
>> generally for other aggregates, I also think that until someone is  
>> willing to
>> do that work, this needs to stay in contrib, and that we ought to  
>> accept
>> patches improving it.
>
> I started to look at implementing array_agg by making the existing
> intagg stuff more generic, but I came across an issue (which might
> also be a bug in intagg). Intagg relies on being able to stomp on the
> input transition value, and basically allocates the working array as a
> block, reallocating it when it becomes too small. The lower bound of
> the array is (ab)used to keep track of how many items have been
> allocated.
>
> For a generic aggregate, which accepts any types and NULL inputs, in
> order to avoid a linear search through the data to find where to put
> the next element, it would be useful to instead store the offset to
> the next free byte in the lower pointer.
>
> The problem is that I can't see any way to guarantee that someone
> wouldn't create another aggregate using the same transition function,
> but giving an initial value to the lower bound which will cause the
> transition function to do naughty things (as far as I can tell, this
> is also true of intagg - giving an inital state value of
> '[200000:200000]{1}' will cause it to happily write up to 200000
> integers off the end of that one element array without allocating any
> extra storage...
>
> I'm not sure what the best way around this is - it seems that
> performancewise, avoiding an array_seek() call in the transition
> function would be good. Is there some way that the transition function
> can tell which context the state value was allocated in, and thus
> whether it was supplied as an initial value or not?
>
> Ian
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: The Axe list

From
"Ian Caulfield"
Date:
2008/10/15 Greg Stark <greg.stark@enterprisedb.com>:
> Sorry for top posting - darn phone...
>
> 1) as I mentioned when I reviewed the patch in commitfest I don't see the
> point of the manual memory management. Palloc/realloc has just the same kind
> of doubling behaviour behind the scenes anyways. Just call realloc before
> adding every new element.

Does this work? If an aggregate returns a different pointer to the one
it was passed, nodeAgg.c will free the original - but if we've
actually realloc'ed it, won't this lead to a double free?

> 2) look at the aggregate for count() which avoids pallocing a new bigint
> using a similar trick. It protects against the bug you describe checking the
> fctxt to verify it's being called as an aggregate function and not a regular
> function. So as long as the aggregate has the right initial state it should
> be fine.

The right initial state bit is what I was worried about, though I
think I've thought of a way around for now...

> Come to think of it though... Do we require creators of new aggregates own
> the state transition function? If not we have a problem...

No idea...

Ian

> greg
>
> On 15 Oct 2008, at 07:08 PM, "Ian Caulfield" <ian.caulfield@gmail.com>
> wrote:
>
>> 2008/10/14 Robert Treat <xzilla@users.sourceforge.net>
>>>
>>> On Monday 13 October 2008 04:53:44 Markus Wanner wrote:
>>>
>>>> Having reviewed the last commit fest's intagg patch as well, I thought
>>>> we agreed that a more general functionality is wanted for core. But as
>>>> long as we don't have that, I'd like intagg to stay in contrib.
>>>>
>>>
>>> While I agree that the "right" solution would be to make this code work
>>> more
>>> generally for other aggregates, I also think that until someone is
>>> willing to
>>> do that work, this needs to stay in contrib, and that we ought to accept
>>> patches improving it.
>>
>> I started to look at implementing array_agg by making the existing
>> intagg stuff more generic, but I came across an issue (which might
>> also be a bug in intagg). Intagg relies on being able to stomp on the
>> input transition value, and basically allocates the working array as a
>> block, reallocating it when it becomes too small. The lower bound of
>> the array is (ab)used to keep track of how many items have been
>> allocated.
>>
>> For a generic aggregate, which accepts any types and NULL inputs, in
>> order to avoid a linear search through the data to find where to put
>> the next element, it would be useful to instead store the offset to
>> the next free byte in the lower pointer.
>>
>> The problem is that I can't see any way to guarantee that someone
>> wouldn't create another aggregate using the same transition function,
>> but giving an initial value to the lower bound which will cause the
>> transition function to do naughty things (as far as I can tell, this
>> is also true of intagg - giving an inital state value of
>> '[200000:200000]{1}' will cause it to happily write up to 200000
>> integers off the end of that one element array without allocating any
>> extra storage...
>>
>> I'm not sure what the best way around this is - it seems that
>> performancewise, avoiding an array_seek() call in the transition
>> function would be good. Is there some way that the transition function
>> can tell which context the state value was allocated in, and thus
>> whether it was supplied as an initial value or not?
>>
>> Ian
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: The Axe list

From
Tom Lane
Date:
Greg Stark <greg.stark@enterprisedb.com> writes:
> Come to think of it though... Do we require creators of new aggregates  
> own the state transition function? If not we have a problem...

No, but they are required to have permission to call it.  So you could
restrict the transition function to superusers if you were worried.
        regards, tom lane