Thread: The Axe list
Folks, It's that time again! Purging antiquated contrib modules. chkpass: this module is incomplete and does not implement all functions it describes. It's not really even useful as an Example since it uses crypt() and not any modern encryption. And Darcy hasn't touched it in 6 years. intagg: the aggregation function has been obsolete since 7.4 because standard array functionality supports the same. intagg has a nice equivalent for UNROLL, but it only works for arrays of INT, and only one-dimensional arrays. Has not been updated since 2001. Any objections to dropping both of these? --Josh
> intagg: the aggregation function has been obsolete since 7.4 because > standard array functionality supports the same. intagg has a nice > equivalent for UNROLL, but it only works for arrays of INT, and only > one-dimensional arrays. Has not been updated since 2001. I think this one can be dropped. The definition of a general array_accum() is very easy, and supplied in the docs: CREATE AGGREGATE array_accum (anyelement) ( sfunc = array_append, stype = anyarray, initcond = '{}' ); A general one-dimensional array enumerator is equally easy to define, though not in the docs (there is a two-dimensional array enumerator called unnest2 under generate_subscripts): CREATE OR REPLACE FUNCTION array_enum(anyarray) RETURNS SETOF anyelement AS $$ SELECT $1[i] FROM generate_subscripts($1,1) i $$ LANGUAGE sql; With a little more work you could even (crazy thought) add some error checking. It would probably be helpful if the generate_subscripts and array documentation were cross-referenced a bit better. There's no indication on the array page that generate_subscripts() is out there, and it's clearly both very useful and array-related. ...Robert
Josh Berkus <josh@agliodbs.com> writes: > Any objections to dropping both of these? You should ask on -general, not here, if you are trying to find out whether the modules have any users. I tend to agree that chkpass is of doubtful value, but I'm not so sure about intagg. As you said yourself, we haven't yet replaced its functionality with a generalized substitute. Also, it's the only example in the current codebase of a useful technique, ie, an aggregate function doing its own memory management. regards, tom lane
On Fri, 10 Oct 2008 16:28:29 -0700 Josh Berkus <josh@agliodbs.com> wrote: > Folks, > > It's that time again! Purging antiquated contrib modules. > > chkpass: this module is incomplete and does not implement all > functions it describes. It's not really even useful as an Example > since it uses crypt() and not any modern encryption. And Darcy > hasn't touched it in 6 years. > > intagg: the aggregation function has been obsolete since 7.4 because > standard array functionality supports the same. intagg has a nice > equivalent for UNROLL, but it only works for arrays of INT, and only > one-dimensional arrays. Has not been updated since 2001. > > Any objections to dropping both of these? None. > > --Josh > -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/
Josh Berkus wrote: > intagg: ... Has not been updated since 2001. Really? Just a couple years ago (2005) bugs we reported were still getting fixed in it: http://archives.postgresql.org/pgsql-bugs/2005-03/msg00202.php http://archives.postgresql.org/pgsql-bugs/2005-04/msg00165.php Here's one use-case I had at the time. http://archives.postgresql.org/pgsql-general/2005-04/msg01249.php I think we still use that code, but I suppose I can re-write it to use features that were added since then. And hey - where'd README.int_aggregate go? IIRC that had a couple interesting use-cases too. I also like intagg, because it's kinda like a "hello world" for writing one kind of C extensions. I'm not saying it needs to stay in contrib. Perhaps it could live on as an example in the docs?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, Oct 10, 2008 at 09:09:51PM -0700, Ron Mayer wrote: > Josh Berkus wrote: >> intagg: ... Has not been updated since 2001. [...] > I also like intagg, because it's kinda like a "hello world" for > writing one kind of C extensions. I'm not saying it needs to > stay in contrib. Perhaps it could live on as an example in the docs? and Tom Lane said upthread > I tend to agree that chkpass is of doubtful value, but I'm not so sure > about intagg. As you said yourself, we haven't yet replaced its > functionality with a generalized substitute. Also, it's the only > example in the current codebase of a useful technique, ie, an aggregate > function doing its own memory management. So it seems that intagg should rather live in a section "examples" than in contrib? (this seems to stem from the promotion of contrib from "just a bunch of stuff which might be useful" to "useful modules you can use as-is to extend PostgreSQL, no?) Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI8C/yBcgs9XrR2kYRAr3WAJ0dDY6yVIqTitmjVB2bfih5K3/rzwCfd7wk 7hvtw/pHPyfbEUgGvFT87QQ= =iofV -----END PGP SIGNATURE-----
"Robert Haas" <robertmhaas@gmail.com> writes: > CREATE AGGREGATE array_accum (anyelement) > > CREATE OR REPLACE FUNCTION array_enum(anyarray) Have you actually tried these functions on large data sets? They're not in the same performance league as intagg. Your array_accum is O(n^2)! -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
>> CREATE AGGREGATE array_accum (anyelement) >> >> CREATE OR REPLACE FUNCTION array_enum(anyarray) > > Have you actually tried these functions on large data sets? No. :-) > They're not in the same performance league as intagg. Your array_accum is O(n^2)! It's not "mine" - I copied it from the official PostgreSQL documentation and used it because it did what I needed! If it's a bad way to do it, that's certainly an argument for keeping (or maybe generalizing) intagg. ...Robert
"Robert Haas" <robertmhaas@gmail.com> writes: > If it's a bad way to do it, that's certainly an argument for keeping > (or maybe generalizing) intagg. There was actually a patch this past commitfest to *add* functionality to intagg. When I reviewed it I said it would make more sense to generalize it and integrate that functionality into the base array operations. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
>> If it's a bad way to do it, that's certainly an argument for keeping >> (or maybe generalizing) intagg. > > There was actually a patch this past commitfest to *add* functionality to > intagg. When I reviewed it I said it would make more sense to generalize it > and integrate that functionality into the base array operations. I suppose it's just a question of finding enough round tuits. I might take a look at it but my grasp of toasting and memory management may not be good enough yet. ...Robert
On Fri, 10 Oct 2008 16:28:29 -0700 Josh Berkus <josh@agliodbs.com> wrote: > It's that time again! Purging antiquated contrib modules. > > chkpass: this module is incomplete and does not implement all functions > it describes. It's not really even useful as an Example since it uses > crypt() and not any modern encryption. And Darcy hasn't touched it in 6 > years. Well, I still use it a lot but I have it in my own CVS tree anyway so I won't miss it in contrib. However, if all it needs is a modern encryption scheme that's probably an hour's work. The only reason that I haven't done so yet is because I have no use case. If I am storing encrypted passwords in a database it's probably because I need to generate many password files from it. As a result I need to keep it at the LCD. That's DES. Which described functions are missing? I wouldn't mind having a chance to clean it up before it is removed just in case someone else wants to grab it from CVS later. -- D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
tomas@tuxteam.de wrote: > So it seems that intagg should rather live in a section "examples" than > in contrib? Perhaps. Seems my old intagg use case from 8.1 is not really needed anymore since it seems ANY got much smarter since then. Cool. ================================================================================= == With 8.1 ================================================================================= fli=# explain analyze fli-# select * from lines, selected_lines fli-# where selected_lines.id = 16238 and tlid = ANY (line_ids); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=24.51..16899784.67 rows=166031616 width=202) (actual time=0.980..615547.605 rows=2 loops=1) Join Filter:("outer".tlid = ANY ("inner".line_ids)) -> Seq Scan on lines (cost=0.00..1956914.72 rows=55343872 width=166) (actualtime=0.012..160106.897 rows=55291697 loops=1) -> Materialize (cost=24.51..24.57 rows=6 width=36) (actual time=0.002..0.003rows=1 loops=55291697) -> Seq Scan on selected_lines (cost=0.00..24.50 rows=6 width=36) (actualtime=0.012..0.016 rows=1 loops=1) Filter: (id = 16238) Total runtime: 615547.680 ms (7 rows) fli=# explain analyze fli-# select * fli-# from lines, fli-# (select int_array_enum(line_ids) as lid from selected_lines where id=16238) as z fli-# where lid = lines.tlid; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=0.00..54.42 rows=6 width=170) (actual time=36.755..42.268 rows=2 loops=1) -> Seq Scan on selected_lines (cost=0.00..24.52 rows=6 width=32) (actual time=0.023..0.029 rows=2 loops=1) Filter: (id = 16238) -> Index Scan using rtgr_lines__tlid on lines (cost=0.00..4.96 rows=1 width=166) (actual time=21.105..21.108 rows=1loops=2) Index Cond: ("outer"."?column1?" = lines.tlid) Total runtime: 42.352 ms (6 rows) ================================================================================= == With HEAD ================================================================================= fli=# explain analyze select * from lines, selected_lines where selected_lines.id = 16238 and tlid = ANY (line_ids); fli-# fli-# QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=36.44..424.78 rows=61 width=210) (actual time=0.335..0.421 rows=2 loops=1) -> Seq Scan on selected_lines (cost=0.00..24.50 rows=6 width=36) (actual time=0.018..0.021 rows=1 loops=1) Filter: (id = 16238) -> Bitmap Heap Scan on lines (cost=36.44..66.49 rows=10 width=174) (actual time=0.293..0.374 rows=2 loops=1) Recheck Cond: (lines.tlid = ANY (selected_lines.line_ids)) -> Bitmap Index Scan on rtgr_lines__tlid (cost=0.00..36.44rows=10 width=0) (actual time=0.251..0.251 rows=2 loops=1) Index Cond: (lines.tlid = ANY (selected_lines.line_ids))Total runtime: 0.653 ms (8 rows)
* Gregory Stark (stark@enterprisedb.com) wrote: > "Robert Haas" <robertmhaas@gmail.com> writes: > > CREATE AGGREGATE array_accum (anyelement) > > > > CREATE OR REPLACE FUNCTION array_enum(anyarray) > > Have you actually tried these functions on large data sets? They're not in the > same performance league as intagg. Your array_accum is O(n^2)! array_accum itself may also end up in core, if I have my dithers. It makes psql's job to display column-level privs in a nice way alot easier, and there's been quite a few cases where I've used it outside of that, along with others. It'd be nice to have there. That said, once it's in, we really need to rework it to not suck(tm). I wrote a patch to do just that quite some time ago, but it required some things in core that were ugly to expose to any aggregation function (as I recall). If we made that only available to built-in's, then we might be able to have array_accum in core *and* have it be fast. :) It's a goal anyway. Thanks, Stephen
D'Arcy J.M. Cain wrote: > On Fri, 10 Oct 2008 16:28:29 -0700 > Josh Berkus <josh@agliodbs.com> wrote: >> It's that time again! Purging antiquated contrib modules. >> >> chkpass: this module is incomplete and does not implement all functions >> it describes. It's not really even useful as an Example since it uses >> crypt() and not any modern encryption. And Darcy hasn't touched it in 6 >> years. > > Well, I still use it a lot but I have it in my own CVS tree anyway so I > won't miss it in contrib. > > However, if all it needs is a modern encryption scheme that's probably > an hour's work. The only reason that I haven't done so yet is because > I have no use case. If I am storing encrypted passwords in a database > it's probably because I need to generate many password files from it. > As a result I need to keep it at the LCD. That's DES. Is there any reason for using this one over just using pgcrypto, which also gives you a whole lot more functionality? > Which described functions are missing? I wouldn't mind having a > chance to clean it up before it is removed just in case someone else > wants to grab it from CVS later. /* This function checks that the password is a good one* It's just a placeholder for now */ static int verify_pass(const char *str) { return 0; } It is documented that this is just a stub though. //Magnus
On Sat, 11 Oct 2008 16:07:31 +0200 Magnus Hagander <magnus@hagander.net> wrote: > D'Arcy J.M. Cain wrote: > > However, if all it needs is a modern encryption scheme that's probably > > an hour's work. The only reason that I haven't done so yet is because > > I have no use case. If I am storing encrypted passwords in a database > > it's probably because I need to generate many password files from it. > > As a result I need to keep it at the LCD. That's DES. > > Is there any reason for using this one over just using pgcrypto, which > also gives you a whole lot more functionality? Not quite the same. The pgcrypto module adds encryption functions but chkpass adds an encrypted type. I suppose chkpass could be implemented in terms of pgcrypto if one wished. > > Which described functions are missing? I wouldn't mind having a > > chance to clean it up before it is removed just in case someone else > > wants to grab it from CVS later. > > /* This function checks that the password is a good one > * It's just a placeholder for now */ > static int > verify_pass(const char *str) > { > return 0; > } > > > It is documented that this is just a stub though. Ah yes. I generally call external modules for that functionality as they are much better at that than I could be in chkpass. I can upgrade the external module when new ones appear rather than recompiling chkpass each time. -- D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
D'Arcy, > However, if all it needs is a modern encryption scheme that's probably > an hour's work. The only reason that I haven't done so yet is because > I have no use case. Well, I had no use case either which is why I didn't propose updating it. I can certainly see having chkpass live on pgFoundry, though. --Josh
All, So it sounds like intagg is still in use/development. But ... is it more of an example, or is it useful as a type/function in production? --Josh
On Sat, 11 Oct 2008 11:57:50 -0700 Josh Berkus <josh@agliodbs.com> wrote: > > However, if all it needs is a modern encryption scheme that's probably > > an hour's work. The only reason that I haven't done so yet is because > > I have no use case. > > Well, I had no use case either which is why I didn't propose updating > it. I can certainly see having chkpass live on pgFoundry, though. No need. I have places to put it up. I would like to make the following changes for the CVS archives before it is removed though. Any objections? Index: chkpass.c =================================================================== RCS file: /cvsroot/pgsql/contrib/chkpass/chkpass.c,v retrieving revision 1.20 diff -u -p -u -r1.20 chkpass.c --- chkpass.c 25 Mar 2008 22:42:41 -0000 1.20 +++ chkpass.c 11 Oct 2008 19:52:52 -0000 @@ -70,6 +70,7 @@ chkpass_in(PG_FUNCTION_ARGS) char *str = PG_GETARG_CSTRING(0); chkpass *result; char mysalt[4]; + static bool random_initialized = false; static char salt_chars[] = "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; @@ -88,10 +89,16 @@ chkpass_in(PG_FUNCTION_ARGS) result = (chkpass *) palloc(sizeof(chkpass)); + if (!random_initialized) + { + srandom((unsigned int) time(NULL)); + random_initialized = true; + } + mysalt[0] = salt_chars[random() & 0x3f]; mysalt[1] = salt_chars[random() & 0x3f]; - mysalt[2] = 0; /* technically the terminator is not necessary - * but I like to play safe */ + mysalt[2] = 0; /* technically the terminator is not + * necessary but I like to play safe */ strcpy(result->password, crypt(str, mysalt)); PG_RETURN_POINTER(result);} @@ -108,9 +115,11 @@ chkpass_out(PG_FUNCTION_ARGS) chkpass *password = (chkpass *) PG_GETARG_POINTER(0); char *result; - result = (char *) palloc(16); - result[0] = ':'; - strcpy(result + 1, password->password); + if ((result = (char *) palloc(16)) != NULL) + { + result[0] = ':'; + strcpy(result + 1, password->password); + } PG_RETURN_CSTRING(result);} @@ -142,6 +151,9 @@ chkpass_eq(PG_FUNCTION_ARGS) text *a2 = PG_GETARG_TEXT_PP(1); char str[9]; + if (!a1 || !a2) + PG_RETURN_BOOL(0); + text_to_cstring_buffer(a2, str, sizeof(str)); PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0);} @@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS) text *a2 = PG_GETARG_TEXT_PP(1); char str[9]; + if (!a1 || !a2) + PG_RETURN_BOOL(0); + text_to_cstring_buffer(a2, str, sizeof(str)); PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0);} -- D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
Josh Berkus wrote: > So it sounds like intagg is still in use/development. But ... is it > more of an example, or is it useful as a type/function in production? Where I work we (and our customers) use it in our production systems. At first glance it seems our reasons for using it are mostly legacy reasons dating to 8.1 where intagg was the best way to write some queries. At least some of these seem to be unnecessary with 8.3. If intagg's at risk of going away soon I could further check the range of queries where we use it against 8.3 or CVS head if that's useful to the discussion. From our testing notes, here's another 8.1 query where we had order-of-magnitude speedups using intagg and friends. -- with 30000 -- explain analyze select fac_nam from userfeatures.point_features join entity_facets using (entity_id) where featureid=115group by fac_nam; -- -- Total runtime: 7125.322 ms -- select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select distinct fac_ids from entity_facidsnatural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id); -- -- Total runtime: 1297.558 ms -- explain analyze select fac_nam from (select int_array_enum(fac_ids) as fac_id from (select fac_ids from entity_facidsnatural join point_features where featureid=115 group by fac_ids) as a group by int_array_enum(fac_ids)) asa join facet_lookup using (fac_id) order by fac_nam; -- -- Total runtime: 1164.258 ms -- explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select intarray_union_agg(fac_ids)as fac_ids from entity_facids natural join point_features where featureid=115) as a) as a joinfacet_lookup using (fac_id); -- -- Total runtime: 803.187 ms I can check it on 8.3 monday.
Josh Berkus <josh@agliodbs.com> writes: > So it sounds like intagg is still in use/development. But ... is it > more of an example, or is it useful as a type/function in production? You're still asking the wrong list ... regards, tom lane
On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote: > No need. I have places to put it up. I would like to make the > following changes for the CVS archives before it is removed though. > Any objections? > > Index: chkpass.c > =================================================================== > RCS file: /cvsroot/pgsql/contrib/chkpass/chkpass.c,v > retrieving revision 1.20 > diff -u -p -u -r1.20 chkpass.c > --- chkpass.c 25 Mar 2008 22:42:41 -0000 1.20 > +++ chkpass.c 11 Oct 2008 19:52:52 -0000 > @@ -70,6 +70,7 @@ chkpass_in(PG_FUNCTION_ARGS) > char *str = PG_GETARG_CSTRING(0); > chkpass *result; > char mysalt[4]; > + static bool random_initialized = false; > static char salt_chars[] = > "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; > > @@ -88,10 +89,16 @@ chkpass_in(PG_FUNCTION_ARGS) > > result = (chkpass *) palloc(sizeof(chkpass)); > > + if (!random_initialized) > + { > + srandom((unsigned int) time(NULL)); > + random_initialized = true; > + } > + This is bad idea, postgres already does srandom() > mysalt[0] = salt_chars[random() & 0x3f]; > mysalt[1] = salt_chars[random() & 0x3f]; > - mysalt[2] = 0; /* technically the terminator is not > necessary > - * but I like to play safe */ > + mysalt[2] = 0; /* technically the terminator is not > + * necessary but I like to play safe */ > strcpy(result->password, crypt(str, mysalt)); > PG_RETURN_POINTER(result); > } Comment change only? Ok. > @@ -108,9 +115,11 @@ chkpass_out(PG_FUNCTION_ARGS) > chkpass *password = (chkpass *) PG_GETARG_POINTER(0); > char *result; > > - result = (char *) palloc(16); > - result[0] = ':'; > - strcpy(result + 1, password->password); > + if ((result = (char *) palloc(16)) != NULL) > + { > + result[0] = ':'; > + strcpy(result + 1, password->password); > + } AFAIK palloc() cannot return NULL? > > PG_RETURN_CSTRING(result); > } > @@ -142,6 +151,9 @@ chkpass_eq(PG_FUNCTION_ARGS) > text *a2 = PG_GETARG_TEXT_PP(1); > char str[9]; > > + if (!a1 || !a2) > + PG_RETURN_BOOL(0); > + > text_to_cstring_buffer(a2, str, sizeof(str)); > PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0); > } > @@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS) > text *a2 = PG_GETARG_TEXT_PP(1); > char str[9]; > > + if (!a1 || !a2) > + PG_RETURN_BOOL(0); > + > text_to_cstring_buffer(a2, str, sizeof(str)); > PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0); > > } The functions are already defined as STRICT, so unnecessary. Also returning non-NULL on NULL input seems to go against SQL style. -- marko
> Josh Berkus <josh@agliodbs.com> writes: >> So it sounds like intagg is still in use/development. But ... is it >> more of an example, or is it useful as a type/function in production? Based on the patch submitted it's definitely in heavy production use. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
On Sun, 12 Oct 2008 12:57:58 +0300 "Marko Kreen" <markokr@gmail.com> wrote: > On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote: > > + if (!random_initialized) > > + { > > + srandom((unsigned int) time(NULL)); > > + random_initialized = true; > > + } > > This is bad idea, postgres already does srandom() Is that new? I added that to my local version at one time because I was getting the same salt every time I ran it. > > - * but I like to play safe */ > > + mysalt[2] = 0; /* technically the terminator is not > > + * necessary but I like to play safe */ > > strcpy(result->password, crypt(str, mysalt)); > > PG_RETURN_POINTER(result); > > } > > Comment change only? Ok. If that turns out to be the only change I won't bother. > > + if ((result = (char *) palloc(16)) != NULL) > > + { > > + result[0] = ':'; > > + strcpy(result + 1, password->password); > > + } > > AFAIK palloc() cannot return NULL? Really? My program will simply come crashing down if there is a memory problem without giving me a chance to clean up? > > + if (!a1 || !a2) > > + PG_RETURN_BOOL(0); > > + > > text_to_cstring_buffer(a2, str, sizeof(str)); > > PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0); > > } > > @@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS) > > text *a2 = PG_GETARG_TEXT_PP(1); > > char str[9]; > > > > + if (!a1 || !a2) > > + PG_RETURN_BOOL(0); > > + > > text_to_cstring_buffer(a2, str, sizeof(str)); > > PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0); > > > > } > > The functions are already defined as STRICT, so unnecessary. > Also returning non-NULL on NULL input seems to go against SQL style. I'm a belt and suspenders guy. However, I agree that this is unneccessary. So, I guess I just need to know, how long has PG been doing srandom(). -- D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
On Sun, Oct 12, 2008 at 10:41:21AM -0400, D'Arcy J.M. Cain wrote: > > > + if ((result = (char *) palloc(16)) != NULL) > > > + { > > > + result[0] = ':'; > > > + strcpy(result + 1, password->password); > > > + } > > > > AFAIK palloc() cannot return NULL? > > Really? My program will simply come crashing down if there is a memory > problem without giving me a chance to clean up? If by "come crashing down" you mean "the transaction will be aborted, all memory and other resources freed and the user gets a nice error message", then yes. palloc will never return an invalid pointer. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
"D'Arcy J.M. Cain" <darcy@druid.net> writes: > On Sun, 12 Oct 2008 12:57:58 +0300 > "Marko Kreen" <markokr@gmail.com> wrote: >> This is bad idea, postgres already does srandom() > Is that new? I added that to my local version at one time because I > was getting the same salt every time I ran it. Quite a while ago we went around and removed random calls to srandom. In any case it is *not* acceptable to put one into a datatype, because the effects are global. If we did have a problem like that, the appropriate solution would be elsewhere. regards, tom lane
D'Arcy J.M. Cain wrote: > On Sun, 12 Oct 2008 12:57:58 +0300 > "Marko Kreen" <markokr@gmail.com> wrote: >> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote: >>> + if (!random_initialized) >>> + { >>> + srandom((unsigned int) time(NULL)); >>> + random_initialized = true; >>> + } >> This is bad idea, postgres already does srandom() > > Is that new? I added that to my local version at one time because I > was getting the same salt every time I ran it. You really should not be using the standard random() function to generat salts... You need a more secure one. >>> + if ((result = (char *) palloc(16)) != NULL) >>> + { >>> + result[0] = ':'; >>> + strcpy(result + 1, password->password); >>> + } >> AFAIK palloc() cannot return NULL? > > Really? My program will simply come crashing down if there is a memory > problem without giving me a chance to clean up? It will do an ereport() call and clean things up. This is one of the things that rock with using palloc ;-) //Magnus
Magnus Hagander <magnus@hagander.net> writes: > D'Arcy J.M. Cain wrote: >> On Sun, 12 Oct 2008 12:57:58 +0300 >> "Marko Kreen" <markokr@gmail.com> wrote: >>> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote: >>>> + if (!random_initialized) >>>> + { >>>> + srandom((unsigned int) time(NULL)); >>>> + random_initialized = true; >>>> + } >>> This is bad idea, postgres already does srandom() >> >> Is that new? I added that to my local version at one time because I >> was getting the same salt every time I ran it. > > You really should not be using the standard random() function to generat > salts... You need a more secure one. Do salts have to be secure at all? I thought they just had to be widely distributed so that you couldn't use a dictionary attack. The traditional way to pick crypt salts for /etc/passwd was to use the first two letters of the username after all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Gregory Stark wrote: > Magnus Hagander <magnus@hagander.net> writes: > >> D'Arcy J.M. Cain wrote: >>> On Sun, 12 Oct 2008 12:57:58 +0300 >>> "Marko Kreen" <markokr@gmail.com> wrote: >>>> On 10/11/08, D'Arcy J.M. Cain <darcy@druid.net> wrote: >>>>> + if (!random_initialized) >>>>> + { >>>>> + srandom((unsigned int) time(NULL)); >>>>> + random_initialized = true; >>>>> + } >>>> This is bad idea, postgres already does srandom() >>> Is that new? I added that to my local version at one time because I >>> was getting the same salt every time I ran it. >> You really should not be using the standard random() function to generat >> salts... You need a more secure one. > > Do salts have to be secure at all? I thought they just had to be widely > distributed so that you couldn't use a dictionary attack. The traditional way > to pick crypt salts for /etc/passwd was to use the first two letters of the > username after all. Gah. I blame (jet|beer)lag. You're right, of course. Salts just need to be distributed, because you usually store them along with the encrypted password after all. Now it can be argued that random() just isn't distributed enough for even that - using the userid is usually guaranteed to be distributed since it's the primary key... /me pulls bag back over head. //Magnus
Hi, Josh Berkus wrote: > So it sounds like intagg is still in use/development. But ... is it > more of an example, or is it useful as a type/function in production? We use it in production for quite remarkable speedups of operations on int4[]. Having reviewed the last commit fest's intagg patch as well, I thought we agreed that a more general functionality is wanted for core. But as long as we don't have that, I'd like intagg to stay in contrib. Regards Markus Wanner
On Monday 13 October 2008 04:53:44 Markus Wanner wrote: > Hi, > > Josh Berkus wrote: > > So it sounds like intagg is still in use/development. But ... is it > > more of an example, or is it useful as a type/function in production? > > We use it in production for quite remarkable speedups of operations on > int4[]. > +1, istr we are indexing icount values on one system, but certainly this is something being actively used. > Having reviewed the last commit fest's intagg patch as well, I thought > we agreed that a more general functionality is wanted for core. But as > long as we don't have that, I'd like intagg to stay in contrib. > While I agree that the "right" solution would be to make this code work more generally for other aggregates, I also think that until someone is willing to do that work, this needs to stay in contrib, and that we ought to accept patches improving it. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
2008/10/14 Robert Treat <xzilla@users.sourceforge.net> > > On Monday 13 October 2008 04:53:44 Markus Wanner wrote: > > > Having reviewed the last commit fest's intagg patch as well, I thought > > we agreed that a more general functionality is wanted for core. But as > > long as we don't have that, I'd like intagg to stay in contrib. > > > > While I agree that the "right" solution would be to make this code work more > generally for other aggregates, I also think that until someone is willing to > do that work, this needs to stay in contrib, and that we ought to accept > patches improving it. I started to look at implementing array_agg by making the existing intagg stuff more generic, but I came across an issue (which might also be a bug in intagg). Intagg relies on being able to stomp on the input transition value, and basically allocates the working array as a block, reallocating it when it becomes too small. The lower bound of the array is (ab)used to keep track of how many items have been allocated. For a generic aggregate, which accepts any types and NULL inputs, in order to avoid a linear search through the data to find where to put the next element, it would be useful to instead store the offset to the next free byte in the lower pointer. The problem is that I can't see any way to guarantee that someone wouldn't create another aggregate using the same transition function, but giving an initial value to the lower bound which will cause the transition function to do naughty things (as far as I can tell, this is also true of intagg - giving an inital state value of '[200000:200000]{1}' will cause it to happily write up to 200000 integers off the end of that one element array without allocating any extra storage... I'm not sure what the best way around this is - it seems that performancewise, avoiding an array_seek() call in the transition function would be good. Is there some way that the transition function can tell which context the state value was allocated in, and thus whether it was supplied as an initial value or not? Ian
Sorry for top posting - darn phone... 1) as I mentioned when I reviewed the patch in commitfest I don't see the point of the manual memory management. Palloc/realloc has just the same kind of doubling behaviour behind the scenes anyways. Just call realloc before adding every new element. 2) look at the aggregate for count() which avoids pallocing a new bigint using a similar trick. It protects against the bug you describe checking the fctxt to verify it's being called as an aggregate function and not a regular function. So as long as the aggregate has the right initial state it should be fine. Come to think of it though... Do we require creators of new aggregates own the state transition function? If not we have a problem... greg On 15 Oct 2008, at 07:08 PM, "Ian Caulfield" <ian.caulfield@gmail.com> wrote: > 2008/10/14 Robert Treat <xzilla@users.sourceforge.net> >> >> On Monday 13 October 2008 04:53:44 Markus Wanner wrote: >> >>> Having reviewed the last commit fest's intagg patch as well, I >>> thought >>> we agreed that a more general functionality is wanted for core. >>> But as >>> long as we don't have that, I'd like intagg to stay in contrib. >>> >> >> While I agree that the "right" solution would be to make this code >> work more >> generally for other aggregates, I also think that until someone is >> willing to >> do that work, this needs to stay in contrib, and that we ought to >> accept >> patches improving it. > > I started to look at implementing array_agg by making the existing > intagg stuff more generic, but I came across an issue (which might > also be a bug in intagg). Intagg relies on being able to stomp on the > input transition value, and basically allocates the working array as a > block, reallocating it when it becomes too small. The lower bound of > the array is (ab)used to keep track of how many items have been > allocated. > > For a generic aggregate, which accepts any types and NULL inputs, in > order to avoid a linear search through the data to find where to put > the next element, it would be useful to instead store the offset to > the next free byte in the lower pointer. > > The problem is that I can't see any way to guarantee that someone > wouldn't create another aggregate using the same transition function, > but giving an initial value to the lower bound which will cause the > transition function to do naughty things (as far as I can tell, this > is also true of intagg - giving an inital state value of > '[200000:200000]{1}' will cause it to happily write up to 200000 > integers off the end of that one element array without allocating any > extra storage... > > I'm not sure what the best way around this is - it seems that > performancewise, avoiding an array_seek() call in the transition > function would be good. Is there some way that the transition function > can tell which context the state value was allocated in, and thus > whether it was supplied as an initial value or not? > > Ian > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
2008/10/15 Greg Stark <greg.stark@enterprisedb.com>: > Sorry for top posting - darn phone... > > 1) as I mentioned when I reviewed the patch in commitfest I don't see the > point of the manual memory management. Palloc/realloc has just the same kind > of doubling behaviour behind the scenes anyways. Just call realloc before > adding every new element. Does this work? If an aggregate returns a different pointer to the one it was passed, nodeAgg.c will free the original - but if we've actually realloc'ed it, won't this lead to a double free? > 2) look at the aggregate for count() which avoids pallocing a new bigint > using a similar trick. It protects against the bug you describe checking the > fctxt to verify it's being called as an aggregate function and not a regular > function. So as long as the aggregate has the right initial state it should > be fine. The right initial state bit is what I was worried about, though I think I've thought of a way around for now... > Come to think of it though... Do we require creators of new aggregates own > the state transition function? If not we have a problem... No idea... Ian > greg > > On 15 Oct 2008, at 07:08 PM, "Ian Caulfield" <ian.caulfield@gmail.com> > wrote: > >> 2008/10/14 Robert Treat <xzilla@users.sourceforge.net> >>> >>> On Monday 13 October 2008 04:53:44 Markus Wanner wrote: >>> >>>> Having reviewed the last commit fest's intagg patch as well, I thought >>>> we agreed that a more general functionality is wanted for core. But as >>>> long as we don't have that, I'd like intagg to stay in contrib. >>>> >>> >>> While I agree that the "right" solution would be to make this code work >>> more >>> generally for other aggregates, I also think that until someone is >>> willing to >>> do that work, this needs to stay in contrib, and that we ought to accept >>> patches improving it. >> >> I started to look at implementing array_agg by making the existing >> intagg stuff more generic, but I came across an issue (which might >> also be a bug in intagg). Intagg relies on being able to stomp on the >> input transition value, and basically allocates the working array as a >> block, reallocating it when it becomes too small. The lower bound of >> the array is (ab)used to keep track of how many items have been >> allocated. >> >> For a generic aggregate, which accepts any types and NULL inputs, in >> order to avoid a linear search through the data to find where to put >> the next element, it would be useful to instead store the offset to >> the next free byte in the lower pointer. >> >> The problem is that I can't see any way to guarantee that someone >> wouldn't create another aggregate using the same transition function, >> but giving an initial value to the lower bound which will cause the >> transition function to do naughty things (as far as I can tell, this >> is also true of intagg - giving an inital state value of >> '[200000:200000]{1}' will cause it to happily write up to 200000 >> integers off the end of that one element array without allocating any >> extra storage... >> >> I'm not sure what the best way around this is - it seems that >> performancewise, avoiding an array_seek() call in the transition >> function would be good. Is there some way that the transition function >> can tell which context the state value was allocated in, and thus >> whether it was supplied as an initial value or not? >> >> Ian >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >
Greg Stark <greg.stark@enterprisedb.com> writes: > Come to think of it though... Do we require creators of new aggregates > own the state transition function? If not we have a problem... No, but they are required to have permission to call it. So you could restrict the transition function to superusers if you were worried. regards, tom lane