The Axe list

Started by Josh Berkusover 17 years ago34 messageshackers
Jump to latest
#1Josh Berkus
josh@agliodbs.com

Folks,

It's that time again! Purging antiquated contrib modules.

chkpass: this module is incomplete and does not implement all functions
it describes. It's not really even useful as an Example since it uses
crypt() and not any modern encryption. And Darcy hasn't touched it in 6
years.

intagg: the aggregation function has been obsolete since 7.4 because
standard array functionality supports the same. intagg has a nice
equivalent for UNROLL, but it only works for arrays of INT, and only
one-dimensional arrays. Has not been updated since 2001.

Any objections to dropping both of these?

--Josh

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#1)
Re: The Axe list

Josh Berkus <josh@agliodbs.com> writes:

Any objections to dropping both of these?

You should ask on -general, not here, if you are trying to find out
whether the modules have any users.

I tend to agree that chkpass is of doubtful value, but I'm not so sure
about intagg. As you said yourself, we haven't yet replaced its
functionality with a generalized substitute. Also, it's the only
example in the current codebase of a useful technique, ie, an aggregate
function doing its own memory management.

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#1)
Re: The Axe list

intagg: the aggregation function has been obsolete since 7.4 because
standard array functionality supports the same. intagg has a nice
equivalent for UNROLL, but it only works for arrays of INT, and only
one-dimensional arrays. Has not been updated since 2001.

I think this one can be dropped. The definition of a general
array_accum() is very easy, and supplied in the docs:

CREATE AGGREGATE array_accum (anyelement)
(
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);

A general one-dimensional array enumerator is equally easy to define,
though not in the docs (there is a two-dimensional array enumerator
called unnest2 under generate_subscripts):

CREATE OR REPLACE FUNCTION array_enum(anyarray)
RETURNS SETOF anyelement AS
$$
SELECT $1[i] FROM generate_subscripts($1,1) i
$$ LANGUAGE sql;

With a little more work you could even (crazy thought) add some error checking.

It would probably be helpful if the generate_subscripts and array
documentation were cross-referenced a bit better. There's no
indication on the array page that generate_subscripts() is out there,
and it's clearly both very useful and array-related.

...Robert

#4Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#1)
Re: The Axe list

On Fri, 10 Oct 2008 16:28:29 -0700
Josh Berkus <josh@agliodbs.com> wrote:

Folks,

It's that time again! Purging antiquated contrib modules.

chkpass: this module is incomplete and does not implement all
functions it describes. It's not really even useful as an Example
since it uses crypt() and not any modern encryption. And Darcy
hasn't touched it in 6 years.

intagg: the aggregation function has been obsolete since 7.4 because
standard array functionality supports the same. intagg has a nice
equivalent for UNROLL, but it only works for arrays of INT, and only
one-dimensional arrays. Has not been updated since 2001.

Any objections to dropping both of these?

None.

--Josh

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/

#5Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Josh Berkus (#1)
Re: The Axe list

Josh Berkus wrote:

intagg: ... Has not been updated since 2001.

Really? Just a couple years ago (2005) bugs we reported were
still getting fixed in it:
http://archives.postgresql.org/pgsql-bugs/2005-03/msg00202.php
http://archives.postgresql.org/pgsql-bugs/2005-04/msg00165.php

Here's one use-case I had at the time.
http://archives.postgresql.org/pgsql-general/2005-04/msg01249.php
I think we still use that code, but I suppose I can re-write it
to use features that were added since then.
And hey - where'd README.int_aggregate go? IIRC that had
a couple interesting use-cases too.

I also like intagg, because it's kinda like a "hello world" for
writing one kind of C extensions. I'm not saying it needs to
stay in contrib. Perhaps it could live on as an example in the docs?

#6tomas@tuxteam.de
tomas@tuxteam.de
In reply to: Ron Mayer (#5)
Re: The Axe list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Oct 10, 2008 at 09:09:51PM -0700, Ron Mayer wrote:

Josh Berkus wrote:

intagg: ... Has not been updated since 2001.

[...]

I also like intagg, because it's kinda like a "hello world" for
writing one kind of C extensions. I'm not saying it needs to
stay in contrib. Perhaps it could live on as an example in the docs?

and Tom Lane said upthread

I tend to agree that chkpass is of doubtful value, but I'm not so sure
about intagg. As you said yourself, we haven't yet replaced its
functionality with a generalized substitute. Also, it's the only
example in the current codebase of a useful technique, ie, an aggregate
function doing its own memory management.

So it seems that intagg should rather live in a section "examples" than
in contrib?

(this seems to stem from the promotion of contrib from "just a bunch of
stuff which might be useful" to "useful modules you can use as-is to
extend PostgreSQL, no?)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI8C/yBcgs9XrR2kYRAr3WAJ0dDY6yVIqTitmjVB2bfih5K3/rzwCfd7wk
7hvtw/pHPyfbEUgGvFT87QQ=
=iofV
-----END PGP SIGNATURE-----

#7Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#3)
Re: The Axe list

"Robert Haas" <robertmhaas@gmail.com> writes:

CREATE AGGREGATE array_accum (anyelement)

CREATE OR REPLACE FUNCTION array_enum(anyarray)

Have you actually tried these functions on large data sets? They're not in the
same performance league as intagg. Your array_accum is O(n^2)!

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

#8Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#7)
Re: The Axe list

CREATE AGGREGATE array_accum (anyelement)

CREATE OR REPLACE FUNCTION array_enum(anyarray)

Have you actually tried these functions on large data sets?

No. :-)

They're not in the same performance league as intagg. Your array_accum is O(n^2)!

It's not "mine" - I copied it from the official PostgreSQL
documentation and used it because it did what I needed!

If it's a bad way to do it, that's certainly an argument for keeping
(or maybe generalizing) intagg.

...Robert

#9Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#8)
Re: The Axe list

"Robert Haas" <robertmhaas@gmail.com> writes:

If it's a bad way to do it, that's certainly an argument for keeping
(or maybe generalizing) intagg.

There was actually a patch this past commitfest to *add* functionality to
intagg. When I reviewed it I said it would make more sense to generalize it
and integrate that functionality into the base array operations.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!

#10Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#9)
Re: The Axe list

If it's a bad way to do it, that's certainly an argument for keeping
(or maybe generalizing) intagg.

There was actually a patch this past commitfest to *add* functionality to
intagg. When I reviewed it I said it would make more sense to generalize it
and integrate that functionality into the base array operations.

I suppose it's just a question of finding enough round tuits.

I might take a look at it but my grasp of toasting and memory
management may not be good enough yet.

...Robert

#11D'Arcy J.M. Cain
darcy@druid.net
In reply to: Josh Berkus (#1)
Re: The Axe list

On Fri, 10 Oct 2008 16:28:29 -0700
Josh Berkus <josh@agliodbs.com> wrote:

It's that time again! Purging antiquated contrib modules.

chkpass: this module is incomplete and does not implement all functions
it describes. It's not really even useful as an Example since it uses
crypt() and not any modern encryption. And Darcy hasn't touched it in 6
years.

Well, I still use it a lot but I have it in my own CVS tree anyway so I
won't miss it in contrib.

However, if all it needs is a modern encryption scheme that's probably
an hour's work. The only reason that I haven't done so yet is because
I have no use case. If I am storing encrypted passwords in a database
it's probably because I need to generate many password files from it.
As a result I need to keep it at the LCD. That's DES.

Which described functions are missing? I wouldn't mind having a chance
to clean it up before it is removed just in case someone else wants to
grab it from CVS later.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#12Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: tomas@tuxteam.de (#6)
Re: The Axe list

tomas@tuxteam.de wrote:

So it seems that intagg should rather live in a section "examples" than
in contrib?

Perhaps. Seems my old intagg use case from 8.1 is not really needed
anymore since it seems ANY got much smarter since then. Cool.

=================================================================================
== With 8.1
=================================================================================
fli=# explain analyze
fli-# select * from lines, selected_lines
fli-# where selected_lines.id = 16238 and tlid = ANY (line_ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=24.51..16899784.67 rows=166031616 width=202) (actual time=0.980..615547.605 rows=2 loops=1)
Join Filter: ("outer".tlid = ANY ("inner".line_ids))
-> Seq Scan on lines (cost=0.00..1956914.72 rows=55343872 width=166) (actual time=0.012..160106.897 rows=55291697 loops=1)
-> Materialize (cost=24.51..24.57 rows=6 width=36) (actual time=0.002..0.003 rows=1 loops=55291697)
-> Seq Scan on selected_lines (cost=0.00..24.50 rows=6 width=36) (actual time=0.012..0.016 rows=1 loops=1)
Filter: (id = 16238)
Total runtime: 615547.680 ms
(7 rows)

fli=# explain analyze
fli-# select *
fli-# from lines,
fli-# (select int_array_enum(line_ids) as lid from selected_lines where id=16238) as z
fli-# where lid = lines.tlid;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..54.42 rows=6 width=170) (actual time=36.755..42.268 rows=2 loops=1)
-> Seq Scan on selected_lines (cost=0.00..24.52 rows=6 width=32) (actual time=0.023..0.029 rows=2 loops=1)
Filter: (id = 16238)
-> Index Scan using rtgr_lines__tlid on lines (cost=0.00..4.96 rows=1 width=166) (actual time=21.105..21.108 rows=1 loops=2)
Index Cond: ("outer"."?column1?" = lines.tlid)
Total runtime: 42.352 ms
(6 rows)

=================================================================================
== With HEAD
=================================================================================

fli=# explain analyze
select * from lines, selected_lines
where selected_lines.id = 16238 and tlid = ANY (line_ids);

fli-# fli-# QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=36.44..424.78 rows=61 width=210) (actual time=0.335..0.421 rows=2 loops=1)
-> Seq Scan on selected_lines (cost=0.00..24.50 rows=6 width=36) (actual time=0.018..0.021 rows=1 loops=1)
Filter: (id = 16238)
-> Bitmap Heap Scan on lines (cost=36.44..66.49 rows=10 width=174) (actual time=0.293..0.374 rows=2 loops=1)
Recheck Cond: (lines.tlid = ANY (selected_lines.line_ids))
-> Bitmap Index Scan on rtgr_lines__tlid (cost=0.00..36.44 rows=10 width=0) (actual time=0.251..0.251 rows=2 loops=1)
Index Cond: (lines.tlid = ANY (selected_lines.line_ids))
Total runtime: 0.653 ms
(8 rows)

#13Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#7)
Re: The Axe list

* Gregory Stark (stark@enterprisedb.com) wrote:

"Robert Haas" <robertmhaas@gmail.com> writes:

CREATE AGGREGATE array_accum (anyelement)

CREATE OR REPLACE FUNCTION array_enum(anyarray)

Have you actually tried these functions on large data sets? They're not in the
same performance league as intagg. Your array_accum is O(n^2)!

array_accum itself may also end up in core, if I have my dithers. It
makes psql's job to display column-level privs in a nice way alot
easier, and there's been quite a few cases where I've used it outside of
that, along with others. It'd be nice to have there.

That said, once it's in, we really need to rework it to not suck(tm). I
wrote a patch to do just that quite some time ago, but it required some
things in core that were ugly to expose to any aggregation function (as
I recall). If we made that only available to built-in's, then we might
be able to have array_accum in core *and* have it be fast. :)

It's a goal anyway.

Thanks,

Stephen

#14Magnus Hagander
magnus@hagander.net
In reply to: D'Arcy J.M. Cain (#11)
Re: The Axe list

D'Arcy J.M. Cain wrote:

On Fri, 10 Oct 2008 16:28:29 -0700
Josh Berkus <josh@agliodbs.com> wrote:

It's that time again! Purging antiquated contrib modules.

chkpass: this module is incomplete and does not implement all functions
it describes. It's not really even useful as an Example since it uses
crypt() and not any modern encryption. And Darcy hasn't touched it in 6
years.

Well, I still use it a lot but I have it in my own CVS tree anyway so I
won't miss it in contrib.

However, if all it needs is a modern encryption scheme that's probably
an hour's work. The only reason that I haven't done so yet is because
I have no use case. If I am storing encrypted passwords in a database
it's probably because I need to generate many password files from it.
As a result I need to keep it at the LCD. That's DES.

Is there any reason for using this one over just using pgcrypto, which
also gives you a whole lot more functionality?

Which described functions are missing? I wouldn't mind having a
chance to clean it up before it is removed just in case someone else
wants to grab it from CVS later.

/* This function checks that the password is a good one
* It's just a placeholder for now */
static int
verify_pass(const char *str)
{
return 0;
}

It is documented that this is just a stub though.

//Magnus

#15D'Arcy J.M. Cain
darcy@druid.net
In reply to: Magnus Hagander (#14)
Re: The Axe list

On Sat, 11 Oct 2008 16:07:31 +0200
Magnus Hagander <magnus@hagander.net> wrote:

D'Arcy J.M. Cain wrote:

However, if all it needs is a modern encryption scheme that's probably
an hour's work. The only reason that I haven't done so yet is because
I have no use case. If I am storing encrypted passwords in a database
it's probably because I need to generate many password files from it.
As a result I need to keep it at the LCD. That's DES.

Is there any reason for using this one over just using pgcrypto, which
also gives you a whole lot more functionality?

Not quite the same. The pgcrypto module adds encryption functions but
chkpass adds an encrypted type. I suppose chkpass could be implemented
in terms of pgcrypto if one wished.

Which described functions are missing? I wouldn't mind having a
chance to clean it up before it is removed just in case someone else
wants to grab it from CVS later.

/* This function checks that the password is a good one
* It's just a placeholder for now */
static int
verify_pass(const char *str)
{
return 0;
}

It is documented that this is just a stub though.

Ah yes. I generally call external modules for that functionality as
they are much better at that than I could be in chkpass. I can upgrade
the external module when new ones appear rather than recompiling
chkpass each time.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#16Josh Berkus
josh@agliodbs.com
In reply to: D'Arcy J.M. Cain (#11)
Re: The Axe list

D'Arcy,

However, if all it needs is a modern encryption scheme that's probably
an hour's work. The only reason that I haven't done so yet is because
I have no use case.

Well, I had no use case either which is why I didn't propose updating
it. I can certainly see having chkpass live on pgFoundry, though.

--Josh

#17Josh Berkus
josh@agliodbs.com
In reply to: Stephen Frost (#13)
Re: The Axe list

All,

So it sounds like intagg is still in use/development. But ... is it
more of an example, or is it useful as a type/function in production?

--Josh

#18D'Arcy J.M. Cain
darcy@druid.net
In reply to: Josh Berkus (#16)
Re: The Axe list

On Sat, 11 Oct 2008 11:57:50 -0700
Josh Berkus <josh@agliodbs.com> wrote:

However, if all it needs is a modern encryption scheme that's probably
an hour's work. The only reason that I haven't done so yet is because
I have no use case.

Well, I had no use case either which is why I didn't propose updating
it. I can certainly see having chkpass live on pgFoundry, though.

No need. I have places to put it up. I would like to make the
following changes for the CVS archives before it is removed though.
Any objections?

Index: chkpass.c
===================================================================
RCS file: /cvsroot/pgsql/contrib/chkpass/chkpass.c,v
retrieving revision 1.20
diff -u -p -u -r1.20 chkpass.c
--- chkpass.c   25 Mar 2008 22:42:41 -0000  1.20
+++ chkpass.c   11 Oct 2008 19:52:52 -0000
@@ -70,6 +70,7 @@ chkpass_in(PG_FUNCTION_ARGS)
    char       *str = PG_GETARG_CSTRING(0);
    chkpass    *result;
    char        mysalt[4];
+   static bool random_initialized = false;
    static char salt_chars[] =
    "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

@@ -88,10 +89,16 @@ chkpass_in(PG_FUNCTION_ARGS)

result = (chkpass *) palloc(sizeof(chkpass));

+   if (!random_initialized)
+   {
+       srandom((unsigned int) time(NULL));
+       random_initialized = true;
+   }
+
    mysalt[0] = salt_chars[random() & 0x3f];
    mysalt[1] = salt_chars[random() & 0x3f];
-   mysalt[2] = 0;              /* technically the terminator is not
necessary
-                                * but I like to play safe */
+   mysalt[2] = 0;              /* technically the terminator is not
+                                * necessary but I like to play safe */
    strcpy(result->password, crypt(str, mysalt));
    PG_RETURN_POINTER(result);
 }
@@ -108,9 +115,11 @@ chkpass_out(PG_FUNCTION_ARGS)
    chkpass    *password = (chkpass *) PG_GETARG_POINTER(0);
    char       *result;
-   result = (char *) palloc(16);
-   result[0] = ':';
-   strcpy(result + 1, password->password);
+   if ((result = (char *) palloc(16)) != NULL)
+   {
+       result[0] = ':';
+       strcpy(result + 1, password->password);
+   }

PG_RETURN_CSTRING(result);
}
@@ -142,6 +151,9 @@ chkpass_eq(PG_FUNCTION_ARGS)
text *a2 = PG_GETARG_TEXT_PP(1);
char str[9];

+ if (!a1 || !a2)
+ PG_RETURN_BOOL(0);
+
text_to_cstring_buffer(a2, str, sizeof(str));
PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) == 0);
}
@@ -154,6 +166,9 @@ chkpass_ne(PG_FUNCTION_ARGS)
text *a2 = PG_GETARG_TEXT_PP(1);
char str[9];

+   if (!a1 || !a2)
+       PG_RETURN_BOOL(0);
+
    text_to_cstring_buffer(a2, str, sizeof(str));
    PG_RETURN_BOOL(strcmp(a1->password, crypt(str, a1->password)) != 0);
 }
-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.
#19Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Josh Berkus (#17)
Re: The Axe list

Josh Berkus wrote:

So it sounds like intagg is still in use/development. But ... is it
more of an example, or is it useful as a type/function in production?

Where I work we (and our customers) use it in our production systems.

At first glance it seems our reasons for using it are mostly
legacy reasons dating to 8.1 where intagg was the best way to
write some queries. At least some of these seem to be unnecessary
with 8.3. If intagg's at risk of going away soon I could
further check the range of queries where we use it against 8.3
or CVS head if that's useful to the discussion.

From our testing notes, here's another 8.1 query where we had
order-of-magnitude speedups using intagg and friends.
-- with 30000
-- explain analyze select fac_nam from userfeatures.point_features join entity_facets using (entity_id) where featureid=115 group by fac_nam;
-- -- Total runtime: 7125.322 ms
-- select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select distinct fac_ids from entity_facids natural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id);
-- -- Total runtime: 1297.558 ms
-- explain analyze select fac_nam from (select int_array_enum(fac_ids) as fac_id from (select fac_ids from entity_facids natural join point_features where featureid=115 group by fac_ids) as a group by int_array_enum(fac_ids)) as a join facet_lookup using (fac_id) order by fac_nam;
-- -- Total runtime: 1164.258 ms
-- explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select intarray_union_agg(fac_ids) as fac_ids from entity_facids natural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id);
-- -- Total runtime: 803.187 ms
I can check it on 8.3 monday.

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#17)
Re: The Axe list

Josh Berkus <josh@agliodbs.com> writes:

So it sounds like intagg is still in use/development. But ... is it
more of an example, or is it useful as a type/function in production?

You're still asking the wrong list ...

regards, tom lane

#21Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#20)
#22Marko Kreen
markokr@gmail.com
In reply to: D'Arcy J.M. Cain (#18)
#23D'Arcy J.M. Cain
darcy@druid.net
In reply to: Marko Kreen (#22)
#24Martijn van Oosterhout
kleptog@svana.org
In reply to: D'Arcy J.M. Cain (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: D'Arcy J.M. Cain (#23)
#26Magnus Hagander
magnus@hagander.net
In reply to: D'Arcy J.M. Cain (#23)
#27Bruce Momjian
bruce@momjian.us
In reply to: Magnus Hagander (#26)
#28Magnus Hagander
magnus@hagander.net
In reply to: Bruce Momjian (#27)
#29Markus Wanner
markus@bluegap.ch
In reply to: Josh Berkus (#17)
#30Robert Treat
xzilla@users.sourceforge.net
In reply to: Markus Wanner (#29)
#31Ian Caulfield
ian.caulfield@gmail.com
In reply to: Robert Treat (#30)
#32Bruce Momjian
bruce@momjian.us
In reply to: Ian Caulfield (#31)
#33Ian Caulfield
ian.caulfield@gmail.com
In reply to: Bruce Momjian (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#32)