question about HTTP API

Started by Szymon Guzover 12 years ago10 messages
#1Szymon Guz
mabewlun@gmail.com

Do we have any attempts of implementation the HTTP server described at
http://wiki.postgresql.org/wiki/HTTP_API?

It seems like there are design ideas only. Are there any ideas about
implementation like using some existing http servers or writing everything
from scratch?

regards
Szymon

#2Josh Berkus
josh@agliodbs.com
In reply to: Szymon Guz (#1)
Re: question about HTTP API

On 08/08/2013 11:44 AM, Szymon Guz wrote:

Do we have any attempts of implementation the HTTP server described at
http://wiki.postgresql.org/wiki/HTTP_API?

It seems like there are design ideas only. Are there any ideas about
implementation like using some existing http servers or writing everything
from scratch?

Well, there's HTSQL: http://htsql.org/

Other than that, no. I was thinking of creating a general tool as a
custom background worker, which would take stored procedure calls and
pass them through to PostgreSQL, returning results as JSON. Mainly
because I need it for a project. However, this wouldn't accept any query.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andrew Tipton
andrew@kiwidrew.com
In reply to: Szymon Guz (#1)
Re: question about HTTP API

On Fri, Aug 9, 2013 at 2:44 AM, Szymon Guz <mabewlun@gmail.com> wrote:

Do we have any attempts of implementation the HTTP server described at
http://wiki.postgresql.org/wiki/HTTP_API?

It seems like there are design ideas only. Are there any ideas about
implementation like using some existing http servers or writing everything
from scratch?

I recently threw together a quick-and-dirty prototype of this idea. It was
an external tool which used the libmicrohttpd library to accept incoming
requests, convert them to a SQL query (which called a stored procedure),
and return the query results. (It allowed *any* content-type to be
returned, not just JSON.) I only got as far as handling GET requests. The
code is available here:

http://code.malloclabs.com/pghttpd.v1

I'm also aware of an nginx module (ngx_postgres) that lets you transform
requests into queries against a Postgres database, but it isn't "generic"
-- you have to configure it for each URL that should be handled.

Regards,
Andrew Tipton

#4Andrew Tipton
andrew@kiwidrew.com
In reply to: Josh Berkus (#2)
Re: question about HTTP API

On Fri, Aug 9, 2013 at 3:44 AM, Josh Berkus <josh@agliodbs.com> wrote:

Well, there's HTSQL: http://htsql.org/

Other than that, no. I was thinking of creating a general tool as a
custom background worker, which would take stored procedure calls and
pass them through to PostgreSQL, returning results as JSON. Mainly
because I need it for a project. However, this wouldn't accept any query.

I'm actually in the process of writing an HTTP server that lives inside
Postgres. I hope to have a WIP patch ready in the next week or two, and
then (if all goes well) submit it for CF2. [There are a few impediments to
doing this as an extension module, which I shall detail as part of the WIP
patch...]

Why integrate a webserver with Postgres? Well, the trend that we're seeing
in web development is to push much of the display logic into client-side
Javascript frameworks and expose the database through an HTTP API. (Good
examples of this are Parse and Meteor.) CouchDB can even host the
application's static content alongside the data. As a result, many
applications don't need any middle-tier Python/Ruby/Java framework at all.

One of my goals is to allow Postgres to directly serve HTTP requests and
return arbitrary content (e.g. text/html in addition to JSON) directly to
end-user browsers. With the JSON datatype and PL/v8, code can even be
re-used on both client and server. Getting rid of an entire middle tier
would make small-scale application development dramatically easier.

Regards,
Andrew Tipton

#5Greg Stark
stark@mit.edu
In reply to: Andrew Tipton (#3)
Re: question about HTTP API

On Fri, Aug 9, 2013 at 9:21 AM, Andrew Tipton <andrew@kiwidrew.com> wrote:

I recently threw together a quick-and-dirty prototype of this idea. It
was an external tool which used the libmicrohttpd library to accept
incoming requests, convert them to a SQL query (which called a stored
procedure), and return the query results. (It allowed *any* content-type
to be returned, not just JSON.) I only got as far as handling GET
requests. The code is available here:

I looked at the wiki and thought it had a lot of good ideas but also a lot
of good questions. do you have any idea how to tackle the session problem?

Postgres has always assumed session == backend == connection. TPC prepared
transactions are the one main break in this model and they can take a lot
of short cuts because they know there will be no more operations in the
transaction aside from commit or rollback.

A decent HTTP RPC layer will need to have some way of creating a session
and issuing multiple requests on that session. That session will need to be
a stored and available for future requests. The obvious concern is state
like the current database, current role, gucs, and prepared queries. But
even if you're prepared to discard those for a stateless interface the
performance issues of not having a relcache built will be pretty severe.

I suspect this is something better built into something like pgbouncer
which already has to deal with multiplexing many clients onto a single
connection.

--
greg

#6Andrew Tipton
andrew@kiwidrew.com
In reply to: Greg Stark (#5)
Re: question about HTTP API

On 9 Aug 2013 17:03, "Greg Stark" <stark@mit.edu> wrote:

I looked at the wiki and thought it had a lot of good ideas but also a

lot of good questions. do you have any idea how to tackle the session
problem?

[...]
A decent HTTP RPC layer will need to have some way of creating a session

and issuing multiple requests on that session. That session will need to be
a stored and available for future requests. The obvious concern is state
like the current database, current role, gucs, and prepared queries. But
even if you're prepared to discard those for a stateless interface the
performance issues of not having a relcache built will be pretty severe.

The performance certainly will be poor to start with, yes. Sessions and
HTTP simply don't go together, and so I think we need to accept that each
request is going to be stateless. (We could use Websockets, and pass the
socket to libpq.... but that hardly counts as an HTTP API.)

For my patch, I plan to use pre-forked bgworkers which have already
connected to the backend, so that populating the relcache and other process
startup costs don't impact on the HTTP response time. (This still means
queries are being planned and function code is being compiled for each
request, of course...)

This is going to be a very long series of patches, but IMHO we have to
start somewhere! For some applications, performance is far less important
than ease-of-use and ease-of-deployment.

Regards,
Andrew Tipton

#7Josh Berkus
josh@agliodbs.com
In reply to: Szymon Guz (#1)
Re: question about HTTP API

For my patch, I plan to use pre-forked bgworkers which have already
connected to the backend, so that populating the relcache and other process
startup costs don't impact on the HTTP response time. (This still means
queries are being planned and function code is being compiled for each
request, of course...)

This is going to be a very long series of patches, but IMHO we have to
start somewhere! For some applications, performance is far less important
than ease-of-use and ease-of-deployment.

Agreed. Too bad you can't do this as an extension, it would allow you
to rev releases a lot faster than once a year.

Actually, maybe you should look at "what is the minimum patch required
to enable a webserver extension", with the idea that most of the
webserver code would still live outside the core? That way you could
continue to develop it a lot faster.

Also, if all aspects of the web services model (management of sessions,
sercurity, etc.) need to be a core PostgreSQL patch, you're in for a
really long set of arguments since there's no one "best" way to do these
things. Keeping the web services engine outside the core would let you
not have those arguments on this list, which otherwise would likely
cause the feature to miss 9.4.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#7)
Re: question about HTTP API

Josh Berkus <josh@agliodbs.com> writes:

Agreed. Too bad you can't do this as an extension, it would allow you
to rev releases a lot faster than once a year.

Actually, maybe you should look at "what is the minimum patch required
to enable a webserver extension", with the idea that most of the
webserver code would still live outside the core? That way you could
continue to develop it a lot faster.

+1. I think for reasons such as security, a lot of people would rather
*not* see any such thing in core anyway, independent of development
issues. It's also far from clear that there is only one desirable
behavior of this sort, so a design path that offers the possibility
of multiple webserver implementations as separate extensions seems
attractive.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Josh Berkus (#2)
Re: question about HTTP API

On 8/8/13 3:44 PM, Josh Berkus wrote:

Other than that, no. I was thinking of creating a general tool as a
custom background worker, which would take stored procedure calls and
pass them through to PostgreSQL, returning results as JSON. Mainly
because I need it for a project. However, this wouldn't accept any query.

You can write such a thing in 20 lines of code as an external service.
What's the value in having it has a background worker? (Note also the
term *background* worker.) It just seems harder to manage and scale
that way.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Szymon Guz
mabewlun@gmail.com
In reply to: Peter Eisentraut (#9)
Re: question about HTTP API

On 12 August 2013 18:37, Peter Eisentraut <peter_e@gmx.net> wrote:

On 8/8/13 3:44 PM, Josh Berkus wrote:

Other than that, no. I was thinking of creating a general tool as a
custom background worker, which would take stored procedure calls and
pass them through to PostgreSQL, returning results as JSON. Mainly
because I need it for a project. However, this wouldn't accept any

query.

You can write such a thing in 20 lines of code as an external service.
What's the value in having it has a background worker? (Note also the
term *background* worker.) It just seems harder to manage and scale
that way.

When I think about that, it seems to me like the only value of that would
be a nice sql command for starting a service. On the other hand I could
implement that in python/perl/C and start external server from the same sql
query. When I started this thread I was thinking about writing some super
simple app, in something like python/perl, and run it externally. I really
don't like idea of having that in core, as it will be another thing to
support, test etc. and another source of security/efficiency bugs. What we
really need is something like phpPgAdmin with JSON/XML/Something api.

And one more thing: I would never let my db users to start such a service
on their own.

Szymon