================================================================================
 README for pgsql_fdw

 Copyright (c) 2011, NIPPON TELEGRAPH AND TELEPHONE CORPORATION
================================================================================

Overview of pgsql_fdw
---------------------

The pgsql_fdw is a foreign-data wrapper for external PostgreSQL.  The pgsql_fdw
module provides foreign-data wrapper handler function pgsql_fdw_handler which
can be used to access external PostgreSQL server via plain SQL, and
foreign-data wrapper validator function pgsql_fdw_validator which can be used
to validate generic options set to SQL/MED objects defined for pgsql_fdw.

The pgsql_fdw can be installed on only PostgreSQL 9.1, but it can connect to
older version of PostgreSQL.

How to Install
--------------

At first you need to install binary module of pgsql_fdw into your PostgreSQL
installation.  To build pgsql_fdw, you need source code of PostgreSQL 9.1, or
pgxs environment to build pgsql_fdw.  If you have appropriate source code, you
can build with steps below:

	1) expand pgsql_fdw source archive into contrib directory
	2) execute "make -C contrib/pgsql_fdw" to build
	3) execute "make -C contrib/pgsql_fdw install" to install

If you don't have appropriate source code, you have to use pgxs environment
with steps below:

	1) extract pgsql_fdw source code into some directory
	2) execute "USE_PGXS=1 make -C contrib/pgsql_fdw" to build
	3) execute "USE_PGXS=1 make -C contrib/pgsql_fdw install" to install

Now you can install pgsql_fdw into your database via CREATE EXTENSION command.
Note that you need superuser privilege to do this.  Creating extension also
creates default FDW, pgsql_fdw.

	$ psql
	postgres=# CREATE EXTENSION pgsql_fdw;
	CREATE EXTENSION
	postgres=# \dew
						List of foreign-data wrappers
	   Name    |  Owner   |      Handler      |     Validator
	-----------+----------+-------------------+---------------------
	 pgsql_fdw | postgres | pgsql_fdw_handler | pgsql_fdw_validator
	(1 row)

How to access external data
---------------------------

First of all, you need to create foreign server and user mapping with
connection information.

	postgres=# CREATE SERVER asset_db FOREIGN DATA WRAPPER pgsql_fdw
	postgres-# OPTIONS (host 'server', port '5432', dbname 'asset_db');
	CREATE SERVER
	postgres=# CREATE USER MAPPING FOR current_user SERVER asset_db
	postgres-# OPTIONS (user 'manager', password 'secret');

After that, you can create foreign table which defines data structure of remote
table.  If you have a remote table such as:

	postgres=# \d person
                              Table "public.app_user"
      Column  |  Type   |                      Modifiers
    ----------+---------+-------------------------------------------------------
     id       | integer | not null default nextval('app_user_id_seq'::regclass)
     name     | text    |
     birthday | date    |
    Indexes:
        "app_user_pkey" PRIMARY KEY, btree (id)

You can define a foreign table as below:

	postgres=# CREATE FOREIGN TABLE app_user (
	postgres(# id integer,
	postgres(# name text,
	postgres(# birthday date
	postgres(# ) SERVER asset_db;

Note that you need to use integer and bigint instead of serial and bigserial
respectively, because actual values are determined on remote side.  Once you
have created a foreign table, you can execute SELECT query against it.

	postgres=# SELECT * FROM app_user WHERE birthday < '2000-01-01' ORDER BY id;

Some of conditions which appear in WHERE clause are sent to remote side to
reduce amount of data transferred.  You can see the query actually sent to
remote side in result of EXPLAIN command, "Remote SQL" item below "ForeignScan"
node.  Please see the section "SQL sent to remote PostgreSQL" for details.

	postgres=# EXPLAIN SELECT id FROM app_user
    postgres-# WHERE birthday < '2000-01-01' ORDER BY id;
                                                                QUERY PLAN
    ----------------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=59.83..62.33 rows=1000 width=4)
       Sort Key: id
       ->  Foreign Scan on app_user  (cost=10.00..10.00 rows=1000 width=4)
             Remote SQL: SELECT app_user.id, NULL, NULL FROM public.app_user app_user WHERE (app_user.birthday < '2000-01-01'::date)
    (4 rows)

SQL sent to remote PostgreSQL
-----------------------------

Foreign tables can be used in any SELECT statement, including CTE, subquery and
PREPARE statement.  Currently pgsql_fdw sends SELECT query for each foreign
table appears in the local query.  This means that every join is done in local
side.

Some kind of query can be optimized by pgsql_fdw.

	1) unused column reference would be replaced with "NULL".
	2) some part of WHERE/JOIN clause, which consists of limited elements, can
	   be sent and evaluated on remote side.  Expressions can be sent are:
			* constant value
			* array, if every elements suit these rules recursively
			* stable functions, if every arguments suit these rules recursively
			* operators implemented with stable functions
			* Boolean expressions such as AND, OR and NOT
			* x IS NULL and x IS NOT NULL
			* x IS DISTINCT FROM y
			* scalar op ANY/ALL (array), if op is stable
			* argument of EXECUTE statement
			* column reference

These optimization would decrease amount of data transferred from remote
server.  You can see the query which is going to be sent to remote side via
EXPLAIN command.

If the planner estimates that result is more than 1000 rows, pgsql_fdw uses
cursor statements such as DECLARE and FETCH instead of simple SELECT statement,
to reduce memory usage to a certain level.

FDW options
-----------

The pgsql_fdw accepts FDW generic options, and they can be classified to some
groups.

* Connection options
The pgsql_fdw retrieves connection information from FDW options of foreign
server and user mapping.  The pgsql_fdw accepts subset of libpq connection
options:

	* For server option
	service, connect_timeout, dbname, host, hostaddr, port, options,
	application_name, requiressl, sslmode, krbsrvname, gsslib

	* For user mapping option
	user, password

Note: If you omit connection information, pgsql_fdw takes the alternative from
environment variables of the user who launched the postgres server.  Usually
such omission would cause unexpected result, so it's strongly recommended to
specify connection information explicitly as much as you can.

* Object name options
You can specify schema name and relation name of a remote table with foreign
table's generic option nspname and relname respectively.  These options allow
you to define local foreign tables with different name from remote side.

* Cursor options
The pgsql_fdw toggles fetch-method and simple SELECT statement and SQL-level
cursor (DECLARE, FETCH and CLOSE statement) according to the estimated size of
result.  If the estimated number of rows is less than min_cursor_rows or
min_cursor_rows is 0, all result is fetched at once with SELECT.  Otherwise,
pgsql_fdw uses cursor and fetches fetch_count rows at once.  Default value of
those options is 1000.  These cursor options can be set both foreign server and
foreign table, and latter overrides former.

Connection management
---------------------

Connection to a foreign server is established in the beginning of a query
execution, and discarded at the end of the query.  The pgsql_fdw shares a
connection if multiple foreign tables are used in a query and they belong to
same foreign table.

Transaction management
----------------------

All remote query for a foreign server is executed in a transaction block.
That transaction's isolation level is READ COMMITTED, default of PostgreSQL.

Copyright
---------

Copyright (c) 2011, NIPPON TELEGRAPH AND TELEPHONE CORPORATION
