Is it possible to create an index without keeping the indexed data in a column?

Started by Larry Whiteover 11 years ago7 messagesgeneral
Jump to latest
#1Larry White
ljw1001@gmail.com

Hi,

I would like to create a GIN index on a set of JSON documents. Right now
I'm storing the data in a JSONB column. The current index looks like this:

CREATE INDEX document_payload_idx
ON document
USING gin
(payload jsonb_path_ops);

The index is pretty small, but the actual data takes up a *lot* of space.
Is there a way to get Postgres to index the table *as if* the JSON were
there, but not actually put the data in the table? I could either store the
docs elsewhere and keep a reference, or compress them and put them in the
table in compressed form as a blob.

Thanks much for your help.

Larry

#2Michael Paquier
michael@paquier.xyz
In reply to: Larry White (#1)
Re: Is it possible to create an index without keeping the indexed data in a column?

On Fri, Aug 1, 2014 at 4:47 AM, Larry White <ljw1001@gmail.com> wrote:

Is there a way to get Postgres to index the table as if the JSON were there,
but not actually put the data in the table?
I could either store the docs
elsewhere and keep a reference, or compress them and put them in the table
in compressed form as a blob.

No. This is equivalent to the creation of an index on a foreign table.
Regards,
--
Michael

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Michael Paquier (#2)
Re: Is it possible to create an index without keeping the indexed data in a column?

On Fri, Aug 1, 2014 at 10:48 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Fri, Aug 1, 2014 at 4:47 AM, Larry White <ljw1001@gmail.com> wrote:

Is there a way to get Postgres to index the table as if the JSON were there,
but not actually put the data in the table?
I could either store the docs
elsewhere and keep a reference, or compress them and put them in the table
in compressed form as a blob.

No. This is equivalent to the creation of an index on a foreign table.

Not sure exactly if it applies here; but I seem to recall reading
somewhere that you can index "generated" columns. Something like
following (this example is similar what I recall seeing there)

postgres=# CREATE TABLE test(a, b) AS SELECT md5(g::text)::char(10),
md5(g::text)::char(5) FROM generate_series(1, 100000) g;
SELECT 100000

postgres=# CREATE OR REPLACE FUNCTION ab(rec test) RETURNS text AS $$
SELECT rec.a || rec.b; $$ STABLE LANGUAGE SQL;
CREATE FUNCTION

postgres=# CREATE EXTENSION pg_trgm;
CREATE EXTENSION

postgres=# CREATE INDEX test_idx ON test USING GIN (ab(test) gin_trgm_ops);
CREATE INDEX

postgres=# EXPLAIN SELECT * FROM test WHERE ab(test) LIKE '%c4c%';
QUERY PLAN
-------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=16.09..52.53 rows=10 width=17)
Recheck Cond: (((a)::text || (b)::text) ~~ '%c4c%'::text)
-> Bitmap Index Scan on test_idx (cost=0.00..16.08 rows=10 width=0)
Index Cond: (((a)::text || (b)::text) ~~ '%c4c%'::text)
Planning time: 0.361 ms
(5 rows)

--
Amit

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#3)
Re: Is it possible to create an index without keeping the indexed data in a column?

On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote <amitlangote09@gmail.com> wrote:

Not sure exactly if it applies here;

Re-reading the OP again, perhaps it doesn't. Sorry about the noise

--
Amit

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Amit Langote (#4)
Re: Is it possible to create an index without keeping the indexed data in a column?

Amit Langote wrote

On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote &lt;

amitlangote09@

&gt; wrote:

Not sure exactly if it applies here;

Re-reading the OP again, perhaps it doesn't. Sorry about the noise

This is a functional index which lets you store derived data in the index
without having to also store it in the table. Mostly useful for stuff that
is only relevant in the context of searching and not something you would
ever return to the user.

The restriction here is that the raw data still needs to be stored in the
table.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813499.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Larry White (#1)
Re: Is it possible to create an index without keeping the indexed data in a column?

larrry wrote

Hi,

I would like to create a GIN index on a set of JSON documents. Right now
I'm storing the data in a JSONB column. The current index looks like this:

CREATE INDEX document_payload_idx
ON document
USING gin
(payload jsonb_path_ops);

The index is pretty small, but the actual data takes up a *lot* of space.
Is there a way to get Postgres to index the table *as if* the JSON were
there, but not actually put the data in the table? I could either store
the
docs elsewhere and keep a reference, or compress them and put them in the
table in compressed form as a blob.

Thanks much for your help.

Larry

No idea if this works but maybe you can store the compressed data and then
write the index expression like:

USING gin (unzip(payload) jsonb_path_ops)

The unzip function would need to be custom I think...

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Larry White
ljw1001@gmail.com
In reply to: David G. Johnston (#6)
Re: Re: Is it possible to create an index without keeping the indexed data in a column?

Thank you David and Amit. This is more or less what I was looking for.

I _think_ I might be able to store the data as TEXT, which is highly
compressed by Toast, and then perhaps write the function in terms of a TEXT
to JSONB conversion. I will give it a try. It might perform terribly, but
will be an interesting experiment.:)

On Fri, Aug 1, 2014 at 3:14 AM, David G Johnston <david.g.johnston@gmail.com

Show quoted text

wrote:

larrry wrote

Hi,

I would like to create a GIN index on a set of JSON documents. Right now
I'm storing the data in a JSONB column. The current index looks like

this:

CREATE INDEX document_payload_idx
ON document
USING gin
(payload jsonb_path_ops);

The index is pretty small, but the actual data takes up a *lot* of space.
Is there a way to get Postgres to index the table *as if* the JSON were
there, but not actually put the data in the table? I could either store
the
docs elsewhere and keep a reference, or compress them and put them in the
table in compressed form as a blob.

Thanks much for your help.

Larry

No idea if this works but maybe you can store the compressed data and then
write the index expression like:

USING gin (unzip(payload) jsonb_path_ops)

The unzip function would need to be custom I think...

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general