pgvector extension error

Started by Zacher, Stacyabout 1 year ago10 messagesgeneral

Jump to latest

Zacher, Stacy

szacher@mcw.edu

about 1 year ago

Hello:

We are having issues getting the pgvector extension to work in a database.

* Installed the code
* Created the extension in a database - it says it already exists but it's not there:

dbdev=# create extension if not exists vector;
ERROR: type "vector" already exists
dbdev=# drop extension pgvector;
ERROR: extension "pgvector" does not exist

Works in a different database on the same server:

dbdev=# \c dbrag;
You are now connected to database "dbrag" as user "postgres".
It works in this database below.
dbrag=# select * from pg_extension;
oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
--------+---------+----------+--------------+----------------+------------+-----------+--------------
13422 | plpgsql | 10 | 11 | f | 1.0 | |
157739 | vector | 10 | 2200 | t | 0.7.4 | |
(2 rows)

Question: how can I fix the error in the dbdev database?

We already did a reinstall of the extension code and that didn't resolve it.

Thank you,

Stacy Zacher
Enterprise Database Administrator III | Information Services
Medical College of Wisconsin
szacher@mcw.edu<mailto:szacher@mcw.edu>
414-955-5654

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Zacher, Stacy (#1)

Re: pgvector extension error

"Zacher, Stacy" <szacher@mcw.edu> writes:

We are having issues getting the pgvector extension to work in a database.

This:

dbdev=# create extension if not exists vector;
ERROR: type "vector" already exists

indicates you've got some type named "vector" in that database
already. It didn't come from pgvector, so where did it come from?

regards, tom lane

Brent Wood

Brent.Wood@niwa.co.nz

about 1 year ago

In reply to: Tom Lane (#2)

concatenating hstores in a group by?

Hi,

I have a table with a timestamp and hstore columns with times in milliseconds.

I want to concatenate the hstores, grouped by timestamp, with the timestamps truncated to whole seconds: date_trunc('seconds', timer).

How can I concatenate all the hstores within a one second interval,

ie, something like:

select timer,
<hstores concatenated>
from (select date_trunc('seconds', timer) as timer_sec, hstore)
group by timer;

Thanks

Brent Wood

Principal Technician, Fisheries
NIWA
DDI: +64 (4) 3860529

[https://www.niwa.co.nz/static/niwa-2018-horizontal-180.png] <https://www.niwa.co.nz>
Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529

National Institute of Water & Atmospheric Research Ltd (NIWA)
301 Evans Bay Parade Hataitai Wellington New Zealand
Connect with NIWA: niwa.co.nz<https://www.niwa.co.nz> Facebook<https://www.facebook.com/nzniwa> LinkedIn<https://www.linkedin.com/company/niwa> Twitter<https://twitter.com/niwa_nz> Instagram<https://www.instagram.com/niwa_science> YouTube<https://www.youtube.com/channel/UCJ-j3MLMg1H59Ak2UaNLL3A>
To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems.
Note: This email is intended solely for the use of the addressee and may contain information that is confidential or subject to legal professional privilege. If you receive this email in error please immediately notify the sender and delete the email.

David G. Johnston

david.g.johnston@gmail.com

about 1 year ago

In reply to: Brent Wood (#3)

concatenating hstores in a group by?

On Friday, January 17, 2025, Brent Wood <Brent.Wood@niwa.co.nz> wrote:

I want to concatenate the hstores,

There are no hstore aggregate functions. You’ll want to convert them
to,json first then use the json_objectagg aggregate function.

David J.

Adrian Klaver

adrian.klaver@aklaver.com

about 1 year ago

In reply to: Brent Wood (#3)

Re: concatenating hstores in a group by?

On 1/17/25 12:43, Brent Wood wrote:

Hi,

I have a table with a timestamp and hstore columns with times in
milliseconds.

I want to concatenate the hstores, grouped by timestamp, with the
timestamps truncated to whole seconds: date_trunc('seconds', timer).

How can I concatenate all the hstores within a one second interval,

ie, something like:

select timer,
<hstores concatenated>
from (select date_trunc('seconds', timer) as timer_sec, hstore)
group by timer;

1) This is going to need some example data and the table definition.

2) The above query does not make sense.

3) Hstore does not store duplicate key values:

From here:

https://www.postgresql.org/docs/current/hstore.html#HSTORE-OPS-FUNCS

select 'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore ;
?column?
------------------------------
"a"=>"b", "c"=>"x", "d"=>"q"

Are there duplicated keys and if so how do you want to handle them?

Thanks

Brent Wood

Principal Technician, Fisheries
NIWA
DDI: +64 (4) 3860529

--
Adrian Klaver
adrian.klaver@aklaver.com

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: David G. Johnston (#4)

Re: concatenating hstores in a group by?

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Friday, January 17, 2025, Brent Wood <Brent.Wood@niwa.co.nz> wrote:

I want to concatenate the hstores,

There are no hstore aggregate functions.

It would be trivial to make a user-defined one.
More or less (untested)

create aggregate hstore_agg(hstore)
(
sfunc = hs_concat,
stype = hstore
);

Either way though, the order of aggregation is unspecified and thus
it's unclear which input will win when there are duplicate keys.
If that matters then you'll need to think about how you want
to resolve it.

regards, tom lane

Brent Wood

Brent.Wood@niwa.co.nz

about 1 year ago

In reply to: David G. Johnston (#4)

Re: concatenating hstores in a group by?

Thanks for the replies, appreciated...

My current solution is:

select trip_code,
station_no,
timer_sec + interval '12 hour' as NZST,
timer_sec as utc,
hstore_to_json(string_agg(values_sec::text, ', ')::hstore) as values_sec
from (select '$TRIP' as trip_code,
$STATION as station_no,
date_trunc('second', timer) as timer_sec,
values_sec
from t_reading_hstore_sec
where timer >= '$ISO_S'::timestamp - interval '12 hour'
and timer <= '$ISO_F'::timestamp - interval '12 hour') as foo
group by timer_sec, trip_code, station_no;

Convert the hstore to text, aggregate the text with string_agg(), convert back to hstore (which seems to remove duplicate keys, OK for my purpose)
and group by timer truncated to whole seconds. I also provide UTC & local timezone times for each set of readings. It is run in a bash script which passes the trip & station values to the query, as well as the start/finish times as ISO format strings.

The output is going to a Sqlite3 (Spatialite) database, which does not have hstore, or all the hstore functionality that Postgres has, but does have a json datatype which is adequate for our purposes, hence the hstore_to_json in the query.

Thanks again,

Brent Wood

Principal Technician, Fisheries
NIWA
DDI: +64 (4) 3860529
________________________________
From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Monday, January 20, 2025 04:28
To: Brent Wood <Brent.Wood@niwa.co.nz>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: concatenating hstores in a group by?

On Friday, January 17, 2025, Brent Wood <Brent.Wood@niwa.co.nz<mailto:Brent.Wood@niwa.co.nz>> wrote:

I want to concatenate the hstores,

There are no hstore aggregate functions. You’ll want to convert them to,json first then use the json_objectagg aggregate function.

David J.

Adrian Klaver

adrian.klaver@aklaver.com

about 1 year ago

In reply to: Brent Wood (#7)

Re: concatenating hstores in a group by?

On 1/19/25 12:09, Brent Wood wrote:

Thanks for the replies, appreciated...

My current solution is:

/select trip_code,/
/ station_no,/
/ timer_sec + interval '12 hour' as NZST,/
/ timer_sec as utc,/
/ hstore_to_json(string_agg(values_sec::text, ', ')::hstore)
as values_sec/
/ from (select '$TRIP' as trip_code,/
/ $STATION as station_no,/
/ date_trunc('second', timer) as timer_sec,/
/ values_sec/
/ from t_reading_hstore_sec/
/ where timer >= '$ISO_S'::timestamp - interval '12 hour'/
/ and timer <= '$ISO_F'::timestamp - interval '12 hour') as foo/
/group by timer_sec, trip_code, station_no;/

Convert the hstore to text, aggregate the text with string_agg(),
convert back to hstore (which seems to remove duplicate keys, OK for my
purpose)

To be clear values_sec in t_reading_hstore_sec is the hstore field?

If so what is it's structure?

and group by timer truncated to whole seconds. I also provide UTC &
local timezone times for each set of readings. It is run in a bash
script which passes the trip & station values to the query, as well as
the start/finish times as ISO format strings.

The output is going to a Sqlite3 (Spatialite) database, which does not
have hstore, or all the hstore functionality that Postgres has, but does
have a json datatype which is adequate for our purposes, hence the
hstore_to_json in the query.

Thanks again,

Brent Wood

--
Adrian Klaver
adrian.klaver@aklaver.com

Brent Wood

pcreso@yahoo.com

about 1 year ago

In reply to: Adrian Klaver (#8)

Re: concatenating hstores in a group by?

Yes, a Timescale hypertable with 500,000,000 rows of about 15 key/value pairs per record.
I'm not sure why there is both a gin & gist index on the hstore, or the merits of each.

Thanks....

\d t_reading_hstore_sec
                                         Table "public.t_reading_hstore_sec"
   Column   |            Type             | Collation | Nullable |                      Default
------------+-----------------------------+-----------+----------+---------------------------------------------------
key        | bigint                      |           | not null | nextval('t_reading_hstore_sec_key_seq'::regclass)
timer      | timestamp without time zone |           | not null |
values_sec | hstore                      |           |          |
Indexes:
    "t_reading_hstore_sec_pkey" PRIMARY KEY, btree (key, timer)
    "t_reading_hstore_sec_timer_idx" btree (timer)
    "t_reading_hstore_sec_timer_key" UNIQUE CONSTRAINT, btree (timer)
    "t_reading_hstore_sec_values_idx_gin" gin (values_sec)
    "t_reading_hstore_sec_values_idx_gist" gist (values_sec)

On Wednesday, January 22, 2025 at 06:34:38 AM GMT+13, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 1/19/25 12:09, Brent Wood wrote:

Thanks for the replies, appreciated...

My current solution is:

/select trip_code,/
/ station_no,/
/ timer_sec + interval '12 hour' as NZST,/
/ timer_sec as utc,/
/ hstore_to_json(string_agg(values_sec::text, ', ')::hstore)
as values_sec/
/ from (select '$TRIP' as trip_code,/
/ $STATION as station_no,/
/ date_trunc('second', timer) as timer_sec,/
/ values_sec/
/ from t_reading_hstore_sec/
/ where timer >= '$ISO_S'::timestamp - interval '12 hour'/
/ and timer <= '$ISO_F'::timestamp - interval '12 hour') as foo/
/group by timer_sec, trip_code, station_no;/

Convert the hstore to text, aggregate the text with string_agg(),
convert back to hstore (which seems to remove duplicate keys, OK for my
purpose)

To be clear values_sec in t_reading_hstore_sec is the hstore field?

If so what is it's structure?

and group by timer truncated to whole seconds. I also provide UTC &
local timezone times for each set of readings. It is run in a bash
script which passes the trip & station values to the query, as well as
the start/finish times as ISO format strings.

The output is going to a Sqlite3 (Spatialite) database, which does not
have hstore, or all the hstore functionality that Postgres has, but does
have a json datatype which is adequate for our purposes, hence the
hstore_to_json in the query.

Thanks again,

Brent Wood

--
Adrian Klaver
adrian.klaver@aklaver.com

#10

Adrian Klaver

adrian.klaver@aklaver.com

about 1 year ago

In reply to: Brent Wood (#9)

Re: concatenating hstores in a group by?

On 1/21/25 10:02, Brent Wood wrote:

Yes, a Timescale hypertable with 500,000,000 rows of about 15 key/value
pairs per record.

I'm not sure why there is both a gin & gist index on the hstore, or the
merits of each.

Questions:

1) What is the timezone setting for the database?

2) What is the system time zone where the BASH script is run?

3) Are you sure you don't need the duplicate keys?
The keys maybe duplicated, but are the values duplicated also?

Also could this be simplified to something like?:

create table hstore_tbl (id integer, tsz_fld timestamptz, hstore_fld
hstore);

insert into hstore_tbl values (1, now(), 'a=>1, b=>2'), (2, now() +
interval '0.5 sec', 'a=>3, b=>4'), (3, now() + interval '1 sec', 'a=>1,
b=>5'), (4, now()
+ interval '1.5 sec', 'a=>6, b=>7');

select id, date_trunc('second', tsz_fld) AS t_sec, h.key, h.value from
hstore_tbl, each(hstore_fld) as h ;
id | t_sec | key | value
----+------------------------+-----+-------
1 | 2025-01-21 14:00:27-08 | a | 1
1 | 2025-01-21 14:00:27-08 | b | 2
2 | 2025-01-21 14:00:28-08 | a | 3
2 | 2025-01-21 14:00:28-08 | b | 4
3 | 2025-01-21 14:00:28-08 | a | 1
3 | 2025-01-21 14:00:28-08 | b | 5
4 | 2025-01-21 14:00:29-08 | a | 6
4 | 2025-01-21 14:00:29-08 | b | 7

It would unnest the data.

Thanks....

\d t_reading_hstore_sec
                                         Table
"public.t_reading_hstore_sec"
   Column   |            Type             | Collation | Nullable
|                      Default
------------+-----------------------------+-----------+----------+---------------------------------------------------
key        | bigint                      |           | not null |
nextval('t_reading_hstore_sec_key_seq'::regclass)
timer      | timestamp without time zone |           | not null |
values_sec | hstore                      |           |          |
Indexes:
    "t_reading_hstore_sec_pkey" PRIMARY KEY, btree (key, timer)
    "t_reading_hstore_sec_timer_idx" btree (timer)
    "t_reading_hstore_sec_timer_key" UNIQUE CONSTRAINT, btree (timer)
    "t_reading_hstore_sec_values_idx_gin" gin (values_sec)
    "t_reading_hstore_sec_values_idx_gist" gist (values_sec)

On Wednesday, January 22, 2025 at 06:34:38 AM GMT+13, Adrian Klaver
<adrian.klaver@aklaver.com> wrote:

On 1/19/25 12:09, Brent Wood wrote:

Thanks for the replies, appreciated...

My current solution is:

/select trip_code,/
/ station_no,/
/ timer_sec + interval '12 hour' as NZST,/
/ timer_sec as utc,/
/ hstore_to_json(string_agg(values_sec::text, ', ')::hstore)
as values_sec/
/ from (select '$TRIP' as trip_code,/
/ $STATION as station_no,/
/ date_trunc('second', timer) as timer_sec,/
/ values_sec/
/ from t_reading_hstore_sec/
/ where timer >= '$ISO_S'::timestamp - interval '12 hour'/
/ and timer <= '$ISO_F'::timestamp - interval '12 hour')

as foo/

/group by timer_sec, trip_code, station_no;/

Convert the hstore to text, aggregate the text with string_agg(),
convert back to hstore (which seems to remove duplicate keys, OK for my
purpose)

To be clear values_sec in t_reading_hstore_sec is the hstore field?

If so what is it's structure?

and group by timer truncated to whole seconds. I also provide UTC &
local timezone times for each set of readings. It is run in a bash
script which passes the trip & station values to the query, as well as
the start/finish times as ISO format strings.

The output is going to a Sqlite3 (Spatialite) database, which does not
have hstore, or all the hstore functionality that Postgres has, but does
have a json datatype which is adequate for our purposes, hence the
hstore_to_json in the query.

Thanks again,

Brent Wood

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com