External data files possible?

Started by Chris Clevelandalmost 4 years ago3 messages
#1Chris Cleveland
ccleve+github@dieselpoint.com

It's turning out to be difficult to store the data for my custom index
access method in the main fork. Breaking up the data into pages with page
headers means a lot of extra work, a big performance hit, and disk space
management headaches. It's just not a good fit for my particular file
format.

It would be much better to store the index in a set of external data files.
This seems possible so long as I put the files under the database's
directory and name things properly.

But here's the one thing I haven't figured out: how to delete the files
when the index, table, or database gets dropped. The IndexAmRoutine does
not have an "amdrop" hook that gets called when the index gets dropped.

Is there a hook I can use to clean these files up? More generally, can I
get away with using my own data files without causing a problem?

#2Andres Freund
andres@anarazel.de
In reply to: Chris Cleveland (#1)
Re: External data files possible?

Hi,

On 2022-02-21 15:16:31 -0600, Chris Cleveland wrote:

It's turning out to be difficult to store the data for my custom index
access method in the main fork. Breaking up the data into pages with page
headers means a lot of extra work, a big performance hit, and disk space
management headaches. It's just not a good fit for my particular file
format.

I assume you're planning to not go through shared buffers, right?

It would be much better to store the index in a set of external data files.
This seems possible so long as I put the files under the database's
directory and name things properly.

But here's the one thing I haven't figured out: how to delete the files
when the index, table, or database gets dropped. The IndexAmRoutine does
not have an "amdrop" hook that gets called when the index gets dropped.

For some things it'd probably work to just use the normal files, but format
them differently. I.e. go through the smgr.c layer, but not bufmgr.

But unfortunately e.g. basebackup.c will assume they're the normal format and
complain about checksums etc. I don't think there's a way around that right
now.

Is there a hook I can use to clean these files up? More generally, can I
get away with using my own data files without causing a problem?

Not currently. A plain hook wouldn't suffice, because it'd not integrate with
transactional DDL and crash recovery.

Greetings,

Andres Freund

#3Aleksander Alekseev
aleksander@timescale.com
In reply to: Chris Cleveland (#1)
Re: External data files possible?

Hi Chris,

Breaking up the data into pages with page headers means a lot of extra work [...]. It would be much better to store the index in a set of external data files. This seems possible so long as I put the files under the database's directory and name things properly.

Just curious, what is your index for, and how you are going to handle
crash recovery?

--
Best regards,
Aleksander Alekseev