8K block limit

Started by Ken Mortalmost 27 years ago12 messages
#1Ken Mort
kenmort@mort.port.net

Already asked this in the other lists so here.

I need to store some polygons that are larger than 8K.
I was reading in hackers archives talk about a solution
to the 8K limit. Was anything done? If so, what do I
need to do to solve my problem?

Regards,
Kenneth R. Mort <kenmort@mort.port.net>
TreeTop Research
Brooklyn, NY, USA

#2Peter T Mount
peter@retep.org.uk
In reply to: Ken Mort (#1)
Re: [HACKERS] 8K block limit

On Mon, 15 Feb 1999, Ken Mort wrote:

Already asked this in the other lists so here.

I need to store some polygons that are larger than 8K.
I was reading in hackers archives talk about a solution
to the 8K limit. Was anything done? If so, what do I
need to do to solve my problem?

There is an option that can be set at compile time to set the block size
from 8k to something like 32 or 64K (not sure which).

Note: Changing the block size may have a performance hit however.

Another way is to break the polygons down into smaller pieces.

Peter

--
Peter T Mount peter@retep.org.uk
Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgres
Java PDF Generator: http://www.retep.org.uk/pdf

#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Peter T Mount (#2)
Re: [HACKERS] 8K block limit

On Mon, 15 Feb 1999, Ken Mort wrote:

Already asked this in the other lists so here.

I need to store some polygons that are larger than 8K.
I was reading in hackers archives talk about a solution
to the 8K limit. Was anything done? If so, what do I
need to do to solve my problem?

There is an option that can be set at compile time to set the block size
from 8k to something like 32 or 64K (not sure which).

I think it is 32k. (tuple offset in a block is limited to 15 bits)

Note: Changing the block size may have a performance hit however.

Why?
---
Tatsuo Ishii

#4Peter T Mount
peter@taer.maidstone.gov.uk
In reply to: Tatsuo Ishii (#3)
Re: [HACKERS] 8K block limit

On Wed, 17 Feb 1999, Tatsuo Ishii wrote:

On Mon, 15 Feb 1999, Ken Mort wrote:

Already asked this in the other lists so here.

I need to store some polygons that are larger than 8K.
I was reading in hackers archives talk about a solution
to the 8K limit. Was anything done? If so, what do I
need to do to solve my problem?

There is an option that can be set at compile time to set the block size
from 8k to something like 32 or 64K (not sure which).

I think it is 32k. (tuple offset in a block is limited to 15 bits)

Note: Changing the block size may have a performance hit however.

Why?

I think some file systems are more optimised for 8K blocks. I may be
thinking on the original reason for the 8k limit in the first place, but I
remember there was discussions about this when the block size was altered.

Peter

--
Peter Mount, IT Section
petermount@it.maidstone.gov.uk
Anything I write here are my own views, and cannot be taken as being the
official words of Maidstone Borough Council

#5Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Peter T Mount (#4)
Re: [HACKERS] 8K block limit

I think some file systems are more optimised for 8K blocks. I may be
thinking on the original reason for the 8k limit in the first place, but I
remember there was discussions about this when the block size was altered.

Yes, most UFS file systems use 8k blocks/2k fragments. It allows write
of block in one i/o operation.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Stupor Genius
stuporg@erols.com
In reply to: Bruce Momjian (#5)
RE: [HACKERS] 8K block limit

I think some file systems are more optimised for 8K blocks. I may be
thinking on the original reason for the 8k limit in the first
place, but I remember there was discussions about this when the block
size was altered.

Yes, most UFS file systems use 8k blocks/2k fragments. It allows write
of block in one i/o operation.

The max is 32k because of the aforementioned 15 bits available, but I'd
be a bit cautious of trying it. When I put this in, the highest I could
get to work on AIX was 16k. Pushing it up to 32k caused major breakage
in the system internals. Had to reboot the machine and fsck the file
system. Some files were linked incorrectly, other files disappeared, etc,
a real mess.

Not sure exactly what it corrupted, but I'd try the 32k limit on a non-
production system first...

Darren

#7Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Stupor Genius (#6)
Re: [HACKERS] 8K block limit

I think some file systems are more optimised for 8K blocks. I may be
thinking on the original reason for the 8k limit in the first
place, but I remember there was discussions about this when the block
size was altered.

Yes, most UFS file systems use 8k blocks/2k fragments. It allows write
of block in one i/o operation.

But modern Unixes have read/write ahead i/o if it seems a sequential
access, don't they. I did some testing on my LinuxPPC box.

0. create table t2(i int,c char(4000));
1. time psql -c "copy t2 from '/tmp/aaa'" test
(aaa has 5120 records and this will create 20MB table)
2. time psql -c "select count(*) from t2" test
3. total time of the regression test

o result of testing 1

8K: 0.02user 0.04system 3:26.20elapsed
32K: 0.03user 0.06system 0:48.25elapsed

32K is 4 times faster than 8k!

o result of testing 2

8K: 0.02user 0.04system 6:00.31elapsed
32K: 0.04user 0.02system 1:02.13elapsed

32K is neary 6 times faster than 8k!

o result of testing 3

8K: 11.46user 9.51system 6:08.24
32K: 11.34user 9.54system 7:35.35

32K is a little bit slower than 8K?

My thought:

In my test case the tuple size is relatively large, so by using
ordinary size tuple, we may get different results. And of course
different OS may behave differently...

Another point is the access method. I only tested for seq scan. I
don't know for index scan.

Additional testings are welcome...

The max is 32k because of the aforementioned 15 bits available, but I'd
be a bit cautious of trying it. When I put this in, the highest I could
get to work on AIX was 16k. Pushing it up to 32k caused major breakage
in the system internals. Had to reboot the machine and fsck the file
system. Some files were linked incorrectly, other files disappeared, etc,
a real mess.

Not sure exactly what it corrupted, but I'd try the 32k limit on a non-
production system first...

I did above on 6.4.2. What kind of version are you using? Or maybe
platform dependent problem?

BTW, the biggest problem is there are some hard coded query length
limits in somewhere(for example MAX_MESSAGE_LEN in libpq-int.h). Until
these get fixed, 32K option is only useful for (possible) performance
boosting.
---
Tatsuo Ishii

#8Vadim Mikheev
vadim@krs.ru
In reply to: Tatsuo Ishii (#7)
Re: [HACKERS] 8K block limit

Tatsuo Ishii wrote:

But modern Unixes have read/write ahead i/o if it seems a sequential
access, don't they. I did some testing on my LinuxPPC box.

0. create table t2(i int,c char(4000));
1. time psql -c "copy t2 from '/tmp/aaa'" test
(aaa has 5120 records and this will create 20MB table)
2. time psql -c "select count(*) from t2" test
3. total time of the regression test

o result of testing 1

8K: 0.02user 0.04system 3:26.20elapsed
32K: 0.03user 0.06system 0:48.25elapsed

32K is 4 times faster than 8k!

o result of testing 2

8K: 0.02user 0.04system 6:00.31elapsed
32K: 0.04user 0.02system 1:02.13elapsed

32K is neary 6 times faster than 8k!

Did you use the same -B for 8K and 32K ?
You should use 4x buffers in 8K case!

Vadim

#9Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] 8K block limit

Tatsuo Ishii wrote:

But modern Unixes have read/write ahead i/o if it seems a sequential
access, don't they. I did some testing on my LinuxPPC box.

0. create table t2(i int,c char(4000));
1. time psql -c "copy t2 from '/tmp/aaa'" test
(aaa has 5120 records and this will create 20MB table)
2. time psql -c "select count(*) from t2" test
3. total time of the regression test

o result of testing 1

8K: 0.02user 0.04system 3:26.20elapsed
32K: 0.03user 0.06system 0:48.25elapsed

32K is 4 times faster than 8k!

o result of testing 2

8K: 0.02user 0.04system 6:00.31elapsed
32K: 0.04user 0.02system 1:02.13elapsed

32K is neary 6 times faster than 8k!

Did you use the same -B for 8K and 32K ?
You should use 4x buffers in 8K case!

Ok. This time I started postmaster as 'postmaster -S -i -B 256'.

test1:
0.03user 0.02system 3:21.65elapsed

test2:
0.01user 0.08system 5:30.94elapsed

a little bit faster, but no significant difference?
--
Tatsuo Ishii

#10Vadim Mikheev
vadim@krs.ru
In reply to: Tatsuo Ishii (#9)
Re: [HACKERS] 8K block limit

Tatsuo Ishii wrote:

Did you use the same -B for 8K and 32K ?
You should use 4x buffers in 8K case!

Ok. This time I started postmaster as 'postmaster -S -i -B 256'.

test1:
0.03user 0.02system 3:21.65elapsed

test2:
0.01user 0.08system 5:30.94elapsed

a little bit faster, but no significant difference?

Yes. So, 32K is sure value for a few simultaneous sessions.

Vadim

#11Stupor Genius
stuporg@erols.com
In reply to: Tatsuo Ishii (#7)
RE: [HACKERS] 8K block limit

Additional testings are welcome...

The max is 32k because of the aforementioned 15 bits available, but I'd
be a bit cautious of trying it. When I put this in, the highest I could
get to work on AIX was 16k. Pushing it up to 32k caused major breakage
in the system internals. Had to reboot the machine and fsck the file
system. Some files were linked incorrectly, other files

disappeared, etc,

a real mess.

Not sure exactly what it corrupted, but I'd try the 32k limit on a non-
production system first...

I did above on 6.4.2. What kind of version are you using? Or maybe
platform dependent problem?

My platform at the time was AIX 4.1.4.0 and it was an definitely AIX
that broke, not postgres.

Glad to hear it works at 32k on other systems though!

Darren

#12Zeugswetter Andreas IZ5
Andreas.Zeugswetter@telecom.at
In reply to: Stupor Genius (#11)
AW: [HACKERS] 8K block limit

This is all logical, all sequential access will be faster with bigger
blocksize. It usually gets faster up to a blocksize of 256k.
You could have done the same test using dd.
The slowdown will be on random access using index and too much data
to cache all data pages and needed index pages.
1. The cache will not be as selective, since for one often needed
row the backend will cache 32k (plus at least 32k for each index
needed to find this row).
2. The index lookup will use more CPU time, since one index page
is larger.
3. To cache one row with 3 indices you will need at least 128k memory
compared to 32 k with an 8k pagesize.
4. It will also increase the amount of disk space and cache needed
for the system tables (since they are all rather small)

For the default configuration a 4-8k page size seems to be a good
tradeoff in other DBMS's. For PostgreSQL this might not be so,
because of the lack of read ahead and write behind.
Remember that the read ahead would actually need to read in bigger
blocks (up to 256k) to actually perform best. The same is true for
the write behind.

Andreas

Show quoted text

Tatsuo Ishii wrote:

But modern Unixes have read/write ahead i/o if it seems a sequential
access, don't they. I did some testing on my LinuxPPC box.

0. create table t2(i int,c char(4000));
1. time psql -c "copy t2 from '/tmp/aaa'" test
(aaa has 5120 records and this will create 20MB table)
2. time psql -c "select count(*) from t2" test
3. total time of the regression test

o result of testing 1

8K: 0.02user 0.04system 3:26.20elapsed
32K: 0.03user 0.06system 0:48.25elapsed

32K is 4 times faster than 8k!

o result of testing 2

8K: 0.02user 0.04system 6:00.31elapsed
32K: 0.04user 0.02system 1:02.13elapsed

32K is neary 6 times faster than 8k!

Did you use the same -B for 8K and 32K ?
You should use 4x buffers in 8K case!

Ok. This time I started postmaster as 'postmaster -S -i -B 256'.

test1:
0.03user 0.02system 3:21.65elapsed

test2:
0.01user 0.08system 5:30.94elapsed

a little bit faster, but no significant difference?