COPY JSON: use trailing commas in FORCE_ARRAY output
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning of the next line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[
{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```
I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
For comparison, the existing json_agg() places commas at the end of the line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```
If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT json" is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more common style.
This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how commas and newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
See the attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Attachments:
v1-0001-COPY-JSON-use-trailing-commas-in-FORCE_ARRAY-outp.patchapplication/octet-stream; name=v1-0001-COPY-JSON-use-trailing-commas-in-FORCE_ARRAY-outp.patch; x-unix-mode=0644Download+36-18
Hi,
On Wed, 6 May 2026 at 12:11, Chao Li <li.evan.chao@gmail.com> wrote:
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it
places the delimiter comma at the beginning of the next line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case:
\"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```I was surprised by the comma placement. It is valid JSON, but it looks
quite uncommon.For comparison, the existing json_agg() places commas at the end of the
line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```If this feature had already been released, I would not think it worth
changing just for formatting. But since "FORMAT json" is a new PG19 feature
and has not been released yet, I think it is better to make the output to
follow the more common style.This patch changes the output to place the comma at the end of the
previous line instead. The fix only adjusts how commas and newlines are
emitted. It does not buffer the whole result, so it should not have any
performance impact.
The patch looks good to me. I applied and the focused 'copy'
regression test too passed.
I agree that since FORMAT json is new for PG19, it is reasonable to adjust
the formatting before release. The implementation still streams the output
and only changes where the separator/newline are emitted.
Regards,
Ayush
On 6 May 2026, at 08:40, Chao Li <li.evan.chao@gmail.com> wrote:
I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
It might look uncommon, but for very wide lines it's IMHO preferrable to not
have to scroll all the way to the end of the line to know that the line is part
of an array.
For comparison, the existing json_agg() places commas at the end of the line:
That's true, but json_agg() and COPY TO in ndjson format have different use
cases.
..it should not have any performance impact.
It does add branches though, and in one branch use a non-inlined function where
previously it would unconditionally use an inline function. ISTM it would
still be valuable to do performance testing given that COPY is commonly used in
performance sensitive settings.
--
Daniel Gustafsson
On 5/6/26 2:40 PM, Chao Li wrote:
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning of the next line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[
{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
For comparison, the existing json_agg() places commas at the end of the line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT json" is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more common style.
This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how commas and newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
See the attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Thanks for the patch, I like it as I feel better with placing commas at the end of lines.
I have a small suggestion. The function name CopySendTextLikeEOL reads very similar to the existing CopySendTextLikeEndOfRow. Would it better to rename it to CopySendTextLikeLineTerminator?
Other than that, the patch looks good to me.
Regards,
Alex Guo
On May 6, 2026, at 16:23, Daniel Gustafsson <daniel@yesql.se> wrote:
On 6 May 2026, at 08:40, Chao Li <li.evan.chao@gmail.com> wrote:
I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
It might look uncommon, but for very wide lines it's IMHO preferrable to not
have to scroll all the way to the end of the line to know that the line is part
of an array.For comparison, the existing json_agg() places commas at the end of the line:
That's true, but json_agg() and COPY TO in ndjson format have different use
cases...it should not have any performance impact.
It does add branches though, and in one branch use a non-inlined function where
previously it would unconditionally use an inline function. ISTM it would
still be valuable to do performance testing given that COPY is commonly used in
performance sensitive settings.
Make sense. I just did a test to compare the performance between master and the patch:
For the data setup, since the patch only changes where the comma is emitted, I intentionally used a table with only one column, to minimize the cost of formatting each row:
```
DROP TABLE IF EXISTS copy_json_force_array_perf;
CREATE UNLOGGED TABLE copy_json_force_array_perf(id int);
INSERT INTO copy_json_force_array_perf
SELECT g
FROM generate_series(1, 10000000) AS g;
VACUUM ANALYZE copy_json_force_array_perf;
\timing on
```
On master:
```
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1208.694 ms (00:01.209)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1200.203 ms (00:01.200)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1238.639 ms (00:01.239)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1211.344 ms (00:01.211)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1252.197 ms (00:01.252)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1223.510 ms (00:01.224)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1212.378 ms (00:01.212)
```
Average: ~1221 ms
With the patch:
```
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1218.580 ms (00:01.219)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1212.913 ms (00:01.213)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1204.350 ms (00:01.204)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1205.276 ms (00:01.205)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1202.088 ms (00:01.202)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1222.390 ms (00:01.222)
```
Average: ~1211 ms
The difference doesn't look quite meaningful from this test. I built with debug and asserts disabled, and compiled with -O2. The output was written to /dev/null to avoid client/network overhead. The tests ran on my MacBook M4.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On May 6, 2026, at 16:46, Alex Guo <guo.alex.hengchen@gmail.com> wrote:
On 5/6/26 2:40 PM, Chao Li wrote:
Hi,
Another issue I found with COPY TO (FORMAT json, FORCE_ARRAY) is that it places the delimiter comma at the beginning of the next line, like this:
```
evantest=# copy test_json_copy to stdout with (format json, force_array);
[
{"id":1,"name":"Alice","is_active":true,"tags":["dev","db"],"created_at":"2026-05-01"}
,{"id":2,"name":"Bob","is_active":false,"tags":["manager"],"created_at":"2026-05-02"}
,{"id":3,"name":"Charlie","is_active":null,"tags":null,"created_at":null}
,{"id":4,"name":"Special Case: \"Quotes\"","is_active":true,"tags":["a","b"],"created_at":"2026-05-04"}
]
```I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
For comparison, the existing json_agg() places commas at the end of the line:
```
evantest=# select json_agg(t) from (select id, name from test_json_copy) t;
json_agg
----------------------------------------------
[{"id":1,"name":"Alice"}, +
{"id":2,"name":"Bob"}, +
{"id":3,"name":"Charlie"}, +
{"id":4,"name":"Special Case: \"Quotes\""}]
(1 row)
```If this feature had already been released, I would not think it worth changing just for formatting. But since "FORMAT json" is a new PG19 feature and has not been released yet, I think it is better to make the output to follow the more common style.
This patch changes the output to place the comma at the end of the previous line instead. The fix only adjusts how commas and newlines are emitted. It does not buffer the whole result, so it should not have any performance impact.
See the attached patch for details.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/Thanks for the patch, I like it as I feel better with placing commas at the end of lines.
I have a small suggestion. The function name CopySendTextLikeEOL reads very similar to the existing CopySendTextLikeEndOfRow. Would it better to rename it to CopySendTextLikeLineTerminator?
Other than that, the patch looks good to me.
Regards,
Alex Guo
Thanks for the suggestion, I take it.
PFA v2 - Renamed CopySendTextLikeEOL to CopySendTextLikeLineTerminator.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/