JSONB order?

Started by Tony Shelverover 5 years ago9 messagesgeneral
Jump to latest
#1Tony Shelver
tshelver@gmail.com

I am getting data out of a spreadsheet (Google API) and loading it into a
Python 3.8 dict.
I then dump it to json format. On printing, it's in the correct order:
{
"Timestamp": "05/11/2020 17:08:08",
"Site Name": "SureSecurity Calgary",
"Last Name": "Shelver",
"First Name": "Anthony",
"Middle Name(s)": "",
"Phone": 555757007,
"Person visited": "test",
"Body Temperature": 44,
"Fever or chills": "No",
"Difficulty breathing or shortness of breath": "No",
"Cough": "No",
"Sore throat, trouble swallowing": "No",
"Runny nose/stuffy nose or nasal congestion": "No",
"Decrease or loss of smell or taste": "No",
"Nausea, vomiting, diarrhea, abdominal pain": "No",
"Not feeling well, extreme tiredness, sore muscles":
"Yes", "Have you travelled outside of Canada in the past 14 days?": "No",
"Have you had close contact with a confirmed or probable case of
COVID-19?": "No"
}

It's passed to a plpgsql function, using a jsonb parameter variable.
This insets it into the table, into into a jsonb column.

When looking at what the column contents are, it's been rearranged. The
order always seems to have been rearranged in the same way, as below:
{
"Cough": "No",
"Phone": 5555757007,
"Last Name": "Shelver",
"Site Name": "SureSecurity Calgary",
"Timestamp": "04/11/2020 17:34:48",
"First Name": "Anthony",
"Middle Name(s)": "",
"Person visited": "Many",
"Fever or chills": "No",
"Body Temperature": 44,
"Sore throat, trouble swallowing": "No",
"Decrease or loss of smell or taste": "No",
"Nausea, vomiting, diarrhea, abdominal pain": "No",
"Runny nose/stuffy nose or nasal congestion": "No",
"Difficulty breathing or shortness of breath": "No",
"Not feeling well, extreme tiredness, sore muscles": "No",
"Have you travelled outside of Canada in the past 14 days?": "No",
"Have you had close contact with a confirmed or probable case of
COVID-19?": "No"
}

If the order had remained the same, it's child's play to pull the data out
and present it in a report, even if the data elements change.
But... seen above, the order gets mixed up.

Any ideas?

Thanks

Tony Shelver

#2Christophe Pettus
xof@thebuild.com
In reply to: Tony Shelver (#1)
Re: JSONB order?

On Nov 5, 2020, at 07:34, Tony Shelver <tshelver@gmail.com> wrote:
But... seen above, the order gets mixed up.

Any ideas?

JSON objects, like Python dicts, are not automatically ordered by key. Once you move from the column space to the JSON object space, you can't rely on the object keys being in a consistent order.

You'll want to have a step when ingesting the JSON object into a report that lines up the key values appropriately with the right presentation in the report.
--
-- Christophe Pettus
xof@thebuild.com

#3Magnus Hagander
magnus@hagander.net
In reply to: Tony Shelver (#1)
Re: JSONB order?

On Thu, Nov 5, 2020 at 4:35 PM Tony Shelver <tshelver@gmail.com> wrote:

I am getting data out of a spreadsheet (Google API) and loading it into a Python 3.8 dict.
I then dump it to json format. On printing, it's in the correct order:
{
"Timestamp": "05/11/2020 17:08:08",
"Site Name": "SureSecurity Calgary",
"Last Name": "Shelver",
"First Name": "Anthony",
"Middle Name(s)": "",
"Phone": 555757007,
"Person visited": "test",
"Body Temperature": 44,
"Fever or chills": "No",
"Difficulty breathing or shortness of breath": "No",
"Cough": "No",
"Sore throat, trouble swallowing": "No",
"Runny nose/stuffy nose or nasal congestion": "No",
"Decrease or loss of smell or taste": "No",
"Nausea, vomiting, diarrhea, abdominal pain": "No",
"Not feeling well, extreme tiredness, sore muscles":
"Yes", "Have you travelled outside of Canada in the past 14 days?": "No",
"Have you had close contact with a confirmed or probable case of COVID-19?": "No"
}

It's passed to a plpgsql function, using a jsonb parameter variable.
This insets it into the table, into into a jsonb column.

When looking at what the column contents are, it's been rearranged. The order always seems to have been rearranged in the same way, as below:
{
"Cough": "No",
"Phone": 5555757007,
"Last Name": "Shelver",
"Site Name": "SureSecurity Calgary",
"Timestamp": "04/11/2020 17:34:48",
"First Name": "Anthony",
"Middle Name(s)": "",
"Person visited": "Many",
"Fever or chills": "No",
"Body Temperature": 44,
"Sore throat, trouble swallowing": "No",
"Decrease or loss of smell or taste": "No",
"Nausea, vomiting, diarrhea, abdominal pain": "No",
"Runny nose/stuffy nose or nasal congestion": "No",
"Difficulty breathing or shortness of breath": "No",
"Not feeling well, extreme tiredness, sore muscles": "No",
"Have you travelled outside of Canada in the past 14 days?": "No",
"Have you had close contact with a confirmed or probable case of COVID-19?": "No"
}

If the order had remained the same, it's child's play to pull the data out and present it in a report, even if the data elements change.
But... seen above, the order gets mixed up.

Any ideas?

The json standard declares that the keys in a document are unordered,
and can appear at any order.

In PostgreSQL, jsonb will not preserve key ordering, as a feature for
efficiency. The plain json datatype will, so if key ordering is
important you should use json instead of jsonb (but you should
probably also not use the json format in general, as it does not
guarantee this)

See https://www.postgresql.org/docs/13/datatype-json.html

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#4Tony Shelver
tshelver@gmail.com
In reply to: Tony Shelver (#1)
Fwd: JSONB order?

---------- Forwarded message ---------
From: Tony Shelver <tshelver@gmail.com>
Date: Thu, 5 Nov 2020 at 17:45
Subject: Re: JSONB order?
To: Christophe Pettus <xof@thebuild.com>

Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way every
time.

FYI, as of Python 3.7, dicts *are* ordered.

The problem is that we are possibly going to have many versions of these
forms with slightly differing keys, which will be a pain to order in some
hard coded way.

On Thu, 5 Nov 2020 at 17:40, Christophe Pettus <xof@thebuild.com> wrote:

Show quoted text

On Nov 5, 2020, at 07:34, Tony Shelver <tshelver@gmail.com> wrote:
But... seen above, the order gets mixed up.

Any ideas?

JSON objects, like Python dicts, are not automatically ordered by key.
Once you move from the column space to the JSON object space, you can't
rely on the object keys being in a consistent order.

You'll want to have a step when ingesting the JSON object into a report
that lines up the key values appropriately with the right presentation in
the report.
--
-- Christophe Pettus
xof@thebuild.com

#5Christophe Pettus
xof@thebuild.com
In reply to: Tony Shelver (#4)
Re: JSONB order?

On Nov 5, 2020, at 07:45, Tony Shelver <tshelver@gmail.com> wrote:
Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way every time.

FYI, as of Python 3.7, dicts are ordered.

The problem is that we are possibly going to have many versions of these forms with slightly differing keys, which will be a pain to order in some hard coded way.

As Magnus noted, you can use JSON instead of JSONB. JSON is basically a text blob with a syntax check wrapper around it, so it will be order-stable once created. (If you run it through a JSONB-expecting function, then the ordering may change again.) It's less efficient to operate on than JSONB, but that might be OK for your purposes.

--
-- Christophe Pettus
xof@thebuild.com

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Tony Shelver (#4)
Re: JSONB order?

On Thu, Nov 5, 2020 at 8:46 AM Tony Shelver <tshelver@gmail.com> wrote:

---------- Forwarded message ---------
From: Tony Shelver <tshelver@gmail.com>
Date: Thu, 5 Nov 2020 at 17:45
Subject: Re: JSONB order?
To: Christophe Pettus <xof@thebuild.com>

Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way
every time.

FYI, as of Python 3.7, dicts *are* ordered.

The problem is that we are possibly going to have many versions of these
forms with slightly differing keys, which will be a pain to order in some
hard coded way.

The Google Sheet source document has column ordering. If that is what you
want to rely upon have your Python code capture that into an array and
attach that array to the json document as a separate
"field_headers_ordered" key (or something similar).

David J.

#7Rob Sargent
robjsargent@gmail.com
In reply to: Tony Shelver (#4)
Re: JSONB order?

On Nov 5, 2020, at 8:45 AM, Tony Shelver <tshelver@gmail.com> wrote:

---------- Forwarded message ---------
From: Tony Shelver <tshelver@gmail.com <mailto:tshelver@gmail.com>>
Date: Thu, 5 Nov 2020 at 17:45
Subject: Re: JSONB order?
To: Christophe Pettus <xof@thebuild.com <mailto:xof@thebuild.com>>

Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way every time.

FYI, as of Python 3.7, dicts are ordered.

The problem is that we are possibly going to have many versions of these forms with slightly differing keys, which will be a pain to order in some hard coded way.

On Thu, 5 Nov 2020 at 17:40, Christophe Pettus <xof@thebuild.com <mailto:xof@thebuild.com>> wrote:

On Nov 5, 2020, at 07:34, Tony Shelver <tshelver@gmail.com <mailto:tshelver@gmail.com>> wrote:
But... seen above, the order gets mixed up.

Any ideas?

JSON objects, like Python dicts, are not automatically ordered by key. Once you move from the column space to the JSON object space, you can't rely on the object keys being in a consistent order.

You'll want to have a step when ingesting the JSON object into a report that lines up the key values appropriately with the right presentation in the report.
--
-- Christophe Pettus
xof@thebuild.com <mailto:xof@thebuild.com>

Sounds like you’ll need a separate mechanism for maintaining versions of the forms and which headers represent the same data concept. Always access data via canonical header translated to current form.

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Tony Shelver (#4)
Re: Fwd: JSONB order?

On 11/5/20 7:45 AM, Tony Shelver wrote:

---------- Forwarded message ---------
From: *Tony Shelver* <tshelver@gmail.com <mailto:tshelver@gmail.com>>
Date: Thu, 5 Nov 2020 at 17:45
Subject: Re: JSONB order?
To: Christophe Pettus <xof@thebuild.com <mailto:xof@thebuild.com>>

Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way
every time.

Probably because that resolves to the most efficient way to store in
jsonb for that particular record.

FYI, as of Python 3.7, dicts /_are_/ ordered.

By insertion order so updating a dict with a new item will add new key
to end.

The problem is that we are possibly going to have many versions of these
forms with slightly differing keys, which will be a pain to order in
some hard coded way.

On Thu, 5 Nov 2020 at 17:40, Christophe Pettus <xof@thebuild.com
<mailto:xof@thebuild.com>> wrote:

On Nov 5, 2020, at 07:34, Tony Shelver <tshelver@gmail.com

<mailto:tshelver@gmail.com>> wrote:

But...  seen above, the order gets mixed up.

Any ideas?

JSON objects, like Python dicts, are not automatically ordered by
key.  Once you move from the column space to the JSON object space,
you can't rely on the object keys being in a consistent order.

You'll want to have a step when ingesting the JSON object into a
report that lines up the key values appropriately with the right
presentation in the report.
--
-- Christophe Pettus
xof@thebuild.com <mailto:xof@thebuild.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Tony Shelver
tshelver@gmail.com
In reply to: Rob Sargent (#7)
Re: JSONB order?

On Thu, 5 Nov 2020 at 18:27, Rob Sargent <robjsargent@gmail.com> wrote:

On Nov 5, 2020, at 8:45 AM, Tony Shelver <tshelver@gmail.com> wrote:

---------- Forwarded message ---------
From: Tony Shelver <tshelver@gmail.com>
Date: Thu, 5 Nov 2020 at 17:45
Subject: Re: JSONB order?
To: Christophe Pettus <xof@thebuild.com>

Thanks Christophe, that's what I thought.
Just seemed weird that they were 'disordered' in exactly the same way
every time.

FYI, as of Python 3.7, dicts *are* ordered.

The problem is that we are possibly going to have many versions of these
forms with slightly differing keys, which will be a pain to order in some
hard coded way.

Sounds like you’ll need a separate mechanism for maintaining versions of
the forms and which headers represent the same data concept. Always access
data via canonical header translated to current form.

Did a workaround. For what I needed, I used the python dict to json
function, which creates a string, and then stored that string in a varchar
column on the DB, leaving the json in place for other use.

For reporting, pulled the varchar back and used the json to dict function
on python. Minimal code changes required.

The output is formatted into an Excel spreadsheet, writing the column
header / title from the dict keys, and then formatting the values
underneath.

Not exactly 3rd normal form and all the other best practices, but this is a
hard prototype we have some customers for, so ease of being able to iterate
multiple forms and changes is key.