Hello,
in python pandas there is for example a json_normalize function,
i didn't find something similar or better in postgresql?
what would be the general idea: "easily" convert an hierarchical
structure like json or xml to a table; for example creating columns by
appending the key-names when going doing down the three, using null for
empty values, adding more columns as needed by the given structure.
(1-way operation)
a few conceptual gists:
jsonX=
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Sta..";
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-m..",
"GlossSeeAlso":
["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
select *
from json_to_table('jsonX');
-- generated columns with no data/only nulls could be removed..
-- arrays could be unnested in the process as well
glossary | glossary.title | glossary.title.GlossDiv.title | ..
-------------------------------------------------------------
null | example glossary | S |
..
the last column: glossary.title.GlossDiv.GlossList.GlossEntry.GlossSee
with value "markup"
---
what if there are different structures that need to be combined?
(they could be added in the same manner as before)
jsonY=
{
s1:[{
"f1": "a",
"f2": "b",
"f3": { "f3.1": "c",
"f3.2": "d"}
},
{
"f1": "e",
"f4": "g"
}
]
}
select *
from json_to_table('jsonY');
-- generated columns with no data/only nulls could be removed..
-- separator sign is untrusted
s1 | s1.f1 | s1.f2 | s1.f3 | s1.f3.f3.1 | s1.f3.f3.2 | s1.f4
-------------------------------------------------------------
null| a | b | null | c | d | null
null| e | null | null | null | null | g
any ideas or suggestions (apart from plpython)?
Wim