Round-trip GeoJSON
Created by: jayvdb
I would like to be able to round-trip GeoJSON through csvkit, so that small operations can be done using csvkit in the middle, without resulting in strange diffs. https://github.com/wireservice/csvkit/pull/867 is part of this effort.
i.e. The following should have very minimal output.
$ wget https://raw.githubusercontent.com/lyzidiamond/learn-geojson/master/geojson/pdxplaces.geojson
$ in2csv -f geojson pdxplaces.geojson > pdxplaces.csv
$ csvjson --lon longitude --lat latitude --indent=2 pdxplaces.csv > pdxplaces.csv.geojson
$ diff -u pdxplaces.geojson pdxplaces.csv.geojson
Some of the diff results which need to be controllable using command line args:
- sequence of keys in a feature ; i.e. should
geometry
appear beforeproperties
or the opposite. It seems different tools make different choices for this ordering, and ideallyin2csv
can annotate its output with this ordering (assuming it is consistent throughout the input) so that csvjson can re-use the same ordering. - do not emit empty properties. Currently they are being emitted and mostly they shouldn't be emitted. (https://github.com/wireservice/csvkit/pull/869)
- order the properties by key.
in2csv
adds a column when it sees a new property, which doesnt work well if many properties do not exist on the firstFeature
, but appear in latter nodes. Sorting could be a feature of in2csv orcsvjson
- generation of the
bbox
. It should be possible to disable this being computed and added, and it is non-trivial to compute it for complex geojson (c.f. https://github.com/wireservice/csvkit/pull/867). - other metadata (https://github.com/wireservice/csvkit/issues/870)
After simplistic solutions for those three, the diff looks like:
diff -u pdxplaces.geojson pdxplaces.csv.geojson
--- pdxplaces.geojson 2017-07-24 10:22:24.234446535 +0700
+++ pdxplaces.csv.geojson 2017-07-24 12:58:46.408943761 +0700
@@ -106,7 +106,7 @@
"properties": {
"Name": "place 2",
"Contributor": "geografa",
- "Reason": 2
+ "Reason": "2"
},
"geometry": {
"type": "Point",
@@ -164,8 +164,7 @@
{
"type": "Feature",
"properties": {
- "my polygon": "it's here",
- "Name": ""
+ "my polygon": "it's here"
},
"geometry": {
"type": "Polygon",
@@ -251,8 +250,8 @@
"type": "Feature",
"properties": {
"Name": "The Commons Brewery",
- "Reason": "Farmhouse ales, duh.",
- "Contributor": "Dillon Mahmoudi"
+ "Contributor": "Dillon Mahmoudi",
+ "Reason": "Farmhouse ales, duh."
},
"geometry": {
"type": "Point",
@@ -267,7 +266,7 @@
"properties": {
"Name": "Kenilworth Coffeehouse",
"Utility": "Excellent biscuits",
- "Coffee": "Yes"
+ "Coffee": true
},
"geometry": {
"type": "Point",
@@ -327,8 +326,7 @@
"properties": {
"Name": "Square 54",
"Contributor": "Henrik",
- "Reason": "Mystic zombie area",
- "my polygon": ""
+ "Reason": "Mystic zombie area"
},
"geometry": {
"type": "Polygon",
The change to "Coffee"
is concerning (and maybe there is a command line arg which would prevent that), but the rest of those changes are IMO a good "linted" output.