Thread: PostGIS data loader / dumper - GSOC 2022 Idea

PostGIS data loader / dumper - GSOC 2022 Idea

From
"Regina Obe"
Date:
In pgAdmin3, our ESRI shapefile loader /dumper GUI was installable as a
plugin.  We lost that feature with pgAdmin4.

To get this feature back, I was thinking that our command line tools would
be easier to integrate in pgAdmin4 than the gui
and also solve the issue of our GUI not packaged  on many OS.

The command line tools I am thinking about are these that are part of the
PostGIS code base and generally installable even on headless systems.
shp2pgsql (ESRI shape file loader) and pgsql2shp (ESRI shapefile dumper)

These could be set up in much the same way as how the Backup / Restore /
export/import (call pg_dump, pg_restore, psql)

This would require the user to have PostGIS command-line tools and PostGIS
in database installed to use it and the binaries in the PostgreSQL bin
folder.
For windows these files are always installed in PostgreSQL bin when PostGIS
is installed.  I think on Linux they are installed in system.

Do you see any issues with the idea I've outlined in Idea 2:

https://trac.osgeo.org/postgis/wiki/GoogleSummerCode2022

I haven't looked that closely at how those pieces in pgAdmin4 work so not
sure how difficult the task and if it makes sense for a GSOC project.

Thanks,
Regina




Re: PostGIS data loader / dumper - GSOC 2022 Idea

From
Dave Page
Date:
Hi

On Sat, 19 Feb 2022 at 22:51, Regina Obe <lr@pcorp.us> wrote:
In pgAdmin3, our ESRI shapefile loader /dumper GUI was installable as a
plugin.  We lost that feature with pgAdmin4.

To get this feature back, I was thinking that our command line tools would
be easier to integrate in pgAdmin4 than the gui
and also solve the issue of our GUI not packaged  on many OS.

The command line tools I am thinking about are these that are part of the
PostGIS code base and generally installable even on headless systems.
shp2pgsql (ESRI shape file loader) and pgsql2shp (ESRI shapefile dumper)

These could be set up in much the same way as how the Backup / Restore /
export/import (call pg_dump, pg_restore, psql)

Makes sense.
 

This would require the user to have PostGIS command-line tools and PostGIS
in database installed to use it and the binaries in the PostgreSQL bin
folder.
For windows these files are always installed in PostgreSQL bin when PostGIS
is installed.  I think on Linux they are installed in system.

We could probably come up with a sensible way of setting the path with reasonable defaults. I'd be inclined to have that as a separate setting from the normal binpath (as we likely wouldn't have per-server versions, except on Windows). 
 

Do you see any issues with the idea I've outlined in Idea 2:

https://trac.osgeo.org/postgis/wiki/GoogleSummerCode2022

I haven't looked that closely at how those pieces in pgAdmin4 work so not
sure how difficult the task and if it makes sense for a GSOC project.

It's certainly feasible for someone to do. We have infrastructure for running external tools, so that part is easy. Adding a PostGIS binpath would also be very easy. Most of the work would be designing and building the wizard that the user would complete to generate the command that would be run by the process manager. I don't know the intricacies of those commands though, so I have no idea if this is a simple flow or something that's complex with lots of inter-dependencies between options.

My main concern with this project would be on the mentoring side. How familiar are you and Sandro with the pgAdmin codebase, particularly with the newer React framework we're moving to now (which you should probably add to your skills required list in place of Javascript)? The team can offer advice and guidance of course, but if a student ends up needing a lot of hand-holding, that might become more difficult.
 
--

RE: PostGIS data loader / dumper - GSOC 2022 Idea

From
"Regina Obe"
Date:


This would require the user to have PostGIS command-line tools and PostGIS
in database installed to use it and the binaries in the PostgreSQL bin
folder.
For windows these files are always installed in PostgreSQL bin when PostGIS
is installed.  I think on Linux they are installed in system.

 

We could probably come up with a sensible way of setting the path with reasonable defaults. I'd be inclined to have that as a separate setting from the normal binpath (as we likely wouldn't have per-server versions, except on Windows). 

[Regina Obe]

I was thinking of that too, but wasn’t sure if that was asking for too much.  BTW when I brought this idea up, others had similar wishes

Like pgPointcloud loading ogr2ogr (used to load/read 1000s of other spatial vector and non-spatial data) (which is often installed if you have postgis_raster installed).  So we might have some others coming down the pike if this is successful, and might to just add these as additional tabs aka a “PostGIS gui”

 


Do you see any issues with the idea I've outlined in Idea 2:

https://trac.osgeo.org/postgis/wiki/GoogleSummerCode2022

I haven't looked that closely at how those pieces in pgAdmin4 work so not
sure how difficult the task and if it makes sense for a GSOC project.

 

It's certainly feasible for someone to do. We have infrastructure for running external tools, so that part is easy. Adding a PostGIS binpath would also be very easy. Most of the work would be designing and building the wizard that the user would complete to generate the command that would be run by the process manager. I don't know the intricacies of those commands though, so I have no idea if this is a simple flow or something that's complex with lots of inter-dependencies between options.

[Regina Obe]

For shp2pgsql and pgsql2shp both just need a database connection. There isn’t that much inter-dependency between the options that couldn’t be handled with one screen.

 

shp2pgsql generates a psql/sql data which can then be piped to psql.  That is the only flow where more than one commandline is needed

 

pgsql2shp needs a connection to the database and then dumps 2-4 files or for each table in the target folder.  So for postgis tables it would have a .dbf, .shp, .shx, .cpg

For regular tables (no postgis columns) – it would have a .dbf and .cpg (the cpg just has character encoding information so not really necessary)

 

My main concern with this project would be on the mentoring side. How familiar are you and Sandro with the pgAdmin codebase, particularly with the newer React framework we're moving to now (which you should probably add to your skills required list in place of Javascript)? The team can offer advice and guidance of course, but if a student ends up needing a lot of hand-holding, that might become more difficult.

[Regina Obe]

I’ve updated the required skill list. 

I’ve lurked in the code base but haven’t spent much time looking at it. 

No neither Sandro or I are familiar with React.  It has been on my list to learn, since many of my clients use it.

I’ve mostly used Angular, JQuery, Bootstrap for my web projects.  I’m guessing React wouldn’t be too hard for me to pick up.

 

Thanks,

Regina

 

Re: PostGIS data loader / dumper - GSOC 2022 Idea

From
Dave Page
Date:
Hi

On Mon, 21 Feb 2022 at 16:53, Regina Obe <lr@pcorp.us> wrote:


This would require the user to have PostGIS command-line tools and PostGIS
in database installed to use it and the binaries in the PostgreSQL bin
folder.
For windows these files are always installed in PostgreSQL bin when PostGIS
is installed.  I think on Linux they are installed in system.

 

We could probably come up with a sensible way of setting the path with reasonable defaults. I'd be inclined to have that as a separate setting from the normal binpath (as we likely wouldn't have per-server versions, except on Windows). 

[Regina Obe]

I was thinking of that too, but wasn’t sure if that was asking for too much.  BTW when I brought this idea up, others had similar wishes

Like pgPointcloud loading ogr2ogr (used to load/read 1000s of other spatial vector and non-spatial data) (which is often installed if you have postgis_raster installed).  So we might have some others coming down the pike if this is successful, and might to just add these as additional tabs aka a “PostGIS gui”


Yes. I wouldn't want to have to add search path options for every possible tool we might interact with, but we could certainly have a generic set of paths for tools, as well as the DB server-specific ones we already have.

Or we could just have one set of options with full paths to any tools we integrate with (and ignore the automatic searching that we have for DB server-specific stuff), e.g.

shp2pgsql Binary: /usr/bin/shp2pgsql
pgsql2shp Binary: /usr/bin/pgsql2shp
Foo Binary: /usr/bin/foo
Bar Binary: /usr/bin/bar
...
...
 

 


Do you see any issues with the idea I've outlined in Idea 2:

https://trac.osgeo.org/postgis/wiki/GoogleSummerCode2022

I haven't looked that closely at how those pieces in pgAdmin4 work so not
sure how difficult the task and if it makes sense for a GSOC project.

 

It's certainly feasible for someone to do. We have infrastructure for running external tools, so that part is easy. Adding a PostGIS binpath would also be very easy. Most of the work would be designing and building the wizard that the user would complete to generate the command that would be run by the process manager. I don't know the intricacies of those commands though, so I have no idea if this is a simple flow or something that's complex with lots of inter-dependencies between options.

[Regina Obe]

For shp2pgsql and pgsql2shp both just need a database connection. There isn’t that much inter-dependency between the options that couldn’t be handled with one screen.

 

shp2pgsql generates a psql/sql data which can then be piped to psql.  That is the only flow where more than one commandline is needed

 

pgsql2shp needs a connection to the database and then dumps 2-4 files or for each table in the target folder.  So for postgis tables it would have a .dbf, .shp, .shx, .cpg

For regular tables (no postgis columns) – it would have a .dbf and .cpg (the cpg just has character encoding information so not really necessary)


OK, that sounds straightforward enough.
 

 

My main concern with this project would be on the mentoring side. How familiar are you and Sandro with the pgAdmin codebase, particularly with the newer React framework we're moving to now (which you should probably add to your skills required list in place of Javascript)? The team can offer advice and guidance of course, but if a student ends up needing a lot of hand-holding, that might become more difficult.

[Regina Obe]

I’ve updated the required skill list. 

I’ve lurked in the code base but haven’t spent much time looking at it. 

No neither Sandro or I are familiar with React.  It has been on my list to learn, since many of my clients use it.

I’ve mostly used Angular, JQuery, Bootstrap for my web projects.  I’m guessing React wouldn’t be too hard for me to pick up.


Great - then I don't imagine this would be a major issue :-)

From the sound of it, this would be a relatively simple GSoC project.
 
--