Thread: indexes on float8 vs integer
Anyone got any insight or experience in the speed and size of indexes on Integer(4 byte) vs float (8byte). For a projectthat I'm on, I'm contemplating using an integer for: Latitude Longitude In a huge, publically searchable table. In the INSERTS, the representation would be equal to: IntegerLatOrLong = to_integer( float8LatOrLong * to_float(1000000) ); This would keep it in a smaller (4 bytes vs 8 byte) representation with simple numeric comparison for indexing values whilestill provide 6 decimals of precision, i.e. 4.25 inches of resolution, what google mapes provides. I am expecting this table to be very huge. Hey, I want to be the next 'portal' :-) Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life I agree with Bolivian President Evo Morales # The right to life: "The right for no ecosystem to be eliminated by the irresponsible acts of human beings." # The right of biosystems to regenerate themselves: "Development cannot be infinite. There's a limit on everything." # The right to a clean life: "The right for Mother Earth to live without contamination, pollution. Fish and animals and treeshave rights." # The right to harmony and balance between everyone and everything: "We are all interdependent." See the movie - 'Inconvenient Truth' See the movie - 'Syriana'
On Sat, Jul 11, 2009 at 10:15 PM, Dennis Gearon<gearond@sbcglobal.net> wrote: > > Anyone got any insight or experience in the speed and size of indexes on Integer(4 byte) vs float (8byte). For a projectthat I'm on, I'm contemplating using an integer for: > > Latitude > Longitude > > In a huge, publically searchable table. > > In the INSERTS, the representation would be equal to: > > IntegerLatOrLong = to_integer( float8LatOrLong * to_float(1000000) ); > > This would keep it in a smaller (4 bytes vs 8 byte) representation with simple numeric comparison for indexing values whilestill provide 6 decimals of precision, i.e. 4.25 inches of resolution, what google mapes provides. > > I am expecting this table to be very huge. Hey, I want to be the next 'portal' :-) > Dennis Gearon Well, floats can be bad if you need exact math or matching anyway, and math on them is generally slower than int math. OTOH, you could look into numeric to see if it does what you want. Used to be way slower than int, but in recent versions of pgsql it's gotten much faster. Numeric is exact, where float is approximate, so if having exact values be stored is important, then either using int and treating it like fixed point, or using numeric is usually better.
Hi Dennis, Is there any reason you are not using PostGIS to store the values as point geometries & use a spatial (GIST) index on them?I have tables with hundreds of millions of point features which work well. On disk data volume is not really worth optimisingfor with such systems, i suggest flexibility, ease of implementation & overall performance should be more valuable. If you need to store & query coordinates, then a map based tool seems relevant, and there are plenty of tools to do thissoirt of thing with PostGIS data, such as Mapserver, GeoServer at the back end & OpenLayers in the front end. Cheers, Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand >>> Scott Marlowe <scott.marlowe@gmail.com> 07/12/09 10:31 PM >>> On Sat, Jul 11, 2009 at 10:15 PM, Dennis Gearon<gearond@sbcglobal.net> wrote: > > Anyone got any insight or experience in the speed and size of indexes on Integer(4 byte) vs float (8byte). For a projectthat I'm on, I'm contemplating using an integer for: > > Latitude > Longitude > > In a huge, publically searchable table. > > In the INSERTS, the representation would be equal to: > > IntegerLatOrLong = to_integer( float8LatOrLong * to_float(1000000) ); > > This would keep it in a smaller (4 bytes vs 8 byte) representation with simple numeric comparison for indexing values whilestill provide 6 decimals of precision, i.e. 4.25 inches of resolution, what google mapes provides. > > I am expecting this table to be very huge. Hey, I want to be the next 'portal' :-) > Dennis Gearon Well, floats can be bad if you need exact math or matching anyway, and math on them is generally slower than int math. OTOH, you could look into numeric to see if it does what you want. Used to be way slower than int, but in recent versions of pgsql it's gotten much faster. Numeric is exact, where float is approximate, so if having exact values be stored is important, then either using int and treating it like fixed point, or using numeric is usually better. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.
Well, Brent, I'm just getting started on this design. I'm doing it at a hosting site, initially, so I have to find out if they haveor will load this module. At first, I was just going to interpolate the distance as a bounding box based on the distance between latitude linesand longitude lines at that latitude. Then serve the data based on the integers for lat/long between two values. Allthe geographic calculations would have taken place in the server app, then postgres would only be working with integers. So, what is the base type for the point column? I had planned on using google maps as the geographic server, I was going to query them using their API and a data setof center location and labeled points within a certain range. Lot's to learn here, that's for sure. I will file your reply and look at it in a week or so when I store the first data. Dennis Gearon --- On Sun, 7/12/09, Brent Wood <b.wood@niwa.co.nz> wrote: > From: Brent Wood <b.wood@niwa.co.nz> > Subject: Re: [GENERAL] indexes on float8 vs integer > To: gearond@sbcglobal.net > Cc: pgsql-general@postgresql.org > Date: Sunday, July 12, 2009, 1:52 PM > Hi Dennis, > > Is there any reason you are not using PostGIS to store the > values as point geometries & use a spatial (GIST) index > on them? I have tables with hundreds of millions of point > features which work well. On disk data volume is not really > worth optimising for with such systems, i suggest > flexibility, ease of implementation & overall > performance should be more valuable. > > If you need to store & query coordinates, then a map > based tool seems relevant, and there are plenty of tools to > do this soirt of thing with PostGIS data, such as Mapserver, > GeoServer at the back end & OpenLayers in the front > end. > > > Cheers, > > Brent Wood > > > Brent Wood > DBA/GIS consultant > NIWA, Wellington > New Zealand > >>> Scott Marlowe <scott.marlowe@gmail.com> > 07/12/09 10:31 PM >>> > On Sat, Jul 11, 2009 at 10:15 PM, Dennis Gearon<gearond@sbcglobal.net> > wrote: > > > > Anyone got any insight or experience in the speed and > size of indexes on Integer(4 byte) vs float (8byte). For a > project that I'm on, I'm contemplating using an integer > for: > > > > Latitude > > Longitude > > > > In a huge, publically searchable table. > > > > In the INSERTS, the representation would be equal to: > > > > IntegerLatOrLong = to_integer( > float8LatOrLong * to_float(1000000) ); > > > > This would keep it in a smaller (4 bytes vs 8 byte) > representation with simple numeric comparison for indexing > values while still provide 6 decimals of precision, i.e. > 4.25 inches of resolution, what google mapes provides. > > > > I am expecting this table to be very huge. Hey, I want > to be the next 'portal' :-) > > Dennis Gearon > > Well, floats can be bad if you need exact math or matching > anyway, and > math on them is generally slower than int math. OTOH, > you could look > into numeric to see if it does what you want. Used to > be way slower > than int, but in recent versions of pgsql it's gotten much > faster. > Numeric is exact, where float is approximate, so if having > exact > values be stored is important, then either using int and > treating it > like fixed point, or using numeric is usually better. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > NIWA is the trading name of the National Institute of Water > & Atmospheric Research Ltd. >
I wish that I didn't have to say this, but that is over my head at this point. I see this HUGE, steep mountain ahead of meand a little sign in front of it saying, "Learning Curve, start here." :-) Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life I agree with Bolivian President Evo Morales # The right to life: "The right for no ecosystem to be eliminated by the irresponsible acts of human beings." # The right of biosystems to regenerate themselves: "Development cannot be infinite. There's a limit on everything." # The right to a clean life: "The right for Mother Earth to live without contamination, pollution. Fish and animals and treeshave rights." # The right to harmony and balance between everyone and everything: "We are all interdependent." See the movie - 'Inconvenient Truth' See the movie - 'Syriana' --- On Sun, 7/12/09, Brent Wood <b.wood@niwa.co.nz> wrote: > From: Brent Wood <b.wood@niwa.co.nz> > Subject: Re: [GENERAL] indexes on float8 vs integer > To: gearond@sbcglobal.net > Date: Sunday, July 12, 2009, 9:10 PM > You might look at UMN mapserver or > Geoserver to provide PostGIS data via WMS/WFS and OpenLayers > to plot these layers on top of Google Maps. These tools > facilitate this sort of online map production pretty easily, > although hosting can be an issue as teh requirements become > more specific. > > Cheers, > > Brent > > > Brent Wood > DBA/GIS consultant > NIWA, Wellington > New Zealand > >>> Dennis Gearon <gearond@sbcglobal.net> > 07/13/09 1:05 PM >>> > > Well, Brent, > I'm just getting started on this > design. I'm doing it at a hosting site, initially, so I have > to find out if they have or will load this module. > At first, I was just going to > interpolate the distance as a bounding box based on the > distance between latitude lines and longitude lines at that > latitude. Then serve the data based on the integers for > lat/long between two values. All the geographic calculations > would have taken place in the server app, then postgres > would only be working with integers. > So, what is the base type for the > point column? > I had planned on using google maps > as the geographic server, I was going to query them using > their API and a data set of center location and labeled > points within a certain range. > Lot's to learn here, that's for > sure. I will file your reply and look at it in a week or so > when I store the first data. > Dennis Gearon > > > > --- On Sun, 7/12/09, Brent Wood <b.wood@niwa.co.nz> > wrote: > > > From: Brent Wood <b.wood@niwa.co.nz> > > Subject: Re: [GENERAL] indexes on float8 vs integer > > To: gearond@sbcglobal.net > > Cc: pgsql-general@postgresql.org > > Date: Sunday, July 12, 2009, 1:52 PM > > Hi Dennis, > > > > Is there any reason you are not using PostGIS to store > the > > values as point geometries & use a spatial (GIST) > index > > on them? I have tables with hundreds of millions of > point > > features which work well. On disk data volume is not > really > > worth optimising for with such systems, i suggest > > flexibility, ease of implementation & overall > > performance should be more valuable. > > > > If you need to store & query coordinates, then a > map > > based tool seems relevant, and there are plenty of > tools to > > do this soirt of thing with PostGIS data, such as > Mapserver, > > GeoServer at the back end & OpenLayers in the > front > > end. > > > > > > Cheers, > > > > Brent Wood > > > > > > Brent Wood > > DBA/GIS consultant > > NIWA, Wellington > > New Zealand > > >>> Scott Marlowe <scott.marlowe@gmail.com> > > 07/12/09 10:31 PM >>> > > On Sat, Jul 11, 2009 at 10:15 PM, Dennis Gearon<gearond@sbcglobal.net> > > wrote: > > > > > > Anyone got any insight or experience in the speed > and > > size of indexes on Integer(4 byte) vs float (8byte). > For a > > project that I'm on, I'm contemplating using an > integer > > for: > > > > > > Latitude > > > Longitude > > > > > > In a huge, publically searchable table. > > > > > > In the INSERTS, the representation would be equal > to: > > > > > > IntegerLatOrLong = > to_integer( > > float8LatOrLong * to_float(1000000) ); > > > > > > This would keep it in a smaller (4 bytes vs 8 > byte) > > representation with simple numeric comparison for > indexing > > values while still provide 6 decimals of precision, > i.e. > > 4.25 inches of resolution, what google mapes > provides. > > > > > > I am expecting this table to be very huge. Hey, I > want > > to be the next 'portal' :-) > > > Dennis Gearon > > > > Well, floats can be bad if you need exact math or > matching > > anyway, and > > math on them is generally slower than int math. > OTOH, > > you could look > > into numeric to see if it does what you want. > Used to > > be way slower > > than int, but in recent versions of pgsql it's gotten > much > > faster. > > Numeric is exact, where float is approximate, so if > having > > exact > > values be stored is important, then either using int > and > > treating it > > like fixed point, or using numeric is usually better. > > > > -- > > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-general > > > > NIWA is the trading name of the National Institute of > Water > > & Atmospheric Research Ltd. > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > NIWA is the trading name of the National Institute of Water > & Atmospheric Research Ltd. >