Re: [psycopg] Turbo ODBC - Mailing list psycopg

From Uwe L. Korn
Subject Re: [psycopg] Turbo ODBC
Date
Msg-id 1484666321.1044625.850459792.6F63D407@webmail.messagingengine.com
Whole thread Raw
In response to Re: [psycopg] Turbo ODBC  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List psycopg
In Arrow, we have a bitmap for each column indicating if a value is
NULL.  We can convert this clearly to NumPy masked arrays but once this
data is converted to Pandas though, integer columns with NULLs will be
converted to floats with NaN representing NULL as there is no explicit
NULL representation in Pandas 0.x.

--
  Uwe L. Korn
  uwelk@xhochy.com

On Tue, Jan 17, 2017, at 04:06 PM, Jim Nasby wrote:
> On 1/17/17 4:51 AM, Uwe L. Korn wrote:
> > One important thing for fast columnar data access is that you don't want
> > to have the data as Python objects before they will be turned into a
> > DataFrame. Besides much better buffering, this was one of the main
> > advantages we have with Turbodbc. Given that the ODBC drivers for
> > Postgres seem to be in a miserable state, it would be much preferable to
> > have such functionality directly in pyscopg2. Given from meetings with
> > people at some PyData conferences that I showed turbodbc to, I can
> > definitely say that there are some users out there that would like a
> > fast path for Postgres-to-Pandas.
> >
> > In turbodbc, there are two additional functions added to the DB-API
> > cursor object: fetchallnumpy and fetchallarrow. These suffice mostly for
> > the typical pandas workloads. The experience from implementing this is
> > basically that with Arrow it was quite simple to add a columnar
> > interface as most of the data conversions were handled by Arrow. Also
> > there was no need for me to interface with any Python types as the
> > language "barrier" was transparently handled by Arrow.
>
> I certainly see the advantages to not creating objects. How do you end
> up handling NULLs?
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)


psycopg by date:

Previous
From: "Koenig, Michael"
Date:
Subject: Re: [psycopg] Turbo ODBC
Next
From: Jim Nasby
Date:
Subject: Re: [psycopg] Turbo ODBC