Thread: writing a foreign data wrapper for hdfs, but getting and undefined symbol error for hdfsConnect

I've been working on creating a foreign data wrapper for hdfs on using version
9.1.0. This is my first time creating C functions against postgres, so
hopefully this falls under the 'newbie' category and is easy to solve.

The source code code does compile resulting in a shared library:

file mylibrary.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
dynamically linked, not stripped

ldd mylibrary.so
        linux-vdso.so.1 =>  (0x00007fff40fff000)
        libc.so.6 => /lib/libc.so.6 (0x00007f3adb8cc000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3adbe72000)

But, the library fails to load when I use the LOAD statement:

LOAD mylibrary.so

The error is:

 ERROR:  could not load library "mylibrary.so": mylibrary.so: undefined symbol:
hdfsConnect

I already figured it needs to recognize the hadoop shared library, libhdfs.so.0
but loading hdfs directly, of course, results in the following error:

ERROR:  incompatible library "/home/robert/lib/libhdfs.so.0": missing magic
block
HINT:  Extension libraries are required to use the PG_MODULE_MAGIC macro.

So how do I manage to load hdfs? All that is required at this point for the
data wrapper to do is to open and read a very small file.

Here's the snippet in question:

/////////////////////////////////////////////////////////////////////
#include "hdfs.h"

#include <string.h>
#include <stdio.h>

#include "postgres.h"
#include "fmgr.h"
#include "funcapi.h"
#include "foreign/fdwapi.h"
#include "foreign/foreign.h"
#include "commands/explain.h"
#include "commands/defrem.h"
#include "catalog/pg_foreign_table.h"

PG_MODULE_MAGIC;

typedef struct {
  char     *connection,
           *filename,
           *limit,
           *offset;
  hdfsFS   *fs;
  hdfsFile *fp;
} hdfsFdwExecutionState;

extern Datum hdfs_fdw_handler(PG_FUNCTION_ARGS);
.
.
.
// here's where the function "hdfsConnect" is first called

static void hdfsBeginForeignScan(ForeignScanState *node, int eflags) {
  hdfsFdwExecutionState *festate;
  char  *recordset  = malloc(LINE_LENGTH*sizeof(*recordset)),
        *connection = "default",
        *filename,
        *limit,
        *offset;
  hdfsFS   fs;
  hdfsFile fp;

  if (eflags & EXEC_FLAG_EXPLAIN_ONLY) return;

  hdfsGetOptions(RelationGetRelid(node->ss.ss_currentRelation), &filename,
&limit, &offset);

  festate           = (hdfsFdwExecutionState *)
palloc(sizeof(hdfsFdwExecutionState));

  fs                = hdfsConnect(connection, 0);
  fp                = setFILEoffset(fs, filename, offset);

  festate->filename = filename;
  festate->limit    = limit;
  festate->offset   = offset;
  festate->fs       = fs;
  festate->fp       = (void *) fp;

  node->fdw_state = (void *) festate;
}
/////////////////////////////////////////////////////////////////////




"Rob_pg" <robert7390@comcast.net> writes:
> I've been working on creating a foreign data wrapper for hdfs on using version
> 9.1.0. This is my first time creating C functions against postgres, so
> hopefully this falls under the 'newbie' category and is easy to solve.

> The source code code does compile resulting in a shared library:

> file mylibrary.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
> dynamically linked, not stripped

> ldd mylibrary.so
>         linux-vdso.so.1 =>  (0x00007fff40fff000)
>         libc.so.6 => /lib/libc.so.6 (0x00007f3adb8cc000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007f3adbe72000)

The reason it's not working is that libhdfs.so isn't listed as a
requirement for mylibrary.so.  You did not show us your link command
for mylibrary.so, but most likely there needs to be a -lhdfs in it.

            regards, tom lane

Hi Tom,

Thanks for the tip, I altered the gcc invocation as follows:

Here are the two gcc invocations originally creating the shared library:

gcc -Wall -fPIC -c mylibrary.c -o mylibrary.o \
-I $(A) -I $(B) -I $(C) -I $(E) -lhdfs

gcc -I $(A) -I $(B) -I $(C) -I $(E) -shared\
-Wl,-soname,mylibrary.so -o mylibrary.so mylibrary.o

#############################
Here's the new invocations: I added "-lhdfs" to the second gcc invocation.

gcc -I $(A) -I $(B) -I $(C) -I $(E) -shared\
 -lhdfs -Wl,-soname,mylibrary.so -o mylibrary.so mylibrary.o

Now I can see the libraries!

ldd mylibrary.so
        linux-vdso.so.1 =>  (0x00007fff499c5000)
        libhdfs.so.0 => /usr/lib/libhdfs.so.0 (0x00007f44e4739000)
        libc.so.6 => /lib/libc.so.6 (0x00007f44e43b6000)
        libjvm.so => /usr/lib/jvm/java-6-
openjdk/jre/lib/amd64/server/libjvm.so (0x00007f44e3866000)
        libdl.so.2 => /lib/libdl.so.2 (0x00007f44e3662000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00007f44e3445000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f44e4b66000)
        libm.so.6 => /lib/libm.so.6 (0x00007f44e31c1000)


> "Rob_pg" <robert7390@comcast.net> writes:
> > I've been working on creating a foreign data wrapper for hdfs on using
> > version 9.1.0. This is my first time creating C functions against
> > postgres, so hopefully this falls under the 'newbie' category and is
> > easy to solve.
> >
> > The source code code does compile resulting in a shared library:
> >
> > file mylibrary.so: ELF 64-bit LSB shared object, x86-64, version 1
> > (SYSV), dynamically linked, not stripped
> >
> > ldd mylibrary.so
> >
> >         linux-vdso.so.1 =>  (0x00007fff40fff000)
> >         libc.so.6 => /lib/libc.so.6 (0x00007f3adb8cc000)
> >         /lib64/ld-linux-x86-64.so.2 (0x00007f3adbe72000)
>
> The reason it's not working is that libhdfs.so isn't listed as a
> requirement for mylibrary.so.  You did not show us your link command
> for mylibrary.so, but most likely there needs to be a -lhdfs in it.
>
>             regards, tom lane

"Rob_pg" <robert7390@comcast.net> writes:
> Thanks for the tip, I altered the gcc invocation as follows:

> Here are the two gcc invocations originally creating the shared library:

> gcc -Wall -fPIC -c mylibrary.c -o mylibrary.o \
> -I $(A) -I $(B) -I $(C) -I $(E) -lhdfs

> gcc -I $(A) -I $(B) -I $(C) -I $(E) -shared\
> -Wl,-soname,mylibrary.so -o mylibrary.so mylibrary.o

> Here's the new invocations: I added "-lhdfs" to the second gcc invocation.

Yeah, -l is useless when building a .o file; gcc will just ignore it.
(Conversely, there's not much point in -I switches in a link step.)

            regards, tom lane