Thread: PGXS problem with pdftotext
I've been wondering whether anyone else would want to use the functions we wrote to extract text from PDF documents stored in bytea columns. If so, I would need to sort out the problems I've been having with builds through the PGXS techniques. Here's the directory, after a successful build under contrib: kgrittn@project-db:~/postgresql-8.3.7/contrib/pdftotext> ll total 108 -rw-r--r-- 1 kgrittn dbas 22990 2009-04-14 17:14 libpdftotext.a lrwxrwxrwx 1 kgrittn dbas 19 2009-04-14 17:14 libpdftotext.so -> libpdftotext.so.0.0 lrwxrwxrwx 1 kgrittn dbas 19 2009-04-14 17:14 libpdftotext.so.0 -> libpdftotext.so.0.0 -rwxr-xr-x 1 kgrittn dbas 21666 2009-04-14 17:14 libpdftotext.so.0.0 -rw-r--r-- 1 kgrittn dbas 443 2009-04-14 17:14 Makefile -rw-r--r-- 1 kgrittn dbas 2980 2008-07-22 13:00 pdftotext.c -rw-r--r-- 1 kgrittn dbas 14184 2009-04-14 17:14 pdftotext.o -rw-r--r-- 1 kgrittn dbas 285 2009-04-14 17:14 pdftotext.sql -rw-r--r-- 1 kgrittn dbas 285 2008-07-22 13:00 pdftotext.sql.in -rw-r--r-- 1 kgrittn dbas 4658 2009-04-13 17:02 poppler_compat.cc -rw-r--r-- 1 kgrittn dbas 355 2008-07-22 13:00 poppler_compat.h -rw-r--r-- 1 kgrittn dbas 8208 2009-04-14 17:14 poppler_compat.o -rw-r--r-- 1 kgrittn dbas 733 2008-07-22 13:00 README.pdftotext Here's the Makefile contents: MODULE_big = pdftotext OBJS = pdftotext.o poppler_compat.o DATA_built = pdftotext.sql DOCS = README.pdftotext PG_CPPFLAGS =-I/usr/include/poppler -shared -fpic SHLIB_LINK = -lpoppler -L/usr/local/lib ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS) else subdir = contrib/pdftotext top_builddir = ../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif If we export PGXS=1 and make ; sudo make install outside the PostgreSQL build tree, it seems to build and deploy OK, but it can't find the poppler implementation at run time. If we do it in the build tree, all is good. Where's the problem? Is the SHLIB_LINK setting proper? What's the right way to do this? BTW, libpoppler is GPL licensed, and always reminds me of what Churchill said about democracy, if that affects anyone's interest in the code. You're likely to need to tweak the code based on the particular version of libpoppler you're using. If you use an older version of libpoppler, it can crash the whole PostgreSQL environment if you try to use it with a PDF using newer features. :-( If anyone's still interested, and I can fix the build problem, I'll throw the source code onto pgfoundry. -Kevin "It has been said that democracy is the worst form of government except all the others that have been tried."
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > PG_CPPFLAGS =-I/usr/include/poppler -shared -fpic > SHLIB_LINK = -lpoppler -L/usr/local/lib It doesn't seem appropriate to put -shared or -fpic into PG_CPPFLAGS. If you need those, the makefiles should add them automatically. The other thing that seems peculiar is looking for the include files in /usr/include and the library in /usr/local/lib. I've never seen any package install itself like that --- either everything goes under /usr/local or nothing does. I suspect you might have two incompatible poppler installations on the machine and you're picking up the wrong combination of files. Running ldd or local equivalent on pdftotext.so might help you determine what's going on as far as finding the library goes. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> PG_CPPFLAGS =-I/usr/include/poppler -shared -fpic >> SHLIB_LINK = -lpoppler -L/usr/local/lib > > It doesn't seem appropriate to put -shared or -fpic into > PG_CPPFLAGS. If you need those, the makefiles should add them > automatically. > > The other thing that seems peculiar is looking for the include files > in /usr/include and the library in /usr/local/lib. I've never > seen any package install itself like that --- either everything goes > under /usr/local or nothing does. I suspect you might have two > incompatible poppler installations on the machine and you're picking > up the wrong combination of files. > > Running ldd or local equivalent on pdftotext.so might help you > determine what's going on as far as finding the library goes. Thanks. Let's just say that the poppler build from source has not ever gone as smoothly as the most eventful PostgreSQL build from source. We've had to do much ad hoc hacking to get anything usable, and I'm sure we've made some bad choices in the process. I'll take a close look at where everything has landed in light of your advice, and see if I can arrange things more sensibly. Does it seem likely that fixing these issues will allow PGXS to work? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Does it seem likely that fixing these issues will allow PGXS to work? Couldn't say. It would be useful to compare ldd output for pdftotext.so built both ways. regards, tom lane
Hi, Le 2 juil. 09 à 22:20, Kevin Grittner a écrit : > Here's the Makefile contents: You could compare to this: http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/backports/uuid-ossp/Makefile?rev=1.1.1.1&content-type=text/x-cvsweb-markup > SHLIB_LINK = -lpoppler -L/usr/local/lib SHLIB_LINK += $(OSSP_UUID_LIBS) Dunno how far it'll get you, but it may help some :) Regards, -- dim
I cleaned up the poppler build situation, and all looks good except: Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> PG_CPPFLAGS =-I/usr/include/poppler -shared -fpic > > It doesn't seem appropriate to put -shared or -fpic into > PG_CPPFLAGS. If you need those, the makefiles should add them > automatically. Leaving off -shared was OK, but when I left off -fpic, I got this: /usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../x86_64-suse-linux/bin/ld: poppler_compat.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC poppler_compat.o: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [libpdftotext.so.0.0] Error 1 With -fPIC or -fpic in my Makefile, PGXS now seems to work as intended. Is it worth doing anything to check on why that is needed or how to get rid of it? Might it have something to do with compiling both .c and .cc files? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Leaving off -shared was OK, but when I left off -fpic, I got this: > /usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../x86_64-suse-linux/bin/ld: > poppler_compat.o: relocation R_X86_64_32 against `a local symbol' can > not be used when making a shared object; recompile with -fPIC > poppler_compat.o: could not read symbols: Bad value > collect2: ld returned 1 exit status > make: *** [libpdftotext.so.0.0] Error 1 Huh. On Linux platforms, the PG makefiles should include -fpic in CFLAGS (via CFLAGS_SL) automatically; you should not need to repeat it in CPPFLAGS. For instance, if I go into contrib/adminpack and make, I see sed 's,MODULE_PATHNAME,$libdir/adminpack,g' adminpack.sql.in >adminpack.sql gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv-g -fpic -I../../src/interfaces/libpq -I. -I../../src/include -D_GNU_SOURCE -c -o adminpack.o adminpack.c gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv-g -fpic -shared adminpack.o -L../../src/port -Wl,-rpath,'/home/tgl/testversion/lib' -o adminpack.so What do you get? What does pg_config report for the various FLAGS variables? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > What do you get? sed 's,MODULE_PATHNAME,$libdir/adminpack,g' adminpack.sql.in >adminpack.sql gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -fpic -I/usr/local/pgsql-8.3.7/include -I. -I/usr/local/pgsql-8.3.7/include/server -I/usr/local/pgsql-8.3.7/include/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o adminpack.o adminpack.c ar crs libadminpack.a adminpack.o ranlib libadminpack.a gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -fpic -shared -Wl,-soname,libadminpack.so.0 adminpack.o -L/usr/local/pgsql-8.3.7/lib -Wl,-rpath,'/usr/local/pgsql-8.3.7/lib' -o libadminpack.so.0.0 > What does pg_config report for the various FLAGS variables? CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g CFLAGS_SL = -fpic LDFLAGS = -Wl,-rpath,'/usr/local/pgsql-8.3.7/lib' LDFLAGS_SL = -Kevin
Tom Lane <tgl@sss.pgh.pa.us> wrote: > What do you get? More to the point, here's what I get when I use PGXS with my pdf code. sed 's,MODULE_PATHNAME,$libdir/pdftotext,g' pdftotext.sql.in >pdftotext.sql gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -fpic -I/usr/local/include/poppler -I. -I/usr/local/pgsql-8.3.7/include/server -I/usr/local/pgsql-8.3.7/include/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o pdftotext.o pdftotext.c g++ -I/usr/local/include/poppler -I. -I/usr/local/pgsql-8.3.7/include/server -I/usr/local/pgsql-8.3.7/include/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o poppler_compat.o poppler_compat.cc ar crs libpdftotext.a pdftotext.o poppler_compat.o ranlib libpdftotext.a gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -fpic -shared -Wl,-soname,libpdftotext.so.0 pdftotext.o poppler_compat.o -L/usr/local/pgsql-8.3.7/lib -lpoppler -Wl,-rpath,'/usr/local/pgsql-8.3.7/lib' -o libpdftotext.so.0.0 /usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../x86_64-suse-linux/bin/ld: poppler_compat.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC poppler_compat.o: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [libpdftotext.so.0.0] Error 1 Since the gcc line has it, it must be the g++ line that's the problem? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What does pg_config report for the various FLAGS variables? > CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 > CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing > -fwrapv -g > CFLAGS_SL = -fpic > LDFLAGS = -Wl,-rpath,'/usr/local/pgsql-8.3.7/lib' > LDFLAGS_SL = Well, that looks about right, so the next question is why the CFLAGS value isn't getting used in your build. What's the whole output of make when you try to build your module? regards, tom lane
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Since the gcc line has it, it must be the g++ line that's the problem? Hmm, try addingCXXFLAGS = $(CFLAGS) Although in general we don't try very hard to support C++ code inside the backend. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hmm, try adding > CXXFLAGS = $(CFLAGS) Thanks, that worked; I don't need to specify -fpic in my file if I put the above line in. > Although in general we don't try very hard to support C++ code > inside the backend. I try to avoid it when possible. The C++ code is the thinnest wrapper we could arrange around the poppler code to allow access from the C code. Would it make sense to add the above to the PGXS file somewhere, for those cases (like this) when someone has to access some existing C++ code base? -Kevin