Issue installing pdftotext in Python 3.6 on CentOS due to popplerInstall Poppler for Python on Macsudo pip install python-Levenshtein failed with error code 1install docker-cloud client using pippython install all in requirements.txtpyodbc and RODBC installation issuespython gcc and setuptools error during lxml installationby installing PyAudio (Python3) on my Raspberry pi 3 (noobs) I get an error, how could i fix this?libKMcuda i found this error when i install libKMcudaerror: command 'gcc' failed with exit status 1 while installing python glovecannot install pdftotext on windows because of poppler

Why is consensus so controversial in Britain?

What's that red-plus icon near a text?

dbcc cleantable batch size explanation

Important Resources for Dark Age Civilizations?

Watching something be written to a file live with tail

How old can references or sources in a thesis be?

Is it unprofessional to ask if a job posting on GlassDoor is real?

How do I deal with an unproductive colleague in a small company?

What are these boxed doors outside store fronts in New York?

What typically incentivizes a professor to change jobs to a lower ranking university?

Is it legal for company to use my work email to pretend I still work there?

Modeling an IP Address

Theorems that impeded progress

Is it possible to run Internet Explorer on OS X El Capitan?

A case of the sniffles

Do I have a twin with permutated remainders?

Replacing matching entries in one column of a file by another column from a different file

Why "Having chlorophyll without photosynthesis is actually very dangerous" and "like living with a bomb"?

Could an aircraft fly or hover using only jets of compressed air?

Accidentally leaked the solution to an assignment, what to do now? (I'm the prof)

Paid for article while in US on F-1 visa?

Why is 150k or 200k jobs considered good when there's 300k+ births a month?

What does the "remote control" for a QF-4 look like?

How much of data wrangling is a data scientist's job?



Issue installing pdftotext in Python 3.6 on CentOS due to poppler


Install Poppler for Python on Macsudo pip install python-Levenshtein failed with error code 1install docker-cloud client using pippython install all in requirements.txtpyodbc and RODBC installation issuespython gcc and setuptools error during lxml installationby installing PyAudio (Python3) on my Raspberry pi 3 (noobs) I get an error, how could i fix this?libKMcuda i found this error when i install libKMcudaerror: command 'gcc' failed with exit status 1 while installing python glovecannot install pdftotext on windows because of poppler






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








-1















I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.



Some quick notes first:



  • I'm using CentOS 6.7 on VirtualBox

  • I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.

  • I'm trying to get an existing application to work, so I'm not looking for an alternative to pdftotext the library at this time.

I followed the instructions from the github repo and already tried this step:



Fedora, Red Hat, and friends:



sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config


But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:



============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files


My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:



[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/


I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?



What I can look into?










share|improve this question
























  • Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

    – B. Shefter
    Mar 8 at 20:15






  • 1





    They're running CentOS release 6.7 (Final)

    – Michael Stackhouse
    Mar 8 at 21:25











  • Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

    – halfer
    Mar 10 at 17:08






  • 1





    @halfer I edited the response to include the full formatted text from the source.

    – Michael Stackhouse
    Mar 18 at 19:56

















-1















I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.



Some quick notes first:



  • I'm using CentOS 6.7 on VirtualBox

  • I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.

  • I'm trying to get an existing application to work, so I'm not looking for an alternative to pdftotext the library at this time.

I followed the instructions from the github repo and already tried this step:



Fedora, Red Hat, and friends:



sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config


But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:



============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files


My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:



[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/


I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?



What I can look into?










share|improve this question
























  • Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

    – B. Shefter
    Mar 8 at 20:15






  • 1





    They're running CentOS release 6.7 (Final)

    – Michael Stackhouse
    Mar 8 at 21:25











  • Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

    – halfer
    Mar 10 at 17:08






  • 1





    @halfer I edited the response to include the full formatted text from the source.

    – Michael Stackhouse
    Mar 18 at 19:56













-1












-1








-1


0






I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.



Some quick notes first:



  • I'm using CentOS 6.7 on VirtualBox

  • I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.

  • I'm trying to get an existing application to work, so I'm not looking for an alternative to pdftotext the library at this time.

I followed the instructions from the github repo and already tried this step:



Fedora, Red Hat, and friends:



sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config


But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:



============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files


My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:



[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/


I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?



What I can look into?










share|improve this question
















I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.



Some quick notes first:



  • I'm using CentOS 6.7 on VirtualBox

  • I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.

  • I'm trying to get an existing application to work, so I'm not looking for an alternative to pdftotext the library at this time.

I followed the instructions from the github repo and already tried this step:



Fedora, Red Hat, and friends:



sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config


But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:



============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files


My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:



[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1

----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/


I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?



What I can look into?







linux python-3.x centos pdftotext poppler






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 10 at 17:06









halfer

14.7k759116




14.7k759116










asked Mar 8 at 1:57









Michael StackhouseMichael Stackhouse

114




114












  • Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

    – B. Shefter
    Mar 8 at 20:15






  • 1





    They're running CentOS release 6.7 (Final)

    – Michael Stackhouse
    Mar 8 at 21:25











  • Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

    – halfer
    Mar 10 at 17:08






  • 1





    @halfer I edited the response to include the full formatted text from the source.

    – Michael Stackhouse
    Mar 18 at 19:56

















  • Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

    – B. Shefter
    Mar 8 at 20:15






  • 1





    They're running CentOS release 6.7 (Final)

    – Michael Stackhouse
    Mar 8 at 21:25











  • Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

    – halfer
    Mar 10 at 17:08






  • 1





    @halfer I edited the response to include the full formatted text from the source.

    – Michael Stackhouse
    Mar 18 at 19:56
















Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

– B. Shefter
Mar 8 at 20:15





Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.

– B. Shefter
Mar 8 at 20:15




1




1





They're running CentOS release 6.7 (Final)

– Michael Stackhouse
Mar 8 at 21:25





They're running CentOS release 6.7 (Final)

– Michael Stackhouse
Mar 8 at 21:25













Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

– halfer
Mar 10 at 17:08





Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.

– halfer
Mar 10 at 17:08




1




1





@halfer I edited the response to include the full formatted text from the source.

– Michael Stackhouse
Mar 18 at 19:56





@halfer I edited the response to include the full formatted text from the source.

– Michael Stackhouse
Mar 18 at 19:56












2 Answers
2






active

oldest

votes


















1














pdftotext should be in poppler-utils, so try yum install poppler-utils



EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)



Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.



I don't suppose you're interested in upgrading to CentOS 7?






share|improve this answer




















  • 1





    I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

    – Michael Stackhouse
    Mar 8 at 15:17






  • 1





    For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

    – Michael Stackhouse
    Mar 8 at 21:18



















0














I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.



Following the instructions from this repo:




On CentOS



On CentOS the libpoppler-cpp library is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.



# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel

# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0

# Build and install
./configure
make
sudo make install


By default libraries get installed in /usr/local/lib and /usr/local/include. On CentOS this is not a default search path so we need to set PKG_CONFIG_PATH and LD_LIBRARY_PATH to point R to the right directory:



export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055663%2fissue-installing-pdftotext-in-python-3-6-on-centos-due-to-poppler%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    pdftotext should be in poppler-utils, so try yum install poppler-utils



    EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)



    Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.



    I don't suppose you're interested in upgrading to CentOS 7?






    share|improve this answer




















    • 1





      I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

      – Michael Stackhouse
      Mar 8 at 15:17






    • 1





      For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

      – Michael Stackhouse
      Mar 8 at 21:18
















    1














    pdftotext should be in poppler-utils, so try yum install poppler-utils



    EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)



    Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.



    I don't suppose you're interested in upgrading to CentOS 7?






    share|improve this answer




















    • 1





      I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

      – Michael Stackhouse
      Mar 8 at 15:17






    • 1





      For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

      – Michael Stackhouse
      Mar 8 at 21:18














    1












    1








    1







    pdftotext should be in poppler-utils, so try yum install poppler-utils



    EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)



    Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.



    I don't suppose you're interested in upgrading to CentOS 7?






    share|improve this answer















    pdftotext should be in poppler-utils, so try yum install poppler-utils



    EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)



    Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.



    I don't suppose you're interested in upgrading to CentOS 7?







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 8 at 21:09

























    answered Mar 8 at 3:40









    B. ShefterB. Shefter

    429111




    429111







    • 1





      I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

      – Michael Stackhouse
      Mar 8 at 15:17






    • 1





      For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

      – Michael Stackhouse
      Mar 8 at 21:18













    • 1





      I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

      – Michael Stackhouse
      Mar 8 at 15:17






    • 1





      For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

      – Michael Stackhouse
      Mar 8 at 21:18








    1




    1





    I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

    – Michael Stackhouse
    Mar 8 at 15:17





    I tried that one too but it's just the command line utilities. The Python library pdftotext distinctly seems to rely on the C++ wrapper for poppler.

    – Michael Stackhouse
    Mar 8 at 15:17




    1




    1





    For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

    – Michael Stackhouse
    Mar 8 at 21:18






    For pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.

    – Michael Stackhouse
    Mar 8 at 21:18














    0














    I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.



    Following the instructions from this repo:




    On CentOS



    On CentOS the libpoppler-cpp library is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.



    # Build dependencies
    yum install wget xz libjpeg-devel openjpeg2-devel

    # Download and extract
    wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
    tar -Jxvf poppler-0.47.0.tar.xz
    cd poppler-0.47.0

    # Build and install
    ./configure
    make
    sudo make install


    By default libraries get installed in /usr/local/lib and /usr/local/include. On CentOS this is not a default search path so we need to set PKG_CONFIG_PATH and LD_LIBRARY_PATH to point R to the right directory:



    export LD_LIBRARY_PATH="/usr/local/lib"
    export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"






    share|improve this answer





























      0














      I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.



      Following the instructions from this repo:




      On CentOS



      On CentOS the libpoppler-cpp library is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.



      # Build dependencies
      yum install wget xz libjpeg-devel openjpeg2-devel

      # Download and extract
      wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
      tar -Jxvf poppler-0.47.0.tar.xz
      cd poppler-0.47.0

      # Build and install
      ./configure
      make
      sudo make install


      By default libraries get installed in /usr/local/lib and /usr/local/include. On CentOS this is not a default search path so we need to set PKG_CONFIG_PATH and LD_LIBRARY_PATH to point R to the right directory:



      export LD_LIBRARY_PATH="/usr/local/lib"
      export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"






      share|improve this answer



























        0












        0








        0







        I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.



        Following the instructions from this repo:




        On CentOS



        On CentOS the libpoppler-cpp library is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.



        # Build dependencies
        yum install wget xz libjpeg-devel openjpeg2-devel

        # Download and extract
        wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
        tar -Jxvf poppler-0.47.0.tar.xz
        cd poppler-0.47.0

        # Build and install
        ./configure
        make
        sudo make install


        By default libraries get installed in /usr/local/lib and /usr/local/include. On CentOS this is not a default search path so we need to set PKG_CONFIG_PATH and LD_LIBRARY_PATH to point R to the right directory:



        export LD_LIBRARY_PATH="/usr/local/lib"
        export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"






        share|improve this answer















        I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.



        Following the instructions from this repo:




        On CentOS



        On CentOS the libpoppler-cpp library is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.



        # Build dependencies
        yum install wget xz libjpeg-devel openjpeg2-devel

        # Download and extract
        wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
        tar -Jxvf poppler-0.47.0.tar.xz
        cd poppler-0.47.0

        # Build and install
        ./configure
        make
        sudo make install


        By default libraries get installed in /usr/local/lib and /usr/local/include. On CentOS this is not a default search path so we need to set PKG_CONFIG_PATH and LD_LIBRARY_PATH to point R to the right directory:



        export LD_LIBRARY_PATH="/usr/local/lib"
        export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 18 at 22:50









        halfer

        14.7k759116




        14.7k759116










        answered Mar 8 at 22:01









        Michael StackhouseMichael Stackhouse

        114




        114



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055663%2fissue-installing-pdftotext-in-python-3-6-on-centos-due-to-poppler%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            1928 у кіно

            Захаров Федір Захарович

            Ель Греко