John Muccigrosso
2016-06-24 00:12:13 UTC
Recently installed tesseract and am having some trouble with PDFs. The
error is some form of:
Error in fopenReadStream: file not found
%ᅵᅵᅵᅵ in pixRead: image file not found: %PDF-1.3
%ᅵᅵᅵᅵ cannot be read!
Error during processing.
where the 1.3 may be 1.4 or 1.6. Things are fine with a jpg or tiff version
of the same PDF (created by exporting from Preview.app).
System: Mac OS X 10.9.5.
"tesseract -v" reports:
tesseract 3.04.01
leptonica-1.72
libjpeg 8d : libpng 1.6.23 : libtiff 4.0.6 : zlib 1.2.5
I installed tesseract and leptonica with homebrew and "brew info
tesseract" reports:
tesseract: stable 3.04.01 (bottled), HEAD
OCR (Optical Character Recognition) engine
https://github.com/tesseract-ocr/
/usr/local/Cellar/tesseract/3.04.01_1 (93 files, 39.5M) *
Poured from bottle on 2016-05-27 at 15:41:15
From: https:
//github.com/Homebrew/homebrew-core/blob/master/Formula/tesseract.rb
==> Dependencies
Required: leptonica â
Recommended: libtiff â
==> Options
--with-all-languages
Install recognition data for all languages
--with-opencl
Enable OpenCL support
--with-training-tools
Install OCR training tools
--without-libtiff
Build without libtiff support
--HEAD
Install HEAD version
I suspect some missing package or something similar, but don't know what
exactly.
TIA.
error is some form of:
Error in fopenReadStream: file not found
%ᅵᅵᅵᅵ in pixRead: image file not found: %PDF-1.3
%ᅵᅵᅵᅵ cannot be read!
Error during processing.
where the 1.3 may be 1.4 or 1.6. Things are fine with a jpg or tiff version
of the same PDF (created by exporting from Preview.app).
System: Mac OS X 10.9.5.
"tesseract -v" reports:
tesseract 3.04.01
leptonica-1.72
libjpeg 8d : libpng 1.6.23 : libtiff 4.0.6 : zlib 1.2.5
I installed tesseract and leptonica with homebrew and "brew info
tesseract" reports:
tesseract: stable 3.04.01 (bottled), HEAD
OCR (Optical Character Recognition) engine
https://github.com/tesseract-ocr/
/usr/local/Cellar/tesseract/3.04.01_1 (93 files, 39.5M) *
Poured from bottle on 2016-05-27 at 15:41:15
From: https:
//github.com/Homebrew/homebrew-core/blob/master/Formula/tesseract.rb
==> Dependencies
Required: leptonica â
Recommended: libtiff â
==> Options
--with-all-languages
Install recognition data for all languages
--with-opencl
Enable OpenCL support
--with-training-tools
Install OCR training tools
--without-libtiff
Build without libtiff support
--HEAD
Install HEAD version
I suspect some missing package or something similar, but don't know what
exactly.
TIA.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+***@googlegroups.com.
To post to this group, send email to tesseract-***@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/de320a67-b788-4263-8486-a522c556051c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+***@googlegroups.com.
To post to this group, send email to tesseract-***@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/de320a67-b788-4263-8486-a522c556051c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.