Shandigutt
2018-09-01 22:10:45 UTC
Hi,
I was trying to create LSTM training data using tesstrain.sh. I got the
below error. Can somebody explain me what has gone wrong,
*Command I used:*
./src/training/tesstrain.sh --fonts_dir ../Support/font --lang sin
--linedata_only \
--noextract_font_properties --langdata_dir ../langdata \
--tessdata_dir ../tessdata --output_dir ../training/sintrain --fontlist
"BhashitaComplex" --training_text ../langdata/sin/sin.training_text
*Extract of the output:*
=== Phase E: Generating lstmf files ===
Using TESSDATA_PREFIX=../tessdata
[2018 à·à·à¶Žà·à¶à·à¶žà·à¶¶à¶»à· 1 à·à·à¶±à· à·à·à¶±à·à·à¶»à·à¶¯à· 21:41:25 +0300] /usr/local/bin/tesseract
/tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.tif
/tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0 --psm 6 lstm.train
../langdata/sin/sin.config
read_params_file: Can't open lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.4-74-gd8237 with Leptonica
Page 1
Page 2
Page 3
ERROR: /tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.lstmf does not
exist or is not readable
*For the complete output please see the attached err.txt*
*After executing the command I checked the tmp directory it created. It was
shown as below,*
***@tharaka-laptop-ubuntu:~$ cd /tmp/sin-2018-09-01.E4T/
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ ll
total 776
drwx------ 2 tharaka tharaka 4096 à·à·à¶Žà· 1 21:41 ./
drwxrwxrwt 50 root root 4096 à·à·à¶Žà· 2 00:10 ../
-rw-r--r-- 1 tharaka tharaka 249413 à·à·à¶Žà· 1 21:41
sin.BhashitaComplex.exp0.box
-rw-r--r-- 1 tharaka tharaka 436290 à·à·à¶Žà· 1 21:41
sin.BhashitaComplex.exp0.tif
-rw-r--r-- 1 tharaka tharaka 9099 à·à·à¶Žà· 1 23:27
sin.BhashitaComplex.exp0.txt
-rw-r--r-- 1 tharaka tharaka 6543 à·à·à¶Žà· 1 21:41 sin.unicharset
-rw-r--r-- 1 tharaka tharaka 3053 à·à·à¶Žà· 1 21:41 sin.xheights
-rw-r--r-- 1 tharaka tharaka 71704 à·à·à¶Žà· 1 23:27 tesstrain.log
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$
*My tesseract version:*
tesseract 4.0.0-beta.4-74-gd8237
leptonica-1.77.0
libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib
1.2.11
Found SSE
*My OS details,*
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
Appreciate your support on this.
Thanks
I was trying to create LSTM training data using tesstrain.sh. I got the
below error. Can somebody explain me what has gone wrong,
*Command I used:*
./src/training/tesstrain.sh --fonts_dir ../Support/font --lang sin
--linedata_only \
--noextract_font_properties --langdata_dir ../langdata \
--tessdata_dir ../tessdata --output_dir ../training/sintrain --fontlist
"BhashitaComplex" --training_text ../langdata/sin/sin.training_text
*Extract of the output:*
=== Phase E: Generating lstmf files ===
Using TESSDATA_PREFIX=../tessdata
[2018 à·à·à¶Žà·à¶à·à¶žà·à¶¶à¶»à· 1 à·à·à¶±à· à·à·à¶±à·à·à¶»à·à¶¯à· 21:41:25 +0300] /usr/local/bin/tesseract
/tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.tif
/tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0 --psm 6 lstm.train
../langdata/sin/sin.config
read_params_file: Can't open lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.4-74-gd8237 with Leptonica
Page 1
Page 2
Page 3
ERROR: /tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.lstmf does not
exist or is not readable
*For the complete output please see the attached err.txt*
*After executing the command I checked the tmp directory it created. It was
shown as below,*
***@tharaka-laptop-ubuntu:~$ cd /tmp/sin-2018-09-01.E4T/
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ ll
total 776
drwx------ 2 tharaka tharaka 4096 à·à·à¶Žà· 1 21:41 ./
drwxrwxrwt 50 root root 4096 à·à·à¶Žà· 2 00:10 ../
-rw-r--r-- 1 tharaka tharaka 249413 à·à·à¶Žà· 1 21:41
sin.BhashitaComplex.exp0.box
-rw-r--r-- 1 tharaka tharaka 436290 à·à·à¶Žà· 1 21:41
sin.BhashitaComplex.exp0.tif
-rw-r--r-- 1 tharaka tharaka 9099 à·à·à¶Žà· 1 23:27
sin.BhashitaComplex.exp0.txt
-rw-r--r-- 1 tharaka tharaka 6543 à·à·à¶Žà· 1 21:41 sin.unicharset
-rw-r--r-- 1 tharaka tharaka 3053 à·à·à¶Žà· 1 21:41 sin.xheights
-rw-r--r-- 1 tharaka tharaka 71704 à·à·à¶Žà· 1 23:27 tesstrain.log
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$
*My tesseract version:*
tesseract 4.0.0-beta.4-74-gd8237
leptonica-1.77.0
libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib
1.2.11
Found SSE
*My OS details,*
***@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
Appreciate your support on this.
Thanks
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+***@googlegroups.com.
To post to this group, send email to tesseract-***@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7d771008-c142-4302-8b5e-e1fd130cc140%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+***@googlegroups.com.
To post to this group, send email to tesseract-***@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7d771008-c142-4302-8b5e-e1fd130cc140%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.