Revision [5040]

Last edited on 2018-09-04 23:57:39 by BenoitAudouard
Additions:
~~- only keep relevant surface scanned (up and down, eventually, same area). e.g. : 20 x 28 (margins can be spared...)
~~- reverse 180° some pages : a book may be best scanned up then down, same rulers horizontally/vertically, should be the same zone
Deletions:
~~- only keep relevant surface scanned
~~- reverse 180° some pages


Revision [5039]

Edited on 2018-09-04 23:48:24 by BenoitAudouard
Additions:
=== processing articles ===
~- scan reliably to the size of the magazine/book
~~- 300 dpi in color, about 20 MB/page,
~~- only keep relevant surface scanned
~~- reverse 180° some pages
~~- then have them compressed, can be stored as PNG in a PDF file (for example)
~- ocr (recognize characters) when possible, stored as a front layer (searchable)
~- add tags
~- possibly take into account corrections to improve the ocr part, test better scanning options


Revision [5038]

Edited on 2018-09-04 23:28:36 by BenoitAudouard
Additions:
Scanning as image (in a pdf), then recognize characters
https://help.ubuntu.com/community/OCR
~- add `tesseract -l fra` to identify french for the language of the content
~- gocr
~- ocrfeeder
~- paperwork
~- simple-scan
~- tesseract
~- xsane
Other useful programs:
~- gscan2pdf
~- scantailor - interactive post-processing tool for scanned pages. It performs operations such as page splitting, deskewing, adding/removing borders, and others. You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DJVU file. Scanning, optical character recognition, and assembling multi-page documents are out of scope of this project
~- skanlite
~- zbar - bar code scanner
=== obsolete programs in 2018 ===
They are not packaged anymore for Mageia, phaps in need of a packager?
~- clara
~- ocrad
~- ocropus
Deletions:
https://help.ubuntu.com/community/OCR add `tesseract -l fra` to identify french for the language of the content


Revision [4117]

Edited on 2015-10-26 19:44:08 by BenoitAudouard
Additions:
OCRfeeder did the job (convert PDF to text)


Revision [4116]

The oldest known version of this page was created on 2015-10-23 20:11:59 by BenoitAudouard
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki