«Arabic OCR»: الفرق بين المراجعتين
من ويكي عربآيز
(←Software) |
|||
سطر 24: | سطر 24: | ||
* [http://www.irislink.com/c2-532/OCR-Software---Product-list.aspx Readiris] - Supports Arabic and Persian |
* [http://www.irislink.com/c2-532/OCR-Software---Product-list.aspx Readiris] - Supports Arabic and Persian |
||
* [http://www.novodynamics.com NovoDynamics VERUS] - Focuses on high-performance OCR and image enhancement for Arabic-based scripts, including Arabic, Persian, Pashto, Urdu. |
* [http://www.novodynamics.com NovoDynamics VERUS] - Focuses on high-performance OCR and image enhancement for Arabic-based scripts, including Arabic, Persian, Pashto, Urdu. |
||
+ | |||
+ | '''FOSS''' "no Arabic support" |
||
+ | *[[Tesseract (software)|Tesseract]] is an open source OCR, initially developed by [[HP]], and released under the [[Apache License]], Version 2.0. It can be compiled using MSVC 6.0 or GCC. |
||
+ | *[http://oocr.sourceforge.net OOCR] OOCR is an OCR program still in development, under the [[GPL]]. |
||
+ | *[[GOCR]] - included in [[Debian]] and other distributions. |
||
+ | *[http://www.gnu.org/software/ocrad/ocrad.html GNU Ocrad] "is an OCR [...] program based on a feature extraction method". |
||
== Other Links == |
== Other Links == |
نسخة 15:57، 20 يناير 2007
محتويات
Optical Character Recognition
OCR is the ability to scan a document (or grab a PDF file) and run an OCR program on it and it will generate, based on optical recognition and approximation, an editable text file. For an idea about OCR see http://www.students.cs.uu.nl/people/mjkammer/Work/intro_2_OCR.html
Current Status of Arabic OCR software
I (MuhammadAlkarouri) know of no actually working Arabic OCR software that is open source. Any additions are certainly welcome.
Resources
Arabic OCR Links
Papers
- Automatic Recognition Using Zernike Moments As A Feature Extractor (Paper)
- Graph Based Segmentation .. (Paper)
- Structural Features Of Cursive Arabic Scripts (Paper)
- Multilingual Machine Printed OCR (Paper)
- Test of two Arabic OCR programs
- Performance Evaluation of two Arabic OCR products
Software
- Readiris - Supports Arabic and Persian
- NovoDynamics VERUS - Focuses on high-performance OCR and image enhancement for Arabic-based scripts, including Arabic, Persian, Pashto, Urdu.
FOSS "no Arabic support"
- Tesseract is an open source OCR, initially developed by HP, and released under the Apache License, Version 2.0. It can be compiled using MSVC 6.0 or GCC.
- OOCR OOCR is an OCR program still in development, under the GPL.
- GOCR - included in Debian and other distributions.
- GNU Ocrad "is an OCR [...] program based on a feature extraction method".
Other Links
- How to encode image produced by a recognition system (mailing thread) http://lists.arabeyes.org/archives/general/2002/March/msg00001.html
- Rapidly Retargetable Translingual Detection http://tides.umiacs.umd.edu/description.html
- Sibawayhi Project http://www.hf.uio.no/east/sibawayhi/HomePage/