Text Detection in Document Images: Highlight on using FAST algorithm

Authors

  • Geetika Mathur
  • Ms. Suneetha Rikhari

Keywords:

Corner point, ,FAST(Features from Accelerated Segment Test), OCR, multilingual documents, handwritten documents

Abstract

In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.

Downloads

Published

2020-12-03

Issue

Section

Articles

How to Cite

Mathur, G., & Rikhari, M. S. (2020). Text Detection in Document Images: Highlight on using FAST algorithm. International Journal of Advanced Engineering Research and Science, 4(3). https://journal-repository.com/index.php/ijaers/article/view/2818