Text extraction method for historical Tibetan document images based on block projections
Author:
Affiliation:

1. Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China;2. Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data, Beijing University of Technology, Beijing 100124, China;3.Beijing Key Laboratory of Trusted Computing, Beijing University of Technology, Beijing 100124, China

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks’ projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.

    Reference
    Related
    Cited by
Get Citation

DUAN Li-juan, ZHANGXi-qun, MALong-long, WUJian. Text extraction method for historical Tibetan document images based on block projections[J]. Optoelectronics Letters,2017,13(6):457-461

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 30,2017
  • Revised:September 18,2017
  • Adopted:
  • Online: November 17,2017
  • Published: