For OCR (optical character recognition) and indirectly to identify the original writer of the handwritten document, segmentation of handwritten document images into text-lines and words is an essential task. Since the features of handwritten document are irregular and is different depending on the person therefore it is considered a challenging problem. To address the problem, we formulating the problem of word segmentation as a binary quadratic assignment problem that considers pair wise correlations between the gaps in the text and also of individual gaps. Using the Structured SVM (Support Vector Machine) framework we estimate all the parameter to work the proposed method well regardless of different writing styles and written languages without user defined parameters.
Handwritten Document, word segmentation, SVM, binary quadratic assignment, text-lines