New methods, techniques and applications for sketch recognition
Abstract
The use of diagrams is common in various disciplines. Typical examples
include maps, line graphs, bar charts, engineering blueprints, architects’
sketches, hand drawn schematics, etc.. In general, diagrams can be created
either by using pen and paper, or by using specific computer programs. These
programs provide functions to facilitate the creation of the diagram, such as
copy-and-paste, but the classic WIMP interfaces they use are unnatural when
compared to pen and paper. Indeed, it is not rare that a designer prefers
to use pen and paper at the beginning of the design, and then transfer the
diagram to the computer later.
To avoid this double step, a solution is to allow users to sketch directly on
the computer. This requires both specific hardware and sketch recognition
based software. As regards hardware, many pen/touch based devices such as
tablets, smartphones, interactive boards and tables, etc. are available today,
also at reasonable costs. Sketch recognition is needed when the sketch must
be processed and not considered as a simple image and it is crucial to the
success of this new modality of interaction. It is a difficult problem due to the
inherent imprecision and ambiguity of a freehand drawing and to the many
domains of applications. The aim of this thesis is to propose new methods
and applications regarding the sketch recognition. The presentation of the
results is divided into several contributions, facing problems such as corner
detection, sketched symbol recognition and autocompletion, graphical context
detection, sketched Euler diagram interpretation.
The first contribution regards the problem of detecting the corners present
in a stroke. Corner detection is often performed during preprocessing to
segment a stroke in single simple geometric primitives such as lines or curves.
The corner recognizer proposed in this thesis, RankFrag, is inspired by the
method proposed by Ouyang and Davis in 2011 and improves the accuracy
percentages compared to other methods recently proposed in the literature.
The second contribution is a new method to recognize multi-stroke hand
drawn symbols, which is invariant with respect to scaling and supports symbol
recognition independently from the number and order of strokes. The method
is an adaptation of the algorithm proposed by Belongie et al. in 2002 to the
case of sketched images. This is achieved by using stroke related information.
The method has been evaluated on a set of more than 100 symbols from
the Military Course of Action domain and the results show that the new
recognizer outperforms the original one.
The third contribution is a new method for recognizing multi-stroke partially
hand drawn symbols which is invariant with respect to scale, and
supports symbol recognition independently from the number and order of
strokes. The recognition technique is based on subgraph isomorphism and
exploits a novel spatial descriptor, based on polar histograms, to represent
relations between two stroke primitives. The tests show that the approach
gives a satisfactory recognition rate with partially drawn symbols, also with
a very low level of drawing completion, and outperforms the existing approaches
proposed in the literature. Furthermore, as an application, a system
presenting a user interface to draw symbols and implementing the proposed
autocompletion approach has been developed. Moreover a user study aimed
at evaluating the human performance in hand drawn symbol autocompletion
has been presented. Using the set of symbols from the Military Course of
Action domain, the user study evaluates the conditions under which the
users are willing to exploit the autocompletion functionality and those under
which they can use it efficiently. The results show that the autocompletion
functionality can be used in a profitable way, with a drawing time saving of
about 18%.
The fourth contribution regards the detection of the graphical context of
hand drawn symbols, and in particular, the development of an approach for
identifying attachment areas on sketched symbols. In the field of syntactic
recognition of hand drawn visual languages, the recognition of the relations
among graphical symbols is one of the first important tasks to be accomplished
and is usually reduced to recognize the attachment areas of each symbol and
the relations among them. The approach is independent from the method used
to recognize symbols and assumes that the symbol has already been recognized.
The approach is evaluated through a user study aimed at comparing the
attachment areas detected by the system to those devised by the users. The
results show that the system can identify attachment areas with a reasonable
accuracy.
The last contribution is EulerSketch, an interactive system for the sketching
and interpretation of Euler diagrams (EDs). The interpretation of a hand
drawn ED produces two types of text encodings of the ED topology called
static code and ordered Gauss paragraph (OGP) code, and a further encoding
of its regions. Given the topology of an ED expressed through static or OGP
code, EulerSketch automatically generates a new topologically equivalent ED
in its graphical representation. [edited by author]