|
@@ -0,0 +1,249 @@
|
|
|
+\documentclass[a4paper]{scrartcl}
|
|
|
+\usepackage{amssymb, amsmath} % needed for math
|
|
|
+\usepackage[utf8]{inputenc} % this is needed for umlauts
|
|
|
+\usepackage[english]{babel} % this is needed for umlauts
|
|
|
+\usepackage[T1]{fontenc} % this is needed for correct output of umlauts in pdf
|
|
|
+\usepackage[margin=2.5cm]{geometry} %layout
|
|
|
+\usepackage{hyperref} % links im text
|
|
|
+\usepackage{color}
|
|
|
+\usepackage{framed}
|
|
|
+\usepackage{enumerate} % for advanced numbering of lists
|
|
|
+\usepackage{csquotes}
|
|
|
+\usepackage{ifxetex,ifluatex}
|
|
|
+\usepackage{etoolbox}
|
|
|
+\usepackage[svgnames]{xcolor}
|
|
|
+\usepackage{tikz}
|
|
|
+\usepackage{framed}
|
|
|
+\usepackage{parskip}
|
|
|
+\usepackage{cite}
|
|
|
+\usepackage{fancyref}
|
|
|
+\usepackage{mystyle}
|
|
|
+\clubpenalty = 10000 % Schusterjungen verhindern
|
|
|
+\widowpenalty = 10000 % Hurenkinder verhindern
|
|
|
+
|
|
|
+\hypersetup{
|
|
|
+ pdfauthor = {Martin Thoma},
|
|
|
+ pdfkeywords = {Bachelor proposal, LaTeX, handwriting recognition},
|
|
|
+ pdftitle = {Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae}
|
|
|
+}
|
|
|
+
|
|
|
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
+
|
|
|
+\begin{document}
|
|
|
+ \title{Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae}
|
|
|
+ \author{Martin Thoma}
|
|
|
+ \maketitle
|
|
|
+\section{The problem backgound}
|
|
|
+ There are people who don't know how to write even
|
|
|
+ simple mathematical formulae with \LaTeX{} like
|
|
|
+ \[\pi/\alpha=\sum_{n=-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}=\int_{-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}\, \text{d}n\]
|
|
|
+ or who need much time to do so. Currently, there are several online
|
|
|
+ services, programms and apps that help to write mathematical
|
|
|
+ formulae, but all programms I know have serious disadvantages:
|
|
|
+ \begin{itemize}
|
|
|
+ \item \href{http://detexify.kirelabs.org/classify.html}{detexify.kirelabs.org}
|
|
|
+ recognizes \textbf{only symbols},
|
|
|
+ \item the formel editor of LibreOffice Writer 3.6 as showen
|
|
|
+ in \Fref{fig:libre-office-3.6} offers some
|
|
|
+ guidiance by grouping common operations while showing
|
|
|
+ a WYSIWYG editor, but it has \textbf{no handwriting recognition}.
|
|
|
+ Another drawback is the fact that it is \textbf{not available
|
|
|
+ as an online service}, so you have to install LibreOffice
|
|
|
+ which might not be possible on all devices.
|
|
|
+ \item The \enquote{Daum Equation Editor} (see \Fref{fig:daum-editor}) is available online
|
|
|
+ and offers guidiance through the creation of equations,
|
|
|
+ but does not offer handwriting recognition. Although
|
|
|
+ it might be OpenSource, the \textbf{source code is difficult to
|
|
|
+ find}. This means if you want to improve the recognition,
|
|
|
+ it is not possible. It also makes use of Adobe Flash
|
|
|
+ which is not available on many smartphones and tablet
|
|
|
+ computers.
|
|
|
+ \item Maple seems to offer handwritten symbol recognition (\href{http://www.maplesoft.com/products/maple/features/handwritten.aspx}{source}),
|
|
|
+ but on the one hand I was not able to test that, because
|
|
|
+ it is \textbf{not available for free}. On the other hand you
|
|
|
+ have to install additional software, it seems not to be
|
|
|
+ available for tablet computers and it does only recognize
|
|
|
+ single symbols.
|
|
|
+ \item Wolfram Mathematica seems to be able to do complete
|
|
|
+ formula recognition at least for simple formulae (\href{http://reference.wolfram.com/mathematica/tutorial/HandwrittenMathRecognition.html}{source})
|
|
|
+ by using Microsofts \href{http://windows.microsoft.com/en-ph/windows7/use-math-input-panel-to-write-and-correct-math-equations}{Math Input Panel},
|
|
|
+ but this is neither OpenSource nor available as an
|
|
|
+ online service. Additionally it is not
|
|
|
+ available for Linux systems, so I can't test it.
|
|
|
+ \end{itemize}
|
|
|
+
|
|
|
+ A more comprehensive list can be found at \href{https://en.wikipedia.org/wiki/Formula_editor}{https://en.wikipedia.org/wiki/Formula\_editor}.
|
|
|
+ A problem of some of the projects presented there is that they
|
|
|
+ require the client to execute Java Applets which is a security
|
|
|
+ risk.
|
|
|
+
|
|
|
+ \begin{figure}[h]
|
|
|
+ \centering
|
|
|
+ \includegraphics*[width=5cm, keepaspectratio]{figures/libreoffice-writer.png}
|
|
|
+ \caption{LibreOffice Writer 3.6 - Formel Editor}
|
|
|
+ \label{fig:libre-office-3.6}
|
|
|
+ \end{figure}
|
|
|
+
|
|
|
+ \begin{figure}[h]
|
|
|
+ \centering
|
|
|
+ \includegraphics*[width=15cm, keepaspectratio]{figures/daum-editor.png}
|
|
|
+ \caption{Daum Equation editor}
|
|
|
+ \label{fig:daum-editor}
|
|
|
+ \end{figure}
|
|
|
+\break
|
|
|
+\section{The problem statement}
|
|
|
+ What I would like to have is an interactive on-line handwriting
|
|
|
+ recognition service, that is available as a web service which makes
|
|
|
+ use of touchscreens. Additionally, it should be for free and
|
|
|
+ OpenSource, the source code should be easy to find and documented.
|
|
|
+ This means:
|
|
|
+ \begin{itemize}
|
|
|
+ \item \textbf{Service}: The program can be accessed over the web, so
|
|
|
+ that the user does only have to have a modern browser.
|
|
|
+ As a consequence, the software could be used with any
|
|
|
+ device that has a touch screen.
|
|
|
+ \item \textbf{On-line handwriting recognition}: The service
|
|
|
+ starts recognizing while the user enters a formula.
|
|
|
+ \item \textbf{Interactive}: The service offers symbols and constructs
|
|
|
+ to the user before the user starts typing. These suggestions
|
|
|
+ might chage depending on what the user has typed before.
|
|
|
+ \item \textbf{OpenSource}: Any license in this list: \href{http://opensource.org/licenses}{http://opensource.org/licenses}
|
|
|
+ \item \textbf{Easy to find}: Ideally, the project should have
|
|
|
+ an own domain that contains the source code, the service
|
|
|
+ and documentation. But it might be enough to provide
|
|
|
+ an email address to a developer within the top of
|
|
|
+ of the source code of the delivered HTML document.
|
|
|
+ \end{itemize}
|
|
|
+
|
|
|
+ This service should also encourage the users by techniques
|
|
|
+ of \enquote{gamification} to give as much
|
|
|
+ meta information about their formulae as possible:
|
|
|
+ \begin{itemize}
|
|
|
+ \item Which problem domain does the formula belong to, e.~g. \enquote{Euclidean geometry}, \enquote{analysis} or \enquote{calculus}?
|
|
|
+ \item Does the formula itself have a name, e.~g. \enquote{Pythagorean theorem}, \enquote{Fibonacci numbers} or \enquote{geometric series}?
|
|
|
+ \end{itemize}
|
|
|
+
|
|
|
+ This information should be used to create a formula database.
|
|
|
+
|
|
|
+\section{Significance}
|
|
|
+For me as a Linux user, there no software that I can test and which
|
|
|
+offers on-line, interactive math handwriting recognition. But the
|
|
|
+need of such a software is there.
|
|
|
+
|
|
|
+But there are more reasons why this bachelor's thesis matters:
|
|
|
+Projects like \LaTeX{}, Linux, Apache or FireFox have shown that
|
|
|
+OpenSoure software can enrich the develpment in specific areas. The
|
|
|
+\enquote{Browser Wars} might be the most famous result of an active
|
|
|
+OpenSource community. Internet Explorer 6 had
|
|
|
+a market share of over 80\% in 2003. Prequels of Firefox and the Mozilla
|
|
|
+foundation already existed, but Firefox 1.0 was released not until
|
|
|
+November 2004. After that, Firefox and other open browsers added many
|
|
|
+features that Internet Explorer had to compete with, like tabbed browsing,
|
|
|
+HTML4 standard conformance, support of the \texttt{<canvas>} tag and
|
|
|
+speed of HTML rendering and JavaScript execution.\footnote{\href{http://www.evolutionoftheweb.com/}{www.evolutionoftheweb.com} offers a graphical overview. Although supporting standards like HTML4 or CSS~2 is not done with one version, but rather an incremental process.} Some of these
|
|
|
+questions are interesting for science such as many problems related
|
|
|
+to layouts and just-in-time compilation (JIT). With OpenSource software
|
|
|
+that makes it easy to find its source and offers good documentation,
|
|
|
+researchers can simply try their ideas without being blocked by
|
|
|
+having to try to access the source code.
|
|
|
+
|
|
|
+Additionally, such a project might give researchers more time to
|
|
|
+concentrate on the tasks they really want to do rather than spending
|
|
|
+hours by learning \LaTeX{}.
|
|
|
+
|
|
|
+One last reason why this thesis matters is the formula database that
|
|
|
+gets created by users. This database might be used in follow-up work,
|
|
|
+e.~g. a formula spotter for presentations or a math detector for speech.
|
|
|
+
|
|
|
+\section{Time schedule}
|
|
|
+\begin{itemize}
|
|
|
+ \item[70h] Literature research about on-line handwriting recognition
|
|
|
+ techniques and gamification.
|
|
|
+ \item[5h] Defining browsers and devices that should get supported
|
|
|
+ and required client side software like HTML5, CSS 3
|
|
|
+ and ECMAScript (better known as JavaScript). Also,
|
|
|
+ required input methods like touchscreens and stylus
|
|
|
+ should be mentioned.
|
|
|
+ \item[20h] Writing use cases. This is includes writing example
|
|
|
+ formula that the user shoud type and the system should
|
|
|
+ be able to recognize; finding people with different
|
|
|
+ knowledge of \LaTeX{} and from different fields who
|
|
|
+ want to participate in user tests.
|
|
|
+ \item[60h] Implementing the core of the application: Handwriting
|
|
|
+ recognition of digits and symbols by using only
|
|
|
+ HTML, CSS and on the client side. This includes implementing
|
|
|
+ a way for the user to enter new symbols and to correct the
|
|
|
+ symbol that was suggested by the recognition system.
|
|
|
+ \item[20h] Introduce testers that already know \LaTeX{} to the
|
|
|
+ current system. At this point, the system does only do
|
|
|
+ symbol recognition. The testers should train it,
|
|
|
+ insert symbols like $a-z, A-Z, 0-9, \alpha-\omega, A-\Omega, \cdot, \circ, \dots$
|
|
|
+ \item[10h] Get feedback by the users. This feedback will not be included
|
|
|
+ in the thesis, but the improvements will get documented.
|
|
|
+ \item[60h] Finding structures and ways how to enter them. Examples
|
|
|
+ of structures that can be nested are sums:
|
|
|
+ \begin{verbatim}\sum_{<some structure>}^{<another strcuture>} <a third structure>\end{verbatim}
|
|
|
+ Implement the recognition of those strucutres.
|
|
|
+ \item[30h] Observe \enquote{fresh} testers while they try to use
|
|
|
+ the system.
|
|
|
+ \item[70h] Improving the software to fix problems that were found
|
|
|
+ with user tests
|
|
|
+ \item[50h] Fix bugs, improve code quality and readability as well
|
|
|
+ as documentation.
|
|
|
+ \item[45h] Usability testing: Try Hallway testing. The results
|
|
|
+ of these tests get documented and will be part of the
|
|
|
+ bachelor's thesis. If possible, I would like
|
|
|
+ to let the testers use their own devices.
|
|
|
+ \item[10h] Mentioning open questions and ideas how they could be
|
|
|
+ analyzed with the service that was created.
|
|
|
+\end{itemize}
|
|
|
+
|
|
|
+\section{Outline}
|
|
|
+I have described in which steps I would like to write the software,
|
|
|
+but almost all points include writing the bachelor's thesis document.
|
|
|
+A first draft of the outline could be like this:
|
|
|
+
|
|
|
+\begin{enumerate}
|
|
|
+ \item Introduction
|
|
|
+ \item Definitions
|
|
|
+ \begin{enumerate}
|
|
|
+ \item Hardware: What is available and what is the distribution?
|
|
|
+ \item Software: What is available and what is the distribution?
|
|
|
+ \item Support of standards like HTML, CSS, ECMA-Script, Flash, Cookies, ...
|
|
|
+ \item Choice of hardware, software and standards that should get supported as well as the choice of Libraries and the required server-side software
|
|
|
+ \item Application to the domain of math recognition
|
|
|
+ \end{enumerate}
|
|
|
+ \item On-line handwriting techniques
|
|
|
+ \begin{enumerate}
|
|
|
+ \item Description of techniques in general
|
|
|
+ \item Application to the domain of math recognition
|
|
|
+ \end{enumerate}
|
|
|
+ \item Gamification techniques
|
|
|
+ \begin{enumerate}
|
|
|
+ \item Description of techniques in general
|
|
|
+ \item Application to the domain of math recognition in the web
|
|
|
+ \end{enumerate}
|
|
|
+ \item Software Project
|
|
|
+ \begin{enumerate}
|
|
|
+ \item Structure of the code
|
|
|
+ \item Availability of documentation
|
|
|
+ \item Availability of the service
|
|
|
+ \end{enumerate}
|
|
|
+ \item Summary
|
|
|
+ \begin{enumerate}
|
|
|
+ \item Future Work
|
|
|
+ \end{enumerate}
|
|
|
+\end{enumerate}
|
|
|
+\break
|
|
|
+
|
|
|
+\renewcommand\refname{Related Literature}
|
|
|
+\nocite{*}
|
|
|
+\bibliographystyle{itmalpha}
|
|
|
+\bibliography{literatur}
|
|
|
+
|
|
|
+This literature list is only a list that seems to make sense to me
|
|
|
+by now. As I proceed I might find more usefull sources for the different
|
|
|
+topics. So I might add, but also remove elements from this list.
|
|
|
+Especially for gamification I might read documents from
|
|
|
+\href{http://gamification-research.org/}{gamification-research.org}.
|
|
|
+\end{document}
|