bachelor-proposal.tex 6.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161
  1. \documentclass[a4paper]{scrartcl}
  2. \usepackage{amssymb, amsmath} % needed for math
  3. \usepackage[utf8]{inputenc} % this is needed for umlauts
  4. \usepackage[english]{babel} % this is needed for umlauts
  5. \usepackage[T1]{fontenc} % this is needed for correct output of umlauts in pdf
  6. \usepackage[margin=2.5cm]{geometry} %layout
  7. \usepackage{hyperref} % links im text
  8. \usepackage{color}
  9. \usepackage{framed}
  10. \usepackage{enumerate} % for advanced numbering of lists
  11. \usepackage{csquotes}
  12. \usepackage{ifxetex,ifluatex}
  13. \usepackage{etoolbox}
  14. \usepackage[svgnames]{xcolor}
  15. \usepackage{tikz}
  16. \usepackage{framed}
  17. \usepackage{parskip}
  18. \usepackage{cite}
  19. \usepackage{mystyle}
  20. \clubpenalty = 10000 % Schusterjungen verhindern
  21. \widowpenalty = 10000 % Hurenkinder verhindern
  22. \hypersetup{
  23. pdfauthor = {Martin Thoma},
  24. pdfkeywords = {Bachelor proposal: },
  25. pdftitle = {Bachelor proposal}
  26. }
  27. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  28. \begin{document}
  29. \title{Proposal for a Bachelor of Science Thesis:\\Recognition of mathematical formulae in the Context of Lecture Translation}
  30. \author{Martin Thoma}
  31. \maketitle
  32. \section{The problem backgound}
  33. The KIT Lecture Translator, CMUSphinx, Android voice typing and
  34. many other speech recognition systems have proven that it is possible to
  35. recognize speech. But at the moment, there seems not to be a single
  36. system that manages to recognize natural language math speech
  37. recognition. For example, a term like
  38. \[\sum_{n=1}^\infty \frac{1}{n^2} \rightarrow \infty \]
  39. would naturally be spoken as
  40. \begin{shadequote}[l]{}
  41. The sum of one divided by n squared for n from one to infinity diverges to infinity.
  42. \end{shadequote}
  43. in natural language. Today, speech recognition systems do only
  44. recognize the words spoken. They don't recognize that it was a
  45. mathematical term which could and should be expressed with symbols.
  46. One way to extend an existing speech recognition $A$ systems would be
  47. by the following steps:
  48. \begin{enumerate}
  49. \item $A$ recognizes speech and returns a text $T$. This text
  50. has to contain anotations that indicate at which time
  51. in the original recording the various parts of speech
  52. were detected.
  53. \item A math detecter parses $T$ and returns the time intervalls $I$
  54. when math was detected.
  55. \item A math parser tries to parse speech in $I$. This parser
  56. can make use of a language model dedicated to math. It
  57. returns weighted hypotheses which terms might have
  58. been spoken.
  59. \item Finally, a program compares the hypotheses with math
  60. in a formula database. Many formulas might already been
  61. written in \TeX{}, e.g. on Wikipedia, math.stackexchange.com
  62. or in freely available \LaTeX{} / \TeX{} files.
  63. \end{enumerate}
  64. \break
  65. \section{The problem statement}
  66. The bachelor's thesis at KIT is worth 15 ECTS. It should be
  67. created within 4 months and at most 450 hours.
  68. This aim of this bachelor's thesis is to answer the following
  69. questions:
  70. \begin{itemize}
  71. \item \textbf{Representation of Math:} How can math be expressed
  72. for speech recognition in a textual way?
  73. Especially:
  74. \begin{itemize}
  75. \item What reasons are there to use \TeX{}, which
  76. reasons are there for MathML?
  77. \item Are there alternatives?
  78. \end{itemize}
  79. \item \textbf{Detection:} How can parts of speech be detected
  80. that contain math?
  81. \begin{itemize}
  82. \item Which keywords indicate mathematics?
  83. \item Is a keyword-density based approach sufficient?
  84. \end{itemize}
  85. \item \textbf{Evalution of math recognition strength}:
  86. \begin{itemize}
  87. \item How can speech recognition systems be evaluated
  88. for their strength in math recognition?
  89. \item Is the \textbf{W}ord \textbf{E}rror \textbf{R}ate
  90. to measure how well the recognition worked?
  91. \end{itemize}
  92. \item \textbf{Literature research:}
  93. \begin{itemize}
  94. \item Can \TeX{} be used as a grammar to recognize math speech?
  95. \item Can MathML be used as a grammar to recognize math speech?
  96. \end{itemize}
  97. \end{itemize}
  98. Follow-up tasks, that will not be part of this bachelor's thesis,
  99. include:
  100. \begin{itemize}
  101. \item \textbf{Other languages}: This thesis will focus on math
  102. recognition for the English language. Follow-up work might
  103. try to deal with math independant of the language.
  104. \item \textbf{Implementation}: The aim of this thesis is not
  105. to create a working math recognition.
  106. \end{itemize}
  107. \section{Significance}
  108. This thesis will create a basis for follow-up work in speech recognition
  109. that contains mathematical content. It will enable people to evaluate
  110. various speech2math recognition ideas. Also, it will give an overview
  111. of the current state of art in math speech recognition and which
  112. questions need to be tackled in feature.
  113. \section{Time schedule}
  114. \begin{itemize}
  115. \item[10h] Research of ways to represent math
  116. \item[20h] Research ways how \TeX{} deals with math
  117. \item[20h] Research how MathML deals with math
  118. \item[50h] Recording math lectures
  119. \item[100h] Annotating math lectures; writing the best
  120. representation for mathematical terms contained in
  121. these lectures
  122. \item[10h] Finding keywords that indicate mathematical formulas
  123. \item[5h] Test the keyword-approach with the annotated lectures
  124. \end{itemize}
  125. \renewcommand\refname{Related Literature}
  126. \nocite{*}
  127. \bibliographystyle{itmalpha}
  128. \bibliography{literatur}
  129. \section{Hypotheses}
  130. I think that MathML will be the best way to represent math, because
  131. it was designed to do this. MathML~3.0, the most recent version,
  132. is a W3C recommendation since October 2001.
  133. \TeX{} in contrast is great in rendering mathematical equations,
  134. but it grew over time. It existed even before the web was invented.
  135. Another reason why I think MathML might be favorable for internal
  136. representation is that it was created to be parsed and written by
  137. machines. It is an XML standard and as such you can apply XML tools
  138. and libraries to parse it. \TeX{} on the other hand was created
  139. to be written by humans.
  140. I'm pretty sure that it is hopless to create a grammar for math
  141. in it's general form. But for some areas like boolean logic, arithmetic
  142. or analysis it might work pretty well.
  143. \end{document}