article.tex 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354
  1. \documentclass{article}
  2. \usepackage{arxiv}
  3. \usepackage[utf8]{inputenc} % allow utf-8 input
  4. \usepackage[T1]{fontenc} % use 8-bit T1 fonts
  5. \usepackage{hyperref} % hyperlinks
  6. \usepackage{url} % simple URL typesetting
  7. \usepackage{booktabs} % professional-quality tables
  8. \usepackage{amsfonts} % blackboard math symbols
  9. \usepackage{nicefrac} % compact symbols for 1/2, etc.
  10. \usepackage{microtype} % microtypography
  11. \usepackage{lipsum}
  12. \usepackage{amsmath}
  13. \usepackage{amsfonts}
  14. \usepackage{amssymb}
  15. \usepackage{kbordermatrix}
  16. \usepackage[
  17. defernumbers=true,
  18. backend=bibtex,
  19. % style=authoryear,
  20. style=numeric,
  21. sorting=none
  22. ]{biblatex} % Use the bibtex backend with the authoryear citation style (which resembles APA)
  23. \addbibresource{bibliografie.bib}
  24. \usepackage[section]{placeins}
  25. \usepackage{caption}
  26. \DeclareCaptionFormat{citation}{%
  27. \ifx\captioncitation\relax\relax\else
  28. \captioncitation\par
  29. \fi
  30. #1#2#3\par}
  31. \newcommand*\setcaptioncitation[1]{\def\captioncitation{\textit{Sursa:}~#1}}
  32. \let\captioncitation\relax
  33. \captionsetup{format=citation,justification=centering,margin=40pt,font=small,labelfont=bf}
  34. \usepackage{graphicx}
  35. \graphicspath{ {images/} }
  36. \title{MACHINE LEARNING BASED METHODS USED FOR IMPROVING SCHOLAR PERFORMANCE}
  37. \author{
  38. Radu BONCEA \\
  39. ICI Bucuresti\\
  40. \texttt{radu@rotld.ro} \\
  41. \And
  42. Ionut PETRE \\
  43. ICI Bucuresti\\
  44. \texttt{ionut.petre@rotld.ro} \\
  45. \AND
  46. Victor VEVERA \\
  47. ICI Bucuresti\\
  48. \texttt{victor.vevera@ici.ro}
  49. \And
  50. Alexandru GHEORGHIȚĂ \\
  51. ICI Bucuresti\\
  52. \texttt{alex.gheorghita@rotld.ro}
  53. }
  54. \begin{document}
  55. \maketitle
  56. \begin{abstract}
  57. In their 1968 study \textit{"The Teaching-Learning Paradox: A Comparative Analysis of College Teaching Methods"}, Robert Dubin and Thomas Taveggia found no evidence to indicate any basis for preferring one teaching method over another as measured by the performance of students on course examinations. The conclusion is based on systematic reanalysis of the data of almost 100 comparative studies of different college teaching methods. The teaching-learning process is virtually a black box in which the teaching methods do not influence scholar performance. So rather than focusing on teaching methods, we propose a method for improving scholar performance by a continuous and intelligent monitoring and assessing process of daily knowledge gains. Recent developments in machine learning and data analysis, allow us the use of techniques for unveiling the strengths and weaknesses in the learning process. Our proposed solution is “micro-assessing” each student and after each course, by sending a 5 minutes-long mini test on their smartphones, collects the result and send it for analysis to a dedicated platform. When the system has sufficient data, it can personalize the tests for each student with focus on areas the student is lacking. It can make recommendations, to teachers, students, school and competent authorities at local or national level. The solution is not a replacement for classical examinations, but it augments the learning experience through interactive and personalized quizzes. The teacher also has a better view over students knowledge, thus he can do better assessments overall. Student abnormal \textit{deviations} could be detected much faster, while competent authorities could assess the impact of their decisions in near-realtime.
  58. \end{abstract}
  59. % keywords can be removed
  60. \keywords{scholar performance \and assessment model \and machine learning \and knowledge gain \and RLO \and IAMA}
  61. \section{Introduction}
  62. \paragraph{}
  63. A study conducted in 1968 concluded that the teaching method has no impact on final exam performance \cite{rdtct1968}, a principle know as \textit{Teaching-Learning Paradox}. No matter the quality of teacher or the program, regardless of teaching methods and despite the circumstances before them, the students learned and they demonstrated their learning consistently across time. In his book "Why Don’t Students Like School", Daniel Willingham provides a useful model for how the mind retains factual information (see Figure ~\ref{fig:model_of_mind}) and points out that factual information is not sufficient for learning, but is often the basis for real learning, since in order to apply concepts and make creative evaluations, we must have a database of knowledge from which to pull and make connections \cite{danielw}. Thus, to achieve long-term memory, two things need to happen:
  64. \begin{quote}
  65. \textit{1) said data must be subject to an intense level of attention while it is readily available in working memory, and... \\
  66. 2) content previously stored in long-term memory must be pulled up and matched with the new data in order for it to permanently \emph{stick}} \cite{andrewneuendorf}....
  67. \end{quote}
  68. \begin{figure}[ht]
  69. \centering
  70. \includegraphics[scale=0.7]{model_of_mind}
  71. \caption{Daniel Willingham's model of mind. }
  72. \label{fig:model_of_mind}
  73. \end{figure}
  74. \paragraph{}
  75. In this article we are proposing new methods that are addressing the above two challenges and improve scholar performance through means of demonstrating knowledge and understanding, by using an \textbf{I}terative (daily base) and \textbf{A}daptive \textbf{M}onitoring and \textbf{A}ssessing (IAMA) process that relies on state-of-art technologies which uncover the power of \textit{Data Intelligence} \cite{Drazdilova2010, Romero:2010:EDM:1922269.1922270, Wang:2009:DMC:1822686}. One such method implies a process of \textit{micro-assessing} students right after each course using profiled Reusable Learning Objects (RLO) for assessment \cite{Neven:2002:RLO:641007.641067, JoDI89, SiqSean2003}. The assessment result is input data for generating and consolidating a student profile with focus on strengths and weaknesses. Data collected can also be used to monitor student performance and to alert and react to deviations. \emph{IAMA} could also be seen as a game of words such as the \emph{subreddit} question-and-answer \emph{iAMA} - \textit{\textbf{i}nteractive \textbf{A}sk \textbf{M}e \textbf{A}nything}.
  76. \paragraph{}
  77. It should be noted that the whole process is automated. The teacher allocates 5 minutes for each hour of course to assess students, by sending to their mobile electronic tablet, mini tests with 10 questions to be answered. Students receive the test and upon completion or expiring the time allocated, the applications installed on their mobile interfaces send the test results to a central platform (cloud based, data intensive), where they are stored. The system will process the results (compute scores, make inferences) and create and consolidate a knowledge-centered profile for each student. The profile is then used to recognize areas the student is lacking in knowledge and to provide with the input for future assessment adjustments and fine tuning.
  78. \section{IAMA Building Blocks}
  79. \subsection{Assessment Reusable Learning Object}
  80. \paragraph{}
  81. RLOs are conceptualized as accessible, reusable, interoperable, and adaptable learning resources to facilitate developmental cost savings. Therefore, RLOs need to be designed as independent learning units that are free from context as well as links with external resources \cite{AJET3072}.
  82. Characteristics of RLOs:
  83. \begin{itemize}
  84. \item \textit{digital asset} - stored, retrieved, modified by electronic means, available online 24/7;
  85. \item \textit{self-contained} - loosely coupled, each learning object can be taken independently;
  86. \item \textit{reusable} – a single RLO may be used in multiple contexts for multiple purposes;
  87. \item \textit{searchable} - easy to find learning material; each RLO is tagged with metadata (see LOM subsection \ref{lom});
  88. \item \textit{atomicity} - RLOs are ~5-15 minutes long, distinct, small \emph{units} of knowledge information;
  89. \item \textit{flexible} – easy to update and change;
  90. \item \textit{standardized} – adopt the same organizational structure from an enterprise architecture perspective;
  91. \item \textit{aggregability} – learning objects can be grouped into a larger collection of content, including traditional course structures;
  92. \item \textit{interoperability} – blend into Learning Management Systems (e.g. WebCT Vista, Moodle);
  93. \item \textit{digital adaptiblity} - RLOs are suited to address a new type of learner – “\emph{Net-generation learner}” adapted to multi-tasking and digital technologies;
  94. \item \textit{student-focus} - enhance student-centered learning.
  95. \end{itemize}
  96. \paragraph{}
  97. An assessment RLO (aRLO) is a RLO that is in essence a question associated with a set of possible answers and a set of correct answers and has additional properties:
  98. \begin{itemize}
  99. \item \textit{mutability} - given a aRLO and a specific algorithm, we can obtain a set of distinctive aRLOs. A very simple algorithm would be to randomly sort the answers. A more complex algorithm is to make an inference on the question as seen in Figure \ref{fig:rlo_mutability}.
  100. \item \textit{affiliation} - aRLOs are associated with one ore more knowledge domains (parts of the curriculum) and memory domains (e.g. recall, identify, recognize, recount, relate,etc).
  101. \end{itemize}
  102. \begin{figure}[ht]
  103. \centering
  104. \includegraphics[scale=0.4]{rlo_mutability}
  105. \caption{Inference based mutability of RLOs}
  106. \label{fig:rlo_mutability}
  107. \end{figure}
  108. \subsection{Learning Object Metadata}
  109. \label{lom}
  110. \paragraph{}
  111. Learning Object Metadata (LOM) \cite{1032843} is a data model, used to describe a l\emph{earning object} and similar digital resources used to support learning. It is based on 1484.12.1 IEEE multi-part standard and has been developed with the purpose to facilitate search, evaluation, acquisition, and use of learning objects, for instance by learners or instructors or automated software processes. LOM also facilitates the sharing and exchange of learning objects, by enabling the development of catalogs and inventories while taking into account the diversity of cultural and lingual contexts in which the learning objects and their metadata are reused. It is a natural choice to describe RLOs.\\
  112. The LOM conceptual data schema has a hierarchical tree structure composed of the following nine categories \cite{LCampbell}:
  113. \begin{enumerate}
  114. \item \textit{General}: information that describes the learning object as a whole.
  115. \item \textit{Lifecycle}: history and current status of the learning object and those who have contributed to its creation.
  116. \item \textit{Meta-metadata}: information about the metadata describing the learning object, as opposed to the learning object itself.
  117. \item \textit{Technical}: technical requirements and characteristics of the learning object.
  118. \item \textit{Educational}: educational and pedagogic characteristics of the learning object..
  119. \item \textit{Rights}: intellectual property rights and conditions of use of the learning object.
  120. \item \textit{Relation}: relationship between the learning object and other related objects.
  121. \item \textit{Annotation}: comments on the educational use of the learning objects, including when and by whom the comments were created.
  122. \item \textit{Classification}: classification schemes used to describe different characteristics of the learning object.
  123. \end{enumerate}
  124. \paragraph{}
  125. As seen in Figure \ref{fig:lom_schema}, the LOM schema is relatively extensive and, in most cases, application profiles based on the standard generally restrict the elements used, designate certain elements as mandatory or optional, specify vocabulary usage and interpretation, and add organization or community specific classification schemes. And in some cases, the LOM schema must be extended with new properties. In our case, the aRLO should extend the \emph{general} category, so it includes the two possible/correct answer sets.
  126. \begin{figure}[ht]
  127. \centering
  128. \includegraphics[scale=0.5]{lom_schema}
  129. \caption{LOM conceptual data schema}
  130. \label{fig:lom_schema}
  131. \end{figure}
  132. \subsection{Semantic Repository}
  133. \paragraph{}
  134. A semantic repository is a database management systems, capable of handling structured data, taking into consideration their semantics and is perfectly suited to storing RLOs \cite{Ochoa}. We can use Resource Description Framework (RDF) \cite{532408} to express RLO metadata following the IEEE LOM standard. A perfect candidate for storing and handling RDF structures is Apache Jena \cite{apachejenaurl}, a free and open source Java framework for building semantic web and linked data applications that offers:
  135. \begin{itemize}
  136. \item RDF API to create and read RDF graphs and serialize the triples using popular formats such as RDF/XML or Turtle;
  137. \item graph-oriented query language ARQ, which is a SPARQL 1.1 compliant engine. ARQ supports remote federated queries and free text search via Lucene;
  138. \item a fast persistent triple store that stores directly to disk (TDB);
  139. \item ontology API working with models, RDFS and the Web Ontology Language (OWL) to add extra semantics to RDF data;
  140. \item inference API wrapping functionality around built-in OWL and RDFS reasoners.
  141. \end{itemize}
  142. \subsection{Machine Learning Models and Frameworks}
  143. \paragraph{}
  144. Monitoring student performance over time based on small iterative tests implies that we should \emph{feed} the test results into an algorithm that outputs a score or a list of scores for a domain or a set of knowledge domains. There are several machine learning models that are well suited for doing classification and regression \cite{Mandeep}:
  145. \begin{enumerate}
  146. \item \textit{Naive Bayes Classifier} is a probabilistic classifier based on Bayes Theorem with an assumption of independence among predictors and particularly useful for very large data sets. It is simple to implement and is known to outperform even highly sophisticated classification methods.
  147. \item \textit{Logistic Regression} is the appropriate regression predictive analysis to conduct when the dependent variable (the outcome) is dichotomous (binary).
  148. \item \textit{Decision Trees} builds classification or regression models in the form of a tree structure, by breaking down a data set into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes.
  149. \item \textit{Random Forests} are an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees. This model is very accurate, runs efficiently on large data sets and has a very useful feature: it gives estimates of what variables are important in the classification.
  150. \item \textit{Neural Network} consists of units (neurons), arranged in layers, which convert an input vector into some output. Each unit takes an input, applies a (often nonlinear) function to it and then passes the output on to the next layer. Generally the networks are defined to be feed-forward: a unit feeds its output to all the units on the next layer, but there is no feedback to the previous layer. Weightings are applied to the signals passing from one unit to another, and it is these weightings which are tuned in the training phase to adapt a neural network to the particular problem at hand.
  151. \item \textit{Nearest Neighbor} is a supervised, non-parametric, classification and regression algorithm, where the input consists of the \emph{k} closest training examples in the feature space.
  152. \end{enumerate}
  153. \paragraph{}
  154. These algorithms can be implemented using low level frameworks, such as SciPy \cite{scipy} and Scikit Learn \cite{scikit}, highly used, simple, yet efficient Python libraries built on \emph{NumPy}, \emph{pandas}, and \emph{matplotlib}. For implementing neural networks, specifically deep learning models, we should use higher level frameworks such as TensorFlow \cite{tensorflow}, CNTK \cite{cntk}, PyTorch \cite{pytorch} or Keras \cite{keras}.
  155. \section{The Solution}
  156. \paragraph{}
  157. Let $\mathcal{S}\epsilon\mathcal{R}^{m \times n}$ annotate the associative matrix of computed scores for all \emph{m} students and all \emph{n} aRLOs, where $\mathcal{R}$ is the closed interval $[0,1] =\{x \epsilon \mathbb{R}; 0\leq x \leq 1\}$. As mentioned before an aRLO is an assessment reusable object that has attached to it a set of possible answers and a set of correct answers and a difficulty level $\delta \epsilon (0,1]$ where $\delta=1$ means the most difficult level. Each element $\sigma_i$ in the matrix $\mathcal{S}$ is calculated as $\sigma_i = \gamma_i \times \delta_i$ where $\gamma_i$ is the grade for the corresponding aRLO (students get maximum grade 1 if they get all correct answers right ) and $\delta_i$ is the difficulty level of the aRLO \cite{BONCEA2018PRO}.\\
  158. Let $\mathcal{D}\epsilon\{0,1\}^{n \times p}$ annotate the matrix of mappings between knowledge domains and aRLOs and $\mathcal{M}\epsilon\{0,1\}^{n \times q}$ be the matrix of mappings between memory domains and aRLOs, where \emph{n} is the number of aRLOs, \emph{p} is the number of knowledge domains and \emph{q} is the number of memory domains.
  159. \paragraph{}
  160. Then, we calculate the scores for associated knowledge domains $\mathcal{S}_{\mathcal{D}}$ with the formula $(\mathcal{S} \mathcal{D} )_{ij} = \frac{\sum_{k=1}^n \mathcal{S}_{i,k} \times \mathcal{D}_{k,j}}{N} = \mathcal{S} \bigotimes \mathcal{D}$, where \emph{N} is the number of domain associations of each aRLO (the count of \textit{1}s).\\
  161. Similarly, we calculate the scores for associated memory domains as $\mathcal{M}$: $\mathcal{S}_{\mathcal{M}} = \mathcal{S} \bigotimes \mathcal{M}$.
  162. For exemplification, let's consider the following matrixes: the grade scores \emph{Gs} with \emph{m} questions and \emph{n} students, the knowledge domains associated matrix $\mathcal{D}$ and the memory domains matrix $\mathcal{M}$,
  163. \renewcommand{\kbldelim}{(}% Left delimiter
  164. \renewcommand{\kbrdelim}{)}% Right delimiter
  165. \[
  166. \text{\emph{Gs}} = \kbordermatrix{
  167. & S_1 & S_2 & S_3 & ... & S_n \\
  168. \gamma_1 & 1 & 1 & 0 & ... & 1 \\
  169. \gamma_2 & 1 & 0 & 1 & ... & 1 \\
  170. \gamma_3 & 0 & 1 & 1 & ... & 1 \\
  171. ... & ... & ... & ... & ... & ... \\
  172. \gamma_m & 1 & 0 & 1 & ... & 1
  173. }
  174. \]
  175. \[
  176. \mathcal{D} = \kbordermatrix{
  177. & \gamma_1 & \gamma_2 & \gamma_3 & ... & \gamma_m \\
  178. \delta_1 & 1 & 0 & 0 & ... & 0 \\
  179. \delta_2 & 0 & 1 & 0 & ... & 0 \\
  180. \delta_3 & 0 & 0 & 1 & ... & 1 \\
  181. ... & ... & ... & ... & ... & ... \\
  182. \delta_p & 1 & 0 & 1 & ... & 0
  183. }
  184. \]
  185. \[
  186. \mathcal{M} = \kbordermatrix{
  187. & \gamma_1 & \gamma_2 & \gamma_3 & ... & \gamma_m \\
  188. \mu_1 & 0 & 0 & 0 & ... & 1 \\
  189. \mu_2 & 0 & 1 & 0 & ... & 0 \\
  190. \mu_3 & 1 & 0 & 1 & ... & 0 \\
  191. ... & ... & ... & ... & ... & ... \\
  192. \mu_q & 0 & 0 & 1 & ... & 0
  193. }
  194. \]
  195. and the difficulty levels \emph{Diff}
  196. \[
  197. \text{\emph{Diff}} = \kbordermatrix{
  198. & \gamma_1 & \gamma_2 & \gamma_3 & ... & \gamma_m \\
  199. & 0.8 & 0.6 & 0.6 & ... & 1 \\
  200. }
  201. \]
  202. we calculate the score matrix $\mathcal{S}$, the knowledge domain scores $\mathcal{S}_{\mathcal{D}}$ and the memory domain scores $\mathcal{S}_{\mathcal{M}}$
  203. \[
  204. \mathcal{S} =
  205. \begin{bmatrix}
  206. 1 \times 0.8 = 0.8 & 1 \times 0.8 = 0.8 & 0 \times 0.8 = 0 & ... & 1 \times 0.8 = 0.8\\
  207. 1 \times 0.6 = 0.6 & 0 \times 0.6 = 0 & 1 \times 0.6 = 0.6 & ... & 1 \times 0.6 = 0.6\\
  208. 0 \times 0.6 = 0 & 1 \times 0.6 = 0.6 & 1 \times 0.6 = 0.6 & ... & 1 \times 0.6 = 0.6\\
  209. ... & ... & ... & ... & ...\\
  210. 1 \times 1 = 1 & 1 \times 1 = 1 & 1 \times 1 = 1 & ... & 1 \times 1 = 1\\
  211. \end{bmatrix}
  212. =
  213. \begin{bmatrix}
  214. 0.8 & 0.8 & 0 & ... & 0.8\\
  215. 0.6 & 0 & 0.6 & ... & 0.6\\
  216. 0 & 0.6 & 0.6 & ... & 0.6\\
  217. ... & ... & ... & ... & ...\\
  218. 1 & 1 & 1 & ... & 1\\
  219. \end{bmatrix}
  220. \]
  221. \[
  222. \mathcal{S}_{\mathcal{D}} = \mathcal{S} \bigotimes \mathcal{D} =
  223. \begin{bmatrix}
  224. 0.8 & 0.8 & 0 & ... & 0.8\\
  225. 0.6 & 0 & 0.6 & ... & 0.6\\
  226. 0 & 0.6 & 0.6 & ... & 0.6\\
  227. ... & ... & ... & ... & ...\\
  228. 1 & 1 & 1 & ... & 1
  229. \end{bmatrix}
  230. \bigotimes
  231. \begin{bmatrix}
  232. 1 & 0 & 0 & ... & 0 \\
  233. 0 & 1 & 0 & ... & 0 \\
  234. 0 & 0 & 1 & ... & 1 \\
  235. ... & ... & ... & ... & ... \\
  236. 1 & 0 & 1 & ... & 0
  237. \end{bmatrix}
  238. =
  239. \begin{bmatrix}
  240. \frac{0.8+0.8}{2} & \frac{0.8}{1} & \frac{0.8}{1} & ... & 0 \\
  241. \frac{0.12}{2} & 0 & \frac{0.12}{2} & ... & \frac{0.6}{1} \\
  242. 0.6 & 0.6 & 0.6 & ... & 0.6 \\
  243. ... & ... & ... & ... & ... \\
  244. 1 & 1 & 1 & ... & 1
  245. \end{bmatrix}
  246. \]
  247. \[
  248. =
  249. \kbordermatrix{
  250. & \delta_1 & \delta_2 & \delta_3 & ... & \delta_p \\
  251. S_1 & 0.8 & 0.8 & 0.8 & ... & 0 \\
  252. S_2 &0.6 & 0 & 0.6 & ... & 0.6 \\
  253. S_3 &0.6 & 0.6 & 0.6 & ... & 0.6 \\
  254. ... & ... & ... & ... & ... \\
  255. S_n &1 & 1 & 1 & ... & 1
  256. }
  257. \]
  258. \[
  259. \mathcal{S}_{\mathcal{M}} = \mathcal{S} \bigotimes \mathcal{M} =
  260. \begin{bmatrix}
  261. 0.8 & 0.8 & 0 & ... & 0.8\\
  262. 0.6 & 0 & 0.6 & ... & 0.6\\
  263. 0 & 0.6 & 0.6 & ... & 0.6\\
  264. ... & ... & ... & ... & ...\\
  265. 1 & 1 & 1 & ... & 1
  266. \end{bmatrix}
  267. \bigotimes
  268. \begin{bmatrix}
  269. 0 & 0 & 0 & ... & 1 \\
  270. 0 & 1 & 0 & ... & 0 \\
  271. 1 & 0 & 1 & ... & 0 \\
  272. ... & ... & ... & ... & ... \\
  273. 0 & 0 & 1 & ... & 0
  274. \end{bmatrix}
  275. =
  276. \]
  277. \[
  278. =
  279. \kbordermatrix{
  280. & \mu_1 & \mu_2 & \mu_3 & ... & \mu_q \\
  281. S_1 & 0 & 0.8 & 0.8 & ... & 0.8 \\
  282. S_2 &0.6 & 0 & 0.6 & ... & 0 \\
  283. S_3 &0.6 & 0.6 & 0.6 & ... & 0 \\
  284. ... & ... & ... & ... & ... \\
  285. S_n &1 & 1 & 1 & ... & 1
  286. }
  287. \]
  288. \paragraph{}
  289. In our previous example we can conclude that the \emph{n}\textsuperscript{th} student has perfect score in each domain of knowledge and each domain of memory. The knowledge domain that is best demonstrated is $\delta_3$, while $\delta_p$ is the worst.
  290. \paragraph{}
  291. We then store the tensor $\mathcal{T} = (\mathcal{S},\mathcal{S}_{\mathcal{D}},\mathcal{S}_{\mathcal{M}} )$ in a non-relational database like MongoDB or Cassandra and, after a certain number of iterations or mini tests, we can infer new relations using the machine learning models described before. For instance we could classify the students on knowledge levels (classification problem) or we can compute a score in the interval [0,1] (multivariate regression problem). The result could then be used to provide with profiled tests to student in domains they are less knowledgeable.
  292. \paragraph{}
  293. Figure \ref{fig:architecture} shows a high level generic architecture of a IAMA solution, with three important aplication components:
  294. \begin{enumerate}
  295. \item \textit{aRLO Management Component} is responsible with the development of aRLOs using LOM and RDF as semantic assets. aRLOs are stored and retreived from a semantic repository , such as Apache Jena. This component also handles the aRLO mutability, that is, given aRLO and inference rules, the component will generate a set of related aRLOs.
  296. \item \textit{Assessment Component} main functionlity is to generate profiled tests for students and handle the test results by storing them into a datastore.
  297. \item \textit{Data Analysis Component} uses machine learning models to generate the student profiles and have new insights.
  298. \end{enumerate}
  299. The three components are well integrated in order to provide with business services that can be used by a large variety of actors (e.g. teachers, students, parent, local authorities,goverment), including the integration into larger application services ecosystem.
  300. \begin{figure}[ht]
  301. \centering
  302. \includegraphics[scale=0.5]{architecture}
  303. \caption{IAMA Reference Architecture}
  304. \label{fig:architecture}
  305. \end{figure}
  306. \section{Conclusions}
  307. \paragraph{}
  308. The solution addreses key challenges in teaching-learning process, by using data intelligence and machine learning in order to monitor the student performance over time. The IAMA model is based on an iterative and adaptive evaluation process that's using semantic and higly extensible reusable learning objects implemented on LOM and serialized and stored as RDF structures. The RLOs are associated with one or more knowledge domains (identifiable components of curriculum), one or more memory domains or long-term memory capabilities (e.g. recall, recount, recognize, relate, identify) and a difficulty level. The test results are stored for future retrieval and processed using machine learning algorithms suited for solving classification and regression problems. The main objective of this process is to obtain a student's profile which is continuosly updated and improved. The student profile is then used to tune the assessment with focus on domains the student is less knowledgeable.
  309. \paragraph{}
  310. Except for RLO creation process, in IAMA, all other processes should be fully automated. The system should provide with distinct tests to all students in the class, making sure there are no students completing the same test. Assessments should contain a certain number of RLOs that are associated with previous courses so that all memory domains are sufficiently addressed. The same requirement is for knowledge domains coverage.\\
  311. But, the RLO creation could also be automated in some areas, though human supervision is a requirement that can't be overlooked. New RLOs can be proceduraly generated using inference rules or more simple algorithms such as permutations of possible answers. Proceduralaly generated RLOs inherit most of the properties from the parent RLO, including affiliation (domain associative matrixes).
  312. \paragraph{}
  313. Besides the score, domain and memory matrixes, we can always add new features. For instance, we could additionally add the response time for each RLO or the sex, age and student's gender, or provide with data regarding the classroom, such as the number of students in class. And we can always verify which are the most important features, which features have greater impact on the classification or regression final result, by using \emph{Random Forests}. This process can be further extended by including features regarding the school, the city, the country. Thus, the results can be aggregated into critical information for better decisions by teachers, students, parents and competent authorities.\\
  314. It is the feature extensibility that requires our solution to store the assessment results and associated data into a schema-free and non-relational database.
  315. \nocite{*}
  316. \printbibliography[heading=bibintoc, title={Bibliografie}]
  317. \end{document}