Pedagogical experiments with MathCheck in university teaching

MathCheck is a relatively new online tool that gives students feedback on their solutions to elementary university mathematics and theoretical computer science exercises. MathCheck was designed with constructivism learning theory in mind and it differs from other online tools as it checks the solutions step by step and shows a counter-example if the step is incorrect. It has been in student use since the autumn of 2015 and under design-based research from the first online day. The main research questions of this study are the following. 1) How can the usage of MathCheck support the aspects of conceptual understanding and procedural fluency of constructivism learning? 2) How can MathCheck empower both students and teachers in the education of mathematics? This paper presents the results of five pedagogical experiments considering both students’ and teachers’ point of views. In each experiment, the students have suggested improvements, which have affected the further development of MathCheck. In general, both students and teachers have given positive feedback on MathCheck. MathCheck seems to support learning better than tools that only provide the “incorrect”/“correct” verdict after checking the answer. MathCheck is suitable for independent studying as well as an addition to traditional lectures. In the best case, it can reduce teachers’ workload during courses.


Introduction
Traditionally university mathematics has been taught with the pencil and paper method. Over the last decade, computers and online tools for mathematics have established their place as a part of mathematics courses (Mäkelä, 2016). There are plenty of online tools for students to use in mathematics. One popular type of online mathematics tools simplify expressions, evaluates expressions, and solves equations. Examples of such tools are Matlab (MathWorks), Wolfram Alpha (Wolfram Alpha) and GeoGebra (GeoGebra). The latter is more used in upper secondary schools while Matlab and Wolfram Alpha are more used in universities. Such tools are convenient when the student already understands the mathematics behind the operation. However, teachers have observed that students are using these tools more often just to get correct answers without understanding mathematics.
A prime example of another popular type of online tools for mathematics education is STACK (System for Teaching and Assessment using a Computer algebra Kernel) (Sangwin, 2015). With it, the teacher provides problems for the student (Mäkelä et al., 2016). The student solves each problem with a pencil and paper (at least the teacher hopes so) and then types the final answer to the website. Let us consider the simplification of √3 + 2 + 1 − as an example. The student computes √3 + 2 + 1 − = √ 2 + 2 + 1 = √( + 1) 2 = + 1 on paper and types + 1 to STACK. STACK checks the answer immediately and gives feedback telling that the answer is incorrect. STACK compares the student's answer to the teacher's answer with Maxima (a symbolic algebra system) (Maxima) and reports whether or not they are mathematically equivalent. STACK does not tell in its feedback where the possible mistake has happened -it cannot, because it has only been given the final answer and not the intermediate steps that led to it. This software is used in many universities and its pedagogical utility has been the subject of much research and discussion in different perspectives (Mäkelä, 2016), (Pelkola, Rasila, Sangwin, 2018).
Unfortunately, students can use these systems to support behavioural learning. While it is possible for a teacher to build in STACK task sets and feedback systems that also ensure in-depth learning, this requires a lot of teacher work. Therefore, it is possible that both of these methods support behavioural learning, where the aim is on the right answers, and the wrong answers are disregarded. MathCheck differs from the tools mentioned above as it gives feedback on all steps of the solution that the student types, not on just the final answer. As the feedback on an incorrect step, it gives a counter-example. Therefore, MathCheck could support constructivism learning, as in constructivism learning the learner builds her knowledge and concept understanding by making sense of all information perceived from her experiences (Bada, 2015).
In this paper, we study the usage of MathCheck in teaching finding out answers to the following questions. How can the usage of MathCheck support the aspects of conceptual understanding and procedural fluency of constructivism learning? How can MathCheck empower both students and teachers in the education of mathematics? 2 MathCheck as a constructivism learning platform Nowadays students' role in the learning process is emphasised. The constructivism learning process gives students the responsibility of learning (Weegar & Pacis, 2012). A student must oneself be an active thinker and processor, and construct new information on top of old information. With a suitable learning process, it is possible to affect different areas of mathematics learning. As a theoretical framework to describe mathematics learning, we use the concept of mathematical proficiency, which consists of the following five components (National Research Council, 2001) 1. conceptual understanding: comprehension of mathematical concepts, operations, and relations, 2. procedural fluency: skill in carrying out procedures flexibly, accurately, efficiently and appropriately, 3. strategic competence: ability to formulate, represent, and solve mathematical problems, 4. adaptive reasoning: a capacity for logical thought, recreation, explanation, and justification and 5. productive disposition: a habitual inclination to see mathematics as sensible, useful, and worthwhile, coupled with a belief in diligence and ones' own efficacy.
Of the components listed above, MathCheck (MathCheck; Valmari & Rantala, 2019) aims to support especially conceptual understanding and procedural fluency. MathCheck supports these from the constructivism learning point of view as it gives feedback on all steps of the solution that the student types, not on just the final answer. As the feedback on an incorrect step, it gives a counterexample. The current version of MathCheck also draws graphs of the expressions on both sides of the error place (this feature had not been implemented yet in the versions that were used in the experiments reported in this study). These give the student a starting point for tracing the error. When a student's erroneous thought chain is overturned by a counterexample, the student must rethink his preconceptions. Then the student rebuilds his reasoning and this is close to radical constructivism. At the same time, we are working in the students' (Vygotsky) zone of proximal development. Boudourides nicely explores various sub-categories of constructivism and explains Vygotsky's theory in his article Constructivism, Education, Science, and Technology (Boudourides, 2003). Figure 1 shows the feedback that MathCheck gives on our example. It is clear that the step √( + 1) 2 = + 1 is incorrect. The green and red graphs show that for negative enough numbers, + 1 yields negative values while √( + 1) 2 yields positive values. The correct step is √( + 1) 2 = | + 1| .
An example of how MathCheck shows errors.
To produce the feedback in Figure 1, MathCheck needs absolutely no model solution or other contribution by the teacher. It suffices that the student types the solution to the main page of MathCheck and presses the submit button. The default mode of MathCheck works by checking the mathematical correctness of each equality and inequality in the input, without assuming that the input should be an answer to some specific problem or that the computation in the input should follow some prespecified path.
When checking a relation in the simplification mode, MathCheck first tries to prove it correct. If that succeeds, MathCheck shows the relation symbol in green. The proof engine of MathCheck is rather straightforward, but also weak. If MathCheck fails to prove the relation, it tries to find a counter-example by trying many combinations of values of the variables in question. If MathCheck finds a counterexample, it prints the relation symbol, the expression to its right, and the counterexample in red. Otherwise, it prints the relation symbol and the expression in black. In the summer of 2017, MathCheck was modified to print the relation symbol in magenta in those rare cases where there is strong evidence but no certainty of an error, or where the checking was less reliable than usual. Figure 2 shows an example of such a case.
An example of when MathCheck has strong evidence of error, but not a certainty.
The example shows that MathCheck may fail to detect an error. Fortunately, addition, division, trigonometric functions, and so on have regular mathematical properties that make it unlikely for two different functions built from them to yield the same value for all the test values that MathCheck uses. Indeed, MathCheck has proven reliable in practice. Initially, the plan was to use a better proof engine like in STACK but testing with value combinations was needed in any case to produce counter-examples for the students, and when that had been implemented, it proved so reliable that there was no need to improve the proof engine.
In the equation mode, MathCheck checks that each step has at least the roots provided by the teacher, and after seeing the roots found by the student, MathCheck checks that they are also roots of the original equation. This makes it possible to deal with numerous equation types, instead of being restricted to, for instance, linear and quadratic equations. The array claim mode relies on checking with all arrays of size at most four with elements being integers between 0 and 3 (or between n and n+3, where n is an integer given by the teacher). In the propositional logic, quotient ring, and expression tree comparison modes, MathCheck checks the solution steps thoroughly. Also, membership of a string in the language defined by a context-free grammar is checked exhaustively. The comparison of context-free grammars given by the teacher and the student is based on generating strings in each language until a difference is found or an upper limit of work is met. It is thus incomplete.
MathCheck has been developed originally at the Tampere University of Technology (TUT) and then at the University of Jyväskylä (JYU). It has been open for student use since the autumn of 2015. Originally, MathCheck only had the simplification model illustrated above, without the graph-drawing feature that was added in December 2016 (Valmari, 2016. Since then, new problem modes have been added and old problem modes improved. As the aim of the study is to improve MathCheck and confirm that it supports constructivism learning in university mathematics education, we have used design and development research (Richey & Klein, 2014) and design-based research (Anderson & Shattuck, 2012) methods. Our research is design-based as it contains an iteration process where interventions are used in traditional university mathematics education (Anderson & Shattuck, 2012). We have two types of interventions in our research: MathCheck itself as an educational tool and teaching modules, where MathCheck is used as a support. Within design and development research, MathCheck is a tool that is developed during the research and teaching modules are models that are studied during the tool development (Richey & Klein, 2014). The research was done in cycles in order to measure the learning outcomes of the students and to receive feedback on usability. Improvements have taken place in the form of new features and better instructions. The process is shown in Figure 3.
Research iterations used in this study. The results address both conceptual understanding and procedural fluency. 4. In the experiment in Propositional Logic 2017, the aim was to evaluate the usefulness of MathCheck as a supporting tool for independent studies in the basics of propositional logic and normal forms. The results gave answers to the second research question: "How can MathCheck empower both students and teachers in the education of mathematics?" 5. The Context-Free Grammars experiment in 2018 also addressed the "empowering of students and teacher" -research question as the aim of the experiment was to find out if MathCheck can be used to help a teacher to find a counter-example or be convinced that the CFG that is designed by a newcomer is correct.

Engineering Mathematics 1 in autumn 2015
Engineering Mathematics 1 was a first-year university-level course at Tampere University of Technology, TUT (Finland). Its contents included limits, continuity, and derivatives. In the experiment, students used MathCheck as a part of regular weekly exercises. Each week, one or two exercises among the full set of that week's exercises were MathCheck exercises, that is, their solutions were meant to be checked with MathCheck at home before the exercise session. The aim of using MathCheck was that students could check almost any solution and simplifications of intermediate steps with MathCheck on their own. The solutions were not returned to the teacher, that is, the use of MathCheck was solely between the student and MathCheck. However, the solutions were presented and discussed in the exercise sessions as usual. Figure 4 shows an example of a MathCheck exercise used in the course.
An example of a simplification exercise.
About 150 students used MathCheck in the exercises. Feedback on using MathCheck was obtained from 120 students. Of the students who gave feedback, 44 % experienced that MathCheck is useful, 40 % that it was not useful, and 16 % did not use it. The most common feeling was that MathCheck is useful.

Algorithm Mathematics in spring 2016
This experiment addressed an experienced teacher's (author) observations on firstand second-year university students' conceptual understanding in mathematics. The teacher had observed that approximating expressions from below or above is a tough task for the students making it also difficult to understand the concept of asymptotic time complexity (that is, the big , Θ, and Ω notation). The following experiment was conducted to find out if MatchCheck can be used to increase the understanding of the expression approximation and time complexity. The participant group was the students in Algorithm Mathematics course at TUT.
Algorithm Mathematics is a first-or second-year course, depending on the student group. Its contents are set theory, relations, functions, logic, induction, and recursion. Because of the experience of the previous experiment, this time the students were given more difficult exercises to encourage the usage of intermediate steps in the solution process (Rasimus, Valmari & Kaarakka, 2016;. These exercises were considered as special exercises instead of being part of the regular weekly exercises. The students were asked to save the feedback given by MathCheck as a PDF file and deliver it to the teacher via the course page in Moodle. For example, one of the problems was "Simplify the expression ln (10) , and give the answer in terms of the log function." In the same week with the exercise mentioned above, the students were asked to approximate the expression log( 4 + 3 − 5) upwards to find ∈ ℝ and 0 ∈ ℕ such that log( 4 + 3 − 5) ≤ log when ≥ 0 . Next week, the MathCheck exercises included the task of proving using the definition that (a) 2 3 − 2 + 5 = ( 3 ) and (b) 2 2 − 10 + 3 = Ω( 2 ). One question in the examination asked the students to prove that log(2 3 − 6 2 ) = Ω(log( )), using the definition. Table 1 relates the points that students got from this examination question to the points that the same students got from the MathCheck exercises on log( 4 + 3 − 5), and Ω. Each entry shows the number of students. The result shows that the MathCheck points that the students ( = 135) had obtained and the examination results had a positive correlation ( = 0.4845) which is statistically highly significant ( < 0.001). Unfortunately, this does not necessarily tell much about the benefit of using MathCheck. It is only natural that a skilful and motivated student performs better in both the MathCheck exercises and in the examination than a not so skilful and unmotivated student.

MathCheck versus Wolfram Alpha in autumn 2016
First-year students at TUT and the Norwegian Defence Cyber Academy (NDCA) participated in this experiment. The aim was to compare if students' learning with MathCheck and Wolfram Alpha differ. Both tools were used to check the correctness of solutions to simplification problems. Exercises and the final test are shown in Appendix A1-A3. Veera Hakala's (2016) project work contains more detailed results.
Altogether 146 students participated in the experiment, 106 in TUT and 40 in NDCA. In each place, the students were divided into two groups: those who were told to use MathCheck as a checking tool (N(TUT) = 56 and N(NDCA) = 20) and those who were told to use Wolfram Alpha (N(TUT) = 50 and N(NDCA) = 20). Each student had to solve a collection of exercises and check the solutions / final answers either with MathCheck or Wolfram Alpha. After completing the exercise collection, the students took part in a test, which was done without any tools. The maximum possible number of points from the test was 16. In the test, Finnish students were also asked to tell the time they had spent with the program. The students were divided into three grade intervals: 0 or 1, 2 or 3, and 4 or 5. The highest possible grade is 5, and the lowest accepted grade is 1. Table 2 and Table 3 show the grade distributions of MathCheck and Wolfram Alpha users in each usage time group among Finnish students.
MathCheck users who had used the program at least one hour succeeded better than students who had used it less than one hour (Table 2). A similar difference cannot be observed among Wolfram Alpha users (Table 3). It can also be observed that among those who had used the tool at least one hour, 30 % of MathCheck users and 16 % of Wolfram Alpha users got one of the two highest grades 4 or 5. In Norway, all of the students (N = 40) used either Wolfram Alpha or MathCheck over an hour because they did their exercises during lessons. Therefore, Norwegian students belong to the category "used at least an hour". Half of the students used MathCheck and the other half Wolfram Alpha. Due to the small number of participants in Norway, it is not reasonable to analyze Norwegian results in isolation. In Table 4, the Finnish and Norwegian students' results have been combined. From Table 4 it can be seen that among the users of MathCheck, the proportion of students in the highest-grade interval 4-5 (36 %) is higher than the similar proportion with Wolfram Alpha (24 %). In brief, those students who practised with MathCheck succeeded better than those who practised with Wolfram Alpha.
The students at NDCA were asked an open question of whether MathCheck is a suitable tool for independent studying. Nineteen students out of twenty answered the questionnaire, and from those 14 thought that MathCheck applies well or to some extent for independent studying. Five out of nineteen students experienced that MathCheck does not apply for independent studying or is too hard to use.

Propositional logic (Algorithm Mathematics) in spring 2017
In the spring of 2017, MathCheck was experimented again in Algorithm Mathematics course with 160 participants at TUT. However, the focus was different. A teaching module was created containing the basics of propositional logic and normal forms. It was a part of the course, but the idea was that students could independently study and practice these topics with the module. Basics of propositional logic were familiar to the students from previous mathematics courses, so the propositional logic part was more of a revision. Conjunctive and disjunctive normal forms and full normal forms were new topics.
The module contained a total of eleven pages. The most common structure for a page was a short theory part, an example and an exercise about the current topic. This way the module was interactive and students got to try the theory immediately in practice. Figure 5 shows an example of the exercises of the module.
An example of Algorithm Mathematics exercise.
The students were motivated to complete the module by telling them that one examination question will be about one of the topics of the module. After completing the module, the students were asked the following six open questions: 1. Did you get all the exercises done in your opinion? 2. Did the module help your learning? 3. Was the platform pleasant to use (why / why not)? 4. Would you want to study independently with this kind of a platform in the future? 5. Would you have liked to have a pause option during the module? 6. Development proposals?
Unfortunately, only 21 students answered the questionnaire. The real number of who made the module cannot be known, because MathCheck does not keep any track of its users. According to the open questions, 20 out of 21 students announced that they had done all or almost all exercises. The number shows that the exercises have not been too hard and that those who have done the module have been motivated. From 21 students, 15 commented that the module had helped their learning, and only two said that it did not help at all.
The answers to the third question were categorized into three groups: positive (the platform was pleasant to use); positive but needs improvement, and negative. Nine out of 21 students experienced the platform as pleasant, six answered positively but felt that it could have been better with improvement, and five felt that the platform was not pleasant to use. One set of answers did not match the questions. The answers tell that the user interface could be improved.
Thirteen out of 21 students reported their willingness to study independently at home. Two out of 21 preferred that a part of the teaching would be independent. They preferred the blended learning method, where different kinds of teaching methods are used during the course. Two said that they could be interested in studying independently if improvements were made and two did not want to study with this kind of a platform. The answers of the two students did not match the questions. It seems that the majority of students who answered would like to study independently, but considering the small number of participants, it may be that the students who did the module, were already motivated to study independently.
The module did not have an explicit opportunity to pause, because implementing such a possibility would require introducing user accounts, which we want to avoid. Indeed, MathCheck does not collect any data about its users and does not know who is using it. The module was not very long, so it should have been possible to complete it during one session. Furthermore, one can save the position by using a feature that is available in all browsers: by bookmarking the current question page. Still, 17 out of 21 students would have wanted a pause option, which could also result from the fact that by mistake, students did not know the overall length of the module. A couple of students commented that this module was of suitable length, but any longer would have needed a pause option.
There were many development proposals. Despite the small number of answers, some problems came up often. In addition to the pause option, students suggested that the program should point out more precisely the location of the error and that the screen view should be more modern. Also, it was proposed that MathCheck should check not only that the answer is logically equivalent to the correct answer, but also that it satisfies the particular requirements stated by the teacher, such as if it should be in the disjunctive normal form.

Context-free grammars (Automata and Grammars) in autumn 2018
Context-free grammars (CFGs) are the most important method of defining structures of formal languages, such as programming languages. They are a simple but deep mathematical formalism. If a CFG does not yield the intended language, then there always is a counter-example. A CFG designed by a beginner is sometimes so difficult to analyse that the teacher can neither find a counter-example nor be convinced that the CFG is correct. This makes the teaching of CFGs difficult.
In the autumn of 2018, features were added to MathCheck for comparing the languages defined by two CFGs, checking whether a character string belongs to the language defined by a CFG, and for drawing a parse tree in case it does. A web page that teaches CFGs and contains exercises was written. Students of the Automata and Grammars course at the University of Jyväskylä, JYU (Finland), were given a link to this web page among their weekly homework problems. The CFG exercises constituted one-third of the problems of that week, while two-thirds were traditional paper and pencil tasks. The CFG exercises are given the link in Appendix B1.
At the beginning of the next meeting, the students were given a questionnaire in the form of a piece of paper and asked to fill it immediately. The questionnaire is shown in Appendix B1. Altogether 28 students returned it fully (25 students) or partially (3 students) filled. In every case, at least 14 out of a total of 18 questions were answered. Eighteen students told that they had done at least 80 % of the CFG exercises, 2 more had done at least 60 %, 6 more at least 40 %, and the last two at least 20 %. Table 5. shows the results for some of the questions. Table 5. The results of the questionnaire. The columns are sd=strongly disagree, wd=weakly disagree, n=neutral, wa=weakly agree, sa=strongly agree, a=average, and p=statistical significance (p-value). The limits for *, **, and *** are 5 %, 1 % and 0.1 %, respectively.

Question sd wd n wa sa a p
The exercises are suitable for 1 st -year students 1 3 8 10 6 3.6 * The exercises are suitable for 2 nd -and 3 rd -year students 0 0 5 12 11 4.2 *** The exercises are suitable for 4 th -year and older students 0 0 7 13 8 4.0 *** It was more pleasant to study with this than with traditional exercises 0 0 1 16 11 4.4 *** I believe I learnt more than I would have with traditional exercises 0 0 1 18 9 4.3 *** The exercises make traditional lectures on the same topic unnecessary 7 15 6 0 0 2.0 *** The students had to collect a sufficient number of points from the weekly meetings to earn the right to participate in the examination. A student got points by telling in the meeting what exercises they had done and/or by actively participating in the discussion on a solution. Seven students claimed points only from the CFG exercises, three only from the remaining exercises, and 18 from both. That is, the students favoured the web-based exercises over the traditional exercises. At least three students who had returned the questionnaire did not claim points from the CFG exercises, perhaps because of doing too small a percentage of them, or because of not bothering (if they already had many enough points). Among the students who claimed points that week, 18 had and 10 had not already earned enough points meaning that the sample represents both fast and slow students.
It is clear that the students liked the MathCheck CFG exercises. Unfortunately, observations made later in the course and after the examination revealed that the students had not learnt the topic as deeply as the teacher hoped. Since then, much more MathCheck-based teaching material on CFGs has been developed. After all, exercises worth one-third of a week are not much for a topic like CFGs.

Discussion
Hundreds of university students have used MathCheck in their mathematics courses during the five experiments presented above. Generally, the feedback on using MathCheck collected via inquiries and interviews has been positive. This chapter discusses the results of the experiments in the light of the research questions.

How can the usage of MathCheck support the aspects of conceptual understanding and procedural fluency of constructivism learning?
The results in Experiment 2 (Algorithm Mathematics 2016) and Experiment 3 (MathCheck vs. WolframAlpha 2016) indicate that using MathCheck when evaluating own solutions helps students to gain conceptual understanding and increase procedural fluency.
The nature of using MathCheck differs from that of common mathematical programs that are used in teaching, for example, Wolfram Alpha or STACK. MathCheck is to be used during the solution process, for checking whether the intermediate steps are correct. The student can develop the solution step by step and check each step immediately (or rather the sequence of steps written so far). MathCheck points out errors but does not tell what the right step would be. So, the student must oneself analyse what the possible mistake is. As a consequence, MathCheck directs better towards conceptual understanding than Wolfram Alpha or even STACK.
MathCheck supports procedural fluency because when a student is given many exercises (whose solutions can be checked by the student herself), the student has to pay attention to writing expressions precisely and with several repetitions, the fluency will increase. In contrary, with Wolfram Alpha, the student has only to write correctly the starting point, and the program does the rest independently.
In more detail, MathCheck proved especially suitable when approximating values of functions upwards or downwards. Students are used to computing with precise values. However, in the real world, it is often necessary to approximate values rather than calculate with precise values.
With simplification problems, MathCheck seemed an excellent tool for students. The possibility of making mistakes increases with the number of computing steps. Similarly, finding the mistakes becomes more difficult the longer the path to the final solution is. With the help of MathCheck, the mistake is quickly found and the limited time can be spent on solving the problem instead of being wasted on finding the first error. Also, it was observed that when the students were forced to define the domain (e.g., declare ≠ −5 if the expression is 1/( + 5)) before checking the answer with MathCheck, the habit stuck and several students continued to define the domains through the whole course. This is an improvement because usually, this habit fades away when it is no longer "needed" meaning that it is not noted in the book's solutions.

How can MathCheck empower both students and teacher in the education of mathematics?
Students mostly experienced MathCheck as a useful tool in mathematics education. However, not everyone found MathCheck as useful, especially not in the beginning. Not understanding the scope of MathCheck explains partly why a large number of students participating in the first experiment did not experience MathCheck as useful. Some students did not understand that the idea is not that MathCheck should find the final answers for them, but the idea is that MathCheck should give them feedback on their solutions. One factor may also have been too easy tasks. As it happened, some of the exercises used in the experiment were too simple, so there were no intermediate steps that needed checking (Rasimus & Valmari & Kaarakka, 2016;.
In the rest of the experiments, the scope of MathCheck has been clearly explained and the complexity of the exercises has been raised.
As stated earlier in the MathCheck vs. WolframAlpha experiment in 2016, those who used MathCheck succeeded better in the examination than those who used Wolfram Alpha or no tool at all (where the dividing line between "used" and "not used" is one hour). The same, that is, usage of MathCheck improved examination results, was also noticed in other experiments (Algorithm Mathematics 2016 and 2017) when comparing the students' activity on doing MathCheck exercises. However, it has to be taken into account that also other factors such as motivation affect the examination results.
In Propositional Logic 2017, MathCheck was used as a supporting tool for selfstudy of the basics of propositional logic and normal forms. Most of the students who answered the questionnaire in the course commented that the independent learning module had helped their overall learning. Similarly, most of the respondents reported their willingness to study independently at home. However, in order to gain the full benefit of MathCheck in independent studying, thorough user guidance is needed to be given.
From the teachers' point of view, MathCheck decreases the teachers' workload, especially with courses of a large number of students. For example in exercise sessions, MathCheck, instead of the teacher, can show the exact point of the mistake. One suggestion for lowering teachers' workload was evaluated in the Context-Free Grammars experiment in 2018 where MathCheck was used to help a teacher to find a counter-example or to be convinced that the CFG that is designed by a beginner is correct. It became clear that the students liked the MathCheck CFG exercises; however, the number of homework problems was too small in order to gain a deep understanding of the topic as it was hoped.
MathCheck also offers an alternative for differentiating the level of education based on the students' individual abilities. Teachers can create extra problems for those students who need or want extra practice. It is possible to build web pages that create random problems of a fixed structure but varying parameters. By creating exercises of different levels of difficulty, MathCheck can be used as a differentiating method, thus taking the students into account, no matter what their starting level is. Another way the teachers can use MathCheck is to create teaching modules or courses. Teaching modules can be used as a revision or as a tool for learning a new topic. The modules give students more flexibility, in that they can decide when and where they will study.

User interface
The user interface issue deserves a discussion. Some of the first-year students had problems with textual input. We investigated the possibility of adding a mouseclickable keyboard to question pages. It can be used for selecting the most commonly needed symbols and structures. For instance, when clicking √ it would write sqrt() into the answer box so that the user can write the argument between the ( and ). One problem is that such a keyboard can only contain a small number of symbols because otherwise, it would occupy too much space on the question page.
With Norwegian participants, there were fewer problems with textual input. There are two explaining facts; several Norwegian participants had earlier programming experience (because among selection criteria for entering to NDCA, programming experience is counted as positive) and that all Norwegian participants had a programming course on the same semester than the mathematics course where the experiment was conducted. It seemed that motivation for programming generally helped to adopt a new program with textual input.
The students at JYU studied information technology. They had no serious problems with textual input.
There also is another user interface issue. Technically, MathCheck is executed via web forms. It stores information neither on the server nor on the user's computer. There is no need for downloads or opening an account. Starting to use MathCheck as a student is technically as easy as it can be. Furthermore, question pages are just ordinary web pages with a web form. Therefore, teachers that are fluent with HTML and CSS have very great freedom in making them be whatever they want. The other side of the coin is that the possibilities to provide feedback by MathCheck in a natural and easy-to-use fashion are limited.
Initially, MathCheck provided its feedback as a separate web page that replaced the question web page on the user's screen. Getting back to the question page was possible using the back button of the web browser. As an attempt to improve user experience, since April 2017, many question pages have contained two submit buttons, one that delivers the feedback as was described above and one that opens it to a new tab (or a new browser window, if the browser has been configured to work so). Therefore, the students can choose whichever feels better.
There have also been attempts to make the feedback open into a separate box that is beside the answer box. That is otherwise very natural, but introduces the need for clumsy scrolling, if the answer is too wide or long. Because MathCheck aims at making it possible to ask students problems whose solutions need many, possibly complicated steps, long and wide feedbacks should be expected.
In July 2017, we decided to test feedback function with giving the students two submit buttons, one that opens feedback in an area to the right of the answer box, and another that opens it in a new tab or browser window. The idea was that the students always first use the former button, and then use the latter if scrolling becomes a problem. Submitting the same answer twice is not a problem and does not force to rewrite the answer.
This improvement made it possible to put many exercises on the same web page, together with text that teaches the material in question. Consequently, question pages grew long. Originally, many feedback boxes were used, each one beside the group of questions that it corresponds to, so that when the long web page is scrolled, always the relevant feedback box is visible. In January 2018, we found out how to fix the position of the feedback box, that is, it does not move when the question page is scrolled. The question pages written since then contain only one feedback box. Each group of questions has two submit buttons, one sending the feedback to the feedback box and the other sending it to a new tab or window. Figure 6 shows an example.
An example of the new user interface.
Since these improvements, the students have made very few complaints on the user interface. Unfortunately, now that it is possible to put multiple question groups on the same question page, a new problem arose: it is technically challenging to combine the answers to different question groups into a single package that could be sent to the teacher or a point recording system. With Firefox, it is possible to save the page in such a way that the resulting file contains all the answers (and also the questions, which is an advantage), but we have yet not found out how the same could be achieved with other browsers. Making it possible to save the answers one group at a time would be technically easy, but this solution is clumsy for the students. It may be that a reliable solution to this problem is only possible when using user accounts. One reason why we have not put much effort in solving this problem is that, for reasons explained by (Gibbs & Simpson, 2004;Gibbs, 2010), the authors believe that it is not necessarily pedagogically advantageous to record points in the middle of a course.
In the first four experiments, when asked about the user experience, students would have wanted more precise feedback from the location of mistakes. They also commented on a bit outdated screen view and hoped it to be updated to become a bit more modern and pleasant to the eyes. Besides, a feature that would check whether the answer satisfies all special requirements in the teacher's question was hoped. Most of these issues had been addressed by the fifth experiment. Consequently, similar remarks were almost absent in the feedback obtained from the fifth experiment and from other users of MathCheck by more than 100 students at JYU. Students in JYU wanted answer boxes to have a running number so that it would be easier to refer to the right place when discussing an exercise. This has now been implemented. Currently, the only repeatedly occurring wish is that there should be a mechanism for recording the answers so that the students could more easily reproduce their answers in the weekly meetings of a course. Our standard reply is: With such a mechanism, you would run into trouble in the examination because the recorded answers would not be available there. Therefore, the idea is not to record the answers but to learn the topic so well that you can re-generate the answers.
The present version of MathCheck has seven problem modes: simplification of arithmetic expressions (including derivatives), propositional logic, equation solving, use of predicate logic for formulating claims about arrays, predicate logic and equations in quotient rings, expression tree comparison (a problem mode designed to help students to perceive expressions as structural entities and learn such concepts as operator precedence) and context-free grammar (also known as Backus-Naur form).
In the experiments reported in this study, the simplification, propositional logic, and context-free grammar modes were used.

Conclusion
The results from above are only suggestive, but they are encouraging. Overall study shows evidence that MathCheck supports conceptual understanding and procedural fluency. The results also indicate that MathCheck can be used as a supporting tool in individual studies. In addition, MathCheck can lower the workload of a teacher.
The interest in MathCheck is growing in the Mathematics laboratory of TUT. In August 2017 MathCheck was connected to the electronic examination system Exam. During the examination, MathCheck only checks that the solution is syntactically correct and satisfies the particular requirements stated by the teacher, for instance, is in disjunctive normal form and is not more complicated than allowed. Afterwards, the teacher can use the full checking ability of MathCheck making the grading process quicker (and perhaps even more reliable) while reducing the teachers' workload.
In the future, the university education will highlight more student-oriented teaching, where the aim is constructivism learning facilitating a deeper conceptual understanding of mathematical concepts and procedures (Rämö, Oinonen & Vikberg, 2015). MathCheck addressed this and it has a role both as a part of traditional university courses (lectures, practice sessions) and as a supporter of the students' independent studying. No matter the place or time, students can use MathCheck during the solution process to check the correctness of the part of the solution obtained so far. The same applies to both teacher-given problems and problems that the students invent by themselves.