# ATStat: Active Training of Statistics

**Project information**

When learning statistics, students need to develop conceptual relationships between characteristics of the data and the resulting values from statistical procedures. In order to acquire the knowledge about these relationships with the existing statistical software, students need to perform a large amount of steps between manipulating data and actually seeing changes in the statistical values. This long gap in the feedback loop obscures the conceptual relationships. In addition to studying statistics, students are thus faced with a cognitive overload, as they are simultaneously learning to operate the software. These two factors together cause learning statistics to become significantly more difficult for many students.

ATStat, an online interactive platform for training statistics, intends to lessen this problem. ATStat comprises the software that allows students to directly manipulate data points graphically and thereby immediately see the feedback of the changes in statistical values. ATStat aims to assist students in building their knowledge about conceptual relationships. ATStat will provide data sets and prompts in two domains: Psychology and HCI. The user interface and the content in psychology will also be available in both English and German. ATStat will be released with an open source license for future adoptions and extensions.

**Goal of the project**

Targets:

1. An interactive software for learning statistics that incorporates direct manipulation of data and immediate feedback of statistical results.

2. Accompanying datasets and description of experiments integrated in the software.

3. Prompts of contrasting cases for students to build their understanding of statistical concepts with the software.

4. The user interface and materials mentioned above in both English and German

**Advancement of teaching**

Problem with existing software for learning statistics:

---

Commercial statistical software (e.g., SPSS, SAS, STATA, R) are designed for conducting statistical analysis rather than learning about statistical analysis. Thus, these software solutions avoid manipulating data by design. This means that in order to re-run statistical analysis with modified data, the user has to input a numerical change (e.g., by typing a number in the data table), re-run the analysis (either through menus or commands) and then is only able to see the current version of the results.

This required, complex manipulation leaves a long gap between a user’s action (e.g., changing the value of a data point) and interpretation of the results (e.g., see how the t-value changes). Furthermore, the results are usually presented numerically, which is cognitively more demanding when comparing results across different datasets. Additionally, students already need to be familiar with the menus or commands to issue statistical tests. Thus, this software is unsuitable for learning statistical concepts, especially with the PFL approach.

Possibilities of the new scenarios:

---

ATStat originated from three strands of knowledge: (1) a set of principles for learning statistics, (2) the Prepare for Future Learning approach, and (3) the direct manipulation interaction style.

(1) Principles for learning statistics:

In 1995, Joan Garfield synthesized a set of principles for learning statistics based on the literature in statistical research and psychological research. This set of principles is also re-validated in 2007 by Garfield and her colleague Dani Ben-Zvi. The following principles from their research inspire ATStat:

a. Letting students actively analyze data

b. Allowing students to guess and predict the outcome before being confronted with actual results

c. Using technology to visualize and explore data and statistical models.

(2) Prepare for Future Learning (PFL):

Classroom teaching usually lets students do exercises after a lecture. This is called the tell-and-practice (T&P) approach. In the PFL approach (Bransford and Schwartz, 1999), students use learning materials that provide contrasting cases for them to hypothesize about the phenomena of interest before the lecture. In the lecture, students contrast their hypotheses with the explanation from the lecturer. The preparation activity encourages students to create a necessary cognitive structure to integrate information from the lecture. PFL has been shown to be more effective than T&P for learning descriptive statistics (Schwartz & Martin, 2004) and neuroscience (Schneider et al., 2013). As illustrated in the scenario above, ATStat is a playground for PFL and for practice after learning.

(3) Direct manipulation:

Imagine using a mouse cursor to drag an icon of a file on the screen and dropping it into a folder. This is an example of the direct manipulation interface. The characteristics of direct manipulation interface are a) continuous visual representation of the object of interest, b) use of physical actions to manipulate the representation, and b) providing continuous feedback and reversible, incremental actions. In the file-moving example, the icons of the file and the folder are the visual representations; the drag-and-drop is a physical action; the fact that the file icon moves on the screen as the mouse cursor moves is the continuous feedback, which is reversible and incremental. Direct manipulation interfaces reduce the “gulf of execution” (between what the user meant to do and the action that he takes) as well as the “gulf of evaluation” (between what the system meant to communicate and the output shown on the screen, and between the output and the users’ interpretation) (Hutchins et al., 1985). Direct manipulation interfaces are easy to learn and encourage exploration (Shneiderman, 2010 and Sharp et al, 2007). ATStat allows students to directly manipulate data points and receive incremental feedback of the statistical result.