Inducing Mathematical Concepts from Specific Examples: The Role of Schema-Level Variation David W. Braithwaite (
[email protected]) Robert L. Goldstone (
[email protected]) Indiana University, 1101 E. 10th Street Bloomington, IN 47405 USA Abstract familiar contexts (Nuñes, Schliemann, & Carraher, 1993). Previous research suggests that comparing multiple specific In such contexts, learners can apply intuitions from examples of a general concept can promote knowledge everyday life to help in understanding the mathematical transfer. The present study investigated whether this approach ideas involved. Abstract presentation of mathematical ideas could be made more effective by systematic variation in the therefore risks sacrificing learning for the sake of transfer. semantic content of the specific examples. Participants It may, then, be desirable for learners to encounter received instruction in a mathematical concept in the context mathematical ideas in a way that leverages their intuitive of several examples, which instantiated either a single understanding of specific examples, while also drawing semantic schema (non-varied condition) or two different schemas (varied condition). Schema-level variation during attention to the abstract structure present in those examples. instruction led to better knowledge transfer, as predicted. Research on analogy suggests that this goal might be However, this advantage was limited to participants with achieved through presentation of multiple specific examples relatively high performance before instruction. Variation also followed by comparison (Gentner, Loewenstein, & improved participants’ ability to describe the target concept in Thompson, 2003; Gick & Holyoak, 1983). Comparing abstract terms. Surprisingly, however, this ability was not examples encourages learners to align their corresponding associated with successful knowledge transfer. elements, and thereby to notice their common relational Keywords: mathematics; analogy; comparison; schemas; structure. Awareness of this structure, in turn, can facilitate instruction; transfer understanding of new cases with the same structure. Thus, learning mathematical ideas by studying and then Introduction comparing multiple examples may enable learners to gain Part of the power of mathematics lies in its generality. The intuitive accessibility without losing generality. same mathematical formulae may be used to understand the The question then arises as to how the examples which growth of slime molds or the accumulation of interest from will instantiate a mathematical concept during learning are investments, the probabilities of hands in poker or outcomes to be chosen. Central to this question is the issue of how of scientific experiments, and the oscillations of mechanical much, and in what ways, the examples should differ from or electromagnetic systems. In order to fully realize this each other. If, as the above research suggests, learners power, however, learners must be able to recognize and induce concepts that incorporate commonalities among the apply mathematical concepts in contexts different from examples, it seems desirable that the examples should share those in which they were learned – that is, to transfer their the mathematical structure in question, but should not share mathematical knowledge from learned to novel contexts. other extraneous details. Extraneous commonalities might Learners’ difficulties in achieving such transfer are well- be misunderstood as part of the concept to be learned, documented (Novick & Holyoak, 1991; Ross, 1987). One limiting learners’ ability to generalize (Medin & Ross, reason may be that, when a general idea is learned in the 1989), and so defeating the purpose of using multiple context of specific examples, learners’ concepts become tied examples in the first place. These observations suggest that to the details of the examples, inhibiting their ability to extraneous aspects should be systematically varied across recall the concept or apply it correctly when faced with examples, while holding mathematical structure constant. cases that do not share similar details (Ross, 1987). This The present study investigates the effects on mathematical difficulty may be especially strong when the examples are concept learning of a particular type of variation among presented in a perceptually detailed format (Kaminski, examples: variation at the level of “semantic schemas.” Sloutsky, & Heckler, 2008), and is likely to be more serious This term here refers to structures more general than for domain novices than experts (Novick & Holyoak, 1991). specific examples but less general than mathematical One way to address this difficulty is to present structure. Consider the three combinatorics problems shown mathematical ideas in abstract form, without specific in Figure 1. Problems (a) and (b) share a schema, termed examples. Such an approach has indeed been shown to “Objects Selected in Sequence” (OSS), in which a sequence promote transfer in some cases (Kaminski et al., 2008). of selections is made from a fixed set of options. Problem However, in other cases, learners have experienced serious (c), by contrast, belongs to a different schema, termed difficulties with abstractly-presented mathematics, despite “People Choosing Options” (PCO), in which several people being competent with the same mathematics encountered in each choose once from a fixed set of options. Materials Sixteen story problems were constructed as stimuli. All of the problems had the same mathematical structure: Sampling with Replacement (SWR), in which multiple selections are made from a fixed set. The number of possible joint outcomes in such a case is given by the expression mn, where m is the number of elements of the set and n is the number of selections, or sampling events. The sixteen problems belonged to four different schema categories. The first two categories were those already illustrated above: PCO and OSS (OSS: Figure 1a-b, PCO: Figure 1c). Problems in these categories were used as learning examples. The other two categories were Options Assigned to Places (OAPlc) and Objects Assigned to People (OAPpl), illustrated below (Figures 2a and 2b respectively). Figure 1. Three combinatorics problems. OAPlc and OAPpl problems served as pretest and transfer problems. Note that in the learning examples (OSS and Of course, all three problems share the same PCO) and OAPlc problems, people are either doing the mathematical structure (discussed further in the Methods choosing or are not mentioned at all. In OAPpl, by contrast, section), and the differences between them would likely not people are being chosen instead of choosing. Due to this seem important to a mathematics expert. For mathematics role reversal relative to the learning examples, transfer to novices, however, semantic schemas are known to exert a OAPpl problems was expected to be particularly difficult, as strong influence on the mathematical interpretation of found in previous research (Ross, 1987). contextualized problems. For example, Bassok, Wu, and Olseth (1995) found that learners were more likely to solve correctly problems in which schematic and mathematical roles were matched consistently with their default expectations than problems in which such matches were inconsistent. In light of the preceding discussion, learning about a mathematical structure via several examples based on the same schema might lead learners to induce concepts tied to that particular schema, and thus to perform poorly on problems involving other schemas. Conversely, systematic Figure 2. Combinatorics problems from the (a) OAPlc and variation of the schemas encountered during learning should (b) OAPpl categories. lead to induction of more general concepts and thus to more successful transfer to novel problems. Each problem category contained two pairs of problems, This hypothesis was investigated in the present study. for a total of four problems. The problems within a pair Combinatorics problems were used as the domain for study involved the same back story but different numbers, while and transfer for several reasons. First, the discovery of better the two pairs within each category involved different back methods for learning and teaching combinatorics would stories (and different numbers from each other). The order have considerable practical value due to the foundational in which the two critical numbers, i.e. the size of the role of combinatorics in applied mathematics – in particular, sampled set and the number of sampling events, were probability and statistics. Second, mathematics learners are presented was varied among questions so that it could not known to have considerable difficulty correctly applying serve as a cue to match the numbers to their respective roles. combinatorics methods to novel problems (Bassok et al., 1995; Ross, 1987). Finally, semantic schemas are known to play a role in the mathematical interpretation of combinatorics problems (Bassok et al., 1995). Methods Participants Participants were 109 Indiana University undergraduate students, who participated in partial fulfillment of a course requirement. Figure 3. Summary of experimental design. The experiment employed a pretest-training-posttest After completing each pair of training problems, design, summarized in Figure 3. The pretest consisted of participants were asked to choose from a list of options the one OAPlc problem pair and one OAPpl problem pair, for correct method of solving problems like those just seen, four problems altogether. The posttest consisted of the other independent of the specific numbers involved. For example, OAPlc problem pair followed by the other OAPpl problem the correct answer to this question after the problems pair. Thus, all eight OAPlc and OAPpl problems appeared involving pizza flavors (Figure 1c above) was “Multiply the in either the pretest or the posttest. number of pizza flavors by itself as many times as there are The training consisted of worked solutions to four consumers.” Participants who chose incorrectly were not problems drawn from the PCO and OSS categories. allowed to proceed until they chose the correct answer. Participants were assigned randomly to one of two training After answering the above question for the second pair of condition. In the varied condition, participants were shown training problems (only), participants were asked to choose one pair of problems from each category, either PCO from a list of options the correct mapping between elements followed by OSS or vice versa (these two possible orders of the preceding two problem pairs. For example, the correct were balanced across participants). In the non-varied answer to this question if the preceding problem pairs condition, participants were shown two pairs of problems involved a website generating passwords and consumers from the same category, either both PCO or both OSS tasting pizza flavors (Figure 1b and 1c) was “The length of (again, the two possibilities were balanced across the note sequences corresponds to the number of consumers, participants). If a certain problem category was shown in a and the number of possible notes corresponds to the number given position (either first pair or second pair), it was of pizza flavors.” The purpose of this question was to always the same problem pair regardless of condition. For encourage participants to think about the shared structure of example, if PCO problems were shown first in the varied the training problem pairs. After answering this question, condition, they were the same problems that were shown participants were asked to describe, in free-response format, first in the non-varied condition. An important consequence a general method for solving problems like those just seen. of this design is that each training problem was shown No feedback was given for either of these questions. equally often across the two conditions. Finally, participants were administered the posttest. The posttest utilized whichever set of OAPlc / OAPpl problems Procedure had not been presented during the pretest, and the procedure Participants were randomly assigned to receive one set of was in all ways the same as for the pretest. OAPlc / OAPpl problems as pretest. The pretest problems were displayed to participants on a computer monitor Coding together with a virtual calculator, which participants were For each problem, participants were assigned a score of 1 if encouraged to use as needed. Only one problem appeared on their answer was correct and 0 otherwise. the screen at a time. Two spaces were provided below each Responses to the free-response question regarding a problem: one in which to show work, and another in which general solution method posed at the end of the training to write the final answer. Participants were required to show were coded on a 0-2 scale in each of two respects. For the their work and enter some number as their final answer first respect, Correctness, responses were assigned a score before they could proceed to the next question. of 2 if they indicated that the number of elements in the After the pretest, answers were scored for correctness, and sampled set should be raised to the power of the number of participants were classified as high pretest performers if sampling events (or multiplied by itself as many times as the they answered at least 50% of the pretest problems correctly latter). Responses which implicated exponentiation but did and low pretest performers otherwise. They were then not correctly identify the base and exponent were assigned a assigned randomly to one of the two training conditions score of 1, and all other responses received a score of 0. The with the constraint that, at each level of pretest performance, second respect, Abstractness, was intended to measure how the number of participants in each condition was balanced. well participants had generalized beyond the specific details This manipulation was intended to reduce differences in of the learning examples. Responses were assigned a score pretest scores between training conditions. of 2 if they referred to the two numbers using general The training problems corresponding to participant’s words, such as “the options” (for the size of the sampled set) training conditions were then presented in the same way as or “the number of times they are able to be chosen” (for the the pretest problems. However, after completing each number of sampling events). Responses which used general problem, participants were shown the correct answer words for one but not the other number were assigned a together with a brief explanation of how the answer was score of 1, and all other responses received a score of 0. All calculated and why this calculation was appropriate. These responses were coded by two independent coders, and all explanations utilized exponential notation but did not show disagreements were resolved through discussion. In the the general expression mn. Instead, they only showed analyses detailed below, scores of 0 and 1 were combined specific versions of this expression instantiated with the for both correctness and abstractness, so that responses were numbers used in the problem. The explanation for a given classified as either correct (2) or not correct (0 or 1) and problem did not differ between training conditions. abstract (2) or not abstract (0 or 1). Results More importantly, the main effect of training condition Average pretest and posttest scores are shown in Figure 4. was significant, F(1,105)=4.0, p=.049, indicating greater Participants demonstrated considerable improvement on improvement in the varied (0.305) than in the non-varied posttest, but the amount of improvement varied by problem (0.201) condition. However, this effect was qualified by a category. The data were entered into a 2 (test section: marginally significant condition by pretest performance pretest or posttest) x 2 (problem category: OAPlc or OAPpl) interaction, F(1,105)=3.1, p=.08. Consequently, the same within-subjects ANOVA. The main effects of both factors model (excluding the pretest performance factor) was and the interaction between them were all significant (test applied separately to the data from low and high pretest section: F(1,108)=69.8, p<.001; problem category: performers. This analysis found a significant effect of F(1,108)=14.6, p<.001; interaction: F(1,108)=16.4, p<.001). training condition among high performers, F(1,29)=.706, Participants improved from pretest (0.216) to posttest p=.022, indicating higher transfer in the varied condition (0.489), but this improvement was greater for OAPlc (0.225 (0.047) than in the non-varied condition (-0.167), but no to 0.638) than for OAPpl (0.206 to 0.339). effect of training condition among low performers, F(1,76)=.042, p=.838 (varied: 0.410, non-varied: 0.397). In addition to the effect of training condition on transfer, we were also interested in whether training condition affected participants’ ability to induce a general method for solving SWR problems. The proportion of participants providing correct and abstract solution descriptions (i.e. receiving scores of 2 on the correctness and abstractness scales) within each training condition are shown in Figure 6. In the varied condition, 40% of participants’ solutions were scored as correct, 62% as abstract, and 29% as both correct Figure 4. Pre and posttest accuracy by problem category1. and abstract. In the non-varied condition, 56% of participants’ solutions were scored as correct, 39% as Figure 5 shows average transfer scores, defined as the abstract, and 20% as both correct and abstract. difference between posttest and pretest scores, for each training condition, among low and high pretest performers. Transfer scores were submitted to a 2x2x2 mixed ANOVA with training condition (varied vs. non-varied) and pretest performance (low or high) as between-subjects factors and problem category (OAPlc or OAPpl) as a within-subjects factor. The main effect of pretest performance was significant, F(1,105)=66.6, p<.001, indicating more improvement from pretest to posttest among low pretest performers (0.404) than high pretest performers (-0.056). Also, the effect of problem category was significant, F(1,105)=12.3, p=.001, indicating greater improvement on OAPlc (0.413) than on OAPpl (0.133). Problem category did not interact significantly with any of the other factors. Figure 6. Percent generating correct or abstract general solutions by training condition. The Breslow-Day test, a non-parametric test for stratified analysis of 2x2 tables, was applied to the frequencies of best (2) and other (0-1) scores within each training condition (varied or non-varied) for each aspect rated (correctness or abstractness). The relative frequencies of best vs. other scores between training conditions differed significantly according to aspect rated, p=.004. In other words, the effectiveness of varied relative to non-varied training was greater with respect to abstractness than with respect to correctness. To further clarify this effect, Pearson’s Chi- square tests were applied to the contingency tables of best Figure 5. Transfer by condition and pretest performance. vs. other scores by training condition separately for each measurement respect. These analyses found that abstract solutions were more common in the varied than in the non- 1 Here and elsewhere, error bars indicate standard errors. varied condition, p=.028, but the proportion of correct before introducing schema-level variation. Consistent with solutions did not differ by training condition, p=.152. this view, Kotovsky and Gentner (1996) found that children Were participants who provided solutions that were initially presented with several examples sharing both abstract, correct, or both more likely to perform well on abstract structure and superficial details were later able to posttest? Average posttest scores among participants notice shared structure even in the absence of superficial displaying each combination of solution abstractness and similarity. Similarly, Elio and Anderson (1984) found that correctness are shown in Figure 7. (Participants were category learning was better after a learning schedule approximately equally distributed over these combinations.) beginning with low variation among exemplars and later Scores were virtually identical for each of these progressing to more variation, as opposed to one beginning combinations: 0.50 for both correct and abstract, 0.49 for with and maintaining a high level of variability. neither abstract nor correct, 0.48 for abstract but not correct, Interestingly, Elio and Anderson (1984) also found that and 0.48 for correct but not abstract. A mixed ANOVA when learners were specifically instructed to take an applied to posttest scores with solution correctness (correct analytical approach to category learning, the effectiveness or not), solution abstractness (abstract or not), pretest of training with initially high variability improved. performance, and training condition as between-subjects Similarly, high pretest performers in the present study, who factors and problem category as a within-subjects factor may have been better equipped to take an analytical found no significant main effects of solution correctness or approach to learning the SWR concept, derived greater abstractness, no significant interaction between them, and benefits from varied relative to non-varied training. One no significant interaction of either or both with any other account for this result is that good learners are more factor. (None of these effects were significant when transfer attentive to the features and relations that are relevant to rather than posttest scores were entered into the model.) domain principles. Consequently, good learners would be less likely to be distracted by – and more likely to benefit from – variation in extraneous features and relations. Considering this conclusion together with the previous one regarding weaker learners, the best instructional approach might be an adaptive one, beginning with examples drawn from a single schema and transitioning to schema-level variation once learners demonstrate understanding of the target concept in the context of the initial schema. This interesting possibility deserves further investigation. However, the observed advantage of the varied training Figure 7. Average transfer scores by correctness and for high pretest performers must also be interpreted with abstractness of generated solution and test problem pair. caution. Transfer scores among high pretest performers were rather low, averaging around zero in the varied condition and below zero in the non-varied condition. One Discussion interpretation of these data is that varied training merely This experiment investigated whether exposure to multiple helped to avoid negative transfer, and did not actually examples of an abstract mathematical concept followed by benefit learners. On the other hand, high pretest performers comparison among them would lead to better induction of might be expected to show regression to the mean on the general concept when the semantic schemas of the posttest, resulting in negative scores on our measure of examples were systematically varied during learning than transfer. In this case, the actual (slightly above zero) transfer when all examples were based on the same schema. As scores in the varied condition would represent a positive predicted, participants in the varied condition both induced effect of training. It is difficult to disambiguate between more abstract solution methods for SWR problems, and these possibilities due to the lack of a control condition in showed greater improvement on a transfer test requiring the present study. Also, the inclusion of particularly difficult them to apply such methods. These results suggest that transfer problems, i.e. those in the OAPpl category, may schema-level variation of examples can be an effective way have obscured the presence of positive transfer by bringing to promote transfer. down the overall average. The beneficial effects of schema- Caution is necessary in interpreting these results because level variation might be better explored in future studies by the advantage of the varied over the non-varied condition in using a wider range of relatively easy transfer problems. promoting transfer was almost entirely driven by high In addition to their differing effects on transfer, the varied pretest performers. Low pretest performers did not benefit and non-varied training conditions also led to differing from the varied condition, although they were not hurt by it levels of success in describing general solutions for SWR either. A possible reason is that the dissimilarity between problems. In particular, while participants in both conditions examples in the varied condition made it difficult to notice were equally able to describe correct solutions, those in the their shared structure. This difficulty might be overcome by varied condition were better able to characterize the presenting several examples from the same schema, thus elements of those solutions in abstract, general terms. facilitating comparison and alignment of the examples, Previous research has demonstrated that comparison knowledge of how to solve the problems, it seems likely to between multiple analogous examples can lead participants relate to some form of implicit knowledge, e.g. improved to induce their shared abstract structure (Gentner et al., perception / encoding of problems or improved procedural 2003; Gick & Holyoak, 1983). The present findings build skill. Because the procedures required were essentially the on that principle by suggesting that if the examples in same across problems and conditions, the perceptual question share semantic content not intrinsic to the desired explanation seems more likely. The varied condition may structure, learners may induce a more limited, less general have encouraged learners to encode the elements of the concept than if such extraneous semantic content is problems in terms of their general roles in the mathematical systematically varied across learning examples. Moreover, structure of SWR, rather than in terms of their more specific not only superficial elements but also more abstract roles in one or another semantic schema. Such improved semantic structures, such as the schemas of the present encoding could, in turn, have facilitated application of the study, can count as extraneous content in this context. This solution procedures learned during training to the transfer conclusion implies that instructional design in mathematics problems. This explanation is admittedly speculative, but could benefit from attention to variation of semantic offers a promising direction for future research. schemas across examples of a given concept. Although the varied condition led both to more abstract Acknowledgments described solutions and to better transfer performance, the This research was supported by National Science former effect did not mediate the latter as expected. In fact, Foundation REESE grant 0910218. participants who succeeded in describing general solutions were not more likely than other participants actually to demonstrate successful transfer. This result is surprising in References light of previous research, in which the quality of Bassok, M., Wu, L.-ling, & Olseth, K. L. (1995). Judging a participants’ generalizations following exposure to multiple book by its cover: Interpretative effects of content on examples of a concept did predict their ability to apply the problem-solving transfer. Memory & Cognition, concept to novel cases (Gick & Holyoak, 1983; Novick & 23(3), 354-367. Holyoak, 1991). Several explanations are possible for this Elio, R., & Anderson, J. R. (1984). The effects of dissociation of described solution methods and problem- information order and learning mode on schema solving performance. abstraction. Memory & cognition, 12(1), 20-30. First, participants may not have attempted to apply their Gentner, D., Loewenstein, J., & Thompson, L. (2003). described solutions during the transfer test, possibly due to Learning and transfer: A general role for analogical failure to recall the solutions or failure to recognize their encoding. Journal of Educational Psychology, 95(2), relevance. However, these possibilities seem unlikely given 393-405. that the transfer test was administered immediately after Gick, M. L., & Holyoak, K. J. (1983). Schema Induction participants described their general solutions, and that the and Analogical Transfer. Cognitive Psychology, 15, 1- problems in the transfer test were presented in the same 38. format and with very similar wording to those in the Kaminski, J. A., Sloutsky, V. M., & Heckler, A. F. (2008). training. Second, participants may have attempted to apply The Advantage of Abstract Examples in Learning their solutions, but failed to do so successfully on either or Math. Science, 320(April), 454-455. both pairs of transfer problems. Such failure might have Kotovsky, L., & Gentner, D. (1996). Comparison and been due either to inability to map the elements of the Categorization in the Development of Relational transfer problems to the roles mentioned in their solutions, Similarity. Child Development, 67(6), 2797-2822. or to inability to apply the solution procedure despite having Medin, D. L., & Ross, B. H. (1989). The specific character correctly mapped the corresponding elements. Both of these of abstract thought: Categorization, problem solving, issues have been implicated in failures of analogical transfer and induction. In R. J. Sternberg (Ed.), Advances in in mathematics learning (Novick & Holyoak, 1991). Future the Psychology of Human Intelligence, Vol. 5 (pp. research might disambiguate between these possibilities by, 189-223). Hillsdale, N.J.: Lawrence Erlbaum. on the one hand, directly testing whether participants could Novick, L. R., & Holyoak, K. J. (1991). Mathematical map elements in the transfer problems to those in training problem solving by analogy. Journal of experimental problems, and on the other hand, testing the effects of psychology: Learning, memory, and cognition, 17(3), providing such a mapping to participants. 398-415. Regardless of why posttest performance was not predicted Nuñes, T., Schliemann, A. D., & Carraher, D. W. (1993). by participants’ ability to describe correct and general Street Mathematics and School Mathematics. solution methods, it is clear that such ability was not the Cambridge University Press. cause of the superior transfer observed in the varied over the Ross, B. H. (1987). This is like that: The use of earlier non-varied condition. The question then arises: what was problems and the separation of similarity effects. the cause for that advantage in transfer? Because this Journal of Experimental Psychology: Learning, advantage was dissociated from explicit, articulable Memory, and Cognition, 13(4), 629-639.