By Elizabeth Spelke, Ph.D., Harvard University Summary
Our research examined the relationship between cognitive systems that underlie music and mathematical abilities. Specifically, we undertook studies to determine whether, when children or adolescents produce music—comparing and operating on melodies, harmonies, and rhythms—they activate brain systems that also enable them to compare and operate on representations of number and geometry.
Prior research has established three core systems at the foundation of mathematical reasoning. These are: 1) a system for representing small exact numbers of objects (up to three); 2) a system for representing large, approximate numerical magnitudes; and 3) a system for representing geometric properties and relationships (especially Euclidean distance and angle). Each system emerges in infancy, continues into adulthood, and is malleable by specific experiences.
Children use object representations to learn the meaning of words for numbers and the logic of verbal counting. They use approximation of large numbers to learn arithmetic and its logical properties (such as the inverse relation of addition to subtraction). They use geometric representation to learn and use symbolic maps. Students’ mastery of formal mathematics depends, in part, on these three systems.
If music training fosters mathematical ability, it could do so by activating and enhancing one or more of these systems. In addition, music training could activate and enhance processes that connect these systems. We examined, therefore, whether instruction in music is associated with higher performance on tasks that tap each of the three core systems and their connections.
We conducted three experiments in children and adolescents. All three experiments assessed participants’ performance on a total of six tests. Three of the tests assessed the function of each of the core systems underlying math capabilities. Three further tests measured abilities to connect pairs of systems to one another.
The first experiment compared test performance of participants who had moderate training in music or sports. This experiment involved 85 children and adolescents, aged 517. The second experiment compared performance of participants who had intense music training to those who had little such training. This experiment involved 32 children, aged 813, with high intensity music training and 29 sameaged children with low intensity music training. The third experiment compared the effects of intensive training in music, dance, theater, creative writing, and visual arts. Participants were 80 students, aged 1318, attending a private school for arts.
Our results show that intensive music training is associated with improved performance in the core mathematical system for representing abstract geometry. Controlling for an array of other variables (such as IQ, academic performance, social and economic factors), we found that intensively musictrained students outperformed students with little or no music training at detecting geometric properties of visual forms, relating Euclidean distance to numerical magnitude, and using geometric relationships between forms on a map to locate objects in a larger spatial layout. While this association was found for participants who received intensive music training, our sample size might have been too small to detect more subtle improvements that may occur from less intensive music training. That possibility needs to be explored in future studies.
Our findings are consistent with recent studies that suggest that sequences of tones spontaneously activate representations of space in the brain. We are now beginning to use behavioral and brain imaging methods to explore the nature and development of this relationship in infants and young children.
Introduction
Music is universal across human cultures and is a source of pleasure to people of all ages, but its place in humans’ cognitive landscape is poorly understood. The present research investigates the relation between the cognitive systems at the foundations of music (and possibly other arts) and those at the foundations of mathematics and science. When children or adults produce or actively listen to music, comparing and operating on melodies, harmonies, and rhythms, do they activate brain systems that also allow them to compare and operate on representations of number and geometry? We explore this question by investigating whether instruction in music is associated with higher core abilities in the numerical and spatial domains.
Many studies have documented that musically trained children and adults outperform their untrained peers on tests of academic aptitude, especially in mathematics (see Schellenberg, 2005, for discussion). These studies are consistent with the possibility that musical training enhances children’s core cognitive capacities. The studies are not conclusive, however, because they focus on complex measures of higher cognitive function, such as IQ scores (e.g. Schellenberg, 2004), that depend on multiple, diverse cognitive abilities. If music instruction enhances particular core cognitive functions, however, its effects should be evident in tasks that tap those core functions more directly.
This report describes three experiments that were conducted to investigate associations between music training and performance on each of the core systems at the foundations of higher cognition in mathematics and the sciences. The experiments focus on children in three age groups (from kindergarten to high school) and on music instruction at three levels of intensity (from mild extracurricular interest to principal discipline). In different studies, we compare effects of music training to a noinstruction control, to sports training, or to training in other arts disciplines. By these diverse approaches, we test for fundamental associations between musical and mathematical cognition.
Core Systems of Number and Geometry
Formal mathematics and science are recent achievements in human history: recursive natural number systems emerged only several thousand years ago and are not universal among humans today (Everett, 2005; Pica, Lemer, Izard & Dehaene, 2004). Geometric mapmaking is even more recent, and the formal unification of number and geometry is less than 400 years old (see Dehaene, 1997, for discussion). Thus, the human brain cannot have been shaped, by natural selection, to perform symbolic mathematics. When children learn mathematics, they harness brain systems that evolved for other purposes.
What are those systems and purposes? Research in cognitive neuroscience provides evidence for three core systems at the foundations of mathematics and science: a system for representing small, exact numbers of objects (up to 3); a system for representing large, approximate numerical magnitudes (e.g., about 20); and a system for representing geometric properties and relationships (especially Euclidean distance and angle) (Feigenson, Dehaene & Spelke, 2004; Wang & Spelke, 2002). Each of these systems emerges in human infancy and continues to function in adults.
Each of the systems, moreover, is malleable by specific experiences (e.g., Green & Bavelier, 2003; Baenninger & Newcombe, 1989; Dehaene et al., 2006; Newcombe & Uttal, 2006). Behavioral experiments provide evidence that these systems guide adults’ intuitive reasoning about object mechanics (e.g., McCloskey, 1983), mental arithmetic (e.g., Gallistel & Gelman, 1992), and spatial relationships (e.g., Shepard & Metzler, 1971). Studies using functional brain imaging provide evidence that the systems are activated when adults or children solve problems in symbolic mathematics (e.g., Piazza, Pinel, & Dehaene, 2006; Temple & Posner, 1998). Moreover, impairment of these systems is associated with impaired mathematical and spatial performance (e.g., Lemer, Dehaene, Spelke, & Cohen, 2003; Cappelletti, Barth, Fregni, Spelke, & PascualLeone, 2007).
Most importantly, each of these systems supports young children’s learning of formal mathematics and science. Children use the system of object representation to learn the meanings of number words and the logic of the system of verbal counting (Lipton & Spelke, 2006). Children also use this system to organize their learning about the mechanical properties of objects (Kim & Spelke, 1999; HuntleyFenner, Carey, & Solimando, 2002). Children use the system of large, approximate numbers to learn symbolic arithmetic (Gilmore, McCarthy, & Spelke, 2007) and to master logical properties of arithmetic such as the inverse relation between addition and subtraction (Gilmore & Spelke, in press). Finally, children as young as two years of age use the system of geometric representation to make sense of symbolic maps (WinklerRhoades, Carey, & Spelke, 2007).
Although the three systems at the core of number and geometry are relatively independent of one another at the start of human life, they become linked over the course of human cognitive development, and the most important linkages emerge before children begin their formal education. When children master verbal counting, at about 4 years of age, they connect their representations of small, exact numbers and large, approximate numbers to construct new concepts of natural number (Spelke, 2003).
By the time children enter kindergarten, they have come to relate their representations of number to their representations of space and construct the “number lines” that will become central to their mastery of measurement and higher mathematics (Siegler & Opfer, 2003; de Hevia & Spelke, in review). Even earlier, children begin to link their representations of objects and geometry: they detect the geometric relations among a set of objects and use those relations to understand maps (Shusterman, Lee & Spelke, in press: WinkerRhoades, et al., 2007). These linkages provide kindergarten children with highly useful and versatile tools for mastering mathematics and science in school (see Gelman, 1991; Griffin & Case, 1996).
If music training enhances mathematical ability, therefore, it could act in either of two general ways. First, musical processing may specifically activate one or more of the core systems, and so music training may enhance processing of small exact numbers, large approximate numbers, or geometric relationships. Second, musical processing may specifically activate processes that connect different systems together in the construction of natural number, number lines, or symbolic maps.
Our consortium research investigates the first suite of possibilities by testing whether children and adolescents with more musical experience show enhanced representations of small exact numbers, of large approximate numbers, or of geometric relationships in smallscale visual arrays or largescale spatial layouts. It investigates the second suite of possibilities by testing whether children with more musical experience show enhanced abilities to represent and operate on symbolic natural numbers, number lines, and maps.
The Experiments
We conducted three experiments comparing children and adolescents with training in music to those with (a) no specific training, (b) training in sports, or (c) training in other art forms. Separate experiments tested for effects of mild to moderate training in children and adolescents (Experiment 1), moderate to intense training in children (Experiment 2) and highly intense training in adolescents (Experiment 3).
In Experiment 1, we studied 5 to 17yearold children and adolescents from an affluent suburban Massachusetts community who attended either a music school offering Saturday music lessons, or a soccer program offering Saturday sports lessons. Both programs offered lessons to all interested children, and both were attended primarily by children with only a mild to moderate interest in music or sports. In Experiment 2, we compared a group of 8 to 13yearold children receiving intense music training, mostly recruited from music schools in the Boston area to which they were admitted by audition, to a group of control children with little or no music training. In Experiment 3, we studied 13 to 18yearold students attending a private and selective high school for the arts, and we compared those majoring in different arts disciplines: music, dance, theater, writing, or visual arts.
Methods and Measures
Participants were given six behavioral tests of mathematical and spatial abilities: three tests tapping core systems of numerical and spatial representations and three tests tapping abilities to link those systems. Because all of the abilities emerge in preschool children, we were able to use the same tests on participants ranging in age from 6 to 18 years. On each test, participants were tested at a wide range of difficulty levels so as to reveal patterns of variation both across age and across individuals.
In addition to these tests (described below), each participant was given a test of verbal ability (the vocabulary subtest of the Wechsler Intelligence Scale for Children), and a questionnaire assessment of his or her history of lessons and practice in music, other arts, and sports. The vocabulary test was started at different levels for the elementary school and high school students. The questionnaire was answered by the participants themselves in the high school sample and by the participants’ parents in the younger samples.
Small exact number: Multiple Object Tracking
To investigate whether music training enhances the ability to represent small, exact numbers of objects, we created a childfriendly version of the Multiple Object Tracking (MOT) task (Pylyshyn & Storm, 1988). In our version of the task, children saw an array of white dots described as ladybug eggs, and they were instructed to keep track of a subset of the eggs in the array while all the eggs moved independently around the screen. After 10 seconds of motion, the eggs stopped moving, two eggs turned into ladybugs, and the child indicated which of the two ladybugs was in her tracked set (Figure 1a). The number of eggs tracked varied across trials (from 2 to 6; 10 trials at each number); on each trial, one of the two ladybugs was a member of the tracked set.
This task was selected, because a wealth of research provides evidence that the same signature limits characterize adults’ performance on this task and infants’ performance in tests assessing smallnumber representations (see Scholl, 2001, and Carey & Xu, 2001, for review). Moreover, task performance has been shown to be highly malleable by experience (Green & Bavelier, 2003) yet little affected by verbal or symbolic object knowledge (Scholl, 2001). If music training enhances representations of small numbers of objects, then children and adolescents with a history of music training should track objects more accurately.

Fig. 1a. Multiple object tracking (“which is your ladybug?”) (image courtesy of Kristen LaMont) 
Large, approximate number: comparison
To investigate whether music training enhances the ability to represent the approximate cardinal values of large sets of objects, we used a variant of the method of Barth, La Mont, Lipton, & Spelke (2005) to assess children’s and adolescents’ abilities to compare two visual arrays of elements on the basis of number over variation in continuous quantities. In our version of the task, participants saw an array of blue dots and an array of red dots side by side, and they indicated which array had more dots (Fig. 1b).

Figure 1b. Numerical comparison (“are there more blue or red dots?”) (images courtesy of Spelke Lab) 
The ratio of the two numbers varied across trials, as did the continuous quantitative variables of dot size, position, density, and array area. Each participant received 48 trials, 8 per ratio, with arrays differing in number at each of 6 ratios: 2:3, 3:4, 4:5, 5:6, 7:8, and 9:10. For each participant, we computed the proportion of correct comparisons at each of these ratios (chance=.50) and tested for associations with music training both overall and at each ratio. If music training enhances representations of large, approximate numerical magnitudes, then children with music training should choose the array with the larger number more accurately, especially at the more difficult ratios.
Geometry: Detecting invariants
To investigate whether music training enhances the ability to represent geometric properties in visual forms, we used a task developed by Dehaene, Izard, Pica & Spelke (2006) for studies of sensitivity to geometry in adults and children. On each trial, children viewed six geometric figures that differed in size and orientation. Five of the figures shared a single property not shared by the sixth figure; children were instructed to find the deviant figure (Fig. 1c).

Figure 1c. Detecting geometric invariants (“which one is different?”) 
Participants received a total of 45 trials testing for properties in 7 categories: topological properties, Euclidean properties of points and lines, Euclidean properties of figures, symmetry, chirality, proportionality, and geometrical transformations. Their performance was assessed both overall and within each category (chance=.167). If music training enhances sensitivity to geometry in visual arrays, then children with music training should detect more reliably the geometric properties that unite 5 of the 6 figures in each trial, and they therefore should choose the deviant figure with higher accuracy.
Linking small and large numbers: verbal estimation

Figure 1d. Numerical estimation (“guess how many dots are on the screen.”) 
We used a verbal estimation task to assess participants’ command of the system of number words and sensitivity to their numerical reference. On each trial, participants saw an array of dots, presented too briefly for counting, and gave a verbal estimate of their number (Fig. 1d). Participants were tested with 3 arrays at each of 15 numerosities (10 through 150, increasing by multiples of 10). At each numerosity, the sizes and distributions of elements varied across trials. From the verbal estimates, we calculated a measure of the accuracy of estimation performance, by taking the average of the distance of each estimate point from the true number. If musical training enhances the link between approximate number and language that underpins knowledge of number words and verbal counting, then participants with music training may perform more accurately on this test.
Linking number and space: Number lines
We used a task developed by Siegler and Opfer (2003) to investigate participants’ command over the relationship between numerical and spatial magnitudes. On each trial, participants were presented with a line, the left endpoint was marked “1” and the right endpoint was marked with a higher number (e.g., “100”). Above this line was a third number between those extremes (e.g., “38”; Fig. 1e). Subjects’ task was to mark the location on the line where that number belonged. The task was presented to subjects in paper and pencil form. Each participant received 12 trials at each of 3 number line scales from 1 to 100, 1000, or 1,000,000. Performance was analyzed by recording the location of each point and calculating its accuracy by means of the same accuracy measure as for the estimation task, after transforming the numerical magnitudes so that errors at each of the magnitude ranges were computed on a common scale.

Figure 1e. Number line task (“Draw a mark on the line where the number in the circle belongs.”) 
Linking geometry and objects: Maps
Our final task used a method developed by Dehaene et al. (2006) to assess the ability to represent geometric properties of the surrounding environment in maps. On each of 1828 trials (with more trials for the older students), children viewed a simple map depicting three geometric forms in a triangular arrangement, while facing away from an array of three containers forming a similar triangle but 12 times larger and at a different orientation. The experimenter pointed to a single location on the map and instructed the child to place an object in the corresponding container (Fig. 1f).
Across trials, the nature of the triangle (isosceles or right triangle) and its orientation were varied. On half the trials, one form on the map and one container in the room were distinct in color and served as a landmark. Children’s performance was analyzed separately for trials with and without the landmark. If music training enhances sensitivity to the geometry of the spatial layout, then children with music training should perform better on the map task.

Figure 1f. Map task (“Put the toy in this can (pointing to asterisk)”) (image courtesy of Kristen LaMont) 
Experiment 1: Effects of mild to moderate music training, relative to sports training, at 517 years of age.
Participants
Participants were 85 students aged 5 to 17 years (mean 10.4 years) in grades ranging from preschool to 11 (mean 4th grade). Participants and their parents were recruited through their participation in one of two Saturday programs in the same suburban community in Massachusetts: a music program and a soccer program.
Measures
Music and sports training each were calculated as weeks of training, derived from the start date of training in each area and the end date. In most cases, the training was ongoing and the date of test was used as the end date. Other measures were as described above.
Analyses
Preliminary analyses of students’ experience with music and sports revealed high overlap between the profiles of children recruited from the music school and those recruited from the soccer program. For this reason, the two groups were collapsed together. Associations between music training, sports training, and performance on the 6 numerical and spatial tasks were analyzed by hierarchical regressions that tested and controlled for effects of age, sex, socioeconomic status, and verbal IQ. Regressions on each task proceeded by entering age, sex, parental SES, and verbal IQ scores simultaneously as the first set of predictors, followed by music and sports training entered simultaneously as the second set of predictors. In this way, we were able to explore the effect of music and sports training independently of any effects of the suite of demographic variables.
Findings
Small, exact number
Performance on the Multiple Object Tracking task was high yet below ceiling (mean 87.1%, chance=50%, t(84)>4,000, p<.001), and it showed reasonably high variability across children (range 57100%, s.d. 9.5) (Fig. 2a). A oneway analysis of the effect of number of items tracked (5 levels: 2, 3, 4, 5, and 6) revealed the expected linear contrast, with declining accuracy as the number tracked increased (F(1,84)>169, p<.001). Because the distribution of overall performance was both negatively skewed (zskewness=4.64) and kurtotic (zkurtosis=2.59) and because mean performance was quite high, we focused our correlational and regression analyses on participants’ performance on the trials with 5 or 6 targets. Performance on this subset of trials was above chance (mean 76.5%, t(84)>3,000, p<.001) and below ceiling, and variable across participants (range 40100%, s.d. 13.7), and the distribution was normal.
The hierarchical regression analysis revealed that age and SES significantly predicted performance on trials with 5 or 6 targets (standardized beta of age=.475, p<.001; standardized beta of SES=.205, p=.043), but that neither music nor sports training predicted unique variation after the variation related to the suite of demographic variables was removed (Rsquare=.250, deltaRsquare=.003, 1st step of model p<.001, 2nd step of model p>.8, n.s.).

Figure 2a. Multiple object tracking performance in Experiment 1. 
Largenumber comparison
Performance on the numerical comparison task also was above chance and below ceiling (mean 70.1%, t(84)>6,800, p<.001), and variable across children (range 5283%, s.d. 6.65). A oneway ANOVA of the effect of discrimination ratio revealed the expected linear contrast (F(1,84)>168, p<.001), showing decreasing accuracy as the ratio of the two numbers approached 1 (Fig. 2b).
In the hierarchical regression analysis, age was the only significant predictor of task performance (standardized beta=.37, p=.001). Neither music nor sports training predicted unique variation in task performance after the variation related to the suite of demographic variables was removed (Rsquare=.143, deltaRsquare=.007, 1st step of model p=.016, 2nd step of model p>.7, n.s.).

Figure 2b. Numerical comparison performance by ratio in Experiment 1. 
Geometrical invariants
Performance on the geometry task was above chance (mean 73.70%, t(84)>990, p<.001), below ceiling, and variable (range 3395%, s.d.14.8). Because the distribution was negatively skewed (zskewness=2.96), the performance variable was transformed by taking the square root of its reflection (as was done with certain variables in Experiment 1). A oneway analysis comparing performance in the 7 different categories of trials revealed significant differences in sensitivity to the different geometric properties (Fig. 2c). Because of nonsphericity in the data, the GreenhouseGeisser correction was used for interpreting this effect (F(4.24, 356.52)>63, p<.001).
After Bonferroni correction, posthoc pairwise comparisons revealed that performance on the topology and Euclidean geometry categories was superior to performance on all other categories of trials (p’s<.001), performance on the geometric figures category was significantly better than performance on the chiral figures, metric properties, and geometrical transformation categories (p’s<.001), and performance on the symmetrical figures and metric properties categories was significantly better than on the geometrical transformations category (p=.031 and p=.007, respectively).
The hierarchical regression analysis of overall task performance revealed that age and verbal IQ predicted performance on the geometry task (standardized beta of age=.804, p<.001; standardized beta of verbal IQ=.264, p=.001), such that subjects who are older or of higher verbal IQ scored better on this task. However, neither music nor sports training predicted unique variation on the task, after the variation related to the suite of demographic variables was removed (Rsquare=.603, deltaRsquare=.004, 1st step of model p<.001, 2nd step of model p>.6, n.s.).
A set of similar regression analyses tested performance on each of the subsets of the task. Age predicted performance on all subsets of the task, but music and sports training amounts were never significant predictors of performance after partialing out the effects of the demographic variables.

Figure 2c. Sensitivity to geometric invariants in Experiment 1. 
Estimation
Performance on this task showed a significant linear contrast between number of dots presented visually and the verbal estimates given (F(1,84)>851, p<.001), indicating that participants’ estimates increased linearly and monotonically as the number of dots presented increased (Fig. 2d). Because the distribution of the measure of error was positively skewed (zskewness=3.72), we transformed the variable by taking its square root.
The hierarchical regression analysis revealed that age was a significant predictor of performance on the estimation task (standardized beta=.607, p<.001) such that older children had smaller estimation error, but that neither music training nor sports training accounted for unique variation in the task after variation related to the suite of demographic variables was removed (Rsquare=.116, deltaRsquare=.071, 1st step of model p=.047, 2nd step of model p=.044).

Figure 2d. Estimation performance in Experiment 1. 
Number Line
Performance on the number line task showed reasonable accuracy at all three scales (Fig. 2e). Oneway repeated measures ANOVAs on spatial estimates at each target number, for each scale, revealed a significant linear contrast on the hundreds scale (F(1,84)>2,600, p<.001), the thousands scale (F(1,82)>3,600, p<.001; 2 children did not complete this scale), and the millions scale (F(1,74)>1,700, p<.001; 10 children did not complete this scale). Because error distributions at each of the three scales were both positively skewed and kurtotic (zskewness>3, zkurtosis>=2.57), each distribution was transformed by logarithm, to minimize skewness and kurtosis. A oneway ANOVA on log transformed performance by scale (3 levels) revealed the expected linear contrast (F(1,74)>12,000, p<.001), with greater errors on trials at the higher scales.
In order to facilitate correlational and regression analyses, we formed a composite distance score by averaging the untransformed distance scores on each of the scales, after adjusting the thousands and millions scale scores down to the same order of magnitude as the 100s distance. This was done by dividing the thousands distance by ten and the millions distance by 10,000. The distribution of this composite measure was both highly positively skewed and kurtotic, and so it was transformed by logarithm, to yield a normal distribution.
The hierarchical regression analysis on this transformed variable revealed that age, sex, and verbal IQ predicted performance on the number line task (standardized beta of age=.669, p<.001; standardized beta of sex=.379, p<.001; standardized beta of verbal IQ=.244, p=.006), such that older children, boys, and children with higher verbal IQ scores performed better. However, neither music training nor sports training predicted unique variation on this task, after the variation related to the demographic variables was removed (Rsquare=.582, deltaRsquare=.016, 1st step of model p<.001, 2nd step of model p>.2).

Figure 2e. Number Line placements in Experiment 1. 
Map
Performance on the map task was well above chance (mean 78.58%, t(84)>2,000, p<.001) yet below ceiling and variable (range 42100%, s.d.14.2). Because the distribution was negatively skewed (zskewness=2.36), we transformed the scores by reflecting them and taking the square root of the reflected scores in order to achieve a normal distribution. A preliminary oneway ANOVA on the 2 different categories of trials (with and without landmarks) revealed marginally higher performance on landmark trials (F(1,84)=3.94, p=.051; Fig. 2f).
The hierarchical regression analysis of overall task performance showed that age and sex significantly predicted performance on the map task (standardized beta of age=.520, p<.001; standardized beta of sex=.244, p=.012), such that older children and boys performed better on the map task. Music and sports training did not predict unique variation on the task after the variation related to the demographic variables was removed (Rsquare=.316, deltaRsquare=.013, 1st step of model p<.001, 2nd step of model p>.4, n.s.). Similar regression analyses of the subset of trials with and without landmarks also revealed no effects of music or sports training after controlling for the demographic variables.

Figure 2f. Map performance on trials with and without landmarks in Experiment 1. 
Summary
Experiment 1 tested for effects of mild to moderate levels of music training on a suite of numerical and spatial abilities. In this experiment, we attempted to control for motivation and extracurricular engagement by recruiting participants either from a Saturday music school or a Saturday sports program. We also controlled for effects of age, verbal IQ, and SES on children’s performance. Preliminary analyses (not reported here) showed associations between music training and performance of a number of our mathematical and spatial tasks, but these relationships disappeared when we controlled for the suite of demographic variables, and verbal IQ, that correlated with music training. Age, verbal IQ, and SES predicted performance on a number of our tasks, but the amount of training in music (or sports) did not, when the effects of these variables were controlled.
These findings provide no evidence, therefore, that low or moderate levels of music training specifically enhance any core mathematical abilities. The next experiment accordingly tested for effects of more intense music training on the same abilities.
Experiment 2: Effects of intensive music training, relative to no training, at 813 years of age.
Participants
A total of 61 elementary and middleschool children aged 8 to 13 years (mean 10.97 years) participated in the experiment. Of these, 32 children had high levels of music training and 29 children had low levels. Most of the children with high music training were recruited from selective music schools in the Boston area; most of the control children were recruited from the records of the Lab for Developmental Studies at Harvard University.
Measures
Music training was measured in weeks, calculated by date of start and end of music training. In most cases, music training was ongoing at time of test, so the date of test was used as the end date. All other measures were as described above, except that the number line task was not administered.
Analyses
Preliminary analyses showed that the musicians and control group did not differ significantly by parental SES (p=0.508), but that the musicians were older than the control group (t(59)=2.26, p=0.028) and had higher verbal IQ scores (t(59)=2.70, p=0.009). In order to control for differences in age and verbal IQ, ANCOVAs compared performance of the musictrained and control groups on the experimental tasks, after controlling for these covariates. In addition to the group comparisons, hierarchical regressions performed on all the participants together tested for the effect of amount of music training, after controlling for the effects of age, sex, SES, and verbal IQ. Age, sex, SES, and verbal IQ were entered simultaneously as the set of predictors for the first step of each regression, followed by music training as the second step.
Findings
Small, exact number
Raw overall performance on the Multiple Object Tracking task was high yet below ceiling (mean 88.72%, t(60)>34, p<.001) and showed moderate variability across children (range 56100%, s.d. 8.8). To correct for negative skew in the score distribution (zskewness=2.96), the performance scores were squareroot transformed and then reflected by subtracting each score from the highest score plus 1. All subsequent analyses used the transformed scores.
A twoway ANCOVA with age and verbal IQ as covariates, crossing subpopulation (musician or control) and number of items tracked (5 levels: 2, 3, 4, 5, or 6 items) revealed a significant linear contrast on the number of items tracked (F(1,59)>158, p<.01), but no effect of Group (F(1,57)=1.20, p>.2 n.s.), and no interaction (using GreenhouseGeisser correction for nonsphericity: F(3.50,206.45)=1.25, p>.2 n.s.). Children with high and low music training showed equal object tracking performance (Fig. 3a).
Because of the high mean of overall performance on this task, we focused the regression analyses on performance on the trials with the two largest numbers of targets (mean 80.1%, range 40100%, s.d. 13.7). This analysis tested for the effect of music training, after controlling for age, sex, parental SES, and verbal IQ. All four demographic variables were entered simultaneously in the first step of the regression, followed by music training as the variable in the second step. The regression analysis revealed that the regression model as a whole did not predict a significant amount of variance in the scores, at either the first or second steps of the model (Rsquare of first step=.03, deltaRsquare=.000, 1st step of model p>.7 n.s., 2nd step of model p>.8 n.s.).

Figure 3a. Multiple object tracking performance by musictrained and control children 813 years. 
Largenumber comparison
Overall performance on the numerical comparison task was above chance, below ceiling, and variable across children (4692%, mean 75.88%, s.d. 9.90, t(60)<3,000, p<.001). Because the distribution was negatively skewed (zskewness=2.78), scores on this task were transformed in the same way as MOT. All subsequent analyses use the transformed scores.
A twoway ANCOVA with age and verbal IQ as covariates, crossing Group (music trained vs. untrained) and numerical ratio (6 levels) found a significant linear contrast on numerical ratio (F(1,59)>20, p<.001), no significant effect of Group (F(1,57)<1), and no interaction effect (F(5,295)=1.98, p=.08, n.s.). Children with high and low music training performed equally well on the numerical comparison task (Fig. 3b).
The hierarchical regression analysis, testing for the effect of music training after controlling for age, sex, parental SES, and verbal IQ, revealed that the regression model did not predict a significant amount of variation in performance on this task (Rsquare=.047, deltaRsquare=.003, 1st step of model p>.5 n.s., 2nd step of model p>.6 n.s.).

Figure 3b. Numerical comparison by musictrained and control children 813 years. 
Geometrical invariants
Performance on the geometry task was above chance (mean 75.52, where chance is 16.67%, t(60)=38.73, p<.001), below ceiling, and variable (range 4795%, s.d. 11.87). An ANCOVA with age and verbal IQ as covariates, crossing Group (music training vs. control) by geometrical property (7 levels) revealed that the effects of both covariates, age and verbal IQ, were significant (F(1,57)=9.02, p=.004; F(1,57)=6.74, p=.012, respectively). In addition, there was a significant effect of geometrical property (F(6,354)>36.74, p<.001), and no effect of group (F(1,57)=1), or interaction (F(6,354)=2.00, p>.06 n.s.).
Posthoc pairwise comparisons after Bonferroni correction revealed that performance on the topology and Euclidean geometry categories was significantly higher than all other categories (p’s<.001), performance on the geometric figures category was significantly higher than the symmetrical figures (p=.009), metric properties (p<.001), and geometrical transformation categories (p<.001), and that performance on the chiral figures and metric properties categories was significantly higher than geometrical transformations (p<.001 and p=.042, respectively).
A subsequent set of oneway ANCOVAs compared the performance of the two groups on each of the subsets of trials. When age and verbal IQ were controlled, musicians significantly outperformed nonmusicians on the trials testing Euclidean geometric relationships (F(1,57)=5.90, p=.018; Fig. 3c), and they tended nonsignificantly to outperform nonmusicians on all the trial subsets except for the topology subset.
Hierarchical regression analysis tested for the effect of music training, after controlling for age, sex, parental SES, and verbal IQ. The analysis revealed that age and verbal IQ were significant predictors of performance on the geometry task, but that music training did not explain unique variation in performance after the effect of those variables was removed (Rsquare=.315, deltaRsquare=.019, 1st step of model p<.001, 2nd step of model p>.2, standardized beta of age=.392, p=.002, standardized beta of verbal IQ=.345, p=.007).

Figure 3c. Sensitivity to geometric invariants by musictrained and control children 813 years. 
Estimation
Performance on this task, assessed by a twoway ANOVA crossing subpopulation with average estimates at each target number, showed a significant linear contrast between the number of dots presented visually and the verbal estimates given (F(1,59)=383.18, p<.001), no effect of subpopulation (F(1,59)=2.05, p>.1 n.s.), and no interaction effect (F(14,826)<1, p>.6 n.s.; Fig. 3d).
A ttest comparing the performance of the two groups by the overall error measure revealed no significant difference between the musician and control groups (t(59)=0.061, p>0.9 n.s.). The hierarchical regression analysis testing for the effect of music training, after controlling for age, sex, parental SES, and verbal IQ, revealed that age was a significant predictor of performance on this task (standardized beta of age=.371, p=.005), but that music training did not predict unique variation in the task after removing variation accounted for by the age and the other demographic variables (Rsquare=.151, deltaRsquare=.000, 1st step of model p=.053, 2nd step of model p>.9).

Figure 3d. Estimation by musictrained and control children 813 years. 
Map
Performance on the map task was well above chance, below ceiling (mean 79.38%, where chance is 33%, t(60)>2,200, p<.001) and variable (range 60100%, s.d. 10.99).
A twoway ANCOVA crossing Group (music trained vs. control) and task condition (arrays with vs. without landmarks) revealed a significant effect of task condition (F(1,57)>44, p<.001), with higher performance on trials with landmarks, and no significant effect of group (F(1,59)=1.70, p>.1 n.s.), or interaction (F(1,59)=3.27, p=.076; Fig. 3f). Separate oneway ANCOVAs comparing the subpopulations on the subsets of trials in arrays with versus. without a landmark also revealed no significant group difference (both Fs(1,57)<1).
In contrast, the hierarchical regression analysis, testing for the effect of music training, after controlling for age, sex, parental SES, and verbal IQ, revealed that music training was a significant predictor of overall performance on the map task, after removing the variation related to the suite of demographic variables (standardized beta of music training=.318, p=.009; Rsquare=.305, deltaRsquare=.082, 1st step of model p<.001, 2nd step of model p=.009). Age was also a significant predictor of map task performance (standardized beta of age=.391, p=.002), and sex was a marginally significant predictor, with males performing slightly better than females (standardized beta of sex=.211, p=.052).

Figure 3f. Map performance by musictrained and control children aged 813 years. 
Summary
In this second experiment, the attempt to find a specific association between music training and core systems of number and geometry clearly suffered from the strong relationships between music training, verbal IQ, and age. Numerous associations between music and mathematics disappeared when the latter relationships were controlled. Nevertheless, several findings emerged.
First, there was no hint of an association between music training and representations of small exact numbers, large approximate numbers, or number words. Second, there were associations between music training and two measures of spatial cognition: sensitivity to Euclidean geometry in visual forms and use of simple geometric maps. Music training predicted geometric map performance when the suite of demographic variables and verbal IQ were controlled. Children with extensive music training also outperformed control children on all the measures of sensitivity to geometry that involved metric geometric relationships, although the effect of music training attained conventional levels of significance only on one measure after controls for age and verbal ability.
This experiment therefore suggests a relationship between music training and spatial ability, in this moderatelytointensively trained population of children. We return to this relationship after considering the findings from the third experiment.
Experiment 3: Effects of intensive music training, relative to intensive training in other arts disciplines, at 1318 years of age.
Participants
In total, 80 students at a private high school for the arts in suburban Massachusetts participated in the experiment. Students ranged in age from 13.84 to 18.81 years (mean=16.35) and in grade level from 9 to 12. Most (64) of the students were female, reflecting the gender ratio of the overall school population. Students were drawn from five arts programs: music (N=16), dance (N=23), theater (N=26), creative writing (N=4), and visual arts (N=11). They were tested at the school in two onehour sessions, separated by a short break.
Measures
Training experience in music, dance, visual arts, theater, writing, and sports was calculated separately for each area by multiplying the number of years of lessons the student received in the area by their selfreported intensity of focus. Intensity was reported on a scale of 15, where 1 indicated that the student tried the activity but didn’t pursue it with intensity, 3 indicated that the student pursued the activity with some intensity but it was not the primary focus, and 5 indicated that the student pursued the activity with great intensity as the main focus of his or her work.
Analyses
Preliminary analyses of students’ musical experience revealed high and overlapping values for those in the music and dance programs, and lower, overlapping values for those in the other programs. Accordingly, group comparisons of performance on each task were analyzed both by discipline and by two groups of disciplines: music and dance (N=39) vs. theater, writing, and visual arts (N=41).
A oneway ANOVA compared the 5 arts majors by age and revealed a significant difference (omnibus F(4,75)=3.91, p=.006). Post hoc pairwise comparisons corrected by Hochberg’s GT2 showed that music majors were significantly older than dance majors. A similar oneway ANOVA compared the 5 arts majors on verbal IQ and found no significant difference between the arts majors (F(4,75),1, p>.4, n.s.). Consequently, comparisons of performance by the five different arts majors on the experimental tasks, were undertaken by ANCOVAs partialing out age, in an effort to see whether group differences emerged after statistical control for age.
Ttests comparing music and dance majors to visual art, theater, and writing majors found no significant differences between the groups on age (t(78)<1, p>.3 n.s.) or verbal IQ (t(78)<1, p>.7, n.s.). Simple ANOVAs therefore compared these groups’ performance on each spatial and numerical task.
Finally, hierarchical, stepwise regression analyses tested for the effect of training duration and intensity in music, dance, visual arts, theater, writing and sports, after controlling for age, sex, and verbal IQ. Regressions proceeded by entering age, sex, and verbal IQ simultaneously as predictors in the first step of the model, followed by all training amounts in music, dance, visual arts, theater, writing and sports entered simultaneously as predictors in the second step.
Findings
Small, exact number
Overall, MOT scores for these high school students were uniformly high (mean 93.48%, t(79)>9,200, p<.001), yielding a restricted range of variation (80100%, s.d. 4.8). Because the distribution of scores was negatively skewed (zskewness at each target number <2), all analyses used the square root transform of the reflected percent correct on this task.
A twoway ANCOVA crossing arts major by number of items tracked and statistically controlling for age revealed a significant linear contrast on items tracked (F(1,75)>120, p<.001), no significant group difference (F(4,74)<1, p>.7), and no interaction between arts major and number of items tracked (using GreenhouseGeisser correction for nonsphericity: F(12.24,229.54)=1.18, p>.2, n.s.; Fig. 4a). There was no significant effect of age (F(1,74)=1.75, p>.1). No pairwise posthoc comparisons survived correction for multiple tests.
A second twoway ANOVA crossing arts major group (music and dance majors vs. visual arts, theater, and writing majors) by number of items tracked revealed no difference between the groups of arts majors (F(1,78)=1.38, p>.2, n.s.), the significant linear contrast (F(1,78)>172, p<.001), and no interaction (using GreenhouseGeisser correction: F(3.09,240.96)=2.42, p=.065; Fig. 4a). No pairwise posthoc comparisons survived Bonferroni correction. A subsequent oneway ANOVA comparing arts majors’ performance on the hardest subset of trials (those with 56 targets) also revealed no effect of arts major, either in the comparison of the five groups (F(4,75)<1, p>.6 n.s.) or in the comparison of music and dance majors to the majors in theater, writing, and visual arts (F(1,78)=1.75, p>.1 n.s.).
Given that performance on the overall task was uniformly high with a restricted range of variation, in order to maximize the sensitivity of the measure, regression analyses focused on performance on the most difficult trials: those with 5 and 6 targets (mean 85.6%, t(79)>4,300, p<.001; range 60100%, s.d. 10.2). Because the distribution of scores on this subset of trials is also skewed negatively (zskewness=2.17), these analyses again used the square root transform of the reflected variable. This regression analysis revealed that dance training, visual art training, sex, and verbal IQ were all significant predictors of performance on MOT, such that children with greater dance or visual arts training, boys, and those with higher verbal IQ scores performed better on this task (standardized betas of: dance training=.314, visual art training=.301, sex=.421, and verbal IQ score=.343, all p’s<=.01; Rsquare=.165, deltaRsquare=.141, 1st step of model p=.003, 2nd step p=.038). Music training was not a significant predictor of performance.


Figure 4a. Multiple object tracking by high school students in differing arts disciplines. 
Large number comparison
Performance on the test of numerical comparison was well above chance and below ceiling, and it showed considerable variability across participants (mean 74.0%; t(79)>6,000, p<.001; range 5690%; s.d. 7.2). Effects of arts training first were analyzed by a 5 (arts major) by 6 (Ratio) ANCOVA with age as the covariate. Although the effect of ratio showed a highly significant linear contrast, F(1,75)>161, p<.001, there was no effect of arts major (F(4,74)<1, p>.7 n.s.; Fig. 4b) and no effect of age (p>.7 n.s.). Followup analyses of performance at the hardest two ratios also found no significant effects of arts major. The ANOVA comparing music and dance majors to students in the other arts majors revealed the significant linear contrast on ratio (F(1,78)>240, p<.001), no effect of arts major (F(1,78)<1, p>.5 n.s.; Fig. 4b), and no interaction effect (F(4.15,323.63)=1.87, p>.1, after GreenhouseGeisser correction for nonsphericity).
The regression analysis revealed that none of the training amounts or demographic variables was a significant predictor of performance on the numerical discrimination task (Rsquare=.012, deltaRsquare=.105, 1st step of model p>.8, 2nd step of model p>.4, n.s.).


Figure 4b. Numerical comparison by high school students in differing arts disciplines. 
Geometrical invariants
Overall, performance on this task was well above chance and below ceiling (mean 83.43%; t(79)>59, p<.001; range 51100%; s.d. 10.1). Because the distribution of scores was negatively skewed (zskewness=4.65) and kurtotic (zkurtosis=2.68), all analyses used the square root transform of the reflected percent correct on this task. The 5 (arts field) by 7 (geometric property) ANCOVA controlling for age revealed a significant effect of arts major (F(4,74)=9.39, p<.001; Fig. 4c) and geometric property (F(4.22,316.28)=27.27, p<.001, after GreenhouseGeisser correction for nonsphericity), and no interaction (F(16.78, 316.28)=1.15, p=.09). The effect of age was not significant (p>.3 n.s.). After Bonferroni correction for multiple comparisons, post hoc pairwise comparisons of arts majors revealed that music majors performed significantly better than theater majors (p=.002), dance majors outperformed both theater majors (p<.001) and writing majors (p=.03), and visual arts majors outperformed both theater majors (p=.001) and writing majors (p=.026).
A second planned twoway ANOVA crossing arts major grouping (music and dance majors v. visual arts, theater, and writing majors) with geometric property, found a significant effect of arts major grouping (F(1,78)>14, p<.001), such that music and dance majors performed better than the other group (Fig. 4c). It also showed the significant effect of geometric property (F(4.28,333.82)>48, p<.001, after GreenhouseGeisser correction for nonsphericity) and no interaction (F(4.28, 333.82)=2.15, p=.07 n.s.).
Another set of planned ANOVAs compared the performance of music and dance majors to all other majors on each subset of the task. The ANOVA comparing music and dance majors to all other majors on the topology subset revealed no significant group difference (F(1,78)=1.50, p>.4 n.s.). However, the music and dance major group outperformed the other majors on the Euclidean geometry, geometric figures, symmetric figures, chiral figures, and metric properties subsets (F(1,78)=7.57, p=.007; F(1,78)=11.12, p=.001; F(1,78)=4.73, p=.033; F(1,78)=8.88, p=.004; F(1,78)=6.99, p=.010). There was no significant group difference on the geometric transformations subset (F(1,78)<1, p>.5 n.s.).
A followup set of ANOVAs tested the threefold comparison of visual arts majors to the group of music and dance majors and the group of theater and writing majors on each of the task subsets. Oneway ANOVAs tested for group differences on each of the demographic variables for this population, age and verbal IQ, and found no significant differences across these groups (F(2,77)=2.31, p>.1 n.s.; F(2,77)=1.68, p>.1 n.s.). Further ANOVAs revealed no effect of arts major grouping on the topology subset (F(2,77)=2.49, p>.08 n.s.) but significant effects on all the other measures. In particular, there was an effect of arts major grouping on the Euclidean geometry subset (F(2,77)=5.04, p=.009), and the geometric figures subset (F(2,77)=7.65, p=.001).
In both cases, post hoc pairwise comparisons corrected by the Hochberg’s GT2 procedure revealed that music and dance majors significantly outperformed the theater and writing majors. There were also effects of arts major grouping on the chiral figures subset (F(2,77)=8.53, p<.001), the symmetric figure subset (F(2,77)=5.83, p=.004), and the metric properties subset (F(2,77)=7.13, p=.001). In all these cases, post hoc pairwise GT2corrected comparisons showed that both music and dance majors and visual arts majors outperformed the theater and writing majors. Finally, there was an effect of arts major grouping on the geometric transformations subset (F(2,77)=4.18, p=.019, and post hoc pairwise GT2corrected comparisons showed the visual artists outperformed theater and writing majors.
The regression analysis on overall task performance showed that amount of visual art training predicted performance on the geometry task (standardized beta=.243, p=.047), such that students with more visual art training perform better on this task. Task performance was also predicted by amount of theater training—but in the opposite direction—such that children with more theater training perform worse on this task (standardized beta of theater training=.325, p=.004; Rsquare=.021, deltaRsquare=.205, 1st step of model p>.6, 2nd step p=.01).


Figure 4c. Sensitivity to geometric invariants by high school students in differing arts disciplines. 
Estimation
Performance on this task showed a significant linear contrast between number of dots presented visually and the verbal estimates given (F(1,79)>813, p<.001), indicating that subjects’ estimates increased linearly and monotonically as the number of dots presented increased. As in the previous two experiments, the metric for performance on this task was amount of error as measured by the average distance of verbal estimates from the ideal line y=x. Because this distribution was positively skewed (zskewness=3.41), correlational and regression analyses used the square root transform of the variable.
A oneway ANCOVA, comparing the 5 different arts majors and controlling for age, revealed a significant difference in performance between majors (omnibus (F(4,74)=3.89, p=.006). There was no effect of age (p>.7 n.s.). A second planned oneway ANOVA compared the performance of the music and dance majors to all other majors. It revealed that music and dance majors significantly outperformed the other majors (F(1,78)=4.75, p=.032; Fig. 4d).
Hierarchical regression analysis revealed that none of the training amounts or demographic variables was a significant predictor of performance on this task (Rsquare=.022, deltaRsquare=.051, 1st step of model p>.6, 2nd step of model p>.6, n.s.).


Figure 4d. Estimation by high school students in differing arts disciplines. 
Number line
Performance on the number line task showed reasonable accuracy, assessed by oneway repeated measures ANOVA on spatial estimates at each target number, with a significant linear contrast on the estimates given for each target number at the hundreds scale (F(1,79)>7,000, p<.001), the thousands scale (F(1,79)>3,800, p<.001), and the millions scale (F(1,79)>3,400, p<.001). Because the error distributions for the three scales were skewed and kurtotic (average zskewness=10.63; average zkurtosis=25.87), we performed a logarithmic transformation of the scores before performing our analyses.
An ANCOVA crossing arts major and scale (3 levels) showed a significant linear contrast on scale (F(1,75)>11,000, p<.001), no effect of arts major (F(4,74)=1.31, p>.2 n.s.), and no interaction effect (F(8,150)=1.18, p>.3 n.s.). There was no effect of age (F(1,74)<1, p>.7 n.s.). A second planned ANOVA crossing arts major grouping (music and dance majors vs. all other majors) and scale showed that music and dance majors outperformed the other majors (F(1,78)=4.23, p=.043; Fig. 4e), as well as demonstrating the linear contrast on scale (F(1,78)>18,000, p<.001) and a lack of interaction effect (F(2,156)=2.44, p=.09 n.s.).
Regression analysis with age, sex and verbal IQ entered as the first step, followed by all the training variables in the second step, revealed only an effect of the amount of theater training predicted performance on this task, such that children with more theater training perform less well on this task (standardized beta of theater training=.345, p=.002; Rsquare=.073, deltaRsquare=.156, 1st step of model p>.1, 2nd step of model p=.04).


Figure 4e. Number Line performance by students in differing arts disciplines. 
Map
Overall performance on the map task was well above chance (mean 87.8%, t(79)>2,800, p<.001) with considerable variability (range 64100%, s.d. 10.4). Because this distribution was negatively skewed (zskewness=2.22), all analyses used the square root transform of the reflected percent correct on this task.
Performance first was analyzed by a 5 (arts major) by 2 (trial type: landmark vs. no landmark) ANCOVA. This revealed no effect of arts major (F(4,74)=1.64, p>.1 n.s.), no effect of trial type (F(1,75)<1, p>.7 n.s.), and no interaction (F(4,75)<1, p>.8 n.s.). There was no effect of age (F(1,74)=1.01, p>.3 n.s.). No posthoc pairwise comparisons were significant after correction for multiple comparisons. The second planned ANOVA crossing arts major group (music and dance majors vs. all others) and trial type found a significant difference in arts major groups such that music and dance majors outperformed the other majors (F(1,78)=4.77, p=.032), with no significant effect of trial type (F(1,78)<1, p>.4 n.s.) and no interaction (F(1,78)=1.22, p>.2 n.s.; Fig. 4f). Visual inspection of the graph reveals that visual artists and writers performed least well and that theater majors were intermediate.
A oneway ANCOVA comparing the arts majors on trials in arrays with landmarks, and controlling for age, revealed no significant difference in the arts majors’ performance (F(4,74)=1.87, p>.1 n.s.) and no significant effect of age (F(1,74)=3.05, p>.08, n.s.).
A oneway ANOVA comparing music and dance majors to all other majors on trials in arrays with landmarks revealed that music and dance majors outperformed all of the other majors (F(1,78)=5.41, p=.023). A oneway ANCOVA comparing the arts majors on trials in arrays without landmarks revealed no significant effects of either arts major (F(4,74)<1, p>.6 n.s.) or age (F(1,74)<1, p>.9 n.s.). A oneway ANOVA comparing music and dance majors to all other majors on trials in arrays without landmarks revealed no significant difference between the two groups (F(1,78)=1.43, p>.2 n.s.).
The hierarchical regression analyses of overall task performance showed that none of the training amounts or demographic variables was a significant predictor of performance on the map task (Rsquare=.03, deltaRsquare=.085, 1st step of model p>.5, 2nd step of model p>.3). Similar negative findings come from regression analyses of performance on trials with a landmark (Rsquare=.079, deltaRsquare=.159, 1st step of model p=.098, 2nd step p>.3) and trials without a landmark. (Rsquare=.027, deltaRsquare=.074, 1st step of model p>.5, 2nd step of model p>.4).


Figure 4f. Geometric map performance by high school students in differing arts disciplines. 
Summary
This experiment confirms, strengthens, and clarifies the suggested findings of Experiment 2. Like Experiments 1 and 2, it provides no evidence for an association between training in music and representations of small exact numbers, large approximate numbers, or number words and verbal counting. Like Experiment 2, it provides evidence that musictrained students outperform students with no music training on tasks that involve geometric representations and reasoning. In the present study, this association was found on all three tasks involving geometrical reasoning. Musictrained students outperformed untrained students on a test of sensitivity to geometric invariants in visual forms, a test of mappings between space and number, and a test of understanding of geometric maps. The association also was found on a task that is believed to involve spatial representations indirectly: numerical estimation (see Dehaene, 1997).
The present findings also clarify the relation between music training and geometry in two respects. First, they provide the first evidence that this relationship is specific to music training and not a more general effect of training in any form of the arts. Musictrained students excelled at spatial reasoning not only when compared to students lacking any special training in the arts (Experiment 2), but when compared to students who had pursued arts training at equal intensity but in other disciplines not involving music (especially theater and writing). Second, they provide evidence for an interesting role for training in the visual arts. Students with intense training in the visual arts excelled on the test of sensitivity to geometry in visual forms, but they performed less well on the other tests of sensitivity to geometry.
These findings suggest that geometrical cognition, like numerical cognition, has multiple sources, and that training in the arts can have diverse enhancing effects. This unexpected finding deserves to be replicated and extended.
Conclusions
Our experiments suggest that the welldocumented association between music training and mathematical ability depends, in part, on a more specific relationship between music and the core system for representing abstract geometry. Musictrained students performed better than students with little or no music training on three tests of sensitivity to geometry: one focused on the geometric properties of visual forms, a second on the relation of Euclidean distance to numerical magnitude, and the third on the geometric relationship between forms on a map and objects in the larger spatial layout.
In our experiments, the relationship between musical training and geometric representation is specific in three respects. First, it is not a byproduct of individual differences in intelligence, academic achievement, or the social and economic factors that underpin these qualities, because all of the present analyses control for those important factors.
Second, it is not a byproduct of a more general relationship between music and all mathematical abilities, because no associations were found between music training and any of three tests of numerical reasoning. Third, it is not a byproduct of a more general relationship between geometric representation and training in the arts, because students with extensive music training (either through study of music or study of dance) showed greater geometrical abilities than students with equally intense training in theater, writing, or (for two of the three measures) in the visual arts.
Our findings accord with those of recent studies linking impairments in music cognition to impairments in spatial cognition. Patients with amusia, a specific insensitivity to tonal relationships, were found to perform reliably worse than either musictrained or untrained adults on a test of mental rotation similar to the chirality subtest of our test of sensitivity to geometric invariants (Douglas & Bilkey, 2007).
Moreover, patients with amusia showed less interference between spatial and tonal processing than the two comparison groups, producing a performance advantage on a tonediscrimination task with spatial interference (Douglas & Bilkey, 2007). These findings suggest that sequences of tones spontaneously activate representations of space in normal humans. Experiments in my lab are now beginning to probe the nature and development of this relationship in human infants and young children.
Finally, our experiments provide evidence for an association between music and geometry only when training in music is intensive and prolonged. No clear association was found in Experiment 1, which focused on a population of children whose training in music varied from light to moderate. Associations emerged in Experiment 2, which focused on children with more intense music training, but they were not uniformly strong.
Clear and strong relationships were obtained only in Experiment 3, which focused on older children whose primary interest and academic work centered on their music training. These findings provide no evidence that shortterm, lowintensity training in music enhances abilities at the foundations of mathematics. It is possible, however, that this negative conclusion says more about the limits of our methods than about the true strength and generality of the association between music and mathematical achievement. Studies with larger samples may reveal more subtle enhancements in spatial abilities after music training of lower intensity.
More generally, our findings underscore both the importance and the feasibility of breaking down children’s complex learning capacities into component systems at the foundations of human knowledge. With the present methods, developmental cognitive neuroscientists and educators can ask not only “is arts instruction good for children?” but “in what ways do arts instruction enhance children’s academic ability: what brain/cognitive systems are enhanced by training in the arts?” These methods are simple, engaging to children across a wide age range, and revealing of the functioning of educationally relevant, core cognitive systems both in children and adults.
Because of its correlational approach, the present research does not reveal whether music training causes improvements in children’s fundamental mathematical abilities. It does, however, provide tools that future experiments could use to answer that question.
back to top
References
Baenninger, M. & Newcombe, N. (1989). The role of experience in spatial test performance. Sex Roles, 20, 327344.
Barth, H., La Mont, K., Lipton, J., & Spelke, E.S. (2005). Abstract number and arithmetic in young children. Proceedings of the National Academy of Sciences, 102(39), 1411714121.
Cappelletti, M., Barth, H., Fregni, F., Spelke, E., & PascualLeone, A. (2007). rTMS over the intraparietal sulcus disrupts numerosity processing. Experimental Brain Research.
Carey, S., & Xu, F. (2001). Infant knowledge of objects: Beyond object files and object tracking. Cognition, 80(1/2), 179  213.
Dehaene, S., Izard, V., Pica, P., & Spelke, E. (2006). Core knowledge of geometry in an Amazonian indigene group. Science, 311, 381384.
de Hevia, M.D., Spelke, E.S. (in review). Spontaneous mapping of number and space in adults and young children. Psychological Science.
Douglas, K.M. & Bilkey, D.K. (2007). Amusia is associated with deficits in spatial processing. Nature Neuroscience, 10, 915921.
Everett, D.L. (2005). Cultural constraints on grammar and cognition in Piraha. Current Anthropology, 46, 621646.
Feigenson, L. & Carey, S. (2003). Tracking individuals via object files: Evidence from infants’ manual search. Developmental Science, 6, 568584.
Feigenson, L., Dehaene, S., & Spelke, E.S. (2004). Core systems of number. Trends in Cognitive Sciences, 8, 307314.
Gallistel, C.R., and Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 4374.
Gelman, R. (1991). Epigenetic foundations of knowledge structures: Initial and transcendent conditions. In S. Carey & R. Gelman (Eds.), Epigenesis of Mind. Hillsdale: Erlbaum.
Gilmore, C.K., McCarthy, S.E., & Spelke, E.S. (2007). Symbolic arithmetic knowledge without instruction. Nature, 447, 589592.
Gilmore, C.K. & Spelke, E. (under review). Children’s understanding of the relationship between addition and subtraction.
Green, C.S. & Bavelier, D. (2003). Action video game modifies visual selective attention. Nature, 425, 534537.
Griffin, S. & Case, R. (1996). Evaluating the breadth and depth of training effects when central conceptual structures are taught. Society for Research in Child Development Monographs, 59, 90113.
Kim, I.K., & Spelke, E.S. (1999). Perception and understanding of effects of gravity and inertia on object motion. Developmental Science, 2(3), 339362.
Lipton, J.S., & Spelke, E.S. (2006). Preschool children master the logic of number word meanings. Cognition, 98(3), B57B66
Lemer, C., Dehaene, S., Spelke, E., & Cohen, L. (2003). Approximate quantities and exact number words: Dissociable systems. Neuropsychologia, 41, 19421958.
McCarthy, S., Gilmore, C., & Spelke, E. (in prep.). Nonsymbolic arithmetic and school performance in kindergarten children.
McCloskey, M. (1983). Intuitive Physics. Scientific American, 248, 122130.
HuntleyFenner, G., Carey, S., & Solimando, A. (2002). Objects are individuals but stuff doesn’t count: Perceived rigidity and cohesiveness influence infants’ representations of small groups of distinct entities. Cognition, 85(3), 203–221.
Newcombe, N.S., & Uttal, D.H. (2006). Whorf versus Socrates, round 10. Trends in Cognitive Sciences, 10, 394396.
Piazza, M., Pinel, P. & Dehaene, S. (2006). A magnitude code common to numerosities and number symbols in human intraparietal cortex. Neuron.
Pica, P., Lemer, C., Izard, V., & Dehaene, S. (2004). Exact and approximate arithmetic in an Amazonian indigene group. Science, 306, 499503.
Pylyshyn, Z.W. & Storm, R.W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179197.
Schellenberg, E.G. (2004). Music lessons enhance IQ. Psychological Science, 15, 511514.
Schellenberg, E.G. (2005). Music and cognitive abilities. Current Directions in Psychological Science, 14, 317320.
Scholl, B.J. (2001) Objects and attention: The state of the art. Cognition, 80(1/2), 1  46.
Shepard, R.N. & Metzler, J. (1971). Mental rotation of threedimensional objects. Science, 171, 701703.
Siegler, R.S., & Opfer, J.E. (2003). The development of numerical estimation: evidence for multiple representations of numerical quantity. Psychological Science, 14, 237243.
Spelke, E.S. (2000). Core knowledge. American Psychologist, 55, 12331243.
Spelke, E.S. (2003). Core knowledge. In N. Kanwisher & J. Duncan (Eds.) Attention and Performance, vol. 20: Functional neuroimaging of visual cognition. Oxford University Press.
Temple, E. & Posner, M.I. (1998). Brain mechanisms of quantity are similar in 5yearolds and adults. Proceedings of the National Academy of Sciences, 95, 78367841.
Wang, R.F. & Spelke, E.S. (2002). Human spatial representation: Insights from animals. Trends in Cognitive Sciences, 6(9), 376382.
WinklerRhoades, N., Carey, S.E., and Spelke, E.S. (2007). Of Pictures and Words: Language as a Mechanism Underlying the Emergence of DepictiveSymbolic Understanding. Poster presented at the Biennial Meeting of the Society for Research on Child Development, Boston, MA.
back to top