Use of R as a toolbox for mathematical statistics exploration

Nicholas J. Horton, Department of Mathematics, Smith College, Northampton, MA

Elizabeth R.Brown, Department of Biostatistics, University of Washington, Seattle, WA

Linjuan Qian, Department of Mathematics, Smith College, Northampton, MA


I. Introduction

The R language, a freely available environment for statistical computing and graphics is widely used in many fields. This "expert-friendly" system has a powerful command language and programming environment, combined with an active user community. We discuss how R is ideal as a platform to support experimentation in mathematical statistics, both at the undergraduate and graduate levels. Using a series of case studies and activities, we describe how R can be utilized in a mathematical statistics course as a toolbox for experimentation. Examples include the calculation of a running variance, maximization of a non-linear function, resampling of a statistic, simple Bayesian modeling, sampling from multivariate normal and estimation of power. These activities, often requiring only a few dozen lines of code, offer the student the opportunity to explore statistical concepts and experiment. In addition, they provide an introduction to the framework and idioms available in this rich environment.

Keywords: mathematical statistical education, statistical computing


II. Downloads

             pdf version of paper (published in the November 2004 issue of The American Statistician)

             The R Project for Statistical Computing

             Lavine's Introduction to Statistical Thought

 

III. Sample Activities

3.1 Calculation of a running average

3.2 Simulating the sample distribution of the mean

3.3 Sampling from multivariate normal distribution

3.4 Power and sample size calculation (analytic)

3.5 Power calculation (empirical)

3.6 Bootstrapping of a sample statistic

3.7 Iteration to maximize a likelihood

3.8.1 ROC curves

3.8.2 ROC curves (bootstrap estimate of SE for AUC)

3.9 EM algorithm

3.10 Bayesian inference

 

IV. Acknowledgements

We are grateful to Ken Kleinman and Paul Kienzle for comments on an earlier draft of the manuscript, and for the support provided by NIMH grant R01-MH54693 and a Career Development Fund Award from the Department of Biostatistics at the University of Washington.


Address for correspondence: Nicholas Horton, Department of Mathematics and Statistics, Smith College, Northampton, MA 01063.
Phone: 413-585-3688, fax: 413-585-3786.

visits since September 1, 2004
Last updated May 24, 2006