Data Analysis and Graphics Using R:
An Example-Based Approach, 2nd ed.
J. Maindonald and J. Braun
Cambridge University Press, 2007, 502 pages + 10 color plates
Data Analysis and Graphics Using R:
An Example-Based Approach, 2nd ed.
J. Maindonald and J. Braun
Cambridge University Press, 2007, 502 pages + 10 color plates

tatistics packages, even those that use a graphical user
interface, are notoriously clumsy and unintuitive to use, and are often
poorly documented. This book uses examples to teach R, a free but powerful
command-line based statistics package. A familiarity with basic
statistics is assumed.
This book doesn't waste time telling you how to install or configure R. It goes right to its topic, teaching R by example. The teach-by-example method can be very effective. An excellent example is Statistical Analysis: A Decision-Making Approach by Robert Parsons. To make this sort of book useful requires discipline and organization. Data Analysis and Graphics Using R is fairly well organized. It has an extensive index of R functions and statistical topics.
However, there is one big problem: Where are all the examples?? It turns
out that by 'examples', the authors don't mean 'examples of R code,' but
sample statistical problems, as distinct from a theory-based approach.
There is surprisingly little R code in this book. The commands for linear
modeling (which is R's term for linear regression), for instance, are
scattered across several chapters, making it
hard for the reader to piece together the correct syntax. This could have
been avoided by including parts (or all) of the authors' R scripts (such
as lm-tests.R). This well-written file makes
it immediately obvious how to run a linear model. These are the "examples"
that should have been included in the text. I eventually discovered that
it was much easier to learn R by reading the help pages within R rather
than guess the correct syntax from the text.
A related problem, at least in the early sections, is that some of the examples don't make sense unless you install the authors' DAAG data package. The book also provides relatively little insight as to how R processes the data internally.
On the positive side, topics such as time series analysis and tree-based classification, which are missing from many other books, are thoroughly covered. The authors try to teach some statistics along with R and give many warnings about whether a particular model is appropriate. As might be expected, little mathematical background is provided for the statistical methods.
