ggplot is different from base and lattice graphics in how you build up the plot. With ggplot you build up the plot object (rather than the plot on the screen as in base graphics, or all at once as in lattice graphics.)
We start with ggplot
which creates a new plot object, and then add (using $+$) the other components: a single layer, specifying the data, mapping, geom and stat, the two scales and Cartesian coordinate system. The layer is the most complicated component and specifies:
data
set to use, in this case, the diamonds data.mapping
between data and aesthetics: carat to vertical position, price to horizontal position.geom
, and the identity stat
should be usedOnly one of geom and stat needs to be specified, as each geom has a default stat and each stat has a default geom. You can have any number of layers.
Scales and coordinate systems are added with consistent naming scheme (scales begin with scale\_
and coordinate systems with coord\_
). You must have one scale per aesthetic attribute used, and only one coordinate system. Adding additional scales or coordinate systems will overwrite existing values. Often you will tune various parameters to get the plot that you want. Parameters are specified in the format name = value
. Parameters can also be specified by position, but this requires much greater familiarity with the code.
The above definition is perfectly explicit, but extremely verbose. By default, ggplot will automatically add sensible scales and the default coordinate system, so that the following code is equivalent to the that above:
ggplot() + layer( data = diamonds, mapping = aes(x = carat, y = price), geom = "point") )
The more complex Figure \ref{fig:logged} can be produced with the following code:
ggplot() + layer( data = diamonds, mapping = aes(x = carat, y = price), geom = "point", stat = "identity") ) + layer( data = diamonds, mapping = aes(x = carat, y = price), geom = "smooth", stat = "smooth", method = lm) ) + scale_y_log10() + scale_x_log10()
Here we have used multiple layers. In the second layer, method="lm"
specifies an parameter of the smooth statistic, the method to use, here lm
to fit a linear smooth. Additional arguments to the layer will be passed to the stat or geom as appropriate. If there is any ambiguity, geom and stat arguments can be specified directly as, e.g. stat\_params = list(method = "lm")
.
There is some duplication in this example, which we can reduce by using plot defaults:
ggplot(data = diamonds, mapping = aes(x = carat, y = price)) + layer(geom = "point") + layer(stat = "smooth", method = lm) + scale_y_log10() + scale_x_log10()
You can use If you want to change the background colour, how the panel strips are displayed, or any other default graphical option, see ggoptsummary
to give a quick description of a plot.