Syntax

Main Clauses

ggsql augments the standard SQL syntax with a number of new clauses to describe a visualisation:

  • VISUALISE initiates the visualisation part of the query
  • DRAW adds a new layer to the visualisation
  • SCALE specify how an aesthetic should be scaled
  • FACET describes how data should be split into small multiples
  • PROJECT is used for selecting the coordinate system to use
  • LABEL is used to manually add titles to the plot or the various axes and legends

Layers

There are many different layers to choose from when visualising your data. Some are straightforward translations of your data into visual marks such as a point layer, while others perform more or less complicated calculations like e.g. the histogram layer. A layer is selected by providing the layer name after the DRAW clause

  • point is used to create a scatterplot layer.
  • line is used to produce lineplots with the data sorted along the x axis.
  • path is like line above but does not sort the data but plot it according to its own order.
  • segment connects two points with a line segment.
  • linear draws a long line parameterised by a coefficient and intercept.
  • rule draws horizontal and vertical reference lines.
  • area is used to display series as an area chart.
  • ribbon is used to display series extrema.
  • polygon is used to display arbitrary shapes as polygons.
  • bar creates a bar chart, optionally calculating y from the number of records in each bar.
  • density creates univariate kernel density estimates, showing the distribution of a variable.
  • violin displays a rotated kernel density estimate.
  • histogram bins the data along the x axis and produces a bar for each bin showing the number of records in it.
  • boxplot displays continuous variables as 5-number summaries.
  • errorbar a line segment with hinges at the endpoints.

Scales

A scale is responsible for translating a data value to an aesthetic literal, e.g. a specific color for the fill aesthetic, or a radius in points for the size aesthetic. A scale is a combination of a specific aesthetic and a scale type

Aesthetics

  • Position aesthetics are those aesthetics realted to the spatial location of the data in the coordinate system.
  • Color aesthetics are related to the color of fill and stroke
  • opacity is the aesthetic that determines the opacity of the color
  • linetype governs the stroke pattern of strokes
  • linewidth determines the width of strokes
  • shape determines the shape of points
  • size governs the radius of points
  • Faceting aesthetics are used to determine which facet panel the data belongs to

Scale types

  • continuous scales translates a continuous input to a continuous output
  • discrete scales translates discrete input to a discrete output
  • binned scales translate continuous input to an ordered discrete output by binning the data
  • ordinal scales translate discrete input to an ordered discrete output by enforcing an ordering to the input
  • identity scales passes the data through unchanged

Coordinate systems

The coordinate system defines how the abstract positional aesthetics are projected onto the screen or paper where the final plot appears. As such, it has great influence over the final look of the plot.

  • cartesian is the classic coordinate system consisting of two perpendicular axes, one being horizontal and one being vertical
  • polar interprets the primary position as the angular location relative to the center and the secondary position as the distance (radius) from the center, and this creates a circular coordinate system