library(gglite)
Data can be provided at the chart level, at the mark level, or not at all (for marks that generate their own data from transforms or inline values).
Pass a data frame to g2() and it becomes the default data source for all
marks in the chart.
g2(mtcars, hp ~ mpg) |> mark_point()
Multiple marks can share the same chart-level data frame:
g2(mtcars, hp ~ mpg) |>
mark_point() |>
mark_line()
Supply a data frame directly to a mark to override or supplement the chart-level data for that mark only. This is useful for annotation layers or overlays that use a different data source.
# Mark-level data for an annotation line
g2(mtcars, hp ~ mpg) |>
mark_point() |>
mark_line_y(
data = data.frame(y = 150),
encode = list(y = 'y'),
style = list(stroke = 'red', lineDash = c(4, 4))
)
Marks can have entirely independent data:
df1 = data.frame(x = 1:5, y = c(2, 4, 3, 5, 1))
df2 = data.frame(x = 1:5, y = c(1, 3, 5, 2, 4))
g2() |>
mark_line(data = df1, encode = list(x = 'x', y = 'y')) |>
mark_point(data = df2, encode = list(x = 'x', y = 'y'))
For marks that do not work with data frames — such as reference lines, annotations, or hierarchical charts — pass data as a list of records or as a nested list structure.
# A single reference line at y = 150
g2(mtcars, hp ~ mpg) |>
mark_point() |>
mark_line_y(
data = list(list(y = 150)),
encode = list(y = 'y'),
style = list(stroke = 'tomato', lineWidth = 2)
)
# A shaded region between x = 15 and x = 25
g2(mtcars, hp ~ mpg) |>
mark_point() |>
mark_range_x(
data = list(list(x = c(15, 25))),
encode = list(x = 'x'),
style = list(fill = 'steelblue', fillOpacity = 0.15)
)
G2 can fetch data directly from a URL. Pass data = list(type = 'fetch', value = '<url>') to any mark to load JSON or CSV data client-side.
g2() |> mark_point(
data = list(
type = 'fetch',
value = 'https://gw.alipayobjects.com/os/antvdemo/assets/data/scatter.json'
),
encode = list(x = 'weight', y = 'height', color = 'gender')
)
gglite automatically trims data frame columns to only those referenced by the chart before serializing to JSON. This keeps the HTML output compact when working with wide data frames that have many unused columns.
The iris dataset has five columns: Sepal.Length, Sepal.Width,
Petal.Length, Petal.Width, and Species. When only two columns are
mapped, only those two columns end up in the generated HTML.
# Only Sepal.Length and Sepal.Width are serialized
g2(iris, Sepal.Length ~ Sepal.Width) |> mark_point()
Additional aesthetic channels count as used columns too:
# Sepal.Length, Sepal.Width, and Species are included; Petal.* are dropped
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |> mark_point()
Trimming also applies to labels: the text column referenced by labels()
is automatically preserved.
df = data.frame(
x = c('A', 'B', 'C'), y = c(3, 7, 2),
label = c('low', 'high', 'mid'), extra = 1:3
)
# label is kept (used by label()); extra is trimmed
g2(df, y ~ x) |>
mark_interval() |>
labels(text = ~ label, position = 'inside')
I()Some configurations reference columns inside inline JavaScript functions
that gglite cannot detect statically. A common case is a custom style
callback that reads a field from the data row directly:
# Species is used in the JS fill callback but is not listed in encode.
# Without I(), Species would be trimmed and the callback would not work.
g2(I(iris), Sepal.Length ~ Sepal.Width) |>
mark_point(style = list(
fill = js('(d) => d.Species === "setosa" ? "steelblue" : "tomato"')
))
Wrapping the data in I() tells gglite to preserve all columns and skip
trimming. The AsIs class is stripped before JSON serialization, so the
chart works exactly as if the data were passed directly — but with all
columns available to JavaScript.
The same applies to mark-level data:
df = data.frame(x = 1:5, y = c(2, 4, 3, 5, 1), label = c('a', 'b', 'c', 'd', 'e'))
# label is referenced in a JS tooltip callback, not in encode
g2() |> mark_point(
data = I(df),
encode = list(x = 'x', y = 'y'),
tooltip = list(items = list(js('(d) => ({ name: "label", value: d.label })')))
)