---
title: "Introduction to gglite"
output:
  html:
    meta:
      css: ['@default', '@article', '@copy-button', '@heading-anchor', '@pages']
      js: ['@sidenotes', '@appendix', '@toc-highlight', '@copy-button', '@heading-anchor', '@pages']
    options:
      toc: true
      number_sections: true
vignette: >
  %\VignetteIndexEntry{Introduction to gglite}
  %\VignetteEngine{litedown::vignette}
  %\VignetteEncoding{UTF-8}
---

```{r, setup, include = FALSE}
if (!exists('penguins')) {
  load(con <- url('https://cdn.jsdelivr.net/gh/r-devel/r-svn/src/library/datasets/data/penguins.rda'))
  close(con)
}
```

The **gglite** package provides a lightweight R interface to the
[AntV G2](https://g2.antv.antgroup.com/) JavaScript visualization library. It
follows the _Grammar of Graphics_ framework---the same theoretical foundation
behind **ggplot2**---but renders interactive, web-based charts powered by G2.

A visualization in gglite is built by composing independent layers:

1. **Data** -- the data frame you want to visualize.
2. **Marks** (geometries) -- the visual shapes representing data (points,
   lines, bars, ...).
3. **Encodings** (aesthetics) -- mappings from data columns to visual channels
   (position, color, size, ...).
4. **Scales** -- control how data values translate to visual values.
5. **Coordinates** -- the coordinate system (Cartesian, polar, ...).
6. **Transforms** -- statistical or layout transforms applied to the data.
7. **Facets** -- split data into multiple panels.
8. **Themes** -- overall visual styling.
9. **Components** -- axes, legends, titles, tooltips, and labels.

Each layer is added with the pipe operator `|>`, so building a chart reads
naturally from left to right. If you prefer the ggplot2 convention, you can
also use `+` instead of `|>`---both operators produce identical results. Use
whichever you prefer.

## Data and encodings

Every chart starts with `g2()`, which accepts a data frame and aesthetic
mappings as R formulas:

```{r}
library(gglite)
g2(mtcars, hp ~ mpg)
```

You can also set encodings later with `encode()`:

```{r}
g2(mtcars) |> encode(x = ~ mpg, y = ~ hp, color = ~ cyl)
```

### Formula interface

You can use R formulas as a shorthand for aesthetic mappings. The left-hand
side maps to `y` and the right-hand side maps to `x`:

```{r}
g2(mtcars, hp ~ mpg)
```

Additional aesthetics like `color` can be passed alongside the formula:

```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species)
```

Use `|` for faceting:

```{r}
g2(iris, Sepal.Length ~ Sepal.Width | Species)
```

A one-sided formula maps only `x` (useful for histograms or counts):

```{r}
g2(mtcars, ~ mpg)
```

Use `+` on the right-hand side for multiple position fields (parallel
coordinates):

```{r}
g2(iris, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
  color = ~ Species)
```

### Character string interface

All aesthetic channels also accept plain character strings instead of formulas.
This alternative syntax is equivalent---`color = 'species'` produces the same
result as `color = ~ species`:

```{r}
g2(mtcars, x = 'mpg', y = 'hp', color = 'cyl')
```

The `encode()` function also accepts character strings:

```{r}
g2(mtcars) |> encode(x = 'mpg', y = 'hp', color = 'cyl')
```

## Marks (geometries)

Marks are the visual building blocks. gglite provides 35+ mark types. Here are
the most common ones.

### Points (scatter plot)

```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
  mark_point()
```

### Lines

```{r}
df = data.frame(
  x = rep(1:5, 2), y = c(3, 1, 4, 1, 5, 2, 7, 1, 8, 3),
  group = rep(c('A', 'B'), each = 5)
)
g2(df, y ~ x, color = ~ group) |> mark_line()
```

### Bars (intervals)

```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x) |> mark_interval()
```

### Areas

```{r}
df = data.frame(x = 1:10, y = c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3))
g2(df, y ~ x) |> mark_area()
```

### Box plots

```{r}
g2(iris, Sepal.Width ~ Species) |> mark_boxplot()
```

### Combining marks

Multiple marks can be layered on the same chart:

```{r}
df = data.frame(x = c('A', 'B', 'C'), y = c(3, 7, 2))
g2(df, y ~ x) |>
  mark_interval() |>
  mark_text(encode = list(text = 'y'))
```

## Automatic marks

When no `mark_*()` is added to the pipeline, gglite automatically chooses a
mark based on the types of the `x` and `y` variables:

| `x` type | `y` type | Mark | Chart type |
|-----------|----------|------|------------|
| numeric | numeric | `point` | Scatter plot |
| categorical (unique) | numeric | `interval` | Bar plot |
| categorical (repeated) | numeric | `beeswarm` | Beeswarm plot |
| categorical (repeated, n ≥ 30) | numeric | `beeswarm` + `density` | Beeswarm + density |
| numeric | categorical (unique) | `interval` (transposed) | Horizontal bar plot |
| numeric | categorical (repeated) | `beeswarm` (transposed) | Horizontal beeswarm |
| numeric | categorical (repeated, n ≥ 30) | `beeswarm` + `density` (transposed) | Horizontal beeswarm + density |
| categorical | categorical | `cell` + `group` | Contingency table |
| Date | numeric | `line` | Line chart |
| `ts`/`mts` | _(auto)_ | `line` | Time series line chart |
| numeric | _(none)_ | `interval` + `binX` | Histogram |
| categorical | _(none)_ | `interval` + `groupX` | Count bar chart |
| _(position)_ | _(none)_ | `line` + parallel | Parallel coordinates |

When `x` (or `y`) is categorical, the choice depends on whether the categories
are unique in the data. If every category appears exactly once, a bar plot
(`interval`) is drawn. If categories are repeated, a beeswarm plot shows
individual data points. When all groups have at least 30 observations, a density
curve is overlaid on the beeswarm for a summary view.

This means you can often skip the mark entirely:

### Scatter plot (numeric × numeric)

```{r}
g2(penguins, bill_len ~ bill_dep, color = ~ species)
```

### Bar plot (categorical × numeric, unique categories)

When each category appears once, a bar chart is drawn:

```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x)
```

### Beeswarm plot (categorical × numeric, repeated categories)

When categories are repeated, individual points are shown in a beeswarm layout:

```{r}
g2(chickwts, weight ~ feed)
```

### Beeswarm + density (categorical × numeric, large groups)

When every group has at least 30 observations, a density curve is overlaid on
the beeswarm:

```{r}
g2(penguins, bill_len ~ species)
```

### Horizontal beeswarm (numeric × categorical)

```{r}
g2(penguins, species ~ bill_len)
```

### Contingency table (categorical × categorical)

Cells are automatically colored by the count of each combination:

```{r}
g2(penguins, island ~ species)
```

### Line chart (Date × numeric)

```{r}
df = data.frame(date = Sys.Date() + 0:9, value = cumsum(rnorm(10)))
g2(df, value ~ date)
```

### Histogram (numeric only)

```{r}
g2(penguins, ~ bill_len)
```

### Count bar chart (categorical only)

```{r}
g2(penguins, ~ species)
```

### Parallel coordinates (multiple position fields)

```{r}
g2(penguins, ~ bill_len + bill_dep + flipper_len + body_mass, color = ~ species)
```

You can still add scales, themes, titles, and other components as usual:

```{r}
g2(mtcars, hp ~ mpg, color = ~ cyl) |>
  scale_color(type = 'ordinal') |>
  titles('Motor Trend Cars')
```

If you add any `mark_*()`, automatic detection is skipped entirely, so explicit
marks always take priority.

### Time series

`g2()` also accepts R time series (`ts` and `mts`) objects directly. Univariate
series are converted to a data frame with `time` and `value` columns;
multivariate series are reshaped to long format with `time`, `series`, and
`value` columns. The auto-mark feature draws a line chart automatically:

```{r}
g2(sunspot.year) |> titles('Yearly Sunspot Numbers (1700--1988)')
```

Multivariate time series produce one line per series:

```{r}
g2(EuStockMarkets) |> titles('EU Stock Markets (1991--1998)')
```

## Scales

Scales control how data values map to visual properties. Use helpers like
`scale_x()`, `scale_y()`, and `scale_color()` to configure scales:

```{r}
g2(mtcars, hp ~ mpg, color = ~ wt) |>
  scale_y(type = 'log') |>
  scale_color(palette = 'viridis')
```

Custom domain and range:

```{r}
g2(mtcars, hp ~ mpg) |>
  scale_x(domain = c(10, 35)) |>
  scale_y(domain = c(0, 400))
```

## Coordinates

Coordinate systems change how positional encodings are interpreted. gglite
supports Cartesian (default), polar, theta, and radial coordinates.

### Polar coordinates (rose chart)

```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x, color = ~ x) |>
  mark_interval() |>
  coord_polar()
```

### Theta coordinates (pie chart)

```{r}
g2(df, y ~ x, color = ~ x) |>
  mark_interval() |>
  transform('stackY') |>
  coord_theta(innerRadius = 0.5)
```

### Transposing axes

`coord_transpose()` swaps x and y (similar to ggplot2's `coord_flip()`):

```{r}
g2(df, y ~ x) |>
  mark_interval() |>
  coord_transpose()
```

## Transforms

Transforms modify the data before rendering. Use `transform()` to apply
statistical or layout transforms. When using `+`, the first argument must be
unnamed; see `?transform.g2` for details.

### Stacked bars

```{r}
df = data.frame(
  x = rep(c('A', 'B', 'C'), each = 2), y = c(3, 2, 5, 4, 1, 6),
  color = rep(c('a', 'b'), 3)
)
g2(df, y ~ x, color = ~ color) |>
  mark_interval() |>
  transform('stackY')
```

### Dodged bars

```{r}
g2(df, y ~ x, color = ~ color) |>
  mark_interval() |>
  transform('dodgeX')
```

### Stacked area chart

```{r}
df = data.frame(
  x = rep(1:5, 2), y = c(3, 1, 4, 1, 5, 2, 7, 1, 8, 3),
  group = rep(c('A', 'B'), each = 5)
)
g2(df, y ~ x, color = ~ group) |>
  mark_area() |>
  transform('stackY')
```

## Facets

Faceting splits data into panels. Use `facet_rect()` for a grid layout:

```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
  facet_rect(~ Species)
```

The formula interface supports faceting with `|`. Use `| var` for column facets,
`| 0 + var` for row facets, and `| var1 + var2` for both:

```{r}
# Column facet: panels arranged in columns by species
g2(penguins, bill_len ~ bill_dep | species)
```

```{r}
# Row facet: panels arranged in rows by island
g2(penguins, bill_len ~ bill_dep | 0 + island)
```

```{r}
# Both: columns by species, rows by island
g2(penguins, bill_len ~ bill_dep | species + island)
```

## Themes

Themes change the overall look. Built-in themes include `theme_classic()`
(default), `theme_classic_dark()`, `theme_light()`, `theme_dark()`, and
`theme_academy()`:

```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
  theme_academy()
```

## Components

Components are the non-data elements of a chart: titles, tooltips, axes,
legends, and labels.

### Titles

```{r}
g2(mtcars, hp ~ mpg) |>
  titles('Motor Trend Cars', subtitle = 'mpg vs horsepower')
```

### Tooltips

```{r}
g2(sunspot.year) |> tooltip(crosshairs = TRUE)
```

### Labels

Use `labels()` to add text annotations. When using `+`, the first argument
must be unnamed; see `?labels.g2` for details.

```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x) |>
  mark_interval() |>
  labels(text = ~ y)
```

## Interactions

Interactions add user-driven behaviors like hovering, brushing, and filtering:

```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
  interact('tooltip') |>
  interact('legendFilter') |>
  interact('brushHighlight')
```

## Putting it all together

Here is a more complete example combining several grammar layers:

```{r}
df = data.frame(
  x = rep(c('Q1', 'Q2', 'Q3', 'Q4'), each = 2),
  y = c(120, 80, 150, 90, 180, 110, 200, 130),
  product = rep(c('Widget', 'Gadget'), 4)
)
g2(df, y ~ x, color = ~ product) |>
  mark_interval() |>
  transform('dodgeX') |>
  scale_color(range = c('#5470c6', '#91cc75')) |>
  titles('Quarterly Sales', subtitle = 'By product line') |>
  interact('tooltip') |>
  interact('elementHighlightByX') |>
  theme_classic()
```

## Using `+` instead of `|>`

If you are used to ggplot2, you can replace `|>` with `+`. Both operators
produce identical charts:

```{r}
# Pipe style
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
  scale_color(palette = 'set2') |>
  titles('Iris Dataset')
```

```{r}
# ggplot2 style
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) +
  scale_color(palette = 'set2') +
  titles('Iris Dataset')
```

You can mix modifiers freely---marks, scales, coordinates, themes, facets,
transforms, and components all work with `+`:

```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x, color = ~ x) +
  mark_interval() +
  coord_polar() +
  titles('Polar Bar Chart') +
  theme_academy()
```

You can even freely mix `+` and `|>` in the same expression---due to R's
operator precedence (`|>` binds tighter than `+`), any combination produces
the same result:

```{r}
# These are all equivalent:
g2(mtcars, hp ~ mpg) |>
  scale_x(type = 'log') + theme_dark()

g2(mtcars, hp ~ mpg) +
  scale_x(type = 'log') + theme_dark()
```
