_{This page is available as an executable or viewable Jupyter Notebook:}

**Lets-Plot** is an open-source plotting library for statistical data. It is implemented using the
Kotlin programming language that has a multi-platform nature.
That's why Lets-Plot provides the plotting functionality that
is packaged as a JavaScript library, a JVM library, and a native Python extension.

The design of the Lets-Plot library is heavily influenced by ggplot2 library.

Library is distributed via Maven Repository.
You can include it in your Kotlin or Java project using Maven or Gradle configuration files (see also Developer guide),
or include it in your Jupyter notebook script via `%use lets-plot`

annotation (see Kotlin kernel for IPython/Jupyter).

In `lets-plot`

, the **plot** is represented at least by one
**layer**. It can be built based on the default dataset with the aesthetics mappings, set of scales, or additional
features applied.

The **Layer** is responsible for creating the objects painted on the ‘canvas’ and it contains the following elements:

**Data**- the set of data specified either once for all layers or on a per layer basis. One plot can combine multiple different datasets (one per layer).**Aesthetic mapping**- describes how variables in the dataset are mapped to the visual properties of the layer, such as color, shape, size, or position.**Geometric object**- a geometric object that represents a particular type of charts.**Statistical transformation**- computes some kind of statistical summary on the raw input data. For example,`bin`

statistics is used for histograms and`smooth`

is used for regression lines. Most stats take additional parameters to specify details of the statistical transformation of data.**Position adjustment**- a method used to compute the final coordinates of geometry. Used to build variants of the same`geom`

object or to avoid overplotting.

The typical code fragment that plots a Lets-Plot chart looks as follows:

```
import jetbrains.letsPlot.*
import jetbrains.letsPlot.geom.*
import jetbrains.letsPlot.stat.*
p = lets_plot(<dataframe>)
p + geom_<chart_type>(stat=<stat>, position=<adjustment>) { <aesthetics mapping> }
```

`geom`

¶You can add a new geometric object (or plot layer) by creating it using the `geom_xxx()`

function and then adding this object to `ggplot`

:

```
p = lets_plot(data=df)
p + geom_point()
```

See the geom reference for more information about the supported geometric objects, their arguments, and default values.

There is also a few `stat_xxx()`

functions which also create a plot layer.
Occasionally, it feels more naturally to use `stat_xxx()`

instead of `geom_xxx()`

function to add a new plot layer.
For example, you might prefer to use `stat_count()`

instead of `geom_bar()`

.

See the stat layer reference for more information about the supported stat plot-layer objects, their arguments, and default values.

With the GGBunch() object, you can
render a collection of plots.
Use the `addPlot()`

method to add a plot to the bunch and set an arbitrary location and size for plots inside the grid:

```
bunch = GGBunch()
.add_plot(plot1, 0, 0)
.add_plot(plot2, 0, 200)
bunch.show()
```

See the ggbunch.ipynb example for more information.

`stat`

¶Add `stat`

as an argument to `geom_xxx()`

function to define statistical data transformations:

`geom_point(stat=Stat.count())`

Supported statistical transformations:

`identity`

: leave the data unchanged`count`

: calculate the number of points with same x-axis coordinate`bin`

: calculate the number of points falling in each of adjacent equally sized ranges along the x-axis`bin2d`

: calculate the number of points falling in each of adjacent equal sized rectangles on the plot plane`smooth`

: perform smoothing`contour`

,`contourf`

, : calculate contours of 3D data`boxplot`

: calculate components of a box plot.`density`

,`density2d`

,`density2df`

: perform a kernel density estimation for 1D and 2D data

See the stat reference for more information about the supported stat objects, their arguments, and default values.

`mapping`

¶With mappings, you can define how variables in dataset are mapped to the visual elements of the chart.
Add the `{x=< >; y=< >; ...}`

closure to `geom`

, where:

`x`

: the dataframe column to map to the x axis.`y`

: the dataframe column to map to the y axis.`...`

: other visual properties of the chart, such as color, shape, size, or position.

`geom_point() {x = "cty"; y = "hwy"; color="cyl"}`

`position`

¶All layers have a position adjustment that computes the final coordinates of geometry.
Position adjustment is used to build variances of the same plots and resolve overlapping.
Override the default settings by using the `position`

argument in the `geom`

functions:

`geom_bar(position=Pos.dodge)`

or

`geom_bar(position=position_dodge(width=1.01))`

Available adjustments:

`dodge`

`jitter`

`jitterdodge`

`nudge`

`identity`

`fill`

`stack`

See Pos object and position functions reference for more information about position adjustments.

Enables choosing a reasonable scale for each mapped variable depending on the variable attributes. Override default scales to tweak details like the axis labels or legend keys, or to use a completely different translation from data to aesthetic. For example, to override the fill color on the histogram:

`p + geom_histogram() + scale_fill_continuous("red", "green")`

See the list of the available `scale`

methods in the scale reference

The coordinate system determines how the x and y aesthetics combine to position elements in the plot. For example, to override the default X and Y ratio:

`p + coord_fixed(ratio=2)`

See the list of the available methods in coordinates reference

The axes and legends help users interpret plots.
Use the `guide`

methods or the `guide`

argument of the `scale`

method to customize the legend.
For example, to define the number of columns in the legend:

`p + scale_color_discrete(guide=guide_legend(ncol=2))`

See more information about the `guide_colorbar, guide_legend`

functions in the scale reference

Adjust legend location on plot using the `theme`

legendPosition, legendJustification and legendDirection methods, see:
theme reference

Sampling is a special technique of data transformation built into Lets-Plot and it is applied after stat transformation.
Sampling helps prevents UI freezes and out-of-memory crashes when attempting to plot an excessively large number of geometries.
By default, the technique applies automatically when the data volume exceeds a certain threshold.
The `sampling_none`

value disables any sampling for the given layer. The sampling methods can be chained together using the + operator.

Available methods:

`sampling_random_stratified`

: randomly selects points from each group proportionally to the group size but also ensures that each group is represented by at least a specified minimum number of points.`sampling_random`

: selects data points at randomly chosen indices without replacement.`sampling_pick`

: analyses X-values and selects all points which X-values get in the set of first`n`

X-values found in the population.`sampling_systematic`

: selects data points at evenly distributed indices.`sampling_vertex_dp`

,`sampling_vertex_vw`

: simplifies plotting of polygons. There is a choice of two implementation algorithms: Douglas-Peucker (`_dp`

) and Visvalingam-Whyatt (`_vw`

).

For more details, see the sampling reference.

Let's plot a point chart built using the mpg dataset.

Create the `DataFrame`

object and retrieve the data.

In [1]:

```
%use lets-plot
@file:DependsOn("com.github.doyaaaaaken:kotlin-csv-jvm:0.7.3")
import com.github.doyaaaaaken.kotlincsv.client.*
val csvData = java.io.File("mpg.csv")
val mpg: List<Map<String, String>> = CsvReader().readAllWithHeader(csvData)
fun col(name: String, discrete: Boolean=false): List<*> {
return mpg.map {
val v = it[name]
if(discrete) v else v?.toDouble()
}
}
val df = mapOf(
"displ" to col("displ"),
"hwy" to col("hwy"),
"cyl" to col("cyl"),
"index" to col(""),
"cty" to col("cty"),
"drv" to col("drv", true),
"year" to col("year")
)
```

Plot the basic point chart.

Perform the following aesthetic mappings:

`x`

= displ (the**displ**column of the dataframe)`y`

= hwy (the**hwy**column of the dataframe)`color`

= cyl (the**cyl**column of the dataframe)

In [2]:

```
// Mapping
lets_plot(df) {x = "displ"; y = "hwy"; color = "cyl"} + geom_point(df)
```

Out[2]:

Apply statistical data transformation to count the number of cases at each x position.

In [3]:

```
val p = lets_plot(df)
p + geom_point(df, stat = Stat.count()) {x = "displ"; color = "..count.."; size = "..count.."}
```

Out[3]:

Change the pallete and the legend, add the title.

In [4]:

```
val p = lets_plot(df) {x = "displ"; y = "hwy"; color = "cyl"}
p +
geom_point(df, position = Pos.nudge) +
scale_color_continuous("red", "green", guide = guide_legend(ncol=2)) +
ggtitle("Displacement by horsepower")
```

Out[4]:

Apply the randomly stratified sampling to select points from each group proportionally to the group size.

In [5]:

```
val p = lets_plot(df) {x = "displ"; y = "hwy"; color = "cyl"}
p + geom_point(
data=df, position=Pos.nudge,
sampling = sampling_random_stratified(40)
) + scale_color_continuous(
"blue", "pink",
guide = guide_legend(ncol=2)
)
```

Out[5]: