Vignette / Tutorial

This tutorial will show you the basic things that can be done with this Plotly repo with the current assortment of charts available.

Note

Before you begin, you need to have certain packages installed. Be sure to download the following via pip install:

  • numpy

  • pandas

  • plotly

  • plotly.express

  • plotly.graph_objects

  • plotly.offline

Then be sure to:

python setup.py develop

That will install the package onto your local machine. To use, simply import the following:

from easyplotly import Interactive_Visuals

Control Chart

For creating control charts, the data frame must contain variables named the same as in the example below. Make sure the Date variable is set to the index if it isn’t already (ADTK will do this by default). Load in the Interactive_Visuals class and then call the plot function.

df = pd.DataFrame(dict(
        Date=["2020-01-10", "2020-02-10", "2020-03-10", "2020-04-10", "2020-05-10", "2020-06-10", "2020-07-10"],
        Values=[1,2,3,1,2,4, 5],
        Median = [2,2,2,2,2,2,2],
        UCL = [3,3,3,3,3,3,3],
        LCL = [1,1,1,1,1,1,1],
        Violation = [0,0,0,0,0,.5, .9]
    ))

#Pandas set date to index col (will be how ingested from ADTK)
df = df.set_index("Date")
iv = Interactive_Visuals(df)
plot(iv.control_chart_ADTK(title = "Anomaly Detection Graph"))
../_images/Control_Chart.png

Scatterplot

There are a few variations on what can be done with a scatter plot. First you will want to load in a data frame (here, we’ll be using the infamous iris dataset).

df = px.data.iris()
iv = Interactive_Visuals(df)

To obtain a very basic scatterplot, run this:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width"))
../_images/Scatterplot_Basic.png

Marginal Scatterplot

To create a scatterplot with a marginal box plot, run the following:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width", marg_x = "box", marg_y = "box"))
../_images/Scatterplot_Marginal.png

(Note that histograms or violin plots can also be plotted in the margins.)

Change Colors Based on Another Variable

Scatterplots can be labeled based on a factor variable:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width",
marg_x = "box", marg_y = "box", color = "species"))
../_images/Scatterplot_Marginal_Factor.png

Or a numeric variable:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width",
marg_x = "box", marg_y = "box", color = "petal_width"))
../_images/Scatterplot_Marginal_Numeric.png

Prettify with Jitter and Opacity

If points overlap, jitter can be applied. If the default jitter is unsatisfactory, the value can be changed with jitter_sd:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width",
marg_x = "box", marg_y = "box", color = "species", jitter = True))
../_images/Scatterplot_Marginal_Jitter.png

Opacity can also be lowered for points closeby to be more easily seen:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width",
marg_x = "box", marg_y = "box", color = "species",
jitter = True, opacity = .5))
../_images/Scatterplot_Marginal_Opacity.png

Add Trendlines

Trendlines can also be added via “ols”:

plot(iv.scatterplot(x = "sepal_length", y = "sepal_width",
marg_x = "box", marg_y = "box", color = "species", jitter = True,
opacity = .8, trendline = "ols"))
../_images/Scatterplot_Marginal_Trendline.png

Histogram

A basic histogram can be created by using a numeric variable:

plot(iv.histogram(x = "sepal_length"))
../_images/Histogram_Basic.png

Facet on Categorical Variable

This histogram can be split based on a categorical variable:

plot(iv.histogram(x = "sepal_length", color = "species"))
../_images/Histogram_Factor.png

Show Marginal Distribution

The marginal distributions can be shown above the histogram:

plot(iv.histogram(x = "sepal_length", color = "species", marginal="box"))
../_images/Histogram_Marginal.png

Facet Plots

And the plots can be faceted either vertically or horizontally for readability:

plot(iv.histogram(x = "sepal_length", color = "species", facet_col = "species", marginal="box"))
../_images/Histogram_Facet.png

Customize Bins

The number of bins is also customizable:

plot(iv.histogram(x = "sepal_length", color = "species", facet_col = "species",
marginal = "box", bins = 10))
../_images/Histogram_Bins.png

Titles

Titles can be removed if disruptive:

plot(iv.histogram(x = "sepal_length", color = "species", facet_col = "species",
marginal = "box", bins = 10, has_title = False))
../_images/Histogram_NoTitle.png

Or renamed to what the user prefers:

plot(iv.histogram(x = "sepal_length", color = "species", facet_col = "species",
marginal = "box", bins = 10, title = "Sepal Length Faceted on Species"))
../_images/Histogram_CustomTitle.png

Bar Plot

For bar plots we will use a dataset where more categorical variables are included:

df = px.data.tips()
iv = Interactive_Visuals(df)

A basic bar plot can be created by using a categorical variable:

plot(iv.barplot(x = "sex"))
../_images/Barplot_Basic.png

Stacked Bar Plots

Stacked bar plots can be created by setting a categorical variable to color:

plot(iv.barplot(x = "sex", color = "smoker"))
../_images/Barplot_Stacked.png

Grouped Bar Plots

These can also be set as grouped bar plots:

plot(iv.barplot(x = "sex", color = "smoker", barmode = "group"))
../_images/Barplot_Grouped.png

Horizontal Bars

Bars can also be set horizontally:

plot(iv.barplot(x = "sex", color = "smoker", is_horizontal = True))
../_images/Barplot_Horizontal.png

Plot on Percentages

And bar plots can be plotted based on Percentages and not Counts:

plot(iv.barplot(x = "sex", color = "smoker", is_horizontal = True, is_percent = True))
../_images/Barplot_Percent.png

Add Actual Values Onto Plots

If graphs are going into PowerPoints, actual values can be added to graphs for both count and percentage cases (percents automatically round to two decimal places):

plot(iv.barplot(x = "sex", color = "smoker", is_horizontal = True,
is_percent = True, show_num = True))
../_images/Barplot_Values.png