10:00
BAA1028 - Workflow & Data Management
An open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication.
Pandoc Markdown
Jupyter Kernels
Dozens of Output Formats
Specialized Project Types
Render to output formats:
# ipynb notebook
quarto render notebook.ipynb
quarto render notebook.ipynb --to docx
# plain text qmd
quarto render notebook.qmd
quarto render notebook.qmd --to pdf
Live preview server (re-render on save):
penguins.qmd
---
title: "Palmer Penguins"
author: Norah Jones
date: March 12, 2023
format: html
jupyter: python3
---
```{python}
#| echo: false
import pandas as pd
df = pd.read_csv("palmer-penguins.csv")
df = df[["species", "island", "year", \
"bill_length_mm", "bill_depth_mm"]]
```
## Exploring the Data
See @fig-bill-sizes for an exploration of bill sizes.
```{python}
#| label: fig-bill-sizes
#| fig-cap: Bill Sizes by Species
import matplotlib.pyplot as plt
import seaborn as sns
g = sns.FacetGrid(df, hue="species", height=3)
g.map(plt.scatter, "bill_length_mm", "bill_depth_mm") \
.add_legend()
```
Editable with any text editor (extensions for VS Code, Neovim, and Emacs)
Cells always run in the same order
Integrates well with version control
Cache output with Jupyter Cache or Quarto freezer
Lots of pros and cons visa-vi traditional .ipynb
format/editors, use the right tool for each job
Notebook workflow (no execution occurs by default):
Plain text workflow (.qmd
=> .ipynb
then execute cells):
See for example:
Navigation Bar and Pages — Icon, title, and author along with links to sub-pages (if more than one page is defined).
Sidebars, Rows & Columns, and Tabsets — Rows and columns using markdown heading (with optional attributes to control height, width, etc.). Sidebars for interactive inputs. Tabsets to further divide content.
Cards (Plots, Tables, Value Boxes, Content) — Cards are containers for cell outputs and free form markdown text. The content of cards typically maps to cells in your notebook or source document.
All of these components can be authored and customized within notebook UI or plain text qmd.
```{python}
#| title: GDP and Life Expectancy
import plotly.express as px
df = px.data.gapminder()
px.scatter(
df, x="gdpPercap", y="lifeExp",
animation_frame="year", animation_group="country",
size="pop", color="continent", hover_name="country",
facet_col="continent", log_x=True, size_max=45,
range_x=[100,100000], range_y=[25,90]
)
```
## Row
```{python}
#| component: valuebox
#| title: "Current Price"
dict(icon = "currency-dollar",
color = "secondary",
value = get_price(data))
```
```{python}
#| component: valuebox
#| title: "Change"
change = get_change(data)
dict(value = change['amount'],
icon = change['icon'],
color = change['color'])
```
## Column
```{python}
#| title: Population
px.area(df, x="year", y="pop",
color="continent",
line_group="country")
```
```{python}
#| title: Life Expectancy
px.line(df, x="year", y="lifeExp",
color="continent",
line_group="country")
```
::: {.card}
Gapminder combines data from multiple sources
into unique coherent time-series that can’t be
found elsewhere. Learn more about the Gampminder
dataset at <https://www.gapminder.org/data/>.
:::
Cards provide an Expand button which appears at bottom right on hover:
Dashboards are typically just static HTML pages so can be deployed to any web server or web host.
Static | Rendered a single time (e.g. when underlying data won’t ever change) |
Scheduled | Rendered on a schedule (e.g. via cron job) to accommodate changing data. |
Parameterized | Variations of static or scheduled dashboards based on parameters. |
Interactive | Fully interactive dashboard using Shiny (requires a server for deployment). |
Add a parameters tag to the first cell (based on papermill) :
Use the -P
command line option to vary the parameter:
https://quarto.org/docs/dashboards/interactivity/shiny-python/
For interactive exploration, some dashboards can benefit from a live Python backend
To do this with Quarto Dashboards, add interactive Shiny components
Note that this requires a server for deployment
---
title: "Penguin Bills"
format: dashboard
server: shiny
---
```{python}
import seaborn as sns
penguins = sns.load_dataset("penguins")
```
## {.sidebar}
```{python}
from shiny import render, ui
ui.input_select("x", "Variable:",
choices=["bill_length_mm", "bill_depth_mm"])
ui.input_select("dist", "Distribution:", choices=["hist", "kde"])
ui.input_checkbox("rug", "Show rug marks", value = False)
```
## Column
```{python}
@render.plot
def displot():
sns.displot(
data=penguins, hue="species", multiple="stack",
x=input.x(), rug=input.rug(),kind=input.dist())
```
Shiny for Python applications are built on Starlette and ASGI, and can deployed in server environments that support WebSockets and sticky sessions.
On-Prem
Alternatively, deploy serverless using Pyodide. See the Retirement Simulation example for details.
Create a Dashboard with the following widgets:
Remember to use in your YAML:
10:00
Huge thanks the following people who have generated and shared most of the content of this lecture:
Mine Çetinkaya-Rundel: Build-a-Dashboard Workshop
J.J. Allaire: Dashboards with Jupyter and Quarto
Thanks for your attention and don’t hesitate to ask if you have any questions!
@damien_dupre
@damien-dupre
https://damien-dupre.github.io
damien.dupre@dcu.ie