diff --git a/docs/applications/design_space_exploration.ipynb b/docs/applications/design_space_exploration.ipynb
index f60cb0ed..fa59e488 100644
--- a/docs/applications/design_space_exploration.ipynb
+++ b/docs/applications/design_space_exploration.ipynb
@@ -1,4392 +1,4702 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "f5bc5fd7-33fc-472c-b59b-fb279a320d00",
- "metadata": {
- "tags": []
- },
- "source": [
- "# Using Draco for Visualization Design Space Exploration\n",
- "To help verify, debug, and tune the recommendation results, we provide general [guidelines](https://dig.cmu.edu/draco2/applications/debug_draco.html#). \n",
- "We apply the guidelines and features in the following demonstration. \n",
- "\n",
- "In this example we will use Draco to explore the visualization design space for the Seattle weather dataset.\n",
- "Starting with nothing but a raw dataset, we are going to use the reusable building blocks that Draco provides to generate a wide space\n",
- "of recommendations, and we will investigate the produced designs using the debugger module."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "id": "e010c7ae-4d1f-4783-9b83-9f5414f392de",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.865774Z",
- "start_time": "2023-04-30T12:19:39.830896Z"
- },
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Suppressing warnings raised by altair in the background\n",
- "# (iteration-related deprecation warnings)\n",
- "import warnings\n",
- "\n",
- "warnings.filterwarnings(\"ignore\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "137594a6-c856-441f-82ff-78250cf9e74f",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.866209Z",
- "start_time": "2023-04-30T12:19:39.834772Z"
- },
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Display utilities\n",
- "from pprint import pprint\n",
- "from IPython.display import display, Markdown"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "fde40b06",
- "metadata": {},
- "source": [
- "## Loading the Data\n",
- "\n",
- "We will use the Seattle weather dataset from the [Vega Datasets](https://vega.github.io/vega-datasets/) for this example."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "id": "95789170-a7d4-4924-ae23-3736e03ea006",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.897305Z",
- "start_time": "2023-04-30T12:19:39.837783Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "data": {
- "text/html": [
- "
\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " date | \n",
- " precipitation | \n",
- " temp_max | \n",
- " temp_min | \n",
- " wind | \n",
- " weather | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " 2012-01-01 | \n",
- " 0.0 | \n",
- " 12.8 | \n",
- " 5.0 | \n",
- " 4.7 | \n",
- " drizzle | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " 2012-01-02 | \n",
- " 10.9 | \n",
- " 10.6 | \n",
- " 2.8 | \n",
- " 4.5 | \n",
- " rain | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " 2012-01-03 | \n",
- " 0.8 | \n",
- " 11.7 | \n",
- " 7.2 | \n",
- " 2.3 | \n",
- " rain | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " 2012-01-04 | \n",
- " 20.3 | \n",
- " 12.2 | \n",
- " 5.6 | \n",
- " 4.7 | \n",
- " rain | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " 2012-01-05 | \n",
- " 1.3 | \n",
- " 8.9 | \n",
- " 2.8 | \n",
- " 6.1 | \n",
- " rain | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " date precipitation temp_max temp_min wind weather\n",
- "0 2012-01-01 0.0 12.8 5.0 4.7 drizzle\n",
- "1 2012-01-02 10.9 10.6 2.8 4.5 rain\n",
- "2 2012-01-03 0.8 11.7 7.2 2.3 rain\n",
- "3 2012-01-04 20.3 12.2 5.6 4.7 rain\n",
- "4 2012-01-05 1.3 8.9 2.8 6.1 rain"
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "import draco as drc\n",
- "import pandas as pd\n",
- "from vega_datasets import data as vega_data\n",
- "import altair as alt\n",
- "\n",
- "# Loading data to be explored\n",
- "df: pd.DataFrame = vega_data.seattle_weather()\n",
- "df.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b9335743",
- "metadata": {},
- "source": [
- "We can use the `schema_from_dataframe` function to generate the schema of the dataset, including the data types of each column and their statistical properties."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "id": "8e0d10d9-1d3b-463d-97bb-9dae981930a5",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.898967Z",
- "start_time": "2023-04-30T12:19:39.847058Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "{'field': [{'entropy': 7287,\n",
- " 'name': 'date',\n",
- " 'type': 'datetime',\n",
- " 'unique': 1461},\n",
- " {'entropy': 2422,\n",
- " 'max': 55,\n",
- " 'min': 0,\n",
- " 'name': 'precipitation',\n",
- " 'std': 6,\n",
- " 'type': 'number',\n",
- " 'unique': 111},\n",
- " {'entropy': 3934,\n",
- " 'max': 35,\n",
- " 'min': -1,\n",
- " 'name': 'temp_max',\n",
- " 'std': 7,\n",
- " 'type': 'number',\n",
- " 'unique': 67},\n",
- " {'entropy': 3596,\n",
- " 'max': 18,\n",
- " 'min': -7,\n",
- " 'name': 'temp_min',\n",
- " 'std': 5,\n",
- " 'type': 'number',\n",
- " 'unique': 55},\n",
- " {'entropy': 3950,\n",
- " 'max': 9,\n",
- " 'min': 0,\n",
- " 'name': 'wind',\n",
- " 'std': 1,\n",
- " 'type': 'number',\n",
- " 'unique': 79},\n",
- " {'entropy': 1201,\n",
- " 'freq': 714,\n",
- " 'name': 'weather',\n",
- " 'type': 'string',\n",
- " 'unique': 5}],\n",
- " 'number_rows': 1461}\n"
- ]
- }
- ],
- "source": [
- "data_schema = drc.schema_from_dataframe(df)\n",
- "pprint(data_schema)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c72d7d02",
- "metadata": {},
- "source": [
- "We transform the data schema into a set of facts that Draco can use to reason about the data when generating recommendations. We use the `dict_to_facts` function to do so which takes a dictionary and returns a list of facts.\n",
- "The output list of facts encodes the same information as the input dictionary, it is just a different representation that we can feed into [Clingo](https://potassco.org/clingo/) under the hood."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "id": "f004d684-9ba5-4f73-9383-34a3007ba6c0",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.899337Z",
- "start_time": "2023-04-30T12:19:39.854958Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "['attribute(number_rows,root,1461).',\n",
- " 'entity(field,root,0).',\n",
- " 'attribute((field,name),0,date).',\n",
- " 'attribute((field,type),0,datetime).',\n",
- " 'attribute((field,unique),0,1461).',\n",
- " 'attribute((field,entropy),0,7287).',\n",
- " 'entity(field,root,1).',\n",
- " 'attribute((field,name),1,precipitation).',\n",
- " 'attribute((field,type),1,number).',\n",
- " 'attribute((field,unique),1,111).',\n",
- " 'attribute((field,entropy),1,2422).',\n",
- " 'attribute((field,min),1,0).',\n",
- " 'attribute((field,max),1,55).',\n",
- " 'attribute((field,std),1,6).',\n",
- " 'entity(field,root,2).',\n",
- " 'attribute((field,name),2,temp_max).',\n",
- " 'attribute((field,type),2,number).',\n",
- " 'attribute((field,unique),2,67).',\n",
- " 'attribute((field,entropy),2,3934).',\n",
- " 'attribute((field,min),2,-1).',\n",
- " 'attribute((field,max),2,35).',\n",
- " 'attribute((field,std),2,7).',\n",
- " 'entity(field,root,3).',\n",
- " 'attribute((field,name),3,temp_min).',\n",
- " 'attribute((field,type),3,number).',\n",
- " 'attribute((field,unique),3,55).',\n",
- " 'attribute((field,entropy),3,3596).',\n",
- " 'attribute((field,min),3,-7).',\n",
- " 'attribute((field,max),3,18).',\n",
- " 'attribute((field,std),3,5).',\n",
- " 'entity(field,root,4).',\n",
- " 'attribute((field,name),4,wind).',\n",
- " 'attribute((field,type),4,number).',\n",
- " 'attribute((field,unique),4,79).',\n",
- " 'attribute((field,entropy),4,3950).',\n",
- " 'attribute((field,min),4,0).',\n",
- " 'attribute((field,max),4,9).',\n",
- " 'attribute((field,std),4,1).',\n",
- " 'entity(field,root,5).',\n",
- " 'attribute((field,name),5,weather).',\n",
- " 'attribute((field,type),5,string).',\n",
- " 'attribute((field,unique),5,5).',\n",
- " 'attribute((field,entropy),5,1201).',\n",
- " 'attribute((field,freq),5,714).']\n"
- ]
- }
- ],
- "source": [
- "data_schema_facts = drc.dict_to_facts(data_schema)\n",
- "pprint(data_schema_facts)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "bbf3a2a6",
- "metadata": {},
- "source": [
- "## Iterating the partial specification query\n",
- "\n",
- "> Generating recommendations from a minimal input\n",
- "\n",
- "We start by defining `input_spec_base` which is a list of facts including the data schema, a single view and a single mark.\n",
- "This is the minimal set of facts that Draco needs to generate recommendations which can be rendered into charts.\n",
- "\n",
- "We instantiate a `Draco` object, using the default knowledge base, and an `AltairRenderer` object which will be used to render the recommendations into Vega-Lite charts."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "id": "0c3ab0c4-095d-48ba-b2f5-729a0a5de316",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.899709Z",
- "start_time": "2023-04-30T12:19:39.857856Z"
- },
- "tags": []
- },
- "outputs": [],
- "source": [
- "from draco.renderer import AltairRenderer\n",
- "\n",
- "input_spec_base = data_schema_facts + [\n",
- " \"entity(view,root,v0).\",\n",
- " \"entity(mark,v0,m0).\",\n",
- "]\n",
- "d = drc.Draco()\n",
- "renderer = AltairRenderer()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "1a3c4d59",
- "metadata": {},
- "source": [
- "We can now use the `complete_spec` method of the `Draco` object to generate recommendations from incomplete specifications.\n",
- "The function below is a reusable utility for this example, responsible for generating, rendering and displaying the recommendations."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "id": "da55ac50-c3f4-4064-ac0d-ce09ee6e7d7e",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:39.899750Z",
- "start_time": "2023-04-30T12:19:39.861482Z"
- },
- "tags": []
- },
- "outputs": [],
- "source": [
- "def recommend_charts(\n",
- " spec: list[str], draco: drc.Draco, num: int = 5, labeler=lambda i: f\"CHART {i+1}\"\n",
- ") -> dict[str, dict]:\n",
- " # Dictionary to store the generated recommendations, keyed by chart name\n",
- " chart_specs = {}\n",
- " for i, model in enumerate(draco.complete_spec(spec, num)):\n",
- " chart_name = labeler(i)\n",
- " spec = drc.answer_set_to_dict(model.answer_set)\n",
- " chart_specs[chart_name] = drc.dict_to_facts(spec)\n",
- "\n",
- " print(chart_name)\n",
- " print(f\"COST: {model.cost}\")\n",
- " chart = renderer.render(spec=spec, data=df)\n",
- " # Adjust column-faceted chart size\n",
- " if (\n",
- " isinstance(chart, alt.FacetChart)\n",
- " and chart.facet.column is not alt.Undefined\n",
- " ):\n",
- " chart = chart.configure_view(continuousWidth=130, continuousHeight=130)\n",
- " display(chart)\n",
- "\n",
- " return chart_specs"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "dd4301df",
- "metadata": {},
- "source": [
- "We are using `input_spec_base` as the starting point for our exploration, that is, we are only specifying the data schema, and that we want the recommendations to have at least one view and one mark."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "id": "068fc36e-7910-42f6-9f61-f9cc9b31ba9c",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:40.052837Z",
- "start_time": "2023-04-30T12:19:39.864028Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1\n",
- "COST: [3]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 2\n",
- "COST: [4]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 3\n",
- "COST: [4]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 4\n",
- "COST: [4]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 5\n",
- "COST: [5]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "input_spec = input_spec_base\n",
- "recommend_charts(spec=input_spec, draco=d);"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "8ec5f8ca",
- "metadata": {},
- "source": [
- "While the above recommendations are valid, they are not very diverse. We can extend the input specification to better specify the design space we want to see recommendations for.\n",
- "Let's say, we want the fields `date` and `temp_max` of the weather dataset to be encoded in the charts.\n",
- "Also, we specify that we want the chart to be a faceted chart.\n",
- "Note that we are not specifying the mark type, the encoding channels for the fields nor for the facet. We leave this to Draco to decide, based on its underlying knowledge base."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "id": "c0e76b39-31ec-459f-ba56-ee06ed080011",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:40.267472Z",
- "start_time": "2023-04-30T12:19:40.054062Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1\n",
- "COST: [16]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 2\n",
- "COST: [16]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 3\n",
- "COST: [17]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 4\n",
- "COST: [17]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 5\n",
- "COST: [17]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "input_spec = input_spec_base + [\n",
- " # We want to encode the `date` field\n",
- " \"entity(encoding,m0,e0).\",\n",
- " \"attribute((encoding,field),e0,date).\",\n",
- " # We want to encode the `temp_max` field\n",
- " \"entity(encoding,m0,e1).\",\n",
- " \"attribute((encoding,field),e1,temp_max).\",\n",
- " # We want the chart to be a faceted chart\n",
- " \"entity(facet,v0,f0).\",\n",
- " \"attribute((facet,channel),f0,col).\",\n",
- "]\n",
- "recommendations = recommend_charts(spec=input_spec, draco=d, num=5)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f468eaad",
- "metadata": {},
- "source": [
- "## Inspecting the Knowledge Base\n",
- "\n",
- "> Debugging the recommendations\n",
- "\n",
- "We can use the `DracoDebug` class to investigate the recommendations generated by Draco and whether they violate any of the soft constraints.\n",
- "We start by instantiating a `DracoDebug` object, passing the recommendations and the `Draco` object used to generate them.\n",
- "A `DataFrame` is returned, containing the recommendations and the soft constraints that they violate as well as the weights associated with each constraint."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "id": "84b3493b",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:40.271155Z",
- "start_time": "2023-04-30T12:19:40.269429Z"
- },
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [],
- "source": [
- "# Parameterized helper to avoid code duplication as we iterate on designs\n",
- "def display_debug_data(draco: drc.Draco, specs: dict[str, dict]):\n",
- " debugger = drc.DracoDebug(specs=specs, draco=draco)\n",
- " chart_preferences = debugger.chart_preferences\n",
- " display(Markdown(\"**Raw debug data**\"))\n",
- " display(chart_preferences.head())\n",
- "\n",
- " display(Markdown(\"**Number of violated preferences**\"))\n",
- " num_violations = len(\n",
- " set(chart_preferences[chart_preferences[\"count\"] != 0][\"pref_name\"])\n",
- " )\n",
- " num_all = len(set(chart_preferences[\"pref_name\"]))\n",
- " display(\n",
- " Markdown(\n",
- " f\"*{num_violations} preferences are violated out of a total of {num_all} preferences (soft constraints)*\"\n",
- " )\n",
- " )\n",
- "\n",
- " display(\n",
- " Markdown(\n",
- " \"Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:\"\n",
- " )\n",
- " )\n",
- " plotter = drc.DracoDebugPlotter(chart_preferences)\n",
- " plot_size = (600, 300)\n",
- " chart = plotter.create_chart(\n",
- " cfg=drc.DracoDebugChartConfig.SORT_BY_COUNT_SUM,\n",
- " violated_prefs_only=True,\n",
- " plot_size=plot_size,\n",
- " )\n",
- " display(chart)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "id": "a3bb3393",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:40.444672Z",
- "start_time": "2023-04-30T12:19:40.271857Z"
- },
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [
- {
- "data": {
- "text/markdown": [
- "**Raw debug data**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " chart_name | \n",
- " pref_name | \n",
- " pref_description | \n",
- " count | \n",
- " weight | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " CHART 1 | \n",
- " cartesian_coordinate | \n",
- " Cartesian coordinates. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " CHART 1 | \n",
- " summary_point | \n",
- " Point mark for summary tasks. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " CHART 1 | \n",
- " linear_y | \n",
- " Linear scale with y channel. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " CHART 1 | \n",
- " linear_x | \n",
- " Linear scale with x channel. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " CHART 1 | \n",
- " c_c_point | \n",
- " Continuous by continuous for point mark. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " chart_name pref_name pref_description \\\n",
- "0 CHART 1 cartesian_coordinate Cartesian coordinates. \n",
- "1 CHART 1 summary_point Point mark for summary tasks. \n",
- "2 CHART 1 linear_y Linear scale with y channel. \n",
- "3 CHART 1 linear_x Linear scale with x channel. \n",
- "4 CHART 1 c_c_point Continuous by continuous for point mark. \n",
- "\n",
- " count weight \n",
- "0 1 0 \n",
- "1 1 0 \n",
- "2 1 0 \n",
- "3 1 0 \n",
- "4 1 0 "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Number of violated preferences**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "*18 preferences are violated out of a total of 147 preferences (soft constraints)*"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.VConcatChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "display_debug_data(draco=d, specs=recommendations)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "89e4bb3d",
- "metadata": {},
- "source": [
- "## Generating Input Specifications Programmatically\n",
- "\n",
- "> Exploring more possibilities within the design space\n",
- "\n",
- "To get a better impression of the space of possible visualizations and to produce examples that might be covered by more soft constraints, we can programmatically generate further input specifications.\n",
- "We define a list of possible values for the mark type, fields and encoding channels that we want to be used in the recommendations and combine them using a nested list comprehension.\n",
- "We also filter out designs with less than 3 encodings and exclude multi-layer designs for now."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9be1c816-4892-4fba-a524-3ff5d54e3aef",
- "metadata": {},
- "source": [
- "We set off by creating the helper function `rec_from_generated_spec` to avoid code duplication as we iterate on designs."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "id": "b4ec26e7",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:40.445020Z",
- "start_time": "2023-04-30T12:19:40.439019Z"
- },
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [],
- "source": [
- "def rec_from_generated_spec(\n",
- " marks: list[str],\n",
- " fields: list[str],\n",
- " encoding_channels: list[str],\n",
- " draco: drc.Draco,\n",
- " num: int = 1,\n",
- ") -> dict[str, dict]:\n",
- " input_specs = [\n",
- " (\n",
- " (mark, field, enc_ch),\n",
- " input_spec_base\n",
- " + [\n",
- " f\"attribute((mark,type),m0,{mark}).\",\n",
- " \"entity(encoding,m0,e0).\",\n",
- " f\"attribute((encoding,field),e0,{field}).\",\n",
- " f\"attribute((encoding,channel),e0,{enc_ch}).\",\n",
- " # filter out designs with less than 3 encodings\n",
- " \":- {entity(encoding,_,_)} < 3.\",\n",
- " # exclude multi-layer designs\n",
- " \":- {entity(mark,_,_)} != 1.\",\n",
- " ],\n",
- " )\n",
- " for mark in marks\n",
- " for field in fields\n",
- " for enc_ch in encoding_channels\n",
- " ]\n",
- " recs = {}\n",
- " for cfg, spec in input_specs:\n",
- " labeler = lambda i: f\"CHART {i + 1} ({' | '.join(cfg)})\"\n",
- " recs = recs | recommend_charts(spec=spec, draco=draco, num=num, labeler=labeler)\n",
- "\n",
- " return recs"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "id": "8120a84a-d3f5-48fa-9c6e-abfe91f3d050",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:42.272484Z",
- "start_time": "2023-04-30T12:19:40.439246Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | weather | color)\n",
- "COST: [25]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | weather | shape)\n",
- "COST: [28]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | weather | size)\n",
- "COST: [30]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | temp_min | color)\n",
- "COST: [27]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | temp_min | shape)\n",
- "COST: [41]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | date | color)\n",
- "COST: [28]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | date | shape)\n",
- "COST: [42]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (point | date | size)\n",
- "COST: [19]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (bar | weather | color)\n",
- "COST: [25]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (bar | temp_min | color)\n",
- "COST: [27]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (bar | date | color)\n",
- "COST: [28]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (line | weather | color)\n",
- "COST: [45]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (line | temp_min | color)\n",
- "COST: [47]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (line | date | color)\n",
- "COST: [48]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | weather | color)\n",
- "COST: [71]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | temp_min | color)\n",
- "COST: [39]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | date | color)\n",
- "COST: [40]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "recommendations = rec_from_generated_spec(\n",
- " marks=[\"point\", \"bar\", \"line\", \"rect\"],\n",
- " fields=[\"weather\", \"temp_min\", \"date\"],\n",
- " encoding_channels=[\"color\", \"shape\", \"size\"],\n",
- " draco=d,\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5644a643",
- "metadata": {},
- "source": [
- "It is no secret that some of the above recommendations are not very useful when it comes to communicating the data. Nevertheless, they are valid visualizations from the space of possibilities. Following the already introduced workflow, we can use `DracoDebug` to investigate the soft constraint violations of the generated recommendations. If there are recommendations we are not happy with, we can extend the knowledge base to cover them so that they do not appear in the future."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "id": "4179f1d1-8b97-48a0-9400-3c6798f686d4",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:42.665603Z",
- "start_time": "2023-04-30T12:19:42.274931Z"
- },
- "tags": []
- },
- "outputs": [
- {
- "data": {
- "text/markdown": [
- "**Raw debug data**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " chart_name | \n",
- " pref_name | \n",
- " pref_description | \n",
- " count | \n",
- " weight | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " CHART 1 (point | weather | color) | \n",
- " cartesian_coordinate | \n",
- " Cartesian coordinates. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " CHART 1 (point | weather | color) | \n",
- " summary_point | \n",
- " Point mark for summary tasks. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " CHART 1 (point | weather | color) | \n",
- " aggregate_mean | \n",
- " Mean as aggregate op. | \n",
- " 1 | \n",
- " 1 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " CHART 1 (point | weather | color) | \n",
- " aggregate_count | \n",
- " Count as aggregate op. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " CHART 1 (point | weather | color) | \n",
- " ordinal_color | \n",
- " Ordinal scale with color channel. | \n",
- " 1 | \n",
- " 8 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " chart_name pref_name \\\n",
- "0 CHART 1 (point | weather | color) cartesian_coordinate \n",
- "1 CHART 1 (point | weather | color) summary_point \n",
- "2 CHART 1 (point | weather | color) aggregate_mean \n",
- "3 CHART 1 (point | weather | color) aggregate_count \n",
- "4 CHART 1 (point | weather | color) ordinal_color \n",
- "\n",
- " pref_description count weight \n",
- "0 Cartesian coordinates. 1 0 \n",
- "1 Point mark for summary tasks. 1 0 \n",
- "2 Mean as aggregate op. 1 1 \n",
- "3 Count as aggregate op. 1 0 \n",
- "4 Ordinal scale with color channel. 1 8 "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Number of violated preferences**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "*40 preferences are violated out of a total of 147 preferences (soft constraints)*"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.VConcatChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "display_debug_data(draco=d, specs=recommendations)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "ba981d6e-a9c2-4725-a5b4-91b799cdd9d7",
- "metadata": {},
- "source": [
- "## Adjusting the Knowledge Base\n",
- "\n",
- "> Filtering out suboptimal designs by creating a new soft constraint and tuning its weight\n",
- "\n",
- "As apparent from the above-generated recommendations, there are some visualizations that are valid but not as expressive as we would desire. As a concrete example, the recommendations `CHART 1 (rect | weather | color)` and `CHART 1 (rect | date | color)` used row-faceting and no rules in our knowledge base penalised them for doing so. \n",
- "\n",
- "We demonstrate how we can extend the knowledge base with a design rule (soft constraint) to discourage using faceting with `rect` mark and `color` encoding and how to tune its weight to achieve more desirable recommendations."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e89ef2a2",
- "metadata": {},
- "source": [
- "We start by creating the helper function `draco_with_updated_kb`, to return a `Draco` instance with the updated knowledge base.\n",
- "We extend the knowledge base with a new preference (soft constraint) called `rect_color_facet` to discourage\n",
- "faceting with rect mark and color encoding. To explore how the recommendations change as we assign different weights to the new soft constraint, we parameterize this function to accept `pref_weight` as an argument. This weight will be associated with the `rect_color_facet` preference we extend the knowledge base with."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "id": "ad0c4c38",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:42.665960Z",
- "start_time": "2023-04-30T12:19:42.650120Z"
- },
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [],
- "source": [
- "def draco_with_updated_kb(pref_weight: int) -> drc.Draco:\n",
- " # Custom soft constraint to discourage faceting with rect mark and color encoding\n",
- " rect_color_facet_pref = \"\"\"\n",
- " % @soft(rect_color_facet) Faceting with rect mark and color encoding.\n",
- " preference(rect_color_facet,Fa) :-\n",
- " attribute((mark,type),_,rect),\n",
- " attribute((encoding,channel),_,color),\n",
- " attribute((facet,channel),Fa,_).\n",
- " \"\"\".strip()\n",
- " rect_color_facet_pref_weight = pref_weight\n",
- "\n",
- " # Update the default soft constraint knowledge base (program)\n",
- " soft_updated = drc.Draco().soft + f\"\\n\\n{rect_color_facet_pref}\\n\\n\"\n",
- " # Assign the weight to the new soft constraint\n",
- " weights_updated = drc.Draco().weights | {\n",
- " \"rect_color_facet_weight\": rect_color_facet_pref_weight\n",
- " }\n",
- " return drc.Draco(soft=soft_updated, weights=weights_updated)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "8392bc7b",
- "metadata": {},
- "source": [
- "As opposed to the previous example, we only generate specifications for the `rect` mark, the `weather` and `date` fields and the `color` encoding channel, since we observed the undesired faceted recommendations for these configurations. We explore how the recommendations change as we assign a higher weight value (that is, a higher penalty) to the new soft constraint."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "44943c2c-a5e3-4fb0-936f-be55618b5643",
- "metadata": {},
- "source": [
- "### Verifying That the Knowledge Base Got Updated"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3e04c127-5405-4c97-8d62-bb6c3a398b86",
- "metadata": {},
- "source": [
- "First, to validate that the `rect_color_facet` soft constraint we created got registered properly to our knowledge base we start with a weight of `0`. We expect to obtain the same, faceted recommendations, but we also expect to see in the plot created in `display_debug_data` by `DracoDebugPlotter` that the faceted recommendations violate the design preference we defined."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "id": "99399a60",
- "metadata": {
- "ExecuteTime": {
- "end_time": "2023-04-30T12:19:43.402066Z",
- "start_time": "2023-04-30T12:19:42.650386Z"
- },
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- },
- "tags": []
- },
- "outputs": [
- {
- "data": {
- "text/markdown": [
- "**Weight for `rect_color_facet` preference: 0**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | weather | color)\n",
- "COST: [71]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | date | color)\n",
- "COST: [40]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.FacetChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Raw debug data**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " chart_name | \n",
- " pref_name | \n",
- " pref_description | \n",
- " count | \n",
- " weight | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " rect_color_facet | \n",
- " Faceting with rect mark and color encoding. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " cartesian_coordinate | \n",
- " Cartesian coordinates. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " summary_rect | \n",
- " Rect mark for summary tasks. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " ordinal_color | \n",
- " Ordinal scale with color channel. | \n",
- " 1 | \n",
- " 8 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " ordinal_y | \n",
- " Ordinal scale with y channel. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " chart_name pref_name \\\n",
- "0 CHART 1 (rect | weather | color) rect_color_facet \n",
- "1 CHART 1 (rect | weather | color) cartesian_coordinate \n",
- "2 CHART 1 (rect | weather | color) summary_rect \n",
- "3 CHART 1 (rect | weather | color) ordinal_color \n",
- "4 CHART 1 (rect | weather | color) ordinal_y \n",
- "\n",
- " pref_description count weight \n",
- "0 Faceting with rect mark and color encoding. 1 0 \n",
- "1 Cartesian coordinates. 1 0 \n",
- "2 Rect mark for summary tasks. 1 0 \n",
- "3 Ordinal scale with color channel. 1 8 \n",
- "4 Ordinal scale with y channel. 1 0 "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Number of violated preferences**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "*20 preferences are violated out of a total of 148 preferences (soft constraints)*"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.VConcatChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "weight = 0\n",
- "display(Markdown(f\"**Weight for `rect_color_facet` preference: {weight}**\"))\n",
- "updated_draco = draco_with_updated_kb(pref_weight=weight)\n",
- "recommendations = rec_from_generated_spec(\n",
- " marks=[\"rect\"],\n",
- " fields=[\"weather\", \"date\"],\n",
- " encoding_channels=[\"color\"],\n",
- " draco=updated_draco,\n",
- ")\n",
- "display_debug_data(draco=updated_draco, specs=recommendations)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b711ddf4-9722-4d18-b84b-35c0b1adeb6f",
- "metadata": {},
- "source": [
- "As expected, our debug plot indicates in the heatmap's 6th column that both `CHART 1 (rect | weather | color)` and `CHART 1 (rect | date | color)` violate the `rect_color_facet` preference we introduced. Now we can work on tuning the weight associated with this rule, so that we actually penalize the usage of faceting when we have a `rect` mark and `color` as the encoding channel. "
- ]
- },
- {
- "cell_type": "markdown",
- "id": "cca01ae7-e874-4092-8337-9e0d2bb30419",
- "metadata": {},
- "source": [
- "### Weight Tuning\n",
- "\n",
- "We increase the weight from `0` to `10` and by doing so we expect that this penalty will be sufficient for the Clingo solver the find a model with a lower cost, not violating the `rect_color_facet` design rule we extended our knowledge base with."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "id": "3e200090-4991-4754-bdb0-badce0768908",
- "metadata": {
- "tags": []
- },
- "outputs": [
- {
- "data": {
- "text/markdown": [
- "**Weight for `rect_color_facet` preference: 10**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | weather | color)\n",
- "COST: [73]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "CHART 1 (rect | date | color)\n",
- "COST: [42]\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.Chart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Raw debug data**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " chart_name | \n",
- " pref_name | \n",
- " pref_description | \n",
- " count | \n",
- " weight | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " cartesian_coordinate | \n",
- " Cartesian coordinates. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " summary_rect | \n",
- " Rect mark for summary tasks. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " aggregate_median | \n",
- " Median as aggregate op. | \n",
- " 1 | \n",
- " 3 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " ordinal_color | \n",
- " Ordinal scale with color channel. | \n",
- " 1 | \n",
- " 8 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " CHART 1 (rect | weather | color) | \n",
- " ordinal_y | \n",
- " Ordinal scale with y channel. | \n",
- " 1 | \n",
- " 0 | \n",
- "
\n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " chart_name pref_name \\\n",
- "0 CHART 1 (rect | weather | color) cartesian_coordinate \n",
- "1 CHART 1 (rect | weather | color) summary_rect \n",
- "2 CHART 1 (rect | weather | color) aggregate_median \n",
- "3 CHART 1 (rect | weather | color) ordinal_color \n",
- "4 CHART 1 (rect | weather | color) ordinal_y \n",
- "\n",
- " pref_description count weight \n",
- "0 Cartesian coordinates. 1 0 \n",
- "1 Rect mark for summary tasks. 1 0 \n",
- "2 Median as aggregate op. 1 3 \n",
- "3 Ordinal scale with color channel. 1 8 \n",
- "4 Ordinal scale with y channel. 1 0 "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "**Number of violated preferences**"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "*18 preferences are violated out of a total of 148 preferences (soft constraints)*"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/markdown": [
- "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
- ],
- "text/plain": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "\n",
- ""
- ],
- "text/plain": [
- "alt.VConcatChart(...)"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "weight = 10\n",
- "display(Markdown(f\"**Weight for `rect_color_facet` preference: {weight}**\"))\n",
- "updated_draco = draco_with_updated_kb(pref_weight=weight)\n",
- "recommendations = rec_from_generated_spec(\n",
- " marks=[\"rect\"],\n",
- " fields=[\"weather\", \"date\"],\n",
- " encoding_channels=[\"color\"],\n",
- " draco=updated_draco,\n",
- ")\n",
- "display_debug_data(draco=updated_draco, specs=recommendations)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5f153e84-7754-43f9-a010-775cb6e1f048",
- "metadata": {},
- "source": [
- "Just as expected, thanks to the higher weight assigned to the newly added `rect_color_facet` rule, we don't see recommendations using faceting when a `rect` mark and `color` encoding is used. One can use the very same process to tailor the knowledge base and fine-tune the constraint weights to obtain more expressive visualization recommendations."
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.10.0"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "f5bc5fd7-33fc-472c-b59b-fb279a320d00",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Using Draco for Visualization Design Space Exploration\n",
+ "To help verify, debug, and tune the recommendation results, we provide general [guidelines](https://dig.cmu.edu/draco2/applications/debug_draco.html#). \n",
+ "We apply the guidelines and features in the following demonstration. \n",
+ "\n",
+ "In this example we will use Draco to explore the visualization design space for the Seattle weather dataset.\n",
+ "Starting with nothing but a raw dataset, we are going to use the reusable building blocks that Draco provides to generate a wide space\n",
+ "of recommendations, and we will investigate the produced designs using the debugger module."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "e010c7ae-4d1f-4783-9b83-9f5414f392de",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.736324Z",
+ "start_time": "2023-08-04T07:31:46.627931Z"
+ },
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Suppressing warnings raised by altair in the background\n",
+ "# (iteration-related deprecation warnings)\n",
+ "import warnings\n",
+ "\n",
+ "warnings.filterwarnings(\"ignore\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "137594a6-c856-441f-82ff-78250cf9e74f",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.745850Z",
+ "start_time": "2023-08-04T07:31:46.632050Z"
+ },
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Display utilities\n",
+ "from IPython.display import display, Markdown\n",
+ "import json\n",
+ "import numpy as np\n",
+ "\n",
+ "# Handles serialization of common numpy datatypes\n",
+ "class NpEncoder(json.JSONEncoder):\n",
+ " def default(self, obj):\n",
+ " if isinstance(obj, np.integer):\n",
+ " return int(obj)\n",
+ " elif isinstance(obj, np.floating):\n",
+ " return float(obj)\n",
+ " elif isinstance(obj, np.ndarray):\n",
+ " return obj.tolist()\n",
+ " else:\n",
+ " return super(NpEncoder, self).default(obj)\n",
+ "\n",
+ "\n",
+ "def md(markdown: str):\n",
+ " display(Markdown(markdown))\n",
+ "\n",
+ "\n",
+ "def pprint(obj):\n",
+ " md(f\"```json\\n{json.dumps(obj, indent=2, cls=NpEncoder)}\\n```\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fde40b06",
+ "metadata": {},
+ "source": [
+ "## Loading the Data\n",
+ "\n",
+ "We will use the Seattle weather dataset from the [Vega Datasets](https://vega.github.io/vega-datasets/) for this example."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "95789170-a7d4-4924-ae23-3736e03ea006",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.749563Z",
+ "start_time": "2023-08-04T07:31:46.634904Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " date | \n",
+ " precipitation | \n",
+ " temp_max | \n",
+ " temp_min | \n",
+ " wind | \n",
+ " weather | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " 2012-01-01 | \n",
+ " 0.0 | \n",
+ " 12.8 | \n",
+ " 5.0 | \n",
+ " 4.7 | \n",
+ " drizzle | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " 2012-01-02 | \n",
+ " 10.9 | \n",
+ " 10.6 | \n",
+ " 2.8 | \n",
+ " 4.5 | \n",
+ " rain | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " 2012-01-03 | \n",
+ " 0.8 | \n",
+ " 11.7 | \n",
+ " 7.2 | \n",
+ " 2.3 | \n",
+ " rain | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " 2012-01-04 | \n",
+ " 20.3 | \n",
+ " 12.2 | \n",
+ " 5.6 | \n",
+ " 4.7 | \n",
+ " rain | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " 2012-01-05 | \n",
+ " 1.3 | \n",
+ " 8.9 | \n",
+ " 2.8 | \n",
+ " 6.1 | \n",
+ " rain | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " date precipitation temp_max temp_min wind weather\n",
+ "0 2012-01-01 0.0 12.8 5.0 4.7 drizzle\n",
+ "1 2012-01-02 10.9 10.6 2.8 4.5 rain\n",
+ "2 2012-01-03 0.8 11.7 7.2 2.3 rain\n",
+ "3 2012-01-04 20.3 12.2 5.6 4.7 rain\n",
+ "4 2012-01-05 1.3 8.9 2.8 6.1 rain"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import draco as drc\n",
+ "import pandas as pd\n",
+ "from vega_datasets import data as vega_data\n",
+ "import altair as alt\n",
+ "\n",
+ "# Loading data to be explored\n",
+ "df: pd.DataFrame = vega_data.seattle_weather()\n",
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b9335743",
+ "metadata": {},
+ "source": [
+ "We can use the `schema_from_dataframe` function to generate the schema of the dataset, including the data types of each column and their statistical properties."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "8e0d10d9-1d3b-463d-97bb-9dae981930a5",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.804058Z",
+ "start_time": "2023-08-04T07:31:46.642610Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "```json\n",
+ "{\n",
+ " \"number_rows\": 1461,\n",
+ " \"field\": [\n",
+ " {\n",
+ " \"name\": \"date\",\n",
+ " \"type\": \"datetime\",\n",
+ " \"unique\": 1461,\n",
+ " \"entropy\": 7287\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"precipitation\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 111,\n",
+ " \"entropy\": 2422,\n",
+ " \"min\": 0,\n",
+ " \"max\": 55,\n",
+ " \"std\": 6\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_max\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 67,\n",
+ " \"entropy\": 3934,\n",
+ " \"min\": -1,\n",
+ " \"max\": 35,\n",
+ " \"std\": 7\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_min\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 55,\n",
+ " \"entropy\": 3596,\n",
+ " \"min\": -7,\n",
+ " \"max\": 18,\n",
+ " \"std\": 5\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"wind\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 79,\n",
+ " \"entropy\": 3950,\n",
+ " \"min\": 0,\n",
+ " \"max\": 9,\n",
+ " \"std\": 1\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"weather\",\n",
+ " \"type\": \"string\",\n",
+ " \"unique\": 5,\n",
+ " \"entropy\": 1201,\n",
+ " \"freq\": 714\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "```"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data_schema = drc.schema_from_dataframe(df)\n",
+ "pprint(data_schema)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c72d7d02",
+ "metadata": {},
+ "source": [
+ "We transform the data schema into a set of facts that Draco can use to reason about the data when generating recommendations. We use the `dict_to_facts` function to do so which takes a dictionary and returns a list of facts.\n",
+ "The output list of facts encodes the same information as the input dictionary, it is just a different representation that we can feed into [Clingo](https://potassco.org/clingo/) under the hood."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "f004d684-9ba5-4f73-9383-34a3007ba6c0",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.805392Z",
+ "start_time": "2023-08-04T07:31:46.648759Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "```json\n",
+ "[\n",
+ " \"attribute(number_rows,root,1461).\",\n",
+ " \"entity(field,root,0).\",\n",
+ " \"attribute((field,name),0,date).\",\n",
+ " \"attribute((field,type),0,datetime).\",\n",
+ " \"attribute((field,unique),0,1461).\",\n",
+ " \"attribute((field,entropy),0,7287).\",\n",
+ " \"entity(field,root,1).\",\n",
+ " \"attribute((field,name),1,precipitation).\",\n",
+ " \"attribute((field,type),1,number).\",\n",
+ " \"attribute((field,unique),1,111).\",\n",
+ " \"attribute((field,entropy),1,2422).\",\n",
+ " \"attribute((field,min),1,0).\",\n",
+ " \"attribute((field,max),1,55).\",\n",
+ " \"attribute((field,std),1,6).\",\n",
+ " \"entity(field,root,2).\",\n",
+ " \"attribute((field,name),2,temp_max).\",\n",
+ " \"attribute((field,type),2,number).\",\n",
+ " \"attribute((field,unique),2,67).\",\n",
+ " \"attribute((field,entropy),2,3934).\",\n",
+ " \"attribute((field,min),2,-1).\",\n",
+ " \"attribute((field,max),2,35).\",\n",
+ " \"attribute((field,std),2,7).\",\n",
+ " \"entity(field,root,3).\",\n",
+ " \"attribute((field,name),3,temp_min).\",\n",
+ " \"attribute((field,type),3,number).\",\n",
+ " \"attribute((field,unique),3,55).\",\n",
+ " \"attribute((field,entropy),3,3596).\",\n",
+ " \"attribute((field,min),3,-7).\",\n",
+ " \"attribute((field,max),3,18).\",\n",
+ " \"attribute((field,std),3,5).\",\n",
+ " \"entity(field,root,4).\",\n",
+ " \"attribute((field,name),4,wind).\",\n",
+ " \"attribute((field,type),4,number).\",\n",
+ " \"attribute((field,unique),4,79).\",\n",
+ " \"attribute((field,entropy),4,3950).\",\n",
+ " \"attribute((field,min),4,0).\",\n",
+ " \"attribute((field,max),4,9).\",\n",
+ " \"attribute((field,std),4,1).\",\n",
+ " \"entity(field,root,5).\",\n",
+ " \"attribute((field,name),5,weather).\",\n",
+ " \"attribute((field,type),5,string).\",\n",
+ " \"attribute((field,unique),5,5).\",\n",
+ " \"attribute((field,entropy),5,1201).\",\n",
+ " \"attribute((field,freq),5,714).\"\n",
+ "]\n",
+ "```"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data_schema_facts = drc.dict_to_facts(data_schema)\n",
+ "pprint(data_schema_facts)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bbf3a2a6",
+ "metadata": {},
+ "source": [
+ "## Iterating the partial specification query\n",
+ "\n",
+ "> Generating recommendations from a minimal input\n",
+ "\n",
+ "We start by defining `input_spec_base` which is a list of facts including the data schema, a single view and a single mark.\n",
+ "This is the minimal set of facts that Draco needs to generate recommendations which can be rendered into charts.\n",
+ "\n",
+ "We instantiate a `Draco` object, using the default knowledge base, and an `AltairRenderer` object which will be used to render the recommendations into Vega-Lite charts."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "0c3ab0c4-095d-48ba-b2f5-729a0a5de316",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.805499Z",
+ "start_time": "2023-08-04T07:31:46.651939Z"
+ },
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "from draco.renderer import AltairRenderer\n",
+ "\n",
+ "input_spec_base = data_schema_facts + [\n",
+ " \"entity(view,root,v0).\",\n",
+ " \"entity(mark,v0,m0).\",\n",
+ "]\n",
+ "d = drc.Draco()\n",
+ "renderer = AltairRenderer()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1a3c4d59",
+ "metadata": {},
+ "source": [
+ "We can now use the `complete_spec` method of the `Draco` object to generate recommendations from incomplete specifications.\n",
+ "The function below is a reusable utility for this example, responsible for generating, rendering and displaying the recommendations."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "da55ac50-c3f4-4064-ac0d-ce09ee6e7d7e",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.806754Z",
+ "start_time": "2023-08-04T07:31:46.656207Z"
+ },
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "def recommend_charts(\n",
+ " spec: list[str], draco: drc.Draco, num: int = 5, labeler=lambda i: f\"CHART {i+1}\"\n",
+ ") -> dict[str, tuple[list[str], dict]]:\n",
+ " # Dictionary to store the generated recommendations, keyed by chart name\n",
+ " chart_specs = {}\n",
+ " for i, model in enumerate(draco.complete_spec(spec, num)):\n",
+ " chart_name = labeler(i)\n",
+ " spec = drc.answer_set_to_dict(model.answer_set)\n",
+ " chart_specs[chart_name] = drc.dict_to_facts(spec), spec\n",
+ "\n",
+ " print(chart_name)\n",
+ " print(f\"COST: {model.cost}\")\n",
+ " chart = renderer.render(spec=spec, data=df)\n",
+ " # Adjust column-faceted chart size\n",
+ " if (\n",
+ " isinstance(chart, alt.FacetChart)\n",
+ " and chart.facet.column is not alt.Undefined\n",
+ " ):\n",
+ " chart = chart.configure_view(continuousWidth=130, continuousHeight=130)\n",
+ " display(chart)\n",
+ "\n",
+ " return chart_specs"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dd4301df",
+ "metadata": {},
+ "source": [
+ "We are using `input_spec_base` as the starting point for our exploration, that is, we are only specifying the data schema, and that we want the recommendations to have at least one view and one mark."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "068fc36e-7910-42f6-9f61-f9cc9b31ba9c",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.943638Z",
+ "start_time": "2023-08-04T07:31:46.658807Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1\n",
+ "COST: [3]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 2\n",
+ "COST: [4]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 3\n",
+ "COST: [4]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 4\n",
+ "COST: [4]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 5\n",
+ "COST: [5]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "input_spec = input_spec_base\n",
+ "initial_recommendations = recommend_charts(spec=input_spec, draco=d)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "87be73a6-cb5b-4aa6-bf9d-be422cf1ad16",
+ "metadata": {},
+ "source": [
+ "While the above recommendations are valid, they are not very diverse. We can also observe that the first two recommendations are represented by seemingly identical Vega-Lite specifications, however, they have different costs. We explore this behavior below, by inspecting the Draco specification of the first two charts."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "8a629259-cd47-411d-9842-3923ff3618df",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:46.943997Z",
+ "start_time": "2023-08-04T07:31:46.851319Z"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "**Draco Specification of CHART 1**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "```json\n",
+ "{\n",
+ " \"number_rows\": 1461,\n",
+ " \"task\": \"summary\",\n",
+ " \"field\": [\n",
+ " {\n",
+ " \"name\": \"date\",\n",
+ " \"type\": \"datetime\",\n",
+ " \"unique\": 1461,\n",
+ " \"entropy\": 7287\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"precipitation\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 111,\n",
+ " \"entropy\": 2422,\n",
+ " \"min\": 0,\n",
+ " \"max\": 55,\n",
+ " \"std\": 6\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_max\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 67,\n",
+ " \"entropy\": 3934,\n",
+ " \"min\": -1,\n",
+ " \"max\": 35,\n",
+ " \"std\": 7\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_min\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 55,\n",
+ " \"entropy\": 3596,\n",
+ " \"min\": -7,\n",
+ " \"max\": 18,\n",
+ " \"std\": 5\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"wind\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 79,\n",
+ " \"entropy\": 3950,\n",
+ " \"min\": 0,\n",
+ " \"max\": 9,\n",
+ " \"std\": 1\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"weather\",\n",
+ " \"type\": \"string\",\n",
+ " \"unique\": 5,\n",
+ " \"entropy\": 1201,\n",
+ " \"freq\": 714\n",
+ " }\n",
+ " ],\n",
+ " \"view\": [\n",
+ " {\n",
+ " \"coordinates\": \"cartesian\",\n",
+ " \"mark\": [\n",
+ " {\n",
+ " \"type\": \"bar\",\n",
+ " \"encoding\": [\n",
+ " {\n",
+ " \"channel\": \"x\",\n",
+ " \"aggregate\": \"count\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " ],\n",
+ " \"scale\": [\n",
+ " {\n",
+ " \"type\": \"linear\",\n",
+ " \"channel\": \"x\",\n",
+ " \"zero\": \"true\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "```"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Draco Specification of CHART 2**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "```json\n",
+ "{\n",
+ " \"number_rows\": 1461,\n",
+ " \"task\": \"value\",\n",
+ " \"field\": [\n",
+ " {\n",
+ " \"name\": \"date\",\n",
+ " \"type\": \"datetime\",\n",
+ " \"unique\": 1461,\n",
+ " \"entropy\": 7287\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"precipitation\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 111,\n",
+ " \"entropy\": 2422,\n",
+ " \"min\": 0,\n",
+ " \"max\": 55,\n",
+ " \"std\": 6\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_max\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 67,\n",
+ " \"entropy\": 3934,\n",
+ " \"min\": -1,\n",
+ " \"max\": 35,\n",
+ " \"std\": 7\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"temp_min\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 55,\n",
+ " \"entropy\": 3596,\n",
+ " \"min\": -7,\n",
+ " \"max\": 18,\n",
+ " \"std\": 5\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"wind\",\n",
+ " \"type\": \"number\",\n",
+ " \"unique\": 79,\n",
+ " \"entropy\": 3950,\n",
+ " \"min\": 0,\n",
+ " \"max\": 9,\n",
+ " \"std\": 1\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"weather\",\n",
+ " \"type\": \"string\",\n",
+ " \"unique\": 5,\n",
+ " \"entropy\": 1201,\n",
+ " \"freq\": 714\n",
+ " }\n",
+ " ],\n",
+ " \"view\": [\n",
+ " {\n",
+ " \"coordinates\": \"cartesian\",\n",
+ " \"mark\": [\n",
+ " {\n",
+ " \"type\": \"bar\",\n",
+ " \"encoding\": [\n",
+ " {\n",
+ " \"channel\": \"x\",\n",
+ " \"aggregate\": \"count\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " ],\n",
+ " \"scale\": [\n",
+ " {\n",
+ " \"zero\": \"true\",\n",
+ " \"channel\": \"x\",\n",
+ " \"type\": \"linear\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "```"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "chart_1_key, chart_2_key = 'CHART 1', 'CHART 2'\n",
+ "(_, chart_1), (_, chart_2) = initial_recommendations[chart_1_key], initial_recommendations[chart_2_key]\n",
+ "\n",
+ "md(f\"**Draco Specification of {chart_1_key}**\")\n",
+ "pprint(chart_1)\n",
+ "\n",
+ "md(f\"**Draco Specification of {chart_2_key}**\")\n",
+ "pprint(chart_2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cd2d41f5-5ba6-4257-aa06-c8314abc7ec1",
+ "metadata": {},
+ "source": [
+ "Taking a good look at the specifications above, we can see that they only differ by their `\"task\"` attribute value. `CHART 1` has `\"task\": \"summary\"`, while `CHART 2` has `\"task\": \"value\"`. Thanks to the constraints in the default Draco knowledge base, the logical solver assigns slightly different costs to the two specifications. However, since the two charts use the same fields, scales, marks and encodings, the actual Vega-Lite specifications of the different Draco specifications are identical."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8ec5f8ca",
+ "metadata": {},
+ "source": [
+ "We can extend the input specification to better specify the design space we want to see recommendations for, to get more diverse results.\n",
+ "Let's say, we want the fields `date` and `temp_max` of the weather dataset to be encoded in the charts.\n",
+ "Also, we specify that we want the chart to be a faceted chart.\n",
+ "Note that we are not specifying the mark type, the encoding channels for the fields nor for the facet. We leave this to Draco to decide, based on its underlying knowledge base."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "c0e76b39-31ec-459f-ba56-ee06ed080011",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:47.123433Z",
+ "start_time": "2023-08-04T07:31:46.858016Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1\n",
+ "COST: [16]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 2\n",
+ "COST: [16]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 3\n",
+ "COST: [17]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 4\n",
+ "COST: [17]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 5\n",
+ "COST: [17]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "input_spec = input_spec_base + [\n",
+ " # We want to encode the `date` field\n",
+ " \"entity(encoding,m0,e0).\",\n",
+ " \"attribute((encoding,field),e0,date).\",\n",
+ " # We want to encode the `temp_max` field\n",
+ " \"entity(encoding,m0,e1).\",\n",
+ " \"attribute((encoding,field),e1,temp_max).\",\n",
+ " # We want the chart to be a faceted chart\n",
+ " \"entity(facet,v0,f0).\",\n",
+ " \"attribute((facet,channel),f0,col).\",\n",
+ "]\n",
+ "recommendations = recommend_charts(spec=input_spec, draco=d, num=5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f468eaad",
+ "metadata": {},
+ "source": [
+ "## Inspecting the Knowledge Base\n",
+ "\n",
+ "> Debugging the recommendations\n",
+ "\n",
+ "We can use the `DracoDebug` class to investigate the recommendations generated by Draco and whether they violate any of the soft constraints.\n",
+ "We start by instantiating a `DracoDebug` object, passing the recommendations and the `Draco` object used to generate them.\n",
+ "A `DataFrame` is returned, containing the recommendations and the soft constraints that they violate as well as the weights associated with each constraint."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "84b3493b",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:47.123908Z",
+ "start_time": "2023-08-04T07:31:47.082803Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# Parameterized helper to avoid code duplication as we iterate on designs\n",
+ "def display_debug_data(draco: drc.Draco, specs: dict[str, tuple[list[str], dict]]):\n",
+ " debugger = drc.DracoDebug(specs={chart_name: fact_spec for chart_name, (fact_spec, _) in specs.items()}, draco=draco)\n",
+ " chart_preferences = debugger.chart_preferences\n",
+ " display(Markdown(\"**Raw debug data**\"))\n",
+ " display(chart_preferences.head())\n",
+ "\n",
+ " display(Markdown(\"**Number of violated preferences**\"))\n",
+ " num_violations = len(\n",
+ " set(chart_preferences[chart_preferences[\"count\"] != 0][\"pref_name\"])\n",
+ " )\n",
+ " num_all = len(set(chart_preferences[\"pref_name\"]))\n",
+ " display(\n",
+ " Markdown(\n",
+ " f\"*{num_violations} preferences are violated out of a total of {num_all} preferences (soft constraints)*\"\n",
+ " )\n",
+ " )\n",
+ "\n",
+ " display(\n",
+ " Markdown(\n",
+ " \"Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:\"\n",
+ " )\n",
+ " )\n",
+ " plotter = drc.DracoDebugPlotter(chart_preferences)\n",
+ " plot_size = (600, 300)\n",
+ " chart = plotter.create_chart(\n",
+ " cfg=drc.DracoDebugChartConfig.SORT_BY_COUNT_SUM,\n",
+ " violated_prefs_only=True,\n",
+ " plot_size=plot_size,\n",
+ " )\n",
+ " display(chart)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "a3bb3393",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:47.213230Z",
+ "start_time": "2023-08-04T07:31:47.085904Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "**Raw debug data**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " chart_name | \n",
+ " pref_name | \n",
+ " pref_description | \n",
+ " count | \n",
+ " weight | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " CHART 1 | \n",
+ " cartesian_coordinate | \n",
+ " Cartesian coordinates. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " CHART 1 | \n",
+ " summary_point | \n",
+ " Point mark for summary tasks. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " CHART 1 | \n",
+ " linear_y | \n",
+ " Linear scale with y channel. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " CHART 1 | \n",
+ " linear_x | \n",
+ " Linear scale with x channel. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " CHART 1 | \n",
+ " c_c_point | \n",
+ " Continuous by continuous for point mark. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " chart_name pref_name pref_description \\\n",
+ "0 CHART 1 cartesian_coordinate Cartesian coordinates. \n",
+ "1 CHART 1 summary_point Point mark for summary tasks. \n",
+ "2 CHART 1 linear_y Linear scale with y channel. \n",
+ "3 CHART 1 linear_x Linear scale with x channel. \n",
+ "4 CHART 1 c_c_point Continuous by continuous for point mark. \n",
+ "\n",
+ " count weight \n",
+ "0 1 0 \n",
+ "1 1 0 \n",
+ "2 1 0 \n",
+ "3 1 0 \n",
+ "4 1 0 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Number of violated preferences**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "*18 preferences are violated out of a total of 147 preferences (soft constraints)*"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.VConcatChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "display_debug_data(draco=d, specs=recommendations)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "89e4bb3d",
+ "metadata": {},
+ "source": [
+ "## Generating Input Specifications Programmatically\n",
+ "\n",
+ "> Exploring more possibilities within the design space\n",
+ "\n",
+ "To get a better impression of the space of possible visualizations and to produce examples that might be covered by more soft constraints, we can programmatically generate further input specifications.\n",
+ "We define a list of possible values for the mark type, fields and encoding channels that we want to be used in the recommendations and combine them using a nested list comprehension.\n",
+ "We also filter out designs with less than 3 encodings and exclude multi-layer designs for now."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9be1c816-4892-4fba-a524-3ff5d54e3aef",
+ "metadata": {},
+ "source": [
+ "We set off by creating the helper function `rec_from_generated_spec` to avoid code duplication as we iterate on designs."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "b4ec26e7",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:47.221555Z",
+ "start_time": "2023-08-04T07:31:47.193921Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def rec_from_generated_spec(\n",
+ " marks: list[str],\n",
+ " fields: list[str],\n",
+ " encoding_channels: list[str],\n",
+ " draco: drc.Draco,\n",
+ " num: int = 1,\n",
+ ") -> dict[str, dict]:\n",
+ " input_specs = [\n",
+ " (\n",
+ " (mark, field, enc_ch),\n",
+ " input_spec_base\n",
+ " + [\n",
+ " f\"attribute((mark,type),m0,{mark}).\",\n",
+ " \"entity(encoding,m0,e0).\",\n",
+ " f\"attribute((encoding,field),e0,{field}).\",\n",
+ " f\"attribute((encoding,channel),e0,{enc_ch}).\",\n",
+ " # filter out designs with less than 3 encodings\n",
+ " \":- {entity(encoding,_,_)} < 3.\",\n",
+ " # exclude multi-layer designs\n",
+ " \":- {entity(mark,_,_)} != 1.\",\n",
+ " ],\n",
+ " )\n",
+ " for mark in marks\n",
+ " for field in fields\n",
+ " for enc_ch in encoding_channels\n",
+ " ]\n",
+ " recs = {}\n",
+ " for cfg, spec in input_specs:\n",
+ " labeler = lambda i: f\"CHART {i + 1} ({' | '.join(cfg)})\"\n",
+ " recs = recs | recommend_charts(spec=spec, draco=draco, num=num, labeler=labeler)\n",
+ "\n",
+ " return recs"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "8120a84a-d3f5-48fa-9c6e-abfe91f3d050",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:49.215073Z",
+ "start_time": "2023-08-04T07:31:47.198691Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | weather | color)\n",
+ "COST: [25]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | weather | shape)\n",
+ "COST: [28]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | weather | size)\n",
+ "COST: [30]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | temp_min | color)\n",
+ "COST: [27]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | temp_min | shape)\n",
+ "COST: [41]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | date | color)\n",
+ "COST: [28]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | date | shape)\n",
+ "COST: [42]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (point | date | size)\n",
+ "COST: [19]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (bar | weather | color)\n",
+ "COST: [25]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (bar | temp_min | color)\n",
+ "COST: [27]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (bar | date | color)\n",
+ "COST: [28]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (line | weather | color)\n",
+ "COST: [45]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (line | temp_min | color)\n",
+ "COST: [47]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (line | date | color)\n",
+ "COST: [48]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | weather | color)\n",
+ "COST: [71]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | temp_min | color)\n",
+ "COST: [39]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | date | color)\n",
+ "COST: [40]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "recommendations = rec_from_generated_spec(\n",
+ " marks=[\"point\", \"bar\", \"line\", \"rect\"],\n",
+ " fields=[\"weather\", \"temp_min\", \"date\"],\n",
+ " encoding_channels=[\"color\", \"shape\", \"size\"],\n",
+ " draco=d,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5644a643",
+ "metadata": {},
+ "source": [
+ "It is no secret that some of the above recommendations are not very useful when it comes to communicating the data. Nevertheless, they are valid visualizations from the space of possibilities. Following the already introduced workflow, we can use `DracoDebug` to investigate the soft constraint violations of the generated recommendations. If there are recommendations we are not happy with, we can extend the knowledge base to cover them so that they do not appear in the future."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "id": "4179f1d1-8b97-48a0-9400-3c6798f686d4",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:49.610795Z",
+ "start_time": "2023-08-04T07:31:49.213929Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "**Raw debug data**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " chart_name | \n",
+ " pref_name | \n",
+ " pref_description | \n",
+ " count | \n",
+ " weight | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " CHART 1 (point | weather | color) | \n",
+ " cartesian_coordinate | \n",
+ " Cartesian coordinates. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " CHART 1 (point | weather | color) | \n",
+ " summary_point | \n",
+ " Point mark for summary tasks. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " CHART 1 (point | weather | color) | \n",
+ " aggregate_mean | \n",
+ " Mean as aggregate op. | \n",
+ " 1 | \n",
+ " 1 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " CHART 1 (point | weather | color) | \n",
+ " aggregate_count | \n",
+ " Count as aggregate op. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " CHART 1 (point | weather | color) | \n",
+ " ordinal_color | \n",
+ " Ordinal scale with color channel. | \n",
+ " 1 | \n",
+ " 8 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " chart_name pref_name \\\n",
+ "0 CHART 1 (point | weather | color) cartesian_coordinate \n",
+ "1 CHART 1 (point | weather | color) summary_point \n",
+ "2 CHART 1 (point | weather | color) aggregate_mean \n",
+ "3 CHART 1 (point | weather | color) aggregate_count \n",
+ "4 CHART 1 (point | weather | color) ordinal_color \n",
+ "\n",
+ " pref_description count weight \n",
+ "0 Cartesian coordinates. 1 0 \n",
+ "1 Point mark for summary tasks. 1 0 \n",
+ "2 Mean as aggregate op. 1 1 \n",
+ "3 Count as aggregate op. 1 0 \n",
+ "4 Ordinal scale with color channel. 1 8 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Number of violated preferences**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "*40 preferences are violated out of a total of 147 preferences (soft constraints)*"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.VConcatChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "display_debug_data(draco=d, specs=recommendations)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ba981d6e-a9c2-4725-a5b4-91b799cdd9d7",
+ "metadata": {},
+ "source": [
+ "## Adjusting the Knowledge Base\n",
+ "\n",
+ "> Filtering out suboptimal designs by creating a new soft constraint and tuning its weight\n",
+ "\n",
+ "As apparent from the above-generated recommendations, there are some visualizations that are valid but not as expressive as we would desire. As a concrete example, the recommendations `CHART 1 (rect | weather | color)` and `CHART 1 (rect | date | color)` used row-faceting and no rules in our knowledge base penalised them for doing so. \n",
+ "\n",
+ "We demonstrate how we can extend the knowledge base with a design rule (soft constraint) to discourage using faceting with `rect` mark and `color` encoding and how to tune its weight to achieve more desirable recommendations."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e89ef2a2",
+ "metadata": {},
+ "source": [
+ "We start by creating the helper function `draco_with_updated_kb`, to return a `Draco` instance with the updated knowledge base.\n",
+ "We extend the knowledge base with a new preference (soft constraint) called `rect_color_facet` to discourage\n",
+ "faceting with rect mark and color encoding. To explore how the recommendations change as we assign different weights to the new soft constraint, we parameterize this function to accept `pref_weight` as an argument. This weight will be associated with the `rect_color_facet` preference we extend the knowledge base with."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "id": "ad0c4c38",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:49.610953Z",
+ "start_time": "2023-08-04T07:31:49.404648Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def draco_with_updated_kb(pref_weight: int) -> drc.Draco:\n",
+ " # Custom soft constraint to discourage faceting with rect mark and color encoding\n",
+ " rect_color_facet_pref = \"\"\"\n",
+ " % @soft(rect_color_facet) Faceting with rect mark and color encoding.\n",
+ " preference(rect_color_facet,Fa) :-\n",
+ " attribute((mark,type),_,rect),\n",
+ " attribute((encoding,channel),_,color),\n",
+ " attribute((facet,channel),Fa,_).\n",
+ " \"\"\".strip()\n",
+ " rect_color_facet_pref_weight = pref_weight\n",
+ "\n",
+ " # Update the default soft constraint knowledge base (program)\n",
+ " soft_updated = drc.Draco().soft + f\"\\n\\n{rect_color_facet_pref}\\n\\n\"\n",
+ " # Assign the weight to the new soft constraint\n",
+ " weights_updated = drc.Draco().weights | {\n",
+ " \"rect_color_facet_weight\": rect_color_facet_pref_weight\n",
+ " }\n",
+ " return drc.Draco(soft=soft_updated, weights=weights_updated)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8392bc7b",
+ "metadata": {},
+ "source": [
+ "As opposed to the previous example, we only generate specifications for the `rect` mark, the `weather` and `date` fields and the `color` encoding channel, since we observed the undesired faceted recommendations for these configurations. We explore how the recommendations change as we assign a higher weight value (that is, a higher penalty) to the new soft constraint."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "44943c2c-a5e3-4fb0-936f-be55618b5643",
+ "metadata": {},
+ "source": [
+ "### Verifying That the Knowledge Base Got Updated"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3e04c127-5405-4c97-8d62-bb6c3a398b86",
+ "metadata": {},
+ "source": [
+ "First, to validate that the `rect_color_facet` soft constraint we created got registered properly to our knowledge base we start with a weight of `0`. We expect to obtain the same, faceted recommendations, but we also expect to see in the plot created in `display_debug_data` by `DracoDebugPlotter` that the faceted recommendations violate the design preference we defined."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "id": "99399a60",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:49.672895Z",
+ "start_time": "2023-08-04T07:31:49.407768Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "**Weight for `rect_color_facet` preference: 0**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | weather | color)\n",
+ "COST: [71]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | date | color)\n",
+ "COST: [40]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.FacetChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Raw debug data**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " chart_name | \n",
+ " pref_name | \n",
+ " pref_description | \n",
+ " count | \n",
+ " weight | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " rect_color_facet | \n",
+ " Faceting with rect mark and color encoding. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " cartesian_coordinate | \n",
+ " Cartesian coordinates. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " summary_rect | \n",
+ " Rect mark for summary tasks. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " ordinal_color | \n",
+ " Ordinal scale with color channel. | \n",
+ " 1 | \n",
+ " 8 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " ordinal_y | \n",
+ " Ordinal scale with y channel. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " chart_name pref_name \\\n",
+ "0 CHART 1 (rect | weather | color) rect_color_facet \n",
+ "1 CHART 1 (rect | weather | color) cartesian_coordinate \n",
+ "2 CHART 1 (rect | weather | color) summary_rect \n",
+ "3 CHART 1 (rect | weather | color) ordinal_color \n",
+ "4 CHART 1 (rect | weather | color) ordinal_y \n",
+ "\n",
+ " pref_description count weight \n",
+ "0 Faceting with rect mark and color encoding. 1 0 \n",
+ "1 Cartesian coordinates. 1 0 \n",
+ "2 Rect mark for summary tasks. 1 0 \n",
+ "3 Ordinal scale with color channel. 1 8 \n",
+ "4 Ordinal scale with y channel. 1 0 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Number of violated preferences**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "*20 preferences are violated out of a total of 148 preferences (soft constraints)*"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.VConcatChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "weight = 0\n",
+ "display(Markdown(f\"**Weight for `rect_color_facet` preference: {weight}**\"))\n",
+ "updated_draco = draco_with_updated_kb(pref_weight=weight)\n",
+ "recommendations = rec_from_generated_spec(\n",
+ " marks=[\"rect\"],\n",
+ " fields=[\"weather\", \"date\"],\n",
+ " encoding_channels=[\"color\"],\n",
+ " draco=updated_draco,\n",
+ ")\n",
+ "display_debug_data(draco=updated_draco, specs=recommendations)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b711ddf4-9722-4d18-b84b-35c0b1adeb6f",
+ "metadata": {},
+ "source": [
+ "As expected, our debug plot indicates in the heatmap's 6th column that both `CHART 1 (rect | weather | color)` and `CHART 1 (rect | date | color)` violate the `rect_color_facet` preference we introduced. Now we can work on tuning the weight associated with this rule, so that we actually penalize the usage of faceting when we have a `rect` mark and `color` as the encoding channel. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cca01ae7-e874-4092-8337-9e0d2bb30419",
+ "metadata": {},
+ "source": [
+ "### Weight Tuning\n",
+ "\n",
+ "We increase the weight from `0` to `10` and by doing so we expect that this penalty will be sufficient for the Clingo solver the find a model with a lower cost, not violating the `rect_color_facet` design rule we extended our knowledge base with."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "id": "3e200090-4991-4754-bdb0-badce0768908",
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-08-04T07:31:50.087462Z",
+ "start_time": "2023-08-04T07:31:49.613876Z"
+ },
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "**Weight for `rect_color_facet` preference: 10**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | weather | color)\n",
+ "COST: [73]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CHART 1 (rect | date | color)\n",
+ "COST: [42]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.Chart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Raw debug data**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " chart_name | \n",
+ " pref_name | \n",
+ " pref_description | \n",
+ " count | \n",
+ " weight | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " cartesian_coordinate | \n",
+ " Cartesian coordinates. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " summary_rect | \n",
+ " Rect mark for summary tasks. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " aggregate_median | \n",
+ " Median as aggregate op. | \n",
+ " 1 | \n",
+ " 3 | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " ordinal_color | \n",
+ " Ordinal scale with color channel. | \n",
+ " 1 | \n",
+ " 8 | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " CHART 1 (rect | weather | color) | \n",
+ " ordinal_y | \n",
+ " Ordinal scale with y channel. | \n",
+ " 1 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " chart_name pref_name \\\n",
+ "0 CHART 1 (rect | weather | color) cartesian_coordinate \n",
+ "1 CHART 1 (rect | weather | color) summary_rect \n",
+ "2 CHART 1 (rect | weather | color) aggregate_median \n",
+ "3 CHART 1 (rect | weather | color) ordinal_color \n",
+ "4 CHART 1 (rect | weather | color) ordinal_y \n",
+ "\n",
+ " pref_description count weight \n",
+ "0 Cartesian coordinates. 1 0 \n",
+ "1 Rect mark for summary tasks. 1 0 \n",
+ "2 Median as aggregate op. 1 3 \n",
+ "3 Ordinal scale with color channel. 1 8 \n",
+ "4 Ordinal scale with y channel. 1 0 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "**Number of violated preferences**"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "*18 preferences are violated out of a total of 148 preferences (soft constraints)*"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "Using `DracoDebugPlotter` to visualize the debug `DataFrame` produced by `DracoDebug`:"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ ""
+ ],
+ "text/plain": [
+ "alt.VConcatChart(...)"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "weight = 10\n",
+ "display(Markdown(f\"**Weight for `rect_color_facet` preference: {weight}**\"))\n",
+ "updated_draco = draco_with_updated_kb(pref_weight=weight)\n",
+ "recommendations = rec_from_generated_spec(\n",
+ " marks=[\"rect\"],\n",
+ " fields=[\"weather\", \"date\"],\n",
+ " encoding_channels=[\"color\"],\n",
+ " draco=updated_draco,\n",
+ ")\n",
+ "display_debug_data(draco=updated_draco, specs=recommendations)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5f153e84-7754-43f9-a010-775cb6e1f048",
+ "metadata": {},
+ "source": [
+ "Just as expected, thanks to the higher weight assigned to the newly added `rect_color_facet` rule, we don't see recommendations using faceting when a `rect` mark and `color` encoding is used. One can use the very same process to tailor the knowledge base and fine-tune the constraint weights to obtain more expressive visualization recommendations."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "015537dd-7672-4d06-b8bc-74eefcbf23f3",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.11"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
}