Skip to content

Recipe page

David Megginson edited this page Oct 26, 2016 · 11 revisions
HXL Proxy Recipe page The **Recipe page** is the heart of the HXL Proxy. Here, you define a series of steps (filters) to transform the data into a different state: for example, you might add or remove columns, change the general shape of the data, produce a report (similar to a pivot table), automatically clean dates and numbers, replace text, and many other steps.

In the typical workflow, you arrive here from the Source page or Tagger page.

When you are finished editing the recipe, the Done filters button will take you to the View page, where you can browse or download the result of your recipe.

Live demo

Options

Strip text headers: remove the text headers and leaves only the HXL hashtags. This is useful when you need to use HXL data with tools that allow only one header row.

Never cache: show changes in the source data instantly. See the Caching article for details.

Download filename: specify a basename for downloadable data. For example, if you enter "survey-data" as the value, the download filename for CSV data will be "survey-data.csv" and the download filename for JSON data will be "survey-data.json".

Filters

Each transformation step is a filter that reads the data from the previous step (or the original source), changes it in some way, then passes it on to the next step (or final output). You can rearrange the steps using the drag handles to the left of each one, or delete steps using the [x] box to the right.

The following filter types are available for each step (select a filter type for typical use cases):

Filter Description
Add column filter Add a new column with a fixed value to the left or right side of the dataset.
Append datasets filter Combine multiple source datasets into a single output dataset (even if the columns don't exactly match).
Clean data filter Perform automated cleanup of dates, numbers, whitespace, and character case.
Count rows filter Aggregate data to produce reports and summaries, like in a spreadsheet pivot table.
Cut columns filter Remove columns from a dataset.
Deduplicate rows filter Remove duplicate rows from a dataset.
Explode data filter Normalise data by converting "wide" data (e.g. time series) to "long" data.
Merge columns filter Combine data from multiple datasets (similar to SQL "join").
Rename columns filter Change the hashtags and headers on a dataset column.
Replace data filter Replace data selectively, using string or regular expression patterns.
Replace data (mapping table) filter Replace data selectively, driven by an external data table (useful for larger collections of replacements).
Select rows filter Filter rows out of a dataset (e.g. every row with a date before 2015).
Sort rows filter Sort the rows of a dataset based on one or more columns.
Clone this wiki locally