Skip to content

Commit

Permalink
fixed typos and added last checks section
Browse files Browse the repository at this point in the history
  • Loading branch information
amandaha8 committed Jun 26, 2024
1 parent 06d5b13 commit 3e5be7b
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 22 deletions.
37 changes: 27 additions & 10 deletions docs/publishing/sections/4_notebooks_styling.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Getting Notebooks Ready for the Portfolio

We want all the content on our [portfolio](https://analysis.calitp.org/) to be consistent and tidy. Below are some guidelines for you to follow when creating the Jupyter Notebooks.
Depending on the complexity of your visualizations, you may want to produce
a full website composed of multiple notebooks and/or the same notebook that is rerun across different parameters.

For these situations, the [Jupyter Book-based](https://jupyterbook.org/en/stable/intro.html)
[publishing framework](https://github.com/cal-itp/data-analyses/tree/main/portfolio)
present in the data-analyses repo is your friend. You can find the Cal-ITP Analytics Portfolio at [analysis.calitp.org](https://analysis.calitp.org).

We want all the content on our portfolio to be consistent. Below are guidelines for you to follow when creating the Jupyter Notebooks.

## Narrative

Expand Down Expand Up @@ -55,18 +62,18 @@ These are a set of principles to adhere to when writing the narrative content in

- Integers when referencing dates, times, etc

- 2020 for year, not 2020.0 (coerce to int64 or Int64 in `pandas`; Int64 are nullable integers, which allow for NaNs to appear alongside integers)
- 1 hr 20 min, not 1.33 hr (use best judgment to decide what's easier for readers to interpret)
- 2020 for year, not 2020.0. Ccoerce to int64 or Int64 in `pandas`; Int64 are nullable integers, which allow for NaNs to appear alongside integers.
- 1 hr 20 min, not 1.33 hr. Use your best judgment to decide what's easier for readers to interpret.

- Round at the end of the analysis. Use best judgment to decide on significant digits. National Institutes of Health has a guide on [Rounding Rules](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4483789/#:~:text=Ideally%20data%20should%20be%20rounded,might%20call%20it%20Goldilocks%20rounding.&text=The%20European%20Association%20of%20Science,2%E2%80%933%20effective%20digits%E2%80%9D.).

- Too many decimal places give an air of precision that may not be present.
- Too few decimal places may not give enough detail to distinguish between categories or ranges.
- A good rule of thumb is to start with 1 extra decimal place than what is present in the other columns when deriving statistics (averages, percentiles), and decide from there if you want to round up.
- An average of `$100,000.0` can simply be rounded to `$100,000`.
- An average of 5.2 mi might be left as is.
- An average of 5.2 miles might be left as is.

- Additional references: [American Psychological Association (APA) style](https://apastyle.apa.org/instructional-aids/numbers-statistics-guide.pdf) and [Purdue](https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/apa_numbers_statistics.html).
- Additional references we recommend are from the [American Psychological Association (APA)](https://apastyle.apa.org/instructional-aids/numbers-statistics-guide.pdf) and [Purdue University](https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/apa_numbers_statistics.html).

## Standard Names

Expand All @@ -89,7 +96,7 @@ These are a set of principles to adhere to when writing the narrative content in

## Accessibility

It's important to make our content as user-friendly as possible. Here are a few options to consider.
It's important to make our content as user-friendly as possible. Here are a few things to consider.

- Use a color palette that is color-blind friendly. There is no standard palette for now, so use your best judgement. There are many resources online such as [this one from the University of California, Santa Barbara](https://www.nceas.ucsb.edu/sites/default/files/2022-06/Colorblind%20Safe%20Color%20Schemes.pdf).
- Add tooltips to your visualizations so users can find more detail.
Expand Down Expand Up @@ -121,17 +128,26 @@ Markdown cells of the <i>H1</i> type creates the titles of our website, not the
- We can see that the [yml](https://github.com/cal-itp/data-analyses/blob/main/portfolio/sites/sb125_route_illustrations.yml) file lists the abbreviated county names as the parameter.
![example yml file](../assets/section4_image5.png)

- However, the titles and headers in the notebook are the fully spelled out conunty names.
- However, the titles and headers in the notebook are the fully spelled out county names.
![rendered notebook](../assets/section4_image3.png)

- This is due to the fact that the parameter is mapped to another variable in the [notebook](https://github.com/cal-itp/data-analyses/blob/main/sb125_analyses/path_examples_tttf4/path_examples.ipynb).
![notebook heading](../assets/section4_image4.png)

## Getting Ready for Parameterization
## Last Checks

Now that your notebook is styled appropriately, setting up your Jupyter Notebook to be parameterized and published to the portflio requires a few extra steps.
Your notebook is all ready to be published. However, it never hurts to double check your work once more. Here are some things to look over once more.

- All your values are formatted properly. Currencies should have $ and percentages should have %.
- The titles of your visualizations make sense and have the correct capitalizations.
- The legends of your visualizations are not cutoff horizontally or vertically. If you have many values in your legend, Altair will truncate them.
- The values in your visualizations are sorted properly. For example, if you use the string column of `month` with values January, February, etc in the x-axis of an Altair chart, Altair will sort these months alphabetically.
- If you are displaying a Pandas dataframe, consider styling it. You can look through the [Pandas' website](https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html) for inspiration. We also have a [function](https://github.com/cal-itp/data-analyses/blob/main/_shared_utils/shared_utils/portfolio_utils.py#L96) in our `portfolio_utils` that styles and formats a dataframe.
- Look at your notebook(s) on your laptop versus a monitor.

## Getting Ready for Parameterization

The instructions below are also detailed in this [sample parameterized notebook here.](https://github.com/cal-itp/data-analyses/blob/main/starter_kit/parameterized_notebook.ipynb)
If you plan to rerun the same Jupyter Notebook over a set of different parameters, you need to setup your Jupyter Notebook in a particular way.

### Step 1: Packages to include

Expand All @@ -147,6 +163,7 @@ warnings.filterwarnings('ignore')
import calitp_data_analysis.magics
all your other packages go here
```

### Capturing Parameters
Expand Down
20 changes: 8 additions & 12 deletions docs/publishing/sections/5_analytics_portfolio_site.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,13 @@

# The Cal-ITP Analytics Portfolio

Depending on the complexity of your visualizations, you may want to produce
a full website composed of multiple notebooks and/or the same notebook that is rerun across different parameters.
For these situations, the [Jupyter Book-based](https://jupyterbook.org/en/stable/intro.html)
[publishing framework](https://github.com/cal-itp/data-analyses/tree/main/portfolio)
present in the data-analyses repo is your friend. You can find the Cal-ITP Analytics Portfolio at [analysis.calitp.org](https://analysis.calitp.org).

## Netlify Setup

Netlify is the platform turns our Jupyter Notebooks uploaded to GitHub into a full website.

To setup your netlify key:

- Ask in Slack/Teams for a Netlify key if you don't have one yet.
- Install netlify: `npm install -g netlify-cli`
- Navigate to your main directory
- Edit your bash profile using Nano:
Expand All @@ -40,9 +35,10 @@ In order to publish to analysis.calitp.org, you need to create two different fil

### README.md

Create a `README.md` file in the repo where your work lies. This serves to detail purpose of your website, methologies, relevant links, instructions, and more. However, this also forms the landing page of your website.
Create a `README.md` file in the repo where your work lies. This also forms the landing page of your website.

- Your file should <b>always</b> be titled as `README.md`. No other variants such as `README_gtfs.md` or `read me.md` or ` README.md` are allowed. The portfolio can only take a `README.md` when generating the landing page of your website.
- The `README.md` is the first thing the audience will see when they visit your website. Therefore, this page should contain content such as the goal of your work, the methodology you used, relevant links, and more. [Here](https://github.com/cal-itp/data-analyses/blob/main/portfolio/template_README.md) is a template for you to populate.
- If you do accidentally create a `README.md` file with extra strings, you can fix this by taking the following steps:
- `git rm portfolio/my_analysis/README_accidentally_named_something_else.md`
- `rm portfolio/my_analysis/_build/html/README_accidentally_named_something.html`. We use `rm` because \_build/html folder is not checked into GitHub
Expand Down Expand Up @@ -160,9 +156,9 @@ After your Jupyter Notebook (refer to the previous section), `README.md`, and `.
### Deploy your Report
1. Make sure you are in the root of the data-analyses repo: `~/data-analyses`
1. Make sure you are in the root of the data-analyses repo: `~/data-analyses`.
2. Run `python portfolio/portfolio.py build my_report --deploy`
2. Run `python portfolio/portfolio.py build my_report --deploy`.
- By running `--deploy`, you are deploying the changes to display in the Analytics Portfolio.
- **Note:** The `my_report` will be replaced by the name of your `.yml` file in [data-analyses/portfolio/sites](https://github.com/cal-itp/data-analyses/tree/main/portfolio/sites).
Expand All @@ -182,7 +178,7 @@ After your Jupyter Notebook (refer to the previous section), `README.md`, and `.
### Other Specifications
- You also have the option to specify after the initial `python portfolio/portfolio.py build my_report [specification goes here]`: run `python portfolio/portfolio.py build --help` to see the following options:
- You also have the option to specify after the initial `python portfolio/portfolio.py build my_report [specification goes here]` command: run `python portfolio/portfolio.py build --help` to see the following options:
- `--deploy / --no-deploy`
- deploy this component to netlify.
- `--prepare-only / --no-prepare-only`
Expand All @@ -199,9 +195,9 @@ After your Jupyter Notebook (refer to the previous section), `README.md`, and `.
### Adding to the Makefile
Another and more efficient way to write to the Analytics Portfolio is to use the Makefile and run
`make build_my_report -f Makefile` in data-analyses
`make build_my_report -f Makefile` in the `data-analyses` repo.
Example makefile in [`cal-itp/data-analyses`](https://github.com/cal-itp/data-analyses/blob/main/Makefile):
Here's an example makefile in [`cal-itp/data-analyses`](https://github.com/cal-itp/data-analyses/blob/main/Makefile):
```
build_my_reports:
Expand Down

0 comments on commit 3e5be7b

Please sign in to comment.