Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand cookbook #104

Open
11 of 29 tasks
jacobvjk opened this issue Sep 4, 2024 · 2 comments
Open
11 of 29 tasks

Expand cookbook #104

jacobvjk opened this issue Sep 4, 2024 · 2 comments

Comments

@jacobvjk
Copy link
Member

jacobvjk commented Sep 4, 2024

We currently have some documentation that will be part of the cookbook

  • simple example of running the code
  • config setup
  • data dictionary

The final cookbook should be usable as a central document based on which the user can implement the entire analysis and/or find help in other resources.

The suggested structure is:

  • Title page/Intro/Context
  • Preparatory steps
    • Which data sets are required as input?
      • Where/how to obtain external inputs
      • How to prepare own inputs (loan books and possibly sector classification)
      • Highlight which ones are required and which ones are optional
    • Installing relevant software
      • R, RStudio, this package, dependencies
    • Setting up the project
      • configuration file
      • Folder structure
  • Running the analysis
    • Basic flow of analysis
    • Matching and data prep
      • Matching the loan books
      • Manual matching process
      • Checking coverage and iterating in a smart way
      • misclassified loans
      • ABCD preparation
      • Optional: sector split
    • Run main analysis
      • Expected PACTA outputs
      • Expected Net Aggregate Alignment outputs
  • Basic interpretation of outputs
    • Data dictionary
    • individual output graphs (each graph gets one page, explaining the idea and how the values from the data dictionary map to the graph)
  • Advanced use cases
    • How to adjust the config and inputs to generate the results of interest
    • gather realistic use cases for the analysis and add one page each
@jacobvjk
Copy link
Member Author

jacobvjk commented Sep 4, 2024

@cjyetman @jdhoffa I think in the end a cookbook that can be used as the main information hub for users to run P4S by themselves could have a structure like this. Feel free to comment any additions or aspects that you think do not belong here. I realize these are a lot of bullets, but in some cases multiple of those be may combined into a single page.

@jdhoffa
Copy link
Member

jdhoffa commented Sep 9, 2024

I think the structure looks great! I have some comments (might be things you are already thinking of, but good to have them explicit!). Also not necessarily saying each of these comments is strictly necessary, just brainstorming 😄

I'll start with General suggestions, and then more specific suggestions per section.

General suggestions

Modularity
Consider making each section as modular as possible, so users can jump to the section they need without following the entire document in order.

Troubleshooting and FAQs
Each top-level section (e.g., setup, running analysis, outputs) could end with a “Common Issues” or “FAQ” section to anticipate user difficulties.

Glossary
A glossary of technical terms might be useful for less-experienced users.

Resources
Provide links to any other external resources/ documentation/ the main PACTA website.

Specific suggestions

Title page/Intro/Context

  • Should clearly communicate the purpose of the tool and its overall capabilities.
  • Consider briefly outlining the audience (who is this for?) and why the tool is valuable for them.
  • A high-level diagram or flowchart showing how different stages in the analysis interact might help contextualize the later sections.

Preparatory steps

  • It would be helpful to provide a concise "checklist" of the required software (e.g., R, RStudio, dependencies) before diving into detailed instructions.
  • Minimum versions of software dependencies should be defined/noted to avoid compatibility issues.

Data Input

  • Clarify the format of the input data (e.g., .csv, .xlsx) and any specific formatting requirements.
  • For external inputs, adding instructions on how to validate or clean the data before using it in the tool could prevent user errors later on.
  • The distinction between required and optional datasets should be visually obvious, perhaps using different sections, icons, or a color-coded table.

Installing relevant software

  • A step-by-step approach is essential here. Users appreciate precise details, so specifying terminal commands (if applicable) and any known issues/troubleshooting steps could make this section more comprehensive.
  • Always better to be linking or referring to any pre-existing documentation here (installing RStudio etc.), if available, to avoid duplicating information.

Setting up the project

  • Clarity is key in this critical section. If the configuration file or folder structure is complex, providing examples or templates will help the user.
  • Consider including a visual representation of the folder structure (e.g. can use the tree cli to do this), so the user can clearly see where different files should be located.

Running the analysis

  • "Basic flow of analysis" sounds general. Consider breaking this down into smaller sub-tasks and using numbered steps or a flowchart to show dependencies between tasks (e.g., Matching -> Data Prep -> Main Analysis).
  • Including a “Troubleshooting” section here would help users who encounter issues, especially in matching data, loan book misclassification, and iteration processes.

Expected Outputs

  • In the "Expected PACTA outputs" and "Net Aggregate Alignment outputs" sections, example outputs (with explanations of each column/graph) will be crucial. Users will benefit from screenshots or generated output samples.
  • Cross-reference the data dictionary as much as possible.

Basic interpretation of outputs:

  • This section should provide simple, understandable interpretations of the results for both non-technical and technical audiences. Ensure that common questions or confusions around the outputs are addressed here.
  • If possible, include practical examples of what decisions or next steps might be taken based on different output scenarios.

Advanced use cases:

  • I’d recommend starting each use case with a problem statement or scenario and showing step-by-step how to adjust the config/inputs to achieve the desired result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants