Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equations to define causal graph including causal mechanism #1066

Conversation

bhatt-priyadutt
Copy link
Contributor

  • This feature will enable user to write custom equations for each and node and get a causal model back with causal mechanisms assigned.
  • The Usage is mainly targeted for when the function/relationship model between nodes is known and allows user to specify it in a equation form as demonstrated below -
X = empirical()
Y = 12*exp(X) + halfnorm()
Z = 3*Y + empirical()
  • List of Supported functions for specifying parent-child relationships - here

  • List of Supported functions for specifying noise models here

@bhatt-priyadutt
Copy link
Contributor Author

some tasks like writing test cases, packaging new libraries used, handling for disconnected node is still remaining...

@bhatt-priyadutt bhatt-priyadutt marked this pull request as draft November 7, 2023 19:04
amit-sharma and others added 26 commits November 10, 2023 10:08
* fixed frontdoor bug

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* fixed formatting issues

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
This should work better with multivariate data and mixed data types. However, it is generally slower than the knn appraoch.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, when creating a linear regressor with fixed parameters, these parameters are overridden when fit to data. Now, the parameters remain fixed.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This change aims at providing a better overview of the notebooks by displaying them as separate cards instead of a card carousel.

Other changes:
- Introductory examples and Real world-inspired examples are now more prominent with individual images and a grid layout by 2 per-row.
- All other examples are now in a grid layout with 3 examples per row.
- Clear outputs of some notebooks.
- Fix issue with rendering counterfactual example notebook.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Also slightly change citation hint.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
The build sometimes randomly fails due to a timeout issue in the unit tests of the unit change methods of the GCM module. While this only happens in the github builds, this is most likely due to the prallelization of the underlying RandomForestRegressors being fitted.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This module adds a new method for evaluating a fitted gcm. Here, we evaluate the performance of causal mechanisms, the underlying modeling assumptions (if possible), the goodness of the generated joint distribution and the graph structure. This utilizes some of the existing methods, but also introduces new ones.

This further adds a new user guide and notebook entries demonstrating the usage.

Part of introducing the module required to make some changes in other modules and implementatins, which are mostly fixes and improvements.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
These methods are now available in the feature_relevance.py module.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, the method threw an error when all samples were equal. However, in these cases, it should rather return a KL divergence of 0.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
NaN values are now correctly counted when estimating the anomaly score.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
This is an updated and slightly modified version of the blog post: https://aws.amazon.com/blogs/opensource/root-cause-analysis-with-dowhy-an-open-source-python-library-for-causal-machine-learning/

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Bumps [actions/github-script](https://github.com/actions/github-script) from 6 to 7.
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](actions/github-script@v6...v7)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Before, the scorer was not able to handle numpy object types directly. However, GCM often uses the object dtype to ensure support of mixing categorical and float values. This fixes the handling of object dtypes by explicitly converting them to floats first.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
If the confidence intervals are misspecified, e.g., greater lower bound than upper bound, the method threw an error before. This, however, can sometimes happen due to precision errors in some algorithms and lead to random build fails. This change fixes the issue and ignores invalid intervals accordingly.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Before, the Support Vector Classifier did not produce probabilities, which are required for different algorithms in the GCM module. This changes the 'probability' parameter to True.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
In addition to CRPS and depending on the node data type, it now also reports the MSE, NMSE, R2 and F1 score.

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
…nML estimators (py-why#1061)

* auto identify the effect modifier columns

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* fixed formatting errors

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
…y-why#943)

* Deprecate CausalGraph

The effect estimation API is now based on an functional API that expects a networkx graph as input.

- The graph should now be defined via a networkx graph. Most identification methods now expect an additional "observed_nodes" parameter accordingly.
- CausalModel and CausalGraph still exist and should be compatible with the old API.

---------

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Amit Sharma <amit_sharma@live.com>
Co-authored-by: Amit Sharma <amit_sharma@live.com>
bloebp and others added 29 commits December 1, 2023 06:58
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
- Slightly update and revise existing GCM notebooks
- Moving mediation analysis, direct arrow strength and ICC to their own "Quantify Causal Influence" section
- Adding brief overview to describe differences between the quantification methods
- Change navigation image to reflect newest changes
- Adding related notebooks links to some of the causal task entries
- Adding a direct arrow strength example to the ICC notebook
- Adding a brief overview of the available root cause analysis and explanation methods
- Smaller revision of other GCM entries, such as the basic example
- Smaller typos and missing refernce fixes

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
…ed_conditional_estimates is True (py-why#1092)

* fixed bug where CATE is not returned by lr

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* added test

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* formatted file

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
)

* fixed frontdoor bug and added tests

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* updated docstring

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* reformatted file

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
* linked to up-to-date list of estimators

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* updated docs

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* using absolute paths

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
…hy#1091)

* removed deepiv and updated flaky test

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* black reformattingb

Signed-off-by: Amit Sharma <amit_sharma@live.com>

* removed all outputs from nb

Signed-off-by: Amit Sharma <amit_sharma@live.com>

---------

Signed-off-by: Amit Sharma <amit_sharma@live.com>
It now does not raise a division by zero error anymore. Other changes:
- Add new parameter indicating whether the method requires data for all nodes in the graph or also allows a subset of data.
- If no tests were performed, the summary now returns "Cannot be evaluated".

Signed-off-by: Patrick Bloebaum <bloebp@amazon.com>
…' into equations-to-define-causal-graph

# Conflicts:
#	dowhy/gcm/__init__.py
#	dowhy/gcm/causal_models.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants