Try the app now: https://casual-competitor-susbk7aqfls5uxswyrp2qz.streamlit.app/
No installation required! Just upload your CSV file and start discovering causal relationships.
Watch the app in action! See how to upload data, configure analysis, and discover causal relationships:
Streamlit.-.Brave.2025-07-31.14-10-02.mp4
π§ How to add your Google Drive video
- Upload your demo video to Google Drive
- Right-click the video β "Get shareable link"
- Make sure it's set to "Anyone with the link can view"
- Copy the file ID from the URL (the long string between
/d/and/view) - Replace
YOUR_GOOGLE_DRIVE_FILE_IDin the link above with your actual file ID
Example: If your Google Drive link is:
https://drive.google.com/file/d/1ABC123xyz789DEF456/view?usp=sharing
Then your file ID is: 1ABC123xyz789DEF456
Alternative display option (if you prefer the video to play directly in GitHub):
<div align="center">
<iframe src="https://drive.google.com/file/d/1xcQMgNFicwbHpBuE5N1mBUsf5P5VdoEJ/view?usp=sharing"
width="640" height="360"
allow="autoplay">
</iframe>
</div>- Data Upload: How to upload and preview your CSV dataset
- Variable Selection: Choosing treatment and outcome variables
- Causal Graph: Viewing the generated causal relationship graph
- Inference Results: Understanding the causal effect estimates
- Error Handling: What to do when issues arise
- Real Example: Complete walkthrough with sample data
A Streamlit web application for discovering causal relationships in your data using Microsoft's DoWhy library. This tool helps you identify and quantify causal effects between variables in your datasets through correlation-based graph discovery and rigorous causal inference.
Try the app now: https://casual-competitor-susbk7aqfls5uxswyrp2qz.streamlit.app/
No installation required! Just upload your CSV file and start discovering causal relationships.
- Causal Graph Discovery: Automatically generates causal graphs using correlation-based analysis
- Causal Inference: Estimates causal effects using DoWhy's backdoor identification methods
- Interactive Web Interface: User-friendly Streamlit interface with real-time analysis
- Data Preprocessing: Automatic data cleaning, encoding, and preprocessing pipeline
- Fast Mode: Optimized performance for large datasets
- Traditional Analysis: Optional statistical analysis with correlation matrices and visualizations
- Error Handling: Comprehensive error handling with helpful debugging information
- Python 3.8+
- Virtual environment (recommended)
Simply visit: https://casual-competitor-susbk7aqfls5uxswyrp2qz.streamlit.app/
Upload your CSV file and start analyzing causal relationships immediately!
- Clone the repository:
git clone https://github.com/Krishnadev-cmd/Casual-Competitor.git
cd Casual-Competitor- Create and activate virtual environment:
# Windows
python -m venv venv_py
venv_py\Scripts\activate
# Linux/Mac
python -m venv venv_py
source venv_py/bin/activate- Install dependencies:
pip install -r requirements.txt- Run locally:
streamlit run src/app.py- streamlit: Web application framework
- dowhy: Microsoft's causal inference library
- pandas: Data manipulation and analysis
- numpy: Numerical computing
- matplotlib: Plotting library
- seaborn: Statistical data visualization
- networkx: Graph analysis
- scikit-learn: Machine learning utilities
- pydot: Graph visualization (requires Graphviz)
- Visit: https://casual-competitor-susbk7aqfls5uxswyrp2qz.streamlit.app/
- Upload your CSV file using the sidebar
- Configure your analysis (see steps below)
- Run and explore the results!
- Start the application:
streamlit run src/app.py- Open your browser and navigate to
http://localhost:8501
- Upload a CSV file
- Specify treatment variable (what you want to change)
- Specify outcome variable (what you want to predict)
- Choose whether to show dataset analysis
- Enable fast mode for large datasets
- Run the analysis and explore the results!
- Automatic ID column removal
- Missing value imputation
- Categorical variable encoding
- Feature scaling and normalization
- Date parsing and feature extraction
- Correlation-based edge detection
- Automatic DAG (Directed Acyclic Graph) construction
- Treatment β Outcome path guarantee
- Threshold-based edge filtering (correlation > 0.3)
- DoWhy's backdoor identification method
- Linear regression estimation
- Placebo treatment refutation tests
- Statistical significance testing
Casual_Competitor/
βββ src/
β βββ app.py # Main Streamlit application
β βββ doWhy_utils.py # DoWhy integration utilities
β βββ Preprocess.py # Data preprocessing pipeline
β βββ Utils.py # Helper functions
β βββ __pycache__/ # Python cache files
βββ data/ # Sample datasets
β βββ Auto Sales data.csv
β βββ bakery_Sales.csv
β βββ big_mart_sales.csv
β βββ Electronic_sales_Sep2023-Sep2024.csv
β βββ retail_data.csv
β βββ Video_Games_Sales_as_at_22_Dec_2016.csv
βββ venv_py/ # Virtual environment
βββ requirements.txt # Python dependencies
βββ README.md # This file
- Dataset Analysis: Show/hide traditional statistical analysis
- Preprocessing: Specify if your dataset is already preprocessed
- Fast Mode: Enable for faster processing on large datasets
- Treatment Variable: The variable you want to manipulate
- Outcome Variable: The variable you want to predict/analyze
- Fast Mode: Recommended for datasets with >500 rows or >15 columns
- Preprocessing: Skip if your data is already clean and encoded
- Variable Selection: Choose variables with clear causal relationships
-
Marketing Analysis:
- Treatment: Marketing spend
- Outcome: Sales revenue
- Discover: How marketing investment affects sales
-
Pricing Strategy:
- Treatment: Product price
- Outcome: Customer demand
- Discover: Price elasticity effects
-
A/B Testing:
- Treatment: Feature flag (0/1)
- Outcome: User engagement
- Discover: Feature impact on user behavior
- Causal vs Correlation: This tool identifies potential causal relationships, but domain expertise is still required for interpretation
- Data Quality: Results are only as good as your input data - ensure clean, relevant datasets
- Sample Size: Larger datasets generally produce more reliable causal estimates
- Variable Selection: Choose treatment and outcome variables with theoretical causal relationships
- "Variable not found": Check that your treatment/outcome variables exactly match column names
- "Graph contains cycles": Enable fast mode or check for circular relationships in your data
- Zero causal effect: May indicate no causal relationship or insufficient data signal
- Visualization errors: Ensure Graphviz is installed for graph plotting
The app provides comprehensive debug information including:
- Graph structure visualization
- Available column names
- Raw inference results
- Error details and suggestions
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Microsoft DoWhy: For the powerful causal inference framework
- Streamlit: For the excellent web application framework
- NetworkX: For graph analysis capabilities
- Pandas & NumPy: For data manipulation foundations
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Look at the debug information in the app
- Open an issue on GitHub
- Review the DoWhy documentation for advanced usage
- Support for more causal discovery algorithms
- Advanced visualization options
- Export functionality for results
- Batch processing capabilities
- Integration with more causal inference methods
- Real-time data streaming support
Happy Causal Discovery! π