A multi-source, RAG-enhanced fake news detection system using ML models, retrieval-augmented generation, and AI-powered analysis. Combines news API verification, knowledge base retrieval, and advanced analytics in a Streamlit web app.
- Multi-API News Verification: Checks news existence across NewsAPI, GNews, CurrentsAPI, ContextualWeb, and Google Fact Check.
- RAG Knowledge Base: Ingests facts from datasets and APIs, supports semantic retrieval (ChromaDB + SentenceTransformers, TF-IDF fallback).
- ML Model Analysis: Uses multiple trained models (Naive Bayes, Logistic Regression, Random Forest, CatBoost) for prediction.
- AI Assessment: Integrates Gemini AI for deep analysis.
- Content Validation: Ensures only news-like content is analyzed.
- Bulk Dataset Ingestion: Supports streaming ingestion from large CSVs (True.csv, Fake.csv).
- Clone the repository
git clone https://github.com/muhammadnavas/Fake_News_Predictor.git cd Fake_News_Predictor - Install dependencies
pip install -r requirements.txt
- Configure API keys
- Copy
.streamlit/secrets.toml.exampleto.streamlit/secrets.tomland fill in your keys:NEWSAPI_KEY = "your_newsapi_key" GNEWS_KEY = "your_gnews_key" CURRENTS_KEY = "your_currents_key" CONTEXTUALWEB_KEY = "your_contextualweb_key" GOOGLE_FACTCHECK_API_KEY = "your_google_factcheck_key" GEMINI_API_KEY = "your_gemini_key"
- Or set keys in
.env(see.env.example).
- Copy
- Run the app
streamlit run app.py
- Enter a news headline or article in the input box.
- The app validates content and runs analysis using selected methods (API, RAG, ML, AI).
- View results in tabs: Verification, RAG Analysis, ML Models, AI Assessment, Summary.
- Use the sidebar to ingest datasets and manage the knowledge base.
- No news fetched: Ensure API keys are set and match expected names in both secrets and code.
- Analysis blocked: Input must be news-like (headline or article, not personal/casual text).
- Module import errors: Run
pip install -r requirements.txtto install all dependencies. - Secrets parse error: All values in
.streamlit/secrets.tomlmust be quoted strings.
- Never commit real API keys to version control.
.gitignoreexcludes.envand.streamlit/secrets.tomlby default. - Rotate keys if accidentally exposed.
app.py— Main Streamlit apprag_system.py,rag_pipeline.py— RAG logic and pipelineml_analysis.py— ML model loading and analysiscontent_detector.py— Content validationfetch_news.py— News API integrationmodels/— Pretrained ML modelschroma_db/— ChromaDB persistent storagefact_database.json— Knowledge base facts
MIT (or specify your license)
- GitHub: https://github.com/muhammadnavas/Fake_News_Predictor.git
- Owner: muhammadnavas (MrHidey)
- Branch: main
- Built by MrHidey (muhammadnavas) and contributors
- Uses open-source libraries: Streamlit, scikit-learn, chromadb, sentence-transformers, Google Generative AI, etc.
For questions or contributions, open an issue or pull request at the GitHub repository.