Veritas: News Credibility Analyzer

Can we trust what we read online?
Can AI help spot fake news and explain why?

These questions inspired the creation of Veritas, a full-stack ML-powered platform with risk assessment, pattern analysis, and explainable AI for news verification using Text and Metadata.

Overview

Veritas isn’t just a classifier it’s a decision support tool, helps users think critically, not just react, by showing why a news item is flagged. It reveals how language can persuade, mislead, or inform.

Key Features

Intelligent Analysis

ML-Powered Detection: Advanced classification using trained models to identify fake vs. real news
SHAP Explainability: Transparent decision-making with feature importance rankings
Pattern Recognition: Analyzes writing style, linguistic patterns, and content structure

Defensive Programming

Lorem Ipsum Detection: Catches placeholder text before expensive ML processing
Character Repetition Filtering: Identifies suspicious repetitive patterns
Special Character Validation: Monitors unusual character usage ratios
Early Issue Detection: Prevents obvious problems from reaching the ML model

Risk Assessment

Severity Levels: High, Medium, Low risk categorization
Confidence Scoring: Numerical confidence ratings (0-100%)
Reliability Metrics: Comprehensive credibility scoring system
Risk Indicators: Identifies specific patterns that raise concerns

Interactive Visualizations

SHAP Summary Plots: Visual feature impact analysis
Text Pattern Charts: Graphical representation of writing patterns
Decision Factors: Clear visualization of prediction reasoning
Feature Analysis: Detailed breakdown of model decision factors

Professional Features

Text Statistics: Word count, sentence analysis, readability scores
Content Metrics: ALL CAPS usage, punctuation patterns, URL detection
Export Capabilities: Save analysis results for further review
History Tracking: Monitor analysis patterns over time

Live Frontend & API Access

Streamlit Frontend

Access the interactive web application here: https://veritas-news-credibility-analyzer.streamlit.app

Paste or upload a news article and receive:

Credibility prediction (Real or Fake)
Risk level and confidence score
SHAP-based explanation and key features

Demo

FastAPI Backend

Access the API at: https://veritas-news-credibility-analyzer.onrender.com

Method	Endpoint	Description
POST	`/predict`	Returns prediction, confidence, SHAP values
GET	`/health`	Check if the API service is running

Demo

How It Works

Input Processing: Text is analyzed for basic patterns and defensive checks
Feature Extraction: Advanced linguistic and structural features are computed
ML Prediction: Trained model classifies content as real or fake news
SHAP Analysis: Explains which features influenced the prediction
Risk Assessment: Evaluates overall credibility and assigns risk levels
Visualization: Presents results through interactive charts and summaries

Technical Stack

Frontend: Streamlit for interactive web interface
Backend: FastAPI for API and documentation
ML Framework: Scikit-learn for model training and prediction
Explainability: SHAP for transparent AI decision explanations
Visualization: Matplotlib,Seaborn, Plotly for interactive charts
Text Processing: NLTK for linguistic analysis

Use Cases

Journalists: Verify source credibility and fact-check articles
Researchers: Study misinformation patterns and linguistic indicators
Educators: Teach media literacy and critical thinking skills
General Users: Evaluate news authenticity before sharing

Evaluation Results

To assess model performance, we evaluated it on a held-out test set using several classification metrics.

Confusion Matrix: Shows the distribution of true vs. predicted classes
Precision-Recall Curve: Highlights model performance on imbalanced data

These plots help assess the model’s capability to distinguish between fake and real news, especially under imbalanced class scenarios. For fair evaluation, we prioritized metrics like F1-score and AUC-PR over just accuracy.

Exploratory Data Analysis (EDA) Summary

A comprehensive EDA was conducted on the misinformation dataset to uncover patterns and insights critical to model development. The key findings include:

Class Distribution: Dataset imbalance with more real news than fake, requiring stratified sampling and careful metric selection.
Subject Distribution: Perfect correlation with target label presents data leakage risk; excluded from model training.
Title Length: Fake news titles are longer and more variable, often sensational or verbose.
Text Length: Fake news articles tend to be longer and highly variable; real news is more concise.
Punctuation Usage: Exclamation and question marks occur more frequently in fake news.
Uppercase Words: Fake news contains more uppercase words for emphasis or sensationalism.
Temporal Patterns: Fake news spikes on weekends; real news is more evenly distributed.
Word Clouds: Real news uses institutional and factual language; fake news uses emotional and subjective terms.
Top Unigrams & Bigrams: Distinct vocabulary reflecting formal reporting in real news and sensationalism in fake news.

These insights informed feature engineering and helped improve model interpretability and fairness.

For the full detailed report, see: EDA_Report.md

Important Disclaimer

Veritas is designed as a supplementary tool for news analysis. It provides guidance based on writing patterns and linguistic features, not absolute truth determination. Users should:

Always verify information through multiple reliable sources
Consider context and domain expertise
Use critical thinking alongside automated analysis
Understand that no AI system is infallible

Repo Structure

misinfo-detector/
├── .gitignore                     # Git ignore file
├── app.py                         # Main streamlit application
├── Dockerfile                     # Docker configuration
├── .dockerignore                  # Docker ignore files
├── .env_example                   # Example environment variables
├── LICENSE                        # MIT License
├── output.png                     # Output visualization
├── README.md                      # Project documentation
├── requirements.txt               # Python dependencies
├── api/                           # API-related files
│   ├── dependencies.py            # API dependencies
│   ├── main.py                    # API main entry point
│   └── schema/                    # API schema definitions
├── data/                          # Data directory
│   ├── processed/                 # Processed datasets
│   └── raw/                       # Raw datasets (True.csv, Fake.csv)
├── model/                         # Model artifacts
│   ├── __init__.py                # Package initialization
│   ├── best_params.pkl            # Best hyperparameters
│   └──  model_pipeline.pkl        # Model pipeline
├── notebooks/                     # Jupyter notebooks
│   ├── eda.ipynb                  # Exploratory Data Analysis
│   ├── modeling.ipynb             # Model training and evaluation
│   └── eda.ipynb                  # Additional EDA
├── reports/                       # Analysis reports and figures
│   ├── figures/                   # EDA visualizations
|   ├── evaluatin_metrics/         # evaluation metirices fig 
|   └── eda_report.md              # eda reort 
├── utils/                         # Utility functions
│   ├── model_utils.py             # Model handling utilities
│   ├── predict_output.py          # Prediction utilities
│   ├── nltk_config.py             # config nltk resources
│   ├── nltk_setup.py              # download nltk setup
│   └── preprocessing.py           # Text preprocessing utilities

Getting Started

To run the Veritas News Credibility Analyzer locally, follow these steps:

Clone the repository:

# Clone the repository
git clone https://github.com/kushalregmi61/veritas-news-analyzer.git

# Navigate into the project directory
cd veritas-news-analyzer

Setup Virtual Environment:

# Create a virtual environment (using venv)

# mac/linux
python3 -m venv .venv
source .venv/bin/activate

# windows
python -m venv .venv
.venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

set up environment variables:

# Copy the .env_example to .env file

# mac/linux
cp .env_example .env

# windows
copy .env_example .env

# Then, edit the .env file to set the API_URL if needed

# Example URL for local FastAPI Model Inference Service   
API_URL=http://127.0.0.1:8000/predict

Run the application locally:

# 4.1 Start the FastAPI backend (in a separate terminal)
uvicorn api.main:app --host 127.0.0.1 --port 8000

# 4.2 Run the Streamlit app (in another terminal)
streamlit run app.py

Docker Deployment

The app is containerized for reproducible deployment. Run the FastAPI backend in Docker by either pulling the published image or building locally.

Option A: Pull and run (quick start)

Pull the published image:

docker pull kushalregmi61/veritas-news-analyzer:latest

Run the container (maps container port 8000 to host port 8000):

docker run -p 8000:8000 kushalregmi61/veritas-news-analyzer:latest

Option B:Build and run locally

Build the image from the project root:

docker build -t veritas-news-analyzer .

Run the container:

docker run -p 8000:8000 veritas-news-analyzer

Accessing the services

FastAPI API docs: http://localhost:8000/docs
If you run the Streamlit frontend locally (in another terminal), start it with:

streamlit run app.py

Ensure the Streamlit app’s API_URL (in .env or config) points to the running backend, e.g.:

API_URL=http://localhost:8000/predict

Contributing

Contributions are welcome! Please refer to the contributing guidelines for:

Bug reports and feature requests
Code improvements and optimizations
Documentation enhancements
Model performance improvements

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Veritas: News Credibility Analyzer

Table of Contents

Overview

Key Features

Intelligent Analysis

Defensive Programming

Risk Assessment

Interactive Visualizations

Professional Features

Live Frontend & API Access

Streamlit Frontend

Demo

FastAPI Backend

Demo

How It Works

Technical Stack

Use Cases

Evaluation Results

Exploratory Data Analysis (EDA) Summary

Important Disclaimer

Repo Structure

Getting Started

Docker Deployment

Option A: Pull and run (quick start)

Option B:Build and run locally

Accessing the services

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.devcontainer		.devcontainer
api		api
data		data
model		model
notebooks		notebooks
reports		reports
utils		utils
.dockerignore		.dockerignore
.env_example		.env_example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

KushalRegmi61/veritas-news-credibility-analyzer

Folders and files

Latest commit

History

Repository files navigation

Veritas: News Credibility Analyzer

Table of Contents

Overview

Key Features

Intelligent Analysis

Defensive Programming

Risk Assessment

Interactive Visualizations

Professional Features

Live Frontend & API Access

Streamlit Frontend

Demo

FastAPI Backend

Demo

How It Works

Technical Stack

Use Cases

Evaluation Results

Exploratory Data Analysis (EDA) Summary

Important Disclaimer

Repo Structure

Getting Started

Docker Deployment

Option A: Pull and run (quick start)

Option B:Build and run locally

Accessing the services

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages