Skip to content

Jakee4488/Twitter_Sentiment_Analysis

Repository files navigation

📊 Twitter Sentiment Analysis for Financial Markets

A production-ready Flask application that analyzes real-time tweets from top financial influencers and institutions using advanced NLP sentiment analysis. Track market sentiment, identify trends, and visualize financial opinions with an interactive dashboard.

Python Flask License

🌟 Features

Core Functionality

  • Real-time Tweet Fetching: Automated collection from Twitter API v2 for financial influencers
  • Advanced Sentiment Analysis: VADER-based analysis enhanced with custom financial lexicon
  • Financial Entity Extraction: Automatic detection of stock tickers ($AAPL), crypto (BTC), and financial keywords
  • Interactive Dashboard: Real-time visualization with Chart.js and Bootstrap 5
  • RESTful API: Comprehensive endpoints for programmatic access
  • Background Scheduling: Periodic tweet fetching with APScheduler
  • SQLite Database: Persistent storage for tweets and sentiment scores

Analytics

  • Overall market sentiment gauge
  • Sentiment trends over time (daily/hourly)
  • Influencer comparison and rankings
  • Trending financial topics and keywords
  • Stock ticker mention tracking with sentiment
  • Cryptocurrency sentiment analysis
  • Bullish vs Bearish ratio metrics

Technical Highlights

  • Text Processing: Advanced cleaning, tokenization, lemmatization
  • Custom Financial Lexicon: 60+ financial terms with sentiment weights
  • Rate Limiting: Exponential backoff for Twitter API compliance
  • Error Handling: Comprehensive logging and error recovery
  • Testing: Full test suite with pytest
  • Production Ready: Gunicorn deployment with proper logging

📁 Project Structure

Twitter_Sentiment_Analysis/
├── bin/
│   └── bullish.sh              # Automated deployment script
├── Database/
│   └── tweets.db               # SQLite database (created on init)
├── src/
│   ├── __init__.py
│   ├── analytics.py            # Sentiment aggregation and metrics
│   ├── data_loader.py          # Twitter API integration
│   ├── main.py                 # Flask app factory and scheduler
│   ├── models.py               # SQLAlchemy database models
│   ├── sentiment_analyzer.py  # NLP sentiment analysis engine
│   └── text_processor.py      # Tweet preprocessing and cleaning
├── tests/
│   ├── test_analytics.py
│   ├── test_data_loader.py
│   ├── test_sentiment_analyzer.py
│   └── test_text_processor.py
├── Tweet_Analysis/
│   └── app.py                  # Flask routes and API endpoints
├── templates/
│   └── dashboard.html          # Interactive web dashboard
├── .env.example                # Environment variables template
├── .gitignore
├── pyproject.toml              # Project dependencies
└── README.md

🚀 Quick Start

Prerequisites

  • Python 3.9 or higher
  • Twitter Developer Account with API v2 access
  • Bearer Token from Twitter Developer Portal

Installation

Option 1: Automated Setup (Recommended)

# Make the deployment script executable
chmod +x bin/bullish.sh

# Run the setup script
./bin/bullish.sh

The script will:

  1. Check Python version
  2. Create virtual environment
  3. Install dependencies
  4. Download NLTK data
  5. Initialize database
  6. Run tests (optional)
  7. Start the application

Option 2: Manual Setup

  1. Clone or Navigate to Project Directory
cd Twitter_Sentiment_Analysis
  1. Create Virtual Environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install Dependencies
# Using pip
pip install -e .

# Or using uv (faster)
uv pip install -e .
  1. Configure Environment Variables
cp .env.example .env

Edit .env and add your credentials:

TWITTER_BEARER_TOKEN=your_actual_bearer_token_here
SECRET_KEY=generate_a_random_secret_key
  1. Initialize Database
python -c "from src.models import init_db; init_db()"
  1. Run the Application
# Development mode
python -m src.main

# Production mode with Gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 'src.main:create_app()'

🔑 Twitter API Setup

  1. Go to Twitter Developer Portal
  2. Create a new App or use existing one
  3. Navigate to "Keys and tokens"
  4. Generate/Copy your Bearer Token
  5. Add it to your .env file

Required API Access

  • Twitter API v2 Essential or Elevated access
  • Endpoints used:
    • GET /2/users/by/username/:username
    • GET /2/users/:id/tweets

📊 Dashboard

Access the interactive dashboard at http://localhost:5000

Dashboard Features:

  • Sentiment Gauge: Real-time overall market sentiment (-1 to +1 scale)
  • Time Series Chart: 7-day sentiment trend visualization
  • Influencer Comparison: Bar chart comparing sentiment across accounts
  • Trending Topics: Word cloud of most frequent financial keywords
  • Top Tickers: Stock symbols with mention counts and sentiment
  • Recent Tweets: Live feed of analyzed tweets with sentiment labels
  • Auto-Refresh: Dashboard updates every 30 seconds

🔌 API Endpoints

Health Check

GET /api/health

Get Dashboard Data

GET /api/dashboard?days=7

Get Tweets

GET /api/tweets?username=Reuters&limit=50&offset=0

Get Sentiment by Influencer

GET /api/sentiment/elonmusk?days=7

Analyze Custom Text

POST /api/analyze
Content-Type: application/json

{
  "text": "Bitcoin is surging! Very bullish on crypto markets."
}

Get Trending Topics

GET /api/trends?days=7&limit=20

Get Influencer Comparison

GET /api/comparison?days=7

Get Time Series Data

GET /api/timeseries?username=Reuters&days=7&interval=daily

Get Financial Terms Analysis

GET /api/financial-terms?days=7

Trigger Manual Fetch

POST /api/fetch-now
Content-Type: application/json

{
  "influencers": ["Reuters", "Bloomberg", "federalreserve"]
}

⚙️ Configuration

Environment Variables

Variable Description Default
TWITTER_BEARER_TOKEN Twitter API Bearer Token Required
DATABASE_URL Database connection string sqlite:///Database/tweets.db
FLASK_ENV Flask environment development
SECRET_KEY Flask secret key Generate random
FLASK_PORT Application port 5000
LOG_LEVEL Logging level INFO
FETCH_INTERVAL_HOURS Tweet fetch frequency 1
MAX_TWEETS_PER_REQUEST Max tweets per API call 100
FINANCIAL_INFLUENCERS Comma-separated usernames See .env.example

Tracked Influencers (Default)

  • Government: @federalreserve, @SECGov
  • News: @Reuters, @Bloomberg, @CNBC, @WSJ
  • Tech/Business: @elonmusk, @BillGates
  • Investors: @chamath, @cathiedwood, @WarrenBuffett

🧪 Testing

Run the test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test file
pytest tests/test_sentiment_analyzer.py -v

# Run tests with detailed output
pytest -vv --tb=short

Test Coverage

  • Text Processing: URL/mention removal, tokenization, lemmatization
  • Sentiment Analysis: Financial lexicon, classification, batch processing
  • Data Loading: API mocking, rate limiting, error handling
  • Analytics: Aggregations, trending topics, time series

📈 Sentiment Classification

The system classifies sentiment into 5 categories based on compound scores:

Score Range Label Description
> 0.6 Very Bullish Strong positive sentiment
0.2 to 0.6 Bullish Positive sentiment
-0.2 to 0.2 Neutral Balanced or no clear sentiment
-0.6 to -0.2 Bearish Negative sentiment
< -0.6 Very Bearish Strong negative sentiment

Financial Lexicon Samples

Bullish Terms: moon (+0.8), surge (+0.7), rally (+0.7), bullish (+0.8), breakout (+0.6)

Bearish Terms: crash (-0.9), plunge (-0.8), bearish (-0.8), recession (-0.7), collapse (-0.8)

🔧 Development

Adding New Influencers

Update the FINANCIAL_INFLUENCERS variable in .env:

FINANCIAL_INFLUENCERS=Reuters,Bloomberg,elonmusk,your_new_account

Customizing Sentiment Lexicon

Edit src/sentiment_analyzer.py and modify the FINANCIAL_LEXICON dictionary:

FINANCIAL_LEXICON = {
    'your_term': 0.7,  # Positive
    'another_term': -0.6,  # Negative
}

Adjusting Fetch Frequency

Modify FETCH_INTERVAL_HOURS in .env:

FETCH_INTERVAL_HOURS=2  # Fetch every 2 hours

🐛 Troubleshooting

Common Issues

Import Errors

# Reinstall dependencies
pip install -e . --force-reinstall

NLTK Data Missing

python -c "import nltk; nltk.download('all')"

Twitter API Rate Limits

  • The app handles rate limits automatically with exponential backoff
  • Check logs for rate limit messages
  • Consider increasing FETCH_INTERVAL_HOURS

Database Locked

# Remove and reinitialize database
rm Database/tweets.db
python -c "from src.models import init_db; init_db()"

Logs

Application logs are stored in logs/app.log with automatic rotation:

  • Max size: 10MB per file
  • Keeps 10 backup files
  • Check for errors and API responses

📊 Performance

Benchmarks (on standard hardware)

  • Tweet fetching: ~2-3 seconds per influencer
  • Sentiment analysis: ~0.01 seconds per tweet
  • Dashboard load: < 1 second with 1000 tweets
  • Database queries: < 100ms for most operations

Optimization Tips

  1. Use indexes: Database models include indexes on frequently queried fields
  2. Batch processing: Analyze multiple tweets simultaneously
  3. Caching: Consider adding Redis for frequently accessed data
  4. Database: For production, migrate to PostgreSQL for better performance

🚢 Production Deployment

Using Gunicorn

gunicorn -w 4 -b 0.0.0.0:5000 \
  --access-logfile logs/access.log \
  --error-logfile logs/error.log \
  'src.main:create_app()'

Docker Deployment (Future)

FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install -e .
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "src.main:create_app()"]

Systemd Service

Create /etc/systemd/system/twitter-sentiment.service:

[Unit]
Description=Twitter Sentiment Analysis
After=network.target

[Service]
User=your_user
WorkingDirectory=/path/to/Twitter_Sentiment_Analysis
Environment="PATH=/path/to/venv/bin"
ExecStart=/path/to/venv/bin/gunicorn -w 4 -b 0.0.0.0:5000 'src.main:create_app()'

[Install]
WantedBy=multi-user.target

🤝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

📝 License

This project is licensed under the MIT License.

🙏 Acknowledgments

  • VADER Sentiment: Natural Language Toolkit (NLTK)
  • TextBlob: Simplified text processing
  • Flask: Web framework
  • Chart.js: Beautiful charts and visualizations
  • Twitter API: Real-time data source

📧 Support

For issues, questions, or suggestions:

  • Open an issue on GitHub
  • Check logs in logs/app.log
  • Review API documentation above

🔮 Future Enhancements

  • WebSocket support for real-time updates
  • Multi-language sentiment analysis
  • Machine learning model training on historical data
  • Advanced visualizations (D3.js integration)
  • Export reports to PDF/Excel
  • User authentication and personalized dashboards
  • Telegram/Discord bot integration
  • Historical data comparison
  • Sentiment prediction models
  • Integration with stock market APIs

Built with ❤️ for financial sentiment analysis

Note: This tool is for educational and research purposes. Always conduct your own research before making financial decisions.

About

Twitter_Sentiment_Analysis with X API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors