Skip to content

ammarbinshakir/devboard-ocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Here’s a complete, polished, and developer-friendly README.md for your DevBoard OCR+ FastAPI project:


# 🧠 DevBoard OCR+

A lightweight, production-ready FastAPI microservice that extracts **TODOs**, **PR references**, and **deadlines** from uploaded images or PDFs using **OCR** (`Tesseract`) and **regex-based NLP**.

---

## πŸš€ Features

- βœ… Upload image or PDF files via API
- 🧠 Extracts actionable dev tasks like:
  - `Fix login bug`
  - `Review PR #42`
  - `Deadline: 2025-07-22`
- βš™οΈ Powered by: https://github.com/tesseract-ocr/tesseract
- πŸ” Clean and regex-based extraction logic
- 🦾 Built with FastAPI and async IO
- πŸ’Ό Great for portfolio/demo projects

---

## πŸ“Έ Example

Upload a screenshot or scanned whiteboard that contains:

Fix broken navbar Review PR #101 Deploy by 22/07/2025


The API returns:

```json
{
  "text": "Fix broken navbar\nReview PR #101\nDeploy by 22/07/2025",
  "todos": ["Fix broken navbar", "Review PR #101", "Deploy by 22/07/2025"],
  "refs": ["PR #101"],
  "dates": ["22/07/2025"]
}

πŸ§‘β€πŸ’» Local Development

1. Clone the repository

git clone https://github.com/your-username/devboard-ocr.git
cd devboard-ocr

2. Create and activate virtual environment

python -m venv venv
source venv/Scripts/activate   # Windows
# OR
source venv/bin/activate       # Linux/macOS

3. Install dependencies

pip install -r requirements.txt

4. Install Tesseract OCR

  • Windows: Download Installer
  • macOS: brew install tesseract
  • Ubuntu: sudo apt install tesseract-ocr

5. Configure .env

Create a .env file in the root:

TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe

Update the path to match your system.


🚦 Run the Server

uvicorn main:app --reload

Then open:
πŸ‘‰ http://localhost:8000/docs

Upload an image/PDF using the Swagger UI.


πŸ“ Project Structure

devboard-ocr/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   └── ocr.py
β”‚   └── services/
β”‚       └── ocr_service.py
β”œβ”€β”€ main.py
β”œβ”€β”€ .env
β”œβ”€β”€ .gitignore
β”œβ”€β”€ requirements.txt
└── README.md

πŸ›  Tech Stack

  • FastAPI – blazing-fast Python web framework
  • pytesseract – Python wrapper for Tesseract OCR
  • Pillow – image handling
  • dotenv – environment config
  • re – built-in regex for text parsing

🌐 Future Ideas

  • PDF support with pdf2image
  • SQLite task history
  • Frontend dashboard with upload + history
  • Background task queue for heavy processing

πŸ“„ License

MIT β€” free to use, fork, and build on.


Built by Ammar Bin Shakir β€” feel free to fork & showcase πŸš€

About

πŸš€ FastAPI-based microservice that extracts TODOs, PR references, and deadlines from images and PDFs using OCR and regex-based NLP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages