GitHub - ammarbinshakir/devboard-ocr: 🚀 FastAPI-based microservice that extracts TODOs, PR references, and deadlines from images and PDFs using OCR and regex-based NLP.

Here’s a complete, polished, and developer-friendly README.md for your DevBoard OCR+ FastAPI project:

# 🧠 DevBoard OCR+

A lightweight, production-ready FastAPI microservice that extracts **TODOs**, **PR references**, and **deadlines** from uploaded images or PDFs using **OCR** (`Tesseract`) and **regex-based NLP**.

---

## 🚀 Features

- ✅ Upload image or PDF files via API
- 🧠 Extracts actionable dev tasks like:
  - `Fix login bug`
  - `Review PR #42`
  - `Deadline: 2025-07-22`
- ⚙️ Powered by: https://github.com/tesseract-ocr/tesseract
- 🔍 Clean and regex-based extraction logic
- 🦾 Built with FastAPI and async IO
- 💼 Great for portfolio/demo projects

---

## 📸 Example

Upload a screenshot or scanned whiteboard that contains:

Fix broken navbar Review PR #101 Deploy by 22/07/2025


The API returns:

```json
{
  "text": "Fix broken navbar\nReview PR #101\nDeploy by 22/07/2025",
  "todos": ["Fix broken navbar", "Review PR #101", "Deploy by 22/07/2025"],
  "refs": ["PR #101"],
  "dates": ["22/07/2025"]
}

🧑‍💻 Local Development

1. Clone the repository

git clone https://github.com/your-username/devboard-ocr.git
cd devboard-ocr

2. Create and activate virtual environment

python -m venv venv
source venv/Scripts/activate   # Windows
# OR
source venv/bin/activate       # Linux/macOS

3. Install dependencies

pip install -r requirements.txt

4. Install Tesseract OCR

Windows: Download Installer
macOS: brew install tesseract
Ubuntu: sudo apt install tesseract-ocr

5. Configure `.env`

Create a .env file in the root:

TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe

Update the path to match your system.

🚦 Run the Server

uvicorn main:app --reload

Then open:
👉 http://localhost:8000/docs

Upload an image/PDF using the Swagger UI.

📁 Project Structure

devboard-ocr/
├── app/
│   ├── routes/
│   │   └── ocr.py
│   └── services/
│       └── ocr_service.py
├── main.py
├── .env
├── .gitignore
├── requirements.txt
└── README.md

🛠 Tech Stack

FastAPI – blazing-fast Python web framework
pytesseract – Python wrapper for Tesseract OCR
Pillow – image handling
dotenv – environment config
re – built-in regex for text parsing

🌐 Future Ideas

PDF support with pdf2image
SQLite task history
Frontend dashboard with upload + history
Background task queue for heavy processing

📄 License

MIT — free to use, fork, and build on.

Built by Ammar Bin Shakir — feel free to fork & showcase 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
app		app
public/fonts		public/fonts
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧑‍💻 Local Development

1. Clone the repository

2. Create and activate virtual environment

3. Install dependencies

4. Install Tesseract OCR

5. Configure `.env`

🚦 Run the Server

📁 Project Structure

🛠 Tech Stack

🌐 Future Ideas

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧑‍💻 Local Development

1. Clone the repository

2. Create and activate virtual environment

3. Install dependencies

4. Install Tesseract OCR

5. Configure .env

🚦 Run the Server

📁 Project Structure

🛠 Tech Stack

🌐 Future Ideas

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

5. Configure `.env`

Packages