NovelAI - AI Powered Children's Storybook Generator

Transform any PDF into an engaging, illustrated children's storybook using Google's Gemini API.

Link - https://novel-ai-one.vercel.app/

Backend -

main.py: Contains the core logic of PDF processing, gemini initialisation text and image API calls
chat.py: Contains the logic of chatbot query and answering

Overview

NovelAI is an intelligent application that leverages Google's Gemini 2.5 Flash API to automatically convert complex PDF documents into child-friendly storybooks complete with AI generated illustrations. The application simplifies text for young readers (ages 6-8) and creates colorful, engaging images that bring stories to life.

Why Gemini API is Essential

This project fundamentally relies on two distinct Gemini models to achieve its core functionality:

1. Gemini 2.5 Flash (Text Generation)

Purpose: Simplifies complex text into child-friendly language
Why it's necessary:
- Transforms hard english language content into simple english content for kids
- Generates concise, meaningful chapter titles (2-3 words)
- Gemini chatbot answers from uploaded novel pdf as core memory
- Creates detailed image prompts for illustration generation
- Maintains story coherence while simplifying vocabulary
- Uses structured JSON output for reliable parsing

API Implementation:

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents=prompt,
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        temperature=0.6
    )
)

2. Gemini 2.5 Flash Image (Image Generation)

Purpose: Creates colorful children's book illustrations
Why it's necessary:
- Generates unique, context aware images for each chapter
- Produces child friendly, vibrant illustrations
- Maintains consistent artistic style throughout the book
- No external image APIs or databases required

API Implementation:

response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents=f"Children's book illustration, colorful: {prompt}",
    config=types.GenerateContentConfig(response_modalities=["IMAGE"])
)

Features

PDF Text Extraction: Automatically extracts and chunks content into chapters
ChatBot: Answers kids questions in simple language based on the novel
Text Simplification: Converts complex text into simple, engaging stories
AI Generated Illustrations: Creates unique images for each chapter
Lazy Loading: On demand chapter processing for optimised performance
Image Caching: Reduces redundant API calls and improves speed
Retry Mechanism: Robust error handling with exponential backoff
Rate Limiting: Semaphore based concurrency control

Architecture

┌─────────────────────────────────────────────────────────────┐
│                         Frontend                            │
│                    (Next.js + React)                        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ PDF Upload   │  │ Book Display │  │ Story View   │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└─────────────────────────────────────────────────────────────┘
                            │
                            │ HTTP/REST API
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                         Backend                             │
│                    (FastAPI + Python)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ PDF Parser   │  │ Text Processor│  │ Image Gen    │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└─────────────────────────────────────────────────────────────┘
                            │
                            │ API Calls
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                    Google Gemini API                        │
│  ┌──────────────────────────┐  ┌──────────────────────┐    │
│  │  Gemini 2.5 Flash        │  │ Gemini 2.5 Flash     │    │
│  │  (Text Simplification)   │  │ Image (Illustrations)│    │
│  └──────────────────────────┘  └──────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

Technology Stack

Backend

FastAPI: High-performance async web framework
Google Gemini API: AI text and image generation
PyPDF2: PDF text extraction
Pydantic: Data validation and serialization
Python-dotenv: Environment variable management

Frontend

Next.js 16: React framework with server-side rendering
React 19: Modern UI library
Framer Motion: Smooth animations and transitions
TailwindCSS 4: Utility-first CSS framework
TypeScript: Type-safe development

Usage

Upload a PDF: Click the upload button and select any PDF document
Wait for Processing: The app extracts text and splits it into chapters
View Chapters: Navigate through chapters using the interactive UI
Lazy Loading: Chapters are simplified and illustrated on-demand as you view them
Enjoy: Read the simplified story with beautiful AI-generated illustrations!

Gemini API Optimization Techniques

1. Concurrency Control

TEXT_SEMAPHORE = asyncio.Semaphore(2)   # Max 2 concurrent text requests
IMAGE_SEMAPHORE = asyncio.Semaphore(1)  # Max 1 concurrent image request

2. Image Caching

IMAGE_CACHE: dict[str, str] = {}  # Cache generated images by prompt

3. Retry with Exponential Backoff

async def retry_with_backoff(func, *args, max_retries=5, initial_delay=1, **kwargs):
    for attempt in range(max_retries):
        try:
            return await func(*args, **kwargs)
        except ServerError as e:
            if e.code == 503:  # Model overloaded
                delay = initial_delay * (2 ** attempt) + random.uniform(0, 1)
                await asyncio.sleep(delay)

4. Structured JSON Output

config=types.GenerateContentConfig(
    response_mime_type="application/json",  # Ensures parseable output
    temperature=0.6  # Balanced creativity
)

5. Lazy Loading

Chapters are only processed when viewed by the user
Reduces initial load time and API costs
Improves user experience with progressive loading

Performance Metrics

PDF Processing: ~10-30 seconds (depending on size)
Text Simplification: ~2-5 seconds per chapter
Image Generation: ~3-8 seconds per image
Total Time (10 chapters): ~5-10 minutes (with lazy loading)

Error Handling

503 Errors: Automatic retry with exponential backoff
Rate Limiting: Semaphore-based concurrency control
Network Failures: Graceful degradation with user feedback
Invalid PDFs: Clear error messages
API Quota: Informative error responses

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NovelAI - AI Powered Children's Storybook Generator

Link - https://novel-ai-one.vercel.app/

Backend -

Overview

Why Gemini API is Essential

1. Gemini 2.5 Flash (Text Generation)

2. Gemini 2.5 Flash Image (Image Generation)

Features

Architecture

Technology Stack

Backend

Frontend

Usage

Gemini API Optimization Techniques

1. Concurrency Control

2. Image Caching

3. Retry with Exponential Backoff

4. Structured JSON Output

5. Lazy Loading

Performance Metrics

Error Handling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NovelAI - AI Powered Children's Storybook Generator

Link - https://novel-ai-one.vercel.app/

Backend -

Overview

Why Gemini API is Essential

1. Gemini 2.5 Flash (Text Generation)

2. Gemini 2.5 Flash Image (Image Generation)

Features

Architecture

Technology Stack

Backend

Frontend

Usage

Gemini API Optimization Techniques

1. Concurrency Control

2. Image Caching

3. Retry with Exponential Backoff

4. Structured JSON Output

5. Lazy Loading

Performance Metrics

Error Handling

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages