Skip to content

Database

Laith edited this page May 25, 2026 · 2 revisions

PostgreSQL via Neon with the pgvector extension for RAG embeddings.

Tables are defined in server/models.py as SQLModel classes. init_db() (called on server startup via FastAPI lifespan) creates all tables if they don't exist.


Schema

users

Column Type Notes
id UUID PK  
auth0_id string UNIQUE Auth0 sub claim
email string UNIQUE Lowercase
name string Display name
role string STUDENT or ADMIN
major_id UUID FK → majors  
total_points int Denormalised sum for fast leaderboard queries
created_at timestamp UTC
# Database

PostgreSQL via [Neon](https://neon.tech) with the pgvector extension for RAG embeddings.

Tables are defined in server/models.py as SQLModel classes. init_db() (called on server startup via FastAPI lifespan) creates all tables if they don't exist.


Schema

users

Column Type Notes
id UUID PK
auth0_id string UNIQUE Auth0 sub claim
email string UNIQUE Lowercase
name string Display name
role string STUDENT or ADMIN
major_id UUID FK → majors
total_points int Denormalised sum for fast leaderboard queries
created_at timestamp UTC

Missing (planned): bio, avatar_url, university, last_login_at


majors

Column Type Notes
id UUID PK
name string UNIQUE e.g. "Software Engineering"
slug string UNIQUE e.g. "software-engineering"

courses

Column Type Notes
id UUID PK
code string UNIQUE e.g. "10036"
name string
major_id UUID FK → majors
year_id int 1–4 (program level)
semester int 1 = Fall, 2 = Spring

lecturers

Column Type Notes
id UUID PK
name string UNIQUE

material_types

Column Type Notes
id UUID PK
display_name string "Slides", "Exam", "Lecture Notes", "Summary", "Homework"

materials

Approved, published files.

Column Type Notes
id UUID PK Shares ID with its originating file_request
title string
academic_year int Program level 1–4
material_year int Calendar year (e.g. 2023)
course_id UUID FK → courses
lecturer_id UUID FK → lecturers
type_id UUID FK → material_types
uploader_id UUID FK → users Who gets the XP credit
notes string nullable
file_url string GCS path: <CourseName>/<filename>
created_at timestamp

file_requests

Pending student submissions.

Column Type Notes
id UUID PK
user_id UUID FK → users
course_id UUID FK → courses
type_id UUID FK → material_types
title string
academic_year int
material_year int
file_url string GCS path — changes as file moves between folders
lecturer_id UUID FK → lecturers
status string pending / approved / rejected
notes string nullable Student's optional note
admin_note string nullable Admin's rejection reason (visible to student)
created_at timestamp

Status transitions:
pending → approved (approve) / pending → rejected (reject)
approved → pending (undo-approve) / rejected → pending (undo-reject)


points_transactions

XP ledger. Never deleted — append-only.

Column Type Notes
id UUID PK
user_id UUID FK → users
amount int Positive = earned
action string upload_approval, file_completed, course_complete
reason string Human-readable
request_id UUID FK → file_requests nullable For upload approvals
source_id string UNIQUE nullable Idempotency key (e.g. "view_<session_id>", "course_complete:<user_id>:<course_id>")
created_at timestamp

Unique constraint: (request_id, action) — prevents double XP per approval
Unique constraint: source_id — prevents all other double awards


file_viewing_sessions

One row per student per file view attempt.

Column Type Notes
id UUID PK Returned to frontend as session_id
user_id UUID FK → users
file_id UUID FK → materials
course_id UUID FK → courses Denormalised for activity queries
started_at timestamp Used for plausibility check on active_seconds
active_seconds int Cumulative; incremented by heartbeat
required_active_seconds int total_pages × 45 × 0.60
completion_score float Recomputed each heartbeat
is_complete bool Immutable once true

user_course_activities

Materialised activity state per user per course.

Column Type Notes
user_id UUID PK (composite)
course_id UUID PK (composite)
status string not_started / exploring / engaged / completed
files_completed int Count of is_complete = true sessions for this course
total_files int ⚠️ May not be populated in current implementation

Status logic: not_started (0), exploring (1 to <50%), engaged (≥50%), completed (100%)


user_tasks

Personal learning-plan tasks.

Column Type Notes
id UUID PK
user_id UUID FK → users
title string
date string "YYYY-MM-DD" — avoids TZ issues
start_hour float 0–23, 0.5-step resolution
duration float Hours
priority string normal / high / urgent
completed bool
created_at timestamp

Index: (user_id, date) for fast per-user day queries.


user_notes

Per-user per-file scratch pad.

Column Type Notes
user_id UUID PK (composite)
file_id UUID PK (composite) FK → materials
content string JSON-encoded sticky notes array
updated_at timestamp

user_recent_files

Column Type Notes
user_id UUID PK (composite)
file_id UUID PK (composite) FK → materials
viewed_at timestamp Updated on re-view

audit_logs

Append-only admin action log.

Column Type Notes
id UUID PK
timestamp timestamp indexed
actor_id UUID FK → users Admin who acted
actor_name string Denormalised
action string approve / reject / bulk_approve / bulk_reject / undo_approve / undo_reject
target_type string "file_request"
target_ids JSON array UUIDs of affected requests
meta_data JSONB title, reason, count, etc.

material_chunks

RAG vector embeddings. One row per text chunk of a material.

Column Type Notes
id UUID PK
material_id UUID FK → materials
content string 1000-char text chunk
embedding vector(1536) gemini-embedding-001 output, truncated from 3072
page_number int nullable Source page
created_at timestamp

Index: HNSW on embedding with cosine_distance ops, m=16, ef_construction=64


Entity relationships (simplified)

users ──────────────────────── majors
  │
  ├── file_requests ─────────── courses ──── majors
  │                         └── lecturers
  │                         └── material_types
  │
  ├── materials ─────────────── courses
  │       └── material_chunks (RAG)
  │
  ├── points_transactions
  ├── user_tasks
  ├── user_recent_files ─────── materials
  ├── user_notes ────────────── materials
  ├── user_course_activities ── courses
  ├── file_viewing_sessions ─── materials
  └── audit_logs

Conventions

  • All primary keys are UUIDs (default_factory=uuid4).
  • All timestamps are UTC (datetime.now(timezone.utc)).
  • academic_year = program level (1–4). material_year = calendar year (e.g. 2023).
  • total_points on users is a denormalised cache of the sum of points_transactions.amount. Updated synchronously on every XP event.

Migrations

There is no Alembic or migration framework. Schema changes are applied via:

  1. Adding/modifying model classes in models.py.
  2. init_db() creates new tables on server restart (does not alter existing columns).
  3. Column additions are done manually via server/add_column.py one-off scripts.

Recommendation: Adopt Alembic before the first production schema change that modifies an existing column.


Known gaps

Gap Impact
user_course_activity.total_files not populated by current code Course status (exploring/engaged/completed) may be incorrect
motivational_quotes table not in models.py Course completion celebration will skip the quote
No Alembic migrations Schema changes require manual ALTER TABLE scripts
users missing bio, avatar_url columns Profile page cannot show avatar or bio

Clone this wiki locally