Thai School Data Scraper & API

This project provides a tool to scrape Thai school information from Wikipedia and serve it via an API. It is built with Bun, ElysiaJS, and Cheerio.

Features

Live Scraping: Fetches up-to-date school data directly from Wikipedia.
REST API: Serves school data via high-performance ElysiaJS endpoints.
Data Export: Script to scrape and save all school data to JSON and CSV files.

Prerequisites

Bun runtime installed.

Installation

bun install

Usage

Data Preparation

Before running the API, you must scrape the data from Wikipedia. This will generate the necessary data files in the dist/ directory.

bun run scrape

Output:

JSON:
- dist/json/schools.json: Combined list of all schools (pretty).
- dist/json/schools.min.json: Combined list of all schools (minified).
- dist/json/provinces/[province].json: Individual JSON files for each province.
CSV:
- dist/csv/schools.csv: Combined list of all schools.
- dist/csv/provinces/[province].csv: Individual CSV files for each province.

Run API Server

Start the development server:

bun dev

The server will be running at http://localhost:3000.

API Endpoints

GET /schools: Retrieve a list of schools.
- Query Parameters:
  - q: Search by school name (optional).
  - province: Filter by province (optional).
- Example: GET /schools?province=ภูเก็ต
GET /: API Information.
GET /openapi: OpenAPI documentation.

Project Structure

src/index.ts: API server entry point.
src/scripts/scrape.ts: CLI script for scraping and saving data.
src/services/scraper.ts: Scraper logic using Cheerio.
src/constants/provinces.ts: List of Thai provinces.

Automation

This project uses GitHub Actions to automatically update the school data.

The workflow runs on the 1st of every 3rd month at midnight UTC.
It executes the scraper and commits any changes to the dist/ directory back to the repository.
You can also manually trigger the "Update School Data" workflow from the Actions tab.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
.vscode		.vscode
dist		dist
src		src
.gitignore		.gitignore
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thai School Data Scraper & API

Features

Prerequisites

Installation

Usage

Data Preparation

Run API Server

API Endpoints

Project Structure

Automation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Thai School Data Scraper & API

Features

Prerequisites

Installation

Usage

Data Preparation

Run API Server

API Endpoints

Project Structure

Automation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages