A static website that visualises Scottish Government payment card spending over £500, sourced from the official monthly reports published at https://www.gov.scot/collections/government-spend-over-gbp500-monthly-reports/
Live site: https://govspend.opendata.scot
The site is built with Eleventy and deployed to GitHub Pages. Data is scraped from gov.scot using a Python script and stored as JSON files in the repository. A GitHub Actions workflow runs monthly to fetch any newly published reports and rebuild the site automatically.
.
├── .github/
│ └── workflows/
│ ├── build.yml # Builds and deploys the site on push to main
│ └── update-data.yml # Monthly cron to fetch new spend reports
├── _assets/
│ ├── css/main.scss # Bootstrap overrides and site styles
│ └── js/
│ ├── main.js # JS entry point
│ ├── charts.js # Highcharts visualisations
│ └── constants.js # Shared constants (currency formatting)
├── _data/
│ ├── site.js # Site metadata (name, URL, build time)
│ └── spendsOver500/ # One JSON file per monthly report (YYYY-MM.json)
├── _includes/
│ └── layouts/base.njk # Base HTML layout
├── index.njk # Homepage
├── spendover500.njk # Monthly detail pages (paginated)
├── spendover500.json.njk # JSON export pages (/spends/YYYY-MM.json)
├── scrape_data.py # Data scraper
├── headers.csv # Column schema log (one row per month)
├── package.json # Node dependencies and build scripts
└── requirements.txt # Python dependencies
Each monthly report is stored as _data/spendsOver500/YYYY-MM.json. Records follow this schema:
| Field | Description |
|---|---|
Directorate |
Scottish Government directorate or department |
Merchant Name |
Vendor or supplier name |
Merchant Category Name |
Payment category |
Transaction Date |
Date of transaction (DD/MM/YYYY) |
Transaction Amount |
Value in GBP (numeric) |
Expense Description |
Description of the purchase |
Data covers May 2016 onwards and includes only transactions over £500.
- Node.js (LTS recommended)
- Python 3.x
- pip
To download all monthly reports from gov.scot:
pip install -r requirements.txt
python scrape_data.pyTo only download months that are not already present locally (faster for incremental updates):
python scrape_data.py --skip-existingnpm run startThis starts Eleventy with live reload and the Parcel asset bundler in watch mode.
The update-data.yml workflow runs at 09:00 UTC on the first of every month. It:
- Runs
scrape_data.py --skip-existingto fetch any newly published monthly reports - Commits the new JSON files and updated
headers.csvtomain - Rebuilds and redeploys the site to GitHub Pages
The workflow can also be triggered manually from the Actions tab in GitHub using the workflow_dispatch event, which is useful for backfilling missing months.
The site is deployed automatically to GitHub Pages on every push to main (via build.yml), and also as part of the monthly data update workflow. GitHub Pages should be configured to serve from the gh-pages branch.