Berlin Buzzwords Conference Materials

2023

The code is a PoC of how you can use Tensorflow in Solr query parser. It includes:

/2023/config - the configuration files for Solr
/2023/solr - the final PoC code that extends Solr
/2023/tf-java-poc - the work in progress code for loading and working with the created model
/2023/tf-python-poc - the code for creating the model used for PoC
/2023demo - the script for the demo during the talk

The video of the talk is available on YouTube.

2025

The video of the talk is available on YouTube.

2026

The code in the repository is the code used to prepare the data for the Berlin Buzzwords 2026 talk.

MovieLens ml-25m Vespa Indexer

Indexes the MovieLens ml-25m dataset into a locally running Vespa instance. Creates three document types — movie, rating, and tag — and deploys the full Vespa application package automatically.

Prerequisites

Python 3.10+
Vespa running in Docker with ports 19071 (config server) and 8080 (feed/query) exposed
The ml-25m/ dataset directory (containing movies.csv, ratings.csv, tags.csv, links.csv)

Start Vespa in Docker

docker run --detach \
  --name vespa \
  --hostname vespa-container \
  --publish 8080:8080 \
  --publish 19071:19071 \
  vespaengine/vespa

Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run

DATA_DIR=./ml-25m python3 batch.py

The script will:

Generate the Vespa application package (schemas + services.xml)
Deploy it to the config server
Feed all movies (~62K docs)
Feed all ratings (~25M docs)
Feed all tags (~1M docs)
Print a final summary

Note: It can take quite some time to index the data.

Verfication

You can run the verify.py during indexing to check its status:

python3 verify.py

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
2023		2023
2025		2025
2026		2026
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Berlin Buzzwords Conference Materials

2023

2025

2026

MovieLens ml-25m Vespa Indexer

Prerequisites

Start Vespa in Docker

Setup

Run

Verfication

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Berlin Buzzwords Conference Materials

2023

2025

2026

MovieLens ml-25m Vespa Indexer

Prerequisites

Start Vespa in Docker

Setup

Run

Verfication

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages