There are many ways to be an open source contributor, and we're here to help you on your way! You may:
- Propose ideas in the
#elasticgraphchannel on the Block Open Source Discord server - Raise an issue or feature request in our issue tracker
- Help another contributor with one of their questions, or a code review
- Suggest improvements to our Getting Started documentation by supplying a Pull Request
- Evangelize our work together in conferences, podcasts, and social media spaces
This guide is for you.
| Requirement | Tested Version | Installation Instructions |
|---|---|---|
| Ruby | 3.4.x or 4.0.x | ruby-lang.org |
| Java | JDK 11+ | java.com |
| Docker Engine | 27.x | docker.com |
| Docker Compose | 2.29.x | docker.com |
This project is written in Ruby, a dynamic, open source programming language with a focus on simplicity and productivity.
You may verify your ruby installation via the terminal:
$ ruby -v
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]If you do not have Ruby, we recommend installing it using one of the following:
Ruby dependencies are managed using bundler, which comes installed with Ruby. To install Ruby dependencies, run:
$ bundle installOnce that is done, prefix Ruby commands (ruby, rspec, rake, etc) with bundle exec in order to run them in the context of the project bundle.
This project uses Docker Engine and Docker Compose to run Elasticsearch and OpenSearch locally. We recommend installing Docker Desktop to get both Docker dependencies.
The test suite requires Java 11 or greater to be available on $PATH. You can install a modern JDK using
your package manager (i.e. brew install java).
The project bundle only contains the gems necessary for what runs on CI. For local development, you may want to use some additional gems, such as:
- debug for debugging
- vernier for profiling
- solargraph for an LSP implementation used by an IDE
Different engineers have different preferences around what gems to include, so the standard project bundle does not include gems like these. However, support is included to customize the development environment:
- Make a
Gemfile-customfile listing the additional gems you want to include. See Gemfile-custom.example for an example. - Run
source script/enable_custom_gemfile.
This will set the BUNDLE_GEMFILE and BUNDLE_LOCKFILE environment variables in your shell session
so that bundle exec will run in the context of your custom bundle.
To understand how the different parts of the codebase fit together, see the codebase overview.
Using AI tools such as Goose, ChatGPT, Cursor, or Claude to contribute is encouraged. However:
- AI tools are assistants and should not replace critical thinking and judgement.
- We expect contributors to understand every line of code submitted in a PR--in the long run, humans are on the hook for maintaining it!
If you use an AI agent, feel free to leverage the growing AI memory bank, and if updates made by your AI agent to that directory seem worth keeping, please include them in your submitted PR!
Primary agent instructions are maintained in AGENTS.md. We keep CLAUDE.md as a compatibility symlink.
The codebase includes a variety of build scripts and executables which are useful for local development:
script/quick_build: Performs an abridged version of the CI build. This is generally the most complete CI build we run locally. We recommend running it before opening a PR.script/lint: Runs the linter on the codebase, surfacing style and formatting issues.- Run
script/lint --fixto autocorrect most linting issues.
- Run
script/type_check: Runs a steep type check.script/spellcheck: Spellchecks the codebase using codespell.- Run
script/spellcheck -wto write autocorrections back to source files.
- Run
script/run_specs: Runs the test suite.script/run_gem_specs [gem_name]: Runs the test suite for one ElasticGraph gem.
We use RSpec as our test framework.
Each of the ElasticGraph gems has its own test suite in spec (e.g. elasticgraph-support/spec contains the tests for
elasticgraph-support).
Run the entire suite:
script/run_specsTo test a single gem (e.g., elasticgraph-support):
# From the root:
bundle exec rspec elasticgraph-support/spec
# Alternatively run a gem's specs within the context of that gem's bundle, with code coverage tracked:
script/run_gem_specs elasticgraph-support
# Alternatively, you can run tests within a subdirectory:
cd elasticgraph-support
bundle exec rspecThe RSpec CLI is extremely flexible. Here are some useful options:
# See RSpec CLI options
bundle exec rspec --help
# Run all specs in one directory
bundle exec rspec path/to/dir
# Run all specs in one file
bundle exec rspec path/to/dir/file_spec.rb
# Run the spec defined at a specific line in a file
bundle exec rspec path/to/dir/file_spec.rb:47
# Run only the tests that failed the last time they ran
bundle exec rspec --only-failures
# Run just failures, and halt after the first failure (designed to be run repeatedly)
bundle exec rspec --next-failureIn addition, you can run tests in parallel by using script/flatware_rspec instead of bundle exec rspec:
script/flatware_rspec path/to/dirRunning tests in parallel using flatware tends to be faster for large test suite runs, but is usually slower for running a small subset of the test suite (e.g. one file or directory).
script/quick_build, script/run_specs, and script/run_gem_specs use flatware when appropriate. (It's not always faster!)
The integration and acceptance tests require Elasticsearch or OpenSearch to be running locally on a specific port; to boot it for those tests, run one of the following in a separate terminal and leave it running:
bundle exec rake elasticsearch:test:boot
# or
bundle exec rake opensearch:test:bootNote: our integration and acceptance tests hammer Elasticsearch/OpenSearch pretty hard, particularly when running
tests in parallel. Sometimes that puts the datastore into a bad state. When this happens, simply kill the rake *:test:boot
process, and run it again; then re-run the tests.
The source code for https://block.github.io/elasticgraph/ lives in config/site.
To serve it locally, run:
bundle exec rake site:serveThen visit http://localhost:4000/elasticgraph/ in your browser. Local edits to the site will be reflected when you reload a page.
ElasticGraph's Ruby code is documented using YARD. You can view the rendered API docs in the context of the
project website using the same site:serve rake task (just visit http://localhost:4000/elasticgraph/api-docs/main/). However, that task
fully regenerates the documentation from scratch and it's not very quick. If you're working on multiple changes to the API documentation,
you'll get a faster feedback loop using the site:preview_docs:[gem name] tasks. For example, to preview the docs of
elasticgraph-schema_definition, run:
bundle exec rake site:preview_docs:elasticgraph-schema_definitionThen visit http://localhost:8808/. The preview task will rebuild the parts of the generated docs impacted by your edits, and is quite fast.
One common type of contribution to ElasticGraph is adding a new query API feature, such as a new filtering predicate or aggregation function. This section walks through the process using the substring filtering feature (added in #555, #557, #559, and #560) as an example.
Before implementing a new query API feature:
- Create a GitHub Discussion to propose the feature and gather feedback
- Research the underlying datastore capabilities (Elasticsearch/OpenSearch features)
- Design the GraphQL API considering ElasticGraph's guiding principles:
- Maximize functionality while minimizing API surface area
- Ensure query validity can be statically verified
- Maintain consistency with existing patterns
Note
What if a breaking API change is needed? We prioritize API stability and aim to avoid that as much as possible. However, if a breaking change unlocks the ability to offer a significant improvement, it's something we'll allow using a multi-step process:
- Offer a schema definition option (e.g.
legacy_grouping_schema: true) that lets users opt-out of the breaking change, while defaulting to the new GraphQL schema (so that new projects automatically get the new-and-improved schema). As per our versioning policy, such a change can only go in a minor or major release, not a patch release. Be sure to update the example test schema to have fields/types using both the new and old schema features, so that we can maintain comprehensive test coverage of both the old and new approaches. - In the next major release (which may be much, much later), we'll plan to remove the provided legacy option. Such a removal can
only happen in a major release as per our versioning policy, since the upgrade may impact GraphQL clients. The release notes
will need to include detailed upgrade instructions. See "Remove
legacy_grouping_schema: true" from our v1.0.0 release notes for an example.
If you decide a breaking API change is needed, be sure to document your plans in the discussion proposing the feature.
See the substring filtering discussion for an example.
The first implementation step is to define the new GraphQL schema elements in the schema definition DSL. For this step, the changes usually include:
- New schema element names
for any new fields or arguments. The
SchemaElementNamesclass allows ElasticGraph users to customize the names used in the generated GraphQL schema. For example, in this case, it would allow a user to name the new prefix filtering predicatebeginsWithinstead ofstartsWith. - New built-in types or updates to existing built-in types to expose the new functionality. Be sure to include documentation on any new types or fields.
- Test coverage of the new GraphQL schema elements.
- Artifact updates for
the local/test schema used in this repo. The artifacts can be updated by running
bundle exec rake schema_artifacts:dump.
See the substring schema definition PR for a complete example.
Next, implement the logic to translate from GraphQL to the appropriate datastore query form. For this step, the changes usually include:
- Updates to the core query engine logic. The place to make changes depends on what kind of functionality you're adding:
- Changes may need to be made to ElasticGraph::GraphQL::DatastoreQuery, which is the intermediate form used by ElasticGraph internally to model an OpenSearch/Elasticsearch query.
- For a new filtering predicate, add a new entry to the map of filter operators in filter_node_interpreter.rb.
- For a new aggregation feature, multiple changes are typically needed under the
ElasticGraph::GraphQL::Aggregation module:
- ElasticGraph::GraphQL::Aggregation::Query models an aggregation query.
- ElasticGraph::GraphQL:::Aggregation::QueryAdapter
is responsible for building an
ElasticGraph::GraphQL::Aggregation::Queryfrom the GraphQL query AST. - The ElasticGraph::GraphQL::Aggregation::Resolvers module is responsible for resolving GraphQL aggregation fields by extracting values from the datastore response.
- Multiple levels of comprehensive test coverage:
- The acceptance tests exercise the new GraphQL feature end-to-end, and are the ultimate demonstration that your new feature works. We intentionally do not follow a "one assertion per test" rule with these tests; instead, we optimize for test speed by running multiple GraphQL queries after indexing some documents.
- The integration tests
still hit the datastore "for real", but do not exercise the GraphQL layer. Instead, these tests directly build and execute a
DatastoreQuery. - The unit tests also directly
build a
DatastoreQuery. However, instead of executing theDatastoreQuery, we inspect the body of the produced query to verify it is correct.
- For new filtering predicates, be sure to consider what impact your change may have on shard routing and search index expressions. Otherwise, the queries may target the wrong shards or indices!
See the substring query translation PR for a complete example.
Finally, add user-facing documentation to the ElasticGraph website to help users understand and use the new feature. This could take the form of a brand new page and/or updates to an existing page. As you work on the updates, run the following so you can view the site locally in your browser (at http://localhost:4000/elasticgraph/):
bundle exec rake site:serveWe aim to include working examples throughout our docs, so please add one or more example queries demonstrating usage of the new feature.
Example GraphQL queries
are defined under config/site/examples/*/queries and then included in a documentation
page
using {% include copyable_code_snippet.html language="graphql" data="..." %}.
All example queries are validated as part of the CI build against the example schema provided by elasticgraph new when
bootstrapping a project, to verify that they return no errors and return some data. To try an example query out locally, run:
ELASTICGRAPH_GEMS_PATH=`pwd` bundle exec elasticgraph new tmp/demo_app
cd tmp/demo_app
bundle exec rake boot_locallyYou may need to update the schema or factories provided in the project template so that the new query feature is available and produces matching results.
See the substring documentation PR for a complete example.
Common codebase maintenance tasks are documented in the maintainer's runbook.
Anyone from the community is welcome (and encouraged!) to raise issues via GitHub Issues.
Design discussions and proposals take place on GitHub discussions. We advocate an asynchronous, written discussion model - so write up your thoughts and invite the community to join in!
In addition, we have a discord channel (#elasticgraph) on the Block Open Source Discord server
for synchronous communication. Discord is best for questions and general conversation.
Build and test cycles are run on every commit to every branch on GitHub Actions.
We review contributions to the codebase via GitHub's Pull Request mechanism. We have the following guidelines to ease your experience and help our leads respond quickly to your valuable work:
- Start by proposing a change either on Discord (most appropriate for small change requests or bug fixes) or in Discussions (most appropriate for design and architecture considerations, proposing a new feature, or where you'd like insight and feedback).
- Cultivate consensus around your ideas; the project leads will help you pre-flight how beneficial the proposal might be to the project. Developing early buy-in will help others understand what you're looking to do, and give you a greater chance of your contributions making it into the codebase! No one wants to see work done in an area that's unlikely to be incorporated into the codebase.
- Fork the repo into your own namespace/remote.
- Work in a dedicated feature branch. Atlassian wrote a great description of this workflow.
- When you're ready to submit a pull request:
- Squash your commits into a single one (or an appropriate small number of commits), and
rebase atop the upstream
mainbranch. This will limit the potential for merge conflicts during review, and helps keep the audit trail clean. A good writeup for how this is done is here, and if you're having trouble - feel free to ask a member or the community for help or leave the commits as-is, and flag that you'd like rebasing assistance in your PR! We're here to support you. - Please run
script/quick_buildand fix any failures (it'll be faster to get your change merged if it already passes the build!)- If you're not sure how to fix the failures (and an AI agent isn't helping), feel free to submit what you have and we'll recommend the fix.
- Open a PR in the project to bring in the code from your feature branch.
- The maintainers noted in the CODEOWNERS file will review your PR and optionally open a discussion about its contents before moving forward.
- Remain responsive to follow-up questions, be open to making requested changes, and... You're a contributor!
- Squash your commits into a single one (or an appropriate small number of commits), and
rebase atop the upstream
- Remember to respect everyone in our development community. Guidelines are established in our Code of Conduct.