Skip to content

Latest commit

 

History

History
383 lines (280 loc) · 21.4 KB

File metadata and controls

383 lines (280 loc) · 21.4 KB

Contribution Guide

There are many ways to be an open source contributor, and we're here to help you on your way! You may:

  • Propose ideas in the #elasticgraph channel on the Block Open Source Discord server
  • Raise an issue or feature request in our issue tracker
  • Help another contributor with one of their questions, or a code review
  • Suggest improvements to our Getting Started documentation by supplying a Pull Request
  • Evangelize our work together in conferences, podcasts, and social media spaces

This guide is for you.

Development Prerequisites

Requirement Tested Version Installation Instructions
Ruby 3.4.x or 4.0.x ruby-lang.org
Java JDK 11+ java.com
Docker Engine 27.x docker.com
Docker Compose 2.29.x docker.com

Ruby

This project is written in Ruby, a dynamic, open source programming language with a focus on simplicity and productivity.

You may verify your ruby installation via the terminal:

$ ruby -v
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24]

If you do not have Ruby, we recommend installing it using one of the following:

Ruby Dependencies

Ruby dependencies are managed using bundler, which comes installed with Ruby. To install Ruby dependencies, run:

$ bundle install

Once that is done, prefix Ruby commands (ruby, rspec, rake, etc) with bundle exec in order to run them in the context of the project bundle.

Docker and Docker Compose

This project uses Docker Engine and Docker Compose to run Elasticsearch and OpenSearch locally. We recommend installing Docker Desktop to get both Docker dependencies.

Java

The test suite requires Java 11 or greater to be available on $PATH. You can install a modern JDK using your package manager (i.e. brew install java).

Customizing the Development Environment

The project bundle only contains the gems necessary for what runs on CI. For local development, you may want to use some additional gems, such as:

Different engineers have different preferences around what gems to include, so the standard project bundle does not include gems like these. However, support is included to customize the development environment:

  • Make a Gemfile-custom file listing the additional gems you want to include. See Gemfile-custom.example for an example.
  • Run source script/enable_custom_gemfile.

This will set the BUNDLE_GEMFILE and BUNDLE_LOCKFILE environment variables in your shell session so that bundle exec will run in the context of your custom bundle.

Codebase Overview

To understand how the different parts of the codebase fit together, see the codebase overview.

Using AI Tools

Using AI tools such as Goose, ChatGPT, Cursor, or Claude to contribute is encouraged. However:

  • AI tools are assistants and should not replace critical thinking and judgement.
  • We expect contributors to understand every line of code submitted in a PR--in the long run, humans are on the hook for maintaining it!

If you use an AI agent, feel free to leverage the growing AI memory bank, and if updates made by your AI agent to that directory seem worth keeping, please include them in your submitted PR!

Primary agent instructions are maintained in AGENTS.md. We keep CLAUDE.md as a compatibility symlink.

Build Scripts and Executables

The codebase includes a variety of build scripts and executables which are useful for local development:

  • script/quick_build: Performs an abridged version of the CI build. This is generally the most complete CI build we run locally. We recommend running it before opening a PR.
  • script/lint: Runs the linter on the codebase, surfacing style and formatting issues.
    • Run script/lint --fix to autocorrect most linting issues.
  • script/type_check: Runs a steep type check.
  • script/spellcheck: Spellchecks the codebase using codespell.
    • Run script/spellcheck -w to write autocorrections back to source files.
  • script/run_specs: Runs the test suite.
  • script/run_gem_specs [gem_name]: Runs the test suite for one ElasticGraph gem.

Running Tests

We use RSpec as our test framework.

Each of the ElasticGraph gems has its own test suite in spec (e.g. elasticgraph-support/spec contains the tests for elasticgraph-support).

Run the entire suite:

script/run_specs

To test a single gem (e.g., elasticgraph-support):

# From the root:
bundle exec rspec elasticgraph-support/spec

# Alternatively run a gem's specs within the context of that gem's bundle, with code coverage tracked:
script/run_gem_specs elasticgraph-support

# Alternatively, you can run tests within a subdirectory:
cd elasticgraph-support
bundle exec rspec

The RSpec CLI is extremely flexible. Here are some useful options:

# See RSpec CLI options
bundle exec rspec --help

# Run all specs in one directory
bundle exec rspec path/to/dir

# Run all specs in one file
bundle exec rspec path/to/dir/file_spec.rb

# Run the spec defined at a specific line in a file
bundle exec rspec path/to/dir/file_spec.rb:47

# Run only the tests that failed the last time they ran
bundle exec rspec --only-failures

# Run just failures, and halt after the first failure (designed to be run repeatedly)
bundle exec rspec --next-failure

In addition, you can run tests in parallel by using script/flatware_rspec instead of bundle exec rspec:

script/flatware_rspec path/to/dir

Running tests in parallel using flatware tends to be faster for large test suite runs, but is usually slower for running a small subset of the test suite (e.g. one file or directory).

script/quick_build, script/run_specs, and script/run_gem_specs use flatware when appropriate. (It's not always faster!)

The integration and acceptance tests require Elasticsearch or OpenSearch to be running locally on a specific port; to boot it for those tests, run one of the following in a separate terminal and leave it running:

bundle exec rake elasticsearch:test:boot
# or
bundle exec rake opensearch:test:boot

Note: our integration and acceptance tests hammer Elasticsearch/OpenSearch pretty hard, particularly when running tests in parallel. Sometimes that puts the datastore into a bad state. When this happens, simply kill the rake *:test:boot process, and run it again; then re-run the tests.

Project Website

The source code for https://block.github.io/elasticgraph/ lives in config/site.

To serve it locally, run:

bundle exec rake site:serve

Then visit http://localhost:4000/elasticgraph/ in your browser. Local edits to the site will be reflected when you reload a page.

API Documentation

ElasticGraph's Ruby code is documented using YARD. You can view the rendered API docs in the context of the project website using the same site:serve rake task (just visit http://localhost:4000/elasticgraph/api-docs/main/). However, that task fully regenerates the documentation from scratch and it's not very quick. If you're working on multiple changes to the API documentation, you'll get a faster feedback loop using the site:preview_docs:[gem name] tasks. For example, to preview the docs of elasticgraph-schema_definition, run:

bundle exec rake site:preview_docs:elasticgraph-schema_definition

Then visit http://localhost:8808/. The preview task will rebuild the parts of the generated docs impacted by your edits, and is quite fast.

Adding a New Query API Feature

One common type of contribution to ElasticGraph is adding a new query API feature, such as a new filtering predicate or aggregation function. This section walks through the process using the substring filtering feature (added in #555, #557, #559, and #560) as an example.

Step 1: Design and Discussion

Before implementing a new query API feature:

  1. Create a GitHub Discussion to propose the feature and gather feedback
  2. Research the underlying datastore capabilities (Elasticsearch/OpenSearch features)
  3. Design the GraphQL API considering ElasticGraph's guiding principles:
    • Maximize functionality while minimizing API surface area
    • Ensure query validity can be statically verified
    • Maintain consistency with existing patterns

Note

What if a breaking API change is needed? We prioritize API stability and aim to avoid that as much as possible. However, if a breaking change unlocks the ability to offer a significant improvement, it's something we'll allow using a multi-step process:

  1. Offer a schema definition option (e.g. legacy_grouping_schema: true) that lets users opt-out of the breaking change, while defaulting to the new GraphQL schema (so that new projects automatically get the new-and-improved schema). As per our versioning policy, such a change can only go in a minor or major release, not a patch release. Be sure to update the example test schema to have fields/types using both the new and old schema features, so that we can maintain comprehensive test coverage of both the old and new approaches.
  2. In the next major release (which may be much, much later), we'll plan to remove the provided legacy option. Such a removal can only happen in a major release as per our versioning policy, since the upgrade may impact GraphQL clients. The release notes will need to include detailed upgrade instructions. See "Remove legacy_grouping_schema: true" from our v1.0.0 release notes for an example.

If you decide a breaking API change is needed, be sure to document your plans in the discussion proposing the feature.

See the substring filtering discussion for an example.

Step 2: Define Schema Elements

The first implementation step is to define the new GraphQL schema elements in the schema definition DSL. For this step, the changes usually include:

  • New schema element names for any new fields or arguments. The SchemaElementNames class allows ElasticGraph users to customize the names used in the generated GraphQL schema. For example, in this case, it would allow a user to name the new prefix filtering predicate beginsWith instead of startsWith.
  • New built-in types or updates to existing built-in types to expose the new functionality. Be sure to include documentation on any new types or fields.
  • Test coverage of the new GraphQL schema elements.
  • Artifact updates for the local/test schema used in this repo. The artifacts can be updated by running bundle exec rake schema_artifacts:dump.

See the substring schema definition PR for a complete example.

Step 3: Implement Query Translation

Next, implement the logic to translate from GraphQL to the appropriate datastore query form. For this step, the changes usually include:

  • Updates to the core query engine logic. The place to make changes depends on what kind of functionality you're adding:
  • Multiple levels of comprehensive test coverage:
    • The acceptance tests exercise the new GraphQL feature end-to-end, and are the ultimate demonstration that your new feature works. We intentionally do not follow a "one assertion per test" rule with these tests; instead, we optimize for test speed by running multiple GraphQL queries after indexing some documents.
    • The integration tests still hit the datastore "for real", but do not exercise the GraphQL layer. Instead, these tests directly build and execute a DatastoreQuery.
    • The unit tests also directly build a DatastoreQuery. However, instead of executing the DatastoreQuery, we inspect the body of the produced query to verify it is correct.
  • For new filtering predicates, be sure to consider what impact your change may have on shard routing and search index expressions. Otherwise, the queries may target the wrong shards or indices!

See the substring query translation PR for a complete example.

Step 4: Update Documentation

Finally, add user-facing documentation to the ElasticGraph website to help users understand and use the new feature. This could take the form of a brand new page and/or updates to an existing page. As you work on the updates, run the following so you can view the site locally in your browser (at http://localhost:4000/elasticgraph/):

bundle exec rake site:serve

We aim to include working examples throughout our docs, so please add one or more example queries demonstrating usage of the new feature. Example GraphQL queries are defined under config/site/examples/*/queries and then included in a documentation page using {% include copyable_code_snippet.html language="graphql" data="..." %}.

All example queries are validated as part of the CI build against the example schema provided by elasticgraph new when bootstrapping a project, to verify that they return no errors and return some data. To try an example query out locally, run:

ELASTICGRAPH_GEMS_PATH=`pwd` bundle exec elasticgraph new tmp/demo_app
cd tmp/demo_app
bundle exec rake boot_locally

You may need to update the schema or factories provided in the project template so that the new query feature is available and produces matching results.

See the substring documentation PR for a complete example.

Maintenance Tasks

Common codebase maintenance tasks are documented in the maintainer's runbook.

Communications

Issues

Anyone from the community is welcome (and encouraged!) to raise issues via GitHub Issues.

Discussions

Design discussions and proposals take place on GitHub discussions. We advocate an asynchronous, written discussion model - so write up your thoughts and invite the community to join in!

In addition, we have a discord channel (#elasticgraph) on the Block Open Source Discord server for synchronous communication. Discord is best for questions and general conversation.

Continuous Integration

Build and test cycles are run on every commit to every branch on GitHub Actions.

Contribution

We review contributions to the codebase via GitHub's Pull Request mechanism. We have the following guidelines to ease your experience and help our leads respond quickly to your valuable work:

  • Start by proposing a change either on Discord (most appropriate for small change requests or bug fixes) or in Discussions (most appropriate for design and architecture considerations, proposing a new feature, or where you'd like insight and feedback).
  • Cultivate consensus around your ideas; the project leads will help you pre-flight how beneficial the proposal might be to the project. Developing early buy-in will help others understand what you're looking to do, and give you a greater chance of your contributions making it into the codebase! No one wants to see work done in an area that's unlikely to be incorporated into the codebase.
  • Fork the repo into your own namespace/remote.
  • Work in a dedicated feature branch. Atlassian wrote a great description of this workflow.
  • When you're ready to submit a pull request:
    • Squash your commits into a single one (or an appropriate small number of commits), and rebase atop the upstream main branch. This will limit the potential for merge conflicts during review, and helps keep the audit trail clean. A good writeup for how this is done is here, and if you're having trouble - feel free to ask a member or the community for help or leave the commits as-is, and flag that you'd like rebasing assistance in your PR! We're here to support you.
    • Please run script/quick_build and fix any failures (it'll be faster to get your change merged if it already passes the build!)
      • If you're not sure how to fix the failures (and an AI agent isn't helping), feel free to submit what you have and we'll recommend the fix.
    • Open a PR in the project to bring in the code from your feature branch.
    • The maintainers noted in the CODEOWNERS file will review your PR and optionally open a discussion about its contents before moving forward.
    • Remain responsive to follow-up questions, be open to making requested changes, and... You're a contributor!
  • Remember to respect everyone in our development community. Guidelines are established in our Code of Conduct.