From f3edf6b50e7ec133248ee9d820a9f1b88860fd4a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Lachowski?= Date: Wed, 6 May 2026 11:37:14 +0200 Subject: [PATCH 1/3] RDoc-3819 Verify & improve SEO of existing guides. --- guides/employing-data-archival-guide.mdx | 65 ++++-- ...ma-validation-to-standardize-your-data.mdx | 178 ++++++---------- guides/how-is-my-database-today.mdx | 56 ++--- ...g-red-flags-in-index-definitions-guide.mdx | 39 +++- ...es-using-data-subscriptions-in-ravendb.mdx | 51 +++-- guides/ravendb-client-certificates.mdx | 193 +++++++++--------- ...search-with-ravendb-python-and-fastapi.mdx | 107 ++++++---- guides/spatial-search-in-ravendb.mdx | 150 +++++++------- ...e-the-ai-tidal-wave-with-ravendb-genai.mdx | 47 +++-- guides/transactional-outbox.mdx | 13 +- ...emote-attachments-to-cut-storage-costs.mdx | 4 +- .../vibe-coding-with-ravendb-and-context7.mdx | 9 +- guides/zabbix-setup-guide.mdx | 83 +++++--- 13 files changed, 542 insertions(+), 453 deletions(-) diff --git a/guides/employing-data-archival-guide.mdx b/guides/employing-data-archival-guide.mdx index 14bff44889..f461ae9ceb 100644 --- a/guides/employing-data-archival-guide.mdx +++ b/guides/employing-data-archival-guide.mdx @@ -4,7 +4,7 @@ tags: [data-governance, csharp, use-case] icon: "real-time-statistics" image: "https://ravendb.net/wp-content/uploads/2025/10/employing-data-archival-article.svg" publishedAt: 2025-11-17 -description: "How to set up and use data archival in RavenDB" +description: "Use RavenDB Data Archival to exclude historical documents from indexes and queries, keeping performance high without deleting data you need to retain." author: "Paweł Lachowski" proficiencyLevel: "Expert" --- @@ -25,7 +25,7 @@ Queries providing different financial analytics (e.g., previous month summaries) You start to investigate and realise it’s not just a spike caused by traffic or a software bug slowing you down. It’s the sheer volume of data that you need to index. -You can’t just delete older data with [expiration](https://docs.ravendb.net/7.1/studio/database/settings/document-expiration/) to free space. It’s prohibited by the legal restrictions. Being forced to keep large amounts of required but rarely used documents, only to suffer through slower queries as a result, feels like a torture for your business. +You can’t just delete older data with [expiration](https://docs.ravendb.net/studio/database/settings/document-expiration/) to free space. It’s prohibited by the legal restrictions. Being forced to keep large amounts of required but rarely used documents, only to suffer through slower queries as a result, feels like a torture for your business. ## What is the problem? @@ -39,8 +39,8 @@ from index "InvoicesByIssueDate" where issue_date >= "2025-06-25T10:21:56.1500Z" A query (not so complex) alone took over 1200ms. Exploring the data by the user takes a lot of time, reducing the quality of service and making interacting with your application frustrating. - - +RavenDB Studio query result showing over 1200ms execution time on the InvoicesByIssueDate index before archival +RavenDB Studio index view showing a large number of entries due to unarchived historical invoice documents Not mentioning overhead of the rest of the system. Forget about instant filtering \- you’re indexing all of the data, even though your users don’t care about the old stuff. This creates a huge problem as all those files from seven years ago are still being viewed and filtered during the query. @@ -63,7 +63,10 @@ To trigger this feature, you can take one of two approaches: from code or the st **If you’re using RavenDB Cloud,** make sure to enable Data Archival in the Product Features section before continuing. - +RavenDB Cloud portal Product Features section with the Data Archival toggle enabled + +### From Code + To enable and trigger data archival from your application, we need to define which collection should be archived and set up the archival configuration. That’s how we do that: ```csharp showLineNumbers @@ -93,6 +96,8 @@ This code triggers archival in the Invoices database. We set it to check every h Note: You can make a synchronous version by deleting await in the apply configuration and changing store.Maintenance.SendAsync to just store.Maintenance.Send. +### From Studio + Or we can trigger it from the studio easily, we just need to: 1. Go to Settings \=\> Data Archival. @@ -101,14 +106,16 @@ Or we can trigger it from the studio easily, we just need to: 4. Optionally, toggle to customize the maximum number of documents the archiving task will process in a single run. 5. Click Save to apply your settings. - +RavenDB Studio Data Archival settings panel showing the frequency and maximum documents per run configuration options But this is only the beginning. We have only triggered our archival, but now we need to tell RavenDB which files we want to archive. ## Sorting Through the Piles: Choosing What to Archive Now that our archivist is working, they need to know what we want to archive, and we have numerous invoices for them to review. Now we just need to tell them to archive what we need. For our invoices, we want to archive those older than 90 days. -We can take different approaches to achieving that. We will start with the simplest. First, let's do it in the studio, and later we will make it from code. +We can take different approaches to achieving that. We will start with the simplest. First, let’s do it in the studio, and later we will make it from code. + +### Using Studio Patch Let’s enter the studio, select our database, and then the Documents tab. In there, we want to enter the Patch menu. In there, we will use a patch on files to archive files older than 90 days. This patch would look like this: @@ -117,14 +124,16 @@ from "Invoices" update { var archiveDate = new Date(this.issue_date); archiveDate.setDate(archiveDate.getDate() + 90); - + var archiveDateString = archiveDate.toISOString(); archived.archiveAt(this, archiveDateString); } ``` This code calculates our gap 90 days from the `issue_date` of each file. Then it converts it to an ISO string, allowing us to use and archive it using this calculated value. RavenDB will process all documents in the Invoices collection. - +RavenDB Studio Patch view executing a patch on the Invoices collection to set the archive-at date 90 days from each document's issue date +### Using Code + Alternatively, we can do it with patchByQuery from code: ```csharp showLineNumbers @@ -150,6 +159,8 @@ store.Operations.Send(patchByQueryOp); With this update, we take the `issue_date` of our invoices, add 90 days to it, and convert it to an ISO string. The data archivist that we set up before will see these entries and schedule them for archiving. +### At Write Time + If you want to add an archiving flag to the document when it arrives at the database, you can do that with this additional code after adding this document. For example, it would look like this. ```csharp showLineNumbers @@ -163,9 +174,9 @@ using (var session = store.OpenSession()) // ... other properties }; - session.Store(newCompany); + session.Store(newInvoice); - var metadata = session.Advanced.GetMetadataFor(newCompany); + var metadata = session.Advanced.GetMetadataFor(newInvoice); // Set the archival date to 90 days in the future (in UTC) var archiveDate = SystemTime.UtcNow.AddDays(90); @@ -179,13 +190,13 @@ using (var session = store.OpenSession()) How does archival affect this problem? After setting up the archival, documents only 90 days old or newer will be made available by default for all indexes. Because indexes are interconnected with other parts of the system, this affects your whole system. After archiving our files, changes are made to the metadata. - +RavenDB document metadata view showing the @archived flag set to true after the archival process ran And the index becomes smaller, containing fewer files. - +RavenDB Studio index view showing a reduced entry count after old invoices were archived and excluded from indexing This will affect queries by reducing the time they need to search through data. After all, a big part of the documents isn’t in the indexes. - +RavenDB Studio query result showing significantly reduced execution time on the InvoicesByIssueDate index after archival Query performance improved, but the data is still in the database. Perfect of two worlds. ## Archive behaviour @@ -194,17 +205,24 @@ With added archival to documents, they will be ignored by default during specifi When you have archival enabled during index creation, you can choose the behavior the index should take with archived files. This allows you to choose if you want to exclude those documents, include them, or query only on archived ones. +### Indexes + In indexes, you do that at the bottom of the index in the configuration. - +RavenDB Studio index configuration panel showing the archived document behavior dropdown with options to exclude, include, or query only archived documents + +### Subscriptions + Or if you are using subscriptions, you can change this setting directly under subscription RQL. - -More information about behaviour can be found in our documentation under this [link](https://docs.ravendb.net/7.1/data-archival/archived-documents-and-other-features). +RavenDB Studio subscription task editor showing the archived document behavior setting below the RQL query +More information about behaviour can be found in [Archived Documents and Other Features](https://docs.ravendb.net/data-archival/archived-documents-and-other-features) in our documentation. ## Taking documents out -If you need to unarchive a document, it can be done in one of two simple ways. This can be done either in the studio using a patch: +If you need to unarchive a document, it can be done in one of two simple ways. + +### From Studio ``` from Invoices as o @@ -214,6 +232,8 @@ update { } ``` +### From Code + Or in code using this auto-index patchByQuery function: ``` @@ -235,6 +255,11 @@ Note: Don’t try to just remove `@archived: true` manually in the document. Thi ## Summary -Data archival gives you more control over your database. By excluding unnecessary documents from queries, it makes them run smoother. If you’re also looking to reduce storage costs for binary data, check out our guide on [using remote attachments to cut storage costs](https://docs.ravendb.net/guides/using-remote-attachments-to-cut-storage-costs). But there’s so much more to RavenDB, like GenAI support, which you can read about [here](https://ravendb.net/articles/survive-the-ai-tidal-wave-with-ravendb-genai) and so many other features. +- Data Archival flags documents as archived, keeping them in the database while excluding them from indexes and queries by default, so you retain data without paying the performance cost. +- Enable archival via `ConfigureDataArchivalOperation` in code or through Settings > Data Archival in RavenDB Studio. +- Mark individual documents or whole collections for archival by setting the `@archive-at` metadata field, either at write time or retroactively with a patch operation. +- Archived documents can be selectively included in specific indexes or subscriptions, and can be unarchived at any time using `archived.unarchive(this)`. + +Data archival gives you more control over your database. By excluding unnecessary documents from queries, it makes them run smoother. If you’re also looking to reduce storage costs for binary data, check out our guide on [using remote attachments to cut storage costs](https://docs.ravendb.net/guides/using-remote-attachments-to-cut-storage-costs). But there’s so much more to RavenDB, like [GenAI support](https://ravendb.net/articles/survive-the-ai-tidal-wave-with-ravendb-genai) and so many other features. -Interested in testing this feature before taking a production licence? Grab the developer license dedicated to testing under this link [here](https://ravendb.net/dev). Any questions about this feature or just want to hang out and talk with the RavenDB team? Join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). +Interested in testing this feature before taking a production licence? Grab the [RavenDB developer license](https://ravendb.net/dev) dedicated to testing. Any questions about this feature or just want to hang out and talk with the RavenDB team? Join our [RavenDB Discord Community Server](https://discord.com/invite/ravendb). diff --git a/guides/employing-schema-validation-to-standardize-your-data.mdx b/guides/employing-schema-validation-to-standardize-your-data.mdx index 92e5f1c7c7..3e6c695211 100644 --- a/guides/employing-schema-validation-to-standardize-your-data.mdx +++ b/guides/employing-schema-validation-to-standardize-your-data.mdx @@ -1,11 +1,24 @@ --- -title: "Employing Schema Validation to standardize your data" +title: "Employing Schema Validation to Standardize Your Data" tags: [data-governance, csharp, use-case] icon: "document-schema" -description: "Learn more about configuring schema validation in RavenDB to incrase trust in your data across teams." +description: "Configure JSON schema validation in RavenDB to enforce data consistency across teams. Learn how to audit documents against a schema or block invalid writes at the database level." publishedAt: 2026-02-01 author: "Paweł Lachowski" -proficiencyLevel: "Expert" +proficiencyLevel: "Intermediate" +see_also: + - title: "Schema Validation Overview" + link: "documents/schema-validation/overview" + source: "docs" + path: "Documents > Schema Validation" + - title: "Schema Validation Configuration" + link: "documents/schema-validation/configuration" + source: "docs" + path: "Documents > Schema Validation" + - title: "Auditing Document Compliance: API" + link: "documents/schema-validation/auditing-document-compliance/auditing-document-compliance_api" + source: "docs" + path: "Documents > Schema Validation > Auditing Document Compliance" --- import Admonition from '@theme/Admonition'; @@ -16,7 +29,7 @@ import LanguageSwitcher from "@site/src/components/LanguageSwitcher"; import LanguageContent from "@site/src/components/LanguageContent"; import Image from "@theme/IdealImage"; -When working with any type of data, things usually go smoothly when a single team owns the entire system. When an organisation adds new teams that work on the same data, it can quickly become a mess without proper governance. You may find yourself being frustrated that the other team is saving as a string what breaks your system from time to time. Data must be tracked, and dataset version changes must be enforced precisely. You need to keep the same data model between all documents. Thankfully, we now have a tool to handle those situation and you don’t need to prepare for the worst by preparing boundaries. This tool is a schema validation. +When working with any type of data, things usually go smoothly when a single team owns the entire system. When an organisation adds new teams that work on the same data, it can quickly become a mess without proper governance. You may find yourself being frustrated that the other team is saving a value as a string that ends up breaking your system from time to time. Data must be tracked, and dataset version changes must be enforced precisely. You need to keep the same data model between all documents. Thankfully, we now have a tool to handle those situations. This tool is schema validation. Schema validation ensures your documents conform to the defined data model. Whether you need a single check after a model change or migration, or to run it continuously, RavenDB’s JSON schema validation keeps your data model predictable and consistent, even in the era of rapid development, increasing trust in your data across multiple teams. @@ -26,28 +39,40 @@ Your team just doesn’t have time to prepare for all possible errors caused by You can also use schema validation after migration or for general big data changes. You can just run schema validation and check all the documents. Just create a schema, validate it in schema validation, and run the process to avoid stress if everything transfers without data corruption. -On the other hand, if you can pick a more restricted way and block non-fitting documents. It reduces the manual work you'll need later and shortens planning time for possibilities you don’t want. +On the other hand, you can pick a more restricted approach and block non-fitting documents outright, reducing the manual work you’ll need later and shortening planning time for edge cases you don’t want. But how does it work in RavenDB? Let’s look into it. ## How it works in RavenDB -Mode 1: Non-blocking validation +### Mode 1: Non-blocking validation + You define a schema and validate documents occasionally using an audit operation, or rely on an index that collects schema errors. Neither approach blocks writes. Audit operations check documents when you execute them. The index-based approach also allows all documents, matching or not, to be stored, and simply records any issues for later querying. -Mode 2: Strict validation +### Mode 2: Strict validation + You enable schema validation on every write. In this mode, RavenDB blocks any document that does not match the schema, ensuring that only valid data is accepted. These two modes give you flexibility: you can use schema validation as a diagnostic tool that never interferes with writes, or enforce it strictly to guarantee schema correctness at all times. It gives you a guarantee that no invalid document will make it into the database. -The best thing is that you don’t need to write any line of code to define how the database should react to finding schema errors. If any issues are found with your documents, you will receive a clear response with documents ID and the exact reason they failed to verify against the schema. +| | Mode 1: Audit (non-blocking) | Mode 2: Strict (blocking) | +|---|---|---| +| **Blocks writes?** | No | Yes | +| **How errors surface** | Audit operation results or index query | Write error returned immediately | +| **When to trigger** | On demand or via background index | Every document write | +| **Best for** | Post-migration checks, diagnostics, gradual rollout | Enforcing a guaranteed data contract | +| **Requires schema enabled?** | No (index can carry its own schema) | Yes | + +The best thing is that you don’t need to write any line of code to define how the database should react to finding schema errors. If any issues are found with your documents, you will receive a clear response with each document's ID and the exact reason it failed to verify against the schema. + +See [Write Validation: API](https://docs.ravendb.net/documents/schema-validation/write-validation/write-validation_api) for the full list of available JSON Schema restrictions. ## Quick usage from Studio ### Audit Operation -About the modes we mentioned, we can divide the audit operation into two modes. One works automatically, and the other requires you to manually trigger it. When schema validation is enabled, RavenDB will check all modified documents after enabling it. There is few different triggers for schema validation, and those are: +Schema validation in Studio operates in two ways: it can run automatically on every write, or you can trigger it manually. When enabled, RavenDB will check all modified documents. There are a few different triggers for schema validation, and those are: * Inserting documents via session and via bulk insert. * Inserting documents via patch by query @@ -57,24 +82,24 @@ About the modes we mentioned, we can divide the audit operation into two modes. If the schema is triggered, it will block the document from being stored and return an error. -If you want to input only a few documents or use schema validation as a one-time scan, you can disable schema validation while leaving global schema validation enabled. You can also trigger a collection-wide scan to test all documents, not just new ones. We will use a basic schema to scan if all our Orders have the Company and Employee fields. Click on the testing button and then on the bottom-right corner, press run test. It will return all errors in the table. - - +If you want to input only a few documents or use schema validation as a one-time scan, you can disable schema validation while leaving global schema validation enabled. You can also trigger a collection-wide scan to test all documents, not just new ones. We will use a basic schema to scan whether all our Orders have the Company and Employee fields. Click on the testing button and then on the bottom-right corner, press run test. It will return all errors in the table. +RavenDB Studio schema validation audit results table showing validation errors per document +Audit result detail showing an Orders document missing the required Employee field As you can see, one of our orders is missing the Employee field. ### Index -Once your schema is ready, we can also create an index to store all documents that do not fit the schema in one place. With index, prepared you can just query your documents: +Once your schema is ready, we can also create an index to store all documents that do not fit the schema in one place. With the index prepared, you can just query your documents: - -And after clicking eye icon in preview you can learn more information about this document. +RavenDB index query results listing documents that failed schema validation +And after clicking the eye icon in preview you can learn more information about this document. - +Document preview panel showing the schema validation error details for a specific document ## How to use it with code? ### Audit Operation -Of course, we can do the same checks with code. We can run a quick scan of all documents from code. To do that, we can use such code. +Of course, we can do the same checks with code. We can run a quick scan of all documents from code. To do that, we can use code like this: ```csharp var parameters = new StartSchemaValidationOperation.Parameters @@ -100,10 +125,10 @@ foreach (var error in result.Errors) ``` After that, we get a response like this. - +Console output of the StartSchemaValidationOperation showing validated document count and error messages ### Index -Of course, querying from code is an option too; you just need to prepare the required index beforehand. To query such an index, you can usea script like this one: +Of course, querying from code is an option too; you just need to prepare the required index beforehand. To query such an index, you can use a script like this one: ```csharp using (var session = store.OpenAsyncSession()) @@ -131,7 +156,7 @@ using (var session = store.OpenAsyncSession()) } ``` -How do you create such index? Can we make it from code? Let’s get into configuration. +How do you create such an index? Can we make it from code? Let’s get into configuration. ## Configure it from studio @@ -139,8 +164,8 @@ How do you create such index? Can we make it from code? Let’s get into configu To configure schema validation in Studio, first enter the database you want to set it for, then navigate to Settings \> Document Schema. Add a new schema. This gives us two fields. One to select the collection and another to write our schema in. - -After you selected collection, you can proceed to write your schema. Schemas are built using [JSON Schema](https://json-schema.org/), a popular vocabulary for JSON files. Basic schema on orders collection can look like this: +RavenDB Studio Settings > Document Schema page with the Add Schema form open showing collection selector and schema editor fields +After you select a collection, you can proceed to write your schema. Schemas are built using [JSON Schema](https://json-schema.org/), a popular vocabulary for JSON files. Basic schema on orders collection can look like this: ```json { @@ -233,7 +258,7 @@ This schema is simple. What it does is define the fields Company and Employee mu } ``` -There are many different properties that full list of constants is available [here](https://docs.ravendb.net/documents/schema-validation/write-validation/write-validation_api#available-constraints). Now what it checks is not only if the Company and Employee fields are ther,e but also if Freight is a number, Lines with nested fields, and more. To make it even more strict, we can enforce additionalProperties false to tell the schema to catch any writes for documents that have any additional field other than those present in the schema. +There are many different properties; the full list of constraints is available in the [Write Validation: API](/documents/schema-validation/write-validation/write-validation_api#available-constraints) reference. Now what it checks is not only if the Company and Employee fields are there, but also if Freight is a number, Lines with nested fields, and more. To make it even more strict, we can enforce additionalProperties false to tell the schema to catch any writes for documents that have any additional field other than those present in the schema. ### Index @@ -254,7 +279,7 @@ select new We need to add Errors to stored values. We can easily do that in the fields section. - +RavenDB Studio index editor Fields section with the Errors field added to stored values If you want to make this index to only gather errors from one collection, change the first line to: ``` @@ -263,14 +288,14 @@ from doc in docs.ChosenCollection //change ChosenCollection to your collection We can save the index, and if we have any documents not matching the schema in database, we can notice they are added to the index. - +RavenDB index list view showing the Orders_WithValidation index with schema-invalid documents indexed ### Dedicated index schema Because our schema is enabled, writes will be blocked. But we can create an index that does not require enabling the schema. We switch to a different section on the same bar where we were selecting Fields to add the Errors field, and switch to Schema Definitions. There, we can write a definition dedicated to this index. In there, we select our desired collection and input our schema. - +RavenDB Studio index Schema Definitions tab with collection selector and dedicated schema definition editor After saving it if you didn’t do it before, you can delete or disable the schema in the **database** setting. Now let’s do this from code. ## Configure it from code @@ -372,99 +397,11 @@ await store.Maintenance.SendAsync(new ConfigureSchemaValidationOperation(configu ``` -If we wanted to use this only as scans we manually trigger, we would change the schema definition to enabled so it doesn’t block our documents. Full code in the end looks like this: - -```csharp -var schemaJson = @"{ - ""additionalProperties"": true, - - ""properties"": { - ""Company"": { - ""type"": ""string"" - }, - - ""Freight"": { - ""type"": ""number"", - ""minimum"": 0 - }, - - ""Lines"": { - ""type"": ""array"", - ""items"": { - ""type"": ""object"", - ""additionalProperties"": true, - ""properties"": { - ""Product"": { ""type"": ""string"" }, - ""Quantity"": { ""type"": ""integer"", ""minimum"": 1 }, - ""PricePerUnit"": { ""type"": ""number"" }, - ""Discount"": { ""type"": ""number"", ""minimum"": 0, ""maximum"": 1 } - }, - ""required"": [""Product"", ""Quantity"", ""PricePerUnit""] - } - }, - - ""OrderedAt"": { - ""type"": ""string"", - ""pattern"": ""^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{7}$"" - }, - - ""RequireAt"": { - ""type"": ""string"", - ""pattern"": ""^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{7}$"" - }, - - ""ShipTo"": { - ""type"": ""object"", - ""additionalProperties"": true, - ""properties"": { - ""City"": { ""type"": ""string"" }, - ""Country"": { ""type"": ""string"" }, - ""Line1"": { ""type"": ""string"" }, - ""Line2"": { ""type"": [""string"", ""null""] } - } - }, - - ""ShipVia"": { - ""type"": ""string"" - }, - - ""ShippedAt"": { - ""type"": [""string"", ""null""], - ""pattern"": ""^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{7}$"" - } - }, - - ""required"": [ - ""Company"", - ""Freight"", - ""Lines"", - ""OrderedAt"", - ""RequireAt"", - ""ShipTo"", - ""ShipVia"" - ] -}"; - -var configuration = new SchemaValidationConfiguration -{ - Disabled = false, // Enables global schema validation - ValidatorsPerCollection = new Dictionary - { - { "Orders", new SchemaDefinition - { - Disabled = false, // Enables this schema definition - Schema = schemaJson - } - } - } -}; - -await store.Maintenance.SendAsync(new ConfigureSchemaValidationOperation(configuration)); -``` +To use this only as manually triggered scans without blocking writes, set `Disabled = true` on the `SchemaDefinition` instead. ### Index -After making a schema, we can also make an index from code. We start by defining our schema and index map. Nex,t we connect the schema to a specific collection. We are still using the same example so orders collection. We will use a simple short schema here. +After making a schema, we can also make an index from code. We start by defining our schema and index map. Next, we connect the schema to a specific collection. We are still using the same example so orders collection. We will use a simple short schema here. ```csharp string schemaDefinition = @"{ @@ -523,4 +460,11 @@ public class IndexResult This code creates an index with dedicated index schema validation, allowing you to write even incorrect documents while still allowing you to query it. All of that from code, making it quicker to set up on multiple devices instead of going into the studio on each of them. -Interested in RavenDB? Grab the developer license dedicated for testing under this link [here](https://ravendb.net/dev), or get a free cloud database [here](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). +## Summary + +- Schema validation enforces a consistent data model across all documents in a collection, catching type mismatches and missing required fields before they cause problems in production. +- Two modes give you flexibility. Non-blocking audit mode lets you scan existing data without interrupting writes, while strict mode rejects any document that does not conform to the schema at write time. +- No code required to get started, as the Studio's Document Schema settings let you define a JSON Schema, run collection-wide audits, and review errors in a table without touching the API. +- A dedicated validation index can be configured to collect schema violations from all indexed documents, letting you query non-conforming records at any time without blocking writes. + +Interested in RavenDB? [Grab a free developer license](https://ravendb.net/dev) for testing, or [start a free cloud database](https://ravendb.net/cloud). If you have questions about this feature or want to talk with the RavenDB team, [join our Discord Community Server](https://discord.com/invite/ravendb). diff --git a/guides/how-is-my-database-today.mdx b/guides/how-is-my-database-today.mdx index 63afc8a7e1..aa1c079fd4 100644 --- a/guides/how-is-my-database-today.mdx +++ b/guides/how-is-my-database-today.mdx @@ -3,7 +3,7 @@ title: "How is my database today?" tags: [monitoring, troubleshooting, clusters] icon: "cluster-dashboard" publishedAt: 2026-03-20 -description: "Learn tools you can use to check state of your database." +description: "Learn how to use RavenDB Studio's Cluster Dashboard, Traffic Watch, Admin Logs, and other built-in tools to diagnose and investigate database performance issues." author: "Paweł Lachowski" proficiencyLevel: "Beginner" --- @@ -49,11 +49,11 @@ The notification centre in the top-right corner of the studio collects these ale When something is really going badly, it might look like that. - +RavenDB Studio notification centre showing a disk slow warning
This is an obvious example, but even small notifications can provide useful insight into the system you are working with. We now know that our disk is slow, which can mean a few things. We can suspect that this is Database Backup slowing down the rest of the system, so we can check its details. - +RavenDB Studio backup task details showing a stuck backup file upload
Here we can see that the backup file upload is stuck. From here, we can investigate further, gather information for support, or potentially resolve the problem ourselves. Let’s look at the dashboard and check our resources. @@ -71,9 +71,9 @@ Problems it identifies: The Cluster Dashboard is the view that RavenDB Studio opens by default. It provides a modular overview of the server and cluster, allowing you to quickly check the current state of the system. -The dashboard ([link](https://docs.ravendb.net/7.2/studio/cluster/cluster-dashboard/cluster-dashboard-overview) to documentation) is composed of widgets, which can be added or removed from the bottom-right corner. +The [Cluster Dashboard](https://docs.ravendb.net/7.2/studio/cluster/cluster-dashboard/cluster-dashboard-overview) is composed of widgets, which can be added or removed from the bottom-right corner. - +RavenDB Cluster Dashboard displaying CPU, traffic, IO, and GC metric widgets
These widgets display real-time metrics such as CPU usage, traffic, IO activity, and garbage collection statistics. By scanning these metrics, you can quickly determine whether the cluster is under unusual load or a core resource is being heavily used. That brings us back to the [“Know your system”](#know-your-system) part of this guide. @@ -81,15 +81,15 @@ Because the dashboard updates in real time, it is often the fastest way to under While scrolling through the widgets, we eventually reach the memory statistics and notice that the instance is using a large amount of memory. High memory usage can affect many parts of the system. This suggests that memory might be contributing to the slowdown. - +RavenDB Cluster Dashboard memory statistics widget showing only 46 MB available for processing
As you can see, only 46 MB of memory is *Available for processing*, which may be very little depending on the machine. At the same time, we can see that Average Request Time spiked moments ago. - +RavenDB Cluster Dashboard average request time widget showing a recent spike
We can also look at other metrics to see if the rest of our systems are fine. For example, we can look at the IO Stats widget. - +RavenDB Cluster Dashboard IO Stats widget displaying disk read and write activity
Knowing that this machine can handle only about 500 IOPS outside burst capacity, we can expect performance to drop significantly once the burst credits are exhausted. Being aware of your machine’s limits and “normal numbers” within those widgets can give you much information about the state of your instance. @@ -107,13 +107,13 @@ Problems it identifies: Sometimes the issue is not related to workload but to the cluster itself. A node might be unavailable, recovering, or not participating correctly in cluster operations. Even when a node is missing, the rest of the cluster may continue running. -The Cluster Status view ([link](https://docs.ravendb.net/7.2/studio/cluster/cluster-view/) to documentation) helps you inspect the health of individual nodes and their topology. It shows how nodes relate to each other and whether any of them are unavailable or experiencing problems. By reviewing this view, you can quickly notice situations where a node is down or under heavy memory pressure. +The [Cluster Status view](https://docs.ravendb.net/7.2/studio/cluster/cluster-view/) helps you inspect the health of individual nodes and their topology. It shows how nodes relate to each other and whether any of them are unavailable or experiencing problems. By reviewing this view, you can quickly notice situations where a node is down or under heavy memory pressure. For example, if the cluster has nodes in different geographical locations, the failure of one node may change response times for users located closer to that server. For users who are away from the rest of the servers, or if [load balancing](https://docs.ravendb.net/7.2/client-api/configuration/load-balance/read-balance-behavior/) is used, it can affect your database behaviour \- e.g., the fastest node can change. - +RavenDB Studio Cluster Status view showing node topology and individual node health -If you see your nodes having uneven workloads, you might want to change the behaviour of load balancing that you can learn about [here](https://docs.ravendb.net/7.2/client-api/configuration/load-balance/load-balance-behavior/). To further investigate differences in node behavior, you can use Cluster Debug, which provides detailed insights into the cluster’s communication status. +If you see your nodes having uneven workloads, you might want to change the [load balance behavior](https://docs.ravendb.net/7.2/client-api/configuration/load-balance/load-balance-behavior/). To further investigate differences in node behavior, you can use Cluster Debug, which provides detailed insights into the cluster’s communication status. ## Traffic Watch @@ -133,11 +133,11 @@ Traffic Watch allows you to observe incoming HTTP requests in real time. It show If something is wrong and requests hit the database repeatedly every second it might be generating problems. RavenDB can also highlight HTTP requests that generate error codes like 4xx or 5xx. - +RavenDB Studio Traffic Watch showing live HTTP requests with status codes and request and response sizes You can also see the request and response sizes for those requests. If those are significant and appear multiple times, they can affect your system. If you need even more information, you can look into Admin Logs instead. -If you want, you can also save Traffic Watch onto a disk, but it might result in a slight performance change. More about Traffic Watch, you can read in this article [here](https://ravendb.net/articles/what-requests-hit-my-cloud-cluster). +If you want, you can also save Traffic Watch onto a disk, but it might result in a slight performance change. More about Traffic Watch, you can read [What Requests Hit My Cloud Cluster](https://ravendb.net/articles/what-requests-hit-my-cloud-cluster). ## Admin Logs @@ -153,15 +153,15 @@ Problems it identifies: Some problems happen deeper inside the server and are not visible through metrics. Internal tasks, background processes, or unexpected errors may leave their traces only in the logs. -Admin Logs ([link](https://docs.ravendb.net/7.2/studio/server/debug/admin-logs/) to documentation) provide detailed information about what the database is doing internally. By reviewing them, you can observe server activity and identify operations that may be consuming unusually high amounts of resources. So if your query sometimes doesn’t respond and Traffic Watch doesn’t give you enough information, Admin Logs are a great place to check. +[Admin Logs](https://docs.ravendb.net/7.2/studio/server/debug/admin-logs/) provide detailed information about what the database is doing internally. By reviewing them, you can observe server activity and identify operations that may be consuming unusually high amounts of resources. So if your query sometimes doesn’t respond and Traffic Watch doesn’t give you enough information, Admin Logs are a great place to check. -In addition to performance-related information, the logs can reveal errors, warnings, or unexpected behavior that might not immediately appear in the UI. This makes them especially problematic when you don’t know exactly what happened in the database that caused the issue. It is also a useful tool if you need to check later what led to problems, as you can download your logs to the disk after setup. +In addition to performance-related information, the logs can reveal errors, warnings, or unexpected behavior that might not immediately appear in the UI. This makes them especially useful when you don’t know exactly what happened in the database that caused the issue. It is also a useful tool if you need to check later what led to problems, as you can download your logs to the disk after setup. - +RavenDB Studio Admin Logs view showing detailed internal server log entries
-While logs require more careful reading than dashboards or metrics, they often contain the most precise description of what the server is actually doing at a given moment. Because it’s so detailed, it can be overwhelming and make it harder for less experienced users to pick out what actually matters. More about Admin Logs can be found in [this](https://ravendb.net/articles/what-requests-hit-my-cloud-cluster) article. +While logs require more careful reading than dashboards or metrics, they often contain the most precise description of what the server is actually doing at a given moment. Because it’s so detailed, it can be overwhelming and make it harder for less experienced users to pick out what actually matters. More about Admin Logs can be found in the [What Requests Hit My Cloud Cluster](https://ravendb.net/articles/what-requests-hit-my-cloud-cluster) article. -You can also send logs to outside monitoring like Grafana or Zabbix. They allow you to set up alerts, correlate metrics, and gather other useful information in one place. If you want to learn more about external monitoring, you can look into [*this*](#external-monitoring) part of the guide. +You can also send logs to outside monitoring like Grafana or Zabbix. They allow you to set up alerts, correlate metrics, and gather other useful information in one place. If you want to learn more about external monitoring, you can look into the [External Monitoring](#external-monitoring) section of this guide. ## Cluster Debug @@ -179,11 +179,11 @@ More complex issues sometimes involve cluster coordination itself. Problems with Cluster Debug exposes internal information about the cluster's state and Raft command queue. It can help identify nodes that are behind or situations where cluster commands are stuck. For example, if you post many commands at once, such as inserting files with operations, you can cause desynchronisation between nodes. - +RavenDB Cluster Debug view showing Raft command queue and per-node synchronization status
You can also see the queue size for each node to see if it’s desynchronized or see the status of your nodes. Scrolling down, you can also see the whole log entry for Raft operations that happen on each node. - +RavenDB Cluster Debug log showing Raft operation entries for each cluster node ## RavenDB Cloud Alerts and Metrics @@ -201,23 +201,23 @@ If you are using RavenDB Cloud, it also provides its own basic monitoring for al The first part of RavenDB Cloud monitoring is the Metrics and Cluster Health screens. The first metric you might look at to notice a problem is Nodes Availability widget. It is placed on the Cluster Health screen and provides overall information about whether something is wrong. If it is degraded, it’s a first alarm to look deeper into the system. - +RavenDB Cloud Cluster Health screen showing the Nodes Availability widget in a degraded state
Then you can look into metrics, suggestions, and incident history to determine the problem. When we look into our test database, we can see it degraded only a bit, but we will look into the problems either way. - +RavenDB Cloud metrics and incident history showing a slightly degraded database instance
While looking at the incident history, we can see that our problem is High IO usage. By looking further into suggestions, we can see that we should consider upgrading our instance. - +RavenDB Cloud suggestions panel recommending an instance upgrade due to high IO usage
To determine whether we want to proceed, we can go into the Metrics screen and review them to see whether we really need to upgrade. In our case, the database was calm most of the time, so depending on whether it will be a recurring problem, we can upgrade or check whether our systems aren’t generating such high I/O usage in a different place. Either way, we know that we now need to look into IO. - +RavenDB Cloud Metrics screen showing IO usage over time with a high IO spike
Additionally, if the database is struggling and, for example, our problems will start to fill our instance with data, or it will auto scale, we will get a notification via email connected to this database. - +Email notification from RavenDB Cloud alerting about a database instance event ## External Monitoring @@ -237,10 +237,10 @@ As we said, it’s about preventing future problems. External monitoring can hel For example, if we receive reports of problems at certain hours, we can check the monitoring to investigate the issue later. Without it, we would need to dig into logs to just know something is wrong, or be there during problems to notice what isn’t really a realistic scenario. Worst-case scenario, you might not even know something is wrong for days. Meanwhile, with monitoring, we can sit down after an incident or even be notified and look at what is wrong. It’s basically a dashboard, but allows for more customisation. - +External monitoring dashboard showing RavenDB disk usage trends over time In our case, the disk is under the most pressure. Now we have a lead, and we can go to the database and use previous tools to identify the cause. More about monitoring can be found in the articles we mentioned. -Now that you know where to look, you might need a way to fix your problems or look even deeper into them. If you need help with investigating high CPU usage, you may be interested in [this](https://ravendb.net/articles/how-to-troubleshoot-ravendbs-high-cpu-usage) article. Or maybe you are having trouble with high memory usage; we have an [article](https://ravendb.net/articles/master-ravendb-troubleshoot-fix-high-memory-usage) about that, too. +Now that you know where to look, you might need a way to fix your problems or look even deeper into them. If you need help with investigating high CPU usage, you may be interested in [How to Troubleshoot High CPU Usage in RavenDB](https://ravendb.net/articles/how-to-troubleshoot-ravendbs-high-cpu-usage). Or maybe you are having trouble with high memory usage; we have a guide on [troubleshooting high memory usage in RavenDB](https://ravendb.net/articles/master-ravendb-troubleshoot-fix-high-memory-usage), too. -Interested in RavenDB? Grab the developer license dedicated for testing under this link [here](https://ravendb.net/dev?_gl=1*x014vj*_gcl_au*NTAyODIzOTk4LjE3NzAwMTY5NTcuMzYwMzg1MDIwLjE3NzMwNDQ3NjcuMTc3MzA0NDc2Nw..), or get a free cloud database [here](https://ravendb.net/cloud?_gl=1*s4ciyf*_gcl_au*NTAyODIzOTk4LjE3NzAwMTY5NTcuMzYwMzg1MDIwLjE3NzMwNDQ3NjcuMTc3MzA0NDc2Nw..). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). \ No newline at end of file +Interested in RavenDB? Grab the [developer license for testing](https://ravendb.net/dev), or get a [free RavenDB Cloud database](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join the [RavenDB Discord Community Server](https://discord.com/invite/ravendb). \ No newline at end of file diff --git a/guides/master-ravendb-spotting-red-flags-in-index-definitions-guide.mdx b/guides/master-ravendb-spotting-red-flags-in-index-definitions-guide.mdx index 9e5d563686..31a46645e5 100644 --- a/guides/master-ravendb-spotting-red-flags-in-index-definitions-guide.mdx +++ b/guides/master-ravendb-spotting-red-flags-in-index-definitions-guide.mdx @@ -2,7 +2,7 @@ title: "Master RavenDB: Spotting red flags in index definitions" tags: [indexes, perf-tuning, deep-dive] icon: "ravendb-etl" -description: "Learn about index definitions and thier optimization." +description: "Discover 7 common RavenDB index anti-patterns, including misusing LoadDocument and storing too many fields, and learn how to fix each one for better query performance." publishedAt: 2025-09-17 image: "https://ravendb.net/wp-content/uploads/2025/09/spotting-red-flags-article-image.png" author: "Paweł Lachowski" @@ -75,7 +75,7 @@ Storing fields in RavenDB basically means keeping certain data from your indexed This is why you don’t just mark every field as “stored” thinking it’ll make everything fast. The more fields you store, the heavier your indexes get, and the less truly fast your queries become. Store only what you really need for quick access or query efficiency. Making all fields ‘fast’ makes none truly fast. -You can find more information about storing fields here: [documentation](https://docs.ravendb.net/6.2/indexes/storing-data-in-index/). +You can find more information about storing fields here: [storing data in an index](https://docs.ravendb.net/indexes/storing-data-in-index/). ## Misusing LoadDocument in Indexes @@ -92,17 +92,21 @@ select new } ``` +### When LoadDocument causes problems + If only a few documents reference a particular document, an index using `LoadDocument` in its definition works great. However, problems arise when many documents point to the same document, or a small set of them, and that document is frequently updated. Every time the referenced document changes, all the documents referencing it must be re-indexed. Suddenly, a single small change can generate a tremendous amount of indexing work. Consequences include slower index updates, leading to delayed queries that rely on fresh data. Also, all this extra work generates high IO demands, leading to longer request durations and potential system instability. +### How to avoid the issue + Sometimes, this situation comes from trying to apply relational database thinking to a document database. It’s advised to understand effective document modeling rather than forcing relational patterns. Rethink your queries or data models. -## Not using fanout +## Missing fanout for nested data -Fanout is a [helpful method](https://docs.ravendb.net/6.2/indexes/indexing-nested-data#fanout-index---multiple-index-entries-per-document) in indexing that allows you to query your lists stored inside documents without working around them. If you need to filter list elements individually or compute something per item, then most probably you should use fanout. Let’s look at an example to understand it. Consider this example document: +A [fanout index](https://docs.ravendb.net/indexes/indexing-nested-data#fanout-index---multiple-index-entries-per-document) is a technique that allows you to query lists stored inside documents without working around them. If you need to filter list elements individually or compute something per item, then most probably you should use fanout. Let’s look at an example to understand it. Consider this example document: ``` { @@ -114,6 +118,8 @@ Fanout is a [helpful method](https://docs.ravendb.net/6.2/indexes/indexing-neste } ``` +### Indexing without fanout + This is what an index would look like without using fanout. See example below ``` @@ -135,6 +141,8 @@ Price = (Price:3, Price:5) This is troublesome, you want to query by elements in the list but it is treated as one object. If we use a query to find orders with apples costing more than 4$, it would indicate that the document with ID: “orders/1-A” meets this requirement, because “Apple” is in the product list, and the price list has at least one value greater than 4$. +### Using fanout correctly + This problem is easily solved by fanout. Fanout is for indexing a list of entities within a single document. If you are looking for a similar concept from dotnet BCL, SelectMany would be the method to compare with. We can use fanout like this: ``` @@ -167,13 +175,15 @@ And this one would return a negative response, as it should. But if we’d have order that matches this query, it works properly: - +RavenDB Studio showing a correct fanout query result filtering nested product list items individually ## Map Reduce With Unique Values -Sometimes you want to aggregate your data \- e.g. all profit generated from each customer, but you hold each purchase in separate documents. That is where [Map-Reduce](https://docs.ravendb.net/6.0/studio/database/indexes/create-map-reduce-index/) can help you. It aggregates all the data you want from your query under one key and combines them. +Sometimes you want to aggregate your data \- e.g. all profit generated from each customer, but you hold each purchase in separate documents. That is where [Map-Reduce](https://docs.ravendb.net/studio/database/indexes/create-map-reduce-index/) can help you. It aggregates all the data you want from your query under one key and combines them. The problem appears when you select the wrong field as a key. If you group your data using values that are unique, then you don’t really aggregate anything, because every document is isolated in its own group. That makes your index work more without actually doing anything. +### Grouping by a unique key + Map Reduce with unique values would look like this: ``` @@ -197,6 +207,8 @@ select new { In this example, you might notice we are using OrderID as a key, which will not be grouped in any way, as each OrderID is unique and never repeats. This just adds extra steps with something that changes nothing for this case. +### Grouping by a shared key + To make Map-Reduce effective, always pick a key that multiple documents can share. For example, grouping purchases by Company lets you sum all orders from one company. If instead, you group by OrderId, every document stands alone, and your index wastes resources without providing aggregation. As a rule of thumb, ask yourself: Will more than one document end up in this group? If not, reconsider your key choice. Good map reduce would look like this: @@ -225,6 +237,8 @@ In this example we sort using company grouping. We sum prices of Orders per comp When you need to process or combine a few fields into one field, you use ‘`let`’ in your index. This can also cause trouble if you are not careful. +### Chaining multiple let statements + When you are still in the development phase, or you just need multiple variables to store something at runtime, you may create indexes with fragments that resemble this: ``` @@ -249,14 +263,21 @@ let data = new { } ``` +### Grouping them into one object + That way, all of this is just one object, and it is treated as only one step. Of course, using one or two ‘`let`’ is completely normal, but if you are writing the fifth `let` in your index, you should stop for a moment and consider grouping them. ## Summary -Indexes in RavenDB are a basic but powerful tool. Like any tool, they work best when used correctly. Now that you understand them better, you may want to [explore](https://ravendb.net/articles/master-ravendb-projections-performance) queries and their performance, or [learn](https://ravendb.net/articles/enable-ai-search-in-your-web-app-in-5-minutes) about the vector search function that RavenDB offers. +Indexes in RavenDB are a basic but powerful tool. Like any tool, they work best when used correctly. This guide covered common index definition problems like: + +- Indexing too many fields bloats the index and increases I/O. Only the fields used for filtering or sorting belong in the index; everything else can be fetched via projections. +- Misusing LoadDocument causes mass re-indexing when a frequently updated document is referenced by many others, resulting in high I/O pressure and stale query results. +- Skipping fanout for nested arrays produces incorrect query results, because without it all list values are merged into a single index entry and cannot be matched individually. +- Choosing a unique key in Map-Reduce means no actual aggregation happens. Every document ends up in its own group, so the reduce step does nothing useful. -If you want to hang out with the RavenDB team to chat about this feature and meet our community, here is our [Discord \- RavenDB’s Developers Community](https://discord.gg/ravendb) server. +Now that you understand them better, you may want to read about [queries and performance](https://ravendb.net/articles/master-ravendb-projections-performance), or explore [vector search in RavenDB](https://ravendb.net/articles/enable-ai-search-in-your-web-app-in-5-minutes). -[image1]: \ No newline at end of file +If you want to hang out with the RavenDB team to chat about this feature and meet our community, here is our [Discord \- RavenDB’s Developers Community](https://discord.gg/ravendb) server. \ No newline at end of file diff --git a/guides/processing-invoices-using-data-subscriptions-in-ravendb.mdx b/guides/processing-invoices-using-data-subscriptions-in-ravendb.mdx index 3f2c1b4bc8..50304d8b94 100644 --- a/guides/processing-invoices-using-data-subscriptions-in-ravendb.mdx +++ b/guides/processing-invoices-using-data-subscriptions-in-ravendb.mdx @@ -1,7 +1,7 @@ --- title: "Processing Invoices Using Data Subscriptions" tags: [demo, background-tasks, csharp, use-case] -description: "Read about Processing Invoices Using Data Subscriptions" +description: "See how RavenDB Data Subscriptions let you offload invoice processing to a background worker in C#, with PDF generation and attachment storage included." publishedAt: 2024-12-08 author: "Egor Shamanaev" image: "https://ravendb.net/wp-content/uploads/2024/12/processing-invoices-article-cover.jpg" @@ -15,31 +15,31 @@ import CodeBlock from '@theme/CodeBlock'; import LanguageSwitcher from "@site/src/components/LanguageSwitcher"; import LanguageContent from "@site/src/components/LanguageContent"; -In this article we will tackle the problem of processing invoices in asynchorious manner using the [RavenDB Data subscriptions](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/what-are-data-subscriptions) feature. +In this article we will tackle the problem of processing invoices in an asynchronous manner using the [RavenDB Data subscriptions](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/what-are-data-subscriptions) feature. -We will create a [data subscription](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/what-are-data-subscriptions) on `Orders` collection, and use a [Subscription Worker](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/consumption/how-to-consume-data-subscription) to process the newly added `Orders` documents, in this particullar article we are going to process in an ongoing fashion, but since `Subscriptions` state are persisted, it can be a process that runs in a timely fashion, like overnight or weekend. +We will create a [data subscription](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/what-are-data-subscriptions) on `Orders` collection, and use a [Subscription Worker](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/consumption/how-to-consume-data-subscription) to process the newly added `Orders` documents. In this particular article we are going to process in an ongoing fashion, but since `Subscriptions` state is persisted, it can be a process that runs on a schedule, like overnight or on weekends. -In the subscription batch processing we will calculate the overall products cost, prepare the invoice PDF file and store it as an [Attachment](https://ravendb.net/docs/article-page/6.0/csharp/document-extensions/attachments/what-are-attachments) to the `Invoices` document. +In the subscription batch processing we will calculate the overall products cost, prepare the invoice PDF file and store it as an [Attachment](https://ravendb.net/docs/article-page/latest/csharp/document-extensions/attachments/what-are-attachments) to the `Invoices` document. Additional subscription can be defined for processing the `Invoices` documents, and sending an email with an `Attachment` that was created. ## Intro -Typically after paying for your goods on online store you would get the confirmation right away, but the invoice will be sent as a separate email afterwards. Did you ever wonder why it works this way? The reason is that the store want to confirm the purchase immediately, and do the actual processing of the order in the background, so you can return to the shop homepage and possible purchase even more stuff. +Typically after paying for your goods on an online store you would get the confirmation right away, but the invoice will be sent as a separate email afterwards. Did you ever wonder why it works this way? The reason is that the store wants to confirm the purchase immediately, and do the actual processing of the order in the background, so you can return to the shop homepage and possibly purchase even more. -That's from the business point of view, but what with user experience point of view? +That's from the business point of view, but what about user experience? -In case we would do the invoice processing in a sync manner in will likely bring to online store website responsiveness, long resonse time. Think about waiting few minutes for order confirmation, it may make the customer refresh or even close the page, which will cancel the order. +Processing invoices synchronously would hurt online store responsiveness, leading to long response times. Think about waiting a few minutes for order confirmation. The customer may refresh or even close the page, which would cancel the order. ## Breakdown -After customer added a new order, and got confirmation with some order identifier, we would want to start the invoce processing. +After a customer adds a new order and gets confirmation with an order identifier, we want to start the invoice processing. -Lets break down the invoice processing into steps: +Let's break down the invoice processing into steps: 1. Get & start processing newly added `Order` document 2. Load the list of ordered products 3. Calculate order total sum 4. Generate PDF and save it to memory stream -5. Mark the `Order` document with `InvoiceCreated=true` (so it will get proceesed only once) +5. Mark the `Order` document with `InvoiceCreated=true` (so it will get processed only once) 6. Add the PDF stream as attachment to the order document 7. Save changes to the RavenDB database @@ -93,7 +93,7 @@ await DocumentStore.Subscriptions.CreateAsync(new SubscriptionCreationOptions ``` -The subscription worker (`subscription.Run()` method) will receive batch of `Orders` documents each time, inside this batch we will process each `Orders` document, and prepare a invoice pdf, please see the full code below. +The subscription worker (`subscription.Run()` method) will receive a batch of `Orders` documents each time. Inside this batch we will process each `Orders` document and prepare an invoice PDF. Please see the full code below. ```csharp showLineNumbers await using var subscription = DocumentStore.Subscriptions.GetSubscriptionWorker(_subsName); @@ -156,23 +156,23 @@ await subscription.Run(async batch => }); ``` -The subscription worker is opened using default [SubscriptionWorkerOptions](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/consumption/api-overview#subscriptionworkeroptions), it means the subscription worker [strategy](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/consumption/how-to-consume-data-subscription#worker-interplay) is `OpenIfFree`. +The subscription worker is opened using default [SubscriptionWorkerOptions](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/consumption/api-overview#subscriptionworkeroptions), which means the subscription worker [strategy](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/consumption/how-to-consume-data-subscription#worker-interplay) is `OpenIfFree`. -The code here is simply creation of subscription worker, additional example can be found in [Subscription Consumption Examples](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/consumption/examples#data-subscriptions-subscription-consumption-examples) documentation article. There is also example project with the full code attached (or put link to git?) +The code here is simply creation of a subscription worker. Additional examples can be found in the [Subscription Consumption Examples](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/consumption/examples#data-subscriptions-subscription-consumption-examples) documentation article. -In the batch processing code, we check for `order.InvoiceCreated` and if its `true` we skip the item, the reason for that is in case of [subscription connection failover](https://ravendb.net/docs/article-page/6.0/csharp/client-api/data-subscriptions/what-are-data-subscriptions#how-the-worker-communicates-with-the-server) we might get the item second time, thus we might get `Order` that we already created a invoce for it. +In the batch processing code, we check `order.InvoiceCreated` and if it is `true` we skip the item. The reason for that is that in case of [subscription connection failover](https://ravendb.net/docs/article-page/latest/csharp/client-api/data-subscriptions/what-are-data-subscriptions#how-the-worker-communicates-with-the-server) we might receive an item a second time, meaning we could get an `Order` that we already created an invoice for. -After checking the value of `order.InvoiceCreated`, we open a [Session](https://ravendb.net/docs/article-page/6.0/csharp/client-api/session/what-is-a-session-and-how-does-it-work) and load the related `Products` that were ordered. +After checking the value of `order.InvoiceCreated`, we open a [Session](https://ravendb.net/docs/article-page/latest/csharp/client-api/session/what-is-a-session-and-how-does-it-work) and load the related `Products` that were ordered. -Then we set the `order.InvoiceCreated` value to `true` we have to set this property to `true` the reason for that is that after we add the invoice, the `Order` document will be updated, setting `order.InvoiceCreated` value to `true` will make sure the updated `Order` document won't match the `Subscription` criteria. +Then we set `order.InvoiceCreated` to `true`. We have to set this property to `true` because after we add the invoice, the `Order` document will be updated, and setting `order.InvoiceCreated` to `true` ensures the updated `Order` document will no longer match the `Subscription` criteria. -The session in subscription is bound to the processing node, we can be sure that editing the `Order` document in subscrition session we are editing the document on same server that is processing the subscription. +The session in a subscription is bound to the processing node, so we can be sure that editing the `Order` document in the subscription session edits the document on the same server that is processing the subscription. -Afterwards, in case there are producs, we create an `Invoice` document, and call the `CreateInvoiceForOrderAsync()` method which will prepare the PDF file and return it (see the method code below), then we load the PDF file into a `MemoryStream` and store it as an attachment of the created `Invoice` document. -The last step is to call the `session.SaveChangesAsync()` method which will persist the changes into the database. +Afterwards, if there are products, we create an `Invoice` document and call the `CreateInvoiceForOrderAsync()` method, which prepares the PDF file and returns it (see the method code below). We then load the PDF file into a `MemoryStream` and store it as an attachment of the created `Invoice` document. +The last step is to call `session.SaveChangesAsync()`, which persists the changes to the database. -## Additions -The code of `CreateInvoiceForOrderAsync()` method, calculates the overall products costs, prepares the invoice PDF document and returns the PDF document `byte[]`. +## PDF Generation Implementation +The `CreateInvoiceForOrderAsync()` method calculates the overall product costs, prepares the invoice PDF document, and returns it as a `byte[]`. ```csharp showLineNumbers private Task CreateInvoiceForOrderAsync(string invoiceId, Order order, Dictionary products) @@ -226,4 +226,11 @@ private Task CreateInvoiceForOrderAsync(string invoiceId, Order order, D return Task.FromResult(memStream.ToArray()); } -``` \ No newline at end of file +``` + +## Summary + +- Why async invoice processing matters: Synchronous invoice generation hurts store responsiveness and risks order cancellations. Offloading it to a background worker keeps confirmation instant. +- Subscription setup: A RavenDB Data Subscription on the `Orders` collection filters for documents where `InvoiceCreated = false`, ensuring each order is picked up exactly once. +- Batch processing: The subscription worker loads related `Products`, calculates the order total, generates a PDF using iText, and stores it as an attachment on the corresponding `Invoice` document. +- Reliability and scheduling: Subscription state is persisted in RavenDB, so the worker handles connection failovers safely and can run continuously or on a timed schedule such as overnight. diff --git a/guides/ravendb-client-certificates.mdx b/guides/ravendb-client-certificates.mdx index e19f1c2cc8..6fef336a82 100644 --- a/guides/ravendb-client-certificates.mdx +++ b/guides/ravendb-client-certificates.mdx @@ -1,11 +1,24 @@ --- title: "RavenDB Client Certificates with Vault-Backed Key Reuse" -tags: [security, csharp] +tags: [security, administration, deployment] icon: "guides" description: "RavenDB client certificates that renew automatically without re-registration using vault-backed key reuse patterns." publishedAt: 2026-02-12 -author: "Paweł Lachowski" +author: "Omer Ratsaby" proficiencyLevel: "Expert" +see_also: + - title: "Security Overview" + link: "server/security/overview" + source: "docs" + path: "Server > Security" + - title: "Certificate Management" + link: "server/security/authentication/certificate-management" + source: "docs" + path: "Server > Security > Authentication" + - title: "Setup Wizard Overview" + link: "start/installation/setup-wizard/overview" + source: "docs" + path: "Start > Installation > Setup Wizard" --- import Admonition from '@theme/Admonition'; @@ -19,17 +32,17 @@ import Image from "@theme/IdealImage"; ## What are RavenDB client certificates and why does key reuse matter? Certificate rotation is supposed to be routine. In practice, it often turns into an operational tax: -**new cert → re-registration → rollout coordination → “why is prod locked out?”** +**new cert → re-registration → rollout coordination → "why is prod locked out?"** This article shows a cleaner model for **RavenDB client authentication**: Register a client certificate once, then renew it as often as you want by keeping the same private key across renewals. No re-registering needed. -The certificate metadata can change (validity dates, serial number, thumbprint), but as long as it’s issued by the **same authority** and renewed in a way that reuses the **same key pair**, RavenDB continues to accept it. +The certificate metadata can change (validity dates, serial number, thumbprint), but as long as it's issued by the **same authority** and renewed in a way that reuses the **same key pair**, RavenDB continues to accept it. ### Terminology (Quick Primer) **Private Key** -The private key defines the certificate’s identity. +The private key defines the certificate's identity. If two certificates use the *same* private key and are signed by the same issuer, they represent the same identity, even if their thumbprint or validity dates differ. **Issuer (Certificate Authority)** @@ -40,22 +53,22 @@ Renewal without re-registration only works when the same issuer (or trusted chai A CSR is generated from your private key and contains the public key plus identification details. When a CA signs a CSR, it issues a certificate tied to that same key. Re-using the *same CSR* produces new certificates with the *same identity key*. -We’ll walk through three vault patterns, each showing a different way to renew a certificate while preserving the same identity key: +We'll walk through three vault patterns, each showing a different way to renew a certificate while preserving the same identity key: -- **Azure Key Vault** \- Uses the native *reuseKey: true* capability to generate new certificate versions that keep the private key intact. -- **HashiCorp Vault**: Does not provide automatic key reuse, so we generate and store the private key and CSR ourselves. Renewal simply means asking Vault to sign that same CSR again. -- **AWS Private CA with Secrets Manager**: Follows the same model as HashiCorp Vault: store the private key and CSR once, then instruct PCA to issue fresh certificates by signing that same CSR, producing new certificates with the same identity key. +- **Azure Key Vault** - Uses the native *reuseKey: true* capability to generate new certificate versions that keep the private key intact. +- **HashiCorp Vault**: Does not provide automatic key reuse, so we generate and store the private key and CSR ourselves. Renewal simply means asking Vault to sign that same CSR again. +- **AWS Private CA + Secrets Manager**: Follows the same model as HashiCorp Vault: store the private key and CSR once, then instruct PCA to issue fresh certificates by signing that same CSR, producing new certificates with the same identity key. ## How does RavenDB's register-once model work? When a certificate is renewed using the **same private key** and signed by the **same issuer**, it represents the same identity in every meaningful sense. -The metadata may change \- validity dates, serial number, thumbprint \- but the underlying identity and the authority vouching for it remain constant. +The metadata may change - validity dates, serial number, thumbprint - but the underlying identity and the authority vouching for it remain constant. A useful way to think about this is the renewal of an official ID: You may receive a *new* driving license, but because it is issued by the same government authority and tied to the same person, every system that trusted the old license continues to trust the new one automatically. Certificates work the same way. -The identity (the key) stays the same, the issuer stays the same \- so trust carries forward without re-registration, re-provisioning, or operational churn. +The identity (the key) stays the same, the issuer stays the same - so trust carries forward without re-registration, re-provisioning, or operational churn. RavenDB uses X.509 client certificates as its authentication and authorization mechanism. In the normal workflow, you: @@ -66,13 +79,13 @@ In the normal workflow, you: Where things traditionally become painful is during renewal. Most public-key infrastructure (PKI) systems generate a new key pair when issuing a renewed certificate. -From RavenDB’s perspective, that means a new identity entirely \- which forces you to go back, re-register the certificate, and roll out the updated identity across every environment and automation touchpoint. +From RavenDB's perspective, that means a new identity entirely - which forces you to go back, re-register the certificate, and roll out the updated identity across every environment and automation touchpoint. -But as you’ll see next, if you renew in a way that preserves the **same key** and the **same issuer**, RavenDB no longer requires re-registration \- and certificate rotation becomes a seamless, fully operational practice. +But as you'll see next, if you renew in a way that preserves the **same key** and the **same issuer**, RavenDB no longer requires re-registration - and certificate rotation becomes a seamless, fully operational practice. ## How do I renew certificates without re-registration? Three providers, one principle -In the next sections we’ll make this idea concrete with three full, end to end walkthroughs: one for **Azure Key Vault**, one for **HashiCorp Vault** and one for **AWS Private CA \+ Secrets Manager.** +In the next sections we'll make this idea concrete with three full, end to end walkthroughs: one for **Azure Key Vault**, one for **HashiCorp Vault** and one for **AWS Private CA + Secrets Manager.** Each tab follows the same pattern: 1. create (or configure) an issuer @@ -81,9 +94,9 @@ In the next sections we’ll make this idea concrete with three full, end to end 4. renew the certificate while RavenDB continues to accept it without re-registration. -To follow along you’ll need a secured RavenDB node reachable over HTTPS (with an existing admin/client certificate that can register new certs), CLI access to the relevant platform (az for Azure, vault for HCV, aws CLI for AWS), and basic tooling like curl, openssl, and jq installed on your workstation. +To follow along you'll need a secured RavenDB node reachable over HTTPS (with an existing admin/client certificate that can register new certs), CLI access to the relevant platform (az for Azure, vault for HCV, aws CLI for AWS), and basic tooling like curl, openssl, and jq installed on your workstation. -We’ll also assume your RavenDB node is already secured with Let’s Encrypt, using a setup package you created through the RavenDB Setup Wizard. These walkthroughs work just as well for self-signed setups. For instructions on securing your RavenDB node and generating the setup package, please check [here](https://docs.ravendb.net/start/installation/setup-wizard/overview). +We'll also assume your RavenDB node is already secured with Let's Encrypt, using a setup package you created through the RavenDB Setup Wizard. These walkthroughs work just as well for self-signed setups. For instructions on securing your RavenDB node and generating the setup package, please check [here](https://docs.ravendb.net/start/installation/setup-wizard/overview). @@ -91,13 +104,13 @@ Azure Key Vault is, by far, the most straightforward platform for demonstrating That single switch transforms certificate renewal from an operational headache into a predictable, frictionless update: RavenDB continues trusting the renewed certificate without ever seeing a new registration request. -Below is the full flow: define a policy, create the certificate, register it once in RavenDB, then generate a brand-new version and watch RavenDB accept it immediately \- different thumbprint, same identity. +Below is the full flow: define a policy, create the certificate, register it once in RavenDB, then generate a brand-new version and watch RavenDB accept it immediately - different thumbprint, same identity. -So let’s switch over to the terminal and get started. We’ll be demonstrating everything in a Linux environment. +So let's switch over to the terminal and get started. We'll be demonstrating everything in a Linux environment. -

1\. Create an Azure Key Vault certificate policy with `reuseKey: true`

+

1. Create an Azure Key Vault certificate policy with `reuseKey: true`

-Create an Azure Key Vault certificate policy that we’ll supply to Key Vault for generating the certificate: +Create an Azure Key Vault certificate policy that we'll supply to Key Vault for generating the certificate: ```json showLineNumbers # policy.json @@ -124,18 +137,18 @@ Create an Azure Key Vault certificate policy that we’ll supply to Key Vault fo - **`reuseKey: true`** The magic ingredient. Azure issues a *new* certificate version but reuses the *same* private key. RavenDB sees a familiar key pair from the same issuer and continues to authenticate it immediately. -- **`exportable: true` \+ `contentType: application/x-pkcs12`** - Key Vault must allow you to export a PFX. RavenDB accepts PFXs directly \- so this ensures a seamless integration. +- **`exportable: true` + `contentType: application/x-pkcs12`** + Key Vault must allow you to export a PFX. RavenDB accepts PFXs directly - so this ensures a seamless integration. - **`subject: "CN=vault-demo.ravendb.run"`** - Same certificate’s common name we used while generating our setup package. + Same certificate's common name we used while generating our setup package. - **`kty: "RSA" and keySize: 2048`** Chooses the RSA algorithm with a 2048-bit key, a standard and broadly compatible choice for TLS client certificate -

2\. Issue the certificate and export the PFX

+

2. Issue the certificate and export the PFX

```bash showLineNumbers $ export VAULT_NAME="my-akv" @@ -159,25 +172,25 @@ $ base64 -d client.pfx.b64 > client.pfx At this point, you hold the PFX containing the identity you will **register exactly once** in RavenDB. -

**3\. Register the PFX in RavenDB a single time**

+

3. Register the PFX in RavenDB a single time

To register the certificate through **RavenDB Studio**, navigate to: **Manage Server → Certificates → Manage Certificates → Upload Client Certificate** In the upload dialog, fill in the following fields: -- **Name** \- A friendly identifier for this client certificate (e.g., ravendb-client). +- **Name** - A friendly identifier for this client certificate (e.g., ravendb-client). -- **Security Clearance** \- The permission level you want this identity to have (e.g., ClusterAdmin). +- **Security Clearance** - The permission level you want this identity to have (e.g., ClusterAdmin). -- **Path to certificate file** \- The PFX file you bundled (e.g., client.pfx). +- **Path to certificate file** - The PFX file you bundled (e.g., client.pfx). -- **Passphrase (optional)** \- Only required if the PFX was created with one. +- **Passphrase (optional)** - Only required if the PFX was created with one. -- **Expiration (optional)** \- If you want RavenDB to treat the certificate as expired earlier than its actual validity period. +- **Expiration (optional)** - If you want RavenDB to treat the certificate as expired earlier than its actual validity period. RavenDB Studio upload client certificate dialog showing Name, Security Clearance, and certificate file fields -

**Alternatively, register the certificate via the CLI**

+

Alternatively, register the certificate via the CLI

For automation or scripted deployments, you can perform the same registration using the admin REST API: @@ -200,9 +213,9 @@ We have now introduced this identity to the cluster. **We will not do this again Every future renewal will be accepted automatically. -Let’s try it by going through the renewal process. +Let's try it by going through the renewal process. -

4\. Issue a new certificate version, export it, and prepare it for use

+

4. Issue a new certificate version, export it, and prepare it for use

```bash showLineNumbers # Create a second version of the certificate - same key, new validity window @@ -225,9 +238,9 @@ We created a fresh certificate **version** in Azure Key Vault. Because `reuseKey After exporting this renewed PFX, it is immediately ready for use as a client certificate. -At this point, your application or deployment pipeline can simply retrieve the latest certificate version as part of its startup or container initialization process. The rotation logic now lives entirely in the Vault- your service just asks for “the newest version,” and RavenDB will accept it automatically because the key and issuer remain the same. +At this point, your application or deployment pipeline can simply retrieve the latest certificate version as part of its startup or container initialization process. The rotation logic now lives entirely in the Vault- your service just asks for "the newest version," and RavenDB will accept it automatically because the key and issuer remain the same. -

5\. Use the renewed certificate to authenticate \- without re-registering

+

5. Use the renewed certificate to authenticate - without re-registering

```bash showLineNumbers $ curl --cert client-renewed.pfx: --cert-type P12 \ @@ -240,9 +253,9 @@ The renewed certificate successfully authenticates without performing a registra This confirms the key reuse mechanism worked: despite being a new version with a new thumbprint, the public/private key pair is identical, so the system treats it as the same client identity. -

6\. List certificates on the server and observe both versions under the same identity

+

6. List certificates on the server and observe both versions under the same identity

-You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear \- directly through the RavenDB Studio. +You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear - directly through the RavenDB Studio. Navigate to: **Manage Server → Certificates → Manage Certificates** @@ -292,13 +305,13 @@ HashiCorp Vault does not offer a reuseKey flag like Azure Key Vault. But we can achieve the exact same behavior by generating the private key ourselves once, storing it, and instructing Vault to sign the same CSR on every renewal. -This gives us a new certificate with a new thumbprint and new dates \- but the **same key pair** \- meaning we can renew indefinitely without re-registering in RavenDB. +This gives us a new certificate with a new thumbprint and new dates - but the **same key pair** - meaning we can renew indefinitely without re-registering in RavenDB. -This walkthrough follows the same pattern: generate the anchor key \+ CSR, have Vault sign it, register the PFX once, then renew by signing the same CSR again. +This walkthrough follows the same pattern: generate the anchor key + CSR, have Vault sign it, register the PFX once, then renew by signing the same CSR again. -So let’s switch over to the terminal and get started. We’ll be demonstrating everything in a Linux environment. +So let's switch over to the terminal and get started. We'll be demonstrating everything in a Linux environment. -

1\. Spin up Vault, enable the PKI engine, and set a max TTL

+

1. Spin up Vault, enable the PKI engine, and set a max TTL

```bash showLineNumbers $ export VAULT_ADDR='http://127.0.0.1:8200' @@ -316,7 +329,7 @@ $ vault secrets tune -max-lease-ttl=8760h pki A development Vault instance is running locally, and the PKI secrets engine is ready to issue certificates with a long TTL window. -

2\. Generate the long-lived RSA private key and CSR

+

2. Generate the long-lived RSA private key and CSR

```bash showLineNumbers $ openssl genrsa -out client.key 2048 @@ -327,9 +340,9 @@ $ openssl req -new -key client.key \ We generate the private key once and never rotate it. -Since certificate renewal usually happens only after a long time window, it’s often wise to store the private key and CSR in the vault for future use. It lets you retrieve them later and hand the same CSR back to the root CA for re-signing, producing a fresh certificate without changing the key. It also means the anchor material (key \+ CSR) is protected and accessed using the same security, authentication, and authorization model that Vault already enforces, while keeping everything under Vault’s existing security and access-control model +Since certificate renewal usually happens only after a long time window, it's often wise to store the private key and CSR in the vault for future use. It lets you retrieve them later and hand the same CSR back to the root CA for re-signing, producing a fresh certificate without changing the key. It also means the anchor material (key + CSR) is protected and accessed using the same security, authentication, and authorization model that Vault already enforces, while keeping everything under Vault's existing security and access-control model -So before we go any further, let’s tuck both the private key and the CSR safely into the vault \- ready for the day we’ll need them again: +So before we go any further, let's tuck both the private key and the CSR safely into the vault - ready for the day we'll need them again: ```bash showLineNumbers $ vault secrets enable -path=kv kv-v2 @@ -341,9 +354,9 @@ $ vault kv put kv/ravendb/client \ The CSR is tied to this key, and Vault will sign this CSR every time we want a renewal. -This is how we reproduce the “same-key renewal” behavior that Azure Key Vault gets with reuseKey: true. +This is how we reproduce the "same-key renewal" behavior that Azure Key Vault gets with reuseKey: true. -

3\. Create a self-signed root CA inside Vault and define a client certificate role

+

3. Create a self-signed root CA inside Vault and define a client certificate role

```bash showLineNumbers $ vault write pki/root/generate/internal \ @@ -364,7 +377,7 @@ $ vault write pki/roles/ravendb-client \ The role tells Vault to issue a **client** certificate with the correct EKUs and key usages, and we allow any CN since the CSR already defines it. -

4\. Issue the first certificate by signing the saved CSR, then package it as a PFX

+

4. Issue the first certificate by signing the saved CSR, then package it as a PFX

```bash showLineNumbers $ vault write -format=json pki/sign/ravendb-client \ @@ -382,24 +395,24 @@ $ openssl pkcs12 -export \ ``` -Vault did **not** generate a key \- it only signed the CSR we provided. Therefore the certificate uses the *same key pair* we generated in step 1\. Packaging it as a PFX makes it ready for RavenDB registration. +Vault did **not** generate a key - it only signed the CSR we provided. Therefore the certificate uses the *same key pair* we generated in step 1. Packaging it as a PFX makes it ready for RavenDB registration. -

**5\. Register the client certificate in RavenDB a single time**

+

5. Register the client certificate in RavenDB a single time

To register the certificate through **RavenDB Studio**, navigate to: **Manage Server → Certificates → Manage Certificates → Upload Client Certificate** In the upload dialog, fill in the following fields: -- **Name** \- A friendly identifier for this client certificate (e.g., ravendb-client). +- **Name** - A friendly identifier for this client certificate (e.g., ravendb-client). -- **Security Clearance** \- The permission level you want this identity to have (e.g., ClusterAdmin). +- **Security Clearance** - The permission level you want this identity to have (e.g., ClusterAdmin). -- **Path to certificate file** \- The PFX file you bundled (e.g., client.pfx). +- **Path to certificate file** - The PFX file you bundled (e.g., client.pfx). -- **Passphrase (optional)** \- Only required if the PFX was created with one. +- **Passphrase (optional)** - Only required if the PFX was created with one. -- **Expiration (optional)** \- If you want RavenDB to treat the certificate as expired earlier than its actual validity period. +- **Expiration (optional)** - If you want RavenDB to treat the certificate as expired earlier than its actual validity period. RavenDB Studio upload client certificate dialog showing Name, Security Clearance, and certificate file fields

Alternatively, register the certificate via the CLI

@@ -424,9 +437,9 @@ We have now introduced this identity to the cluster. **We will not do this again Every future renewal will be accepted automatically. -Let’s try it by going through the renewal process. +Let's try it by going through the renewal process. -

6\. Renew the certificate by signing the exact same CSR again

+

6. Renew the certificate by signing the exact same CSR again

```bash showLineNumbers # fetch the client.csr we stored earlier in the vault @@ -449,7 +462,7 @@ $ openssl pkcs12 -export \ Vault issued a brand-new certificate with a new validity period and new thumbprint. But because it signed the same CSR, the underlying private key is unchanged. -

7\. Use the renewed certificate to authenticate \- without re-registering

+

7. Use the renewed certificate to authenticate - without re-registering

```bash showLineNumbers $ curl --cert client-renewed.pfx: --cert-type P12 \ @@ -460,9 +473,9 @@ $ curl --cert client-renewed.pfx: --cert-type P12 \ The renewed certificate successfully authenticates without performing a registration step again. The identity (the key pair) is the same, so authentication succeeds. -

8\. List certificates on the server and observe both versions under the same identity

+

8. List certificates on the server and observe both versions under the same identity

-You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear \- directly through the RavenDB Studio. +You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear - directly through the RavenDB Studio. Navigate to: **Manage Server → Certificates → Manage Certificates** @@ -502,20 +515,20 @@ $ curl --cert client-renewed.pfx: --cert-type P12 \ You will find two entries named ravendb-client, each with a different thumbprint. They appear separately because they are separate certificate files. -But both certificates authenticate successfully because they share the same key pair \- meaning they represent one logical identity from an authentication perspective. +But both certificates authenticate successfully because they share the same key pair - meaning they represent one logical identity from an authentication perspective.
-AWS does not provide a `reuseKey` mechanism like Azure Key Vault, and ACM (the public certificate service) explicitly **rotates the key pair** during renewal \- which breaks our goal. +AWS does not provide a `reuseKey` mechanism like Azure Key Vault, and ACM (the public certificate service) explicitly **rotates the key pair** during renewal - which breaks our goal. However, AWS **Private CA (ACM PCA)** behaves much closer to HashiCorp Vault: *we bring our own private key*, generate a CSR once, and then repeatedly ask PCA to sign that same CSR whenever we need a new certificate. -We already set up the necessary AWS services earlier \- a Private CA in ACTIVE state and AWS Secrets Manager available for storing the key/CSR anchor. +We already set up the necessary AWS services earlier - a Private CA in ACTIVE state and AWS Secrets Manager available for storing the key/CSR anchor. -So let’s switch over to the terminal and get started. We’ll be demonstrating everything in a Linux environment. +So let's switch over to the terminal and get started. We'll be demonstrating everything in a Linux environment. -

1\. Generate the long-lived private key and CSR

+

1. Generate the long-lived private key and CSR

```bash showLineNumbers $ openssl genrsa -out client.key 2048 @@ -526,13 +539,13 @@ $ openssl req -new -key client.key \ We generate the private key once and never rotate it. - The CSR is bound to this key, and PCA will sign this same CSR each time we renew. This is the AWS equivalent of Azure Key Vault’s reuseKey: true. + The CSR is bound to this key, and PCA will sign this same CSR each time we renew. This is the AWS equivalent of Azure Key Vault's reuseKey: true. -Since certificate renewal usually happens only after a long time window, it’s often wise to store the private key and CSR in the Secret Manager for future use. It lets you retrieve them later and hand the same CSR back to the root CA for re-signing, producing a fresh certificate without changing the key. It also means the anchor material (key \+ CSR) is protected and accessed using the same security, authentication and authorization model that Secret Manager already enforces, while keeping everything under the Secret Manager’s existing security and access-control model. +Since certificate renewal usually happens only after a long time window, it's often wise to store the private key and CSR in the Secret Manager for future use. It lets you retrieve them later and hand the same CSR back to the root CA for re-signing, producing a fresh certificate without changing the key. It also means the anchor material (key + CSR) is protected and accessed using the same security, authentication and authorization model that Secret Manager already enforces, while keeping everything under the Secret Manager's existing security and access-control model. -So before we go any further, let’s tuck both the private key and the CSR safely into the vault \- ready for the day we’ll need them again: +So before we go any further, let's tuck both the private key and the CSR safely into the vault - ready for the day we'll need them again: -

2\. Store the private key \+ CSR in AWS Secrets Manager for future renewals

+

2. Store the private key + CSR in AWS Secrets Manager for future renewals

```bash showLineNumbers $ export AWS_REGION="us-east-1" @@ -552,7 +565,7 @@ Secrets Manager securely stores the two critical pieces: our private key and our Future renewals simply fetch the CSR back, meaning no manual handling months or years later. -

3\. Issue the first certificate using PCA by signing the saved CSR

+

3. Issue the first certificate using PCA by signing the saved CSR

```bash showLineNumbers $ export CA_ARN="arn:aws:acm-pca:us-east-1:123456789012:certificate-authority/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" @@ -573,11 +586,11 @@ $ CERT_ARN="$( Why these parameters matter: -- \--csr fileb://client.csr We sign our CSR, so PCA uses our key. -- \--idempotency-token "$(uuidgen)" idempotency-token is a PCA parameter that ensures each request is treated as a unique issuance event. uuidgen is a standard Linux command that generates a random UUID. Using a new UUID each time prevents PCA from returning the same certificate ARN if the command is repeated, guaranteeing that renewal actually produces a **new certificate**. +- --csr fileb://client.csr We sign our CSR, so PCA uses our key. +- --idempotency-token "$(uuidgen)" idempotency-token is a PCA parameter that ensures each request is treated as a unique issuance event. uuidgen is a standard Linux command that generates a random UUID. Using a new UUID each time prevents PCA from returning the same certificate ARN if the command is repeated, guaranteeing that renewal actually produces a **new certificate**. - validity simply defines the certificate lifetime. -

4\. Retrieve the issued certificate \+ chain, then package everything as a PFX

+

4. Retrieve the issued certificate + chain, then package everything as a PFX

```bash showLineNumbers $ aws acm-pca get-certificate \ @@ -600,22 +613,22 @@ $ openssl pkcs12 -export \ PCA signed our CSR, producing a leaf certificate and chain. We bundled them with our private key to form a PFX suitable for RavenDB. -

5\. Register the client certificate in RavenDB a single time

+

5. Register the client certificate in RavenDB a single time

To register the certificate through **RavenDB Studio**, navigate to: **Manage Server → Certificates → Manage Certificates → Upload Client Certificate** In the upload dialog, fill in the following fields: -- **Name** \- A friendly identifier for this client certificate (e.g., ravendb-client). +- **Name** - A friendly identifier for this client certificate (e.g., ravendb-client). -- **Security Clearance** \- The permission level you want this identity to have (e.g., ClusterAdmin). +- **Security Clearance** - The permission level you want this identity to have (e.g., ClusterAdmin). -- **Path to certificate file** \- The PFX file you bundled (e.g., client.pfx). +- **Path to certificate file** - The PFX file you bundled (e.g., client.pfx). -- **Passphrase (optional)** \- Only required if the PFX was created with one. +- **Passphrase (optional)** - Only required if the PFX was created with one. -- **Expiration (optional)** \- If you want RavenDB to treat the certificate as expired earlier than its actual validity period. +- **Expiration (optional)** - If you want RavenDB to treat the certificate as expired earlier than its actual validity period. RavenDB Studio upload client certificate dialog showing Name, Security Clearance, and certificate file fields

Alternatively, register the certificate via the CLI

@@ -641,9 +654,9 @@ We have now introduced this identity to the cluster. **We will not do this again Every future renewal will be accepted automatically. -Let’s try it by going through the renewal process. +Let's try it by going through the renewal process. -

6\. Renew the certificate by signing the exact same CSR again

+

6. Renew the certificate by signing the exact same CSR again

```bash showLineNumbers # fetch the client.csr we stored earlier in the sm @@ -686,9 +699,9 @@ $ openssl pkcs12 -export \ -passout pass:'' ``` -We achieved a completely new certificate (new thumbprint, new dates) but using the *same key pair* we generated at step 1\. +We achieved a completely new certificate (new thumbprint, new dates) but using the *same key pair* we generated at step 1. -

7\. Use the renewed certificate to authenticate \- without re-registering

+

7. Use the renewed certificate to authenticate - without re-registering

```bash showLineNumbers $ curl --cert client-renewed.pfx: --cert-type P12 \ @@ -699,9 +712,9 @@ $ curl --cert client-renewed.pfx: --cert-type P12 \ The renewed certificate successfully authenticates without performing a registration step again. The identity (the key pair) is the same, so authentication succeeds. -

8\. List certificates on the server and observe both versions under the same identity

+

8. List certificates on the server and observe both versions under the same identity

-You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear \- directly through the RavenDB Studio. +You can also inspect the registered certificates- and verify that both versions of the renewed certificate appear - directly through the RavenDB Studio. Navigate to: **Manage Server → Certificates → Manage Certificates** @@ -740,24 +753,22 @@ $ curl --cert client-renewed.pfx: --cert-type P12 \ You will find two entries named ravendb-client, each with a different thumbprint. They appear separately because they are separate certificate files. -But both certificates authenticate successfully because they share the same key pair \- meaning they represent one logical identity from an authentication perspective. +But both certificates authenticate successfully because they share the same key pair - meaning they represent one logical identity from an authentication perspective.
## Conclusion -The real strength of RavenDB’s security model is that it makes a notoriously painful process \- certificate rotation \- practical and predictable. By anchoring identity to the **key pair** and the **issuer** rather than the certificate file, RavenDB turns renewal into a non-event: +The real strength of RavenDB's security model is that it makes a notoriously painful process - certificate rotation - practical and predictable. By anchoring identity to the **key pair** and the **issuer** rather than the certificate file, RavenDB turns renewal into a non-event: As long as a renewed certificate is signed by the same authority and uses the same private key, the cluster immediately recognizes it as the same identity. This makes renewals whether done in Azure Key Vault, HashiCorp Vault, or AWS Private CA silently slide into place without re-registration, downtime, or reconfiguration. You keep a strong certificate-based identity while eliminating the operational churn that usually comes with PKI. The result is tight security without the recurring overhead. -That’s what “secure-by-default” should feel like: simple, predictable, and something you don’t have to refight every time a certificate expires. +That's what "secure-by-default" should feel like: simple, predictable, and something you don't have to refight every time a certificate expires. -If you want to dive deeper, explore RavenDB’s security documentation: -[https://docs.ravendb.net/server/security/overview](https://docs.ravendb.net/server/security/overview) +If you want to dive deeper, explore the [RavenDB security documentation](https://docs.ravendb.net/server/security/overview). -And if you’d like to discuss approaches, get help, or share feedback, join the RavenDB community on Discord: -[https://discord.com/invite/ravendb](https://discord.com/invite/ravendb) +And if you'd like to discuss approaches, get help, or share feedback, join the [RavenDB community on Discord](https://discord.com/invite/ravendb). diff --git a/guides/semantic-search-with-ravendb-python-and-fastapi.mdx b/guides/semantic-search-with-ravendb-python-and-fastapi.mdx index 4c113efe86..d7ec824dbb 100644 --- a/guides/semantic-search-with-ravendb-python-and-fastapi.mdx +++ b/guides/semantic-search-with-ravendb-python-and-fastapi.mdx @@ -1,11 +1,24 @@ --- title: "Semantic Search with RavenDB, Python, and FastAPI" tags: [python, fastapi, ai, demo, use-case] -description: "Read about Semantic Search with RavenDB and Python" +description: "Build a semantic search API with RavenDB, Python, and FastAPI. Learn manual and automatic embedding generation with OpenAI, and run vector similarity search in under 50 lines of application code." publishedAt: 2025-07-22 image: "https://ravendb.net/wp-content/uploads/2025/07/Semantic-Search-cover.png" author: "Paweł Lachowski" proficiencyLevel: "Expert" +see_also: + - title: "Vector Search Overview" + link: "ai-integration/vector-search/overview" + source: "docs" + path: "AI Integration > Vector Search" + - title: "Embeddings Generation Overview" + link: "ai-integration/generating-embeddings/overview" + source: "docs" + path: "AI Integration > Generating Embeddings" + - title: "AI Connection Strings" + link: "ai-integration/connection-strings/overview" + source: "docs" + path: "AI Integration > Connection Strings" --- import Admonition from '@theme/Admonition'; @@ -22,7 +35,7 @@ Semantic search is a more intuitive way for your users to find the content they To achieve this, your data needs to be vectorized into AI model embeddings, as they are a digital representation of the meaning of your data within a specific model. -To reduce data logistics and the amount of code to be written and maintained, RavenDB offers automatic embedding generation. If you don’t need another custom solution for that and want to ship faster, this is the way. Automatic embedding generation with RavenDB handles your data logistics out of the box, simplifying app development. +To reduce data logistics and the amount of code to be written and maintained, RavenDB offers automatic embedding generation. If you don't need another custom solution for that and want to ship faster, this is the way. Automatic embedding generation with RavenDB handles your data logistics out of the box, simplifying app development. In this article, we will create a sample FastAPI application to show you how vector search works. We will implement both *manual* and *automatic* embedding generation. @@ -30,17 +43,17 @@ In this article, we will create a sample FastAPI application to show you how vec ### Application -Using FastAPI, we can quickly build a web AI search endpoint to demonstrate how semantic search works. We’ll use a built-in OpenAPI interface to picture that. +Using FastAPI, we can quickly build a web AI search endpoint to demonstrate how semantic search works. We'll use a built-in OpenAPI interface to picture that. -See, we query for “Cheese” and we get all kinds of cheese products from our database: +See, we query for "Cheese" and we get all kinds of cheese products from our database: - - -Under the hood, the application translates our query term “Cheese” to an embedding (vector) on the fly and compares other vectors within the database, finding the closest “meanings”. Let’s show how to build that. +FastAPI Swagger UI with a 'Cheese' query entered into the search endpoint +FastAPI Swagger UI response showing cheese products returned by RavenDB vector search +Under the hood, the application translates our query term "Cheese" to an embedding (vector) on the fly and compares other vectors within the database, finding the closest "meanings". Let's show how to build that. Without the query implementation, our application looks like this: -``` +```python showLineNumbers from fastapi import FastAPI from pydantic import BaseModel from ravendb import DocumentStore @@ -76,9 +89,9 @@ if __name__ == "__main__": uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True) ``` -To make it work, we need a few Python packages:. +To make it work, we need a few Python packages: -``` +```bash pip install fastapi pip install uvicorn pip install ravendb @@ -88,9 +101,9 @@ pip install ravendb - uvicorn: ASGI web server - ravendb: Python SDK for RavenDB -This fragment sets up RavenDB. It connects to a database running on our local machine with the Northwind [sample data set](https://ravendb.net/docs/article-page/7.0/csharp/studio/database/tasks/create-sample-data). +This fragment sets up RavenDB. It connects to a database running on our local machine with the Northwind [sample data set](https://ravendb.net/docs/studio/database/tasks/create-sample-data). -```py +```python showLineNumbers # RavenDB setup document_store = DocumentStore( urls=["http://127.0.0.1:8080"], @@ -99,9 +112,9 @@ document_store = DocumentStore( document_store.initialize() ``` -Then, we create the Product class to represent the JSON document schema for documents in the Products collection of the [“Northwind”](https://ravendb.net/docs/studio/database/tasks/create-sample-data) database. This allows us to work with Product documents as objects. +Then, we create the Product class to represent the JSON document schema for documents in the Products collection of the ["Northwind"](https://ravendb.net/docs/studio/database/tasks/create-sample-data) database. This allows us to work with Product documents as objects. -``` +```python showLineNumbers # Northwind Product schema class Product(BaseModel): name: str @@ -115,9 +128,9 @@ class Product(BaseModel): reorder_level: int ``` -Next, we define endpoints to query products. We’ll add the detailed query logic once we have embeddings set up. +Next, we define endpoints to query products. We'll add the detailed query logic once we have embeddings set up. -``` +```python showLineNumbers @app.get("/products") async def search_products(query: str): ... @@ -125,7 +138,7 @@ async def search_products(query: str): The last step is to use [Uvicorn](https://www.uvicorn.org/) to launch this app. It handles your web requests, allowing them to reach our application. Uvicorn serves as a simple bridge between the network and your application. -``` +```python showLineNumbers if __name__ == "__main__": import uvicorn uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True) @@ -133,9 +146,9 @@ if __name__ == "__main__": ### Vector Search with manual embeddings generation -We can generate embeddings manually, but it requires more effort on our part. We need to take care of data logistics and structure, and we may still be missing some more advanced functionalities, for example, caching or chunking. Let’s show how to generate embeddings using the popular OpenAI embedding model ["text-embedding-3-small"](https://platform.openai.com/docs/guides/embeddings). +We can generate embeddings manually, but it requires more effort on our part. We need to take care of data logistics and structure, and we may still be missing some more advanced functionalities, for example, caching or chunking. Let's show how to generate embeddings using the popular OpenAI embedding model ["text-embedding-3-small"](https://platform.openai.com/docs/guides/embeddings). -``` +```python showLineNumbers import openai def get_embedding(text: str) -> list[float]: @@ -148,7 +161,7 @@ embedding = get_embedding("Example product name") We need to install and use the *openai* Python package. This allows us to interact with AI models from OpenAI. Next, we generate embeddings with `get_embedding` for the entire Products collection. Then we put the vector in the vector\_embedding field. We will query all Products, add the field `vector_embedding`, and then call `save_changes`. -``` +```python showLineNumbers with document_store.open_session() as session: products = list(session.query(object_type=Product)) for product in products: @@ -157,11 +170,11 @@ with document_store.open_session() as session: session.save_changes() ``` -Note: The Northwind database is relatively small, allowing us to query and update the documents directly. However, for larger datasets, you’d need to explore a more effective strategy. If that’s your case, see how to use automatic embeddings generation later in this article. +Note: The Northwind database is relatively small, allowing us to query and update the documents directly. However, for larger datasets, you'd need to explore a more effective strategy. If that's your case, see how to use automatic embeddings generation later in this article. Products now have an embedding field, allowing us to run RavenDB Vector Search. But to compare vectors, we also need an embedding vector for the query search term. We use the same function to generate embeddings for incoming search terms on the fly using the same method (`get_embedding`). -``` +```python showLineNumbers query_embedding = get_embedding("search term") with document_store.open_session() as session: @@ -194,32 +207,43 @@ This way we need to: … and we still lack features like caching repetitive search terms and text chunking. +| Aspect | **Manual embeddings** | **Automatic embeddings** | +|---|---|---| +| **Embedding generation** | Your code calls OpenAI directly | RavenDB calls the model via a configured task | +| **Initial data vectorization** | You loop through documents and save vectors yourself | RavenDB handles the backfill automatically | +| **Query-time vectorization** | Your code calls `get_embedding()` on each request | `vector_search_text_using_task` handles it internally | +| **Caching repeated queries** | Not included; you build it yourself | Built in | +| **Chunking support** | Not included; you build it yourself | Built in | +| **Switching AI models** | Requires code changes and re-vectorization | Change the connection string in Studio | +| **Application code size** | ~50 lines including OpenAI setup | ~30 lines, no AI client code | +| **Best for** | Full control over embedding logic | Faster delivery, less maintenance overhead | + ### Vector search with automatic embeddings -Now, let’s try automatic embeddings generation in RavenDB. The code will be the same as starting one, but reduced by manual communication with OpenAI. We just need to add the embeddings generation in RavenDB Studio. +Now, let's try automatic embeddings generation in RavenDB. The code will be the same as starting one, but reduced by manual communication with OpenAI. We just need to add the embeddings generation in RavenDB Studio. Adding automatic embeddings generation starts in the AI Hub. We will automate the communication with the external AI model. To do so, we need to define which model we want to use. -### AI Connection String +#### AI Connection String Create a new [AI connection string](https://ravendb.net/docs/ai-integration/connection-strings/connection-strings-overview) in RavenDB Studio: - +RavenDB Studio form for creating a new AI connection string, with the OpenAI provider selected Define your custom name and identifier and pick the service you want to use; we chose OpenAI. Then, in the new fields, select the endpoint & model, and paste your API key. Other fields are optional and not currently relevant to us. You can test the connection to ensure everything works properly. -### AI Task +#### AI Task -We can connect to the OpenAI model, but we need to create a task that generates embeddings. Go back to the AI Hub and choose ‘AI Tasks’. Create a new embeddings generation task and fill in its name and identifier. Select our new connection string. We select the ‘Products’ collection and type ‘Name’ for a path just beneath the collection. Just save it, and it’s ready. +We can connect to the OpenAI model, but we need to create a task that generates embeddings. Go back to the AI Hub and choose 'AI Tasks'. Create a new embeddings generation task and fill in its name and identifier. Select our new connection string. We select the 'Products' collection and type 'Name' for a path just beneath the collection. Just save it, and it's ready. - -Look how easy and short it is. Using RavenDB, we don’t need to worry about adding new fields; query logic is already thinner, and all connections to AI are already handled without the need for additional stuff. +RavenDB Studio form for a new embeddings generation task, targeting the Products collection Name field +The task configuration is minimal: just a name, a connection string, the target collection, and the field path. RavenDB handles everything else: scheduling, API calls, and storing the resulting vectors. -### Application code +#### Application code All the required parts for your own embeddings generation and adding the field can be removed, making the code smaller. -``` +```python showLineNumbers from fastapi import FastAPI from pydantic import BaseModel from ravendb import DocumentStore @@ -231,7 +255,7 @@ document_store = DocumentStore( ) document_store.initialize() -# Northwind Product schema based on your example +# Northwind Product schema class Product(BaseModel): name: str supplier: str @@ -263,7 +287,7 @@ if __name__ == "__main__": uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True) ``` -Look how easy and short it is. Using RavenDB, we don’t need to worry about adding new fields; the query endpoint logic is already more straightforward, and connection to the AI model is already handled without the need for additional stuff. +Compared to the manual approach, the application code no longer needs to manage an OpenAI client, generate embeddings per document, or store them in a separate field. The `vector_search_text_using_task` call replaces the full manual pipeline: RavenDB resolves the task, vectorizes the query on the fly, and returns ranked results. This way we get: @@ -274,12 +298,19 @@ This way we get: Works perfectly: - - +FastAPI Swagger UI showing a semantic search query returning relevant products using automatic RavenDB embeddings +FastAPI Swagger UI response body with product results returned by automatic vector search In the studio, RavenDB created a separate collection for embeddings and cached terms: - -Everything’s working on its own, and our semantic search can be delivered much quicker. +RavenDB Studio collections view showing an auto-generated embeddings collection and cached query terms +Everything's working on its own, and our semantic search can be delivered much quicker. ## Summary -Now that you know how to handle Vector Search with RavenDB, you may have some cool projects to share\! In this case, we would like to invite you to our Discord \- RavenDB Developers Community. Check it out [here](https://discord.com/invite/ravendb). +In this article, you built a semantic search endpoint using RavenDB, Python, and FastAPI, twice. + +- Manual approach: You vectorized the Products collection yourself using `openai.embeddings.create`, stored the vectors in a document field, and queried them with `vector_search`. This gives you full control but requires managing API calls, data updates, and lacks built-in caching. +- Automatic approach: You configured an embeddings generation task in RavenDB Studio and called `vector_search_text_using_task` in a single line. RavenDB handles vectorization, storage, caching, and query-time embedding, with no OpenAI client code needed in the application. + +For most projects, the automatic approach is the better starting point: less code, easier to maintain, and caching included out of the box. + +Built something with RavenDB? Share it with the community on [Discord](https://discord.com/invite/ravendb). diff --git a/guides/spatial-search-in-ravendb.mdx b/guides/spatial-search-in-ravendb.mdx index ff62822ca0..8f4c3c0c22 100644 --- a/guides/spatial-search-in-ravendb.mdx +++ b/guides/spatial-search-in-ravendb.mdx @@ -5,7 +5,7 @@ icon: "spatial-map-view" description: "Learn how to implement spatial search in RavenDB with Python. Covers radius queries, custom polygon shapes, WKT syntax, and reverse point-in-polygon lookup using the Flat Finder demo." author: "Paweł Lachowski" publishedAt: 2026-03-11 -proficiencyLevel: "Expert" +proficiencyLevel: "Intermediate" keywords: ["spatial search", "geospatial query", "radius search", "polygon search", "WKT", "BoundingBox", "QuadPrefixTree", "GeoHashPrefixTree", "RavenDB Python"] see_also: - title: "Indexing Spatial Data" @@ -31,7 +31,7 @@ import LanguageContent from "@site/src/components/LanguageContent"; import Image from "@theme/IdealImage"; -RavenDB offers native spatial search without external plugins. Store point coordinates or polygon shapes in your documents and query by radius, custom shapes, or reverse point-in-polygon lookup — using the same indexing workflow you use for all other fields. This guide walks through all three query types with a working Python demo. +RavenDB offers native spatial search without external plugins. Store point coordinates or polygon shapes in your documents and query by radius, custom shapes, or reverse point-in-polygon lookup, using the same indexing workflow you use for all other fields. This guide walks through all three query types with a working Python demo. Spatial data tends to create an impression that you need to learn a dedicated tool from scratch before you can do anything useful with it. The technical concepts like storing coordinates and polygons may look simple at first, but once you dig a little deeper, they can become surprisingly hard to navigate through. Geospatial tools often do not help newcomers get started smoothly due to convoluted technical documentation. @@ -40,13 +40,13 @@ Yet many applications only require a handful of straightforward operations. Your These are simple ideas that do not require advanced infrastructure, and RavenDB keeps this approachable. You can store points or shapes directly in your documents and query them with the same workflow you use for other fields. This makes spatial search feel like a natural extension of the database rather than a separate feature you need to learn from scratch. -## Why Use RavenDB for Spatial Search +## Why use RavenDB for spatial search -RavenDB offers native spatial search without external plugins, unlike many other popular databases. RavenDB is designed to make it as fast as possible to go from inserting data to querying it. Once your data is stored in the correct format, you can query it immediately. +Most databases require a dedicated geospatial extension or a separate service to handle spatial search. In RavenDB, spatial search is built in. Once your documents store coordinates or shapes in the correct format, you can query them immediately using the same session and index workflow you already use for everything else. -While the system is easy to learn, we also focused on keeping it as precise as possible. This even lets you point to specific rooms in bigger buildings. RavenDB supports three systems for geographical data, and two of them also work with Cartesian systems (more about precision and the systems we use later in this guide \- to skip there, you can click [here](#cartesian-and-geographical)). +The system is also designed for precision. RavenDB supports three indexing strategies for spatial data, two of which work with Cartesian systems as well as geographical ones, giving you accuracy down to roughly 2.5 meters. That is precise enough to distinguish individual rooms in a building. More on the strategies and how they differ [here](#cartesian-and-geographical). -Alright, it’s flexible, precise, and easy to use, but what can we do with it? +So it’s built in, precise, and fits naturally into your existing workflow. What can you actually do with it? ## Spatial search capabilities @@ -56,11 +56,11 @@ Another subfeature is [*querying by shape*](#part-2-search-by-shape). You define You can also [*search using polygons*](#part-3-reverse-search) defined in the same way as the second method. If you pre-define shapes like districts, you can point to space and then return the district it’s in. -Respectively, as we allow indexing points and WKT shapes, you can achieve anything you need with spatial indexes \- even shape-by-shape search. RavenDB's spatial search capability is ready for any use case, both simple and more advanced. +Respectively, as we allow indexing points and WKT shapes, you can achieve anything you need with spatial indexes, including shape-by-shape search. RavenDB's spatial search capability is ready for any use case, both simple and more advanced. All those usages depend on how you use your indexes and queries, and there is, of course, more you can do. -### Dynamic vs Static index +### Dynamic vs static index Queries over spatial data behave the same as regular RavenDB queries, so we can either let RavenDB generate an auto-index (dynamic) or create a predefined index (static) manually. Dynamic indexes are good if you are satisfied with the default search options, and you are fine with the first query being slower (while the index is being created). @@ -72,11 +72,11 @@ But what do we have under the hood? ## How this system works -This section discusses the technical details of RavenDB and the components that make it work. If you prefer to view the demo first, click this [link](#demo---how-we-use-spatial-search). +This section discusses the technical details of RavenDB and the components that make it work. If you prefer to view the demo first, click this [link](#demo-how-we-use-spatial-search). -### Cartesian and Geographical +### Cartesian and geographical -RavenDB has two systems \- Cartesian and geographical. Both systems mostly support the same strategies. Geographical, that is the default one, can work with three indexing strategies. +RavenDB has two systems: Cartesian and geographical. Both systems mostly support the same strategies. Geographical, that is the default one, can work with three indexing strategies. * BoundingBox * QuadPrefixTree @@ -90,13 +90,19 @@ QuadPrefixTree uses a hierarchical grid that becomes more detailed with each lev GeoHashPrefixTree works similarly, but instead of grid cells, it uses geohashes. A geohash is a compact string that represents a location. Each additional character increases precision and zooms in on a smaller part of the map. RavenDB stores these prefixes in the index. When you perform a spatial search, RavenDB converts your search region into a set of geohash prefixes and immediately filters the entries that match them. Since nearby locations produce similar prefixes, this system is very efficient for geographical data. It is tied to Earth's coordinate system, so it is used only in geographic mode, not in Cartesian mode. -All three strategies support the same spatial search features. The difference lies in how they use space and how they query your data. BoundingBox keeps things simple and light. QuadPrefixTree provides precision at the grid level. GeoHashPrefixTree provides compact indexing that naturally works with real-world coordinates. When it comes to precision, both QuadPrefixTree and GeoHashPrefixTree have around 2.5 meters of precision, but QuadPrefixTree uses more bytes. Why GeoHash is more compact? It needs only 9 levels of depth to reach same precision QuadPrefixTree reaches in 23 levels. +All three strategies support the same spatial search features. The difference lies in how they use space and how they query your data. BoundingBox keeps things simple and light. QuadPrefixTree provides precision at the grid level. GeoHashPrefixTree provides compact indexing that naturally works with real-world coordinates. When it comes to precision, both QuadPrefixTree and GeoHashPrefixTree have around 2.5 meters of precision, but QuadPrefixTree uses more bytes. Why GeoHash is more compact? It needs only 9 levels of depth to reach the same precision QuadPrefixTree reaches in 23 levels. + +| Strategy | Coordinate systems | Precision | Depth levels | Best for | +|---|---|---|---|---| +| BoundingBox | Geographic + Cartesian | Rectangle bounds | N/A | General use, simple shapes, fastest queries | +| QuadPrefixTree | Geographic + Cartesian | ~2.5 m | 23 | Large areas, dense point clusters | +| GeoHashPrefixTree | Geographic only | ~2.5 m | 9 | Real-world coordinates, most compact storage | ### How to describe point and polygon The first thing we need to understand is how we describe points. We do that with basic coordinates. In both Cartesian and geographic systems, you input longitude and latitude to describe where your point is. The same numbers work regardless of the indexing strategy. A point is simply a location with two values. -To define a shape, we use a “[well-known text](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry)” markup language called in short WKT. If you are not familiar with it, let us briefly describe how it works. You describe your geometry with a simple keyword followed by a list of coordinates. The structure is built to be readable by, so even a long shape is easy to understand at a glance. +To define a shape, we use a “[well-known text](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry)” markup language called in short WKT. If you are not familiar with it, let us briefly describe how it works. You describe your geometry with a simple keyword followed by a list of coordinates. The structure is built to be human-readable, so even a long shape is easy to understand at a glance. We use mainly two shapes: @@ -104,28 +110,33 @@ We use mainly two shapes: A closed shape defined by a list of coordinates. The first coordinate must match the last one to indicate that the shape is closed. -Example: +```wkt POLYGON((10 10, 20 10, 20 20, 10 20, 10 10)) +``` -When defining a polygon, list the points in counterclockwise order. That way, the system understands that the area inside the shape is what you want. If you list the points clockwise instead, some systems may interpret it the opposite way, selecting everything outside the polygon instead of the area inside it. +When defining a polygon, list the points in counterclockwise order. That way, the system understands that the area inside the shape is what you want. If you list the points clockwise instead, some systems may interpret it the opposite way, selecting everything outside the polygon instead of the area inside it. A polygon can also contain a hole. This is done by adding a second set of coordinates, this time clockwise, after the outer boundary. -Example with a hole: +```wkt POLYGON((0 0, 10 0, 10 10, 0 10, 0 0), (3 3, 7 3, 7 7, 3 7, 3 3)) +``` * CIRCLE -RavenDB accepts circles and format works similarly to the polygon syntax. -Example: -CIRCLE(12.4924 41.8902, 5\) +RavenDB accepts circles; the format works similarly to the polygon syntax. + +```wkt +CIRCLE(12.4924 41.8902, 5) +``` + Here, the first part describes the center, and the second part, divided by a comma, describes the radius. WKT lets you describe your shapes consistently. When you put a polygon or a circle into the index, RavenDB converts that shape into whatever structure your indexing strategy needs. You only provide coordinates. After that, the index handles precision, grid subdivision, and comparisons. Once your geometry is stored, you can run spatial queries on it. You can check if a point is inside a polygon. You can search within a radius. You can define districts or regions and return the area that contains a specific coordinate. Everything begins with a WKT definition. -## Demo \- How we use Spatial Search +## Demo: how we use spatial search Now that we have covered the theory, let’s look at the demo and see how to build it with RavenDB. The examples below use **RavenDB 7.2** and the **RavenDB Python client** with **FastAPI**. The demo is coded in Python and is available to clone via this [link](https://github.com/poissoncorp/samples.flats). @@ -141,7 +152,7 @@ Now let's dive into the app and see what it's capable of. We will go through thr Let’s start with the first function, searching by radius. -### Part 1: Search in radius +### Part 1: search in radius This option is selected by default on the top left. All you need to do is click anywhere and select a radius if the default search radius does not suit you. This gives us points within the radius; in this demo, those are flats in Paris. @@ -153,7 +164,7 @@ As you can see, a search by radius excludes any other flats outside the radius, But how do we map those flats on a map? -### Part 1 Setup +### Part 1 setup Let’s walk through the code and find the piece responsible for the radius search. The part we are looking for looks like this. @@ -196,16 +207,15 @@ First, we define the endpoint and open a session in order to communicate with Ra Flats_SpatialIndex definition in RavenDB Studio ```py - query = session.query_index_type(Flats_SpatialIndex, Flat) - - query = query.within_radius_of( - "location", - request.radius_km, - request.latitude, - request.longitude, - SpatialUnits.KILOMETERS, - ) +query = session.query_index_type(Flats_SpatialIndex, Flat) +query = query.within_radius_of( + "location", + request.radius_km, + request.latitude, + request.longitude, + SpatialUnits.KILOMETERS, +) ``` Additionally, we provide information for our query. Location selects the field with spatial data in, latitude and longitude define the center of the radius, meanwhile, `radius_km` tells RavenDB how big the radius is. Last line `SpatialUnits.KILOMETERS` simply tells RavenDB to use kilometers. @@ -226,17 +236,17 @@ To keep everything sorted, we let the user select price or distance, then query ```py results = list(query) - response = [] - for flat in results: - metadata = session.advanced.get_metadata_for(flat) - distance = metadata.get("@spatial", {}).get("Distance") - response.append(convert_to_response(flat, distance)) - return response +response = [] +for flat in results: + metadata = session.advanced.get_metadata_for(flat) + distance = metadata.get("@spatial", {}).get("Distance") + response.append(convert_to_response(flat, distance)) +return response ``` …we execute the query and retrieve the distance to the selected point from the metadata, so we can display it alongside the flat data. This is all the code you need to search by radius. Fast and easy without the need for any advanced concepts, just a simple query by field generated by the index. All we had to do was provide the coordinates and the desired range. Let’s go and query by any shape. -### Part 2: Search by shape +### Part 2: search by shape The second option in our demo allows us to search not by distance from a point, but by a custom shape. @@ -248,7 +258,7 @@ This is useful when the area you care about is not a perfect circle. Maybe you w Let’s walk through how this works in code. -### Part 2 Setup +### Part 2 setup The code used for the polygon includes elements we have encountered before, so we will mainly stop at places where new content is introduced. Part 2 code looks like this. @@ -308,8 +318,8 @@ We are again using our predefined spatial index. What changes is how we filter t ```py query = query.spatial( -"location", -lambda criteria: criteria.within(request.wkt) + "location", + lambda criteria: criteria.within(request.wkt) ) ``` @@ -318,10 +328,10 @@ We use RavenDB’s spatial search to return points within our polygon. We simply Just like in Part 1 before, we give the user sorting options, but with a twist. ```py - if request.sort_by == "price": - query = query.order_by("price_per_month") - elif request.center_latitude is not None and request.center_longitude is not None: - query = query.order_by_distance("location", request.center_latitude, request.center_longitude) +if request.sort_by == "price": + query = query.order_by("price_per_month") +elif request.center_latitude is not None and request.center_longitude is not None: + query = query.order_by_distance("location", request.center_latitude, request.center_longitude) ``` A polygon does not have a defined “center”, so if we want to sort by distance, we must explicitly provide one. Next, we get our center and run our query. @@ -329,15 +339,15 @@ A polygon does not have a defined “center”, so if we want to sort by distanc ```py results = list(query) - response = [] - for flat in results: - distance = None - if request.center_latitude is not None and request.center_longitude is not None: - metadata = session.advanced.get_metadata_for(flat) - distance = metadata.get("@spatial", {}).get("Distance") - response.append(convert_to_response(flat, distance)) +response = [] +for flat in results: + distance = None + if request.center_latitude is not None and request.center_longitude is not None: + metadata = session.advanced.get_metadata_for(flat) + distance = metadata.get("@spatial", {}).get("Distance") + response.append(convert_to_response(flat, distance)) - return response +return response ``` Just like in Part 1, RavenDB stores spatial calculation results in metadata. We do not compute the distance ourselves. We simply read what the database already calculated for us. RavenDB returns only flats that are inside the polygon, then sends them to the client. @@ -351,27 +361,27 @@ As you can see, those are districts that we had ready underneath it all. We can ```py @app.get("/api/districts") async def get_all_districts(): -with store.open_session() as session: + with store.open_session() as session: ``` And then we simply query our districts and create a response for our app. ```py - with store.open_session() as session: - districts = list(session.query(object_type=District)) - - return [{ - "name": district.name, - "arrondissement_number": district.arrondissement_number, - "description": district.description, - "population": district.population, - "boundary_wkt": district.boundary_wkt - } for district in districts] +with store.open_session() as session: + districts = list(session.query(object_type=District)) + + return [{ + "name": district.name, + "arrondissement_number": district.arrondissement_number, + "description": district.description, + "population": district.population, + "boundary_wkt": district.boundary_wkt + } for district in districts] ``` Then we can just connect search by shape type of code, and we can search all flats in the selected district. -### Part 3: Reverse search +### Part 3: reverse search You might have noticed that if we click on any flat, the district (4th Arrondissement - Hôtel-de-Ville on the image) it is in is shown. @@ -389,7 +399,7 @@ The only thing that distinguishes this index from a non-spatial one is a single But how do we do this? -### Part 3 Setup +### Part 3 setup The whole code that searches the district from a point for us looks like this: @@ -430,7 +440,7 @@ We start with our endpoint and session. ```py @app.post("/api/district/by-point", response_model=DistrictResponse) async def get_district_by_point(request: GetDistrictByPointRequest): -with store.open_session() as session: + with store.open_session() as session: ``` Then we query using a spatial index for districts with a point written in WKT. RavenDB will take its coordinates and convert them into a point in space. @@ -444,8 +454,8 @@ Then we find the districts that contain the flat, in other words, we look for a ```py query = query.spatial( -"boundary", -lambda criteria: criteria.within(point_wkt) + "boundary", + lambda criteria: criteria.within(point_wkt) ) ``` @@ -476,4 +486,4 @@ raise HTTPException( As you can see, RavenDB lets you query spatial data and use spatial search capabilities without becoming an expert in a completely new technology. Want to easily add more advanced capabilities to your application? You may consider combining semantic search with your spatial search to query by the best-matching description. You can find more about [semantic search in RavenDB with Python and FastAPI](/guides/semantic-search-with-ravendb-python-and-fastapi/) in the companion guide. -Interested in RavenDB? Grab the [free developer license](https://ravendb.net/dev) for testing, or get a [free cloud database](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). +Interested in RavenDB? Grab the [free developer license](https://ravendb.net/dev) for testing, or get a [free cloud database](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join our Discord Community Server. The invitation link is [here](https://discord.com/invite/ravendb). diff --git a/guides/survive-the-ai-tidal-wave-with-ravendb-genai.mdx b/guides/survive-the-ai-tidal-wave-with-ravendb-genai.mdx index 904dd0afbf..efc6f2c043 100644 --- a/guides/survive-the-ai-tidal-wave-with-ravendb-genai.mdx +++ b/guides/survive-the-ai-tidal-wave-with-ravendb-genai.mdx @@ -1,7 +1,7 @@ --- title: "Survive the AI Tidal Wave with RavenDB GenAI" tags: [ai, demo, deep-dive, use-case] -description: "Read about Survive the AI Tidal Wave with RavenDB GenAI on the RavenDB.net news section" +description: "Learn how to use RavenDB's GenAI feature to automatically summarize, prioritize, and enrich documents. Step-by-step walkthrough with code examples and screenshots." publishedAt: 2025-07-07 author: "Gracjan Sadowicz" image: "https://ravendb.net/wp-content/uploads/2025/06/article-cover-genai.png" @@ -30,7 +30,7 @@ This way, AI enthusiasm went to the moon. It’s very accessible – to embrace But in reality, it’s so simple only in abstract high-level system design. The real fact is that AI isn’t simple. For decades, it was a separate branch of computer science. The engineering know-how behind it started to spread super-quickly, yet still very recently. It requires specialized knowledge, skills, approaches, and tools, which are in demand more than ever. -#### AI isn’t that trivial +### AI isn’t that trivial Tempted by its power, companies let developers dive into the world of embeddings, LLMs, tokens, quantization, and so on. This whole new world (at least to most of us) holds nuances and know-how with its learning curve. @@ -47,15 +47,15 @@ RavenDB takes the burden of AI off your shoulders, keeping all AI data logistics The surge for building custom AI solutions is no longer needed. **Basic prompting skills are enough**. You can take advantage of the AI opportunity and empower your system in any way imaginable, with almost no work around it. Everything stays in the database\! -One of our new tools is a powerful RavenDB **GenAI** feature. It’s super easy to set up and requires minimal coding to work. You just define the context, write a prompt, and it works, without leaving the RavenDB Studio. Any developer can set it up ridiculously fast. +One of our new tools is a powerful RavenDB [**GenAI**](https://docs.ravendb.net/7.2/ai-integration/gen-ai-integration/overview) feature. It's super easy to set up and requires minimal coding to work. You just define the context, write a prompt, and it works, without leaving the RavenDB Studio. Any developer can set it up ridiculously fast. -It’s not just an AI connector; it’s the entire process, complete with a ready-to-use template. It reads, processes, and updates documents you attach it to. It can even be configured to re-process documents upon changes, letting you create intelligent processes. You can use any model, even offline ones, so when ChatGPT goes down, you’re still up. +It’s not just an AI connector; it’s the entire process, complete with a ready-to-use template. It reads, processes, and updates documents you attach it to. It can even be configured to re-process documents upon changes, letting you create intelligent processes. You can use any model, even offline ones, so when ChatGPT goes down, you're still up. In this article, we will introduce you to GenAI. We will show you how easy it is to set up and inspire you on how broad its usage can be. We will use an example from our demo, Intelligent Support Desk. ## Part 1: Application - +Intelligent Support Desk demo application showing the helpdesk ticket queue
Quick and quality support is essential in a business flow, but it requires well-defined prioritization that differentiates “ASAP” from low-priority tickets, proper escalation procedures, and good searchability. Analyzing a **pile of issues** can be very time-consuming: it involves reading the conversation, understanding the problem, assigning a priority, and assessing whether there’s a need for escalation. @@ -69,17 +69,17 @@ The second feature used here is automatic embedding generation, making all ticke Let’s see how it works: - +Helpdesk application demonstrating AI-powered semantic search results with ticket priority labels
AI search gives us the power to query by meaning, rather than just by tags or exact phrases, so we can quickly receive the tickets we seek without guessing the keywords. The application summarizes the current ticket state, so we don’t need to read the whole conversation to figure out what’s happening. Additionally, all tickets are automatically prioritized and escalated, following the rules we defined. - +Helpdesk ticket detail view showing AI-generated priority, summary, and escalation status
But how does it work under the hood? Let's dive into it. ## Part 2: How it works? -Here’s the link to the application repo for the code-curious readers: [https://github.com/poissoncorp/ravendb-demo-explorer](https://github.com/poissoncorp/ravendb-demo-explorer) +Here's the link to the application repo for the code-curious readers: [https://github.com/poissoncorp/ravendb-demo-explorer](https://github.com/poissoncorp/ravendb-demo-explorer) ### App breakdown @@ -128,16 +128,16 @@ In the big picture, it’s just fetching tickets to your browser. We’re trimmi Returning to the ticket, as you might have seen in the schema, it has a summary, priority, and needs\_sales flag. It’s prepared to be fulfilled not by humans but by AI. -To achieve that with RavenDB, first, we created an OpenAI connection string to the LLM so we could use it and the new task. +To achieve that with RavenDB, first, we created an [OpenAI connection string](https://docs.ravendb.net/7.2/ai-integration/connection-strings/open-ai) to the LLM so we could use it and the new task. - +RavenDB Studio showing the OpenAI connection string configuration screen for the GenAI ongoing task
Then, while creating a task, we needed to define which collection (HelpdeskTickets) would be processed, and the **context** we want to give to the AI, to help it **understand the job** it’s dealing with. We wanted to provide the conversation history to the model so it knows what’s happening inside. Still, it also needed to be provided with the customer's name to differentiate customer messages from the support team. Ticket title can also be helpful; it’s another bit of information to understand the issue entirely. We wrote a script that collects all this data from our document, based on a helpful template available in the ***(i)*** pop-up: - +RavenDB Studio GenAI task wizard showing the document context configuration with the helper template popup
In the wizard's next step, we provided AI with two crucial things: a **prompt** and an **output schema** in which we wanted to receive data *from* the AI. The prompt is written in natural language (English in our case), so you **don’t need to write any code** here. Our prompt looked like this: @@ -157,7 +157,7 @@ Also, indicate if the conversation needs an intervention from one of our sales. As we said, the second significant element is the JSON schema. We can define a sample object here \- it serves as an example for AI, showing what response is expected. We selected a sample object, a more convenient approach for individuals less experienced with AI or those who don’t require manual precision: - +RavenDB Studio GenAI task wizard showing output schema configuration using a sample JSON object ``` { @@ -176,29 +176,29 @@ Object.assign(this, $output); After saving, it automatically started generating stuff. A single prompt handled all summaries, priorities, tags, and sales detections. See: - +RavenDB Studio GenAI ongoing task status view showing active document processing progress
Look how nice it is. - +RavenDB document view showing a helpdesk ticket with AI-generated summary, priority, tags, and sales escalation flag
Then we got an idea—why shouldn’t the issues be searchable by generated summaries, not by matched text, but by meaning? -We quickly set up an embeddings generation task on the summary field, which GenAI has generated. +We quickly set up an [embeddings generation task](https://docs.ravendb.net/7.2/ai-integration/generating-embeddings/embeddings-generation-task) on the summary field, which GenAI has generated. We prepared another OpenAI connection string \- now to the embeddings model. See: - +RavenDB Studio showing a second OpenAI connection string configured for the text embeddings model
Then we rolled out the second task – this one is for the embeddings generation. Embeddings can be perceived as the numerical representation of the concept/meaning of the text here. The task is also responsible for automatic query terms “translation” and caching, so your costs are kept under control. See the task, we set it up to generate embeddings on the Summary field: - +RavenDB Studio embeddings generation task configured to create vector embeddings from the ticket Summary field
-It works\! Now, (of course, after UI work, but that’s a detail 😏) we can search by summary meaning\! +It works\! Now, (of course, after UI work, but that's a detail 😏) we can search by summary meaning\! - +Helpdesk application showing semantic search results returned by vector similarity of ticket summaries ### Building smart, minimum code @@ -210,6 +210,13 @@ The best part is that this requires **only the fun part of coding to ship**. We ## Where to go from here? -If you are unfamiliar with RavenDB, we encourage you to explore what we can do for you. GenAI isn’t our only feature that will make your life easier. Another thing you might be interested in is [native vector search](https://ravendb.net/articles/new-in-7-0-ravendbs-vector-search) or [automatic embeddings generation](https://ravendb.net/articles/embeddings-generation-with-ravendb). +Here's the short version of what we covered: + +- RavenDB's GenAI removes the need for a custom AI pipeline. You define a context, write a prompt in plain English, and the feature reads, processes, and updates your documents automatically. +- One natural-language prompt with a JSON output schema replaced what would normally be four separate processing steps, with RavenDB patching the document fields directly from the AI response. +- GenAI writes the summaries, the embeddings task indexes them, and suddenly you have semantic search across your entire ticket history without touching a single query. +- You don't need to understand how AI works under the hood to use it. GenAI is configured like any other RavenDB Ongoing Task, supports commercial and offline models, and the whole setup lives inside RavenDB Studio. + +If you are unfamiliar with RavenDB, we encourage you to explore what we can do for you. GenAI isn’t our only feature that will make your life easier. Another thing you might be interested in is [native vector search](https://docs.ravendb.net/7.2/ai-integration/vector-search/overview) or [automatic embeddings generation](https://docs.ravendb.net/7.2/ai-integration/generating-embeddings/overview). If you and your team have **gained some free time** by letting AI do the work, you may have a moment to chat with us on our Discord. RavenDB Team and our community are chronically online there. Check out our [Discord server](https://discord.com/invite/ravendb). diff --git a/guides/transactional-outbox.mdx b/guides/transactional-outbox.mdx index d23ea2b26d..40b2098fb2 100644 --- a/guides/transactional-outbox.mdx +++ b/guides/transactional-outbox.mdx @@ -67,7 +67,7 @@ But in RavenDB particularly, there’s another way of achieving the same goal, u ## Implementing the outbox with data subscriptions -We want to connect and save our documents to both collections as a single atomic transaction. To achieve that, we want to use the saveChanges() after storing both documents, to wrap both changes into a single transaction, as described in this [article](/7.2/client-api/faq/transaction-support#acid-for-document-operations). +We want to connect and save our documents to both collections as a single atomic transaction. To achieve that, we want to use the saveChanges() after storing both documents, to wrap both changes into a single transaction, as described in this [article](/client-api/faq/transaction-support#acid-for-document-operations). ```csharp var invoice = new Invoice @@ -174,7 +174,7 @@ await worker.Run(async batch => }); ``` -In the end whole part of this code looks like this: +Here is the complete code: ```csharp showLineNumbers using (var store = new DocumentStore()) @@ -278,7 +278,7 @@ await worker.Run(async batch => }); ``` -In the end whole part of this code looks like this: +Here is the complete code: ```csharp showLineNumbers using (var store = new DocumentStore()) @@ -409,6 +409,13 @@ And for RabbitMQ we can just look into the GUI. RabbitMQ management GUI showing messages published via RavenDB Queue ETL This way, with RavenDB, you can address the actual need of publishing an event after invoice creation in **minutes**. +## Summary + +- Treating the invoice and its outbox message as a single atomic commit eliminates a whole class of distributed system bugs. Partial failure (data saved, event not sent) is harder to recover from than total failure because it produces silent inconsistencies rather than visible errors. +- Data subscriptions track acknowledgment per worker, so if your application crashes mid-batch, RavenDB resumes from where processing stopped. You get at-least-once delivery guarantees without writing any retry logic yourself. +- Queue ETL moves the reliability boundary into the database server. Because RavenDB owns both the data and the delivery, it journals processed documents and skips re-sending them after a restart, removing a whole category of application-level state management. +- Queue ETL is the right default unless your messages need custom logic such as conditional publishing, payload enrichment, or fan-out to multiple queues. Starting with ETL and switching to data subscriptions only when needed keeps the codebase smaller and the failure surface narrower. + RavenDB ETLs can be used in many different ways. If you are interested in ETL to AWS SQS, [read about RavenDB Amazon SQS ETL](https://ravendb.net/articles/new-in-7-0-ravendb-and-amazon-sqs-etl). You might also be interested in [The Library of Ravens sample repository](https://github.com/ravendb/samples-library), which uses ETL to handle timeouts. [Get a free RavenDB developer license](https://ravendb.net/dev) for testing, or [start a free RavenDB Cloud database](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, [join the RavenDB Discord community](https://discord.com/invite/ravendb). diff --git a/guides/using-remote-attachments-to-cut-storage-costs.mdx b/guides/using-remote-attachments-to-cut-storage-costs.mdx index 0ac3cd196e..e9abf1ef45 100644 --- a/guides/using-remote-attachments-to-cut-storage-costs.mdx +++ b/guides/using-remote-attachments-to-cut-storage-costs.mdx @@ -253,7 +253,7 @@ Then after one minute that has been requested with `DateTime.UtcNow.AddMinutes(1 When moving to remote storage, you would probably also want to move old attachments. To do that we can use a [set-based patch operation](https://docs.ravendb.net/7.2/client-api/operations/patching/set-based#updating-all-documents): -```csharp +```sql from @all_docs update { @@ -289,4 +289,4 @@ Remote attachments in RavenDB let you offload binary data to Amazon S3, Azure Bl For the full API reference, see the [remote attachments configuration documentation](https://docs.ravendb.net/document-extensions/attachments/configure-remote-attachments). To learn about storing remote attachments programmatically, see [store remote attachments](https://docs.ravendb.net/document-extensions/attachments/store-attachments/store-attachments-remote). For how remote attachments interact with replication, backups, and subscriptions, see [attachments and other features](https://docs.ravendb.net/document-extensions/attachments/attachments-and-other-features). -Interested in RavenDB? Grab the developer license dedicated for testing under this link [here](https://ravendb.net/dev), or get a free cloud database [here](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). +Interested in RavenDB? [Grab a free developer license](https://ravendb.net/dev) for testing, or [start a free RavenDB Cloud database](https://ravendb.net/cloud). If you have questions about this feature, or want to hang out and talk with the RavenDB team, [join the RavenDB Discord Community](https://discord.com/invite/ravendb). diff --git a/guides/vibe-coding-with-ravendb-and-context7.mdx b/guides/vibe-coding-with-ravendb-and-context7.mdx index 532032b2b2..cc01a74389 100644 --- a/guides/vibe-coding-with-ravendb-and-context7.mdx +++ b/guides/vibe-coding-with-ravendb-and-context7.mdx @@ -26,7 +26,7 @@ import LanguageSwitcher from "@site/src/components/LanguageSwitcher"; import LanguageContent from "@site/src/components/LanguageContent"; import Image from "@theme/IdealImage"; -As AI tools become more capable, so-called “vibe coding” becomes a practical option in certain situations. It can be especially useful for fast PoC development and prototyping. When treated as just another tool in our toolbox, it fits naturally into a modern workflow. One of its strongest advantages is lowering the barrier to entry for technologies you are not yet comfortable with. With AI assistance, you can quickly prototype an idea and evaluate whether a chosen technology is appropriate before committing more time to it. +Vibe coding is what happens when you stop writing every line yourself and start describing what you want instead. You steer the direction, the AI writes the code, and you review what comes back. It works especially well for prototyping when the goal is to test an idea quickly rather than build something perfect. Overall, AI has improved steadily over time. Models are faster and can handle more context; they have become a standard to provide web search access to find missing information. Many of these LLMs can now be integrated into the IDE, helping you stay focused on the code rather than constantly switching applications. In most cases, this makes work faster and smoother, and the user enters a flow state. @@ -138,4 +138,9 @@ We enable email sending in the config and test it using a local SMTP server. We With an AI assistant at our side, we built a working RavenDB e-commerce prototype — complete with a product listing page, a sales analytics dashboard, and automated order email notifications — in a single session. Context7 kept the AI grounded in accurate RavenDB documentation throughout, reducing the need to search for context and keeping the workflow uninterrupted. -From here you can extend the prototype with additional RavenDB features, use it as a basis for evaluating whether RavenDB fits your project's requirements, or explore RavenDB's built-in AI integration capabilities for embeddings and vector search in production applications. +## Summary + +- MCP shifts documentation lookup off your plate. Once Context7 is configured, the AI fetches RavenDB docs on its own when it hits an unfamiliar [client API](/docs/client-api/what-is-a-document-store). You do not need to invoke it explicitly or paste reference material into the chat. +- Vibe coding is a surprisingly honest technology evaluator. Building a prototype this way reveals which parts of a database are intuitive to describe in plain language and which require precise API knowledge, giving you a low-cost signal before committing to a technology. +- Token efficiency matters more than it looks. Mid-session web searches fill the context window with noise. MCP-sourced documentation is targeted and compact, so the AI spends fewer tokens finding context and more time generating useful code. +- Review AI-generated queries before going further. [RavenDB queries](/docs/querying/overview) and [index definitions](/docs/indexes/what-are-indexes) produced during vibe coding are a good starting point, but validate them against your actual data shape and index configuration before treating them as production-ready. diff --git a/guides/zabbix-setup-guide.mdx b/guides/zabbix-setup-guide.mdx index 86a3ad55bc..04114cd731 100644 --- a/guides/zabbix-setup-guide.mdx +++ b/guides/zabbix-setup-guide.mdx @@ -2,7 +2,7 @@ title: "Set up Zabbix Monitoring for RavenDB Cloud" tags: [monitoring] icon: "real-time-statistics" -description: "How to set up your Zabbix to monitor your RavenDB Cloud" +description: "Step-by-step guide to deploying Zabbix in Docker and connecting it to a RavenDB Cloud instance using SNMPv3 for real-time database monitoring." publishedAt: 2025-11-16 image: "https://ravendb.net/wp-content/uploads/2025/11/zabbix-article-image.svg" author: "Paweł Lachowski" @@ -19,7 +19,7 @@ import Image from "@theme/IdealImage"; After setting up RavenDB and watching data flow smoothly between your app and a database, you realize that storing information is only half the story. The other half is keeping it nice and safe. That curiosity leads to monitoring. Tools designed to keep an eye on systems, servers, and networks in real time. -To monitor your database, one option is Zabbix. It’s a handy tool that keeps an eye on your database and lets you know when something’s not quite right. It tracks performance and storage metrics, so you can spot problems early and keep everything running smoothly. Let’s see how to set it up. +To monitor your database, one option is Zabbix. It's a handy tool that keeps an eye on your database and lets you know when something's not quite right. It tracks performance and storage metrics, so you can spot problems early and keep everything running smoothly. Let's see how to set it up. @@ -28,26 +28,31 @@ Before we start deploying Zabbix in Docker, make sure you have the following pre * **Docker** installed and running on your machine (this article was tested with Docker version 28.1.1) * **RavenDB Cloud** instance -If you encounter any problems installing or starting Docker, refer to the [official Docker documentation](https://docs.docker.com/get-started/). RavenDB Cloud instance is available [here](https://ravendb.net/cloud). In this article, we are using Windows. Remember to change commands if needed according to your operating system. +If you encounter any problems installing or starting Docker, refer to the [official Docker documentation](https://docs.docker.com/get-started/). A RavenDB Cloud instance is available at [ravendb.net/cloud](https://ravendb.net/cloud). In this article, we are using Windows. Remember to change commands if needed according to your operating system. + ## Setup -To set it all up, we will use a few Docker commands. Docker will make it easier for us to set up everything. Let’s start by creating a dedicated network for all Zabbix components. Our Zabbix environment is made of several pieces that collect, process, and present monitoring data. The core components are: server, PostgreSQL database (stores configuration/history), Web frontend (GUI), agent (collects/forwards data) +To set it all up, we will use a few Docker commands. Docker will make it easier for us to set up everything. Let's start by creating a dedicated network for all Zabbix components. Our Zabbix environment is made of several pieces that collect, process, and present monitoring data. The core components are: server, PostgreSQL database (stores configuration/history), Web frontend (GUI), agent (collects/forwards data) + +### Create the Docker network Open your terminal and run the following command: -``` showLineNumbers +```bash showLineNumbers docker network create --subnet 172.20.0.0/16 --ip-range 172.20.240.0/20 zabbix-net ``` This creates an isolated Docker network named *zabbix-net*, ensuring all containers can communicate internally without interfering with other services. +### Deploy the containers + Next, we want to create Zabbix containers inside Docker. We will need four containers, each setup with the command in the command prompt. The first container will contain a PostgreSQL server. We create it using the following command, so we can collect our data and Zabbix config files: -``` showLineNumbers +```powershell showLineNumbers docker run --name postgres-server -t ` -e POSTGRES_USER="zabbix" ` -e POSTGRES_PASSWORD="zabbix_pwd" ` @@ -59,7 +64,7 @@ docker run --name postgres-server -t ` Next, we need an SNMP trap agent to extract the logs. SNMP is a protocol we use for communication and trap notification, which our agent will receive. We also need to expose port 162 for the trap to communicate. We will use the Alpine version of the Zabbix SNMP traps container. -``` showLineNumbers +```bash showLineNumbers docker run --name zabbix-snmptraps -t \ -v /zbx_instance/snmptraps:/var/lib/zabbix/snmptraps:rw \ -v /var/lib/zabbix/mibs:/usr/share/snmp/mibs:ro \ @@ -71,7 +76,7 @@ docker run --name zabbix-snmptraps -t \ Then we need a Zabbix server as a main component for Zabbix logic. It will connect the previous two containers, obtain data from Zabbix traps, and store them in the PostgreSQL server. -``` showLineNumbers +```powershell showLineNumbers docker run --name zabbix-server-pgsql -t ` -e DB_SERVER_HOST="postgres-server" ` -e POSTGRES_USER="zabbix" ` @@ -85,10 +90,9 @@ docker run --name zabbix-server-pgsql -t ` -d zabbix/zabbix-server-pgsql:alpine-7.4-latest ``` - And the last thing we need is a web interface so we can interact with this. It's added with the last container. -``` showLineNumbers +```powershell showLineNumbers docker run --name zabbix-web-nginx-pgsql -t ` -e ZBX_SERVER_HOST="zabbix-server-pgsql" ` -e DB_SERVER_HOST="postgres-server" ` @@ -105,37 +109,45 @@ docker run --name zabbix-web-nginx-pgsql -t ` Those four containers are the whole Zabbix environment. Now we need to connect it to our RavenDB Cloud instance. -Let’s first log in to Zabbix with default credentials. You can enter webUI clicking at port next to the web container or by searching [`http://localhost:80`](http://localhost:80). +### Log in to Zabbix - -You should be greeted with sight like that: +Let's first log in to Zabbix with default credentials. You can enter webUI clicking at port next to the web container or by navigating to `http://localhost:80`. - +Docker Desktop showing the Zabbix web container with a port link to open the web UI +You should see something like this: + +Zabbix login screen with username and password fields There, you want to input default login credentials: Username: Admin Password: zabbix -Once we log in, it’s worth changing the default credentials to something more secure. Let’s select User Settings on the left, then the profile tab and change your password. +Once we log in, it's worth changing the default credentials to something more secure. Let's select User Settings on the left, then the profile tab and change your password. + +Zabbix User Settings profile tab with the Change Password option +### Import the RavenDB template - -Let’s add a RavenDB template we will use later. Templates can be downloaded from [Zabbix community github](https://github.com/zabbix/community-templates/blob/main/Databases/RavenDB/template_ravendb_server/6.0/template_ravendb_server.yaml). With the file downloaded, enter Data Collection and Templates. On the top right corner click import and select our file. +Let's add a RavenDB template we will use later. Templates can be downloaded from the [Zabbix community GitHub](https://github.com/zabbix/community-templates/blob/main/Databases/RavenDB/template_ravendb_server/6.0/template_ravendb_server.yaml). With the file downloaded, enter Data Collection and Templates. On the top right corner click import and select our file. - +Zabbix Data Collection Templates page with the Import button in the top-right corner ## Connecting to RavenDB -Now that we have the base for what we want to do let’s connect your RavenDB Cloud. First we need to turn on the monitoring product feature. You can do it in the Manage page of your chosen RavenDB Cloud instance. +### Enable SNMP monitoring - +Now that we have the base for what we want to do let's connect your RavenDB Cloud. First we need to turn on the monitoring product feature. You can do it in the Manage page of your chosen RavenDB Cloud instance. + +RavenDB Cloud instance Manage page showing the Product Features section In the product features, find the Monitoring option and enable it. We need SNMP credentials to connect with Zabbix. Just follow the configuration menu that you can open by pressing the button that will appear under where the enable button was. - +RavenDB Cloud Monitoring configuration panel showing the SNMP credentials section Inside, you want to copy three things: authentication username, authentication password and privacy password. RavenDB is designed to be safe, so to ensure that we use SNMPv3, we use two passwords. Also, add IP you will be connecting to on top. - +RavenDB Cloud SNMP credentials showing authentication username, authentication password, and privacy password fields +### Configure the Zabbix host + Now we go back to Zabbix and click on monitoring on the left bar and hosts. Then on the top right corner of the screen, hit create host. - +Zabbix Monitoring Hosts page with the Create Host button in the top-right corner What you need to do here is (In brackets, you have RavenDB Cloud configuration names): 1. Add Host Name @@ -150,22 +162,31 @@ What you need to do here is (In brackets, you have RavenDB Cloud configuration n 10. Select AES128 and add your Privacy passphrase (Privacy password) 11. Save - -That’s it. If done correctly, you should be able to go into the latest data, select everything, and at the bottom of the page, click execute now. After a few seconds and a refresh, you should have data in your Zabbix. You can check all monitored metrics at this page. +Zabbix Create Host form configured with RavenDB host name, template, SNMP interface, and SNMPv3 authPriv credentials +That's it. If done correctly, you should be able to go into the latest data, select everything, and at the bottom of the page, click execute now. After a few seconds and a refresh, you should have data in your Zabbix. You can check all monitored metrics at this page. + +### Add a trigger -Let’s go to data collection and hosts. There are click triggers at your new host. You can see that we already have some triggers inside from the template. If they satisfy your needs, you are done; if not, let’s add a new one. On the top right corner of the screen, press Create Trigger. +Let's go to data collection and hosts. There are click triggers at your new host. You can see that we already have some triggers inside from the template. If they satisfy your needs, you are done; if not, let's add a new one. On the top right corner of the screen, press Create Trigger. - +Zabbix Create Trigger form showing the expression editor We can add an expression in two ways. Either use the editor by pressing add or just typing it. We want to create information that will inform us that our database has been down for the last 30 minutes. Our expression will look like this. -``` showLineNumbers +```text showLineNumbers nodata(/RavenDB4/server.uptime,1800)=1 ``` -This gives you a trigger; now, you can add an option to send a message to your email address. If you want to learn how to do it you can check Zabbix documentation [here](https://www.zabbix.com/documentation/3.4/en/manual/config/notifications/media/email). +This gives you a trigger; now, you can add an option to send a message to your email address. If you want to learn how to do it you can check the [Zabbix email notification documentation](https://www.zabbix.com/documentation/3.4/en/manual/config/notifications/media/email). ## Summary -Setting up monitoring data with Zabbix can be a useful tool to monitor work of your database. If you are interested in monitoring your on-premise database, you can try to use Datadog. You can check the connection tutorial [here](https://ravendb.net/articles/leverage-ravendb-observability-with-datadog). +Here's what you built and why each piece matters: + +- RavenDB Cloud does not broadcast SNMP data by default. You have to explicitly enable the Monitoring feature per instance in the Manage page. That same step also generates the SNMP credentials Zabbix needs, so skipping it means the host in Zabbix will never receive any data. +- The Zabbix server reads trap data through a shared volume, not a network call. The `--volumes-from zabbix-snmptraps` flag mounts the SNMP container's filesystem directly into the server container, which is why both must run on the same Docker host. +- RavenDB Cloud enforces two separate SNMP passwords by design. One authenticates the sender (SHA1), the other encrypts the payload (AES128). Both are generated for you in the Cloud portal and both are required when configuring `authPriv` security level in Zabbix. +- The imported RavenDB template already ships with triggers. Before writing a custom trigger, check the template defaults first. The `nodata()` example in this guide is only needed when the built-in triggers do not cover your specific alerting requirements. + +If you are interested in monitoring your on-premise database, you can try to use Datadog. You can check the [RavenDB with Datadog tutorial](https://ravendb.net/articles/leverage-ravendb-observability-with-datadog). -Interested in RavenDB? Grab the developer license dedicated to testing under this link [here](https://ravendb.net/dev) or get a free cloud database [here](https://ravendb.net/cloud). Any questions about this feature or just want to hang out and talk with the RavenDB team? Join our Discord Community Server \- invitation link is [here](https://discord.com/invite/ravendb). +Interested in RavenDB? Grab a [free RavenDB developer license](https://ravendb.net/dev) or spin up a [free RavenDB Cloud database](https://ravendb.net/cloud). Any questions about this feature or just want to hang out and talk with the RavenDB team? Join the [RavenDB Discord Community Server](https://discord.com/invite/ravendb). From b7c916bc4477a8281187e1f1feba898063ae58b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Lachowski?= Date: Wed, 6 May 2026 15:47:57 +0200 Subject: [PATCH 2/3] Small fixes to transactional outbox and vibe coding guides. --- guides/transactional-outbox.mdx | 4 ++-- guides/vibe-coding-with-ravendb-and-context7.mdx | 10 ++++------ 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/guides/transactional-outbox.mdx b/guides/transactional-outbox.mdx index 40b2098fb2..c0af40d3b9 100644 --- a/guides/transactional-outbox.mdx +++ b/guides/transactional-outbox.mdx @@ -42,7 +42,7 @@ In this article, we will dive into the world of atomic transactions, queues, and Publishing events based on data changes and hoping everything works out is only good until the first failure. We have [fallacies of distributed computing](https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing) to remind us of the basic problems of distributed systems. The database might be ACID-compliant, but those guarantees end at the database. If the data is saved in the database but the event is not sent due to a crash, network issue, or power outage, we just lose the data. -When it happens, we can end up with situations where the invoice exists, but no microservice responds to it. For example, in an e-commerce system, this can mean that payment is accepted, but no realisation process is started. The order is never prepared or shipped, and from the user’s perspective, it simply disappears. This is how technical problems turn into lost trust. +When it happens, we can end up with situations where the invoice exists, but no microservice responds to it. For example, in an e-commerce system, this can mean that payment is accepted, but no realization process is started. The order is never prepared or shipped, and from the user’s perspective, it simply disappears. This is how technical problems turn into lost trust. At the same time, we can have the opposite situation: the queue receives our event, but the database doesn’t create an invoice. That’s how minor problems can change into real legal difficulties. We can’t have partial failure at all; it is worse than complete failure. Then how is it possible we have so many online services that work? @@ -50,7 +50,7 @@ At the same time, we can have the opposite situation: the queue receives our eve The reason systems like this can work at all is that they stop treating event sending or document creation separately. The moment an invoice is created is also the exact moment an event must be captured. There is no acceptable state in which only one of those happens. -This is how the concept of the [transactional outbox](https://en.wikipedia.org/wiki/Inbox_and_outbox_pattern#The_outbox_pattern) pattern was born. Instead of sending an event directly to the queue, the system first records the intent to publish that event together with the business data. The invoice and its corresponding event are stored in the database as part of the same transaction. If the transaction succeeds, both exist. If it fails, neither does. In practice, this usually means introducing an outbox as a separate collection and queuing its documents via a data subscription. We create documents simultaneously in both collections. In RavenDB, it would look like this: +This is how the concept of the [transactional outbox](https://microservices.io/patterns/data/transactional-outbox.html) pattern was born. Instead of sending an event directly to the queue, the system first records the intent to publish that event together with the business data. The invoice and its corresponding event are stored in the database as part of the same transaction. If the transaction succeeds, both exist. If it fails, neither does. In practice, this usually means introducing an outbox as a separate collection and queuing its documents via a data subscription. We create documents simultaneously in both collections. In RavenDB, it would look like this: ```csharp using (var session = store.OpenAsyncSession()) diff --git a/guides/vibe-coding-with-ravendb-and-context7.mdx b/guides/vibe-coding-with-ravendb-and-context7.mdx index cc01a74389..376563af21 100644 --- a/guides/vibe-coding-with-ravendb-and-context7.mdx +++ b/guides/vibe-coding-with-ravendb-and-context7.mdx @@ -138,9 +138,7 @@ We enable email sending in the config and test it using a local SMTP server. We With an AI assistant at our side, we built a working RavenDB e-commerce prototype — complete with a product listing page, a sales analytics dashboard, and automated order email notifications — in a single session. Context7 kept the AI grounded in accurate RavenDB documentation throughout, reducing the need to search for context and keeping the workflow uninterrupted. -## Summary - -- MCP shifts documentation lookup off your plate. Once Context7 is configured, the AI fetches RavenDB docs on its own when it hits an unfamiliar [client API](/docs/client-api/what-is-a-document-store). You do not need to invoke it explicitly or paste reference material into the chat. -- Vibe coding is a surprisingly honest technology evaluator. Building a prototype this way reveals which parts of a database are intuitive to describe in plain language and which require precise API knowledge, giving you a low-cost signal before committing to a technology. -- Token efficiency matters more than it looks. Mid-session web searches fill the context window with noise. MCP-sourced documentation is targeted and compact, so the AI spends fewer tokens finding context and more time generating useful code. -- Review AI-generated queries before going further. [RavenDB queries](/docs/querying/overview) and [index definitions](/docs/indexes/what-are-indexes) produced during vibe coding are a good starting point, but validate them against your actual data shape and index configuration before treating them as production-ready. +- **MCP shifts documentation lookup off your plate.** Once Context7 is configured, the AI fetches RavenDB docs on its own when it hits an unfamiliar [client API](/docs/client-api/what-is-a-document-store). You do not need to invoke it explicitly or paste reference material into the chat. +- **Vibe coding is a surprisingly honest technology evaluator.** Building a prototype this way reveals which parts of a database are intuitive to describe in plain language and which require precise API knowledge, giving you a low-cost signal before committing to a technology. +- **Token efficiency matters more than it looks.** Mid-session web searches fill the context window with noise. MCP-sourced documentation is targeted and compact, so the AI spends fewer tokens finding context and more time generating useful code. +- **Review AI-generated queries before going further.** [RavenDB queries](/docs/querying/overview) and [index definitions](/docs/indexes/what-are-indexes) produced during vibe coding are a good starting point, but validate them against your actual data shape and index configuration before treating them as production-ready. From 7f4d4b1de85d90159c732402d330291982f89db2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Lachowski?= Date: Wed, 6 May 2026 16:04:06 +0200 Subject: [PATCH 3/3] Fixes to broken links --- .../employing-schema-validation-to-standardize-your-data.mdx | 2 +- guides/transactional-outbox.mdx | 2 +- guides/vibe-coding-with-ravendb-and-context7.mdx | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/guides/employing-schema-validation-to-standardize-your-data.mdx b/guides/employing-schema-validation-to-standardize-your-data.mdx index 3e6c695211..8a9f6c3438 100644 --- a/guides/employing-schema-validation-to-standardize-your-data.mdx +++ b/guides/employing-schema-validation-to-standardize-your-data.mdx @@ -258,7 +258,7 @@ This schema is simple. What it does is define the fields Company and Employee mu } ``` -There are many different properties; the full list of constraints is available in the [Write Validation: API](/documents/schema-validation/write-validation/write-validation_api#available-constraints) reference. Now what it checks is not only if the Company and Employee fields are there, but also if Freight is a number, Lines with nested fields, and more. To make it even more strict, we can enforce additionalProperties false to tell the schema to catch any writes for documents that have any additional field other than those present in the schema. +There are many different properties; the full list of constraints is available in the [Write Validation: API](https://docs.ravendb.net/documents/schema-validation/write-validation/write-validation_api) reference. Now what it checks is not only if the Company and Employee fields are there, but also if Freight is a number, Lines with nested fields, and more. To make it even more strict, we can enforce additionalProperties false to tell the schema to catch any writes for documents that have any additional field other than those present in the schema. ### Index diff --git a/guides/transactional-outbox.mdx b/guides/transactional-outbox.mdx index c0af40d3b9..d961e3f3dc 100644 --- a/guides/transactional-outbox.mdx +++ b/guides/transactional-outbox.mdx @@ -67,7 +67,7 @@ But in RavenDB particularly, there’s another way of achieving the same goal, u ## Implementing the outbox with data subscriptions -We want to connect and save our documents to both collections as a single atomic transaction. To achieve that, we want to use the saveChanges() after storing both documents, to wrap both changes into a single transaction, as described in this [article](/client-api/faq/transaction-support#acid-for-document-operations). +We want to connect and save our documents to both collections as a single atomic transaction. To achieve that, we want to use the saveChanges() after storing both documents, to wrap both changes into a single transaction, as described in this [article](https://docs.ravendb.net/client-api/faq/transaction-support#acid-for-document-operations). ```csharp var invoice = new Invoice diff --git a/guides/vibe-coding-with-ravendb-and-context7.mdx b/guides/vibe-coding-with-ravendb-and-context7.mdx index 376563af21..17331b9343 100644 --- a/guides/vibe-coding-with-ravendb-and-context7.mdx +++ b/guides/vibe-coding-with-ravendb-and-context7.mdx @@ -138,7 +138,7 @@ We enable email sending in the config and test it using a local SMTP server. We With an AI assistant at our side, we built a working RavenDB e-commerce prototype — complete with a product listing page, a sales analytics dashboard, and automated order email notifications — in a single session. Context7 kept the AI grounded in accurate RavenDB documentation throughout, reducing the need to search for context and keeping the workflow uninterrupted. -- **MCP shifts documentation lookup off your plate.** Once Context7 is configured, the AI fetches RavenDB docs on its own when it hits an unfamiliar [client API](/docs/client-api/what-is-a-document-store). You do not need to invoke it explicitly or paste reference material into the chat. +- **MCP shifts documentation lookup off your plate.** Once Context7 is configured, the AI fetches RavenDB docs on its own when it hits an unfamiliar [client API](https://docs.ravendb.net/client-api/what-is-a-document-store). You do not need to invoke it explicitly or paste reference material into the chat. - **Vibe coding is a surprisingly honest technology evaluator.** Building a prototype this way reveals which parts of a database are intuitive to describe in plain language and which require precise API knowledge, giving you a low-cost signal before committing to a technology. - **Token efficiency matters more than it looks.** Mid-session web searches fill the context window with noise. MCP-sourced documentation is targeted and compact, so the AI spends fewer tokens finding context and more time generating useful code. -- **Review AI-generated queries before going further.** [RavenDB queries](/docs/querying/overview) and [index definitions](/docs/indexes/what-are-indexes) produced during vibe coding are a good starting point, but validate them against your actual data shape and index configuration before treating them as production-ready. +- **Review AI-generated queries before going further.** [RavenDB queries](https://docs.ravendb.net/querying/overview) and [index definitions](https://docs.ravendb.net/indexes/what-are-indexes) produced during vibe coding are a good starting point, but validate them against your actual data shape and index configuration before treating them as production-ready.