Conversation
- move translation-specific code out of the service - remove references to Machine class in the interface
pmachapman
left a comment
There was a problem hiding this comment.
Looks good. Just a couple of points directly related to the PR:
- Would you be able to update
.vscode/launch.jsonto match the new docker/project config? - I think the README.md should also be updated.
- Should
Serval.Shared.Contractsbe a global using in the projects that use it? I notice it is used by a lot of different classes in Echo, Serval.Machine.Shared, and Serval.Machine.Shared.Tests. (Maybe alsoServal.WordAlignment.ContractsandServal.Translation.Contractsin these projects too?)
While reviewing the code, I noticed cleanups we could make, but perhaps for sanity these should be dealt with separately:
- Not all of the new Serval projects have
<Import Project="../../../AssemblyInfo.props" />. - The user of tabs and spaces in inconsistent in the csproj files (most have tabs, some have spaces, some have both).
- Use
System.Linq'sToAsyncEnumerable()instead of our own implementation(s). - Use global constants for
"nmt"and"smt_transfer". - Use collection expressions instead of
.ToList()and.ToArray(). - Use
[]instead ofArray.Empty<>()ornew List<>(). - Updating one line function bodies to lamba syntax.
- Change
namespace Microsoft.Extensions.DependencyInjection;to a namespace that matches the folder the file is in. - Updating spelling of Cancelled to Canceled in code (OperationCancelledExceptionFilter) and comments.
- I notice some
Dtos andContracts are classes, while most are records. Is this accidental? - Updating
Microsoft.*10.0.3 dependencies to 10.0.5 (I notice some packages are 10.0.3, and others are 10.0.5). - And probably updating other dependencies too...
(and so up to you if we address these now, or in a later PR. I listed them here rather than where they occur so the PR doesn't get swamped).
@pmachapman partially reviewed 518 files and all commit messages, and made 8 comments.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on ddaspit and Enkidu93).
src/Machine/src/Serval.Machine.Shared/Services/SmtTransferEngineService.cs line 62 at r1 (raw file):
await _engines.DeleteAsync(e => e.EngineId == engineId, cancellationToken); await _trainSegmentPairs.DeleteAllAsync(p => p.TranslationEngineRef == engineId, cancellationToken); await _buildJobService.DeleteEngineAsync(engineId, cancellationToken);
Since this is now cancelable, shouldn't the engine be deleted last, i.e.
await _trainSegmentPairs.DeleteAllAsync(p => p.TranslationEngineRef == engineId, cancellationToken);
await _buildJobService.DeleteEngineAsync(engineId, cancellationToken);
await _engines.DeleteAsync(e => e.EngineId == engineId, cancellationToken);Code quote:
await _engines.DeleteAsync(e => e.EngineId == engineId, cancellationToken);
await _trainSegmentPairs.DeleteAllAsync(p => p.TranslationEngineRef == engineId, cancellationToken);
await _buildJobService.DeleteEngineAsync(engineId, cancellationToken);src/Machine/src/Serval.Machine.Shared/Services/StatisticalEngineService.cs line 22 at r1 (raw file):
private readonly IClearMLQueueService _clearMLQueueService = clearMLQueueService; public string Type => EngineType.Statistical.ToString().ToCamelCase();
I notice that this is the only place that ToCamelCase() is used. In this case, I tihnk ToLowerInvariant() will have the same effect, but I assume you were wanting to use ToCamelCase() for SmtTransfer (in the two occurrences elsewhere of public EngineType EngineType => EngineType.SmtTransfer;)?
Code quote:
public string Type => EngineType.Statistical.ToString().ToCamelCase();src/Serval/src/Serval.Shared/Configuration/IServiceCollectionExtensions.cs line 25 at r1 (raw file):
services.AddHangfire(c => c.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)
We should update to CompatibilityLevel.Version_180. We could even remove this line, given there is just one Hangfire database now.
Code quote:
c.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)src/Serval/src/Serval.Shared.Contracts/UsfmVersificationErrorContract.cs line 13 at r1 (raw file):
InvalidChapterNumber, InvalidVerseNumber, }
I wonder if we shouldn't just use SIL.Machine.Corpora.UsfmVersificationErrorType, especially as this enum is not exposed via any external facing API with the new architecture? It would reduce some boilerplate.
Code quote:
public enum UsfmVersificationErrorType
{
MissingChapter,
MissingVerse,
ExtraVerse,
InvalidVerseRange,
MissingVerseSegment,
ExtraVerseSegment,
InvalidChapterNumber,
InvalidVerseNumber,
}src/Serval/src/Serval.Translation/Features/Engines/GetWordGraph.cs line 6 at r1 (raw file):
{ public required IReadOnlyList<string> SourceTokens { get; init; } public required float InitialStateScore { get; init; }
Should we make this a double? I notice it is a downcast from the contract's public required double InitialStateScore { get; set; }.
Code quote:
public required float InitialStateScore { get; init; }src/Serval/src/Serval.Translation/Features/Engines/StartBuild.cs line 20 at r1 (raw file):
: IRequest<StartBuildResponse>; public record struct StartBuildResponse(
I notice this is the only Response you use record struct. Just checking that was deliberate.
Code quote:
public record struct StartBuildResponse(src/Serval/test/Serval.ApiServer.IntegrationTests/Utils.cs line 1 at r1 (raw file):
namespace Serval.ApiServer;
Should this file be deleted?
Code quote:
namespace Serval.ApiServer;
Modular Monolith Rearchitecture
Core Architecture Change
The branch converts Serval from a distributed microservice architecture — communicating via gRPC and MassTransit message bus — to a modular monolith where all engine services run in-process as directly-called services.
Foundational Refactoring (
c93af884,24cb9fdb)TranslationEngineServiceV1,ServalTranslationEngineServiceV1, and all gRPC-based platform service implementations (V1 suffix). Replaced with direct in-process equivalents (PlatformService,TranslationEngineService).EngineCancelBuildConsumer,EngineCreateConsumer,EngineDeleteConsumer, etc.) and build lifecycle event consumers (TranslationBuildCanceledConsumer,TranslationBuildCompletedConsumer, etc.).5af9ab97): TheSIL.ServiceToolkitshared library was absorbed into the main Serval codebase.New Assembly Structure
SIL.DataAccess.Abstractions(37c5c2dc): Extracted data access interfaces into a dedicated abstractions assembly.Serval.DataFiles.Contracts,Serval.Shared.Contracts,Serval.Translation.Contracts: Created dedicated contracts assemblies to hold shared models and service interfaces.Vertical Slice Architecture (
46f32165,89c5125b)EngineServiceclass with individual vertical slice feature handlers underFeatures/Engines/:CreateEngine,DeleteEngine,GetEngine,GetAllEnginesStartBuild,CancelBuildTranslate,GetWordGraph,TrainSegment,UpdateEngine,GetModelDownloadUrlIEvent/IEventHandlerandIRequest/IRequestHandlerinterfaces to support the CQRS-style dispatch.EngineServiceFactoryfor both Translation and WordAlignment modules.Naming & Structural Cleanup
IServalBuilder→IServalConfigurator(58d5b659): All builder/builder-extensions classes were renamed toIServalConfigurator/IServalConfiguratorExtensionsacross every module.bbe34048): All contract-layer model classes were renamed to use theContractsuffix (e.g.,ParallelCorpusContract,MonolingualCorpusContract).Contracts/→Dtos/folders in all Serval modules (Translation, WordAlignment, DataFiles, Webhooks, Shared).Models/subfolders were moved to the flatServal.Shared.Contractsnamespace.Data Access (
abcf3c4a)Service Refactoring
ParallelCorpusService(71434bb7): Moved translation-specific corpus logic out intoPretranslationService; removedSIL.Machinetype references from the interface, making it more generic.PlatformService(Translation & WordAlignment): FormerPlatformServiceV1gRPC server now replaced with a direct in-processPlatformService.Test Updates
3e172909,89c5125b: Fixed unit and E2E test suite to align with the new architecture — including newEnginesFeatureTests(2,800+ lines) replacing the oldEngineServiceTests, expandedParallelCorpusServiceTests, and updated integration test harness (ServalWebApplicationFactory).Benefits
Removing gRPC and MassTransit eliminates significant operational complexity — no separate message broker to deploy, configure, or monitor. A single database instead of many reduces infrastructure overhead and simplifies cross-module queries. In-process service calls are faster and easier to debug than distributed message passing, and vertical slice feature handlers make the codebase easier to navigate and extend. Overall, the refactor trades distribution complexity for simplicity without sacrificing modularity while also removing around 5,000 lines of code.
This change is