pod network integration (Shadow mode)#4205
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
aff7b2c to
dad9d3d
Compare
|
This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed. |
|
This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed. |
99c8b62 to
78a3e71
Compare
| [[package]] | ||
| name = "jsonwebtoken" | ||
| version = "9.3.1" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "5a87cc7a48537badeae96744432de36f4be2b4a34a05a5ef32e9dd8a1c169dde" | ||
| dependencies = [ | ||
| "base64 0.22.1", | ||
| "js-sys", | ||
| "pem", | ||
| "ring", | ||
| "serde", | ||
| "serde_json", | ||
| "simple_asn1", | ||
| ] |
Check warning
Code scanning / Trivy
jsonwebtoken: jsonwebtoken has Type Confusion that leads to potential authorization bypass Medium
e32884c to
7f0f75c
Compare
Pod Integration: Design Decisions (more context apart from PR description)Shadow Mode ArchitecturePod integration runs in shadow mode, so bid submission failures are logged as warnings but never block the protocol flow. Haircut-Based Score Differentiation (fo rthe e2e Tests)Since i'm using 2 instances of baseline solvers to simulate competition for the e2e tests; they produce the same flowTwo solvers competing for the same order find identical routes → identical scores. SolutionUse
This reliably produces different scores for testing pod winner selection without complex liquidity setup. Nonce/Pending TX HandlingCurrent IssuePod network rejects new TXs when previous ones from same sender are pending: Approach
Deterministic Test AddressesE2E tests use deterministic addresses derived from Anvil's default mnemonic. Pre-funding these on pod network:
Both funded with ~10 ETH on pod network. Config Structure[pod]
endpoint = "http://cow.pod.network:8545"
auction-contract = "0xeDD0670497E00ded712a398563Ea938A29dD28c7"Wallet uses existing |
eec3eab to
62e7fc4
Compare
|
I have read the CLA Document and I hereby sign the CLA |
cad881f to
b61e69d
Compare
There was a problem hiding this comment.
Code Review
This pull request implements integration with the pod network to support decentralized solver competitions. Key changes include upgrading the Alloy dependency suite to version 1.8.3, introducing deterministic solution hashing within the winner-selection crate, and adding a pod_flow to the driver for bid submission and local arbitration. The PR also includes new E2E tests, local configurations, and account recovery logic for the pod network. Feedback was provided regarding the fragility of the bid-fetching process, where a single malformed bid currently causes the entire batch to fail; it is recommended to collect individual errors instead of aborting the process.
| let bidder_solve_response: dto::SolveResponse = | ||
| match serde_json::from_slice(bid.data.as_slice()) { | ||
| Ok(resp) => resp, | ||
| Err(e) => { | ||
| tracing::error!(error = %e, "failed to deserialize SolveResponse"); | ||
| return Err(anyhow::Error::new(e)); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Aborting the entire bid fetching process due to a single malformed bid is fragile. However, simply skipping the error is insufficient. According to our rules, when fetching a batch of items where individual fetches can fail, the API response should explicitly indicate which items failed and provide error details for each failure, rather than silently ignoring them.
let mut errors = Vec::new();
for bid in bids {
match serde_json::from_slice::<dto::SolveResponse>(bid.data.as_slice()) {
Ok(resp) => {
for solution in resp.solutions {
participants.push(Bid::new(solution));
}
}
Err(e) => {
tracing::error!(error = %e, "failed to deserialize SolveResponse");
errors.push((bid.id, e.to_string()));
}
}
}References
- When fetching a batch of items where individual fetches can fail, do not silently ignore errors. The API response should explicitly indicate which items failed and provide error details for each failure.
…ebase to latest main branch, get test and services compiling again
…rbosity - Add test-pod-verbose justfile target for detailed test output - Replace [pod] log prefixes with structured tracing spans (pod_flow, pod_submit_bid, pod_fetch_bids, pod_local_arbitration) - Move auction_id and solver context to span fields instead of repeating in each log - Reduce log noise: change balance fetch failure from error to warn, make signer type selection debug-level - Add debug-level payload hex logging,
… new pod network version
…ount of skipped bids
… driver/autopilot wrappers
…te related structures
53b940c to
61e1846
Compare
jmg-duarte
left a comment
There was a problem hiding this comment.
Can't we split this by doing a merge over the winsel changes and their downstream friends?
IMO this is way too much to be merged and released confidently in one go
| pub solver: Address, | ||
| pub solution_id: u64, |
There was a problem hiding this comment.
These don't need to be public, arguably the key doesn't either
| impl<S> From<&Solution<S>> for SolutionKey { | ||
| fn from(solution: &Solution<S>) -> Self { | ||
| Self { | ||
| solver: solution.solver(), | ||
| solution_id: solution.id(), | ||
| } | ||
| } |
There was a problem hiding this comment.
This makes other constructions of SolutionKey { .. } inconsistent, either use the same for all of them, or remove this one and use the same approach as the others (2nd is better IMO)
| /// Result of [`Arbitrator::arbitrate_paired_and_rejoin`]. | ||
| #[derive(Debug)] | ||
| pub struct Rejoined<T> { |
There was a problem hiding this comment.
The comment should rather explain what this represents, the current one is more like a footnote
| /// Sorting before arbitration is what lets independent observers | ||
| /// (autopilot, driver, third-party verifiers) reach the same | ||
| /// tie-breaking decision on the same logical solution set. | ||
| pub fn arbitrate_paired<T>( |
There was a problem hiding this comment.
- What is supposed to be the shape of T?
- I see this is only used by arbitrate_paired_and_rejoin, we can drop the pub
| /// A non-zero `orphans` count means two inputs shared the same | ||
| /// `SolutionKey`. Callers should log and alert on this, not treat it | ||
| /// as fatal. | ||
| pub fn arbitrate_paired_and_rejoin<T>( |
There was a problem hiding this comment.
I see this only being used with Bid<BigPayload, Unscored>; I think we can lock in the type here and extend to generics later if needed
| pub fn arbitrate_paired<T>( | ||
| &self, | ||
| items: Vec<(T, Solution<Unscored>)>, | ||
| context: &AuctionContext, | ||
| ) -> (Ranking, HashMap<SolutionKey, T>) { | ||
| let mut paired = items; | ||
| paired.sort_by_cached_key(|(_, solution)| solution.canonical_hash()); | ||
| let mut by_key = HashMap::with_capacity(paired.len()); | ||
| let mut solutions = Vec::with_capacity(paired.len()); | ||
| for (item, solution) in paired { | ||
| by_key.insert(SolutionKey::from(&solution), item); | ||
| solutions.push(solution); | ||
| } | ||
| (self.arbitrate(solutions, context), by_key) | ||
| } |
There was a problem hiding this comment.
The move to paired just to make items mut is kinda weird, you can also avoid the extra collection muts with an iterator
| pub fn arbitrate_paired<T>( | |
| &self, | |
| items: Vec<(T, Solution<Unscored>)>, | |
| context: &AuctionContext, | |
| ) -> (Ranking, HashMap<SolutionKey, T>) { | |
| let mut paired = items; | |
| paired.sort_by_cached_key(|(_, solution)| solution.canonical_hash()); | |
| let mut by_key = HashMap::with_capacity(paired.len()); | |
| let mut solutions = Vec::with_capacity(paired.len()); | |
| for (item, solution) in paired { | |
| by_key.insert(SolutionKey::from(&solution), item); | |
| solutions.push(solution); | |
| } | |
| (self.arbitrate(solutions, context), by_key) | |
| } | |
| pub fn arbitrate_paired<T>( | |
| &self, | |
| mut items: Vec<(T, Solution<Unscored>)>, | |
| context: &AuctionContext, | |
| ) -> (Ranking, HashMap<SolutionKey, T>) { | |
| items.sort_by_cached_key(|(_, solution)| solution.canonical_hash()); | |
| let (by_key, solutions) = items | |
| .into_iter() | |
| .map(|(item, solution)| ((SolutionKey::from(&solution), item), solution)) | |
| .collect(); | |
| (self.arbitrate(solutions, context), by_key) | |
| } |
| let mut orphans = 0; | ||
| let mut rejoin = |s: Solution<Ranked>| -> Option<(T, Solution<Ranked>)> { | ||
| let key = SolutionKey::from(&s); | ||
| match by_key.remove(&key) { | ||
| Some(t) => Some((t, s)), | ||
| None => { | ||
| orphans += 1; | ||
| None | ||
| } | ||
| } | ||
| }; | ||
|
|
||
| let filtered_out = ranking | ||
| .filtered_out | ||
| .into_iter() | ||
| .filter_map(&mut rejoin) | ||
| .collect(); | ||
| let ranked = ranking.ranked.into_iter().filter_map(&mut rejoin).collect(); |
There was a problem hiding this comment.
There must be a simpler way of doing this, there's mutation and pure things leading to collects, looks very weird to me
The FnMut + captured mutable state sounds like an accident waiting to happen as well
Can this made simpler?
| }; | ||
|
|
||
| /// Auction arbitrator responsible for selecting winning solutions. | ||
| #[derive(Clone)] |
There was a problem hiding this comment.
This Clone is not actually needed because the SolverArbitrator also doesn't need one
| /// Clearing prices keyed by token address. Not used by arbitration; | ||
| /// included so `canonical_hash` fingerprints them. | ||
| prices: HashMap<Address, U256>, |
There was a problem hiding this comment.
So, can't we make cannonical_hash's signature be fn(&self, &HashMap<Address, U256>) instead?
Or store the hash of the prices at creation time instead of carrying them around?
| impl<P: Clone, State: Clone> Clone for Bid<P, State> { | ||
| fn clone(&self) -> Self { | ||
| Self { | ||
| payload: self.payload.clone(), | ||
| state: self.state.clone(), | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Even though it's not 1:1, we can use derive(Clone) here instead
Description
Closes #4126. Supersedes #4052.
Adds an opt-in shadow path in the driver that mirrors each auction onto the pod network. After the driver scores its own best solution, a
tokio::spawn-ed task submits it as a bid to a pod auction contract, waits for the deadline, fetches everyone's bids, and runs a local copy of the same arbitration. Pod failures never block, delay, or mutate the response to autopilot.If
[pod]is omitted from the driver TOML, nothing pod-shaped is allocated.Highlights
crates/driver/src/infra/pod.PodManager(inflow.rs) owns the provider, the reusableAuctionClient, and the local arbitrator; it exposes a singlespawn().Competition::solveinvokes it once and is otherwise untouched.winner_selection::Solution::canonical_hashis shared by autopilot and driver, so independent observers tie-break identically on the same logical solution.max_winners(used by the local arbitrator) is configurable from the[pod]block; defaults to 10.pod_getRecoveryTargetTx→recover(tx_hash, nonce)on the0x50d…0003precompile and retry once. Seeinfra/pod/recovery.rsfor the production-mode caveats (TODO(production-pod)).What's new
winner-selection—Solution::canonical_hash,Arbitrator::arbitrate_paired, publicSolutionKey.infra::pod(config,flow,recovery) anddomain::competition::solver_winner_selection.[pod] endpoint, auction-contract-address, max-winners(max-winnersdefaults to 10). Pod is opt-in.pod_test_basic,pod_test_multi_order,pod_test_multi_solver(gated by--run-ignored ignored-only; need live pod RPC).How to test