Skip to content

Release/v1.0.0#18

Merged
crypticsaiyan merged 51 commits into
mainfrom
release/v1.0.0
Dec 14, 2025
Merged

Release/v1.0.0#18
crypticsaiyan merged 51 commits into
mainfrom
release/v1.0.0

Conversation

@crypticsaiyan

@crypticsaiyan crypticsaiyan commented Dec 14, 2025

Copy link
Copy Markdown
Owner

Description

Release v1.0.0 - Merging the first stable release of InFoundry from the dev branch to main. This release includes the complete InFoundry platform with AI-powered infrastructure generation capabilities.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 📚 Documentation update
  • 🔧 Configuration change
  • ♻️ Refactor (no functional changes)
  • 🏗️ Infrastructure/IaC change

Related Issues

Closes #

Changes Made

  • Initial release of InFoundry platform
  • MCP Server implementation for AI-assisted infrastructure generation
  • Kestra orchestration pipelines for end-to-end IaC workflows
  • Next.js UI dashboard for pipeline management and visualization
  • Oumi integration for model training and inference
  • Comprehensive documentation (architecture, blueprints, contributing guidelines)
  • Example files for service profiles, architecture plans, and telemetry
  • Test suite for MCP server, Kestra pipelines, and Oumi services

Infrastructure Changes (if applicable)

  • Terraform files modified
  • New cloud resources added
  • Security groups/IAM policies changed
  • Cost impact assessed

IaC Details

  • Kestra pipeline definitions (9 workflows)
  • MCP server TypeScript configuration
  • Cloud training notebook for model fine-tuning

Testing

  • Unit tests pass
  • Integration tests pass
  • Manual testing completed
  • IaC validation (terraform validate) passes

Test Evidence

  • tests/test_generate_training_data.py - Training data generation tests
  • tests/test_kestra.js - Kestra pipeline tests
  • tests/test_mcp_server.mjs - MCP server integration tests
  • tests/test_oumi_serve.py - Oumi serving tests

Checklist

  • My code follows the project's coding standards
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings or errors
  • I have added tests that prove my fix/feature works
  • New and existing unit tests pass locally
  • Any dependent changes have been merged and published

CodeRabbit Review

  • I have addressed all CodeRabbit suggestions
  • Critical security/performance issues resolved
  • IaC best practices followed (for infrastructure changes)

Screenshots (if applicable)

image

Additional Notes

This is the first official release of InFoundry. The platform enables:

  • Repository ingestion and analysis
  • Telemetry-driven architecture proposals
  • Graph visualization of infrastructure
  • Automated IaC generation and validation
  • PR creation and validation workflows
  • Model evaluation pipelines

This PR will be automatically reviewed by CodeRabbit 🐰

Summary by CodeRabbit

  • New Features

    • Visual Architecture Editor and Dashboard for building/exporting graphs
    • Service Configuration Generator for multi-service profiles
    • End-to-end pipeline orchestration (ingest → propose → render → generate IaC → validate → PR → evaluate)
    • AI-powered architecture recommendations and a local model server option
    • Automated IaC generation, validation, and pull-request creation
    • Real-time pipeline execution and step monitoring in the UI
  • Documentation

    • Project rebranded to InFoundry; expanded README, CHANGELOG, ROADMAP, SECURITY, CONTRIBUTING, and governance docs
    • Added issue and pull-request templates
  • Tests & CI

    • New integration and unit tests for model server, pipelines, and UI workflows; updated CI workflows
  • Chores

    • Expanded ignore rules and license switched to MIT

✏️ Tip: You can customize this high-level summary in your review settings.

crypticsaiyan and others added 30 commits December 11, 2025 15:02
…d examples

Set up the initial Kestra orchestration foundation for InFoundry, including
starter flow definitions and example artifacts. These are minimal boilerplate
pipelines meant to establish structure and prepare the repo for full
orchestration development.

Added:
- `orchestrator/kestra_pipelines/` containing initial boilerplate flows
  (ingest-repo, ingest-telemetry, propose-architecture, generate-iac,
  validate-pr, test-deploy, evaluate, end-to-end).
- `examples/kestra/` with sample metrics, summaries, and mock flow inputs
  to support early testing and pipeline validation.
- `kestra/upload_flows.sh` script to manually or programmatically upload
  flows to a running Kestra instance.
- Scaffold for CI integration (future `deploy-kestra-flows.yml`) so flows
  can eventually be deployed from Git.

Notes:
- These flows are placeholders and do not yet contain full logic or
  production-ready tasks—only initialization of naming conventions,
  structure, and workflow boundaries.
- Real integrations (Oumi, Cline, GitHub, CodeRabbit) will be added in
  subsequent commits.

This commit establishes the initial Kestra orchestration layer for the
project and prepares the repository for future flow expansion.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
feat(kestra): initialize orchestration flows
- Add training data generator with 500 architecture examples
- Add SFT training script using Oumi framework with LoRA
- Add inference script for testing trained model
- Add model serving endpoint with FastAPI
- Add Colab notebooks for cloud training
- Add synth data generation scripts
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Docstrings generation was requested by @crypticsaiyan.

* #2 (comment)

The following files were modified:

* `oumi/generate_training_data.py`
* `oumi/generate_with_oumi.py`
* `oumi/oumi_synth.py`
* `oumi/serve.py`
* `oumi/train_sft.py`
feat(oumi): Add Oumi-based model training pipeline for cloud architecture recommendations
…ture

- Add training data generator with 500 rule-based examples
- Add SFT training script using Oumi with LoRA (Qwen2.5-1.5B-Instruct)
- Add inference script for testing trained model
- Add FastAPI server with Oumi model integration
- Add Colab notebook for cloud training on T4 GPU
- Upload trained adapter to HuggingFace (crypticsayan/infoundry-architect)
- Update README with installation and usage instructions

Model outputs JSON matching sample_architecture_plan.json schema
with pattern, components, topology, scaling_strategy, and rationale.
feat(oumi): Add Oumi-based model training pipeline for cloud architecture
- Add infoundry-mcp-server with 4 tools (analyze_repo, propose_architecture, generate_iac, validate_iac)
- Add GitHub Action for architecture issue automation
- Update cline_integration.py with real CLI calls
- Add test script for MCP protocol testing
- Add author association check for issue comment triggers
- Sanitize issue body input to prevent shell injection
- Replace execSync with spawnSync to prevent command injection
- Fix path comparison for repo root exclusion
- Update Terraform templates with complete IAM roles and required fields
- Handle tflint ENOENT gracefully with skipped state
- Use portable paths in test.mjs with import.meta.url
- Target dev branch and use unique timestamped branch names for PRs
- Prevent command injection with spawnSync and path validation
- Sanitize GitHub Actions inputs
- Remove default DB credentials from Terraform
- Fix git workflow for PR creation
- Add proper error handling and fallback logic
feat(cline): Add MCP server for autonomous cloud architecture generation
- Add PR template with standardized sections and CodeRabbit checklist
- Add coderabbit.yml workflow that triggers on all PRs
- Implement IaC validation: terraform fmt, validate, TFLint, tfsec
- Block merge on Terraform validation failures
- Update CONTRIBUTING.md with CodeRabbit review guidelines
feat(ci): add CodeRabbit review workflow with IaC validation gate
…model

01-ingest-repo:
- Add GitHub Languages API integration for accurate language detection
- Implement optimized service detection using fast marker scanning
- Extract databases, ports, frameworks from package.json/Dockerfile
- Add comprehensive error handling with fallback mechanisms

02-ingest-telemetry:
- Improve metric parsing for latency and error rate extraction
- Add better null handling for missing telemetry data

03-propose-architecture:
- Integrate trained Oumi model (crypticsayan/infoundry-architect)
- Call local Oumi serve.py server via HTTP for fast inference
- Add oumi_server_url input with Docker bridge IP default
- Remove heavy pip install, use lightweight API calls instead
- Maintain heuristic fallback when Oumi server unavailable

This enables the full InFoundry pipeline: repo analysis → telemetry
ingestion → AI-powered architecture recommendations in sub-second time.
feat(kestra): enhance repository analysis and integrate trained Oumi …
feat(render-graph): integrate Kestra graph output with React Flow dashboard
…tion

## Kestra AI Agent Integration
- Add AI Agent to 05-generate-iac.yaml for architecture summarization
- Add AI Agent to 05-generate-iac.yaml for IaC configuration decisions
- Add AI Agent to 09-evaluate.yaml for intelligent evaluation
- Configure Ollama (qwen2.5:7b) as local LLM provider
- Implement heuristic fallback when AI is unavailable
- Fix AI output passing using textOutput property

## Pipeline Improvements
- Rename 06-create-pr → 07-create-pr (correct order)
- Rename 07-validate-pr → 08-validate-pr
- Rename 08-test-deploy → 06-validate-iac
- Add 00-full-iac-pipeline.yaml for orchestrating the full flow
- Add conditional PR creation (only if validation passes)

## Terraform Validation (06-validate-iac)
- Use hashicorp/terraform:1.6 Docker image
- Run init, validate, fmt, and plan commands
- Add -no-color flag for clean output
- Make fmt and plan non-blocking (warnings only)
- Add default values to required variables

## Other Fixes
- Add target_folder input to 07-create-pr.yaml
- Use kv() for GITHUB_TOKEN instead of secret()
- Strip markdown code fences from AI output
- Add sample files for testing pipelines
feat(kestra): Complete AI-powered IaC generation pipeline with Ollama integration
## Pipeline Updates
- Update 00-end-to-end.yaml with complete 9-step flow
- Add conditional execution with runIf and skip_* flags
- Add log_summary step and comprehensive outputs
- Update namespace.yaml with Ollama config and pipeline overview
feat(kestra): Update pipeline orchestration with complete 9-step flow
…sualization

## Frontend Pipeline Integration
- Add /pipeline page with execution form and real-time status polling
- Add StepProgressBar component with 9-step visual indicator
- Add StepOutputCard with expandable output sections
- Add PipelineForm for all Kestra flow input parameters
- Add Pipeline link to Header navigation

## Rich Output Viewers
- Add JsonViewer: collapsible tree with syntax highlighting and copy
- Add TableViewer: sortable columns with pagination
- Add GraphViewer: mini ReactFlow for architecture diagrams (horizontal layout)
- Add PRViewer: GitHub PR details with file changes
- Add LogViewer: searchable logs with error highlighting

## API Routes
- Add /api/kestra/execute: trigger pipeline execution
- Add /api/kestra/status/[executionId]: poll execution status
- Add /api/kestra/file: fetch file contents from Kestra storage

## Kestra Pipeline Updates
- 04-render-graph: Change to horizontal (left-to-right) node layout
- 06-validate-iac: Exit with error if validation fails (gate pipeline)

## Configuration
- Add jsconfig.json for @/ path alias
- Add lib/kestra.js for Kestra API utilities
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
crypticsaiyan and others added 18 commits December 14, 2025 10:08
## Security Improvements
- Pin status API route to Node.js runtime (required for Buffer)
- URL-encode executionId in API requests to prevent injection
- Add 10s fetch timeout with AbortController
- Return generic error messages to clients, log details server-side

## DRY Refactoring - Single Source of Truth
- Remove duplicate STATE_MAP from status route, import from lib/kestra.js
- Remove duplicate PIPELINE_STEPS from page.jsx, import from lib/kestra.js
- Remove duplicate STEPS from StepProgressBar, import from lib/kestra.js
- Update lib/kestra.js to use Next.js API proxy routes (removes NEXT_PUBLIC env var)

## Accessibility
- Add :focus-visible styles for LogViewer controls (searchInput, clearBtn, autoScrollBtn)
- Add :focus-within on searchBox container for keyboard navigation
- Use consistent #3b82f6 blue for all focus indicators
Docstrings generation was requested by @crypticsaiyan.

* #11 (comment)

The following files were modified:

* `ui/app/api/kestra/execute/route.js`
* `ui/app/api/kestra/file/route.js`
* `ui/app/api/kestra/status/[executionId]/route.js`
* `ui/app/pipeline/page.jsx`
* `ui/components/Header.jsx`
* `ui/components/PipelineForm.jsx`
* `ui/components/StepOutputCard.jsx`
* `ui/components/StepProgressBar.jsx`
* `ui/components/viewers/GraphViewer.jsx`
* `ui/components/viewers/JsonViewer.jsx`
* `ui/components/viewers/LogViewer.jsx`
* `ui/components/viewers/PRViewer.jsx`
* `ui/components/viewers/TableViewer.jsx`
* `ui/lib/kestra.js`
…262f

📝 Add docstrings to `feat/ui-integration`
feat(ui): Add Kestra pipeline execution interface with rich output visualization
…generation

- Refactor MCP server to exactly 9 tools matching Kestra pipeline:
  ingest_repo, ingest_telemetry, propose_architecture, render_graph,
  generate_iac, validate_iac, create_pr, validate_pr, evaluate
- Fix Terraform templates to use proper multi-line HCL format (no semicolons)
- Fix create_pr to accept both JSON string and object for files param
- Fix Oumi API call to use correct /v1/chat/completions endpoint
- Add ingest_repo support for GitHub URLs (auto-clone)
- Add mcp.json.example for Cline configuration
- Add test-all-tools.mjs for testing all 9 workflow steps
- Remove non-functional cline-architect.yml workflow
- Update README with 9-step workflow documentation
feat(mcp-server): Align MCP Tools with Kestra Workflow
Add component palette, AWS service names, and graph transfer to dashboard
- Add component palette with 21 AWS services for drag-and-drop architecture building
- Add full AWS service name mapping (Rds → Amazon RDS, etc.)
- Add 'Open in Dashboard' button in pipeline graph viewer with localStorage transfer
- Remove prompt input and placeholder links from homepage
- Simplify header/footer with only functional links
- Add suppressHydrationWarning to fix browser extension errors
refactor(ui): Add Architecture Builder with AWS component palette and UI cleanup
feat(ui): add Service Configuration Generator
Tests:
- Add unit tests for oumi/serve.py and generate_training_data.py (72 tests)
- Add JS tests for ui/lib/kestra.js
- Add MCP server integration test

Fixes:
- Add missing `import json` in oumi/generate_training_data.py
- Update license to MIT

Cleanup:
- Remove redundant test files and unused scripts
- Remove compute_reward.py and simulate_load.py
- Update CI workflow to run actual tests
- Add requirements.txt
chore: refactor CI workflow and add test coverage for Oumi and MCP server
- Add docs/architecture.md with system architecture overview, component
  diagrams, data flow documentation, and integration points
- Add docs/blueprint.md with complete system design, project structure,
  component specifications, API contracts, and data schemas
- Add docs/runlocally.md with step-by-step local development guide,
  service setup instructions, and troubleshooting tips
- Add orchestrator/kestra_pipelines/README.md documenting all 9 pipeline
  steps and AI agent capabilities
- Add ui/.env.example for environment configuration
- Add CHANGELOG.md, ROADMAP.md, and SECURITY.md for project governance
- Add GitHub issue templates for bug reports and feature requests
docs: Add comprehensive documentation for InFoundry architecture and local development
@coderabbitai

coderabbitai Bot commented Dec 14, 2025

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Release/v1.0.0' is directly related to the PR's objective of merging the v1.0.0 stable release into main, clearly identifying this as a major release milestone.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbed885 and c033194.

📒 Files selected for processing (1)
  • ui/jsconfig.json (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (20)
LICENSE-1-21 (1)

1-21: Update license headers in the notebook and review Apache dependencies.

You're right to flag this license switch—it's a bigger deal than just updating one file. I found some stuff that needs attention:

  1. The notebook at oumi/InFoundry_Cloud_Training.ipynb still has old Apache License 2.0 headers in it that don't match your new MIT license. Those need to be updated or removed.

  2. Your dependencies include Apache-2.0 licensed packages (visible in infoundry-mcp-server/package-lock.json and ui/package-lock.json). The good news: MIT is compatible with Apache 2.0, so you're not breaking anything. But it's worth being aware of.

  3. Python dependencies (fastapi, uvicorn, pydantic, etc.) are all permissive licenses, so no red flags there.

Since there's no CONTRIBUTORS file, make sure your team actually agreed to this license change before merging.

ui/app/configure/page.jsx-1-28 (1)

1-28: Upgrade React to version 19 — your Next.js 15 App Router needs it.

Your project is running Next.js 15.5.9 with React 18.3.1, but that's a mismatch. Next.js 15 requires React 19 for the App Router (which your component is using with the "use client" directive). The React 18 compatibility only works with the older Pages Router.

This mismatch can lead to hydration errors, unexpected behavior in client components, and compatibility issues with Next.js 15 features. To fix it, upgrade React and react-dom to version 19 in your ui/package.json:

cd ui
npm install react@19 react-dom@19 --save-exact

If you're using TypeScript, also update the type definitions:

npm install @types/react@19 @types/react-dom@19 --save-dev --save-exact
oumi/run_inference.py-34-44 (1)

34-44: Watch out for the relative path to trained_model.

Your adapter_model points to "./trained_model" which will break if someone runs this script from a different directory. Consider making this an absolute path or at minimum add a comment explaining it must be run from the oumi directory.

You could either:

  1. Make it relative to the script location:
import os
from pathlib import Path

SCRIPT_DIR = Path(__file__).parent
MODEL_PATH = SCRIPT_DIR / "trained_model"

config = InferenceConfig(
    model=ModelParams(
        model_name="Qwen/Qwen2.5-1.5B-Instruct",
        adapter_model=str(MODEL_PATH),
        trust_remote_code=True,
    ),
    # ...
)
  1. Or add a clear usage note in the docstring:
"""
Simple inference script for the trained Oumi model.
Uses the EXACT same prompt format as training for proper output.

Usage: Must be run from the oumi/ directory:
    cd oumi
    python run_inference.py
"""
ui/app/api/kestra/execute/route.js-70-77 (1)

70-77: Add a timeout to the fetch call to prevent hanging.

Fetch calls without timeouts can hang indefinitely if the Kestra server becomes unresponsive. This is especially important for user-facing APIs where you want to fail fast rather than leave requests hanging.

You can wrap the fetch with a timeout using AbortController:

+    const controller = new AbortController();
+    const timeoutId = setTimeout(() => controller.abort(), 30000); // 30 second timeout
+    
     const response = await fetch(kestraUrl, {
       method: 'POST',
       headers: {
         ...getAuthHeaders(),
         // Note: Don't set Content-Type, fetch will set it with boundary for FormData
       },
       body: formData,
+      signal: controller.signal,
     });
+    
+    clearTimeout(timeoutId);

Then update your error handling to catch AbortError:

   } catch (error) {
     console.error('Error triggering pipeline:', error);
+    if (error.name === 'AbortError') {
+      return NextResponse.json(
+        { error: 'Request timeout', message: 'Kestra server did not respond in time' },
+        { status: 504 }
+      );
+    }
     return NextResponse.json(
       { error: 'Internal server error', message: error.message },
       { status: 500 }
     );
   }

Committable suggestion skipped: line range outside the PR's diff.

tests/test_mcp_server.mjs-64-110 (1)

64-110: Consider using a proper test framework instead of hard-coded timeouts.

Right now you're just firing off requests with setTimeout and hoping they complete in order within your 7-second window. There are no assertions, no failure detection, and no clear pass/fail criteria. This makes it more of a "smoke test" than a real integration test.

For a v1.0.0 release, consider migrating to a proper test framework (like Jest or Mocha) with:

  • Actual assertions to verify responses
  • Await/async flow instead of setTimeout chains
  • Proper error handling and test failure detection
  • Clear pass/fail exit codes

Would you like me to help generate a proper test structure using a test framework?

tests/test_mcp_server.mjs-43-43 (1)

43-43: Empty catch block silently swallows parsing errors.

This empty catch means any JSON parsing errors will be silently ignored, making debugging super painful. At minimum, log the error so you know when responses are malformed.

Apply this diff:

-  } catch { }
+  } catch (err) {
+    console.error('Failed to parse response:', line, err);
+  }
scripts/generate-pr.sh-8-12 (1)

8-12: Actually use OUTPUT_FILE (right now it’s ignored). As-is, ./generate-pr.sh dev PR.md won’t create PR.md.

 BASE_BRANCH="${1:-dev}"
 OUTPUT_FILE="${2:-}"
+
+if [ -z "$OUTPUT_FILE" ]; then
+  echo "Error: output-file is required (arg 2)" >&2
+  exit 1
+fi
...
 echo "# Running Cline CLI..." >&2
+
+if ! command -v cline >/dev/null 2>&1; then
+  echo "# cline not found, falling back to basic template..." >&2
+  CLINE_OK=0
+else
+  CLINE_OK=1
+fi
 
 # Run Cline with the context file
-cline -y "Read $CONTEXT_FILE and generate a PR description following the template. Output only markdown." --mode act 2>/dev/null || {
+if [ "$CLINE_OK" -eq 1 ]; then
+  cline -y "Read $CONTEXT_FILE and generate a PR description following the template. Output only markdown." --mode act > "$OUTPUT_FILE" 2> /dev/null && exit_code=0 || exit_code=$?
+else
+  exit_code=1
+fi
+
+if [ "${exit_code:-1}" -ne 0 ]; then
   echo "# Cline failed, falling back to basic template..." >&2
-  cat << ENDPR
+  cat > "$OUTPUT_FILE" << ENDPR
 # Pull Request
 ...
 ENDPR
 }
+
+echo "# Wrote PR description to $OUTPUT_FILE" >&2

Also applies to: 64-93

ui/components/StepProgressBar.module.css-40-59 (1)

40-59: Add :focus-visible styles for keyboard accessibility. Hover-only feedback is rough for non-mouse users.

 .step:hover {
   border-color: var(--color-text-secondary);
   transform: scale(1.1);
 }
+
+.step:focus-visible {
+  outline: 2px solid var(--color-text-secondary);
+  outline-offset: 2px;
+}

Also applies to: 122-126

orchestrator/kestra_pipelines/04-render-graph.yaml-221-248 (1)

221-248: Validate components is a list (strings), otherwise you can get weird behavior. If upstream accidentally outputs a string, you’ll iterate chars.

 components = arch.get("components", [])
+if not isinstance(components, list):
+    print("WARNING: components is not a list; coercing to list")
+    components = [str(components)]
orchestrator/kestra_pipelines/08-validate-pr.yaml-58-110 (1)

58-110: Handle missing repository + add a real timeout outcome (don’t leave it as “pending”). Right now, after 5 minutes you can still return "status": "pending" with no explanation.

-          elif not github_token:
+          elif not repository:
+              result["status"] = "skipped"
+              result["message"] = "No repository configured"
+          elif not github_token:
               result["status"] = "skipped"
               result["message"] = "No GitHub token configured"
           else:
               def github_api(endpoint):
                   url = f"https://api.github.com/repos/{repository}{endpoint}"
                   headers = {
+                      "User-Agent": "infoundry-kestra",
                       "Authorization": f"token {github_token}",
-                      "Accept": "application/vnd.github.v3+json"
+                      "Accept": "application/vnd.github+json"
                   }
...
-                  except:
-                      return None
+                  except Exception as e:
+                      return {"_error": str(e)}
...
               for attempt in range(max_attempts):
                   pr = github_api(f"/pulls/{pr_number}")
-                  if pr:
+                  if pr and not pr.get("_error"):
...
                   if attempt < max_attempts - 1:
                       time.sleep(10)
+
+              if result["status"] == "pending":
+                  result["status"] = "timeout"
+                  result["message"] = "Timed out waiting for checks to complete"
ui/components/viewers/LogViewer.jsx-20-38 (1)

20-38: Don’t let a circular log object crash the whole viewer. JSON.stringify() throws on circular refs.

 export default function LogViewer({ logs, title, maxHeight = 300 }) {
+  const safeStringify = (v) => {
+    try { return typeof v === 'string' ? v : JSON.stringify(v); }
+    catch { return String(v); }
+  };
...
     if (Array.isArray(logs)) {
       return logs.map((log, idx) => ({
         id: idx,
-        text: typeof log === 'string' ? log : JSON.stringify(log),
+        text: safeStringify(log),
         level: log.level,
         timestamp: log.timestamp,
       }));
     }
     
-    return [{ id: 0, text: JSON.stringify(logs, null, 2) }];
+    return [{ id: 0, text: safeStringify(logs) }];
   }, [logs]);

Committable suggestion skipped: line range outside the PR's diff.

orchestrator/kestra_pipelines/09-evaluate.yaml-24-29 (1)

24-29: Hardcoded AI endpoint and API key will break in any environment outside your Docker setup. The baseUrl http://172.17.0.1:11434/v1 (Docker bridge IP) and apiKey "ollama" are tied to your local machine. Use Kestra's secret injection (secret('key_name')) or workflow inputs instead so this works in CI/CD, production, or on a teammate's machine.

orchestrator/kestra_pipelines/07-create-pr.yaml-73-78 (1)

73-78: Big correctness gap: only uploads top-level files, and updates will 422 without sha
This will break the moment your zip has folders (common for Terraform), and it can also fail if TARGET_FOLDER already has files from base_branch. You should os.walk() + preserve relative paths, and include sha when overwriting. Also don’t ignore the PUT result.

Also applies to: 123-136

ui/components/StepProgressBar.jsx-48-57 (1)

48-57: Add type="button" to avoid accidental form submits
If this component ever ends up inside a <form>, clicking a step will submit by default.

@@
               <button
+                type="button"
                 className={`
                   ${styles.step}
                   ${styles[state]}
                   ${isActive ? styles.active : ''}
                 `}
ui/components/StepProgressBar.jsx-32-46 (1)

32-46: Connector completion logic is using the wrong step’s state
Right now the connector to step i turns “completed” when step i is completed; usually you want it completed when step i-1 is completed.

@@
-        {STEPS.map((step, index) => {
-          const state = getStepState(step.id);
+        {STEPS.map((step, index) => {
+          const state = getStepState(step.id);
+          const prevState = index > 0 ? getStepState(STEPS[index - 1].id) : null;
           const isActive = currentStep === step.id;
@@
               {index > 0 && (
                 <div 
                   className={`${styles.connector} ${
-                    state === 'completed' ? styles.connectorCompleted : ''
+                    prevState === 'completed' ? styles.connectorCompleted : ''
                   }`}
                 />
               )}
ui/components/ServiceConfigGenerator.jsx-101-130 (1)

101-130: Duplicate service names silently overwrite in servicesObj (data loss).
If two services share the same name, you end up with 1 entry in JSON and no warning. You probably want to block duplicates or surface an error.

   const generatedJson = useMemo(() => {
     const servicesObj = {};
     const allLanguages = new Set();
+    const seen = new Set();
+    const duplicateNames = new Set();

     services.forEach(service => {
       if (service.name) {
+        if (seen.has(service.name)) duplicateNames.add(service.name);
+        seen.add(service.name);
         servicesObj[service.name] = {
           path: service.path || '.',
           has_dockerfile: service.has_dockerfile,
           language: service.language || 'unknown',
           framework: service.framework || null,
           ports: service.ports,
           databases: service.databases,
           queues: service.queues,
         };
         if (service.language) {
           allLanguages.add(service.language);
         }
       }
     });

     const primaryLanguage = services[0]?.language || 'unknown';

     return {
       services: servicesObj,
       service_count: Object.keys(servicesObj).length,
       languages: Array.from(allLanguages),
       primary_language: primaryLanguage,
+      ...(duplicateNames.size ? { warnings: { duplicate_service_names: Array.from(duplicateNames) } } : {}),
     };
   }, [services]);
orchestrator/kestra_pipelines/01-ingest-repo.yaml-19-45 (1)

19-45: Add validation to fail fast if the repo URL can't be parsed (don't let null/null reach the GitHub API).

Right now, if the URL doesn't match either pattern, owner and repo stay None, but the workflow keeps going anyway. The fetch_languages task (lines 47–54) just builds a request to https://api.github.com/repos/null/null/languages, which fails with a cryptic 404 instead of a clear error message.

Add a check after the pattern loop to error out early with a helpful message:

           owner, repo = None, None
           for pattern in patterns:
               match = re.search(pattern, url)
               if match:
                   owner, repo = match.groups()
                   break
+
+          if not owner or not repo:
+              raise ValueError(f"Unsupported repo_url format: {url}")
ui/components/viewers/GraphViewer.jsx-123-156 (1)

123-156: Node style from input data isn’t applied to ReactFlow nodes (only used for color).
Right now you do data: { style: node.style }, but ReactFlow expects node.style at the top level. If the pipeline outputs node styling, it won’t render.

-    const graphNodes = (data.nodes || []).map(node => ({
+    const graphNodes = (data.nodes || []).map(node => ({
       id: node.id,
       type: 'infrastructureNode',
       position: node.position || { x: 0, y: 0 },
+      style: node.style || {},
       data: {
         label: node.data?.label || node.id,
         icon: node.data?.icon || 'box',
+        type: node.data?.type,
         style: node.style || {},
       },
     }));
ui/app/dashboard/page.jsx-255-279 (1)

255-279: Node placement will be wrong under zoom/pan—you need ReactFlow coordinate conversion.

The current code uses getBoundingClientRect() to calculate position, but that only gives you viewport/DOM coordinates. When someone zooms or pans the canvas, your nodes end up in the wrong spot because you're not accounting for the canvas transform.

Use useReactFlow().screenToFlowPosition() instead—it handles all the zoom/pan math for you. Import useReactFlow from @xyflow/react and pass it the raw clientX/clientY values (no getBoundingClientRect subtraction needed).

 import {
   ReactFlow,
   Controls,
   Background,
   useNodesState,
   useEdgesState,
   addEdge,
   Panel,
   Handle,
   Position,
+  useReactFlow,
 } from "@xyflow/react";

 export default function DashboardPage() {
+  const { screenToFlowPosition } = useReactFlow();
   ...
   const onPaneClick = useCallback((event) => {
     if (!selectedPaletteItem) return;
-
-    const bounds = event.target.getBoundingClientRect();
-    const position = {
-      x: event.clientX - bounds.left - 70,
-      y: event.clientY - bounds.top - 40,
-    };
+    const position = screenToFlowPosition({ x: event.clientX, y: event.clientY });

     const newNode = {
infoundry-mcp-server/src/index.ts-222-253 (1)

222-253: Missing timeout on the Oumi fetch call ⏱️

Hey, so when you call fetch to the Oumi server (line 226), there's no timeout specified. If the Oumi server is hanging or unresponsive, this could block indefinitely. In Node.js, you can use AbortSignal.timeout() to add a timeout.

       try {
         const prompt = `Recommend cloud architecture for: ${serviceCount} services, ${cloudProvider}, ${hasDatabase ? 'with database' : 'no database'}, ${hasQueue ? 'with queue' : 'no queue'}`;
         
         const response = await fetch("http://localhost:8000/v1/chat/completions", {
           method: "POST",
           headers: { "Content-Type": "application/json" },
           body: JSON.stringify({
             model: "oumi",
             messages: [{ role: "user", content: prompt }],
           }),
+          signal: AbortSignal.timeout(30000), // 30 second timeout
         });
🟡 Minor comments (18)
requirements.txt-4-6 (1)

4-6: Consider whether the MCP server code is ready for Pydantic v2 compatibility.

You've pinned pydantic>=2.0.0, which has breaking changes from v1. The existing code in oumi/serve.py uses simple BaseModel definitions that are compatible with v2, but the infoundry-mcp-server directory doesn't have Python implementation yet. Once you add code there, make sure to use Pydantic v2 patterns—if you start with v1 syntax (like class Config: or @validator decorators), you'll hit compatibility issues when this dependency gets installed. Stick to v2 patterns like ConfigDict and field_validator from the start to avoid headaches later.

README.md-24-24 (1)

24-24: Update the placeholder repository URL.

Yo, you've got your-org/infoundry as a placeholder here. Should probably be crypticsaiyan/infoundry based on what I see in the namespace config and PR objectives.

Apply this diff:

-git clone https://github.com/your-org/infoundry.git
+git clone https://github.com/crypticsaiyan/infoundry.git
.github/workflows/coderabbit.yml-32-50 (1)

32-50: Add quotes around variable expansions to prevent word splitting.

Hey! Shellcheck is flagging some unquoted variables that could cause issues if file names have spaces or special characters. On lines 35 and 37, ${{ github.base_ref }} should be quoted, and the $CHANGED_FILES variable on line 41 should also be quoted.

This is a pretty common shell scripting gotcha - unquoted variables can split on spaces and expand globs, which might give you unexpected behavior.

Apply these fixes:

       - name: Check for Terraform Changes
         id: tf-changes
         run: |
           # Get list of changed files
           if [ "${{ github.event_name }}" == "pull_request" ]; then
-            CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)
+            CHANGED_FILES=$(git diff --name-only "origin/${{ github.base_ref }}"...HEAD)
           else
             CHANGED_FILES=$(git diff --name-only HEAD~1)
           fi
           
           # Check if any .tf files are changed
-          TF_CHANGED=$(echo "$CHANGED_FILES" | grep -E '\.tf$' || true)
+          TF_CHANGED=$(echo "${CHANGED_FILES}" | grep -E '\.tf$' || true)
           
           if [ -n "$TF_CHANGED" ]; then
.github/PULL_REQUEST_TEMPLATE.md-33-35 (1)

33-35: Add language identifier to the code block.

Your code block on line 33 should specify a language for proper syntax highlighting. Since this is meant for Terraform output, add terraform or at least text as the language identifier.

Apply this diff:

-```
+```terraform
 terraform plan output or summary here

</blockquote></details>
<details>
<summary>orchestrator/kestra_pipelines/README.md-23-23 (1)</summary><blockquote>

`23-23`: **Fix the typo: "strictly reasoning" → "strict reasoning".**

Small typo in your AI capabilities description that should be corrected for clarity.



```diff
-This demonstrates the power of Kestra's AI capabilities to not just process data, but to strictly reasoning about it and driving the workflow logic.
+This demonstrates the power of Kestra's AI capabilities to not just process data, but to strict reasoning about it and driving the workflow logic.

Actually, on second thought, the whole phrase could be more natural as "to perform strict reasoning about it and drive the workflow logic" (without the "to" before "driving").

infoundry-mcp-server/package.json-21-25 (1)

21-25: Update TypeScript and @types/node to their latest 5.x versions.

Your dev dependencies are running some older versions. TypeScript 5.3.0 is now six minor versions behind the latest 5.x release (5.9), and @types/node 20.10.0 is nine minor versions behind (20.19.26). These packages get updates pretty frequently with bug fixes and improvements, so bumping them up is a good idea.

ui/components/PipelineForm.jsx-64-66 (1)

64-66: Validation restricts to GitHub.com only.

Your repo URL validation (line 64) requires github.com in the URL, which blocks users from using GitLab, Bitbucket, or self-hosted Git instances. If InFoundry is meant to work with any Git repo, this validation is too strict.

Either:

  1. Remove the GitHub-specific check and just validate URL format, or
  2. Document that only GitHub is supported in v1.0.0

If GitHub-only is intentional for now, update the label/placeholder to make it clear:

           <label htmlFor="repo_url" className={styles.label}>
-            Repository URL <span className={styles.required}>*</span>
+            GitHub Repository URL <span className={styles.required}>*</span>
           </label>

Committable suggestion skipped: line range outside the PR's diff.

ui/components/Button.jsx-20-24 (1)

20-24: Add explicit type="button" default to prevent accidental form submissions.

Without an explicit type attribute, buttons default to type="submit" in HTML. This can cause surprising behavior if your Button is used inside a form but isn't meant to submit it. This is a classic gotcha!

Apply this diff:

 export default function Button({
   children,
   variant = "primary",
   size = "md",
   icon,
   iconPosition = "right",
   className = "",
+  type = "button",
   ...props
 }) {
   const classNames = [
     styles.button,
     styles[variant],
     styles[size],
     className,
   ].filter(Boolean).join(" ");

   return (
-    <button className={classNames} {...props}>
+    <button type={type} className={classNames} {...props}>
       {icon && iconPosition === "left" && <span className={styles.icon}>{icon}</span>}
       {children}
       {icon && iconPosition === "right" && <span className={styles.icon}>{icon}</span>}
     </button>
   );
 }
orchestrator/kestra_pipelines/00-end-to-end.yaml-155-176 (1)

155-176: Log summary step hardcodes success checkmarks for all steps.

Your final log summary (lines 165-173) prints checkmarks for all 9 steps, even though some might be skipped (like when skip_pr or skip_validation are true) or failed (with allowFailure: true). This could be confusing—the summary says everything passed, but actually some steps didn't run or failed.

Consider making the print script conditional or dynamic based on actual execution results. At minimum, add a disclaimer:

       print("Steps completed:")
-      print("  ✅ 01. Repository ingested")
-      print("  ✅ 02. Telemetry collected")
+      print("  ✅ 01. Repository ingested")
+      print("  ✅ 02. Telemetry collected (may have failed)")
       ...
+      print("")
+      print("Note: Some steps may have been skipped or failed based on configuration.")

Committable suggestion skipped: line range outside the PR's diff.

orchestrator/kestra_pipelines/02-ingest-telemetry.yaml-6-11 (1)

6-11: Input -> env wiring is fine, but guard against empty strings. If inputs.services is "" (or ends with a comma), you’ll generate a "" service entry.

 env:
-  SERVICES: "{{ inputs.services ?? 'backend,frontend' }}"
+  SERVICES: "{{ (inputs.services ?? 'backend,frontend') }}"

And in the script:

- services = [s.strip() for s in services_input.split(",")]
+ services = [s.strip() for s in services_input.split(",") if s.strip()]
+ if not services:
+     services = ["backend", "frontend"]

Also applies to: 66-68

orchestrator/kestra_pipelines/08-validate-pr.yaml-65-77 (1)

65-77: The Accept header should be updated to the current format, and add the X-GitHub-Api-Version header for consistency with GitHub's recommended practices. The current code uses the legacy application/vnd.github.v3+json Accept header, but GitHub's newer endpoints prefer application/vnd.github+json. Also add the X-GitHub-Api-Version: 2022-11-28 header to explicitly pin the API version rather than relying on defaults. The token-based Authorization you're using is fine for this use case—you don't necessarily need Bearer auth. Also applies to: 85-90.

oumi/train_sft.py-79-79 (1)

79-79: Remove the unnecessary f-string prefix.

Hey! Your linter's yelling at you here. This line:

print(f"  Training Type: SFT with LoRA")

...has an f prefix but no {variables} inside it. It's just a regular string pretending to be fancy. 😅

-    print(f"  Training Type: SFT with LoRA")
+    print("  Training Type: SFT with LoRA")

Tiny thing, but clean code = happy code!

ui/README.md-38-43 (1)

38-43: Markdownlint: list indentation under “Variables”
Looks like the bullets are indented 1 space too far; easy cleanup to satisfy MD007.

docs/architecture.md-13-31 (1)

13-31: Markdownlint cleanups: fence languages + blank lines around tables
Super minor, but adding code-fence languages (probably text) and blank lines around tables will keep the doc tooling quiet.

Also applies to: 46-53, 65-80, 84-95, 170-200

ui/app/pipeline/page.jsx-12-12 (1)

12-12: Tiny nit: comment got glued to POLL_INTERVAL
Not a functional bug, but this will look weird / fail lint in some setups.

orchestrator/kestra_pipelines/05-generate-iac.yaml-15-25 (1)

15-25: Same "defaults" typo as the other pipeline 📝

Hey, same issue here as in 03-propose-architecture.yaml—Kestra uses default, not defaults for input default values. Lines 18 and 24 have this typo.

   - id: cloud_provider
     type: STRING
     required: true
-    defaults: "aws"
+    default: "aws"
     description: Target cloud provider (aws, gcp, azure)

   - id: project_name
     type: STRING
     required: true
-    defaults: "infoundry"
+    default: "infoundry"
     description: Project name for resource naming
oumi/serve.py-140-142 (1)

140-142: Overly broad string matching for service count detection

Hey, this pattern matching is going to give you some false positives! Checking "5" in prompt or "6" in prompt will match things like:

  • "Budget: $500/month" → triggers kubernetes
  • "Running on EC2 t3.2xlarge" → triggers kubernetes
  • "API v6 endpoint" → triggers kubernetes

You probably want to be more precise about detecting service counts:

-    elif "kubernetes" in prompt_lower or "5" in prompt or "6" in prompt:
+    elif "kubernetes" in prompt_lower or "5 services" in prompt_lower or "6 services" in prompt_lower:
         pattern = "kubernetes"
         components = ["eks_cluster", "alb", "rds", "elasticache"]

Or better yet, use regex to match patterns like "5 services", "6 microservices", etc.

oumi/serve.py-271-272 (1)

271-272: Don't use bare except: — it catches too much!

Okay so here's the thing about bare except: — it catches everything, including KeyboardInterrupt (Ctrl+C), SystemExit, and other things you probably don't want to silently swallow. It's considered a Python anti-pattern.

Also, silently passing on exceptions (except: pass) makes debugging a nightmare. At minimum, you should log something.

-    except:
-        pass
+    except Exception:
+        pass  # Ollama not available, which is fine

Or even better, log it at debug level so you can troubleshoot if needed.

Comment thread orchestrator/kestra_pipelines/06-validate-iac.yaml Outdated
Comment thread orchestrator/kestra_pipelines/09-evaluate.yaml
Comment thread ui/app/api/kestra/file/route.js
Comment thread ui/app/api/kestra/file/route.js
Comment thread ui/app/page.jsx
Comment thread ui/components/StepOutputCard.jsx
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
orchestrator/kestra_pipelines/06-validate-iac.yaml (1)

48-54: Initialize deploy_output in result dict for consistency.

The script conditionally adds deploy_output (lines 77, 97), but never initializes it in the result dict (lines 48–54). This means downstream consumers that expect this key might fail.

Consider initializing it upfront:

           result = {
               "environment": environment,
               "started_at": datetime.utcnow().isoformat(),
               "deploy_status": "pending",
               "smoke_tests": {"passed": 0, "failed": 0, "tests": []},
+              "deploy_output": "",
               "success": False
           }

Also applies to: 77-77, 97-97

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2c7693 and cbed885.

📒 Files selected for processing (1)
  • orchestrator/kestra_pipelines/06-validate-iac.yaml (1 hunks)
🔇 Additional comments (1)
orchestrator/kestra_pipelines/06-validate-iac.yaml (1)

35-40: Good catch on the sys import fix!

The previous review flagged a missing sys import that would cause a NameError when calling sys.exit(1). I can see this has been addressed—line 36 now includes import sys, so the pipeline will cleanly exit on validation failure as intended. 👍

Comment thread orchestrator/kestra_pipelines/06-validate-iac.yaml
Comment thread orchestrator/kestra_pipelines/06-validate-iac.yaml
@github-actions

Copy link
Copy Markdown

🐰 CodeRabbit Review Reminder

This PR will be automatically reviewed by CodeRabbit. Please ensure:

  • All CodeRabbit suggestions have been addressed
  • Critical issues are resolved before requesting human review
  • IaC changes follow security best practices

Tip: Reply to CodeRabbit comments with @coderabbitai to get clarifications or request re-reviews.


Maintainer approval is required after addressing CodeRabbit feedback.

@crypticsaiyan crypticsaiyan merged commit f0de2ef into main Dec 14, 2025
5 of 6 checks passed
@coderabbitai

coderabbitai Bot commented Dec 14, 2025

Copy link
Copy Markdown
Contributor

Hey there! 👋

I see the automated reminder is working as expected! This message was triggered by the workflow you've set up in .github/workflows/coderabbit.yml to remind everyone about the review process.

Just a heads up - I'm replying to this latest comment without context from earlier in the conversation. But here's the deal:

This is your workflow's way of saying "Hey, CodeRabbit will be reviewing this PR soon!" Since this is a pretty hefty release (v1.0.0 with tons of new features, MCP server, Kestra pipelines, UI dashboard, and more), I'll be doing a thorough review once triggered.

What happens next:

  1. When the review runs, I'll go through all the changes
  2. I'll flag any issues (critical, suggestions, or nitpicks)
  3. You can chat with me using @coderabbitai in the comments
  4. The IaC validation gate will run to check your Terraform code

The checklist in that reminder is basically your to-do list after my review is complete. Think of it like a video game checkpoint - gotta clear all the CodeRabbit feedback before moving to the final boss (maintainer approval) 🎮

If you want me to start reviewing now, just reply with @coderabbitai review or use the `` command!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant