Skip to content

Add Cloud Run Vertex AI backend proxy#20

Merged
Bhagat-Atul merged 1 commit into
benevolentbandwidth:mainfrom
eman-cickusic:add-gcp-vertex-proxy
Jun 9, 2026
Merged

Add Cloud Run Vertex AI backend proxy#20
Bhagat-Atul merged 1 commit into
benevolentbandwidth:mainfrom
eman-cickusic:add-gcp-vertex-proxy

Conversation

@eman-cickusic

Copy link
Copy Markdown
Contributor

Summary

Adds a small production-ready FastAPI backend under backend/ so the Android app can move off user-supplied AI API keys to a GCP-hosted proxy.

  • Accepts OCR-extracted ingredient text (POST /analyze), plus optional product name / barcode / locale.
  • Calls Gemini through Vertex AI using the Cloud Run service-account identity (Application Default Credentials) — no user API keys, no service-account JSON files committed.
  • Returns NOVA classification, ingredient analysis, ultra-processed markers, and likely allergens. The response shape is aligned to the app's existing Kotlin contract (NovaClassification + IngredientListAnalysis + AllergenDetection) for drop-in future integration.
  • Includes a Cloud Run-compatible Dockerfile and full deploy instructions in backend/README.md.

Endpoints

  • GET /healthz{status:ok}
  • POST /analyze → combined nova / ingredients / allergens JSON.

Robustness

  • Pydantic validation, 20k-char input cap, CORS (open for dev, README flags prod restriction), graceful 502 on model/timeout failure, JSON-fence fallback parsing.

Testing

  • pytest (8 tests, Gemini call mocked — no live creds needed): all pass.
  • Local uvicorn smoke: /healthz ok; /analyze returns 502 without ADC (expected); overlong input → 422.
  • Docker build not run here (no daemon available); Dockerfile is standard slim-Python + uvicorn on $PORT.

Deploy (see backend/README.md)

gcloud run deploy ultraprocessed-ai-proxy \
  --source backend \
  --project b2-ultra-processed \
  --region us-east1 \
  --service-account up-app-service@b2-ultra-processed.iam.gserviceaccount.com \
  --allow-unauthenticated \
  --set-env-vars GCP_PROJECT_ID=b2-ultra-processed,GCP_LOCATION=us-east1,GEMINI_MODEL=gemini-2.5-flash

--allow-unauthenticated is for initial testing only; protect before production (App Check / Firebase Auth / API Gateway).

Notes

  • No Android/Kotlin code changed; the classifier architecture is untouched. Wiring the app to the Cloud Run URL is intended follow-up once the service is deployed.
  • Vertex regional availability for gemini-2.5-flash varies; README documents a global/us-central1 fallback if us-east1 rejects the model.

🤖 Generated with Claude Code

FastAPI service under backend/ that accepts OCR-extracted ingredient text and
calls Gemini via Vertex AI using the Cloud Run service-account identity (ADC).
No user API keys, no service-account JSON. Returns NOVA classification,
ingredient analysis, ultra-processed markers, and likely allergens in a shape
aligned to the app's existing Kotlin contract. Includes Dockerfile, tests with
a mocked Gemini call, and Cloud Run deploy instructions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@Bhagat-Atul Bhagat-Atul left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the changes. There are couple of concerns I do have regarding duplicate contracts/ prompts, I would discuss that in the upcoming call.

@Bhagat-Atul Bhagat-Atul merged commit 88bf0d5 into benevolentbandwidth:main Jun 9, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants