Secure AI Agent Engineering
Production
Cutover Plan
Controlled runbook for promoting the Security & AI Agent platform from dev / staging into production on AWS.
Wave-sequenced, gated deploys
Blue/green edge · fast rollback
Account 506997029654 · us-east-1
Sign in
Restricted to Trilagen accounts. Use your hub credentials.
· Cutover Plan
··
Trilagen · Secure AI Agent Engineering

Production Cutover Plan

Security & AI Agent Platform

Dev / Staging → Production · AWS 506997029654 · us-east-1 · v1.0 · 24 Jun 2026

§ Document Control

FieldValue
Version / Status1.0 · For approval
Date24 June 2026
Cutover leadSatish Kandagadla — Trilagen Engineering
Target environmenttrilagen-prod (506997029654), us-east-1
Promotion pathdev / staging → prod

Contents

01 Purpose & Scope

This is the controlled runbook for promoting the Trilagen Security and AI Agent platform from dev / staging into production. It defines what is cut over, who owns each step, the exact sequence, how success is validated, and how the change is rolled back if a gate fails.

All work targets one production landing zone: AWS account 506997029654 in us-east-1, deployed exclusively through the trilagen-prod CLI profile. The default profile points at a different account and must never be used.

1.1 — In scope · Security project

SystemSubdomainFunctionStatus
SailPoint ISCisc.trilagen.botIdentity governance co-pilot (Identity Security Cloud)LIVE
Secure Vaultvault.trilagen.botSecrets / secure document handlingLIVE
AWS Agentaws.trilagen.botCross-account AWS ops via STS AssumeRoleLIVE

1.2 — In scope · AI Agent portfolio

SystemSubdomainFunctionStatus
Agent Hubhub.trilagen.botPortal listing & launching agents (JWT hand-off)LIVE
Diagram / Transcriptdiagram.trilagen.botTranscript → Lucidchart process diagramsLIVE
SOW Buildersow.trilagen.botStatement-of-Work drafting & reviewLIVE
Salesforcesalesforce.trilagen.botSalesforce data co-pilotBETA
Candidate Evaluationcandidate.trilagen.botResume / candidate scoringBETA
PMOpmo.trilagen.botProject / portfolio assistantBETA
OpenAiropenair.trilagen.botNetSuite OpenAir PSA assistantDEV
Legal Reviewlegal.trilagen.botContract / legal document reviewDEV

1.3 — Out of scope

  • Net-new feature development — only promotion of already-tested builds.
  • Changes to the shared Cognito user pool schema or identity provider config.
  • Client-tenant data migration — agents are stateless apart from TTL-expiring conversation history.
  • The non-prod account (878112346062), except as an AssumeRole trust target for the AWS Agent.

02 Target Architecture

Every agent follows the same production reference pattern, which keeps the cutover repeatable and rollback predictable.

2.1 — Shared production platform

ComponentProduction value
AWS account506997029654
CLI profiletrilagen-prod (export AWS_PROFILE=trilagen-prod)
Regionus-east-1
Cognito user poolus-east-1_cYeALmJr6 (trilagen-hub-prod)
Cognito app client3bp1ppq928aom54kveof08hn7o
Allowed email domainstrilagen.com, trilagen.ai
Route53 hosted zoneZ02186353SN03930GPGID (trilagen.bot)
CloudFront alias zoneZ2FDTNDATAQYW2 (fixed AWS value)
SAM artifact buckettrilagen-sam-deployments-prod

2.2 — Per-agent stack pattern

  • Backend (SAM): Lambda (Node.js 18, 300s) agentic loop, API Gateway /chat, DynamoDB conversation table with TTL, IAM role. Stack trilagen-<agent>-<env>.
  • Frontend (CloudFormation): private S3 (OAC-only) + CloudFront + per-subdomain ACM cert (DNS-validated) + Route53 A-alias to <agent>.trilagen.bot. Stack trilagen-<agent>-frontend-<env>.
  • Secrets: API keys / OAuth in Secrets Manager (trilagen-<agent>/<env>), read by the Lambda at runtime — never plaintext env vars or CFN params.
Environment is a stack parameter (dev | staging | prod), so the same templates produce isolated stacks. The cutover is a controlled promotion of the same artifacts — not a re-architecture.

03 Cutover Strategy

Executed agent-by-agent in dependency-ordered waves, with a go/no-go gate after each wave.

3.1 — Sequencing

WaveScopeRationale
0 — PlatformCognito, zone, SAM bucketFoundation; verify-only, no change.
1 — Hubhub.trilagen.botEntry point + JWT hand-off must be live first.
2 — SecurityISC, Vault, AWS AgentHighest-sensitivity systems cut first.
3 — AI (Live)Diagram, SOWProduction-grade, already validated in staging.
4 — AI (Beta/Dev)Salesforce, Candidate, PMO, OpenAir, LegalLower-risk; promoted last, may stay feature-flagged.

3.2 — Approach

  • Blue/green at the edge: validate a new CloudFront distribution before repointing the Route53 alias → near-instant rollback.
  • Backend stacks updated in place via reviewed SAM change sets.
  • Secrets seeded in Secrets Manager before the backend deploy.
  • Explicit go/no-go gate before each subsequent wave.

3.3 — Cutover window

ItemDetail
Proposed windowSaturday 04:00–08:00 ET (low-traffic)
Expected duration~3 h active + 1 h validation buffer
First-time frontend stacks10–15 min each (ACM validation + CloudFront rollout)
FreezeNo merges to release branch from T-24h until sign-off

04 Roles & Responsibilities

RoleOwnerResponsibility
Cutover LeadSatish KandagadlaOwns runbook, calls go/no-go, final sign-off.
Platform EngineerEngineering on-callExecutes deploys, manages Secrets Manager.
Security ReviewerSecurity workstreamValidates ISC, Vault, AWS Agent IAM & least-privilege.
Validation / QAEngineeringRuns smoke tests, confirms auth + chat round-trip.
CommsCutover LeadStakeholder updates at start, per-wave, completion.

05 Pre-Cutover Readiness Checklist

All items green before the window opens (T-24h review).

#PrerequisiteOwner
1AWS CLI, SAM CLI, Node.js 18+ installed & verifiedPlatform Eng
2AWS_PROFILE=trilagen-prod authenticates to 506997029654Platform Eng
3All agent builds pass in staging; release branch tagged & frozenEngineering
4Prod secrets gathered: Anthropic key, ISC secret, Lucid OAuth, SF/OpenAir credsSecurity
5Cognito pool confirmed live with correct allowed domainsPlatform Eng
6Route53 zone reachable; no conflicting records for target subdomainsPlatform Eng
7SAM artifact bucket exists and is writablePlatform Eng
8Rollback plan reviewed; previous stack template versions identifiedCutover Lead
9Stakeholders notified; freeze in effectComms

06 Cutover Runbook

Run all commands with export AWS_PROFILE=trilagen-prod and export AWS_REGION=us-east-1. Replace <agent> with each system's slug.

6.1 — Wave 0 · Platform verification (T-0:00)

  1. Confirm identity: aws sts get-caller-identity → Account = 506997029654.
  2. Confirm Cognito: aws cognito-idp describe-user-pool --user-pool-id us-east-1_cYeALmJr6.
  3. Confirm bucket: aws s3 ls s3://trilagen-sam-deployments-prod.
  4. Gate G0: platform confirmed → proceed.

6.2 — Wave 1 · Agent Hub (T-0:15)

  1. Deploy hub frontend stack (S3 + CloudFront + ACM + Route53).
  2. Upload hub assets; aws cloudfront create-invalidation --distribution-id <id> --paths "/*".
  3. Verify https://hub.trilagen.bot loads and Cognito login renders.
  4. Gate G1: hub live + login works → proceed.

6.3 — Waves 2–4 · Per-agent deploy (repeat per system)

  1. Install Lambda deps: cd lambda && npm install && cd ..
  2. Seed secrets: create/update trilagen-<agent>/prod with API keys & OAuth.
  3. Build backend: cd infrastructure && sam build
  4. Deploy backend with Environment=prod, Cognito pool/client, domains, SecretId — review change set before executing.
  5. Capture ApiUrl from stack outputs.
  6. Deploy frontend stack trilagen-<agent>-frontend-prod (first run 10–15 min).
  7. Wire API URL into frontend, upload to S3, invalidate CloudFront.
  8. Smoke test (§7) before moving to the next agent.

Per-wave gate (G2 / G3 / G4): all agents in the wave pass smoke tests → proceed.

6.4 — Order within waves

WaveAgents (in order)
2 — Security1. SailPoint ISC · 2. Secure Vault · 3. AWS Agent (deploy cross-account roles first)
3 — AI Live1. Diagram / Transcript · 2. SOW Builder
4 — AI Beta/Dev1. Salesforce · 2. Candidate · 3. PMO · 4. OpenAir · 5. Legal Review
AWS Agent: deploy the cross-account IAM role stack in both the main (506997029654) and sub (878112346062) accounts before the backend — the Lambda assumes a role in the target account via STS.

07 Validation & Smoke Tests

Run for every agent immediately after deploy. All must pass to clear the wave gate.

#TestPass criterion
1Frontend loads over HTTPS at <agent>.trilagen.bot200, valid ACM cert, no mixed content
2Cognito login with a trilagen.com accountJWT issued, app screen renders
3Hub launch hand-offOpens authenticated via #token=… (no re-login)
4Chat round-tripMessage → Lambda → model → response within timeout
5Secrets resolve at runtimeNo "missing credential" errors in CloudWatch
6Conversation persistenceItem written to DynamoDB with TTL set
7Integration call (per agent)ISC / Lucid / SF / OpenAir API returns 200
8CloudWatch error scanNo ERROR / Throttle in first 15 min
Security checks: confirm ISC least-privilege scopes, Vault secret access is logged, and the AWS Agent AssumeRole trust resolves only intended accounts.

08 Rollback Plan

Rollback is per-agent and fast — the edge is blue/green and stacks are versioned.

8.1 — Triggers

  • Any wave gate fails and cannot be remediated within 30 minutes.
  • Authentication broken (no JWT, or hub hand-off fails) for a Live agent.
  • Sustained Lambda 5xx / secret-resolution failure after deploy.

8.2 — Procedure

  1. Frontend: revert Route53 alias / CloudFront origin to prior distribution, or re-upload previous build and invalidate /*.
  2. Backend: redeploy the previous SAM template version for trilagen-<agent>-prod.
  3. Secrets: restore the previous version label if a bad secret was written.
  4. DynamoDB: no rollback needed (additive, TTL-bound; no destructive migrations).
  5. Confirm health on prior version; log the failure for post-mortem.
A single failure does not force a full-platform rollback — only the affected agent reverts while the rest proceed.

09 Risk Register

RiskImpactLikelihoodMitigation
Deploy to wrong AWS accountHighLowget-caller-identity gate at T-0; AWS_PROFILE enforced every step.
ACM validation delay on first deployMediumMediumPre-create certs at T-24h; budget 10–15 min each.
Secret missing / incorrect at runtimeHighMediumSeed & verify secrets before backend deploy; smoke test 5.
Cognito domain misconfig blocks loginHighLowWave-1 hub login gate before any agent.
Cross-account AssumeRole trust gapHighMediumDeploy role stacks in both accounts first; Security review.
CloudFront serves stale buildLowMediumInvalidate /* after upload; verify content hash.
External API rate limitMediumLowStagger Wave 4; validate one call per integration.

10 Communications Plan

WhenAudienceMessage
T-24hStakeholdersFreeze in effect; window confirmed
T-0 (start)Eng + stakeholdersCutover started; hub going first
Per waveEngineeringWave N complete, gate GN passed/failed
CompletionAllPlatform live; URLs + status summary
On rollbackEng + LeadAgent reverted; root cause + next attempt

11 Approval & Sign-off

This cutover proceeds only with the approvals below.

RoleNameApproval / Date
Cutover LeadSatish Kandagadla 
Platform Engineering  
Security  
Product / Owner  

A Appendix · Environment & URL Reference

KeyValue
Production account506997029654
Non-prod / sub account878112346062 (AssumeRole target)
Region / profileus-east-1 / trilagen-prod
Cognito pool / clientus-east-1_cYeALmJr6 / 3bp1ppq928aom54kveof08hn7o
Route53 zoneZ02186353SN03930GPGID
CloudFront alias zoneZ2FDTNDATAQYW2
SAM artifact buckettrilagen-sam-deployments-prod
Stack namingtrilagen-<agent>-prod / trilagen-<agent>-frontend-prod
Secret namingtrilagen-<agent>/prod

Production URLs

AgentURL
Hubhttps://hub.trilagen.bot
SailPoint ISChttps://isc.trilagen.bot
Secure Vaulthttps://vault.trilagen.bot
AWS Agenthttps://aws.trilagen.bot
Diagram / Transcripthttps://diagram.trilagen.bot
SOW Builderhttps://sow.trilagen.bot
Salesforcehttps://salesforce.trilagen.bot
Candidate Evaluationhttps://candidate.trilagen.bot
PMOhttps://pmo.trilagen.bot
OpenAirhttps://openair.trilagen.bot
Legal Reviewhttps://legal.trilagen.bot