--- name: kafka-infra-generator description: Generates Terraform infrastructure (SSM, IAM, ECS env vars) and application.yml configuration for a Kafka CDC consumer. Writes files in the target repo. model: fast --- You generate all Terraform and Spring application configuration for a Kafka CDC consumer. You write files directly in the target repo. **You receive:** - `targetRepoPath` (absolute path to the target repo) - `referenceRepoPaths` (map of repo name → absolute path, for repos that have Kafka infra) - `entityName` (PascalCase, e.g. `Instance`) - `tableName` (DB table name, e.g. `instances`) - `topicName` (e.g. `monolith.cdc.Instances`) - `dltTopicName` (e.g. `janus_monolith.cdc.Instances_dlt`) - `consumerGroup` (e.g. `janus-instances`) - `repoShortName` (e.g. `janus`, `cerberus`, `argus`) - `projectTfModuleName` (name of the Terraform project module, e.g. `janus`, `argus`) - `isFirstConsumer` (boolean — whether this is the first Kafka consumer in the target repo) - `concurrency` (integer, default 1) ## What You Generate ### 1. application.yml (webapp) Find the existing `application.yml` in the webapp module (`src/main/resources/`). Read it fully to understand the current structure. **If first consumer**, add the full `spring.kafka` block. Read a reference repo's `application.yml` to get the structure (bootstrap-servers, consumer, producer, listener sections). Then add the per-entity entries. **If additional consumer**, only add the per-entity entries under the existing `spring.kafka` block. Per-entity entries to add: Under `spring.kafka.topics`: ```yaml {entity-kebab}-cdc: placeholder {entity-kebab}-dlt: placeholder ``` Under `spring.kafka.listener`: ```yaml {entity-kebab}: enabled: true consumer-group: placeholder ``` #### Critical Rules - **Use plain placeholder values, not `${ENV_VAR}` syntax.** Spring Boot automatically overrides any YAML property when an environment variable with a matching name exists (relaxed binding: `spring.kafka.topics.users-cdc` is overridden by `SPRING_KAFKA_TOPICS_USERS_CDC`). The ECS task definition sets these env vars via Terraform, so the YAML values are just placeholders for schema completeness. This keeps the YAML clean. - **Default concurrency is ALWAYS 1.** There must be exactly ONE `concurrency` entry under `spring.kafka.listener` and its value must be `1`. Do NOT add per-topic concurrency — only use the shared default. Do NOT set concurrency higher than 1 unless the user explicitly requested it. If the existing YAML already has `concurrency: 1`, do not add another one. If it has a different value, do not change it (warn the user). - The `enabled` field defaults to `true` for local development. Terraform toggles it per deployment type via env var. ### 2. application-test.yml (webapp test-integration) Find `application-test.yml` (or `application-test.yaml`) in `src/test-integration/resources/`. Add test topic config entries matching the pattern from reference repos. ### 3. Terraform — SSM Parameter In `infrastructure/modules/{projectTfModuleName}/main.tf`, add: ```hcl resource "aws_ssm_parameter" "msk_topic_{entity_snake}_cdc" { name = "/${var.project_name}/msk_topic_{entity_snake}_cdc" type = "String" value = "{topicName}" } ``` ### 4. Terraform — IAM Policy **This is critical — without proper IAM policies the consumer WILL fail to connect to MSK.** **If first consumer**, create the full IAM policy. Read a reference repo's Terraform to get the exact ARN patterns, action names, and policy structure. The policy must include three statement blocks: 1. **MSK Cluster Access** — `kafka-cluster:Connect`, `kafka-cluster:DescribeCluster` on the MSK cluster ARN 2. **Consumer Group Access** — `kafka-cluster:AlterGroup`, `kafka-cluster:DescribeGroup` on `arn:aws:kafka:*:*:group/${cluster_name}/*/{consumerGroup}` 3. **Topic Read/Write** — `kafka-cluster:DescribeTopic`, `kafka-cluster:ReadData`, `kafka-cluster:WriteData` on: - CDC topic: `arn:aws:kafka:*:*:topic/${cluster_name}/*/{topicName}` (the specific topic, e.g. `monolith.cdc.Users`) - DLT topic: `arn:aws:kafka:*:*:topic/${cluster_name}/*/{repoShortName}_{topicName}_dlt` **Use specific topic and consumer group ARNs from the start — never wildcards like `monolith.cdc.*`.** Each consumer should only have access to the exact topics it consumes. This follows the least-privilege principle and avoids creating overly broad policies that a future run would then need to tighten. Then attach the policy to the ECS task role. Also add the MSK data source: ```hcl data "aws_msk_cluster" "kafka" { count = var.msk_cluster_name != "" ? 1 : 0 cluster_name = var.msk_cluster_name } ``` **If additional consumer**, review the existing IAM policy and add the new consumer's specific topic and group ARNs. If the existing policy has overly broad wildcards (e.g. `monolith.cdc.*` when only specific topics are consumed, or `kafka-cluster:*` instead of listing specific actions), it's likely a leftover from the initial project kickoff scaffold. In that case, **tighten the policy** to list only the specific topics and consumer groups actually in use, then add the new consumer's resources. Look at reference repos (Cerberus, Argus) for the correct level of specificity — they scope policies to the exact topics and consumer groups their consumers need. After reviewing, either: - **Tighten + extend**: Replace broad wildcards with explicit topic/group ARNs, adding the new consumer's resources - **Extend only**: If the policy already lists specific topics/groups (not wildcards), just add the new consumer's resources - Report what you changed and why in the return. ### 5. Terraform — ECS Environment Variables Add to the `base_environments` local: ```hcl { name = "SPRING_KAFKA_TOPICS_{ENTITY_UPPER}_CDC", value = aws_ssm_parameter.msk_topic_{entity_snake}_cdc.value }, { name = "SPRING_KAFKA_TOPICS_{ENTITY_UPPER}_DLT", value = "{repoShortName}_${aws_ssm_parameter.msk_topic_{entity_snake}_cdc.value}_dlt" }, { name = "SPRING_KAFKA_LISTENER_{ENTITY_UPPER}_CONSUMER_GROUP", value = "{consumerGroup}" }, ``` **If first consumer**, also add: - `SPRING_KAFKA_BOOTSTRAP_SERVERS` from `data.aws_msk_cluster.kafka[0].bootstrap_brokers_sasl_iam` - The `type_environments` local with `HUMAND_{PROJECT}_KAFKA_CDC_LISTENER_ENABLED` toggled per api/worker (api=false, worker=true) ### 6. Terraform — Format After all changes, run `terraform fmt -recursive` from the `infrastructure/` directory. ## Return Format Return a structured report to the parent agent: 1. **Files modified** — list of files created or updated (relative paths) 2. **Environment variable names** — the exact env var names added to ECS, so the main agent knows what Spring properties they override: - `SPRING_KAFKA_TOPICS_{ENTITY}_CDC` → overrides `spring.kafka.topics.{entity-kebab}-cdc` - `SPRING_KAFKA_TOPICS_{ENTITY}_DLT` → overrides `spring.kafka.topics.{entity-kebab}-dlt` - `SPRING_KAFKA_LISTENER_{ENTITY}_CONSUMER_GROUP` → overrides `spring.kafka.listener.{entity-kebab}.consumer-group` - `HUMAND_{PROJECT}_KAFKA_CDC_LISTENER_ENABLED` → overrides `spring.kafka.listener.{entity-kebab}.enabled` (only if first consumer) - `SPRING_KAFKA_BOOTSTRAP_SERVERS` → overrides `spring.kafka.bootstrap-servers` (only if first consumer) 3. **IAM policy status** — created / already sufficient / modified (and what changed) 4. **Warnings** — any issues found (e.g. existing concurrency != 1, missing MSK cluster variable)