docs(acp): add StatefulSet application DR solution#129
docs(acp): add StatefulSet application DR solution#129tossmilestone wants to merge 1 commit intomainfrom
Conversation
WalkthroughNew documentation file describing an Active‑Passive cross‑cluster disaster recovery solution for stateful applications. It covers architecture, component roles (Primary/Standby, GSLB, VolSync, GitOps), configuration templates, GitOps and VolSync setup, and step‑by‑step procedures for migrations, failover, and failback with risks and checks. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant GSLB
participant GitOps
participant PrimaryCluster
participant VolSync
participant StandbyCluster
User->>GSLB: Resolve service DNS
GSLB->>PrimaryCluster: Route traffic to Primary
Note right of PrimaryCluster: Primary handles reads/writes
PrimaryCluster->>VolSync: Replicate PVs (async rsync-tls)
VolSync->>StandbyCluster: Stream replicated data
GitOps->>StandbyCluster: Sync standby manifests (kustomize patches)
alt Emergency failover
GSLB->>StandbyCluster: Switch DNS to Standby
StandbyCluster->>User: Serve traffic (read/write)
end
alt Failback
PrimaryCluster->>VolSync: Re-sync data back
GitOps->>PrimaryCluster: Reapply primary manifests
GSLB->>PrimaryCluster: Route traffic back
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use oxc to improve the quality of JavaScript and TypeScript code reviews.Add a configuration file to your project to customize how CodeRabbit runs oxc. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md`:
- Around line 291-295: Replace the indented command block for the
ReplicationSource status check with a fenced shell code block (```bash ... ```)
so it no longer triggers MD046, and update the explanatory text that references
trigger.manual to describe it as a spec field (e.g., "trigger.manual is a spec
field") rather than a label; apply the same fenced-block and wording correction
to the later occurrence that spans the other referenced lines and ensure the
command shown is the kubectl -n <application-namespace> get ReplicationSource
rs-pvc-01 -o jsonpath='{.status}' example.
- Around line 135-148: The example only creates a headless Service
(metadata.name: my-stateful-app-headless with clusterIP: None and selector app:
my-stateful-app) but omits the client-facing exposure needed for GSLB traffic
switching; update the doc to either add a separate external
Service/Ingress/Gateway example (showing a Service with a clusterIP and/or type:
LoadBalancer or an Ingress/Gateway pointing to the headless selector) or add a
clear sentence after the headless Service block explicitly stating that a
separate client-facing Service/Ingress/Gateway is required for external traffic
and must target the same selector (app: my-stateful-app) to enable the
traffic-switching flow.
- Around line 174-178: The Parameter tables (the block starting with
"**Parameters**:" that lists **application-namespace** and **psk.txt**) need
blank lines inserted immediately before and after the table to satisfy MD058 and
ensure proper rendering; update the markdown around the "**Parameters**:" table
and the other similar tables (the ones containing **application-namespace** /
**psk.txt** entries) by adding one empty line above the table header and one
empty line below the table end so each table is separated from surrounding
paragraphs.
- Around line 40-45: The diagram incorrectly shows a StatefulSet Pod in the
passive cluster; remove or hide the Pod node (PodS / "StatefulSet Pod
<br>replicas=0") from the "Secondary Cluster Passive" subgraph so the visual
matches the text (replicas=0), leaving only the Service (SvcS), Destination
PVC/Snapshot (PVCS), and VolSync ReplicationDestination (VSR_Dest) nodes in that
subgraph.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e347fff8-7779-4eed-b814-323c0b74c76b
📒 Files selected for processing (1)
docs/en/solutions/acp/StatefulSet_Application_DR.md
| subgraph Secondary[Secondary Cluster Passive] | ||
| SvcS[Service] | ||
| PodS[StatefulSet Pod <br>replicas=0] | ||
| PVCS[(Destination PVC / Snapshot)] | ||
| VSR_Dest[VolSync ReplicationDestination] | ||
| end |
There was a problem hiding this comment.
Remove the standby Pod from the passive-cluster diagram.
Lines 40-45 show a StatefulSet Pod in the secondary cluster even though the rest of the doc defines the standby side as replicas=0. That contradiction makes the target topology ambiguous for readers following the runbook.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 40 - 45,
The diagram incorrectly shows a StatefulSet Pod in the passive cluster; remove
or hide the Pod node (PodS / "StatefulSet Pod <br>replicas=0") from the
"Secondary Cluster Passive" subgraph so the visual matches the text
(replicas=0), leaving only the Service (SvcS), Destination PVC/Snapshot (PVCS),
and VolSync ReplicationDestination (VSR_Dest) nodes in that subgraph.
| apiVersion: v1 | ||
| kind: Service | ||
| metadata: | ||
| name: my-stateful-app-headless | ||
| namespace: <application-namespace> | ||
| spec: | ||
| clusterIP: None | ||
| selector: | ||
| app: my-stateful-app | ||
| ports: | ||
| - name: tcp | ||
| port: 80 | ||
| targetPort: 8080 | ||
| ``` |
There was a problem hiding this comment.
Call out the client-facing Service separately from the headless Service.
This sample only defines a headless Service (clusterIP: None), which is fine for StatefulSet identity but does not match the GSLB traffic-switching flow described earlier. Please either add the external Service/Ingress/Gateway example here or explicitly state that a separate client-facing exposure resource is required.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 135 - 148,
The example only creates a headless Service (metadata.name:
my-stateful-app-headless with clusterIP: None and selector app: my-stateful-app)
but omits the client-facing exposure needed for GSLB traffic switching; update
the doc to either add a separate external Service/Ingress/Gateway example
(showing a Service with a clusterIP and/or type: LoadBalancer or an
Ingress/Gateway pointing to the headless selector) or add a clear sentence after
the headless Service block explicitly stating that a separate client-facing
Service/Ingress/Gateway is required for external traffic and must target the
same selector (app: my-stateful-app) to enable the traffic-switching flow.
| **Parameters**: | ||
| | **Parameter** | **Explanation** | | ||
| |:--|:--| | ||
| | **application-namespace** | The namespace of secret, should same as application | | ||
| | **psk.txt**| This field adheres to the format expected by stunnel: `<id>:<at least 32 hex digits>`. <br></br>for example, `1:23b7395fafc3e842bd8ac0fe142e6ad1`. | |
There was a problem hiding this comment.
Add blank lines around the parameter tables.
These tables currently violate MD058, and some Markdown renderers handle them poorly when they are not separated from surrounding paragraphs. A blank line before and after each table will make the page render more reliably.
Also applies to: 207-216, 247-257
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 175-175: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 174 - 178,
The Parameter tables (the block starting with "**Parameters**:" that lists
**application-namespace** and **psk.txt**) need blank lines inserted immediately
before and after the table to satisfy MD058 and ensure proper rendering; update
the markdown around the "**Parameters**:" table and the other similar tables
(the ones containing **application-namespace** / **psk.txt** entries) by adding
one empty line above the table header and one empty line below the table end so
each table is separated from surrounding paragraphs.
50e0a99 to
425569e
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (4)
docs/en/solutions/acp/StatefulSet_Application_DR.md (4)
174-178:⚠️ Potential issue | 🟡 MinorAdd blank lines before each parameters table (MD058).
At Line 175, Line 208, and Line 248, the table starts immediately after
**Parameters**:. Markdownlint MD058 expects a blank line before the table header.Also applies to: 207-216, 247-257
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 174 - 178, The tables after each "**Parameters**:" heading start immediately on the next line and violate MD058; insert a single blank line between every "**Parameters**:" line and its following table header (the lines beginning with "| **Parameter** | **Explanation** |" / the table rows such as "| **application-namespace** | ..." and "| **psk.txt**| ...") so each Parameters section has one blank line before the table.
40-43:⚠️ Potential issue | 🟡 MinorRemove the standby Pod node from the passive-cluster diagram.
At Line 40–43, showing
StatefulSet Pod <br>replicas=0in the passive subgraph conflicts with the “no running workload” standby model and can mislead readers during runbook execution.Suggested doc patch
subgraph Secondary[Secondary Cluster Passive] SvcS[Service] - PodS[StatefulSet Pod <br>replicas=0] PVCS[(Destination PVC / Snapshot)] VSR_Dest[VolSync ReplicationDestination] end @@ -GSLB -. "Standby Traffic <br>(0 Weight)" .-> SvcS +GSLB -. "Standby Traffic <br>(0 Weight)" .-> SvcS @@ -SvcP --> PodP +SvcP --> PodP🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 40 - 43, The passive-cluster diagram incorrectly includes the standby Pod node "PodS[StatefulSet Pod <br>replicas=0]" which contradicts the intended "no running workload" standby model; remove the PodS node (and its label "StatefulSet Pod <br>replicas=0") from the Secondary subgraph so only Service (SvcS) and PVC/Snapshot (PVCS) remain, ensuring the diagram reflects a passive cluster with no pods running.
135-148:⚠️ Potential issue | 🟠 MajorDocument a separate client-facing Service/Ingress for GSLB traffic switching.
At Line 135–148, only a headless Service is defined (
clusterIP: None). That is fine for StatefulSet identity, but it does not represent the client-facing endpoint required by the GSLB flow described earlier.Suggested doc patch
apiVersion: v1 kind: Service metadata: name: my-stateful-app-headless @@ - name: tcp port: 80 targetPort: 8080 +--- +apiVersion: v1 +kind: Service +metadata: + name: my-stateful-app + namespace: <application-namespace> +spec: + selector: + app: my-stateful-app + ports: + - name: tcp + port: 80 + targetPort: 8080Or add one explicit sentence after the block: “A separate client-facing Service/Ingress/Gateway is required for GSLB and must target
app: my-stateful-app.”🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 135 - 148, The current manifest only defines a headless Service (clusterIP: None) for StatefulSet identity but lacks a client-facing endpoint for GSLB; update the doc by adding either a separate Service/Ingress/Gateway example that selects app: my-stateful-app (i.e., a normal ClusterIP/LoadBalancer/Ingress/Gateway resource) or insert one clear sentence immediately after the headless Service block stating: “A separate client-facing Service/Ingress/Gateway is required for GSLB and must target app: my-stateful-app.” Reference the headless Service block (name: my-stateful-app-headless / selector: app: my-stateful-app) when adding the new Service/Ingress/Gateway guidance.
291-295:⚠️ Potential issue | 🟡 MinorUse a proper fenced block and call
trigger.manuala spec field.At Line 291–295, the command block is still indented, which triggers MD046. At Line 313,
trigger.manualis described as a “label,” but in your YAML it is a spec field.Suggested doc patch
- Check synchronization from the `ReplicationSource`. - - ```bash - kubectl -n <application-namespace> get ReplicationSource rs-pvc-01 -o jsonpath='{.status}' - ``` +Check synchronization from the `ReplicationSource`. + +```bash +kubectl -n <application-namespace> get ReplicationSource rs-pvc-01 -o jsonpath='{.status}' +``` @@ -- Create a One-Time `ReplicationSource` with the `trigger.manual` label on the primary cluster. +- Create a One-Time `ReplicationSource` with the `trigger.manual` spec field on the primary cluster.Also applies to: 313-313
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` around lines 291 - 295, Remove the indentation for the bash snippet so it uses a proper fenced code block (```bash ... ```) around the kubectl command for ReplicationSource and update the text that mentions trigger.manual to describe it as a spec field rather than a label (i.e., change "trigger.manual label" to "trigger.manual spec field"); ensure references to ReplicationSource and trigger.manual are corrected consistently (e.g., the Get command block and the "Create a One-Time ReplicationSource" sentence).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md`:
- Around line 174-178: The tables after each "**Parameters**:" heading start
immediately on the next line and violate MD058; insert a single blank line
between every "**Parameters**:" line and its following table header (the lines
beginning with "| **Parameter** | **Explanation** |" / the table rows such as "|
**application-namespace** | ..." and "| **psk.txt**| ...") so each Parameters
section has one blank line before the table.
- Around line 40-43: The passive-cluster diagram incorrectly includes the
standby Pod node "PodS[StatefulSet Pod <br>replicas=0]" which contradicts the
intended "no running workload" standby model; remove the PodS node (and its
label "StatefulSet Pod <br>replicas=0") from the Secondary subgraph so only
Service (SvcS) and PVC/Snapshot (PVCS) remain, ensuring the diagram reflects a
passive cluster with no pods running.
- Around line 135-148: The current manifest only defines a headless Service
(clusterIP: None) for StatefulSet identity but lacks a client-facing endpoint
for GSLB; update the doc by adding either a separate Service/Ingress/Gateway
example that selects app: my-stateful-app (i.e., a normal
ClusterIP/LoadBalancer/Ingress/Gateway resource) or insert one clear sentence
immediately after the headless Service block stating: “A separate client-facing
Service/Ingress/Gateway is required for GSLB and must target app:
my-stateful-app.” Reference the headless Service block (name:
my-stateful-app-headless / selector: app: my-stateful-app) when adding the new
Service/Ingress/Gateway guidance.
- Around line 291-295: Remove the indentation for the bash snippet so it uses a
proper fenced code block (```bash ... ```) around the kubectl command for
ReplicationSource and update the text that mentions trigger.manual to describe
it as a spec field rather than a label (i.e., change "trigger.manual label" to
"trigger.manual spec field"); ensure references to ReplicationSource and
trigger.manual are corrected consistently (e.g., the Get command block and the
"Create a One-Time ReplicationSource" sentence).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9aaeeecf-2453-4f9a-9f56-81069ae4dfe9
📒 Files selected for processing (1)
docs/en/solutions/acp/StatefulSet_Application_DR.md
Add StatefulSet application DR solution with VolSync. Signed-off-by: Xiaoxi He <xxhe@alauda.io>
425569e to
4b96dd6
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
docs/en/solutions/acp/StatefulSet_Application_DR.md (3)
178-178: Use standard line break syntax.The
<br></br>tag is non-standard HTML. Use either<br/>(self-closing tag) or double-space markdown line breaks for better compatibility.✏️ Proposed fix
-| **psk.txt**| This field adheres to the format expected by stunnel: `<id>:<at least 32 hex digits>`. <br></br>for example, `1:23b7395fafc3e842bd8ac0fe142e6ad1`. | +| **psk.txt**| This field adheres to the format expected by stunnel: `<id>:<at least 32 hex digits>`. <br/>For example, `1:23b7395fafc3e842bd8ac0fe142e6ad1`. |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` at line 178, Replace the non-standard <br></br> usage in the table cell that describes **psk.txt** with a standard line break: either use a self-closing HTML break tag (<br/>) or convert to a Markdown line break (end the previous line with two spaces) so the example `1:23b7395fafc3e842bd8ac0fe142e6ad1` renders consistently; update the table cell text containing "**psk.txt**| This field adheres to the format expected by stunnel: `<id>:<at least 32 hex digits>`. <br></br>for example, ..." accordingly.
403-403: Consider using "before" instead of "prior to".The phrase "prior to a disaster" can be simplified to "before a disaster" for more concise writing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` at line 403, Replace the phrase "prior to a disaster" in the sentence that reads "The storage and computational resources sit idle prior to a disaster, inaccessible to the external network, and incapable of offloading query pressure like an Active-Active architecture would." with the simpler "before a disaster" so the sentence reads "...sit idle before a disaster..."; update the document text in StatefulSet_Application_DR.md accordingly.
80-80: Consider simplifying "In order to construct".The phrase "In order to construct" can be shortened to "To construct" for more concise writing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/en/solutions/acp/StatefulSet_Application_DR.md` at line 80, Replace the verbose phrase "In order to construct the \"Active-Passive\" setup" with the concise "To construct the \"Active-Passive\" setup" in the sentence that describes using Kustomize overrides to set `replicas` (the line mentioning GitOps Base Configuration Sync and Kustomize overrides manipulating `replicas` so Primary is `N` and Secondary is `replicas=0`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/en/solutions/acp/StatefulSet_Application_DR.md`:
- Line 178: Replace the non-standard <br></br> usage in the table cell that
describes **psk.txt** with a standard line break: either use a self-closing HTML
break tag (<br/>) or convert to a Markdown line break (end the previous line
with two spaces) so the example `1:23b7395fafc3e842bd8ac0fe142e6ad1` renders
consistently; update the table cell text containing "**psk.txt**| This field
adheres to the format expected by stunnel: `<id>:<at least 32 hex digits>`.
<br></br>for example, ..." accordingly.
- Line 403: Replace the phrase "prior to a disaster" in the sentence that reads
"The storage and computational resources sit idle prior to a disaster,
inaccessible to the external network, and incapable of offloading query pressure
like an Active-Active architecture would." with the simpler "before a disaster"
so the sentence reads "...sit idle before a disaster..."; update the document
text in StatefulSet_Application_DR.md accordingly.
- Line 80: Replace the verbose phrase "In order to construct the
\"Active-Passive\" setup" with the concise "To construct the \"Active-Passive\"
setup" in the sentence that describes using Kustomize overrides to set
`replicas` (the line mentioning GitOps Base Configuration Sync and Kustomize
overrides manipulating `replicas` so Primary is `N` and Secondary is
`replicas=0`).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a25e62d4-791b-4fae-a089-b14154f78d96
📒 Files selected for processing (1)
docs/en/solutions/acp/StatefulSet_Application_DR.md
Add StatefulSet application DR solution with VolSync.
Summary by CodeRabbit