Skip to content

Conversation

dhtclk
Copy link
Collaborator

@dhtclk dhtclk commented Sep 23, 2025

Summary

Adding ClickHouse Cloud Disaster Recovery Guide

Screenshot 2025-09-23 at 9 19 03 AM

Copy link

vercel bot commented Sep 23, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
clickhouse-docs Ready Ready Preview Comment Oct 7, 2025 11:25am
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
clickhouse-docs-jp Ignored Ignored Oct 7, 2025 11:25am
clickhouse-docs-ru Ignored Ignored Preview Oct 7, 2025 11:25am
clickhouse-docs-zh Ignored Ignored Preview Oct 7, 2025 11:25am

@leticiawebb
Copy link
Contributor

Tracking comments in ClickHouse Cloud Disaster Recovery - Public Docs. Adding more directed comments in the review.

Copy link
Contributor

@leticiawebb leticiawebb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aashishkohli updated PR for comments. PTAL.

title: 'Disaster recovery'
description: 'This guide provides an overview of disaster recovery.'
doc_type: 'guide'
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this document in the broader scope of backups in our docs:

  • Features > Backups > Overview > move to Guides > Backups > Review and Restore Backups
  • Features > Backups > Configurable Backups move to Guides > Backups > Configure Backup Schedules
  • Features > Backups > Export Backups to your Own Cloud Account move to Guides > Export Backups

This document would then move to Reference > Data Resiliency. Comments below for x-references.

Add a new document under Features > Backups. The new document should cover:

  • Review and Restore Backups
  • Configure Backup Schedules
  • Export Backups

These should have brief descriptions and links to the Guide pages above.


It is helpful to cover some definitions first.

**RPO (Recovery Point Objective)**: The maximum acceptable data loss measured in time following a disruptive event. Example: An RPO of 30 mins means that in the event of a failure the DB should be restorable to data no older than 30 mins. This, of course, depends on how frequently backups are taken.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIP:
Customers should perform periodic backup restore testing to understand the specific RTO for their service size and configuration.


**Default backups**: By default, ClickHouse Cloud takes a backup of your service every 24 hours. These backups are in the same region as the service, and happen in the ClickHouse CSP (cloud service provider) storage bucket. In the event that the data in the primary service gets corrupted, the backup can be used to restore to a new service.

**External backups (in customer's own storage bucket)**: Enterprise Tier customers can export backups to their object storage in their own account, in the same region, or in another region. Cross-cloud backup export support is coming soon. Applicable data transfer charges will apply for cross-region, and cross-cloud backups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFO:
This feature is not currently available in PCI/ HIPAA or encrypted services.


**External backups (in customer's own storage bucket)**: Enterprise Tier customers can export backups to their object storage in their own account, in the same region, or in another region. Cross-cloud backup export support is coming soon. Applicable data transfer charges will apply for cross-region, and cross-cloud backups.

**Configurable backups**: Customers can configure backups to happen at a higher frequency, up to every 6 hours, to improve the RPO. Customers can also configure longer retention.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFO:
The size of the database plays a significant role in how quickly a backup completes and when the next backup starts if backups are scheduled close to each other. Customers should monitor the first few backups and test recovery to verify the correct RTO/RPO for the service.


### Primary service data corruption {#primary-service-data-corruption}

In this case the data can be restored from the backup to another service in the same region. The backup could be up to 24 hours old if using the default backup policy, or up to 6 hours old (if using configurable backups with 6 hours frequency).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Primary region downtime {#primary-region-downtime}

Customers in the Enterprise Tier can export backups to their own cloud provider bucket. If you are concerned about regional failures, we recommend exporting backups to a different region. Keep in mind that cross-region data transfer charges will apply.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants