AWS Backup Service — Automated Centralized Backup and Restoration

Author: Ankit Meshram, Associate Engineer – CloudDevOps

Protecting your data is an important step towards achieving business and regulatory compliance requirements. Even durable resources are susceptible to threats such as bugs in your application that can cause accidental deletions or corruption. Up to 75% of data loss is caused by human error, according to a report highlighted by PC World. It’s the single greatest cause of data loss in the workplace. Around the world, it happens every single day, at businesses of all sizes.

Need more proof?

AMAG Pharmaceuticals ran into a problem with data stored on Google Drive. A folder relating to HR activities was moved within the company’s Drive, it stopped syncing properly. As a result, all the files seemed to disappear. After checking everywhere, including the trash bin and desktop, it seemed like the data was completely gone.
A 2016 report by the UK’s Information Commissioner’s Office found that human error accounted for the vast majority (nearly two thirds) of data loss and data breach events reported to the agency.
At data centers specifically, 70% of data incidents are caused by accidental human error, according to research by Uptime Institute.
In a 2015 survey of more than 400 IT professionals, human error was cited as the top cause of data loss, higher than all other causes, including hardware failure, data corruption and natural disasters.

Building and managing your own backup workflows across all your applications in a compliant and consistent manner can be complex and costly. Cloud backup, sometimes referred to as online backup or remote backup is the process of backing up data to cloud-based servers. When you back up your data to the cloud, you’re storing a copy of that data on one or more remote servers, which are owned and managed by a third-party cloud service provider. Typically, cloud service providers charge fees based on things like the amount of storage space or servers required, available server bandwidth, and the number of users who access these servers. AWS Backup removes the need for costly, custom solutions or manual processes by providing a fully managed, policy-based data protection solution.

What is AWS Backup Service ?

AWS Backup is a fully managed service that centralizes and automates data protection across AWS services like Amazon Simple Storage Service (S3), Amazon FSx, Amazon Elastic Compute Cloud (EC2), and Amazon Relational Database Service (RDS), and hybrid workloads like VMware on premises, VMware Cloud on AWS, and VMware Cloud on AWS Outposts.

AWS Backup offers a cost-effective, fully managed, policy-based service that further simplifies data protection at scale. AWS Backup offers advanced features such as lifecycle policies to transition backups to a low-cost storage tier. It also includes backup storage and encryption independent from its source data, audit and compliance reporting capabilities with AWS Backup Audit Manager, and delete protection with AWS Backup Vault Lock.

Components of AWS Backup

1. Backup’s :

A backup or recovery point, represents the content of a resource, such as an Amazon Elastic Block Store (Amazon EBS) volume or Amazon DynamoDB table, at a specified time. It is a term that refers generally to the different backups in AWS services, such as Amazon EBS snapshots and DynamoDB backups.

2. Backup Vault :

In AWS Backup, a backup vault is a container that stores and organizes your backups. When creating a backup vault, you must specify the AWS Key Management Service (AWS KMS) encryption key that encrypts some of the backups placed in this vault. Encryption for other backups is managed by their source AWS services.

New backup vaults can be created in each AWS Region where AWS Backup is available. Enable delete-protection on the backup vaults using AWS Backup Vault Lock to prevent malicious actors from re-encrypting your data. AWS Backup stores your continuous backups and periodic snapshots in the backup vault of your preference and lets you browse and restore as per your requirements.

3. Backup Plan :

A backup plan is a policy expression that defines when and how you want to backup your AWS resources. You assign resources to backup plans and AWS Backup will then automatically make and retain backups for those resources according to the backup plan.

Each backup rule consists of the following elements:

Backup frequency: The backup frequency determines how often AWS Backup creates a snapshot backup. Using the console, you can choose a frequency of every hour, 12 hours, daily, weekly, or monthly. You can also create a cron expression that creates snapshot backups as frequently as hourly. Using the AWS Backup CLI, you can schedule snapshot backups as frequently as hourly.
Backup vault: Backups are stored in the specified backup vault. You can create your own backup vault, use a previously created one, or store your backups in the default AWS Backup vault.
Backup window: Backup windows consist of the time that the backup window begins and the duration of the window in hours. Backup jobs are started within this window. If you are unsure what backup window to use, you can choose to use the default backup window that AWS Backup recommends.
Transition to cold storage: Specifies when to transition the backup copy to cold storage and when to expire (delete) the copy. Backups transitioned to cold storage must be stored in cold storage for a minimum of 90 days. You can’t change this value after a copy has transitioned to cold storage.
Retention period: Tell AWS Backup how long to store your backups. AWS Backup automatically deletes your backups at the end of this period to save storage costs for you. AWS Backup can retain snapshots between 1 day and 100 years (or indefinitely, if you do not enter a retention period), and continuous backups between 1 and 35 days.

Backup plans are composed of one or more backup rules. Each backup rule is composed of –

A backup schedule, which includes the backup frequency (Recovery Point Objective [RPO]) and backup window with the backup vault in which to place the created recovery points

A lifecycle rule that specifies when to transition a backup from one storage tier to another and when to expire the recovery point

To select the specific resource type which needs to be backup.

The tags to be added to backups upon creation.

For example, a backup plan might have a daily backup rule and a monthly backup rule. The daily rule backs up resources every day at midnight and retains the backups for one month. The monthly rule takes a backup once a month on the beginning of every month and retains the backups for one year.

4. Backup Jobs:

Once the backup scheduled, the backup details can be monitored status and other details like backup, restore and copy activity. Backup job statuses include pending, running, aborted, completed, and failed.

5. Backup Audit Manager :

AWS Backup Audit Manager provides built-in compliance controls and allows you to customize those controls to define your data protection policies. It is designed to automatically detect violations of your defined data protection policies and prompt you to take corrective actions. With AWS Backup Audit Manager, you can continuously evaluate backup activity and generate audit reports that can help you demonstrate compliance with regulatory requirements. These reports also provide you with more visibility into your backup activities, helping you monitor your operational posture and identify failures that may need further action.

Creating backup using AWS Backup Service via Terraform

Let’s try to create backup for EC2 instance, below are the steps to create the AWS backup for EC2 instance using terraform:

Let’s create each of the sub resources using terraform code. First we need to create a Backup Vault

resource "aws_backup_vault" "example" {
  name        = "demo_backup_vault"
  kms_key_arn = "<<provide kms arn>>"
  tags        = "backup_vault"
}

2. Next we will create the backup plan which will schedule the backup.

resource "aws_backup_plan" "example" {
  name = "demo_backup_plan"

  rule {
    rule_name         = "demo_backup_rule"
    target_vault_name = aws_backup_vault.example.name
    schedule          = "cron(0 0 12 * * ? )"
    start_window      = 60
    completion_window = 300

    lifecycle = {
        cold_storage_after = 0
        delete_after       = 90
      }
     destination_vault_arn = "${aws_backup_vault.this.arn}"
  }

Here, the backup plan is been scheduled according to the cron(0 0 12 * * ? ) which means it will hit at 12pm (noon) every day.

start_window consist of the time that the backup window begins and the duration of the window in hours. Backup jobs are started within this window.

completion_window defines the period of time during which your backup must complete.

cold_storage_after when to tier your backups to cold storage.

delete_after how long to store your backups. AWS Backup automatically deletes your backups at the end of this period to save storage costs for you.

destination_vault_arn to copy your backup to another region create a backup vault to another region and give the arn of that backup vault.

3. Now we will create a IAM role which will be assumed by backup service and will we add the Access Policy which similar to S3 bucket policy or resource policy used for other services.

resource "aws_iam_role" "example" {
  name               = "example"
  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": ["sts:AssumeRole"],
      "Effect": "allow",
      "Principal": {
        "Service": ["backup.amazonaws.com"]
      }
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "example" {
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSBackupServiceRolePolicyForBackup"
  role       = aws_iam_role.example.name
}

4. Finally we will associate resources with certain tags to be backed up.

resource "aws_backup_selection" "example" {
  iam_role_arn = aws_iam_role.example.arn
  name         = "demo_backup_selection"
  plan_id      = aws_backup_plan.example.id

  selection_tag {
    type  = "STRINGEQUALS"
    key   = "backup"
    value = "true"
  }
}

Once TF apply to all EC2 instances, DynamoDB Tables, EFS and RDS which have a tag key “backup” and value “true” will be backed up on the next schedule (next day at 12:00 UTC). Once done you can see the Job status in console.

Restore a backup

After a resource has been backed up at least once, it is considered protected and is available to be restored using AWS Backup. Let’s try to understand the backup restoration process with a simple example. Lets restore an EC2 Instance Using AWS Backup Service. Follow below mentioned steps to restore an EC2 Instance :

Navigate to the backup vault that was selected in the backup plan and select the latest completed backup

2. . To restore the EC2 instance, select the recovery point ARN and choose Restore.

3. The restore of the ARN will bring you to a Restore backup screen that will have the configurations for the EC2 instance using the backed-up AMI and all the attached EBS volumes.

In the Network settings pane, accept the defaults or specify the options for the Instance type, Virtual Private Cloud (VPC), Subnet, Security groups, and Instance IAM role settings.

4. Check for your restored backup job under Restore jobs in the the AWS Backup console.

5. Once the Restore job is completed, navigate to the Amazon EC2 Dashboard and select Instances in the left navigation pane to see the restored EC2 instance. The EC2 instance is restored using the backup of the AMI and the attached EBS volume.

Cross-account and Cross Region Backup

Using AWS Backup, you can back up to multiple AWS accounts on demand or automatically as part of a scheduled backup plan. Use a cross-account backup if you want to securely copy your backups to one or more AWS accounts in your organization for operational or security reasons. If your original backup is inadvertently deleted, you can copy the backup from its destination account to its source account, and then start the restore. Before you can do this, you must have two accounts that belong to the same organization in the AWS Organizations service.

In your destination account, you must create a backup vault. Then, you assign a customer managed key to encrypt backups in the destination account, and a resource-based access policy to allow AWS Backup to access the resources you would like to copy. In the source account, if your resources are encrypted with a customer managed key, you must share this customer managed key with the destination account. You can then create a backup plan and choose a destination account that is part of your organizational unit in AWS Organizations.

Alternatives for AWS Backup:

Amazon Data Lifecycle Management (DLM) :

You can use Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs. When you automate snapshot and AMI management, it helps you to:

Protect valuable data by enforcing a regular backup schedule.
Create standardized AMIs that can be refreshed at regular intervals.
Retain backups as required by auditors or internal compliance.
Reduce storage costs by deleting outdated backups.
Create disaster recovery backup policies that back up data to isolated accounts.

DLM provides basic EBS Volume backups and management of the associated snapshots. This is really easy to configure, just give it a policy name, tag to use, schedule name, a schedule and away you go. Simple right? Well, yes, but it is somewhat limited and there are some complications. Firstly, the tag needs to exist before you can create the policy (Console only. For CLI, you don’t need the tag to exist). Secondly, it runs daily or portions of a day. You can run it every 2, 3, 4, 6, 12 or 24hrs. If you want to do weekly or monthly backups, sorry.

DLM does not backup instances, it backs up volumes. DLM doesn’t do restores. To restore data, you are on your own and either need to use the console or CLI.

2. AWS Snapshot:

AWS Snapshot is a cloud-based backup service that helps protect your data in the event of a disaster. It automatically backs up your data to Amazon S3, making it easy to restore your files if they are lost or corrupted. Snapshots can be used to create new resources or to recover lost data.

They are incremental, so only changed data is stored in each snapshot. In addition, Snapshot integrates with Amazon CloudWatch, making it easy to monitor your backup process and ensure that your data is always safe.

3. Amazon S3 Glacier:

Amazon S3 Glacier is a low-cost, cloud-archive storage service that provides secure and durable storage for data archiving and online backup. To keep costs low, S3 Glacier provides three storage classes from a few milliseconds to hours. S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive provide additional options based on how quickly you need to restore the data. With S3 Glacier, you can reliably store large or small amounts of data at significant savings compared to on-premises solutions. S3 Glacier is well suited for storage of backup data with long or indefinite retention requirements and for long-term data archiving. S3 Glacier provides the following storage classes:

S3 Glacier Instant Retrieval for archiving data that might be needed once per quarter and needs to be restored quickly (milliseconds).
S3 Glacier Flexible Retrieval for archiving data that might infrequently need to be restored, once or twice per year, within a few hours.
S3 Glacier Deep Archive for archiving long-term backup cycle data that might infrequently need to be restored within 12 hours.

With just a few clicks in the AWS Management Console, customers can create a policy that defines how frequently backups are created and how long they are stored. Customers can then assign these policies to their AWS resources, and AWS Backup automatically handles the rest by automatically scheduling backup actions for the assigned AWS resources, orchestrating across AWS services, and managing their retention period.

AWS Backup, a very user-friendly and convenient AWS service which allows customers to easily manage backups in one unified tool. It is a fully-managed and automatic backup service first released by Amazon Web Services in January 2019 and subsequently updated with new features too. With this policy-based service, it is possible to automatically backup data from multiple AWS services in your cloud environments as well as your on-premises servers with the additional help of AWS Storage Gateway. It does not even need any setup, except for the definition of your intended backup plans. It acts as a single pane of glass for managing all the backup related activities.

Comments

Leave a Reply Cancel reply