Backing Up Oracle to Amazon S3

  • Posted on: 15 December 2014
  • By: David La Motta

This blog post is going to show you how we use Integra to backup an Oracle database to Amazon S3, running on any storage hardware. We won't zero in on a specific storage vendor because in doing so we would defeat the core purpose behind Integra.  Instead, the workflow we create will apply to any storage vendor, so making this work with what you have in your IT shop is simpler than you may think.  If we don't have the storage provider you require, please contact us so we can work with you on that.

There are many different types of Oracle backups, but the one we're going to do here falls in the category of a User-Managed Open Database Backup.  Broadly speaking, we will put Oracle in and out of hot backup mode, take a storage-level snapshot in between, while also getting the archive logs after that snapshot.  All files (control, data, archive logs) will be persisted on-prem via the storage snapshot, but they will also be uploaded and stored in Amazon S3 for cloud protection.  In terms of Integra, we use 5 Providers as you see below and our workflow consists of 18 steps from start to finish.

The first thing you will need to do in order to get this example going is configure 5 Providers.  If you are new to Integra, you can see how easy it installs.  With that done, you can come back here and configure the Providers as you see below.  Don't forget you'll need an Integra license, so if you do not have one please request a 30-day trial license from us.  We configure:

  1. Amazon: for uploading to S3
  2. Archive: this provider allows us to remove archive logs from on-prem storage based on a retention policy.  We do this to save space on on-prem storage, since all archive logs are now in the cloud.
  3. Oracle: a key provider, since we are using it to enter / exit hot backup mode and trigger an archive log creation
  4. Plugin: this provider allows us to execute OS commands where we are running Integra.  We use it to mount and unmount the storage when we are done using it to backup to the cloud.
  5. Storage: this provider will take snapshots, create volumes, remove volumes, etc.
Storage could be Tegile, Nimble, Pure, SolidFire, NetApp, EMC, Hitachi, IBM, etc. -- basically any storage array that has the ability to take a snapshot, create a container from that snapshot and expose it to the filesystem.  In essence that means any storage array, or at least any of the storage arrays you would use to store your Oracle data.

We now move to the actions for each of the providers.  As an automation architect, you will need to provide some input so Integra can operate on your behalf.  Namely, this is mostly credentials to Oracle and S3, how many days you want to retain the archive logs locally, etc.  I've sorted the actions by provider, so it is easier to see the things that each provider will be doing.

Integra has the ability to inline variables in any text-based input field.  This is a topic of its own that needs special coverage, but it is worth giving you a glimpse of that here because you will need it in order to make this workflow succeed.  Some actions in Integra return values, which you can use to pass in dynamically to other actions.  So, for instance, if an action called getContainers returns an object called containerDetails and you decide to pipe that output into a new action, the receiving action can use fields from containerDetails.

For example, in order to delete a volume, we first need to find it.  But that volume is created automatically by Integra after we take a snapshot of--say--the archive logs volume.  To make things even more interesting, that volume name is dynamic--we don't even know what it is going to be, except that it will contain a few identifiers.  So, when you configure these actions, you can make use of a variable called ${name}.  Why ${name} you may ask.  The answer, again, is because the different actions in play here return containerDetails objects, which in turn contain a field called name.  If you pipe the output of getContainers to any other action, Integra allows you to simply reference ${name} (or any other field in containerDetails for that matter) and at runtime the value will be replaced with that of the output that was piped in.  As mentioned, the topic of variables merits a post of its own, so you can rest assured that its coming.  In the meantime, if you are trying this example out and would like more clarification, reach out to us in our Google+ community to get some help.

You will also notice in the image above that we are doing an incremental upload to S3.  As you may imagine, this means that after the initial upload, only those files that have been modified will be uploaded again.  All of the cloud providers in Integra offer this functionality, saving you bandwidth, time and money.

Now that all of our actions are configured and saved, it is time to put them to use.  The next image shows what the end-to-end workflow for backing up Oracle to Amazon S3 looks like.  There are a couple of steps that we could have ran in parallel, such as the upload to S3.  If you have bandwidth to spare, steps 12 and 13 can be given the same priority.  This means when the workflow runs, you will be uploading all data files and the archive logs simultaneously to S3.

We talked about pipes and variables just a few moments ago.  In the picture above, this becomes evident.  Steps 2 and 5 are creating brand new storage snapshots of both Oracle's data and logs volumes.  We pipe that information into steps 6 and 7, respectively, to create new volumes.  It is steps 6 and 7 that demultiplex their outputs to several other actions:  step 6 sends its output to steps 8, 10 and 15; and step 7 sends its output to steps 9, 11 and 16.  Without Integra's special variables, using dynamic values would be impossible and the platform would simply be too rigid to be of any use.  However, being able to use pipes and dynamic values are precisely a couple of the features that make Integra so flexible and powerful.

The clean-storage workflow is optional for you, but in this example it is shown so you can see what executing a workflow on failure looks like.  If things were to fail, we have the ability to run other workflows to clean our tracks, so to speak.  The clean-storage does precisely that, un-mounting volumes from the filesystem and deleting those volumes that were mounted initially in the workflow.

Following Integra's 4-step tenants (Providers, Actions, Workflows and Schedules), the last bit remaining is to create the schedule that will run the protect-oracle workflow.  The screen below shows what our schedule looks like, set to run every day at 1:00 AM until December 31st, 2014.

That's it!  After the workflow is executed, your Oracle files will be sitting pretty in your Amazon S3 bucket; data files and archive logs are now in the cloud.

Last but not least, we want to show you what the Integra mobile self-service portal looks like.  Being able to tap into your workflows from anywhere is invaluable; now, with Integra you can run this backup workflow from your mobile device while you are away from the office.  After selecting the workflow, you can hit the Execute button in the upper-right corner of the device.  As you may suspect, this kicks off the workflow and you're done.  The mobile self-service portal has a rich roadmap, so watch this space for exciting changes that will only make your life as an automation architect much more pleasant.

We hope you have enjoyed this quick post on backing up Oracle to Amazon S3.  You can make use of any of the other application providers Integra currently offers and follow the same example to protect those applications.  This means you could protect MySQL, Sybase, DB2, MaxDB or Postgres instead of Oracle.  You could even add Azure instead of Amazon S3 if that is your cloud storage of choice, or use both for double cloud protection.   Regardless, if there is an Integra provider you don't see, don't fret!  Contact us and let us know.