Backblaze B2 and the S3 Compatible API on Cloudflare

Backblaze B2 and the S3 Compatible API on Cloudflare

In May 2020, Backblaze, a founding Bandwidth Alliance partner announced S3 compatible APIs for their B2 Cloud Storage service. As a refresher, the Bandwidth Alliance is a group of forward-thinking cloud and networking companies that are committed to discounting or waiving data transfer fees for shared customers. Backblaze has been a proud partner since 2018. We are excited to see Backblaze introduce a new level of compatibility in their Cloud Storage service.

History of the S3 API

First let’s dive into the history of the S3 API and why it’s important for Cloudflare users.

Prior to 2006, before the mass migration to the Cloud, if you wanted to store content for your company you needed to build your own expensive and highly available storage platform that was large enough to store all your existing content with enough growth headroom for your business. AWS launched to help eliminate this model by renting their physical computing and storage infrastructure.

Amazon Simple Storage Service (S3) led the market by offering a scalable and resilient tool for storing unlimited amounts of data without building it yourself. It could be integrated into any application but there was one catch: you couldn’t use any existing standard such as WebDAV, FTP or SMB: your application needed to interface with Amazon’s bespoke S3 API.

Fast forward to 2020 and the storage provider landscape has become highly competitive with many providers capable of providing petabyte (and exabyte) scale content storage at extremely low cost-per-gigabyte. However, Amazon S3 has remained a dominant player despite heavy competition and not being the most cost-effective player.

The broad adoption of the S3 API by developers in their codebases and internal systems has transformed the S3 API into what WebDAV promised us to be: de facto standard HTTP File Storage API.

Engineering costs of changing storage providers

With many code bases and legacy applications being entrenched in the S3 API, the process to switch to a more cost-effective storage provider is not so easy. Companies need to consider the cost of engineer time programming a new storage API while also physically moving their data.

This engineering overhead has led many storage providers to natively support the S3 API, leveling the playing field and allowing companies to focus on picking the most cost-effective provider.

First-mile bandwidth costs and the Bandwidth Alliance

Cloudflare caches content in Points of Presence located in more than 200 cities around the world. This cached content is then handed to your Internet service provider (ISP) over low cost and often free Internet exchange connections in the same facility using mutual fibre optic cables. This cost saving is fairly well understood as the benefit of content delivery networks and has become highly commoditized.

What is less well understood is the first-mile cost of moving data from a storage provider to the content delivery network. Typically storage providers expect traffic to route via the Internet and will charge the consumer per-gigabyte of data transmitted. This is not the case for Cloudflare as we also share facilities and mutual fibre optic cables with many storage providers.

Backblaze B2 and the S3 Compatible API on Cloudflare

These shared interconnects created an opportunity to waive the cost of first-mile bandwidth between Cloudflare and many providers and is what prompted us to create the Bandwidth Alliance.

Media and entertainment companies serving user-generated content have a continuous supply of new content being moved over the first-mile from the storage provider to the content delivery network. The first-mile bandwidth cost adds up and using a Bandwidth Alliance partner such Backblaze can entirely eliminate it.

Using the S3 API in Cloudflare Workers

The Solutions Engineering team at Cloudflare is tasked with providing strategic technical guidance for our enterprise customers.

It’s not uncommon for developers to connect Cloudflare’s global network directly to their storage provider and directly serve content such as live and on-demand video without an intermediate web server.

For security purposes engineers typically use Cloudflare Workers to sign each uncached request using the S3 API. Cloudflare Workers allows anyone to deploy code to our global network of over 200+ Points of Presence in seconds and is built on top of Service Workers.

We’ve tested Backblaze B2’s S3 Compatible API in Cloudflare Workers using the same code tested for Amazon S3 buckets and it works perfectly by changing the target endpoint.

Creating a S3 Compatible Worker script

Here’s how it is done using Cloudflare Worker’s CLI tool Wrangler:

Generate a new project in Wrangler using a template intended for use with Amazon S3:

wrangler generate <projectname> https://github.com/obezuk/worker-signed-s3-template

This template uses aws4fetch. A fast, lightweight implementation of an S3 Compatible signing library that is commonly used in Service Worker environments like Cloudflare Workers.

The template creates an index.js file with a standard request signing implementation:

import { AwsClient } from 'aws4fetch'

const aws = new AwsClient({
    "accessKeyId": AWS_ACCESS_KEY_ID,
    "secretAccessKey": AWS_SECRET_ACCESS_KEY,
    "region": AWS_DEFAULT_REGION
});

addEventListener('fetch', function(event) {
    event.respondWith(handleRequest(event.request))
});

async function handleRequest(request) {
    var url = new URL(request.url);
    url.hostname = AWS_S3_BUCKET;
    var signedRequest = await aws.sign(url);
    return await fetch(signedRequest, { "cf": { "cacheEverything": true } });
}

Environment Variables

Modify your wrangler.toml file to use your Backblaze B2 API Key ID and Secret:

[env.dev]
vars = { AWS_ACCESS_KEY_ID = "<BACKBLAZE B2 keyId>", 
AWS_SECRET_ACCESS_KEY = "<BACKBLAZE B2 secret>", 
AWS_DEFAULT_REGION = "", 
AWS_S3_BUCKET = "<BACKBLAZE B2 bucketName>.<BACKBLAZE B2 S3 Endpoint>"}

AWS_S3_BUCKET environment variable will be the combination of your bucket name, period and S3 Endpoint. For a Backblaze B2 Bucket named example-bucket and S3 Endpoint s3.us-west-002.backblazeb2.com use example-bucket.s3.us-west-002.backblazeb2.com

AWS_DEFAULT_REGION environment variable is interpreted from your S3 Endpoint. I use us-west-002.

We recommend using Secret Environment variables to store your AWS_SECRET_ACCESS_KEY content when using this script in production.

Preview your Cloudflare Worker

Next run wrangler preview --env dev to enter a preview window of your Worker script. My bucket contained a static website containing adaptive streaming video content stored in a Backblaze B2 bucket.

Backblaze B2 and the S3 Compatible API on Cloudflare
Note: We permit caching of third party video content only for enterprise domains. Free/Pro/Biz users wanting to serve video content via Cloudflare may use Stream which delivers an end-to-end video delivery service.

Backblaze B2’s compatibility for the S3 API is an exciting update that has made their storage platform highly compatible with existing code bases and legacy systems. And, as a special offer to Cloudflare blog readers, Backblaze will pay the migration costs for transferring your data from S3 to Backblaze B2 (click here for more detail). With the cost of migration covered and compatibility for your existing workflows, it is now easier than ever to switch to a Bandwidth Alliance partner and save on first-mile costs. By doing so, you can slash your cloud bills, gain flexibility, and make no compromises to your performance.

To learn more, join us on May 14th for a webinar focused on getting you ultra fast worldwide content delivery.

Source: Cloudflare