How to cache ssm.getParameter calls in a Lambda function

SSM Parameter store

Storing secrets in the Systems Manager Parameters Store enables you to decouple the application logic from the secrets. This means the person, or process, who is
doing deployments does not need to know the value of the secret, only the name of the parameter. This allows securely storing things like access keys to
third-party systems, database passwords, and encryption keys. With correct permissions, the Lambda function can read the value of the parameter by making
a getParameter call to the SSM service.

const AWS = require("aws-sdk");
const ssm = new AWS.SSM();

// reading a secret value from the SSM Parameters Store
const secretValue = await ssm.getParameter(
	Name: process.env.PARAMETER,
	WithDecryption: true,
).promise();

Environment variables are a similar construct that allows passing arbitrary values to a function. But separate storage for secrets allows better decoupling.
Everybody who can deploy the function by definition needs to know the values, which might not be acceptable.

// a secret is passed as an environment variable
const secretValue = process.env.SECRET_VALUE

But while environment variables scale with the function, SSM does not. It has a maximum throughput, defined in transactions per second (TPS), which is only
40 by default. Since Lambda can handle traffic several orders of magnitude, this can be a bottleneck.

AWS reacted like a good salesman, adding an option to increase the limit
to 1000 TPS for a higher cost. This should be enough for almost all use-cases.

But why pay for something that you can solve yourself?

How to add parameters

To add a parameter, go to the AWS Systems Manager service and select the Parameter Store item under Application Management. This lists all the parameters defined
in the region and you can add new ones.

The Parameter Store is a simple key-value store. Parameters have a name and a value associated. The SecureString type is a String encrypted with KMS. This
allows the WithDecryption parameter that allows getting only the cyphertext.

Here’s a secret with the name test-secret that is a SecureString:

An SSM parameter

By default, the Lambda function can not read this value. Let’s add a policy that grants access:

Finally, the function needs to know the name of the parameter. This avoids the need to hardcode this value and it allows deploying identical environments.
An environment variable is a good place to store it:

This is all the function needs to access the secret. Whenever the value is needed, just read the name from the environment, then issue a getParameter call:

const AWS = require("aws-sdk");
const ssm = new AWS.SSM();

module.exports.handler = async (event) => {
	const secretValue = await ssm.getParameter(
		Name: process.env.PARAMETER,
		WithDecryption: true,
	).promise();
};

Caching SSM calls

The above code issues the ssm.getParameter call every time the function is run. This limits the scalability of the function to the maximum TPS of the
SSM service, which is by default 40. Let’s see how we could do better and issue a lot fewer requests to the service!

Lambda reuses function instances. The first request starts the code on a new machine, but subsequent calls reach the same running instance. This enables
per-instance storage, as all the variables that are declared outside the handler function are still accessible during the next call.

const variable = ""; // <- shared between calls

module.exports.handler = async (event) => {
	const localVariable = ""; // <- per-request
};

This allows an easy caching mechanism. Initialize an object that is shared between the calls, and in the handler function use it to interface with the SSM.
The cache object can then store the last retrieved value and can decide when to refresh it.

Here’s a caching function that implements this:

const cacheSsmGetParameter = (params, cacheTime) => {
	let lastRefreshed = undefined;
	let lastResult = undefined;
	let queue = Promise.resolve();
	return () => {
		// serialize async calls
		const res = queue.then(async () => {
			const currentTime = new Date().getTime();
			// check if cache is fresh enough
			if (lastResult === undefined ||
				lastRefreshed + cacheTime < currentTime) {
				// refresh the value
				lastResult = await ssm.getParameter(params).promise();
				lastRefreshed = currentTime;
			}
			return lastResult;
		});
		queue = res.catch(() => {});
		return res;
	};
};

This builds on the async function serializer pattern as while getting the parameter returns a Promise,
it runs only one call at a time. This makes sure that when the value is fetched from SSM it will happen only once even if multiple calls are waiting for the result.

The function returns another function that fetches the parameter’s value:

const cacheSsmGetParameter = (params, cacheTime) => {
	// ...
};

const getParam = cacheSsmGetParameter({
	Name: process.env.PARAMETER,
	WithDecryption: true,
}, 30 * 1000); // 30 seconds

module.exports.handler = async (event) => {
	const param = await getParam();
	return param;
};

Cache time

Cache time is a tradeoff between freshness and efficiency. The longer the value is in the cache the more time it needs to pick up the changes, but it also makes
fewer calls.

Also, don’t forget that this caching is per instance. While the Lambda service reuses instances whenever possible, it also liberally starts new ones and those
come with separate memory storages. This makes cache time calculations dependent on the number of instances as well as the function invocations per second.

Let’s say a huge application uses 1000 instances at a given time. This should be a reasonable upper limit. In this case, the 40 TPS limit of the standard
throughput of the SSM translates into 25 seconds cache time.

Based on this back-on-the-envelope calculation, a cache time between 30 seconds and 5 minutes should be a good value.

Source: Advanced Web Machinery

Leave a Reply

Your email address will not be published.


*