background-shape
feature-image

As organizations continue to migrate their applications to the cloud, serverless architectures have become increasingly popular due to their scalability, cost-effectiveness, and ease of deployment. However, with the benefits of serverless comes the challenge of ensuring data consistency and minimizing data loss.

One solution to this challenge is the Outbox Pattern. This pattern ensures that any changes made to the system are recorded in a reliable data store, making it possible to recover data in the event of an outage. In this blog post, we will explore how the Outbox Pattern can be used to eliminate data loss in AWS serverless architectures.

What is the Outbox Pattern?

The Outbox Pattern is a design pattern used to ensure data consistency in distributed systems. It involves using a reliable messaging system to capture changes made to the system, and recording these changes in an outbox table within a database. This table acts as a buffer between the application and the messaging system, ensuring that any messages that fail to be delivered are not lost.

When a change is made to the system, such as creating a new item or updating an existing one, a message is sent to the messaging system. The message contains details of the change, such as the type of change and the affected data. The messaging system then delivers the message to the appropriate destination, such as another service or a database.

If the message fails to be delivered, it is stored in the outbox table. The application can then periodically check the outbox table for any messages that need to be re-sent, ensuring that no data is lost in the event of an outage.

Implementing the Outbox Pattern in AWS

To implement the Outbox Pattern in AWS, we will use Lambda functions to handle the creation and updating of items, and Amazon Simple Queue Service (SQS) to capture changes made to the system.

We will also use Amazon DynamoDB to store the outbox table, which will be used to buffer any messages that fail to be delivered. Finally, we will use the AWS SDK for JavaScript to interact with DynamoDB and SQS.

Let’s take a look at the code:

Creating the Outbox Table

First, we will create the outbox table. This table will have three columns: id, data, and processed. The id column will contain a unique identifier for each message, while the data column will contain the message payload. The processed column will indicate whether the message has been successfully processed or not.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB();

const params = {
    TableName: 'outbox',
    KeySchema: [
        { AttributeName: 'id', KeyType: 'HASH' }
    ],
    AttributeDefinitions: [
        { AttributeName: 'id', AttributeType: 'S' },
        { AttributeName: 'processed', AttributeType: 'BOOL' }
    ],
    ProvisionedThroughput: {
        ReadCapacityUnits: 1,
        WriteCapacityUnits: 1
    },
    GlobalSecondaryIndexes: [
        {
            IndexName: 'processed-index',
            KeySchema: [
                { AttributeName: 'processed', KeyType: 'HASH' }
            ],
            Projection: {
                ProjectionType: 'ALL'
            },
            ProvisionedThroughput: {
                ReadCapacityUnits: 1,
                WriteCapacityUnits: 1
            }
        }
    ]
};

dynamodb.createTable(params, (err, data) => {
    if (err) {
        console.error(err);
    } else {
        console.log(data);
    }
});

This code uses the AWS SDK to create the outbox table in DynamoDB. We define the table name, key schema, attribute definitions, and provisioned throughput. We also create a global secondary index on the processed column, which will be used to query the table for messages that need to be re-sent.

Capturing Changes with SQS

Next, we will use SQS to capture changes made to the system. When a change is made, such as creating a new item or updating an existing one, we will send a message to an SQS queue.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const AWS = require('aws-sdk');
const sqs = new AWS.SQS();

const params = {
    QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123456789012/outbox-queue'
};

const data = {
    type: 'create',
    id: '123456',
    name: 'John Doe'
};

sqs.sendMessage({ ...params, MessageBody: JSON.stringify(data) }, (err, data) => {
    if (err) {
        console.error(err);
    } else {
        console.log(data);
    }
});

This code uses the AWS SDK to send a message to an SQS queue. We define the queue URL and the message payload, which contains details of the change made to the system.

Handling Messages with Lambda

Finally, we will use Lambda functions to handle the messages captured by SQS. When a message is received, the Lambda function will process the message, update the system accordingly, and mark the message as processed in the outbox table.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB();
const sqs = new AWS.SQS();

exports.handler = async (event) => {
    const message = JSON.parse(event.Records[0].body);
    const params = {
        TableName: 'outbox',
        Item: {
            id: { S: event.Records[0].messageId },
            data: { S: event.Records[0].body },
            processed: { BOOL: false }
        }
    };
    try {
        await dynamodb.putItem(params).promise();
        console.log(`Saved message ${event.Records[0].messageId} to outbox`);
        // Process message and update system accordingly
        await processMessage(message);
        // Mark message as processed in outbox table
        const updateParams = {
            TableName: 'outbox',
            Key: { id: { S: event.Records[0].messageId } },
            UpdateExpression: 'SET processed = :processed',
            ExpressionAttributeValues: { ':processed': { BOOL: true } }
        };
        await dynamodb.updateItem(updateParams).promise();
        console.log(`Marked message ${event.Records[0].messageId} as processed`);
    } catch (err) {
        console.error(err);
        // If message fails to be processed, re-queue it
        await sqs.sendMessage({ QueueUrl: params.QueueUrl, MessageBody: event.Records[0].body }).promise();
        console.log(`Re-queued message ${event.Records[0].messageId}`);
    }
};

This code defines a Lambda function that processes messages received from the SQS queue. We parse the message payload, store the message in the outbox table, and process the message. If processing is successful, we mark the message as processed in the outbox table. If processing fails, we re-queue the message.

Conclusion

The Outbox Pattern provides a reliable way to handle data consistency and minimize data loss in distributed systems. By capturing changes made to the system in a messaging system and buffering any failed messages in an outbox table, we can ensure that data is not lost in the event of an outage.

In this blog post, we have explored how the Outbox Pattern can be implemented in AWS serverless architectures using Lambda functions, SQS, and DynamoDB. With this implementation, we can eliminate data loss and ensure reliable data consistency in our applications.