DynamoDB local in Docker

DynamoDB local in Docker

dev.to - Jan 25

Here is a quick post to show how to run DynamoDB locally if you want to test without connecting to the Cloud. As I already mentioned in a previous blog post, it stores the DynamoDB table items in a SQLite database. Yes, a NoSQL database stored in a SQL one... this tells a lot about the power of SQL.

I'll run DynamoDB Local in a Docker container, and define aliases to access it with AWS CLI and SQLite:

# Start DynamoDB local with a SQLite file (not in memory)
docker run --rm -d --name dynamodb -p 8000:8000 amazon/dynamodb-local -jar DynamoDBLocal.jar -sharedDb -dbPath /home/dynamodblocal

# alias to run `sqlite3` on this file
alias sql='docker exec -it dynamodb \
 sqlite3 /home/dynamodblocal/shared-local-instance.db \
'

# alias to run AWS CLI with linked to the DynamoDB entrypoint and exposing the current directory as /aws (which is the container home directory) 
alias aws='docker run --rm -it --link dynamodb:dynamodb -v $PWD:/aws \
 -e AWS_DEFAULT_REGION=xx -e AWS_ACCESS_KEY_ID=xx -e AWS_SECRET_ACCESS_KEY=xx \
 public.ecr.aws/aws-cli/aws-cli --endpoint-url http://dynamodb:8000 \
'
Enter fullscreen mode Exit fullscreen mode

Create table

I create a table from the create-table example, and query the internal SQLite DB:

aws dynamodb create-table \
    --table-name MusicCollection \
    --attribute-definitions AttributeName=Artist,AttributeType=S AttributeName=SongTitle,AttributeType=S \
    --key-schema AttributeName=Artist,KeyType=HASH AttributeName=SongTitle,KeyType=RANGE \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
    --tags Key=Owner,Value=blueTeam

sql -line -echo "select * from dm;"

Enter fullscreen mode Exit fullscreen mode

Insert Items

I insert some data from the batch-write-items example

cat > request-items.json <<'JSON'
{
    "MusicCollection": [
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "No One You Know"},
                    "SongTitle": {"S": "Call Me Today"},
                    "AlbumTitle": {"S": "Somewhat Famous"}
                }
            }
        },
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "Acme Band"},
                    "SongTitle": {"S": "Happy Day"},
                    "AlbumTitle": {"S": "Songs About Life"}
                }
            }
        },
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "No One You Know"},
                    "SongTitle": {"S": "Scared of My Shadow"},
                    "AlbumTitle": {"S": "Blue Sky Blues"}
                }
            }
        }
    ]
}
JSON

aws dynamodb batch-write-item \
    --request-items file://request-items.json \
    --return-consumed-capacity INDEXES \
    --return-item-collection-metrics SIZE

Enter fullscreen mode Exit fullscreen mode

Here is what is stored in the SQLite table:
Image description

Transact Write

Here is an update of 'Happy Day' and Delete of 'Call Me Today' as in the transact-write example:


cat > transact-items.json <<'JSON'
[
    {
        "Update": {
            "Key": {
                "Artist": {"S": "Acme Band"},
                "SongTitle": {"S": "Happy Day"}
            },
            "UpdateExpression": "SET AlbumTitle = :newval",
            "ExpressionAttributeValues": {
                ":newval": {"S": "Updated Album Title"}
            },
            "TableName": "MusicCollection",
            "ConditionExpression": "attribute_not_exists(Rating)"
        }
    },
    {
        "Delete": {
            "Key": {
                "Artist": {"S": "No One You Know"},
                "SongTitle": {"S": "Call Me Today"}
            },
            "TableName": "MusicCollection",
            "ConditionExpression": "attribute_not_exists(Rating)"
        }
    }
]

JSON

aws dynamodb transact-write-items \
 --transact-items file://transact-items.json \
 --return-consumed-capacity TOTAL \
 --return-item-collection-metrics SIZE

Enter fullscreen mode Exit fullscreen mode

Image description

Scan

I have 2 items remaining, 'Scared of My Shadow' that has not been touched and 'Happy Day' where the title has been updated with 'Updated Album Title':

aws dynamodb scan     --table-name MusicCollection --output table
Enter fullscreen mode Exit fullscreen mode

Image description

Testing failures

I put back the initial data with 'Call Me Today', 'Happy Day' and 'Scared of My Shadow':

aws dynamodb batch-write-item --request-items file://request-items.json
Enter fullscreen mode Exit fullscreen mode

I have those 3 Items:

aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE

ALBUMTITLE      Songs About Life
ALBUMTITLE      Somewhat Famous
ALBUMTITLE      Blue Sky Blues
Enter fullscreen mode Exit fullscreen mode

To simulate something that can go wrong, I lock the row for
'Call Me Today' which is the one that by Transac Write should delete ('Somewhat Famous'):

sql
 begin transaction;
 select rangeKey from MusicCollection;
 update MusicCollection set rangeKey='x' where rangeKey like '%Today%';
 select rangeKey from MusicCollection;
Enter fullscreen mode Exit fullscreen mode

I try my Transact Write:

aws dynamodb transact-write-items  --transact-items file://transact-items.json
Enter fullscreen mode Exit fullscreen mode

It fails with:

An error occurred (InternalFailure) when calling the TransactWriteItems operation (reached max retries: 2): The request processing has failed because of an unknown error, exception or failure.
Enter fullscreen mode Exit fullscreen mode

and all is back to normal:

aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE

ALBUMTITLE      Songs About Life
ALBUMTITLE      Somewhat Famous
ALBUMTITLE      Blue Sky Blues
Enter fullscreen mode Exit fullscreen mode

This is a simple test of atomicity. But this runs on a software that is different from the real DynamoDB.

Can we test race conditions?

Why did I do that? The Transact API was subject to discussions:

DynamoDB is a closed source managed service with no possibility to look at the internals. The behavior I see looks good, but how can we be sure?

By "sure", I mean:

  • that it works as designed, and documented, without any bug
  • that my understanding of this documentation is correct
  • that a simple test case can be reproduced later

If you read my blog, you know that this is my way of learning and explaining. Many times I come back to a blog post from the past and copy-paste the simple test to se if something has changed with a new version.

So, how to do the same with a proprietary software, run on a platform where you don't have access, and for which there's no documentation about the internals?

You can read the documentation and trust it, like Alex:

You can stress test and see what happens, like Rick:

I can try on DynamoDB local like suggested by Pete:

So... that's what I did. But now I have to think about the possible ways to reproduce race conditions on small test case. I do that with Open Source software, like PostgreSQL of YugabyteDB. I even did that for proprietary software like Oracle Database, because you can download it and run the same software that they use for their managed services. But for AWS services that are not running an Open Source product, this is impossible.

Of course, there's also experience. You can trust Alex, Rick and Pete as they did troubleshooting AWS customers problems. But from my experience, I've never learned a lot about the internals when troubleshooting production because there's no time to go into the details. On the opposite, I've learned a lot when reproducing those issues in a lab, building test case for support requests, preparing demos, leading a training workshops, or simply investigating by curiosity. You know a topic when you can explain it, and you get the fundamentals when you can demo it.

MORE ARTICLES