# Thoughts on DynamoDB
Oct 6, 2018 2 minute readI recently built a Serverless Project that used DynamoDB. Learning to use a distributed NoSQL database is never a simple task. DynamoDB makes a strong emphasis on being intentional with the data read or written. It limits almost every operation. When limits are reached, services shut down or more capacity is added at an increased cost. These are some of the most important takeaways.
Good
DynamoDB has three features that save a considerable amount of work. Entire companies have been built around tools that [poorly] solved these problems.
Global Tables
Write data in one AWS Region and the changes propagate to other regions. No need to write replication code. Consider the alternative of managing your own replication strategy.
Streams
Run custom code whenever any record changes. Such code can be unit tested and can be as decoupled from your product as you want it to be. Streams don’t impact the performance of the live database. 1
Time to Live (TTL)
Expire old data automatically.
Scalability
The capacity to scale DynamoDB is [mostly] limited by money. Configure Auto Scaling to handle variable loads and DAX to speed up reads. No pages, no manual intervention, scalability becomes a non-event.
Bad
Performance and scalability come with a cost. Specially when reading data.
Queries
DynamoDB has no easy way to answer this question:
Which products are more expensive than $1000?
Why?: Queries need an equality comparison on the Partition key
Indexes
Indexes are nothing more than managed table clones with different keys. They cost almost the same as the original table.
Limits Everywhere
- Want to know which products are more expensive than $1000? You’re gonna need a Scan to read every item in the table. This will not only cost you, but it may also take down your service.
- Want to run a common bulk operation such as “Delete all accounts from the EU”? Be prepared to write code to stagger the deletions or your service may go down.
- Want to read all the orders for a customer? Consider limiting how many you read.
- Proper Key selection is fundamental. Or your service may go partially down during high traffic periods.
Time to Live (TTL)
The deletion of data is not too responsive [by design]. In my experience TTL deletes items ten minutes late.
Conclusion
DynamoDB is a managed globally available service constrained by cost. It has similar problems to other comparable databases. It should be used under the right circumstances [like any other tool].
-
Streams are like SQL triggers. Only better [grin] ↩