Dynamo in 5

- April 21, 2021

DynamoDB ( AWS NOSQL Database)

DynamoDB is a NOSQL Database that stores data in Key-Value format and in document format.

How is a NOSQL Database (DB) different from SQL DB
No rigid structure other than the mandatory HASH key (this is your partition key or primary key). During Table creation, only the Key Schema is mentioned.
Data Normalization is followed in SQL , Data Denormalization in NOSQL (use as less number of tables as possible)
User is charged based identifying the access patterns.
In SQL, the focus was on structure while in NOSQL the focus is on data & cheap & easy access.

Let me explain this using below Problem Statement:

In my iTunes library, I want to identify genres i often listen to and the Artists I am always pulled towards. I would also like to identify how many times i have played each of these artists and their genres too.

So in above problem💪 statement you could identify my access pattern: some more to look at.

->how many artists are played more than 50 times
->what genre did they belong to
->what genre do i like most
->what are the names of artists
->What artists have I never played and many more.

So once I define and list my access patterns, we can identify the columns we require as HASH key and RANGE key. HASH key can be just a simple one column key or can be a composite key.

Lets come down to basic terminology in Dynamo(NOSQL DB) and SQLDB

DynamoDB : TABLE ITEM ATTRIBUTE
SQL: TABLE ROW COLUMNS

In the above Example our table name will be named :Music
Attributes: Artist, Genres , Played.
Now the Key Schema of DynamoDB would include the Artist as partition Key and genre as the sort key.

The best ideal condition in a database model is to have s primary key that supports all or important access patterns for a workload but this is not always possible. the other possibilities include:
Scan for the item :
Scanning is just reading the entire table looking for an item. This is ideal for small tables and less frequent access pattern . This process increases the latency and the throughput required for a full table scan , especially at scale.
Change the primary key :
Either we can change the primary key to suit all access patterns, we need to create a new table and have a new partition key and migrate the data. we need to consider by changing the primray key we are not impacting the existing access patterns
secondary Indexes:
LSI:
should be created during table creation & shares the same schema as table.
a table can have 5 LSi's.
LSI use the base table without making copy of indexed data .
LSI consume read and write capacity on base table.
LSI support eventual/consistent reads
GSI:
separate primary key from table and separate schema from the main table .
GSI consume additional storage and throughput but improved efficiency for access patterns thaat aren't fully supported by the table primary key and sort key.
A table can have 25 GSIs.
GSIs maintain their own copy of indexed data .GSI has it own provisioned throughput capacity.
GSI support eventual reads only.
aws provides sdk to help developers build applictaions with dynamodb using different programming languages. need to be familiar with dynamo db APis and SDK.

Summary:

Table is a collection of items. An item is a collection of attributes.
The attribute selected as the partition key defines which partition the data is stored in and sort key defines the sort order the partition is stored in.
partition and sort key together comprise the primary key that is designed to support the workload's main access patterns.

we can add secondary indexes if we need to support any additional access patterns

When to scan and when to Index?
scan is slower and uses more read capacity than query based on primary key and secondary indexes but creating & maintaining a index uses additional read and write capacity.

for an uncommon access pattern -->scan might be an ideal solution

To get familiarity with aws dynamodb concepts:

Start with console : Create Tables, items, indexes and streams and then move to alarms, capacity, health, news and support
command line: Use CLI to control dynamodb from the command line (one -time operation and automation via script)
Dynamodb api: processes http(s) requests and responses including correct formatting, valid digital signature and binary data encoded in base64 format
AWS SDK: user focuses on app logic using a familiar programming language
constructs low level dynamodb api requests and handles the responses
generates a cryptographic signature for each request.
forwards requuests to dynamodb endpoint and retrieves responses
implements basic retry logic incase of errors

Comments