If you’re studying for the AWS Cloud Practitioner exam, there are a few Amazon S3 (Simple Storage Service) facts that you should know and understand. This post will guide you through how to utilize S3 in AWS environments, for the correct use cases.
First things first:
What is Amazon S3?
Amazon S3 is an object storage service that is built to be scalable, high available, secure, and performant. This means that customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile apps, backup and restore, archive, enterprise applications, IoT devices, and big data analytics.
These characteristics make Amazon S3 an excellent data store service, but if you’re looking for a database service, you’ll want to look at services like Amazon RDS, DynamoDB, or even customer-managed databases with EC2 instances.
What’s the difference between data stores and databases?
In general terms, data stores can contain a wide range of data files and formats, while databases store bits of data and offer Database Management Systems (DBMSs) that let you query, insert, and update data. For example, you could go to a database and tell it to pull all records for a certain customer, and it would find that data for you. A data store like Amazon S3, on the other hand, could be a great place to store your database backups as files, which could then be used to restore databases should anything happen. It could also store images, text files, logs, archives, etc…
Amazon S3 can also be used by other AWS resources of your choosing, so you can build applications in different services like EC2 and Lambda, and those applications can get access to data from S3 (like images and other files). Even if your resources are in VPCs and S3 buckets don’t live in VPCs, you can create VPC Endpoints which will give you easy access between the VPC and your S3 data without opening up your VPC to the outside world. If you’re not sure what this means or you’d like to try this hands-on, take a look at our Hands-On Lab.
Now that we have a better understanding of Amazon S3, let’s dive into how it’s structured and laid out. If you need more of an understanding of what Amazon S3 is and how it works, take our Hands-On Lab called Introduction to Amazon S3.
How is Amazon S3 structured?
S3 has 3 main “levels” of structure. The first and highest level is the Bucket.
Buckets
Highest level
Buckets are used to group objects together
You can apply bucket-wide settings that can apply to everything stored in that bucket
Buckets can be used to host static websites
Bucket names:
Must be unique across all of AWS – if I use the bucket name Linux academy, you can’t use that name unless I delete that bucket
Must be 3 to 63 characters in length
Can only contain lower case letters, numbers, and hyphens
Must not be formatted as an IP address (ie: 192.168.1.1)
Folders
“Subfolders” of Buckets
Helps even more with organization
Folders don’t do anything functionally apart from giving you more organization and structure. For example, if I have a bucket XYZ, I could create folders 2019 and 2018 to store images from those years.
Objects
Objects are what AWS calls the files stored in S3
Supports different formats (jpg, mov, txt, gzip, etc)
So instead of calling something an “image” in S3, you can call it an object.
Using Amazon S3 Buckets in Regions
Buckets that you create in S3 have to be created in specific regions, such as North Virginia. This means that your data is physically stored in the N. Virginia region, which can make a big difference if all of your customers are in the Middle East because data will have a large distance to travel.
So picking the region for your buckets can be important. Think about where your customers are geographically located, and get as close to them as possible. Of course, using a CDN like CloudFront can help alleviate the distance issues, especially if your customers are all over the globe.
AWS does have a feature that lets you replicate data to different regions (and more recently within the same region) so this is another option to get data closer to customers and to create backups.
Using S3 Object Versioning
A neat feature offered by Amazon S3 is called Object Versioning, and it allows you to keep multiple versions of objects.
This can be helpful for several use cases such as:
Reverting – you uploaded a bad version and need to revert back
Recovery – you accidentally deleted a newer version of the object, but you still have access to the prior version
Temporary version – you can test a new version and if you don’t like it, you can always revert back
Set up and use object versioning in a real AWS console with our Hands-On Lab called S3 Data Management with Versioning and Lifecycle Rules.
What to know about Amazon S3 Storage Classes
One of the key advantages of Amazon S3 is its durability and availability. But in some cases, you may not need the level of durability or availability that comes by default, and instead, you’d rather save money.
S3 makes it possible to do that thanks to storage classes. While you don’t have to memorize these word for word for the exam, you must understand the key differences between those storage classes so that you know which one is best to use depending on specific scenarios.
Note: While we do our best to update facts, make sure to double-check them straight from AWS’ website in case they recently changed something:
Standard (default)
Most expensive storage class
Designed for general, all-purpose storage
99.999999999% (“11 nines”) object durability
99.99% object availability
Supports SSL for data in transit and encryption of data at rest
Standard-IA (IA = Infrequent Access)
Designed for objects that you do not access frequently but must be immediately available when accessed (uses multiple Availability Zones)
99.999999999% object durability
99.90% object availability
Less expensive than the standard storage class
Great for disaster recovery, backups
One Zone-IA (IA = Infrequent Access)
Designed for objects that you do not access frequently but must be immediately available when accessed (uses only one Availability Zone)
99.99% object durability
99.50% object availability
~20% less expensive than the Standard-IA storage class
Intelligent-Tiering
Designed to optimize costs by automatically moving data to the most cost-effective tier based on your usage
Monitors file for usage and move them after 30 days of inactivity
99.999999999% (“11 nines”) object durability
99.90% object availability
Pricing depends on the assigned storage class
A great option if you don’t know which storage class will be best for certain objects
Glacier & Deep Archive
Designed for long-term archival storage
May take several hours to retrieve objects
99.999999999% object durability
Cheapest storage class in AWS
Deep Archive has a longer storage minimum and longer retrieval time
Amazon S3 Object Lifecycles
What if you need objects to start in a certain storage class, but then transfer to a different one after some time has gone by?
Enter Object Lifecycles:
They can transition objects to different storage classes
They can delete objects after a certain period
Some great use cases for object lifecycles are:
Delete log files that aren’t needed after x period of days
Move files you might have needed more frequently at first, but don’t need as frequently anymore – yet you have to keep them for compliance and want them in cheaper storage
To get Hands-On experience with lifecycle policies, launch our Hands-On Lab called: Creating a basic Amazon S3 lifecycle policy
What Next?
Continue referencing this post to refresh your memory, especially as you get closer to your exam date. Make sure you take the labs so that you can experience first-hand how this all works. Even if you don’t have to complete any hands-on scenarios in the exam (it’s Q&A based), you still need to develop skills to complete tasks on the job, so that’s an important step.
Finally, if you haven’t yet, sign up for a 7 days free trial so you can start our AWS Cloud Practitioner certification prep course. This course will teach you from start to finish everything you need to know to pass the exam. It also includes quizzes and a practice exam at the end that mimic the real exam so you can simulate what it will be like.
Good luck, and go get certified!
The post Amazon S3 Reference for the Cloud Practitioner appeared first on Linux Academy.