Understanding The Heart of AWS S3

Behrouz Kashani
6 min readFeb 26, 2022

Introduction

Amazon S3 or Simple Storage Service is one of the most useful and powerful services from AWS for storing data. At least 90% of other amazon services use S3 directly or indirectly. S3 is not only a storage service but also, it can serve static websites as well and acts as a powerful but simple webserver.

Amazon S3 provides developers and IT teams with secure, durable, and highly-scalable cloud storage. S3 is easy-to-use object storage with a simple web service interface that you can use to store, retrieve and manage files and folders or even you can use it as a command line to control your files.

Also, it must be mentioned that S3 provides unlimited storage that you can upload from 0 bytes to petabytes files and data on it and it’s one of the cheapest storage across all the internet.

Object Storage

As mentioned S3 stores data as object vs the other storage architecture likes:

  • File system: stores data as file and fire hierarchy
  • Block Storage: stores data as blocks within sectors and tracks.

S3 Object and Bucket

Each file inside the S3 consider as an object and it consists of:

  • Key: uses as file name
  • Value: the data itself made up of a sequence of bytes
  • Version ID: if the versioning is enabled this field contains a unique version id
  • Metadata: additional attributes or key-value data that can be attached to the file

The bucket on the other hand looks like a folder. Each bucket can store unlimited objects but there’s one rule which is the bucket name must be unique entirely through the S3. It’s more like a subdomain inside the S3 namespace.

Storage Classes

There are different services provided by S3 in terms of storage classes that can have different pricing, availability, and use cases. You don’t need always to use the standard version of S3 if you need to store some files that are needed once a year. These solutions can help IT, teams, and developers, to find the correct architecture.

  • Standard: It’s the default S3 solution, it’s very fast with 11 9’s durability, and 99.99% availability across at least three availability zones (AZ).
  • Standard Infrequently Accessed (IA): Still Fast! Cheaper if you access files less than once a month. An additional retrieval fee is applied. 50% less than Standard (reduced availability)
  • One Zone IA: Same as the Standard IA but this one is hosted on only one zone. The availability rate is %99.5. It’s cheaper than standard IA by %20 but your data may get destroyed. Also, the retrieval fee is applied.
  • Glacier: One of the popular solutions by S3 to store the files and data you don’t need a lot. It’s super cheap but retrieving data from Glacier could take from minutes to hours.
  • Glacier Deep Archive: The cheapest storage class but the data retrieval could take to 12 hours.

Now let’s compare all the classes together to have a better understanding.

S3 Encryption

There are three types of encryption are available when we talk about S3:

  • Encryption in Transit: All the traffic between your host and S3 is on SSL/TLS.
  • Server-Side Encryption: Amazon can help you through data encryption on the host. There are three different ways for server-side encryption:
  • SSE-AES: This way S3 handles the keys using AES-256 algorithms.
  • SSE-KMS: You handle the keys through a service inside the AWS called AWS KMS.
  • SSE-C: You provide and handle the key on yourself and your side.
  • Client-Side Encryption: This is kind of old-school but still sometimes may be applied. This way you encrypt the data before uploading it to S3 and after retrieval, you must go through decryption too.

Data Consistency

Concept of CRR

Cross-Region Replication is a feature in S3 and when it’s enabled all the data migrate to other regions as well. As a result of this feature, you have a much more durability rate and potential disaster recovery.

For enabling CRR you must have versioning enabled in both buckets and also a benefit of CRR is that you can replicate data to another AWS account as well.

Versioning

Versioning is one of the most important and coolest features across S3. When you have data that needs to keep updating but don’t want to lose tacks of previous versions also you don’t want to rename them and deal with lots of duplicated files inside your bucket, versioning comes to the rescue.

When you upload a new version it will be identified with Version ID in metadata and you can get access to the version of the file with that ID.

One point should be considered, when you enable versioning you cannot disable it again. You only can suspend versioning from now on but the files before still keep their versions.

Also, you cannot remove a version from the middle and it must happen from the top. It looks like POP syntax in programming languages.

S3 Life Cycle Management

S3 brings automation tools to manage files easier. If you have files that keep uploaded and don’t wanna lose them but after a while, you don’t need them much, you can set a bunch of Life Cycle Rules to move them into cheaper classes (like IA or Glacier) automatically.

In this example, the files after 7 days move to Glacier, and after 365 days are automatically deleted from the whole storage. You can create unlimited rules based on your conditions.

Presiggned URL

The last but not least topic here about the S3 is a very useful feature called Presiggned URL. this feature uses by developers a lot. As you may already realize the buckets by default are blocked to public access. You may need to put a couple of checkboxes and make the whole bucket or one object public and send the links or convert your bucket to the webserver and all the files will be accessible. This may be the case sometimes but put your bucket in a compromising situation.

Another way is to generate a Presiggned URL This is a temporary URL generated by the owner of the bucket (or whoever has access from IAM) and make that object public for a certain amount of time and after that the access is removed automatically.

You can create a Presiggned URL through AWS CLI or SDKs easily.

In this example, the URL will lose public access after 300 seconds.

Wrapup

S3 is one of the powerful and cheapest services in the AWS world and can be used in many cases. You just need to know how to configure it and how to deal with it. Recently AWS changed the user interface and now it’s beautiful and has a better UX.

There are many videos on Youtube to show you how to create and manage S3 buckets and Objects.

I tried to explain important points for S3 and I hope you enjoyed it.

--

--

Behrouz Kashani

Leader turned writer, penning wisdom from a journey through tech & leadership. Here to share lessons learned, insights gained, and experiences lived.