Journey to the center of AWS S3 (Simple Storage Service)

As you could probably realize by seeing the title of this article, I got some inspiration from a book written by the french writer Jules Verne to start working on this article, I see the Simple Storage Service has many features that provocated me to go travel throught the S3 features and its features like encryption, CORS and the resilience.

Putting together the main features of AWS S3, the Simple Storage Service was designed to offer great reliability, your data can be stored with 99,9% availability and 99,999999999% durability but what does that mean?

“As long as this heart goes on beating, I can’t admit that any creature endowed with will-power should ever despair.” ― Jules Verne, Journey to the Center of the Earth

S3 is a web cloud based service provided by AWS that offers a object storage through web services interface, objects are stored in buckets that are similar to folders but they have some additional features that will be explained below, buckets make use of universal namespaces that must be unique globally e.g. http://bucketname-sa-east-1.amazonaws.com.

S3 is a simple key value store
Is a object based stored in multiple facilities and designed to sustain the loss of 2 facilities concurrently and each object stored in S3 has the following details:

  • Key – Name of the object;
  • Value – which is the sequence of bytes that contains the data;
  • Version ID – Used for versioning;
  • Metadata – that is additional information about your data, you can use this additional fields to add some more information about the object being stored;

Bucket specific configuration

  • Bucket policies – The XML policies you can define;
  • Access Control Lists – Determine some level of security to the files;
  • CORS – Cros Origin Resource Sharing is a way that one application loaded in one domain can interact with resources in a different domain;
  • Transfer Acceleration – Service that allows accelaration of the transfer speed when you are uploading lots of files to S3.

99.999 999 999% Durability
Means that if you store 10 million objects there the chances you have to lose an object is one in 10 thousand years, is it safe enough for you? I’m affraid the future generations will have a lot to see in there!

99.99 % Availability
This is the uptime period that the system will be available for you to acess your files, so there is a chance of 0.01% that you wont be able to see them…

“If we may die at any moment, we may also be saved at any moment. So let us be prepared to seize the slightest opportunity.” ― Jules Verne, Journey to the Center of the Earth

Read after write consistency
This is another characteristic of S3 to be remembered, it means that when you upload a new object to S3 it will be available and consistent for reading as soon as you finish the upload

Eventual consistency for overwrite puts and deletes
That means if you change a file or object that is already there the consistency will be guaranteed after it gets fully replicated inside the S3 system and this should take a little time to propagate.

Replication between diferent availability zones
AWS also informs that data in Amazon S3 Standard, S3 Standard-IA, and Amazon Glacier storage classes is automatically distributed across a minimum of three physical Availability Zones (AZs) that are typically miles apart within an AWS Region

Compliance and international standards
S3 also complies with the international standards PCI-DSS, HIPAA/HITECH, FedRAMP, EU Data Protection Directive, and FISMA.

Queries through the filesystem
S3 also allows queries to be made against your files by using simple SQL expressions by using S3 select or Athena.

S3-IA (Infrequently Accessed)
For data that you need to get access less frequently but still need the data available rapidly when required, lets say a report you need to check once in a quarter, the advantage is that you should pay a lower fee for storage and a retrieval fee, availability 99.9% and durability 99.999999999% across multiple availability zones.

S3 One Zone IA 
Same as S3-IA but the files will be kept in a single availability zone you still count with 99.999999999% durability but 99.5% availability, it will cost 20% less than S3-IA.

Reduced Redundancy Storage
99.99% availability and 99.99% durability as per what the name says it has lower redundancy and it is designed for data that can be recreated in case of a loss.

Glacier
Its probably the cheapest storage solution they offer, its used for files archival, files would take 3 to 5 hours to restore when requested.

“Enough. When science has spoken, it is for us to hold our peace.” I” ― Jules Verne, Journey to the Center of the Earth

How you are charged when using S3
That’s a very important point to consider, so when you are designing how to make use of S3 for your business you must leverage the following factors:

  • Storage space in GB;
  • Requests (Copy, Get, Put, etc);
  • Storage management prices (Tags, analytics, inventory);
  • Data Transfers;
  • Transfer acceleration;
  • Cloudfront if used.

Website publishing
You can also publish your serverless website by just preparing your beautiful html files kit and save there and selecting the options to make your files public and static website hosting to enable it will provide you with an URL for your new website, “easy peasy lemon squeezy” as a friend of mine use to say.

Sources: 
https://en.wikipedia.org/wiki/Amazon_S3
https://aws.amazon.com/s3/
https://aws.amazon.com/s3/storage-classes/