Skip to main content
Version: 1.0

S3 (Simple Storage Service)

Storage classes

  • Standard
    • High availability and durability (>= 3 AZ)
    • Designed for frequent access
    • Suitable for most workloads
  • Standard-IA
    • Less frequently and rapid access
  • One Zone Infrequent Access
    • Like Standard-IA but data is stored in single AZ
    • Cost less 20% than standard-IA
  • Intelligent-Tiering
    • Some time accessed frequently and sometime not
  • Glacier
    • Glacier Instant Retrieval: provides long-term archiving data with instant retrieval time for your data
    • Glacier Flexible Retrieval: ideal storage class for archive data that does not require immediate access but needs
      • Expedited (1 to 5 minutes)
      • Standard (3 to 5 hours)
      • Bulk (5 to 12 hours) flexibility to retrieve large sets of data at no cost, such as backup or disaster recovery use case.Can be up minutes or up to 12 hours
    • Glacier Deep Retrieval: cheapest storage class and designed for customers that retain data sets for 7-10 years
      • Standard (12 hours)
      • Bulk (48 hours) or longer to meet customer needs and regulator compliance requirements. The standard retrieval time is 12 hours, and the bulk retrieval time is 48 hours

Versioning

  • Enable in bucket level
  • Same Key overwrite will increment the version: 1, 2, 3
  • It is best practice to version your buckets
    • Protect against unintended deletes (ability to restore a version)
    • Easy roll back to previous version
  • Any file that is not versioned prior to enabling versioning will have version null
  • You can suspend versioning

Management Lifecycle Rules

  • Set of rules to move data between different tiers, to save storage cost
  • Can be used in conjunction with versionning
  • Can be applied to the current versions and previous versions
  • Example: General Purpose --> Infrequent Access --> Glacier
  • Actions:
    • Transaction Actions: configure objects to transition to another storage class
    • Expiration Actions: configure object to expire (delete) after some time
  • Rules can be created for a certain prefix (s3://bucketname/test/*)
  • Rules can be created for certain objects Tags

Object Lock

  • Compliance mode
  • Governance mode
  • Legal Hold ??

S3 Replication (Cross Region Replication)

  • Move bucket from one region to another
  • Must enable versioning (source and destination)
  • Buckets must be in different AWS regions
  • Can be in different accounts
  • Copying is async
  • Must give proper IAM permissions to S3
  • Only new objects are replicated, to enable for existing object use S3 Batch Replication

MFA delete

  • Ensures users cannot delete objects from a bucket unless they provided their MFA code

Tags

  • How do you verify if a file has already been uploaded to S3
  • Names work, but how are you sure the file is exactly the same ?
  • For this, you can use AWS ETags:
    • For simple uploads (less than 5GB), it's the MD5 hash
    • For Multi-part uploads, it's more complicated, no need to know the algorithm
  • Using a tag, we can ensure integrity of files
  • Delete markers aren't replicated by default

Requester Pays

  • The requester instead of the bucket owner pays the cost of the request and the data download from the bucket
  • The requester must be authenticated in AWS (cannot be anonymous)

Event Notifications

  • React to events that are happening in S3 (create, delete, ... object)
  • We need to have IAM Permissions
  • Targets
    • SNS
    • SQS
    • Lambda Function
  • EventBridge can also be used

Performance

  • Multi-Part upload
  • S3 Transfer accelerator
  • Byte Range Fetch
  • S3 Select & Glacier Select
  • S3 Batch Operations: perform bulk operations on existing S3 objects with a single request, example:
    • modify object metadata
    • Copy objects between S3 buckets
    • Encrypt un-encrypted objects
    • Modify ACLs, tags
    • Restore objects from S3 Glacier
  • Prefix: spread our reads across multiple prefixes
  • KMS come in with limits
  • Multipart upload
  • S3 Byte range fetches
  • S3 Analytics ??

Security

Encryption

  • At rest: Server-side encryption

    • SSE-S3:
      • Encryption using keys handled, managed and owned by AWS
      • Object is encrypted server-side
      • Encryption type is AES-256
      • Must set header "x-amz-server-side-encryption": "AES256"
      • Enabled by default for new buckets and new objects
    • SSE-KMS
      • Encryption using keys handled and managed by AWS KMS
      • ??
    • SSE-C: customer provide keys
  • At rest: Client-side encryption

    • You encrypt your files before uploading them to S3
  • In transit

    • SSL/TLS
    • HTTPS

ACLs

  • Works on individual object level

Bucket policies

  • Works on entire bucket level

S3 Object Lambda

  • Use to change object before it is retrieved by the caller application

Princing

You don't pay for data:

  • In to S3 from the internet
  • Out to EC2 in the same region
  • Out to CloudFront