For my test preperation I signed up for the AWS Certified Solutions Architect training from A Cloud Guru. I paid $30 for the course which was a great deal imho, and provided the information needed to pass the exam.
IAM (Identity Access Management)
IAM allows for the management of users and control the access levels of those users for the AWS Console. IAM is also universal and does not apply to a specific region.
IAM allows you to do the following:
- Single location to control your AWS account.
- Granular access control.
- Identity Federation, tying AWS to AD etc.
- 2FA - Two factor authentication for accounts.
- Manage password policy.
- For compliance IAM supports PCI DSS.
- Users - Your end users with each account having its specific permissions. New users have NO permissions assigned when first created.
- Access Types
- Programmatic Access - AWS API, CLI, SDK
- AWS Management Console Access - AWS Management Console
- Access Types
- Groups - A collection of users where permissions can be applied to the group. Each user in the group will inherit the permissions of the group.
- Roles - Users and groups assigned to specific AWS Resources.
- Policies - Defines permissions that specify what actions a user, group or role can be performed on a resource(s). Kept in JSON formatting.
S3 (Simple Storage Service)
Is a object based storage service where data is spread across multiple systems and locations. Objects consist of the following:
- Version ID
- Access Control Lists
- Files can be from 0 Bytes to 5TB in size.
- Unlimited storage.
- Files are stored in Buckets.
- Universal name space so names must be unique globally (assigned through DNS) and HTTP 200 = document/file upload was successful.
- Availability SLA 99.9%
- Durability SLA 11x9's - Data loss
Tiered Storage Available
- S3 Standard
- S3/IA (Infrequently Accessed) - accessed less but allows for rapid access when needed.
- S3 One Zone-IA (Reduced Redundancy Storage) - best for data that can be restored like images.
- Glacier - very inexpensive and used for archival only. Expedited, Standard or Bulk. A standard retrieval time takes 3-5 hours.
S3 Standard S3 Standard IA S3 One Zone-IA Glacier Durability 99.999999999% 99.99999999% 99.999999999% 99.999999999% Availability 99.99% 99.9% 99.5% N/A Minimum Object Size N/A 128KB N/A Minimum Storage Duration N/A 30 days 90 days Retrieval Fee N/A per GB retrieved per GB retrieved Concurrent facility fault tolerances 2 2 1 SSL Support Yes Yes Yes First byte latency Milliseconds Milliseconds Milliseconds Lifecycle Management Policies Yes Yes Yes Considerations Retrieval fee with objects, most suitable for infrequently accessed data Not available for real-time access, must restore objects before you can access them, restoring objects can take 3-5 hours.
- Stores all versions of an object, including all writes even if you delete the object.
- Can be used in conjunction with versioning.
- Good for backups.
- Once enabled it cannot be disabled only suspended.
- Integrates with Lifecycle rules.
- MFA Delete provides additional level of security.
- Access Control and Bucket Policies
- By Default buckets are private and all objects stored inside them are private.
- Read after Write consistency for PUTS of new Objects.
- Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate).
S3 Cross Region Replication
- Versioning must be enabled on both the source and destination buckets.
- Regions must be unique.
- Existing files in a bucket are not automatically replicated. All subsequent updated files will be replicated automatically.
- Delete markers are replicated.
- Deleting individual versions or delete markers will not be replicated.
Read the S3 FAQ before take the exam. S3 is a large percentage of test questions.
- By default, all newly created buckets are PRIVATE
- You can setup access control to your buckets using:
- Bucket Policies
- Access Control Lists
- S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be done to another bucket.
- In Transit (to and from bucket):
- At Rest
- Server Side Encryption
- S3 Managed Keys - SSE-S3
- AWS Key Management Service, Managed Keys - SSE-KMS
- Server Side Encryption with Customer Provided Keys SSE-C
- Client Side Encryption
A service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization's on-premise IT environment and AWS's storage infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage. Supports either VMware ESXi or Microsoft Hyper-V
Four Types of Storage Gateways
- File Gateway (NFS) - For flat files, stored directly on S3.
- Volume Gateway (iSCSI)
- Stored Volumes - Entire dataset is stored on site and is asynchronously backed up to S3.
- Cached Volumes - Entire dataset is stored on S3 and the most frequently accessed data is cached on site.
- Virtual Tape Gateway (VTL) - Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.
Types of Snowballs:
- Snowball is a pedabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of AWS.
- Snowball Edge is a 100TB data transfer device with on-board storage and compute capabilities.
- Snowmobile is an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS. You can transfer 100PB per Snowmobile.
Utilises the CloudFront Edge Network to accelerate your uploads to S3. Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct URL to upload to.
Is a content delivery network (CDN) offered by Amazon Web Services. Content delivery networks provide a globally-distributed network of proxy servers which cache content, such as web videos or other bulky media, more locally to consumers, thus improving access speed for downloading the content.
CloudFront Key Terminology
- Edge Location the location where content will be cached. This is separate to an AWS Region/AZ
- Origin the origin of all the files that the CDN will distribute. This can be either an S3 Bucket, and EC2 Instance, and Elastic Load Balancer or Route53.
- Distribution this is the name given the CDN which consists of a collection of Edge Locations.
- Web typically used for websites.
- RTMP used for media streaming.
Amazon Elastic Compute Cloud is a web service that provides resizable compute capacity in the cloud.
EC2 Pricing Options
- On Demand - allows you to pay a fixed rate by the hour (or by the second) with no commitment.
- Reserved - provides you with a capacity reservaton, and offer a significant discount on the hourly charge for an instance. 1 year or 3 year terms.
- Spot - enables you to bid whatever price you want for instance capacity, providing for even greater savings if your applications have flexible start and end times.
- Dedicated Hosts - physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses.
EC2 Instance Types
|F1||Field Programmable Gate Array||Genomics research, financial analytics, realtime video processing, big data etc.|
|I3||High Speed Storage||NoSQL DBs, Data Warehousing etc.|
|G3||Graphics Intensive||Video Encoding/3D Application Streaming|
|H1||High Disk Throughput||MapReduce-based workloads, distributed file systems such as HDFS and MapR-FS|
|T2||Lowest Cost, General Purpose||Web Server/Small DBs|
|D2||Dense Storage||Fileservers/Data Warehousing/Hadoop|
|R4||Memory Optimized||Memory Intensive Apps/DBs|
|M5||General Purpose||Application Servers|
|C5||Compute Optimized||CUP Intensive Apps/DBs|
|P3||Graphics/General Purpose GPU||Machine Learning, Bit Coin Mining etc.|
|X1||Memory Optimized||SAP HANA/Apache Spark etc.|
Amazon EBS allows you to create storage volumes and attach them to Amazon EC2 instances. Once attached, you can create a file system on top of these volumes, run a database, or use them in any other way you would use a block device. Volumes are placed in a specific Availability Zone, where they are automatically replicated to protect you from the failure of a single component.
EBS Volume Types
- General Purpose SSD (GP2)
- balances both price and performance
- Ratio of 3 IOPS per GB with up to 10,000 IOPS and the ability to burst up to 3000 IOPS for extended periods of time for volumes at 3334 GiB and above.
- Provisioned IOPS SSD (IO1)
- Designed for I/O intensive applications such as large relational or NoSQL databases.
- Use if you need more than 10,000 IOPS.
- Can provision up to 20,000 IOPS per volume.
- Throughput Optimized HDD (ST1)
- Big data
- Data warehousing
- Log processing
- Cannot be a boot volume
- Cold HDD (SC1)
- Lowest Cost Storage for infrequently accessed workloads
- File Server
- Cannot be a boot volume
- Magnetic (Standard)
- Lowest cost per gigabyte of all EBS volume types that is bootable. Magnetic volumes are ideal for workloads where data is accessed infrequently, and applications where the lowest storage cost is important.
Volumes & Snapshots
- Volumes exist on EBS:
- Virtual Hard Disk
- Snapshots exist on S3
- Snapshots are point in time copies of Volumes
- Snapshots are incremental -- this means that only the blocks that have changed since your last snapshot are moved to S3 If this is the first snapshot it may take some time to create
Snapshots of Root Device Volumes
- To create a snapshot for EBS volumes that serve as root devices, you should stop the instance before taking the snapshot (best practice)
- However you can take a snapshot while the instance is running (performance hit)
- You can create AMI's from EBS-backed instances and snapshots
- You can change EBS volume sizes on the fly, including changing the size and storage type
- Volumes will ALWAYS be in the same availability zone as the EC2 instance
- To move an EC2 volume from one AZ/Region to another, take a snapshot or an image of it then copy it to the new AZ/Region
Volumes vs Snapshots - Security
- Snapshots of encrypted volumes are encrypted automatically
- Volumes restored from encrypted snapshots are encrypted automatically
- You can share snapshots but only if they are unencrypted
- These snapshots can be shared with other AWS accounts or made public
AMI Types (EBS vs Instance Store)
- All AMIs are categorized as either backed by Amazon EBS or backed by instance store
- For EBS Volumes the root device for an instance launched from the AMI is an Amazon EBS volume created from an Amazon EBS snapshot
- For Instance Store Volumes the root device for an instance launched from the AMI is an instance store volume created from a template stored in Amazon S3
- Instance Store Volumes are sometimes called Ephemeral Storage
- Instance Store Volumes cannot be stopped. If the underlying host fails, you will lose the data
- EBS backed instances can be stopped. You will not lose the data on the instance if it is stopped
- You can reboot both, you will not lose the data
- By default, both ROOT volumes will be deleted on termination, however, with EBS volumes, you can tell AWS to keep the root device volume
- All Inbound traffic is blocked by default
- All Outbound traffic is allowed by default
- Changes to Security Groups take effect immediately
- You can have any number of EC2 instances within a security group
- You can have multiple security groups attached to EC2 instances
- Security Groups are STATEFUL
- If you create an inbound rule allowing traffic in, that traffic is automatically allowed back out again
- You cannot block specific IP addresses using Security Groups, instead use Network Access Control Lists.
Elastic Load Balancers - 3 Types
- Application Load Balancer - best suited for load balancing of HTTP and HTTPS traffic. They operate at Layer 7 and are application aware. They are intelligent and you can create advanced request routing, sending specified requests to specific web servers.
- Network Load Balancer - best suited for load balancing of TCP traffic where extreme performance is required. Operating at the connection level (Layer 4), Network Load Balancer are capable of handling millions of requests per second while maintaining ultra-low latencies. Used for extreme performance!
- Classic Load Balancer - the legacy Elastic Load Balancers. You can load balance HTTP/HTTPS applications and use Layer 7 specific features such as X-Forwarded and sticky sessions. You can also use strict Layer 4 load balancing for applications that rely purely on the TCP protocol.
- 504 Error means the gateway has timed out. This means that the application is not responding within the idle timeout period
- If you need the IPv4 address of your end user, look for the X-Forwarded-For header