Automated Static Site Deployment in AWS Using Terraform

pirx
7 min readMay 2, 2022

This is a quick and reusable way to deploy an AWS-hosted environment for static sites generated by the likes of Hugo and Jekyll. The code is written in Terraform, which allows us to set up (and tear down, if needed) all the necessary components just by running a few commands.

I found several good articles and examples online. However, there was not a single source that had everything I needed. My requirements were:

  • S3 + Cloudfront to host the static files and serve/cache the content, respectively
  • The S3 bucket remains private, with only Cloudfront having read-only access to it
  • Can serve more than a single page — not just /index.html but also /posts/, etc.
  • Aside from the domain’s hosted zone, let Terraform configure (and destroy!) everything for me.

All my learnings led me to write my own Terraform module, making it easier integrate across projects:

There’s instructions in the Github page on how to use it. Essentially, you would clone it somewhere and reference it in your own Terraform project. For example, you could end up with a file structure like this:

▾ your_project/
▾ aws-terraform-static-site/ # This module!
▾ main.tf # Your own code
▾ output.tf # Your own code
▾ variables.tf # Your own code

At the minimum, you’d have this in your main.tf:

provider "aws" {
region = us-east-1
}
module "my_static_site" {
source = "./terraform-aws-static-site"
name = "my_site"
domain = example.com
subdomains = ["www"]
route53_zone_id = EXAMPLE12345
}

The module accepts more options, so be sure to consult the README.

The above code sets up your site on https://example.com, with a subdomain URL https://www.example.com that just redirects to the apex domain. (Note that accessing http:// will redirect to https://.)

From the your_project/ directory, you'd then run:

terraform init   # Initialize the providers, modules etc.
terraform plan # Review the changes that will be applied
terraform apply # Start the deployment!

Wait a few minutes, and if there are no errors, your environment should be all set!

As a final step, you’d need to sync your static files into the S3 bucket, which has the same name as your site’s domain. For example, if your static files are in ~/my_site/public directory:

aws s3 sync ~/my_site/public s3://example.com/ --delete

At this point, you should now be able to access your site in https://example.com. If not, you might need to give it a few more minutes until the DNS record propagates.

Done! The next section breaks down the different parts of the module.

Module Details

The module configures these AWS resources in an automated fashion:

Origin Access Identity (OAI)

The OAI serves as the access point of the Cloudfront distribution into the S3 bucket. Setting this up keeps the bucket private and only directly accessible by Cloudfront.

resource "aws_cloudfront_origin_access_identity" "static_site" {
comment = var.domain
}

S3 bucket to store the source files

resource "aws_s3_bucket" "static_site" {
bucket = var.domain
}

Pretty straightforward; the bucket name is the same as the apex domain.

Bucket ACL, IAM policies, public access block

resource "aws_s3_bucket_acl" "static_site" {
bucket = aws_s3_bucket.static_site.id
acl = "private"
}
data "aws_iam_policy_document" "read_static_site_bucket" {
statement {
actions = ["s3:GetObject"]
resources = ["${aws_s3_bucket.static_site.arn}/*"]
principals {
type = "AWS"
identifiers = [aws_cloudfront_origin_access_identity.static_site.iam_arn]
}
}
statement {
actions = ["s3:ListBucket"]
resources = [aws_s3_bucket.static_site.arn]
principals {
type = "AWS"
identifiers = [aws_cloudfront_origin_access_identity.static_site.iam_arn]
}
}
}
resource "aws_s3_bucket_policy" "read_static_site" {
bucket = aws_s3_bucket.static_site.id
policy = data.aws_iam_policy_document.read_static_site_bucket.json
}
resource "aws_s3_bucket_public_access_block" "static_site" {
bucket = aws_s3_bucket.static_site.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = false
}

I grouped all these resources together as they all pertain to access control. Essentially, these configurations make the bucket private with only the OAI having read-only access to it.

TLS certificate

Now this is where it gets interesting. This block creates the TLS certificate for the website.

resource "aws_acm_certificate" "static_site" {
provider = aws.acm
domain_name = var.domain
subject_alternative_names = [for s in var.subdomains : "${s}.${var.domain}"]
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
}

By default, the certificate is created only for the apex domain. However, the module accepts subdomains, and if any are defined, the subdomain(s) get added to the certificate as Subject Alternate Names (SANs).

Before AWS creates the certificate it needs to verify that we actually own the domain! Hence the validation_method = "DNS" line. It tells ACM to validate ownership by looking for very specific DNS records in the domain. The next section sets this up.

Certificate validation

For each domain and subdomain in the certificate, ACM provides one or more CNAME record and value. This serves as a kind of shared key between ACM and the DNS provider. When these records exist in domain records, ACM will be able to validate and generate and automatically renew the certificate.

The configuration below takes care of adding these record to our domain’s zone in Route 53.

resource "aws_route53_record" "validation" {
for_each = {
for dvo in aws_acm_certificate.static_site.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
allow_overwrite = true
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
zone_id = var.route53_zone_id
}

And on the ACM side, this resource performs the validation:

resource "aws_acm_certificate_validation" "static_site" {
provider = aws.acm
certificate_arn = aws_acm_certificate.static_site.arn
validation_record_fqdns = [for record in aws_route53_record.validation : record.fqdn]
}

For more information, see the AWS docs on DNS Validation.

DNS records

This section creates the DNS records for our static site — one resource for the apex domain, which is an A record, and another for the subdomains, which are CNAME records.

resource "aws_route53_record" "static_site" {
zone_id = var.route53_zone_id
name = var.domain
type = "A"
alias {
name = aws_cloudfront_distribution.static_site.domain_name
zone_id = aws_cloudfront_distribution.static_site.hosted_zone_id
evaluate_target_health = false
}
}
resource "aws_route53_record" "static_site_subdomains" {
for_each = toset(var.subdomains)
zone_id = var.route53_zone_idname = each.value
type = "CNAME"
ttl = 300
records = [var.domain]
}

You might have noticed that the Cloudfront distribution is already referred to here. Don’t worry — Terraform is smart enough to figure out which resources to invoke first (most of the time).

Cloudfront distribution

Finally, we get to the chunky but essential Cloudfront distribution resource.

resource "aws_cloudfront_distribution" "static_site" {
origin {
domain_name = aws_s3_bucket.static_site.bucket_regional_domain_name
origin_id = var.domain
s3_origin_config {
origin_access_identity = aws_cloudfront_origin_access_identity.static_site.cloudfront_access_identity_path
}
}
aliases = concat([var.domain], [for s in var.subdomains : "${s}.${var.domain}"])enabled = true
is_ipv6_enabled = false
comment = "CDN for ${var.domain}"
default_root_object = "index.html"
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = var.domain
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
function_association {
event_type = "viewer-request"
function_arn = aws_cloudfront_function.static_site.arn
}
viewer_protocol_policy = "redirect-to-https"
min_ttl = var.cache_ttl.min
default_ttl = var.cache_ttl.default
max_ttl = var.cache_ttl.max
}
price_class = var.price_classviewer_certificate {
acm_certificate_arn = aws_acm_certificate_validation.static_site.certificate_arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1"
}
custom_error_response {
error_caching_min_ttl = 300
error_code = 404
response_code = 404
response_page_path = var.html_404
}
restrictions {
geo_restriction {
restriction_type = var.block_ofac_countries ? "blacklist" : "none"
locations = var.block_ofac_countries ? var.ofac_countries : []
}
}
}

In here we can see coming together things we’ve worked so hard to configure: the DNS records, the OAI, and the TLS certificate.

The aws_cloudront_distribution resource has a lot more options to offer. To see all possibilities, refer to the Terraform doc.

There is one more last thing we haven’t covered and that is the aws_cloudfront_function.

Cloudfront function

The final piece of the puzzle!

Traditionally, a Cloudfront distribution only gave us the option of serving a single-page website whose index page is defined via the default_root_object option (see above). This means that, Cloudfront alone could not server a static site with sub-pages, such as /blog or /about (that is, on top of the root page). In order to be able to do this, we would either need to

  • Make our S3 bucket a website in itself (therefore, public)
  • Use another service such as Lambda@Edge to handle rewrite and redirect rules

However, starting May 2021 AWS introduced Cloudfront Functions which now lets us deploy rewrite and redirect rules directly from Cloudfront and at a fraction of the cost of Lambda@Edge! Hurrah!

Back to the Terraform code:

resource "aws_cloudfront_function" "static_site" {
name = "${replace(var.domain, ".", "_")}_index_rewrite"
runtime = "cloudfront-js-1.0"
comment = "index.html rewrite for S3 origin"
publish = true
code = templatefile("${path.module}/function.js.tftpl", { domain = var.domain, subdomains = var.subdomains })
}

This configuration points to function.js.tftpl, which is a templatized Javascript file. The file contains this code:

function handler(event) {
var request = event.request;
var host = request.headers.host.value;
var uri = request.uri;
%{ for sub in subdomains }
// Redirect to apex domain if using ${sub} subdomain.
if (host.startsWith("${sub}.")) {
var newurl = 'https://${domain}';
var response = {
statusCode: 302,
statusDescription: 'Found',
headers:
{ "location": {"value": newurl}}
}
return response;
}
%{ endfor ~}

// Check whether the URI is missing a file name.
if (uri.endsWith('/')) {
request.uri += 'index.html';
}
// Check whether the URI is missing a file extension.
else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}

This does two things:

  • If our site is hit using one of the subdomains we defined (e.g. https://www.example.com), simply redirect to the apex URL (https://example.com)
  • Each request with a trailing / gets rewritten so a request for the underlying index.html is made instead. This mimics the behavior of a traditional HTTP server such as Apache or Nginx.

And that’s about it! Hope this post has been useful for some.

Originally published at https://pirx.io on May 2, 2022.

--

--

pirx

Security engineer when not distracted by other things