DNS and CI/CD Pipelines

Automated SSL provisioning, blue-green deployments, canary releases, and zero-downtime migrations — DNS as a deployment primitive.

DNS isn’t just configuration — it’s a deployment tool. The ability to redirect traffic by changing a record opens up powerful deployment patterns that are impossible (or at least much harder) at other layers. When you combine DNS management APIs with CI/CD pipelines, you get automated SSL provisioning, instant traffic shifting, graceful rollbacks, and zero-downtime migrations.

Let’s explore how modern teams use DNS as a first-class component of their deployment infrastructure.

Automated SSL Certificate Provisioning

The most common DNS + CI/CD integration is automated TLS certificate provisioning via Let’s Encrypt (or similar ACME providers). The dns-01 challenge proves domain ownership by creating a specific TXT record.

How DNS-01 Challenge Works

1. Your automation requests a certificate for example.com
2. ACME server provides a challenge token
3. You create a TXT record: _acme-challenge.example.com → token
4. ACME server queries DNS to verify the record exists
5. Certificate is issued
6. You clean up the TXT record

Automated with certbot + Cloudflare

# Install the Cloudflare DNS plugin
$ pip install certbot-dns-cloudflare

# Create credentials file
$ cat > /etc/letsencrypt/cloudflare.ini << EOF
dns_cloudflare_api_token = YOUR_API_TOKEN
EOF
$ chmod 600 /etc/letsencrypt/cloudflare.ini

# Request certificate with DNS validation
$ certbot certonly \
  --dns-cloudflare \
  --dns-cloudflare-credentials /etc/letsencrypt/cloudflare.ini \
  -d example.com \
  -d "*.example.com"   # Wildcard requires DNS validation!

In a CI/CD Pipeline

# .github/workflows/ssl-renewal.yml
name: SSL Certificate Renewal

on:
  schedule:
    - cron: '0 3 1,15 * *'  # 1st and 15th of each month

jobs:
  renew:
    runs-on: ubuntu-latest
    steps:
      - name: Install certbot
        run: |
          pip install certbot certbot-dns-cloudflare

      - name: Create credentials
        run: |
          echo "dns_cloudflare_api_token = $" > cf.ini
          chmod 600 cf.ini

      - name: Renew certificate
        run: |
          certbot certonly --dns-cloudflare \
            --dns-cloudflare-credentials cf.ini \
            --dns-cloudflare-propagation-seconds 30 \
            -d example.com -d "*.example.com" \
            --non-interactive --agree-tos \
            -m admin@example.com

      - name: Deploy certificate
        run: |
          # Upload to your server, load balancer, CDN, etc.
          ./scripts/deploy-cert.sh

The --dns-cloudflare-propagation-seconds 30 flag tells certbot to wait 30 seconds after creating the TXT record before asking the ACME server to verify. This accounts for DNS propagation.

Wildcard Certificates

DNS-01 is the only ACME challenge type that supports wildcard certificates (*.example.com). HTTP-01 can’t verify wildcards because there’s no specific URL to serve the challenge on. This makes DNS automation essential for any organization using wildcard certs.

Blue-Green Deployments with DNS

Blue-green deployment maintains two identical production environments. At any time, one (“blue”) serves live traffic while the other (“green”) is idle or being prepared for the next release. DNS switches traffic between them.

Architecture

                    DNS
                     │
            ┌────────┴────────┐
            │                 │
      ┌─────▼─────┐   ┌──────▼────┐
      │   BLUE    │   │   GREEN   │
      │ (active)  │   │ (standby) │
      │ 10.0.1.10 │   │ 10.0.2.10 │
      └───────────┘   └───────────┘

Implementation

#!/bin/bash
# blue-green-deploy.sh

ZONE_ID="your-zone-id"
DOMAIN="app.example.com"
BLUE_IP="10.0.1.10"
GREEN_IP="10.0.2.10"

# Step 1: Determine current active environment
CURRENT_IP=$(dig +short $DOMAIN @ns1.cloudflare.com)

if [ "$CURRENT_IP" = "$BLUE_IP" ]; then
  NEW_IP=$GREEN_IP
  NEW_ENV="green"
else
  NEW_IP=$BLUE_IP
  NEW_ENV="blue"
fi

echo "Switching from $CURRENT_IP to $NEW_IP ($NEW_ENV)"

# Step 2: Deploy new version to inactive environment
./deploy.sh $NEW_ENV

# Step 3: Run smoke tests against inactive environment
if ! ./smoke-test.sh $NEW_IP; then
  echo "Smoke tests failed! Aborting."
  exit 1
fi

# Step 4: Switch DNS (low TTL is critical!)
RECORD_ID=$(curl -s "${CF_API}/zones/${ZONE_ID}/dns_records?name=${DOMAIN}&type=A" \
  -H "Authorization: Bearer ${CF_TOKEN}" | jq -r '.result[0].id')

curl -X PUT "${CF_API}/zones/${ZONE_ID}/dns_records/${RECORD_ID}" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  -H "Content-Type: application/json" \
  -d "{\"type\":\"A\",\"name\":\"${DOMAIN}\",\"content\":\"${NEW_IP}\",\"ttl\":60}"

echo "DNS switched to $NEW_ENV ($NEW_IP)"
echo "Rollback: re-run this script to switch back"

Critical requirement: The TTL for the A record must be low (60–300 seconds). If your TTL is 3600 seconds, it could take an hour for all users to see the new environment. During that time, traffic splits between old and new — which might be fine, or might be catastrophic if the database schema changed.

Rollback

The beauty of blue-green with DNS: rollback is just switching the record back. No redeployment needed. The old environment is still running with the previous version.

Canary Releases via Weighted DNS

Weighted DNS sends a percentage of traffic to a new version, letting you test in production with minimal blast radius.

Route 53 Weighted Routing

# Send 90% of traffic to stable, 10% to canary
$ aws route53 change-resource-record-sets \
  --hosted-zone-id Z1234567890 \
  --change-batch '{
    "Changes": [
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "stable",
          "Weight": 90,
          "TTL": 60,
          "ResourceRecords": [{"Value": "10.0.1.10"}]
        }
      },
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "canary",
          "Weight": 10,
          "TTL": 60,
          "ResourceRecords": [{"Value": "10.0.2.10"}]
        }
      }
    ]
  }'

Gradual Rollout Script

#!/bin/bash
# canary-rollout.sh — Gradually shift traffic to canary

ZONE_ID="Z1234567890"
WEIGHTS=(10 25 50 75 100)
STABLE_IP="10.0.1.10"
CANARY_IP="10.0.2.10"

for canary_weight in "${WEIGHTS[@]}"; do
  stable_weight=$((100 - canary_weight))

  echo "Setting traffic split: stable=${stable_weight}% canary=${canary_weight}%"

  aws route53 change-resource-record-sets \
    --hosted-zone-id $ZONE_ID \
    --change-batch "{
      \"Changes\": [
        {
          \"Action\": \"UPSERT\",
          \"ResourceRecordSet\": {
            \"Name\": \"app.example.com\",
            \"Type\": \"A\",
            \"SetIdentifier\": \"stable\",
            \"Weight\": ${stable_weight},
            \"TTL\": 60,
            \"ResourceRecords\": [{\"Value\": \"${STABLE_IP}\"}]
          }
        },
        {
          \"Action\": \"UPSERT\",
          \"ResourceRecordSet\": {
            \"Name\": \"app.example.com\",
            \"Type\": \"A\",
            \"SetIdentifier\": \"canary\",
            \"Weight\": ${canary_weight},
            \"TTL\": 60,
            \"ResourceRecords\": [{\"Value\": \"${CANARY_IP}\"}]
          }
        }
      ]
    }"

  echo "Waiting 5 minutes and checking error rates..."
  sleep 300

  # Check error rate (your monitoring system here)
  ERROR_RATE=$(curl -s "https://monitoring.example.com/api/error-rate?env=canary")
  if (( $(echo "$ERROR_RATE > 5.0" | bc -l) )); then
    echo "Error rate ${ERROR_RATE}% exceeds threshold! Rolling back."
    # Set canary weight to 0
    aws route53 change-resource-record-sets \
      --hosted-zone-id $ZONE_ID \
      --change-batch "..." # Full rollback batch
    exit 1
  fi
done

echo "Canary rollout complete. Canary is now handling 100% of traffic."

DNS Canary Limitations

DNS-based canary releases have a key limitation: the traffic split is per-resolver, not per-request. A resolver caches the weighted response for the TTL duration, so all users behind that resolver go to the same backend until the TTL expires. This makes the split approximate, not precise.

For exact traffic splitting, use a load balancer or service mesh. DNS-based canary works best for coarse-grained shifts (10% → 50% → 100%) rather than precise per-request routing.

Zero-Downtime Domain Migrations

Moving your domain to a new hosting provider without downtime requires careful DNS orchestration.

The Migration Playbook

Day -7:  Lower TTL to 300 seconds (5 minutes)
Day -1:  Verify low TTL has propagated
Day  0:  Execute the migration
Day +1:  Monitor and verify
Day +7:  Raise TTL back to 3600

Step-by-Step

# Day -7: Lower the TTL
# Change your A record TTL from 3600 to 300

# Day -1: Verify TTL is low everywhere
$ dig example.com +short          # Check answer
$ dig example.com | grep -i ttl   # Verify TTL is 300 or less

# Day 0: The migration

# 1. Set up the new server and verify it works
$ curl -H "Host: example.com" http://NEW_SERVER_IP/
# Should return your site

# 2. Switch the DNS record
$ curl -X PUT "${CF_API}/zones/${ZONE_ID}/dns_records/${RECORD_ID}" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"type":"A","name":"example.com","content":"NEW_SERVER_IP","ttl":300}'

# 3. Keep old server running! Users with cached records still hit it.
# Wait at least 2x the old TTL before decommissioning.

# 4. Monitor from multiple locations
$ for resolver in 8.8.8.8 1.1.1.1 9.9.9.9 208.67.222.222; do
    echo -n "$resolver: "
    dig example.com @$resolver +short
  done

# Day +7: Raise TTL back
# Only after confirming everything is stable

Migration with CNAME Intermediary

For platforms that provide a CNAME target, you can use an intermediary pattern:

; Before: pointing directly to old server
example.com.    IN A    OLD_SERVER_IP

; Step 1: Add CNAME for www (if not already)
www.example.com.    IN CNAME    old-server.hosting.com.

; Step 2: Update CNAME to new platform
www.example.com.    IN CNAME    your-site.new-platform.app.

; Step 3: Handle root domain (A record or ALIAS)
example.com.    IN ALIAS    your-site.new-platform.app.

The advantage: future hosting changes only require updating the CNAME target, and the hosting provider can change their IPs without breaking your DNS.

DNS in Feature Flag Systems

DNS can function as a lightweight feature flag mechanism:

# Feature enabled: points to feature-enabled server
feature-x.internal.example.com.    IN A    10.0.1.100

# Feature disabled: points to feature-disabled server (or doesn't exist)
# Application checks: can I resolve feature-x.internal.example.com?

This is niche but effective for infrastructure-level feature flags — enabling/disabling entire backends, routing to A/B test clusters, or toggling between service versions.

More practically, combine DNS with application-level feature flags:

import socket

def is_feature_enabled(feature_name):
    """Check if a feature is enabled via DNS."""
    try:
        socket.gethostbyname(f"{feature_name}.flags.internal.example.com")
        return True
    except socket.gaierror:
        return False

# Usage
if is_feature_enabled("new-checkout"):
    serve_new_checkout()
else:
    serve_old_checkout()

Key Takeaways

  • DNS-01 ACME challenges enable automated wildcard certificate provisioning
  • Blue-green deployments with DNS give you instant traffic switching and rollback
  • Weighted DNS enables canary releases, but traffic splitting is approximate (per-resolver, not per-request)
  • Zero-downtime migrations require the TTL-lowering playbook: lower → wait → migrate → verify → raise
  • Keep old servers running after DNS changes until 2x the old TTL has elapsed
  • DNS-based feature flags are lightweight but limited — best for infrastructure-level toggles
  • Low TTLs are essential for any DNS-based deployment strategy

You’ve completed Part 7: Practical DNS for Developers. You now have hands-on skills for setting up, debugging, and automating DNS in real-world workflows. Finally, Part 8: The Future of DNS looks ahead — decentralized naming, AI agents, encrypted client hello, post-quantum cryptography, and where the Domain Name System is headed next.