Levenshtein Distance and Distance Similarity Functions

Russian scientist Vladimir Levenshtein discovered the Levenshtein Distance algorithm in 1965. The algorithm produces the number of edits (i.e., insertions, deletions, and substitutions) required to change one string into the other. Consider the distance between “Steven” and “Stephen”:

Step-1: Substitute "v" for "p"
Step-2: Insert "h" after the "p"

It takes two edits to change “Steven” to “Stephen”. The distance is 2.

The distance similarity expands on the distance algorithm by creating a percentage from the number of edits (distance algorithm’s output). The percentage indicates how similar the words are to one another. We needed two edits to change Steven to Stephen in the example above. That could be expressed by saying Steven is 71% similar to Stephen.

It’s calculated with the following formulae:

[distance_similarity] = 100 - ([distance] / [length_of_longest_string] * 100)

Apache Spark includes an implementation of the Levenschtein Distance function. To implement the distance similarity, your code needs to perform the extra calculations:

## Import the `levenshtein` function
from pyspark.sql.functions import levenshtein

## Creating a single data from with the values I want to compare
df0 = spark.createDataFrame([('kitten', 'sitting',)], ['l', 'r'])

## Let's compare the two values...
df0.select(levenshtein('l', 'r').alias('d')).collect()

The following list of name pairs can be enriched to provide their distance similarity.

foo,bar
Suzie Chappman,Suzy Chappman
Billy Wonka,William Wonka
Kelly Schmitt,Kelly Schmitt
Devon Parker,Devon Jones
Kylie LeBlanc,Stacy LeBow

In the following, we use PySpark to establish the similarity between each pair of names:

from pyspark.sql import SparkSession
from pyspark.sql.functions import levenshtein, length, when, col
from pyspark.sql.types import IntegerType

def spark_session_get():
    try:
        spark
    except NameError:
        app_name = "Sketch distance similarity between two words"
        spark = SparkSession.builder.appName(app_name).getOrCreate()
    
    return spark

def source_get(source_file_path, format='csv', header=True):
    return spark.read.load(source_file_path, header=header, format=format)

def source_enrich_distance_similarity(source_raw):
    source_with_distance = source_raw.withColumn('customer_foo_size', length(source_raw.SAR)) \
        .withColumn('customer_bar_size', length(source_raw.PAR)) \
        .withColumn('distance', levenshtein(source_raw.SAR, source_raw.PAR))

    return source_with_distance \
        .withColumn( 'similarity', 100 - (col("distance") / when(col("customer_bar_size") >= col("customer_foo_size"), col("customer_bar_size") ).otherwise(col("customer_foo_size")) * 100))

spark = spark_session_get()
source_file_path = './sample.csv'

source_raw = source_get(source_file_path)
source_enriched = source_enrich_distance_similarity(source_raw)
source_enriched.show()

Analysts can build on this example to establish the similarity between various words and phrases.

Want Coffee with your Workload Simulation?

Hey, train wreck, this isn't your station

Coffee is an OLTP (online transaction processing) workload simulator for SQL Server, Azure SQL Database, and Azure SQL Managed Instance, which mimics the activity of a point-of-sale (POS) system. It simulates the handling of hundreds of orders created, updated, and retrieved hourly from dozens of terminals scattered through the restaurant.

The motivation for Coffee came from several projects needing to evaluate database features under load. Throughout those projects I wished I had a modifiable, simple to use, affordable, and scalable tool that would run a set of OLTP workloads against a simple schema, enabling all sorts of fun things:

  1. Generate test data, useful for testing visualization and reporting tools
  2. Gauge the performance impact of security features like Always Encrypted or Transparent Data Encryption (TDE)
  3. Evaluate different network, database, or system configurations
  4. Perform mock administration tasks (e.g., failing over a replica, modifying a file group) with a live system.

What’s Coffee in a Nutshell?

Servers and kitchen staff place, update, and ring-up orders across several terminals. The simulator mimics this behavior concurrently executing order create, update, and retrieve actions. Action execution is distributed over several threads run in parallel. Each thread deliberately delays triggering an action for a random time interval, which is a random value with-in a set range. The delay avoids all actions executing simultaneously and mimics the ad hoc nature of the workload. The end result is a system that mimics the use pattern of servers and kitchen staff.

What can host a Coffee database?

The project was initially developed with SQL Server 2014 (version-12.x) in mind. However, it has been used with versions of SQL Server through 2019 as well as Azure SQL Databases.

How does Coffee work?

Coffee is written in Windows PowerShell. The project’s repository is hosted in Github. It includes a READ ME that outlines the application’s design, describes usage, and identifies dependencies.

Users interact with the system through a command line interface. Coffee ships with several scripts described in the project’s READ ME. One of the most of these scripts is the launcher script, which initiates workloads. When executed, the launcher idles until the start of the next minute then launches the write, read, and update controllers.

Coffee Execution

The write, read, and update controllers spawn workload threads that generate load against our database. The whole application runs in a single PowerShell process.

Engineers can adjust the workload volume and concurrency from the launcher script. The volume of work is the number of create, read, and update actions to trigger. The concurrency of work describes how many threads are created for each type of action: read, update, and create. By default, Coffee creates, updates, and reads 35,000, 35,000, and 30,000 orders respectively with each controller spawning 5-threads for a total of 15 threads. Because each thread gets its own connection, you will see 15-sessions for Coffee’s PID when running a simulation with default settings.

Once the simulation completes, you will be left with the number of orders you asked the write controller to create, 35,000 by default.

I purposely kept the database’s physical model simple and intuitive to make it easy for developers to manipulate and query. The database has four tables all in the “dbo” schema:

  • dbo.customer, this table captures the restaurant’s customers.
  • dbo.sustenance, contains the restaurant’s menu items.
  • dbo.order, this table contains the restaurant’s orders.
  • dbo.order_detail, hosts the dishes purchased with a given order.

The tables are related as follows:

Coffee Schema Diagram
This is the physical data model for Coffee.

The data generated as part of a simulation remains once the simulation completes.

This data comes in handy when testing visualization and reporting tools, partitioning schemas, or different SQL commands.

Lastly, Coffee saves runtime metrics for each executed simulation in a pair of files: test summary and test detail. The test summary file captures metrics by workload controller. These metrics include controller start and end date and time, total run time, and number of threads.

The test detail file captures metrics for each action executed as part of a given simulation. The metrics report include the action’s type, duration, number of errors encountered, worker thread id, and start time.

Each file includes the name of the machine executing the simulation and the simulation’s start date and time. Engineers can use this data in concert with additional metrics to gauge system health.

Conclusions

Engineers can leverage Coffee whenever they need (a) sample data or (b) to gauge system behavior in the context of a condition or system state change.

This project is far from a polished solution. Despite the many areas for improvement, Coffee remains one of my favorite pet projects, and a tool I find myself using again and again in my work. I use Coffee with cloud and on-premise installations of SQL Server. I use it with cloud based DBaaS solutions like Azure SQL Database. I use it in presentations and training classes. I use it to generate test data when exploring data analysis and visualization tools. For these reasons, Coffee is a project I thought worth sharing.

Append to a Static List in a YAML CloudFormation Template

When writing CloudFormation stack templates, I sometimes need to create a list combining things defined at runtime and static values.

Imagine you have a template that contains a mapping, which enumerates IAM roles by environment. You want to grant permission to these roles as well as one or more Lambda execution roles. Can you create a list comprised of the static values defined in your map with references to roles created as part of your stack?

The FindInMap intrinsic function returns a set when the mapped value is a list, such as in our example. The Join function creates a string composed of the elements in the set separated by a given value.

You may perform a join on a set returned from the FindInMap function, returning a string composed of the elements in the set delimited by comma. You can then join the comma delimited string with a list of values. This second list can include references to resources created in the template.

!Join
  - ","
    - - !Join [",", !FindInMap ["MyMap", "foo", "thing"]]
    - !Ref "Thinger"

The following shows a CloudFormation stack template using this technique juxtaposition to an instance of the provisioned resource..

AWS CloudFormation Append Value to List
You’re seeing a role definition in a CloudFormation stack template shown juxtaposition to an instance of the resource provisioned. The role’s definition includes a list of ARNs. The ARNs are a combination of a static list provided by a mapping, and an execution role for a Lambda. The provisioned role reflects the complete list.

Notice the provisioned resource is a superset of the two lists. The following is the complete template:

Description: Sample Stack
Parameters:
  Thinger:
    Type: "String"
    Default: "arn:aws:s3:::f2c9"
Mappings:
  MyMap:
    foo:
      thing:
        - "arn:aws:s3:::0b50"
        - "arn:aws:s3:::e256"
        - "arn:aws:s3:::4159"
      thang:
        - "arn:aws:s3:::8199"
        - "arn:aws:s3:::d9f1"
        - "arn:aws:s3:::bc2b"
    bar:
      thing:
        - "arn:aws:s3:::bd69"
        - "arn:aws:s3:::eb00"
        - "arn:aws:s3:::0f55"
      thang:
        - "arn:aws:s3:::5ebc"
        - "arn:aws:s3:::4ccb"
        - "arn:aws:s3:::85c2"
Resources:
  Something:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              Service:
                - "lambda.amazonaws.com"
            Action: "sts:AssumeRole"
      Policies:
        - PolicyName: ExecuteSubmitFilePolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: !Split
                  - ","
                  - !Join
                    - ","
                    - - !Join [",", !FindInMap ["MyMap", "foo", "thing"]]
                      - !Ref "Thinger"
Outputs:
  UnifiedList:
    Value: !Join
      - ","
      - - !Join [",", !FindInMap ["MyMap", "foo", "thing"]]
        - !Ref "Thinger"

The utility of this technique is debatable. That said, it’s a useful pattern for joining two sets in a CloudFormation stack template.

KeyLookup Exception Thrown When Calling awsglue.utils.getResolvedOptions() Locally

I was locally testing a PySpark script I’d written for an AWS Glue job I was building when I ran across the error relating to the call to getResolvedOptions(). The call was generating a KeyLookup exception. The problem was with the argv parameter I supplied.

When a Glue job is executes, parameters are passed to the script through sys.argv. Typically, you pass sys.argv to getResolvedOption(args, options) with the options you want to tease from the list – see Accessing Parameters Using getResolvedOptions for details.

You can mimic this behavior when running this script locally:

from pprint import pprint as pp
from awsglue.utils import getResolvedOptions

argv = ['whatevs', '--JOB_NAME=ThisIsMySickJobName']
args = getResolvedOptions(argv, ['JOB_NAME'])

pp(args)

The following is me running a script that contains the above code locally:

Screen Shot Calling getResolvedOptions() with Fabricated Arguments

The trick is that the list passed as the argv parameter needs values to use the pattern:

--KEY=VALUE

For example…

--JOB_NAME=ThisIsMySickJobName

Configure SQL Server to Run as a Managed Service Account (MSA)

Most security standards ask administrators to periodically cycle passwords, like those for service accounts. To ease that burden, Microsoft released the Managed Service Accounts (a.k.a., MSA) with Windows Server 2008 R2 and Windows 7. MSA is an active directory account associated with a specific computer on the domain. The account’s password is complex and managed by the domain.

For this example, I’m using the old sandbox.local domain. I’ve got two machines on the domain running Windows Server 2016: a DC named sbx-dc01 and a server named sbx-misc-dbs02. sbx-misc-dbs02 is doing to be hosting a default instance of SQL Server. We’ll be creating an MSA named svc-dbs02-eng01, installing it on our database server (sbx-misc-dbs02), and configuring SQL Server to run as that user.

Execute the following instructions to provision the MSA and configure a SQL Server instance to use it:

Before You Start: This instructions in this section need to be run by someone with rights to create and install an MSA.

  1. From the target machine, launch a PowerShell terminal as an Administrator.
  2. Update and run the following command script.
New-ADServiceAccount -Name <service-account-name> -DNSHostName <fully-qualified-service-account-name> -Enabled $True
Add-ADComputerServiceAccount -Identity <target-machine-name> -ServiceAccount <service-account-name>
Set-ADServiceAccount -Identity <service-account-name> -PrincipalsAllowedToRetrieveManagedPassword <distinguished-name-target-machine>
Install-ADServiceAccount -Identity <service-account-name>

Before use, change anything in angle brackets to relevant values for your test, for example:

New-ADServiceAccount -Name svc-dbs02-eng02 -DNSHostName svc-dbs02-eng02.sandbox.local -Enabled $true
Add-ADComputerServiceAccount -Identity sbx-misc-dbs02 -ServiceAccount svc-dbs02-eng02
Set-ADServiceAccount -Identity 'svc-dbs02-eng02' -PrincipalsAllowedToRetrieveManagedPassword 'CN=SBX-MISC-DBS02,CN=Computers,DC=sandbox,DC=local'
Install-ADServiceAccount -Identity 'svc-dbs02-eng02'

NOTE: We call Test-ADServiceAccount in the screen shot above. While not required, you may call this commandlet to verify the account was installed. This especially helpful if you’re creating the account for someone else to use.

  1. Open SQL Server Configuration Manager.
  2. Select SQL Server Services from the left pane, right-click the service you want to configure on the right pane, and select Properties.
  3. Select the Log On tab.
  1. Enter the service account name (domain\accountName) followed by a dollar sign ($) into the Account Name field. Do not include a password.
  2. Click Apply.
  1. Click Yes when prompted to restart the service.

The SQL Server service should start successfully, and when it does, your instance should be running as the MSA you created.

One caution, remember to make the service account assignment from SQL Server Configuration Manager. Configuration Manager assigns various permissions to the account allowing it to successfully run the SQL Server process.

Mocking .NET Objects in Pester Scripts

You have written several functions in a PowerShell script or module. You want to use Pester, a unit testing framework for PowerShell, to create unit tests for those functions. Unfortunately, several of the functions call methods of .NET objects, and you can’t mock those method calls. As result, the method is executed when you run your test script.

To illustrate the fix we’ll use the following example:

function do-thing {
    Param ($servername)

    [Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.smo") `
        | Out-Null

    $server = New-Object 'Microsoft.SqlServer.Management.Smo.Server' `
        -ArgumentList $servername

    $Server.ConnectionContext.ConnectTimeout = 5
    $Server.ConnectionContext.Connect()
    $server.Databases
}

When `do-thing` is executed the following occurs:

  1. Microsoft.SqlServer.Management.Smo.Server object named $server is created
  2. The $server object’s Connect() method is called
  3. The $server object’s database collection is returned

This function will blow-up at step 2 unless its actually connecting to a live SQL Server instance, an external element. It’s a generally accepted best-practice that unit tests should be independent of external elements, components outside the unit being tested. You typically control interactions with external components through the use of mocks.

If you’re running PowerShell 5.x or greater, you can employ a mock object to solve this problem. A mock object according to Margaret Rouse “…is a simulated object that mimics the behavior of the smallest testable parts of an application in controlled ways.” You’ll start by creating a mock class, which you’ll instantiate into a mock object. The following is an example of this concept in practice:

describe 'do-thing' {
    class fake_smo_connnection_context {
        [int] $ConnectTimeout
        [void] connect(){ }
    }

    class fake_smo_server {
        [string[]] $databases = @('foo', 'bar')
        [fake_smo_connnection_context] $ConnectionContext

        fake_smo_server() {
            $this.ConnectionContext = `
                New-Object 'fake_smo_connnection_context'
        }
    }

    context 'when the server exists' {

        Mock 'New-Object' { New-Object 'fake_smo_server' } `
            -ParameterFilter {
                $TypeName -and  
                $TypeName -eq 'Microsoft.SqlServer.Management.Smo.Server'
            }

        it 'should complete successfully' {
            do-thing -servername 'whatever'
        }

        it 'should return databases' {
            $databases = do-thing -servername 'whatever'
            $databases.Count | Should -Be 2
        }
    }

    context 'when the server does not exist' {

        it 'should throw an exception' {
            { do-thing -servername 'whatever' } | Should -Throw
        }
    }
}

We’ve created two classes:

  1. fake_smo_connnection_context, a class to mock the server’s ConnectionContext object
  2. fake_smo_server, a class to mock the server object

fake_smo_connnection_context has single method named Connect() that does nothing and a single property ConnectTimeout. fake_smo_connnection_context mocks the ConnectionContext type. We’re not creating a complete facsimile of the type we want to mimic. We’re only creating the parts of the server object needed to unit test this function. Since fake_smo_server contains fake_smo_connnection_context we’re declaring fake_smo_connnection_context before fake_smo_server. We do this because you can’t create a variable of type fake_smo_connnection_context until type fake_smo_connnection_context has been defined. Lastly, the classes were defined in a describe block. We did this because the context and it child-blocks have visibility into the parent describe block. I tend to define mock-classes one scope up from where I’ll use them.

If you’re familiar with classes but aren’t familiar with them in PowerShell, checkout the post PowerShell v5 Classes & Concepts by Michael Willis. If you’re not familiar with classes at all, …I’m impressed that you read this far. You’ll want to search around for an introduction to classes in PowerShell. However, if you’re in a pinch there is an alternative.

If you can’t create a mock object, you can wrap the method call in a PowerShell function then simply not test the wrapper. The following is the function with a wrapper:

function run-wrapper {
    Param ($server)

    $Server.ConnectionContext.ConnectTimeout = 5
    $Server.ConnectionContext.Connect()
    $server.Databases
}
function do-thing {
    Param ($servername)

    [Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.smo") `
        | Out-Null

    $server = New-Object 'Microsoft.SqlServer.Management.Smo.Server' `
        -ArgumentList $servername

    run-wrapper -Server $server
}

The code in this post was tested with Pester 4.4.0 and PowerShell 5.1.17134. Pester changed the way assertions were passed with Pester 4.x. With Pester 4.x assertions are passed as parameters. If you are running Pester 3.x, some of the code in this article will not work.

Access Images Stored in AWS S3

I built a very simple image storage solution for a client. The solution stored imagery in an S3 bucket. The images were retrieved by various components referencing the image’s S3 URL. One of the developers asked for guidance on how to access imagery from the bucket.

Step-1: Create the Bucket

AWS provides a fantastic article for folks new to S3: Getting Started with
Amazon Simple Storage Service
. The piece includes guidance on creating an S3 bucket through the AWS console. I’ll create the bucket using the AWS CLI:

aws s3 mb s3://nameofmybucket

Step-2: Grant Read Access to the Bucket

For this example, I’m going to make objects in the bucket publicly accessible. To make that change, you’ll need to shoot over to the bucket’s Permissions tab in the console:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<your-bucket-name>/*"
        },
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<your-bucket-name>"
        }
    ]
}
  1. Select the CORS configuration block:
  2. Add the following policy to the CORS confiration editor and click Save.
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <MaxAgeSeconds>5000</MaxAgeSeconds>
        <ExposeHeader>x-amz-request-id</ExposeHeader>
        <ExposeHeader>x-requested-with</ExposeHeader>
        <ExposeHeader>Content-Type</ExposeHeader>
        <ExposeHeader>Content-Length</ExposeHeader>
        <ExposeHeader>x-amz-server-side-encryption</ExposeHeader>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
</CORSConfiguration>

What did you just do? You’ve removed the default safety measures AWS employs to prevent public access to your bucket, and you’ve created a bucket policy (affects the whole bucket) that permits folks to list the buckets contents and read files from the bucket. Lastly, you’ve enable cross-origin access to the bucket.

CORS is a security feature built into modern browsers, which prohibits a site from access content from a site on a different domain. You can read about CORS from here.

Step-3: Create Your Web Page

The following is my 3rd grade quality simplistic web page:

<html>

<head>
    <title>Web Page with Images Hosted on S3</title>
    <script>
        function downloadArtwork() {
            const imageUrl = "https://nameofmybucket.s3.us-east-2.amazonaws.com/funny_cats.jpg";
            const requestType = 'GET';
            const isAsyncOperation = true;

            // Get the image from S3
            const request = new XMLHttpRequest();

            // Initiate image retrieval
            request.open(requestType, imageUrl, isAsyncOperation);

            // Handle the data you get back from the retrieve call
            request.onload = function () {
                let binary = "";

                // New image object
                const image = new Image();
                const response = request.responseText;

                // Convert the gobbly-gook you get into something your
                // browser can render
                for (i = 0; i < response.length; i++) {
                    binary += 
                        String.fromCharCode(response.charCodeAt(i) & 0xff);
                }

                image.src = 'data:image/jpeg;base64,' + btoa(binary);

                // Link the image data to the image tag/node in your page
                const imageFromS3 = 
                    document.getElementById('exampleImageFromS3');
                imageFromS3.src = image.src;
            }

            request.overrideMimeType('text/plain; charset=x-user-defined');
            request.send();
        }
    </script>
</head>

<body onload="downloadArtwork()">
    <h1>Option-1: Reference the file host in S3</h1> <img
        src="https://nameofmybucket.s3.us-east-2.amazonaws.com/funny_cats.jpg" alt="Using 'img'">
    <h1>Option-2: Download the file from S3</h1> <img src="#" id="exampleImageFromS3" alt="using JavaScript" />
</body>

</html>

The page access the funny_cats.jpg image from my S3 bucket in one of two ways:

  • Option-1: Link directly to the image in the bucket
  • Option-2: Use JavaScript to retrieve the image and add it to the page

And, there are the Gotchas…

When I first created this sketch, I didn’t have CORS enabled on my bucket. I wanted to see the headers coming back from S3, and I didn’t want the browser and page in the way. So, I mimicked the pull with cURL:

curl -H "Origin: https://whatever.net" \
-H "Access-Control-Request-Method: GET" \
-H "Access-Control-Request-Headers: X-Requested-With" \
-X OPTIONS \
--verbose \
https://nameofmybucket.s3.us-east-2.amazonaws.com/funny_cats.jpg

You’re looking for the Access-Control-Allow-origin header on the response, and an HTTP response code of 200.

You can also see the headers associated with HTTP calls from a page in the Chrome.

  1. Select Developer Tools: (Elipse) > More Tools > Developer Tools.
  2. Select the Network tab.
  3. Select the name of the image file from the pane on the left in the console to see the headers associated with its request/response.
|   

Assigning a Custom Domain Name to an AWS API Gateway

I wrote a solution that included a REST API implemented with API Gateway, which necessitated the use of a custom domain. I found a few resources while researching how best to implement (see the following links), but I didn’t find anything that was accurate and succinct. I’ve Created this article for that purpose.

This article provides step-by-step instructions to add a custom domain name to an API Gateway using the web console – as it existed on or around the 1st quarter of 2020.

A few assumptions…

  • I start the instructions assuming you’ve logged into the AWS console.
  • I assume you have an API already.
  • The DNS name added in the directions is “api.mycompany.com”. This is a fictional name. I assume you’ll replace this value with whatever DNS name you’re assigning to the API.

Before you start…

  • You’ll need a user in an AWS account with rights to perform this action.
  • You must load the certificate into the same AWS region as the one hosting the API.
  • Your certificate needs to employ an RSA key size of 1024 or 2048 bit.

Execute the following instructions to create a custom domain name for an API Gateway:

  1. Load the api.mycompany.com certificate into AWS Certificate Manager in your hosting region e.g., US-East-2.
    1. Navigate to the AWS Certificate Manager service from the AWS console.
    2. If this is your first time using ACM, click the Get started button under Provision certificates.
    3. Choose Import a certificate.
    4. Paste the PEM encoded certificate to the Certificate body text area.
    5. Paste the PEM encoded private key into the Certificate private key text area.
    6. Click Review and import.
    7. Click import.
  2. Create custom domain name in AWS API Gateway.
    1. Navigate to the Amazon API Gateway service from the AWS console.
    2. Select Custom Domain Names from the menu on the left side of the page.
    3. Click the + Create Custom Domain Name button.
    4. Select HTTP.
    5. Enter the domain name into the Domain Name field e.g., api.mycompany.com.
    6. Select TLS 1.2 from the Security Policy option group.
    7. Select Regional from the Endpoint Configuration.
    8. Select api.mycompany.com from the ACM Certificate drop down.
    9. Click Save.
    10. Click Edit.
    11. Click Add mapping.
    12. Enter “/” in the Path field.
    13. Select the “My-API-Name” from the Destination drop down.
    14. Click Save.
      Certificate Configuration
  3. From the newly created custom domain name, create a mapping to the deployed API’s stage.
  4. Create CNAME record for api.mycompany.com to Target Domain Name in new custom domain name.

When you first create the base path mapping, you might be enticed to connect to an endpoint using the target domain name. That won’t work. The target domain name is meant to be the target of your CNAME record, it’s not accessible independently. Once the alias record has been updated, give the change a few minutes to propagate. You can then attempt to access your endpoint via cURL or Postman:

Call API Using Custom Domain Name via Postman
Call API Using Custom Domain Name via Postman
curl --location \
--request POST 'https://api.mycompany.com/v1/things/stuff' \
--header 'Content-Type: application/json' \
--header 'Content-Type: text/plain' \
--data-raw '{
	"thingId": "fed8b3c1341ea9388dcbc8f260e4a2177907a7f1"
}'

It took between 5 and 20 minutes for the DNS change to take affect during for me. If you’re having problems after having followed these instructions and given DNS 20 (or more) minutes to update, something went wrong.

Generating a Uniquifier for Your Resources in CloudFormation

I don’t generally name CloudFormation resources explicitly. However, once in a while, I want to explicitly name a resource, and I want whatever this resource name is to be unique across stacks. This lets me deploy multiple instances of the stack without worrying about naming collision. Oh, and I don’t want the unique portion of the name to change each time I update the stack. This is important. If I use this technique on an S3 bucket (for example), I’d get a new bucket with each stack update, and I don’t want that.

One quick-and-dirty way to accomplish this is to leverage the stack id(entifier). Consider the CloudFormation template:

---
Outputs:
  MyBucket:
    Value: !Select [6, !Split [ "-", !Ref "AWS::StackId" ]]
    Export:
      Name: "MyBucket"
Resources:
  MyBucket:
    Type: "AWS::S3::Bucket"
    Properties:
      BucketName: !Sub
        - "mybucket-${Uniquifier}"
        - Uniquifier: !Select [6, !Split [ "-", !Ref "AWS::StackId" ]]

I’m building the template with something like the following (run in bash):

aws cloudformation deploy \
--stack-name "KewlStackAdamGaveMe" \
--template-file "<full-path-to-template-file>" \
--capabilities CAPABILITY_IAM

You’ll end up with a stack and S3 bucket that looks like the following:

Deployed Cloudformation Stack

I’m using the last the last 12-characters of the stack id, but you can use the whole thing if you’d like to. Keep in mind the naming rules for S3 buckets. Either way, you get the gist of how I’m creating a unique name that stays unique across stack updates.