Exhibit H. Detailed Scope and Methodology 42
Having determined four clusters to be optimal, we conducted further analyses to
determine which four cluster groupings to use. We generated 200 grouping
solutions for k=4 (one for each seed), and calculated the average number of
towers in each group. We then selected the solution whose number of towers for
each group was closest to the average. Thirty-four of the 200 solutions had
exactly the same minimum distance to the average group size.
33
We picked one
of the listed solutions for use in the cost and safety comparisons. Summary
statistics for the groups in the selected solution can be seen in tables H-2, H-3,
and H-4
Hierarchical Clustering
W
e also clustered towers using agglomerative hierarchical clustering. In
hierarchical clustering, towers are folded into increasingly larger groups based on
the similarity and dissimilarity of individual observations and groups, measured
using specified metrics. We used Ward’s linkage and Gower’s distance measure to
accommodate the mixed nature of our data. The merging is tracked in a graph
called a dendrogram,
34
and once the process is complete, the analyst can “cut”
the observations into groups at any of the agglomerative steps. The height of
each vertical line in the dendrogram of the clustering process corresponds to a
dissimilarity measure, with a larger vertical distance implying greater dissimilarity.
We cut the process at the level below which increases in dissimilarity dropped.
This produced four groups or clusters. Tables H-5, H-6, and H-7 report summary
statistics for the groups overall and by tower type.
We note that the hierarchical method clusters more strongly on runway
configuration than on traffic density, compared to the k-means method.
However, both methods grouped towers with single runways together. In both
cases, this group consisted of 12 FAA-staffed towers and 71 FCTs that averaged
nearly 14 operations per hour. While relationships between the other groups
produced by the two methods are not as obvious, table H-8 indicates
considerable overlap between them. It shows the distribution of towers across
groupings. Each number on the diagonal indicates the number of towers that
were sorted into the same group under the two methods. For example, table H-8
reports that 28 towers were sorted into Group 1 under both methods. The off-
diagonal numbers track the towers which the two methods sorted into different
33
We considered both Euclidean and Chebyshev’s distance functions, where each group is a “coordinate.” For
example, if the number of towers in groups 1 through 4 is (25, 50, 75, 100) for a candidate solution and the group
means across the two hundred solutions are (30, 40, 60, 105), the Chebyshev distance is the maximum difference
between a group’s size and its mean (in this case, it is 15 = 75-60).
34
A dendrogram is a tree diagram that links all observations according to the hierarchical clustering criteria.
Observations linked lower on the dendrogram are more similar to each other than those linked higher up. Larger
vertical distances on the figure represent more dissimilarity from one linkage to the next. Horizontal differences do
not have an interpretation or meaning.