This document is intended as a supplement to the paper Finds from Miletus XXXII: Clay Rings from the Sanctuary of Dionysos in Miletus, Archäologischer Anzeiger 2020/1 (Steinmann 2020).
Abstract: During its excavation in the 1970s, a large number of unidentifiable objects were found in the sanctuary of Dionysos in Miletus: »Peculiar and as of now not explicable for the editor [Willi Real] are numerous fragments of flat rings […]. They are reminiscent of the rings used for modern coal stoves. […] Hitherto no interpretation has been found.« In the course of a re-examination of the excavation’s findings since 2017, it has been possible to find similar rings from different places in the Mediterranean. It is plausible that these until now unidentified objects are stacking rings used in potters workshops, an isolated and unique find for 5th century Miletus. In this article, the rings will be compared and classified, followed by an assessment of their functionality as well as a discussion of their context.
(Preliminary reports of the excavation: Müller-Wiener (1977) and Müller-Wiener (1979), finds: (Real 1977, 105))
This github repository provides all the original data used in said paper. This document contains all the code used to generate the relevant graphs and findings. It is not intended as an explanation of methodology – as this is already provided in the paper itself – but as additional transparency.
The table provided in the publication (Steinmann 2020, fig. 16) is available as a .csv-file (in the “data-raw” subdirectory) as well as a Data set that is provided in this package:
All measurements were gathered by the author during a research stay in Miletus in 2017, funded by the Research School of the Ruhr-University in Bochum. Subsequent work was accomplished during a stipend of the Gerda Henkel Foundation I received for my PhD project “The Sanctuary of Dionysos in the Sacral Landscape of Miletus” at the Ruhr-University Bochum and since 2020 at the University of Hamburg. The table included the following data (sample of inventoried rings):
Inv | clay | applications | color_appl | slip | markings | markings_type | height | min_DM | max_DM | width | thickness | shape | fragments | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
59 | MK73.C5.3 | reddish yellow | none | NA | none | 1 | tools | 1.80 | 10.0 | 12.5 | 1.25 | 0.39 | straight | 1 |
1 | MK75.N16.1 | pink | stains | red | beige | 0 | none | 1.30 | 11.6 | 14.2 | 1.30 | 0.51 | straight | 9 |
20 | MK75.N16.10 | pink | none | NA | none | 0 | none | 1.10 | 6.8 | 11.8 | 2.50 | 0.34 | concave | 2 |
66 | MK75.N16.11 | pink | none | NA | none | 1 | Theta | 1.30 | 12.2 | 15.2 | 1.50 | 0.49 | straight | 1 |
67 | MK75.N16.12 | pink | stains | NA | none | 1 | Theta (half), tools | 1.48 | 14.6 | 18.0 | 1.70 | 0.49 | concave | 1 |
14 | MK75.N16.2 | pink | stains | black | beige | 1 | tools | 1.00 | 10.0 | 12.2 | 1.10 | 0.50 | concave | 1 |
65 | MK75.N16.3 | pink | none | NA | none | 0 | none | 1.17 | 11.6 | 12.4 | 0.40 | 0.48 | convex | 1 |
9 | MK75.N16.4 | light reddish brown | none | NA | yellowish | 0 | none | 1.50 | 11.4 | 14.6 | 1.60 | 0.57 | concave | 3 |
13 | MK75.N16.5 | pink | stains | dark red | none | 0 | none | 1.00 | 10.0 | 12.4 | 1.20 | 0.44 | concave | 3 |
6 | MK75.N16.6 | light reddish brown | streaks | light red | yellowish | 0 | none | 1.20 | 11.6 | 14.8 | 1.60 | 0.34 | straight | 4 |
2 | MK75.N16.7 | light reddish brown | stains | dark red | none | 0 | none | 1.30 | 11.4 | 13.8 | 1.20 | 0.50 | straight | 3 |
41 | MK75.N16.8 | pink | none | NA | none | 0 | none | 1.10 | 7.0 | 13.0 | 3.00 | 0.35 | concave | 1 |
18 | MK75.N16.9 | reddish yellow | streaks | light red | none | 0 | none | 1.70 | 6.8 | 13.2 | 3.20 | 0.35 | concave | 3 |
The first step in the paper is to look at the distribution of measurements with a density graph of the numeric variables. Thickness and Height are multiplied by 10 to make the outcome more visible. All measurements are originally in cm.
library(reshape2)
library(ggridges)
library(ggplot2)
StackRMiletus %>%
transmute(
thickness_mm = thickness * 10,
height_mm = height * 10,
width_cm = width,
min_DM_cm = min_DM,
max_DM_cm = max_DM) %>%
melt() %>%
ggplot(aes(x = value, y = variable, fill = variable)) +
geom_density_ridges(scale = 4, bw = "SJ", alpha = 0.8) +
scale_fill_manual(values = custom_palette(5)) +
theme(legend.position = "none",
panel.background = element_blank(),
axis.title = element_blank()) +
labs(title = "Numerical variables across all rings")
As explained in the paper, hdbscan (McInnes, Healy, and Astels 2017) from the dbscan-package (Hahsler, Piekenbrock, and Doran 2017) is used to cluster the rings according to minimum and maximum diameter. First, let hdbscan identify clusters for the original data:
library(dbscan)
#>
#> Attaching package: 'dbscan'
#> The following object is masked from 'package:stats':
#>
#> as.dendrogram
hdbcluster <- StackRMiletus %>%
select(min_DM, max_DM) %>%
hdbscan(minPts = 5)
The outcome (i.e. cluster membership and membership probability) is then added to the original data.frame:
StackRMiletus$HDBScan_Cluster <- as.factor(hdbcluster$cluster)
StackRMiletus$membership_prob <- hdbcluster$membership_prob
And can be visualized using ggplot2, putting the diameters on the x and y axis, displaying cluster membership as point color, membership probability as opacity and shape of the profile as point shape:
To generate samples based on the actual data, we need summary statistics first:
datastructure <- data.frame(matrix(ncol = 5, nrow = 4))
colnames(datastructure) <- c("height", "min_DM", "max_DM",
"width", "thickness")
rownames(datastructure) <- c("range", "mean", "sd", "cv")
for (col in colnames(datastructure)) {
range <- range(StackRMiletus[, col])
sd <- sd(StackRMiletus[, col])
mean <- mean(StackRMiletus[, col])
datastructure["range", col] <- range[2] - range[1]
datastructure["mean", col] <- mean
datastructure["sd", col] <- sd
datastructure["cv", col] <- sd / mean * 100
}
Which look like this:
height | min_DM | max_DM | width | thickness | |
---|---|---|---|---|---|
range | 1.200 | 9.000 | 6.600 | 3.100 | 0.260 |
mean | 1.120 | 10.103 | 13.234 | 1.566 | 0.441 |
sd | 0.248 | 1.826 | 1.511 | 0.750 | 0.073 |
cv | 22.120 | 18.070 | 11.420 | 47.878 | 16.570 |
Setup an empty dataframe to record the results of silhouette-tests:
tests_compare <- data.frame(matrix(ncol = 5, nrow = 1000))
colnames(tests_compare) <- c("N", "n_cluster", "avg_sil_width",
"avg_sil_width_without_0", "n_noise")
Based on the info from the summary statistics, the data.frame created above can be filled with 1000 fictional distributions using the mean of the original data and its standard deviation as a basis. First, 3000 fictional distributions will be generated to randomly sample exactly 1000 of them, as sometimes hdbscan may return no clusters and thus the process would produce NA-values, which can be removed beforehand in this way.
library(cluster)
tests_dummy <- lapply(seq_len(3000), function(x) {
testsample <- data.frame(matrix(nrow = 67,
ncol = 2))
colnames(testsample) <- c("min_DM", "max_DM")
testsample[, "min_DM"] <- rnorm(
n = 67,
mean = datastructure$min_DM[2],
sd = datastructure$min_DM[3]
)
width <- rnorm(
n = 67,
mean = datastructure$width[2],
sd = datastructure$width[3]
)
testsample[, "max_DM"] <- (width * 2) + testsample[, "min_DM"]
test_dist <- dist(testsample, method = "euclidean")
test_hdbcluster <- hdbscan(testsample, minPts = 5)
sil_hdbscan <- silhouette(test_hdbcluster$cluster,
test_dist,
do.clus.stat = T,
do.n.k = T)
x <- list(
N = nrow(testsample),
n_cluster = length(unique(test_hdbcluster$cluster)) - 1,
n_noise = length(which(test_hdbcluster$cluster == 0)),
avg_sil_width = NA,
avg_sil_width_without_0 = NA
)
if (x$n_cluster >= 1) {
mat <- as.data.frame(sil_hdbscan[1:67, 1:3])
x$avg_sil_width <- mean(mat$sil_width)
mat <- mat[-which(mat$cluster == 0), ]
x$avg_sil_width_without_0 <- mean(mat$sil_width)
}
if (any(is.na(x))) {
return(NA)
} else {
return(x)
}
})
tests_dummy <- tests_dummy[!is.na(tests_dummy)]
tests_compare <- tests_dummy[sample(seq_along(tests_dummy),
replace = FALSE,
size = nrow(tests_compare))]
tests_compare <- bind_rows(tests_compare)
We might as well take a quick look at the silhouette plot for the clusters found in the original data. The silhouette first needs a distance matrix of the original data as well.
dist <- StackRMiletus %>%
select(min_DM, max_DM) %>%
dist(method = "euclidean")
sil_hdbscan <- silhouette(hdbcluster$cluster,
dist,
do.clus.stat = T,
do.n.k = T)
Range of SC | Interpretation |
---|---|
0.71-1.0 | A strong structure has been found (min. green line) |
0.51-0.70 | A reasonable structure has been found (min. blue line) |
0.26-0.50 | The structure is weak and could be artificial (min. red line) |
< 0.25 | No substantial structure has been found |
(see: Spector (2011))
Results from the Silhouette-Test for the original data will be are compiled in a separate data.frame:
miletus_data_sil <- data.frame(matrix(ncol = 5, nrow = 1))
colnames(miletus_data_sil) <- colnames(tests_compare)
miletus_data_sil$N[1] <- nrow(StackRMiletus)
miletus_data_sil$n_cluster[1] <- length(unique(hdbcluster$cluster)) - 1
miletus_data_sil$n_noise[1] <- length(which(hdbcluster$cluster == 0))
mat <- as.data.frame(sil_hdbscan[1:67, 1:3])
miletus_data_sil$avg_sil_width[1] <- mean(mat$sil_width)
mat <- mat[-which(mat$cluster == 0), ]
miletus_data_sil$avg_sil_width_without_0[1] <- mean(mat$sil_width)
And can then be compared to the distribution of respective values from the generated sample, seen here in density graphs:
For the plot in Fig. 12 all the numerical variables where scaled
using scale()
to ease visual comparison:
However, a look at the original boxplot might be interesting as well:
As I am unsure of copyright issues in making the data available online, I did not incorporate the comparisons with rings from Monaco (2000) and Cracolici (2003).