Khipu Cord Analysis

In this section, I will review the statistical nature of cords. How are they attached, what twists do they have, what is the main pendant to subsidiary ratio, etc.

If you are unaware of cord construction concepts such as twist, knots, etc. I suggest you start with the Khipu Legend page.

This page is dry and somewhat unexciting, but it’s a necessary step to understanding what is needed to draw and analyze the khipus. I believe it is at the cord and cord cluster level where most of the detailed investigation of khipus will lie in Phase 3.

This page is the output of an executable Jupyter notebook. The long prologue of imports and setup has been extracted to a startup file for brevity. The first task is to complete initialization by doing final setup and importing the Python Khipu OODB (Object Oriented Database) and summary files.

# Initialize plotly
plotly.offline.init_notebook_mode(connected = True);
# Khipu Imports
import khipu_kamayuq as kamayuq  # A Khipu Maker is known (in Quechua) as a Khipu Kamayuq
import khipu_qollqa as kq        # A Khipu Qollca is a warehouse that holds khipu (and other databases)
all_khipus = [aKhipu for aKhipu in kamayuq.fetch_all_khipus().values()]
khipu_summary_df = kq.fetch_khipu_summary()

# Since this a presentation version, turn off any idiot lights
import warnings

Primary Cords

Let’s first start with primary cords, the main cord that all the cords are attached to. I’m particularly interested in their lengths, since that will have an impact on rendering…

pcord_df = kq.primary_cord_df.loc[:,['khipu_id', 'pcord_length', 'twist', 'notes']].sort_values(by='pcord_length', ascending=False)
pcord_df.twist = pcord_df.twist.fillna('')
pcord_df.notes = pcord_df.notes.fillna('')

khipu_id pcord_length twist notes
1000096 1000096 513.5 U About 20.0 cm of the main cord is tied in one large knot.
1000541 1000541 304.0 U Final twist: Braided
1000118 1000118 300.5 Z Main cord is Z-plyed BB:W which is S-wrapped with BB cord.
1000508 1000508 280.5
1000333 1000333 243.5 U
... ... ... ... ...
1000289 1000289 NaN Connected to pendant #13 of UR117A.
1000293 1000293 NaN All of the top cords are looped through the attachments of their group pendants.
1000295 1000295 NaN #NAME?
1000328 1000328 NaN Between 15 - 28.5 cm, primary cord bears knots:
1000452 1000452 NaN 0.0-1.5 cm ravelled end to thread-wrap (AB)
count    662.000000
mean      57.646752
std       47.536571
min        0.000000
25%       29.000000
50%       46.500000
75%       73.375000
max      513.500000
Name: pcord_length, dtype: float64
sns.distplot(pcord_df.pcord_length.values, bins=60, color="#588bbe", kde=False, rug=False).set_title('Histogram of Primary cord Lengths');

Note that a significant number of cords (20) have 0 length. One khipu is over 500 cm in size - 17 feet long.

Cord Clusters

When you look at khipus you notice that there’s a lot of pendant cord clusters - 6 cords of white, 4 cords of black and white mottled, that sort of thing. Do these clusters and colors constitute some sort of basic structure?

Let’s do a simple investigation for now. First let’s build the cluster table - what we’re trying to plot is number of cords per cluster versus colors. To make things simple, let’s get the grey-scale value of a cord, and then average that across the cord cluster to get an average intensity and or the number of Ascher cord colors for the cluster. The prebuilt database awaits:

cluster_summary_df = kq.fetch_cluster_summary()
Unnamed: 0 khipu_id khipu_name cluster_id cords_per_cluster colors_per_cluster intensity cluster_colors_set cluster_colors_all banded seriated mean_cord_value frequency
0 6052 1000661 UR294 1019839 5 5 0.437930 'LB', 'MB', 'NB', 'W', 'W:BG:MB' 'LB', 'MB', 'MB', 'MB', 'NB', 'W', 'W:BG:MB' False True 47.0 1
1 6051 1000660 UR293 1019837 4 4 0.586729 '0B', 'W', 'W-MB-MB', 'W:MB' '0B', 'W', 'W-MB-MB', 'W:MB' False True 142.0 1
2 8186 6000092 UR292A 6004423 1 1 0.956863 'W' 'W' True False 20.0 441
3 8183 6000092 UR292A 6004420 6 5 0.550804 'AB:KB', 'GG', 'KB', 'W', 'W:MB' 'AB:KB', 'GG', 'KB', 'W', 'W', 'W:MB' False True 1.0 1
4 8184 6000092 UR292A 6004421 15 8 0.502118 'AB', 'GB', 'KB', 'MB', 'W', 'W-YB', 'W:GB', 'YB' 'AB', 'GB', 'GB', 'GB', 'GB', 'KB', 'KB', 'KB', 'MB', 'W', 'W', 'W-YB', 'W:GB', 'YB', 'YB' False True 1.0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
8187 6057 6000001 AS002 6000016 6 2 0.207314 'DB', 'LB' 'DB', 'DB', 'DB', 'LB', 'LB', 'LB' False True 19.0 2
8188 6056 6000000 AS001 6000003 1 1 0.272706 'MB' 'MB' True False 93.0 104
8189 6055 6000000 AS001 6000002 6 1 0.272706 'MB' 'MB', 'MB', 'MB', 'MB', 'MB', 'MB' True False 13.0 20
8190 6054 6000000 AS001 6000001 1 1 0.272706 'MB' 'MB' True False 101.0 104
8191 6053 6000000 AS001 6000000 6 1 0.272706 'MB' 'MB', 'MB', 'MB', 'MB', 'MB', 'MB' True False 16.0 20
cords_per_cluster = list(cluster_summary_df['cords_per_cluster'])
cord_cluster_counter = Counter(cords_per_cluster)
cord_count = sorted(cord_cluster_counter.most_common(), key=lambda x: x[0])

colors_per_cluster = list(cluster_summary_df['colors_per_cluster'])
color_cluster_counter = Counter(colors_per_cluster)
color_count = sorted(color_cluster_counter.most_common(), key=lambda x: x[0])
max([num_colors for num_colors, occurences in color_count])
sns.distplot(cluster_summary_df.cords_per_cluster.values, bins=160, color="#588bbe", 
             kde=False, rug=False, hist_kws={'log':True}).set_title('Histogram of Cords Per Cluster (Log10)');

sns.distplot(cluster_summary_df.colors_per_cluster.values, bins=30, color="#588bbe", 
             kde=False, rug=False, hist_kws={'log':True}).set_title('Histogram of Colors Per Cluster (Log10)');

cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="intensity", log_x=True,
                 size="marker_size", color="colors_per_cluster", 
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "intensity": "Average GreyScale (0-1) per Cluster"},
                 title="<b>Cords per Cluster vs Average GreyScale (0-1) per Cluster</b> Size is frequency of occurence",
                 width=1200, height=750);

In the python code, I had set a cord’s is_white predicate to True if it’s average intensity is .7. Looking at the above graphic I changed it to .75. Beyond that I can see true white is not 1.0 but close, and that true white clusters are used until about 10 cords per cluster. I note that the number of white cords drops significantly after 7 - perhaps the old saw about we can easily remember up to 7 things is the reason? After that the true colors come out so to speak. So let’s see what the reverse has to tell us:

cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="colors_per_cluster", log_x = True, log_y = True,
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "colors_per_cluster": "#Ascher Colors per Cluster (log10)"},
                 title="<b>Cords per Cluster(log10) vs Colors per Cluster (log10)</b> Size is frequency of occurence",
                 width=1200, height=750);


As a final preliminary EDA on cord clusters, let’s “riff” on an a theme first investigated by Jon Clindaniel - mean cord value per cluster across the number of colors in the cluster.

cluster_summary_df['marker_size'] = [aValue for aValue in cluster_summary_df.cords_per_cluster.values]
fig = px.scatter(cluster_summary_df, x="colors_per_cluster", y="mean_cord_value", log_x = False, log_y = False,
                 labels={"colors_per_cluster": "#Ascher Colors per Cluster", "mean_cord_value": "Mean Cord Value per Cluster", },
                 title="<b>Mean Cord Value per Cluster vs Colors per Cluster</b> Size is number of cords in cluster",
                 width=1200, height=750);