Khipu Cords - An Exploratory Data Analysis

In this section, I will review the statistical nature of cords. How are they attached, what twists do they have, what is the main pendant to subsidiary ratio, etc.

If you are unaware of cord construction concepts such as twist, knots, etc. I suggest you start with the Khipu Legend page.

This page is dry and somewhat unexciting, but it's a necessary step to understanding what is needed to draw and analyze the khipus. I believe it is at the cord and cord cluster level where most of the detailed investigation of khipus will lie in Phase 3.

This page is the output of an executable Jupyter notebook. The long prologue of imports and setup has been extracted to a startup file for brevity. The first task is to complete initialization by doing final setup and importing the Python Khipu OODB (Object Oriented Database) and summary files.

In [1]:
# Initialize plotly
plotly.offline.init_notebook_mode(connected = True);
# Khipu Imports
sys.path.insert(0, '../../classes')  # Load external khipu classes contained in this directory
import khipu_kamayuq as kamayuq  # A Khipu Maker is known (in Quechua) as a Khipu Kamayuq
import khipu_qollqa as kq        # A Khipu Qollca is a warehouse that holds khipu (and other databases)
all_khipus = [aKhipu for aKhipu in kamayuq.fetch_all_khipus().values()]
khipu_summary_df = kq.fetch_khipu_summary()

Let's first start with primary cords, the main cord that all the cords are attached to. I'm particularly interested in their lengths, since that will have an impact on rendering...

In [2]:
pcord_df = kq.primary_cord_df.loc[:,['khipu_id', 'pcord_length', 'twist', 'notes']].sort_values(by='pcord_length', ascending=False)
pcord_df.twist = pcord_df.twist.fillna('')
pcord_df.notes = pcord_df.notes.fillna('')

count    489.000000
mean      57.262986
std       48.641912
min        0.000000
25%       29.000000
50%       46.500000
75%       71.000000
max      513.500000
Name: pcord_length, dtype: float64
In [3]:
sns.distplot(pcord_df.pcord_length.values, bins=60, color="#588bbe", kde=False, rug=False).set_title('Histogram of Primary cord Lengths');

Note that a significant number of cords (20) have 0 length. One khipu is over 500 cm in size - 17 feet long.

When you look at khipus you notice that there's a lot of pendant cord clusters - 6 cords of white, 4 cords of black and white mottled, that sort of thing. Do these clusters and colors constitute some sort of basic structure?

Let's do a simple investigation for now. First let's build the cluster table - what we're trying to plot is number of cords per cluster versus colors. To make things simple, let's get the grey-scale value of a cord, and then average that across the cord cluster to get an average intensity and or the number of Ascher cord colors for the cluster. The prebuilt database awaits:

In [4]:
cluster_summary_df = kq.fetch_cluster_summary()
Unnamed: 0 khipu_id khipu_name cluster_id cords_per_cluster colors_per_cluster intensity cluster_colors_set cluster_colors_all banded seriated mean_cord_value frequency
0 5627 1000661 UR294 1019839 5 5 0.437930 'LB', 'MB', 'MB:W:BG', 'NB', 'W' 'LB', 'MB', 'MB', 'MB', 'MB:W:BG', 'NB', 'W' False True 47.0 1
1 5619 1000658 UR291A 1019818 10 8 0.643034 'AB:KB', 'AB:NB', 'GL:W:AB', 'KB:NB', 'W', 'W-... 'AB:KB', 'AB:NB', 'GL:W:AB', 'KB:NB', 'W', 'W'... False True 3.0 1
2 5612 1000658 UR291A 1019811 11 7 0.603955 'AB', 'GL', 'KB', 'KB:NB', 'MB:NB', 'NB', 'W' 'AB', 'GL', 'KB', 'KB:NB', 'MB:NB', 'NB', 'W',... False True 23.0 1
3 5613 1000658 UR291A 1019812 10 6 0.632355 'GL', 'KB', 'NB', 'W', 'W-BG', 'W:NB' 'GL', 'KB', 'NB', 'NB', 'W', 'W', 'W', 'W', 'W... False True 13.0 1
4 5614 1000658 UR291A 1019813 11 6 0.606289 'GL', 'KB', 'NB', 'W', 'W-BG', 'W:NB' 'GL', 'KB', 'NB', 'NB', 'W', 'W', 'W', 'W-BG',... False True 4.0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
5623 9 1000166 AS010 1005295 3 1 0.956863 'W' 'W', 'W', 'W' True False 15.0 88
5624 10 1000166 AS010 1005296 1 1 0.235059 'LB' 'LB' True False 0.0 22
5625 11 1000166 AS010 1005297 1 1 0.956863 'W' 'W' True False 10.0 190
5626 12 1000166 AS010 1005298 1 1 0.235059 'LB' 'LB' True False 2.0 22
5627 0 1000166 AS010 1005286 1 1 0.595961 'LB:W' 'LB:W' True False 33.0 10

5628 rows × 13 columns

In [5]:
cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="intensity", log_x=True,
                 size="marker_size", color="colors_per_cluster", 
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "intensity": "Average GreyScale (0-1) per Cluster"},
                 title="<b>Cords per Cluster vs Average GreyScale (0-1) per Cluster</b> Size is frequency of occurence",
                 width=1200, height=750);

In the python code, I had set a cord's is_white predicate to True if it's average intensity is .7. Looking at the above graphic I changed it to .75. Beyond that I can see true white is not 1.0 but close, and that true white clusters are used until about 10 cords per cluster. I note that the number of white cords drops significantly after 7 - perhaps the old saw about we can easily remember up to 7 things is the reason? After that the true colors come out so to speak. So let's see what the reverse has to tell us:

In [6]:
cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="colors_per_cluster", log_x = True, log_y = True,
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "colors_per_cluster": "#Ascher Colors per Cluster (log10)"},
                 title="<b>Cords per Cluster(log10) vs Colors per Cluster (log10)</b> Size is frequency of occurence",
                 width=1200, height=750);