Khipu Cord Analysis

In this section, I will review the statistical nature of cords. How are they attached, what twists do they have, what is the main pendant to subsidiary ratio, etc.

If you are unaware of cord construction concepts such as twist, knots, etc. I suggest you start with the Khipu Legend page.

This page is dry and somewhat unexciting, but it’s a necessary step to understanding what is needed to draw and analyze the khipus. I believe it is at the cord and cord cluster level where most of the detailed investigation of khipus will lie in Phase 3.

This page is the output of an executable Jupyter notebook. The long prologue of imports and setup has been extracted to a startup file for brevity. The first task is to complete initialization by doing final setup and importing the Python Khipu OODB (Object Oriented Database) and summary files.

Code

# Initialize plotly
plotly.offline.init_notebook_mode(connected = True);
# Khipu Imports
import khipu_kamayuq as kamayuq  # A Khipu Maker is known (in Quechua) as a Khipu Kamayuq
import khipu_qollqa as kq        # A Khipu Qollca is a warehouse that holds khipu (and other databases)
all_khipus = [aKhipu for aKhipu in kamayuq.fetch_all_khipus().values()]
khipu_summary_df = kq.fetch_khipu_summary()

# Since this a presentation version, turn off any idiot lights
import warnings
warnings.filterwarnings('ignore')

Primary Cords

Let’s first start with primary cords, the main cord that all the cords are attached to. I’m particularly interested in their lengths, since that will have an impact on rendering…

Code

pcord_df = kq.primary_cord_df.loc[:,['khipu_id', 'pcord_length', 'twist', 'notes']].sort_values(by='pcord_length', ascending=False)
pcord_df.twist = pcord_df.twist.fillna('')
pcord_df.notes = pcord_df.notes.fillna('')
display_dataframe(pcord_df)

pcord_df.pcord_length.describe()

	khipu_id	pcord_length	twist	notes
primary_cord_index
1000096	1000096	513.5	U	About 20.0 cm of the main cord is tied in one large knot.
1000541	1000541	304.0	U	Final twist: Braided
1000118	1000118	300.5	Z	Main cord is Z-plyed BB:W which is S-wrapped with BB cord.
1000508	1000508	280.5
1000333	1000333	243.5	U
...	...	...	...	...
1000289	1000289	NaN		Connected to pendant #13 of UR117A.
1000293	1000293	NaN		All of the top cords are looped through the attachments of their group pendants.
1000295	1000295	NaN		#NAME?
1000328	1000328	NaN		Between 15 - 28.5 cm, primary cord bears knots:
1000452	1000452	NaN		0.0-1.5 cm ravelled end to thread-wrap (AB)

count    662.000000
mean      57.646752
std       47.536571
min        0.000000
25%       29.000000
50%       46.500000
75%       73.375000
max      513.500000
Name: pcord_length, dtype: float64

Code

prep_plot()
sns.distplot(pcord_df.pcord_length.values, bins=60, color="#588bbe", kde=False, rug=False).set_title('Histogram of Primary cord Lengths');

Note that a significant number of cords (20) have 0 length. One khipu is over 500 cm in size - 17 feet long.

Cord Clusters

When you look at khipus you notice that there’s a lot of pendant cord clusters - 6 cords of white, 4 cords of black and white mottled, that sort of thing. Do these clusters and colors constitute some sort of basic structure?

Let’s do a simple investigation for now. First let’s build the cluster table - what we’re trying to plot is number of cords per cluster versus colors. To make things simple, let’s get the grey-scale value of a cord, and then average that across the cord cluster to get an average intensity and or the number of Ascher cord colors for the cluster. The prebuilt database awaits:

Code

cluster_summary_df = kq.fetch_cluster_summary()
cluster_summary_df

	Unnamed: 0	khipu_id	khipu_name	cluster_id	cords_per_cluster	colors_per_cluster	intensity	cluster_colors_set	cluster_colors_all	banded	seriated	mean_cord_value	frequency
0	6052	1000661	UR294	1019839	5	5	0.437930	'LB', 'MB', 'NB', 'W', 'W:BG:MB'	'LB', 'MB', 'MB', 'MB', 'NB', 'W', 'W:BG:MB'	False	True	47.0	1
1	6051	1000660	UR293	1019837	4	4	0.586729	'0B', 'W', 'W-MB-MB', 'W:MB'	'0B', 'W', 'W-MB-MB', 'W:MB'	False	True	142.0	1
2	8186	6000092	UR292A	6004423	1	1	0.956863	'W'	'W'	True	False	20.0	441
3	8183	6000092	UR292A	6004420	6	5	0.550804	'AB:KB', 'GG', 'KB', 'W', 'W:MB'	'AB:KB', 'GG', 'KB', 'W', 'W', 'W:MB'	False	True	1.0	1
4	8184	6000092	UR292A	6004421	15	8	0.502118	'AB', 'GB', 'KB', 'MB', 'W', 'W-YB', 'W:GB', 'YB'	'AB', 'GB', 'GB', 'GB', 'GB', 'KB', 'KB', 'KB', 'MB', 'W', 'W', 'W-YB', 'W:GB', 'YB', 'YB'	False	True	1.0	1
...	...	...	...	...	...	...	...	...	...	...	...	...	...
8187	6057	6000001	AS002	6000016	6	2	0.207314	'DB', 'LB'	'DB', 'DB', 'DB', 'LB', 'LB', 'LB'	False	True	19.0	2
8188	6056	6000000	AS001	6000003	1	1	0.272706	'MB'	'MB'	True	False	93.0	104
8189	6055	6000000	AS001	6000002	6	1	0.272706	'MB'	'MB', 'MB', 'MB', 'MB', 'MB', 'MB'	True	False	13.0	20
8190	6054	6000000	AS001	6000001	1	1	0.272706	'MB'	'MB'	True	False	101.0	104
8191	6053	6000000	AS001	6000000	6	1	0.272706	'MB'	'MB', 'MB', 'MB', 'MB', 'MB', 'MB'	True	False	16.0	20

Code

cords_per_cluster = list(cluster_summary_df['cords_per_cluster'])
cord_cluster_counter = Counter(cords_per_cluster)
cord_count = sorted(cord_cluster_counter.most_common(), key=lambda x: x[0])

colors_per_cluster = list(cluster_summary_df['colors_per_cluster'])
color_cluster_counter = Counter(colors_per_cluster)
color_count = sorted(color_cluster_counter.most_common(), key=lambda x: x[0])
max([num_colors for num_colors, occurences in color_count])

Code

prep_plot(3)
sns.distplot(cluster_summary_df.cords_per_cluster.values, bins=160, color="#588bbe", 
             kde=False, rug=False, hist_kws={'log':True}).set_title('Histogram of Cords Per Cluster (Log10)');

Code

prep_plot(3)
sns.distplot(cluster_summary_df.colors_per_cluster.values, bins=30, color="#588bbe", 
             kde=False, rug=False, hist_kws={'log':True}).set_title('Histogram of Colors Per Cluster (Log10)');

Code

cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="intensity", log_x=True,
                 size="marker_size", color="colors_per_cluster", 
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "intensity": "Average GreyScale (0-1) per Cluster"},
                 hover_data=['cluster_colors_set'], 
                 title="<b>Cords per Cluster vs Average GreyScale (0-1) per Cluster</b> Size is frequency of occurence",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

In the python code, I had set a cord’s is_white predicate to True if it’s average intensity is .7. Looking at the above graphic I changed it to .75. Beyond that I can see true white is not 1.0 but close, and that true white clusters are used until about 10 cords per cluster. I note that the number of white cords drops significantly after 7 - perhaps the old saw about we can easily remember up to 7 things is the reason? After that the true colors come out so to speak. So let’s see what the reverse has to tell us:

Code

cluster_summary_df['marker_size'] = [10+aValue*2 for aValue in cluster_summary_df.frequency.values]
fig = px.scatter(cluster_summary_df, x="cords_per_cluster", y="colors_per_cluster", log_x = True, log_y = True,
                 size="marker_size",color="intensity", 
                 labels={"cords_per_cluster": "Cords per Cluster (log10)", "colors_per_cluster": "#Ascher Colors per Cluster (log10)"},
                 hover_data=['cluster_colors_set'], 
                 title="<b>Cords per Cluster(log10) vs Colors per Cluster (log10)</b> Size is frequency of occurence",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

Hmmnn…..

As a final preliminary EDA on cord clusters, let’s “riff” on an a theme first investigated by Jon Clindaniel - mean cord value per cluster across the number of colors in the cluster.

Code

cluster_summary_df['marker_size'] = [aValue for aValue in cluster_summary_df.cords_per_cluster.values]
fig = px.scatter(cluster_summary_df, x="colors_per_cluster", y="mean_cord_value", log_x = False, log_y = False,
                 size="marker_size",
                 color="seriated", 
                 labels={"colors_per_cluster": "#Ascher Colors per Cluster", "mean_cord_value": "Mean Cord Value per Cluster", },
                 hover_data=['khipu_name','cluster_colors_set'], 
                 title="<b>Mean Cord Value per Cluster vs Colors per Cluster</b> Size is number of cords in cluster",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

Cord Attachments

Let’s look at the distribution of cords by attachment type (up, down, recto, etc…)

Pendant Cords

We can examine how khipus “Fan-Out” by looking at their branching structure. First let’s review top level primary “pendant” cords vs subsidiary cords attached to pendant cords. Then let’s review fan-out, the amount of braching that occurs. At this macro level, I’ll define fan-out as number of pendant cords divided by total cords (#pendant_cords + #subsidiary_cords).

Code

pendant_cords_df = kq.cord_df[kq.cord_df.pendant_from < 2000000]
pendant_cords_df.attachment_type.value_counts()

khipu_ids = pendant_cords_df.drop_duplicates(subset=['khipu_id']).khipu_id.values

khipu_pendants_df = pendant_cords_df.set_index('khipu_id')
r_cords = khipu_pendants_df[khipu_pendants_df.attachment_type=='R'].index.value_counts()
v_cords = khipu_pendants_df[khipu_pendants_df.attachment_type=='V'].index.value_counts()
rv_cords = khipu_pendants_df[khipu_pendants_df.attachment_type.isin(['R','V'])].index.value_counts()
u_cords = khipu_pendants_df[khipu_pendants_df.attachment_type=='U'].index.value_counts()
total_cords = khipu_pendants_df[khipu_pendants_df.attachment_type.isin(['R','V', 'U'])].index.value_counts()
khipu_names = [kq.kfg_name_from_id(aKhipu_ID) for aKhipu_ID in khipu_ids]

pendant_attachment_df = pd.DataFrame({'khipu_id': khipu_ids, 'name': khipu_names,
                                      'r_cords':r_cords, 'v_cords':v_cords, 'u_cords':u_cords, 'rv_cords':rv_cords, 'total_cords':total_cords})
pendant_attachment_df.r_cords = pendant_attachment_df.r_cords.fillna(0)
pendant_attachment_df.v_cords = pendant_attachment_df.v_cords.fillna(0)
pendant_attachment_df.u_cords = pendant_attachment_df.u_cords.fillna(0)
pendant_attachment_df.total_cords = pendant_attachment_df.total_cords.fillna(0)

pendant_attachment_df

R    12812
U    11635
V    11246
?      301
Name: attachment_type, dtype: int64

	khipu_id	name	r_cords	v_cords	u_cords	rv_cords	total_cords
1000000	1000000	UR019	3.0	8.0	2.0	11.0	13
1000001	1000001	UR008	0.0	6.0	0.0	6.0	6
1000002	1000002	UR020	13.0	9.0	0.0	22.0	22
1000003	1000003	UR018	15.0	12.0	0.0	27.0	27
1000004	1000004	UR010	63.0	91.0	3.0	154.0	157
...	...	...	...	...	...	...	...
1000656	1000656	UR289	4.0	31.0	0.0	35.0	35
1000657	1000657	UR290	8.0	0.0	0.0	8.0	8
1000658	1000658	UR291A	0.0	166.0	0.0	166.0	166
1000660	1000660	UR293	0.0	4.0	0.0	4.0	4
1000661	1000661	UR294	0.0	5.0	0.0	5.0	5

Code

prep_plot(3)
sns.distplot(khipu_summary_df.fanout_ratio.values, bins=100, color="#588bbe", 
             kde=False, rug=False, hist_kws={'log':True}).set_title('Histogram of Khipu Fan-Out Ratio\'s (Log10)');
khipu_summary_df.fanout_ratio.describe()

Text(0.5, 1.0, "Histogram of Khipu Fan-Out Ratio's (Log10)")

count    650.000000
mean       1.424814
std        0.980953
min        1.000000
25%        1.000000
50%        1.085476
75%        1.500000
max       14.250000
Name: fanout_ratio, dtype: float64

Distribution of Up Cords vs Recto/Verso Cords

It will be useful later to evaluate up cord distribution on the narrative-to-accounting khipu spectrum. Meanwhile, let’s warm up with this:

Code

fig = px.scatter(khipu_summary_df, x="num_top_cords", y="rv_cords", log_x=True,
                 size="num_cords", color="num_cords", 
                 labels={"num_top_cords": "#U/Top Cords (Log10)", "rv_cords": "#Recto+Verso Cords"},
                 hover_data=['kfg_name'], title="<b>Top Cords (Log10) vs Recto+Verso Cords</b>",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

Distribution of Recto vs Verso Cords

Code

fig = px.scatter(khipu_summary_df, x="verso_cords", y="recto_cords", log_x=True,
                 size="num_cords", color="num_cords", 
                 labels={"verso_cords": "#Verso Cords (log10)", "recto_cords": "#Recto Cords"},
                 hover_data=['kfg_name'], title="<b>Verso Cords (log10) vs Recto Cords by Khipu</b>",
                 width=1200, height=750);
(fig.add_shape(type="line", x0=29, y0=1, x1=830, y1=410,
              line=dict(color="MediumPurple",width=4,dash="dot"))
    .add_shape(type="line", x0=29, y0=1, x1=3, y1=330,
              line=dict(color="MediumPurple",width=4,dash="dot"))
    .update_shapes(dict(xref='x', yref='y'))
    .update_layout(showlegend=True).show()
)

Note the curious relationship between rectos and versos suggested by the dashed purple lines. Is this simply an artifact of log mapping or is there something there… To me, this indicates that versos and rectos may be involved in some sort of subdivision - for example north vs south, or east vs west.

Cord Twist (aka Ply/Spin)

Each cord has one of two ply/spin’s - affectionately known as S or Z twist depending on the diagonal’s direction in the string. Let’s plot that distribution.

Code

fig = px.scatter(khipu_summary_df, x="num_s_cords", y="num_z_cords", log_x=True,
                 size="num_cords", color="num_cords", 
                 labels={"num_s_cords": "#S Cords (Log10)", "num_z_cords": "#Z Cords"},
                 hover_data=['kfg_name'], title="<b>S Cords (Log10) vs Z Cords</b>",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

Mixed Twist Khipus

Urton in Sign of the Inka Khipu spends quite some time on mixed twist khipus. I note that mixed twist khipus occur in only approximately 5% of the khipus in the Harvard database.

Code


print(f"{khipu_summary_df[khipu_summary_df.num_z_cords > 0].shape[0]} khipus exist with Z cords")
print(f"{khipu_summary_df.num_z_cords.sum()} Z cords exist - about {100*khipu_summary_df.num_z_cords.sum()/khipu_summary_df.num_cords.sum()} % of all cords")

mixed_ply_df = khipu_summary_df[(khipu_summary_df.num_z_cords > 0) & (khipu_summary_df.num_s_cords > 0)].sort_values('num_z_cords',ascending=False)
mixed_ply_df = mixed_ply_df[['khipu_id', 'kfg_name', 'num_cord_clusters', 
       'benford_match', 'num_cords', 'num_s_cords', 'num_z_cords',
       'num_total_cords','num_ascher_colors']]
print(f"{mixed_ply_df.shape[0]} khipus have mixed twist construction")
display_dataframe(mixed_ply_df)

80 khipus exist with Z cords
2912 Z cords exist - about 5.226130653266332 % of all cords
34 khipus have mixed twist construction

	khipu_id	kfg_name	num_cord_clusters	benford_match	num_cords	num_s_cords	num_z_cords	num_total_cords	num_ascher_colors
71	1000629	JC005	6	0.652365	149	1	148	149	28
235	6000092	UR292A	9	0.513569	56	3	52	56	16
94	6000079	UR054	115	0.788527	115	6	47	115	2
74	1000314	UR1095	10	0.913865	212	171	40	212	5
135	1000310	UR1087	2	0.948618	97	61	36	97	6
...	...	...	...	...	...	...	...	...	...
51	1000329	UR119	49	0.893283	304	301	1	304	22
37	1000275	UR089	9	0.804845	288	286	1	288	37
17	1000020	UR003	15	0.645978	758	749	1	758	36
11	1000344	UR1084	26	0.840853	319	298	1	319	13
631	1000339	UR096	3	0.936845	57	45	1	57	26

Cord Knots

Knot Distribution on Recto/Verso Cords

What about the distribution of Recto/Verso and S/Z Twist Knots

Code

from khipu_cord import fetch_cord
r_cords_df = khipu_pendants_df[khipu_pendants_df.attachment_type=='R']
v_cords_df = khipu_pendants_df[khipu_pendants_df.attachment_type=='V']

SR_cord_df = r_cords_df[r_cords_df.twist == 'S']
SR_cords = [fetch_cord(aCordID) for aCordID in SR_cord_df.cord_id.values]
SR_knots = [aCord.num_knots() for aCord in SR_cords] 

ZR_cord_df = r_cords_df[r_cords_df.twist == 'Z']
ZR_cords = [fetch_cord(aCordID) for aCordID in ZR_cord_df.cord_id.values]
ZR_knots = [aCord.num_knots() for aCord in ZR_cords] 

SV_cord_df = v_cords_df[v_cords_df.twist == 'S']
SV_cords = [fetch_cord(aCordID) for aCordID in SV_cord_df.cord_id.values]
SV_knots = [aCord.num_knots() for aCord in SV_cords] 

ZV_cord_df = v_cords_df[v_cords_df.twist == 'Z']
ZV_cords = [fetch_cord(aCordID) for aCordID in ZV_cord_df.cord_id.values]
ZV_knots = [aCord.num_knots() for aCord in ZV_cords]

Code

prep_plot(3)
p1 = sns.distplot(SR_knots, bins=29, kde=False, rug=True, color = "#588bbe", hist_kws={'log':True});
p1 = sns.distplot(ZR_knots, bins=24, kde=False, rug=True, color = "yellow", hist_kws={'log':True})
p1.set_title('Histogram of Knot Counts on Recto Cords\n S (blue) vs Z (yellow) with LOG SCALE on Y AXIS');

Code

prep_plot(3)
p1 = sns.distplot(SV_knots, bins=26,  kde=False, rug=True, color = "#588bbe", hist_kws={'log':True})
p1 = sns.distplot(ZV_knots, bins=23,  kde=False, rug=True, color = "yellow",  hist_kws={'log':True})
p1.set_title('Histogram of Knot Counts on Verso Cords\n S (blue) vs Z (yellow) with LOG SCALE on Y AXIS');

Code

# plt.figure(figsize=(14,8))
# plt.grid(linestyle='dotted', linewidth=.75, axis='y')

fig = px.scatter(mixed_ply_df, x="num_s_cords", y="num_z_cords", 
                 size="num_cords", color="num_cords", 
                 labels={"num_s_cords": "#S Twists", "num_z_cords": "#Z Twists"},
                 hover_data=['kfg_name'], title="<b>Mixed Twist Khipus: S Twist Cords vs Z Twist Cords by Khipu</b>",
                 width=1200, height=750);
fig.update_layout(showlegend=True).show()

Z-Cords vs S-Cords Knot Distribution

Urton suggests that perhaps the Z twist cords might be narrative in nature. I’m curious what the number of knots distributed over Z cords looks like: vs S cords

Code

from khipu_cord import fetch_cord
from collections import Counter
S_cord_df = kq.cord_df[kq.cord_df.twist == 'S']
S_cords = [fetch_cord(aCordID) for aCordID in list(S_cord_df.cord_id.values)]
S_knots = [aCord.num_knots() for aCord in S_cords] 

print("S Cord Knot Counter:")
knot_counter = Counter(S_knots)
for key in sorted(knot_counter):
    print(f"{key}: {knot_counter[key]}")
print("\n")

Z_cord_df = kq.cord_df[kq.cord_df.twist == 'Z']
Z_cords = [fetch_cord(aCordID) for aCordID in list(Z_cord_df.cord_id.values)]
Z_knots = [aCord.num_knots() for aCord in Z_cords] 

print("Z Cord Knot Counter:")
knot_counter = Counter(Z_knots)
for key in sorted(knot_counter):
    print(f"{key}: {knot_counter[key]}")

S Cord Knot Counter:
0: 13643
1: 16710
2: 4863
3: 2713
4: 1951
5: 1573
6: 1093
7: 955
8: 808
9: 689
10: 550
11: 330
12: 249
13: 192
14: 150
15: 129
16: 81
17: 55
18: 52
19: 39
20: 20
21: 18
22: 8
23: 2
24: 7
25: 4
26: 3
27: 2
29: 2


Z Cord Knot Counter:
0: 644
1: 1009
2: 541
3: 538
4: 303
5: 187
6: 138
7: 126
8: 104
9: 94
10: 51
11: 56
12: 33
13: 48
14: 15
15: 16
16: 8
17: 6
19: 4
20: 2
21: 1
23: 3
24: 1

Code

hist_data = [S_knots, Z_knots]
group_labels = ['S_Cords', 'Z_Cords']
colors = ['#835AF1', '#B8F7D4']

# Create plotly distplot with custom bin_size
(
ff.create_distplot(hist_data, group_labels,  colors=colors, bin_size=1, 
                         show_rug=False,show_curve=False)
    .update_layout(title_text='<b>Number of Knots by Cord Twist</b>', # title of plot
                     xaxis_title_text='Number of Knots', # xaxis label
                     yaxis_title_text='Count', # yaxis label
                     width=1200,
                     #bargap=0.2, # gap between bars of adjacent location coordinates
                     #bargroupgap=0.1 # gap between bars of the same location coordinates
                     )
     .show()
)

That ends our quick EDA of cords.