Previously
- Reading Kinney Word Contingencies 11.4.17
- Which e-version texts did Kinney use -- GE to ask HC
Base texts for experiments
- HUGHS SET
- GABRIEL'S SET
- WANTED
- ?? Bretts Set
We would like to reproduce the results in Kinney's Chapter of Craig and Kinney 2009.
Decisions
- Which Texts?
- King Lear Q1 1608 vs F1 1623
- Which Electronic Versions?
- From Brettt G Hirsch - available only as "TEXT" not "XML" for 'King Lear'
- Which Tools?
- Pre processing -- Manual processing? Craig and Whipp 2010 ?
- Statistical Analysis -- IA ?? Which Version ?? Source Code ??
- Which Tests?
- Word Frequencies
- "Word in Proximity" frequencies
- PCA
- Shannon and SJ Entropy ....
Intelligent Archive
- Java Source Code ??
- "Processing" logic vs "function word semantic" logic etc ...
Clarify
- Typograph vs orthograph
- Homophone vs homograph
Alternative Approaches
- IA focuses on words
- DTW/MSA focuses on letters
- We want to assess...
- Changes in word frequencies
- Changes in word form
- Changes in ...
DTW Pairwise Alignments Revisited
- Focused on Letters not words...
- King Lear Q1 F1 Brett Text files... (w some markup)
- Strip out xml entities (lematisaton etc) stripMarkup.sh 20.4.17
- Script pw_align.py 20.4.17
- Alternatives to DTW? -- maybe Genetic Algorithms ??
- eg Simple Genetic Algorithm in 15 lines of Python
- GA Fitness functions could be multiobjective -- selecting for bith alignment of letters and words ...
Visualization
- King Lear Q1 F1 Brett Text files... (w some markup)
- Use ascii values as colors and layout as square grid... enabling image arithmetic ...
Q1 F1 Q1-F1 F1-Q1