Experiment 1 -- "One King Lear?"
- 1ii) Compare Q-only with Q-common using PCA and NSC
Principal Component Analysis of 200 F-words
- Lr Q/F only/common
- Include also: Merry Wives of Windsor (Q1), Pericles (Q1) and Henry V (Folio)
- DECISION use 2D plots in future -- this will also fix the bugs in the "fake" legends on the 3D plots.
PCS of Other S. Plays
- We now have XML for several other S. plays so extend PCA analysis to these data sets in Lr Q/F (Full text).
- Plot Only centroids of 1st/2nd principal component of each plays 200 function-word feature vector.
- Tentative results were discussed
- DECISION use 2D plots in future -- this will also fix the bugs in the "fake" legends on the 3D plots.
Experiment 2 " Find-Your-Partner"
Pairing Speeches in XML
- Previous attempt to pair up Q/F speeches in the Lr Common xml files led to identification of Numerically Mismatched Speech Pairs 22.11.17
- In order to pair speeches in other plays where we do not extent the xml with "add resp=" markup we would need a Speech Lookup Table (LUT) or or to use heuristics that based on metrics like Edit Distance of Mutual Information to find the matches.
- In a sence this is what Experiment 2 " Find-Your-Partner" seeks to do...
- DECISION: To make Experiment 2 tractable -- use entire speech sections instead of 2000 word windows.
Code
Results
- We can clearly see the two speeches inserted in F for Lear ("Nothing?") and Cordelia ("Nothing.") (columns 26 and 27).
- We can also see that Lear's second speech is noticeable different in Q and F (row 17) using the JSD metric.
- Likewise the preceding speech by Gloucester (row 16)
Discussion
- Visualisation even of this "speech" based approach will not scale
- Revisit what this experiment seeks to do.
- DECISION: Rather than visualizing the entire analysis, simply report for each speech in Q what the bast matching speech in F is and visa-versa.
- This will enable pairing of Q-F speeches in other S. plays automatically
- Rather than using a manual approach as we did for Lr LR-Q1-F-mapping-20.6.17