Semantic Drift between the Testaments Using Collocation Analysis to Find Theological Significance Matt Munson
Theological Background Use of Old Testament in the New Similarities Differences Relative Meaning But re-use goes beyond quotations What about the similarities, differences, and relative meanings of individual words Can we detect theological significance even here?
Linguistic Background - Collocations Firth, You shall know a word by the company it keeps! Harris, If we consider words or morphemes A and B to be more different in meaning than A and C, then we will often find that the distributions of A and B are more different than the distributions of A and C. In other words, difference of meaning correlates with difference of distribution.
My Hypothesis Linguistic: By comparing the collocation word fields of the same target word in the Septuagint and the New Testament, one can detect which words have changed in meaning the most from one Testament to the other. Theological: Further investigation of how the collocation fields have changed will lead to insights concerning the theological changes from the LXX to the NT.
My Method Lemmatized Greek Texts Collocation span of 4L and 4R Co-occurrence counts Log likelihood significance Cosine similarity of log likelihood tables Comparison of log likelihood and cosine similarity tables for each lemma
Lemmatized Greek Texts Highly Inflected Language Nouns: 8 distinct forms Verbs: would you believe over 200 forms? Not lemmatizing would make each of these forms appear to the computer to be a unique word Could be interesting but not enough data to overcome atomization
Collocation Span of 4L and 4R L4 L3 L2 L1 Lemma R1 R2 R3 R4 ἐν ἀρχή ποιέω ὁ θεός ὁ οὐρανός καί ὁ ὁ ἄβυσσος καί πνεῦμα θεός ἐπιφέρω ἐπάνω ὁ ὕδωρ Experiments have shown this to be the most effective span
Co-occurrence Counts Simple counts of how often a collocate occurs in the given span of the target Example: 'ἔντιμος' 1, 'ἀπόδεκτος' 1, προευαγγελίζομαι' 1, 'γεννάω' 11, 'κιθάρα' 1, 'ὀλίγος' 1, 'πρό' 4, 'ἀνοίγω' 2, 'ἐπιποθέω' 2, 'ἀστεῖος' 1, 'ἔμπροσθεν' 6, 'μετάνοια' 7, 'ἐκπορεύομαι' 2, 'ὅτε' 9, 'οἰκτιρμός' 2, 'Ῥαιφάν' 1, 'ὅτι' 122
Log Likelihood Significance I Significant collocation is regular collocation between two items, such that they co-occur more often than their respective frequencies. (Léon, 14) log-likelihood measures the strength of association between words by comparing the occurrences of words respectively and their occurrences together. also appropriate for sparse data This measures syntagmatic relationships More Information: TU Darmstadt LinguisticsWeb
Log Likelihood Significance II Tables θεός 1317 Lemma ἔντιμος 1 0,03803416 ἀπόδεκτος 1 0,11466575 προευαγγελίζομαι 1 0,18522827 γεννάω 11 0,08205591 κιθάρα 1 0,05429049 ὀλίγος 1 0,10219456 πρό 4 0,00225585 ἀνοίγω 2 0,18766275 ἐπιποθέω 2 0,09092744 ἀστεῖος 1 0,11466575 ἔμπροσθεν 6 0,06828326 μετάνοια 7 0,51665527 ἐκπορεύομαι 2 0,00614629 ὅτε 9 0,00729741 οἰκτιρμός 2 0,18778194 Ῥαιφάν 1 0,18522827 ὅτι 122 0,030284
Cosine Similarity of Log-Likelihood Tables Cosine similarity is often used to measure the similarity between word frequency lists I used it to compare log likelihood tables, which have the same form as frequency lists I compared all the tables in the LXX to each other and all in the NT to each other I also compared the same lemmata in each Testament to each other This measures paradigmatic relationships
Cosine Similarity Results Within each Testament Between the Testaments 'θεός' Lemma 'εἰμί' 0,98700233 'πᾶς' 0,98665867 'καί' 0,9866551 'ἐν' 0,98611097 'πιστεύω' 0,98221732 'οὕτω' 0,98221732 'περί' 0,98221732 'γράφω' 0,98221732 'ὄνομα' 0,98221732 'λαλέω' 0,98221732 'χάρις' 0,98221732 'χείρ' 0,98221732 'πόλις' 0,98221732 'λαμβάνω' 0,98221732 'οὗτος' 0,98221732 'λόγος' 0,98221732 καί 0,99546961 οὔτε 0,97747377 ἐν 0,95708394 πᾶς 0,95482188 ὁ 0,95312389 ὅς 0,94789635 οὗτος 0,94640289 αὐτός 0,9461037 σύ 0,94513681 ἐγώ 0,93073733 εἰς 0,92295005 εἰμί 0,92174644 Σαλμών 0,92024309 ἐπί 0,91145861 ὅτι 0,89938138 γάρ 0,89611732 ἀπό 0,89527621
Compare Log Likelihood Tables This will show which collocates occur more significantly with the lemma in the LXX and the NT Positive means more significantly in the LXX, negative in the NT Syntagmatic comparison Will show change in usage but not change in meaning directly
Compare Log Likelihood Tables - Results 'θεός English Lemma κύριος Lord 75,9100978 ἐγώ I 21,0948238 σύ You 20,605072 λατρεύω To serve 9,54964005 εὐλογητός Blessed 9,03718071 ὅτι That, which 8,88005679 υἱός Son 8,10690664 Ἰσραήλ Israel 6,6088925 εὐλογέω To bless 6,01655046 ἕτερος other 5,36895205 εἰρήνη Peace -1,70287265 ἐνώπιον Before, in front of -2,07034527 θέλημα Will -2,35896724 εὐχαριστέω To give thanks -2,60862474 δοξάζω To magnify, extol -3,50148557 χάρις Gift, grace -3,71370302 βασιλεία Kingdom -7,38014298 αὐτός He, she, it, self -12,1297082 καί And -32,6091537 ὁ the -116,551495
Compare Cosine Similarity Tables This will show to which other lemmata each lemma in each Testament attracts The value will be positive if it they are more attracted in the LXX, negative if in the NT Paradigmatic comparison These comparisons should suggest meaning change
Compare Cosine Similarity Tables - Results 'θεός English Lemma συνίημι To come together, understand 0,87504707 δοξάζω To magnify, extol 0,8714031 ταπεινόω To lower, to abase 0,83253347 φέρω To carry 0,81786621 σπέρμα Seed 0,81385463 ἀρχή Beginning, power 0,80058442 ἐπερωτάω To consult 0,80056964 Δαυίδ David 0,78433087 πρεσβύτερος Elder 0,7824249 γλῶσσα Tongue 0,77929021 καιρός Peace -0,55061329 ἕτερος Other -0,56250249 οὖν And so -0,58666004 ἐμός Mine (possessive) -0,58994683 ἵνα In order to -0,62106431 βάλλω To throw -0,63303536 ὥρα Part of a day, hour -0,64563456 μᾶλλον more -0,68514535 χάρις Gift, grace -0,76401283 περιπατέω To walk (about), to live -0,97335415
Next Steps Finish comparison of LL and CS tables Include other information in analysis POS Information Semantic dependencies Could help to account for Greek sentence structure Remove information from the analysis Stop words Certain parts of speech (e.g., adverbs, particles) Close-reading analysis of the results