Machine Translation: a Perspective Pushpak Bhattacharyya, CSE Dept., IIT Bombay South Asian University, Delhi, 17th Dec, 2013
1
Introduction • Machine Translation (MT) is a technique to translate texts from one natural language to another natural language using a machine • Translated text should have two desired properties: – Adequacy: Meaning should be conveyed correctly – Fluency: Text should be fluent in the target language • Translation between distant languages is a difficult task – Handling Language Divergence is a major challenge
2
Perpectivising NLP: Areas of AI and their inter-dependencies Search
Logic
Machine Learning
NLP
Vision
Knowledge Representation
Planning
Robotics
Expert Systems
MT is part of NLP: a useful view Problem
Semantics
NLP Trinity
Parsing Part of Speech Tagging
Discourse and Coreference Increased Complexity Of Processing
Morph Analysis
Semantics
Marathi
French
HMM
Hindi
Parsing
Language
CRF MEMM Chunking
POS tagging
Morphology
English
Algorithm
Language Typology
Why is MT difficult
Ambiguity
Stages of processing • • • • • • •
Phonetics and phonology Morphology Lexical Analysis Syntactic Analysis Semantic Analysis Pragmatics Discourse
Morphology Ambiguity • Word formation rules from root words • Verb morphology GNPTAM (gender, number, person, tense, aspect, modality) – jaayeMge we/they will_go • Noun morphology Number and direct/oblique information – ladke boys/boy+case-marker
Lexical Ambiguity First step: part of Speech Disambiguation • Dog as a noun (animal) • Dog as a verb (to pursue)
Sense Disambiguation • Dog (as animal) • Dog (as a very detestable person)
Needs word relationships in a context • The chair emphasised the need for adult education Very common in day to day communications Satellite Channel Ad: Watch what you want, when you want (two senses of watch) e.g., Ground breaking ceremony/research
Ambiguity of Multiwords • • • •
The grandfather kicked the bucket after suffering from cancer. This job is a piece of cake Put the sweater on He is the dark horse of the match
Google Translations of above sentences:
दादा कसर से पी ड़त होने के बाद बा ट लात मार . इस काम के केक का एक टु कड़ा है . वेटर पर रखो. वह मैच के अंधेरे घोड़ा है .
Ambiguity of Named Entities • Bengali: চ ল সরকার বািড়েত আেছ English: Government is restless at home. (*) Chanchal Sarkar is at home • Hindi: दै नक दबंग दु नया English: everyday bold world Actually name of a Hindi newspaper in Indore •
High degree of overlap between NEs and MWEs
• Treat differently - transliterate do not translate
Structural Ambiguity • Scope 1.The old men and women were taken to safe locations (old men and women) vs. ((old men) and women) 2. No smoking areas will allow Hookas inside
• Preposition Phrase Attachment • I saw the boy with a telescope (who has the telescope?) • I saw the mountain with a telescope (world knowledge: mountain cannot be an instrument of seeing) • I saw the boy with the pony-tail (world knowledge: pony-tail cannot be an instrument of seeing) Very ubiquitous: newspaper headline “20 years later, BMC pays father 20 lakhs for causing son’s death”
Semantic Role ambiguity • John gave a book to Mary • Give action: Agent: John, Object: Book, Recipient: Mary • Challenge: ambiguity in semantic role labeling – (Eng) Visiting aunts can be a nuisance – (Hin) aapko mujhe mithaai khilaanii padegii (ambiguous in Marathi and Bengali too; not in Dravidian languages)
Pragmatics (fraught with ambiguity) • Very hard problem • Model user intention – Tourist (in a hurry, checking out of the hotel, motioning to the service boy): Boy, go upstairs and see if my sandals are under the divan. Do not be late. I just have 15 minutes to catch the train. – Boy (running upstairs and coming back panting): yes sir, they are there. • World knowledge – WHY INDIA NEEDS A SECOND OCTOBER (ToI, 2/10/07)
Discourse (again fraught with ambiguity) Processing of sequence of sentences Mother to John: John go to school. It is open today. Should you bunk? Father will be very angry. Ambiguity of open bunk what? Why will the father be angry? Complex chain of reasoning and application of world knowledge Ambiguity of father father as parent or father as headmaster
Ambiguity of dialogue situations John was returning from school dejected – today was the math test
He couldn’t control the class Teacher shouldn’t have made him responsible After all he is just a janitor
Textual Humour (1/2) 1. Teacher (angrily): did you miss the class yesterday? Student: not much 2. A man coming back to his parked car sees the sticker "Parking fine". He goes and thanks the policeman for appreciating his parking skill. 3. Son: mother, I broke the neighbour's lamp shade. Mother: then we have to give them a new one. Son: no need, aunty said the lamp shade is irreplaceable. 4. Ram: I got a Jaguar car for my unemployed youngest son. Shyam: That's a great exchange! 5. Shane Warne should bowl maiden overs, instead of bowling maidens over
Textual Humour (2/2) • It is not hard to meet the expenses now a day, you find them everywhere • Teacher: What do you think is the capital of Ethiopia? Student: What do you think? Teacher: I do not think, I know Student: I do not think I know
Why is MT difficult?
Language divergence
Why is MT difficult: Language Divergence • One of the main complexities of MT: Language Divergence • Languages have different ways of expressing meaning – Lexico-Semantic Divergence – Structural Divergence
Our work on English-IL Language Divergence with illustrations from Hindi (Dave, Parikh, Bhattacharyya, Journal of MT, 2002) 20
Languages differ in expressing thoughts: Agglutination • Finnish: “istahtaisinkohan” • English: "I wonder if I should sit down for a while" Analysis: • ist + "sit", verb stem • ahta + verb derivation morpheme, "to do something for a while" • isi + conditional affix • n+ 1st person singular suffix • ko + question particle • han a particle for things like reminder (with declaratives) or "softening" (with questions and imperatives)
Language Divergence Theory: Lexico-Semantic Divergences (few examples) • Conflational divergence – F: vomir; E: to be sick – E: stab; H: chure se maaranaa (knife-with hit) – S: Utrymningsplan; E: escape plan • Categorial divergence – Change is in POS category: – The play is on_PREP – Khel chal_rahaa_haai_VM
22
Language Divergence Theory: Structural Divergences • SVOSOV – E: Peter plays basketball – H: piitar basketball kheltaa haai
• Head swapping divergence – E: Prime Minister of India – H: bhaarat ke pradhaan mantrii (India-of Prime Minister)
23
Language Divergence Theory: Syntactic Divergences (few examples) • Constituent Order divergence – E: Singh, the PM of India, will address the nation today; H: bhaarat ke pradhaan mantrii, singh, … (India-of PM, Singh…) • Adjunction Divergence – E: She will visit here in the summer; H: vah yahaa garmii meM aayegii (she here summer-in will come) • Preposition-Stranding divergence – E: Who do you want to go with?; H: kisake saath aap jaanaa chaahate ho? (who with…)
24
Vauquois Triangle
Kinds of MT Systems (point of entry from source to the target text)
Deep understanding level
Ontological interlingua
Interlingual level
Conceptual transfer
Semantic transfer
Logico-semantic level
Mixing levels
Semantico-linguistic interlingua
SPA-structures (semantic & predicate-argument) Multilevel descriptio n
Multilevel transfer Syntactico-functional level
Syntagmatic level
Morpho-syntactic level
Graphemic level
Syntactic transfer (deep)
F-structures (functional)
Syntactic transfer (surface )
C-structures (constituent)
Semi-direct translatio n
Tagged text
Direct translation
Text
26
Illustration of transfer SVOSOV S
S
NP
VP
NP
N
V
NP
John
eats
N
bread
(transfer svo sov)
VP
NP N
N John bread
V
eats
Universality hypothesis Universality hypothesis: At the level of “deep meaning”, all texts are the “same”, whatever the language.
Understanding the Analysis-TransferGeneration over Vauquois triangle (1/4) H1.1: सरकार_ने चु नावो_के_बाद मु ंबई म कर _के_मा यम_से अपने राज व_को बढ़ाया | T1.1: Sarkaar ne chunaawo ke baad Mumbai me karoM ke maadhyam se apne raajaswa ko badhaayaa G1.1: Government_(ergative) elections_after Mumbai_in taxes_through its revenue_(accusative) increased E1.1: The Government increased its revenue after the elections through taxes in Mumbai
Understanding the Analysis-TransferGeneration over Vauquois triangle (2/4) Entity
English
Hindi
Subject
The Government
सरकार (sarkaar)
Verb
Increased
बढ़ाया (badhaayaa)
Object
Its revenue
अपने राज व (apne raajaswa)
Understanding the Analysis-TransferGeneration over Vauquois triangle (3/4) Adjunct English Instrumental
Temporal
Hindi
Through taxes in मु ंबई_म Mumbai कर _के_मा यम_ से (mumbai me karo ke maadhyam se) After the बढ़ाया elections (badhaayaa)
Understanding the Analysis-TransferGeneration over Vauquois triangle (3/4) The Government P0
increased P1
P2
its revenue P3
E1.2: after the elections, the Government increased its revenue through taxes in Mumbai E1.3: the Government increased its revenue through taxes in Mumbai after the elections
More flexibility in Hindi generation Sarkaar_ne P0 (the govt)
badhaayaa P1
(increased)
P2
H1.2: चु नावो_के_बाद सरकार_ने मु ंबई_म कर _के_मा यम_से अपने राज व_को बढ़ाया | T1.2: elections_after government_(erg) Mumbai_in taxes_through its revenue increased. H1.3: चु नावो_के_बाद मु ंबई_म कर _के_मा यम_से सरकार_ने अपने राज व_को बढ़ाया | T1.3: elections_after Mumbai_in taxes_through government_(erg) its revenue increased. H1.4: चु नावो_के_बाद मु ंबई_म कर _के_मा यम_से अपने राज व_को सरकार_ने बढ़ाया | T1.4: elections_after Mumbai_in taxes_through its revenue government_(erg) increased. H1.5: मु ंबई_म कर _के_मा यम_से चु नावो_के_बाद सरकार_ने अपने राज व_को बढ़ाया | T1.5: Mumbai_in taxes_through elections_after government_(erg) its revenue increased.
Dependency tree of the Hindi sentence
H1.1: सरकार_ने चु नावो_के_बाद मु ंबई म कर _के_मा यम_से अपने राज व_को बढ़ाया
Transfer over dependency tree
Descending transfer • नृ पायते सं हासनासीनो वानरः • Behaves-like-king sitting-on-throne monkey • A monkey sitting on the throne (of a king) behaves like a king
Ascending transfer: FinnishEnglish • istahtaisinkohan "I wonder if I should sit down for a while" • ist + "sit", verb stem • ahta + verb derivation morpheme, "to do something for a while" • isi + conditional affix • n+ 1st person singular suffix • ko + question particle • han a particle for things like reminder (with declaratives) or "softening" (with questions and imperatives)
Interlingua Based MT Deep understa nding level
Ontological interlingua
Interlingual level
Conceptual transfer
Semantic transfer
Logico-semant ic level
Mixing levels
Semantico-linguistic interlingua
SPA-structures (semantic & predicate-argument) Multilevel descriptio n
Multilevel transfer Syntactico-functio nal level
Syntagmatic level
Morpho-syntac tic level
Graphemic leve l
Syntactic transfer (deep)
F-structures (functional)
Syntactic transfer (surface )
C-structures (constituent)
Semi-direct translatio n
Tagged text
Direct translation
Text
38
MT: EnConverion + Deconversion
Hindi English Interlingua (UNL)
Analysis
Chinese
French generation 39
Challenges of interlingua generation
Mother of NLP problems - Extract meaning from a sentence! Almost all NLP problems are sub-problems
Named Entity Recognition
POS tagging
Chunking
Parsing
Word Sense Disambiguation
Multiword identification
and the list goes on...
Semantic Graph: John eats rice with a spoon Universal words Semantic relations attributes
Repository of 42 Semantic Relations and 84 attribute labels 41
System Architecture NER
Simple Sentence Enconverter
Stanford Dependency Parser
Stanford Dependency Parser
Clause Marker
XLE Parser
WSD
Feature Generation
Simplifier
Simple Enco.
Simple Enco.
Simple Enco.
Merger
Simple Enco.
Simple Enco.
Attribute Generation Relation Generation
Results: Relation
HinD Architecture Deconversion = Transfer + Generation
Manual Evaluation Guidelines Fluency of the given translation is: (4) Perfect: Good grammar (3) Fair: Easy-to-understand but flawed grammar (2) Acceptable: Broken - understandable with effort (1) Nonsense: Incomprehensible Adequacy: How much meaning of the reference sentence is conveyed in the translation: (4) All: No loss of meaning (3) Most: Most of the meaning is conveyed (2) Some: Some of the meaning is conveyed (1) None: Hardly any meaning is conveyed
Results Geometric Average Arithmetic Average Standard Deviation Pearson Cor. BLEU Pearson Cor. Fluency
BLEU 0.34 0.41 0.25 1.00 0.59
Number of sentences
Fluency vs Adequacy 200
168
165 155
138 131
150 100 50
3134 2 0
12
5
26 4 6
3
21
0 1
2
3
4
Fluency Adequacy 1
Adequacy 2
Adequacy 3
Adequacy 4
Fluency 2.54 2.71 0.89 0.59 1.00
Adequacy 2.84 3.00 0.89 0.50 0.68
• Good Correlation between Fluency and BLUE • Strong Correlation between Fluency and Adequacy • Can do large scale evaluation using Fluency alone
Summary: interlingua based MT • • • • •
English to Hindi Rule governed High level of linguistic expertise needed Takes a long time to build (since 1996) But produces great insight, resources and tools
47
Transfer Based MT Marathi-Hindi
Deep understanding level
Ontological interlingua
Interlingual level
Conceptual transfer
Semantic transfer
Logico-semantic level
Mixing levels
Semantico-linguistic interlingua
SPA-structures (semantic & predicate-arg ument) Multilevel descriptio n
Multilevel transfer Syntactico-functional level
Syntagmatic level
Morpho-syntac tic level
Graphemic level
Syntactic transfer (deep)
F-structures (functional)
Syntactic transfer (surface )
C-structures (constituent)
Semi-direct translatio n
Direct translation
48
Tagged text
Text
Indian Language to Indian Language Machine Translation (ILILMT) • Bidirectional Machine Translation System • Developed for nine Indian language pairs • Approach: – Transfer based – Modules developed using both rule based and statistical approach
Architecture of ILILMT System Source Text Morphological Analyzer POS Tagger
Analysis
Target Text Word Generator
Interchunk
Chunker
Generation
Intrachunk Vibhakti Computation Name Entity Recognizer Word Sense Disambiguatio n
Agreement Feature Transfer Lexical Transfer
M-H MT system: Evaluation – Subjective evaluation based on machine translation quality – Accuracy calculated based on score given by linguists
Score : 5 Correct Translation Score : 4 Understandable
with
S5: Number of score 5 Sentences, S4: Number of score 4 sentences, S3: Number of score 3 sentences, N: Total Number of sentences
minor errors Score : 3 Understandable major errors Score : 2 Not Understandable Score : 1 Non sense translation
with
Accuracy =
Evaluation of Marathi to Hindi MT System • Module-wise evaluation – Evaluated on 500 web sentences 1.2
1
0.8
0.6
Precision Recall
0.4
0.2
0 Morph Analyzer
POS Tagger
Chunker
Vibhakti Compute
WSD
Lexical Transfer
Module-wise precision and recall
Word Generator
Evaluation of Marathi to Hindi MT System (cont..) • Subjective evaluation on translation quality
– Evaluated on 500 web sentences – Accuracy calculated based on score given according to the translation quality. – Accuracy: 65.32 % • Result analysis:
– Morph, POS tagger, chunker gives more than 90% precision but Transfer, WSD, generator modules are below 80% hence degrades MT quality. – Also, morph disambiguation, parsing, transfer grammar and FW disambiguation modules are required to improve accuracy.
Important challenge of M-H TranslationMorphology processing: kridanta Ganesh Bhosale, Subodh Kembhavi, Archana Amberkar, Supriya Mhatre, Lata Popale and Pushpak Bhattacharyya, Processing of Participle (Krudanta) in Marathi, International Conference on Natural Language Processing (ICON 2011), Chennai, December, 2011.
Kridantas can be in multiple POS categories
Nouns Verb
Noun
वाच {vaach}{read}
वाचणे {vaachaNe}{reading}
उतर {utara}{climb down}
उतरण {utaraN}{downward slope}
Adjectives Verb चाव {chav}{bite} खा {khaa} {eat}
Adjective चावणारा {chaavaNaara}{one who bites} खा लेले {khallele} {something that is eaten}.
Kridantas derived from verbs (cont.)
Adverbs Verb पळ {paL}{run}
Adverb पळताना {paLataanaa}{while running}
बस {bas}{sit}
बसू न {basun}{after sitting}
Kridanta Types Kridanta Type
Example
Aspect
“णे” {NeKridanta}
vaachNyaasaaThee pustak de. (Give me a book for reading.)
Perfective
“ला” {laaKridanta}
Lekh
“ताना” {TaanaaKridanta}
Pustak vaachtaanaa
te lakShaat aale. (I noticed it while reading the book.)
Book while reading it
in mind came
“लेला” {Lela-Kridanta}
For reading
Article
kaal
book give
vaachalyaavar saaMgen. (I will tell you that after reading the article.) after reading
will tell
vaachlele pustak de. (Give me the book that (I/you) read yesterday. )
Yesterday read
Durative
Perfective
book give
“ऊन”{UnKridanta}
pustak vaachun
“णारा”{NaraKridanta}
pustake vaachNaaRyaalaa dnyaan miLte. (The one who reads books, gets knowledge.)
“वे” {ve-Kridanta}
he pustak pratyekaane vaachaave. (Everyone should read this book.)
parat kar. (Return the book after reading it.)
Completive
Book after reading back do
Books
Stative
to the one who reads knowledge gets
This book everyone
“ता” {taaKridanta}
Perfective
should read
to pustak vaachtaa vaachtaa zopee gelaa. (He fell asleep while reading a book.) He book
Inceptive
while reading
to sleep
went
Stative
Participial Suffixes in Other Agglutinative Languages
Kannada: muridiruwaa kombe jennu esee Broken to branch throw Throw away the broken branch. - similar to the lela form frequently used in Marathi.
Participial Suffixes in Other Agglutinative Languages (cont.)
Telugu: ame padutunnappudoo nenoo panichesanoo she singing I work I worked while she was singing. -similar to the taanaa form frequently used in Marathi.
Participial Suffixes in Other Agglutinative Languages (cont.)
Turkish: hazirlanmis plan prepare-past plan The plan which has been prepared Eqv Marathi: lelaa
Morphological Processing of Kridanta forms (cont.)
Fig. Morphotactics FSM for Kridanta Processing
Accuracy of Kridanta Processing: Direct Evaluation 0.98 0.96 0.94 Precision
0.92
Recall 0.9 0.88 0.86
Ne Kr i d ant a
La Kr id ant a
N ar a Kr id ant a
Lela Kr id ant a
T ana Kr id ant a
T Kr id ant a
Oo n Kr id ant a
Va Kr id ant a
Summary of M-H transfer based MT • • • • • • •
Marathi and Hindi are close cousins Relatively easier problem to solve Will interlingua be better? Web sentences being used to test the performance Rule governed Needs high level of linguistic expertise Will be an important contribution to IL MT
63
Statistical Machine Translation (Many masters and PhD students: Ananthakrishnan, Avishek, Hansraj, Mitesh, Anoop, Abhishek, Rajen, Publication in ACL 09) Deep understanding level
Ontological interlingua
Interlingual level
Conceptual transfer
Semantic transfer
Logico-semantic level
Mixing levels
Semantico-linguistic interlingua
SPA-structures (semantic & predicate-argument) Multilevel description
Multilevel transfer Syntactico-functional level
Syntagmatic level
Morpho-syntactic level
Graphemic level
Syntactic transfer (deep)
F-structures (functional)
Syntactic transfer (surface)
C-structures (constituent)
Semi-direct translation
Direct translation
Tagged text
Text
Czeck-English data • • • • • •
[nesu] [ponese] [nese] [nesou] [yedu] [plavou]
“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”
To translate … • • • •
I will carry. They drive. He swims. They will drive.
Hindi-English data • • • • • •
[DhotA huM] [DhoegA] [DhotA hAi] [Dhote hAi] [chalAtA huM] [tErte hEM]
“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”
Bangla-English data • • • • • •
[bai] [baibe] [bay] [bay] [chAlAi] [sAMtrAy]
“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”
To translate … (repeated) • • • •
I will carry. They drive. He swims. They will drive.
Foundation • Data driven approach • Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. • Translations are generated on the basis of statistical model • Parameters are estimated using bilingual parallel corpora
SMT: Language Model • To detect good English sentences • Probability of an English sentence w1w2 …… wn can be written as Pr(w1w2 …… wn) = Pr(w1) * Pr(w2|w1) *. . . * Pr(wn|w1 w2 . . . wn-1)
• Here Pr(wn|w1 w2 . . . wn-1) is the probability that word wn follows word string w1 w2 . . . wn-1. – N-gram model probability
• Trigram model probability calculation
SMT: Translation Model • P(f|e): Probability of some f given hypothesis English translation e • How to assign the values to p(e|f) ? Sentence level
– Sentences are infinite, not possible to find pair(e,f) for all sentences
• Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair
Word level
Alignment • If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, • then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then – aj= i, and
– if it is not connected to any English word, then aj= O
Example of alignment English: Ram went to school Hindi: Raama paathashaalaa gayaa Ram
went
to
school
Raamapaathashaalaa gayaa
Translation Model: Exact expression
Choose the length of foreign language string given e
Choose alignment given e and m
Choose the identity of foreign word given e, m, a
• Five models for estimating parameters in the expression [2] • Model-1, Model-2, Model-3, Model-4, Model-5
Proof of Translation Model: Exact expression Pr( f | e) Pr( f , a | e)
; marginalization
a
Pr(f , a | e) Pr(f , a, m| e) ; marginalization m
Pr( f , a, m | e) Pr(m | e) Pr( f , a | m, e) m
Pr(m | e) Pr( f , a | m, e) m m
Pr(m | e) Pr( f j , a j | a1j 1 , f1 j 1 , m, e) m
j 1
m
Pr(m | e) Pr(a j | a1j 1 , f1 j 1 , m, e) Pr( f j | a1j , f1 j 1 , m, e) m
j 1
m is fixed for a particular f, hence m
Pr(f , a, m| e) Pr(m| e) Pr(a j | a1j1, f1j1, m, e) Pr(f j | a1j , f1j1, m, e) j 1
Alignment
Fundamental and ubiquitous • • • • •
Spell checking Translation Transliteration Speech to text Text to speeh
EM for word alignment from sentence alignment: example English (1) three rabbits a b
French (1) trois lapins w x
(2) rabbits of Grenoble b c d
(2) lapins de Grenoble x y z
Initial Probabilities: each cell denotes t(a w), t(a x) etc. a
b
c
d
w
1/4
1/4
1/4
1/4
x
1/4
1/4
1/4
1/4
y
1/4
1/4
1/4
1/4
z
1/4
1/4
1/4
1/4
The counts in IBM Model 1 Works by maximizing P(f|e) over the entire corpus For IBM Model 1, we get the following relationship: t (w f | w e ) c (w | w ; f , e ) = . e0 el f f t (w | w ) +… + t (w | w ) f
e
c (w f | w e ; f , e ) is the fractional count of the alignment of w f with w e in f and e
t (w f | w e ) is the probability of w f being the translation of w e is the count of w f in f is the count of w e in e
Example of expected count C[aw; (a b)(w x)] t(aw) = ------------------------- X #(a in ‘a b’) X #(w in ‘w x’) t(aw)+t(ax) 1/4 = ----------------- X 1 X 1= 1/2 1/4+1/4
“counts” ab
a
b
c
d
bcd
a
b
c
d
wx w
1/2
1/2
0
0
xyz w
0
0
0
0
x
1/2
1/2
0
0
x
0
1/3
1/3
1/3
y
0
0
0
0
y
0
1/3
1/3
1/3
z
0
0
0
0
z
0
1/3
1/3
1/3
Revised probability: example trevised(a w) 1/2 = ------------------------------------------------------------------(1/2+1/2 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)
Revised probabilities table a
b
c
d
w
1/2
1/4
0
0
x
1/2
5/12
1/3
1/3
y
0
1/6
1/3
1/3
z
0
1/6
1/3
1/3
“revised counts” ab
a
b
c
d
bcd
a
b
c
d
wx w
1/2
3/8
0
0
xyz w
0
0
0
0
x
1/2
5/8
0
0
x
0
5/9
1/3
1/3
y
0
0
0
0
y
0
2/9
1/3
1/3
z
0
0
0
0
z
0
2/9
1/3
1/3
Re-Revised probabilities table a
b
c
d
w
1/2
3/16
0
0
x
1/2
85/144
1/3
1/3
y
0
1/9
1/3
1/3
z
0
1/9
1/3
1/3
Continue until convergence; notice that (b,x) binding gets progressively stronger; b=rabbits, x=lapins
Derivation of EM based Alignment Expressions VE vocalbulary of language L1 (Say English) VF vocabulary of language L2 (Say Hindi) E1 what is in a name ? या है ? F1 नाम म naam meM kya hai ? name in what is ? what is in a name ? E2 That which we call rose, by any other name will smell as sweet. F2 िजसे हम गु लाब कहते ह, और भी कसी नाम से उसक कु शबू सामान मीठा होगी Jise hum gulab kahte hai, aur bhi kisi naam se uski khushbu samaan mitha hogii That which we rose say , any other name by its smell as sweet That which we call rose, by any other name will smell as sweet.
Vocabulary mapping Vocabulary VE
VF
what , is , in, a , name , that, which, we , call ,rose, by, any, other, will, smell, as, sweet
naam, meM, kya, hai, jise, hum, gulab, kahte, hai, aur, bhi, kisi, bhi, uski, khushbu, saman, mitha, hogii
Key Notations
(Thanks to Sachin Pawar for helping with the maths formulae processing)
Hidden variables and parameters
Likelihoods Data Likelihood L(D; Θ) :
Data Log-Likelihood LL(D; Θ) :
Expected value of Data Log-Likelihood E(LL(D; Θ)) :
Constraint and Lagrangian
Differentiating wrt Pij
Final E and M steps M-step
E-step
18-Dec-2013
SMT Tutorial, ICON-2013
95
Remember: SMT •
What is a good translation? – Faithful to source – Fluent in target
faithfulness
eˆ arg max P (e) e
fluency
P ( f | e)
Case Marker and Morphology crucial in E-H MT • Order of magnitiude facelift in Fluency and fidelity • Determined by the combination of suffixes and semantic relations on the English side • Augment the aligned corpus of the two languages, with the correspondence of English suffixes and semantic relations with Hindi suffixes and case markers
Semantic relations+SuffixesCase Markers+inflections
I
ate
mangoes
I {