Machine Translation: a Perspective

Machine Translation: a Perspective Pushpak Bhattacharyya, CSE Dept., IIT Bombay South Asian University, Delhi, 17th Dec, 2013 1 Introduction • Mach...

Author: Camron Hicks

8 downloads 0 Views 12MB Size

Report

Download PDF

Recommend Documents

A Prototype Machine Translation System

KIELIKONE Machine Translation Workstation

Corpora in machine translation

Name-aware Machine Translation

Statistical Machine Translation

Machine Translation Tuning and factored translation

KIELIKONE Machine Translation Workstation

Statistical Machine Translation

MACHINE TRANSLATION: A BRIEF HISTORY. W.John Hutchins

MACHINE TRANSLATION. An Introductory Guide

MultiLanguage Machine Translation Speech Corrector

MACHINE TRANSLATION SYSTEMS IN INDIA

ASSAMESE-ENGLISH BILINGUAL MACHINE TRANSLATION

(Meta-) Evaluation of Machine Translation

Generalized Parsers for Machine Translation

Statistical Machine Translation and Speech-to-Speech Translation

The University of Maryland Statistical Machine Translation System for the Third Workshop on Machine Translation

ONLINE MACHINE TRANSLATOR SYSTEM AND RESULT COMPARISON STATISTICAL MACHINE TRANSLATION

Multiple Uses of Machine Translation and Computerised Translation Tools

English to Kashmiri Translation System:Using Example Based Machine Translation Approach

ARABIC BROKEN PLURAL RECOGNITION USING A MACHINE TRANSLATION TECHNIQUE

A BIDIRECTIONAL, TRANSFER-DRIVEN MACHINE TRANSLATION SYSTEM FOR SPOKEN DIALOGUES

PersianSMT: A first attempt to English-Persian Statistical Machine Translation

A Machine Translation System from English to American Sign Language

Machine Translation: a Perspective Pushpak Bhattacharyya, CSE Dept., IIT Bombay South Asian University, Delhi, 17th Dec, 2013

1

Introduction • Machine Translation (MT) is a technique to translate texts from one natural language to another natural language using a machine • Translated text should have two desired properties: – Adequacy: Meaning should be conveyed correctly – Fluency: Text should be fluent in the target language • Translation between distant languages is a difficult task – Handling Language Divergence is a major challenge

2

Perpectivising NLP: Areas of AI and their inter-dependencies Search

Logic

Machine Learning

NLP

Vision

Knowledge Representation

Planning

Robotics

Expert Systems

MT is part of NLP: a useful view Problem

Semantics

NLP Trinity

Parsing Part of Speech Tagging

Discourse and Coreference Increased Complexity Of Processing

Morph Analysis

Semantics

Marathi

French

HMM

Hindi

Parsing

Language

CRF MEMM Chunking

POS tagging

Morphology

English

Algorithm

Language Typology

Why is MT difficult

Ambiguity

Stages of processing • • • • • • •

Phonetics and phonology Morphology Lexical Analysis Syntactic Analysis Semantic Analysis Pragmatics Discourse

Morphology Ambiguity • Word formation rules from root words • Verb morphology GNPTAM (gender, number, person, tense, aspect, modality) – jaayeMge we/they will_go • Noun morphology Number and direct/oblique information – ladke boys/boy+case-marker

Lexical Ambiguity First step: part of Speech Disambiguation • Dog as a noun (animal) • Dog as a verb (to pursue)

Sense Disambiguation • Dog (as animal) • Dog (as a very detestable person)

Needs word relationships in a context • The chair emphasised the need for adult education Very common in day to day communications Satellite Channel Ad: Watch what you want, when you want (two senses of watch) e.g., Ground breaking ceremony/research

Ambiguity of Multiwords • • • •

The grandfather kicked the bucket after suffering from cancer. This job is a piece of cake Put the sweater on He is the dark horse of the match

Google Translations of above sentences:

दादा कसर से पी ड़त होने के बाद बा ट लात मार . इस काम के केक का एक टु कड़ा है . वेटर पर रखो. वह मैच के अंधेरे घोड़ा है .

Ambiguity of Named Entities • Bengali: চ ল সরকার বািড়েত আেছ English: Government is restless at home. (*) Chanchal Sarkar is at home • Hindi: दै नक दबंग दु नया English: everyday bold world Actually name of a Hindi newspaper in Indore •

High degree of overlap between NEs and MWEs

• Treat differently - transliterate do not translate

Structural Ambiguity • Scope 1.The old men and women were taken to safe locations (old men and women) vs. ((old men) and women) 2. No smoking areas will allow Hookas inside

• Preposition Phrase Attachment • I saw the boy with a telescope (who has the telescope?) • I saw the mountain with a telescope (world knowledge: mountain cannot be an instrument of seeing) • I saw the boy with the pony-tail (world knowledge: pony-tail cannot be an instrument of seeing) Very ubiquitous: newspaper headline “20 years later, BMC pays father 20 lakhs for causing son’s death”

Semantic Role ambiguity • John gave a book to Mary • Give action: Agent: John, Object: Book, Recipient: Mary • Challenge: ambiguity in semantic role labeling – (Eng) Visiting aunts can be a nuisance – (Hin) aapko mujhe mithaai khilaanii padegii (ambiguous in Marathi and Bengali too; not in Dravidian languages)

Pragmatics (fraught with ambiguity) • Very hard problem • Model user intention – Tourist (in a hurry, checking out of the hotel, motioning to the service boy): Boy, go upstairs and see if my sandals are under the divan. Do not be late. I just have 15 minutes to catch the train. – Boy (running upstairs and coming back panting): yes sir, they are there. • World knowledge – WHY INDIA NEEDS A SECOND OCTOBER (ToI, 2/10/07)

Discourse (again fraught with ambiguity) Processing of sequence of sentences Mother to John: John go to school. It is open today. Should you bunk? Father will be very angry. Ambiguity of open bunk what? Why will the father be angry? Complex chain of reasoning and application of world knowledge Ambiguity of father father as parent or father as headmaster

Ambiguity of dialogue situations John was returning from school dejected – today was the math test

He couldn’t control the class Teacher shouldn’t have made him responsible After all he is just a janitor

Textual Humour (1/2) 1. Teacher (angrily): did you miss the class yesterday? Student: not much 2. A man coming back to his parked car sees the sticker "Parking fine". He goes and thanks the policeman for appreciating his parking skill. 3. Son: mother, I broke the neighbour's lamp shade. Mother: then we have to give them a new one. Son: no need, aunty said the lamp shade is irreplaceable. 4. Ram: I got a Jaguar car for my unemployed youngest son. Shyam: That's a great exchange! 5. Shane Warne should bowl maiden overs, instead of bowling maidens over

Textual Humour (2/2) • It is not hard to meet the expenses now a day, you find them everywhere • Teacher: What do you think is the capital of Ethiopia? Student: What do you think? Teacher: I do not think, I know Student: I do not think I know

Why is MT difficult?

Language divergence

Why is MT difficult: Language Divergence • One of the main complexities of MT: Language Divergence • Languages have different ways of expressing meaning – Lexico-Semantic Divergence – Structural Divergence

Our work on English-IL Language Divergence with illustrations from Hindi (Dave, Parikh, Bhattacharyya, Journal of MT, 2002) 20

Languages differ in expressing thoughts: Agglutination • Finnish: “istahtaisinkohan” • English: "I wonder if I should sit down for a while" Analysis: • ist + "sit", verb stem • ahta + verb derivation morpheme, "to do something for a while" • isi + conditional affix • n+ 1st person singular suffix • ko + question particle • han a particle for things like reminder (with declaratives) or "softening" (with questions and imperatives)

Language Divergence Theory: Lexico-Semantic Divergences (few examples) • Conflational divergence – F: vomir; E: to be sick – E: stab; H: chure se maaranaa (knife-with hit) – S: Utrymningsplan; E: escape plan • Categorial divergence – Change is in POS category: – The play is on_PREP – Khel chal_rahaa_haai_VM

22

Language Divergence Theory: Structural Divergences • SVOSOV – E: Peter plays basketball – H: piitar basketball kheltaa haai

• Head swapping divergence – E: Prime Minister of India – H: bhaarat ke pradhaan mantrii (India-of Prime Minister)

23

Language Divergence Theory: Syntactic Divergences (few examples) • Constituent Order divergence – E: Singh, the PM of India, will address the nation today; H: bhaarat ke pradhaan mantrii, singh, … (India-of PM, Singh…) • Adjunction Divergence – E: She will visit here in the summer; H: vah yahaa garmii meM aayegii (she here summer-in will come) • Preposition-Stranding divergence – E: Who do you want to go with?; H: kisake saath aap jaanaa chaahate ho? (who with…)

24

Vauquois Triangle

Kinds of MT Systems (point of entry from source to the target text)

Deep understanding level

Ontological interlingua

Interlingual level

Conceptual transfer

Semantic transfer

Logico-semantic level

Mixing levels

Semantico-linguistic interlingua

SPA-structures (semantic & predicate-argument) Multilevel descriptio n

Multilevel transfer Syntactico-functional level

Syntagmatic level

Morpho-syntactic level

Graphemic level

Syntactic transfer (deep)

F-structures (functional)

Syntactic transfer (surface )

C-structures (constituent)

Semi-direct translatio n

Tagged text

Direct translation

Text

26

Illustration of transfer SVOSOV S

S

NP

VP

NP

N

V

NP

John

eats

N

bread

(transfer svo sov)

VP

NP N

N John bread

V

eats

Universality hypothesis Universality hypothesis: At the level of “deep meaning”, all texts are the “same”, whatever the language.

Understanding the Analysis-TransferGeneration over Vauquois triangle (1/4) H1.1: सरकार_ने चु नावो_के_बाद मु ंबई म कर _के_मा यम_से अपने राज व_को बढ़ाया | T1.1: Sarkaar ne chunaawo ke baad Mumbai me karoM ke maadhyam se apne raajaswa ko badhaayaa G1.1: Government_(ergative) elections_after Mumbai_in taxes_through its revenue_(accusative) increased E1.1: The Government increased its revenue after the elections through taxes in Mumbai

Understanding the Analysis-TransferGeneration over Vauquois triangle (2/4) Entity

English

Hindi

Subject

The Government

सरकार (sarkaar)

Verb

Increased

बढ़ाया (badhaayaa)

Object

Its revenue

अपने राज व (apne raajaswa)

Understanding the Analysis-TransferGeneration over Vauquois triangle (3/4) Adjunct English Instrumental

Temporal

Hindi

Through taxes in मु ंबई_म Mumbai कर _के_मा यम_ से (mumbai me karo ke maadhyam se) After the बढ़ाया elections (badhaayaa)

Understanding the Analysis-TransferGeneration over Vauquois triangle (3/4) The Government P0

increased P1

P2

its revenue P3

E1.2: after the elections, the Government increased its revenue through taxes in Mumbai E1.3: the Government increased its revenue through taxes in Mumbai after the elections

More flexibility in Hindi generation Sarkaar_ne P0 (the govt)

badhaayaa P1

(increased)

P2

H1.2: चु नावो_के_बाद सरकार_ने मु ंबई_म कर _के_मा यम_से अपने राज व_को बढ़ाया | T1.2: elections_after government_(erg) Mumbai_in taxes_through its revenue increased. H1.3: चु नावो_के_बाद मु ंबई_म कर _के_मा यम_से सरकार_ने अपने राज व_को बढ़ाया | T1.3: elections_after Mumbai_in taxes_through government_(erg) its revenue increased. H1.4: चु नावो_के_बाद मु ंबई_म कर _के_मा यम_से अपने राज व_को सरकार_ने बढ़ाया | T1.4: elections_after Mumbai_in taxes_through its revenue government_(erg) increased. H1.5: मु ंबई_म कर _के_मा यम_से चु नावो_के_बाद सरकार_ने अपने राज व_को बढ़ाया | T1.5: Mumbai_in taxes_through elections_after government_(erg) its revenue increased.

Dependency tree of the Hindi sentence

H1.1: सरकार_ने चु नावो_के_बाद मु ंबई म कर _के_मा यम_से अपने राज व_को बढ़ाया

Transfer over dependency tree

Descending transfer • नृ पायते सं हासनासीनो वानरः • Behaves-like-king sitting-on-throne monkey • A monkey sitting on the throne (of a king) behaves like a king

Ascending transfer: FinnishEnglish • istahtaisinkohan "I wonder if I should sit down for a while" • ist + "sit", verb stem • ahta + verb derivation morpheme, "to do something for a while" • isi + conditional affix • n+ 1st person singular suffix • ko + question particle • han a particle for things like reminder (with declaratives) or "softening" (with questions and imperatives)

Interlingua Based MT Deep understa nding level

Ontological interlingua

Interlingual level

Conceptual transfer

Semantic transfer

Logico-semant ic level

Mixing levels

Semantico-linguistic interlingua

SPA-structures (semantic & predicate-argument) Multilevel descriptio n

Multilevel transfer Syntactico-functio nal level

Syntagmatic level

Morpho-syntac tic level

Graphemic leve l

Syntactic transfer (deep)

F-structures (functional)

Syntactic transfer (surface )

C-structures (constituent)

Semi-direct translatio n

Tagged text

Direct translation

Text

38

MT: EnConverion + Deconversion

Hindi English Interlingua (UNL)

Analysis

Chinese

French generation 39

Challenges of interlingua generation 



Mother of NLP problems - Extract meaning from a sentence! Almost all NLP problems are sub-problems 

Named Entity Recognition



POS tagging



Chunking



Parsing



Word Sense Disambiguation



Multiword identification



and the list goes on...

Semantic Graph: John eats rice with a spoon Universal words Semantic relations attributes

Repository of 42 Semantic Relations and 84 attribute labels 41

System Architecture NER

Simple Sentence Enconverter

Stanford Dependency Parser

Stanford Dependency Parser

Clause Marker

XLE Parser

WSD

Feature Generation

Simplifier

Simple Enco.

Simple Enco.

Simple Enco.

Merger

Simple Enco.

Simple Enco.

Attribute Generation Relation Generation

Results: Relation

HinD Architecture Deconversion = Transfer + Generation

Manual Evaluation Guidelines Fluency of the given translation is: (4) Perfect: Good grammar (3) Fair: Easy-to-understand but flawed grammar (2) Acceptable: Broken - understandable with effort (1) Nonsense: Incomprehensible Adequacy: How much meaning of the reference sentence is conveyed in the translation: (4) All: No loss of meaning (3) Most: Most of the meaning is conveyed (2) Some: Some of the meaning is conveyed (1) None: Hardly any meaning is conveyed

Results Geometric Average Arithmetic Average Standard Deviation Pearson Cor. BLEU Pearson Cor. Fluency

BLEU 0.34 0.41 0.25 1.00 0.59

Number of sentences

Fluency vs Adequacy 200

168

165 155

138 131

150 100 50

3134 2 0

12

5

26 4 6

3

21

0 1

2

3

4

Fluency Adequacy 1

Adequacy 2

Adequacy 3

Adequacy 4

Fluency 2.54 2.71 0.89 0.59 1.00

Adequacy 2.84 3.00 0.89 0.50 0.68

• Good Correlation between Fluency and BLUE • Strong Correlation between Fluency and Adequacy • Can do large scale evaluation using Fluency alone

Summary: interlingua based MT • • • • •

English to Hindi Rule governed High level of linguistic expertise needed Takes a long time to build (since 1996) But produces great insight, resources and tools

47

Transfer Based MT Marathi-Hindi

Deep understanding level

Ontological interlingua

Interlingual level

Conceptual transfer

Semantic transfer

Logico-semantic level

Mixing levels

Semantico-linguistic interlingua

SPA-structures (semantic & predicate-arg ument) Multilevel descriptio n

Multilevel transfer Syntactico-functional level

Syntagmatic level

Morpho-syntac tic level

Graphemic level

Syntactic transfer (deep)

F-structures (functional)

Syntactic transfer (surface )

C-structures (constituent)

Semi-direct translatio n

Direct translation

48

Tagged text

Text

Indian Language to Indian Language Machine Translation (ILILMT) • Bidirectional Machine Translation System • Developed for nine Indian language pairs • Approach: – Transfer based – Modules developed using both rule based and statistical approach

Architecture of ILILMT System Source Text Morphological Analyzer POS Tagger

Analysis

Target Text Word Generator

Interchunk

Chunker

Generation

Intrachunk Vibhakti Computation Name Entity Recognizer Word Sense Disambiguatio n

Agreement Feature Transfer Lexical Transfer

M-H MT system: Evaluation – Subjective evaluation based on machine translation quality – Accuracy calculated based on score given by linguists

Score : 5 Correct Translation Score : 4 Understandable

with

S5: Number of score 5 Sentences, S4: Number of score 4 sentences, S3: Number of score 3 sentences, N: Total Number of sentences

minor errors Score : 3 Understandable major errors Score : 2 Not Understandable Score : 1 Non sense translation

with

Accuracy =

Evaluation of Marathi to Hindi MT System • Module-wise evaluation – Evaluated on 500 web sentences 1.2

1

0.8

0.6

Precision Recall

0.4

0.2

0 Morph Analyzer

POS Tagger

Chunker

Vibhakti Compute

WSD

Lexical Transfer

Module-wise precision and recall

Word Generator

Evaluation of Marathi to Hindi MT System (cont..) • Subjective evaluation on translation quality

– Evaluated on 500 web sentences – Accuracy calculated based on score given according to the translation quality. – Accuracy: 65.32 % • Result analysis:

– Morph, POS tagger, chunker gives more than 90% precision but Transfer, WSD, generator modules are below 80% hence degrades MT quality. – Also, morph disambiguation, parsing, transfer grammar and FW disambiguation modules are required to improve accuracy.

Important challenge of M-H TranslationMorphology processing: kridanta Ganesh Bhosale, Subodh Kembhavi, Archana Amberkar, Supriya Mhatre, Lata Popale and Pushpak Bhattacharyya, Processing of Participle (Krudanta) in Marathi, International Conference on Natural Language Processing (ICON 2011), Chennai, December, 2011.

Kridantas can be in multiple POS categories 



Nouns Verb

Noun

वाच {vaach}{read}

वाचणे {vaachaNe}{reading}

उतर {utara}{climb down}

उतरण {utaraN}{downward slope}

Adjectives Verb चाव {chav}{bite} खा {khaa} {eat}

Adjective चावणारा {chaavaNaara}{one who bites} खा लेले {khallele} {something that is eaten}.

Kridantas derived from verbs (cont.)



Adverbs Verb पळ {paL}{run}

Adverb पळताना {paLataanaa}{while running}

बस {bas}{sit}

बसू न {basun}{after sitting}

Kridanta Types Kridanta Type

Example

Aspect

“णे” {NeKridanta}

vaachNyaasaaThee pustak de. (Give me a book for reading.)

Perfective

“ला” {laaKridanta}

Lekh

“ताना” {TaanaaKridanta}

Pustak vaachtaanaa

te lakShaat aale. (I noticed it while reading the book.)

Book while reading it

in mind came

“लेला” {Lela-Kridanta}

For reading

Article

kaal

book give

vaachalyaavar saaMgen. (I will tell you that after reading the article.) after reading

will tell

vaachlele pustak de. (Give me the book that (I/you) read yesterday. )

Yesterday read

Durative

Perfective

book give

“ऊन”{UnKridanta}

pustak vaachun

“णारा”{NaraKridanta}

pustake vaachNaaRyaalaa dnyaan miLte. (The one who reads books, gets knowledge.)

“वे” {ve-Kridanta}

he pustak pratyekaane vaachaave. (Everyone should read this book.)

parat kar. (Return the book after reading it.)

Completive

Book after reading back do

Books

Stative

to the one who reads knowledge gets

This book everyone

“ता” {taaKridanta}

Perfective

should read

to pustak vaachtaa vaachtaa zopee gelaa. (He fell asleep while reading a book.) He book

Inceptive

while reading

to sleep

went

Stative

Participial Suffixes in Other Agglutinative Languages 

Kannada: muridiruwaa kombe jennu esee Broken to branch throw Throw away the broken branch. - similar to the lela form frequently used in Marathi.

Participial Suffixes in Other Agglutinative Languages (cont.) 

Telugu: ame padutunnappudoo nenoo panichesanoo she singing I work I worked while she was singing. -similar to the taanaa form frequently used in Marathi.

Participial Suffixes in Other Agglutinative Languages (cont.) 

Turkish: hazirlanmis plan prepare-past plan The plan which has been prepared Eqv Marathi: lelaa

Morphological Processing of Kridanta forms (cont.)

Fig. Morphotactics FSM for Kridanta Processing

Accuracy of Kridanta Processing: Direct Evaluation 0.98 0.96 0.94 Precision

0.92

Recall 0.9 0.88 0.86

Ne Kr i d ant a

La Kr id ant a

N ar a Kr id ant a

Lela Kr id ant a

T ana Kr id ant a

T Kr id ant a

Oo n Kr id ant a

Va Kr id ant a

Summary of M-H transfer based MT • • • • • • •

Marathi and Hindi are close cousins Relatively easier problem to solve Will interlingua be better? Web sentences being used to test the performance Rule governed Needs high level of linguistic expertise Will be an important contribution to IL MT

63

Statistical Machine Translation (Many masters and PhD students: Ananthakrishnan, Avishek, Hansraj, Mitesh, Anoop, Abhishek, Rajen, Publication in ACL 09) Deep understanding level

Ontological interlingua

Interlingual level

Conceptual transfer

Semantic transfer

Logico-semantic level

Mixing levels

Semantico-linguistic interlingua

SPA-structures (semantic & predicate-argument) Multilevel description

Multilevel transfer Syntactico-functional level

Syntagmatic level

Morpho-syntactic level

Graphemic level

Syntactic transfer (deep)

F-structures (functional)

Syntactic transfer (surface)

C-structures (constituent)

Semi-direct translation

Direct translation

Tagged text

Text

Czeck-English data • • • • • •

[nesu] [ponese] [nese] [nesou] [yedu] [plavou]

“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”

To translate … • • • •

I will carry. They drive. He swims. They will drive.

Hindi-English data • • • • • •

[DhotA huM] [DhoegA] [DhotA hAi] [Dhote hAi] [chalAtA huM] [tErte hEM]

“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”

Bangla-English data • • • • • •

[bai] [baibe] [bay] [bay] [chAlAi] [sAMtrAy]

“I carry” “He will carry” “He carries” “They carry” “I drive” “They swim”

To translate … (repeated) • • • •

I will carry. They drive. He swims. They will drive.

Foundation • Data driven approach • Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. • Translations are generated on the basis of statistical model • Parameters are estimated using bilingual parallel corpora

SMT: Language Model • To detect good English sentences • Probability of an English sentence w1w2 …… wn can be written as Pr(w1w2 …… wn) = Pr(w1) * Pr(w2|w1) *. . . * Pr(wn|w1 w2 . . . wn-1)

• Here Pr(wn|w1 w2 . . . wn-1) is the probability that word wn follows word string w1 w2 . . . wn-1. – N-gram model probability

• Trigram model probability calculation

SMT: Translation Model • P(f|e): Probability of some f given hypothesis English translation e • How to assign the values to p(e|f) ? Sentence level

– Sentences are infinite, not possible to find pair(e,f) for all sentences

• Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair

Word level

Alignment • If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, • then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then – aj= i, and

– if it is not connected to any English word, then aj= O

Example of alignment English: Ram went to school Hindi: Raama paathashaalaa gayaa Ram

went

to

school

Raamapaathashaalaa gayaa

Translation Model: Exact expression

Choose the length of foreign language string given e

Choose alignment given e and m

Choose the identity of foreign word given e, m, a

• Five models for estimating parameters in the expression [2] • Model-1, Model-2, Model-3, Model-4, Model-5

Proof of Translation Model: Exact expression Pr( f | e)   Pr( f , a | e)

; marginalization

a

Pr(f , a | e)  Pr(f , a, m| e) ; marginalization m

Pr( f , a, m | e)   Pr(m | e) Pr( f , a | m, e) m

  Pr(m | e) Pr( f , a | m, e) m m

  Pr(m | e) Pr( f j , a j | a1j 1 , f1 j 1 , m, e) m

j 1

m

  Pr(m | e) Pr(a j | a1j 1 , f1 j 1 , m, e) Pr( f j | a1j , f1 j 1 , m, e) m

j 1

m is fixed for a particular f, hence m

Pr(f , a, m| e)  Pr(m| e) Pr(a j | a1j1, f1j1, m, e) Pr(f j | a1j , f1j1, m, e) j 1

Alignment

Fundamental and ubiquitous • • • • •

Spell checking Translation Transliteration Speech to text Text to speeh

EM for word alignment from sentence alignment: example English (1) three rabbits a b

French (1) trois lapins w x

(2) rabbits of Grenoble b c d

(2) lapins de Grenoble x y z

Initial Probabilities: each cell denotes t(a w), t(a x) etc. a

b

c

d

w

1/4

1/4

1/4

1/4

x

1/4

1/4

1/4

1/4

y

1/4

1/4

1/4

1/4

z

1/4

1/4

1/4

1/4

The counts in IBM Model 1 Works by maximizing P(f|e) over the entire corpus For IBM Model 1, we get the following relationship: t (w f | w e ) c (w | w ; f , e ) = . e0 el f f t (w | w ) +… + t (w | w ) f

e

c (w f | w e ; f , e ) is the fractional count of the alignment of w f with w e in f and e

t (w f | w e ) is the probability of w f being the translation of w e is the count of w f in f is the count of w e in e

Example of expected count C[aw; (a b)(w x)] t(aw) = ------------------------- X #(a in ‘a b’) X #(w in ‘w x’) t(aw)+t(ax) 1/4 = ----------------- X 1 X 1= 1/2 1/4+1/4

“counts” ab

a

b

c

d

bcd

a

b

c

d





wx w

1/2

1/2

0

0

xyz w

0

0

0

0

x

1/2

1/2

0

0

x

0

1/3

1/3

1/3

y

0

0

0

0

y

0

1/3

1/3

1/3

z

0

0

0

0

z

0

1/3

1/3

1/3

Revised probability: example trevised(a w) 1/2 = ------------------------------------------------------------------(1/2+1/2 +0+0 )(a b)( w x) +(0+0+0+0 )(b c d) (x y z)

Revised probabilities table a

b

c

d

w

1/2

1/4

0

0

x

1/2

5/12

1/3

1/3

y

0

1/6

1/3

1/3

z

0

1/6

1/3

1/3

“revised counts” ab

a

b

c

d

bcd

a

b

c

d





wx w

1/2

3/8

0

0

xyz w

0

0

0

0

x

1/2

5/8

0

0

x

0

5/9

1/3

1/3

y

0

0

0

0

y

0

2/9

1/3

1/3

z

0

0

0

0

z

0

2/9

1/3

1/3

Re-Revised probabilities table a

b

c

d

w

1/2

3/16

0

0

x

1/2

85/144

1/3

1/3

y

0

1/9

1/3

1/3

z

0

1/9

1/3

1/3

Continue until convergence; notice that (b,x) binding gets progressively stronger; b=rabbits, x=lapins

Derivation of EM based Alignment Expressions VE  vocalbulary of language L1 (Say English) VF  vocabulary of language L2 (Say Hindi) E1 what is in a name ? या है ? F1 नाम म naam meM kya hai ? name in what is ? what is in a name ? E2 That which we call rose, by any other name will smell as sweet. F2 िजसे हम गु लाब कहते ह, और भी कसी नाम से उसक कु शबू सामान मीठा होगी Jise hum gulab kahte hai, aur bhi kisi naam se uski khushbu samaan mitha hogii That which we rose say , any other name by its smell as sweet That which we call rose, by any other name will smell as sweet.

Vocabulary mapping Vocabulary VE

VF

what , is , in, a , name , that, which, we , call ,rose, by, any, other, will, smell, as, sweet

naam, meM, kya, hai, jise, hum, gulab, kahte, hai, aur, bhi, kisi, bhi, uski, khushbu, saman, mitha, hogii

Key Notations

(Thanks to Sachin Pawar for helping with the maths formulae processing)

Hidden variables and parameters

Likelihoods Data Likelihood L(D; Θ) :

Data Log-Likelihood LL(D; Θ) :

Expected value of Data Log-Likelihood E(LL(D; Θ)) :

Constraint and Lagrangian

Differentiating wrt Pij

Final E and M steps M-step

E-step

18-Dec-2013

SMT Tutorial, ICON-2013

95

Remember: SMT •

What is a good translation? – Faithful to source – Fluent in target

faithfulness

eˆ  arg max P (e) e

fluency

P ( f | e)

Case Marker and Morphology crucial in E-H MT • Order of magnitiude facelift in Fluency and fidelity • Determined by the combination of suffixes and semantic relations on the English side • Augment the aligned corpus of the two languages, with the correspondence of English suffixes and semantic relations with Hindi suffixes and case markers

Semantic relations+SuffixesCase Markers+inflections

I

ate

mangoes

I {