The Naturalness of Software: A Research Vision. Abram Hindle, Earl Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu

The “Naturalness” of Software: A Research Vision Abram Hindle, Earl Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu % 0 0 1 ral u t a N ...

Author: Trevor Griffin

1 downloads 2 Views 5MB Size

Report

Download PDF

Recommend Documents

Hacking NIMEs. Abram Hindle ABSTRACT. Author Keywords. ACM Classification

Abram Hindle. Department of Computer Science. University of Victoria. July 13, 2004

Premkumar Lakshminarayanan

Scientific vision and research strategy

Vision 2020: The Role and Scope of Operations Research Models

Research. initiative. EARL at a glance the EANM initiative for Multicentre Nuclear Medicine & Research. EANM

Vision Performance Institute A research consortium supporting Quality Sustainable Vision

Dudley, Earl of Leicester

Revision Showa Gabel

Gabel- und Rahmenlichtschranken

1 P-GABEL

Barr and Watson Pharmaceuticals)

Schillers Locken und Kafkas Gabel

INVESTIGATION & ARREST OF ABRAM MOSZEK BURSTYN

Proteins of Epstein-Barr Virus I. Analysis of the Polypeptides of Purified Enveloped Epstein-Barr Virus

MEMORIES AND COMMENTS EARL BARTEL

Making a Mark. Dr. Prabuddha Ganguli CEO VISION-IPR

THE CHANGING SKY. 1 Skyglobe software 1997, by Mark A. Haney, a Shareware product of KlassM Software

The Abram Hoffer Orthomolecular Collection at the University of Saskatchewan

STUDYING INVISIBLY: MEDIA NATURALNESS AND LEARNING

Evaluation of naturalness. different prosodic models

Videre: Journal of Computer Vision Research

Am Beispiel der Gabel

The “Naturalness” of Software: A Research Vision Abram Hindle, Earl Barr, Zhendong Su,

Mark Gabel, and Premkumar Devanbu

% 0 0 1 ral u t a N

% 0 0 1 ral u t a N

public class FunctionCall {! public static void funct1 () {! ! System.out.println ("Inside funct1");! }! public static void main (String[] args) {! int val;! System.out.println ("Inside main");! funct1();! System.out.println ("About to call funct2");! val = funct2(8);! System.out.println ("funct2 returned a value of " + val);! System.out.println ("About to call funct2 again");! val = funct2(-3);! System.out.println ("funct2 returned a value of " + val);! }! public static int funct2 (int param) {! System.out.println ("Inside funct2 with param " + param);! return param * 2;! }! }!

% 0 0 1 ral u t a N

!

English, Tamil, German Can be rich, powerful, expressive ..but “in nature” are mostly simple, repetitive, boring Statistical Models

Can be rich, powerful, expressive Mostly simple, repetitive, boring Statistical Models

Two Examples A speech recognizer example

“European Central Fish” ? Another speech recognizer example

“fish++” ?

Repetition Mathematical Models Useful Software

Is software really repetitive?

The “Uniqueness” of Code

Mark Gabel

Zhendong Su

A study of the Uniqueness of Source Code, Gabel and Su, ACM SIGSOFT FSE 2010

How Redundant is Code? How much code?

6000 projects (C, C++, Java)

430,000,000 LOC

How long?

Sequences of 6-77 token length

(1) How matched?

Exact Match

1-4 edits

(2) How matched?

Raw Tokens

Renamed Identifiers

Non-Uniqueness (Redundancy) in a Large Java Corpus 100 90

Percent Redundancy

80 70

Identifiers Renamed Exact Tokens

60 50 40 30 20 10 0 5

20

35

50

65

80

Length of Candidate Code Fragment in Tokens

Software is really repetitive. How can we use this?

How has the “naturalness” (repetitive structure) of natural language been exploited?

Large Corpora Language Models Speech Recognition,

Translation, etc.

Language Models For any utterance U,

If Ua is often uttered than Ub,

!

0  p(U )  1 p(Ua ) > p(Ub )

p(“EuropeanCentralF ish”) < p(“EuropeanCentralBank”) p(for(i = 0; i < 10; fish + +)) < p(for(i = 0; i < 10; i + +))

History of Language Models in NLP “Rationalist Methods” based on linguistic and • Initially, “Every time I fire a linguist,

logical theories...

performance the “Empiricist” approach

of our

• ...enterthe Most “natural” utterances are repetitive, recognizer goes simple

up”

✓speech on-line corpora

✓Faster computers + large—Fred Jelenik ➡ Good, high quality language models

➡ Rapid, revolutionary advances

Language Models: a Revolution in NLP The design and estimation

Good Language Models have been used for:

of language models

Speech recognition

➡ is at the heart

➡Natural language translation

of modern NLP Document summarization

➡

➡Document retrieval

But what about code? and “code language models”?

Exploiting Code Language Models Suggest the next token for developers

Complete the current token for developers

Assistive (speech, gesture) coding

Summarization and retrieval as translation

Fast, “good guess” static analysis

Search-based Software Engineering

Building a Language Model Large Text

Corpus  (Training) 

Estimation  Algorithm

Statistical

Model

Design

Estimated using

Model frequency of occurrence!

Large Text

Corpus  (Test) 

Evaluation

Model  Quality 

What a Language Model Does ..of the European Central Bank

Language  Model

p(of)

p(the) p(European) p(Central) p(Bank)

Vastly more complex Language

Models

Almost always face data-sparsity

Novel, NLP-specific estimation methods

Evaluating Language Model Quality The words it encounters are not “too surprising” to it.

Frequently encountered language events are assigned higher probability

Infrequent language events are assigned lower probability.

....measured using “Cross-Entropy”

Background Cross Entropy Language  Model

Good   Description?

public class FunctionCall {! public static void funct1 () {! ! System.out.println ("Inside funct1");! }! public static void main (String[] args) {! int val;! System.out.println ("Inside main");! funct1();! System.out.println ("About to call funct2");! val = funct2(8);! System.out.println ("funct2 returned a value of " + val);! System.out.println ("About to call funct2 again");! val = funct2(-3);! System.out.println ("funct2 returned a value of " + val);! }! public static int funct2 (int param) {! System.out.println ("Inside funct2 with param " + param);! return param * 2;! }! }!

!

Background Cross Entropy Language  Model

Good   Description?

public class FunctionCall {! public static void funct1 () {! ! System.out.println ("Inside funct1");! }! public static void main (String[] args) {! int val;! System.out.println ("Inside main");! funct1();! System.out.println ("About to call funct2");! val = funct2(8);! System.out.println ("funct2 returned a value of " + val);! System.out.println ("About to call funct2 again");! val = funct2(-3);! System.out.println ("funct2 returned a value of " + val);! }! public static int funct2 (int param) {! System.out.println ("Inside funct2 with param " + param);! return param * 2;! }! }!

!

Low Cross Entropy!!

Background Cross Entropy Language  Model

Good   Description?

public class FunctionCall {! public static void funct1 () {! ! System.out.println ("Inside funct1");! }! public static void main (String[] args) {! int val;! System.out.println ("Inside main");! funct1();! System.out.println ("About to call funct2");! val = funct2(8);! System.out.println ("funct2 returned a value of " + val);! System.out.println ("About to call funct2 again");! val = funct2(-3);! System.out.println ("funct2 returned a value of " + val);! }! public static int funct2 (int param) {! System.out.println ("Inside funct2 with param " + param);! return param * 2;! }! }!

!

High Cross Entropy!!

Measuring Goodness: Cross entropy

Higher if Model assigns

Low-Probability  to frequent events 

Cross entropy 

n X 1

n i=1

For a document

probability

with

assigned by

n words  Model to word 

log(p(ei ))

Lower if Model assigns

High-Probability  to frequent events 

What language model gives low cross-entropy?

n-gram models • Intuition: Local Context Helps

• Examples (NL, then code)

• multiple choice question

• item = item→next

!

More context helps more!

What is

This?

What is

This?

n-gram models of code: Experimental Results

Java Datasets

C Datasets

N-gram Cross Entropy English

Code

10 7.5

3-4 Bits!

5 2.5

Five Bits! 0 1-gram

2-gram

3-gram

4-gram

5-gram

6-gram

7-gram

8-gram

The Skeptic Asks... Is it just that C, Java, Python... are simpler than English?

➡ Do cross-project testing!

➡ Train on one project, test on the others

➡ If it’s all “in the language”, entropy should be similar

Train on one project, test on the others.

14 12 10 Cross Entropy

8 6 4 2

Self Cross Entropy Ant

Batik

Cassandra

Eclipse

Log4j

Lucene

Corpus Projects

Maven2

Maven3

Xalan−J

Xerces2

Train on one Ubuntu application domain,

test on the others.

6.0 5.5

● ●

4.5

116 ●

4.0

23 22 ●

21

●

15

86 ●

26 ●

135

● ●

118

3.5

●

31

3.0 2.5

Cross Entropy

5.0

●

Self Cross Entropy Admin

Doc

Graphics

Interpreters

Mail

Net

Corpus Categories

Sound

Tex

Text

Web

The “Naturalness” Vision Suggest the next token for developers

developers Complete the current token for developers

Assistive (speech, gesture) coding

Summarization and retrieval as translation

Stupid, statistical, static analysis

Search-based Software Engineering

Uses Type, Scope,  Etc !

Suggesting Tokens

What token could appear here? What token has most often appeared here? Use just

previous two tokens!

Do n-grams help? Eclipse

Suggestion

Engine 

Test Set  (existing

code) 

Merge  Algorithm

Evaluation

Suggestion

from

Language

Model Additional

Benefit

Benefit.

from from

Language

Model 

How many more correct suggestions? Suggestion1

Suggestion1

Suggestion2

Suggestion2

Suggestion3

Language Models

Suggestion1 ALWAYS Suggestion2 improve performance

Suggestion4

Suggestion3

Suggestion5

Suggestion4

Suggestion6

Suggestion5 Suggestion6

Suggestion7 Suggestion8 Suggestion9

Suggestion10

120

●

Percent Gain Raw Gain (count)

4000

100

80 ●

60

Improved performance

Suggestion2 at every token length Suggestion1

●

2000

●

●

40

Raw Gain (count)

Percent Gain over Eclipse

3000

1000

20 ● ●

●

●

● ●

●

●

●

0

0 3

4

5

6

7

8

9

10

11

Suggestion Length

12

13

14

15

N-Gram suggestions

always add value to the native Eclipse suggestion engine,

in a very large trial.

Can be rich, powerful, expressive Mostly simple, repetitive, boring Statistical Models

The “Naturalness” Vision Suggest the next token for developers

Complete the current token for developers

Assistive (speech, gesture) coding

?????

Summarization and retrieval as translation

????? Fast, “good guess” static analysis

Search-based Software Engineering

Assisted Coding

Dasher++

Rachel Aurand

(Graduate

Student)

Eclipse

The “Naturalness” Vision Suggest the next token for developers

Complete the current token for developers

Assistive (speech, gesture) coding

Summarization and retrieval as translation

Fast, “good guess” static analysis

Search-based Software Engineering

Noisy Channel Model “Comment allez vous? ”

He’s trying to speak English, but it is Oh, it saying:  May be he’s   What was the most likelyinto English systematically “messed up” French. must be:  “Do you comment all your code?” “Fine, thank you, How are you?”

sentence

! to say? he was trying

p(F | E).p(E) p(E | F ) = p(F )

Most Likely

way   “it got messed up”

Most Likely

English Sentence

p(F | E).p(E) p(E | F ) = p(F ) Maximize Numerator

over “E” to get

best translation

Normalizing

Constant

Joint Distribution from

Aligned Corpus

English Language  Model

p(F | E).p(E) p(E | F ) = p(F ) Where do the probability distributions

come from?

Normalizing

Constant

Noisy Channel Model Toast.makeText(context, “hello”, 5).show();

He’s trying to speak English, but it What was most likely

comes out the funny-sounding English summary code? Oh, itof this Maybe his code means  must be:  toast?” “Make me some “Pop up a message window!”

p(C | E).p(E) p(E | C) = p(C)

Code-English

Joint Corpus

“Domain-Specific”  English Language  Model

p(C | E).p(E) p(E | C) = p(C) Where do the probability distributions

come from?

Normalizing

Constant

The “Naturalness” Vision Suggest or Complete next tokens

Assistive (speech, gesture) coding

Summarization and Retrieval as Translation

Learn and Enforce Coding Conventions

Syntax Errors

Machine Translation for Porting

Fast, “good guess” static analysis

Search-based Software Engineering