Designing Electronic Voting

Tartu University Faculty of Mathematics and Informatics Institute of Computer Science Chair of Software Engineering Oleg M¨urk Designing Electronic ...

Author: Raymond Lionel Gardner

5 downloads 1 Views 310KB Size

Report

Download PDF

Recommend Documents

Short Report: Electronic Voting

Electronic Voting and Democracy

Secure Electronic Voting Protocols

ELECTRONIC VOTING MACHINES

ELECTRONIC VOTING EQUIPMENT

Self-Enforcing Electronic Voting

Proposal For Secure Electronic Voting

The Risks of Electronic Voting

Electronic Voting A Strategy for Managing the Voting Process

Developing a Methodology for Observing Electronic Voting

Electronic voting: how logic can help

Surveying and Improving Electronic Voting Schemes

Electronic Voting: A Safety Critical System

Electronic Voting: An All-Purpose Platform

Electronic Voting Systems From Theory to Implementation

Election verifiability in electronic voting protocols

Electronic Voting: Issues, Challenges and Strategy

Contractual Barriers to Transparency in Electronic Voting

An Overview of Electronic Voting and Security

A Framework for Electronic Voting in Nigeria

A framework for secure electronic voting

Specified Procedures report Electronic voting system

Caveat Coercitor: coercion-evidence in electronic voting

Tartu University Faculty of Mathematics and Informatics Institute of Computer Science Chair of Software Engineering

Oleg M¨urk

Designing Electronic Voting Bachelor’s Thesis

Supervisor: Helger Lipmaa, PhD

Author: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ”. . . ” June 2001 Supervisor: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ”. . . ” June 2001 Head of the Chair: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ”. . . ” June 2001

Tartu 2001

Contents Introduction Aims of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

System Analysis 1.1 Domain Model . . . . . . . . . . . . 1.2 Generic Requirements . . . . . . . . 1.3 Conventional Elections . . . . . . . . 1.4 Trust . . . . . . . . . . . . . . . . . . 1.5 On Revoking Ballots . . . . . . . . . 1.6 E-voting Requirements . . . . . . . . 1.6.1 Functional Requirements . . . 1.6.2 Non-functional Requirements

2

System Design 2.1 Theoretical Basis . . . . . . . . . . . . . . 2.1.1 Model of the Real World . . . . . . 2.1.2 Electronic Voting Scheme . . . . . 2.1.3 Public Key Infrastructure . . . . . . 2.1.4 Time-stamping . . . . . . . . . . . 2.1.5 Bulletin Board . . . . . . . . . . . 2.1.6 Threshold Encryption and Signature 2.1.7 Implementations of EVS . . . . . . 2.1.8 On the Freedom of Choice . . . . . 2.2 Designing Framework . . . . . . . . . . . . 2.2.1 Real World Model . . . . . . . . . 2.2.2 Computing Device . . . . . . . . . 2.2.3 Software . . . . . . . . . . . . . . 1

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

3 4 5 5

. . . . . . . .

8 8 9 10 11 15 15 15 18

. . . . . . . . . . . . .

20 20 21 21 23 26 26 27 30 34 35 35 36 38

2.3

2.4

3

2.2.4 Threshold Trust . . . . . . . . . . . 2.2.5 Connection . . . . . . . . . . . . . 2.2.6 PKI . . . . . . . . . . . . . . . . . 2.2.7 Time-stamping . . . . . . . . . . . 2.2.8 Summary . . . . . . . . . . . . . . Design for Bulletin Board . . . . . . . . . . 2.3.1 Some Simple Ideas . . . . . . . . . 2.3.2 Synchronous Environment . . . . . 2.3.3 Asynchronous Environment . . . . 2.3.4 Practical Solutions . . . . . . . . . Design Pattern for E-voting System . . . . 2.4.1 Computing Result . . . . . . . . . 2.4.2 Meta Process . . . . . . . . . . . . 2.4.3 Design for Single Authority EVS . 2.4.4 Design for Multiple Authority EVS 2.4.5 Conclusions . . . . . . . . . . . . .

Summary

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

38 40 44 47 47 48 49 50 51 52 53 57 61 61 63 65 66

¨ Resumee (In Estonian)

68

Bibliography

69

2

Introduction Recently, the topic of implementing electronic voting (e-voting) has become very popular: multiple workshops have been held, there exist firms that provide corresponding services, real attempts of e-voting have taken place, media is eagerly covering this topic. The main purpose of electronical elections is allow voters to vote from as many locations as possible, ideally from their personal computing devices. An intermediate option would be to have specialized computers (kiosks) be deployed everywhere like ATMs (automated teller machines) currently are. Communication media would probably be Internet or something similar. The justification is that it would be more convenient, which would increase voter turnout. Also, one might expect that in future e-voting would become less expensive than conventional voting. At the current time e-voting is viewed as a complement to conventional elections because, for instance, not all people have access to computers and Internet (or skills to use them). Despite its tempting simplicity, this problem is much more complex than it seems at the first moment. The main issues are security and reliability. The problem of organizing e-voting consists roughly of three parts: Solving problem mathematically, which includes formulating model of the real world (e.g. formalizing the notion of trust), stating requirements, and finally finding a mathematical construction and proving that it satisfies these requirements. Such construction is called electronic voting scheme (EVS): a collection of protocols and algorithms, which implement e-voting within formulated model of the real world. I will call all this theoretical activity. Provided there is an EVS, it is needed to implement it. In particular, real world model, which was used, must be implemented. Besides that, EVS has usually relatively simple structure (nevertheless being complex mathematically), which assumes some inputs and produces some outputs. It does not consider the process of preparing input data and consuming output data. Also, e-voting must be somehow integrated into existing conventional voting 3

process. Real implementation must consider the whole iterative process of organizing elections. I will call this technical activity. Finally, e-voting will inevitably differ from conventional elections: voter must perform different actions, there are different (and probably bigger) security threats, different demographical groups have different level of access to the Internet, etc. For this reason politicians and sociologists must evaluate impact of e-voting on the democratic process and decide whether it is useful at all and provide suggestions what should be changed. Besides that, laws must be changed to accommodate e-voting into conventional voting process. I will call this political activity. Theoretical activity belongs to the field of cryptography and has lasted for at least twenty years. The most influentious papers in this field are (personal opinion): [Cha81], [Ben87], [BT94], [CGS97]. Reader can find a partial overview of this topic in my semester work [Myr00]. Basically, it can be said, that there exist solutions of acceptable security and complexity, although there is enough place for further advances. There exist some number of firms, which provide e-voting solutions. The most well-known of them are probably [VoteHere.net] and [Election.com]. The first of them provides (at least) some description of their technology and is based on [CGS97], which is a good cryptographical construction. On the other hand the second of them has received more media attention, but does not present any description of their technology at their web site (which is a disadvantage, to my opinion). A number of workshops have been conducted, which concentrated on political and technical aspects: National Workshop on Internet Voting [IPI], Voting Integrity Project [VIP], California Internet Voting Task Force [CIVTF]. Their major finding is that although there is enough theoretical basis for implementing e-voting, technologically it is not possible to make systems secure enough. The biggest problem is insecurity of conventional personal computers and Internet. At the same time they propose using e-voting kiosks in near future.

Aims of the Thesis This thesis can be viewed as continuation of my semester work [Myr00], where I dealt with theoretical problems of e-voting. In this work I will concentrate on the technical aspect: I will try to formulate requirements for the system and outline system design. Software engineering ideology and notation (UML) will be followed throughout the text. 4

Although it is clear that risks of voting from usual PCs over Internet are too high, it is still interesting to design e-voting system and see where and why these risks come up.

Acknowledgement I would like to express my gratitude to Helger Lipmaa for introducing me to this subject and motivating me to deal with it and also for pulling me into Estonian e-voting project [LipMy01].

Notation As a potential reader might have theoretical computer science (and not software engineering) background, I will describe shortly notation used. Two types of UML diagrams are used: static structure and activity. Notation is used and interpreted quite freely, which should be normal from the viewpoint of UML. There are also some other types of diagrams used, but their meaning should be evident or will be explained separately. Static structure diagram sample is presented on Figure 1. It depicts Factory (a class) that produces Cars (dashed line signifying dependency or direction of flow). Cars have names (an attribute) and operations Car::Start() and Car::Stop() (methods). Each car has at most one Owner, each owner can have many cars (arrow with ”1” and ”*” signifying one-to-many relationship). Car consists of Wheels (rhomb signifying aggregation). Car is a kind of Beeper, though it can Beep() (triangle signifying generalization, interface signifying a set of methods that some class should implement). Activity diagram sample is presented on Figure 1. It is supposed to describe state transitions and flow of a process. The upper black dot signifies beginning of the process (initial state), the one at the bottom is the final state. Bubbles signify either activity or state, arrows depict transitions. On this diagram Work1 and Work2 are performed in parallel, state Complete is reached when both of these activities complete. System architecture diagram sample is presented on Figure 1. Here threedimensional bar depicts a subsystem, simple rectangle depicts a process (I also interpret it as a user), rounded rectangle signifies an object (or data), cylinder depicts datastore, grey bar between database and computer signifies a boundary. Lines and arrows are used freely to signify relationships and directions of dataflow.

5

Factory

«interface» Beeper +beep()

Car

Owner

+name

1

+start() +stop() * 1 * Wheel

Figure 1: UML Static Structure Diagram Sample.

Prepare

Work1

Work2

Complete

Figure 2: UML Activity Diagram Sample. 6

Programmer

Computer

Database

Program

Figure 3: System Architecture Diagram Sample.

7

Chapter 1

System Analysis In this chapter I will try to describe the problem of e-voting in detail and formulate requirements for e-voting system. This would be a basis for designing and implementing such system.

1.1 Domain Model The basic entities involved in elections are depicted on Figure 1.1. 1

Person +id

Election +name

1

1

*

* *

Voter

Ballot Type

Option +name

*

1

1

*

Figure 1.1: Domain model. Person Any person that may participate in some election. I assume that each person has unique identifier. Election Specific election. I assume that elections are identified by a unique name. 8

Voter A person participating in election. Many voters can participate in election. Each person can be a voter in many elections. Voter is identified by person’s identifier and name of the election. Ballot Type Different voters can be presented with different ballot types at some election. Ballot type consists of some number of options amongst which voter will have to select one. Option One option belonging to some ballot type. Options are identified by names. Option names are unique within corresponding ballot type. At real election voter might be required to answer to multiple questions. In my model this can be modelled with multiple simultaneous elections. If it is important that voter answers correctly to each question, it can be easily enforced with technical or administrative methods. In addition I present the following definitions: Definition 1.1.1 Ballot - an option from some ballot type chosen by a voter. Definition 1.1.2 Tally - a set of ballots from voters of some election. Each voter can have at most one ballot. Definition 1.1.3 Election Result - calculated from tally, where for each ballot type it is said how many times each option was selected.

1.2 Generic Requirements Any election system (either conventional or electronic) must at least satisfy the following requirements: Functional Requirements System must allow forming lists of eligible voters, needed ballot types, and assigning each voter corresponding ballot type. Each voter must have opportunity to cast at most one vote by selecting one option from his ballot type. In the end election result must be calculated. Freedom of Choice Nobody can affect voter’s choice. Voter should make his decision himself. It is a complex problem how to define such requirement formally, but basically there are two options: Privacy Nobody can learn how a person voted without cooperating with him. 9

Incoercibility Nobody can learn how a person voted even if cooperating with him is possible. This includes that voter cannot prove how he voted. Of course, voter could just tell coercer how he voted, but coercer would not have any means of verifying this claim. Interested reader can find longer discussion of this subject in [Myr00].

1.3 Conventional Elections It is important to stress that e-voting is viewed as a complement to conventional non-electronic voting systems. So it is reasonable to review how such systems are functioning.

Election Organizer

Intermediate Organizer

1

Voting Location

1

*

*

Figure 1.2: Conventional elections. The structure of the system is usually hierarchical. Normally there is one organization, which is responsible for organizing election. I call it Election Organizer. On the other hand there is some number of hierarchy leaves, where people can actually vote. I call them Voting Locations. Voting locations and election organizers communicate through intermediate organizers, which group some number of voting places (usually on geographical basis). Different ballot types can be used at different voting locations. Each voter is assigned to one main voting location (close to his residence), where he can normally vote. In such setting it is possible to prevent voters from overvoting (voting more than once). Different voting technologies can be used there. One possibility is to use paper ballot, where options are presented and voter has to select one of them. After that ballot is casted into a sealed box. Later all ballots are taken out of the sealed box and counted. The counting process can be automated, for example, by means of optical scanning. All of the technologies assure that after voter has casted his ballot, it is not possible to link voter’s identity and his ballot, which ensures privacy. In fact, incoercibility is also guaranteed because voter is casting his ballot in a private voting booth, which implies that voter cannot prove how he voted (of course, in real world such claims are always 10

relative to how much determined is the adversary). It is important to point out that after voter has casted his vote, it is not possible to remove his vote from the tally, which might be useful, if later it turns out that voter was not eligible to vote. If voter does not want to vote at his main voting location, there are some procedures for absentee voting, which allow him to vote from a broader number of locations (e.g. even from home). In this case special measures must be taken to prevent voter from voting more than once (overvoting). One possible solution is that voter puts his ballot into a clean envelope, seals it, and then puts this envelope into a new one, where he writes his identification. After that all envelopes from one voter must arrive at the same place, where it can be ensured that he did not vote more than once. Also, if voter is not prevented from voting at his main voting location, it must be ensured that he did not use both voting procedures. This implies that the best place for gathering voter’s envelopes is at his main voting location. If it is decided that voter’s envelope should be counted, external envelope is removed and the second clean envelope (which cannot be linked to voter’s identity) is put into a bigger pool of clean envelopes of other voters. After that voters’ ballots can be processed without compromising voters’ privacy. If voter is obliged to form his ballot in a private voting booth, incoercibility is also guaranteed, but if ballot can be composed at any place (e.g. at home and then sent by mail) someone could have been watching how voter is filling his ballot. Election results at each voting location are passed up the hierarchy to the intermediate organizers and finally to the main organizers (separately for each ballot type). At every level information passed from lower nodes is summed up and then passed to the parent node. Integrity of the process is ensured by the presence of observers, whose function is to verify that everything is performed as needed (votes are counted correctly, voter privacy is maintained). Distributed nature of the system ensures that violations of voting process at some node will not poison the whole voting system and though will have only limited effect on election result. A separate problem is compilation of lists of people who can vote at each voting location. Probably the best solution is to have a database of all people from which lists of voters can be generated, but this is not always the case.

1.4 Trust E-voting system must be trusted by all entities, whom decisions made using it may concern. In the case of a country, the system must be trusted by all citizens, government, organizers themselves (hopefully) and also it must be approved by the international community. The problem of verifying that system corresponds to its requirements is very 11

common in the field of software engineering. Requirements can be usually divided into the following groups: Functional Which functions should the system be able to perform. Efficiency Limitations on time, memory, and communication channel bandwidth requirements. Dependability To what extent one could rely (depend) on the system, including: Reliability Limits on statistical measure of frequency of faults. Is related to availability. Availability Minimum acceptable percentage of time, during which system performs correctly. Is related to reliability. Safety Limits on the measure of loss in the case of big failures. Security What functions should system prevent from performing. In general, such requirements are application specific, but the most usual ones are: Access Control Policy defining which system users can perform which operations. Confidentiality Policy defining which system users are supposed to see which data. Integrity Policy defining how data present in the system can be entered, modified, and deleted. Most of requirements can be assigned meaningful numeric measures, but in practice it is almost never possible to measure them directly. Functional and efficiency requirements can be measured to some extent by testing, but dependability and in particular security is not measurable directly almost at all (also it is very rare when systems can be built so that their correctness can be proven formally). A typical solution is to evaluate parameters indirectly, by measuring the quality of the process of creating and maintaining the system, which gives of course very vague results. The following aspects are usually taken into consideration: Whether best practices and common sense are used. To what extent system has been subject to testing, formal verification. Presence of continuous process of improvement of system’s quality. In the case of security, important additional criterions are: 12

For how long the system has withstood real or theoretical attacks by other interested parties. Presence of continuous process of prevention, detection and reaction to possible attacks. In general specialists proficient in one field are not able to verify systems from the other fields. A general way to ensure quality of a system is to have specialists others than those who created it review it and express opinion. In the case of approval verifiers would become partially responsible for that system themselves. Figure 1.3 describes components of hypothetical e-voting system, and also specialists that are responsible for them, who must be trusted to some extent. Theory Scientists are responsible for developing corresponding theory, the quality of which must be ensured by peer review. Hardware and Software Engineers are supposed to provide hardware and software, which must be certified. Servers E-voting system consists of one or more server computers, which are set up and maintained by the operators. Integrity of servers could be ensured by the presence of observers. E-voting system E-voting system itself is set up and maintained by the organizers. Integrity of the whole process could be also ensured by the presence of observers. Network Network connecting servers and voters’ computing devices is set up and maintained by the providers. Voters’ computers Voters’ computing devices are set up and maintained by administrators. In many cases voters are administrators of their computers themselves. Careful reader has probably noticed that network and voters’ computers do not have certifiers. This is mostly a reflection of existing situation. It is very hard to certify in any sensible way network, which spans the whole world (Internet). Although users’ computers could be certified, it is not done massively at the current moment. Also in the case of contemporary personal computer (PC), it might not be a very sensible activity, because voter would be able to misconfigure his computer right after certification. So voter computer’s certification bears any sense only to the voter himself (and may be to computer’s administrator), but not the rest of the world. 13

Theory

Hardware

Scientists

Software

Engineers

Servers

Operators

E-voting system

Organizers

Network

Providers

Voters' computers

Administrators

Figure 1.3: Components of hypothetical e-voting system.

14

1.5 On Revoking Ballots Before stating the requirements for e-voting system, one more issue of integrating conventional elections and e-voting must be considered. Revoking ballots means removing some voter’s ballots from the tally before counting it. It is useful when there are two or more facilities for casting a ballot, which take place simultaneously. There are two ways to prevent overvoting in such situation: When voter is casting his ballot it is checked whether his has already done that using the other facility. After election is over it is checked if someone voted using both facilities and then this voter’s ballot must be revoked from one of the two tallies. As it is not possible to revoke ballots in conventional voting, introducing evoting would require one of the following: E-voting to allow revoking ballots. E-voting and conventional voting not taking place at the same time, but sequentially. To have a database where for each voter it would be recorded if he has already voted and to maintain online connection between this database and all voting locations. It’s clear that the first option is the most preferable as it requires the least resources (does not require database) and is the most convenient (conventional election and e-voting can take place at the same time).

1.6 E-voting Requirements Now requirements for e-voting system can be stated.

1.6.1 Functional Requirements System must support the following activities: Organizer’s use cases Election’s organizer must be able to perform the following operations: Prepare election First, election information must be prepared: 15

Enter election parameters It must be possible to create new election record and enter general parameters like name and time period during which it should take place. Enter voter list After that voter list must be entered somehow. Besides that some form of voter identification must be provided. It is reasonably simple, if there exists database of all voters, from which this list could be retrieved with corresponding query. On the other hand, if voters are supposed to register at election (as in USA), it would require a separate software module to fulfil this requirement. Enter ballot types Further, system should support entering different ballot types and options. Map voters to ballot types Finally system should allow creating mapping between voters and ballot types. Normally it would be possible to devise this mapping from supplementary information about voters, if it is available. For example, if ballot types are assigned based on where voter lives. Conduct election Having prepared election information it should be possible to conduct it. It is conceivable that system would start and stop election automatically based on the time period entered when creating election information. Start election Pause election Stop election Finish election After election has been conducted, it is needed to make conclusions. Revoke voters System must allow to leave uncounted ballots of some voters. It is an optional but very desired requirement, which was discussed in Section 1.5. Compute result After that result must be computed: for each ballot type it must be calculated how many times this option was selected. Output result Finally election result must leave the system and enter external world. One option is to print it on paper. The other option is to produce digitally signed document. Archive election As the last step election information must be archived to durable media. 16

Save election information The following information must be archived: Election information: voter list, ballot types, mapping between voters and ballot types. Binary representation of casted votes or any other information based on which election result was computed (and can be recomputed), if possible. The list of voters whose ballots were revoked when computing election result. Official election result as computed before. Verify archived information It must be possible to verify that all archived information is correct, in particular that archived election result matches other archived information. Figure 1.4 illustrates the main process of e-voting system from the viewpoint of organizer.

Preparing election

Conducting election

Revoking voters

Computing result

Archiving

Figure 1.4: Voting process. Voter’s use cases Voter must be able to perform the following operations: Select election As there might be many elections taking place simultaneously, voter must be able to select which election he wants to work with. See ballot and make a choice After having selected election, voter must be able to see the options and select one of them. 17

Submit ballot Finally he must be able to submit his ballot to the system. Change or delete ballot Optionally it might be allowed to change or even delete previously submitted ballot. Access Control It should be possible to define for each system operation, which users can perform it.

1.6.2 Non-functional Requirements Convenience and Usability System must be convenient to use, especially for voters. This includes: Simplicity Performing voting process should be as simple as possible. Graphical representation of ballot should be intuitive (probably it should look like paper ballot). Mobility Ability to vote from as many places and devices as possible. Revisability Ability to change or delete previously submitted vote. Efficiency It should be possible to complete voting process as fast as possible. Software delays (time during which software is performing actions, not waiting for user input) should not be longer than (say) 1 second (conventional web applications). The final step of submitting vote should not last longer than 1 minute (say). Efficiency Voter’s software efficiency has already been discussed. Organizer’s software efficiency should be as follows: Election information preparation should act as a usual application where software delays should not be longer than (say) 1 second. The process of conducting elections should be efficient enough to ensure required speed of voter’s actions. The final stages of election like computing result, archiving or verifying election data should not last more than a couple of hours. Freedom of Choice Besides functional requirements, system must satisfy the freedom of choice requirement, which was discussed in Section 1.2. Privacy Is considered a minimum requirement and is necessary.

18

Incoercibility In general, incoercibility is very useful, but in the case of evoting it can never be ensured completely: if people can vote at any possible location (including at home) it is possible that someone will be looking over shoulder how someone is voting. So probably it is enough to ensure that coercibility cannot take place at a larger scale, than it is possible by physically attending when someone is voting. Dependability It’s clear that e-voting system must be much more reliable than conventional software. There are two ways how system might be malfunctioning: Some operations cannot be performed (voter cannot vote, result cannot be computed). System looks like working correctly, but it is not (voter’s software silently selects different option, some voters’ ballots are silently omitted, result is computed incorrectly). The second type of faults is clearly more dangerous.

19

Chapter 2

System Design In this chapter I will describe different options for implementing e-voting system. I will also try to compare them and evaluate risks. First, an overview of theoretical basis will be given and then design for different parts of the system will be described. It is important to stress that design for such system does not mean only software design, but also design of the organization and the process, and also possibly hardware. As it was already mentioned the main problem of implementing e-voting is security. Security is almost always relative - it can be broken with some investment of resource (money and/or time). Although the benefits of breaking e-voting system cannot be completely measured (for instance in money), it can still be argued that adversary would not spend more resource on breaking e-voting than he would gain from breaking it. This implies that elections of different importance would require different minimum level of security.

2.1 Theoretical Basis Before designing the real system, e-voting problem must be analysed and solved mathematically. In fact, this problem belongs to the field of cryptography. In this text an overview of the topic will be given. Those interested in more details can find them in [Myr00]1 . 1

This implies that references to sources will only be given when necessary. Interested reader can find them in [Myr00].

20

2.1.1 Model of the Real World First a simplified model of the real world must be introduced, within which it would be possible to describe possible solutions and evaluate them. The following terms are introduced: Actor Actor is any entity (person, computer), which is assumed to have an identity. Actors have ability to perform polynomially-bounded randomised computations and have means of keeping secret information. Also, actors are assumed to have synchronized clocks: there should be upper limit on clock value and speed difference. Connection Besides that, actors are able to communicate with each other. Messages can be sent between addresses, but not actors (no identification). Communication is dependable (sent messages are always received in the order of sending) and is synchronous (the time message is delivered from sender to receiver has upper bound). At the same time communication is not anonymous and not private. This means that it might be possible to find out which actor sent or received some message and also any actor might be able to read any message. I call such connection public. In principle one could also think of private, untappable communication with reliable identification (see [Myr00]), which is actually very useful when targeting at incoercibility, but such assumption is very hard to implement in reality (every voter would need such communication channel), so I will omit it. similar components Threshold Trust Often constructions are formed out of (actors) so that their properties hold as long as at most components fail to fulfil some requirement.

For instance critical services are often constructed to be reliable as long as at least components out of are working correctly and are able to communicate (equivalent to requirement that at most components will fail). Security is often ensured as long as no more than components perform some specific operations (are ”dishonest”).

Motivation for such approach is that it might be simpler to ensure such requirement than ensuring that one specific component would not fail.

2.1.2 Electronic Voting Scheme The aim of theoretical activity is to define and design a simple construction, which would abstract the real world problem of organizing e-voting. Such construction is 21

called Electronic Voting Scheme (EVS). The following types of actors are defined: Voter Actor that will vote in election. Authority Actor that will organize election. One and the same actor can belong to more than one set. It is assumed that: There is one ballot type with L options, which are known to all actors. All actors know identities of authorities and have means of communicating with them (e.g. know addresses at which they send at receive messages). All actors know identities of all voters. Some of the voters have selected one option and intend to cast it. Electronic voting scheme is a set of protocols (algorithms) for actors, which allow voters to send their ballot with selected option to authorities and authorities to compute election result and make it available to all actors. EVS must satisfy the following requirements: Correctness Election result must be computed correctly based on all ballots submitted by the voters. Freedom of Choice Either privacy or incoercibility, as discussed previously. It is very typical to require EVS to be verifiable in order to ensure correctness: computation results of authorities must be verifiable by any actor. This usually means that authorities must provide computational ”proof” of correctness of election result. Each EVS should define: Means of actor identification. To what extent actors must be trusted to perform according to prescribed protocols. Basically, there are two options: – Assume that actor is ”honest” - i.e. follows protocols. – Assume threshold trust towards thorities).

22

actors (usually applied only to au-

2.1.3 Public Key Infrastructure Situation when actors need to communicate secretly and with reliable identification, but can use only public communication channel without reliable identification, is very typical. In order to address it, a framework called Public Key Infrastructure has been developed. The basis for PKI is public key cryptography, which defines a set of interfaces with special properties, which must be satisfied by corresponding implementations. These interfaces are depicted on Figure 2.1. Public Key +binary «interface» Key Generator +Generate()

«interface» Digest Algorithm +Compute()

1

1

Key Pair

Private Key

1

+binary 1

«interface» Encryption Algorithm +Encrypt() +Decrypt()

«interface» Signature Algorithm +Sign() +Verify()

Figure 2.1: Encryption and signature algorithms. Digest Algorithm::Generate() takes as input any binary sequence

and produces ”hash” of fixed length (say 128 bits). Digest algorithm is supposed to be collision resistant - it should be infeasible to find two different binary sequences having the same hash.

Key Generator::Generate() is an algorithm that randomly generates a pair of two keys of specified bit length (e.g. bits). These keys are used as inputs to Encryption Algorithm and Signature Algorithm

methods.

Encryption Algorithm::Encrypt() takes as input a Public Key

and a binary sequence of any length (called plaintext) and produces a binary sequence of comparable length (called ciphertext). Encryption Al23

gorithm::Decrypt() takes as input Private Key and performs reverse

transformation from ciphertext to plaintext (which was encrypted using the public key from the same key pair). It is required that actor knowing the public key (but not private), some number of ciphertexts and corresponding plaintexts, would not be able to devise any information about the plaintext corresponding to some other ciphertext.

Signature Algorithm::Sign() takes as input Private Key and a bi-

nary sequence of fixed length and produces another binary sequence of fixed length called signature. Length of input and output sequences is comparable to the length of the key. Signature Algorithm::Verify() takes as input Public Key, binary sequence, and signature and checks that this signature was produced from that binary sequence with corresponding private key. It is required that actor knowing the public key (but not private), some number of binary sequences and their corresponding signatures would not be able to produce signatures for any other binary sequence. In order to sign binary sequences of any length, digest is computed from them and then signed.

There exist cryptographical algorithms that satisfy these interfaces, whereas it is important to assume that actors have only polynomially limited computational power2 . Now, if two actors and generate themselves pairs of keys, exchange somehow their public keys, and keep their private keys in secret, they get ability to communicate secretly and with identification: when actor wants to send message to actor , it signs it with his private key, encrypts with ’s public key and then sends resulting message to . No other actor who might learn message , but does not know ’s private key cannot devise any information about the message . At the same time is able to verify ’s signature, which could have been produced only by some actor knowing ’s private key (which is supposed to be kept in secret). Some encryption algorithms have an interesting property: is able to prove to someone else knowing message that he sent message , without revealing any information about his private key. Despite the beauty of such solution, there is one problem: actors need to exchange their public keys somehow. It is not possible to do it over the public connection, because it is not possible to ensure who did the public key come from. In

2

It should be stressed that there also exist many variations that do not exactly fit into this description. Also not all encryption and signature algorithms have complementary counterpart in the sense of sharing the same key pair. This means that there exist encryption algorithms that do not have complementary signature algorithm, which could use the same key pair and vice versa.

24

fact, in such setting this problem is not solvable at all: at some moment communication with reliable identification is necessary. So at best one may require such communication to take place only once. PKI provides the following construction to deal with this problem: All actors are assumed to have identity, which has a unique identifier, which can be represented as a binary string. A special construction called certificate is introduced. It consists of: – Subject’s public key (in binary representation). – Subject’s identity (binary representation of it’s identificator). – Some optional attributes explained later. – Issuer’s identity. – Signature of all preceding items verifiable with issuer’s public key. Certificate is interpreted as a statement that binds contained public key to the subject’s identity. If someone having such certificate has reasons to trust issuer of the certificate and he knows issuer’s public key, he would have reason to believe that specified public key belongs to the subject. Certificate may allow (trust) or forbid (not trust) the subject to issue certificates himself. Actor who is trusted to issue certificates is called Certification Authority (CA). Certificate may also be limited to some field of activity. Such information can be recorded in the attributes of the certificate. If actor has some number of certificates one could think of a graph, where nodes are identities and arcs signify certificates connecting issuer to the subject. Besides that, some nodes have associated certificates, which are assumed to belong to corresponding identities. Different nodes, certificates, and arcs can have different level of trustworthiness. After that derivations can be made on this graph (transitive closure). The simplest form of such graph is a tree: there is one root CA, which might certify some number of intermediate CAs, which finally certify all interested actors. It is assumed that everybody trusts the root CA. In order to get into this framework one would have to prove his identity to some of the CAs (what cannot be done within our model and so must be done externally) and provide his public key. After that, a certificate would be generated that would be trusted by everybody. It is important to point out that all this framework holds as long as actor is willing to keep his private key in secret. Nothing prevents him from revealing it to someone else. Also in real life someone might steal someone’s private key and for 25

this reason CAs are supposed to provide means of checking whether certificate is still valid. This can be accomplished, for instance, by providing certificate database (containing either valid, revoked, or both kinds of certificates) which can be queried online, or by periodically publishing certificate revocation lists (CRLs).

2.1.4 Time-stamping Time-stamping is a complementary to PKI service, which allows binding arbitrary message to a moment in time. This is done by creating an additional time certificate message. At least two flavours of time-stamping exist:

Absolute Allows determining reasonably small interval , within which the message received time certificate. Such construction is of any use only when actors’ clocks are synchronous. Relative Allows determining for any two messages and having time certificates, which of them received the time certificate earlier. Time-stamping service (TSS) is supposed to be implemented by one trusted actors with threshold trust. Actors forming such service are called actor or by Time Stamping Authorities (TSAs). In ideal, time certificates should allow comparing them without the need for contacting TSS (offline verification). Observing time certificate of a message proves that message was created before the moment in time associated with this certificate. If a signed message incorporates time certificate of any message (e.g. empty), one could conclude that message was created after the moment in time associated with the certificate. One of the most important applications of time-stamping is in the situation when someone’s certificate is revoked (e.g. due to private key leak). In such situation time certificate could be used to prove that message was signed before the certificate was revoked.

2.1.5 Bulletin Board Often a service called Bulletin Board (BB) is assumed to exist. It is supposed to allow each actor to send signed messages to it. After that every actor should eventually be able to see all messages in the order they were sent. Sender of a message should be sure whether sending message succeeded. Such requirements are called atomicity. Also events are usually defined at which bulletin board starts and stops receiving messages. If ordering of messages in not important, the construction is called reliable store.

26

The dynamics of a bulletin board could be described by latency (how long does it take after the message was sent to become readable for everyone) and monotonicity (guarantee that if someone has seen a message at the bulletin board, then at each successive read this message will be visible to every reader). Monotonicity for one specific reader could be called read repeatability. For the purpose of proving time-outs (in order to accuse some actor of not participating) it is very desirable that bulletin board would be able to tell with a reasonable precision at which moment message was sent. Such property could be called absolute time-stamping (as opposed to relative ordering provided by the bulletin board in any case). In a modification of bulletin board called atomic multicast it is supposed to forward messages to some number of subscribers. Note, that this is not the same when some actor sends a message to a group of actors himself because, for instance, there is no guarantee that one and the same message will be sent to everybody. If ordering of messages is not important, the construction is called reliable multicast. It should be clear that atomic multicast can be implemented with bulletin board by polling it periodically, which might not be as efficient, of course. A simplification of bulletin board (and atomic multicast) is to maintain a separate single writer multiple reader bulletin board for each actor. In this situation ordering of messages can be devised for each actor separately. Even in such setting one actor can prove that his message was sent after some other message by some other actor by including in the former message the hash of the latter. It is relatively simple to implement bulletin board if there is one actor everybody would trust. Otherwise a system consisting of multiple actors with threshold trust on them must be devised. It should be clear that full-blown time-stamping exists iff full-blown bulletin board exists. Still, from the viewpoint of efficiency they have different profile: bulletin board requires much more storage and at the same time time-stamping might be required to process much more messages and exist for longer period of time. For the purpose of e-voting it is enough to have a single-writer bulletin board for each actor with reasonable latency, repeatable read, and absolute timestamping. Monotonicity is desirable, but is not a requirement. The number of messages at each bulletin board would be fairly small (say ).

2.1.6 Threshold Encryption and Signature Conventional encryption and signature algorithms can be extended in such a way that private key would be split into parts (shares), which would be given to different actors, so that decryption and signing could be performed only if (threshold)

27

share holders decide so. Such construction is very useful in the context of threshold trust. Figure 2.2 depicts relevant constructions.

Public Key +binary

«interface» Threshold Key Generator +Generate() +Reconstruct Public Key() +Reconstruct Private Key()

Public Output

Private Key Share

+binary

+binary

Private Key +binary

+Verify()

«interface» Threshold Encryption Algorithm +Encrypt() +Partial Decrypt() +Reconstruct()

«interface» Threshold Signature Algorithm +Partial Sign() +Reconstruct() +Verify()

Partial Decryption

Partial Signature

+binary +proof of correctness +Verify()

+binary +proof of correctness +Verify()

Figure 2.2: Threshold encryption and signature algorithms. It is assumed that each of actors has his own private key and corresponding certificate is available to all others. It is also assumed that actors communicate through the bulletin board (or even better with atomic multicast). Each of

actors is supposed to execute Threshold Key Genera-

tor::Generate(), which:

– Takes as input actor’s key pair. – Outputs Private Key Share (supposed to be kept in secret) and also Public Output with actor’s signature (supposed to be made available to everyone). Public Outputs can be verified with Verify() method. Those that do

not pass this verification should be ignored further. 28

Threshold Key Generator::Reconstruct Public Key() can construct a Public Key based on Public Outputs present on the bulletin board. Public key can be used by conventional Encryption Algorithm to encrypt and Signature Algorithm to verify signature.

In principle a conventional Private Key can be reconstructed with help of Threshold Key Generator::Reconstruct Private Key(), although it is not usually used. Threshold Encryption Algorithm::Encrypt() takes as input Public Key and works exactly as conventional Encryption Algorithm. In order to decrypt a ciphertext, actors must execute Threshold Encryption Algorithm::Partial Decrypt(), which:

– Takes as input ciphertext and actor’s Private Key Share. – Produces Partial Decryption. Partial Decryption consists of:

– Binary information. – Computational proof of correctness, which can be verified with Verify(), which takes as input Public Outputs present at the bulletin board. Those partial decryptions that do not pass this verification should be ignored further. Finally plaintext can be reconstructed from Partial Decryptions of actors with help of Threshold Encryption Algorithm::Reconstruct().

In order to sign a plaintext,

actors must execute Threshold Signature

Algorithm::Partial Sign(), which:

– Takes as input plaintext and actor’s Private Key Share. – Produces Partial Signature. Partial Signature consists of:

– Binary information. – Computational proof of correctness, which can be verified with Verify(), which takes as input Public Outputs present at the bulletin board. Those partial signatures that do not pass this verification should be ignored further. 29

Finally signature can be constructed from Partial Signatures of actors with help of Threshold Signature Algorithm::Reconstruct(). Threshold Signature Algorithm::Verify() takes as input Public Key and works exactly as conventional Signature Algorithm.

It is important to point out that: Both threshold decryption and signature operations can be performed only by or more actors or by all actors, the Public Output of which passes Verify() (which is relevant if their number is less than ).

Decryption operation result cannot be incorrect due to proofs of correctness of partial decryptions (signature operation result can always be verified di. rectly). This can also be used in the case

2.1.7 Implementations of EVS In this section I will outline possible implementations of electronic voting scheme. The simplest solution is Single Trusted Authority Solution, where there would be one authority (actor), everybody would trust. In such case every voter would sign representation of his ballot with his private key, encrypt it with authority’s public key and send it to him. Later authority would decrypt these votes, verify signatures, compute result, and make signed result available to all others. In such situation authority is trusted to: Accept ballots from all voters3 . Count them correctly. Not to use single decrypted ballots (intermediate results of computation) in any other operation and discard them. Despite the naivety of such solution, it can be made reasonably safe in real life. There exist many other more or less secure solutions, but the best of them (such statements are always subjective) follow the pattern, which is usually called Multiple Authority Solution. The algorithms and data structures involved are depicted on Figure 2.3. 3

Multiple ballots could be accepted and the latest of them used - this would allow modifying vote. Also, there could be a ballot of special form that would require authority not to count it, which would enable voter to delete previously submitted ballot.

30

Voter's Ballot «interface» Voter's Algorithm +Generate Ballot()

+encrypted information +proof of correctness +signature +Verify()

Authority's Setup Information +information +signature +Verify() Authority's Output «interface» Authority's Algorithm +Setup() +Compute()

+computation result +proof of correcntess +signature +Verify()

«interface» Consumer's Algorithm +Compute Result() +Verify Result()

Figure 2.3: Multiple authority EVS.

31

It is assumed that there are authorities and each actor has a pair of public and private keys and certificates of all other actors. It is also assumed that actors communicate through the bulletin board (it would be good if authorities could communicate through atomic multicast). First authorities jointly execute a setup phase, during which each authority must execute algorithm Authority’s Algorithm::Setup(), which: – Takes as input authority’s key pair. – Might communicate with other authorities. – Outputs a piece of private information that authority should keep in secret and also public Authority’s Setup Information with authority’s signature. The correctness of setup information can be verified with Authority’s Setup Information::Verify(). Those setup informations that do not pass verification should be ignored further. Resulting setup information must be available to all actors. Further, during voting phase each voter can generate his ballot using Voter’s Algorithm::Generate Ballot(), which: – Takes as input voter’s key pair, selected option, and Authority’s Setup Information from all authorities. – Outputs Voter’s Ballot. Voter’s Ballot consists of:

– encrypted information about voter’s choice. – Computational proof of correctness, which can be used to check that encrypted information was formed correctly without the need for decrypting. – Voter’s signature. Voter’s ballot can be verified with Voter’s Ballot::Verify(). Those that do not pass this verification should be ignored further. All voters’ ballots should be made available to all actors. After that, during tallying phase each authority executes Authority’s Algorithm::Compute(), which:

32

– Takes as input authority’s key pair, private information generated during setup phase, Authority’s Setup Information from all authorities, and voters’ ballots. – Verifies voters’ ballots using Voter’s Ballot::Verify() and selects one correct ballot for each eligible voter. All authorities must produce exactly the same list of ballots. – Produces Authority’s Output. Authority’s Output consists of:

– computation result explained later. – proof of correctness of computation result. – Authority’s signature. Authority’s

output

can be verified with Authority’s Algorithm::Verify(). Those that do not pass verification should not be used further. Finally every actor can execute Consumer’s Algorithm::Compute Result(), which: – Takes as input Authority’s Setup Information from all authorities, and Authority’s Output from exactly different authorities.

– Computes election result telling how many times each option was selected. This step can take a long time. This result must be verified with Consumer’s Algorithm::Verify Result(), which: – Takes as input Authority’s Setup Information from all authorities, voters’ ballots, Authority’s Output from exactly different authorities, and the computed result.

– Verifies voters’ ballots using Voter’s Ballot::Verify() and selects one correct ballot for each eligible voter. Selected ballots must be exactly the same as when authorities selected them. – Verifies Authority’s Output from those ify() method.

authorities with Ver-

– Verifies that election result was computed correctly from Authority’s Output from authorities. Result verification is remarkably faster than result computation.

33

It is of crucial importance, that every actor sees the same Authority’s Setup Information from every authority and the same Voter’s Ballot

from every voter (which might not be the case if some actor could send different information to different actors). This condition implies that all information should be posted to the bulletin board. It is important to point out that even if this condition does not hold, wrong election result cannot be computed, because some verification would fail. As voters’ ballots are sent to the bulletin board, it is possible to allow voters sending multiple ballots, amongst which the latest would be selected. Also there could be a ballot of a separate form that would require authority not to count it. This would allow deleting previously casted vote, although this might not be secret. Also the following holds:

Election result can be computed as long as there are at least authorities, which follow algorithms. Election result can never be computed incorrectly (to be more precise it can happen with negligibly low probability).

Information on choices of individual voters can be extracted if at least authorities that produced correct Authority’s Setup Information decide to do that. Also all authorities together that produced correct Authority’s Setup Information can do the same (it is relevant in the case when the number of such authorities is less than ).

All this implies, that no actor should proceed further if the number of authorities that produced correct Authority’s Setup Information is less than . Careful reader has probably already noticed similarity of this construction with threshold encryption and signature. An example of such EVS would be variations of [CGS97] with shared public key (threshold) generation in the setup phase (see [Myr00]). Another option is for instance [CFSY96].

2.1.8 On the Freedom of Choice As it was already mentioned, freedom of choice requirement implies either privacy or incoercibility, whereas the former is obligatory and the latter is optional. All patterns for EVS described in Section 2.1.7 ensure privacy, but do not guarantee incoercibility. The simplest way to prove how voter voted is to reveal his private key. Fortunately, in real life private keys are associated with voter’s identity and might be used to access different services like banks, which implies that actor would not have motivation to reveal his private key. But this does not solve the problem, because it is possible to prove how voter voted without revealing the private key (private key is only used to sign). See [Myr00] for more details. 34

Generally, in an environment with only public (tappable) communication incoercibility is provably not possible. If PKI is introduced and it is assumed that voter does not want to reveal any information about his private key, it might be possible to deduce EVS that would satisfy incoercibility, but I am not knowledgeable of any such construction. Another option is based on observation that incoercibility is not achievable because voter can see intermediate computation results and is able to sign any message. This leads to a solution, where there would be a specialized device, which would ask voter which option to use, perform all computations, sign result and then pass it over to the voter, so that he would not be able to see intermediate results and would not be able to sign arbitrary messages. Unfortunately it is very hard to imagine that such solution would justify itself economically.

2.2 Designing Framework Electronic voting schemes are built upon real world model and supporting services like PKI and bulletin board. In this section I will consider options for implementing these prerequisites. In short, they can be called computer and network security. Bulletin board will be covered in a separate section because it is more applicationspecific. Once such framework is implemented, one can concentrate on essential problems of implementing e-voting.

2.2.1 Real World Model Previously, real world abstraction was introduced in terms of actors being capable of performing randomised computations. In reality actors are often represented by people (or even organizations) which use their computing devices to perform computations. Also software must be written for these devices. For this reason it seems sensible to introduce the following requirements: Computing Device The computing device must be: Correct Execute instructions of provided software correctly. Untappable Computing device’s memory should not be readable by anybody else than device’s owner. This includes both volatile and permanent memory. Randomised Computing device should provide a source of random bits for use in randomised cryptographical algorithms. Synchronized Clock All computing devices should have synchronized clocks. 35

Software Software used must be correct - it must correspond to the algorithms and specifications. Besides that, most of the people do not write their software on their own. This implies that correct software must be somehow delivered to the computing device.

2.2.2 Computing Device In general, computing devices are designed to be correct (and lately untappability has become a widely accepted requirement). In order to prevent computer from being secure the following must be done: Gain access to the computer either virtually or physically 4 . Execute software instructions. Computer cannot be viewed without software running on it (OS, application software, shared libraries), which might have occasional or intentional bugs, which enable external entities to manipulate computer, including executing any software instructions and inspecting permanent or volatile memory. In the context of contemporary personal computer, the following problems exist: General purpose software is written with emphasis on functional requirements and not so much dependability (because the former, not the latter gives profit, unless the latter is critical). As a result, contemporary software systems are flooded with security vulnerabilities. Another problem related to contemporary ways of software distribution: lots of programs are installed, and usually installation programs have full access to the system and can modify any feature of it, including introducing backdoors for unauthorized access to the system from outside. Although software firms might not have motivation to do such things themselves, it is enough to have one of their employees to do that. Many people can gain virtual or physical access to the computer. In general the following can be done to ensure computing device to be correct and untappable: Use minimal, fixed, verified set of necessary software (including operating system). 4

There also exist ideas how to wiretap computer from distance by measuring magnetic field, etc.

36

Use refined access control system, which gives minimum needed rights to the installed software by default. Limit access (including physical) to only relevant personnel. It is generally agreed that contemporary PC is quite insecure (not enough correct and untappable). At the same it is probably possible to create secure computing device, because most of the problems exist due to historical reasons or lack of economical motivation. Also, the smaller and more specialized the system is, the simpler it is to make it secure. A clear borderline should be drawn between the limited number of computers used by organizers and almost unlimited number of computers used by the voters. The amount of resource that can be invested into securing organizers’ computers is orders of magnitude higher than for voters’ computers, which should generally be used ”as is”. What regards securing voters’ computers, two trends should be mentioned. First, recently multiple different handheld mobile devices have become affordable to large masses. Designing such device from scratch gives a good opportunity to implement robust security from the ground up. Also some devices could be created with fixed set of preinstalled unmodifiable software, which would increase security of such device a lot. In practice such devices are not principally more secure than usual PCs. Another idea is to have a tamperproof device (called smartcard) having a processor and memory chip, but otherwise being not self-contained, and keep there some well managed (even better, fixed) software and data (secrets). The latter could be in secret even from the owner of the card himself (e.g. private key). Smartcards have an interface through which other devices can communicate with them. Smartcards are activated by entering PIN code, which is usually passed through the device to which smartcard is attached. Definitely smartcards have their own security threats (see [Sch99]), most remarkable of them is that the device to which smartcard is attached, after the PIN code has been entered, can manipulate the card in any uncontrolled way (sign any messages) and also reveal PIN to anyone else. This implies that device to which smart card is attached must be rather secure itself. Another problem is that most of existing computers are not equipped with smartcard readers and it will probably take a long time before they become widely adopted. Besides being correct and untappable, the computing device has to be randomised. Randomness can be retrieved from special physical device or a cryptographic primitive (algorithm) called pseudorandom bit generator, which must still be seeded with small random piece of information, which is usually collected based on the behaviour of the computing device, which depends on (unpredictable) actions performed by the user. 37

The best what can be done to keep computing devices’ clocks synchronous is to use periodically Network Time Protocol ([NTP]) to synchronize them with some time server’s clock. Simple NTP (SNTP), suitable for usual computers, provides accuracy of 1 second, which should be sufficient. Also care must be taken to avoid bugs when dealing with time zones.

2.2.3 Software Software correctness is a direct implication of the quality of software development process. In order to increase trust towards software, development process, source code, and supplementary documentation must be reviewed and certified by some external trusted parties. After there exists trusted source code from which software can be built, it must be delivered and deployed at the computing device. A problem arises here, because despite the fact that the source code of the software was certified, it does not imply in any way that the binary distributable that is received was built from that source. The solution is to require software publisher and certifiers to sign the distributable and express in this way their trust towards software with respect to specified purpose. In this case each certifier would need to receive the source code, inspect it, build binary distributable, and finally sign it. This implies that all certifiers should be able to produce exactly the same distributable, which means that the build tool (compiler) must be deterministic (which they hopefully are, or at least can be made quite easily). It is natural to expect that framework for expressing trust by signing binary distributable should be a part of PKI. We will see later to what extent it is supported now. Another option is to require certifiers to sign the source code, distribute it, and expect end-users to build it themselves, which is quite unrealistic and also time-consuming.

2.2.4 Threshold Trust The first evident rule is that each component must be made reliable and secure as much as possible. The following component failure scenarios and countermeasures could be considered: Random Failure Failure events of different components must be made independent (in the sense of theory of probability). In this case probability that more components will fail is: than

38

and

where

is probability of failure of one component. For instance if , , the probability of failure of threshold trust is less than for instance . Note that cannot be too small, because the probability of failure would be . in the case of

On the other extreme, if the events of failure are completely dependent (i.e. if one component fails, all components fail) no advantage is gained as compared to case .

Active Adversary It must be ensured that each component must be broken separately from the others so that adversary would need proportionally more resource to break components.

Colluding Components Actors, which control components (or indeed are them) should not wish to cooperate with each other to break the service. This is largely a political issue of selecting actors. Described countermeasures require components to be ”independent”. The following could be done to ensure independency: Independent implementations and manufacturers (hardware, OS, libraries, software). Independent resources: – Physical location – Power supply – Network Different operators and organizations. An interesting issue arises in the context of reliability (where service remains working as long as components are functioning correctly and can communicate) when components can be fragmented into two or more segments that cannot communicate with each other (e.g. network failures). In such situation components in each segment would consider components in the other segment failed as it is not possible to decide whether connection has failed or component is not communicating intentionally. Now, if there are two segments each containing components, each of them could form the service on its own, leading to the situation of ”split mind”. If this is an issue, protocols and algorithms should be designed to avoid such situation. The simplest way is to require or at least to require that components. one can contact more than

39

2.2.5 Connection Connection between computing devices must be dependable: it should be possible to send a message from one address to another without a failure. There are two ways of assessing system dependability: how system acts on average (reliability) and what can happen in the worst case, especially, if some entity is interested in bringing the system down (safety, security). Internet is the first and probably the only candidate for connection implementation. It provides functionality to send messages between nodes (computing devices) having IP addresses. Internet can be viewed as a collection of interconnected local networks (segments) and consists of the following basic components which are built one on upon another: Physical link layer Physical devices providing local packet (message of limited size) sending functionality within one segment, which may have its own address system (OSI 5 physical and link layers.) Network layer Provides means of sending packets between any IP addresses. Special computing devices called routers are used to join segments and find suitable path for each packet. Packet sender does not get trustworthy information about whether packet has reached its destination. Nothing is done if packet sending fails. Packets can be received in an order different from the order of sending. (OSI network layer.) Transport layer Provides means of establishing ”virtual” connection between any two IP addresses, where messages of any size can be sent in both directions. Messages are split into packets, which are sent separately. Receiver is supposed to send acknowledgement about receiving each packet. Packets that do not reach destination (which are not acknowledged by the receiver) are resent. As a result, sender has a good evidence of whether message has reached the target. Also the order of messages (in one direction) is guaranteed to remain the same. (OSI transport layer) Domain Name System (DNS) IP addresses are numeric and hard to memorize. To solve this problem, each Internet node can be assigned one or more symbolic name, which is easier to remember. DNS provides a service of mapping symbolic names to IP addresses. DNS service is implemented as a worldwide distributed hierarchical set of computers, each of which keeps a part of this information and is supposed to know where to get the rest. 5

OSI (Open System Interconnect) reference model - an ISO standard defining seven layers of any network implementation. In practice nobody follows it precisely, but it is a good reference model. See [OSI].

40

Applications Networked applications making use of previously described components. On the average it can be said that reliability of Internet is acceptable, but in the presence of active adversary Internet is by no means dependable. Further, the following context will be assumed (although most of arguments apply to any situation): application server (implemented by a limited number of nodes) to which multiple clients (nodes) connect. When attacking in described context, the following direct aims can be set: Prevent nodes from communicating with each other (clients from connecting to server). Modify transmitted information. Create illusion of communication with fake address (either DNS or IP). The attacks themselves can be classified into the following groups: Damage, modify, fake, or overload components of the infrastructure: – Links – Routers – DNS servers – Nodes (application server, client) Although originally Internet was designed ”to resist nuclear attack”, there should be multiple independent paths between any nodes without common links, at the current moment Internet has become mostly hierarchical with relatively low level of redundancy: both from the viewpoint segment connections and DNS. As a result, for most of node pairs it is possible to find an intermediate link or server (router or DNS server), which when removed would disconnect these nodes one from another. Also, it should be relatively simple to disconnect specific node from most of the others by breaking link or intermediate server close enough to it. In addition, if attacker penetrated some link or intermediate server (or the attacker is indeed the operator of the component), he would be able to imitate communication with any IP address ”behind” him. Finally, there exist effective methods to overload communications infrastructure, known as Denial of Service attacks (DoS), discussed later. It is important to point out, that most of components of the infrastructure belong to one specific entity, which must be trusted in order to rely on the connection. 41

Damage, modify, or fake data of: – Local routing (within one segment) – Routers – DNS As a result packets would be sent to wrong destinations, or would not reach targets at all. Attack protocols of any layer (link, network, transport) in order to break, modify, or fake connections. The following weaknesses of the infrastructure are usually employed: Ability to gain physical or virtual access to infrastructure components, possibly with help of so called ”social engineering”, which targets at human beings instead of surpassing technical or computational protection methods. Bugs in underlying software: operating system, networking components, protocol implementations, application software. Probably the most important of them is so called ”buffer overflow” error, where memory area right after the buffer is overwritten when writing to buffer too long data without proper size checking. Finally the basic protocols (ARP, ICMP, RIP, IP, TCP, DNS, etc) themselves have security vulnerabilities, which can be used against the aims of the infrastructure. As an example, let’s consider an attack of overloading infrastructure components called Denial of Service (DoS). The general idea is to send more ”garbage” information than link or intermediate or application sever can process with an aim to consume some kind of resource: either bandwidth, computational power, or for instance memory. As a result ”legitimate” users would not ”get through”. Attacks can be (and usually are) initiated from multiple nodes, the total resource of which is higher than one of the victim (in this case attack is called Distributed DoS). Attacks can be application specific in such a way that attacker needs much less resource to generate the garbage than the victim to process it, for instance if victim tries to decrypt sent messages. As a result less resource is needed to bring the component down. One could ask why attacker would have more resource than the victim. On one hand state’s elections is an important event and so a lot of resource could be spent to bring it down. On the other hand attacker is always one step ahead of victim’s operators: first they set up some resource, and then attacker has a chance 42

to gain enough resource to run DoS. Finally, as experience shows it is relatively simple to break into multiple Internet nodes and manipulate them externally. Also protocol vulnerabilities can be exploited to direct big traffic towards specific node. Although there is no complete cure for this problem, partial solutions and guidelines exist, which lead to attacker needing much more resource to launch the attack, see [Ero00]. In general, the following countermeasures could be taken to prevent described attacks: Proper development process of protocols, software, and hardware. Proper infrastructure surrounding Internet components, including refined access control policy and attack prevention, detection, and response. Systematic redundancy of connections (bandwidth, independent paths) and nodes (fail-over and load-balancing clusters) Legal measures with big punishments for network disruption, which implies that it should be possible to trace back to the originators of attacks. The first two of described items are directly related to previously discussed problem of correctness of software and computational device. In general, only global measures can rise dependability of Internet substantially. At the same time the problem of identification can be completely solved within PKI with help of such cryptographic constructions as Diffie-Hellman key exchange, message authentication codes, symmetric encryption, and so on. In short, provided two connection endpoints have certificates and are able to establish connection, it is possible to establish communication channel, which would provide: Authenticity It is clear which PKI identity sent received data. Integrity Data cannot be modified between connection endpoints. Confidentiality Data cannot be wiretapped between connection endpoints. Non-repudiation Ability to prove that received data was sent by corresponding PKI identity. There exist implementations of this approach for network layer ([IPSec]), transport layer ([TLS]), and DNS system ([DNSEXT]). These protocols also help rising the quality of connection establishment, because non-repudiation helps proving that some component behaved incorrectly (e.g. provided incorrect information), but only post factum. Also, this measure will be effective only when most of Internet nodes start following them, which is not the case at the current method. 43

Besides that, network was required to be synchronous: there should exist upper bound on how long does it take from the moment message is sent until the message is received. With some simplifications, one segment of a network (one Ethernet segment) can be considered synchronous. The main concern is that the speed of data transmission degrades as a function of network load, which opens doors to effective DoS attacks. This can be relieved to some extent by isolating the segment from the external world, but it does not help with internal adversaries. In the case of Internet it is not possible to give useful and sensible upper bound - Internet is completely asynchronous, whereas failing connection can be interpreted as lasting for especially long time. Although it was assumed that connection is synchronous, most of constructions described before are not sensitive to that, although they can ”freeze” until the messages are delivered. The most remarkable exception is the bulletin board, which is often used as a replacement for the connection itself. The conclusion would be that existing connection is not dependable enough, multiple single points of failure exist, and it can be broken on purpose quite easily. Only global measures can improve situation substantially. At the same time, if connection can be established, communication authenticity, integrity, and even privacy and non-repudiation can be achieved.

2.2.6 PKI PKI is a framework that facilitates establishing correspondence between identities and public keys, the basic functionality of which can be usually split into: Creation and dissemination of identity certificates. Dissemination of certificate revocation information. Roughly, PKI consists of three parts: Standards defining data formats and algorithms. Software supporting these standards (both for authorities and clients). Organizations functioning as certification authorities. Probably the most popular PKI standard is X.509v3 ([X.509]), which defines formats for certificates and CRLs. There exists standard software for managing certification authority and corresponding client software, which provides the following functionality (list is not complete): Certificate retrieval, exchange.

44

Usage of CRLs, which can be automatically retrieved from certificate distribution points (CDPs). Certificate verification. Encryption and decryption. Signing and signature verification. Hierarchical (tree) topology of CAs is widely adopted. One example of both certification authority and client software is, for instance, Microsoft Windows 2000 platform that has built-in support for PKI. There also exist organizations that act as international certification authorities. They typically produce personal certificates of different level of trustworthiness (implied by different quality of identity verification), and also DNS name certificates (given to the owner of DNS name) and code publisher’s certificate, which can be used to sign code distributable (see later). One example of such authority is [VeriSign]. Still there exist multiple problems: There is no universally accepted certification authority. There is no uniform, convenient, and accepted naming convention, which would enable assigning unique identificator to any person or organization in the world. Existing standards are rather loose (many features are optional) and as a result there exist interoperability problems between software from different vendors and also different CAs including: – Not all vendors use hierarchical topology of CAs. – Support for certificate revocation (especially automatic through distribution points) is implemented in many different incompatible ways. There exists problem of how certificates of root CAs reach client computers. Currently they are pre-installed with operating system, which requires additional trust towards operating system vendor. The conclusion would be that existing PKI is rather underdeveloped and further progress is needed. PKI requires that owners of the private keys should keep them in secret. For most of personal computer users this means keeping them on the file system (hopefully) protected by file system access control and a pass phrase. Also, the private 45

key is loaded into random access memory when using it. All this works provided computing device’s untappability requirement is satisfied. A much better solution is to keep private key on a specialized tamperproof device (smart card), which would perform cryptographic operations on it’s own without revealing the private key to the computer. As it was already mentioned, smart cards have their own security concerns and also smartcard readers are not widely adopted yet. In the context of e-voting another problem arises: why would some state (organizer of the elections) trust root CA, which is located in South Africa or USA. A solution would be to have local intermediate certification authority, which would function within that state. Only certificates, which are ”under” that authority would be allowed to be involved in the elections and at the same time interoperability with the external world would be ensured. Besides conventional PKI functionality, infrastructure for code deployment to computing devices is desirable: Software distributable may carry signature from software publisher and also from external parties, who certify that this software is correct with respect to specified purpose. Certifiers should be able to assign different levels of trust towards software. Operating system should provide ways to define policy of whether to install (run) code distributable based on its signatures. The simplest way is to ask user every time some distributable is installed (run), which would become annoying very fast (especially in the case when code is downloaded automatically from the Internet) and would lead to situation where user would always say yes without thinking. It’s a separate research problem how would such policy look like and who would create and enter it (it is naive to expect end-users to do that correctly). It is reasonable to expect such framework to be (an integral) part of operating system. Contemporary code signing is rather underdeveloped: for instance Microsoft’s Authenticode (code signing framework on Windows platform) enables software publisher to sign the code and assert in this way that software is safe (whatever it means), but there are no means to state correctness with respect to some purpose (at best one can add free-form string) and also there can be only one signature. In short, publisher’s identity is bound to the code without any direct legal implications. Forthcoming Microsoft’s .NET platform ([.NET], now in beta) promises to provide more refined policy engine upon Authenticode. Naturally, other vendors have implemented similar constructions in their platforms, for instance in Java 2. Still they all appear to lack support for multiple signatures

46

and expression of trust with respect to specified purpose - i.e. they do not support software certification.

2.2.7 Time-stamping

The need for time-stamping was identified at least years ago. For a long time there were either relatively inefficient solutions (for relative time-stamping) or solutions that required to have common unconditionally trusted third party (for absolute time-stamping). Quite recently in 1998 a practical and efficient solution for relative time-stamping has been developed (see [BLLV98]). The solution is still single-authority, but it is not possible to cheat undetectably, although authority can ignore someone based on his identity. Also consequently the dependability and scalability of the service might not be sufficient for every application. Time certificate processing requires in general communication with the TSA. The topic of ongoing research is implementing time-stamping service, which works with respect to threshold trust. A compendium of information on this subject can be found at the timestamping project Cuculus home page [Cuculus]. As time-stamping is relevant to e-voting mostly in the context of the bulletin board I shall return to this topic when discussing its design options.

2.2.8 Summary Existing framework of hardware, software, network, and PKI has been growing evolutionally during last - years. It is generally agreed that contemporary infrastructure is not dependable enough, although at the same time it is considered good enough to access Internet banks and shops. Many of the existing problems can be circumvented if framework is rebuilt carefully from scratch with specific requirements in mind. Special attention should be paid to the basic constituents of the framework and people or organizations who are responsible for them: Hardware manufacturing and deployment. Software platform development and deployment (one can use code-signing only after one has installed software, which supports code-signing). Network components. Certification authorities. Most of information in this section has been taken from Internet and most of the resources are not self-contained enough to cite them, so I cite only one most relevant: [Rub00]. 47

2.3 Design for Bulletin Board In this section I will describe different ideas and options for implementing bulletin board6 . I remind that for the purpose of e-voting it is enough to have a single writer multiple reader bulletin board for each actor with reasonable latency, repeatable read, and absolute time-stamping. Monotonicity is desirable, but is not a requirement. One option is to implement bulletin board with one authority. In the current context authority means a set of tightly coupled server computers located at one physical location sharing common PKI identity. Such construction maps directly to a notion of actor in previously described model of the real world. Different solutions might require different level of trust towards the authority. The simplest one (in the flavour of conventional info systems) would require trust to be unconditional. In principle one could think of solution where authority would have to prove correctness of its actions and so the failure would be detectable. As it was already mentioned time-stamping and bulletin world are very similar services. My intuition suggests that one round of time-stamping construction [BLLV98] could be adapted quite easily to the needs of bulletin board. Of course previously mentioned problems still remain: possibility that authority will ignore someone, possibly unacceptable dependability and scalability. If a solution for time-stamping is developed that overcomes these problems, it could be probably quite easily adapted to the needs of bulletin board. All this can be a topic of further research. Another option is to implement bulletin board with multiple distributed authorities (in the sense of previous paragraph) with respect to threshold trust. It turns out that in such setting both from theoretical and practical point of view the approach differs quite a lot depending on whether environment is synchronous or asynchronous. In synchronous environment all actors have synchronous clocks (as defined before) and connection between them is also synchronous. Asychronous environment does not satisfy one of these properties (usually the second). In practice, synchronous environment can be approximated with some isolated from external world well supervised reliable communication channel (e.g. Ethernet segment). On the other hand Internet is clearly asynchronous. It is worth pointing out that single authority solution does not depend much on whether environment is synchronous or not: user can either succeed in establishing connection to the service or not. Theoretical (and probably also practical) activity has lasted in this field for at 6

This section should be viewed as a collection of rather raw knowledge and ideas that may need further development.

48

least years. Good overviews of theoretical meta facts can be found in [Kest95] and [Fi00], whereas [Fi85] is the paper where some of them were proven first. Relevant keywords are distributed consensus, Byzantine agreement, reliable broadcast. Quite a lot of practical work has been done by Michael Reiter (see [Reiter]). In the rest of this section I will first consider different implementation options for synchronous and asynchronous cases and then try to formulate practical solutions to the bulletin board problem from the viewpoint of organizing e-voting.

2.3.1 Some Simple Ideas In this subsection I present some simple ideas that will be of use later. The first idea is about how one can create single-writer bulletin board out of single-writer reliable store. Provided that there are means for reliably storing messages without enforcing ordering, the following could be done to ensure it: Use any form of time-stamping to prove ordering of the messages. When sending the next message, it should incorporate a hash of the previous message. This requires sender to remember (or retrieve) the last message sent so far. If message ordering is up to the sender (the only entity interested in ordering the messages is the sender himself), message could just incorporate a sequence number (which requires maintaining a counter) or even the current value of sender’s clock. Based on this it is sensible to concentrate on implementing reliable store, out of which single-writer bulletin board can be created. The second idea is related to absolute time-stamping:

Assume that we have numbers in ascending order out of which at most are assumed to be ”incorrect”, but belongs to the interval startwe do not know which. In this case ing with the smallest and ending with the biggest ”correct” number in the sequence. The reasoning is quite simple: if is ”correct”, then implication is trivial, if is ”incorrect” then it is surrounded in the sequence by correct numbers. This finishes the proof. This can be applied in the following situations: Accessing timeservers to synchronize clock. If attacks of at most servers can be resisted.

49

servers are accessed,

Assume that there are time-stamping servers which append their current time to every sent message, sign it, and send back. Now if one accesses servers, it can get a trustworthy absolute time certificate as long as at most servers are malicious (or have wrong clocks).

2.3.2 Synchronous Environment In the synchronous environment from the theoretical point of view reliable store . Correct information will be present at all correct is possible for any authorities. At the same time construction presented when proving this fact in [Fi00] is completely impractical. My idea for implementing single writer bulletin board would be as follows:

Implement reliable write-once multiple read register Maintain an array of such registers

.

In order to send a new message, find out the highest index of filled register and then write to register new signed message containing an index and a hash of .

Ideas for implemeting such register by multiple authorities with threshold trust can be found in [MR98], section 6. In order to prove time-outs (i.e. that someone didn’t write a message during some period of time), time-stamping should be employed. Previously described out of approach could be applied here. Note that bare time-stamping by the writer does not solve the problem: writer could first ask time certificate, wait for unlimited period of time and only then send the message. This implies that time-stamping must be performed by the authorities implementing the service. Register implemented with such approach has the following properties:

In order to write one message, this message (or at least its digest) must be transmitted times over the network. This limits throughput of the system quite a lot. Quite large cryptographical overhead is needed to perform authenticated communication.

The following limitation holds , which implies or be malicious).

.

, For instance if (at most out of 10 authorities can fail

These limitations make specifically this construction quite useless. I did not have time to make a big investigation and so cannot tell if there exist more practical constructions for synchronous networks. 50

2.3.3 Asynchronous Environment From the theoretical point of view bulletin board implemented by multiple authorities with threshold trust cannot be implemented deterministically in principle if even one authority can crash (let alone malicious behaviour). The reason behind it is that it is not possible to decide whether some authority is intentionally unresponsive or the connection is just ”slow”. See [Fi00] for more. For this reason only partial solutions to this problem exist. One option is to use protocols from synchronous world, but now time-outing cannot be used, as there are no upper bounds on transmission duration. As a result these protocols can ”freeze” for a period of time of indefinite length. As a solution to these problems Rampart toolkit is suggested in [Rei95], [Rei96], and [Rei94]. The basis of the toolkit is so called group membership protocol, which allows removing and adding authorities to the group if more than of group members decide so. This allows removing unresponsive, failed, or malicious group members without making difference between them. Atomic multicast protocol is executed upon it. After reading previously mentioned papers, I have made the following observations:

In order to write one message, this message (or at least its digest) must be transmitted times over the network. This limits throughput of the system quite a lot. Quite large cryptographical overhead is needed to perform authenticated communication.

System can survive only fragmentation of less than group members. System ”freezes” if, for instance, group is split due to network failure into two equal parts. Group of correct authorities can ”melt down” to any size, for instance less than .

As group membership change is relatively expensive operation, an attack could be mounted where authority would periodically pretend to be unresponsive, get thrown out of the group, then again become responsive, get invited into the group, and so on. In order to rise efficiency, messages are broadcasted by one selected (for instance with the smallest ID) group member and if he fails, he is voted out. For this reason it makes sense to concentrate attacks on group member with the smallest index. 51

Quite many issues are left open or need further adaptation to the needs of the bulletin board. For these reasons this construction seems to be rather impractical for use in the ”wild” Internet, but it might be of interest on more protected and synchronous local network. I have to acknowledge that these papers are rather complex, many details are left open, so there is always a chance that I just did not quite understand them.

2.3.4 Practical Solutions The following applications of bulletin board can be identified with respect to evoting: Setup phase of multiple authority EVS and also threshold key generation (I will refer to them as setup phase further). Collecting voters’ ballots (I will call it collecting phase further). Result computation in multiple authority EVS (I will call it computing phase further). Despite previously described semi-theoretical solutions it appears to me that ”practical hack” is more appropriate in the current situation. The setup phase requires a high quality of bulletin board service (atomicity), but it can be restarted without any loss and also informational throughput is quite low. This phase could be performed on synchronous network. For this reason I would suggest to implement bulletin board by a single authority, which should broadcast signed messages to all participants. In the end each participant would sign the transcript of messages delivered to it and this phase would be considered complete if no participant objects that he did not receive all messages and signed transcripts match. Setup phase outcome should be made available to voters with help of administrative methods. Another option it to try to adapt Rampart, although it is questionable if it justifies itself. Collecting phase must be performed at asynchronous network, the amounts of data transmitted are measured in gigabytes. Voter should be sure that if he gets confirmation that operation succeeded, then his ballot will be counted. At the same time he might not always get this confirmation because network connection might go down exactly at the moment when bulletin board sends the last confirmation message to the voter. Ordering of sent ballots is up to voter. At the same time this service cannot be restarted - it must function without failures. While collecting votes, it is sufficient to be write-only. For this reason I would suggest to have 52

independent authorities, which would collect, sent ballots. In order to submit a ballot voter would have to contact servers and send his ballot there. In the end ballot lists would be signed by the authorities, written to durable media, and physically transported to the place where computing phase takes place. Ordering of voter’s sent ballots could be ensured by time-stamping or even by including voter’s computer time into the ballot (probably not so good idea). Computing phase does not actually require the bulletin board. All that is needed is to collect self-verifying output of all authorities. This can be ensured with administrative measures.

2.4 Design Pattern for E-voting System In this section a design pattern for e-voting system will be outlined. It will be done independently of EVS type. Both single and multiple authority EVSes can be ”plugged” into this pattern. This will be demonstrated in later sections. At the beginning let’s assume that infrastructure is already in place. System architecture is outlined on Figure 2.4. The following participants are depicted there: Organizer An organization responsible for conducting elections. It is identified by its public key that has been threshold generated and registered within PKI, whereas private key shares are kept separately and securely and are used only when needed. Voter Any person willing to participate in elections. Is identified by its public key certificate. Observer Any entity having a public key certificate, which is invited to observe and verify the voting process to ensure its integrity. Consumer Any entity interested in learning election result. The following components are presented on the figure: Election Information Management Subsystem that helps producing voter lists, ballot types, and a mapping between them. Voting BB Bulletin board servers for collecting voters’ ballots. Each bulletin board server has its own public key certificate, using which it is identified. EVS An instance of some electronic voting scheme. Entry Server Server having well-known name, providing voters with everything they need to participate in election: 53

Election Information Management

Entry Server

Voter(s)

Organizer

Voting BB

Revocation Information Management

EVS

Observer (s)

Consumer(s)

Archive

Figure 2.4: Design Pattern Architecture Outline.

54

Voting client software to use. Election information. Voting bulletin boards’ addresses and public key certificates. Observers’ public key certificates. EVS information, generated at the setup phase. Revocation Information Management Subsystem that helps creating lists of voter identities that should not be counted. Archive A permanent store, where all election information is saved after election is over. Thick line on the figure signifies boundary between ”managed” and ”external” world. The process according to which system is supposed to act is depicted on Figure 2.5. The following steps are present there: Selecting Observers The process starts with selecting observers for specific elections. Preparing Election Information Election information is prepared. Once information is ready, it is signed by the organizer. Preparing Bulletin Boards Multiple bulletin board servers are setup, possibly at different geographic locations. Some of observers are supposed to observe these servers. Servers’ public key certificates are collected. EVS Setup Electronic voting scheme setup phase is executed resulting in some public information that must be made available to voters. Populating Entry Server Entry server is populated with all relevant data. The integrity of all information is assured by organizer’s signatures. Initiating Voting After that voting can be initiated by activating bulletin board servers. Voters Voting Then for some period of time voters are given an opportunity to vote. This is done so: Voter contacts the entry server and downloads voting software. Voting software is deployed at voter’s computer. Signatures on the software distributable are verified. 55

Selecting Observer(s)

Preparing Election Information

Preparing BB(s)

EVS Setup

Populating Entry Server

Initiating Voting

Preparing Voter Revocation Information

Voters Voting

Computing Election Result

Stopping Voting Observers Approving Election Result

Archiving Election Data

EVS Shutdown

Figure 2.5: Main Process of the Design Pattern. 56

Voting software contacts entry server again and checks that voter is eligible to vote, retrieves voter’s ballot type, bulletin board addresses and certificates, and EVS setup information. After that ballot is presented to the voter, where he can choose one option. Finally he can cast his ballot (or cancel the activity). Voting software forms voter’s ballot according to EVS algorithms and setup information. After that the ballot is signed with voter’s private key. Finally bulletin board servers are contacted and the ballot is submitted to them. There should be lower limit on how many servers must be successfully contacted. Voter is notified of success or failure. Stopping Voting Finally voting is stopped by closing bulletin board servers. Preparing Voter Revocation Information After election is over, voter revocation information is prepared. Once information is ready, it is signed by the organizer. Computing Election Result Election result is computed with help of EVS. This will be explained later in a separate subsection. Observers Approving Election Result Observers are given opportunity to verify correctness of EVS result. After that they are given a chance to sign the result to express that they are satisfied with everything. Archiving Election Data Once EVS result is approved by sufficient number of observers, election data can be stored into permanent archive, where it is accessible to the consumers. EVS Shutdown Finally, electronic voting scheme is shut down. In particular, this means destroying secret information generated during EVS setup phase that may compromise election security if it leaks. It might be needed to postpone this phase for a reasonable amount of time.

2.4.1 Computing Result The process of computing election result is depicted on Figure 2.6:

57

Entry Server

Entry Server Contents

List of Revoked Voters

Voting BB(s)

EVS

Raw Ballot List(s)

EVS Output

Observer(s)

Signature(s)

Organizer Signing and Timestamping

Full Report

Archive

Figure 2.6: Result Computation in the Design Pattern.

58

Ballot(s)

Entry Server Contents (organizer signed)

Raw Ballot List(s) (BB server and organizer signed)

EVS Output

List of Revoked Voters (organizer signed)

Observer Signature(s)

Full Report (organizer signed and timestamped)

Figure 2.7: Election Data in the Design Pattern. At first EVS takes as input raw ballot lists from the bulletin board servers (which are additionally signed by the organizer), list of revoked voters, and also the contents of entry server (why is it needed will become apparent later) and produces election result. After that observers are verifying and signing EVS output. Now all data is signed by the organizer once more and finally time-stamped. This forms complete election report, which can be stored in archive. Figure 2.7 depicts components of election data and their interrelationships (here a dashed arrow from A to B means that A includes a hash of B): EVS output is supposed to contain hashes of raw ballot lists, lists of revoked voters, and entry server contents, which were used in computation. Voter’s client software is supposed to include in the ballot hashes of parts of election data that were used. Observers’ signatures include hash of EVS output by definition. Finally full report contains hashes of observers’ signatures.

59

Setting up Infrastructure

Organizanizer's Key Setup

Maintaining Archive

Conducting Elections

Organizer's Key Shutdown

Figure 2.8: Meta Process of the Design Pattern. Such approach helps gluing different parts of election data together: none of the components can be modified after the full report is complete. In addition such structure commits organizer to some decisions: as bulletin board servers’ and observers’ certificates are present at the entry server, it can be devised at once which bulletin board server outputs were used and which observers have signed the result and which have not. Also such approach guarantees that all ballots were formed out of the same entry server contents. Finally, after time-stamping, organizer’s public key can be revoked without any harm.

60

2.4.2 Meta Process Previously we assumed that infrastructure e-voting system was already in place. In fact it must be supported by a process of its own - meta process, which is depicted on Figure 2.8: System’s lifetime starts with setting up the infrastructure. After that an iterative process of organizer’s public key threshold generation, registering it at PKI, conducting some number of elections, and finally organizer’s public key revocation and private key share destruction. Threshold key generation should be performed using bulletin board constructions described in section 2.3. It is sensible to keep private key shares on smartcards. In parallel with that, data archive should be maintained.

2.4.3 Design for Single Authority EVS In this subsection I will outline design for single authority EVS, which can be plugged into the e-voting system design pattern. It is sensible to generate single authority EVS’s public key in a threshold manner, whereas private key shares are to be kept separately and securely, for instance on smartcards. Bulletin board constructions described in section 2.3 should be used. Computing result in such setting is described on Figure 2.9: Ballot list creator takes as input raw ballot lists, list of revoked voters, and entry server contents and generates a list of ballots, which should be counted. Signatures of ballots must be verified and removed. As a result eligible ballot list does not bear any direct links to voters’ identity. This operation is deterministic and could be duplicated on multiple computers to ensure that list is formed correctly. Finally this list should be signed by the organizer. Result computer takes as input eligible ballot list and the private key shares. Using them, it is possible to decrypt each ballot separately and compute election result. In the end election result is signed using the private key shares. This operation is the weakest point of security: single decrypted ballots should not leak outside the computer. This implies that the computer should be as secure as possible. Also, it shouldn’t have any other means of communication with external world (hard disk, network) besides the one through which ballot list is entered and computed result is returned. Even

61

List of Revoked Voters

Entry Server Contents

Ballot List Creator

Raw Ballot List(s)

Eligible Ballot List

Result Computer

EVS Output

Figure 2.9: Result Computation in Single Authority EVS.

62

Ballot(s)

Entry Server Contents (organizer signed)

Raw Ballot List(s) (server and organizer signed)

Eligible Ballot List (organizer signed)

List of Revoked Voters (organizer signed)

EVS Output (signed with EVS key)

Figure 2.10: Election Data in Single Authority EVS. better, this computer could be implemented as a specialized device. This operation is deterministic and so it could be also duplicated on many computers to ensure that result is computed correctly, although computation correctness cannot be verified directly. Signature generation is not deterministic, so signatures should not be compared, but they can be verified directly. Figure 2.10 presents data structures involved and their relationships. It is very similar to the one in subsection 2.4.1 and does not need further explanation.

2.4.4 Design for Multiple Authority EVS In this subsection I will outline design for multiple authority EVS, which can be plugged into the e-voting system design pattern. Multiple authority setup starts with defining authorities and assigning them public keys. As they are needed only for internal communication, they could be certified by the organizer and not real CA. After that EVS setup phase can execute. Bulletin board constructions described in section 2.3 should be used. It is sensible to keep private information generated during the setup on smartcards. The process of computing result is depicted on the Figure 2.11: Each authority takes as input raw ballot lists, list of revoked voters, and entry server contents and produces signed authority’s output. After that election result can be computed based on authority’s outputs. Figure 2.12 again presents data structures involved and their relationships. It is very similar to the one in subsection 2.4.1 and does not need further explanation. 63

List of Revoked Voters

Entry Server Contents

Authorities

Raw Ballot List(s)

Authority's Outputs

Result Computer

EVS Output

Figure 2.11: Result Computation in Multiple Authority EVS.

Ballot(s)

Entry Server Contents (organizer signed)

Raw Ballot List(s) (server and organizer signed)

Authority's Output(s) (authority signed)

List of Revoked Voters (organizer signed)

EVS Output (organizer signed)

Figure 2.12: Election Data in Multiple Authority EVS. 64

2.4.5 Conclusions The main conclusion of this section would be that it is possible to build a generic e-voting service design pattern, into which both single authority and multiple authority electronic voting schemes can be easily plugged in in a similar way. The advantage of single authority EVS is its speed, the disadvantages are its lack of verifiability and presence of single point of failure, which must be secured a lot. The advantage of multiple authority EVS is its complete verifiability and lack of single points of failure - everything is done with respect to threshold trust. The disadvantage is its relative ineffectiveness. At the same time probably in many contexts both approaches have comparable and sufficient security. Multiple authority EVS becomes more secure with much bigger investment of resource.

65

Chapter 3

Summary In this work the following tasks have been accomplished: Formulated detailed requirements for e-voting system. Described theoretical basis for e-voting from the viewpoint of software engineering. Analysed feasibility of generic security framework, upon which e-voting system could be built. Investigated options for designing bulletin board service. Proposed general design pattern for implementing e-voting system. Described how theoretical single and multiple authority electronic voting schemes fit into this pattern. The main conclusions would be: The only serious security threat is imposed by existing framework of personal computers and Internet. Unfortunately it is enough to prevent implementing secure e-voting in near future. Although theoretically single authority electronic voting scheme is much weaker than its multiple authority counterpart, in practice they are of comparable quality having different advantages and disadvantages. Further directions of work could be: Increasing reliability of Internet and security of personal computing devices. 66

Further development of PKI and code signing. Further research in the field of time-stamping and bulletin board construction. Refinement of e-voting system design. Currently the design pattern is very high-level and lots of technical details are missing. Also the whole process of e-voting system maintenance should much more refined. Evaluate precise financial and computational resources needed to create and maintain e-voting system.

67

Elektrooniliste valimiste kavandamine Oleg M¨urk Res¨umee Valimised on v¨aga oluline u¨ hiskondlik toiming igas demokraatlikus riigis. Viimasel ajal on maailmas muutunud v¨aga populaarseks idee korraldada valimisi elektrooniliselt - piltlikult o¨ eldes ”koduarvutist ja u¨ le Interneti”. P˜ohiliseks motiiviks on valimiste mugavus, mis v˜oiks t˜osta kodanike valimisaktiivsust, mis on t¨uu¨ piliselt u¨ snagi madal. Pikemas perspektiivis v˜oiks sellest olla abi ka invaliididele ja vanuritele. Lisaks v˜oib loota, et selline valimiste korraldamise viis muutub ajapikku odavamaks kui tavalised valimised. Siiski vaadeldakse k¨aesoleval ajal e-valimisi peamiselt kui mugavat lisav˜oimalust tavaliste valimiste juures. E-valimisteks valmistumine koosneb u¨ ldjoontes kolmest aspektist: teoreetiline, tehniline ja poliitiline. Teoreetilise tegevuse eesm¨ark on luua reaalse maailma abstraktsioon ja leida selle piires p˜ohilised konstruktsioonid, millest saaks p¨arast u¨ les ehitada reaalse e-valimiste s¨usteemi. Selle s¨usteemi ehitamine ise oleks juba tehnilise tegevuse vallast. L˜opuks on vaja s¨usteem juurutada olemasolevasse demokraatlikku s¨usteemi, mis oleks juba poliitikute m¨angumaa. K¨aesolevat bakalaureuset¨oo¨ d v˜oib lugeda e-valimiste teoreetilisi aspekte k¨asitlenud semestrit¨oo¨ j¨atkuks ning teemaks on e-valimiste realiseerimise tehniline pool. Probleemile l¨ahenetakse tarkvaraarenduse vaatenurga alt. Esiteks anal¨uu¨ sitakse olemasolevat olukorda ning formuleeritakse n˜ouded s¨usteemile. Sellele j¨argneb teoreetiliste aluste u¨ levaade. E-valimiste realiseerimise t¨ahtis eeldus on turvaline keskkond, mis koosneb turvalistest arvutitest, t¨oo¨ kindlast v˜orgu¨uhendusest ja m˜onedest teistest komponentidest. Selles t¨oo¨ s uuritakse selle olemasolu ja saavutatavust ning n¨aidatakse, et praegune situatsioon j¨atab veel paljugi soovida. Teine t¨ahtis komponent e-valimiste realiseerimise juures on niinimetatud teadetetahvel, mis on aluseks turvalisele suhtlusele arvutite vahel. T¨oo¨ s uuritakse sellise konstruktsiooni efektiivse realiseerimise v˜oimalusi ning tuuakse v¨alja selle seoseid ajatembeldusega. J˜outakse j¨arelduseni et v¨aga h¨aid lahendusi veel pole ning teema vajab edasist uurimist. L˜opuks skitseeritakse e-valimiste s¨usteemi v˜oimalik arhitektuur ning n¨aidatakse, kuidas selles saaks kasutada erinevaid teoreetilisi konstruktsioone. T¨oo¨ p˜ohij¨areldus on, et peamine tehniline takistus e-valimiste realiseerimisel on Interneti ja olemasolevate arvutite ebaturvalisus ja ebat¨oo¨ kindlus.

68

Bibliography [Ben87]

J. Benaloh. Verifiable Secret-Ballot Elections. Ph.D. Thesis presented at Yale University, New Haven, CT (Dec. 1987). (Available as TR-561, Yale University, Department of Computer Science, New Haven, CT (Sep. 1987).)

[BLLV98]

Ahto Buldas, Peeter Laud, Helger Lipmaa, Jan Villemson TimeStamping with Binary Linking Schemes. In Hugo Krawczyk, editor, Advances in Cryptology - CRYPTO ’98, volume 1462 of Lecture Notes in Computer Science, pages 486-501. Springer-Verlag, 1998. http://www.tml.hut.fi/ helger/papers/bllv98/

[BT94]

J. Benaloh and D. Tuinstra. Receipt-Free Secret-Ballot Elections (extended abstract). In Proc. 26th ACM Symposium on the Theory of Computing (STOC), pp. 544-553. ACM, 1994.

[CIVTF]

California Internet Voting Task Force http://www.ss.ca.gov/executive/ivote/

[CFSY96]

R. Cramer, M. Franklin, B. Schoenmakers, M. Yung. Multiauthority secret ballot elections with linear work. In Advances in Cryptology - CRYPTO’96, volume 1070 of Lecture Notes in Computer Science, pages 72-83, Berlin, 1996. Springer-Verlag.

[CGS97]

R. Cramer, R. Gennaro, B. Schoenmakers. A Secure and Optimally Efficient Multi-Authority Election Scheme. European Transactions of Telecommunications, 8:481-489, 1997.

[Cha81]

D. Chaum. Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms. Communications of the ACM, 24(2):84-86, 1981. 69

[Cuculus]

Time-Stamping Project Cuculus. http://www.tml.hut.fi/ helger/cuculus/

[DNSEXT]

IETF DNS Extensions (dnsext) Working Group http://ietf.org/html.charters/dnsext-charter.html

[Election.com] http://www.election.com/ [Ero00]

Pasi Eronen. Denial of service in public key protocols. In Proceedings of the Helsinki University of Technology Seminar on Network Security (Fall 2000), to appear in TML laboratory report series, December 2000. http://www.cs.hut.fi/ peronen/publications/

[Fi85]

M. Fischer, N. Lynch, and M. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), pp. 374-382, 1985.

[Fi00]

Michael J. Fischer The Consensus Problem in Unreliable Distributed Systems (A Brief Survey). Proc. Int. Conf. on Foundations of Computations Theory, 2000. http://citeseer.nj.nec.com/326938.html

[LipMy01]

Helger Lipmaa, Oleg M¨urk. E-valimiste realiseerimisv˜oimaluste anal¨uu¨ s. In Estonian. http://www.just.ee/oldjust/JM/lipmaamyrk.pdf

[IPI]

National Workshop on Internet Voting. Conducted by Internet Policy Institute. Sponsored by the USA National Science Foundation. http://www.netvoting.org/

[IPSec]

IETF IP Security Protocol (ipsec) Working Group http://ietf.org/html.charters/ipsec-charter.html

[Kest95]

Lawrence Kesteloot. Fault-Tolerant Distributed Consensus. 1995. http://tofu.alt.net/ lk/290.paper/290.paper.html

[MR98]

D. Malkhi and M. Reiter. Byzantine quorum systems. Distributed Computing 11(4):203-213, 1998. A preliminary version appears in Proceedings of the 29th ACM Symposium on Theory of Computing, May 1997. http://www.bell-labs.com/user/reiter/#Quorums 70

[Myr00]

Oleg M¨urk. Electronic Voting Schemes. Semester work. http://www.math.ut.ee/ olegm/my papers.english.html

[.NET]

Microsoft .NET Platform. http://www.microsoft.com/net/

[NTP]

Time WWW server http://www.eecis.udel.edu/ ntp/

[OSI]

OSI (Open System Interconnect) reference model (definition) http://webopedia.internet.com/TERM/O/OSI.html

[Rei94]

M. K. Reiter. Secure agreement protocols: Reliable and atomic group multicast in Rampart. In Proceedings of the 2nd ACM Conference on Computer and Communication Security, pages 68-80, November 1994. http://www.bell-labs.com/user/reiter/#Rampart

[Rei95]

M. K. Reiter. The Rampart toolkit for building high-integrity services. In Theory and Practice in Distributed Systems (Lecture Notes in Computer Science 938), pages 99-110, Springer-Verlag, 1995. http://www.bell-labs.com/user/reiter/#Rampart

[Rei96]

M. K. Reiter. A secure group membership protocol. IEEE Transactions on Software Engineering 22(1):31-42, January 1996. http://www.bell-labs.com/user/reiter/#Rampart

[Reiter]

Michael Reiter’s Homepage. http://www.bell-labs.com/user/reiter/

[Rub00]

Avi Rubin. Security Considerations for Remote Electronic Voting over the Internet. http://avirubin.com/e-voting.security.html

[Sch99]

B. Schneier and A. Shostack. Breaking Up Is Hard to Do: Modelling Security Threats for Smart Cards. USENIX Workshop on Smart Card Technology, USENIX Press, 1999, pp. 175-185. http://www.counterpane.com/smart-card-threats.html

[TLS]

IETF Transport Layer Security (tls) Working Group http://ietf.org/html.charters/tls-charter.html 71

[VeriSign]

http://www.verisign.com

[VoteHere.net] http://votehere.net/ [VIP]

Voting Integrity Project http://www.voting-integrity.org/projects/votingtechnology/

[X.509]

IETF Public-Key Infrastructure (X.509) Working Group http://ietf.org/html.charters/pkix-charter.html

72