Proceedings 16th IEEE International Conference on High Performance Computing and Communications

HPCC 2014 11th IEEE International Conference on Embedded Software and Systems

ICESS 2014 6th International Symposium on Cyberspace Safety and Security

CSS 2014 Organized by FEMTO-ST Institute, Ecole Centrale Paris, Ecole des Mines de Paris Sponsored by IEEE IEEE Computer Society IEEE Technical Committee on Scalable Computing

20-22 August 2014 Paris, France Los Alamitos, California Washington



Tokyo

2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)

HPCC 2014 Table of Contents Message from the HPCC/ICESS/CSS 2014 General Chairs..........................................................................................................................................xxv Message from the HPCC 2014 Program Chairs........................................................................................................................................................xxvi Message from the ICESS 2014 Program Chairs.......................................................................................................................................................xxvii Message from CSS 2014 Program Chairs ...........................................................................................xxviii HPCC 2014 Organizing Committee........................................................................................................xxix ICESS 2014 Organizing Committee.....................................................................................................xxxvii CSS 2014 Organizing Committee ..............................................................................................................xl

v

IEEE International Conference onHigh Performance Computing and Communications (HPCC 2014) HPCC DAr 1: Distributed Architecture Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study .......................................................................................................................1 Olivier Serres, Abdullah Kayi, Ahmad Anbar, and Tarek El-Ghazawi HoL-Blocking Avoidance Routing Algorithms in Direct Topologies ............................................................11 Roberto Peñaranda Cebrian, Crispín Gómez Requena, María Engracia Gómez Requena, Pedro López Rodríguez, and Jose Duato Marín Analyzing the Optimal Voltage/Frequency Pair in Fault-Tolerant Caches ..................................................19 Vicente Lorente, Alejandro Valero, Salvador Petit, Pierfrancesco Foglia, and Julio Sahuquillo

HPCC DAr 2: Distributed Architecture Dynamic WCET Estimation for Real-Time Multicore Embedded Systems Supporting DVFS ........................................................................................................................................27 José Luis March, Salvador Petit, Julio Sahuquillo, Houcine Hassan, and José Duato A Flexible and Scalable Affinity Lock for the Kernel ...................................................................................34 Benlong Zhang, Junbin Kang, Tianyu Wo, Yuda Wang, and Renyu Yang Remapping NUCA: Improving NUCA Cache's Power Efficiency ................................................................38 Hui Wang, Chunrong Lai, Yicong Huang, Shih-Lien Lu, Rui Wang, Zhongzhi Luan, and Depei Qian An Energy-Efficient Multi-GPU Supercomputer ..........................................................................................42 David Rohr, Sebastian Kalcher, Matthias Bach, Abdulqadir A. Alaqeeliy, Hani M. Alzaidy, Dominic Eschweiler, Volker Lindenstruth, Sakhar B. Alkhereyfy, Ahmad Alharthiy, Abdulelah Almubaraky, Ibraheem Alqwaizy, and Riman Bin Suliman

HPCC DAl 1: Distributed Algorithms SCADOPT: An Open-Source HPC Framework for Solving PDE Constrained Optimization Problems Using AD ...............................................................................................................46 Kim Feldhoff, Martin Flehmig, Ulf Markwardt, Wolfgang E. Nagel, Maria Schütte, and Andrea Walther Accelerated Solution of Helmholtz Equation with Iterative Krylov Methods on GPU .......................................................................................................................................................54 Abal-Kassim Cheik Ahamed and Frédéric Magoulès

vi

Spectral Domain Decomposition Method for Natural Lighting and Medieval Glass Rendering .........................................................................................................................................62 Guillaume Gbikpi-Benissan, Rémi Cerise, Patrick Callet, and Frédéric Magoulès A Synchronous Parallel Max-Flow Algorithm for Real-World Networks .....................................................68 Guojing Cong

HPCC DAl 2: Distributed Algorithms Benefit of Unbalanced Traffic Distribution for Improving Local Optimization Efficiency in Network-on-Chip .....................................................................................................................76 Weiwei Fu, Mingmin Yuan, Tianzhou Chen, Qingsong Shi, Li Liu, and Minghui Wu Research on Mahalanobis Distance Algorithm Optimization Based on OpenCL .......................................84 Qingchun Xie, Yunquan Zhang, Haipeng Jia, and Yongquan Lu HSR: Hierarchical Source Routing Model for Network-on-Chip .................................................................92 Mingmin Yuan, Weiwei Fu, Tianzhou Chen, and Minghui Wu

HPCC DAl 3: Distributed Algorithms An Exploration on Quantity and Layout of Wireless Nodes for Hybrid Wireless Network-on-Chip .......................................................................................................................................100 Mingmin Yuan, Weiwei Fu, Tianzhou Chen, and Minghui Wu Acceleration of Stereo-Matching on Multi-core CPU and GPU ................................................................108 Tian Xu, Paul Cockshott, and Susanne Oehler A Technique for the Long Term Preservation of Finite Element Meshes .................................................116 Peter Iványi

HPCC DAl 4: Distributed Algorithms Parallel Sub-structuring Methods for Solving Sparse Linear Systems on a Cluster of GPUs .....................................................................................................................................121 Abal-Kassim Cheik Ahamed and Frédéric Magoulès Fast and Green Computing with Graphics Processing Units for Solving Sparse Linear Systems .........................................................................................................................................129 Abal-Kassim Cheik Ahamed, Alban Desmaison, and Frédéric Magoulès Coupling and Simulation of Fluid-Structure Interaction Problems for Automotive Sun-Roof on Graphics Processing Unit ............................................................................137 Liang S. Lai, Choi-Hong Lai, Abal-Kassim Cheik Ahamed, and Frédéric Magoulès

vii

HPCC DAl 5: Distributed Algorithms Comparison of Xeon Phi and Kepler GPU Performance for Finite Element Numerical Integration ................................................................................................................................145 Krzysztof Banaś and Filip Kruzel Efficient Work-Stealing with Blocking Deques ..........................................................................................149 Liu Chi, Song Ping, Liu Yi, and Hao Qinfen Optimizing Cache Locality for Irregular Data Accesses on Many-Core Intel Xeon Phi Accelerator Chip ........................................................................................................................153 Nhat-Phuong Tran, Dong Hoon Choi, and Myungho Lee

HPCC DAl 6: Distributed Algorithms LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU ....................................................................................................................................................157 Tingxing Dong, Azzam Haidar, Piotr Luszczek, James Austin Harris, Stanimire Tomov, and Jack Dongarra GPU Acceleration of Newton's Method for Large Systems of Polynomial Equations in Double Double and Quad Double Arithmetic .......................................................................161 Jan Verschelde and Xiangcheng Yu An Adaptive Task Granularity Based Scheduling for Task-centric Parallelism ........................................165 Jianmin Bi, Xiaofei Liao, Yu Zhang, Chencheng Ye, Hai Jin, and Laurence T. Yang

HPCC CCWS 1: Cloud Computing and Web Services An Energy-Efficient VM Placement in Cloud Datacenter ..........................................................................173 Fei Teng, Danting Deng, Lei Yu, and Frédéric Magoulès Reducing Memory in Software-Based Thread-Level Speculation for JavaScript Virtual Machine Execution of Web Applications .......................................................................................181 Jan Kasper Martinsen, Håkan Grahn, Anders Isberg, and Henrik Sundström Algorithms for Balanced Graph Bi-partitioning ..........................................................................................185 Jigang Wu, Guiyuan Jiang, Lili Zheng, and Suiping Zhou Optimizing the Topologies of Virtual Networks for Cloud-Based Big Data Processing ................................................................................................................................................189 Cong Xu, Jiahai Yang, Hui Yu, Haizhuo Lin, and Hui Zhang

HPCC CCWS 2: Cloud Computing and Web Services Accelerating the Massive VMs Booting Up ...............................................................................................197 Dayang Zheng, Hai Jin, Xiaofei Liao, and Yu Zhang Performance Driven Cloud Resource Provisioning ..................................................................................205 Jay Kiruthika and Souheil Khaddaj

viii

The HPS3 Service: Reduction of Cost and Transfer Time for Storing Data on Clouds ..................................................................................................................................................213 Jorge Veiga, Guillermo L. Taboada, Xoán C. Pardo, and Juan Touriño Securing Cloud Users at Runtime via a Market Mechanism: A Case for Federated Identity ................................................................................................................................221 Giannis Tziakouris, Carlos Joseph Mera Gómez, and Rami Bahsoon

HPCC CCWS 3: Cloud Computing and Web Services Cost-Effective Virtual Machine Image Replication Management for Cloud Data Centers .....................................................................................................................................................229 Dian Shen, Fang Dong, Junxue Zhang, and Junzhou Luo ZDLC-Based Modelling and Simulation of Enterprise Systems ...............................................................237 B. Makoond, A. Elias, S. Ross-Talbot, S. Khaddaj, and S. Franczuk Virtual Machine Scheduling Considering Both Computing and Cooling Energy ......................................244 Xiang Li, Xiaohong Jiang, and Yanzhang He

HPCC CCWS 4: Cloud Computing and Web Services Cloud Energy Broker: Towards SLA-Driven Green Energy Planning for IaaS Providers ...................................................................................................................................................248 Md Sabbir Hasan, Yousri Kouki, Thomas Ledoux, and Jean Louis Pazat Enabling Prioritized Cloud I/O Service in Hadoop Distributed File System ..............................................256 Tsozen Yeh and Yifeng Sun Implementation of the KVM Hypervisor on Several Cloud Platforms: Tuning the Apache CloudStack Agent ..................................................................................................................260 Fernando Gomez Folgar, Antonio Garcia Loureiro, Tomas Fernandez Pena, J. Isaac Zablah, and Natalia Seoane

HPCC CCWS 5: Cloud Computing and Web Services Harnessing Memory Page Distribution for Network-Efficient Live Migration ............................................264 Kashifuddin Qazi, Yang Li, and Andrew Sohn Service Deployment in Cloud ...................................................................................................................268 Amel Haji, Asma Ben Letaifa, and Sami Tabbane MOBBS: A Multi-tiered Block Storage System for Virtual Machines Using Object-Based Storage ..............................................................................................................................272 Sixiang Ma, Haopeng Chen, Heng Lu, Bin Wei, and Pujiang He

ix

HPCC SEC 1: Scientific and Engineering Computing Improving the Scalability of a Hurricane Forecast System in Mixed-Parallel Environments ............................................................................................................................................276 Thiago Santos Quirino, Javier Delgado, and Xuejin Zhang CESMTuner: An Auto-tuning Framework for the Community Earth System Model ........................................................................................................................................................282 Ding Nan, Xue Wei, Ji Xu, Xu Haoyu, and Song Zhenya The Virtual Open Page Buffer for Multi-core and Multi-thread Processors ..............................................290 Hongwei Zhou, Rangyu Deng, Zefu Dai, Xiaobo Yan, Ying Zhang, and Caixia Sun

HPCC SEC 2: Scientific and Engineering Computing On the Performance of the WRF Numerical Model over Complex Terrain on a High Performance Computing Cluster ...................................................................................................298 Nicholas Christakis, Theodoros Katsaounis, George Kossioris, and Michael Plexousakis Power Consumption Analysis of Parallel Algorithms on GPUs ................................................................304 Frédéric Magoulès, Abal-Kassim Cheik Ahamed, Alban Desmaison, Jean-Christophe Léchenet, François Mayer, Haifa Ben Salem, and Thomas Zhu targetDP: an Abstraction of Lattice Based Parallelism with Portable Performance .............................................................................................................................................312 Alan Gray and Kevin Stratford

HPCC SEC 3: Scientific and Engineering Computing Communication Optimal Least Squares Solver ........................................................................................316 Pawan Kumar FLLOP: A Massively Parallel Solver Combining FETI Domain Decomposition Method and Quadratic Programming ........................................................................................................320 Vaclav Hapla, Martin Cermak, Alexandros Markopoulos, and David Horak Performance Implication of Multicore Cache Locking on General-Purpose Processors ................................................................................................................................................328 Matthew Loach and Wei Zhang

HPCC SEC 4: Scientific and Engineering Computing SRFTL: An Adaptive Superblock-Based Real-Time Flash Translation Layer for NAND Flash Memory ...........................................................................................................................332 Xin Li, Zhaoyan Shen, Lei Ju, and Zhiping Jia

x

Exploiting Hybrid SPM-Cache Architectures to Reduce Energy Consumption for Embedded Computing .........................................................................................................................340 Wei Zhang and Lan Wu Texture-Directed Mobile GPU Power Management for Closed-Source Games .......................................348 Beilei Sun, Xi Li, Jiachen Song, Zhinan Cheng, Yuan Xu, and Xuehai Zhou

HPCC DAT 1: Distributed Applications and Technologies Predicting Performance of Hybrid Master/Worker Applications Using Model-Based Regression Trees ...............................................................................................................355 Abel Castellanos, Andreu Moreno, Joan Sorribes, and Tomàs Margalef Leveraging Hierarchical Data Locality in Parallel Programming Models ..................................................363 Ahmad Anbar, Engin Kayraklioglu, Olivier Serres, and Tarek El-Ghazawi Trajectory Pattern Mining over a Cloud-Based Framework for Urban Computing ................................................................................................................................................367 Albino Altomare, Eugenio Cesario, Carmela Comito, Fabrizio Marozzo, and Domenico Talia GPU Maps for the Space of Computation in Triangular Domain Problems ..............................................375 Cristobal A. Navarro and Nancy Hitschfeld

HPCC DAT 2: Distributed Applications and Technologies Look before You Leap: Using the Right Hardware Resources to Accelerate Applications ..............................................................................................................................................383 Jie Shen, Ana Lucia Varbanescu, and Henk Sips An Integrated Hardware-Software Approach to Task Graph Management ..............................................392 Nina Engelhardt, Tamer Dallou, Ahmed Elhossini, and Ben Juurlink A Metadata Update Strategy for Large Directories in Wide-Area File Systems .......................................400 Guoliang Liu, Zhenjun Liu, Liuying Ma, Shuai Zhang, Jing Huang, and Xiuguo Bao

HPCC DAT 3: Distributed Applications and Technologies Modelling and Stochastic Simulation of Synthetic Biological Boolean Gates ...........................................404 Daven Sanassy, Harold Fellermann, Natalio Krasnogor, Savas Konur, Laurentiu M. Mierla, Marian Gheorghe, Christophe Ladroue, and Sara Kalvala High Performance Simulations of Kernel P Systems ................................................................................409 Mehmet E. Bakir, Savas Konur, Marian Gheorghe, Ionut Niculescu, and Florentin Ipate Optimizing GPU Virtualization with Address Mapping and Delayed Submission .....................................413 Xiaolin Wang, Hanbing Wang, Yan Sang, Zhenlin Wang, and Yingwei Luo Buffer on Last Level Cache for CPU and GPGPU Data Sharing .............................................................417 Licheng Yu, Tianzhou Chen, Minghui Wu, and Li Liu

xi

HPCC MCN 1: Mobile Computing and Networking Conflict-Free Opportunistic Centralized Time Slot Assignment in Cognitive Radio Sensor Networks ............................................................................................................................421 Ons Mabrouk, Pascale Minet, Hanen Idoudi, and Leila Saidane Network Aware and Power-Based Resource Allocation in Mobile Ad Hoc Computational Grid ...................................................................................................................................428 Sayed Chhattan Shah, Sajjad Hussain Chauhdary, Muhammad Bilal, and Myong-Soon Park An Inter-frame Correlation Based Error Concealment of Immittance Spectral Coefficients for Mobile Speech and Audio Codecs ...................................................................................436 Yuhong Yang, Shaolong Dong, Ruimin Hu, Yanye Wang, Li Gao, and Maosheng Zhang

HPCC MCN 2: Mobile Computing and Networking Performance Analysis for New Call Bounding Scheme with SFR in LTE-Advanced Networks ......................................................................................................................442 Mahammad A. Safwat, Hesham M. El-Badawy, Ahmad Yehya, and H. El-Motaafy Adaptive Detection for STBCS in IEEE802.11AC ....................................................................................452 Debasish Ghose, Smriti Kana Roy, Hung-Ta Pai, and Chun-Yi Wei

HPCC MCN 3: Mobile Computing and Networking On Delivery Delay-Constrained Throughput and End-to-End Delay in MANETs .....................................456 Yujian Fang, Yuezhi Zhou, Xiaohong Jiang, and Yaoxue Zhang Source Misrouting in King Topologies ......................................................................................................464 E. Stafford, C. Martinez, Jose Luis Bosque, Fernando Vallejo, Cristobal Camarero, Borja Perez, and Ramón Beivide Avoiding Tree Saturation in the Face of Many Hotspots with Few Buffers ...............................................472 Bradley C. Kuszmaul and William H. Kuszmaul

HPCC MCN 4: Mobile Computing and Networking Simultaneous Optical Path-Setup for Reconfigurable Photonic Networks in Tiled CMPs ...........................................................................................................................................482 Paolo Grani and Sandro Bartolini Packet Storage at Multi-gigabit Rates Using Off-the-Shelf Systems ........................................................486 Victor Moreno, Pedro M. Santiago Del Río, Javier Ramos, José Luis García-Dorado, Ivan Gonzalez, Francisco J. Gomez-Arribas, and Javier Aracil

xii

SyncSnap: Synchronized Live Memory Snapshots of Virtual Machine Networks ...................................................................................................................................................490 Bin Shi, Bo Li, Lei Cui, Jieyu Zhao, and Jianxin Li A Multi-layer Hierarchical Inter-cloud Connectivity Model for Sequential Packet Inspection of Tenant Sessions Accessing BI as a Service .......................................................................498 Hussain Al-Aqrabi, Lu Liu, Richard Hill, and Nick Antonopoulos

HPCC SCUC 1: Security, Collaborative and Ubiquitous Computing Developing Scalable Agents in Blueprint ..................................................................................................506 Alex Muscar Host-Based Card Emulation: Development, Security, and Ecosystem Impact Analysis ....................................................................................................................................................514 Mouhannad Alattar and Mohammed Achemlal A Pairing-Free Certificateless Authenticated Group Key Agreement Protocol .........................................518 Gu Xiaozhuo, Xu Taizhong, Zhou Weihua, and Wang Yongming

HPCC SCUC 2: Security, Collaborative and Ubiquitous Computing CGK: A Collaborative Group Key Management Scheme .........................................................................522 Fatma Hendaoui, Hamdi Eltaief, Habib Youssef, and Abdelbasset Trad A Provisioning Service for Automatic Command Line Applications Deployment in Computing Clouds ................................................................................................................................526 Evgeny Pyshkin and Andrey Kuznetsov CGSIL: Collaborative Geo-clustering Search-Based Indoor Localization ................................................530 Thong Minh Doan, Han Nguyen Dinh, Nam Tuan Nguyen, and An Truong Pham

IEEE International Conference onEmbedded Software and Systems (ICESS 2014) ICESS 1: Energy Measurement and Management Characterizing Energy Consumption of Real-Time and Media Benchmarks on Hybrid SPM-Caches ............................................................................................................................534 Lan Wu, Yiqiang Ding, and Wei Zhang Learning Based Power Management for Periodic Real-Time Tasks ........................................................542 Fakhruddin Muhammad Mahbub Ul Islam and Man Lin Energy Consumption Estimation of Software Components Based on Program Flowcharts ................................................................................................................................................550 Patrick Heinrich, Hannes Bergler, and Dirk Eilers

xiii

An Operation Scenario Model for Energy Harvesting Embedded Systems and an Algorithm to Maximize the Operation Quality ...............................................................................554 Kazumi Aono, Atsushi Iwata, Hideki Takase, Kazuyoshi Takagi, and Naofumi Takagi

ICESS 2: Platforms and Systems Modeling Basic Aspects of Cyber-Physical Systems, Part II (Extended Abstract) ...................................................................................................................................................558 Yingfu Zeng, Chad Rose, Paul Brauner, Walid Taha, Jawad Masood, Roland Philippsen, Marcia O‘Malley, and Robert Cartwright An FPGA Based Resources Efficient Solution for the OmniVision Digital VGA Cameras Family ........................................................................................................................................566 Elmar Yusifli, Reda Yahiaoui, Saeed Mian Qaisar, and Tijani Gharbi Design and Implementation of Low-Power Location Tracking System Based on IEEE 802.11 .........................................................................................................................................570 Sanghyun Son, Yongsu Jeon, and Yunju Baek

ICESS 3: Architecture and Systems "CERE": A CachE Recommendation Engine: Efficient Evolutionary Cache Hierarchy Design Space Exploration ........................................................................................................574 Gabriel Yessin, Abdel-Hameed A. Badawy, Vikram Narayana, David Mayhew, and Tarek El-Ghazawi Online Data Allocation for Hybrid Memories on Embedded Tele-health Systems ....................................................................................................................................................582 Meikang Qiu, Longbin Chen, Yongxin Zhu, Jingtong Hu, and Xiao Qin Formulating Optimized Storage and Memory Space Specifications for Linux Network Embedded Systems ...................................................................................................................588 Kleomenis Tsiligkos and Apostolos Meliones

ICESS 4: Real-Time Scheduling Scheduling Analysis of TDMA-Constrained Tasks: Illustration with Software Radio Protocols ........................................................................................................................................593 Shuai Li, Stéphane Rubini, Frank Singhoff, and Michel Bourdellès Efficient Online Benefit-Aware Multiprocessor Scheduling Using an Online Choice of Approximation Algorithms .........................................................................................................603 Behnaz Sanati and Albert M.K. Cheng Dynamic Reservation-Based Mixed-Criticality Task Set Scheduling ........................................................611 Zheng Li, Shangping Ren, and Gang Quan

xiv

Minimal Schedulability Testing Interval for Real-Time Periodic Tasks with Arbitrary Release Offsets ..................................................................................................................619 Yu Jiang, Qiang Zhou, Xingliang Zou, and Albert M.K. Cheng

ICESS 5: Network Protocols Vulnerability Analysis of Clock Synchronization Protocol Using Stochastic Petri Net ............................................................................................................................................................623 Shen Jiajun and Feng Dongqin Contiki80211: An IEEE 802.11 Radio Link Layer for the Contiki OS ........................................................629 Ioannis Glaropoulos, Vladimir Vukadinovic, and Stefan Mangold

ICESS 6: Hardware/Software Co-Design Planning and Optimization of Resources Deployment: Application to Crisis Management .............................................................................................................................................633 Jason Mahdjoub and Francis Rousseaux Monitoring Lick Responses in Animal Behavioral Experiments Using a PSoC ........................................641 Qingshan Shan, David Bullock, Christian J. Sumner, and Trevor M. Shackleton Embedded Face Detection Application Based on Local Binary Patterns .................................................649 Laurentiu Acasandrei and Angel Barriga

ICESS 7: Energy-Efficient Scheduling and Resource Allocation Voltage Island Aware Energy Efficient Scheduling of Real-Time Tasks on Multi-core Processors ..........................................................................................................................653 Jun Liu and Jinhua Guo Energy Efficient Dynamic Core Allocation for Video Decoding in Embedded Multicore Architectures .............................................................................................................................661 Rajesh Kumar Pal, Kolin Paul, and Sanjiva Prasad BATS: An Energy-Efficient Approach to Real-Time Scheduling and Synchronization .................................................................................................................................669 Jun Wu

ICESS 8: System on Chip (SoC) and Multicore Systems CABSR: Congestion Agent Based Source Routing for Network-on-Chip ................................................677 Mingmin Yuan, Weiwei Fu, Tianzhou Chen, Wei Hu, and Minghui Wu On Cache-Aware Task Partitioning for Multicore Embedded Real-Time Systems ....................................................................................................................................................685 Aaron Lindsay and Binoy Ravindran

xv

Task Migration for Energy Saving in Real-Time Multiprocessor Systems ................................................693 Gang Zeng, Yutaka Matsubara, Hiroyuki Tomiyama, and Hiroaki Takada

ICESS 9: Embedded OS Deadline-Aware Interrupt Coalescing in Controller Area Network (CAN) .................................................701 Christian Herber, Andre Richter, Thomas Wild, and Andreas Herkersdorf SmartMig: A Case for Page Migration and Self-Interleaving for On-Chip Distributed Memory Systems ....................................................................................................................709 Weiwei Fu, Mingmin Yuan, Tianzhou Chen, Qingsong Shi, Li Liu, and Minghui Wu A Temporal Partition-Based Linux CPU Scheduler ..................................................................................713 Xingliang Zou, Albert M.K. Cheng, Yu Li, and Yu Jiang A Novel Fault Diagnosis in Reversible Logic Circuit .................................................................................717 Bikromadittya Mondal and Susanta Chakraborty

ICESS 10: Hardware/Software Co-Design A Locality-Preserving Write Buffer Design for Page-Mapping Multichannel SSDs .........................................................................................................................................................721 Sheng-Min Huang and Li-Pin Chang The RESCUE Approach - Towards Compositional Hardware/Software Co-verification ...........................................................................................................................................729 Paula Herber XGRID: A Scalable Many-Core Embedded Processor .............................................................................733 Volkan Gunes and Tony Givargis Advanced DSP Based Narrowband PLC Modem for Smart Grids Applications .......................................737 Mohamed Chaker Bali and Chiheb Rebai

ICESS 11: Embedded Security A Process for the Detection of Design-Level Hardware Trojans Using Verification Methods .................................................................................................................................741 Christian Krieg, Michael Rathmair, and Florian Schupfer An Efficient Admission Control Algorithm for Virtual Sensor Networks ....................................................747 Sawand M. Ajmal, Stefano Paris, Zonghua Zhang, and Farid Naït-Abdesselam Wireless Video Sensor Network Platform and Its Application for Public Safety .......................................755 Hyuntae Cho, Yunju Baek, and Chong-Min Kyung

xvi

The 6th International Symposium on Cyberspace Safety and Security (CSS 2014) CSS 1: Full Paper Track UI-Dressing to Detect Phishing .................................................................................................................759 Luigi Lo Iacono, Hoai Viet Nguyen, Tobias Hirsch, Maurice Baiers, and Sebastian Möller EP2AC: An Efficient Privacy-Preserving Data Access Control Scheme for Data-Oriented Wireless Sensor Networks ...........................................................................................767 Piyi Yang and Tanveer A Zia Snake: An End-to-End Encrypted Online Social Network ........................................................................775 Alessandro Barenghi, Michele Beretta, Alessandro Di Federico, and Gerardo Pelosi

CSS 2 Robust Edge Based Image Steganography through Pixel Intensity Adjustment ......................................783 Saiful Islam and Phalguni Gupta Online Taint Propagation Analysis with Precise Pointer-to Analysis for Detecting Bugs in Binaries ..................................................................................................................790 Gen Li, Ying Zhang, Shuang-Xi Wang, and Kailu Data Interception through Broken Concurrency in Kernel Land ...............................................................797 Julian L. Rrushi

CSS 3 Out-of-Band Authentication Model with Hashcash Brute-Force Prevention .............................................806 George Violaris and Ioanna Dionysiou A Secure Two-Phase Data Deduplication Scheme ..................................................................................814 Pierre Meye, Philippe Raïpin, Frédéric Tronel, and Emmanuelle Anceaume Bivariate Non-parametric Anomaly Detection ...........................................................................................822 Christian Callegari, Stefano Giordano, and Michele Pagano

CSS 4: Short Paper Track Security Mechanisms for a Cooperative Firewall ......................................................................................826 Hammad Kabir, Raimo Kantola, and Jesús Llorente Santos Virtual Firewall Performance as a Waypoint on a Software Defined Overlay Network .....................................................................................................................................................831 Casimer Decusatis and Peter Mueller

xvii

Machine Learning Based Cross-Site Scripting Detection in Online Social Network .....................................................................................................................................................835 Rui Wang, Xiaoqi Jia, Qinlei Li, and Shengzhi Zhang

CSS 5 Asynchronous Covert Communication Using BitTorrent Trackers ...........................................................839 Mathieu Cunche, Mohamed-Ali Kaafar, and Roksana Boreli Cloud Federation? We Are Not Ready Yet ...............................................................................................843 Jacques Bou Abdo, Jacques Demerjian, Hakima Chaouchi, Kabalan Barbar, Guy Pujolle, and Talar Atechian Proof of Retrieval and Ownership Protocols for Images through SPIHT Compression .............................................................................................................................................847 Fatema Rashid, Ali Miri, and Isaac Woungang

Workshops AHPCN: 6th International Symposium on Advances of High Performance Computing and Networking Online Performance Analysis: An Event-Based Workflow Design towards Exascale ......................................................................................................................................851 Michael Wagner, Tobias Hilbrich, and Holger Brunst Analysis of Header Usage Patterns of HTTP Request Messages ...........................................................859 Maria Carla Calzarossa and Luisa Massari Comparison of the Predictive Powers of Phenotypes Combined by Anthropometric Index and Triglyceride for Hypertension Diagnosis Based on Data Mining ..........................................................................................................................................866 Bum Ju Lee and Jong Yeol Kim A Speculative Mechanism for Barrier Synchronization .............................................................................870 Meng Jinglei, Chen Tianzhou, Pan Ping, Yao Jun, and Wu Minghui

AHPCN 2 Extending K-Scope Fortran Source Code Analyzer with Visualization of Performance Profiling Data and Remote Parsing of Source Code .......................................................878 Masaaki Terai, Peter Bryzgalov, Toshiyuki Maeda, and Kazuo Minami Task-Based Parallelization of Unstructured Meshes Assembly Using D&C Strategy ....................................................................................................................................................886 Eric Petit, Loïc Thébault, Nathalie Möller, Quang Dinh, and William Jalby

xviii

A Performance Analysis of Long-Term Archiving Techniques .................................................................890 Martín Vigil, Christian Weinert, Kjell Braden, Denise Demirel, and Johannes Buchmann

Archi 1: First International Workshop on Computing System Architectures Simulation of Asynchronous Iterative Algorithms Using SimGrid .............................................................902 Charles-Emile Ramamonjisoa, Lilia Ziane Khodja, David Laiymani, Arnaud Giersch, and Raphaël Couturier Hybrid Ontology-Based Matching for Distributed Discovery of SWS in P2P Systems ....................................................................................................................................................908 Adel Boukhadra, Karima Benatchba, and Amar Balla Analyses on Performance of Gromacs in Hybrid MPI+OpenMP+CUDA Cluster .....................................916 Ce Li, Wenbo Chen, Yang Zhang, and Qifeng Bai

Archi 2: First International Workshop on Computing System Architectures Optical Interconnects between Microprocessor and Memories ................................................................924 Daxin Luo, Yaoda Liu, Xiaoying Liu, Bin Zhang, Gang Li, Qi Liao, Qinfen Hao, and Zhulin Wei Exploiting the Inter-cluster Record Reuse for Stream Processors ...........................................................928 Ying Zhang, Gen Li, Caixia Sun, Hongwei Zhou, and Fayuan Wang Mobile Computers as Scientific Computing Machines ..............................................................................934 WA Smit and BM Herbst

ALG&MOD: First International Workshop on Algorithmic and Modeling New Bounds of a Measure in Information Theory ....................................................................................939 Mihaela-Alexandra Popescu, Oana Slusanschi, Alexandru-Corneliu Olteanu, and Florin Pop A Semantic Rule-Based Approach Towards Process Mining for Personalised Adaptive Learning .....................................................................................................................................941 Kingsley Okoye, Abdel-Rahman H. Tawil, Usman Naeem, Rabih Bashroush, and Elyes Lamine SignalPU: A Programming Model for DSP Applications on Parallel and Heterogeneous Clusters ....................................................................................................................949 Farouk Mansouri, Sylvain Huet, and Dominique Houzet

xix

App: First International Workshop on HPC Applications Hide-as-you-Type: An Approach to Natural Language Steganography through Sentence Modification .................................................................................................................957 Charles A. Clarke, Eckhard Pfluegel, and Dimitris Tsaptsinos Experience Report State-Replication-Based Matching System ...............................................................965 Yiqun Ding, Fan Li, Bo Zhou, Wei Li, Xinyu Wang, and Tong Wu Real-Time Environmental Monitoring for Cloud-Based Hydrogeological Modeling with HydroGeoSphere ...............................................................................................................971 Andrei Lapin, Eryk Schiller, Peter Kropf, Oliver Schilling, Philip Brunner, Almerima Jamakovic-Kapic, Torsten Braun, and Sergio Maffioletti

AMDA 1: First International Workshop on Advances in Memory and Data Access Ex-Tmem: Extending Transcendent Memory with Non-volatile Memory for Virtual Machines ..................................................................................................................................978 Vimalraj Venkatesan, Wei Qingsong, and Y.C. Tay A Bloom Filter Bank Based Hash Table for High Speed Packet Processing ...........................................986 Nicola Bonelli, Christian Callegari, Stefano Giordano, and Gregorio Procissi A Compiler Translate Directive-Based Language to Optimized CUDA ....................................................994 Feng Li, Hong An, Weihao Liang, Xiaoqiang Li, Yichao Cheng, and Xia Jiang

AMDA 2 Exploiting the Fine Grain SSD Internal Parallelism for OLTP and Scientific Workloads ...............................................................................................................................................1002 Soraya Zertal A Novel Approach for Fair and Secure Resource Allocation in Storage Cloud Architectures Based on DRF Mechanism ...............................................................................................1010 Maha Jebalia, Asma Ben Letaïfa, Mohamed Hamdi, and Sami Tabbane

O&S: First International Workshop on Optimization and Scheduling Core Affinity Code Block Schedule to Reduce Inter-core Data Synchronization of SpMT ..................................................................................................................................................1014 John Ye, Songyuan Li, Tianzhou Chen, Minghui Wu, and Li Liu

xx

M2M2 1: 6th International Workshop on Multicore and Multithreaded Architectures and Algorithms Fast and Accurate Code Placement of Embedded Software for Hybrid On-Chip Memory Architecture ................................................................................................................1020 Zimeng Zhou, Lei Ju, Zhiping Jia, and Xin Li Dual-Page Mode: Exploring Parallelism in MLC Flash SSDs .................................................................1028 Yimo Du, Youtao Zhang, and Nong Xiao A Dynamically Adaptive Approach for Speculative Loop Execution in SMT Architectures ...........................................................................................................................................1036 Meirong Li and Yinliang Zhao

M2M2 2 Embedded Multicore Processors and SIMD Instructions for Emotional-Based Mobile Robotic Agents ............................................................................................................................1044 Francisco Almenar Pedros, Carlos Domínguez, Juan-Miguel Martínez, Houcine Hassan, and Pedro López Security Effectiveness and a Hardware Firewall for MPSoCs ................................................................1052 Miltos D. Grammatikakis, Kyprianos Papadimitriou, Polydoros Petrakis, Antonis Papagrigoriou, George Kornaros, Ioannis Christoforakis, and Marcello Coppola Skeleton Paradigm for Developing E-Science Applications on Distributed Platforms .................................................................................................................................................1060 Mohamed Ben Belgacem and Nabil Abdennadher

WCT 1: First International Workshop on Cloud Technologies A Coalitional Game-Theoretic Approach for QoS-Based and Secure Data Storage in Cloud Environment ................................................................................................................1068 Maha Jebalia, Asma Ben Letaïfa, Mohamed Hamdi, and Sami Tabbane Selective Task Scheduling for Time-Targeted Workflow Execution on Cloud ........................................1075 In-Yong Jung and Chang-Sung Jeong Service Level Agreement (SLA)-Based Resource Management for Improving Cloud Services ........................................................................................................................................1080 Kaiqi Xiong Cost-Optimized Resource Provision for Cloud Applications ...................................................................1088 Yuxi Shen, Haopeng Chen, Lingxuan Shen, Cheng Mei, and Xing Pu

xxi

WCT 2 Trusted Platforms to Secure Mobile Cloud Computing ...........................................................................1096 Samia Bouzefrane and Le Vinh Thinh Clustering-Based Query Result Authentication for Encrypted Databases in Cloud ...................................................................................................................................................1104 Miyoung Jang, Min Yoon, Deulnyeok Youn, and Jae-Woo Chang Cloud Brokerage Model for Resource Pricing and Refund .....................................................................1111 Mohammad Aazam and Eui-Nam Huh Analysis and Detection of DoS Attacks in Cloud Computing by Using QSE Algorithm .................................................................................................................................................1117 Pallavali Radha Krishna Reddy and Samia Bouzefrane

WCT 3 Design and Implementation of a New Load Estimation Strategy in Cloud .............................................1125 Utpal Biswas, Sourav Banerjee, Prateep Bhattacharjee, and Mayukh Dey A Density-Aware Data Encryption Scheme for Outsourced Databases in Cloud Computing ..............................................................................................................................................1129 Min Yoon, Miyoung Jang, Young-Sung Shin, and Jae-Woo Chang Migrating Scientific Workflows to the Cloud: Through Graph-Partitioning, Scheduling and Peer-to-Peer Data Sharing ...........................................................................................1137 Satish Narayana Srirama and Jaagup Viil Towards an Easy-to-Use Web Application Server and Cloud PaaS for Web Development Education ..........................................................................................................................1145 Philipp Brune, Michael Leiser, and Erica Janke

GPU: First International Workshop on Graphical Processing Unit On Implementing Sparse Matrix Multi-vector Multiplication on GPUs ....................................................1149 Walid Abu-Sufah and Khalid Ahmad Flexible Parallelized Empirical Mode Decomposition in CUDA for Hilbert Huang Transform ....................................................................................................................................1157 Kevin P.Y. Huang, Charles H.P. Wen, and Herming Chiueh JolokiaC++: An Annotation Based Compiler Framework for GPGPUs ..................................................1166 Vibha Patel, Sanjeev Aggarwal, and Amey Karkare GPU Accelerated 3D Image Deformation Using Thin-Plate Splines ......................................................1174 Weixin Luo, Xuan Yang, Xiaoxiao Nan, and Bingfeng Hu

xxii

WNet 1: Workshop on Wireless Network Technologies Two New Multicast Algorithms in 3D Mesh and Torus Networks ...........................................................1182 Hovhannes A. Harutyunyan and Shegjian Wang Optimizing a Calibration Software for Radio Astronomy .........................................................................1190 Souley Madougou, Ana Lucia Varbanescu, and Rob Van Nieuwpoort Deterministic Blocker Tag Detection Scheme by Comparing Expected and Observed Slot Status in UHF RFID Inventory Management Systems ............................................1198 Ryo Hattori, Kentaroh Toyoda, and Iwao Sasase Improving Vertical Handover over Heterogeneous Technologies Using a Cross Layer Framework ....................................................................................................................................1202 Mariem Thaalbi and Nabil Tabbane

WNet 2 Throughput Enhancement in Cooperative Wireless Ad Hoc Networks ..................................................1209 Muhammad Khalil Afzal, Byung-Seo Kim, and Sung Won Kim Bounding the Worst-Case Execution Time of Static NUCA Caches ......................................................1213 Yiqiang Ding and Wei Zhang Concurrent Moving-Based Connection Restoration Scheme between Actors to Ensure the Continuous Connectivity in WSANs .................................................................................1217 Yuya Tamura, Takuma Koga, Shinichiro Hara, Kentaroh Toyoda, and Iwao Sasase

PPCSS 1: 6th International Symposium on Cyberspace Safety and Security Workshop Privacy Risks in Publication of Taxi GPS Data .......................................................................................1221 Peipei Sui, Tianyu Wo, Zhangle Wen, and Xianxian Li Security Evaluation for Cyber Situational Awareness .............................................................................1229 Igor Kotenko and Elena Doynikova

PPCSS 2 NoteLocker: Simple Secure Storage Service .........................................................................................1237 Petros Zaris and Harald Gjermundrød Assessing and Managing ICT Risk with Partial Information ...................................................................1245 Fabrizio Baiardi, Fabio Corò, Federico Tonelli, Alessandro Bertolini, Roberto Bertolotti, and Daniela Pestonesi What Private Information Are You Disclosing? A Privacy-Preserving System Supervised by Yourself ...........................................................................................................................1253 Alberto Huertas Celdrán, Manuel Gil Pérez, Félix J. García Clemente, and Gregorio Martínez Pérez

xxiii

Efficient Privacy Preserving Multicast DNS Service Discovery ..............................................................1261 Daniel Kaiser and Marcel Waldvogel

EMCA: Workshop on Embedded Multi-core Computing and Applications An Embedded-Based Distributed Private Cloud: Power Quality Event Classification ...........................................................................................................................................1269 Xiang-Yao Zheng, Chia-Pang Chen, and Joe-Air Jiang Conductor Temperature Estimation Using the Hadoop MapReduce Framework for Smart Grid Applications .....................................................................................................................1275 Sheng-Kai Pan, Chia-Pang Chen, and Joe-Air Jiang Parallel Subcircuit Extraction Algorithm on GPGPUs .............................................................................1280 Che-Lun Hung, Hsiao-Hsi Wang, Chun-Ting Fu, and Chia-Shin Ou

ETD: First International Workshop on HPC-CFD in Energy/Transport Domains Parallel 3D Sweep Kernel with PARSEC ................................................................................................1285 Salli Moustafa, Mathieu Faverge, Laurent Plagne, and Pierre Ramet Numerical Verification of Large Scale CFD Simulations: One Way to Prepare the Exascale Challenge ..........................................................................................................................1287 Christophe Denis Task-Based Programming for Seismic Imaging: Preliminary Results ....................................................1291 Lionel Boillot, George Bosilca, Emmanuel Agullo, and Henri Calandra Author Index ..........................................................................................................................................1299

xxiv

2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)

A Process for the Detection of Design-Level Hardware Trojans Using Verification Methods Christian Krieg, Michael Rathmair and Florian Schupfer Institute of Computer Technology Vienna University of Technology Vienna, Austria [email protected], {rathmair|schupfer}@ict.tuwien.ac.at Abstract—Hardware Trojans have emerged as a serious threat the past years. Several methods to detect possible hardware Trojans have been published, most of them aiming at detection during post-fabrication tests. Nevertheless, hardware Trojans are more probable to be inserted at design-level, as resources required to do so are much lower than those at fabrication. At design-level, verification methods have been shown to serve for Trojan detection. In this paper, we propose a design process to utilize verification methods in hardware Trojan detection, being able to be integrated into a state-of-the-art design flow for embedded systems. We outline the fundamental basics of verification methods and go then into the details of each step in the process. We identify assets and attackers, and outline which methods are suited to defend against which type of attack.

I.

representations of a design. While this approach is effective for outside-attackers, it is inapplicable in detecting designlevel Trojans inserted by a malicious designer. Verification at design-level is fairly used in recent design flows [4], [7]. Therefore, in this paper we propose a design process aiming at detecting malicious hardware structures and/or behavior at design level, inserted by a malicious designer. Different levels of abstraction are considered. The proposed design process extensively makes use of formal verification methods in order to find extra functionality in hardware designs, thus enabling assessment of Trojan absence in golden models. II.

In contrast to validation, which is used to ensure that the system fulfills its intended purpose (“building the right system”), verification is a method which checks if an implementation of a system corresponds to its specification (“building the system right”) Verification of hardware systems can be either complete or incomplete. Recent verification methods imply high computational complexity because of exponential growth of the state space (“state space explosion”). Therefore, simplifications are performed in order to reduce complexity (design partitioning) and to assess the number of states that are subject to a verification process (reachability analysis). To the simplified design, simulative and formal verification methods can be applied. Simulative verification methods work on the principle of falsification and check for presence of errors (instead of absence). They are applicable to entire systems, as implementation details are neglected. Therefore, errors that have been identified are relative to the degree of abstraction. Due to computational complexity, formal verification methods are applicable only to subsystems. Formal verification methods work on the principle of (mathematical) proof to show the absence of errors in an abstraction of the design. Formal methods are complete relative to both the specification and the abstraction. [8]

I NTRODUCTION

Hardware Trojans have been under intense research in the past years. They are digital and/or analog/mixed-signal systems that serve a shadow purpose besides their specified functionality. This additional functionality is inserted maliciously, is unspecified and undocumented. In order to remain stealthy during fabrication and functional tests, hardware Trojans incorporate an activation mechanism, a so-called trigger. A trigger mainly relies on rare occurrence of values and sequences and aims at activation of malicious functionality when the system is already deployed. [1]–[3] In former threat models, a malicious manufacturer is assumed which inserts extra functionality into physical designs. Therefore, detection methods focus on how to detect malicious functionality added this way. During functional tests, test vectors are applied to the circuit under test which take into consideration the activation strategy of a Trojan circuit. This way, specific regions are activated or rare signals are stimulated to force the activation of malicious circuitry. Concurrently, a side-channel analysis (SCA) is performed to measure the impact of malicious circuitry on parameters such as power consumption, leakage currents and timing. The results of the SCA are then compared to a simulation model of the circuit which is assumed Trojan-free, a so-called golden model. If they differ too much, a Trojan is to be suspected. [3] Applying SCA for Trojan detection is a promising method for environments where Trojan-free reference models are available. However, the assumption of Trojan absence in such models must be justifiable. In [4], a hardware security life cycle is presented, which is a good basis for security assurance in hardware systems. However it lacks a specific methodology to detect design-level hardware Trojans. Register transfer level (RTL) verification is recognized in [5] to detect design-level Trojans. Logic encryption is identified in [6] to protect various 978-1-4799-6123-8/14 $31.00 © 2014 IEEE DOI 10.1109/HPCC.2014.112

V ERIFICATION OF HW-D ESIGNS

A. Design Partitioning Design partitioning is used to reduce complexity of a given design in order to enable formal verification. A system is divided into functional subparts which are separately verified (fig. 1). If all subparts pass verification, and all interconnections among all subparts pass verification too, then a system can as a whole be seen as verified. One approach to partition a given design in less complex subparts is to determine the cones of all latches and primary outputs as shown in [9]. 741

Implementation

Partitions

Specification

sigA in1 in1 in2 ...

&

Design Partitioning

&

Netlist Analysis

& &

sigB out2 out3 out1 ... ?

|=

Structural Constraints

Figure 1: Design Partitioning.

yes/ no

p1 , p2 , ..., pn Property Generation

These cones can be treated as partitions of the design. Based on the structural overlapping of the fan-in cones, latches can be further partitioned by grouping uncorrelated latches [10].

Properties

Figure 3: Structural Checking.

B. Reachability Analysis

decision diagram (ROBDD) representation. The ROBDD represents a canonical form, which is a unique representation of the functionality for a fixed variable order. Both ROBDDs are then checked for equivalence (fig. 4). Besides ROBDD-based equivalence checking, also satisfiability (SAT)-based equivalence checking is used due to good scalability properties. [14] Equivalence Checking can be applied to identify modifications during synthesis or modifications of the implementation. Therefore, the risk of malicious synthesis tools or intruders that compromise the implementation can be mitigated.

Reachability analysis is a search problem in a directed system state graph. The result of a reachability analysis is a set of target states satisfied by starting at the initial state and reached by repeatedly applying valid state transitions [11]. Reachability requirements are specified in order to constrain the reachability of states, such as “A state must be reachable”. If states do not satisfy these constraints, they are suspect to implement Trojan behavior. With reachability analysis, states can be identified which are hard or unable to reach per specification. Such states can indicate a possible Trojan state which is entered after a trigger event occurred (fig. 2). [12]

E. Model Checking

Structural checking operates at low levels of abstraction, i.e., gate or transistor level. A netlist at the respective level is analyzed such that all connections between signals are identified [13]. Structural constraints are defined in a way that restrict interconnections between signals (e.g., “only two branches are allowed from a signal”, “a signal must not connect partition 1 with partition 2”, etc.). The concept of structural checking is depicted in fig. 3. This way, modifications of netlists can be detected. Structural checking is closely related to design rule checking [12].

Model checking is used to check specifications if they satisfy a given set of properties. Formally, model checking algorithms operate on a finite transition representation (M ) of a hardware function and check whether a property (p) holds on this model (M |= p) [15], [16]. Properties are formulated in propositional temporal logic (PTL), which are expressions checked on a path of the model (path formulas) or satisfied in a single state (state formulas). Model checking can be applied to identify if an implementation incorporates malicious behavior. Malicious behavior is defined by Trojan properties, which means that for any possible scenario a property has to be defined, which then is checked against the model (fig. 5).

D. Equivalence Checking

F. Threat Model

Equivalence Checking is used to verify if an implementation is functionally equal to its specification. In this context, the term specification refers to a representation of the design at a higher level of abstraction (e.g., RTL), which by a synthesis process is mapped to a representation at a lower level of abstraction (e.g., gate level), the implementation. Therefore, equivalence checking aims at the correct implementation of a specification, or, the correct translation of a representation from a higher level of abstraction to a lower level of abstraction. To prove functional equivalence, both the specification and the implementation are translated into a reduced, ordered binary

For our design process we assume a malicious designer, which adds extra functionality to the RTL design of a system. The added functionality is not specified and not documented, and serves a shadow purpose. We therefore call it malicious functionality. Malicious functionality can be either structural or behavioral. In order to detect malicious functionality, the RTL description of the design has to be fully available, as white-box verification is performed. In our approach, thirdparty intellectual property cores (3PIPs) are verifiable only if they are available as RTL description. Therefore, we assume

C. Structural Checking

Specification M

Implementation

Implementation Synthesis

Model Transformation ?

Reachability Constraints

p1 , p2 , ..., pn Property Generation

|=

yes/ no

?



Properties

yes/no

Figure 4: Equivalence Checking.

Figure 2: Reachability Analysis.

742

M

Implementation

Test Specification

?

Property Generation

tx

ld p1 , p2 , ..., pn

|=

tx snt

stop

tx

Model Transformation

idle

yes/ no

ld

shift tx

tx snt

load

tx

Properties

(a) Formal

“The transmission unit reads a data register, and sends each data bit over a single transmission line. A load signal indicates that the data in the data register are valid. A tx signal indicates that the transmission line is ready. An snt signal indicates that the last bit was sent.”

(b) Informal

Figure 5: Model Checking. Figure 7: System specification of the example UART.

a security policy which restricts the use of 3PIPs to RTL. Also, we assume a specification to be trustworthy, i.e., it is carefully checked against the security policy of the design house. A compromised specification can not be identified by verification. Instead, a specification is assumed to undergo thorough validation. III.

B. Test Specification On the basis of the system specification, the test specification is created which takes into account requirements regarding the implementation of the system. These can be functional or non-functional requirements. In our case, we describe functional requirements that specify malicious behavior which is not desirable and therefore constrains the implementation. Malicious behavior is derived from the security policy of the design house, as well as of known or potential attacks to hardware/software system designs. Based on a known potential attack where extra states are added during implementation [17], we specify for the UART transmitter unit the only legitimate transitions. This is possible because a formal specification is at hand. Figure 8 shows the property specifications in linear time logic (LTL). The properties shown in fig. 8 describe the valid transitions of the FSM which specifies the UART transmission unit. All transitions deviating from this behavior are detected during verification (or, more precisely, during model checking), indicating possible Trojan behavior.

V ERIFICATION P ROCESS

In this section, we describe a state-of-the-art approach to system specification and implementation and how testing and verification is embedded into this process. We identify possible attack vectors and investigate how verification methods can help in detecting and preventing the inclusion of malicious structures and behavior. Figure 6 illustrates this process, highlighting which method is utilized in which step of the process. The figure shows which step of the process is performed when, by whom, and visualizes the inputs and outputs for each subprocess. With the help of a simple example we illustrate the verification process in order to show the applicability of our approach. A simple universal synchronous receiver/transmitter (UART) transmission unit is to be designed, which sends data over a serial interface.

C. System Implementation The system is implemented based on the system specification. In this step, potentially malicious structures and behavior can be inserted by a malicious designer. The resulting document is a description of the system at arbitrary levels of abstraction which reach from RTL to physical. The subprocess of implementation incorporates several steps of synthesis, where each step maps a representation of the design to a lower level of abstraction. In this context, we call the representation at the higher level of abstraction a specification, and the representation at the lower level of abstraction an implementation. Thus, a single representation of the design can be both a specification and an implementation. It only depends on the context in which the representation is treated. If we explicitly want to refer to the specification in which the system behavior is specified, we call it system specification (cf. section III-A). In our threat model (cf. section II-F) we assume a malicious designer (or, in this context: a malicious implementer). The malicious designer adds extra functionality to the UART transmission unit, which enables it to covertly

A. System Specification The system specification is created on the basis of requirements and includes a precise description of the system behavior, as well as properties that are required from the system implementation [8]. Behavior is described in an (executable) behavioral model, whereas the properties that are required from the implementation are specified in the test specification (see section III-B). As an example, the transmission unit of a UART is to be designed. The function of a UART transmission unit is to bitwisely send the contents of a register over a transmission line. In order to keep complexity low, we reduce the functionality of the transmitter to demonstrate the design process. The informal specification of the UART transmission unit is as follows: “The transmission unit reads a data register, and sends each data bit over a single transmission line. A load signal indicates that the data in the data register are valid. A tx signal indicates that the transmission line is ready. A snt signal indicates that the last bit was sent.” It is inherent to informal specifications that these are ambiguous, incomplete and poorly structured which complicates verification [8]. Therefore, we formalize the specification using a finite state machine (FSM), which is depicted in fig. 7a. The formal specification of the UART transmission unit (fig. 7a) provides an exact description of the system behavior. Although implementation details have not been anticipated, design decisions already have been made in determining the system states and signals that influence transitions in the FSM.

1 2 3 4

G G G G

(( (( (( ((

s t a t e = i d l e ) −> X ( s t a t e = i d l e | s t a t e = l o a d ) ) s t a t e = l o a d ) −> X ( s t a t e = l o a d | s t a t e = s h i f t ) ) s t a t e = s h i f t ) −> X ( s t a t e = s h i f t | s t a t e = s t o p ) ) s t a t e = s t o p ) −> X ( s t a t e = s t o p | s t a t e = i d l e ) )

Figure 8: Specification of properties that constrain the system with regard to malicious behavior. Legitimate transitions of the FSM are specified.

743

System Architect

Implementer

Verification Engineer

System Specification

The function of the 1 system is specified

System tests and • Behavioral Constraints verifications are 2 • Reachability Constraints • Structural Constraints specified

Test Specification

The specification 3 is implemented System Design (RTL)

Tests are planned 4 Test Plan

Corrected Specification

Corrected Implementation

Test Documentation (golden reference)

The specification 7 is corrected

Errors

8

The implementation is corrected

Test Evaluation Documentation

• • • •

Design Partitioning Model Transformation Reachability Analysis Property Generation

• Structural Checking Tests are • Equivalence Checking performed 5 • Model Checking

• Structural Evaluation

Results are 6 • Equivalence Evaluation evluated • Property Evaluation

Figure 6: Process of Testing and Verification.1

tation. Two states are added in order to covertly transmit data over the UART. In fig. 9b, illustrative VHDL code is listed, which shows the mechanism for entering the extra states. Code which implements data transmission is deliberately omitted in favor of understanding.

case stNext is when st_idle => if t = ’1’ then stNext 44 X (state = st_shift | state = st_stop)); 45 LTLSPEC G ( (state = st_stop) -> 46 X (state = st_stop | state = st_idle)); 47 LTLSPEC G ( (state = st_idle) -> 48 X (state = st_idle | state = st_load));

E. Verification and Testing Having accomplished the preparing tasks, it is now time to perform the tests and verifications. For the purpose of detecting Trojan functionality that has been inserted at different steps in the design flow, the following verification methods are applied to the implementation. In order to prove if the implementation is functionally equivalent to the specification, an equivalence check is performed. This should reveal compromised synthesis tools or alterations of a synthesized design. Model checking is used to check if potential Trojan behavior has been added to the design. Properties that have been generated in the test planning phase of the process and describe Trojan behavior are checked against the ROBDD-representation of the system model. If Trojan behavior is detected in the design, a counterexample is generated by the model checking tool, which highlights the location of possible malicious inclusions. Model checking aims at revealing malicious components that have been inserted by malicious designers. To check if the implementation was altered at a low level of abstraction (e.g., gate or transistor level), it is submitted to structural checking. In structural checking, the interconnections between signals are checked against structural constraints that are specified in the test specification (see section III-D). The results of each verification are stored in the test documentation.

Figure 11: Test Planning. The system implementation is transformed to the input language of the NuSMV model checking tool. This working example also includes the system properties from fig. 10.

Our attack scenario assumes a malicious designer which adds extra states into the FSM representation of the example UART transmission module. For demonstration purposes, we perform model checking in order to verify if the model of the implementation satisfies the behavioral constraints that have been specified in the test specification (cf. section III-B). As expected, the NuSMV model checker produces a counter example because the implementation does not correspond to its specification. Figure 12 lists the output of the NuSMV model checking tool. Lines 10 to 12 of fig. 12 reveal that the model 1 2 3 4 5 6 7 8

LTLSPEC G ( X ( state LTLSPEC G ( X ( state LTLSPEC G ( X ( state LTLSPEC G ( X ( state

of the implementation does not behave as it is specified. As a proof, a counterexample is generated as shown in lines 15 to 30. It is shown in line 28 that the specification is violated when the (unspecified and maliciously added) state tload is entered. F. Evaluation and Correction The test documentation is handed over to the verification engineer. For every incident which has been reported, the verification engineer locates the cause for the incident and further investigates reasons for failing verification. Any incident can be traced back to either a specification or implementation error. The verification engineer tracks each incident back to its root cause, and documents it in the test evaluation documentation. The evaluation documentation is committed to both the system architect and the implementer, which subsequently remove the discrepancies detected during verification by correcting the design and/or the specification. The corrected versions of the specification and the implementation again are submitted to verification, and subsequently evaluated. This is an iterative

(state = st_idle) -> = st_idle | state = st_load) ) (state = st_load) -> = st_load | state = st_shift) ) (state = st_shift) -> = st_shift | state = st_stop) ) (state = st_stop) -> = st_stop | state = st_idle) )

Figure 10: Test Planning. The system properties are mapped to the implementation (state and signal names) and to the tool chain that is used for verification (syntax).

745

R EFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

-- specification G (state = st_load -> X (state = st_load | state = st_shift)) IN uart_tx is true -- specification G (state = st_shift -> X (state = st_shift | state = st_stop)) IN uart_tx is true -- specification G (state = st_stop -> X (state = st_stop | state = st_idle)) IN uart_tx is true -- specification G (state = st_idle -> X (state = st_idle | state = st_load)) IN uart_tx is false -- as demonstrated by the following execution sequence Trace Description: LTL Counterexample Trace Type: Counterexample -> State: 1.1 State: 1.2 State: 1.3 State: 1.4