Simulink models. 1. Introduction

Report Team: System Modeling Div. Automotive Electronics 1/9 Vilmos Zsombori Transformation of C code to Matlab/Simulink models Approach based on p...
Author: Gilbert Shaw
1 downloads 0 Views 267KB Size
Report

Team: System Modeling Div. Automotive Electronics 1/9

Vilmos Zsombori

Transformation of C code to Matlab/Simulink models Approach based on parsing

1. Introduction 1.1. Motivation In the modern development process of automotive embedded software, the graphical executable specification is becoming to take the place of the textual specification and accepted by both the manufactures and suppliers. There are advantages such as easy understanding and communication, fast development with executable specification in modeling and simulation tools, and automatic code generation since the commercial tools become more and more matured. The overall aim of this project is to transform the existing control functions in C code into the block diagram models in Matlab/Simulink. Considering amount of source code to be transformed, the necessity of unified transformation and the existence of strict transformation patterns based on the C/C++ language structures motivated to explore the possibility of developing a tool, which could assist the engineers in the transformation work. The approach presented here is based on a third party parser generator called JavaCC and the application of the generator to develop Java-based processor for C/C++ grammar.

1.2. Approach A simulation model to be created in Simulink is stored in a textual file (Simulink model file “mdl”) that defines the whole structure of the target system. The goal is to develop a translator that processes the given C files and generates the correspondent Simulink model files automatically. Considering the possibility offered by Matlab, to build up Simulink models through the command line, a thorough understanding of the model file structure is not needed. We generate an intermediate Matlab Script file instead, of which execution in the Matlab command line results in the desired Simulink model.

1.3. Restrictions The usual C codes take advantage of commercial preprocessors to preprocess the source files before being inputs to analyzers. The directives defined in the preprocessor in C specify actions for (1) macro substitution, (2) conditional compilation and (3) source file inclusion. Only after a successful run of the preprocessor can be the source code headed for any kind of parsing. The developed parser assumes that the C source file has been preprocessed.

2 The translator 2.1. Overview Sun MicroSystems has developed many supporting packages to extend the power of the Java programming language. Here, I refer to JavaCC (Java Compiler Compiler) [4] which is used to generate parsers in Java. With JavaCC, I developed a universal C/C++ parser, which runs on any computer platform with a Java Virtual Machine. The parser can be used for lexical and syntactic analysis of any source code written in C/C++. Throughout the analysis a parse tree and a symbol table is constructed. Furthermore, output actions are implemented according to the transformation patterns and the modeling guidelines and these are attached to specific classes of the grammar. The parse tree constructed throughout the parsing process consists of nodes, which are instances of these classes encapsulating the appropriate transformation actions. Hence, the translation is made through traversing the parse tree, using the information from the symbol table and applying the encapsulated actions.

Report

Team: System Modeling Div. Automotive Electronics 2/9

Vilmos Zsombori

2.2. General architecture

Figure 1. General architecture of the translator

2.3. The parser – How JavaCC works JavaCC is a compiler generator that accepts language specifications in BNF-like format as input. The generated parser contains the core components of corresponding compiler of the specified language, which includes a lexical analyzer and a syntax analyzer. Figure 1 shows the overall structure of a parser generated by JavaCC.

Figure 2. Generation of JavaCC Parser

2.3.1. Lexical Analysis with JavaCC The lexical analyzer in JavaCC is called “TokenManager”. The TokenManager is used to group characters from an input stream into Tokens according to specific rules. Each specified rule in TokenManager is associated with an expression kind [2]: SKIP: MORE: TOKEN: SPECIAL_TOKEN:

Throw away the matched expression Continue taking the next matched expression to build up a longer expression Creates a token using the matched expression and send it to the syntax analyzer Creates a token with the match expression and optionally send to the syntax analyzer, which is different from TOKEN.

The TokenManager is a state machine that moves between different lexical states to classify tokens. Figure 3 illustrates a sample state machine feature of the lexical analyzer. When the analyzer starts, it is in the

Report

Team: System Modeling Div. Automotive Electronics

Vilmos Zsombori

3/9

Default State which waits for inputs. If the input is a character "A", it moves to State A. Then from State A, if the input is character "A", it stays in the same state. However, when the input is character "B" or "C", the system moves to the corresponding states. If the state machine is facing an unspecified situation, such as hitting a character "E" in State C, it generates lexical errors. A complete lexical definition of the C/C++ language is given in [2]. The following code segment is a portion of the lexical definitions for the C/C++ lexical analyzer in JavaCC: SKIP

: { “ ” | “\n” | }

TOKEN : { | … } TOKEN : { | | | | … } TOKEN : { | | }

Figure 3. State Machine Features in Lexical Analyzer

2.3.2. Syntax Analysis with JavaCC The syntax analyzer in JavaCC is a recursive-descent LL(k) parser. This type of parser uses k number of lookahead tokens to generate a set of mutually exclusive productions, which recognize the language being parsed by the parser. By default, JavaCC’s syntax analyzer sets k to 1, but developers can override the number of lookahead tokens to any arbitrary number to match productions correctly. LL(k) parsers allow only right recursion in the production [5]. Consider a commonly used syntax for Expression in C/C++ that call itself recursively: expression : expression operators expression | "(" expression ")" | ;

Report

Team: System Modeling Div. Automotive Electronics 4/9

Vilmos Zsombori

The left recursive production is not allowed in LL parsers, the syntax must be reconstructed so that the parser can recognize the production correctly with limited amount of lookahead tokens. Therefore, sequences of tokens that generate mutually exclusive situations in the production should be placed in the beginning of each possible case. In the example of expression production, we uses the character "(" and the token to separate the production into two mutually exclusive situations. In this way, the expression production is structured as shown: expression : (operator expression)+ | "(" expression ")" ; This approach always requires in any LL(k) parsers and it is often no trivial to implement the requiring changes in the structure of the grammar from the targeted language’s BNF specification. Therefore the conversion of the syntax into right recursion should be considered with cares. A thorough syntactic description of the C/C++ grammar is given in [1]. The following code segment is a portion of the grammar definitions for the C/C++ syntax analyzer in JavaCC: void translation_unit() : {} { ( external_declaration() )* } void external_declaration() : {} { function_definition() | declaration() } void function_definition() : {} { declaration_specifiers() declarator() declaration_list() compound_statement() | declaration_specifiers() declarator() compound_statement() | declarator() declaration_list() compound_statement() | declarator() compound_statement() } void selection_statement() : {} “(” expression() “)” | “(” expression() “)” | “(” expression() }

{ statement() statement() statement() “)” statement()

2.4. The script generator A simulation model to be created in Simulink is stored in a textual file (Simulink model file “mdl”) that defines the whole structure of the target system. Considering the possibility offered by Matlab, to build up Simulink models through the command line, a thorough understanding of the model file structure is not needed. We generate an intermediate Matlab Script file instead, of which execution in the Matlab command line results in the desired Simulink model. The complete list of the model construction commands as well as the model and block parameters are described in [6].

Report

Team: System Modeling Div. Automotive Electronics 5/9

Vilmos Zsombori if ( expression ) { statement1 } else { statement2 }

add_block(…) { blocks for the logical expression } add_block(‘built-in/Subsystem’, ‘{prefix}/statement1’) add_block(‘built-in/Subsystem’, ‘{prefix}/statement2’) add_block(‘built-in/Logical Operator’, ‘…/not’, ‘Operation’, ‘NOT’) add_line(‘{prefix}’, ‘{logical exprn out}/1’, ‘statement1/Enable’) add_line(‘{prefix}’, ‘{logical exprn out}/1’, ‘not/1’) add_line(‘{prefix}’, ‘not/1’, ‘statement2/Enable’)

Figure 4. Transformation pattern and intermediate Matlab script.

3. The translation process The translator allows source code written in C/C++ to be converted in Simulink models. The translation process consists of three phases: the parsing, the script generation and the model generation. This process is sketched on Figure 5. In the first phase the parser looks for predefined patterns in an input sequence. When the pattern is found (for example the pattern of a function definition or a selection statement, etc.) then an equivalent new object (Node subclass) of corresponding class is instantiated. These objects build up the parse tree. In this phase also the hierarchy of scopes – global scope vs. local scope (structure definition, function definition or compound statement) – is created. A symbol table is built up for each scope containing the declared types and variables. During the next phase, the parse tree is traversed in an in-order manner and the encapsulated actions are applied, which gradually build up a new Matlab script file containing the necessary commands for the model generation. This is saved using the same name as the source file. The third phase is the model generation from the script file. This is simply accomplished by launching the script at the Matlab command line. A new Simulink model is stored with the same name as the source code and the script file respectively.

Report

Team: System Modeling Div. Automotive Electronics

Vilmos Zsombori

6/9

Figure 5. The process of a conversion from a C/C++ source file to a Simulink model.

The developed parser assumes that the C source file has been preprocessed. This is currently accomplished through a commercial C compiler (MinGW), issuing the following command: gcc -P -E > The whole translation process is done automatically by issuing the following command: java C2Model -cp [path]C2Model.jar

4. Results The source code used for testing is the entire sign.c. The whole transformation process does not exceed 3 minutes. In the following, a few obtained fragments of the output model are presented:

Figure 6. Bus systems in the output model, created from a structure.

Report

Team: System Modeling Div. Automotive Electronics 7/9

Vilmos Zsombori

Figure 7. Conditional operations.

Figure 8. If-then-else structure.

Report

Team: System Modeling Div. Automotive Electronics 8/9

Vilmos Zsombori

Figure 9. Functions.

5. Conclusion A new automated translation method has been presented, based on parsing and a third party compiler generator, which transforms source code written in C/C++ to Matlab/Simulink models. This yields fast and error-free operation, and has proven capable of handling large source codes without human intervention. Although there are some issues concerning the organization of the output models – localization and positioning – results at this stage are encouraging. Switching the implemented actions, the parser can be adapted to any “bit-by-bit transformation”. However, the focus of this approach is the pure source code, not the logic and the functionality. This is the reason, why it is unable to meet the simplification and the re-engineering issues, which are among the essential objectives of the entire project; therefore the developed tool could only assist the transformation work.

Report

Team: System Modeling Div. Automotive Electronics

Vilmos Zsombori

9/9

References [1]

Jutta Degener, ANSI C Yacc grammar, 1995 URL: http://www.lysator.liu.se/c/ANSI-C-grammar-y.html

[2]

Jutta Degener, ANSI C grammar, Lex specification, 1995 URL: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

[3]

Succi, Wong, The application of JavaCC to develop a C/C++ preprocessor, ACM SIGAPP Applied Computing Review Vol 7 issue 3 Fall 1999

[4]

Java Compiler Compiler™ (JavaCC™) – The Java Parser Generator URL: https://javacc.dev.java.net/doc/docindex.html

[5]

Aho A., R. Sethi, J. Ullman, Compiler Principle Technique and Tools, Addison Wesley, 1986

[6]

MathWorks™ – Simulink Reference URL: http://www.mathworks.com/access/helpdesk/help/toolbox/simulink/slref/slref.html