An Introduction to the Clang API. Mark Wilson Senior Software Engineer Integrated Computer Solutions, Inc

An Introduction to the Clang API Mark Wilson Senior Software Engineer Integrated Computer Solutions, Inc. Agenda ● Quick introduction to Clang and...
Author: Beatrice Miller
52 downloads 0 Views 476KB Size
An Introduction to the Clang API

Mark Wilson Senior Software Engineer Integrated Computer Solutions, Inc.

Agenda ●

Quick introduction to Clang and LLVM



Clang usage and error reporting



Using Clang with Qt



Basic parser and Clang terminology



Using the Clang API to highlight text



Questions/Feedback

About Clang •

One of many front-ends to the LLVM compiler infrastructure



LLVM (Low Level Virtual Machine) acronym no longer meaningful, LLVM is full name now



Designed to compile C, C++, Objective-C, and Objective-C++ to machine code



Apple is the primary developer of Clang; Clang is the official compiler for Apple SDK



Clang is the default compiler on FreeBSD



Clang is highly compatible with GCC



Clang is a production compiler for C++ 98



Rapidly advancing in terms of C++ 11



The Clang API allows full insight into any C++ code base programmatically enables relatively easy tool development

Clang Diagnostics are Simpler

GCC diagnostic for a simple error: file.cc:7:1: error: expected ';' before '}' token

The same error output by Clang: file.cc:6:11: error: expected ';' after expression i += 8 ^ ;

Even for Dreaded Template Errors #include #include int main(int argc, char** argv) { std::map aMap; aMap[1] = "clang"; }

Typical Template Error Diagnostic try.cc: In function 'int main(int, char**)': try.cc:9:11: error: invalid user-defined conversion from 'int' to 'const key_type& {aka const std::basic_string&}' [-fpermissive] In file included from /usr/include/c++/4.7/string:55:0, from try.cc:1: /usr/include/c++/4.7/bits/basic_string.tcc:214:5: note: candidate is: std::basic_string::basic_string(const _CharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator] /usr/include/c++/4.7/bits/basic_string.tcc:214:5: note: no known conversion for argument 1 from 'int' to 'const char*' try.cc:9:11: error: invalid conversion from 'int' to 'const char*' [-fpermissive] In file included from /usr/include/c++/4.7/string:55:0, from try.cc:1: /usr/include/c++/4.7/bits/basic_string.tcc:214:5: error: initializing argument 1 of 'std::basic_string::basic_string(const _CharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator]' [fpermissive] try.cc:9:15: error: invalid conversion from 'const char*' to 'std::map, int>::mapped_type {aka int}' [-fpermissive]

Clang Template Error Diagnostic try.cc:9:9: error: no viable overloaded operator[] for type 'std::map' aMap[1] = "clang"; ~~~~^~

Using Clang with Qt Since Clang and GCC are compatible, you can build and link against a Qt installation that was built with GCC. Tell qmake to use the Clang compiler: qmake QMAKE_CC=clang QMAKE_CXX=clang

Basic Compiler/Parser Terminology •

Compiler function is to transform one form of code into another, e.g. C++ source => x86 assembler



Compiler scans the stream of characters that make up code, and tokenizes them: – – – – –



Numeric/string literals Punctuation Language keywords Identifiers Comments

Tokenization produces a stream of tokens, which are parsed to: – – –

Ensure correct syntax Discover the inherent structure of program Build an Abstract Syntax Tree (AST) representation of source

Abstract Syntax Tree 1 Libclang provides a cursor that follows the AST in top-down order 2

5

4

6

3 7

8

Clang Provides Tool Infrastructure API’s •

LibClang – –



C API, stable, allows bindings to other languages (e.g., Python) Simpler, but less control over AST

Clang Plugins – – –



Dynamic libraries loaded by compiler at runtime Complete control over AST Good for generating artifacts during compile time

LibTooling – –

C++ interface for writing stand-alone tools Provides common way to parse Clang command line options

libclang •

Libclang is a stable C interface to the Clang compiler



Entire API is in Index.h



Provides ability to iterate through program structure via cursors



Prefer libclang over the C++ interface unless you need full control over program structure – – –



More stable Better backwards compatibility Much simpler

Libclang is great for tool writing: – – –

Syntax checking (clang-check) Automatic fixing of compile errors (clang-fixit) Automatic code formatting (clang-format)

More Terminology Translation Unit - Basic unit of compilation in C++. Is a single source file plus any header files directly or indirectly included Index - Set of translation units that may link into an executable or library. May be many translation units in an index Cursor – “Pointer” to an element in the AST. Cursor may be hierarchical in nature, e.g., parameters are children of function

Libclang Data Types Primary libclang data types: •

CXTranslationUnit



CXIndex



CXCursor



CXCursorKind



CXToken



CXType



CXTypeKind



CXSourceLocation



CXSourceRange

Some Code – A Simple Syntax Aware Mini “IDE”



Highlights keywords, literals, punctuation, and comments with color



Read-only

Code We need a CXIndex: index_ = clang_createIndex(0, 0);

When the user selects a file, we create a CXTranslationUnit: // Produce object code, parse file as C++ const char* args[] = { "-c", "-x", "c++" }; transUnit_ = clang_parseTranslationUnit(index_, path_.toStdString().c_str(), args, 3, 0, 0, CXTranslationUnit_None);

Visiting the Source Code

We obtain the first cursor in the source from the translation unit and start visiting the AST via a user-defined visitor function: CXCursor startCursor = clang_getTranslationUnitCursor(transUnit_); clang_visitChildren(startCursor, visitor, this);

The Visitor A Clang visitor function has the signature CXChildVisitResult visitor( CXCursor cursor, // the current source cursor CXCursor parent, // the parent of the current cursor, if there is one CXClientData clientData // pointer to arbitrary user data )

CXChildVisitResult is one of three enumerators: 1.

CXChildVisit_Break Terminates the cursor traversal

2.

CXChildVisit_Continue Continues the cursor traversal with the next sibling of the cursor just visited, without visiting its children

3.

CXChildVisit_Recurse Recursively traverse the children of this cursor, using the same visitor and client data.

Token Information To highlight tokens in text, need to know: •

CXSourceLocation – file, line, column, and offset



CXSourceRange – start and end locations of token



CXCursorKind – keyword, literal, punctuation, identifier, or comment

Use QTextCursor to move through source in QTextEdit object and highlight code text.

Source Code for IDE ftp://ftp.ics.com/pub/pickup/clangfollowup.zip

References •

LLVM - http://llvm.org/



Clang Site - http://clang.llvm.org/



Building Clang – http://clang.llvm.org/get_started.html (version 3.4 as of this presentation)



Clang documentation - http://clang.llvm.org/docs/index.html



Clang Doxygen - http://clang.llvm.org/doxygen/index.html



Refactoring with clang http://www.youtube.com/watch?v=yuIOGfcOH0k

Suggest Documents