ADVANTAGES OF ENEA OSE : The Architectural Advantages of Enea OSE in Telecom Applications

white paper 1 ADVANTAGES OF ENEA OSE®: The Architectural Advantages of Enea OSE in Telecom Applications Brian Gildon Product Marketing Manager The d...
Author: Trevor Bridges
18 downloads 1 Views 413KB Size
white paper 1

ADVANTAGES OF ENEA OSE®: The Architectural Advantages of Enea OSE in Telecom Applications Brian Gildon Product Marketing Manager

The days are long gone when a real-time operating system (RTOS) was simply a small kernel providing basic services such as task scheduling and reliable inter-task communications. Today’s real-time operating systems are expected to perform a wide variety of functions ranging from managing real-time communications to providing a reliable foundation for higher level applications. The Architectural Advantages of Enea OSE in In essence, today’s realtime operating and whether it enforces a specific Telecom Applications

systems provide the foundation for programming model can make a complete software platforms that are significant difference in the types of The days are long gone whenfor a real-time system (RTOS) was simply a increasingly purpose-built specific operating applications that best suit the RTOS. small kernel providing basic services such as task scheduling and reliable inter-task applications. Enea OSE® has been designed from communications. Today’s real-time operating systems are expected to perform a What many embedded developers groundcommunications up for the fault-tolerant wide variety of functions ranging from managingthe real-time to may notarealize that RTOS for software distributed systems commonly found in providing reliableis foundation higher level applications. In essence, today’s realtime operating plays systems provide theinfoundation complete software platforms thatfrom architecture a pivotal role meet­ for telecommunications equipment, are for specific applications. ingincreasingly the specificpurpose-built needs of a particular mobile phones to radio base stations. application. Factors such as whether This paper discusses the OSE architecture What many embedded developers may not realize is that RTOS software the RTOS is monolithic or based on a and design philosophy, and how that architecture plays a pivotal role in meeting the specific needs of a particular microkernel, whether it as uses sockets, telecommunicationsoriented application. Factors such whether the RTOSbenefits is monolithic or based on a microkernel, whether it uses sockets, and whether it enforces a specific applications. programming model can

make a significant The Modular, Layered Concept difference in the types of

Figure 1. Layered Embedded Design Approach. Figure 1: Layered Embedded Design Approach

Most embedded systems built applications thatare best suiton a modular, layered architecture the RTOS. Enea® with has been welldefinedOSE® interfaces (Figure 1). designed fromofthe At the highest level the system ground up for the faultis the “application” layer, which contains tolerant distributed the userwritten application software. systems commonly At the lowest levelinis the embedded found telecommunications hardware, which can range from a equipment, from mobile single processor to a multicore/multi­ phones to radio base processor design spanning multiple stations. This paper blades. Justdiscusses above thethe hardware is the OSE firmware, which consists of design the board architecture and philosophy, andand howrelated that support package, drivers, benefits “glue” code. telecommunicationsThe RTOS sits between the firmoriented applications. ware and application. OSE specifically utilizes a layered architecture, including the “kernel”, the “Core Basic Services

layer” and the “Core Extensions layer,” (Figure 2). The OSE kernel provides basic services such as preemptive priority-based scheduling and direct, asynchronous message passing for inter-task communication and synchronization. It also provides a memory management package, which utilizes the processor’s MMU hardware to provide memory protection, and manages special physical memories such as flash. The Core Basic Services layer offers optional, configurable packages for services such as file system management, device driver management, heap management, and run mode / freeze mode debugging. In addition, this layer provides extensive C/C++ runtime support with a fully reentrant function lib­ rary. Moving one level higher, the Core Extensions layer offers optional services such as communication protocol stacks (IPv4/v6, SNMP, DHCP, etc) and interprocess communications (IPC). Enea’s LINX IPC software, for example, allows tasks running on different processors (or cores) to utilize the same message-

1

Enea is a global software and services company focused on solutions for communication-driven products. With 40 years of experience Enea is a world leader in the development of software platforms with extreme demands on high-availability and performance. Enea’s expertise in real-time operating systems and high availability middleware shortens development cycles, brings down product costs and increases system reliability. Enea’s vertical solutions cover telecom handsets and infrastructure, medtech, industrial automation, automotive and mil/aero. Enea has 750 employees and is listed on Nasdaq OMX Nordic Exchange Stockholm AB. For more information please visit enea.com or contact us at [email protected]. www.enea.com

defined interfaces (Figure 1) At the highest level of the system is the “application” layer, which contains the userwritten application software. At the lowest level is the embedded hardware, which can range from a single processor to a multicore/multiprocessor design spanning multiple blades. Just above the hardware is the firmware, which consists of the board support package, drivers, and related “glue” code.

white paper 2

The RTOS sits between the firmware and application. OSE specifically utilizes a layered architecture, including the “kernel”, the “Core Basic Services layer” and the “Core Extensions layer,” (Figure 2).

The Architectural Advantages of Enea OSE in Telecom Applications

PROPERTY OF ENEA

Moving2.one level higher,Layered the Core Extensions layer offers optional services such as Figure OSE’s Modular, Architecture. Figure 2:protocol OSE’s stacks Modular, Layered Architecture communication (IPv4/v6, SNMP, DHCP, etc) and interprocess

passing architecture offers many unique advantages for distributed systems: n In-herently modular, distributed architecture n Simple, intuitive model that is easy to learn n Consistent application design simplifies long term maintenance n No shared memory among applications n Task (process) ownership is never shared n Messages may be traced and monitored n Messages may be used for synchronization

communications (IPC). Enea’s LINX IPC software, for example, allows tasks

The OSE provides basic services such as priority-based running on kernel different processors (orthat cores) to utilize thepreemptive same message-based based communications model OSE strictly enforce, consistent behavior communications model that OSE uses for intertask communications on a single scheduling and direct, asynchronous message passing for inter-task communication uses for intertask communications on a when coding applications. This makes processor. and synchronization. It also provides a memory management package, which single processor. training new programmers, designing utilizes the processor’s MMU hardware to provide memory protection, and manages core extensions layer alsoas flash. newrun programs, and The The corephysical extensions layer also provides a dynamic time loader formaintaining upgrading exist­ special memories such and hot swapping application software. The runing time loader is used to load ELF provides a dynamic run time loader for programs far more straightforward files, typically through OSE loadapplication modules. Loadthan modules relocatable program upgrading and hot swapping withare traditional RTOSes, The Core Basic Services layer offers optional, configurable packages for which services units that can be loaded into a running system and dynamically bound to that such as file system management, device driver management, heap management, software. The run time loader is used to have no explicit programming model system. A loadable module can be uploaded, rebuilt, and quickly downloaded while and run / the freeze mode debugging. addition, thisislayer provides extensive load ELF mode files, of typically through OSE to run.InThis whatsoever. the remainder system continues capability especially valuable C/C++ runtimeLoad support with fully reentrant function for applications that need to beaare field upgradeable. load modules. modules reloc­­ atDirect,library. asynchronous message able program units that can be loaded passing is a simple and intuitive, looselyThe Core Extensions layer is separated from user applications by the “Core API”, a into a runninginterface systemthat andallows dynamically approach that provides programming OSE users tocoupled take advantage of its services 2 bound that system. A internal loadable module transparency datawho transfers from without to having to master RTOS complexities. However,inusers possess a more knowledge thequickly RTOS infrastructure also utilizeProcesses the “Core that can be detailed uploaded, rebuilt, of and processmay to process. SPI”, or System Programming Interface, to tackle tasks such as developing device downloaded while the remainder of the send messages do not have to wait drivers or industry-specific application platforms. system continues to run. This capability for information from the process that isThe especially valuable for Passing applications Enea OSE Message Philosophy that need to be field upgradeable. OSE is optimized for distributed, faultThe Core Extensions layer is separated tolerant systems, and built on an eventfrom user applicationsstate by the “Core API”, driven, communicating machine amodel. programming interface that allows OSE OSE’s natural programming model is based on passing direct,of asynchronous users to take advantage its services messages between tasks (or “processes” without having to master internal RTOS in OSE terminology). This model tends to complexities. However, users who pospromote, though not strictly enforce, consistent behavior when coding of the sess a more detailed knowledge applications. This makes new the RTOS infrastructure maytraining also utilize programmers, designing new programs, “Core SPI”, or System Programming Inand maintaining existing programs far terface, to tackle tasks such develop­ more straightforward than withas traditional RTOSes, have explicit ing devicewhich drivers orno industry-specific programmingplatforms. model whatsoever. application Direct, asynchronous message passing is

The Enea Message a simple and OSE intuitive, loosely-coupled Passing Philosophy

OSE is optimized for distributed, faulttolerant systems, and built on an eventdriven, communicating state machine model. OSE’s natural programming model is based on passing direct, asynchronous messages between tasks (or “processes” in OSE terminology). This model tends to promote, though not

Figure 3: OSE Message Passing Design

Figure 3. OSE Message Passing Design.

3 receives the message. Thus, the sending process cannot fail, even if the receiving process fails or becomes inaccessible. This loosely coupled approach naturally lends itself to distributed, fault-tolerant, multiprocessor (and multicore) systems. OSE’s direct, asynchronous message

These advantages are more easily under­ stood when the OSE kernel services are examined in more detail.

The Enea OSE Kernel Design Today’s modern RTOSes are complex pieces of software that provide a wide array of services, including network communications, file system manage­ ment, and dynamic application loading. They are typically architected in a modular, scalable fashion, which allows services to be added or removed as necessary. The kernel is the most signi­ficant part of the RTOS, as it is responsible for managing hardware and software resources. The most important services provided by the kernel are: n Process management n Process scheduling n Interprocess communications n Interprocess synchronization n Memory management n Dynamic memory allocation n Memory protection n Demand paging n Error handling

Process Management – Process Scheduling Process management involves the coordination, prioritization, execution, and synchronization of processes that comprise and support applications

The Architectural Advantages of Enea OSE in Telecom Applications

white paper PROPERTY OF ENEA

the priority of other tasks/processes running. The net result is that most RTOSes require the application software to be organized and executed in strikingly different ways, depending on whether the processes are periodic, triggered by the RTOS scheduler, or triggered by hardware interrupts. Under, all application software is managed in a consistent fashion using the same programming model.

running on the hardware. OSE manages application process execution through priority-based preemptive scheduling. The governing principle is that the high­ est priority process ready to run should always be the process that is running. OSE manages processes in a different manner than many other Figure Figure 5. Enea 5: OSE’s Direct, Asynchronous Message Passing Model. Enea OSE’s Direct, Asynchronous Message Passing Model RTOSes. For example, the OSE process scheduler manages hardware interrupts Process Management—Interprocess Communications via an interrupt process. In this way, all message buffer) the sender in priority is setmultitasking above the priority other the Preemptive requiresofinterprocess communication andfrom synchronization hardware interrupts are managed in process toinformation the recipient process, while tasks/processes The netconveying result order to prevent running. processes from corrupted or otherwise the same fundamental way as other obliterating the pointer value stored is that most RTOSes interfering with each require another.the To appli­ minimize the impact on hardware resources OSE uses message for interprocess and synchronization software processes, thereby maintaining bycommunications the sender process (for mutual cationdirect software to bepassing organized and (Figure 5).inIts processes send messages “directly” to other processes, but without programming model consistency and exclusion). executed strikingly different ways, actually copying the message. Instead, OSE transfers a pointer (to the message simplifying system level debugging By contrast, many RTOSes employ depending on whether the processes buffer) from the sender process to the recipient process, while obliterating the and troubleshooting. By contrast, many an indirect message model that uses are periodic, triggered by the RTOS pointer value stored by the sender process (for mutual exclusion). RTOSes cannot respond to hardware intermediate mechanisms (such as scheduler, or triggered by hardware interrupts using their internal task/process By message queues) buffer interrupts. Under, application software contrast, manyall RTOSes employ an indirect message modelto that usesmessages intermediate mechanisms as messagebetween queues) processes to buffer messages schedulers. Instead, they require pro(Figure 6).between The prois managed in a consistent(such fashion processes (Figure 6). The problem with this blem approach that it often requires grammers to use an external interrupt withisthis approach is that itthe often using the same programming model. application to create the message queue through which aretosent andthe service routine, which can sometimes requires themessages application create received. In other cases, a message may be copied many times—once when it is Process Management complicate system-level debugging. message queue through which messent from the sending process to the message queue, and a second time when the – Interprocess Communications OSE manages processes that must sages are sent and received. In other receiving process fetches the message. Preemptive multitasking requires run on a periodic, repeating basis using cases, a message may be copied many For short messages, the impact on system resources is minimal. for the interprocess communication and the same philosophy. The process times – once when itHowever, is sent from longer messages,inwhich common in telecom applications, thethe performance synchronization orderare to prevent sched­uler reserves a separate range of sending process to message queue, burden is often processes fromunacceptable. conveying corrupted high process priorities for these period and a second time when the receiving information or otherwise interfering processes, also known as timer interrupt process fetches the message. with each another. To minimize the processes. Each timer-interrupt process For short messages, the impact on has its own OSE control block (Figure 4), impact on hardware resources OSE uses system resources is minimal. However, for direct message passing for interprocess which makes it easy to support system longer messages, which are common in communications and synchronization and process level debugging. telecom applications, the performance 6 (Figure 5). Its processes send messages Most commercial RTOSes require burden is often unacceptable. The Architectural Advantages of Enea OSE in Telecom Applications PROPERTY OF ENEA “directly” to other processes, but with­ an external timer to indirectly force In extreme cases, RTOSes may be hardware. OSE manages application process out execution priority-based actuallythrough copying the message. application software to run on a periodic constructed with “queues of queues” preemptive scheduling. The governing principle is thatOSE the transfers highest priority process Instead, a pointer (to repeating basis. The timer-interrupt (Figure 7), further exacerbating the ready to run should always be the process that is running. performance burden. In the example in Figure 7, there are three queues, one for processes waiting to send messages, one for processes waiting to receive messages, and one central message queue. Managing this hierarchy adds complexity and can significantly degrade performance in communicationsintensive applications.

Process Management – Interprocess Synchronization RTOSes use a variety of interprocess communication and synchronization mechanisms, including message

Figure 4. Typical RTOS Priority Schema (left) vs. Enea OSE Priority Process (right). Figure 4: Typical RTOS Priority Schema (left) vs. Enea OSE Priority Process (right) OSE manages processes in a different manner than many other RTOSes. For example, the OSE process scheduler manages hardware interrupts via an interrupt process. In this way, all hardware interrupts are managed in the same fundamental way as other software processes, thereby maintaining programming model consistency and simplifying system level debugging and troubleshooting. By

3

The Architectural Advantages of Enea OSE in Telecom Applications

The Architectural Advantages of Enea OSE in Telecom Applications

PROPERTY OF ENEA

white paper

PROPERTY OF ENEA

In extreme cases, RTOSes may be constructed with “queues of queues” (Figure 7), In extreme cases, RTOSes may be constructed “queues in ofFigure queues” (Figure further exacerbating the performance burden. Inwith the example 7, there are7), further performance burden. In the exampleone in Figure 7, there are threeexacerbating queues, one the for processes waiting to send messages, for processes three queues, one for processes waiting to send messages, one for processes waiting to receive messages, and one central message queue. Managing this hierarchy adds complexity significantly degradequeue. performance in this waiting to receive messages,and andcan one central message Managing communications-intensive applications. hierarchy adds complexity and can significantly degrade performance in

4

communications-intensive applications.

Figure 6. TypicalFigure RTOS Indirect Message Architecture. 6: Typical RTOSQueue Indirect Message Queue Architecture

Figure 6: Typical RTOS Indirect Message Queue Architecture

The Architectural Advantages of Enea OSE in Telecom Applications

PROPERTY OF ENEA

Figure 7: “Queues of Queues” Required in Some RTOSes asynchronous message passing. RTOSes. To illustrate this concept, take the simple case of Figure 7. “Queues of Queues” Required in Some two processes attempting to share a printer resource, as shown in Figure 8. Process Management—Interprocess Synchronization Figure 7: of “Queues of Queues” Required in Some RTOSes OSE uses a RTOSes usesemaphores, a variety interprocess and synchronization queues, pipes, mailboxes, communication difficulty. Moreover, because mutexes Resource Monitor mechanisms, including message queues, pipes, semaphores, mailboxes, event Process topriorities, manage event groups and asynchronous signals. work by changing thread Process Management—Interprocess Synchronization priorities, completely groups and asynchronous signals. Semaphores are often the mechanism of choice information Semaphores are often the mechanism they are optimal forallhighly distri­ RTOSes use a variety of interprocess communication and synchronization for basic synchronization. However, semaphores arenot subject tohiding unbounded priority about the shared mechanisms, including message queues, pipes, mailboxes, event of choice for basic synchronization. buted, heterogeneous multicore/ inversions and deadlocks, and can be difficult to semaphores, debug. Mutexes solve the problem resource. The requestor processes groups and asynchronous signals.to Semaphores are often the mechanism of choice of unbounded priorityare inversions, but are still subject to deadlocks and debugging However, semaphores subject multi­ p rocessor environments such as asynchronously send for difficulty. basic synchronization. However, semaphores are subject to“print” unbounded priority Moreover, because mutexes work by changing thread priorities, they are messages unbounded priority inversions and those common to telecommunications directly to the the problem inversions andfor deadlocks, and canheterogeneous be difficult to debug. Mutexes solve not optimal highly distributed, multicore / multiprocessor Resource Monitor deadlocks, and can be difficult to applications. OSE supports semaphores of unbounded priority inversions, but are subject to deadlocks and debugging environments such as those common to still telecommunications applications. Process, avoiding debug. Mutexes solve the problem of work and mutexes (for applications with intermediate OSE supports semaphores and mutexes (for applications with ultra-tight time difficulty. Moreover, because mutexes by changing thread priorities, they are The Architectural Advantages of Enea OSE in Telecom Applications PROPERTY OF ENEA mechanisms such as unbounded priority inversions, are for interprocess ultra-tight time the but the preferredbut method synchronization isbut direct, notconstraints), optimal for highly distributed, heterogeneous multicore /constraints), multiprocessor message queues.

resource, as shown in Figure 8. OSE uses a Resource Monitor Process to manage priorities, completely hiding all information about the shared resource. The requestor processes asynchronously send “print” messages directly to the Resource Monitor Process, avoiding intermediate mechanisms such as message queues. The Resource Monitor processes any message it receives as soon as it is ready. If it is already processing another message, OSE auto­ matically builds a dedicated message queue as a linked list of message buffers, without any involvement from the programmer. In this manner, all messages are handled one at a time, even if they arrive in asynchronous bursts. Unlike semaphores and mutexes, which are difficult (or impossible) to distribute, this mutual exclusion approach works in single processor, multi-processor, and multicore environments.

Memory Management – Dynamic Memory Allocation

Like many traditional RTOSes, OSE can borrow buffers of RAM for temporary or asynchronous message To illustrate this concept, take the simple case of Theinterprocess Resource permanent use. OSE’s primary memory still subject to deadlocks and debugging preferred method for environments suchpassing. as those common to telecommunications applications. Monitor processes two processes attempting to share a printer resource, as shown in Figure 8. OSE supports semaphores and mutexes (for applications with ultra-tight allocation mechanism is called a “mem­ any message it time synchronization is direct, asynchronous The Architectural Advantages ofsoon Enea in Telecom Applications PROPERTY receives as as OSE it is ready. If it is already processing another message, OSE constraints), but the preferred method for interprocess synchronization is direct, OF ENEA OSE uses a ory pool.” Most RTOSes manage this message passing. To illustrate this automatically builds a dedicated message queue as a linked list of message buffers, Resource 7 without any involvement from the programmer. In thisMonitor manner, all messages are memory like a traditional heap. The concept, take the simple case of two to manage one at awhen time, even if they arriveProcess in asynchronous bursts. are Unlikelarger than the that heaps tend to handled fragment the required buffer sizes priorities, completely semaphores and mutexes, which processes are difficult (or impossible) to distribute, this mutual problem with this approach is that attempting to share a printer underlying memoryexclusion pageapproach size. works In contrast, OSE’s memoryandpools are nonin single processor, multicore hidingmulti-processor, all information heaps tend to fragment when the environments, aboutto theno shared fragmenting. Available buffer sizes are limited more than eight standard7buffer resource. The required buffer sizes are larger than the sizes per pool, where theManagement— buffer sizes are user-defined. Figure 9 shows an OSE Memory requestor processes Allocation underlying memory page size. In memory pool from Dynamic whichMemory memory buffers can be allocated asynchronously sendin one of eight different “print” messages Like many traditional buffer sizes, as required. contrast, OSE’s memory pools are nonRTOSes, OSE can borrow directly to the fragmenting. Available buffer sizes are buffers of RAM for temporary Resource Monitor or permanent use. OSE’s Process, avoidingbuffers of each buffer size OSE uses a learning algorithm to determine how many limited to no more than eight standard primary memory allocation intermediate is called a are needed by the mechanism application software. Initially, all of the pool’s memory is available buffer sizes per pool, where the buffer mechanisms such as “memory pool.” Most for allocation. When memory buffers they are taken from the pool’s message queues. RTOSes manage this memory are requested, sizes are user-defined. Figure 9 shows an a traditional Thepool. Each The Resource memory, starting atlikeone end heap. of the buffer taken from the pool is OSE memory pool from which memory problem with this approach is Monitor processes Figure(Figure 9: OSE Memory for Buffers Allocation buffers permanently assigned its initially allocated size 9). Pool Once Figure OSE Memory Pool forallocated, Buffers Figure 8. any9.message it buffers can be allocated in one of eight do not split. If Instead, they are message, freed, they receives as merge soon as itor is ready. it is already when processing another OSE are recycled with their Allocation. 8 different buffer sizes, as required. automatically builds a dedicated message queue as a linked list of message buffers, original size using a set of “free buffer lists”, one per buffer size (Figure 10). without any involvement from the programmer. In this manner, all messages are OSE uses a learning algorithm to handled one at a time, even if they arrive in asynchronous bursts. Unlike determine how many buffers of each semaphores and mutexes, which are difficult (or impossible) to distribute, this mutual exclusion approach works in single processor, multi-processor, and multicore buffer size are needed by the applicaenvironments, tion software. Initially, all of the pool’s Memory Management— memory is available for allocation. Dynamic Memory Allocation When memory buffers are requested, Like many traditional they are taken from the pool’s memory, RTOSes, OSE can borrow starting at one end of the pool. Each buffers of RAM for temporary or permanent use. OSE’s primary memory allocation mechanism is called a “memory pool.” Most RTOSes manage this memory like a traditional heap. The problem 10. with Enea this approach is Figure OSE Memory PoolFigure Buffer9:Reuse via Free Lists. OSE Memory Pool for Buffers Allocation

Figure 10: Enea OSE Memory Pool Buffer Reuse via Free Lists 8

When a buffer is freed, it is inserted into the free buffer list corresponding to its buffer size. Whenever an OSE “alloc” request is made, it first checks for the availability of a buffer in the free buffer list of the appropriate size. If no buffers of the required

white paper 5 In Figure 11, there are three blocks Memory Management of OSE processes. Each block has its – Demand Paging own memory pool. Block C is separated OSE gives programmers the option from Blocks A and B by a memory proof tightly controlling RAM usage in tection barrier enforced by processor applications where RAM is in short hardware, depicted as a vertical “brick wall”. supply. In feature-rich mobile phones, If a process in Block C tries to write for example, RAM is one of the most anywhere into Domain 1, hardware will significant contributors to overall cost. report this to OSE, which will handle OSE’s demand paging reduces RAM the error. Similarly, if a process in Blocks requirements by allowing programmers A or B tries to write into Domain 2, to store their programs and data in hardware will cooperate with OSE to NAND flash, and then copy only the handle the error. needed pages into RAM for execution. If a process in Block A would like to This allows designers to replace a large communicate with a process in Block B portion of their RAM with much cheaper or C, OSE (which is aware of the memory NAND flash. protection barrier between the blocks) will handle the message-based comError Handling munication in a manner that is transpaMost traditional RTOSes return a e of “0” rent to the programmer. when a kernel call is successful. When Within each separate memory the call fails, the return is a non-zero address space, OSE’s memory manager error code that is referenced in a Memory Management can make sections of memory nondocument or a header file. Often, the – Memory Protection cacheable or read-only accessible, software that manages the error codes The OSE kernel allows its processes to and assign other MMUsupported must be put into the same task or process be collected into “blocks”, where each attributes. But its main job is to give that is called by the RTOS service. This block of processes can be assigned its separate blocks (or groups of blocks) of can make the combined code difficult own separate (non-fragmenting) RAM processes their own separate, protected to read and maintain, as depicted in memory pool (Figure 11). This compart- memory address space. During the Figure 12. mentalization prevents problems in one development and test phase, where OSE uses an error handling schema memory pool (such as a memory leak) bugs are more prevalent, and may similar to that employed in the C++ occurs from affecting other blocks. throw stray pointers, programmers may language. Error information is not If the target CPU has a memory elect to have the memory manager delivered to the calling OSE process. management unit (“MMU”), OSE can take provide full process separation and Instead, when OSE detects an error advantage of the MMU to establish hard­ address space protection. However, while answering a service request, it ware-enforced, RTOS-aware memory this protection can have significant simply stops execution of the request­ protection between OSE blocks (or overhead, so developers may elect to ing process and switches to a separate The groups Architectural Advantages of aEnea OSE in Telecomdisable Applications PROPERTY OF ENEA of blocks) within single memory protection between piece of code associated with that proprocessor. This allows separate blocks some blocks in the final product. For cess, known as a “Process Error Handler”. If the has ato memory management unit (“MMU”), canistake (ortarget groupsCPU of blocks) have their own example, in FigureOSE 11, there no memory A separate handler is written for advantage the MMU to establish hardware-enforced, RTOS-aware memory separateof memory address spaces. OSE protection between Blocks A and B. each OSE application software process, protection groups of blocks) within a single processor. This can alsobetween interceptOSE stray blocks pointers(or before and contains the code needed to ident­ allows separate blocks (or groups of they cause damage, such as writing blocks) to have their own separate memory ify and remedy the error. If the handler address spaces. OSE can also intercept stray pointers before they cause damage, data to far-flung addresses. reports to OSE that it has fixed the error, such as writing data to far-flung addresses. it will allow the stopped process to continue running. OSE error handling also incorporates an error escalation sequence as part of the core kernel architecture, as seen in Figure 13. If the Process Error Handler reports that it has not been successful in deal­ buffer taken from the pool is permanently assigned its initially allocated size (Figure 9). Once allocated, buffers do not merge or split. Instead, when they are freed, they are recycled with their original size using a set of “free buffer lists”, one per buffer size (Figure 10). When a buffer is freed, it is inserted into the free buffer list corresponding to its buffer size. Whenever an OSE “alloc” request is made, it first checks for the availability of a buffer in the free buffer list of the appropriate size. If no buffers of the required size are available in the free list, OSE creates a new buffer in the unused portion of the pool. In this way, OSE “learns” the worst-case buffer needs of an application for each buffer size and keeps sufficient buffers available to meet those needs in its free buffer lists.

Figure 11.11: Enea OSE OSE BlocksBlocks and Memory Protection. Figure Enea and Memory Protection

In Figure 11, there are three blocks of OSE processes. Each block has its own memory pool. Block C is separated from Blocks A and B by a memory protection barrier enforced by processor hardware, depicted as a vertical “brick wall”. If a

that is referenced in a document or a header file. Often, the software that manages the error codes must be put into the same task or white paper process that is called by the RTOS service. This can make the combined code difficult to read and maintain, as depicted in Figure 12. The Architectural Advantages of Enea OSE in Telecom Applications PROPERTY OF ENEA 6 OSE uses an error handling schema similar to that employed in the C++ language. Error information is not delivered to the calling OSE process. Instead, when OSE detects an error while answering a service request, it simply stops execution of the requesting process and switches to a separate piece of code associated with that process, known as a “Process Error Handler”.

OSE in Telecom Applications

A separate handler is written for each OSE application software process, and contains the code needed to identify and remedy the error. If the handler reports to Figure Error Escalation Sequence. Figure 13. 13: Enea EneaOSE OSE Error Escalation Sequence OSE that it has fixed the error, it will allow the stopped If the Process Error Handleractivity reports that it has not beenwith successful in dealing with sesprocess and the error handling concerned the passing of messages continue running. the error, OSE willto escalate the error to a higher-level error handler, and if it too is

PROPERTY OF ENEA

processes may need from time to time. from process to process, it is not necesnot able to handle the error, OSE will escalate the error to a System-Level Error Normal, desirable process activities sary forfor applications to understand the Handler. There is one System-Level Error Handler each processor running OSE, are coded in the main body of the structure of a distributed system. In andOSE it is the central handler for “last ditch” recovery attempts such board fact, errorerror handling also incorporates anaserror reset or alerting another node to take over asthe system master. process code, while error-handling and location of a given process’s comescalation part of partner(s) the core kernelto recovery activities aresequence coded separatelyas munication is transparent From a software architectural perspective, the OSE approach to error handling in the Process Error Handler. the application. Enea LINX takes care s RAM architecture, seen in Figure creates a clean separationas between the normal activity of13. processes and the error Separating process code from error of the message delivery, regardless of rammers to handling activity processes may need from time to time. Normal, desirable process in NAND flash, handling code greatly simplifies devewhere the receiving process physically activities coded in the main body of the process code, while error-handling Figure 12: Typical RTOS Errorare Handling Interspersed Code (left) vs. OSEand Error d pages into lopment and maintenance. also faci- in the resides (i.e.,Error a different instance of the recovery activities are codedItseparately Process Handler. s designers to Schema (right)operating system, another processor). litatesDetection error escalation logic that allows RAM with Separating code from handlingFrom codeagreatly simplifies development the system process to correct errors at error different distributed systems point-ofand maintenance. It also facilitates error escalation logic that allows the system to levels of abstraction before triggering view, this approach is merely a transpacorrect errors at different levels of abstraction before triggering the “last ditch” the “last ditch” System Error Handler – all rent extension of OSE’s simple, straightSystem Error Handler—all without effecting the structure of the normal process n a e of “0” without forward message passing architecture. code. effecting the structure of the ul. When the normal process code. In addition to simplifying distributed ero error code design, LINX enhances availability by nt or a header Distributed Systems Support providing a means of “watching over” anages the Enea® LINX is a suite of interprocess processes spread across multiple e same task or TOS service. communications (IPC) services that processors or cores. LINX can send and12 Figure 12. Typical RTOS Error Handling code difficult to read and maintain, as depicted in builds upon OSE’s message passing receive “heartbeat” messages to period­ Interspersed Code (first picture above) vs. ically check the integrity of remote OSE Error Detection Schema (second picture architecture. Enea LINX effectively extends OSE’s transparent IPC services hardware and communication links. above). OSE uses an error handling schema similar to that from multiple processes running on When critical processes “die” or become employed in the C++ language. Error information is not delivered to the callinging OSE process. Instead, when a single CPU to multiple processes inaccessible, LINX can notify interested with the error, OSE will escalate the OSE detects an error error while to answering a service running on multiple CPUs and even processes on any processor, taking the a higher-level error handler, equest, it simply stops execution of the requesting multiple operating systems. These burden off of the application software. and if it too is not able to handle the process and switches to a separate piece of code services make complex distributed This service greatly simplifies the design error, OSE will escalate the error to a associated with that process, known as a “Process systems easier to conceptualize, model, of distributed, fault tolerant systems. System-Level Error Handler. There is Error Handler”. partition, and scale. one System-Level Error Handler for A separate handler is each written for each running OSE application High-Availability Enea LINX operates as an OSE processor OSE, and it is oftware process, andthe contains the code needed High-availability is a baseline requireservice in the Core Extensions layer. It central error handler forto“last ditch” dentify and remedy the error. If the handler reports to ment for many telecommunications permits direct, asynchronous message recovery attempts such as board reset OSE that it has fixed the error, it will allow the stopped or alerting another node to take over as passing among OSE (and other operating systems and is a necessity in NGN, IPprocess to continue running. systems) instances in both homogen­ system master. OSE error handling also incorporates an error eous and heterogeneous processing From a software architectural escalation sequence as part of the the coreOSE kernel environments, and across a wide variety perspective, approach to error architecture, as seen handling in Figure 13. of interconnects. creates a clean separation Since OSE applications are only between rror Handling Interspersed Codethe (left)normal vs. OSEactivity Error of proces-

and Paging ption of tightly controlling RAM usage in applications In feature-rich mobile phones, for example, RAM is tributors to overall cost.

Detection Schema (right)

11

11

white paper 7 based communications deployments. In many applications, systems must achieve 99.999% (or “five nines”) uptime in order to deliver the continuous, high-quality service that customers have come to expect from PSTN-based systems of the past. OSE is designed from the ground up with high availability in mind. OSE’s memoryprotected, message passing architecture facilitates the design of modular, compartmentalized applicat­ ions that prevent errant or malicious processes from corrupting the kernel and other application processes. OSE’s run time loader enables applications to be field upgraded and hot swapped

without shutting down running systems. OSE’s advanced multi-level error handling capabilities enhance availability for every load module, whether at the process, block, or system level. And the Enea LINX IPC services extend these high availability (HA) capabilities to multiple instances of OSE running on different processors, cores, and blades. All of these characteristics contribute to OSE’s strength as a “five nines”, highly available, fault tolerant, real time operating system.

Conclusion RTOSes have evolved over time to perform more application-specific

functions. Unlike most traditional RTOSes, OSE was designed specifically with distributed, fault-tolerant telecommunications systems in mind. OSE’s message-passing architecture and general approach to process and memory management, process scheduling, error handling, and distributed communica­ tions have made it the choice for millions of telecommunications applications worldwide. When stability, high availability, development simplicity, maintainability, and performance are the primary criteria for RTOS selection, OSE stands alone, and its track record speaks for itself.

Enea®, Enea OSE®, Netbricks®, Polyhedra® and Zealcore® are registered trademarks of Enea AB and its subsidiaries. Enea OSE®ck, Enea OSE® Epsilon, Enea® Element, Enea® Optima, Enea® Optima Log Analyzer, Enea® Black Box Recorder, Enea® LINX, Enea® Accelerator, Polyhedra® Flashlite, Enea“ dSPEED Platform, Enea® System Manager, Accelerating Network Convergence™, Device Software Optimized™ and Embedded for Leaders™ are unregistered trademarks of Enea AB or its subsidiaries. Any other company, product or service names mentioned above are the registered or unregistered trademarks of their respective owner. WP46 012009. © Enea AB 2009.