ON-BOARD SOFTWARE MAINTENANCE FOR GOCE, ESA S GRAVITY MISSION

ON-BOARD SOFTWARE MAINTENANCE FOR GOCE, ESA’S GRAVITY MISSION C. Steiger(1), G. Lautenschläger(2), A. Weigl(3), R. Eilenberger(4), P. Väisänen(5), E. ...
Author: Marcia Morgan
2 downloads 0 Views 267KB Size
ON-BOARD SOFTWARE MAINTENANCE FOR GOCE, ESA’S GRAVITY MISSION C. Steiger(1), G. Lautenschläger(2), A. Weigl(3), R. Eilenberger(4), P. Väisänen(5), E. Maestroni(6), A. da Costa(7), P. P. Emanuelli(8) (1)

ESA/ESOC, Robert-Bosch Str. 5, 64293 Darmstadt, Germany, [email protected] EADS Astrium GmbH, 88039 Friedrichshafen, Germany, [email protected] (3) EADS Astrium GmbH, 88039 Friedrichshafen, Germany, [email protected] (4) EADS Astrium GmbH, 88039 Friedrichshafen, Germany, [email protected] (5) Space Systems Finland Ltd, Kappelitie 6, 02200 Espoo, Finland, [email protected] (6) Rhea System (c/o ESA/ESOC), [email protected] (7) Vitrociset (c/o ESA/ESOC), [email protected] (8) ESA/ESOC, Robert-Bosch Str. 5, 64293 Darmstadt, Germany, [email protected]

(2)

ABSTRACT ESA’s Gravity Field and Steady-State Ocean Circulation Explorer (GOCE) was launched in March 2009 and is to measure the Earth’s gravity field with unprecedented accuracy. The specifics of the mission require an extremely low orbit of 260 km altitude and a highly sophisticated S/C design and flight software. A significant amount of on-board software maintenance (OBSM) activities have been carried out to correct software problems found after launch. Analysis of these activities shows (i) the commonness of patching the running software in RAM, (ii) the importance of having dedicated facilities allowing to install sets of patches on the platform software, and (iii) the relevance of high-level services for updating software parameters rather than having to resort to OBSM in these cases. S/C design limitations like the absence of a service mode for the redundant platform processor module became apparent when a major anomaly on the nominal platform computer occurred in February 2010. Following an intense commissioning phase, GOCE is now stable in its routine operations phase, with all flight software problems corrected and the nominal mission designed to last up to April 2011, with further extension possible. 1.

THE GOCE MISSION

The Gravity Field and Steady-State Ocean Circulation Explorer (GOCE) is part of the European Space Agency’s Living Planet Programme. The mission objective is to provide data for establishing a model of the Earth’s gravity field with unprecedented accuracy, determining the geoid with an accuracy of 1-2 cm at a spatial resolution better than 100 km. The spacecraft was built by an international consortium led by Thales Alenia Space Italy, with development of the spacecraft platform under responsibility of EADS

Figure 1. Artist’s impression of the GOCE spacecraft (top), top and side view (bottom). Astrium Germany. GOCE was launched on 17th March 2009 and is controlled from ESA’s European Space Operations Centre (ESOC) in Darmstadt, Germany. To achieve its scientific objectives, GOCE orbits the Earth at an exceptionally low altitude of about 260 km

in a sun-synchronous orbit. GOCE has an aerodynamic shape (see Fig. 1) and employs a sophisticated dragfree attitude and orbit control system (DFACS), using an ion propulsion assembly (IPA) in closed loop to continuously counteract the drag caused by the Earth’s atmosphere. High-resolution features of the gravity field are measured with the Electrostatic Gravity Gradiometer (EGG) –employing the most precise accelerometers ever flown in space–, whereas low-resolution data is obtained by measuring GOCE’s drag-free orbit with a scientific GNSS receiver, the Satellite-to-Satellite Tracking Instrument (SSTI). Both units are used in the DFACS control loop, making the GOCE spacecraft as a whole the gravity measurement device, with no clear distinction between platform and payload. General information on the mission can be found in [1][2]. More details about the scientific objectives of GOCE can be found in [3][4][5]. A detailed description of the GOCE spacecraft is given in [6].

storage area and in the memory in which the software is executing. Tab. 1 gives an overview of the specifics of the GOCE on-board software and processors. The PASW resides in the EEPROMs of the two processor modules in the Control and Data Management Unit (CDMU), see Fig. 2 for an overview. It is only possible to access the EEPROM of the currently active processor module. An EEPROM consists of two banks, each of which is holding a full image of the PASW in compressed form. To allow for installing PASW patches without having to replace the compressed image, additional chains of patches can be put into EEPROM. At boot up, these patches are applied in RAM after uncompression of the PASW image into RAM and prior to PASW start up in RAM. A ‘flip-flop’ approach with 2 patch areas (one in each EEPROM bank) was used throughout the mission whenever a new PASW patch arrives (see Fig. 3). To activate a PASW patch installed in EEPROM, a restart of the PASW is needed, which corresponds to putting the S/C into safe mode (i.e. stopping science operations). In general this is not done owing to the large overhead and risk linked to this operation – if a patch is to be made active straight away, it is applied in PASW RAM. The PASW contains good memory margin for installation of software patches, both in EEPROM (27% free space) and RAM (16% free space). The mass memory of the spacecraft is managed by the PASW, with the data being stored in a 4Gbit memory module (1 for each CDMU side, see Fig. 2). The SSTI and EGG as the GOCE payloads (providing the scientific measurement data) are used in the DFACS control loop, therefore SSTI/EGG OBSM activities are handled as for other platform units: 

IPA and EGG: these units are cold redundant. As it is not possible to have both chains switched on at the same time, any OBSM activity not doable on the running software requires an interruption of drag-free mode, implying a stop of science operations and a decay of the S/C orbit.



SSTI and STR: the two SSTIs and the three STRs can be switched on in parallel. The relevant unit can be taken out of the control loop to perform OBSM activities.

Table 1. Overview of GOCE’s on-board software 2.

THE GOCE FLIGHT SOFTWARE AND OBSM APPROACH

The mission specifics of GOCE led to a complex S/C design, in turn resulting in complex flight software. Apart from the Platform Application Software (PASW) implementing the drag-free control and other systemlevel functions, the following units used in the DFACS control loop have their own sophisticated flight software: the star trackers (STR), the EGG, the SSTI and the IPA. For each unit, ground has the means to modify the flight software both in its permanent

The responsibility for developing new GOCE flight software versions or patches after launch is with the S/C manufacturer and its subcontractors. ESA/ESOC is in charge of performing the installation of the software patches in-flight. The interface between ESOC and industry concerning OBSM deliveries is clearly specified in a dedicated interface control document.

Figure 2. Memory layout of the Control and Data Management Unit (CDMU). Only one CDMU side is in use at any point in time. The memories of the processor module board not in use cannot be accessed.

Figure 3. Illustration of approach used for applying PASW patches in EEPROM (active patch chain highlighted). Upon installation of a new PASW patch, the full patch chain is reinstalled in the EEPROM bank not holding the currently active set of patches. 3.

FLIGHT OPERATIONS OVERVIEW

GOCE flight operations from ESOC are characterised by very short ground station contact times (5 min duration in average), with 8 contacts taken per day. Pass operations are automated, only the contacts during normal working hours are manned. While nominal routine operations are straightforward– the acquisition of science data doesn’t require complex operations, but

rather consists of letting the S/C fly in drag-free mode– the high S/C complexity means that anomalies may quickly become rather demanding and involved, as experienced throughout the first year of the mission. Commissioning of GOCE lasted from launch on 17th March 2009 up to start of the routine operations phase in September 2009. Owing to the need to commission the complex subsystems and units required to perform drag-free mode, GOCE was injected at an altitude

higher than the one foreseen for science operations. During commissioning, the orbit was lowered to the desired altitude by not compensating the atmospheric drag. Fig. 4 gives an overview of the S/C altitude from launch up to reaching the altitude for the routine science operations middle of September 2009. The main activities since launch were the following: Initial commissioning (17/03/2009 to 04/05/2009): following successful launch and completion of the launch and early orbit phase, major activities were the commissioning of the ion propulsion system (see Fig. 4, label 1) and the first gradiometer switch ON, all the while the orbit was left decaying. Drag-free mode checkout (05/05/2009 to 22/06/2009): drag-free mode was commissioned, with all units used in the DFACS control loop for the first time. This checkout was performed in two steps due to flight software problems (see Fig. 4, labels 2 and 3).

memory or directly on the running software in RAM (either patching the code or data area): 

PASW: no new PASW version was installed in-flight, but 13 different patches of the PASW code were done, along with an update of the default values of the thermal control tables by patch. As the redundant CDMU processor module can’t be accessed, the PASW patches were initially only installed in permanent memory of CDMU-A.



STR: a parameter only updatable by patch had to be tuned in-flight.



SSTI: replacement of full application SW to correct 2 problems found in commissioning.



IPA and EGG: no OBSM activities needed.

Decay to science altitude (23/06/2009 to 14/09/2009): after completion of commissioning activities the orbit decay was resumed (Fig. 4, label 4) up to reaching an altitude of 259.6 km in September 2009 (Fig. 4, label 5). From that point onwards, the altitude was maintained in drag-free mode. Reference [7] contains more detailed information on the flight operations approach and events in the first year of the mission. Table 2. Overview of OBSM activities performed As can be seen in Tab. 2, patching the running software in RAM was a frequent activity. In particular in case of PASW code corrections, installation of the patch in EEPROM and activation by reloading the software is not desirable owing to the large overhead implied by this activity. The nature of several PASW problems was that such that an immediate fix in RAM was needed, i.e. it was not sufficient to apply the patch in PASW EEPROM and have it active at the next safe mode entry. There were also two cases in which data in PASW RAM was left in an inconsistent state due to PASW code errors, and had to be patched by ground to recover from the anomaly. Figure 4. Altitude of GOCE from launch on 17/03/2009 up to stop of orbit decay on 14/09/2009 4.

ON-BOARD SOFTWARE MAINTENANCE ACTIVITIES

4.1 Statistics on OBSM Activities Tab. 2 gives an overview of the number of OBSM activities for each part of the flight software, distinguishing whether a fix was done in permanent

Fig. 5 shows the size of the various PASW patches performed. The average size of a PASW patch applied amounts to 769 bytes, with the largest patches needed for an update of a default OBCP (patch #1). Though no updates of large portions of code were needed, some of the problems corrected were major – the size of the PASW patch was not related to its importance. All OBSM activities except for the STR patch and the PASW thermal tables update can be classified as corrective maintenance. Concerning the discovery of these anomalies..



2 were found shortly before launch in the simulations campaign at ESOC,



3 were found as a byproduct of ground testing performed by industry after launch (testing for investigation of other anomalies found in commissioning),



the remaining 10 were discovered in-flight, with six of these having a major impact on the mission (e.g. safe mode entry).

Commissioning also required an update of a significant amount of software parameters (e.g. the gains of the attitude controller for certain DFACS modes). Thanks to the PASW offering high-level services for updating of any such PASW parameters –the so-called Standard Parameter Load Interface (SPLIF)–, PASW OBSM was in virtually all cases limited to genuine updates of PASW code.

Loss of attitude control #1: during recovery of the first safe mode of the mission (caused by inadequate DFACS controller gains, see [7]), a flaw in the PASW handling of ground commands to the fine magnetic torquer software object to led to a situation in which the coarse magnetic torquers used by the DFACS were switched off, and thus the S/C attitude was not controlled up to when ground intervened 1 orbit later. This anomaly had not been found during ground testing as it only occurs in case of very specific timing of commands sent with respect to on-board activities. Loss of attitude control #2: the first eclipse of the mission middle of April 2009 triggered a fallback to the lowest DFACS mode due to faulty tuning of FDIR. When ground initiated the transition to the next higher DFACS mode in the course of the recovery, it was seen that the DFACS was ‘stuck’ in the mode transition with no attitude control being exercised. As the transition timeout surveillance –which would have normally protected against this condition– was not working due to another PASW problem, ground recovered by manually commanding a fallback. The anomaly was found to have been caused by a flag used in the mode transition, which had been left in an inconsistent state following the first fallback. This anomaly escaped detection in ground testing as such a series of several failures had not been simulated. Safe mode #2: when commissioning the drag-free modes, a rapid divergence in the S/C pointing errors was observed during the first K2 calibration of the mission, an activity which includes taking the EGG accelerometer heads out of the DFACS control loop one by one. In this case, the root cause was a combination of wrong parameter default settings and coding errors in the PASW algorithm for compensation of the accelerometer biases.

Figure 5. Size in bytes of each of the 14 PASW patches

OBSM in Commissioning (March to July 2009) Fig. 6 gives an overview of the timing concerning the detection and correction of software problems after launch. Not unexpectedly, most problems were discovered in the intense first few months of commissioning up to July 2009, during which many activities were performed for the first time in-flight. 4.2

The following software anomalies found in this phase had a major impact:

Apart from the fixes applied in PASW RAM (whenever required for proceeding with the commissioning activities), two major OBSM activities took place to install the patches in CDMU-A EEPROM on 28th May 2009 and 6th August 2009. Besides PASW-related anomalies, a parameter in the STR flight software had to be updated by patch, preventing a frequent loss of tracking as observed in the first few weeks of the mission. In addition, a new version of the SSTI application software was installed on both SSTIs. OBSM in Routine: Contingencies in October 2009 and the CDMU-A Anomaly Following completion of S/C commissioning and resumption of the orbit decay end of June 2009, an OBSM-wise quiet phase was entered lasting up to the end of the summer. In routine, the following problems were noteworthy:

4.3

Figure 6. Overview of when flight software problems were discovered and corrected. Most problems were found in the first few months after launch. A star (*) next to an activity indicates that the patch was also installed in RAM.

EDAC trap handler correction: in early September 2009 it turned out that the PASW EDAC trap handler had a flaw, leading to the PASW RAM scrubbing task not correcting any single bit errors beyond the first one. Correction of this problem on the running PASW turned out to be tricky, as it involved replacing a routine (the EDAC trap handler) which was being accessed by the software all the time. The PUS service 6 memory patch functionality for PASW RAM had to be updated by patch to allow for installing the EDAC trap handler patch, as the service had originally not allowed for atomic patching of PASW instructions (the jump instruction to the trap handler in this case). Contingencies in October 2009: the mission experienced its most demanding anomalies since launch, when the following series of events took place middle of October 2009: 



On 16th October 2009, the flight software of the ion propulsion system stopped working (the root cause is still unknown, however this anomaly happened only this one time in 9 months of IPA usage), leading to a fallback to Fine Pointing Mode (FPM) and thus resumption of the orbit decay. Attempts to restart the IPA on 17th and 18th October failed, with some of the commands sent from the DFACS to the IPA seemingly not reaching the unit. This was eventually discovered to be due to the DFACS/IPA communications management left in an inconsistent state following the IPA flight software crash. This was worked around on 20th October by resetting some state variables

in PASW RAM by patch. The relevant correction of the PASW code was eventually implemented in February 2010. 

In the evening of 18th October, mass memory playback data was found to be corrupted, limiting visibility on S/C status to real time telemetry received during the ground station passes. Following one week of troubleshooting, the corruption could be recovered through resetting the state variables of some playback-data-related buffer management by PASW RAM patch, though the root cause for the corruption was not understood at the time. Only after reoccurrence of the anomaly in November 2009, a problem in PASW code could be identified and eventually fixed by patch in February 2010. Interestingly, it is now clear that the root cause for playback corruption anomaly was not related to the IPA problems experienced just 2 days before.

CDMU-A anomaly in February 2010: following more than 3 months of quiet routine operations, GOCE experienced a major anomaly in February 2010, when a switchover to the redundant platform computer (CDMU-B) occurred after two unsuccessful autonomous retries to restart the PASW on CDMU-A. End of February, an attempt to manually switch back from CDMU-B to CDMU-A was unsuccessful. While the root cause could not be determined with certainty, it is suspected that the floating point unit of processor module A has failed. The S/C has therefore been configured for an extended stay on CDMU-B. Several

interesting OBSM-related issues have come up in the course of this anomaly: 





5.

The lack of the possibility to switch the unused CDMU processor module to a ‘service mode’ to access its memory meant that no PASW corrections could be installed on CDMU-B during commissioning. When the system switched over to CDMU-B, the default flight software was running with all the corrections applied after launch not active, rendering the recovery on CDMU-B more difficult. Another consequence of not having a service mode for the unused processor module is the difficulty to diagnose the potential failure of PM-A, which can only be accessed by triggering a CDMU reconfiguration back to the A-side, requiring a large overhead and offering limited visibility in case the PASW does not start up nominally on PM-A (as was the case in the attempt end of February 2010). The PASW error logs –recorded when the PASW stops anomalously– got overwritten in the case of consecutive PASW restarts on the same CDMU side. Thus, out of the three PASW crashes on CDMU-A in the course of the anomaly, only a single crash log could be recovered, making the diagnosis of the anomaly more difficult.

CONCLUSIONS AND OUTLOOK

5.1 OBSM for GOCE: Conclusions GOCE is characterized by a high complexity of the S/C, with no clear decoupling between platform and payload and a sophisticated attitude and orbit control system. As a result, the complexity of the platform flight software is significant. It is therefore not surprising that on-board software maintenance played an important role in the first year of the mission. Several flight software problems had to be corrected, in particular in the first few months of commissioning. While it may seem difficult to draw general conclusions out of the experience gained on GOCE, the following can be noted: 

Installation of patches in RAM on the running software was a rather common activity, which is in line with the experience gained on other ESA missions. It is important that the flight software offers the right means to apply RAM patches. In particular, the patching of code instructions in RAM should be atomic – this was unfortunately not the case on GOCE,

requiring an in-flight update of the memory patch service. 

The flexibility offered by the PASW patching facilities in permanent memory –allowing the installation of chains of patches to be applied on top of the compressed default PASW image– turned out to be important. The operational overhead of replacing the default PASW image for each of the problems would have been very large.



The presence of high-level services (GOCE SPLIF) for update of PASW parameters proved to be very useful, limiting the OBSM needs to updates of PASW code.



The H/W inability to access the redundant processor module memories prevented from updating the PASW on CDMU-B, such that the default PASW version was running after the CDMU-A anomaly in February 2010, with none of the essential patches installed after launch active. It also prevented from assessing the PM-A health status following the CDMUA anomaly. A ‘service mode’ for the redundant platform processor module (as implemented on other ESA missions) would have been most useful.

5.2 Current Status and Outlook GOCE is in routine operations since end of October 2009, satisfying its objective of delivering high quality, complete science data. The first set of GOCE products is going to be issued in mid 2010. At the time of writing, several months have passed without discovery of new on-board software problems, with the flight software proving to be most stable in routine operations. Thus, the extensive commissioning and OBSM activities performed in 2009 have obviously been successful. The nominal GOCE mission ends in April 2011. Despite the setback of the CDMU-A anomaly in February 2010, the S/C status is in general very good with a large margin in consumables, such that an extension of the mission beyond its nominal lifetime seems feasible.

ACRONYMS ASW CDMU CPU DFACS DFM EDAC EEPROM EGG ESOC FCT FDIR FPM GOCE IPA LEOP OBCP OBS OBSM PASW PM PROM PUS RAM RM SGM SPLIF SSTI

Application Software Control and Data Management Unit Central Processing Unit Drag Free Attitude and Orbit Control System Drag Free Mode Error Detection and Correction Electronically Erasable Programmable Read Only Memory Electrostatic Gravity Gradiometer European Space Operations Centre Flight Control Team Failure Detection, Isolation and Recovery Fine Pointing Mode Gravity and Steady-state Ocean Circulation Explorer Ion Propulsion Assembly Launch and Early Orbit Phase On-board Control Procedure On-board Software On-board Software Maintenance Platform Application Software Processor Module Programmable Read Only Memory Packet Utilisation Standard Random Access Memory Reconfiguration Module Safeguard Memory Standard Parameter Load Interface Satellite-to-Satellite Tracking Instrument

REFERENCES [1] Haagmans, R., Floberghagen, R., Pieper, B., ESA’s Gravity Mission GOCE, ESA, Noordwijk, The Netherlands, 2006. [2] European Space Agency, GOCE Mission Web Site, http://www.esa.int/goce [3] Drinkwater, M., Haagmans, R., Kern, M., Muzi, D., Floberghagen, R., “GOCE – Obtaining a Portrait of Earth’s Most Intimate Features,” ESA Bulletin, No. 133, February 2008, pp. 4 to 13. [4] Floberghagen, R., 1Drinkwater, M., Haagmans, R., Kern, M., “GOCE’s Measurements of the Gravity Field and Beyond,” ESA Bulletin, No. 133, February 2008, pp. 24 to 31. [5] Drinkwater, M.R., Haagmans, R., Muzi, D., Popescu, A., Floberghagen, R., Kern, M., and Fehringer, M., “The GOCE Gravity Mission: ESA’s First Core Earth Explorer,” 3rd International GOCE

User Workshop, ESA SP-627, ESA, Frascati, Italy, 2007. [6] Fehringer, M., André, G., Lamarre, D., Maeusli, D., “A Jewel in ESA’s Crown – GOCE and its Gravity Measurement Systems,” ESA Bulletin, No. 133, February 2008, pp. 14 to 23. [7] Steiger, C, Piñeiro, J., Emanuelli, P.P., “Operating GOCE, the European Space Agency’s Low-flying Gravity Mission,” SpaceOps 2010, Huntsville, USA.