THE DATA MANAGEMENT GROUP

THE DATA MANAGEMENT GROUP Data Management P761 20 Years of Transplantation – Department of Hematology and Oncology of the Teaching Hospital in Pilsen...

Author: Lucas Webster

4 downloads 0 Views 520KB Size

Report

Download PDF

Recommend Documents

WORKING GROUP ON MARINE DATA MANAGEMENT

Intelligent Data Management Framework Michael G. Simms Director, Enterprise Data Management Client Strategy Group

The Tyler Group Wealth Management

The RVH Group Wealth Management

The Group Management Questionnaire (GMQ)

Engineering Data Management as an Andritz Group Solution

IBM Software Group, Information Management Business value of Master Data Management and Product Information Management

Besadno Group. Management

Advanced Group Policy Management

MANAGEMENT REPORT GROUP RESULTS

The VersaKey Framework: Versatile Group Key Management

Executive Management Search Group

Group management report

The VersaKey Framework: Versatile Group Key Management

HOSPITALITY Project Management Group

GROUP MANAGEMENT REPORT 2015

Anger Management Skills Group

Watershed Management Group (WMG) Performance Management Plan

The Data Management Challenge: Making Extremely Large Amounts of Data

MGC Group Data Book 2016

Product Data Management (PDM) Engineering Data Management (EDM)

The South Carolina Medical Group Management Association The South Carolina State Affiliate of the Medical Group Management Association

Data Management at the Large Hadron Collider

Architecture of the glite Data Management System

THE DATA MANAGEMENT GROUP

Data Management P761 20 Years of Transplantation – Department of Hematology and Oncology of the Teaching Hospital in Pilsen, Czech Republic A. Jungova1,*, S. Vokurka1, M. Karas1, M. Hrabetova1, K. Steinerova1, P. Jindra1 1 Hemato-oncology department, CHARLES UNIVERSITY HOSPITAL PILSEN, Pilsen, Czech Republic Introduction: The Hematology and Oncology Department was established in 1994. First transplantation (Tx) was performed in 1991, but routinely since 1994. In the past 20 years, 735 allogeneic Tx and 1059 autologous Tx have been provided in the department. The number of alotransplantations has been stable in recent 10 years with a yearly average of around 41 as far as allogeneic Tx is concerned (35-53) and an average of 51 connected to autologous Tx (40-85). There is an interesting, but very easily explicable phenomenon connected to individual diagnoses the representation of which has changed in allogeneic Tx. Materials (or patients) and methods: Retrospective analysis – internal database data dealing with a group of patients who were transplanted from 1994 to 2014. Results: In the course of the last 20 years, 735 allogeneic Tx have been provided in the Department of Hematology and Oncology of the Teaching Hospital in Pilsen. 424 patients were men and 311 women (1,3/1). In the ﬁrst decade, the most common indication for the Tx was chronic myeloid leukemia (CML) (37%), followed by acute myeloid leukemia (AML) - 22% of all transplants. The third most common diagnosis was acute lymphoblastic leukemia (ALL- 11%) which was followed by chronic lymphocytic leukemia CLL (9%), MDS (5%), lymphomas (5%) and multiple myeloma (3%). In case of involvement of a very speciﬁc hematology therapy (tyrosine kinase inhibitors, another speciﬁc immunotherapy...), there was a rapid change in major indications for allogeneic Tx. However, in the second decade, CML- originally the main diagnosis - dropped to about 6th place (only 3% of all transplants) and remained an

indication for Tx only in case of immunotherapy failure. During this period, the main diagnosis became AML (45%); the second most common diagnosis was ALL (13%) closely followed by CLL (12%). Lymphomas formed around 8% of all transplants, MDS about 6% and multiple myeloma remained on the same level (4%). Conclusion: The Department of Hematology and Oncology of the Teaching hospital in Pilsen provided approximately 80 transplantations a year. Over the past 20 years, there has been a signiﬁcant change in transplant strategy as far as the order of diagnosis is concerned. With the development of new drugs in hematology (speciﬁc immunotherapy, tyrosine kinase inhibitors and others), there has been a steep decrease in allogeneic transplantations in patients with CML; and acute leukemias are getting to the foreground now. Trend in Czech Republic closely reﬂected trends in Europe. Disclosure of Interest: None declared. P762 Optimizing data collection at a BMT unit C. Roepstorff1,*, H. Petersen1, A.-M. Berthelsen1 1 Haematology, RIGSHOSPITALET, Copenhagen, Denmark Introduction: Accurate data collection is essential for the validity in use of scientiﬁc data entered in databases (DB). From an earlier study we identiﬁed the difﬁcult task in entering accurate data to local and international databases. We therefore decided to examine which tools we have available to collect the data. We frequently participate in requests from CIBMTR, EBMT, local DB and we enter patients into clinical research protocols, all tasks where data collection is necessary. The patient ﬁle is in a paper and an electronic version. The paper version is documents which either haven´t been converted nor been able to convert into an electronic version. Within the patient paper ﬁle there is already a document to list essential data for data registering, and since the electronic patient ﬁle has been implemented, the awareness of paper documents has declined. Currently some data are collected solely by the data managers (DM) and extracted from patient ﬁles at the hospital and from national patient registries. Some data, however, must be reviewed and interpreted by a physician to in order to extract valid data. Therefore, it is necessary for the treating physician, to have an overview of which data is essential to list in the patient ﬁle. The BMT unit is preparing for JACIE accreditation and according to standard B9.1:’’ The Clinical Program must furnish evidence of its own periodic data audits to determine if data are accurate for evaluation of patient outcomes’’ e.g. outcome analysis on day þ 100, þ 180 and annually. Aim: To optimize data collection to increase validity of data entered to CIBMTR, EBMT, DB and CRF´s. To identify which data are to be collected and entered in CIBMTR, EBMT, DB and Clinical Research Files (CRF). To create an overview of which data to be collect to DB and CRF´s to ease the data collection at patient visits. Materials (or patients) and methods: We reviewed forms from the CIBMTR, EBMT and CRF´s from the clinical research team and listed which essential data are needed on day þ 100, þ 180 and annually. We identiﬁed and reviewed existing paper documents in the patient paper ﬁle, which previously has been created to give

S535

an overview of which data to be collected and also to enter data on by the treating physician. By looking into our electronic patient ﬁle system, we discovered the possibility to create standard notes. Results: The existing paper document is now changed to a schedule to give an overview of which data to be collected on day þ 100, þ 180 and annually. The schedule is put in the patient paper ﬁle and data collection is divided into: Data collected by treating physician and data collected by data manager. Within the electronic patient ﬁle we created standard notes. The physician can click on the relevant day of visit and view which data to be entered on the speciﬁc dates of patient visits. From here the DM can extract the data. To implement the new schedules and notes in the patient ﬁle, we had a meeting with the BMT- physicians. Conclusion: We will evaluate the effect on the standard notes 3 months after the implantation. Disclosure of Interest: None declared. P763 STANDARD FOR THE GLOBAL REGISTRY IDENTIFIER FOR DONORS (GRID) P. Distler1,*, P. Ashford1, I. Britton2, L. Foeken3, M. Maiers4, S. Querol5 1 ICCBBA, San Bernardino, United States, 2Diagnostic and Therapeutic Services, NHS Blood and Transplant, Leeds, United Kingdom, 3WMDA, Leiden, Netherlands, 4Bioinformatics Research, National Marrow Donor Program, Minneapolis, United States, 5Cell Therapy Service and Cord Blood Bank, Banc Sang i Teixits, Barcelona, Spain Introduction: Given the global nature of the work done by hematopoietic stem cell donor registries, a system to uniquely identify potential donors on a global scale is needed to facilitate communication and prevent errors in identiﬁcation of donors. A standard machine-readable format for a Global Registry Identiﬁer for Donors (GRID) that can be used by electronic process control systems, as well as a standard format for the human-readable version, is under development. Materials (or patients) and methods: WMDA has signed a Memorandum of Understanding with ICCBBA to assign and manage the register of registry identiﬁers and support the development of associated standards documents. Representatives from WMDA and ICCBBA have identiﬁed three phases for the project: Phase 1 – Registry identiﬁer allocation rules, GRID format, GRID eye-readable presentation, and GRID data structure for electronic transfer are deﬁned. Guidance for registries for mapping local identiﬁers to the GRID is developed. Phase 2 – EMDIS and BMDW databases and messaging standards support GRID. Use of GRID in communication between registries is recommended but not required. Phase 3 – GRID is a mandatory ﬁeld in communication with EMDIS and BMDW databases and in communication between registries. GRID is used as the key donor identiﬁer on search reports. GRID is used on product labels when a donor identiﬁer is required. Results: Phase 1 of the project is in progress. The format for the GRID will be a 4-character (all numeric) registry identiﬁer assigned by ICCBBA followed by a 15-character (numeric or uppercase alpha) donor identiﬁer assigned by the registry for a total of 19 characters. In the human-readable form, the alphanumeric sequence will be divided into blocks of 4,4,4,4,3 to assist in manual transcription. Leading zeroes would be used for donor identiﬁers less than 15 characters. In the humanreadable form, a modulus 37-2 check character will be displayed. This can be used to ensure correct data entry should the number be entered into a computer system via a keyboard. In electronic communications, the characters &,3 will be used to identify the number as containing a Hematopoietic Stem Cell Registry GRID. Code 128 will be used if the GRID is to be represented in a linear bar code and Data Matrix will be used if the GRID is represented in a two-dimensional bar code.

S536

Conclusion: It is anticipated that implementation of a GRID following the 3-phase project plan will take 3-5 years. The GRID has been designed to allow registries to insert their current donor identiﬁers (alphanumeric characters only) into the GRID to facilitate the adoption of the new identiﬁer. For example, a registry has been assigned the identiﬁer 3054 and currently has assigned the identiﬁer A12345 to a donor. The GRID would be 3054 0000 0000 0A12 345. It is believed that the GRID will improve electronic communication, traceability, and accuracy in identifying potential donors by standardizing systems across the globe. Disclosure of Interest: None declared. P764 HML 1.0: Reporting NGS-based HLA & KIR genotyping using MIRING principles R. P. Milius1,*, M. Heuer1, D. Valiga1, K. Doroschak1, C. Kennedy1, J. Schneider1, J. Pollack1, J. Hollenbach2, S. Mack3, M. Maiers1 1 Bioinformatics Research, NMDP, Minneapolis, 2Department of Neurology, University of California, San Francisco, 3Children’s Hospital Oakland Research Institute, Oakland, United States Introduction: HLA is a complex and critical variable to consider when evaluating allogeneic Haematopoietic Stem Cell (HSC) Transplantation. The ﬁeld lacks rigorous Data Standards for electronic reporting of HLA genotyping experiments. Next Generation Sequencing (NGS) based genotyping of HLA and KIR poses new challenges for reporting these results. Recently, a set of principles and guidelines for the Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) [1] has been developed through a broad collaboration of histocompatibility and immunogenetics clinicians, researchers, instrument manufacturers and software developers [2]. Using these principles, we have extended the XML based Histoimmunogenetic Markup Language (HML) [3] to include reporting of NGS-based genotyping of HLA and KIR. Materials (or patients) and methods: HML has been undergoing development to accommodate NGS based genotyping for the past year in an effort led by the National Marrow Donor Program through a series of meetings and discussions with the HLA Information Exchange Data Format Standards Group, as well as the Immunogenomic NGS Data Consortium, a community of registries, clinical and research laboratories, and industry partners focused on identifying and addressing speciﬁc data-reporting requirements for NGSbased genotyping. In Sep 2014, a Data Standards Hackathon (DaSH) was held to develop new ways to exchange NGS data for HLA and KIR. During this event MIRING was further reﬁned, and new substantial requirements were identiﬁed for HML 1.0. Results: HML 1.0 retains similar overall structure to previous versions, but with notable changes. As before, reporting of primary data is separated from allele assignment. NGS is supported by new XML structures to capture all NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. Pointers to external locations are used to refer to registered methodologies, raw NGS reads, and reference standards. A separate component describing consensus sequences and variants was created speciﬁcally to accommodate NGS data, but could be used for other methodologies if desired. This component includes metadata describing the sequences such as references sequences, phasing information, expected copy number, sequence block continuity, and other metadata. Reporting of allele assignment with full genotypic and allelic ambiguity is achieved through the use of GL Strings [5].

Conclusion: HML 1.0 has been ﬁnalized and can be found on schemas.nmdp.org. Reporting of NGS based genotyping of HLA and KIR conforming to MIRING principles is possible using a combination of HML and semantic and syntactic validation tools. As NGS technologies become the de facto standard for HLA and KIR genotyping in both research and clinical settings, standards describing technical speciﬁcations for exchanging the genotyping results and the surrounding metadata become ever more important. We are working with healthcare standards organizations (e.g., HL7) to include these data and metadata described in HML 1.0 in clinical messages and structured documents. References: 1 – miring.immunogenomics.org 2 – ngs.immunogenomics.org 3 – bioinformatics.bethematchclinical.org/HLA-Resources/HML/ 4 – dash.immunogenomics.org 5 – Milius et al, Tissue Antigens. 2013 Aug; 82(2):106-12. doi: 10.1111/tan.12150. Disclosure of Interest: None declared. P765 HLA 5-locus matching predictions for Southwest Asia H.-P. Eberhard1, S. Morsch2, A. Schmidt3, C. Mu¨ller1,* 1 ZKRD, Ulm, 2SKD, Birkenfeld, 3DKMS, Tu¨bingen, Germany Introduction: Prognostic matching algorithms like HapLogict and OptiMatchs are sophisticated statistical tools to evaluate the chances of a search for an unrelated hematopoietic cell donor. Materials (or patients) and methods: High quality HLA-A, -B, -C, -DRB1 and -DQB1 haplotype frequencies (HF) are the actual basics for matching predictions between patients and potential donors. This study highlights the impact of population speciﬁc HF on the match prognoses for donors from Southwest Asia, i.e. from the Middle East and Turkey (ASSW). The HF have been estimated from two subsets of the German donor pool: (1) all donors with molecular assignments at least for HLA-A and -B (ALL) (2) the subset of (1) explicitly assigned to ASSW origin. The matching predictions have been assessed using high resolution conﬁrmatory typing results of donors without previously complete high resolution data. We used weighted linear regression as well as logarithmic score (u(x)) for the whole model and for the predictions up to 10% probability (u10(x)), i.e. the most difﬁcult searches. This judgment has been applied to the 10/10 (full HLA match) and the 9/10 (1 HLA mismatch) predictions. Results: The unrestricted sample including 4.4 million donors yielded 134.000 HF compared to 25.000 HF for the 122.000 ASSW donors. Altogether 617 5-locus HLA high resolution typings for ASSW donors have been used for the validation. The goodness of ﬁt (R2) of the unrestricted 10/10 model is 0.95

compared to 0.98 of the ASSW model (ﬁgure 1). Logarithmic scores u(x) and u10(x) were 0.225 and 0.13 for all donors and 0.179 and 0.098 for ASSW donors. The picture for the 9/10 model is similar except that the R2 is slightly higher. Conclusion: Despite the relatively small number of HLA types available for the validation, the importance of population speciﬁc HF for matching predictions could be underlined in this study. Unfortunately the application of the result of this and similar studies is severely hampered by two factors. First, the lack of adequate a-priori donor HLA types for every patient population which prevents the estimation of HF and, second, the absence of substantial numbers of a-posteriori donor high resolution HLA types for the validation of the estimated HF. Disclosure of Interest: None declared. P766 Why everything (or nothing) seems to work in the treatment of steroid-refractory chronic Graft-versus-Host Disease (SR-cGVHD): a systematic review and critical appraisal of the literature J. Olivieri1,*, L. Manfredi2, L. Postacchini2, S. Tedesco2, A. Gabrielli2, P. Leoni1, A. Bacigalupo3, A. Rambaldi4, A. Olivieri1, G. Pomponio5 1 Hematology, 2Clinical Medicine, Marche Polytechnic University, Ancona, 3Hematology Unit 2, San Martino Hospital, Genoa, 4USC Ematologia, Ospedale Papa Giovanni XXIII, Bergamo, 5Clinical Medicine, AOUOORR Regional Hospital, Ancona, Italy Introduction: SR-cGVHD represents an unmet challenge in allogeneic transplantation. Although research has been very active,the lack of fully developed research methods made it difﬁcult to conduct clinical trials suitable for drug regulatory approval,leading to absence of any FDA-approved medication for SR-cGVHD. To bridge this gap, in 2006 the NIH cGVHD Consensus Project provided recommendations (NIH-cGCP-R) for design of clinical trials. The uptake of NIH-cGCP-R in the recent literature and their impact on the quality of clinical research on SR-cGVHD, has not been yet evaluated. Materials (or patients) and methods: We aimed to identify recurrent methodological deviations from NIH-cGCP-R in the recent literature for SR-cGVHD; to ascertain if these deviations introduced systematic biases; to evaluate whether a signiﬁcant methodological improvement over time could be observed,as a possible consequence of the NIH-cGCP-R. Search in Medline for non-randomized studies (ST) published between 1998 and 2013 for systemic treatment of SR-cGVHD yielded 2901 hits:152 were pertinent to our topic,related to 51 interventions(INT).We excluded 42 ST with less than 3 subjects and 28 ST whose INT had less than 3 published reports in the analyzed timeframe; 82 ST related to 9 INT (Extracorporeal photopheresis, Rituximab, Mycophenolate, Imatinib, Mesenchymal Stem Cells, Methotrexate, Pentostatin, Sirolimus, Thalidomide) were further analyzed.We generated a 61-items checklist from (1) to evaluate adherence to NIH-cGCP-R. Four investigators applied the checklist and extracted efﬁcacy and toxicity data from all the 82 ST, with high concordance. Chisquare or Fisher’s exact test were used to compare items adherence across cut-off dates.Meta-analysis was performed to measure the pooled effect size for global response rate (GRR) and meta-regression analyses were employed to study the effect of covariates on GRR. Results: The NIH-cGCP-REC had a modest impact on the methodologic pattern of the ST over time, without overall signiﬁcant changes before and after the arbitrary cut-off date of Jan 1,2008. Moving the cut-off date for each year from 2006 to 2013 did not change the results. Better adherence to NIH-cGCP-R in the critical subset of GRR determination (GRR-DET) was signiﬁcantly associated with a lower GRR of the evaluated INT. This subset included:use of calendar-driven data collection and objective measures to deﬁne response,the requisite of a magnitude of change for partial response,speciﬁcation of a pre-deﬁned timing for

S537

response evaluation and reporting a measure of response duration. Adherence to NIH-cGCP-R in the GRR-DET subset was signiﬁcantly higher after 2008. Conclusion: An overestimation bias is evident in the reported GRR of recent SR-cGVHD ST: it is thus likely that exaggeration of treatment effects led to accumulation of an array of compounds of unknown true efﬁcacy,thereby contributing to a jammed drug development. The global uptake of NIH-cGCP-R in the recent literature was limited; however it was signiﬁcant in improving the methodology of response determination,thereby reducing overestimation bias. Although a minor change,this may have a strong impact in lowering false expectations about new INT tested,thus avoiding that further patients are subjected to ineffective treatments. References: (1)Martin PJ et al, BBMT 2006;12:491-505. Disclosure of Interest: None declared. P767 Survival Analysis is Robust to Informed Censoring in Studies with Historical Controls M. Al-Khabori1,*, I. Al-Zakawani2, A. AlManiri3, S. Ganguly4 1 Hematology, 2Pharmacy, Sultan Qaboos University Hospital, 3 Health Sector, The Research Council, 4Family Medicine and Public Health, Sultan Qaboos University Hospital, Muscat, Oman Introduction: Comparison to historical controls when concurrent controls are not available remains a common strategy for retrospective comparative studies. Nevertheless, the probability of censoring is different between the current and the historical cohorts. We hypothesized that this informed censoring will bias the survival probability estimate to favor the cohort with shorter follow up time. Materials (or patients) and methods: We simulated a cohort of 500 patients using Weibull distribution with 50% probability of failure and a maximum follow up time of 5 units. We then derived the second cohort where the maximum follow up time was set to be 50% of the original cohort. Failure events that occurred beyond the maximum follow up of 2.5 units in the ﬁrst cohort were changed to censored events in the second cohort. Kaplan-Meier survival probability was estimated at 1 and 2 units follow up time for both cohorts. Results: The total number of failure events in ﬁrst and second cohorts were 242 and 98 respectively. The 1 and 2 year survival probabilities were 94% and 85% for the ﬁrst cohort and 94% and 85% for the second cohorts with overlapping Kaplan-Meier survival curves indicating no bias in the survival probability estimates. Conclusion: Survival analysis is robust to informed censoring in studies comparing current outcomes to historical controls. The results should be validated utilizing data from prospectively collected datasets in the ﬁeld of malignant hematology and bone marrow transplantation. Disclosure of Interest: None declared. P768 Predicting the odds of Day-100 Treatment Related Mortality using Supervised Machine Learning in Acute and Chronic Myeloid Leukemia patients underwent Allogeneic Stem Cell Transplantation: 1042 cases T. A. Elhassan1,*, N. Chaudhri1, G. Aldawsari1, F. Alsharif1, H. Alzahrani1, S. Yousuf Mohamed1, W. Rasheed1, A. Hanbali1, S. Osman Ahmed1, M. Shaheen1, F. Alfraih1, F. Hussain1, F. Almohareb1, M. Aljurf1 1 Oncology Center, KING FAISAL SPECIALIST HOSPITAL & RESEACH CENTER, Riyadh, Saudi Arabia Introduction: Transplant data are becoming more complex where sometimes conventional statistical modeling fails to

S538

hold. On the other hand, supervised machine learning (a ﬁeld of Artiﬁcial Intelligence) provides state-of-the-art algorithms for model building by learning from data instances using both input and output measures. Generalized Boosting Model (GBM), Support Vector Machine (SVM) and Artiﬁcial Neural Networks (ANN) are some examples of these algorithms. GBM is an optimization on Decision Trees (DT) that uses boosting algorithms in order to combine the week DT classiﬁers into a single strong classiﬁer in a stage-wise fashion. While SVM is a technique for constructing an optimal separating hyper plane between two classes in cases where classes are not linearly separable. On the other hand, ANN is a type of learner that extracts the linear combinations of the input features and models the target as a non-linear function of these features. Materials (or patients) and methods: Supervised machine learning predictive models such as GBM, SVM and ANN were compared to Logistic Regression (LR) in predicting day-100 Treatment Related Mortality (TRM). Models were trained using 70% of the dataset while 30% of the data was kept for the validation purpose. Predictive ability of the model was assessed using Area Under the receiver operating characteristic curve (AUC). AUC was obtained to measure the in-sample ﬁt and out-sample ﬁt using training and validating data set respectively. Five-fold cross validation was used to avoid model over ﬁtting and ensure model stability. Decision matrix was used in order to adjust for TRM non-proportionality using prior probabilities and appropriate decision consequences. Average proﬁt summary statistics was computed to assess model performance. In this study 1,042 of patients with Acute and Chronic Myeloid Leukemia underwent Allogeneic transplantation between 1997 and 2013 were analyzed. Baseline characteristics such as age groups (o20, 20-40 & 440), donor-sex-mismatch, disease stage (early, intermediate and advance), graft type (BM vs. PB) were used in building the predictive model. However, HLA mismatch and type of condition were not included because more than 95% of the patients were HLA-identical and received myeloablative conditioning. Analysis was performed using SAS Miner 13.1, Figure (1). Results: Machine learning based model achieved an AUC of 70%, 69% and 61% using GBM, ANN and SVM respectively compared to 67% for LR in training data. While, for validation data, the AUC was 65%, 61% and 60% for SVM, GBM and ANN respectively compared to 62% for LR. However, all models achieved an accuracy of 85% in training and validation data sets, table (1)

Table 1. Model performance assessment measures (average of 5-fold cross validation) Assessment measures

AUC (ROC-index)

Average Proﬁt

Models

Training data

Validation data

Training data

Validation data

GBM ANN SVM LR

0.7 0.69 0.61 0.67

0.65 0.61 0.6 0.62

1.9 1.6 1.4 1.6

1.6 1.5 1.4 1.5

[P768]

Figure 1.

Model Comparison using SAS Miner 13.1.

Conclusion: Machine learning based models showed a comparable performance results compared to LR. However, GBM showed an improved performance that could be promoted by identifying more predictive risk factors that might contribute to the prediction of day-100 TRM. Hence, improving the model deterministic ability. Disclosure of Interest: None declared. P769 Estimation of reaction rates for the MAPK/ERK pathway via Western-blotting data V. Purutcuoglu1,*, T. Seker2 1 Department of Statistics, 2Central Laboratory, Molecular Biology-Biotechnology R and D Center, Middle East Technical University, Ankara, Turkey Introduction: In stochastic modelling of biochemical systems enables us to explain the random feature in biological processes such as transcription and translation when exact numbers of molecules in reactions are known. In this study, we model a realistically complex MAPK/ERK (mitogen activated kinase/extracellular signal reglated kinase) pathway via stochastic approaches, estimate its reaction rate contants and evaluate the results biologically in details. MAPK/ERK is one of the main signalling pathways regulating the growth control in eucaryotic cells and involves many genes whose major components are Ras, Raf, MEK and ERK proteins. Also due to its importance in cell production, differention and apopthosis, it has been studied intensivly in oncogene researches. Here, we apply a western blotting dataset which has measurements for 5 proteins at 8 time points. Under such very sparse data, we apply a quasi list of reactions which describes the system by the EGFR degradation and the RKIP protein besides the MEK protein in the regulation of the system. Materials (or patients) and methods: In modelling, we perform the Euler-Maruyama approximation, which is presented by the following expression. DYt ¼ m(Yt,Y)Dt þ b(Yt,Y)DWt ,

where DYt denotes the change in state at time t, m and b show the mean and the variance terms, respectively, that are dependent on Y and reaction rates Y. Finally DWt indicates the independent and identically distributed Brownian vector at time t. Accordingly, in order to infer Y for partially observed data, we implement the particle ﬁltering approach within the Bayesian framework. This method enables us to estimate both missing observations and Y by decreasing the high correlational structure in calculations, resulting in more accurate results in inference. In the computation, we use western blotting data whose proteins are harvested via the pull-down method for the COS cells. Also we equate measured optical intenties to a ﬁxed number of molecules and take the rest of them proportionally on that value in order to convert the available data into a suitable structure for statistical analyses. Finally observations are assumed to have no measurement error, meaning that their uncertainies are only caused by stochasticity of protein interactions. Results: From ﬁndings, we detect that the estimated structure of the system validates the biological knowledge such that the EGF degradation is crucial and the RKIP gene has a signiﬁcant role in the activation of this pathway. Moreover as we get estimates of the reaction rates, we interpret the importance of each reaction such as we can evaluate the speed of the production of the target gene c-Fos in the nucleus. Conclusion: We observe that the suggested methods to infer the MAPK/ERK pathway are successful in the construction of the system under very sparse data and enable us to critically assess each reaction in the pathway by interpreting it biologically in details. These methods, additionally, can be used for any other complex systems. On the other side, the high correlation between genes causes high computational demand. To improve the underlying cost, we consider to use the inhomogenous poisson process whose model parameters can be inferred by a mixture of the Gibbs sampling approach with other Bayesian techniques. Acknowledgement: The authors would like to thank the grant of TUBITAK (Project No: TBAG: 112T772) for their ﬁnancial support. Disclosure of Interest: None declared.

S539