Cancer Diagnostic An automated method for determining HER2/neu status in breast cancer

by Niels Teilmann Berg Kongens Lyngby 2008 IMM-Master Thesis-2008

Summary Worldwide breast cancer is the fifth most common cause of cancer death and is the leading type of cancer in women. Breast cancer is predominantly a western problem but as western lifestyle is spreading the number of breast cancer cases rises worldwide. About 20% of all breast cancer cases have an amplification of the HER2/neu gene. This gene contributes to the growth and division of the cells and exists normally in healthy cells. When the gene is amplified by cancer both the growth and cell division happens faster making the cancer spread more aggressively and faster. Determination of the HER2/neu status is therefore important both for making a prognosis and as a target for therapy. In this thesis a method for automatically determining the HER2/neu status using imaging analysis is proposed and tested. The input images are captured from immunohistochemistry stained tumor biopsies. The staining colors cell membranes brown if the cells have an amplification of HER2/neu. The intensity of these brown membranes will be used as quantitative measure for the HER2/neu status. Four factors that influence the images or the staining and their effect on the brown intensity will be analyzed. These factors are: The two most common staining reagents, Dako’s HercepTest and Ventana’s Pathway®; images captured with two different cameras, the PixeLINK A686 and the Olympus DP 71 camera; the effect of the slice thickness and the effect of un-focused images. It is proven that the brown intensity, found by the method, can be used as measure for the HER2/neu status and that the effects of the factors either are inconsequential or that it is possible to modify the process to encompass them.

Side ii

Resumé På verdensplan er bryst kræft den femte mest dødelige type kæft og det er den type, der slår flest kvinder ihjel. Bryst kræft rammer overvejende i vestlige land, men som vestlig livsstil breder sig, så stiger antallet af bryst kræft tilfælde på verdens plan. Omkring 20% af alle bryst kræft tilfælde har en forstærkning af HER2/neu genet. Dette gen er med til at styrer cellers vækst og deling og findes normalt i raske celler. Hvis en mutation medfører en forstærkning af dette gen vil cellen både vokse og dele sig hurtigere end normalt, hvilket så medfører at kæften spreder sig mere aggressivt og hurtigere. Bestemmelse af HER2/neu statussen er derfor en vigtig del af prognosen og kan benyttes når behandlingen af patienten skal bestemmes. I dette speciale er en metode, der automatisk kan bestemme HER2/neu statussen ved hjælp af billede analyse blevet udviklet og testet. Denne metode bruger billeder af immunohistochemistry farvet kræft biopsier til at bestemme statussen. Farvningen af biopsien vil gøre celle membranerne brune hvis cellen har forstærket HER2/neu. Intensiteten af denne brune farve vil blive brugt som et mål for HER2/neu statussen. Fire faktorer, der alle kan påvirke enten billedet eller farvningen, og deres effekt på den brune intensitet er blevet undersøgt. Disse faktorer er: de to mest udbredte farve metoder, Dako’s HercepTest and Ventana’s Pathway®, forskellen på billeder optaget med to forskellige kameraer, PixeLINK A686 og Olympus DP 71, effekten af vævs tykkelsen og til sidst effekten af ufokuserede billeder. I opgave er det bevist at den brune intensitet fundet af metoden, kan bruges som et mål for HER2/neu statussen og at faktorerne enten har en meget lille indflydelse på intensiteten eller at metoden kan modificeres til at indbefatte disse faktorer.

Side iii

Preface This document is a master thesis written at the Department of Mathematical Modeling (IMM), Technical University of Denmark in collaboration with Visiopharm A/S, Hørsholm. A big thanks goes to my supervisor Professor Bjarne Kjær Ersbøll for always being helpful and for keeping me going. The writer would like to thank Professor of Pathology Jan P.A. Baak, Head of the Department of Molecular and Quantitative Pathology at Stavanger University Hospital, Norway and Md Mogens Vyberg at the Institute of Pathology at Aalborg Hospital for sharing the data material upon which this thesis is based. Special thanks go to the staff of the Research and Development department at Visiopharm, Steen Frost Tofthøj, Kim Anders Bjerrum and Thomas Ebstrup as well as CTO Johan Doré for helping me when Imaging Utilities or C++ wasn’t behaving as I would have it. Thanks also go to Ph.D. Michael Grunkin for helping me developing the method and to verify the results and to Henrik Stolpe for explaining how microscopes and cameras work. Finally big thanks go to Caroline E. Rasmussen for putting up with me through the final weeks of this thesis.

Niels T. Berg, s021778 12. February 2008

Side iv

Kommentar [NTB1]: INDSÆT NAVN

Abbreviations       

HER2/neu HER2 IHC FISH CISH SISH ANOVA

-

Human Epidermal growth factor Receptor 2 gene Human Epidermal growth factor Receptor 2 gene Immunohistochemistry Fluorescent In Situ Hybridization Chromogenic In Situ Hybridization Silver In Situ Hybridization Analysis of variance

Side v

Content 1

Introduction ..........................................................................................................................................1

2

Cancer....................................................................................................................................................2 2.1

What is cancer? .............................................................................................................................2

2.1.1

Therapy..................................................................................................................................3

2.1.2

The cause for cancer .............................................................................................................3

2.1.3

Cancer statistics.....................................................................................................................4

2.2

Breast cancer .................................................................................................................................4

2.2.1

Breast cancer statistics ..........................................................................................................5

2.2.2

Classification of breast cancer...............................................................................................5

2.3

HER2/neu positive breast cancer ..................................................................................................6

2.3.1

Testing for HER2/neu overexpression ...................................................................................7

3

The study data material ......................................................................................................................11

4

Segmentation of IHC-sample images ..................................................................................................12 4.1

5

The Minimum error thresholding algorithm ...............................................................................13

4.1.1

Threshold selection .............................................................................................................13

4.1.2

Iterative threshold selection ...............................................................................................16

4.1.3

Implementation of the iterative algorithm .........................................................................17

4.2

Pre-processing .............................................................................................................................19

4.3

Input to the segmentation ..........................................................................................................20

4.3.1

Creation of grayscale image highlighting membrane regions.............................................21

4.3.2

Creation of grayscale image highlighting brown regions ....................................................25

4.4

The segmentation method ..........................................................................................................27

4.5

Post-processing ...........................................................................................................................28

4.6

Testing segmentation methods...................................................................................................28

4.7

Statistical Model ..........................................................................................................................33

Analysis of the segmentation method ................................................................................................35 5.1

Analysis of different protocols ....................................................................................................36

5.2

Analysis of the effect of tissue thickness ....................................................................................37

5.3

Analysis of the differences between two cameras .....................................................................41

5.4

Analysis of the effect of the focus ...............................................................................................43

Side vi

5.5

Summary of the analysis .............................................................................................................45

6

Discussion ............................................................................................................................................46

7

Conclusion ...........................................................................................................................................47

8

The program ........................................................................................................................................48

9

Future work .........................................................................................................................................50

10

Bibliography ....................................................................................................................................51

11

Figures .............................................................................................................................................53

1

Appendix ................................................................................................................................................ i 1.1

C++ code ......................................................................................................................................... i

1.1.1

Main() ..................................................................................................................................... i

1.1.2

Initialize() ................................................................................................................................i

1.1.3 DetectBrownMembranes(CImage* RGB, bool method, int band, double& Output_Chrom_Blue) ............................................................................................................................ v 1.1.4

Pre_process.h ....................................................................................................................... vi

1.1.5

Pre_Process(CImage* RGB) .................................................................................................. vi

1.1.6

DetectBackground.h ............................................................................................................ vii

1.1.7

DetectBackground(CImage* RGB, double Means[3]) ......................................................... vii

1.1.8

Kittle.h ................................................................................................................................ viii

1.1.9

Kittle(int* Hist, int Min, int Max)........................................................................................ viii

1.1.10

MakeGrayScale.h.................................................................................................................. ix

1.1.11

MakeMembrane(CImage* RGB, CImage* Image); ............................................................... x

1.1.12

MakeBlueChrom(CImage* Chrom, CImage* Image); ........................................................... x

1.1.13

MakeRedChrom(CImage* Chrom, CImage* Image); ............................................................ x

1.1.14

MakeInvertBlue(CImage* RGB, CImage* Image); ................................................................ xi

1.1.15

Method.h.............................................................................................................................. xi

1.1.16

Mulit(CImage* Membranes, CImage* Brown, CImage* RGB) ............................................ xii

1.1.17

Inter(CImage* Membranes, CImage* Brown, CImage* RGB) ............................................. xii

1.2

Output of dataset A .................................................................................................................... xiii

1.3

GraphPad Prism 5 ANOVA of combinations ............................................................................... xvi

1.4

Output from dataset C ................................................................................................................ xx

1.5

GraphPad Prism 5 calculations for the analysis of different protocols ..................................... xxii

1.6

Output from dataset D .............................................................................................................. xxx

1.7

GraphPad Prism 5 calculations for the analysis of the effect of tissue thickness ....................xxxii

Side vii

1.8

Output from dataset E .............................................................................................................. xliii

1.9

GraphPad Prism 5 calculations for the analysis of the differences between two cameras ....... xlv

1.10

Output dataset F ........................................................................................................................... li

1.11

GraphPad Prism 5 calculations for the analysis of the affect of the focus ................................. liv

Side viii

1 Introduction Breast cancer is one of the most common types of cancer in the developed countries and many people diagnosed with breast cancer dies. In recent years scientists have discovered that tumors, breast cancer and others can have different characteristics and that these characteristics have an influence on the prognosis and treatment of the disease. As the characteristics can affect the prognosis negatively they can also affect it positively. Scientists have tailored drugs that only affect cells with a specific characteristic. These drugs are much more effective than less specific drugs. It is therefore important to learn whether a tumor has any characteristics early in the treatment of a cancer patient. One of the characteristics breast cancer can have is the overexpression of the Human Epidermal growth factor Receptor 2 gene also known as the HER2/neu gene. HER2/neu is a gene which codes for growth and division of cells. A tumor with this characteristic tends to be more aggressive, spreading and growing faster than normal, resulting in a worse prognosis, but a drug has been developed that specifically treat a tumor with this characteristic. The drug, called Herceptin®, improves the survival chance with up to 50%. To test for this characteristic a pathologist has to stain and then examine a biopsy taken from the tumor. This is a complicated process which is prone to human mistakes. The most common procedure involves staining of the outer cell membrane and then determining the amount of stain absorbed in the process. This process is called immunohistochemistry(ICH). The aim of this thesis is two folded, to create a method based on image analysis that can automatically determine the HER2/neu status of a tissue sample and then test this. First a method will be developed. It uses images of IHC-stained samples as input and then calculates the mean intensity of the brown color in the membranes as output. This method should be able to find the brown membranes by locating the membranes and the brown regions using image analysis. It is expected that the quantitative output can be used as measure of the HER2/neu status. The second aim is to test the sensitivity of the method. To analyze this four influencing factor have to be examined: The effect of different staining reagents or protocols, the thickness of the tissue, the effect of two different cameras and the effect of the focus level with which the images are captured. Testing these four factors will give a good indication of how flexible the method is. The staining protocols examined are the two most commonly used, Dako’s HercepTest® and Ventana’s Pathway® protocol. As there is a difference in the way these stain the tissue it is crucial to analyze the effect they have on the output. As the thickness of the tissue will affect the intensity of the color, it will properly also affect the output of the method. In what way will have to be analyzed using tissue of three different thicknesses. Digital cameras don’t always produce the same images. There can be some differences in the color reproduction and this will affect the method. Two cameras, the PixeLINK A686 and the Olympus DP 71, will be tested. Lastly the effect on the output of unfocused input images will be examined. When capturing image using a microscope one can easily lose focus and it is therefore important to analyze the effect of images that are slightly out of focus.

Side 1

2 Cancer The following chapter will give a short introduction to cancer, breast cancer and then in greater detail HER2/neu positive breast cancer. This will provide the reader with the necessary knowledge to comprehend the background for this thesis.

2.1 What is cancer? Cancer normally materializes as tumors. A tumor is an abnormal mass of tissue within the body. Tumors can be either malignant and thus dangerous to the host-body, or benign and thus of no immediate danger to the host-body. For a tumor to be malignant it has to posses one or more of the following three properties; being aggressive (grow and divide abnormal), invasive (spreads and destroy nearby tissue) and metastatic (spreads to other parts of the body) (Cancer - Wikipedia). If a tumor is malignant it is called cancer. Cancer is not a single disease but a group of diseases, all characterized by uncontrollable and abnormally growth of some type of cells. The cancer is named after the tissue from which it originates. So patient can for instance have a breast cancer in the liver if the cancer cells originates from the breast but has metastasized to the liver. The patient would still be diagnosed with breast cancer, although the tumor was found in another organ. There are more than a hundred different types of cancer (What is cancer). The main types of cancer are seen below: 





 

Carcinoma - cancer that originates from epithelial cells which for instance can be found in the skin or inside glands. Sarcoma - cancer that originates in bone, cartilage, fat, muscle, blood vessels, or other connective or supportive tissue. Leukemia - cancer that originates in blood-forming tissue such as the bone marrow. Figure 2-1 Normal cell growth and abnormal cell growth leading to a Lymphoma and myeloma - cancers tumor (What is cancer) that originates in the cells of the immune system. Central nervous system cancers - cancers that begin in the tissues of the brain and spinal cord.

Tumors and thereby cancer starts in cells that for some reason have become abnormal, either through a mutation or due to damage. Normal cells throughout the body grow and divide in a controlled way, thereby producing the cells needed to keep the body healthy. When a cell have reached a certain age or becomes damaged it is programmed to die and is replaced by a new cell (What is cancer).

Side 2

This normal process sometimes goes askew due to changes in the genetic material (DNA). This can result in mutations that affect the cell’s growth, longevity and/or division. When the cells no longer die and they keep dividing, it creates an unwanted accumulation of cells, see Figure 2-1. Such an accumulation is called a tumor. If the cells of a tumor have one of the three properties mentioned above it is called cancer (What is cancer). It is not all types of cancer that causes a solid tumor. Some, like leukemia, are formed in the blood or in organs that produces blood. Here the cancer cells flow with the blood and never form a tumor.

2.1.1 Therapy There are a number of ways to treat cancer, ranging from drugs to radiation therapy to surgery (Cancer Wikipedia). The therapy depends on the type of cancer diagnosed as not all types of cancers respond to the same kind of therapy. Some types of cancer, like breast cancer, can be divided into a number of subcategories, each with a specific method of treatment. Normally the different therapies are used in a combination, depending on the type of cancer. The types of therapies are; surgical removal, chemotherapy, radiation therapy, immunotherapy. There are also a number of alternative therapies available not described here. 





Chemotherapy is the use of drugs to remove or inhibit cancer cells. This group of drugs target rapidly dividing cells. They can therefore have some severe side effects as some healthy tissue, like the intestine, which also contains rabidly dividing cells and can be damage as well. Radiation therapy is the use ionizing radiation to fight the cancer. The radiation damages the target tissues DNA, thereby preventing the cells in the tissue from dividing and growing. This damages both healthy and cancer cells, but as healthy cells regenerate faster they can usually recuperate between therapy sessions. Still the target is to reduce the damage to healthy tissue while maximizing damage to cancer tissue. Immunotherapy is a therapy in which the body’s own immune system is made to fight the cancer. The immune system don’t normally target cancer cells as these are derived from normal cells and therefore do not express any antigen that trigger cell destruction. Herzeptin®, which is an antibody to HER2/neu, is an example of this therapy. This is a new therapy that has a lot of promise as it only targets the cancer cells, thereby minimizing the damage to healthy tissue (Cancer - Wikipedia).

The optimal therapy depends on the cancer, the general health of the patient, the development of the tumor and its location. The development of a tumor is classified in stages from 0 to IV, where 0 is when the tumor is confined to the site of origin, also called in situ, and IV is when the tumor have metastasized to other organs or throughout the body (Cancer staging - Wikipedia).

2.1.2 The cause for cancer There is no single cause for cancer, but a multitude. Research indicates that these can be either environmental or inherited or most likely a combination of these. If a tumor is caused by environmental factors then these are called carcinogens. Examples of carcinogens are radiation, some viral infections and various chemicals. It is well know that exposure to radiation or tobacco smoke cause cancer, but

Side 3

scientists keep finding new carcinogens. Mostly carcinogens causes a specific type of cancer, e.g. tobacco smoke increases the risk of getting lung cancer, but not breast cancer etc. Some types of cancers are caused by hereditary defects in the DNA for instance. Some forms of breast cancer can be caused by inherited mutations of the genes BRCA1 and BRCA2 (Cancer - Wikipedia). In many cases it is a complex combination of factors that causes the cancer to develop. This also explains why some develop cancer after having been exposed to some carcinogens while others don’t.

2.1.3 Cancer statistics If all forms of cancer are seen as one disease, then cancer is the cause of some 8% of all deaths worldwide. In the developed countries cancer is the second most common cause of death (WHO | The top 10 causes of death). As cancer is a multitude of diseases it would not be correct to view them as a single disease therefore on WHO’s Top 10 causes of death in the world the different types of cancer features independently. Still cancer types are placed at the top of the list, especially in the developed countries. Here cancer types figures on four of the top teen spots and lung and bronchus cancer is the cause of a third of all deaths. That cancer causes more deaths in the developed countries is attributable to two facts. First the risk of getting cancer increases with age and people in developed countries live longer than in other countries and second the use of industrial pollutants, of which many are carcinogens, are much higher in the developed countries (WHO | The top 10 causes of death).

2.2 Breast cancer A cancer is called breast cancer when it originates from cells in the breast. The breast is mostly made of lobules (milk-producing glands), ducts (tiny tubes that carry the milk from the lobules to the nipple), and stroma (fatty tissue and connective tissue surrounding the ducts and lobules, blood vessels, and lymphatic vessels)(ACS :: What is Breast Cancer), see Figure 2-2. Most cancer originates in the epithelial cells that line either the lobule or the duct. If the cancer started in the lobular cells it is classified as lobular carcinoma and if it starts in the ducts as ductal carcinoma. Of the two forms of breast cancer ductal carcinoma is the most common, accounting for 80% of all breast cancer cases (ACS :: What is Breast Cancer). It is predominantly women who get Figure 2-2 The structure of a healthy breast breast cancer, but it is not unknown for men to get it as well although rarely.

Side 4

2.2.1 Breast cancer statistics It is estimated that 178,480 women and 2,030 men got breast cancer in the US in 2007(Breast Cancer Home page). This makes it the most common type of cancer in women in the US. It is estimated that 40,460 woman and 450 men will die of the disease in the US in 2007. This makes it the third most common cause of cancer death in the US and the second most common among women in the US. The National Cancer Institute estimates that 12.7 % of all women born in the US today will be diagnosed with breast cancer (Probability of Breast Cancer in American Women).

Figure 2-3 Breast cancer incidences worldwide: age-standardized rates (world population).

Though the disease predominantly is a western problem, see Figure 2-3, the number of incidences in developing countries has increased in the last decade. This can be attributed to the facts that people in all parts of the world live longer and that more and more have a modern western lifestyle (Bray, McCarron, & Parkin, 2004). The number of women worldwide who gets breast cancer has risen so much that it is now the type cancer that causes most deaths among women. It kills more women than lung cancer, liver cancer and stomach cancer, even though these types have a higher fatality rate.

2.2.2 Classification of breast cancer When it is established that a patient has a tumor in the breast the first thing to be determined is whether the cancer is benign or malignant. This is done by taking a biopsy of the tumor. The biopsy is then examined in detail to give a clearer picture of the tumor. A pathologist will categorize the tumor based on a histological examination. This categorization determines what kind of cells the tumor

Side 5

originated from. As mentioned above most carcinomas are ductal carcinomas, i.e. the tumor originates from the ductal cells of the breast. Next the pathologist will grade the tumor. This grading will show how much the tumor looks like normal tissue and will help to give a more accurate prognosis. The grading is normally done using the Bloom-Richardson grade, which is a grade from 1 to 3. A high score means a more aggressive and fast-growing tumor, while a low score indicates a slow-growing cancer with little risk of it spreading. The rate of survival after 5 years is 95% if graded 1 while it is 50% if the tumor is graded 3. If the tumor is still in situ then this grading is not performed (ASC :: How Is Breast Cancer Diagnosed?). Breast cancer can also be classified by its protein and gene expression status. Knowledge about this status can be used to predict the prognosis of a patient and to choose the optimal treatment (Breast Cancer - Wikipedia). The protein expression status tells whether the cell surface has estrogen- or progesterone receptors or not. Cell surface receptors are proteins that are imbedded in the cell membrane thereby creating a signaling pathway between the surroundings and the inside of the cell. The signaling starts when a circulating substance, for instance the hormone estrogen, binds to the receptor. This binding starts a cascade of signaling pathways which lead to a cellular response, for instance cell growth or division. If breast cancer cells have either estrogen- (ER-positive) or/and progesterone receptors (PR-positive) the prognosis tends to be better than for breast cancers not having these receptors. Furthermore this type of breast cancer will respond better to hormone therapies. Approximately 66% of all breast cancers have at least one of these receptors and the percentage increases with age (ASC :: How Is Breast Cancer Diagnosed?). Gene expression status is based on expression of some specific genes. The most common gene to check for is the HER2/neu gene. Breast cancer with an amplification of this is more aggressive in regards to growing and spreading than other breast cancer types. Newly invented medicine specifically targets cells with an amplification of HER2/neu, increasing the chance of a successful treatment. In chapter 2.3 HER2/neu will be explained in depth. The tumor is also classified in accordance with the TNM system. TNM stands for Tumor, lymph Node and Metastasis. This system describes how much the cancer has spread. This system uses the size of the tumor, local lymph node invasion and metastases to other parts of the body as a measure of how invasive the cancer is. A high score in this system gives the patient a worse prognosis.(Cancer staging Wikipedia) When the tumor has been classified a treatment is planed based on the knowledge obtained through the classification. This increases the patient’s chance of success.

2.3 HER2/neu positive breast cancer HER2/neu is a growth promoting gene found normally in all cells throughout the body. The gene codes for a protein also called HER2/neu. This protein is located on the membrane of the cell and is involved in the signal-path way that leads to the cells growth or division. A protein thus located is also called a receptor, since it receives messages from the outside of the cell and transmits them to the inside.

Side 6

In the 1980s scientist first discovered a link between cancer and the human gene HER2/neu. In 1987 Dennis Slamon found that some breast cancers had an amplification of the HER2/neu gene. Amplification means that there are more copies of the gene than normal. A normal healthy cell has two copies of the HER2/neu gene, a cancer cell showing amplification have multiple copies. This leads to an overexpression of the HER2/neu protein, which again leads to an abnormal cell growth regulation. Cells with an amplification of the HER2/neu gene can have up to a 100 times overexpression of the HER2/neu protein. As Slamon found, this makes the cancers more aggressive, spreading and growing faster, giving an overall worse prognosis. He found that approximately 20-25% of all breast cancers show an overexpression of HER2/neu. Since then, the expression of HER2/neu has been an important indicator when determining the prognosis of a patient. Normal expression

HER2/neu

HER2/neu proteins mediate normal cell growth and division

HER2/neu amplification and overexpression

HER2/neu overexpression drives the tumor growth

In HER2/neu-positive breast cancer, HER2/neu gene amplification results in 10 to 100 fold overexpression of the HER2/neu protein

Persistent HER2/neu overexpression leads to uncontrolled cell division and aggressive tumor growth

After having made this discovery Slamon teamed up with a medical company called Genentech to develop an antibody against HER2/neu that could block the HER2/neu protein from receiving signals, thereby preventing the abnormal growth. This antibody only affects cells that have an overexpression of HER2/neu leaving normal healthy cells alone. The antibody is called trastuzumab. It got the U.S Food and Drug Administration approval in 1998 and is now marketed as the drug Herzeptin® (HER2, 2008). Is has been shown that a combination therapy of Herzeptin® and chemotherapy greatly improves the response to the treatment. Another positive effect of Herzeptin® is that the risk of recurrence of the cancer is much smaller when the drug is medicated. The side effect of Herzeptin® is very limited making it a very useful tool when treating breast cancer (Tubs & Hsi, 2004).

2.3.1 Testing for HER2/neu overexpression As the HER2/neu status has become an important factor when making the prognosis and planning the treatment it has become routine to determine the status as soon as possible. The two most ordinary ways to do this is by using immunohistochemistry or Fluorescent In Situ Hybridization. The second

Side 7 Figure 2-4 IHC and FISH targets for HER2/neu testing (DAKO, 2007)

quantifies the number of HER2/neu genes inside the cells, or the amplification of the gene. See Figure 2-4 IHC and FISH targets for HER2/neu testing Two new methods Chromogenic In Situ Hybridization (CISH) and Silver In Situ Hybridization (SISH) is becoming more and more popular, but is still not used widely. All the methods require a biopsy sample to be removed from the tumor. Immunohistochemistry: This method measures the level of the HER2/neu protein on the cell membranes, also called the expression of the protein. Initially a tissue sample is taken from a biopsy and is stained with an antibody which binds to the HER2/neu protein on the surface of the cell. Now a marker that binds to this antibody can be applied. The marker is an enzyme that turns brown when it binds to the antibody. The color change can be measured in a microscope. The result is that the membranes of cells with an overexpression of HER2/neu will become dark brown. The deeper the brown color and the more complete the membrane staining the larger the expression of the HER2/neu (Immunohistochemistry - Wikipedia). In the staining process it is normal to perform a second staining, a counter staining, to emphasize the nucleus of all the cells. This stain is normally blue or light blue in color. Both stains will stain some of the cytoplasm in the cells turning them either light brown or very light blue. The cytoplasm is the area between the membrane and the nucleus. The two most common staining protocols are Dako’s HercepTest® and Ventana’s Pathway® protocol, but FDA has approved some home-mixed protocols to be used in laboratories (Tubs & Hsi, 2004). The Table 1 shows the HercepTest® scoring system and Figure 2-5 shows samples stain with it. Table 1 Table showing the HercepTest® protocol's guide for scoring

When measuring the HER2/neu score the pathologist looks at how dark brown the membrane stain is and how many cells shows complete membrane stain. The sample is then scored from 0+ to 3+ as shown in Table 1. For the scores 0+ and 1+ the HER2/neu expression is considered negative, for the score 3+ it is considered positive and for the score 2+ it is considered weakly positive. A negative result means that the HER2/neu gene is not amplified while a positive score means that it is amplified and that Hezeptin® treatment should be considered for the patient. A score of 2+ requires a FISH-test to confirm if there is an amplification of the HER2/neu gene or not. A pathologist trying to score a sample biopsy first identifies the area where the tumor is located and then makes a detailed examination of the particularly area. This is done to make certain that the area examined is actually the relevant tumor and not unwanted artifacts. Artifacts include areas where the

Side 8

tissue has become folded, has been exposed to heat or the edge of the biopsy. All these artifacts can show an elevated measure of staining and if they are not discarded can lead to a faulty score. When evaluating the score the pathologist should disregard any stained cytoplasm. It is only the level of membrane staining that should be evaluated.

Figure 2-5 Images showing breast cancer samples stained with the HercepTest® protocol showing all four scores

Fluorescent In Situ Hybridization: This method quantifies the number of HER2/neu genes also called gene amplification. This is done by adding a fluorescent probe to the sample which then binds to the parts of the chromosomes that has a similarly sequence. In the case of HER2/neu testing it binds to the HER2/neu gene. The fluorescent signal from the probe can then be quantified using a fluorescent microscope. A tumor is classified as positive (HER2/neu) amplified if there are more than two signals per cell as shown in Figure 2-6 (Fluorescent in situ hybridization - Wikipedia).

Side 9

Figure 2-6 Image showing a breast cancer sample stained with a fluorescent marker. The red signals represent HER2/neu genes and the green the chromosomes

Chromogenic In Situ Hybridization: This method is similar to FISH in that it quantifies the amplification of the HER2/neu gene. In contrast to FISH the probe is not fluorescent and the method therefore does not require a fluorescent microscope when the probe signals are quantified. This makes CISH less complex than FISH, but it still produces a very accurate result (Madrid & Lo, 2004). Silver In Situ Hybridization: This method is like CISH in that it also quantifies the amplification of the HER2/neu gene and with a probe that can be seen in a normal bright field microscope. The difference is that the probe is metallic in SISH which makes the probe signals resilient to fading and photo bleaching (Immunohistochemistry - In Situ Hybridization). The in situ hybridization methods are the most precise ones, since it is the actual numbers of copies of the HER2/neu gene that are quantified, but they cost approximately 10 times as much to perform and the staining procedure is more complex. The FISH method also requires a fluorescent microscope which makes it even more costly and complex to perform. Since the methods CISH and SISH are not yet widely used in laboratories the following will concentrate on the FISH and IHC methods. Both of these methods entail a highly trained pathologist to score a sample. The FISH-method is the most simple to score since it only involves counting the signals, but even so it necessitates a trained professional to distinguish overlapping cells and overlapping signals. An IHC-staining is even more difficult, as it involves deciding whether the membranes are completely stained and how dark this stain is. Again the pathologist has to differentiate between overlapping cells, but he also has to identify and disregard artificial edges and folding of the tissue sample. For both methods it is crucial that the person that performs the scoring is competent and experienced. It is therefore of the otter most importance that the laboratory adhere to a strict self-testing regime and that laboratories are monitored and tested by a centralized organization (Tubs & Hsi, 2004). The results from IHC-testing should not fall outside the expected percentage of positive cases (15-20%). If a laboratory should fall outside this range, its procedures should be examined (DAKO, 2007).

Side 10

3 The study data material The data used in this thesis comes from two different sources. These have been obtained through Visiopharms contacts in the medical research community. The first set of data, used to develop the method comes from the Department of Molecular and Quantitative Pathology at Stavanger University Hospital, Norway. This data consists of images from 30 tumors stained using Dako’s HerceptTest® protocol. Five images per tumor have been captured giving a total of 150 images. The HER2/neu status on all cases is known and ranges from 2+ to 3+. The second set of data comes from the Institute of Pathology at Aalborg Hospital. This data has been compiled to be used in a large study with the purpose to determine the optimal method to identify those breast cancer patients that can benefit from treatment with Herzeptin®. This is one of the largest studies ever made that compares the different methods available for determine the degree of HER2/neu. The data consist of two slides with a number of tissue samples on each. The samples are all tumors found by a pathologist and cut from a larger tissue-biopsy. The data consist of 20 tumor samples, thirteen tumors of score Neg1, one of 2+ and six of 3+. The samples have been stained with the DAKO HercepTest® protocol. From each tissue sample 5 images have been captured using an Olympus DP 71 camera giving a total of 100 images. These two datasets have been combined to one large dataset, A. This set has been used to develop the method and test whether it able to correctly score the tumor samples. The data from Aalborg Hospital has also been used separately and is contained in the dataset B. This subset has been used in the analysis of the effect of camera and focus-level along with dataset E and F. Dataset C consists of 18 tissue samples from the Institute of Pathology at Aalborg Hospital. The samples have been cut from 9 biopsies, with two samples per biopsy. One part of these pairs has been stained with Ventana’s Pathway® protocol, the other part with Dako’s HercepTest protocol. One of the Pathway® samples has become folded making the pair unusable. In total that is sixteen tumors, eight stained with the Pathway® protocol and eight stained with the HercepTest® protocol. The dataset consist of 5 images from each tumor case captured with an Olympus DP 71 camera. Dataset D also consists of samples from Aalborg Pathology Department and is used to measure the effect of the slice thickness. This dataset consist 24 samples all stained with the Dako® protocol. There are eight samples 1 µm thick, eight 3 μm tick and eight 8 µm thick. From each sample 5 images have been captured with an Olympus DP71 camera, giving a dataset that consist of 120 images. To test whether the choice of camera affects the method a dataset E, have been acquired using samples in dataset B from Aalborg Hospital. The images in this dataset have been captured using a PixeLINK A686 camera. As the focus might affect the method a dataset F have been acquired using the samples in dataset B from Aalborg Hospital. These images are all slightly out of focus and have been captured with an Olympus DP 71 camera.

1

The institute do not distinguish between 0+ and 1+ but scores these as HER2/neu-negative or Neg

Side 11

4 Segmentation of IHC-sample images In this chapter a method to score an IHC-stained sample is proposed. This method is based on the observation that the brown membrane stain gets darker depending on the score. Consequently samples of the score 0+ shows no stain, the score 1+ shows light to medium stain, the score 2+ shows medium dark stain and 3+ shows a dark brown stain, see Figure 2-5. It is the expectation that this nuance difference can be measured and used to automatically score IHC-stained samples, for a more reliable and consistent result than what is achievable by human standards. In this thesis the two scores 0+ and 1+ have been combined to one score, Neg. This is done to match the scores of the data material from Aalborg Pathology Department. As the tumor cells aren’t homogenous distributed in a tissue sample, a pathologist will used more than one field of view to determine the score. To emulate this each tumor score is computed using five images. As a pathologist looks at all membranes in a field of view, the method utilize the whole region defined as brown membrane to compute a sub-output. This sub-output is computed as the mean intensity of the chromatic blue color. When all five sub-outputs have been computed using each of the five images, the final output is calculated as the mean of the sub-outputs. Using the mean ensures that the output of method is similar to what a pathologist would base his score upon and also makes the method less sensitive to noise. Pre-processing Segmentation Post-processing Calculate output

Figure 4-1 A diagram of the proposed method

The method consists of four steps, each seen in Figure 4-1. Each step is described in detail below. To get a method that is as useful as possible it needs to be easy to use and robust in respect to usererrors. The method therefore incorporates a pre-process step that performs a background correcting white balance scheme. This scheme first finds the background and then performs a normalizing of the pixel values based on this. It is described in chapter 4.2. When the pre-process has been performed, the next step of the method can be executed. This step is the actual segmentation of the image. In this thesis two methods are proposed and analyzed. The input of each method is two grayscale image computed from the original RGB-image. One image highlights the brown regions in the RGB-image and the other highlights the membranes. How to compute the image that highlights the membranes is described in chapter 4.3.1 and how to compute the image that highlights the brown regions is explained in chapter 0. The first method splits the segmentation into two problems, to segment the membranes (lines) and to segment the brown areas in the image. Both segmentations can be solved by thresholding the two input images separately. Thresholding means that the desired region can be defined as pixels whose value exceed or are less than some critical value, the threshold. As the overall segmentation the method uses

Side 12

the intersection of the two regions. The other method combines the two input images by multiplying them pixel by pixel. This creates an image that highlights the brown membranes. Highlighting these makes it easy to define the region of interest using thresholding. The two methods are described in the chapter 4.4. As there can be some noise in the image a post-process that removes small regions are performed. This step is described in chapter 4.5. Lastly the output is calculated as the mean of the chromatic blue within the region defined as brown membrane. The chromatic blue gives a good indication of how brown a region is as the amount of blue changes with the intensity of the brown color. There is little blue in very dark brown and some blue in lighter brown. As both the pre-processing and the two segmentations problems are solved using the same thresholding algorithm, this is explained first.

4.1 The Minimum error thresholding algorithm One of the simplest methods to solve a thresholding problem is to use a histogram-based threshold algorithm. These methods are simple to implement and have a low computation time. They have been the subject of several studies and a multitude of different variations have been developed. Glasbey (Glasbey, 1993) and Sezgin & Sankur (Sezgin & Sankur, 2004) have compared the effectiveness of some of these threshold algorithms and both of them have determined that the minimum error thresholding algorithm proposed by Kittler & Illingworth (Kittler & Illingworth, 1986) are one of the best, both in regards to reaching the optimal threshold and in computation time. Therefore this algorithm has been chosen as the thresholding algorithm used in this thesis. Since the optimal solution to an image-segmentation depends on the image one is trying to segment, it is hard to say if this algorithm is the optimal for the problems considered in this thesis, but since both Glasbey (1993) and Sezgin & Sankur (2004) have performed tests on images or histograms similar to the ones found in this thesis it is reasonable to conclude that the algorithm also will perform very well at the given problem. Kittler & Illingworth (1986) proposed two algorithms; the second is an iterative method of the first. Since the iterative method is the fastest and with a few exceptions reaches an acceptable threshold, this method has been used in this thesis.

4.1.1 Threshold selection The algorithm proposed by Kittler & Illingworth (1986) exploits the fact that the histogram of a grayscale image can be viewed as an estimate of the probability density function of the mixed population compromising gray levels of object and background pixels. These two populations can be assumed to be distributed normally with distinct means and standard deviations. Using this assumption the population parameters can be inferred from the gray level histogram. The underlying idea of the method is to optimize a criterion function related to the average pixel classification error rate (Kittler & Illingworth, 1986).

Side 13

As mentioned above Kittler & Illingworth (1986) proposed two methods. Both methods will be described as the second is an iterative version of the first. It is the second method that has been implemented in the thesis The pixels of an image have the gray level value 𝑔 and 𝑔 can assume the values between [0, 𝑛]. Then the distribution of 𝑔 can be summed up as the histogram 𝑕(𝑔). The histogram can be viewed as an estimated of the probability density function 𝑝(𝑔) containing the gray levels for both background and object pixels. 𝑝(𝑔) can be defined as 1 2

𝑝 𝑔 =

𝑃𝑖 𝑝(𝑔|𝑖) 𝑖=1

Where 𝑝(𝑔|𝑖) is the probability density function for one of the two objects. Now if an arbitrary threshold 𝑇 is chosen then the two ensuring pixel populations can be modeled by a normal density 𝑕 𝑔 𝑖, 𝑇 with the parameters 𝜇𝑖 (𝑇), 𝜎𝑖 (𝑇) and a prior probability 𝑃𝑖 (𝑇) defined as 2 𝑏

𝑃𝑖 𝑇 =

𝑕(𝑔) 𝑔=𝑎

3 𝑏

𝜇𝑖 𝑇 =

𝑕 𝑔 𝑔

𝑃𝑖 𝑇

𝑔=𝑎

4 𝑏

𝜎𝑖2 𝑇 =

𝑔 − 𝜇𝑖 𝑇

2

𝑕 𝑔

𝑃𝑖 𝑇

𝑔=𝑎

where 5

𝑎=

0 𝑇+1

𝑖=1 𝑖=2

𝑏=

𝑇 𝑛

𝑖=1 𝑖=2

and 6

Using the two models for the normal densities 𝑕 𝑔 𝑖, 𝑇 , 𝑖 = 1, 2 the conditional probability 𝑒 𝑔, 𝑇 of gray level 𝑔 belonging to the correct object is given by

Side 14

7

𝑒 𝑔, 𝑇 = 𝑕 𝑔 𝑖, 𝑇 ∙

𝑃𝑖 𝑇 𝑕 𝑔

𝑖=

1 2

𝑔≤𝑇 𝑔>𝑇

This can be reduced as 𝑕(𝑔) is independent of both 𝑖 and 𝑇 and therefore can be ignored. Taking the logarithm and multiplying this with −2 gives 8

𝜀 𝑔, 𝑇 =

𝑔 − 𝜇𝑖 (𝑇) 𝜎𝑖

2

+ 2 log 𝜎𝑖 𝑇 − 2 log 𝑃𝑖 (𝑇)

𝑖=

1 2

𝑔≤𝑇 𝑔>𝑇

This can be regarded as an alternative index of correct classification performance (Kittler & Illingworth, 1986). For the whole image the average performance figure can be described by the criterion function 9

𝐽 𝑇 =

𝑕 𝑔 ∙ 𝜀 𝑔, 𝑇 𝑔

The criterion function for a given 𝑇 indirectly reflects the amount of overlap between the object and background populations. If 𝑇 is computed for all values of 𝑔 then 𝐽(𝑇) will change reflecting the change in the overlap between the density functions. The lower the value of 𝐽(𝑇) the smaller the overlap and consequently the smaller the classification error. The threshold 𝑇 that yields the lowest value of 𝐽(𝑇) will be the minimum error threshold and the thresholding selection for the optimal threshold 𝜏 can be described as 10

𝐽 𝜏 = min 𝐽(𝑇) 𝑇

The criteria function 𝐽(𝑇) can be rewritten to 11 𝑇

𝐽 𝑇 =

𝑔 − 𝜇1 (𝑇) 𝜎1 (𝑇)

𝑕(𝑔) × 𝑔=0

×

𝑔 − 𝜇2 (𝑇) 𝜎2 (𝑇)

𝑇

2

+ 2 log 𝜎1 (𝑇) − 2 log 𝑃1 (𝑇) +

𝑕(𝑔) 𝑔=0

2

+ 2 log 𝜎2 (𝑇) − 2 log 𝑃2 (𝑇)

Substituting 2, 3 and 4 into 11 gives 12

𝐽 𝑇 = 1 + 2 𝑃1 𝑇 log 𝜎1 (𝑇) + 𝑃2 𝑇 log 𝜎2 (𝑇) − 2 𝑃1 𝑇 log 𝑃1 (𝑇) + 𝑃2 𝑇 log 𝑃2 (𝑇) The computation of this function is straightforward and the minima of it can also be found relative easy as it is a smooth function. This gives a simple solution to the threshold problem, but it should be noted that the models 𝑕 𝑔 𝑖, 𝜏 𝑖 = 1,2 will be biased estimates of the true parts of the mixture of the normal probability distribution. The cause for this is that the tails of the distribution is truncated when the histogram is partitioned. Kittler & Illingworth assumes that the affect of this bias is small enough to be ignored (Kittler & Illingworth, 1986).

Side 15

Figure 4-2 Three population models for thresholds 𝑻 = 𝟏𝟎𝟎, 𝑻 = 𝟏𝟏𝟓 and 𝑻 = 𝟏𝟑𝟎. The black line defines the histogram of an IHC-image. The red and green lines are the population density functions. The blue area represents the binarisation error. The optimal threshold is found as 𝑻 = 𝟏𝟏𝟓

4.1.2 Iterative threshold selection This method builds upon the result of the method described above, but is computational less intensive. It uses dynamic clustering to iteratively minimize 𝐽(𝑇) as defined above. The method uses the fact that the criterion function value can be decreased for any starting threshold 𝑇 by assigning the pixel values to one of the classes according to the Bayes minimum error rule defied in terms of the parameters 𝜇𝑖 𝑇 , 𝜎𝑖 𝑇 and 𝑃𝑖 𝑇 . The rule is defined below: 13

𝑖𝑓

𝑔 − 𝜇1 𝑇 𝜎 𝑇

2

+ 2 log 𝜎1 𝑇 − log 𝑃1 (𝑇)

< 𝑔 − 𝜇2 𝑇 > 𝜎 𝑇

2

+ 2 log 𝜎2 𝑇 − log 𝑃2 (𝑇) 𝑡𝑕𝑒𝑛

𝑔→1 𝑔→2

The solution to this function can be found as the solution to the following quadratic function 14

𝑔2

1 1 𝜇1 𝑇 𝜇2 𝑇 𝜇12 𝑇 𝜇22 𝑇 − − 2𝑔 − + − + 2 log 𝜎1 𝑇 − log 𝜎2 𝑇 𝜎12 𝑇 𝜎22 𝑇 𝜎12 𝑇 𝜎22 𝑇 𝜎12 𝑇 𝜎22 𝑇 − 2 log 𝑃1 𝑇 − log 𝑃2 𝑇 = 0

Repeating this procedure with the new threshold 𝑇 reduces the criterion function value even further. The algorithm terminates when the threshold becomes constant. Choose an arbitrary initial threshold 𝑇; Calculate parameters 𝜇𝑖 𝑇 , 𝜎𝑖 𝑇 and 𝑃𝑖 𝑇 , 𝑖 = 1, 2; Calculate a new threshold by solving equation 14; If the new threshold is the same as the old one terminate the algorithm, else go to Step 2.

Step 1. Step 2. Step 3. Step 4.

In accordance to Niemistö instead of choosing an arbitrary initial threshold, the mean of the histogram is a good initial threshold (Niemistö, 2004). Further Niemistö shortens equation 14 to: 15

𝑔2

1 1 − 2 𝑇 𝜎2 𝑇

𝜎12

− 2𝑔

𝜇1 𝑇 𝜇2 𝑇 − 2 𝜎12 𝑇 𝜎2 𝑇

+

𝜇12 𝑇 𝜇22 𝑇 𝜎12 𝑇 𝑃12 𝑇 − 2 + log 2 2 𝜎1 𝑇 𝜎2 𝑇 𝜎2 𝑇 𝑃22 𝑇

=0

The integer part of the solution to this quadratic equation gives the new threshold. Substituting the three terms with 𝑤0 , 𝑤1 and 𝑤2 and the solution is found as

Side 16

16

𝑇=

𝑤12 − 𝑤0 𝑤2

𝑤1 +

𝑤0

Again the algorithm needs to run until convergence. This solution is the one that has been implemented in the thesis. Both algorithms fail to converge if the quadratic equation hasn’t got a real solution and the solution is still biased as is the non-iterative algorithm.

4.1.3 Implementation of the iterative algorithm The iterative algorithm has been implemented in C++ using the following annotations taken from (Niemistö, 2004). Niemistö make use of the fact that the two populations is a proportion of a whole. Therefore it is only sums of the whole and one of the populations that needs to be computed. The definitions for the second population can then be defined as the part of the whole that is not in the first population. The histogram is defined as 𝑦0 , 𝑦1 , … , 𝑦𝑛 , where 𝑦𝑖 is the number of pixels in the image of gray level 𝑖 and 𝑛 is the maximum gray level in the image; in this case 255. The threshold is defined as 𝑡. The following sums and definitions are used: 17 𝑗

𝐴𝑗 =

𝑦𝑖

𝑓𝑜𝑟 𝑗 = 0,1, … , 𝑛

𝑖=0

18 𝑗

𝐵𝑗 =

𝑖𝑦𝑖 𝑓𝑜𝑟 𝑗 = 0,1, … , 𝑛 𝑖=0

19 𝑗

𝑖 2 𝑦𝑖 𝑓𝑜𝑟 𝑗 = 0,1, … , 𝑛

𝐶𝑗 = 𝑖=0

20

𝜇𝑡 =

𝐵𝑡 𝐴𝑡

21

𝑣𝑡 =

𝐵𝑛 − 𝐵𝑡 𝐴𝑛 − 𝐴𝑡

22

𝑝𝑡 =

𝐴𝑡 𝐴𝑛

23

𝑞𝑡 =

𝐴𝑛 − 𝐴𝑡 𝐴𝑛

Side 17

24

𝜎𝑡2 =

𝐶𝑡 − 𝜇𝑡2 𝐴𝑡

25

𝜏𝑡2 =

𝐶𝑛 − 𝐶𝑡 − 𝑣𝑡2 𝐴𝑛 − 𝐴𝑡

Using the above definitions equation 15 looks like this 26

𝑥2

1 1 𝜇 𝑣 𝜇2 𝑣 2 𝜎 2 𝑞2 2 − 2 − 2𝑥 2 − 2 + 2 − 2 + log 𝜎 𝜏 𝜎 𝜏 𝜎 𝜏 𝜏 2 𝑝2

=0

Defining the three terms as 𝑤0 , 𝑤1 and 𝑤2 gives 27

𝑤0 =

1 1 2 − 2 𝜎 𝜏

28

𝑤1 =

𝜇 𝜎2



𝑣 𝜏2

29

𝑤2 =

𝜇2 𝑣 2 𝜎 2 𝑞2 − + log 𝜎2 𝜏2 𝜏 2 𝑝2

The threshold is then found using equation 16. The definitions are then recalculated and a new threshold computed until the algorithm convergences. If equation 26 has no solution the algorithm fails. This has been solved by using the mean of the histogram as the threshold if at any point the threshold goes outside the pixel ranges [0, 255] of a gray scale image. This problem occurs rarely and as the mean can be used as an approximation of the true threshold and it is very easy to find, it has been applied in this thesis. During the testing of the algorithm it was discovered that the threshold could fluctuate between two succeeding integers. This caused a problem as the algorithm would never converge. This has been solved by checking whether the difference between the new threshold and the old threshold is less than ±3. If this happens five successive times and if the difference between the new threshold and the one found two iterations back is less than ±3 the algorithm terminates. Doing the development of the segmentation method it was found that to be able to limit the range within which the algorithm searched for the threshold could be useful. It is therefore possible to define the range to 𝑚𝑖𝑛, 𝑚𝑎𝑥 . Doing this limits the sums 17, 18 and 19 to this. 𝑚𝑎𝑥

𝐴𝑚𝑎𝑥 =

𝑚𝑎𝑥

𝑦𝑖 , 𝑖=𝑚𝑖𝑛

𝐵𝑚𝑎𝑥 =

𝑚𝑎𝑥

𝑖𝑦𝑖 , 𝑖=𝑚𝑖𝑛

𝑖 2 𝑦𝑖

𝐶𝑚𝑎𝑥 = 𝑖=𝑚𝑖𝑛

The actual C++ code has been implemented in the file Kittler.cpp and can be seen in appendix 1.1.8.

Side 18

4.2 Pre-processing The algorithm performs one pre-processing step before the segmentation is performed. This step ensures that the image has been correctly white-balanced and that the background intensity is the same in all images. This gives more dependable and consistent results. White-balance is performed to ensure that the white colors in the image are reproduced correctly (Coloer balance - Wikipedia). This affects all three color-bands making the colors look like they do if seen by the naked eye.

Figure 4-3 An IHC-image captured with a wrong white-balance and the same image after the algorithm has performed whitebalance

It is possible to acquire two images with the same field of view, that have a great intensity span since both the light-intensity of the microscope and the exposure time of the camera can be changed manually. The pre-processing step helps to normalize this by setting all background pixels, which should be white, to 90% of total white. If the step is not performed it is impossible to interpret the colors, as there is no guaranteed that these are consistent and can be compared. It can also make it difficult to perform the segmentation in the next step. The background-correcting algorithm used here is described below: Step 1. Create a new image using the intensity of the original image Step 2. Find a threshold 𝑡1 between objects and background using algorithm described in chapter 4.1 with the range 0, 255 Step 3. Find a new threshold 𝑡2 using the range 𝑡1 , 255 Step 4. Segment the image using the threshold 𝑡2 𝑡𝑜𝑡𝑎𝑙 # 𝑝𝑖𝑥𝑒𝑙 1000

Step 5.

Remove regions with less than

Step 6.

Compute the mean of the background for each color-band

pixels as noise

Step 7.

Compute the normalizing factor, here set to 90% of total white, so

Step 8.

Multiply all pixel in each color-band with the respective normalizing factor

226 𝑚𝑒𝑎𝑛 𝑏𝑎𝑛𝑑

Using the intensity to find the background reduces some of the complexity of the problem. As the images are captured on a bright-field microscope the background is bright or has high intensity, since the sample is lit from below using a normal white illumination. The tissue of the sample absorbs and

Side 19

diffracts some of the light and will therefore have a darker intensity. This turns the problem into a binary problem, where the brightest regions are to be separated from the darker regions. As the light should be smooth over the image this problem can be solved with thresholding. The intensity of an image is defined as the mean of the three color-bands. 30

𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 =

𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑅𝑒𝑑 + 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝐺𝑟𝑒𝑒𝑛 + 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝐵𝑙𝑢𝑒 3

Figure 4-4 The intensity of the IHC-image shown on Figure 4-3. The blue arrow points to background, the yellow arrow to transparent cells, and the red arrow to stained cancer cells

As can be seen on the Figure 4-4, the samples contain some transparent tissue. This tissue contains no cells with an overexpression of the HER2/neu protein. The cytoplasm of these cells can have absorbed some of the blue stain used to enhance the nucleus turning them very light blue. Since these regions and the background have almost the same intensity they are very hard to separate. To solve this problem the thresholding algorithm described in chapter 4.1 is used twice. The first time the range is 0, 255 , the second time 𝑡1 , 255 . This reduces the amount of light blue cytoplasm that is included in the background region, while not reducing the amount of real background. When the threshold 𝑡2 has been found the image is segmented using 𝑡2 . Pixels with a value above 𝑡2 are defined as background while pixels with a value below or equal to 𝑡2 are defined as objects. When the background has been found the mean of the color-specific intensity within this region is computed for all three color-bands. These means are then used to compute the normalizing factor. This factor is set to 90% of total white meaning that pixels with a value equal to the mean will be reset to 226. This value has been chosen to as it ensures that not too many values exceed the maximum gray-level 255. This ends the pre-processing steps and the segmentation can be performed.

4.3 Input to the segmentation The two segmentation methods proposed both uses two grayscale images as input, one image that highlights membranes and one that highlights brown regions. Each of the two images is created from

Side 20

the original RGB-image. In the first method an individual threshold is computed for each of these images using the threshold algorithm described in chapter 4.1. The first image is segmented into a membrane and a non-membrane region while the second image is segmented into a brown and a non-brown region. The intersection of these two regions is used to segment the brown membranes in the original image. In the second method the two grayscale images are multiplied pixel by pixel to create one image in which brown membranes have a high intensity. Then a threshold is computed for this image and the region defined by a high intensity is used to segment the brown membranes in the original image. In the following two chapters the method to compute the two grayscale image are explained.

4.3.1 Creation of grayscale image highlighting membrane regions The cell membranes can also be seen as lines or curves in the image. There exist a number of filters that can enhance curves. Many of them are sensitive to noise, ex. using a gradient filter. In this thesis an approach using local approximating polynomials is used. The approach has been devised by Michael Grunkin and makes use of the fact that the coefficients of local approximating polynomials can be interpreted as convolution filters (Grunkun, 2004). The approach will be described in 1D but can easily be extended to 2D.

Figure 4-5 An IHC-image and the corresponding image after it has been filtered with a Local Linear filter

Assuming a signal 𝑔(𝑥) have been sampled with the sample density ∆𝑥 . The goal is to approximate the sample function with a polynomial of the order 𝑃 inside a window of a given size. The window can be seen as the interval 𝑥 ∈ −𝐾∆𝑥 ; 𝐾∆𝑥 . The polynomial can then be described as: 31 𝑃

𝜃𝑖 𝑥 𝐼 = Θ𝑇 𝑍(𝑥)

𝑓𝑃 𝑥 = 𝑖=0

using the vector notation 32

Θ𝑇 = 𝜃0 , 𝜃1 , … , 𝜃𝑃 33

𝑍 𝑥

𝑇

= 1, 𝑥, . . , 𝑥 𝑃

Side 21

Estimating the coefficients of the polynomial using the Least-Squares estimator, which is minimizing the objective functional, gives: 34 𝐾

𝑆 Θ =

2

𝑔 𝑖 ∙ ∆𝑥 − 𝑓 𝑖 ∙ ∆𝑥

𝐾

=

𝑖=−𝐾

𝑔 𝑖 ∙ ∆𝑥 − Θ𝑍 𝑖 ∙ ∆𝑥

2

𝑖=−𝐾

If the objective functional is differentiated with respect to Θ and 𝑆 Θ = 0 is solved one gets the normal equation: 35 −1

𝐾

Θ=

𝐾

𝑇

𝑍 𝑖 ∙ ∆𝑥 𝑍 𝑖 ∙ ∆𝑥

𝑔 𝑖 ∙ ∆𝑥 𝑍 𝑖 ∙ ∆𝑥

𝑖=−𝐾

𝑇

𝑖=−𝐾

Using the following notation for the inverse coefficient matrix: 36 −1

𝐾

𝐵=

𝑍 𝑖 ∙ ∆𝑥 𝑍 𝑖 ∙ ∆𝑥

𝑇

=

𝑖=−𝐾

𝑏00 𝑏10 ⋮ 𝑏𝑃0

𝑏01 𝑏11 ⋮ 𝑏𝑃1

⋯ ⋯ ⋱ ⋯

𝑏0𝑃 𝑏1𝑃 ⋮ 𝑏𝑃𝑃

Using the notation from above a second order polynomial with a support window of size – 𝐾; 𝐾 and ∆𝑥 = 1 gives: 37

Θ𝑇 = 𝜃0 , 𝜃1 , 𝜃2 𝑍 𝑥

𝑇

= 1, 𝑥, 𝑥 2

and 38 −1

𝐾

𝐵=

𝑍 𝑖 𝑍 𝑖

𝑇

𝐾

=

𝑖=−𝐾

𝑖=𝐾

Combining this gives:

Side 22

1 𝑖 𝑖2

𝑖 𝑖2 𝑖3

𝑖2 𝑖3 𝑖4

−1

39

𝐵=

2𝐾 + 1

0

0

1 𝐾 𝐾 + 1 2𝐾 + 1 3

1 𝐾 𝐾 + 1 2𝐾 + 1 3 3𝐾 2 + 3𝐾 − 1 3 2𝐾 + 3 2𝐾 + 1 2𝐾 − 1 =

1 𝐾 𝐾 + 1 2𝐾 + 1 3

−1

0 1 𝐾 𝐾 + 1 2𝐾 + 1 3𝐾 2 + 3𝐾 − 1 15 −15 2𝐾 + 3 2𝐾 + 1 2𝐾 − 1

0 0

0

3 𝐾 2𝐾 + 1 2𝐾 + 1

0

−15 2𝐾 + 3 2𝐾 + 1 2𝐾 − 1

0

45 2𝐾 + 3 2𝐾 + 1 2𝐾 − 1 𝐾 + 1

Using this and equation 35 the following estimators for the coefficients of the polynomial can be computed: 40 𝐾

𝜃2 = 𝑖=−𝐾

15 3𝑖 2 − 𝐾 𝐾 + 1 𝑔(𝑖) 𝐾 𝐾 + 1 2𝐾 + 1 2𝐾 − 1 2𝐾 + 3 𝐾

𝜃1 = 𝑖=−𝐾 𝐾

𝜃0 =

3 𝑖=−𝐾

3𝑖 𝑔(𝑖) 𝐾 𝐾 + 1 2𝐾 + 1

3𝐾 2 − 3𝐾 − 1 − 5𝑖 2 𝑔(𝑖) 2𝐾 + 1 2𝐾 − 1 2𝐾 + 3

Each of these estimators is a convolution mask. This can be generalized to the following. Defining a convolution filter with support in the interval [−𝐾; 𝐾] as: 41

Φ = ϕ−𝐾 , ϕ−𝐾+1 , . . , ϕ0 , . . , ϕ𝐾−1 , ϕ𝐾 Convoluting this filter with the signal using: 42 𝐾

𝐺Φ 𝑥 = Φ⨂𝑔 𝑥 =

𝑔 𝑥 + 𝑖 ⋅ ∆𝑥 ∙ 𝜙𝑖 𝑖=−𝐾

The coefficient in the polynomial can be expressed as a convolution mask, Θ0 , Θ1 , . . , ΘP each containing 2𝐾 + 1 filter coefficients as shown in equation 41. The filter coefficients can be computed as: 43

Θ𝑚 𝑗 =

1 Δ𝑥

𝑚 𝑃

𝐵𝑚𝑖 𝑗

𝑖

= 𝐵𝑚𝑖 𝑍(𝑗)

𝑖=0

𝑚 = 0, … , 𝑃 and 𝑗 = −𝐾, … , 𝐾

Side 23

An application of a local approximation to a sample signal is smoothing and differentiation. To compute the i’th order derivative of the signal, in the center of the window in which the signal is approximated, the following is used: 44

𝑑𝑖 𝑓 𝑥 = 𝑤 𝑖 ∙ 𝜃𝑖 𝑥 𝑑𝑥 𝑖 𝑖 ,𝑖 > 0 𝑤 𝑖 = 1 ,𝑖 = 0 𝜃𝑖 is the result of applying the filter kernel Θ𝑖 to the signal in the window 𝑥 ∈ [−𝐾; 𝐾]: 45 𝐾

𝜃𝑖 𝑥 = Θ𝑖 ⨂𝑔 𝑥 =

𝑔(𝑥 + 𝑖 ∙ ∆𝑥 ) ∙ 𝜙𝑖 𝑖=−𝐾

Using this, the zero-order derivative will yield the predicted value in the center of the window, which corresponds to smoothing of the signal. Extending this to 2D and the Laplacian can be approximated. The Laplacian is often used as a measure of the local image curvature. The Laplacian is obtained as 46

∇2 𝑓(𝑥, 𝑦) =

𝜕2 𝑓 𝜕𝑥 2

2

+

𝜕2 𝑓 𝜕𝑦 2

2

= 2 ∙ (𝜃20 𝑥, 𝑦 + 𝜃20 𝑥, 𝑦 )

Instead of only looking at these two derivatives, on could contemplate the Hessian matrix: 47

𝐻 𝑥, 𝑦 =

𝜃20 𝑥, 𝑦 𝜃11 𝑥, 𝑦

𝜃11 𝑥, 𝑦 𝜃02 𝑥, 𝑦

Based on this the Eigen values may be computed as: 𝜆1 (𝑥, 𝑦) 𝜆2 (𝑥, 𝑦)

1 = 2

𝜃20 𝑥, 𝑦 + 𝜃02 𝑥, 𝑦 +

𝜃20 𝑥, 𝑦 − 𝜃02 𝑥, 𝑦

𝜃20 𝑥, 𝑦 + 𝜃02 𝑥, 𝑦 −

𝜃20 𝑥, 𝑦 − 𝜃02 𝑥, 𝑦

2 2

+ 4𝜃11 𝑥, 𝑦

2

+ 4𝜃11 𝑥, 𝑦

2

The Laplacian is can be computed as 𝜆1 𝑥, 𝑦 + 𝜆2 𝑥, 𝑦 = 𝜃20 𝑥, 𝑦 + 𝜃02 𝑥, 𝑦 . The two filters defined above represent a non-linear combination of the directional derivatives. The first Eigen value can be interpreted as the maximum local curvature, while the second is the curvature in a direction perpendicular to the first. It is the first Eigen filter that is used in this thesis to emphasize the membranes in the image. This whole chapter is based on a conversation with Michael Grunkin and his paper on the subject of local approximating polynomials (Grunkun, 2004).

Side 24

4.3.2 Creation of grayscale image highlighting brown regions The color brown is defined as a combination of red, green and blue, with most red and least blue. What makes brown special and difficult to segment is that the three colors can have almost any intensity as long as they adhere to the rule defined above making it possible for brown to be bright almost red or very dark almost black. The intensity-span, both total and for each individual color-band, is so large that it is impossible to use the intensity, total or color-specific, to locate the brown regions. Both of the proposed segmentation methods need a grayscale image, in which the brown regions are highlighted, as input. Below three ways to create such an image is suggested. Converting either the red or the blue color-band to the chromaticity color-space could be used. In the chromatic color-space colors are represented by the proportion of a color instead of, as in the RGB color-space, as the intensity of the color. To convert a color to the chromatic color-space the intensity of the color is divided with the sum of the three intensities, thus chromatic red is defined as below: 48

𝐶𝑕𝑟𝑜𝑚𝑎𝑡𝑖𝑐𝑟𝑒𝑑 =

𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑟𝑒𝑑 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑟𝑒𝑑 + 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑔𝑟𝑒𝑒𝑛 + 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑏𝑙𝑢𝑒

Since red is the dominant color in brown, the red chromaticity will have a high intensity in brown regions. For the blue color the opposite is true, as there is little blue in the brown color resulting in a low intensity in regions with a brown color. Both colors can be used to highlight the brown regions but there are problems with both.

Figure 4-6 The images above shows an image in the RGB-space (left), with the chromatic red band (middle) and the chromatic blue (right) band following. Notice how the brown regions results in high and low intensity respectively

Using the chromatic red is an excellent method when trying to locate brown as the proportion of red is large in brown regions. As can be seen in Figure 4-6 brown regions have a high intensity in the chromatic red color-space while the background and the blue nucleus’ have a low intensity. It is therefore straight forward to segment the brown tissue from the rest using the Kittler & Illingworth (1986) threshold algorithm. The method is not without problems though. The mayor problem when using chromatic red is that brown can be very dark, almost black. The intensity for each color in the RGB color-space will be low and almost level, resulting in the red chromaticity having a low intensity. These dark brown/black regions with low intensity are a problem as they will not get included when performing the thresholding, resulting in biased results. See Figure 4-7.

Side 25

Figure 4-7 Close-up of the RGB-image and the corresponding chromatic red image. Notice that the dark membrane on the RGBimage has a low intensity when converted to chromatic red

Using the chromatic blue solves the problem with dark areas. As there is little blue in brown, light or dark, the intensity of chromatic blue will be low in these regions. This makes it possible to include both dark and light brown when performing the thresholding. But using chromatic blue causes another problem as tissue that are very light brown or grayish contains little blue and as a result is included as brown. This tissue is unwanted and will result in a biased result. See Figure 4-8

Figure 4-8 A RGB-image and the corresponding chromatic blue image. The yellow arrows points to the unwanted light brown tissue that is in included as brown when using the chromatic blue

A third method (from now on called Max Int-Blue) is not to use the chromatic colors at all, but to use the following equation to compute pixel intensity: 49

𝑃𝑖𝑥𝑒𝑙 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 =

255 − 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑏𝑙𝑢𝑒 255

Side 26

The result is that pixels with a low blue intensity will get a high intensity. As the blue intensity is low in brown regions, both light and dark, these will get a high intensity. This method gives good results, but there is a problem as some blue regions have a low blue intensity thus getting a high intensity when converted. For either of the proposed segmentation methods these blue regions can be included in the brown region giving a biased result. See Figure 4-9.

Figure 4-9 A RGB-image and the corresponding image create with the above mentioned method. Notice how the blue nucleus gets a high intensity compare to the background

As none of the three methods of computing a grayscale image is obviously better than the others, all three will be used as inputs. The three methods of conversions are implemented in C++ in the file MakeMask.h in the functions MakeRedChrom, MakeBlueChrom and MakeInvertBlue. See appendix 1.1.10.

4.4 The segmentation method To make a segmentation method that is as close to what a pathologist would do, the two grayscale images that highlight the membranes and the brown regions have to be combined. Two methods have been examined, the Intersection method and the Multiplication method. Both methods uses the grayscale images defined above to define the regions of interest. In the Intersection method, a threshold is computed for each of the two images and the intersection of the regions defined in the two images is used as the region of interest. In the Multiplication method the two images are pixel-wise multiplied together. The output gray-scale image will have high intensity where both of the input images have high intensity resulting in high intensity in regions with brown membranes. The region defined by a threshold, computed for this image, will define the brown membrane regions. As the initial threshold found by the algorithm is too low a second threshold is computed. The first threshold 𝑡1 is computed in the range [0, 255], then the second threshold 𝑡2 is computed in the range [𝑡1 , 255]. This ensures that it is only pixels with the highest intensity that gets included in the regions of interest and gives a result that shows less variation.

Side 27

Intersection method

Multiplication method

Create grayscale images

Create grayscale images

Image highlighting brown

Image highlighting membranes

Threshold

Threshold

Image highlighting brown

Image highlighting membranes

Multiply the two images to create one image that highlights brown membranes

Compute intersection of the two regions defined by the two thresholds

Threshold

Segmentation done

Segmentation done

Calculating output (mean of the chromatic blue)

Calculating output (mean of the chromatic blue)

Both methods have been implemented in C++ in the Method.h file as the functions Multi and Inter. These can be seen in the appendix 1.1.15.

4.5 Post-processing Some noise is expected in the image, both when capturing the image and when performing the segmentation. To give as accurate results as possible this noise has to be removed. This is done by 1

removing every region defined as brown membrane that is smaller than a 10,000 of the total amount of pixels. This threshold seems reasonable as it means that in an image with the size 1360x1024, regions less than 139 pixels will be removed.

4.6 Testing segmentation methods Both methods have been tested with all the input images using the dataset A, the outputs can be seen in the appendix 1.2. For the method to work it should be able to separate the output in group’s matching the input. To test it an ANOVA + Bonferroni’s test has been performed on the output. An ANOVA calculates whether there is a significant difference between the means of the groups involved. As the analysis only controls whether one of the groups differs from the rest a Bonferroni’s test have also been conducted. This test controls whether there is a significant difference between the individual groups. All combinations have a good difference between the groups and none of them can be discarded. It is assumed that the groups have a Gaussian distribution and that they have the same variance. Full results of the analysis’s can be seen in appendix 1.3.

Side 28

Kommentar [NTB2]: LINK

Multi MaxInt-Blue

Intesity value

150

100

50

3+

2+

Ne g

0

Score

4-10 Scatter plot of the output when using the Multiplication method and the MaxInt-Blue input image. The lines shows the mean and the standard variation

It should be noted that some overlap between the groups is to be expected as the end-point scores are based on a human evaluation. This introduces a bias and result in a larger variation within the groups. Also samples with a score of Neg have very little or no brown in them. The regions found by the methods are consequently very small, just above the limit of what is considered noise. This makes the methods sensitive to noise in samples of Neg. A visual inspection of the regions defined by the methods as brown membranes shows that the combinations Multiplication/Chromatic blue and Intersection/MaxInt-blue are the ones that best defines the brown membranes. The two methods Multiplication/Chromatic red and Multiplication/MaxInt-blue defines too much. Both have a tendency to include dark blue regions. The last method, Intersection/Chromatic red is too conservative not defining enough. As it is difficult to define what precisely should be included unless one has been educated as a pathologist, this is a purely uneducated guess from the writer of this thesis part.

Side 29

Table 2 The green regions are the regions defined by the methods as brown membranes

Neg score Original image of score Neg

Multi/Chrom Blue

Multi/Chrom Red

Multi/MaxInt-Blue

Inter/Chrom Red

Inter/MaxInt-Blue

Side 30

Table 3 The green regions are the regions defined by the methods as brown membranes

2+ score Original image of score 2+

Multi/Chrom Blue

Multi/Chrom Red

Multi/MaxInt-Blue

Inter/Chrom Red

Inter/MaxInt-Blue

Side 31

Table 4 The green regions are the regions defined by the methods as brown membranes

3+ score Original image of score 3+

Multi/Chrom Blue

Multi/Chrom Red

Multi/MaxInt-Blue

Inter/Chrom Red

Inter/MaxInt-Blue

Side 32

Given the information gained from the analysis of the combinations of method/input it is impossible to conclude which combination that is the best. All combinations will therefore be used when testing the robustness of the overall method. In the following chapters all combinations of method/input will only be referred to as the method, as it would be too cumbersome to read otherwise. If one method performs differently than the rest it will be mentioned.

4.7 Statistical Model To predict the score based on the intensity of the computed brown of a sample a model has to be generated. In this chapter such a model is computed and tested. To develop the model a test-set has been generated. This set consists of the output from twenty-four samples, eight from each score. The samples have all been picked randomly. Table 5 The samples used in the test-set Neg

2+

3+

MB 4

H 11726 05 B 40x

H 10222 05 B 40x

MB 5

H 11814 05 A 40x

H 10952 03 A 40x

MB 6

H 12664 06 B 40x

H 11722 05 A 40x

MB 7

H 12907 05 B 40x

H 17844 06 C 40x

MB 8

H 6527 06 B 40x

H 22475 05 H 40x

MB 9

H 8212 06 A 40x

MB III 11

MB III 1

H 8865 05 F 40x

MB III 7

MB III 13

MB III 12

MB III 9

To create the model the output is plotted in a scatter plot, with the score on the x-axis and the computed intensity-measure on the y-axis. The model is then generated by fitting a curve through the output and using the function for this curve as the model, see Figure 4-11. To find the model that best fits a number of functions have been tested and compared using GraphPad Prism 5. The functions tested are: Straight line, second order polynomial, third order polynomial, semi-log line (x-axis linear, yaxis log), semi-log line (x-axis log, y-axis linear), log-log (both axis log), Gaussian. The group Neg, consisting of scores 0+ and 1+, has been put at the x-value 1. The result of the test shows that all six of the combinations are best represented by a linear model, see Table 6. Table 6 Combination

Multi/Chrom blue

Multi/Chrom red

Multi/MaxIntblue

Inter/Chrom blue

Inter/Chrom red

Inter/MaxIntblue

Model-type

Straight line

Straight line

Straight line

Straight line

Straight line

Straight line

Model

𝑎 = −0,07104

𝑎 = −0,0778

𝑎 = −0,07369

𝑎 = −0,0648

𝑎 = −0,07476

𝑎 = −0.01420

𝑏 = 0,4426

𝑏 = 0,4697

b=0,4621

b=0,4141

b=0.4709

b=0.3457

Side 33

The models computed above can be used to predict the score of the sample, from which the output is computed. To control how well the models predict the score they have been used to predict the score of the whole dataset A. The score is computed by solving the equations defined by the model in respect to 𝑥. The score is defined in the intervals 𝑁𝑒𝑔 ∈ −∞; 1.5 , 2+∈ 1.5; 2.5 and 3+∈ 2.5; ∞ . Table 7 Combination

Multi/Chrom blue

Multi/Chrom red

Multi/MaxIntblue

Inter/Chrom blue

Inter/Chrom red

Inter/MaxIntblue

True

48

44

44

37

45

45

False

2 (2 2+ ->Neg)

5 (3 2+ ->Neg; 2 3+->2+)

6 (1 Neg->2+; 3 2+ ->Neg; 2 3+ ->2+)

13 (3 Neg ->2+; 9 2+ ->3+; 1 2+ ->Neg)

4 (1 Neg->2+; 1 2+ ->Neg; 1 2+->3+; 1 3+>2+)

6 (1 Neg->2+; 3 2+ ->Neg; 1 2+ ->3+)

As can be seen from Table 7 the best combination, given the test-set, is the Multiplication/Chromatic blue. The Intersection/Chromatic blue is with thirteen misclassifications the worst and as it has twice the number of misclassifications as the others it has been discarded as a viable combination. The other combinations will all be used in the analysis’ performed in chapter 5. 0.5

Neg 2+

Intensity measure

0.4

3+ Model

0.3 0.2 0.1 0.0 0

1

2

3

Score Figure 4-11 Scatter plot of the output with the fitted model

Side 34

4

5 Analysis of the segmentation method In the last chapter it was proven that it is possible to develop a method that automatically scores IHCstained tumors using images captured with a microscope. For this technique to be useful in practice the limits and robustness of it has to be analyzed. In this thesis four factors that may affect the acquisition of the input images have been analyzed. These factors are:    

Protocol, the Dako HercepTest® vs. the Ventana Pathway® protocol The thickness of the tissue sample Camera, the method is used on images captured with two different cameras Focus, it is tested whether the method can handle images slightly out of focus

The effect of these four factors on the output will be analyzed and compared to the relevant output or model. To ensure that there is still separation between the scores, the output from new factors will be analyzed using either ANOVA or a t-test with a confidence interval of 99%. This ensures that the method is still able to separate the mean of the scores and that the difference between the means is significantly large. If this is not the case then it must be concluded that this factor affects the images in such a way that it is impossible to score them using this method. If there is still separation between the scores, the correlation between images applied the factor and image without is calculated. If the coefficient of determination is large and positive then the output from the factor images closely resembles that of the original images. If this is the case then there is a chance that the factor can be encompassed in the model. If there scores are separated and there is a significantly correlation then a regression line is fitted to the factor output. This line is then compared to the model to control that the model can be used to predict scores from such an output. First the slope is compared by calculating a two-tailed P-value testing the null hypothesis that the slopes are identical (the lines are parallel). This is done at a 95% confidence interval. If the slopes are identical then the y-intercepts are compared, again by calculating a two-tailed P-value at a 95% confidence interval. If both the slope coefficient and the y-intercept are identical then there is no course to change the model and it can be concluded that factor have no or little influence on the output calculated by the method. If there is no similarity but the two tests above all have been successful, then it is concluded that the factor either should be avoided or if this is not possible that a new model for this factor should be calculated. If the output fits the model a last test is performed to test how well the model predicts the scores of the samples. This is done as to be able to find the best combination. All these test have been performed using GraphPad Prism 5 and the output of each test can be seen in the appendix. It was unfortunately not possible to capture images using a controlled wrong white-balance or images with an in/de-creased light-intensity. As the method should compensate for these two factors it would have been useful to get data that shows how well this compensation is. It would be possible to create

Side 35

Kommentar [NTB3]: LINK

the images artificially by altering the intensity of one or more of the color-bands, but it was deemed more important to analyze the factors that could be affected.

5.1 Analysis of different protocols As both Dako’s HercepTest® and Ventana’s Pathway® protocols are being used widely throughout the pathology community, it is vital to know if the method can work using images stained with either. There is an obvious difference between the two protocols as the stain of the Pathway® protocol is much darker compared to the stain of the HercepTest.

Figure 5-1 Two images of the same 3+ sample stained with the Pathway® protocol (left) and the HercepTest® protocol (right). It is apparent that the membrane-staining is much darker when using the Pathway® protocol

This difference will affect the output of the method, as the darker brown contains less chromatic blue. The effect of this is that the output from samples stained with the Pathway® protocol will be lower compared to samples stained with the HercepTest® protocol, especially for samples of the score 2+ and 3+ as these contains the most brown. The difference between samples of scores 0+ or 1+ will not be as pronounced as these contains little or no brown. To control that the method can correctly separate the output of samples stained with the Pathway® protocol a two-tailed t-test have been performed. A t-test is used as the dataset only contains samples of two scores, Neg and 3+. The test shows that there is a good separation between the groups. That is good, but as the dataset contains no samples of the score 2+ one cannot fully conclude the method works on samples stained with the Pathway® protocol. Comparing the output from the Pathway®- and HercepTest®-samples using the correlation shows that there is a very significant correlation between the two. The coefficient of the determination is for all combinations higher than .85, meaning that at least 85% of the variation in X can be explained by Y. Performing a linear regression on results from dataset E and comparing this with the model computed in chapter 4.7 shows that the slope of the Pathway® line is somewhat steeper than the model line. This is as expected as the darker membrane of the Pathway® test, will give a lower chromatic blue intensity. Using GraphPad Prism 5 to compare the model with the regression line computed from the data shows

Side 36

that these two lines are almost certainly not the same. But as the data is very limited it is impossible to conclude that this is so.

Multi/Chrom blue 0.6

Model Pathway

0.4

0.2

0.0 2

4

6

8

Score -0.2 Figure 5-2 Comparison of the output computed from Pathway stained samples and the model

Trying to predict the score of the Pathway® results with the models computed in chapter 4.7, gives surprisingly good results. The Multiplication/Chromatic blue-model classifies all samples correctly, while the other misclassifies one Neg-sample as a 2+-sample. But as there are no samples of the score 2+ it is not possible to conclude how well the models perform. So based on the amount of data available it is not possible to decide whether the computed model can decide the actual score of a Pathway® sample or whether a new model has to be computed to predict Pathway® scores. With a more complete dataset this could have been tested. Given that there is a significant difference between the means of the scores Neg and 3+ and given the high correlation the conclusion is that the method seems able to separate samples stained with the Pathway® protocol. As there is no samples of the score 2+ in the dataset, it is not possible to tell whether this group can be separated from one the others. But given the high correlation between the outputs there is a significant chance that outputs from Pathway® samples scored 2+ will behave like their counterparts stained with HercepTest®. It must also be concluded that two different models are needed to predict the score of a sample, because the regression lines are so different. Using the existing model to predict the score didn’t give any problems with the given dataset but considering the limit size of this it is hard to justify any other conclusion.

5.2 Analysis of the effect of tissue thickness When a biopsy is analyzed a slice is first cut from it. It is this slide that the pathologist looks at and evaluates. In Dako’s HercepTest(TM) Interpretation Manual it is recommended that the slices have a thickness of 3-4 μm (DAKO, 2007). The tissue slices are cut from the biopsy using a microtome. This machine is able to cut these very thin slices with a precision of less than a 0.05 μm if calibrated properly.

Side 37

As the calibration is difficult to perform precisely or if it haven’t been done within a given period the slice can vary in thickness.

Figure 5-3 Images of a 3+ sample at a 1 µm(left), 3 µm(middle) and 8 µm(right) thickness. It is easy to see the difference in the color-intensity

The colors captured by the camera is generate when white light from the microscope is absorbed or refracted as it hits the tissue. Since the amount of absorbed light is associated with the length of tissue it needs to travel thru, it can be hypothesized that the thickness will be affecting the intensity of the brown regions. To analyze this samples of 1 µm and 8 µm thickness are compared to the model which has been build using samples of 3 µm thickness.

Nonlin fit of Multi/Chrom blue 0.5

1 um 8 um

0.4

Intesity value

Model 0.3 0.2 0.1 0.0 0

1

2

3

4

Score Figure 5-4 The regression lines computed from the output of two different thickness' compared with the model. It is clearly seen that the thickness affect the intensity of the brown.

Computing the regression lines for each output from slices with the thickness 1 µm and 8 µm and comparing these with the model confirms that the thickness affects the brown intensity. In the images from the thick slices there is more color and this affects the output intensity. While the scores 2+ and 3+ have lower output intensity compared to the model, due to more intense brown, the score Neg has for most of the combinations a higher output intensity, due to a more intense blue that is included in the regions of interest. The opposite is the case for the thin slices. Here the scores 2+ and 3+ has a higher output intensity, due to the lower intensity of the brown color, while the score Neg has a lower or equal

Side 38

output intensity. This results in a steep slope for the thick slice and a shallower slope for the thin slice as can be seen in Figure 5-4. As the intensity is dependent on the thickness it would be useful if the method could be modified in such a way that the dependency was removed. Assuming that the thickness of the tissue is known this is feasible. If the intensity is plotted as a function of the thickness a curve can be fitted to the output. The function of this curve can then be used to compute the intensity as if it had a thickness of 3 µm. For all combinations the curve that best fits the output is a straight line. It should be noted that even though it is the curve that fits the best the 𝑅2 -value which tells how good the fit is, is very low for all the combinations. 50

𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 = 𝑎𝑡𝑕𝑖𝑐𝑘 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠 + 𝑏𝑡𝑕𝑖𝑐𝑘

Slope Multi/Chrom blue 0.5

Intesity value

0.4 0.3 0.2 0.1 0.0 0

2

4

6

8

10

Thickness Figure 5-5 The intensity plotted as a function of the thickness.

The normalized intensity can be computed by isolating 𝑏𝑡𝑕𝑖𝑐𝑘 in equation 50 and then equalizing the function at the thickness 3 µm and at the thickness x: 51

𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦3 − 𝑎𝑡𝑕𝑖𝑐𝑘 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠3 = 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑥 − 𝑎𝑡𝑕𝑖𝑐𝑘 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑥

𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦3

= 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑥 − 𝑎𝑡𝑕𝑖𝑐𝑘 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑥 + 𝑎𝑡𝑕𝑖𝑐𝑘 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠3 = 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑥 − 𝑎𝑡𝑕𝑖𝑐𝑘 (𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠3 − 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑥 ) This makes it possible to normalize the intensity to a thickness of 3 µm. The computed results are only as good as the fit of the curve, and since the fit for the curve is bad, the results are to. This method has therefore been discarded in this thesis. Instead of converting the intensity, the model can be modified to be independent of the thickness. This can be done by fitting a line to the outputs from the 1 µm and 8 µm slices. The function that best fits the output is a straight line. As the model is also a straight line it should be possible to calculate a way to modify the slope coefficient and y-intercept in such a way that it was dependent on the slice thickness.

Side 39

Making the y-intercept and the slope coefficient so dependent of the slice thickness would make the overall model able to predict the score no matter what thickness the tissue slice has.

Figure 5-6 Plot of the y-intercept(left) and the slope coefficient(right) vs. the slice thickness

Plotting the slice thickness vs. the y-intercept and the slope coefficient gives the graphs shown above. The models that best fits these are two quadratic equations. These are used to calculate the coefficients of the model. That gives the following model: 52

𝑠𝑐𝑜𝑟𝑒 =

𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 − 𝑏𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 𝑎𝑠𝑙𝑜𝑝𝑒

53

𝑠𝑐𝑜𝑟𝑒 =

𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 − 𝑏2_𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 ∙ 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠 2 + 𝑏1_𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 ∙ 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠 + 𝑏0_𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 𝑏2_𝑠𝑙𝑜𝑝𝑒 ∙ 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠 2 + 𝑏1_𝑠𝑙𝑜𝑝𝑒 ∙ 𝑡𝑕𝑖𝑐𝑘𝑛𝑒𝑠𝑠 + 𝑏0_𝑠𝑙𝑜𝑝 𝑒

The quadratic equations for each combination are shown in Table 8. Table 8 The quadratic equations used to compute the y-intercept and slope coefficient. x is the slice thickness and y either gives the y-intercept or slope coefficient. y-intercept

Slope coefficient 2

𝑦 = −0.03661 − 0.01338x + 0.000635x2

Multi/Chrom blue

𝑦 = 0.4131 + 0.01108x − 0.00042x

Multi/Chrom red

𝑦 = 0.4151 + 0.01832x − 0.00004143x 2

𝑦 = −0.03422 − 0.01617x + 0.000548x2

Multi/MaxInt-blue

𝑦 = 0.4032 + 0.01849x + 0.0003771x 2

𝑦 = −0.03034 − 0.01616x + 0.000571x2

Inter/Chrom red

𝑦 = 0.3609 + 0.02369x − 0.00198x 2

𝑦 = −002606 − 0.01613x + 0.00107x 2

Inter/MaxInt-blue

𝑦 = 0.4301 + 0.004461x + 0.003047x 2

𝑦 = −0.03836 − 0.01084x + 0.00043x 2

Using this improved model it is possible to predict the scores of a sample no matter what its thickness is, as long as the thickness is known. Trying to predict the scores of the samples in dataset D gives very good results. The combination Multiplication/Chromatic blue has no misclassifications while the Intersection/Chromatic red combination has two and the other three have one each.

Side 40

Both techniques require a method for measuring the thickness if they are to work on live samples. Such a method has not been developed in this thesis, but an idea to how it could be done will be presented here. The thickness could be measured using the focus-measure. This will have a certain value when the focus is either below or above the slice and will change when the focus is on the slice. The thickness can then be measured as the distance traveled between first getting a fluctuation in the focus-measure and when last getting one. Whether this will work in practice is not known, but the theory seems promising. In this chapter it has been shown that the tissue thickness affects the intensity-measure used to score the samples. Two techniques to circumvent this problem have been shown. Trying to modify the intensity directly turned out to be problematic so instead the model for predicting the score was modified to cope with the thickness. As both techniques depended on a method for measuring the thickness of the tissue, such a method has been proposed.

5.3 Analysis of the differences between two cameras There is a variety of different cameras on the market with different qualities. The quality of a camera can affect the images acquired with it and it is therefore needed to test how this affects the method. The two cameras used in this test were a PixeLINK A686 and an Olympus DP 71 camera. Both of these cameras are middle quality cameras, but while the PixeLINK is in the low end the DP 71 is in the high end. The DP 71 has a CCD-chip while the PixeLINK have a CMOS-chip. A CMOS-chip is cheaper and is prone to static noise due to the architecture of the chip. There is also a difference in the color reproduction of the two cameras. The DP 71 camera reproduced colors more correctly with better temperature and tint.

Figure 5-7 Image captured with a Olympus DP 71 (left) and with a PixeLINK (right). Notice the difference in the color temprature and the tint of the images

It should be noted that there is some magnification differences between the images captured with the two cameras. This is attributable to the relation between the C-mount on the microscope and the size of the chip. A C-mount is an adapter that’s connects the camera with the microscope. The chip-size of the two cameras are different and as it wasn’t possible to obtain a C-mount that corresponded with the PixeLINK camera, images acquired with this has a less field of view giving them a larger magnification.

Side 41

This may introduce some bias as the field of view is smaller the number of membranes are smaller, leading to higher variation in the output. First it was testes whether the method was able to separate the scores, when the input images had been captured with a PixeLINK camera. As there only is one sample of the score 2+ it is impossible to perform an ANOVA on the output to control this. Performing a t-test on the difference between the score Neg and 3+ shows a significant difference. This partially proves that the method is able to separate samples from different scores.

T-test Multi/CRed 0.5

Intesity value

0.4 0.3 0.2 0.1

K

71 P 3+ /D

3+

/P

ixe

P

LIN

71

K ixe /P

2+ /D

LIN

71 P 2+

/D Ne g

Ne g

/P

ixe LIN

K

0.0

Score/Camera Figure 5-8 Scatter plot of the output from dataset B and E computed with the combination Multiplication/Chromatic red

Computing the correlation between the outputs shows that there is no great difference between the outputs of the two cameras. All the combinations have a coefficient of determination above 0.80 so they must be consider much correlated. Since the outputs are as correlated as they are it can be assumed that outputs from 2+ samples also will be and that it therefore is possible to separate all three scores.

Side 42

Inter/MaxInt-Blue 0.6

DP 71 Inter/MaxInt-Blue PixeLINK Inter/MaxInt-Blue 0.4

0.2

0.0 2

4

6

8

Score -0.2

Figure 5-9 The regression line of the output from dataset E compared to the model

Computing the regression line for the output from dataset E and comparing this to the model computed in chapter 4.7, shows that the lines are very alike. This further proof that the method is able to coop with image captured from different cameras. Trying to predict the scores of the samples acquired with the PixeLINK camera further underlines this. Only the Intersection/Chromatic red combination has a misclassification. It can be concluded that the method isn’t sensitive to which camera is being used to capture the input images and that it isn’t necessary to make a separate models for different cameras.

5.4 Analysis of the effect of the focus As a slice of tissue is not smooth the focus can be lost when moving the field of view to a different spot on the slice. If this is not corrected and the image is re-focused, the result is un-focused images. Since this can easily happen, it would be interesting to analyze the effect unfocused images have on the method. An unfocused image is blurred and this will make it more difficult to define the membranes and make colors less distinctive. This may introduce a greater variation to the output.

Side 43

Figure 5-10 A 3+ sample capture in focus (left) and out of focus (right)

First a two-tailed t-test is performed to ensure that there still is separation between the scores. This shows that there is an excellent separation between the scores Neg and 3+ as expected. Again there is only one sample of the score 2+, so it is impossible to test whether the method is able to separate this score from the others. Looking at a scatter plot of the output shows that the single 2+ is close to the Neg group. As there is only one value it can’t be concluded whether this is a coincident or not.

Multi/MaxInt-blue 0.40

Intesity value

0.35

0.30

0.25

3+

2+

Ne g

0.20

Score Figure 5-11 The output from un-focused images

Computing the correlation between the focused and the unfocused output shows that these are highly correlated which was expected. The coefficient of the determinant is above 0.88 for all combinations. As can be seen from the images in Figure 5-10 the brown membranes are present in the image, they are just somewhat larger than when the image is focused, so it is not surprisingly that the correlation is as good as it is. Computing the regression line for the unfocused output and comparing this to the model shows that these are not significantly different at a confidence level of 95%. This is again of no surprise considering the results above.

Side 44

Using the models to predict the score of the unfocused output gives reasonable results with one combination, Intersection/MaxInt-blue, having 3 misclassifications as the worst. This is okay considering the affect on the images. Looking at the results of this analysis it can be concluded that the method handles images out of focus reasonably well, but that it performs better on focused images. The method will not fail if one or two images are out of focus, but images should be focused to obtain optimal results.

5.5 Summary of the analysis The effect of the four factors has been analyzed and the conclusion is that the method is able to separate the scores when one factor is applied to the output. Two factors, the protocol and the thickness, require either that the model is modified or that a new model is made. The analysis of the two protocols showed that a model for each properly would be the best solution. But as the data material available was limited a model for the Pathway® protocol wasn’t constructed. The thickness analysis showed that the thickness of the tissue has a great influence on the output of the method. This influence has been overcome by modifying the model, so the constants are dependent on the thickness. In all the method is very versatile.

Side 45

6 Discussion On world basis the number of breast cancer cases is rising. For a patient to receive the optimal treatment the tumor needs to be examined in detail. The number of test performed on a tumor is rising as scientist keeps finding new characteristics of the tumors that either helps with the prognosis or can be used as a target for therapy. All these examinations have to be performed in a pathological laboratory by a trained pathologist. As both the number of cases and the number of tests that should be performed are rising laboratories becomes overloaded. By making the process of determining the HER2/neu status less complicated and more automated some of this pressure will be eased. The aim of this thesis was to develop a method for automatically determination of the HER2/neu score in breast cancer samples to make the process less complicated and easier to perform. This has been achieved, but what are the ramifications of such a method. If this method was to be implemented in a pathological laboratory it would mean that the scoring would become more uniform, it would make the scoring faster, as it only takes about 30 seconds to analyses a sample of five images, it would make it easier to determine the score of a sample and it would enable the lab to determine the score of more cases. It wouldn’t entirely eliminate the need for a trained pathologist since some has to choose the five fields of views, but the level of experience and the amount of work he had to do on each case would be minimized. It would also mean that smaller laboratories that don’t have a large amount of case could perform the scoring as it would be easier to do. In all it would mean that pathological laboratories would become more efficient and it would ease off some of the overall pressure. The negative effects of this method being implemented could be that the level of expertise and carefulness employed today would be lost. As the pathologist wouldn’t look at as many samples close up and wouldn’t have to determine the score himself he wouldn’t gain as much experience as now, perhaps making him unable to determine the score. Since the method is only as good as the input, it is vital that the pathologist is careful when determining the fields of views. Some of this carefulness could be lost if the expertise fades. The positive aspects of an implementation outweigh the negative though and it would help the pathological laboratories to have a tool like this.

Side 46

7 Conclusion In this thesis a method that uses image analysis to automatically determine the HER2/neu score has been proposed. The method uses images capture from immunehistochemistry stained tissue samples to determine the score base on intensity of the brown color in the membranes. It has been proven that the computed brown intensity is able to separate samples of different scores and therefore can be used as a quantitative measured for the scores. The sensitivity of the method has been tested and of the four factors examined it was found that two has no or inconsequential influence on the method while two others has an influence on it. The two factors that haven’t got an influence are the camera and the focus level. From the analysis of the cameras it can be concluded that both the PixeLINK A686 and the Olympus DP 71 cameras can be used with the method. As these two cameras represent each end of the quality scale for good entry level cameras it is likely that all cameras on this level will work with the method. It has been shown that the focus level has little influence on the output, but that focused images produce more accurate results. The two factors that do influence the output significantly is the thickness of the tissue and the protocol used to stain the tissue. Both factors have a significant impact on the output of the method, but in both cases there is still a separation between the samples of different scores. The method can therefore be used to score samples of different thickness or stained with different protocols as long as either factor is taken into account. Due to the limited size of the dataset stained with the Pathway® protocol it hasn’t been possible to create a model that can predict the score such stained samples, but given a sufficient sized dataset this should be doable. The influence of the thickness has been removed by making the predicting model dependent on the thickness. This of cause requires that thickness is known or that a method for measuring it is developed. Such a method has been outside the scope of this thesis, but a way for doing has been proposed. Due to the available microscope-equipment it hasn’t be possible to test the effect of a change in the light-intensity or how incorrectly performed white balance affects the method. This is unfortunate as these two factors could affect the method. It also means that the effect of the pre-process step hasn’t been tested. Given the knowledge obtained from the analyses above it is not possible to discard any of the highlighting method/input image combinations as they all had good performances. One combination, Multiplication/Chromatic blue, though performed above the rest. This combination had the least misclassification both at the initial testing and through all tests. The conclusion is that it is possible to determine the HER2/neu score of a sample automatically and that this method is both robust and versatile.

Side 47

8 The program The program Cancer diagnostics is the implementation of the developed method. It is a very simple program as it is intended to be integrated to VIS, the image analysis program developed by Visiopharm A/S. To execute it, first the program vcredist_x86.exe has to run. This program ensures that executable from Microsoft Visual Studios can run on the computer. When Cancer Diagnostic is executed the user is asked to select five settings. The settings are:     

Show each image with regions of interest defined? Select method. Multiplication or intersection Select input image that highlights brown. Chromatic blue, chromatic redor MaxInt-blue Locate folder containing database o This is choosen from a browseble window Select debth of database. This is the number of folder from the databse folder to the folder wherein the images are situated

The first setting is whether to show the images with the defined regions of interest. If selected each images in the database is shown with the regions of interest defined. As long as the image window is open the program is paused. The two next settings are to choose the method/input image combination. The combination Intersection/Chromatic blue is not available as it has been proven to be unstable. The next two settings are to select where the database is located and the depth of this. VIS has a very good handling of databases and as this program is meant to be integrated into VIS, the selection is very simple. First browse and select the folder wherein the database is situated. Then select the depth of the database. VIS has three levels in a database: Study|Study unit|Measurement. The measurement folder contains an image. So if the user would use the program on a normal VIS structured database the depth level had to be three. When all settings have been set and the database selected the program starts. If it the show images option has been disabled the program will run until there are no more images in the database. The use will then be asked where he would like to save the two output files. These are simple .txt files, one with the names and scores of the each sample in the database and one with the quantitative measure of each sample. When the data has been written to the two files the program closes. Levels for the datasets use in this thesis:     

Dataset A 3 Dataset B 3 Dataset C 4, if both the images from the HercepTest® and Pathway® protocols should be include otherwise select the protocol and then 3 Dataset D 4, if samples from all three thickness should be analyzed otherwise select the thickness and then 3 Dataset E 3

Side 48



Dataset F

3

It should be noted that the program assumes that there are five images to a sample, if this is not the case then the output will be incorrect.

Side 49

9 Future work Some aspects involved in the automatic determinations of HER2/neu scores have not been addressed in this thesis, either because the data/material to do so wasn’t present or because it was outside the scope of the thesis. All of these can have a significant impact on the method and further work should be invested in them. Things that haven’t been possible due to the available data/material are a thorough analysis of the Pathway® protocol and the effect of a change in the light intensity or incorrectly performed whitebalance. In the analysis of the two protocols it was concluded that a specific model was needed for each protocol. A model for the Pathway® protocol has not been created but needs to be if the method should be used with this protocol. The effect of light intensity/white balance hasn’t been analyzed as it was impossible to change the color-temperature or the light intensity without the available microscope selfcorrected these changes. But using a less sophisticated microscope should make this possible. Aspects found to be outside the scope of this thesis includes the creating a method to measure the thickness of a tissue slice and to analyze whether the method is able to separate samples as either FISHpositive or negative based on the measured brown intensity. The last aspect especially could have some promising implications. Another aspect found to be outside the scope is to make the method completely autonomous. As the method is now it needs five input images. These images have to be acquired by a human. To make the method totally autonomous this process would have to be made by the computer as well. The program used to compute the intensity and the score of the samples is a very simple. It is now controlled through the command line. Some form of graphical interface would be preferable but was considered outside the scope of the thesis.

Side 50

10

Bibliography

ACS :: What is Breast Cancer. (n.d.). Retrieved 12 27, 2007, from American Cancer Society: http://www.cancer.org/docroot/CRI/content/CRI_2_4_1X_What_is_breast_cancer_5.asp?sitearea= ASC :: How Is Breast Cancer Diagnosed? (n.d.). Retrieved 12 30, 2007, from American Cancer Society: http://www.cancer.org/docroot/CRI/content/CRI_2_4_3X_How_is_breast_cancer_diagnosed_5.asp Bray, F., McCarron, P., & Parkin, D. M. (2004, August 26). The changing global patterns of female breast cancer incidence and mortality. Breast Cancer Res. , pp. 6:229-239. Breast Cancer - Wikipedia. (n.d.). Retrieved 12 12, 2007, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/Breast_cancer Breast Cancer Home page. (n.d.). Retrieved 12 27, 2007, from National Cancer Institute: http://www.cancer.gov/cancertopics/types/breast Cancer - Wikipedia. (n.d.). Retrieved 12 13, 2007, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/Cancer Cancer staging - Wikipedia. (n.d.). Retrieved 12 14, 2007, from Wikipedia - the free encyclopedia: http://en.wikipedia.org/wiki/Cancer_staging Coloer balance - Wikipedia. (n.d.). Retrieved January 16, 2008, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/White_balance DAKO. (2007, March 8). HercepTest(TM) Interpretation Manual. Retrieved November 5, 2007, from Dako: http://pri.dako.com/28630_herceptest_interpretation_manual.pdf Fluorescent in situ hybridization - Wikipedia. (n.d.). Retrieved 01 02, 2008, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/Fluorescent_in_situ_hybridization Glasbey, C. A. (1993, November 6). An Analysis of Histogram-Based Thresholding Algorithms. Graphical Models and Image Processing , pp. 532-537. Grunkun, M. (2004). Local approximating polynomials 1D and 2D. Visiopharm A/S. HER2. (2008). Retrieved 01 02, 2008, from Amreican Association for Cancer Research: http://www.aacr.org/home/public--media/for-the-media/fact-sheets/cancer-concepts/her2.aspx Immunohistochemistry - In Situ Hybridization. (n.d.). Retrieved 01 11, 2008, from Immunoportal.com. Immunohistochemistry - Wikipedia. (n.d.). Retrieved 01 02, 2008, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/Immunohistochemistry Kittler, J., & Illingworth, J. (1986). Minimum Error Thresholding. Pattern Recognition , pp. 41-47. Madrid, M. A., & Lo, R. W. (2004, July). Chromogenic in situ hybridization (CISH): a novel alternative in screening archival breast cancer tissue samples for HER2/neu status. Breast Cancer Research , pp. 593600. Niemistö, A. (2004, October 27). A Comparison of Nonparametric Histogram-Based Thresholding Algorithms. Presentation for 8002202 Digital Image Processing III . http://www.cs.tut.fi/~ant/histthresh/ThreshComp.pdf.

Side 51

Probability of Breast Cancer in American Women. (n.d.). Retrieved 12 27, 2007, from National Cancer Institute: http://www.cancer.gov/cancertopics/factsheet/Detection/probability-breast-cancer Sezgin, M., & Sankur, B. (2004, January). Survey over image thresholding techniques and quantitative performence evaluation. Journal of Electronic Imaging , pp. 146-165. Tubs, R. R., & Hsi, E. D. (2004, March). Guidelines for HER2 testing in the UK. Journal of Clinical Pathology , pp. 241-242. What is cancer. (n.d.). Retrieved 12 11, http://www.cancer.gov/cancertopics/what-is-cancer

2007,

from

WHO | The top 10 causes of death. (n.d.). Retrieved http://www.who.int/mediacentre/factsheets/fs310/en/index.html

Side 52

National 12

22,

Cancer 2007,

Institute:

from

WHO:

11

Figures

Figure 2-1 Normal cell growth and abnormal cell growth leading to a tumor (What is cancer) ..................2 Figure 2-2 The structure of a healthy breast.................................................................................................4 Figure 2-3 Breast cancer incidences worldwide: age-standardized rates (world population). ....................5 Figure 2-4 IHC and FISH targets for HER2/neu testing (DAKO, 2007) ...........................................................7 Figure 2-5 Images showing breast cancer samples stained with the HercepTest® protocol showing all four scores .....................................................................................................................................................9 Figure 2-7 Image showing a breast cancer sample stained with a fluorescent marker. The red signals represent HER2/neu genes and the green the chromosomes....................................................................10 Figure 4-1 A diagram of the proposed method...........................................................................................12 Figure 4-2 Three population models for thresholds 𝑻 = 𝟏𝟎𝟎, 𝑻 = 𝟏𝟏𝟓 and 𝑻 = 𝟏𝟑𝟎. The black line defines the histogram of an IHC-image. The red and green lines are the population density functions. The blue area represents the binarisation error. The optimal threshold is found as 𝑻 = 𝟏𝟏𝟓 .................16 Figure 4-3 An IHC-image captured with a wrong white-balance and the same image after the algorithm has performed white-balance .....................................................................................................................19 Figure 4-4 The intensity of the IHC-image shown on Figure 4-3. The blue arrow points to background, the yellow arrow to transparent cells, and the red arrow to stained cancer cells ...........................................20 Figure 4-5 An IHC-image and the corresponding image after it has been filtered with a Local Linear filter .....................................................................................................................................................................21 Figure 4-6 The images above shows an image in the RGB-space (left), with the chromatic red band (middle) and the chromatic blue (right) band following. Notice how the brown regions results in high and low intensity respectively.....................................................................................................................25 Figure 4-7 Close-up of the RGB-image and the corresponding chromatic red image. Notice that the dark membrane on the RGB-image has a low intensity when converted to chromatic red ..............................26 Figure 4-8 A RGB-image and the corresponding chromatic blue image. The yellow arrows points to the unwanted light brown tissue that is in included as brown when using the chromatic blue ......................26 Figure 4-9 A RGB-image and the corresponding image create with the above mentioned method. Notice how the blue nucleus gets a high intensity compare to the background ...................................................27 4-10 Scatter plot of the output when using the Multiplication method and the MaxInt-Blue input image. The lines shows the mean and the standard variation ...............................................................................29 Figure 4-11 Scatter plot of the output with the fitted model .....................................................................34 Figure 5-1 Two images of the same 3+ sample stained with the Pathway® protocol (left) and the HercepTest® protocol (right). It is apparent that the membrane-staining is much darker when using the Pathway® protocol ......................................................................................................................................36 Figure 5-2 Comparison of the output computed from Pathway stained samples and the model .............37

Side 53

Figure 5-3 Images of a 3+ sample at a 1 µm(left), 3 µm(middle) and 8 µm(right) thickness. It is easy to see the difference in the color-intensity .....................................................................................................38 Figure 5-4 The regression lines computed from the output of two different thickness' compared with the model. It is clearly seen that the thickness affect the intensity of the brown. ..........................................38 Figure 5-5 The intensity plotted as a function of the thickness. .................................................................39 Figure 5-6 Plot of the y-intercept(left) and the slope coefficient(right) vs. the slice thickness..................40 Figure 5-7 Image captured with a Olympus DP 71 (left) and with a PixeLINK (right). Notice the difference in the color temprature and the tint of the images ....................................................................................41 Figure 5-8 Scatter plot of the output from dataset B and E computed with the combination Multiplication/Chromatic red ......................................................................................................................42 Figure 5-9 ....................................................................................................................................................43 Figure 5-10 A 3+ sample capture in focus (left) and out of focus (right) ....................................................44 Figure 5-11 The output from un-focused images .......................................................................................44

Side 54

1 Appendix 1.1 C++ code 1.1.1 Main() #include #include #include #include #include #include #include #include #include #include #include #include #include #include

"DecideScore.h" "MakeGrayScale.h" "Utility.h" "DetectBackground.h" "Kittler.h" "Pre_process.h" "Method.h"

using namespace std; void main() { if(!Initialize()) cout > answer; if(answer == 'y') { cout Monadic(*Image,0,0,Max,DIV,0,0); Image->Monadic(*Image,0,0,255,MUL|SATURATE,0,0); Image->GetPixelRange(&Min,&Max); }

1.1.14 MakeInvertBlue(CImage* RGB, CImage* Image); void MakeInvertBlue(CImage* RGB, CImage* Image) { int int int

pixels; Rows = Image->Rows; Cols = Image->Cols;

pixels = Rows*Cols; //Put the blue band of the RGB image into Image Image->Unary(*RGB,2,0,PASS,0,0); //Computes (255-pixelintensity)/255 for(int r=0; rGetPixel(0,r,c,0)); } } }

1.1.15 Method.h #ifndef __METHOD_H__ #define __METHOD_H__ #include #include #include #include #include #include #include #include "Utility.h" #include "Kittler.h" using namespace std; void Multi(CImage* Membranes, CImage* Brown, CImage* RGB); void Inter(CImage* Membranes, CImage* Brown, CImage* RGB); #endif

Side xi

1.1.16 Mulit(CImage* Membranes, CImage* Brown, CImage* RGB) void Multi(CImage* Membranes, CImage* Brown, CImage* RGB) { int pixels = >Rows*Membranes->Cols; const int Bins = 256; int Hist[Bins], Thres; double Min, Max;

Membranes-

Membranes->Dyadic(*Membranes,0,0,*Brown,0,0,MUL,0,0); //Normalizes image in range [0;255] Membranes->GetPixelRange(&Min,&Max); if(MinMonadic(*Membranes,0,0,abs(Min),ADD,0,0); Membranes->GetPixelRange(&Min,&Max); if(Max>1) Membranes->Monadic(*Membranes,0,0,Max,DIV,0,0); Membranes->Monadic(*Membranes,0,0,255,MUL|SATURATE,0,0); //Create histogram of image CreateHist(*Membranes, Hist); //Finds Thres = Thres = for(int {

optimal threhold Kittle(Hist,0,255); Kittle(Hist,Thres,255); i=0; iGetPixel(i)>Thres) { RGB->SetMaskPixel(i,1); }

} }

1.1.17 Inter(CImage* Membranes, CImage* Brown, CImage* RGB) void Inter(CImage* Membranes, CImage* Brown, CImage* RGB) { int pixels RGB->Rows*RGB->Cols; const int Bins = int Hist[Bins],

= 256;

//Create histogram of image CreateHist(*Membranes, Hist); //Finds Thres = Thres = for(int {

optimal threhold Kittle(Hist,0,255); Kittle(Hist,Thres,255); i=0; iGetPixel(i)>Thres) Membranes->SetMaskPixel(i,1);

} //Create histogram of image CreateHist(*Brown, Hist); //Finds Thres = Thres = for(int {

optimal threhold Kittle(Hist,0,255); Kittle(Hist,Thres,255); i=0; iGetPixel(i)>Thres) Brown->SetMaskPixel(i,1); } //Finds the intersection of the two previously defined areas. This is brown membranes for(int i=0; iGetMaskPixel(i)==1 && Membranes->GetMaskPixel(i)==1) { RGB->SetMaskPixel(i,1); } } }

1.2 Output of dataset A Multi Chrom blue Multi Chrom red Multi MaxInt-blue Inter Chrom red Inter MaxInt-blue Inter Chrom blue Develop set

Multi Chrom blue Neg 0,355259 0,354608 0,336909 0,374578 0,358807 0,363239 0,391631 0,375795 0,388875 0,383364 0,386884 0,393726 0,387311

2+ 0,277878 0,297969 0,34877 0,33297 0,304839 0,329211 0,326667 0,346377 0,316109 0,305815 0,282942 0,321222 0,287324 0,305376 0,267625 0,316847

3+ 0,189128 0,207709 0,244335 0,207958 0,172381 0,190417 0,232793 0,212327 0,204738 0,13956 0,133817 0,195421 0,261904 0,211258 0,19989 0,169057 0,245902 0,194529 0,262672 0,239666 0,251666

Multi Chrom red Neg 0,372924 0,384265 0,362815 0,389711 0,372731 0,380556 0,415327

2+ 0,319083 0,320743 0,37254 0,357664 0,313003 0,340605 0,34949

3+ 0,193465 0,210993 0,250138 0,211928 0,174354 0,194917 0,240376

Side xiii

0,385771 0,402658 0,410285 0,40299 0,418383 0,4038

0,355982 0,339677 0,320806 0,290992 0,342039 0,301202 0,32765 0,27527 0,341315

0,218082 0,209013 0,137038 0,134041 0,199919 0,279366 0,215568 0,204942 0,175525 0,251794 0,19927 0,276835 0,248362 0,257855

Multi MaxInt-blue Neg 0,369074 0,389477 0,349266 0,379803 0,375019 0,389859 0,410131 0,387541 0,40998 0,407255 0,408379 0,40177 0,40097

2+ 0,324938 0,309388 0,377636 0,352399 0,311655 0,334089 0,334938 0,35468 0,3404 0,316048 0,290774 0,338631 0,306207 0,321419 0,29017 0,338826

3+ 0,207659 0,214581 0,248702 0,216634 0,184936 0,21123 0,245963 0,218297 0,213404 0,157911 0,175827 0,219034 0,287656 0,217344 0,207792 0,188805 0,258153 0,207653 0,283905 0,252491 0,267025

Inter Chrom red Neg 0,323551 0,345655 0,313442 0,341553 0,328055 0,338473 0,382998 0,336938 0,334184 0,333389 0,342286 0,348253 0,341814

2+ 0,276332 0,300061 0,328806 0,31341 0,296423 0,323124 0,311217 0,326843 0,309513 0,295664 0,268457 0,315187 0,270577 0,286234 0,25151 0,320648

3+ 0,180793 0,202206 0,236198 0,197708 0,166581 0,186324 0,225386 0,209785 0,196484 0,138776 0,147454 0,186954 0,227183 0,206021 0,197425 0,160181 0,225794 0,172117 0,240418 0,237949 0,220049

Side xiv

Inter MaxInt-blue Neg 0,380894 0,440408 0,357373 0,410255 0,380708 0,403866 0,431042 0,398017 0,413873 0,436525 0,42352 0,437538 0,42128

2+ 0,31459 0,288078 0,383717 0,34816 0,303193 0,329905 0,33714 0,357442 0,327592 0,303469 0,285025 0,328307 0,296936 0,300541 0,274764 0,327426

3+ 0,211983 0,214878 0,251924 0,221127 0,184835 0,215937 0,246502 0,225025 0,219946 0,153584 0,161131 0,21077 0,276305 0,220361 0,22314 0,199704 0,25343 0,220801 0,309388 0,276834 0,275988

Inter Chrom blue Neg 0,326932 0,323102 0,313551 0,324503 0,324926 0,326911 0,372472 0,327152 0,323132 0,325743 0,329907 0,329778 0,330459

2+ 0,306276 0,299737 0,315475 0,314855 0,291457 0,322788 0,307436 0,325622 0,310226 0,295529 0,246392 0,312326 0,258407 0,300013 0,226872 0,313668

3+ 0,171869 0,194929 0,226399 0,192488 0,159973 0,17196 0,215383 0,194603 0,19046 0,127414 0,13119 0,18205 0,269767 0,19735 0,182678 0,131585 0,202347 0,144134 0,161848 0,184886 0,178244

Develop set Score 1 1 1 1 1 1 1 1

Multi/Chrom blue 0,355259 0,354608 0,336909 0,374578 0,358807 0,363239 0,391631 0,375795

Multi/Chrom red 0,372924 0,384265 0,362815 0,389711 0,372731 0,380556 0,415327 0,385771

Multi/MaxIntblue 0,369074 0,389477 0,349266 0,379803 0,375019 0,389859 0,410131 0,387541

Side xv

Inter/Chrom red 0,323551 0,345655 0,313442 0,341553 0,328055 0,338473 0,382998 0,336938

Inter/MaxIntblue

Inter/Chrom blue

0,380894 0,440408 0,357373 0,410255 0,380708 0,403866 0,431042 0,398017

0,326932 0,323102 0,313551 0,324503 0,324926 0,326911 0,372472 0,327152

2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3

0,304839 0,329211 0,326667 0,346377 0,287324 0,305376 0,267625 0,316847 0,189128 0,207709 0,244335 0,212327 0,204738 0,262672 0,239666 0,251666

0,313003 0,340605 0,34949 0,355982 0,301202 0,32765 0,27527 0,341315 0,193465 0,210993 0,250138 0,218082 0,209013 0,276835 0,248362 0,257855

0,311655 0,334089 0,334938 0,35468 0,306207 0,321419 0,29017 0,338826 0,207659 0,214581 0,248702 0,218297 0,213404 0,283905 0,252491 0,267025

0,296423 0,323124 0,311217 0,326843 0,270577 0,286234 0,25151 0,320648 0,180793 0,202206 0,236198 0,209785 0,196484 0,240418 0,237949 0,220049

0,303193 0,329905 0,33714 0,357442 0,296936 0,300541 0,274764 0,327426 0,211983 0,214878 0,251924 0,225025 0,219946 0,309388 0,276834 0,275988

0,323132 0,325743 0,329907 0,329778 0,330459 0,306276 0,299737 0,315475 0,314855 0,291457 0,322788 0,307436 0,325622 0,310226 0,295529 0,246392

1.3 GraphPad Prism 5 ANOVA of combinations 1way ANOVA of Multi Chrom blue 1way ANOVA of Multi Chrom red 1way ANOVA of Multi MaxInt-blue 1way ANOVA of Inter Chrom red 1way ANOVA of Inter MaxInt-blue 1way ANOVA of Inter Chrom blue Nonlin fit of Develop set:Table of results

1way ANOVA of Multi Chrom blue Data Set-A Table Analyzed

Multi Chrom blue

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F R squared

< 0.0001 *** Yes 3 143.9 0.8596

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

7.362 0.0252 * Yes

ANOVA Table Treatment (between columns) Residual (within columns) Total

SS 0.2360 0.03855 0.2745

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.06266 2+ vs 3+ 0.1025

Data Set-B

Data Set-C

Data Set-D

df 2 47 49

MS 0.1180 0.0008201

t 5.859 10.79

Significant? P < 0.01? Summary Yes *** Yes ***

Data Set-E

99% CI of diff 0.03116 to 0.09415 0.07454 to 0.1305

1way ANOVA of Multi Chrom red Data Set-A

Data Set-B

Side xvi

Data Set-C

Data Set-D

Data Set-E

Table Analyzed

Multi/Chrom red

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F R squared

< 0.0001 *** Yes 3 146.2 0.8615

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

8.841 0.0120 * Yes

ANOVA Table Treatment (between columns) Residual (within columns) Total

SS 0.2815 0.04526 0.3268

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.06322 2+ vs 3+ 0.1157

df 2 47 49

MS 0.1408 0.0009629

t 5.457 11.24

Significant? P < 0.01? Summary Yes *** Yes ***

99% CI of diff 0.02909 to 0.09735 0.08541 to 0.1461

1way ANOVA of Multi MaxInt-blue Data Set-A Table Analyzed

Multi/MaxInt-blue

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F R squared

< 0.0001 *** Yes 3 159.8 0.8718

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

5.567 0.0618 ns No

ANOVA Table Treatment (between columns) Residual (within columns) Total

SS 0.2433 0.03578 0.2790

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.06302 2+ vs 3+ 0.1045

Data Set-B

Data Set-C

Data Set-D

df 2 47 49

MS 0.1216 0.0007613

t 6.117 11.42

Significant? P < 0.01? Summary Yes *** Yes ***

Data Set-E

99% CI of diff 0.03267 to 0.09337 0.07757 to 0.1315

1way ANOVA of Inter Chrom red Data Set-A Table Analyzed

Inter/Chrom red

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F

< 0.0001 *** Yes 3 149.8

Data Set-B

Side xvii

Data Set-C

Data Set-D

Data Set-E

R squared

0.8644

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

4.760 0.0925 ns No

ANOVA Table Treatment (between columns) Residual (within columns) Total

SS 0.1844 0.02893 0.2133

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.03965 2+ vs 3+ 0.1014

df 2 47 49

MS 0.09220 0.0006156

t 4.280 12.32

Significant? P < 0.01? Summary Yes *** Yes ***

99% CI of diff 0.01236 to 0.06694 0.07719 to 0.1257

1way ANOVA of Inter MaxInt-blue Data Set-A Table Analyzed

Inter/MaxInt-blue

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F R squared

< 0.0001 *** Yes 3 130.5 0.8474

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

2.610 0.2711 ns No

ANOVA Table Treatment (between columns) Residual (within columns) Total

SS 0.2744 0.04940 0.3238

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.09126 2+ vs 3+ 0.09183

Data Set-B

Data Set-C

Data Set-D

Data Set-E

df 2 47 49

MS 0.1372 0.001051

t 7.539 8.535

Significant? P < 0.01? Summary Yes *** Yes ***

99% CI of diff 0.05560 to 0.1269 0.06014 to 0.1235

Data Set-C

Data Set-E

1way ANOVA of Inter Chrom blue Data Set-A Table Analyzed

Inter/Chrom blue

One-way analysis of variance P value P value summary Are means signif. different? (P < 0.05) Number of groups F R squared

< 0.0001 *** Yes 3 135.3 0.8521

Bartlett's test for equal variances Bartlett's statistic (corrected) P value P value summary Do the variances differ signif. (P < 0.05)

9.237 0.0099 ** Yes

ANOVA Table

SS

Data Set-B

df

MS

Side xviii

Data Set-D

Treatment (between columns) Residual (within columns) Total

0.2125 0.03690 0.2494

Bonferroni's Multiple Comparison Test Mean Diff. Neg vs 2+ 0.03243 2+ vs 3+ 0.1152

2 47 49

0.1063 0.0007851

t 3.100 12.39

Significant? P < 0.01? Summary Yes ** Yes ***

99% CI of diff 0.001610 to 0.06325 0.08780 to 0.1426

Nonlin fit of Develop set:Table of results

Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Equation 1 Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² (unweighted) Weighted Sum of Squares (1/Y²) Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Dependency YIntercept Slope Equation 2 Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² (unweighted)

Multi/Chrom blue

Multi/Chrom red

Multi/MaxIntblue

Inter/Chrom red

Inter/MaxIntblue

Inter/Chrom blue

Can't calculate Equation 1 Equation 2

Can't calculate Equation 1 Equation 2

Can't calculate Equation 1 Equation 2

Can't calculate Equation 1 Equation 2

Can't calculate Equation 1 Equation 2

Can't calculate Equation 1 Equation 2

Models have the same DF Equation 1

Models have the same DF Equation 1

Models have the same DF Equation 1

Models have the same DF Equation 1

Models have the same DF Equation 1

Models have the same DF Equation 1

0.4426 -0.07104

0.4697 -0.07779

0.4621 -0.07369

0.4141 -0.06483

0.4709 -0.07476

0.3457 -0.01420

0.01591 0.006515

0.01759 0.007164

0.01595 0.006538

0.01583 0.006510

0.02006 0.008232

0.01056 0.004768

0.4096 to 0.4756 -0.08455 to 0.05752

0.4332 to 0.5062 -0.09265 to 0.06293

0.4290 to 0.4951 -0.08725 to 0.06013

0.3813 to 0.4470 -0.07834 to 0.05133

0.4293 to 0.5125 -0.09184 to 0.05769

0.3238 to 0.3676 -0.02409 to 0.004314

22 0.8575 0.1684 0.01252 0.08749

22 0.8597 0.1866 0.01466 0.09210

22 0.8708 0.1546 0.01212 0.08382

22 0.8114 0.1874 0.01411 0.09229

22 0.8241 0.2348 0.01974 0.1033

22 0.2900 0.07956 0.007762 0.06014

0.4187 0.8111

0.4864 0.7841

0.3230 0.8509

1.796 0.4074

1.514 0.4692

10.69 0.0048

13 11 12 0.4334 Not Significant

10 14 12 0.4715 Not Significant

12 12 14 0.7368 Not Significant

11 13 12 0.4334 Not Significant

13 11 12 0.4334 Not Significant

11 13 7 0.0101 Significant

0.8987 0.8987

0.9008 0.9008

0.8983 0.8983

0.8971 0.8971

0.8980 0.8980

0.8646 0.8646

0.4426 -0.07104

0.4697 -0.07779

0.4621 -0.07369

0.4141 -0.06483

0.4709 -0.07476

0.3457 -0.01420

0.01591 0.006515

0.01759 0.007164

0.01595 0.006538

0.01583 0.006510

0.02006 0.008232

0.01056 0.004768

0.4096 to 0.4756 -0.08455 to 0.05752

0.4332 to 0.5062 -0.09265 to 0.06293

0.4290 to 0.4951 -0.08725 to 0.06013

0.3813 to 0.4470 -0.07834 to 0.05133

0.4293 to 0.5125 -0.09184 to 0.05769

0.3238 to 0.3676 -0.02409 to 0.004314

22 0.8575 0.1684 0.01252

22 0.8597 0.1866 0.01466

22 0.8708 0.1546 0.01212

22 0.8114 0.1874 0.01411

22 0.8241 0.2348 0.01974

22 0.2900 0.07956 0.007762

Side xix

Weighted Sum of Squares (1/Y²) Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Dependency YIntercept Slope Number of points Analyzed Outliers (not excluded, Q=1.0%)

0.08749

0.09210

0.08382

0.09229

0.1033

0.06014

0.4187 0.8111

0.4864 0.7841

0.3230 0.8509

1.796 0.4074

1.514 0.4692

10.69 0.0048

13 11 12 0.4334 Not Significant

10 14 12 0.4715 Not Significant

12 12 14 0.7368 Not Significant

11 13 12 0.4334 Not Significant

13 11 12 0.4334 Not Significant

11 13 7 0.0101 Significant

0.8987 0.8987

0.9008 0.9008

0.8983 0.8983

0.8971 0.8971

0.8980 0.8980

0.8646 0.8646

24 0

24 0

24 0

24 0

24 0

24 1

1.4 Output from dataset C Multi Chrom blue Multi Chrom blue Multi Chrom red Multi Chrom red Multi MaxInt-blue Multi MaxInt-blue Inter Chrom red Inter Chrom red Inter MaxInt-blue Inter MaxInt-blue

Multi Chrom blue Pathway 3+ 0,133071 0,108079 0,167818

Pathway Neg 0,33891 0,350272 0,34782 0,351997 0,346451

Pathway 0,133071 0,108079 0,167818 0,33891 0,350272 0,34782 0,351997 0,346451

HercepTest 0,169057 0,245902 0,194529 0,354608 0,336909 0,374578 0,358807 0,363239

Multi Chrom blue Score 1 1 1 1 1 2

Model

Pathway

0,37156

0,33891 0,350272 0,34782 0,351997 0,346451

0,30052

Side xx

3 3 3

0,22948

0,133071 0,108079 0,167818

Multi Chrom red Pathway 3+

Pathway Neg

0,136216 0,101914 0,179516

0,345124 0,360953 0,361296 0,361613 0,371639

Pathway 0,136216 0,101914 0,179516 0,345124 0,360953 0,361296 0,361613 0,371639

HercepTest 0,175525 0,251794 0,19927 0,384265 0,362815 0,389711 0,372731 0,380556

Multi Chrom red Score 1 1 1 1 1 2 3 3 3

Model

Pathway

0,39191

0,345124 0,360953 0,361296 0,361613 0,371639

0,31412 0,23633

0,136216 0,101914 0,179516

Multi MaxInt-blue Pathway 3+

Pathway Neg

0,177145 0,155767 0,19226

0,341728 0,365271 0,358671 0,361599 0,37853

Pathway 0,177145 0,155767 0,19226 0,341728 0,365271 0,358671 0,361599 0,37853

HercepTest 0,188805 0,258153 0,207653 0,389477 0,349266 0,379803 0,375019 0,389859

Multi MaxInt-blue Score 1 1 1 1 1 2 3 3 3

Model

Pathway

0,38841

0,341728 0,365271 0,358671 0,361599 0,37853

0,31472 0,24103

0,177145 0,155767 0,19226

Inter Chrom red Pathway 3+ 0,137345 0,099809

Pathway Neg 0,333404 0,318106

Pathway 0,137345 0,099809

HercepTest 0,160181 0,225794

Side xxi

0,177542

0,339244 0,328179 0,33857

0,177542 0,333404 0,318106 0,339244 0,328179 0,33857

0,172117 0,345655 0,313442 0,341553 0,328055 0,338473

Inter Chrom red Score

HercepTest

1 1 1 1 1 2 3 3 3

0,33674

0,29834 0,2155

Pathway 0,333404 0,318106 0,339244 0,328179 0,33857 0,137345 0,099809 0,177542

Inter MaxInt-blue Pathway 3+ 0,201969 0,14883 0,234582

Pathway Neg 0,345781 0,372439 0,378261 0,368678 0,386404

Pathway 0,201969 0,14883 0,234582 0,345781 0,372439 0,378261 0,368678 0,386404

HercepTest 0,199704 0,25343 0,220801 0,440408 0,357373 0,410255 0,380708 0,403866

Inter MaxInt-blue Score 1 1 1 1 1 2 3 3 3

HercepTest 0,39614

0,32138 0,24662

Pathway 0,345781 0,372439 0,378261 0,368678 0,386404 0,201969 0,14883 0,234582

1.5 GraphPad Prism 5 calculations for the analysis of different protocols t test of Multi Chrom blue Correlation of Multi Chrom blue:Tabular results Correlation of Multi Chrom blue:XY Data t test of Multi Chrom red Correlation of Multi Chrom red:Tabular results Correlation of Multi Chrom red:XY Data t test of Multi MaxInt-blue

Side xxii

Correlation of Multi MaxInt-blue:Tabular results Correlation of Multi MaxInt-blue:XY Data t test of Inter Chrom red Correlation of Inter Chrom red:Tabular results Correlation of Inter Chrom red:XY Data t test of Inter MaxInt-blue Correlation of Inter MaxInt-blue:Tabular results Correlation of Inter MaxInt-blue:XY Data Linear reg. of Multi Chrom blue:Tabular results Linear reg. of Multi Chrom red:Tabular results Linear reg. of Multi MaxInt-blue:Tabular results Linear reg. of Inter MaxInt-blue:Tabular results Nonlin fit of Inter Chrom red:Table of results

t test of Multi Chrom blue Data Set-A Table Analyzed Column A vs Column B

Multi Chrom blue Pathway 3+ vs Pathway Neg

Unpaired t test P value P value summary Are means signif. different? (P < 0.05) One- or two-tailed P value? t, df

< 0.0001 *** Yes Two-tailed t=16.21 df=6

How big is the difference? Mean ± SEM of column A Mean ± SEM of column B Difference between means 95% confidence interval R squared

0.1363 ± 0.01732 N=3 0.3471 ± 0.002259 N=5 -0.2108 ± 0.01300 -0.2426 to -0.1789 0.9777

F test to compare variances F,DFn, Dfd P value P value summary Are variances significantly different?

35.28, 2, 4 0.0058 ** Yes

Correlation of Multi Chrom blue:Tabular results HercepTest Number of XY Pairs Pearson r 95% confidence interval P value (two-tailed) P value summary Is the correlation significant? (alpha=0.05) R squared

8 0.9274 0.6426 to 0.9870 0.0009 *** Yes 0.8600

Correlation of Multi Chrom blue:XY Data Pathway

HercepTest

Side xxiii

0,133071 0,108079 0,167818 0,33891 0,350272 0,34782 0,351997 0,346451

0,169057 0,245902 0,194529 0,354608 0,336909 0,374578 0,358807 0,363239

t test of Multi Chrom red Data Set-A Table Analyzed Column A vs Column B

Multi/Chrom red Pathway 3+ vs Pathway Neg

Unpaired t test P value P value summary Are means signif. different? (P < 0.05) One- or two-tailed P value? t, df

< 0.0001 *** Yes Two-tailed t=12.73 df=6

How big is the difference? Mean ± SEM of column A Mean ± SEM of column B Difference between means 95% confidence interval R squared

0.1392 ± 0.02245 N=3 0.3601 ± 0.004254 N=5 -0.2209 ± 0.01735 -0.2634 to -0.1785 0.9643

F test to compare variances F,DFn, Dfd P value P value summary Are variances significantly different?

16.72, 2, 4 0.0228 * Yes

Correlation of Multi Chrom red:Tabular results HercepTest Number of XY Pairs Pearson r 95% confidence interval P value (two-tailed) P value summary Is the correlation significant? (alpha=0.05) R squared

8 0.9249 0.6324 to 0.9866 0.0010 *** Yes 0.8555

Correlation of Multi Chrom red:XY Data Pathway 0,136216 0,101914 0,179516 0,345124 0,360953 0,361296 0,361613 0,371639

HercepTest 0,175525 0,251794 0,19927 0,384265 0,362815 0,389711 0,372731 0,380556

t test of Multi MaxInt-blue

Side xxiv

Data Set-A Table Analyzed Column A vs Column B

Multi/MaxInt-blue Pathway 3+ vs Pathway Neg

Unpaired t test P value P value summary Are means signif. different? (P < 0.05) One- or two-tailed P value? t, df

< 0.0001 *** Yes Two-tailed t=16.83 df=6

How big is the difference? Mean ± SEM of column A Mean ± SEM of column B Difference between means 95% confidence interval R squared

0.1751 ± 0.01059 N=3 0.3612 ± 0.005927 N=5 -0.1861 ± 0.01106 -0.2132 to -0.1590 0.9793

F test to compare variances F,DFn, Dfd P value P value summary Are variances significantly different?

1.914, 2, 4 0.5222 ns No

Correlation of Multi MaxInt-blue:Tabular results HercepTest Number of XY Pairs Pearson r 95% confidence interval P value (two-tailed) P value summary Is the correlation significant? (alpha=0.05) R squared

8 0.9332 0.6674 to 0.9881 0.0007 *** Yes 0.8709

Correlation of Multi MaxInt-blue:XY Data Pathway 0,177145 0,155767 0,19226 0,341728 0,365271 0,358671 0,361599 0,37853

HercepTest 0,188805 0,258153 0,207653 0,389477 0,349266 0,379803 0,375019 0,389859

t test of Inter Chrom red Data Set-A Table Analyzed Column A vs Column B

Inter/Chrom red Pathway 3+ vs Pathway Neg

Unpaired t test P value < 0.0001 P value summary *** Are means signif. different? (P < 0.05) Yes

Side xxv

One- or two-tailed P value? t, df

Two-tailed t=11.24 df=6

How big is the difference? Mean ± SEM of column A Mean ± SEM of column B Difference between means 95% confidence interval R squared

0.1382 ± 0.02244 N=3 0.3315 ± 0.003899 N=5 -0.1933 ± 0.01720 -0.2353 to -0.1512 0.9547

F test to compare variances F,DFn, Dfd P value P value summary Are variances significantly different?

19.88, 2, 4 0.0167 * Yes

Correlation of Inter Chrom red:Tabular results HercepTest Number of XY Pairs Pearson r 95% confidence interval P value (two-tailed) P value summary Is the correlation significant? (alpha=0.05) R squared

8 0.9128 0.5831 to 0.9843 0.0016 ** Yes 0.8331

Correlation of Inter Chrom red:XY Data Pathway 0,137345 0,099809 0,177542 0,333404 0,318106 0,339244 0,328179 0,33857

HercepTest 0,160181 0,225794 0,172117 0,345655 0,313442 0,341553 0,328055 0,338473

t test of Inter MaxInt-blue Data Set-A Table Analyzed Column A vs Column B

Inter/MaxInt-blue Pathway 3+ vs Pathway Neg

Unpaired t test P value P value summary Are means signif. different? (P < 0.05) One- or two-tailed P value? t, df

0.0001 *** Yes Two-tailed t=8.591 df=6

How big is the difference? Mean ± SEM of column A Mean ± SEM of column B Difference between means 95% confidence interval R squared

0.1951 ± 0.02499 N=3 0.3703 ± 0.006823 N=5 -0.1752 ± 0.02039 -0.2251 to -0.1253 0.9248

F test to compare variances

Side xxvi

F,DFn, Dfd P value P value summary Are variances significantly different?

8.048, 2, 4 0.0792 ns No

Correlation of Inter MaxInt-blue:Tabular results HercepTest Number of XY Pairs Pearson r 95% confidence interval P value (two-tailed) P value summary Is the correlation significant? (alpha=0.05) R squared

8 0.8777 0.4534 to 0.9777 0.0042 ** Yes 0.7704

Correlation of Inter MaxInt-blue:XY Data Pathway 0,201969 0,14883 0,234582 0,345781 0,372439 0,378261 0,368678 0,386404

HercepTest 0,199704 0,25343 0,220801 0,440408 0,357373 0,410255 0,380708 0,403866

Linear reg. of Multi Chrom blue:Tabular results Model Best-fit values Slope Y-intercept when X=0.0 X-intercept when Y=0.0 1/slope 95% Confidence Intervals Slope Y-intercept when X=0.0 X-intercept when Y=0.0 Goodness of Fit r² Sy.x Is slope significantly non-zero? F DFn, DFd P value Deviation from zero? Data Number of X values Maximum number of Y replicates Total number of values Number of missing values Runs test Points above line Points below line Number of runs P value (runs test) Deviation from linearity

Pathway

-0.07104 ± 0.0000 0.4426 ± 0.0000 6.230 -14.08

-0.1054 ± 0.006502 0.4525 ± 0.01300 4.294 -9.489

Perfect line Perfect line Perfect line

-0.1213 to -0.08947 0.4207 to 0.4843 3.933 to 4.773

1.000 0.0000

0.9777 0.01781

Perfect line

262.7 1.000, 6.000 < 0.0001 Significant

3 1 3 6

8 1 8 1

Perfect line

4 4 4 0.3714 Not Significant

1.000, 1.000

Linear reg. of Multi Chrom red:Tabular results

Side xxvii

Model Best-fit values Slope Y-intercept when X=0.0 X-intercept when Y=0.0 1/slope 95% Confidence Intervals Slope Y-intercept when X=0.0 X-intercept when Y=0.0 Goodness of Fit r² Sy.x Is slope significantly non-zero? F DFn, DFd P value Deviation from zero? Data Number of X values Maximum number of Y replicates Total number of values Number of missing values Runs test Points above line Points below line Number of runs P value (runs test) Deviation from linearity

Pathway

-0.07779 ± 0.0000 0.4697 ± 0.0000 6.038 -12.86

-0.1105 ± 0.008675 0.4706 ± 0.01735 4.260 -9.053

Perfect line Perfect line Perfect line

-0.1317 to -0.08923 0.4281 to 0.5130 3.821 to 4.892

1.000 0.0000

0.9643 0.02376

Perfect line

162.1 1.000, 6.000 < 0.0001 Significant

3 1 3 6

8 1 8 1

Perfect line

5 3 4 0.4286 Not Significant

1.000, 1.000

Linear reg. of Multi MaxInt-blue:Tabular results Model Best-fit values Slope Y-intercept when X=0.0 X-intercept when Y=0.0 1/slope 95% Confidence Intervals Slope Y-intercept when X=0.0 X-intercept when Y=0.0 Goodness of Fit r² Sy.x Is slope significantly non-zero? F DFn, DFd P value Deviation from zero? Data Number of X values Maximum number of Y replicates Total number of values Number of missing values Runs test Points above line Points below line Number of runs P value (runs test) Deviation from linearity

Pathway

-0.07369 ± 0.000000004268 0.4621 ± 0.00000000922 6.271 -13.57

-0.09305 ± 0.005528 0.4542 ± 0.01106 4.881 -10.75

Perfect line Perfect line Perfect line

-0.1066 to -0.07952 0.4272 to 0.4813 4.463 to 5.435

1.000 0.000000006036

0.9793 0.01514

Perfect line

283.4 1.000, 6.000 < 0.0001 Significant

3 1 3 6

8 1 8 1

Perfect line

5 3 6 0.9286 Not Significant

1.000, 1.000

Linear reg. of Inter MaxInt-blue:Tabular results

Side xxviii

HercepTest Best-fit values Slope Y-intercept when X=0.0 X-intercept when Y=0.0 1/slope 95% Confidence Intervals Slope Y-intercept when X=0.0 X-intercept when Y=0.0 Goodness of Fit r² Sy.x Is slope significantly non-zero? F DFn, DFd P value Deviation from zero? Data Number of X values Maximum number of Y replicates Total number of values Number of missing values

Pathway

-0.07476 ± 0.000000008637 0.4709 ± 0.00000001866 6.299 -13.38

-0.08759 ± 0.01020 0.4579 ± 0.02039 5.228 -11.42

Perfect line Perfect line Perfect line

-0.1125 to -0.06264 0.4080 to 0.5078 4.419 to 6.650

1.000 0.00000001221

0.9248 0.02792

Perfect line

73.81 1.000, 6.000 0.0001 Significant

3 1 3 6

8 1 8 1

1.000, 1.000

Nonlin fit of Inter Chrom red:Table of results HercepTest

Pathway

Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) 2 parameters different for each data set Best-fit values B0 B1 B2 Std. Error B1 B2 95% Confidence Intervals B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Constraints B0

Global (shared) 2 parameters same for all data sets 2 parameters different for each data set 0.0408 Reject null hypothesis 2 parameters different for each data set 5.230 (2,7)

Perfect fit = 0.3307 0.02826 -0.02222

= 0.3307 0.03328 -0.03248 0.01596 0.005732 -0.005767 to 0.07232 -0.04650 to -0.01845

1 1.000 0.0

6 0.9547 0.003326 0.02355 1.859 0.3948 4 4 7 0.9714 Not Significant

B0 = 0.3307

B0 = 0.3307

2 parameters same for all data sets Best-fit values

Side xxix

B0 B1 B2 Std. Error B1 B2 95% Confidence Intervals B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Constraints B0 B1 B2 Number of points Analyzed

= 0.3307 0.03605 -0.03107

= 0.3307 0.03605 -0.03107

0.03605 -0.03107

0.01758 0.006410

0.01758 0.006410

0.01758 0.006410

-0.003716 to 0.07582 -0.04557 to -0.01657

-0.003716 to 0.07582 -0.04557 to -0.01657

-0.003716 to 0.07582 -0.04557 to -0.01657

0.5359 0.003564

0.9355 0.004733

9 0.8993 0.008297 0.03036

5.384 0.0678 3 0 1 Significant

3 5 6 0.9286 Not Significant

B0 = 0.3307 B1 is shared B2 is shared

B0 = 0.3307 B1 is shared B2 is shared

3

8

1.6 Output from dataset D Multi Chrom blue Model Multi Chorm blue Multi Chrom red Model Multi Chrom red Multi MaxInt-blue Model Multi MaxInt-blue Inter Chrom red Model Inter Chrom red Inter MaxInt-blue Model Inter MaxInt-blue

Multi Chrom blue Score 1 1 1 1 2 3 3 3

1 um 0,378438 0,364009 0,375216 0,378534 0,328524 0,2668 0,267258 0,291534

3 um 0,383364 0,386884 0,393726 0,387311 0,316847 0,262672 0,239666 0,251666

8um

Model

0,330932 0,37156 0,4031 0,38011 0,381268 0,253418 0,30052 0,140328 0,22948 0,176974 0,188276

Side xxx

6 5 7 0.7381 Not Significant

Model Multi Chorm blue Thickness 1 3 8

Slope

y-intercept

-0,04936 -0,07104 -0,103

0,4238 0,4426 0,4749

Multi Chrom red Score 1 1 1 1 2 3 3 3

1 um 0,38773 0,37071 0,382024 0,387079 0,347263 0,273283 0,273408 0,298248

3 um

8um

0,410285 0,40299 0,418383 0,4038 0,341315 0,276835 0,248362 0,257855

Model

0,393526 0,39191 0,449424 0,436885 0,448687 0,289522 0,31412 0,146159 0,23633 0,184098 0,196947

Model Multi Chrom red Thickness 1 3 8

y-intercept 0,4334 0,4697 0,559

Slope -0,04984 -0,07779 -0,1285

Multi MaxInt-blue Score 1 1 1 1 2 3 3 3

1 um

Model

8um

3 um

0,388021 0,38841 0,433265 0,407255 0,36265 0,468372 0,408379 0,374624 0,430408 0,40177 0,37727 0,492152 0,40097 0,334438 0,31472 0,29906 0,338826 0,275777 0,24103 0,163857 0,283905 0,276367 0,223576 0,252491 0,298701 0,246087 0,267025

Model Multi MaxInt-blue Thickness 1 3 8

y-intercept 0,4221 0,4621 0,5753

Slope -0,04593 -0,07369 -0,1231

Inter Chrom red Score 1 1 1 1 2 3 3 3

1 um

Model

8um

3 um

0,340553 0,34927 0,327266 0,333389 0,334366 0,363319 0,342286 0,342016 0,335751 0,348253 0,337706 0,317325 0,341814 0,322727 0,28444 0,256208 0,320648 0,2487 0,21961 0,135863 0,240418 0,247429 0,174837 0,237949 0,270323 0,176261 0,220049

Side xxxi

Model Inter Chrom red Thickness 1 3 8

y-intercept 0,3826 0,4141 0,4234

Slope -0,04112 -0,06483 -0,08667

Inter MaxInt-blue Score 1 1 1 1 2 3 3 3

1 um

Model

8 um

3 um

0,402186 0,39614 0,497492 0,436525 0,375123 0,503026 0,42352 0,389627 0,526567 0,437538 0,391102 0,543216 0,42128 0,326269 0,32138 0,280206 0,327426 0,284894 0,24662 0,148932 0,309388 0,285707 0,244535 0,276834 0,301675 0,252766 0,275988

Model Inter MaxInt-blue Thickness 1 3 8

y-intercept 0,4376 0,4709 0,6608

Slope -0,04963 -0,07476 -0,1527

1.7 GraphPad Prism 5 calculations for the analysis of the effect of tissue thickness Nonlin fit of Multi Chrom blue:Table of results Nonlin fit of Slope Multi Chrom blue:Table of results Nonlin fit of Model Multi Chorm blue:Table of results Nonlin fit of Multi Chrom red:Table of results Nonlin fit of Slope Mulit Chrom red:Table of results Nonlin fit of Multi MaxInt-blue:Table of results Nonlin fit of Slope Multi MaxInt-blue:Table of results Nonlin fit of Inter Chrom red:Table of results Nonlin fit of Slope Inter Chrom red:Table of results Nonlin fit of Inter MaxInt-blue:Table of results Nonlin fit of Slope Inter MaxInt-blue:Table of results Nonlin fit of Model Multi Chrom red:Table of results Nonlin fit of Model Multi MaxInt-blue:Table of results Nonlin fit of Model Inter Chrom red:Table of results Nonlin fit of Model Inter MaxInt-blue:Table of results

Nonlin fit of Multi Chrom blue:Table of results 1 um

8um

Comparison of Fits

Model Can't calculate

Side xxxii

Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Straight line Second order polynomial (quadratic) 0.7406 Do not reject null hypothesis Straight line 0.1225 (1,5)

Straight line Second order polynomial (quadratic) 0.5849 Do not reject null hypothesis Straight line 0.3404 (1,5)

0.4238 -0.04936

0.4749 -0.1030

0.007674 0.003669

0.02141 0.01024

0.4051 to 0.4426 -0.05833 to -0.04038

0.4225 to 0.5273 -0.1280 to -0.07793

6 0.9679 0.0005553 0.009620

6 0.9440 0.004324 0.02684

0.4426 -0.07104

1 1.000 0.0

Perfect fit 0.4118 -0.03382 -0.003901

0.5298 -0.1737 0.01777

0.03543 0.04455 0.01114

0.09684 0.1218 0.03046

0.3207 to 0.5029 -0.1484 to 0.08072 -0.03255 to 0.02475

0.2808 to 0.7788 -0.4868 to 0.1393 -0.06054 to 0.09608

5 0.9687 0.0005420 0.01041

5 0.9476 0.004048 0.02845

0 1.000 0.0

8

8

3

Intensity

Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope

Perfect fit Straight line

Perfect fit

Nonlin fit of Slope Multi Chrom blue:Table of results

Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd)

Straight line Second order polynomial (quadratic)

Straight line One site competition 0.9186 Do not reject null hypothesis Straight line 0.01066 (1,24)

0.3402 -0.007218 0.02457 0.005132 0.2896 to 0.3909 -0.01779 to 0.003354

Side xxxiii

0.4426 -0.07104 8.636e-018

Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model One site competition Best-fit values Bottom Top LogEC50 EC50 Std. Error Bottom Top LogEC50 95% Confidence Intervals Bottom Top LogEC50 EC50 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Number of points Analyzed

25 0.07332 0.1387 0.07449 4.305 0.1162 13 14 4 < 0.0001 Significant

0.2818 0.3314 3.541 3473 0.02687 0.02730 1.614 0.2263 to 0.3373 0.2751 to 0.3878 0.2102 to 6.871 1.622 to 7.435e+006 24 0.07373 0.1386 0.07601 4.340 0.1142 13 14 4 < 0.0001 Significant 27

Nonlin fit of Model Multi Chorm blue:Table of results Slope Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals

y-intercept

Can't calculate Can't calculate Straight line Straight line Second order polynomial (quadratic) Second order polynomial (quadratic) Perfect fit Perfect fit Second order polynomial (quadratic) Second order polynomial (quadratic)

-0.06938 -0.001272

0.4434 0.0009269

0.03663 0.007375

0.03530 0.007108

Side xxxiv

YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model

-0.5348 to 0.3960 -0.09498 to 0.09243

-0.005163 to 0.8919 -0.08939 to 0.09124

1 0.02890 0.001414 0.03761

1 0.01672 0.001314 0.03624

2 1 3 1.0000 Not Significant

1 2 3 1.0000 Not Significant

Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Number of points Analyzed

Perfect fit

Perfect fit

-0.008306 -0.04580 0.004745

0.3845 0.04384 -0.004573

0 1.000 0.0

0 1.000 0.0

3

3

Nonlin fit of Multi Chrom red:Table of results 1 um Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals

Straight line Second order polynomial (quadratic) 0.2434 Do not reject null hypothesis Straight line 1.748 (1,5)

8um Straight line Second order polynomial (quadratic) 0.6316 Do not reject null hypothesis Straight line 0.2603 (1,5)

Model Can't calculate Straight line Second order polynomial (quadratic) Perfect fit Straight line

Perfect fit 0.4334 -0.04984

0.5590 -0.1285

0.009263 0.004429

0.01971 0.009422

Side xxxv

0.4697 -0.07779

YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Number of points Analyzed

0.4107 to 0.4561 -0.06068 to -0.03900

0.5108 to 0.6072 -0.1515 to -0.1054

6 0.9548 0.0008091 0.01161

6 0.9687 0.003662 0.02470

1.595 0.4506

1.556 0.4594

4 4 5 0.6286 Not Significant

5 3 4 0.4286 Not Significant

0.3855 0.01187 -0.01550

0.6036 -0.1858 0.01441

0.03727 0.04686 0.01172

0.08980 0.1129 0.02824

0.2897 to 0.4813 -0.1086 to 0.1323 -0.04563 to 0.01464

0.3727 to 0.8344 -0.4761 to 0.1044 -0.05820 to 0.08702

5 0.9665 0.0005995 0.01095

5 0.9703 0.003481 0.02638

0.6037 0.7394

2.121 0.3463

5 3 5 0.7143 Not Significant

6 2 4 0.6429 Not Significant

8

8

Ambiguous

Nonlin fit of Slope Mulit Chrom red:Table of results Intensity Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope

1 1.000 0.0

Straight line Second order polynomial (quadratic) 0.9378 Do not reject null hypothesis Straight line 0.006209 (1,24)

0.3447 -0.003247 0.02855 0.005963

Side xxxvi

0.4697 ~ -0.07779 ~ 1.901e-013

0 1.000 0.0

3

95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x

0.2859 to 0.4036 -0.01553 to 0.009037 25 0.01172 0.1873 0.08655

Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

0.3408 -0.0005724 -0.0002826 0.05768 0.03449 0.003587 0.2218 to 0.4599 -0.07175 to 0.07061 -0.007685 to 0.007120 24 0.01198 0.1872 0.08833 27

Nonlin fit of Multi MaxInt-blue:Table of results 1 um Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model

Straight line Second order polynomial (quadratic) 0.7133 Do not reject null hypothesis Straight line 0.1513 (1,5)

Model Can't calculate Straight line Second order polynomial (quadratic) Perfect fit Straight line

8um Straight line Second order polynomial (quadratic) 0.4029 Do not reject null hypothesis Straight line 0.8344 (1,5)

Perfect fit 0.4221 -0.04593

0.4621 -0.07369

0.5753 -0.1231

0.008539 0.004082

0.02780 0.01329

0.4012 to 0.4430 -0.05592 to -0.03594

0.5073 to 0.6434 -0.1556 to -0.09055

6 0.9547 0.0006875 0.01070

1 1.000 0.0

6 0.9346 0.007285 0.03485

0.6842 0.7103

2.390 0.3027

4 4 5 0.6286 Not Significant

4 4 6 0.8857 Not Significant

Side xxxvii

Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Normality of Residuals D'Agostino & Pearson omnibus K2 P value Runs test Points above curve Points below curve Number of runs P value (runs test) Deviation from Model Number of points Analyzed

Perfect fit 0.4072 -0.02677 -0.004810

0.4621 -0.07369 -7.451e-009

0.6821 -0.2606 0.03455

0.03932 0.04944 0.01237

0.1203 0.1512 0.03783

0.3061 to 0.5083 -0.1539 to 0.1003 -0.03660 to 0.02698

0.3729 to 0.9913 -0.6494 to 0.1281 -0.06270 to 0.1318

5 0.9561 0.0006673 0.01155

0 1.000 0.0

5 0.9440 0.006243 0.03534

0.5562 0.7572

0.6142 0.7356

4 4 5 0.6286 Not Significant

5 3 6 0.9286 Not Significant

8

3

Nonlin fit of Slope Multi MaxInt-blue:Table of results Intensity Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1

Straight line Second order polynomial (quadratic) 0.9623 Do not reject null hypothesis Straight line 0.002287 (1,24)

0.3338 0.001307 0.02737 0.005717 0.2774 to 0.3902 -0.01047 to 0.01308 25 0.002087 0.1721 0.08298

0.3361 -0.0002492 0.0001644 0.05530 0.03306

Side xxxviii

8

B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

0.003439 0.2219 to 0.4502 -0.06849 to 0.06800 -0.006933 to 0.007262 24 0.002182 0.1721 0.08469 27

Nonlin fit of Inter Chrom red:Table of results 1 um Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Model

8um

Perfect fit 0.3826 -0.04112

0.4141 -0.06483

0.4234 -0.08667

0.009983 0.004773

0.01550 0.007412

0.3582 to 0.4070 -0.05280 to -0.02944

0.3854 to 0.4613 -0.1048 to -0.06853

6 0.9252 0.0009397 0.01251

1 1.000 0.0

6 0.9580 0.002266 0.01943

8

3

8

Nonlin fit of Slope Inter Chrom red:Table of results Intensity Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x

Straight line Second order polynomial (quadratic) 0.9495 Do not reject null hypothesis Straight line 0.004095 (1,24)

0.3128 -0.006451 0.02049 0.004280 0.2706 to 0.3550 -0.01527 to 0.002365 25 0.08331 0.09647 0.06212

Second order polynomial (quadratic) Best-fit values B0 0.3105

Side xxxix

B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

-0.004892 -0.0001647 0.04140 0.02475 0.002574 0.2251 to 0.3960 -0.05598 to 0.04620 -0.005478 to 0.005149 24 0.08347 0.09646 0.06340 27

Nonlin fit of Inter MaxInt-blue:Table of results 1 um Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Straight line Second order polynomial (quadratic) 0.2714 Do not reject null hypothesis Straight line 1.527 (1,5)

Model Can't calculate Straight line Second order polynomial (quadratic) Perfect fit Straight line

8 um Straight line Second order polynomial (quadratic) 0.1002 Do not reject null hypothesis Straight line 4.053 (1,5)

Perfect fit 0.4376 -0.04963

0.4709 -0.07476

0.6608 -0.1527

0.008721 0.004169

0.03924 0.01876

0.4163 to 0.4590 -0.05983 to -0.03943

0.5648 to 0.7568 -0.1986 to -0.1068

6 0.9594 0.0007171 0.01093

1 1.000 0.0

6 0.9169 0.01451 0.04918

Perfect fit 0.4805 -0.1048 0.01387

0.4709 -0.07476 1.490e-008

0.9275 -0.4962 0.08629

0.03567 0.04485 0.01122

0.1363 0.1713 0.04286

0.3888 to 0.5722 -0.2202 to 0.01048 -0.01498 to 0.04271

0.5772 to 1.278 -0.9368 to -0.05570 -0.02391 to 0.1965

5 0.9689 0.0005493 0.01048

0 1.000 0.0

5 0.9541 0.008016 0.04004

8

3

8

Side xl

Nonlin fit of Slope Inter MaxInt-blue:Table of results Intensity Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Straight line Second order polynomial (quadratic) 0.9790 Do not reject null hypothesis Straight line 0.0007094 (1,24)

0.3396 0.004339 0.03263 0.006815 0.2724 to 0.4069 -0.009699 to 0.01838 25 0.01596 0.2446 0.09891

0.3412 0.003306 0.0001092 0.06592 0.03941 0.004099 0.2051 to 0.4772 -0.07804 to 0.08466 -0.008352 to 0.008570 24 0.01599 0.2446 0.1010 27

Nonlin fit of Model Multi Chrom red:Table of results y-intercept Comparison of Fits Null hypothesis Alternative hypothesis P value Conclusion (alpha = 0.05) Preferred model F (DFn, DFd) Straight line Best-fit values YIntercept Slope Std. Error

Slope

Can't calculate Can't calculate Straight line Straight line Second order polynomial (quadratic) Second order polynomial (quadratic) Perfect fit Perfect fit Second order polynomial (quadratic) Second order polynomial (quadratic)

0.4157 0.01793

-0.04127 -0.01103

Side xli

YIntercept Slope 95% Confidence Intervals YIntercept Slope Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x

0.0003198 6.440e-005

0.004227 0.0008511

0.4116 to 0.4197 0.01711 to 0.01875

-0.09498 to 0.01244 -0.02184 to -0.0002119

1 1.000 1.078e-007 0.0003284

1 0.9941 1.884e-005 0.004340

Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Perfect fit

Perfect fit

0.4151 0.01832 -4.143e-005

-0.03422 -0.01617 0.0005476

0 1.000 0.0

0 1.000 0.0

3

3

Nonlin fit of Model Multi MaxInt-blue:Table of results y-intercept Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Slope

Perfect fit

Perfect fit

0.4032 0.01849 0.0003771

-0.03034 -0.01616 0.0005711

0 1.000 0.0

0 1.000 0.0

3

3

Nonlin fit of Model Inter Chrom red:Table of results y-intercept Second order polynomial (quadratic) Best-fit values B0 B1 B2

Slope

Perfect fit

Perfect fit

0.3609 0.02369 -0.001984

-0.02606 -0.01613 0.001070

Side xlii

Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

0 1.000 0.0

0 1.000 0.0

3

3

Nonlin fit of Model Inter MaxInt-blue:Table of results y-intercept Second order polynomial (quadratic) Best-fit values B0 B1 B2 Std. Error B0 B1 B2 95% Confidence Intervals B0 B1 B2 Goodness of Fit Degrees of Freedom R² Absolute Sum of Squares Sy.x Number of points Analyzed

Slope

Perfect fit

Perfect fit

0.4301 0.004461 0.003047

-0.03836 -0.01084 -0.0004319

0 1.000 0.0

0 1.000 0.0

3

3

1.8 Output from dataset E T-test of Multi CBlue T-test Multi CRed T-test Multi MaxInt-Blue T-test Inter CRed T-test Inter MaxInt-Blue

T-test of Multi CBlue PixeLINK 0,295027 0,200381 0,231031 0,2133 0,245779 0,231195

DP 71 0,316847 0,169057 0,245902 0,194529 0,262672 0,239666

Neg 0,351663 0,370132 0,355764 0,366164 0,364068 0,380144

2+ 0,295027

3+ 0,200381 0,231031 0,2133 0,245779 0,231195 0,244532

Side xliii

0,244532 0,351663 0,370132 0,355764 0,366164 0,364068 0,380144 0,376316 0,367179 0,347998 0,354875 0,359695 0,377319 0,364993

0,251666 0,354608 0,336909 0,374578 0,358807 0,363239 0,391631 0,375795 0,388875 0,383364 0,386884 0,393726 0,387311 0,277878

0,376316 0,367179 0,347998 0,354875 0,359695 0,377319 0,364993

T-test Multi CRed PixeLINK 0,304537 0,20281 0,235918 0,216701 0,252237 0,237207 0,251751 0,370834 0,386871 0,373503 0,388403 0,378687 0,387527 0,40081 0,380915 0,379834 0,368403 0,389328 0,385942 0,392996

Dp 71 0,341315 0,175525 0,251794 0,19927 0,276835 0,248362 0,257855 0,384265 0,362815 0,389711 0,372731 0,380556 0,415327 0,385771 0,402658 0,410285 0,40299 0,418383 0,4038 0,319083

Neg 0,370834 0,386871 0,373503 0,388403 0,378687 0,387527 0,40081 0,380915 0,379834 0,368403 0,389328 0,385942 0,392996

2+ 0,304537

3+ 0,20281 0,235918 0,216701 0,252237 0,237207 0,251751

T-test Multi MaxInt-Blue PixeLINK 0,346719 0,226844 0,247612 0,222853 0,25581 0,238075 0,255568 0,375727 0,414928 0,381285 0,407101 0,384464 0,415681 0,40989 0,402038 0,401378 0,410475 0,403868 0,409482 0,418282

DP 71 0,338826 0,188805 0,258153 0,207653 0,283905 0,252491 0,267025 0,389477 0,349266 0,379803 0,375019 0,389859 0,410131 0,387541 0,40998 0,407255 0,408379 0,40177 0,40097 0,324938

Neg 0,375727 0,414928 0,381285 0,407101 0,384464 0,415681 0,40989 0,402038 0,401378 0,410475 0,403868 0,409482 0,418282

2+ 0,346719

3+ 0,226844 0,247612 0,222853 0,25581 0,238075 0,255568

Side xliv

T-test Inter CRed PixeLINK 0,298078 0,189542 0,216764 0,203741 0,236774 0,223908 0,289483 0,317506 0,329391 0,32266 0,336955 0,360152 0,372288 0,443859 0,331382 0,334377 0,324739 0,400159 0,325111 0,403479

DP 71 0,320648 0,160181 0,225794 0,172117 0,240418 0,237949 0,220049 0,345655 0,313442 0,341553 0,328055 0,338473 0,382998 0,336938 0,334184 0,333389 0,342286 0,348253 0,341814 0,276332

Neg 0,317506 0,329391 0,32266 0,336955 0,360152 0,372288 0,443859 0,331382 0,334377 0,324739 0,400159 0,325111 0,403479

2+ 0,298078

3+ 0,189542 0,216764 0,203741 0,236774 0,223908 0,289483

T-test Inter MaxInt-Blue PixeLINK 0,332693 0,212003 0,23346 0,217316 0,258158 0,243054 0,256118 0,384564 0,428905 0,392897 0,415039 0,399478 0,414571 0,416877 0,409979 0,407388 0,418565 0,415797 0,405946 0,423806

DP 71 0,327426 0,199704 0,25343 0,220801 0,309388 0,276834 0,275988 0,440408 0,357373 0,410255 0,380708 0,403866 0,431042 0,398017 0,413873 0,436525 0,42352 0,437538 0,42128 0,31459

Neg 0,384564 0,428905 0,392897 0,415039 0,399478 0,414571 0,416877 0,409979 0,407388 0,418565 0,415797 0,405946 0,423806

2+ 0,332693

3+ 0,212003 0,23346 0,217316 0,258158 0,243054 0,256118

1.9 GraphPad Prism 5 calculations for the analysis of the differences between two cameras Linear reg. of Multi CBlue:Tabular results Linear reg. of Multi CRed:Tabular results Linear reg. of Multi MaxInt-Blue:Tabular results Linear reg. of Inter CRed:Tabular results Linear reg. of Inter MaxInt-Blue:Tabular results t test of T-test of Multi CBlue

Side xlv

Correlation of T-test of Multi CBlue:Tabular results t test of T-test Multi CRed Correlation of T-test Multi CRed:Tabular results t test of T-test Multi MaxInt-Blue Correlation of T-test Multi MaxInt-Blue:Tabular results t test of T-test Inter CRed Correlation of T-test Inter CRed:Tabular results t test of T-test Inter MaxInt-Blue Correlation of T-test Inter MaxInt-Blue:Tabular results

Linear reg. of Multi CBlue:Tabular results DP 71 Multi/CBlue Best-fit values Slope Y-intercept when X=0.0 X-intercept when Y=0.0 1/slope 95% Confidence Intervals Slope Y-intercept when X=0.0 X-intercept when Y=0.0 Goodness of Fit r² Sy.x Is slope significantly non-zero? F DFn, DFd P value Deviation from zero? Data Number of X values Maximum number of Y replicates Total number of values Number of missing values Runs test Points above line Points below line Number of runs P value (runs test) Deviation from linearity

-0.07104 0.4426 6.230 -14.08

± ±

PixeLINK Multi/CBlue 0.0000 -0.06834 0.0000 0.4326 6.331 -14.63

± ±

0.003070 0.005784

Perfect Perfect Perfect

line -0.07479 line 0.4205 line 5.919

to to to

-0.06189 0.4448 6.827

1.000 0.0000

0.9649 0.01249 495.5 1.000 1.000, < line Significant

1.000, Perfect 3 1 3 17

20 1 20 0

Perfect line

11 9 12 0.7731 Not Significant

18.00 0.0001

Linear reg. of Multi CRed:Tabular results DP 71 Multi/CRed Best-fit values Slope -0.07779 Y-intercept when X=0.0 0.4697 X-intercept when Y=0.0 6.038 1/slope -12.86 95% Confidence Intervals Slope Perfect Y-intercept when X=0.0 Perfect X-intercept when Y=0.0 Perfect Goodness of Fit r² 1.000 Sy.x 0.0000 Is slope significantly non-zero? F DFn, DFd 1.000, P value

± ±

PixeLINK Multi/CRed 0.0000 -0.07538 0.0000 0.4586 6.084 -13.27

± ±

0.003146 0.005928

line -0.08199 line 0.4462 line 5.719

to to to

-0.06877 0.4711 6.518

0.9696 0.01280 574.0 1.000 1.000,