Quality assurance in open source projects Jaana Matila, Miikka Kuutila & Teemu Kaukoranta Abstract 1. Introduction 2. Social structure 2.1 Managing quality assurance tasks and issues 3. Tools for quality assurance 3.1 Peer reviews 3.2 Testing 3.3 Bug Reporting 4. Conclusions References
Abstract The onion model that separates open source project organizations into different layers. The layers that the model uses are the core, the developers, the active users, and the passive users. While the model may have its shortcomings, mainly due to the fact that open source projects tend to be very different from each other, we find that its a good model for representing how quality assurance tasks are carried out, too. The main concepts that we discuss are peer reviews, testing procedures, and bug reports. In peer reviews we find that while the difference between the core and the developers may be small, it still does exists. Individual peer reviews are usually carried out by a small amount of reviewers who are typically more experienced than the author, and usually each developer only does a small number of reviews a month. This is due to the fact that developers only review patches that they feel they have expertise about the number of reviews per month is higher for core members, which highlights the fact that members of the core have more expertise and experience than the noncore developers. A lot of testing and bug reporting responsibility often falls upon the user base of the software. While there are testing procedures that are carried out by the development team, often it is the users who use the beta versions of the software and then report the bugs to the development team.
1. Introduction This is a short article written for the course Open Source Software Development, in the Department of Information Processing Science in the University of Oulu. This article will discuss quality assurance in open source projects, and will try to answer which social groups, as defined by the onion model, are associated with which quality assurance tasks. Quality assurance can be of critical importance to a software project. A PR newswire report (2012) analyzed 45 big open source projects and average the project had over 800,000 lines of code where the defect density per thousand line of code was 0.45. Open source code quality to defect density was measured to be better than in average in software industry. When choosing our topic, quality assurance in open source software projects caught our attention, since due to the nature of open source projects that differs traditional closed source projects, quality assurance seems like it could be a big challenge. There has been a lot of research about the social structure of different open source projects. Open source projects are very varied however, and so the social structures can be very different. There have also been a number of case studies about quality assurance in open source projects. Through a literature review, we want to understand how is quality assurance carried out in practise, and how does the size and the social structure of the project affect quality assurance. We also want to understand how social structures tie into quality assurance responsibilities and tasks. Specifically we focus on how different issues and tasks are presented in different layers of the open source project organizations. We first discuss the social structures of open source projects, focusing a lot on the onion model. We then start going through different quality assurance tasks and concepts, such as code ownership and peer reviews, and see how these relate to the different layers of the previously presented social structure.
2. Social structure The social structure of open source projects is often referred to as an onion, since most projects can be considered to consists of multiple layers of users. In the absolute middle there are the core developers: a small group of people who contribute big parts of the code and are responsible for overseeing the project as a whole. Typically the size of the core group is kept small, because a high level of interaction is required and that would be difficult if the core was too large. (Aberdour, 2007) One case study about the Apache Web Server found that a core of 15 developers was responsible for over 80% of code for new features (Mockus, et al, 2000).
The “codevelopers”, “contributing developers”, or just “developers”, surround the core. Typically all submissions made by the codevelopers are reviewed by the core developers. Since the level of interaction required is much smaller than in the core, this group can be quite large. Codevelopers contribute sporadically, often helping with bug fixes or by reviewing changes. (Aberdour, 2007; Crowston et al. 2004; Crowston et al. 2003) Surrounding this developer group are the two layers of users: active and passive. Active users are those who do not contribute to the project by writing code, but rather they can report bugs, do feature requests or write documentation. The passive users are the group of people who just use the software. (Aberdour, 2007; Crowston et al. 2004; Crowston et al. 2003) As is typical of open source projects, it is difficult to present a single social structure model regarding how quality assurance is carried out. For example in the Ubuntu and LibreOffice communities QA can be considered to be a separate layer, but such a phenomenon cannot be observed in the Plone and KDE communities. In all communities people participating in quality assurance seem to communicate between themselves and directly with members who concern themselves with other tasks. It is also challenging to find any consistent patterns about team growth due to spikes in metrics such as team size. (Barham, 2014)
2.1 Managing quality assurance tasks and issues Managing quality assurance in OS deals with giving out tasks and responsibilities to projects core developers, codevelopers and active users for quality assurance. Mockus et al. (2002) did a case study for two open source projects and largely reported about their quality assurance methods. In Apaches case core developers tend work on problems identified with areas of code with which they are most familiar. In this smaller project the architecture of Apache allows for “code ownership”, where core developers usually take on maintaining or developing a core part of the server. This ownership does not give any special powers, and is not official. Additional features are bolted to the core and have people overseeing their development. In Mockus et al. (2002) also report about a big open source project Mozilla there were some differences in managing quality assurance tasks. Mozilla is operated by paid staff. This staff develops and maintains tools used for bug reporting and thus gives development community means to track and fix bugs. These tools are a problem reporting with mail and a database where problems reported are stored. In Mozilla project code ownership is enforced and the code owner is responsible for “fielding bug reports, enhancement requests, patch submissions and so on.” In both Apaches and Mozillas cases the core developers are responsible for most of the bug fixes and thus are a the biggest part of quality assurance. In Mozillas case the project resembles more conventional ways of software development, as the whole process is more defined and the policies regarding quality are more strict.
So in Mockus et al. (2002) both of the studied projects have code ownership, in one project it is enforced and in another it is more loosely applied. These findings support the onion model, where core developers are seen to be responsible for the overview of the project. Code ownership is a big part of Apache’s and Mozilla’s projects regarding quality assurance. In In Van Krogh, Spaeth & Lakhani (2003) one interviewed developer sees the situation so that core developers work on issues that are necessary for the longevity of the project, when the other developers focus more on tasks that are easy or interesting to them.
3. Tools for quality assurance According to ASQ (2014): “Quality assurance : The planned and systematic activities implemented in a quality system so that quality requirements for a product or service will be fulfilled” Quality assurance model support companies with the standards and rules, which can provide proof of the quality. In open source software projects the traditional quality can have drawbacks due to its distributed nature. OSSD(Open Source Software Development) uses peer review techniques in addition to the source code. Code availability in OSSD projects makes it easier to anyone to look through the code and overview it on their own perspective and detect more bugs from the source.(Khanjani et al. 2011) Quality held with the usage of different tools in OSS project Otte et al.(2008) stated that 87,2% of the projects uses source code control tools. Many of them use bugtracking and mailing lists and only 36,5% of the projects uses test support tools. Projects that apply bugtracking tools have much higher defect reporting quality and in general larger projects use more supporting tools than smaller ones, because they benefit from it more than mini projects. Khajani et al. (2011) study stated that the quality assurance in open source development depends critically on two different factors: how the project handles the code reviews and how the data testing has been done. It is important that the developers in the project understand the importance of code modularity, project management and test process management.(Abenrdour, 2007). The project management needs to be experienced and provide good leadership skills for the projects and the whole community have to know about the quality aspects to achieve the improvements needed. (Otte et al.2008). Developing high quality OSS depends on having a good, stable and large community to keep development stable to build new features and it makes debugging more effective. Otte et al.(2008) stated that in their study the biggest change in the Quality Assurance comes within the project size. In their test results onethird of responders apply QA practices but when project grows bigger it becomes more important. Large projects have good impacts of
an independent QA team. The team perform checks and verify that the agreed processes or guidelines are met and it can be done with help of quality assessment tools. Within the project the quality is still important and projects have different kind code review session and structured test approaches. They also include defect handling and test management.
3.1 Peer reviews A software peer review is a process where the author of a software change submits it to the community and interested individuals give feedback. The author and reviewers revise and discuss the patch until it is rejected or accepted. (Rouse, 2010) The peer review process has been widely studied and most people agree that it is a great way of finding defects in one’s code (Wiegers, 2002). However, industry adoption of the technique has been relatively low due to the required time commitment and corresponding increase in costs (Rigby, et.al., 2014). Open source projects break this trend, as most large and successful open source projects use some kind of peer review mechanism, and have embraced it as one of their more important tools of quality control (Rigby, et.al., 2014). Linus’s Law states that “given enough eyeballs, all bugs are shallow” (Raymond, 1999), which is obviously a strong argument in favor of peer reviews. Interestingly it’s been found that on average 2.35 reviewers respond to peer review requests in the Linux project, and similar numbers were found from other projects (Lee, Cole, 2003; Rigby, German, 2006). Overall the number of developers that do peer reviews is quite high, with most developers doing few peer reviews a month. There is also a small percentage at the top who participate in a larger number of reviews. Both of these numbers relate to the fact that developers tend to review changes that they have expertise in, which obviously limits the number of changes they can review due to human limits. (Rigby, et.al., 2014) So far this paper includes bit of discussion of the numbers of reviewers in peer review situations, but it’s the expertise of the reviewers, not the numbers, that’s long been seen as the most important factor in predicting reviewer efficacy (Porter et al. 1998; Sauer et al. 2000). A large number of reviews tend to come from the core developers, ie. the experts (Asundi and Jayant 2007; Rigby and German 2006). Some of the projects include peer reviews also from people outside of the core project team, without developer own interest to the code, which can result blind eye to own code, it can even rise the software quality (Aberdour, 2007.) Often the reviewers have been with the project longer than the authors, and they also have more active work expertise (Rigby, et.al., 2014). Overall it can be found that peer reviews are a very powerful tool for open source projects. Open source projects are typically quite meritocratic (Gacek, Arief, 2004), and peer review lends itself to this culture perfectly each patch is reviewed by a small group of experienced developers, who are interested about the subject and have strong expertise with it.
3.2 Testing In Mockus et al.(2002) two case studies are presented, Mozilla and Apache. In both cases prerelease testing is done at least partly manually. In Apaches case individual developers test the changes on a local copy of the source code in a manner similar to unit testing. After this they either directly commit to the repository or post the changes up for a peerreview in a core developers mailing list. In case of Mozilla (Mockus et al. 2002) which is significantly larger than Apache, commits are tested with daily builds of the software. If this build fails, “people get hassled until they fix the bits they broke”. Also bugs found during this smoke test are posted daily, so that the developers are aware of them in the latest versions. In addition to this there are 6 product area test teams responsible for maintaining test cases and test plans in various parts of the project. OSS development time include lot of testing, Otte et al.(2008) research results said that the average testing time compared to the whole development time was 38,6 % and over a half of the projects followed a structured testing approach. This states that the testing importance has been growing, previous of Zhao et al.(2003) study stated that the majority used only 20% of their development time to testing and most of the testing responsibility was shifted to users. OSS projects approaches aim more to fieldtesting and user reviews and using user’s willingness to use the free product including some bugs. Usually formal testing techniques and automation are expensive and require sponsorships, so user base is often the only choice for OSS projects (Aberdour, 2007.) In Otte et al.(2008) study, the project responders stated that the testing still needs lot of improvements in testing processes like efficiency, code reviewing and use of the tools for automated testing. Those tools should be also OSS products to lower the barrier for communities to use them. When the project has resources and sponsorship they usually mix all of these techniques together: structured manual testing, regression test automation and informal user testing to achieve the coverage of different types of errors. (Aberdour, 2007.)
3.3 Bug Reporting Many OSS projects use some of open bug reporting tools and with bigger open source projects will get hundreds of bug reports per day example Mozilla project. Many of the projects have their own system or tools for bug reporting Ubuntu has it own Launchpad system for bug tracking and Mozilla has Bugzilla system. The importance to user is to use good bug reporting etiquette how the bug is reported and described, that the reported bug can be easily fixed and noticed. Ubuntu for example has
good documentation page how should users report the bugs to the developer community. When the system and user base get’s bigger there are more and more bugs that will not get fixed. When there are more users for the project, there will be users without development experience and bug report might be about some unwanted behaviour which user sees as a bug (Chilana et al. 2010). Mockus et al (2002) report on reporting problems to open source projects. In Apaches case they have almost 4000 problem reports, but the top problem reporter has about 5% of them. Also from the top 15 problem reporters only three are core developers. In the Mozilla project there were also almost 7000 people, who reported problems. This shows that the wider community beyond the developer layer is active and mainly responsible for system testing. Chilana et al. (2010) suggests for better user participation in bug reporting should provide user different ways to express the range of founded unwanted behaviours and get feedback about it if others community members have same issue. Chilana et al. (2010) suggested that if tools could automatically identify violations of personal expectations in bug report descriptions, users could get feedback about it and learn how to describe issues properly that their bugs would get fixed. This could encourage users to refine their reports and help they think other ways to resolve their individual issues.
4. Conclusions Open source communities have been divided into subgroups and modeled with the onion model. In this model the groups are layered. In the core are core developers, typically small group of developers more concerned with the longevity of the project. Developers or “codevelopers” surround the core of the onion, and are generally more motivated by personal interests. The two outer layers surrounding developers are divided into two groups, passive and active users. Active users are different compared to passive users contribute to the project with bug reports. Code ownership is a big part of quality assurance. It distributes responsibilities from bigger entities into smaller pieces. Distributing these responsibilities works differently in different projects. In both project Mockus et al (2002) observed, core developers who “owned” part of the code were also responsible for most of the bug fixes. Project size affects many things in developing the OSS project and its quality. Bigger projects might have different methods than the smaller ones. Bigger projects might even have their own quality assurance team and the smaller ones might lean really strongly to the user base and field testing and have just core team until their project get noticed more or get more loyal user base for their project. Developing high quality OSS project relies strongly a large sustainable community, rapid code development and effective debugging (Aberdour, 2007.)
Abendour (2007) stated that when the project developing cycle is small and fast it keeps the code developers and reviewers interested and motivated to the code and it may result new features more quickly and with better quality. Overall we found that the QA tasks seem to relate to different layers of the onion model as one might expect. The core developers do a lot of peer reviews since they have the most expertise about the system, and the user base is an integral part of the bug reporting process. Since open source projects are so varied in size, age, and complexity, it’s hard to draw any conclusions that can be applied to each and every open source project. However, just like the onion model, we feel that our conclusions can model many open source projects on a general level. One challenge that could be tackled in future research, is dividing open source projects by variables such as size and age, and see how QA tasks are divided. Our conclusions are mostly based on very large projects, which can be considered to be a big limitation of our research. Many of our references can also be considered to be quite old by software engineering research standards, especially considering that the communities around open source projects, and the tools available to open source developers, have probably just kept improving and growing over the years. Comparisons between open source projects with commercial support, open source projects that are completely ran by volunteers, and perhaps more traditional closed source projects could be interesting. Nowadays many companies have moved their focus to releasing often, which could mean that some testing is done on users. Somehow comparing this method to the onion model, and it’s theory of active users doing the testing, could prove to be a fruitful topic of research.
References Aberdour, M. 2007, "Achieving Quality in OpenSource Software," Software, IEEE , vol.24, no.1, pp.58,64, Jan.Feb. 2007 American Society for Quality : ASQ, 2014.”Quality Assurance vs Quality control” http://asq.org/learnaboutquality/qualityassurancequalitycontrol/overview/overview.html Asundi, J., & Jayant, R. (2007, January). Patch review processes in open source software development communities: A comparative case study. In System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on (pp. 166c166c). IEEE. Barham, A. (2014). The Position of Quality Assurance Contributors in Free/Libre Open Source Software Communities.
Chilana, P. K., Ko, A. J., & Wobbrock, J. O. (2010, September). Understanding expressions of unwanted behaviors in open bug reporting. In Visual Languages and HumanCentric Computing (VL/HCC), 2010 IEEE Symposium on (pp. 203206). IEEE. Crowston, K., Annabi, H., Howison, J., & Masango, C. (2004, November). Effective work practices for software engineering: free/libre open source software development. In Proceedings of the 2004 ACM workshop on Interdisciplinary software engineering research (pp. 1826). ACM. Crowston, K., & Howison, J. (2003). The social structure of open source software development teams. Gacek, C., & Arief, B. (2004). The many meanings of open source. Software, IEEE, 21(1), 3440. Khanjani, A; Sulaiman, Riza, "The process of quality assurance under open source software development," Computers & Informatics (ISCI), 2011 IEEE Symposium on , vol., no., pp.548,552, 2023 March 2011 doi: 10.1109/ISCI.2011.5958975 Lee, G. K., & Cole, R. E. (2003). From a firmbased to a communitybased model of knowledge creation: The case of the Linux kernel development. Organization science, 14(6), 633649. Mockus, A., Fielding, R, T. & Herbsleb, J, D. (2002). “Two case studies of open source software development: Apache and Mozilla”. Transactions on Software Engineering and Methodology (TOSEM), vol. 11, Issue 3,pp. 309346. July 2002 Otte, T.; Moreton, R.; Knoell, H.D., "Applied Quality Assurance Methods under the Open Source Development Model," Computer Software and Applications, 2008. COMPSAC '08. 32nd Annual IEEE International , vol., no., pp.1247,1252, July 28 2008Aug. 1 2008 Open source code quality on par with proprietary code in 2011 coverity scan report. (2012, Feb 23). PR Newswire Retrieved from http://search.proquest.com/docview/922915982?accountid=11365 Porter, A., Siy, H., Mockus, A., & Votta, L. (1998). Understanding the sources of variation in software inspections. ACM Transactions on Software Engineering and Methodology (TOSEM), 7(1), 4179. Raymond, E. (1999). The cathedral and the bazaar. Knowledge, Technology & Policy, 12(3), 2349.
Rigby, P. C., German, D. M., Cowen, L., & Storey, M. A. (2014). Peer Review on Open Source Software Projects: Parameters, Statistical Models, and Theory. ACM Transactions on Software Engineering and Methodology, 34. Rigby, P. C., & German, D. M. (2006). A preliminary examination of code review processes in open source projects. University of Victoria, Canada, Tech. Rep. DCS305IR. Rouse, M. (2010). Peer review. . Retrieved from http://searchsoftwarequality.techtarget.com/definition/peerreview Sauer, C., Jeffery, D. R., Land, L., & Yetton, P. (2000). The effectiveness of software development technical reviews: A behaviorally motivated program of research. Software Engineering, IEEE Transactions on, 26(1), 114. Van Krogh, G., Spaeth, S. & Lakhani, K, R. (2003). “Community, joining, and specialization in open source software innovation: a case study”, Research Policy, 32(7), 12171241. Wiegers, K. E. (2002). Peer reviews in software: A practical guide. Boston: AddisonWesley. Zhao, L., & Elbaum, S. (2003). Quality assurance under the open source development model. Journal of Systems and Software, 66(1), 6575.