.. This document is © Martin F. Krafft It is available under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence 2.5 ========================================================================== Method diffusion in large open-source projects ========================================================================== Voluntary adoption of improved development methods in Debian -------------------------------------------------------------------------- .. |event| replace:: Ph.D. transfer, Lero @UL, Ireland .. |talkdate| replace:: 23 Nov 2007 .. |author| replace:: Martin F. Krafft .. |authoremail| replace:: phd@martin-krafft.net .. |footer| replace:: More info: http://martin-krafft.net/phd/ .. container:: author |author| <|authoremail|> * `Ph.D. student`_, `Lero`_, `CSIS`_, `University of Limerick`_, Ireland * Developer with the `Debian`_, `Zope`_, and `Plone`_ projects * Author of the book `The Debian System — Concepts and Techniques`_ .. _Debian: http://debian.org/ .. _Zope: http://zope.org/ .. _Plone: http:/plone.org/ .. _Ph.D. student: http://martin-krafft.net/phd/ .. _Lero: http://lero.ie/ .. _CSIS: http://www.csis.ul.ie/ .. _University of Limerick: http://ul.ie/ .. _The Debian System — Concepts and Techniques: http://debiansystem.info/ .. container:: event |event| |talkdate| Research question ================= .. class:: center vcenter .. What are the factors which lead Debian developers to adopt new methods and more efficient workflows? Terminology =========== :Innovation: "new processes, reorganisation of production leading to increased efficiency, …" [Kline and Rosenberg, 1986] :Method: a tool, technique, part of a process, a means to an end :Diffusion: "the process in which an innovation is communicated through certain channels over time among the members of a social system" [Rogers, 2003] :Adoption: "a decision to make full use of an innovation as the best course of action available" [Rogers, 2003] :Volunteer: "one who enters into, or offers for, any service of his own free will" [Merriam Webster (sorry)] :Network externalities: "the effects on a user of a product or service of others using the same or compatible products or services" [`about.com `_] Overview ======== .. class:: current * Debian * Background information * An example workflow * Diffusions * Research approach * Contribution * Progress to date The Debian Project ================== .. class:: floatright .. image:: ui/debian/blue-swirl.png * Founded 1993 * 1000+/2000+ volunteers, globally-spaced (52 countries) * Possibly largest FLOSS project * 100% Free * Produces Debian GNU/Linux: * 20'000+ packages * 11 architectures * ~150 derivatives .. "one of the largest software systems in the world, probably the largest" [Amor-Iglesias et al., 2005] My role in Debian ================= * Involved since 1997 * Developer since 2002 * Well-known and respected (lead user [cf. von Hippel, 1986]) * Concentrate on process improvement and quality assurance .. I am in a position to study the community from the inside (cultural insider) The Debian System — Concepts and Techniques =========================================== .. container:: floatright .. image:: book-osp.png .. raw:: html
.. image:: book-nsp.png * Published June 2005, English, 608pp. * Translated to German, Japanese, French (Chinese & Spanish in preparation) * Sold ~15'000 copies * Covers * The project and ideological topics * Archives, packages, and packaging * System administration and security * Documentation, bug reporting, forums .. .. class:: center http://debiansystem.info The typical Debian developer ============================ These are from experience/discussions/speculation: * Volunteer * Hacker [Levy, 1984; Coleman, 2005] * Professional user of Debian in advanced/complex settings * Young (~26 years) * Academic background * Technophile * Perfectionist .. Decisions often appear unreasonable from a management perspective Problems with current workflows =============================== * Non-integrated (e.g. no bug tracking system integration) * Repetitive * Error-prone * Boring * Bazaar of cathedrals .. Developers are doing what the computer could be doing more efficiently Why we are still doing it the old way ===================================== * No incentive to change * Large choice of incompatible methods * Great inertia: processes are highly complex and intertwined * Yak shaving awareness .. * There is no authority which can prescribe the *One True Approach* * There is no *One True Approach*, but a healthy competition of approaches * Nobody knows how to drive diffusions of such approaches What can be done? ================= * Identify the factors which lead developers to adopt * Publish a framework structuring those factors * Let proponents of the various approaches use these factors prescriptively to engineer their methods for higher chance of adoption * This preserves competition rather than trying to push *One True Approach* Overview ======== * Debian .. class:: current * Diffusions * Frameworks * Rogers' *Diffusion of Innovations* * Problems with existing frameworks * A new framework? * Research approach * Contribution * Progress to date What's (in) a framework? ======================== * Structuring/logical grouping of factors * Orthogonality * Well-defined domains of variables * Facilitates comparisons and assessments Framework example: Rogers' elements of diffusion ================================================ :innovation: relative advantage, compatibility, complexity, verifiability, visibility :communication: hard vs. soft information, mass vs. interpersonal medium, homophily :time: knowledge, persuasion, decision, implementation, (confirmation) :social system: structure, communication arrangement, social norms, opinion leaders, types of decisions, consequences .. class:: no-space-top Select problems: * overlaps (e.g. relative advantage/compatibility; communication) * simplistic (individual decisions, no network externalities) * process vs. variance theory Problems with existing frameworks ================================= I have tried [Rogers, 2003], [Wejnert, 2002], [Frambach and Schillewaert, 2002], [Gallivan, 2001], [Chau & Tam, 1997], [Saga & Zmud, 1994], [Fichman, 1992], [Davis, 1986], [Kwon & Zmud, 1987], [Tornatzky & Klein, 1982], and a few others… Common issues: * no prior knowledge of innovation required * low degree of network externalities * (lack of) orthogonality * authoritarian decisions Need for a new framework ======================== It seems I need a specific framework for diffusion in * voluntary social systems * with members making "unreasonable" choices * taking part in complex processes * having great network externalities But: One does not discard a framework by looking at it, one discards a model when a better one has taken its place [Kuhn, 1970] Thus, a *bottom-up approach*. Overview ======== * Debian * Diffusions .. class:: current * Research approach * Contribution * Progress to date Research approach ================= Four phases of research: #. Collection of factors (grounded theory) #. Community survey to sort and augment the factor set #. Design of an inital framework and Delphi approach to improve it #. Application and verification of the framework .. "Waterfall model" — paramount to stay adaptive Phase 1: collection of factors ============================== Sources: * *Experience* * Interviews * Forums and mailing lists * Wiki * Literature Strauss & Corbin [1998], because of: * axial coding * experience does not prevent objectivity * encourage use of non-technical literature/resources Phase 2: community survey ========================= Goals: * Assess factor relevance * Probe individual adoption behaviour and perception of diffusions * Demographics Survey strategy: * Console-based, interactive script * Intuitive use, unrestricted navigation * Allows for comments to be submitted * Uses version control to track and integrate responses * Authenticates responses with a cryptographic signature Phase 3: framework design ========================= #. Initial ordering of factors into framework #. Delphi study to improve framework and realign factors * ~30 experts from different fields, compensated * Anonymous or not? * Use simple, known tools: mailing lists, wiki, chat Possible outcome: an existing framework fits Phase 4: application and verification ===================================== * Descriptive use * Two classes of methods: * package build helpers (done) * patch management systems/version control systems (ongoing) * Prescriptive use left for further research I have `collected a large number of other classes I could study `_. Overview ======== * Debian * Diffusions * Research approach .. class:: current * Contribution * Progress to date Contribution ============ * A framework to assess diffusions in Debian * (no suitable framework seems to exists) * maybe suitable to other FLOSS projects; if not, then can be used to compare projects * Improved workflows in Debian * A new survey technique suitable to hackers * Possible new insights into volunteer management Overview ======== * Debian * Diffusions * Research approach * Contribution .. class:: current * Progress to date Progress to date ================ I have been considering the subject since 2003. The output so far has been mostly non-academic: in addition to numerous discussions and seven interviews, I * conceptualised `a new upload process `_ * delivered presentations on `existing workflows `_ and `workflow improvements `_ * delivered a presentation on `improving cooperation in volunteer projects `_ * held `an online tutorial about using version control for packaging `_ * `presented the use of version control for packaging `_ and worked out a `complete workflow for Debian packaging with Git `_ * `presented my research objective `_ various times Thank you … =========== .. container:: … for your attention! .. container:: author |author| <|authoremail|> * http://phd.martin-krafft.net/ * http://people.debian.org/~madduck/ .. include:: licence.en.rst .. include:: common.inc .. footer:: |footer|