Dear @everyone,
TL;DR summary: What are your thoughts on how we define the term “open source”; open source practices not always being compliant with definitions; and/or the implications of that? I’ll share some of mine, and would like to learn from you!
Full post begins:
Background
I am writing this post wearing my hats as a GOSH community member and someone passionate about the cultural freedoms underlying open source software, hardware, and other intellectual objects.
Over the past year, I’ve been thinking lots about the conversation around the definition of open source (software, hardware, and other things). I have occasionally participated in the conversation such as during NASA TOPS meetings or with people in the Turing Way community.
To start things off, I want to describe where I come from intellectually, and acknowledge that the following perspective comes with possible benefits and limitations:
In my view, the fundamental ethical underpinnings of the Open Source Hardware Definition or the Open Source Definition originate from the Free Software Definition and the Definition of Free Cultural Works. My personal opinion is that:
- “Free” means freedom/liberty instead of “free of charge” or “non-commercial”. This idea is more easily expressed in some other languages such as libre or 自由.
- Here, freedom refers to the freedoms to use, study, remix, and share open source material without restrictions (including doing so for money). In practice, because of the prevailing legal environment, this means that an open source license is required to accompany that material in order to formally enshrine those four freedoms.
- It is important to have this underlying ethical motivation, because it enables almost all other benefits that we talk about. Over the long term, standing behind and advocating for these freedoms could eventually lead to a rising tide that lifts all boats.
While I hold this opinion strongly, I certainly do not think my views are authoritative!
My worry
Documenting all instances of this would be a research project onto itself. But for now, my subjective view is that there has been - in recent years - an increasing frequency and intensity of conversations around (a) re-defining the term “open source”; (b) allowing things or activities to be called “open source” when they don’t formally meet the definition; or (c) defining what “open source” means in domains where it has not been previously applied.
I am worried that these often-commendable efforts seriously risk diluting the meaning of “open source”, and that such a dilution would create net-negative outcomes for advocacy and institutional reform.
To give you one symptom as an example of this dilution, I sometimes read in news articles reporting on wars around the world where researchers have made use of “open source intelligence”. In one case, experts used Google Earth satellite imagery to deduce what’s happening on the ground in highly inaccessible places. In fact, Google Earth is closed source, and the satellite images it allows you to see are similarly proprietary. From what I can tell, some call this “open source intelligence” because the images are (relatively) “easy” to access and mostly “free of charge”. I happen to like “easy” and “free of charge”, but doesn’t mean this intelligence data (or the tool with which to access it) is open source based on my perspective as described above.
As someone who is currently an academic researcher, I have also seen “open source” peer-reviewed papers published with free-of-charge (and no paywall) view-only access with a non-commercial and/or no-derivatives license. There are clear benefits of not making others pay a huge amount of money to read academic papers, but again that doesn’t make them open source.
These are, in my view, two examples of how the meaning of “open source” has been diluted.
I think this dilution has three general origins:
Well-meaning but non-open source software
Recently, we have seen efforts to define open source “artificial intelligence (AI)” technologies such as large language models or generative AI (I acknowledge that AI is itself a problematic term!). These well-meaning efforts sometimes state that definition of “open source AI” should include ethical guidelines on what people should and shouldn’t do with the the technology. For example: “Don’t use this AI technology to create misinformation!” or “Don’t use it to discriminate against people based on gender, race, etc.!”
These limitations have a history in software communities, where some well-meaning developers have proposed similar changes to the definition of open source software. These efforts often take the form of “ethical” licenses, such as the Hippocratic License by the Ethical Source initiative. Again, they add ethical guidelines to the definition of “open source”.
Another form of this well-meaning movement is the attempt to keep software kind of “open source” but fend off Big Tech companies. This is probably most notably embodied in the fair-code movement, where it uses licenses that are mostly open source, but have limitations on commercial use; those which keep the code view-only for a long time; or the Anti-Capitalist Software License. Strictly speaking, fair-code (or other initiatives like “Post Open”) does state that it is not open source per se, but colloquially many people now refer to software using this model as “open” or “open source”.
UPDATE Feb 2024: For completeness, a couple more of these non-open source licenses are Equalicense and Polyform. To their credit, they do not claim to be open source.
“Big tent” but non-open source models
In my view, some open source tech communities have an unfortunate history of gatekeeping; being hostile to beginners (no, just saying “RTFM” is not nice); being highly judgemental towards people who do not exclusively use or develop 100% open source software; or outright discrimination to diverse demographics who do not conform to the white, male (and often bearded and beer-guzzling!) nerdy programmer stereotype.
To this day, this is still an endemic problem, and I strongly believe we have a collective responsibility to make open source communities more diverse and inclusive of all humans at any stage of the open source journey. I am buoyed by communities such as GOSH, OSHWA or the Software Freedom Conservancy, which have made great strides in this regard.
At the same time, I am concerned that - sometimes - in our commendable efforts to make our communities a “big tent” that is more friendly and open (no pun intended ), we unnecessarily overstretch the definition of open source, or call practices/outputs which do not fit the open source definition open source.
“Openwashing”
For the purposes of this thread, I refer to openwashing as organisations (usually companies) labelling something as “open source” when it is really not. Typically, doing so attaches the nice, warm, fuzzy feeling to their products and services without the pro-social responsibilities of meeting the open source definition.
For these openwashed technologies, they sometimes have source code (or source code equivalents) published with view-only permissions or other restrictions, often with one of the non-open source licenses mentioned above. A very recent case is that of Meta’s Llama 2 large language model, which was originally advertised as an “open source” AI model. It is, in fact, not open source and it caused much confusion such as discussed near the end of this report in Ars Technica. Similar things have happened in the software world.
Why do I think this is a problem?
First of all, words matter, and they have meanings. Indeed meanings may change over time and no one person can or should dictate them, but these changes come with consequences, sometimes undesirable ones. I am afraid that diluting the meaning of “open source” would create undesirable consequences.
For example, in my opinion, the word “democratise” is a victim of such dilution. The (in)famous ride-hailing service Uber calls itself “democratising”, even as it exploits workers, monopolises transportation pushing out public transit, and often makes transit more costly for passengers.
Similarly, the word “sustainability” has also been heavily diluted.
What do “democratise” and “sustainability” even mean any more? Even when some uses are well-meaning, the indiscriminate use of these terms have made them less useful than before, because it’s hard to know exactly what you’re talking about.
When we want a term to mean everything, we risk making it mean nothing. I fear the same thing happening to the term “open source”.
Another problem is that if the meaning of open source is diluted, then we might still have the nice, warm, fuzzy feelings associated with the term, but it will no longer come with the original rights and freedoms that are foundational to its definition; and the responsibilities to those freedoms that creators of open source technologies should bear. This ambiguity would create openings for exploitation by powerful actors, leading to not only less user freedom, but also fracturing the community that we work so hard to sustain.
All of this means that when we say we are developing or using open source technologies, or that we are in an open source community, it will no longer be clear what we are really about. We will only be a warm and fuzzy community, instead of a warm and fuzzy and open source community.
What can we do???
I appreciate some of the writings by Drew DeVault, who often reflect on cultural issues in open source communities. In one post about the definition of open source, Drew eloquently argued for why keeping the term well-defined is useful:
As language is defined by its usage, some may argue that they are as entitled as anyone else to put forward an alternative usage. This is how language evolves. They are not wrong… [but] I argue that the mainstream definition of open source, that forwarded by the [Open Source Initiative], is a useful term that is worth preserving in its current form. It is useful to quickly understand the essential values and rights associated with a piece of software as easily as stating that it is “open source”.
Drew also wrote a closely-related essay on how this problem manifests in software licenses.
With that in mind, I acknowledge that there is - as some like to call it - a “spectrum”, or a path, from completely closed source to more “open”. We should be highly encouraging of those who make progress on that journey, even if what they create does not formally meet the definition of open source software or hardware.
For instance, there might be those who only publish some of the code they write as open source software; others might share only a PDF schematic of their hardware design; or some may publish code while forgetting to attach an open source license to it. In my view, the above are tangibly better than completely closed source, and we should celebrate those who have made this progress. At the same time we do not have to call all of the above “open source” to achieve that inclusivity.
I think the key to the success of this approach is to avoid an “open source good, non-open source evil” dichotomy, and refrain from passing judgement on those who have not gone all the way in one step. Instead of saying: “You’re a horrible person because your thing is not 100% open source!! ”, we can say: “Congratulations on taking a concrete step towards sharing more of your hard work! Here are some more things you can do to make it fully open source.”
In my view, this is compatible with keeping a well-constrained definition of open source, which would be past a certain point on the closed to open spectrum. This way, we can celebrate movement from one end to the other without overly diluting the meaning of “open source”.
A related good practice would be to consistently use the term “open source” only when we mean it as it is formally defined. In contrast, “open” or “open-source” (with a “-”) are genericised terms which do not have formal definitions, and we should be thoughtful in how we use them. For example, we might say software released under the Hippocratic License is more “open”, but let’s not call it “open source”.
This post is a rough summary of the random worries in my head about the risk of diluting the definition of open source technologies. I think of it as 拋磚引玉, an idiom I learned as a child, meaning to “to offer one’s own relatively worthless words […] to attract others’ more valuable contributions”. I hope my worthless words will provoke your valuable contributions!
Listening to the ongoing conversations, I am grateful to the people who have shared their perspectives on how we should define open source. I have learned much, and discovered things I hadn’t considered. In this spirit, I recognise that I may be stepping into a highly emotive minefield and ask that any responses to this thread be maximally respectful, empathetic, and assume goodwill! When in doubt, please err on the side of extra niceness!
Thank you for reading.