Originally designed as a decentralised ecosystem, the Web has undergone a significant centralization in recent years. In order to regain control over our digital self, over the digital aspects of our lives, we need to understand how we arrived at this point and how we can get back on track. This article explains the history of decentralization in a Web context, and details Tim Berners-Lee’s role in the continued battle for a free and open Web. The challenges and solutions are not purely technical in nature, but rather fit into a larger socio-economic puzzle, to which all of us are invited to contribute. Let us take back the Web for good, and leverage its full potential as envisioned by its creator.
As an inventor, you might envision a purpose and destiny for your creation—yet its ultimate place in history is determined by how other people use it. John Pemberton aimed to cure morphine addicts when he started brewing the potion now known as Coca-Cola, Noah McVicker’s Play-Doh originally served as a wall-cleaner before it became a children’s toy, and Alfred Nobel had to establish a prestigious prize to save his name from perpetual association with the military purposes of dynamite.
Admirably, Tim Berners-Lee never even intended to control his own invention: his former employer CERN released the World Wide Web software openly in 1993, and he gave the Web a decentralized design so that no one can limit what others can say. This unprecedented openness has inspired large-scale permissionless innovation and unbounded creativity, provided a voice to more than half of the world’s population, and revolutionized communication, education, and business. However, a consequence of this unrestricted ability is that anyone can even create things that go against the spirit of the Web, such as illegal materials and—ironically—platforms whose primary goal is centralization.
Looking back to the web landscape not long ago
In and of itself, the concept of centralization does not pose a problem: there are good reasons for bringing people and things together. The situation becomes problematic when we are robbed of our choices, deceived into thinking there would be just one access gate to a space that we in fact collectively own. Some time ago, it seemed unimaginable that a fundamentally open platform like the Web would become the foundation for closed spaces, where we trade our personal data for a fraction of the freedoms that are actually already ours. A majority of Web users today find themselves confined to the boundaries of a handful of influential social networks for their daily interactions. Such networks gather opinions from all over the world, only to condense that richness into one space, where they simultaneously act as the director and the judge of the resulting stream that scrolls across our screens.
Because this change happened so suddenly, we might need a reminder that the Web landscape looked quite different not even that long ago. In 2008, Iranian blogger Hossein Derakhshan was sentenced to 20 years of jail, primarily because of blog posts he had written. He and many others were able to state their critical opinions because they had the Web as an open platform, so they did not depend on anyone’s permission to publish their words.
From critical reades to passive viewers
Crucially, the Web’s hyperlinking mechanism lets blogs point to each other, again without requiring any form of permission. This allows for a decentralized value network between authors, where readers remain in active and conscious control of their next steps. Yet when Derakhshan was eventually released in 2014, he came back to an entirely different Web : critical readers had transformed into passive viewers, as if watching television. The Web for which he had sacrificed his personal freedom seemed to have lost an integral part of its own. While the core technological foundations of the Web had not changed, the way people were using it had become unrecognizable after only 6 years.
Of course, social media are not our enemies here: they should be credited with lowering the barrier for the online publication of short texts and photos by anyone. Unfortunately, they operate under a winner-takes-all strategy, each striving to become the dominant portal instead of interoperating with the rest of the Web. In contrast to blogs, we typically cannot interact with posts in one network from within another: we need to either move the people or the data. This famous “walled gardens” problem of social media  has significantly worsened since 2008 because some gardens have grown huge, and so have their walls. A major problem is that access to the dominant networks invariably means giving up control over our personal data, as the gate to the garden only opens in exchange for our digital belongings. That personal data can then be leveraged to unwittingly influence us with absurdly personalized advertising for brands, products, and even political agendas. Furthermore, once there, people tend to form small conversational circles within each garden—an effect that is further amplified by the inward focus of social media platforms and their algorithms that favor maximizing engagement over diversity. The resulting filter bubble  isolates us into our own echo chambers, whereas the Web’s purpose—and social media’s claim—has always been to connect.
3 Challenges for the web
Unsurprisingly, these problems were reflected in three challenges for the Web  that Tim Berners-Lee put forward in 2017:
- taking back control of our personal data;
- preventing the spread of misinformation;
- realizing transparency for political advertising.
Clearly, it is undesirable to tackle these challenges through centralized solutions, for instance by appointing an authority for personal data, news, and advertising. This would create yet another single point of failure, which—even assuming the best of intentions—would always be more vulnerable to abuse. The core issue is ultimately not caused by any single social network, but by the hyper-centralization of data and people, and therefore power. We want control, but we want to put that control in the hands of every person, as a right they can choose to exercise over their data.
From the above, it is clear that our primary obstacles are not technological ; hence Tim Berners-Lee’s call  to “assemble the brightest minds from business, technology, government, civil society, the arts, and academia to tackle the threats to the Web’s future”. At the same time, computer scientists and engineers need to deliver the technological burden of proof that decentralized personal data networks can scale globally, and that they can provide people with a better experience than centralized platforms.
We will therefore start this chapter with a technological perspective on decentralization, highlighting Tim Berners-Lee’s role in the continuing fight to keep the Web open and decentralized. After a historical overview of power struggles on the Web, we will zoom in on the changes that decentralization requires, and examine what a more healthy ecosystem could look like. As a concrete implementation of these principles, we will study the Solid project. At the end, a discussion of open challenges will lead us to an outlook on the future.
A short history of (de-)centralization and the Web
Decentralization has not always been a question of personal data, as the forces causing centralization have been a moving target. Every time a threat had been addressed, an even bigger one superseded it. Understanding these threats brings insights into the different faces of decentralization.
Decentralization as the unspoken assumption
Decentralized systems, which do not require a central mediator to function, were already around at the time the Web was invented. Most notably, the Internet was increasingly gaining traction as a large-scale decentralized network. Email was even more decentralized than the traditional postal mail service it mimicked, since different mail servers would directly exchange messages with each other. Long forgotten protocols such as the Network News Transfer Protocol (NNTP) allowed for the decentralized exchange of news articles. In short, decentralization was not some crazy new idea, but rather the spirit of the time.
Therefore, when Tim Berners-Lee set out to design a new hypertext system in 1989, it was presumed to be decentralized, in contrast to documentation systems of the time, but in alignment with many others. The main novelty of the Web was its universality , its independence of, among others, hardware and software; decentralization was simply the unspoken assumption. This is reflected in the original article introducing the Web , which emphasizes universal readability across operating systems, but does not mention the term “decentralization” at all.
The only component with centralized roots in the Web’s architectural design is the Domain Name System (DNS), which resolves the domain name part of a Web address (such as example.org) to a physical machine on the Internet. This was not as much of an issue back in the days when the number of domains was relatively small and domain ownership would be stationary. Nowadays, millions of domain names frequently change hands, thereby breaking existing links in possibly malicious ways. By manipulating DNS, governments can block or alter access to existing websites. Tim Berners-Lee has indicated that, in hindsight, a more decentralized naming system might have been preferred. Apart from that, the Web contained all ingredients to thrive in a decentralized way.
The race for our desktop
A first wave of centralization resulted as collateral damage from the browser war of the late nineties, in which companies competed to become the sole vendor of the software through which we access the Web. The Web’s design principle of universality demanded readability on any platform, so the emergence of multiple browsers was a blessing—except that they strived for market dominance rather than mutually beneficial co-existence. The Netscape browser and Microsoft’s Internet Explorer tried to seduce each other’s users through new features, with the latter reaching over 90% of Internet-connected desktop computers at its peak.
While competition by itself can be positive, these features came at the cost of incompatibility across browsers and therefore directly endangered the Web’s universality. Websites would carry badges such as “best viewed in Internet Explorer”, since a consistent experience across platforms could not be guaranteed. This also meant that developers were limited by the functionality and quirks of a single browser that, after establishing market dominance, became sloppy with its updates. People who did not want to use a particular browser—or could not install it because they owned a different kind of computer—would be unable to access such websites fully or at all. The resulting de-facto browser monopoly infringed on people’s preference for device and operating system, centralizing the Web’s decision process in one company that thereby slowed down the rate of innovation.
The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee with a mission of compatibility, enabling cross-browser consistency through recommendations that specify the correct workings of Web technologies. While W3C standardization is administratively centralized, it incorporates feedback from a decentralized network of members through a consensus-driven process. A problem by the early 2000s was that Internet Explorer deviated from W3C recommendations at crucial points, forcing developers to follow either the actual standards or the most popular browser’s incorrect implementation thereof.
Fortunately, pressure from Firefox and Safari during a second browser war eventually forced Microsoft onto a more standards-oriented course . Since 2010, no single browser has gained more than two thirds of global market share anymore, meaning that standards compatibility is now in the interest of browser vendors and Web developers alike. The balkanization of the Web through centralized browser development has thereby largely been averted.
The race for our searches
Microsoft’s short-lived victory after the first browser war quickly turned out to be insignificant, since the centralization battle had gradually shifted to other fields. While each browser was quarreling to become the default application, search engines were racing to become the main entry point. Soon after, it did not matter anymore what software you were using to browse; what mattered was who gave you the directions of where to browse next. After all, no immediate income could be generated from free browser development, whereas companies would gladly pay for a prime spot in one of the major search engines’ rankings.
The early search engine landscape featured several competitors, such as AltaVista and Lycos, but it took Google only a couple of years to become the most popular by far. The centralization of search meant that one company gained an overly strong influence on what content people would access, based on the ranking of search results for given terms. Even assuming the best of intentions and ignoring paid advertising, the fact that one algorithm makes decisions for a large number of people leads to an information bias, as there clearly exists no single objective way to rank the “best” webpages on any topic. External attempts to manipulate these algorithms started to occur, first through relatively simple interventions such as misleading keywords, later through advanced Search Engine Optimization (SEO) techniques that aimed to improve website rankings in various (and sometimes dubious) ways.
The advent of search engines also brought the first online monetization of user-generated data. Our search terms contribute to a detailed profile of what we need in our private and professional lives. Search engines might know more about some aspects of our lives than our close friends. This profile determines the personalization of our search results and the ads we are shown, encouraging us to visit websites and buy things we otherwise might not have. While personalization has helpful effects for many people, the problem is that we are left without choice or control. We are directed to the big search engines, which, due to their large-scale accumulation of data, deliver us a great search experience. Yet these search engines do not provide us with options for how we want to pay for their services, as most of them only accept our personal data. Furthermore, we are not informed about—let alone given control over—how exactly our data influences our search results. The increasing personalization gave rise to the first filter bubbles , wherein we are more likely to see results similar to those we previously clicked on.
The race for our personal data and identity
While the reign of Google still continues, social media have discovered an even more powerful way of collecting and marketing our personal data. The social Web revolution of the 2000s encouraged people to be present online, which drove many of us to various places to share blog posts, bookmarks, photos, videos, and more. Some years later, social media companies created centralized platforms to take over many of these features, which until then were spread out across multiple providers. These platforms store our personal data and request far-reaching usage rights in exchange for their services, all of which operate within their own walled garden. Like search engines, the main service of social networks consists of a linear list of content, ranked by factors and algorithms we can only minimally influence. In contrast to search, a social feed is generated without any input terms from our side, like a television that no longer requires a remote. The ensuing show is meticulously personalized based on data we leave on social network platforms, combined with traces from our browsing history picked up—without our concious consent —by social trackers on third-party websites. In his 2018 Dertouzos distinguished lecture, Tim Berners-Lee remarked that political advertising has been banned from television in the UK  because of concerns about the impact of such a direct medium. By that logic, he continued, we should be much more concerned about the heavily personalized political advertising that current social media platforms enable and allow. Even if we refrain from explicitly sharing certain sensitive traits, seemingly insignificant pieces of other data can be combined into reliable predictors of highly personal information  such as sexual orientation, ethnicity, and religious or political views, which are subsequently used to target us.
As in the previous two centralization races, a subtle force is exerted upon us: we feel pressured to be part of the large networks, because not joining means missing out on the digital traces of our friends’ and family members’ lives. Often the easiest way for grandparents to see their grandchildren’s latest pictures is to create a Facebook or Instagram account. This is how the digital memory of a large part of today’s generation ends up in one space, often beyond control of those that are part of the memories. The centralization of our online activities has turned so extreme that some Facebook users have become unaware of their ability to access the rest of the Internet . This paradox has sadly become a reality in many countries, where Facebook’s Internet.org initiative provides a severely constrained version of the Web that further reduces people’s options, in blatant violation of Net Neutrality.
Meanwhile, another race is happening in the background, namely the battle to become our identity provider. An increasing number of websites are gradually replacing their own login systems with authentication tied to large platforms such as Google or Facebook. For people with an existing account, the “Log in with Facebook” buttons are a convenience. For those without, they create additional pressure to join. And in both cases, such buttons are yet another way of tracking our online activities. This centralization of identity takes away our freedom to assume the persona we want—be it anonymous, pseudonymous, or just ourselves—without needing to expose data we consider our own.
- Derakhshan, H. (2015), “The Web We Have to Save”, 14 July, available at: https://medium.com/matter/the-web-we-have-to-save-2eb1fe15a426.
- “Break down these walls”. (2008), The Economist, available at: https://www.economist.com/node/10880516.
- Pariser, E. (2011), The Filter Bubble, Penguin Books.
- Berners-Lee, T. (2017), “Three challenges for the Web, according to its inventor”, Web Foundation, 12 March, available at: https://webfoundation.org/2017/03/web-turns-28-letter/.
- Rosenthal, D. (2018), “It Isn’t About The Technology”, 11 January, available at: https://blog.dshr.org/2018/01/it-isnt-abouttechnology.html.
- Berners-Lee, T. (2018), “The Web is under threat. Join us and fight for it”., Web Foundation, 12 March, available at: https://webfoundation.org/2018/03/web-birthday-29/.
- Berners-Lee, T. (2005), “Universality of the Web”, 23 March, available at: https://www.w3.org/2005/Talks/0323-yorkshire-tbl/slide5-2.html.
- Berners‐Lee, T., Cailliau, R., Groff, J.F. and Pollermann, B. (1992), “World‐wide web: the information universe”, Electronic Networking, Vol. 2 No. 1.
- Gustafson, A. (2008), “Beyond DOCTYPE: Web Standards, Forward Compatibility, and IE8”, 21 January, available at: https://alistapart.com/article/beyonddoctype.
- Berjon, R. (2018), “Advertising’s War on Consent”, 19 March, available at: https://berjon.com/advertising-war-on-consent/.
- Berners-Lee, T. (2018), “From Utopia to Dystopia in 29 Short Years”, 18 May, available at: https://www.csail.mit.edu/news/utopiadystopia-29-short-years.
- Kosinski, M., Stillwell, D. and Graepel, T. (2013), “Private traits and attributes are predictable from digital records of human behavior”, Proceedings of the National Academy of Sciences, National Academy of Sciences, Vol. 110 No. 15, pp. 5802–5805.
- Samarajiva, R. (2014), “More Facebook users than Internet users in South East Asia?”, 30 August, available at: http://lirneasia.net/2014/08/more-facebook-users-than-internet-users-in-south-east-asia/.
- Verborgh, R. (2017), “Paradigm shifts for the decentralized Web”, 20 December, available at: https://ruben.verborgh.org/blog/2017/12/20/paradigm-shifts-for-the-decentralized-web/.
- “Solid”. (n.d.). , available at: https://solid.mit.edu/.
- Berners-Lee, T. (2006), “Linked Data”, 27 July, available at: https://www.w3.org/DesignIssues/LinkedData.html.
- Berners-Lee, T. and O’Hara, K. (2013), “The read–write Linked Data Web”, Philosophical Transactions of the Royal Society A, Vol. 371 No. 1987.
- Berners-Lee, T., Hendler, J. and Lassila, O. (2001), “The Semantic Web”, Scientific American, Vol. 284 No. 5, pp. 34–43.
- Capadisli, S. and Guy, A. (Eds.). (2017), Linked Data Notifications, Recommendation, World Wide Web Consortium, available at: https://www.w3.org/TR/ldn/.
- Capadisli, S., Guy, A., Verborgh, R., Lange, C., Auer, S. and Berners-Lee, T. (2017), “Decentralised Authoring, Annotations and Notifications for a Read–Write Web with dokieli”, in Proceedings of the 17 International Conference on Web Engineering, pp. 469–481, available at: https://csarven.ca/dokieli-rww.
- Zuckerman, E. (2017), “Mastodon is big in Japan. The reason why is… uncomfortable”, 18 August, available at: http://www.ethanzuckerman.com/blog/2017/08/18/mastodon-is-big-in-japan-the-reason-why-is-uncomfortable/.
- Barabási, A.-L. and Albert, R. (1999), “Emergence of Scaling in Random Networks”, Science, Vol. 286, pp. 509–512.