Big Data

Can Open Source save AI?


Do excuse the clickbait headline, but isn’t everything we write these days done in order to drive some algorithm, somewhere? As it happens, I did just attend a very interesting event; and it was, topically enough, about open source and AI. But am I writing about it just because it was interesting, and I wanted to share some thoughts? Or is it all about the SEO, plus some behavioural psychology tricks I need to apply to guarantee measurable clicks, thus pushing it up the rankings of social sites and indeed, looking good on internal, aggregated dashboards? It’s like our robot overlords have already won, and all we have left to do is welcome them. 

But I digress. Returning to our sheep (as they say in French, and I will return to the question), there was much to learn from the launch of OpenUK’s latest research on the economic impact of open source software (OSS) on UK industry, and more broadly, its GVA – Gross Value Add. OpenUK is a relatively recent national industry body, formed directly to “move open technologies – not only OSS but open data, open standards and open innovation – onto the UK radar,” according to its CEO and opening speaker, Amanda Brock. 

OpenUK’s public purpose is to develop UK leadership and global collaboration in open technology, which essentially means stimulating the symbiosis between UK organisations and open technology. Power to OpenUK’s elbow, that’s what I say — I recommend interested parties take a look at the research (led by chief research officer, Dr Jennifer Barth) and act on its findings. In a nutshell, OSS brings over £13 billion of value to the UK, being 27% of UK Tech contribution to it and sees plans to invest an amount of £327 million. By my reckoning, that’s roughly a 41x planned return on investment. 

I know it’s not as simple as that, in that the spend is into a global pool of developers, innovators, providers and others. But nonetheless — and Amanda made this point — many of the solutions built on top of OSS end up being US-based, including UK-founded companies such as Weaveworks (for GitOps) and Snyk (Development Security). UK investors are traditionally reticent compared to those in the Bay area, and need a clearer understanding of what OSS brings as a result. And conversely, OSS creates more opportunities for skills development and the creation of new enterprises, furthering the goals of our multi-island nation on the global stage. 

The Jeff Goldblum-sized fly in the ointment is AI, which has come out of seemingly nowhere to be this year’s hot topic. Not quite true — we’ve heard a lot about AI in recent times — but it did look like it was going the same way as 3D televisions, before Midjourney and ChatGPT came along. Not ironically, this landed right in the middle of both the OpenUK research cycle (which had to spawn a second research report mid-way) and UK legislation on AI (which has had to be rewritten in flight to take large-scale models into account). 

AI is a significant area for the open technology world, first in terms of software (the most used AI platform, TensorFlow, is open source), but then also for data. Wikipedia was founded on open principles, both using open source and releasing its open data on an open content platform, so it was no coincidence that its founder Jimmy Wales was in attendance. The recent developments in generative AI directly relate to the availability of open data sources — “50% of ChatGPT input is Wikipedia,” says Jimmy, who is cool with this. “That’s what it’s for.”

So, to the question, can Openness save AI? The answer is no, not by itself, but it can go some way to providing the tools we need to deliver it, in a way that will benefit society in general (and therefore the UK in particular), moving the technology into the hands of the many. One reason is that, like OSS, the AI genie is out of the bottle. “We can’t assume there are six companies we can regulate,” says Jimmy, pointing to the millions of hobbyist developers that are already playing with Midjourney via Discord, or writing their own versions of generative AI software. AI can learn from the OSS world, the power of individual responsibility — we can’t blame the tools, but we can legislate against what people are creating, he suggests. “You could always use Photoshop to create an image; it just wouldn’t look very real – it’s now going to look more real.”

That’s not to say that we do without general legislation at a corporate and national level, but this needs to be aimed at the consequences of AI, rather than its inevitable, more general use. “The one thing that’s inevitable is that governments are going to regulate – if that’s too top-down, it’s going to be too hard. But the opposite approach, individual responsibility with the right level of governance, bottom-up and principles-based, that’s the better approach,” says Amanda. As highlighted by Chris Yiu, Director of Public Policy at Meta, this goes with the transparency and openness that are (the clue’s in the name) mainstays of OSS. If the AI genie has spawned lots of little genies, we can use them as a network of peers to create a more solid result. 

I can agree, as long as the responsibility and openness is applied at all stages of the delivery cycle — there’s a lot to unpack about “the right level of governance” across data collection and management, cybersecurity and access management, process best practices and jurisdictional questions (what’s legal in one country may not be in another, and may be unethical in both). For example, if I could use data from the Strava open API to build a picture of people likely to suffer medical issues and then I publish it, who would be responsible? Or if I created the code and left it lying around?

It does strike me that post-Brexit Britain is in a unique position to set a different agenda from either the EU, which is looking at top-down regulation, or the US, which has a habit of playing a bit faster and looser with privacy than we might like. At which point, organisations such as OpenUK might find themselves with their work cut out — it’s one thing to advocate for more acceptance of OSS, but quite another structurally to find yourselves as the most important people in a newly created, yet critical space. That’s a good problem to have, but not one to be taken lightly. 

We have time to get this right. Nobody in the room felt AI was a runaway train: even though examples exist of AI-driven challenges, they remain the exception rather than the norm (said Chris Yiu, “We are a long way off anything approaching super-intelligence.”) Nonetheless, we already need independent organisations who get this stuff to advise on the best way forward, working with policymakers. Perhaps open source models, and the open method of creating new ones, can indeed counter the worst potential vagaries of AI; and right now, we need all the help we can get as we work out a new understanding of the impact of the information age, both in the UK and beyond. 

At which point, we can keep our robots where they need to be, to a sigh of relief for even the most fearful of the AI-embracing future.