Big Data

How Swapfiets Uses Active Metadata to Democratize Access to Data

Driving Self-service, Secure, Trustworthy Data with Atlan
  • Swapfiets, the world’s first bicycle subscription service, sought to launch a governance program to meet growing internal demand for data
  • Choosing Atlan, Swapfiets launched their governance program with a self-service catalog, offering metrics and source definitions across six business domains
  • Having successfully launched the catalog, Swapfiets’ data team now routinely runs governance committees to strengthen cross-functional collaboration, and is driving a culture of data savviness through training and enablement

Founded in the Netherlands in 2014, Swapfiets is on a mission to make cities more livable, introducing the world to the concept of “Bicycle as a Service” with a monthly subscription to a bicycle with repair services included. Now with over 270,000 subscribers across eight countries, Swapfiets’ achievement of sustainability goals, and its continued growth, depends on reliable, easily accessible data.

“I’m really passionate about data. Data is so diverse, and it has not only a technical part, but also a governance, or more abstract part, as well,” shared Lisa Smits, Manager of Swapfiets’ Data & Analytics team. “That’s how I came to Swapfiets, and that’s how I came to lead this team.”

Drawing upon a diverse professional background spanning jobs from supply chain management to pricing analysis, and along with the help of her team and colleagues, Lisa has helped shepherd in a new era of data savviness and collaboration across Swapfiets, building and growing a Data Governance practice, and applying best-of-breed technology like Snowflake and Atlan.

Data is an important part of measuring where we are now, and what we need to do to grow further. We have initiatives like optimizing the parts we use and being more sustainable, and that’s where my team, directly and indirectly, contributes to the whole company.”

Lisa Smits, Data Analytics Manager

Once structured as one large team, Swapfiets has since broken up its data function into three specialized parts. First is Data Engineering, responsible for creating and maintaining Swapfiets’ data platform and driving data quality and availability. Next, Data Science creates complex data products through advanced modeling, such as demand forecasting and customer lifetime value. Finally, the analytics team led by Lisa is responsible for Swapfiets’ BI environments and activating data products. Powering this data team evolution is a quickly modernizing data stack, atop Snowflake as their data warehouse.

“We migrated to Snowflake. It’s more flexible to scale up and down, it takes a lot less time for our engineers to set up, and permissions management was a big thing that was easier to do. This transitioned us into being a more mature data team,” Lisa shared.

Swapfiets also adopted Fivetran, automating the pipelines once run on manual scripts, and dbt, utilizing their semantic layer as a decentralized metrics store. Rounding out this modernization was the adoption of Airflow and Tableau.

“We matured in one big step and really specialized,” Lisa explained. “We focused on having tools to buy, versus build, making a conscious decision on paying a bit more, but saving so much time in the end, resulting in lower costs.”

While Swapfiets was maturing significantly across technical and organizational domains, tribal knowledge about their data assets persisted. Initially, the team created a Google Sheet with a list of 25 metrics and definitions. But without a simple interface to access these metrics, or understand the context around them, the volume of questions posed to the Data & Analytics team increased.

“We got a lot of questions from people, like ‘How is that calculated?’ or ‘We see this number in this dashboard, but in this dashboard, it’s completely different. How does that work?’,” Lisa explained. “There was no ownership, no transparency, and no central place for us to write down terms, or explain what metrics meant and how they were calculated. It took a lot of time to answer these ad-hoc things.

In most cases, there was a perfectly valid reason for differences between dashboards and assets, but absent documentation, trust was eroding, leading Lisa and her team toward a new focus on data governance.

To lose trust is easy, but to gain that trust back is hard. That’s when we started taking this seriously. We wanted to have a high-quality, trustworthy BI environment, and that’s why we started looking into our options for governance, and into Atlan.”

Lisa Smits, Data Analytics Manager

Lisa and her team began by searching for glossary solutions in order to increase the transparency of their data assets and reduce the volume of questions from Swapfiets’ data consumers. But as she and her team learned more about the Active Metadata Management market, their expectations for the value that such a platform could drive grew.

With a flexible user interface, data consumers could discover, understand, and apply data in a self-service manner, and with automated lineage, Swapfiets’ data practitioners could better understand how data flowed through their modernizing stack and how it was being utilized. And with Atlan Insights, a metadata-based query builder, even more of Lisa’s colleagues would be empowered to utilize data.

Rather than just a glossary of terms, Lisa and her team envisioned a one-stop shop for their colleagues who used data every day, and for those who should, but hadn’t yet.

You have one place where you can store the terms, where you can look at all of your data assets, where you have the descriptions, and where you can see the owners. If you have a question, you can immediately send it via Slack. It’s all connected in one place, and I think having that complete picture, not only governance but also self-service, was what really stood out to us about Atlan.

Lisa Smits, Data Analytics Manager

After choosing Atlan, Lisa and her team began by integrating Redshift (since migrating to Snowflake), dbt, and Tableau, quickly enabling lineage across their most critical technologies. With their data estate effectively crawled, and data assets visible, they began the process of assigning owners and creating a business glossary.

Using the definitions they had already agreed upon with their business colleagues and had stored in Google Sheets, Lisa’s team enriched a baseline of a new business glossary, then worked with their colleagues in Data Engineering to document the sources of each asset. While using pre-existing definitions saved valuable time, the process of enriching data assets and documenting their source was still a cumbersome one. In Atlan Playbooks, rule-based bulk automations for tasks like documentation, Swapfiets found a way to significantly accelerate enrichment.

Utilizing Playbooks, Swapfiets built an automation ensuring that each time a new asset is introduced into their metrics store, its pre-existing descriptions from upstream systems are automatically passed through to Atlan. These descriptions are then verified, rather than written from scratch, significantly accelerating their documentation rollout.

“Content creation was a lot of work, but when you introduced Playbooks, it was amazing for us,” Lisa shared. “We now have 125 metrics, and it improved our time spent on documentation.”

Furthering their use of Playbooks, Lisa’s team quickly moved to secure these newly available data assets, automatically tagging sensitive or personally identifiable information, and using personas within Atlan to ensure sensitive data was available only to those authorized to view it, or contribute to documenting it.

Seeking to deepen collaboration with subject matter experts, and to make data governance a routine part of the way Swapfiets operates, Lisa attended an Atlan Masterclass with WeWork to learn how best to build and maintain a business glossary.

Lisa’s team maintains a single business glossary, divided into business domains that align subject matter experts with partners on the data team. Six such business domains exist within the glossary, across teams like Operations, Human Resources, Product, and Consumer Direct. And for each of these six business domains, Lisa’s team arranged cross-functional governance committees composed of data analysts, data owners, and data stewards that meet on a monthly basis to discuss their progress.

“In that monthly meeting, we focus on three governance topics, which are ownership, documentation, and quality. For documentation and ownership, Atlan is really handy. It’s also handy for quality to find out where in the data lifecycle quality issues arise,” Lisa shared. “During the meeting, we write out action points and commit to tasks like verifying more terms.”

To track progress against these commitments, Lisa’s team uses Atlan’s reporting functionality as a heads-up display of how many terms and data sources have been verified or enriched, and at what pace. Further simplifying the process, subject matter experts use filters in Atlan to display the terms they’re responsible for, giving them a list of relevant terms and data sources they’re tasked with enriching, or to mark unverified but important terms and data sources as draft for future enrichment.

While Swapfiets continues to find new ways to apply Atlan and grow their governance practice, the value yielded from self-service and their metrics catalog, alone, has been significant.

“Colleagues frequently ask ‘What does this mean?’ or ‘How do you define this metric?’ With an integrated workflow in Slack, we send them to the Atlan glossary to self-serve their question. It’s been a great time saving for us,” Lisa explained.

For questions that can’t be answered by self-service, the Data & Analytics team benefits from clear, assigned ownership for each data asset, making it simple to direct questions to subject matter experts, or to tag them directly into a Slack thread.

By standardizing the process for defining metrics, and automating their initial enrichment using Atlan Playbooks, Lisa’s team has saved many hours of effort, so far. But more noteworthy to their team is the fact that they are now assured their process will be followed each time a new metric is introduced.

The whole standardization of all the metrics saves between 10 and 30 minutes for every metric. So if you multiply that by 125, it’s a nice number, but it’s not like, ‘Oh, we saved five weeks this year.’ It’s just nice that we don’t have to think about this anymore. It’s not in our process anymore while we create a metric. It’s already taken care of.”

Lisa Smits, Data Analytics Manager

With a catalog of metrics in place, and a sustainable process to further enrich it, Lisa and her team began a series of programs designed to increase data savviness across the Swapfiets organization and further reduce the burden of service on their organization.

Data Literacy Program

Foundational to how Swapfiets enables self-service is their Data Literacy Program. Part of this program is a mandatory training for Atlan & Tableau. This teaches new Swapfiets employees about Atlan and its possibilities, and how to utilize Swapfiets’ existing Tableau dashboards. When complete, users are then given access to Atlan and Tableau. In this program, the Analytics team also offers a SQL-Atlan training for their Data Champions.

For self-service questions, we like teaching people the basics of querying. And with Atlan, you have the visual query builder. Even if people are not super data literate, they can play around with the query builder and it’s much easier to understand than something like ‘SELECT * FROM’. If they need to filter, it’s actually called filter instead of the technical ‘WHERE’ statement. It just speaks to them a lot more.”

Lisa Smits, Data Analytics Manager

With the combination of the Data Literacy Program and Atlan Insights, the use of Tableau is now reserved for long-term data products and visualization, with the bulk of custom querying now done using Atlan, potentially lowering licensing costs over Tableau and lowering the technical barriers to data analysis with a visual query tool.

Data Champions

For Swapfiets employees with a higher technical aptitude or ambition, or for data stewards responsible for updating their glossary, a more advanced level of training dives deeper into visualization in Tableau, and how to explore assets and translate them into queries in Atlan.

Using fine-grained access controls in Atlan, Lisa’s team ensures these Data Champions are able to conduct self-service analysis at a deeper level, without the risk of accessing and applying improper or restricted data.

“We don’t have any PII in Tableau anymore, because access management is easier in Atlan. Now, it’s transparent how many assets have PII data, who can access those, when they are used, and who queries which tables,” Lisa shared.

Extract Champions

The newest addition to Swapfiets’ democratization efforts are Extract Champions, data-savvy colleagues from each business domain that are now empowered to help Lisa’s team when extract requests for data occur.

In Swapfiets’ data support channel on Slack, users initiate a workflow to request data that is unavailable in Tableau, or too advanced to self-service. With one click of an emoticon, Extract Champions from the relevant data domain are tagged, accept the task, and begin querying the data using Atlan Insights for delivery to the requester.

“When they’re done, they just click the ‘Finish’ button on the ticket, and we don’t even have to look at the ticket. So it’s a win-win. I’m so enthusiastic about this, it’s been great,” Lisa shared.

Furthering the value Extract Champions yield from Atlan, Lisa and her team realized the repetitive nature of some requests, and worked to automate them using Atlan Scheduled Queries.

“Atlan helped us schedule extracts to people outside of Atlan, which helped an Extract Champion who’s in charge of communication with B2B customers,” Lisa explained. “And now he’s scheduled his manual work, and Atlan automated what he would usually share via email. He doesn’t have to reach out to us anymore.”

Maintaining the incredible momentum of Swapfiets’ Data Governance Program is a focus on change management, constant communication, and driving cultural change.

Governance Committees continue to meet on a monthly basis, ensuring each business domain keeps their enrichment and ownership responsibilities top-of-mind. Updates and newsletters are routinely sent to stakeholders, educating them on topics like quality, ownership, security, or accessibility, and updating them on crucial topics like how documentation is progressing against their goals.

But the most noteworthy change, beyond how quickly Swapfiets’ technology has modernized and how well their Data Governance Program has been stewarded, is a cultural shift. Where Lisa’s team was once responsible for answering questions about data and connecting the dots for data consumers, that ownership is shifting. Stewards and Data Owners from across their organization now feel empowered to field questions about their data, and responsible for the quality and availability of these assets.

We have the data support channel, and we could answer most questions, but we won’t anymore. We tag the data owner or steward and they will look into it. This results in more motivation to look for sustainable solutions and think about Data Governance when they’re playing their part contributing to Swapfiets’ mission. I think it’s created a culture where data is something that affects all of us, and not just the data team. That’s been the biggest change we’ve accomplished, so far.

Lisa Smits, Data Analytics Manager

Photo by Ernest Ojeh on Unsplash