Data trusts: what are they and how do they work?

Blog 4 Comments

  • Picture of Anouk Ruhaak
    Anouk Ruhaak
  • Economic democracy
  • Future of Work

How do we, the general public, gain greater control over the estimated 2.5 quintillion bytes of data that is recorded, stored, processed and analysed, every day?

For the moment, we have little say over what can be collected, accessed and used, and by whom. Nor do we enjoy much agency over the ways social platforms study and steer our behaviors. Let’s take Uber, if Uber does something you — a regular user — do not like, this isn’t something Uber views as up for discussion. Your only recourse is to delete the app. Your act of defiance is unlikely to have a large impact. If you can even afford to that is; what if Uber was your only way to get to work?

In this article I put forward the concept of data trusts as a way to claw back some control over the digital utilities that we rely on for our everyday lives.

A data trust is a structure whereby data is placed under the control of a board of trustees with a responsibility to look after the interests of the beneficiaries — you, me, society.

Using them offers all of us the chance of a greater say in how our data is collected, accessed and used by others. This goes further than limiting data collecting and access to protect our privacy; it promotes the beneficial use of data, and ensures these benefits are widely felt across society

In a sense, data trusts are to the data economy what trade unions are to the labour economy.

Who’s in control?

Any inquiry into the appropriate collection and flow of data should attempt to answer these questions:

  • Collection: who can collect and who can decide over future collection?
  • Access: who can access and who can decide over future access?
  • Use: who can use and who can decide over future use?

The first question acknowledges that the very act of recording data can have far-reaching consequences. For one, it’s hard to erase data once it is collected, such that collection always implies use (at a minimum, the storage of data). Secondly, the act of recording itself can be viewed as violating our autonomy. Humans behave differently when they know they’re on camera, or when we assume our everyday conversations are on the record (an ever-more reasonable assumption).

The second and third questions determine how information is used and distributed, once it is collected. In addition to determining who has access and can use data today, we need to know who can make as-yet-unspecified future decisions about future access and use. For example, you may have access to data about you, but do not enjoy the right to decide who else can access that data. Alternatively, it could be entirely up to you to decide who can use data about you, or some specific dataset, and you can revoke that use right whenever you so desire.

Clearly, the power to decide who can collect, access and use data is more important than merely holding collection, access and use rights right now. Which begs the question: who gets to make those decisions about our data? Oftentimes, the de facto answer to this question is ‘a corporation’, be it Google or Facebook or Amazon. Most of the sensors collecting data are under corporate control and most of the resulting data is held by corporations as well. Especially in jurisdictions without explicit data protection legislations, this reality has meant that corporations decide what data is collected and who can access and use the collected data, and for what purpose.

Even when data is collected within the context of a public project (e.g. smart cities) it is often the consulting corporation deciding what to collect, who could access it, how it was used, and by whom — with little public oversight. That’s a problem. The director of a corporation has a fiduciary responsibility to act in the interests of their shareholders. Their job is not to ensure your privacy or to make data available for the public good, but to make money. In fact, even when a company’s shareholders decide they do want to put those values above their need to turn a profit, we cannot trust they will continue to do so in the future. What happens to their good intentions when their corporation is sold?

Privacy policies coming into force today solve part of the problem, by handing individuals the right to decide how they want to share or not share data about them, and what they allow to be collected in the first place. However, our ability to exercise these rights depends on whether those decisions are made in freedom. Unfortunately, our reliance on a handful of social media platforms and digital services have resulted in power imbalances that undermine any meaningful notion of consent. Our ability to freely choose how and when we share our data breaks down when the ‘choice’ is between surrendering data about ourselves and social exclusion, or even unemployment (as is the case when we decide to opt out of workplace surveillance). Without a real way to opt out, our consent is meaningless.

Meanwhile, the enforcement of privacy policies leaves much to be desired. Many enforcement bodies rely on complaints, instead of preemptive audits, and are severely understaffed.

In relation to the questions posed above, data protection laws give us the rights we need to grant and revoke access to and use of data. However, without addressing the underlying power imbalances we remain ill-equipped to exercise those rights.

How to level the playing field?

Three alternative solutions have been proposed to level the playing field. Some look to antitrust laws to break up Big Tech. The idea is that many smaller tech companies would allow for more choice between services. This solution is flawed. For one, services like search or social media benefit from network effects. Having large datasets to train on, means search recommendations get better. Having all your friends in one place, means you don’t need five apps to contact them all. I would argue those are all things we like and might lose when Big Tech is broken up. What we want is to be able to leave Facebook and still talk to our friends, instead of having many Facebooks. At the same time, more competition is likely to make things worse. When many services need to compete for your attention, it’s in their best interest to make those services as addictive as possible. This cannot be the desired outcome.

Instead of creating more competition, some argue we should just nationalize Big Tech. This strategy leaves us with two important questions: which government should do the nationalizing? And do we want a government in control of data about us?

Finally, we could decide to divorce those who wish to use data from those who control its use. Personal Data Stores (eg Solid, or MyData) aim to do just that. By placing the data with the internet user, rather than the service provider, they hope to put the user back in control. This approach holds a lot of merit. However, it fails to account for our limited ability to decide how we would want to share data. Do we have enough knowledge and insight to weigh our options? And even if we did, do we really want to spend our time making those decisions?

Data Trusts

As with personal data stores, by placing data in a data trust we separate the data users from those who control the data. The difference is that with a trust, we avoid placing the entire burden of decision-making on the individual. Moreover, by pooling data from various sources together in a data trust, we unlock the ability for a data trustee to negotiate on behalf of the collective, rather than an individual.

A data trust is created when someone or a lot of someones hand over their data assets or data rights to a trustee. That trustee can be a person or an organisation, who will then hold and govern that data on behalf of a group of beneficiaries and will do so for a specific purpose. The beneficiaries could be those who handed the data to the trust, or anyone else (including society at large). Importantly, the trustee has a fiduciary responsibility to look out for the interests of the beneficiary, much like your doctor has a fiduciary responsibility to do what is best for you. That also means that the trustee is not allowed to have a profit motive or, more generally, a conflicting interest in the data or data rights under its custody.

One important feature of a data trust is that the trustee can decide who has access to the data under the trust’s control and who can use it. And, importantly, if that data user fails to comply with the terms and conditions, the trustee can revoke access. To return to the Uber example, instead of you leaving Uber in protest, a trustee can threat to revoke access to the data of many. Such a threat will carry a lot more weight than the act of a single user.

The Road Ahead

How do we get from here to a world in which our data is governed by data trusts? Needless to say there is still a lot to figure out. How do trustees make decisions about data collection and access? How do we make sure we can continue to trust the trust? Are data trusts possible within our current regulatory environment and to what extent does the answer to that question depends on the jurisdiction you are in?

We will not find the answers to these and many other remaining questions just by theorizing. Instead, we need to test various models in real-world scenarios. As a Mozilla Fellow I hope to contribute to this effort by considering the usefulness of a data trust for two specific scenarios:

  • Data Donation Platform: AlgorithmWatch is looking to build a data donation platform that allows users of browsers to donate data on their usage of specific services (eg Youtube, or Facebook) to a platform. That data is then employed to understand how users are targeted by those platforms, or what ads they are being served. Could this data sit in a trust? Who would the trustee be? Who would we want to access and use this data?
  • Health data: CoverUS, a US-based health startup is looking to help its members to collect health data about them and use it to gain better access to health services. We want to find out whether a data trust could hold and govern this data.

It is my hope that by studying the concept of a data trust in these specific contexts I will learn more about the incentives and constraints of the various pilot partners for participating in a trust and gain a better understanding of the design requirements for a data trust. I further hope to obtain more insights in the regulatory and policy requirements for data trusts to work.


Anouk Ruhaak is a Mozilla Fellow. Follow her and get in touch about this work on Twitter @anoukruhaak

This blog was originally posted on Medium on November 11th 2019, and is republished with the persmission of the author.


Join the RSA's community and help shape change in a post-covid world.

Join the discussion

4 Comments

Please login to post a comment or reply

Don't have an account? Click here to register.

  • Hi Anouk, My organisation, Cordial.World is in the process of finding funding to build a Data Trust to support Long-COVID research. Your article is simultaneously provoking and helpful. Thank you for writing.

  • Anouk, This is a really intriguing idea and one worth following up.


    My gloss on this would be to explore how existing representatives bodies (like the trades unions you mention) could develop their current sector or community-specific services to include the data trustee role. This would provide vital context for the role. For trade unions, obviously, the context in which they would act as trustee would be the employment relationship. Here they could champion the data subject's rights and the wider legal and ethical issues such as data minimisation and privacy by design and by default. I suspect that without that additional contextualisation, the idea might be too big to put into practice. 

  • This is really interesting; it does seem that we need to find some kind of solution to the problems people see with private companies holding their personal data.


    The comparison with trade unions is an interesting one too as it helps us to think about the problems with trade unions and what we might have to do to avoid the same problems with Data Trusts. The main problems as I see them are:


    1) Trade Unions are fundamentally politicised, so their decisions about who leads them, the decisions made, and what they choose to act on lack the credibility that a non-politically aligned organisation would have.


    2) Trade Unions are obliged to act in the interests of their members above all else and, as ever with interests, that means making hard decisions about what is in the short term interest and what is in the longer term interest. For example, people might object to the use of their data to inform the development of private medicines that in the long run may prove to be of enormous value to Humanity.


    3) Trade Union leaders and convenors are often criticised as not representing the mainstream of the membership. This means that when speaking supposedly on behalf of their ‘members’, it is often the case that the views of the true membership are often much more diverse. Our voices are rarely as harmonious as some would have them be.


    I don’t want to be seen as majoring too much on what the author may have intended simply as an analogy to help us understand what was being proposed, but I think the analogy is a particularly useful one and might help us think about what we need to do to dream up a system that recognises this complexity.

    • Indeed really interesting, I would expect that consumer rights organizations also could be interested in extending their role in the context of the ‘transactional internet’ and be willing to act as a Data Trust. 

Related articles