Is Small Data > Big Data?

Is Small Data > Big Data?

Dear Reader, I am going to start by doing something that I generally avoid. I am going to make an assumption. Quite a risky thing to do normally but it’s a  weak assumption and if you are reading this post, I believe you know one or two things about technology trends. So here it is: “I assume you know about big data”. I told you, it is a weak assumption, but it’s all over the place. This is illustrated not least by the fact that it is reaching the peak of the Gartner Hype Cycle

Many interesting businesses are developing Big Data Services and I need to disclose that I myself am currently working on the commercialisation of two big data services, but often the ratio of insight to noise is pretty low. You might remember, for example, the case of the so-called “twitter hedge fund”? In addition, big data services are often models based on bringing incremental value to advertising services. The possible issue is that those individuals generating large amounts of data which is processed by these services, rarely capture any value.

Having said that I rarely make assumptions I am going to make another, far riskier, assumption. You have never heard of ‘small data’.  It’s a space where entrepreneurial opportunities abound to such a degree that, if you pardon the pun, small data could end up being bigger than big data. Small Data is the core data that defines us (name, age, physical address, email address, etc…) and the data that we generate (online) across different applications, social or not: our browsing history, our ‘foursquare’ check-ins, the photos we’ve uploaded on Flickr, the tracking of our regular (or rare if you are anything like me) runs, etc… In other words, the data that, aggregated across many verticals and many platforms and millions of citizens and consumers, leads to big data.

Perfect storm for Small Data?

Many businesses are starting to develop services to help individuals make sense, extract and add value to their personal data while, and this is the important part, keeping the data “generator” , as opposed to owner, (more on this further down) at the centre of the value creation exercise rather than sidelining them. There is a hive of innovation occurring in small data. An interesting way to make sense of the ‘creative chaos’ that characterises a nascent industry has been developed by Claro Partners. Other organisations might have different ways to rationalise this wealth of initiatives, but Claro sees them as nine clusters.  Here are four of them:

  1. Discovery: The data you generate when you are discovering and sharing, for example, a new restaurant (foursquare), a nice cake (Pinterest) or some content (StumbleUpon
  2. Crowd generated data: The data that is accumulated by many people contributing.  For example when trying to avoid traffic jams, or keeping a road map up to date - Waze in the US, or its European equivalent based in London, Navmii. It can also be data from personal experience, for example sharing outcomes and events to do with medical treatments.
  3. Self-tracking: Self-explanatory I suppose? There are different versions of self-tracking, from your overall health data like Quentiq, your mood data, such as Moodtracker used by patients suffering from bipolar syndrome or even your behaviour. Rescuetime, for example helps you stop inefficiencies by measuring what you effectively do during the day! 
  4. Identity: Naturally, a very rich cluster with different versions of service around the identity data we generate. You have identity measurement (, protection of your identity (, confirmation of your identity (connect via Facebook for example) or even “identity monetisation” with a service like Empire Avenue.

There are other crucial clusters such as Access Management or ‘Internet of Things’ and without mentioning all the services and clusters I hope this illustrates, dear reader, the sheer volume of initiatives currently happening in the “small data” arena.

Unfortunately, as someone who’s been involved in many early-stage businesses, I sadly know that many of these initiatives will fail, that’s the nature of things. However, that should not discourage any entrepreneurs from looking at this space. I believe that there are a number of underlying forces “here and now” that support the development of new small data services. A PEST analysis would prove my point but without getting into the complexities of such an analysis I would like to highlight a few of the forces that are supporting such endeavours:

  1. Nature of data: As mentioned, the type of data we generate is rich. Your health data, your driving experience data, data from your electric apparels (Nest anyone?). It is very much more profound than just browsing data and photos.
  2. Volume of data: We are generating more and more data from more and more devices and applications.
  3. Ambient connectivity: When is the last time you were not connected to a network? Whilst there are some hurdles from a pricing point of view (e.g. data roaming..), one is rarely out of coverage which often means no latency and no hurdle to up-load data, even if this incurs a roaming charge.
  4. Usability: In my days developing new services online and doing wireframes, we constantly used the “soccer mum” as a reference point to make sure a very busy person would not have any problem learning very quickly how to use the service. Small data services are getting easier and easier to use and are consequently getting more attractive to an ever-growing user base
  5. Mashability: The ability to aggregate at the user level, data from different sources, will certainly boost the value these services provide to the “data generator”. What if you could mash the data from your physical exercise and the amount of food that you’ve eaten and derive some insights?
  6. Legislation: This is a slightly tricky one that I will develop shortly but different legal frameworks are being pushed that could support growth in the small data sector. The UK government’s ‘midata’ initiative is an example of such a framework that would possibly lead to the development of so-called Personal Data Stores, à la in the US, or Mydex in the UK.

The legal bit

So, all looking good right? The wealth of opportunities and the underlying forces supporting new ventures. And happily even the legal bit. Lawyers (and some people would argue lawmakers) are often seen to put hurdles in the way of doing business, but it would seem that current legislation is supporting small data services right? Well, a little bit of caution and scrutiny is needed here.

The background on the development of the legal framework is clearly grounded in big data services that are using personal data to extract value. I don’t think it would be contentious to say that Google and Facebook are two of the most prominent companies that extract humongous value from personal data while providing a free service to their users. But there is a growing resentment around the privacy issue and, naturally, regulators are waking up to this and want to ensure that the “rights” of the end users are protected. The issue is that regulators are possibly looking at it from a big data point of view and haven’t grasped the impact it will have on small data initiatives. In other words, they are trying to solve yesterday’s issue rather than helping next Friday to happen. Doc Searls puts this more eloquently than I do here

And next Friday is small data. Yesterday is big data and advertising-centred models. Admittedly, big data are not yesterday services, I am using here Doc’s analogy here, but they are more and more mainstream unlike small data services.

So I believe the term privacy should be replaced by “ownership of data that is generated in a social context”. Not catchy, granted but that is the real issue with the current development of legal framework for small data and “privacy” regulation. By trying to limit what can be done with this “data generated in a social context” the regulator could actually prevent the opportunity for a huge value creation wave by small data innovators.

What about this killer app then?

Good point… I didn’t mention it and thanks for reading all the way to here! I think that, despite the regulatory risk, this “privacy” angle could be the killer app. Or, to be more precise, the “enabler app”. Indeed, in small data this is the part which consumers understand the most, even if they are still happy to give away all their personal data and wave all rights on them just for a bar of chocolate . However the infamous early adopters are leading the way and the likes of Allow, mii or Abine, in the States, to name but a few are those who will gain traction first. They probably need to be careful in their positioning, for example, avoiding scaremongering tactics so it will not be an easy ride and there will be examples of failure to find the right model both from a service and financial perspective but “data ownership that is generated in a social context” is so prominent that people will start to use “privacy” services and move on to wider “small data” services.

I hope this proved insightful and should you have some thoughts, would love to hear them!

MAVEN>‘s Guest Poster:

Hervé Humbert focuses on the commercialisation of innovative,   technology based service. Leading the business development of big data services that monetise and extract insights from online, financial or mobile sources data, his interest in small data stems from the acknowledgement he made of the need to put end users at the centre of services using their personal data. His twitter handle is: @hervehumbert

Images designed & kindly shared by Claro Partners: @claropartners





Head Office (UK):
Maven Ventures
288 Bishopsgate
London, EC2M 4QP

+44 (0) 203 170 6116