Q&A: Facebook on data, privacy, and global health
As Facebook launches a set of new maps for disease prevention, Devex speaks with its Data for Good team about privacy concerns, public trust, and the biggest misconception about its work.
By Vince Chadwick // 27 May 2019GENEVA — Facebook announced new Disease Prevention Maps last week, intended to help researchers and health organizations reach the right people and predict disease outbreaks. By training its computers to analyze publicly and commercially available satellite images and demographic data — not Facebook user data — the social media company says it has created high-resolution population density maps that are now three times more detailed than any other source. At the same time, the company is sharing its network coverage maps and movement maps of those who have opted to allow Facebook to track their phone’s location, with new partners with whom it signs data licensing agreements. This can be used for modelling how a disease could spread and where to preposition treatments. “Facebook is not just hoarding data for the purpose of monetizing or showing ads. It’s very important for humanity that these datasets exist.” --— Alex Pompe, Data for Good, Facebook Devex spoke with Alex Pompe and Laura McGorman from Facebook’s Data for Good team to learn how this builds on the company’s work on disaster maps, their response to privacy concerns, and how they are trying to encourage more private sector players to share valuable data. The conversation has been edited for length and clarity. What’s new about the latest announcement? McGorman: Disaster Maps has existed since 2017, and we now have over 40 partners that we work with there. Through that, movement maps and network coverage maps were shared with disaster relief organizations only. What is new on movement maps and network coverage maps is that we are creating them in new environments, we are sharing them with new partners, and they also generate slightly differently for disease prevention use cases, because the window of a disease outbreak is longer than the window of observation for a natural disaster. The demographic estimates for our high-resolution population density maps are brand new. Our partners said: “It’s great that you guys have built some of the world’s most accurate population density maps, we know that can be helpful in things like vaccination campaigns. But guess what? It’s not enough just to know where the people and the structures are, we really need to know estimates of where our target populations are. Because we are not just trying to reach people with a vaccine, we are trying to reach children under 5 with a vaccine.” So Disease Prevention Maps include both new products — our demographic estimates for population density — and new partners — bringing on health partners with whom we now share movement data and network coverage data in entirely new settings. What do you say to criticism that the information in the population density maps could be misused by bad actors? McGorman: If there are large situations where there are ongoing conflicts, we simply won’t generate a population density map for that environment. Examples for Africa include Sudan, Somalia, and others. The other thing to clarify is that these datasets are not dynamic. We’re relying on census data and satellite imagery data. We are not creating some sort of daily snapshot, the way we do with the Facebook datasets, on where women are today. This is an estimate of how the female population is distributed across a country. So, I would argue that organizations, sadly they do exist, that are trying to target populations really would want to see insights that are far more dynamic. We are just trying to share insights that humanitarian organizations can better use in addition to census data to see where likely people of certain target populations would reside in a more efficient way. How about the movement and network coverage maps? How do you do protect people’s privacy in the data? McGorman: Similar rules apply for Facebook data but it’s even more narrow in terms of where we generate. We only generate Facebook datasets for areas with natural disasters or areas with public health emergencies which are very subnational. The datasets that generate for movement data will be an area of France where there is a measles outbreak and that will only generate for a short period of time and then be deprecated from our mapping portal. So the scale of generating Facebook data is significantly more narrow. Pompe: For the Facebook user-derived datasets, if there are ever fewer than 10 individuals in one 0.6-kilometer tile, we don’t surface the raw metrics and we introduce random noise into the count and then we do nearest-neighbor smoothing. We opensource that methodology in an effort to try and stimulate other private sector partners like us to not be as scared. Like, “if this is what Facebook’s doing, at least that is out in the open and we can be better than that or at that level.” The other thing is though partners, like the World Health Organization. We would rely on them to report if this is entirely too sensitive of a location. We only generate the data based on their request. It isn’t sitting there in a database and then we just send it to them. It’s only generated based on request and within where the outbreak, crisis or research study is happening. The data is only kept for 90 days and then it is removed. One of the most common requests that we get from NGOs is: “We are trying to reach refugees or displaced populations that have crossed a border.” We don’t generate those. We let them know. They might initially have some pushback, like, “this is really urgently needed so we can reach those populations,” but then when we explain the humanitarian risk involved in surfacing that, we get no pushback. Since this is being made available for free, they are in the position that they have to accept when we say we can’t do something. It’s running the risk and reward for sensitive data. We try and make that a dialogue. We don’t want to be unilateral in proclaiming that. We say “no” often, probably 10-20% of the time. Not everyone is on Facebook. How do you ensure your maps are representative and include those most in need of assistance? McGorman: That is probably the most frequent question from our partners, because they care of course about making the right decisions in these contexts. So with Facebook data, which is the movement maps and the network coverage maps, we did an analysis with Stanford University where we layered our Facebook population from Disaster Maps baseline with the population density maps, which are census data and satellite imagery, and we measured how well they fit against each other. What we found, which was not surprising, was that in pretty much every region of the world, bar sub-Saharan Africa, the goodness of fit was 0.8. What that tells you is that where there is a sufficient enough distribution of smartphone users or connectivity in a country, the Facebook population is going to look very much like the population at large. At the same time — and this is where we are going to have to learn from feedback from our partners — it may be a challenge to use our movement maps to better understand the spread of Ebola in DRC [Democratic Republic of the Congo], where there are very few smartphone users. We have partners that are currently looking at that. We have generated movement maps for various Ebola outbreaks, but it remains to be seen if our maps can really fill a gap there. Pompe: The feedback from those partners where we have worked in those places is that our representativeness is certainly down but there is a lack of other datasets available. The entire tide has sunk down, so they are grateful. What were your impressions from the WHO’s 72nd World Health Assembly? Pompe: There is a call to the private sector to become more involved in these sort of data efforts and I think that’s good. The main thing I would disarm, if I had the chance, would be this notion that the data is all sitting there and the company has it and it’s all just a matter of making it available. None of those things are actually true. Facebook isn’t just sitting on top of all this data that’s readily consumable for a health application. You need to write code to bring that together in a scalable way that doesn’t take three days to run before you get it. When people are calling for the private sector to be contributing datasets, what they actually should be complementing that with is a call for Facebook and others to be describing the privacy-preserving methodology that they use — we open source that — because that takes the majority of the time. Ten percent of the effort is the technical, datascience time to build these datasets and test them; 90% of it is related to ensuring privacy, including on the legal side, making sure that in each country where we might generate this data or share it with a partner, we are fulfilling data privacy requirements. And then making sure that we go beyond that. Of course, Facebook has had several high profile failings in the past as it relates to data privacy when it is shared with third-parties. So even in something as morally unambiguous as a disease outbreak, where data can be shared for the purposes of saving lives, that isn’t just “open the door and let’s go crazy with this.” We need to be really diligent about it. The interpretation of the European Union’s General Data Protection Regulation in this space is relatively unknown. We haven’t seen any of these court cases come to fruition or fines be levied. So we are operating in an area of risk as a company for the purposes of social good. That’s actually the thing that is preventing private entities from contributing to these sorts of efforts. It’s not for lack of will or technical resources. In fact, they are in a much better position to contribute data to these things than money — and it’s more impactful. But the barrier to do so is the uncertainty around privacy legislation. What we are trying to demonstrate with these types of datasets is that Facebook is not just hoarding data for the purpose of monetizing or showing ads. It’s very important for humanity that these datasets exist, and hopefully we deserve the public’s trust that we can preserve the privacy so that they can contribute to some of society’s biggest problems.
GENEVA — Facebook announced new Disease Prevention Maps last week, intended to help researchers and health organizations reach the right people and predict disease outbreaks.
By training its computers to analyze publicly and commercially available satellite images and demographic data — not Facebook user data — the social media company says it has created high-resolution population density maps that are now three times more detailed than any other source.
At the same time, the company is sharing its network coverage maps and movement maps of those who have opted to allow Facebook to track their phone’s location, with new partners with whom it signs data licensing agreements. This can be used for modelling how a disease could spread and where to preposition treatments.
This story is forDevex Promembers
Unlock this story now with a 15-day free trial of Devex Pro.
With a Devex Pro subscription you'll get access to deeper analysis and exclusive insights from our reporters and analysts.
Start my free trialRequest a group subscription Printing articles to share with others is a breach of our terms and conditions and copyright policy. Please use the sharing options on the left side of the article. Devex Pro members may share up to 10 articles per month using the Pro share tool ( ).
Vince Chadwick is a contributing reporter at Devex. A law graduate from Melbourne, Australia, he was social affairs reporter for The Age newspaper, before covering breaking news, the arts, and public policy across Europe, including as a reporter and editor at POLITICO Europe. He was long-listed for International Journalist of the Year at the 2023 One World Media Awards.