A firehose of sensitive data from your vehicle is flowing to a group of companies you’ve probably never heard of
By: Jon Keegan and Alfred Ng
Today’s cars are akin to smartphones, with apps connected to the internet that collect huge amounts of data, some of which is highly personal.
Most drivers have no idea what data is being transmitted from their vehicles, let alone who exactly is collecting, analyzing, and sharing that data, and with whom. A recent survey of drivers by the Automotive Industries Association of Canada found that only 28 percent of respondents had a clear understanding of the types of data their vehicle produced, and the same percentage said they had a clear understanding of who had access to that data.
Welcome to the world of connected vehicle data, an ecosystem of dozens of businesses you never knew existed.
The Markup has identified 37 companies that are part of the rapidly growing connected vehicle data industry that seeks to monetize such data in an environment with few regulations governing its sale or use.
While many of these companies stress they are using aggregated or anonymized data, the unique nature of location and movement data increases the potential for violations of user privacy.
The connected vehicle data market is still in its early days, but analysts predict it will be worth anywhere from $300 billion to $800 billion by 2030.
This nascent industry faces challenges, as it is under pressure to reap profits in order to attract and satisfy investors; at the same time, the disclosure of sensitive and potentially identifying information from smartphones has prompted U.S. lawmakers to threaten sweeping crackdowns on the collection, transfer, and sale of location data, an effort that could create barriers for the industry as it grows.
Nevertheless, the race is on to gather massive amounts of data points about drivers to feed the growing market for this information.
How the Data Flows
The following is a typical data flow scenario for a vehicle with a factory-installed cellular connection.
Once a driver gets into a car, dozens of sensors emit data points that flow to the car’s computer: The driver door is unlocked; a passenger is in the driver’s seat; the internal cabin temperature is 86° F; the sunroof is opened; the ignition button is pressed; a trip has started from this location.
These data points are processed by the car’s computers and transmitted via cellular radio back to the car manufacturer’s servers.
As the trip continues, additional information is collected: the vehicle location and speed, whether the brakes are applied, which song is playing on the entertainment system, whether the headlights are on or the oil level is low.
The data then begins its own journey from the car manufacturer to companies known as “vehicle data hubs” and on through the connected vehicle data marketplace.
The 37 companies identified by the Markup do not make up the total universe of industry players. But the products they create and services they provide illustrate how the industry works and the breadth of its reach.
The companies each play a unique role and some play multiple roles. They fall into several categories:
Vehicle Data Hubs
Awash in vehicle data, most car manufacturers, or OEMs—original equipment manufacturers—found themselves in an unfamiliar role. “What has given rise to the industry is that most OEMs have recognized that they are better at making cars than they are at processing and handling data,” said Andrew Jackson, research director at PTOLEMUS Consulting Group, which studies the connected vehicle industry.
This created an opening for a new kind of third-party data company, vehicle data hubs, which are at the center of the connected vehicle data market.
Vehicle data hubs ingest vehicle and movement data from several different sources: from OEMs, from other connected vehicle data providers, directly from vehicles using aftermarket hardware (such as an onboard diagnostic [OBD] dongle), or from smartphone apps. The companies normalize the data and offer it to customers in the form of a dashboard or insights derived from analysis or other data products.
In the case of car manufacturers, each captures and stores data differently, creating obstacles to analyzing data across the industry. Hubs solve this problem by gathering data from dozens of car manufacturers and other sources and consolidating it in one place for analysis.
Andrea Amico is founder and CEO of Privacy4Cars, an automotive data privacy company. Amico said of vehicle data hubs, “So, there’s many sources out there. Their business proposition is collect all this data, create massive databases, try to standardize this data as much as possible and then literally sell it. So that’s their business model.”
Many vehicle data hubs market their massive troves of data for applications including insurance, traffic management, electric vehicle infrastructure planning, fleet management, advertising, mapping, city planning, and location intelligence. Many also promote their data as crucial to the future application of autonomous vehicles.
When used to produce insights, the data is usually aggregated. The vehicle data may also be made available through an application programming interface (API), which allows customers to integrate the data into their own apps and services.
Among the notable companies in the vehicle data hub space are INRIX, CARUSO, Verisk, LexisNexis, Otonomo, and Wejo.
INRIX has been around since 2005 and offers parking, traffic, and navigation data to transportation agencies, OEMs, and software developers looking to add mobility features.
INRIX CTO and data protection officer Mark Daymond disputed the vehicle data hub categorization.“INRIX does not transfer or exchange raw data to data customers. We analyze anonymous and aggregated data and create products out of it, then distribute those products to customers. Identities of individuals are irrelevant for our business,” Daymond said in an email to The Markup.
CARUSO offers a data marketplace for European vehicle data. Its “data catalog” section of its API documentation lists 245 distinct vehicle data points. The company did not respond to a request for comment.
Verisk spokesperson Alberto Canal said that the data in the Verisk Data Exchange “… is subject to advance safeguards and entails consumer consent at multiple points of the process.” Canal said that vehicle data is only shared with insurers for underwriting purposes after consent is granted by the consumer.
Jennifer Grigas Richman, director, external communications, at LexisNexis Risk Solutions, said “LexisNexis Risk Solutions prides itself on the responsible use of data and devotes enormous resources and time to protecting consumers’ privacy and their personal information.”
A closer look at Otonomo and Wejo illustrates the huge amount of data under their control and the potential value of the information.
Otonomo is a publicly traded company based in Tel Aviv. Founded in 2015, it was valued at $1.4 billion at the time of its initial public offering (via a SPAC) in August 2021. It boasts on its website that it draws data from 50 million vehicles, “tracking” 330 billion miles and ingests 4.1 billion data points per day.
In its Q1 2022 financial results, Otonomo said it has contracts with 23 OEMs, and in April it acquired The Floow, a “connected insurance technology” provider. Otonomo reported revenue of just over $1 million for the quarter, with a $15 million loss.
Wejo, founded in 2014, is a publicly traded vehicle data hub based in Manchester, England. Wejo claims that its data represents “one in every 28 vehicles in the USA” and contains 16.2 trillion data points and 76.7 billion journeys with accuracy down to 3 meters, with a “1-3 second capture rate.”
Wejo says it has partnerships with 24 OEMs and fleet providers and reported revenue of $568,000 with a loss of $40 million in Q1 2022.
Wejo’s investors include GM, Microsoft, and defense and intelligence contractor Palantir.
Wejo declined to comment.
Many of these companies stress the steps they take to protect driver privacy. These protections generally come in two forms: anonymizing or aggregating driver data and clear consent controls. But due to the sensitive nature of movement and location data, risks are high for violating user privacy.
High Mobility is a vehicle data hub that “enables data connections from cars to services, with user consent,” according to Risto Vahtra, CEO and founder.
The company lists 57 categories of data points including “Trips,” “Seats,” “Driver Fatigue,” and “Heart Rate” among the items available in its data catalog. Its API documentation describes 660 distinct data points, though not all of these data points are used by all OEMs.
Vahtra told The Markup in an email, “Out of about 660 data items, perhaps half are supported in a production environment and not from all OEMs that we have agreements with.”
Vahtra said that the company does not collect vehicle data. “High Mobility does not collect, store, manipulate or store vehicle data. Our solution instead is designed to securely link services, cars and people.”
Bennett Cyphers, a staff technologist at the Electronic Frontier Foundation, said, “The more different ways you’re being measured in your vehicle, the more likely it is that someone can take a stream of data and use the characteristics of all of those different data points to fingerprint a particular user or a particular vehicle.”
Vahtra agreed about this potential risk. “This is definitely a risk for anonymized data. For personalized data this may be true, but not all this data is shared with neutral platforms and third parties.” He said that the one party who does have access to this granular data is the car manufacturers. “The OEMs themselves indeed have access to this vast amount of data points.”
Cyphers said the amount of personal data collected in combination with a lack of regulations for its sale and use is troubling. “When you see the volume of data that’s up for sale, and the lack of regulation in the vast majority of American states regarding how companies can use data, it seems like a match made in privacy hell.”
Anonymization and Aggregation
Otonomo is one example of the dozens of companies that market their attempts at keeping information anonymous. Otonomo describes its platform as having “privacy and security by design” and notes the use of patented “data blurring” technology to protect user privacy. It says it is in compliance with the EU’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). It also has an “Otonomo Driver Pledge” page promising drivers the ability to easily grant or revoke access to personal data, customer transparency about sharing data, and adherence to security best practices.
Despite those assurances, in 2021 Motherboard discovered precise, individual vehicle data in free samples on Otonomo’s site. Recently, Otonomo found itself the target of a class action lawsuit filed in California Superior Court for the County of San Francisco by a California BMW owner who alleged in the lawsuit that he never granted permission to the company to collect and sell his personal data. Otonomo had the suit removed to federal district court in California and sought to have the case dismissed, arguing that the plaintiff did grant permission for the car manufacturer to collect vehicle data and that Otonomo did not attach any device to his vehicle as alleged in his lawsuit. Otonomo also argued that tracking people and vehicles were not the same thing. The lawsuit is pending.
Otonomo did not respond to The Markup’s requests for comment, but in a response to Motherboard’s coverage of individual vehicle data in its free samples, Otonomo spokesperson Jodi Joseph Asiag said, “Privacy is at the core of our platform, technology and vision.”
Regarding assurances of anonymized location data, Cyphers noted, “It is not possible to minimize individualized location data traces whenever you have several different data points about a person’s location or a vehicle’s location over time. It doesn’t matter what else you do to the data, it’s not going to be anonymized because people’s location traces are extremely unique.”
Aggregated data can be safer, but the specific methods used in the process matter. “It’s still very difficult to expose aggregated location data that you know is reflective of the movements of real people in a way that’s privacy safe. But it’s possible. It can be done,” Cyphers said.
One area where the car-as-smartphone metaphor breaks down is users’ ability to grant or revoke permissions for apps to access personal data. Both Apple and Google have built fine-grained controls to review and grant permissions and have strengthened prompts to explain what data will be shared with third parties.
For most cars, progress toward this level of controls is lagging. Comparing the privacy control panels found on smartphones to the typical car, “[t]hat stuff does not exist,” said Privacy4Cars’s Amico. Users must consent to a number of different terms of service, either on the OEM’s smartphone app or on the car dashboard.
Amico said there should be a clear division of consent controls for optional features, versus essential features. “Whenever the data is necessary for the safe function of the vehicle, you should disclose it, you should get consent.”
EFF’s Cyphers echoed this concern, saying that drivers should know exactly what data they are granting permission for and how it will be used. “If you opted into sharing location data for the purpose of accessing a navigation program on your car’s screen, your location data should only be used for the purpose of delivering that service. You can’t grant consent for one thing that you want and then have the car company use that for something else, like selling it to a data broker.”
A new federal privacy bill known as the American Data Privacy and Protection Act was voted out of the House Committee on Energy & Commerce last week. It would ensure that clear user consent is obtained for each data processing purpose, for services offered through “nontraditional devices such as cars.”
Privacy as a Feature
There are signs that car manufacturers are following Apple’s path.
Recently, Porsche announced it was rolling out new fine-grained privacy controls in its luxury Taycan SUV model. In a press release announcing the strategic elevation of user privacy, Porsche’s chief privacy officer and director of group privacy Christian Völkel wrote, “The customer is given full transparency and control over data processing in the vehicle, with simple controls for privacy settings. ‘The customer is in the driver’s seat.’ ”
This article was originally published on The Markup and was republished under the Creative Commons Attribution-NonCommercial-NoDerivatives license.