Taking Back Control of Our Personal Data

By Todd Mozer



March 11, 2019


Taking Back Control of Our Personal Data

As digital assistant technologies get smarter, there will be more concern about what private data they collect, store and share over the internet.

As more and more devices that are designed to watch and listen to us flood the market, there is rising concern about how the personal data that are collected gets used. Facebook has admitted to monetizing our personal data and the personal data of our friends. Google has been duplicitous in its tracking of users even when privacy settings are set to not track, and more recently has admitted to placing microphones in products without informing consumers. In no realm is the issue of privacy more relevant than today’s voice assistants. A recent PC Magazine survey of over 2,000 people found that privacy was the top concern for smart home devices (more important than cost!). The issue becomes more complex as personal assistant devices become increasingly better-equipped with IP cameras and various other sensors to watch, track and listen to us.

Admittedly, people get a lot in return for giving up their privacy. We’ve become numb to privacy policies. It is almost expected that we will tap “accept” on End User License Agreements (EULAs) without reading through terms that are intentionally designed to be difficult to read. We get free software, free services, music and audio feeds, discounted rates and other valuable benefits. What we don’t fully realize is what we are giving up, because big data companies don’t openly share that with us. We fall into two main categories of consumers: trust the giants of industry to do what’s right, or don’t play (and lose out on the benefits). For example, I love having smart speakers in my home, but many of my friends won’t get them -- precisely because of these very fair privacy concerns. Our legislators are trying to address privacy concerns, but the government needs to fully understand how data is used before writing legislation. They need help from the tech community on how to deal with these issues. Europe has adopted the new General Data Protection Regulation (GDPR) to address concerns related to businesses’ handling and protection of user data. This is a step in the right direction, as it provides clear rules to companies that handle personal data and offers clearly defined monetary penalties for failure to protect the data of citizens. However, is it enough to slap fines on these companies for failing to comply? What happens when they think the market value of the infringement will be greater than the penalty? Or when human errors are made, which happened when a European user requested their data and was accidentally given someone else’s data.

In this day and age, a great deal of our daily tasks require us to share personal data. Whether we are talking to our AI assistants, using our smart phone to navigate around traffic or making a credit card transaction at a local retailer, we’re sharing data. There are benefits to sharing this data, and as time goes on people will see even more benefits from data sharing, but there are so many problems too.

AI personal assistants provide examples of how sharing data can benefit or hurt the end user. The more we use these devices the more they learn about us. As they become capable of recognizing who we are by face or voice and get to know and memorize our histories, preferences and needs, these systems will evolve from devices that answer simple questions into proactive, accurate helpers that offer useful recommendations, unprompted reminders, improved home security -- assistance we users find truly helpful. But they can also reveal private information that we don’t want others to know, which is what happened when a girl’s shopping habits triggered advertising for baby products before her parents knew she was pregnant.

As digital assistant technologies get smarter, there will be more concern about what private data they collect, store and share over the internet. One solution to the privacy problem would be keeping whatever is learned by the device on the device by moving the AI processing out of the cloud to the edge (on device), ensuring our assistants never take personal data to the cloud so there is no privacy risk. This would be a good solution as embedded AI becomes more powerful. The biggest disadvantage may be that every new device would have to re-learn who we are.

Another possible solution is more collaboration among data companies. Some of the AI assistant companies are starting to work together; for example, Cortana and Alexa are playing friendly. This current approach, however, is about transferring the baton from one assistant to another so that multiple assistants can be accessed from a single device. It’s unlikely this approach will be widely adopted, and even if it were, it would result in inefficiency in the collection and use of our personal data, because each of the AI assistant providers would have to build their own profile of who we are based on the data they collect. However, because companies will want to eventually monetize the data they collect, sharing done right could actually benefit us.

Could more sharing of our data and preferences improve AI assistants?

Sharing data does create the best and most consistent user experience. It allows consumers to switch AI brands or buy new devices without losing any of the benefits in user experience and device knowledge of who we are. Each assistant having its own unique data and profile for its users skirts the privacy issue and misses the advantages of big data. That doesn’t seem to be the right way for the industry to advance. Ideally, we would create a system where there is shared knowledge of who we are, but we, as individuals, need to control and manage that knowledge.

For example, Google knows that the most frequent search I do on Google Maps is for vegetarian restaurants. Google has probably figured out that I’m a vegetarian. Quite ironically, I found “meat” in my shopping cart at Amazon. Somehow, Alexa thought I asked it to add “meat” to my shopping cart. Perhaps if Alexa had the same knowledge of me as Google it would have made a different and better decision. Likewise, Amazon knows a lot about my shopping details. It knows what pills I take, my shoe size and a whole lot of other things that might be helpful to the Google Assistant. A shared knowledge base would benefit me, but I want to be able to control and oversee this shared database or profile.

Improved privacy without losing the advantages of shared data is achievable through a combination of private devices with embedded intelligence and cloud devices restricted and governed by legislature and industry standards.

Legislature should enable user-controlled personal-data sharing systems, then the user can choose their own mix of private devices with embedded intelligence and cloud devices that offer more general and standard protections. Of course the legislature is the hard part since our government tends to work more for the industries that fund reelections and bills than the people…nevertheless here are some thoughts on how legislature for privacy could work:

  1. Companies can’t retain an individual’s private information. That’s it! There is no good reason for them to have it. Period….but they can collect it and provide it to the owner.
  2. Every person has and controls their “Shared Profile” or SP. The SP contains any and all information collected from any company. It is an item list categorized by type (clothing sizes, restaurants, etc.).
  3. The SP is divided into “confidential” (publicly available to any company) and “non-confidential” (not publicly available) sections. The SP owner has direct control over their data and decides the following:
    1. Which categories reside in the “non-confidential” SP vs. the “confidential” SP
    2. Which items are included in those public categories inside the “non-confidential” SP
    3. Which companies can access information in the “confidential” SP and what specific information they have access to
  4. Companies can access the “non-confidential” SP and utilize it for their purposes, but they can’t disclose or sell anything in it to others. It’s our data, not theirs!
  5. Companies can create new categories or items within categories on our SP; however, data added by companies is automatically placed in the “confidential” section. The user gets notified about new information and they can monitor, screen, edit and move such information into the “non-confidential” SP, making it useful for other companies to access.

Ultimately, this concept is a shared format designed to give companies equal access to data while providing users full ownership and control of their personal data and a clear understanding of what is shared or made public. No longer would the user have to deal with individual companies and complex EULAs, and no longer would they need to fear that their devices are carrying out unscrupulous behavior because of the vagueness of our legal system and the less-than-transparent ways big-data companies use our data.

With a little creative thinking and a shift in regulation, we the people can change the future of data collection and take back ownership and control of our personal data.

Todd Mozer is the CEO of Sensory. He holds over a dozen patents in speech technology and has been involved in previous startups that reached IPO or were acquired by public companies. Todd holds an MBA from Stanford University, and has technical experience in machine learning, semiconductors, speech recognition, computer vision, and embedded software.

More from Todd