Stay Lean

Do I need this data to provide the value I’m trying to deliver?

1. What data do you have?

Understanding what data you collect is easier said than done – there is almost always more than you think.

Evaluate data from the user’s perspective: what puts their privacy and security at risk? And how can you categorize your data in ways that make sense for collection while honoring user privacy?

Example: Server Logs

IP addresses and timestamps are standard in most online exchanges and often overlooked in data evaluations. This data poses a security risk because it can be used to identify people and their actions.

2. How do you collect your data?

Multiple organizations may have access to parts of your data, causing a larger digital footprint than expected.

Review where you collect or receive sensitive data. This allows you to manage who has access, what is collected, and how it is communicated to your customers or supporters.

Example: Outsourcing

Most organizations use third party services for managing customer emails, surveys, events, and questions. Vendor templates often collect more data by default. This means you—and the vendor—may both absorb unnecessary risk.

3. Why are you collecting this data?

Many organizations collect large amounts of data, much of which is never used.

Ask two important questions:

  1. Is there is a valid purpose for sensitive data to be collected?
  2. Is there an alternative way to achieve the purpose without that data?

This process often yields innovative privacy-sensitive designs.

Example: First & Last Name

Most sign-up forms ask for first and last name. A valid reason to have this data might be to increase email open rates through customized emails. An equally valuable alternative—that minimizes risk—would be to collect first name only.

4. When do you delete data?

Sometimes you need data indefinitely. But often, the value of the data diminishes over time.

Delete sensitive data when it is no longer relevant, or de-identify it as much as possible.

Example: Survey Results

In many cases, a raw dataset containing sensitive data is no longer needed or used within 3-6 months after its collection. But it won’t go away until someone takes action to delete it.

Resources