March 16, 2024
Decoding the data dilemma: Strategies for effective data deletion in the age of AI
Join leaders in Boston on March 27 for an exclusive night of networking, insights, and conversation. Request an invite here. Businesses today have a tremendous opportunity to use data in new ways, but they must also look at what data they keep and how they use it to avoid potential legal issues. Even with the growth in generative AI , organizations are responsible for not only safeguarding their data, specifically personal data, but also strategically managing and deleting older information that comes with more risk than business value. Forrester predicts a doubling of unstructured data in 2024, driven in part by AI. But the evolving data landscape and escalating cost of breaches and privacy violations call for a critical look at how to create an effective and robust data retention and deletion strategy. While the expected volume of data is growing, so are the cost of data breaches and privacy violations. Ransomware criminals are taking over highly sensitive medical and government databases, including hacks of Australia’s courts, a Kentucky healthcare company, 23andMe and large enterprises like Infosys, Boeing and security-provider Okta. These breaches are getting more expensive too — IBM found that the average total cost of a breach was $4.45M in 2023 — a 15% jump over 2020. To manage data effectively, organizations need to craft a policy to delete obsolete data. With gen AI , executives may ask if anything should ever be deleted given future opportunities. But the longer a company stores data, the more opportunities for a data breach or fines for violations of privacy law. The first step to minimize this risk is to take a comprehensive look at how a company is using its data, along with the nuanced considerations and tangible benefits of a data retention strategy. The AI Impact Tour – Atlanta Organizations often find themselves compelled to delete obsolete data due to legal requirements that are core to data protection laws. Regulations mandate the retention of personal data only for as long as necessary, driving companies to establish retention policies with periods that vary across business areas. Along with reducing legal liability, deleting obsolete data can reduce storage costs. The best way to identify which data can be considered obsolete, and which data will add ongoing business value, is to start with a data map that outlines the sources and types of incoming data, which fields are included and which systems or servers the data is stored on. A comprehensive data map ensures a company knows where personal data lives, types of personal data processed, which types of protected or special category data are processed, the intended data processing purposes and the geographic locations of processing and applicable systems. A meaningful data inventory and classification is the foundation for a solid privacy program and helps provide the data lineage needed to understand how data flows through a company’s systems. Once a company has a map of their corpus of data, legal and technical teams can work with business stakeholders to determine how valuable specific data might be, what sort of regulatory restrictions apply to storing that data and the potential ramifications if that data is leaked, breached or retained longer than necessary. Most business stakeholders will naturally be reluctant to delete anything, especially when technology is changing so quickly. The deletion and retention conversation needs to focus on what’s most useful for the business. As an example, imagine a data analytics team at a financial institution that wants to ensure lending eligibility models are trained on as much data as possible. Unfortunately, that approach is counter to the intention of data protection and privacy laws. The reality is that given how much interest rates, lending practices and consumers’ individual circumstances have changed, data from 20 years ago may not provide an accurate assessment of today’s consumers. That company may be better off focusing on other sources of recent data like updated credit information to determine an accurate risk score. The current commercial real estate market really brings this challenge to light. Many risk-prediction models were trained on pre-pandemic data, before the systemic shift to online shopping and remote work. To reduce the change of inaccurate predictions, discuss with business stakeholders how data becomes stale and less valuable over time and which data is most reflective of today’s world. To help decide how long to keep data, start with affirmative legal obligations around maintaining financial records or sector-specific regulations around transactions that entail personal data. Look at legal statute of limitation periods to determine how long to keep data if it’s needed to defend against a potential lawsuit, and only keep personal data that’s needed for a potential litigation defense, such as transaction logs or evidence of user consent, rather than every piece of data on individual users. When it’s time to clear out less valuable information, data can be deleted manually based on the retention period for each data type defined in the retention schedule. Automating the process via a purge policy improves reliability. It’s also possible to use a deidentification process to remove identifiable personal data, or to use fully anonymized data, but this adds new challenges. Truly deidentified data generally falls under exemptions in data protection laws, but doing this correctly requires stripping out so much value that there’s not much left to use. Deidentifying requires stripping out unique and direct identifiers like an SSN and name, but also indirect identifiers, including information like customer IP addresses. For example, to meet the HIPAA standard for safe harbor protection, an organization must remove a list of 18 identifiers . An organization may want to try this approach to maintain the performance of an analytics or AI model. But it’s important to discuss the pros and cons with stakeholders first. The biggest mistake enterprises make in addressing obsolete data is rushing the process and skipping over those in-depth conversations. Project owners need to resist the urge to expedite and recognize that the right feedback from multiple groups is essential. Companies should work across legal, privacy and security teams, along with business leaders, to get feedback on what data is essential to keep — and avoid a retention policy and schedule that inadvertently deletes something the company needs. It’s easier to shorten retention periods over time and retain less personal data, but once it’s gone, it’s gone, so measure twice, and cut once. As we’ve outlined above, there are several considerations in addressing obsolete data, including foundational data mapping and lineage, defining retention period criteria and working out how to implement these policies efficiently. Navigating the intricacies of data deletion requires a strategic and informed approach. By understanding the legal, cybersecurity and financial implications, organizations can develop a robust data retention strategy that not only complies with regulations but also effectively safeguards their digital assets. Seth Batey is data protection officer and senior managing privacy counsel at Fivetran . DataDecisionMakers Welcome to the VentureBeat community! DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation. If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers. You might even consider contributing an article of your own! Read More From DataDecisionMakers
Latest News
Top news around the world
Academy Awards

‘Oppenheimer’ Reigns at Oscars With Seven Wins, Including Best Picture and Director

Get the latest news about the 2024 Oscars, including nominations, winners, predictions and red carpet fashion at 96th Academy Awards

Around the World

Celebrity News

> Latest News in Media

Watch It
JoJo Siwa Reveals She Spent $50k on This Cosmetic Procedure
April 08, 2024
tilULujKDIA
Gypsy Rose Blanchard Files for Divorce from Ryan Anderson
April 08, 2024
kjqE93AL4AM
Bachelor Nation’s Trista Sutter Shares Update on Husband’s Battle With Lyme Disease | E! News
April 08, 2024
mNBxwEpFN4Y
Alan Tudyk Does All His Disney Voices
April 08, 2024
fkqBY4E9QPs
Bob Iger responds to critics who call Disney "too woke"
April 06, 2024
loZMrwBYVbI
Kirsten Dunst recites a classic cheer from 'Bring it On'
April 06, 2024
VHAca3r0t-k
Dr. Paul Nassif Offers Up Plastic Surgery Warning for Gypsy Rose Blanchard | TMZ
April 09, 2024
cXIyPm8mKGY
Reba McEntire Laughs at Joy Behar's Suggestion 'Jolene' is Anti-Feminist | TMZ TV
April 08, 2024
11Cyp1sH14I
NeNe Leakes Says She's Okay with Cheating If It's Done Respectfully | TMZ TV
April 08, 2024
IsjAeJFgwhk
Ben Affleck and Jennifer Lopez’s wedding was 20 years in the making
April 08, 2024
BU8hh19xtzA
Bianca Censori wears completely sheer tube dress and knee-high stockings for Kanye West outing
April 08, 2024
IkbdMacAuhU
Kelsea Ballerini tells trolls to ‘shut up’ about pantsless CMT Music Awards 2024 performance #shorts
April 08, 2024
G4OSTYyXcOc
TV Schedule
Late Night Show
Watch the latest shows of U.S. top comedians

Sports

Latest sport results, news, videos, interviews and comments
Latest Events
08
Apr
ITALY: Serie A
Udinese - Inter Milan
07
Apr
ENGLAND: Premier League
Manchester United - Liverpool
07
Apr
ENGLAND: Premier League
Tottenham Hotspur - Nottingham Forest
07
Apr
ITALY: Serie A
Juventus - Fiorentina
07
Apr
ENGLAND: Premier League
Sheffield United - Chelsea
07
Apr
ITALY: Serie A
Monza - Napoli
07
Apr
GERMANY: Bundesliga
Wolfsburg - Borussia Monchengladbach
07
Apr
ITALY: Serie A
Verona - Genoa
07
Apr
ITALY: Serie A
Cagliari - Atalanta
07
Apr
GERMANY: Bundesliga
Hoffenheim - Augsburg
07
Apr
ITALY: Serie A
Frosinone - Bologna
06
Apr
GERMANY: Bundesliga
Heidenheim - Bayern Munich
06
Apr
GERMANY: Bundesliga
Borussia Dortmund - Stuttgart
06
Apr
ENGLAND: Premier League
Brighton - Arsenal
06
Apr
ITALY: Serie A
Roma - Lazio
06
Apr
ENGLAND: Premier League
Crystal Palace - Manchester City
06
Apr
ITALY: Serie A
AC Milan - Lecce
04
Apr
ENGLAND: Premier League
Chelsea - Manchester United
04
Apr
ENGLAND: Premier League
Liverpool - Sheffield United
03
Apr
ENGLAND: Premier League
Arsenal - Luton
03
Apr
ENGLAND: Premier League
Manchester City - Aston Villa
02
Apr
ENGLAND: Premier League
West Ham United - Tottenham Hotspur
01
Apr
SPAIN: La Liga
Villarreal - Atletico Madrid
01
Apr
ITALY: Serie A
Lecce - Roma
01
Apr
ITALY: Serie A
Inter Milan - Empoli
31
Mar
ENGLAND: Premier League
Manchester City - Arsenal
31
Mar
SPAIN: La Liga
Real Madrid - Athletic Bilbao
31
Mar
ENGLAND: Premier League
Liverpool - Brighton
30
Mar
SPAIN: La Liga
Barcelona - Las Palmas
30
Mar
ENGLAND: Premier League
Brentford - Manchester United
30
Mar
ITALY: Serie A
Fiorentina - AC Milan
Find us on Instagram
at @feedimo to stay up to date with the latest.
Featured Video You Might Like
zWJ3MxW_HWA L1eLanNeZKg i1XRgbyUtOo -g9Qziqbif8 0vmRhiLHE2U JFCZUoa6MYE UfN5PCF5EUo 2PV55f3-UAg W3y9zuI_F64 -7qCxIccihU pQ9gcOoH9R8 g5MRDEXRk4k
Copyright © 2020 Feedimo. All Rights Reserved.