Data Retention: A Product Design Question

Data Retention: A Product Design Question

Data Retention: A Product Design Question

Written by Hetal Desai

Written by Hetal Desai

Written by Hetal Desai

Written by Hetal Desai

3 min read

3 min read

Indian data protection and privacy law does not prescribe a universal retention timeline, yet the core principle is that personal data may be retained only for as long as it is necessary to fulfil the specific purpose for which it was collected. Section 8(7) and 8(8) of the Digital Personal Data Protection Act, 2023 read with Rule 8 of the Digital Personal Data Protection Rules, 2025 operationalise this principle through pre-erasure notice requirements, minimum retention thresholds for specified data and logs, and defined retention windows for certain classes of e commerce entities, social media intermediaries, and online gaming platforms, thereby embedding deletion into statutory structure rather than policy discretion.

Retention therefore becomes an evidentiary issue as much as a policy issue, since regulators assess whether backend systems technically enforce limitation instead of merely describing it in privacy notices.

Retention therefore becomes an evidentiary issue as much as a policy issue, since regulators assess whether backend systems technically enforce limitation instead of merely describing it in privacy notices.

Retention therefore becomes an evidentiary issue as much as a policy issue, since regulators assess whether backend systems technically enforce limitation instead of merely describing it in privacy notices.

In the United States, regulatory attention has been sectoral and risk-specific, with emphasis on election integrity, consumer protection, and deceptive practices. Disclosure requirements have emerged through agency guidance and state-level legislation, often tied to impersonation or materially misleading synthetic media.

In practice, these disclosure obligations, where they exist, are being operationalised through a combination of provenance, watermarking, and platform-level signalling tools rather than through reliable downstream detection. Large technology providers have experimented with cryptographic watermarking and metadata-based provenance systems, such as content credentials frameworks that attach information about a file’s origin and method of creation at the point of generation. Some generative image and video tools embed invisible watermarks or structured metadata indicating AI involvement, while platforms layer user-facing labels based on self-declarations, tool-specific signals, or internal classifiers. These approaches are uneven in their effectiveness. Metadata can be stripped as content moves across platforms, watermarks degrade under compression or editing, and detection models produce probabilistic outputs rather than definitive attribution. In India, similar techniques are largely present only through voluntary platform practices or imported tooling rather than regulatory mandate, with open-source and offline generation tools remaining entirely outside traceability regimes. The global experience so far suggests that labelling works most consistently where it is embedded upstream in controlled systems, and becomes increasingly fragile as content circulates across open, interoperable digital environments.

How are regulators interpreting this globally where no standard timeline exists - select case studies:

How are regulators interpreting this globally where no standard timeline exists - select case studies:

How are regulators interpreting this globally where no standard timeline exists - select case studies:

Finland (Verkkokauppa.com, 2024): An administrative fine of ~EUR 856,000 was imposed in view of Articles 5(1)(e) and 25(2) of the GDPR. The company retained customer account data indefinitely, arguing that users could request deletion at any time. The regulator rejected this, holding that user-initiated erasure does not legitimise unlimited backend retention.

On 6 December 2025, the CCPA issued an order finding that Zepto used drip pricing and pre-selected add-ons. In simpler terms, mandatory charges were surfaced only at final checkout and a paid membership appeared pre-selected without affirmative consent. The regulator directed Zepto to remove default opt-ins, redesign its checkout, and submit proof of compliance within a limited time window. This enforcement follows a set of Guidelines for Prevention and Regulation of Dark Patterns (‘Guidelines’) that the CCPA published in late 2023 under the Consumer Protection Act, 2019, which identify thirteen specified dark patterns such as drip pricing, basket sneaking and subscription traps. Taken together, the Guidelines plus recent orders show that the regulator is moving beyond advisory notes to enforcement and corrective directions.

In the aforesaid matter, customers were not permitted to complete purchases as guests by providing only payment and delivery data, which resulted in mandatory account creation and expanded data capture, and no defined storage period existed for personal data collected during purchases, which meant that necessity was never re-evaluated against operational purpose.

France (PAP, 2024): CNIL fined PAP with EUR 100,000 penalty, inter alia, in view of Articles 5(1)(e) and 28 of the GDPR. It was observed that personal data continued to reside in systems without automated expiry or deletion controls, despite documented policies. The findings included weak password requirements, insecure transmission of credentials, excessive and unenforced retention periods, incomplete and inaccurate privacy disclosures, processor contracts lacking mandatory GDPR clauses, and plaintext storage of passwords and identifiers, all of which collectively demonstrated that retention risk is intertwined with access control, contract governance, and secure architecture.


Across both rulings, regulators assessed disclosures, internal documentation, and observable system behaviour in parallel because accountability requires alignment between stated retention logic and actual database persistence, log storage, backup design, and processor level replication. Processor governance failures also highlight weaknesses in digital contract and SaaS contract structuring where data processing clauses fail to impose enforceable deletion, audit, and flow down obligations on vendors.


What this means for product and system design?


Each data category must be mapped to a clearly defined purpose at the schema or metadata level so that systems can programmatically determine when purpose exhaustion occurs. Retention and expiry logic must be embedded within database design and application workflows instead of relying on manual review cycles. Deletion and anonymisation must operate as automated default states triggered by purpose completion, account closure, or statutory expiry conditions. Deletion commands must propagate across production environments, backups, log stores, analytics layers, and processor environments to prevent shadow persistence that contradicts declared policy. System generated deletion logs must be preserved in a manner that enables regulatory verification without reintroducing excessive retention of personal data.


Where datasets feed algorithmic decision systems, retention design must additionally align with AI ML Governance controls because prolonged storage of historical personal data expands bias exposure, model drift risk, and regulatory scrutiny

Retention compliance depends on whether architecture enforces necessity in real time because regulators evaluate technical implementation against statutory purpose limitation, and organisations that treat deletion as an afterthought risk regulatory findings that documentation alone cannot cure.