Is Privacy-First Underwriting for NTC Borrowers Achievable in India?

Written by Hetal Desai

4 min read

Indian statute does not define underwriting in the context of retail or MSME lending, but in practice, it refers to the structured assessment of repayment capacity, probability of default, loss severity, and portfolio concentration prior to sanction and pricing.

Historically, underwriting was documentation-centric and verification-based. Decision-making relied on reports from entities such as TransUnion CIBIL, income proof and bank statements. This framework favoured formally employed and collateral-backed borrowers, while New-to-Credit (‘NTC’) borrowers, including gig workers, unorganized income entrants, and income earners outside formal credit channels, remained largely excluded, as the absence of formal credit history translated into unquantified risk.

With fintech scale and smartphone penetration, underwriting in the digital lending ecosystem has shifted toward extracting signals from transactional and behavioural data. New inputs now include bank statement parsing, GST returns for merchants, platform performance data, cash flow regularity, device metadata and behavioural proxies. Alternative data supplements traditional inputs and helps bridge information gaps for thin-file or NTC consumers, while also reducing acquisition costs in digital consumer lending.

Lenders increasingly deploy machine learning models that analyse bank transactions, UPI flows, and GST data (Account Aggregator Framework) to derive insights about income consistency and repayment capacity. However, these approaches remain dependent on the existence of meaningful digital financial activity.

Underwriting where formal financial activity is minimal

The primary limitation arises where data is sparse or entirely absent, including individuals with no formal credit history, irregular or cash-based income, and limited interaction with formal financial systems. In such cases, even well-designed consent frameworks and advanced models do not produce reliable underwriting outcomes, as the constraint lies in the absence of usable inputs rather than analytical capability.

Individuals may be willing to share additional forms of data where it increases their chances of accessing credit, including microfinance or small-ticket loans, but such data is often inconsistently available and collected through fragmented or informal mechanisms. As a result, even expanded data strategies do not fully resolve the underlying problem of data absence. Privacy-preserving approaches such as federated learning and differential privacy offer a different direction by enabling analysis without direct data sharing.

With fintech scale and smartphone penetration, underwriting in the digital lending ecosystem has shifted toward extracting signals from transactional and behavioural data. New inputs now include bank statement parsing, GST returns for merchants, platform performance data, cash flow regularity, and device metadata and behavioural proxies. Alternative data supplements traditional inputs and helps bridge information gaps for thin-file or NTC consumers, while also reducing acquisition costs in digital consumer lending.

Underwriting where formal financial activity is minimal

However, their effectiveness depends on the existence of relevant data and financial footprints within the ecosystem. Where such a footprint is weak or non-existent, these systems have limited utility.

With fintech scale and smartphone penetration, underwriting in the digital lending ecosystem has shifted towards extracting signals from transactional and behavioural data. New inputs now include bank statement parsing, GST returns for merchants, platform performance data, cash flow regularity, device metadata and behavioural proxies. Alternative data supplements traditional inputs and helps bridge information gaps for thin-file or NTC consumers, while also reducing acquisition costs in digital consumer lending.

Underwriting where formal financial activity is minimal

Consent driven by the need for credit

In a layered lending stack involving multiple data sources and third parties, operationalising and monitoring compliance under the Digital Personal Data Protection Act, 2023 presents practical challenges.

The World Bank’s Bank’s Study on Alternative Data in Credit Risk Assessment recognises that alternative data may include transactional records, utility payments, app usage, mobile money transactions, and e-commerce participation, and recommends a risk-based approach combined with consumer-permissioned, secure data-sharing. It also highlights that consent in alternative credit models is often difficult to interpret and may become effectively coerced where access to credit is contingent on agreement.

This becomes particularly relevant in the NTC context, where borrowers may prioritise access to credit over negotiating data use terms. As a result, consent for behavioural or non-essential data cannot be treated as fully voluntary in the conventional sense, especially where it is embedded within onboarding flows.

In this context, a defensible approach for NTC microfinance products is a structured implementation of purpose limitation, as behavioural data is continuous and does not map neatly to a single purpose in the way financial data does. Addressing this requires system-level design choices:

Segregation of behavioural data by use case (for example, fraud detection versus credit assessment), each with a defined purpose
Prohibition on repurposing data collected for one function toward another without fresh consent
Restriction to behaviour-based signals that are demonstrably necessary for credit risk, with exclusion of weak or proxy indicators, particularly those correlated with socio-economic or personal traits, at the feature engineering stage
Replacement of continuous tracking with time-bound or event-based data use
For many NTC borrowers, consent is simply part of getting access to credit, not a considered decision on data use. Thus, commercially, privacy aligned framework means less data to analyse, less scope for targeting and pre-qualification, and a harder trade-off between conversion efficiency and a privacy-first design for the digital lending ecosystem.

In a layered lending stack involving multiple data sources and third parties, operationalising and monitoring compliance under the DPDP Act presents practical challenges.

The World Bank’s Study on Alternative Data in Credit Risk Assessment recognises that alternative data may include transactional records, utility payments, app usage, mobile money transactions, and e-commerce participation, and recommends a risk-based approach combined with consumer-permissioned, secure data-sharing. It also highlights that consent in alternative credit models is often difficult to interpret and may become effectively coerced where access to credit is contingent on agreement.

Segregation of behavioural data by use case (for example, fraud detection versus credit assessment), each with a defined purpose
Prohibition on repurposing data collected for one function toward another without fresh consent
Restriction to behaviour-based signals that are demonstrably necessary for credit risk, with exclusion of weak or proxy indicators, particularly those correlated with socio-economic or personal traits, at the feature engineering stage
Replacement of continuous tracking with time-bound or event-based data use
For many NTC borrowers, consent is simply part of getting access to credit, not a considered decision on data use. Thus, commercially, privacy aligned framework means less data to analyse, less scope for targeting and pre-qualification, and a harder trade-off between conversion efficiency and a privacy-first design for the digital lending ecosystem.

Segregation of behavioural data by use case (for example, fraud detection versus credit assessment), each with a defined purpose
Prohibition on repurposing data collected for one function toward another without fresh consent
Restriction to behaviour-based signals that are demonstrably necessary for credit risk, with exclusion of weak or proxy indicators, particularly those correlated with socio-economic or personal traits, at the feature engineering stage
Replacement of continuous tracking with time-bound or event-based data use
For many NTC borrowers, consent is simply part of getting access to credit, not a considered decision on data use. Thus, commercially, privacy aligned framework means less data to analyse, less scope for targeting and pre-qualification, and a harder trade-off between conversion efficiency and a privacy-first design for the digital lending ecosystem.