You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SecurityAsset.containsSensitiveData: boolean exists in the schema and is shown in the UI on the OAuth-grants card, but nothing populates it. It's seeded by the demo script and otherwise empty. The asset model has DATA_RESOURCE, WORKSPACE, VAULT, REPOSITORY types and dataAssetTypes filtering in internal/bootstrap/security_overview.go — the scaffolding is there, the classifier is missing.
DSPM (Data Security Posture Management) is the fastest-growing SSPM adjacency: every buyer wants to know which findings touch sensitive data and which OAuth grants can exfiltrate it. Even a lightweight version is differentiating.
Goals
Lightweight data classifier that samples content in Drive / SharePoint / Box / GitHub for well-known sensitive-data patterns (PII, PHI, PCI, secrets).
Asset-level enrichment: populate SecurityAsset.containsSensitiveData + a new sensitivityScore and dataLabels set.
Risk-aware overlay: every existing finding's risk score boosts when the affected asset carries sensitive data; OAuth-grant risk likewise.
Data-flow heatmap: where in the SaaS estate does sensitive data live, who can read it, who has?
Findings filter: "show me everything touching customer PII."
Non-goals
Not building a full DSPM with policy-as-code, fine-grained classification taxonomies, or ML-based content classification — start with regex/dictionary classifiers + provider-native labels where available.
Not scanning every file on every sync — sampling + provider-native classification labels (Google DLP, M365 sensitivity labels) where they exist.
Not transferring file contents off the customer's tenant — all classification happens server-side in Aperio's runtime.
Sampling worker — for each SecurityAsset of type DATA_RESOURCE/WORKSPACE/VAULT/REPOSITORY, sample N files (configurable) per scan cycle.
Provider-native first — if Google DLP / M365 Sensitivity Labels / Box Shield labels already exist on the object, take them as authoritative and skip regex.
Regex/dictionary classifier with built-in patterns:
PII: SSN, NI number, EU national IDs, passport numbers, IBAN/SWIFT
Existing finding risk-score calculators (packages/shared/src/risk-scoring.ts) get a multiplier when the linked SecurityAsset.sensitivityScore > threshold:
Problem
SecurityAsset.containsSensitiveData: booleanexists in the schema and is shown in the UI on the OAuth-grants card, but nothing populates it. It's seeded by the demo script and otherwise empty. The asset model hasDATA_RESOURCE,WORKSPACE,VAULT,REPOSITORYtypes anddataAssetTypesfiltering ininternal/bootstrap/security_overview.go— the scaffolding is there, the classifier is missing.DSPM (Data Security Posture Management) is the fastest-growing SSPM adjacency: every buyer wants to know which findings touch sensitive data and which OAuth grants can exfiltrate it. Even a lightweight version is differentiating.
Goals
SecurityAsset.containsSensitiveData+ a newsensitivityScoreanddataLabelsset.Non-goals
Proposed design
Schema extensions
Classifier pipeline
internal/dspm/Go package:SecurityAssetof typeDATA_RESOURCE/WORKSPACE/VAULT/REPOSITORY, sample N files (configurable) per scan cycle.trufflehog-style detector list)sensitivityScore = f(unique labels, hit density, asset exposure).Connectors with sampling
drive.files.listwith size cap;files.exportfor Docs to plain text; abide byDrive Activity APIrate limits.Risk boost integration
Existing finding risk-score calculators (
packages/shared/src/risk-scoring.ts) get a multiplier when the linkedSecurityAsset.sensitivityScore > threshold:OAuth grant risk likewise: an app holding a CRITICAL scope on a
containsSensitiveData = trueasset bumps to MAX risk.UI surface
/security/data— sensitivity heatmap across the SaaS estate, by provider + asset type + data label.containsSensitiveData; once the classifier runs, those flags become real.Phasing
containsSensitiveDataactually populated/security/dataheatmapOpen questions
References
SecurityAsset(already hascontainsSensitiveData,typeenum,dataAssetTypesfiltering);internal/bootstrap/security_overview.go's data-asset awareness;packages/shared/src/risk-scoring.ts.