Skip to main content

Data Masking

1 Overview

Description

Data masking is a crucial technique used to protect sensitive information when interacting with Large Language Models (LLMs). It involves replacing identifiable or confidential data within prompts with placeholder text, ensuring that such information is not exposed to third-party models. This approach is particularly important for safeguarding personally identifiable information (PII) and other sensitive details.

Advanced pattern matching and machine learning algorithms are used to identify sensitive data within prompts. Once identified, the sensitive data is replaced with generic placeholders, such as <Person_0> for names or <Organization_1> for company names. The masked prompt is then processed by the LLM without exposing the original sensitive information. After receiving the response from the LLM, the masked data can be reverted back to its original form if necessary, ensuring that any generated output remains relevant and accurate. AI Core supports data masking as a feature.

Expected Outcome

The expected outcome after implementing data masking is to ensure that sensitive information is protected and not exposed during data processing and analysis. This helps in maintaining data privacy and security, which is crucial for compliance with various regulations and standards.