我每天都在使用 ChatGPT、Claude、Gemini 等强大的在线 AI 模型。它们确实非常有用,但我反复遇到同一个令人不安的时刻:当我准备粘贴某段文字时,突然意识到其中包含了真实的姓名、客户的公司名称、内部项目代号、同事的电话号码,或是不应该发送给外部服务的医疗信息。
最直接的应对方式是在 Word 或 Excel 中使用"查找与替换"。但这并不管用。现代 AI 模型会读取上下文,极有可能推断出占位符代表的真实内容。把"张伟"替换成"某人",并不能真正保护张伟的隐私。
我需要的是一个本地 AI —— 完全在我自己的机器上运行、不向外部发送任何数据、并且足够智能以理解上下文 —— 在文字离开我的设备之前,将敏感内容剥离干净。PrivacyVeil 就是这样一个工具。
PrivacyVeil 是一款本地优先的个人隐私脱敏工具。你将文本粘贴进去,它会把敏感实体替换为占位符(如 [PERSON_1]、[EMAIL_1] 等),然后你将脱敏后的版本发送给任意在线 AI。当 AI 回复后,将回复粘贴回来,PrivacyVeil 会自动还原原始内容,使答复在语境中仍然清晰可读。
所有处理均在本机完成,数据不会发送至任何云服务。
英文界面
中文界面
安装 PrivacyVeil 前,请确认您的系统满足以下所有要求。
| 要求 | 最低配置 | 推荐配置 |
|---|---|---|
| 内存(RAM) | 4 GB | 8 GB 或更多 |
| 可用磁盘空间 | 3 GB | 5 GB 以上 |
| macOS 版本 | macOS 12 Monterey | macOS 13 Ventura 或更新版本 |
| CPU | Apple Silicon 或 Intel | Apple Silicon(M1/M2/M3) |
内存说明: 默认 AI 模型(
phi3)运行时约占用 2.3 GB 内存。4 GB 是硬性最低要求。若您有 8 GB 或更多内存,也可使用llama3.2:3b或mistral以获得更高的检测质量。
| 软件 | 要求版本 | 查看方法 | 安装方法 |
|---|---|---|---|
| macOS | 12 Monterey 或更新版本 | Apple 菜单 → 关于本机 | 系统更新 |
| Python | 3.11 或 3.12 | python3 --version |
brew install python@3.11 |
| Homebrew | 任意近期版本 | brew --version |
brew.sh |
| Ollama | 0.1.48 或更新版本 | ollama --version |
ollama.com/download |
| phi3 模型 | (任意) | ollama list |
ollama pull phi3 |
Python 版本说明: 不支持 Python 3.10 及更早版本。Python 3.13 尚未经过测试,建议使用 3.11 或 3.12。
Ollama 说明: 启动 PrivacyVeil 前,Ollama 必须在后台运行。最简便的方式是安装后打开 Ollama 应用程序,它会以菜单栏图标的形式常驻后台。
install.sh 脚本将自动处理所有步骤:Python 环境、依赖安装、Ollama 检测、模型下载及服务启动。
chmod +x install.sh
./install.sh浏览器将自动打开 http://localhost:8000。
如果您希望自行控制每个步骤,请按以下顺序操作。
1. 克隆仓库
git clone https://github.com/johnwoth/privacyveil.git
cd privacyveil2. 创建 Python 虚拟环境
python3.11 -m venv venv
source venv/bin/activate3. 安装 Python 依赖
pip install -r requirements.txt安装的包:fastapi、uvicorn[standard]、httpx、pydantic、python-multipart。
4. 安装 Ollama
从 ollama.com/download 下载并运行安装程序。安装完成后,打开 Ollama 应用,它将出现在菜单栏中。
5. 下载 AI 模型
ollama pull phi3约下载 2.3 GB,仅需执行一次。
6. 启动 PrivacyVeil
source venv/bin/activate
uvicorn main:app --port 8000在浏览器中打开 http://localhost:8000。
PrivacyVeil 并行运行两个检测层:
第一层 —— 正则规则(始终启用) 通过模式匹配即时识别结构化个人信息:电子邮件地址、电话号码、信用卡号(Luhn 校验)、社会保险号(SSN)、API 密钥、Bearer 令牌、GitHub Token、Google API 密钥、IP 地址、GPS 坐标,以及包含凭证的 URL。
第二层 —— 本地 AI(可选,需要 Ollama) 本地运行的语言模型识别正则规则无法捕获的上下文相关实体:人名、公司名称、内部项目名称、医疗诊断、药物名称、自然语言描述的地址等。AI 层仅处理正则层尚未覆盖的内容。
如果 Ollama 离线或未安装,第一层仍会正常运行,应用不会中断,界面将显示当前所处模式。
- 将文本粘贴到输入框。
- 点击开始脱敏,脱敏结果将显示在脱敏结果框中。
- 复制脱敏结果,粘贴到 ChatGPT、Claude、Gemini 或其他任意 AI 服务中。
- 获得 AI 回复后,将回复粘贴回输入框。
- 点击还原答复,PrivacyVeil 将把占位符替换回原始内容。
- 点击清除以开始新会话,同时删除会话映射文件。
模式说明
| 模式 | 占位符格式 | 会话文件 | 适用场景 |
|---|---|---|---|
| 可还原(默认) | [PERSON_1]、[EMAIL_2] |
保存至 sessions/ |
需要还原 AI 回复内容时 |
| 不可还原 | [REDACTED_PERSON] |
不保存 | 永久脱敏,无需还原 |
手动添加脱敏词
如果 AI 层漏检了某个词,点击脱敏记录面板中的 + 添加脱敏词,输入原始词语,系统将自动分配占位符。
| 方法 | 路径 | 说明 |
|---|---|---|
POST |
/mask |
脱敏文本。请求体:{"text": "…", "mode": "reversible", "session_id": null, "model": null} |
POST |
/unmask |
还原脱敏文本。请求体:{"masked_text": "…", "session_id": "…"} |
GET |
/session/{id} |
查看指定会话的占位符 → 原始内容映射 |
DELETE |
/session/{id} |
删除指定会话的映射 |
GET |
/health |
Ollama 可达状态及可用模型列表 |
- 无出站网络请求,仅与
localhost:11434(本地 Ollama 服务器)通信。 - 会话文件存储在
./sessions/目录,内含占位符到原始内容的映射。使用完毕后请删除,或在界面中点击清除。 - 日志中不包含任何原始敏感内容,仅记录占位符标签。
- 若 Ollama 离线,正则层仍独立运行,应用不会中断,也不会将数据发送至其他地方作为回退。
编辑 config.py 修改默认值:
| 变量 | 默认值 | 说明 |
|---|---|---|
OLLAMA_URL |
http://localhost:11434 |
Ollama 服务器地址 |
OLLAMA_MODEL |
phi3 |
未在界面选择模型时的默认回退模型 |
APP_PORT |
8000 |
服务监听端口 |
SESSION_DIR |
./sessions |
会话映射文件存储目录 |
OLLAMA_TIMEOUT |
30.0 |
AI 层超时秒数 |
Apache License 2.0。详见 LICENSE。
Every day I use ChatGPT, Claude, Gemini, and other powerful online AI models. They are genuinely useful — but I kept running into the same uncomfortable moment: I was about to paste something and realised it contained a real name, a client's company, an internal project code, a colleague's phone number, or a medical detail I had no business sharing with an external service.
The naive fix is find-and-replace in Word or Excel. But that does not work. A modern AI model reads the surrounding context and can easily infer what the placeholder stood for. Replacing "John Smith" with "Person A" does not protect John Smith.
What I needed was a local AI — one that runs entirely on my own machine, never phones home, and is smart enough to understand context — to strip the sensitive parts before the text ever leaves my device. PrivacyVeil is that tool.
PrivacyVeil is a local-first PII masking proxy. You paste text into it, it replaces sensitive entities with tokens ([PERSON_1], [EMAIL_1], etc.), and you paste the sanitised version into any online AI. When the AI replies, you paste the reply back and PrivacyVeil restores the original values — so the answer makes sense in context.
All processing happens on your machine. No data is sent to any cloud service.
English UI
Chinese UI
Before installing PrivacyVeil, ensure your system meets every requirement below.
| Requirement | Minimum | Recommended |
|---|---|---|
| RAM | 4 GB | 8 GB or more |
| Free disk space | 3 GB | 5 GB+ |
| macOS version | macOS 12 Monterey | macOS 13 Ventura or later |
| CPU | Apple Silicon or Intel | Apple Silicon (M1/M2/M3) |
RAM note: The default AI model (
phi3) requires approximately 2.3 GB of RAM when running. 4 GB total RAM is the hard minimum. If you have 8 GB or more, you can also usellama3.2:3bormistralfor higher detection quality.
| Software | Required version | How to check | How to install |
|---|---|---|---|
| macOS | 12 Monterey or later | Apple menu → About This Mac | System update |
| Python | 3.11 or 3.12 | python3 --version |
brew install python@3.11 |
| Homebrew | Any recent version | brew --version |
brew.sh |
| Ollama | 0.1.48 or later | ollama --version |
ollama.com/download |
| phi3 model | (any) | ollama list |
ollama pull phi3 |
Python version: Python 3.10 and earlier are not supported. Python 3.13 has not been tested. Stick to 3.11 or 3.12.
Ollama note: Ollama must be running in the background before you start PrivacyVeil. The quickest way is to open the Ollama app after installation — it runs as a menu-bar item.
The install.sh script handles everything: Python environment, dependencies, Ollama check, model download, and server launch.
chmod +x install.sh
./install.shYour browser will open automatically at http://localhost:8000.
Use this if you prefer to control each step yourself.
1. Clone the repository
git clone https://github.com/johnwoth/privacyveil.git
cd privacyveil2. Create a Python virtual environment
python3.11 -m venv venv
source venv/bin/activate3. Install Python dependencies
pip install -r requirements.txtPackages installed: fastapi, uvicorn[standard], httpx, pydantic, python-multipart.
4. Install Ollama
Download from ollama.com/download and run the installer. Once installed, open the Ollama app — it will appear in your menu bar.
5. Pull the AI model
ollama pull phi3This downloads ~2.3 GB. It only needs to be done once.
6. Start PrivacyVeil
source venv/bin/activate
uvicorn main:app --port 8000Open http://localhost:8000 in your browser.
PrivacyVeil runs two detection layers in parallel:
Layer 1 — Regex (always on) Instantly catches structured PII using pattern matching: email addresses, phone numbers, credit card numbers (Luhn-validated), Social Security Numbers, API keys, Bearer tokens, GitHub tokens, Google API keys, IP addresses, GPS coordinates, and URLs containing credentials.
Layer 2 — Local AI (optional, requires Ollama) A locally-running language model detects context-aware entities that patterns cannot: personal names, company names, internal project names, medical diagnoses, medications, addresses written in natural language, and more. The AI layer only sees text that the regex layer has not already handled.
If Ollama is offline or not installed, Layer 1 still runs and the app never blocks. The UI shows which mode is active.
- Paste your text into the Input box.
- Click Mask Text. The sanitised version appears in Masked Output.
- Copy the masked output and paste it into ChatGPT, Claude, Gemini, or any other AI service.
- When the AI replies, paste its response back into the Input box.
- Click Restore Answer. PrivacyVeil swaps the tokens back to the original values.
- Click Clear to start a new session and delete the session mapping file.
Modes
| Mode | Token format | Session file | Use when |
|---|---|---|---|
| Reversible (default) | [PERSON_1], [EMAIL_2] |
Saved to sessions/ |
You need to restore the AI's answer |
| Irreversible | [REDACTED_PERSON] |
Not saved | Permanent redaction; no restoration needed |
Add word manually
If the AI layers miss a word, click + Add word to mask in the masked-items panel. Type the original word and a token will be assigned automatically.
| Method | Path | Description |
|---|---|---|
POST |
/mask |
Mask text. Body: {"text": "…", "mode": "reversible", "session_id": null, "model": null} |
POST |
/unmask |
Restore masked text. Body: {"masked_text": "…", "session_id": "…"} |
GET |
/session/{id} |
View the token → original mapping for a session |
DELETE |
/session/{id} |
Delete a session's mapping |
GET |
/health |
Ollama reachability status and list of available models |
- No outbound network calls except to
localhost:11434(your local Ollama server). - Session files in
./sessions/store the token-to-original mapping. Delete them when you are done, or click Clear in the UI. - Logs never contain the original sensitive values — only token labels.
- If Ollama is offline, the regex layer runs standalone. The app never blocks or sends data elsewhere as a fallback.
Edit config.py to change defaults:
| Variable | Default | Description |
|---|---|---|
OLLAMA_URL |
http://localhost:11434 |
Ollama server address |
OLLAMA_MODEL |
phi3 |
Fallback model when none is selected in the UI |
APP_PORT |
8000 |
Port the server listens on |
SESSION_DIR |
./sessions |
Directory for session mapping files |
OLLAMA_TIMEOUT |
30.0 |
Seconds before the AI layer times out |
Apache License 2.0. See LICENSE.

