{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ ] } , { "cell_type": "code", "metadata": { "dotnet_interactive": { "language": "fsharp" }, "polyglot_notebook": { "kernelName": "fsharp" } }, "execution_count": null, "outputs": [], "source": [ "#r \"nuget: FSharp.Data,8.1.8\"\n", "\n", "Formatter.SetPreferredMimeTypesFor(typeof\u003cobj\u003e, \"text/plain\")\n", "Formatter.Register(fun (x: obj) (writer: TextWriter) -\u003e fprintfn writer \"%120A\" x)\n", "#endif\n" ] } , { "cell_type": "markdown", "metadata": {}, "source": [ "# Anonymizing JSON\n", "\n", "[![Binder](../img/badge-binder.svg)](https://mybinder.org/v2/gh/diffsharp/diffsharp.github.io/master?filepath=tutorials/JsonAnonymizer.ipynb)\u0026emsp;\n", "[![Script](../img/badge-script.svg)](https://fsprojects.github.io/FSharp.Data//tutorials/JsonAnonymizer.fsx)\u0026emsp;\n", "[![Notebook](../img/badge-notebook.svg)](https://fsprojects.github.io/FSharp.Data//tutorials/JsonAnonymizer.ipynb)\n", "\n", "This tutorial shows how to implement an anonymizer for a JSON document (represented using\n", "the [JsonValue](https://fsprojects.github.io/FSharp.Data/reference/fsharp-data-jsonvalue.html) type discussed in [JSON parser article](JsonValue.html))\n", "This functionality is not directly available in the FSharp.Data package, but it can\n", "be very easily implemented by recursively walking over the JSON document.\n", "\n", "If you want to use the JSON anonymizer in your code, you can copy the\n", "[source from GitHub](https://github.com/fsharp/FSharp.Data/blob/master/docs/content/tutorials/JsonAnonymizer.fsx) and just include it in your project. If you use these\n", "functions often and would like to see them in the FSharp.Data package, please submit\n", "a [feature request](https://github.com/fsharp/FSharp.Data/issues).\n", "\n", "*DISCLAIMER*: Don\u0027t use this for sensitive data as it\u0027s just a sample\n", "\n" ] } , { "cell_type": "code", "metadata": { "dotnet_interactive": { "language": "fsharp" }, "polyglot_notebook": { "kernelName": "fsharp" } }, "execution_count": 2, "outputs": [], "source": [ "open System\n", "open System.Globalization\n", "open FSharp.Data\n", "\n", "type JsonAnonymizer(?propertiesToSkip, ?valuesToSkip) =\n", "\n", " let propertiesToSkip = Set.ofList (defaultArg propertiesToSkip [])\n", " let valuesToSkip = Set.ofList (defaultArg valuesToSkip [])\n", "\n", " let rng = Random()\n", "\n", " let digits = [| \u00270\u0027 .. \u00279\u0027 |]\n", " let lowerLetters = [| \u0027a\u0027 .. \u0027z\u0027 |]\n", " let upperLetters = [| \u0027A\u0027 .. \u0027Z\u0027 |]\n", "\n", " let getRandomChar (c: char) =\n", " if Char.IsDigit c then\n", " digits.[rng.Next(10)]\n", " elif Char.IsLetter c then\n", " if Char.IsLower c then\n", " lowerLetters.[rng.Next(26)]\n", " else\n", " upperLetters.[rng.Next(26)]\n", " else\n", " c\n", "\n", " let randomize (str: string) =\n", " String(str.ToCharArray() |\u003e Array.map getRandomChar)\n", "\n", " let isType testType typ =\n", " match typ with\n", " | Runtime.StructuralTypes.InferedType.Primitive(typ, _, _, _) -\u003e typ = testType\n", " | _ -\u003e false\n", "\n", " let rec anonymize json =\n", " match json with\n", " | JsonValue.String s when valuesToSkip.Contains s -\u003e json\n", " | JsonValue.String s -\u003e\n", " let typ =\n", " Runtime.StructuralInference.inferPrimitiveType\n", " Runtime.StructuralInference.defaultUnitsOfMeasureProvider\n", " Runtime.StructuralInference.InferenceMode\u0027.ValuesOnly\n", " CultureInfo.InvariantCulture\n", " s\n", " None\n", "\n", " (if typ |\u003e isType typeof\u003cGuid\u003e then\n", " Guid.NewGuid().ToString()\n", " elif\n", " typ |\u003e isType typeof\u003cRuntime.StructuralTypes.Bit0\u003e\n", " || typ |\u003e isType typeof\u003cRuntime.StructuralTypes.Bit1\u003e\n", " then\n", " s\n", " elif typ |\u003e isType typeof\u003cDateTime\u003e then\n", " s\n", " else\n", " let prefix, s =\n", " if s.StartsWith \"http://\" then\n", " \"http://\", s.Substring(\"http://\".Length)\n", " elif s.StartsWith \"https://\" then\n", " \"https://\", s.Substring(\"https://\".Length)\n", " else\n", " \"\", s\n", "\n", " prefix + randomize s)\n", " |\u003e JsonValue.String\n", " | JsonValue.Number d -\u003e\n", " let typ =\n", " Runtime.StructuralInference.inferPrimitiveType\n", " Runtime.StructuralInference.defaultUnitsOfMeasureProvider\n", " Runtime.StructuralInference.InferenceMode\u0027.ValuesOnly\n", " CultureInfo.InvariantCulture\n", " (d.ToString())\n", " None\n", "\n", " if\n", " typ |\u003e isType typeof\u003cRuntime.StructuralTypes.Bit0\u003e\n", " || typ |\u003e isType typeof\u003cRuntime.StructuralTypes.Bit1\u003e\n", " then\n", " json\n", " else\n", " d.ToString() |\u003e randomize |\u003e Decimal.Parse |\u003e JsonValue.Number\n", " | JsonValue.Float f -\u003e f.ToString() |\u003e randomize |\u003e Double.Parse |\u003e JsonValue.Float\n", " | JsonValue.Boolean _\n", " | JsonValue.Null -\u003e json\n", " | JsonValue.Record props -\u003e\n", " props\n", " |\u003e Array.map (fun (key, value) -\u003e\n", " let newValue =\n", " if propertiesToSkip.Contains key then\n", " value\n", " else\n", " anonymize value\n", "\n", " key, newValue)\n", " |\u003e JsonValue.Record\n", " | JsonValue.Array array -\u003e array |\u003e Array.map anonymize |\u003e JsonValue.Array\n", "\n", " member _.Anonymize json = anonymize json\n", "\n", "let json = JsonValue.Load(__SOURCE_DIRECTORY__ + \"../../data/TwitterStream.json\")\n", "\n", "printfn \"%O\" json\n", "\n", "let anonymizedJson = (JsonAnonymizer [ \"lang\" ]).Anonymize json\n", "printfn \"%O\" anonymizedJson\n" ] } , { "cell_type": "markdown", "metadata": {}, "source": [ "## Related articles\n", "\n", "* API Reference: [JsonValue](https://fsprojects.github.io/FSharp.Data/reference/fsharp-data-jsonvalue.html)\n", "\n", "* [JSON Parser](../library/JsonValue.html) - a tutorial that introduces\n", "[JsonValue](https://fsprojects.github.io/FSharp.Data/reference/fsharp-data-jsonvalue.html) for working with JSON values dynamically.\n", "\n", "* [JSON Type Provider](../library/JsonProvider.html) - discusses F# type provider\n", "that provides type-safe access to JSON data.\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": ".NET (F#)", "language": "F#", "name": ".net-fsharp" }, "language_info": { "file_extension": ".fs", "mimetype": "text/x-fsharp", "name": "polyglot-notebook", "pygments_lexer": "fsharp" }, "polyglot_notebook": { "kernelInfo": { "defaultKernelName": "fsharp", "items": [ { "aliases": [], "languageName": "fsharp", "name": "fsharp" } ] } } }, "nbformat": 4, "nbformat_minor": 2 }