-
Notifications
You must be signed in to change notification settings - Fork 6.6k
[DLP] Implemented deidentifying and reidentifying of table using fpe #10234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Avani-Thakker
merged 16 commits into
GoogleCloudPlatform:main
from
Avani-Thakker-Crest:dlp_deidentify_reidentify_table_fpe
Jul 10, 2023
Merged
Changes from 13 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
a192093
Implemented deid table with fpe with test cases.
Avani-Thakker-Crest c9c5429
resolved conflicts
Avani-Thakker-Crest 00b7c0a
Merge branch 'main' into dlp_deidentify_reidentify_table_fpe
Avani-Thakker-Crest f3a1355
Implemented deid and reid of table field using FPE
Avani-Thakker-Crest dd63995
Merge branch 'main' into dlp_deidentify_reidentify_table_fpe
Avani-Thakker-Crest 969446e
Refactored as per the review comments
Avani-Thakker-Crest e563ee3
Unique employee ids
Avani-Thakker-Crest 1a92751
Merge branch 'main' into dlp_deidentify_reidentify_table_fpe
Avani-Thakker-Crest 90c8f96
removed unused imports
Avani-Thakker-Crest 8642fb3
Correct calling parser as per the parameter changes in the sample code
Avani-Thakker-Crest 48ebd3f
Passing bytes as input to the sample deid_table_fpe
Avani-Thakker-Crest 174acde
Passing bytes as input to the sample reid_table_fpe
Avani-Thakker-Crest fa65322
Merge branch 'main' into dlp_deidentify_reidentify_table_fpe
Avani-Thakker-Crest 5d15876
Created separate file for deid and reid table fpe samples . Removed t…
Avani-Thakker-Crest 9fc025c
Created separate test file for deid and reid table fpe samples . Remo…
Avani-Thakker-Crest 4deab3b
Removed unused imports
Avani-Thakker-Crest File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,7 +11,7 @@ | |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import base64 | ||
import os | ||
import shutil | ||
import tempfile | ||
|
@@ -595,3 +595,54 @@ def test_deidentify_table_with_multiple_crypto_hash( | |
assert 'string_value: "abbyabernathy1"' not in out | ||
assert "my userid is abbyabernathy1" in out | ||
assert "[email protected]" not in out | ||
|
||
|
||
def test_deidentify_and_reidentify_table_with_fpe(capsys: pytest.CaptureFixture) -> None: | ||
table_data = { | ||
"header": ["employee_id", "date", "compensation"], | ||
"rows": [ | ||
["11111", "2015", "$10"], | ||
["22222", "2016", "$20"], | ||
["33333", "2016", "$15"], | ||
] | ||
} | ||
|
||
deid.deidentify_table_with_fpe( | ||
GCLOUD_PROJECT, | ||
table_data["header"], | ||
table_data["rows"], | ||
["employee_id"], | ||
alphabet='NUMERIC', | ||
wrapped_key=base64.b64decode(WRAPPED_KEY), | ||
key_name=KEY_NAME, | ||
) | ||
|
||
out, _ = capsys.readouterr() | ||
assert "11111" not in out | ||
assert "22222" not in out | ||
|
||
response = out.split(":")[1:] | ||
|
||
deid_col_id = response.index(' "employee_id"\n}\nheaders {\n name') | ||
soumya92 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
total_columns = len(table_data['header']) | ||
total_rows = len(table_data['rows'][0]) | ||
|
||
deid_emp_ids = [response[i].split("\n")[0][2:-1] for i in | ||
range(deid_col_id + total_columns, len(response), total_columns)] | ||
|
||
for i in range(total_rows): | ||
table_data['rows'][i][deid_col_id - 1] = deid_emp_ids[i] | ||
|
||
deid.reidentify_table_with_fpe( | ||
GCLOUD_PROJECT, | ||
table_data["header"], | ||
table_data["rows"], | ||
["employee_id"], | ||
alphabet='NUMERIC', | ||
wrapped_key=base64.b64decode(WRAPPED_KEY), | ||
key_name=KEY_NAME, | ||
) | ||
|
||
out, _ = capsys.readouterr() | ||
assert "11111" in out | ||
assert "22222" in out |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.