1
Vote

Large files can't be anonimized

description

Hi Frank,

Using the Anonimize.exe (for xml files) I am not able to anonimize xml files larger than 300 MB.
Can you help me solving this issue?

Best,
Evelien

comments

FrankvdnThillart wrote Dec 18, 2015 at 8:45 AM

Hi Evelien,

I have the following suggestions:
  • Try anonymizing with only 1 anonymization rule in file AnonymizeCSV.txt and see which rule actually is the show stopper
  • Remove any anonymizations of date. This anonymization remembers which dates are replaced by what random value and it can be this cache that's become too big and slows everything down because the cache is swapped to disk
  • if the above suggestions do not work, I'll require a testfile so I can test it myself but it's my expectation that it's a date anonymization
Kind regards,

Frank

PS:
If you send me a message via the Codeplex system, you have to allow people to send messages to you in your profile if you want an answer

Evelien2308 wrote Dec 29, 2015 at 9:19 AM

Hi Frank,

Unfortunately your suggestions did not work. There was no date field part of the anonymization rules. And excluding the rule that caused the error did not work as well.

I am not able to send you a test file since it is not depersonalized production data. I attached an already anonymized file, however it is three times smaller than the files I experience problems with.

Kind regards,
Evelien
PS: thanks for the tip :)

FrankvdnThillart wrote Dec 30, 2015 at 9:32 PM

Evelien. Please call me