Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why blurring sensitive information is a bad idea (dheera.net)
100 points by soundsop on Nov 25, 2010 | hide | past | favorite | 41 comments


WHAT... why would you completely black out the number, where you could instead use random coloured squares, that look like it is a blurring, so someone can go through all the effort, decoding your white noise, and thinking in the end they have your number... when they don't ;)


I dont think he is stressing the ease of doing this with credit card numbers. The sample space he is suggesting generating is far too large... You can usually identify the first several digits simply by the issuing organization, as they all use standardized numbers, the remaining digits must pass a certain checksum algorithm. So really generating a bunch of valid cc numbers is quite trivial. Matching exp dates with numbers and ccv numbers.. Different story.

But i wonder what the limits to effectiveness is on this attack. I usually randomly swirl around with a smear tool to blur out things...


Bank of America uses a horrible method for generating debit card numbers. It's a standard prefix + account number + sequence number + check digit. If you have stolen someone's BofA debit card number then you can easily guess the replacement card's number (just increment the sequence number and recalculate the check digit). From there you just need to guess the expiration date (a comparatively trivial task).


Also they generate your PIN by running your account number through a DES encryption round. This method is known to be vulnerable to attack:

http://www.cl.cam.ac.uk/~mkb23/research/PIN-Cracking.pdf


Chase PINs are by default 3210 - guess how many people still have that pin.


Either you're skipping over a lot of information in the process of how that number's generated, or that's not how they do it anymore (and not how they've done it for at least the last couple of years).


I got a replacement BofA debit card a few months ago, and this is exactly how they did it — only the last two digits are different.

This may be only in Washington and Idaho though, as there are several different legacy BofA backends as a result of M&A bullshit.


But that's for debit cards - I think most banks include the account number in a debit card number. You would still need the CCV number from the back of the card for the attack to work.


I haven't noticed any other bank which had the same practice.

Also, not all online merchants use CCV. Also consider the risk of creating fake physical CCs, no address or CCV necessary.


Just checked— my credit union debit card number includes my account number.


CCVs aren't always required for transactions.


CCV proves the cardholder was present at the time of the transaction. Online merchants are never allowed to store CCV numbers.


edit: CCV proves that you at one time had access to the CCV number.

Online merchants are supposed to comply with PCI-DSS - not store your CCV ever, never transmit your number unencrypted, never store cardholder information unencrypted, plus tons of management controls and audit controls over the same.

In practice, let's just say lazy programming is everywhere. I've seen many people who handle online transactions and violate PCI-DSS to some degree, including storing CCV numbers.


They can ask for them though.


.. indeed, and they often do.


I'd expect random swirls to be reliable, or any obfuscation that introduces a reasonable amount of entropy. This whole attack relies on an almost-exact replication of the original blurred image. If you do something a computer can't easily reproduce over and over again, or something that looks the same no matter what the obfuscated content is (like blacking out), this attack cannot work.


Actually swirls are very reversible - http://news.bbc.co.uk/1/hi/7384834.stm (Paedophile caught from video with swirled obfuscation of his face.)

You're definitely right about adding entropy though, but why bother? Just blacking it out guarantees how much information is available - zero.


That's not a random swirl. Interesting though


As seen in the brilliant 2008 Underhanded C Contest, sometimes even masking isn't enough: http://underhanded.xcott.com/?p=12


Also be sure to strip EXIF data since it can contain an original thumbnail. Not all image editors update the thumbnail with changes.


good point. i usually just display the pic, snap a screenshot and all metadata is gone. if it is scaled it obfuscates blurred things, too.


Actually when you really need to decode a blurred or mosaiced image you can do even more tricks. Especially when they are screenshots. Since you can take a pattern (digits) and blur them with all possible options of certain most popular image editing software, you can then do massive number of comparisons to see what comes out right. It's massively cpu intensive, but I am sure people that need it can do it.

Blacking out the section entirely is the only proper way, since you really want to be sure you are destroying the information in the image, not just dispersing it.

Even then, if you are removing a single digit it can be partially recovered by observing kerning statistics, etc.


What? Are you talking about proportional fonts?


I'm wondering if the copyright and year are accurate...I recall reading something like this a few years ago, complete with pictures of sample checks.


It looks like this is the original source, but it's from 2007, not 2010. This 2007 Slashdot article links to the same URL: http://it.slashdot.org/article.pl?sid=07/01/07/1352242. Maybe the current year gets auto-added by whatever CMS he's using? Either that or it's been updated.

Incidentally, while I was looking for that link, I found an implementation in the form of a Photoshop filter: http://tlrobinson.net/blog/2008/10/08/recovering-censored-te...


This attack is, for what it's worth, at least 4 years old (probably older).


According to SearchYC this has been posted twice

3 years ago (3 comments) http://news.ycombinator.com/item?id=79405

9 months ago (no comments) http://news.ycombinator.com/item?id=1115919


The attack came out at least while I was in college, so 1998-2002.


"Identify the exact size and offset, in pixels, of the mosaic tiles used to blur the original image (easy)"

I don't see that this is easy. Surely you have to test a number of offsets and sizes of text? And without knowing the digits, this is not going to be totally accurate.


You don't even have to color over, or blur, or do any of that hard stuff. Just select the region, and press "CTRL-X", save and quit. No reason to do it any other way.


People blur to maintain the general look of the original image. Having black boxes everywhere is jarring.

Honestly, I don't think the lesson has to be "don't blur"... it can just be "blur enough". If I blur something out, I just use a radius big enough to erase all of the information.


People blur to maintain the general look of the original image. Having black boxes everywhere is jarring.

In that case, why not erase the numbers and replace them with random digits?


Effort ? Way more work than putting a black box in.


Deconvolution is quick with Fourier transforms. In infinite precision, convolution by any nonzero function (which includes Gaussian blurs) is invertible. In practice, quantization and stopping the convolution at the edge of the blur region complicates things. The blur itself removes no information, if you want to convince yourself it's safe you need to argue numerical analysis.

Check out "High Quality Motion Deblurring from a Single Image", for some quite impressive photo reconstruction done by actually fitting a spatially varying set of blur kernels to an image.


Here's a photo of my credit card with the number blurred out: http://imgur.com/Q4hqZ

I eagerly await your impending purchases.


So, how much is "enough"?


Here's a tip: if you do blur, don't use mosaic. Use the blur tool.


No, don't use blur. Ever.

Black it out. And I don't mean the stupid PDF trick where they draw a black box over it (but you can still copy/paste the number from underneath). I mean actually black it out. Print it out, draw on it with a marker, and scan it again if you have to make sure.

But I've seen so many blurred out numbers that I could just about figure out with my eyes, let alone a computer program that could decode it algorithmically. And yes, standard (and non-standard, i.e. Photoshop) algorithms are fairly well known and can be tested against known data. Also, people really can tell what font was used. That gives them more than enough information to decode it.

But why are you giving people information in the first place? Never, ever try to distort information you should be destroying.


Usually it's easier to extract exact numbers from blured image.


Yep, mosaic (often drastically) reduces the resolution of a portion of the image. The information that's been removed can't easily be recovered.


Unless the blur tool is randomised then it may yield to the same tactic, no?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: