 | 2013-05-23 The Ticket |
 | 2013-04-24 Who is leeching my Facebook |
 | 2013-04-15 Time for a compact Coolpix |
 | 2013-04-15 Selling my Peavey amp |
 | 2013-03-28 encfs Helper Revisited |
 | 2013-03-15 c conf 1.17 |
 | 2013-02-25 Flitsers op de Nederlandse wegen |
 | 2013-01-14 OTG Revisited |
 | 2013-01-11 Kalk revisited |
 | 2012-12-12 God is coming |
 | 2012-11-29 Strangely disturbing |
 | 2012-11-27 How to sneeze on a motorbike |
 | 2012-11-23 Een aanslag uit de toekomst |
 | 2012-11-22 More voluntary layoffs |
 | 2012-10-17 Browser fingerprint |
 | 2012-10-15 Site layout revamped |
 | 2012-10-14 Crossroads performing well |
 | 2012-10-08 The saddle seat |
 | 2012-09-30 Sudo Exploit |
 | 2012-09-26 Technomona is live |
 | 2012-09-19 A Perl to HTML Prettyprinter |
 | 2012-09-17 Playing with Moose |
 | 2012-08-22 How do you export photos from Adobe Photoshop Elements |
 | 2012-08-21 Security hole at the Empire State Building |
 | 2012-07-15 Matrix Sunglasses |
 | 2012-07-12 Two Morgans |
 | 2012-07-12 Review of the BMW R1200RT Motorbike |
 | 2012-06-13 Passwords: maak het hackers moeilijk |
 | 2012-06-13 Mijndomein foutje |
 | 2012-06-07 YOLNT |
 | 2012-06-06 Human Origins |
 | 2012-05-23 Me on Facebook |
 | 2012-05-16 Voluntary Layoffs |
 | 2012-05-07 RBS Bike Tour |
 | 2012-04-27 Room with a view |
 | 2012-04-23 Browser cookies and Javascript revisited |
 | 2012-02-17 Schrodingers Cat |
 | 2012-02-14 KPN heeft problemen |
 | 2012-01-12 Memebase forever |
 | 2012-01-11 Strange squares |
 | 2011-12-22 TVV Ondernemingsportaalnl.com zuigt ezel |
 | 2011-12-08 Dilbert vs Skype |
 | 2011-11-29 The uncanny resilience of bulshytt |
 | 2011-11-23 Another silly Trojan attempt |
 | 2011-10-29 ACTA is coming our way |
 | 2011-10-28 Burgernet in the Netherlands |
 | 2011-10-27 Facepalm art |
 | 2011-10-26 Do not drag this image |
 | 2011-10-22 Off The Grid Challenge |
 | 2011-10-12 PI like a boss |
 | 2011-10-07 Once upon a time |
 | 2011-07-13 Dutch eticket system for trains |
 | 2011-07-12 Is Hell exothermic or endothermic |
 | 2011-04-27 Optical Illusions |
 | 2011-04-19 Odd lyrics |
 | 2011-04-16 Band Revival at MON |
 | 2011-03-13 Protests in the Middle East and you |
 | 2011-03-10 Mac OSX Hotkey for locking your system |
 | 2011-02-12 dnspb 0.06 is out |
 | 2011-02-08 Would I buy this fridge |
 | 2011-02-06 InstaYouth |
 | 2011-02-05 The Thinker is back |
 | 2011-01-17 Math challenge |
 | 2011-01-11 Zero tolerance and zero intelligence |
 | 2011-01-05 My interest income in 1991 |
 | 2011-01-01 Your horoscope by Eddie |
 | 2010-12-22 New York City Tours might be half price for you |
 | 2010-12-20 Weather Forecast |
 | 2010-12-14 World Economy Collapse explained in 3 minutes |
 | 2010-12-13 The Salvation Army and its choice of toys |
 | 2010-12-08 Elizabeth thinks highly of me |
 | 2010-12-06 Should I trust my government with my data |
 | 2010-12-05 Announcing dnspb |
 | 2010-12-03 Realistic piechart |
 | 2010-11-26 Crossroads 2.71 is out |
 | 2010-11-24 8 bit Starwars |
 | 2010-11-17 Six to eight black men |
 | 2010-11-16 Canada wants backdoors and data and everything |
 | 2010-11-11 Autumn storm over the Netherlands |
 | 2010-10-08 USA wants backdoors to everything |
 | 2010-10-05 Sudoku solver in Perl |
 | 2010-10-02 Finally wrote up a Syscheck page |
 | 2010-09-28 Neon sign fail |
 | 2010-09-27 The Renault Eco Team |
 | 2010-09-23 Crossroads 2.68 is out |
 | 2010-09-20 How to suppress Flash cookies |
 | 2010-09-15 Meanwhile on Facebook |
 | 2010-09-09 The Yes Men Fix The World |
 | 2010-09-07 ed is not dead |
 | 2010-08-26 Installing Perl modules in a non root environment |
 | 2010-08-22 Magic self leviation |
 | 2010-08-20 Google Chrome does not support offline Gmail |
 | 2010-08-19 The number 48 |
 | 2010-08-12 Welsh trout mini HOWTO |
 | 2010-08-04 Fooling a NetCache proxy into fetching forbidden files |
 | 2010-07-30 The world will end on May 21, 2011 |
 | 2010-07-28 Hiding or showing a textbox with image animation using JQuery |
 | 2010-07-27 Manipulating browser cookies using Javascript |
 | 2010-07-25 Survival of the fittest book |
 | 2010-07-23 Pastafarians in Spain |
 | 2010-07-22 You have two sheep |
 | 2010-07-09 Highway bank fire |
 | 2010-07-08 Setting up a remote git repository |
 | 2010-07-06 Bye bye trusted old Macbook |
 | 2010-06-28 John Cleese on Football |
 | 2010-06-23 ABN Amro and the Pathetic Customer Service Dept. |
 | 2010-06-22 Wally does not like criticism |
 | 2010-06-14 Soccermatch Netherlands vs Denmark |
 | 2010-06-13 Lazy Cat |
 | 2010-06-08 Reading public Buzz using the Google API |
 | 2010-06-07 A Personal Letter from Steve Martin |
 | 2010-06-05 Sushi Saturday |
 | 2010-06-04 Suppressing the Enter key with Javascript |
 | 2010-05-31 Temporal spacial anomaly on the Dutch highway |
 | 2010-05-23 Greenhost will not log your traffic |
 | 2010-05-10 Jarlsberg Webapp Exploits |
 | 2010-05-04 A Thought Experiment |
 | 2010-05-03 SafeEdit information updated |
 | 2010-05-01 Microproxy now supports ftp |
 | 2010-04-30 What could get Data angry |
 | 2010-04-29 Lego Mindstorm solving the Rubik Cube |
 | 2010-04-28 Crossroads 2.65 is out |
 | 2010-04-17 Goggomobil in its natural habitat |
 | 2010-04-14 Bacon Time |
 | 2010-04-11 104 More friends to connect with |
 | 2010-04-10 Bacteria infested radio reporter |
 | 2010-04-07 The Kubat STAR |
 | 2010-03-30 Homework Essay |
 | 2010-03-29 C++ mutexes again |
 | 2010-03-20 Weird Eyechart |
 | 2010-03-15 Microproxy 1.01 |
 | 2010-03-05 Microproxy |
 | 2010-03-03 Sven Kramer and the wrong lane |
 | 2010-02-26 Endearing Babe Magnet |
 | 2010-02-17 Speed of light measured using chocolate and a microwave |
 | 2010-02-17 Never again expires after 65 years |
 | 2010-02-16 encfs on the Mac |
 | 2010-02-15 Hyves.nl and sexual predators |
 | 2010-02-10 Funny textbook |
 | 2010-02-09 DNS failing after sleep wake cycle |
 | 2010-02-06 Blast from the past |
 | 2010-01-28 Simple and straight Perl HTTP::Proxy |
 | 2010-01-15 Avatar the Movie |
 | 2010-01-08 Slightly NSFW Linux Ad |
 | 2010-01-07 WTF |
 | 2010-01-05 Stop Software Patents in the EU |
 | 2009-12-05 HammerServer 1.02 |
 | 2009-11-28 Perls Automagical Autoloading |
 | 2009-10-07 Office Poster |
 | 2009-10-06 The nr 1 Nerdjoke |
 | 2009-10-04 WoW Startscript for my Mac |
 | 2009-09-27 HammerServer section is online |
 | 2009-09-26 The BING HQ |
 | 2009-09-26 Digging a WOW Tunnel |
 | 2009-06-29 Wee Todd |
 | 2009-06-23 The On Off Switch Revisited |
 | 2009-06-22 Meatspace |
 | 2009-05-30 My old houses |
 | 2009-05-11 LOLcats are funny |
 | 2009-05-11 Civic Duty WIN |
 | 2009-05-10 Vote for the baby, Sky Radio promo FAIL |
 | 2009-05-05 My secure data center |
 | 2009-02-15 My Valentine is sending me a dot exe |
 | 2009-02-05 MacPorts trash: .mp_123456 savefiles cleaning |
 | 2009-02-01 Truecrypt 6 on Linux and the ext3 filesystem |
 | 2009-01-28 www versus nl.youtube.com |
 | 2009-01-27 Songsmith and The Police |
 | 2009-01-25 My own Ministery of Silly Walks |
 | 2009-01-09 CoolIris Mini HOWTO |
 | 2008-11-04 UDP and DNS balancing |
 | 2008-11-02 Life in graphs |
 | 2008-11-01 Skeined yet? |
 | 2008-10-30 New Crossroads on the horizon |
 | 2008-10-28 Thread safe or not |
 | 2008-10-15 WOW patch 3 on a case sensitive MacOSX filesystem |
 | 2008-10-15 Surprising C++ optimizations |
 | 2008-10-14 Weird system message |
 | 2008-10-08 Data mining against terrorism does not work |
 | 2008-09-16 Crossroads at the top of Freshmeat.net |
 | 2008-09-09 Stupid spammers at Computable |
 | 2008-09-06 Spam prevention with Postfix and Postgrey |
 | 2008-09-03 The Gnomish Flying Machine |
 | 2008-08-27 Bank customer data on eBay |
 | 2008-08-26 Mutexes in C++ Threads |
 | 2008-08-22 4M dataloss in the UK last year |
 | 2008-08-21 Dropping spam with Postfix and Spamassassin |
 | 2008-08-18 Bayes and the War on Photography |
 | 2008-08-13 Good marital advice |
 | 2008-08-12 Squid proxy for personal usage |
 | 2008-08-11 Posix threads in C++ |
 | 2008-08-09 Crossroads mailing list |
 | 2008-08-08 Crossroads 2.00 is out |
 | 2008-08-01 Fail Pics |
 | 2008-07-14 The Fish Dance |
 | 2008-07-01 Big Bother and Massive Data Storage |
 | 2008-06-30 MMV One of omitted Unix tools |
 | 2008-06-08 Even anonymous breadcrumbs can give you away |
 | 2008-05-29 Crossroads in Argentina |
 | 2008-05-20 The Party at the Company Outing |
 | 2008-05-19 Crossroads 1.80 is out |
 | 2008-05-18 Where does technical innovation really come from |
 | 2008-05-16 Corporate bs generator |
 | 2008-05-15 Even the Vatican has to adapt |
 | 2008-05-12 Big Brother is watching your dog |
 | 2008-05-09 666 all over the place |
 | 2008-04-17 Security and privacy are incompatible |
 | 2008-04-16 The Hallmark E Card |
 | 2008-04-15 Crosroads Solaris port is out |
 | 2008-04-04 Identity theft can cost you dearly |
 | 2008-04-03 Crossroads can already do that |
 | 2008-03-31 A dagerous safari |
 | 2008-03-28 Why some Java J2EE projects are inefficient |
 | 2008-03-26 The Hummingbird |
 | 2008-03-25 The Easter delusion |
 | 2008-03-18 McAfee detects mass hack of 200.000 webpages |
 | 2008-03-17 More predictive statistics |
 | 2008-03-10 Backwards conclusions even on Slashdot |
 | 2008-02-18 A fractal photograph |
 | 2008-02-15 Kaprekar revisited |
 | 2008-02-14 Kaprekar numbers |
 | 2008-02-12 A tale of the criminal ineptitude |
 | 2008-02-10 Irritating Selfregistered users in PHPBB |
 | 2008-02-08 B2B Spam in the Netherlands |
 | 2008-02-06 Surprising iSight Capture |
 | 2008-02-05 Breadcrumbs at WickedLasers.com |
 | 2008-01-29 iSight Capture Utility |
 | 2008-01-28 The Male Brain |
 | 2008-01-26 Searching for the next Uri Geller |
 | 2008-01-24 Opt in for b2b spam |
 | 2008-01-14 Bokito Revisited |
 | 2008-01-13 Top Crossroads User |
 | 2008-01-12 World of Warcraft Dancing |
 | 2008-01-12 Justice dispensed better late than never |
 | 2008-01-11 Jeremy Clarkson and Identity Theft |
 | 2008-01-10 Terrorism in the Netherlands |
 | 2007-12-07 The mind and bodysnatchers are among us |
 | 2007-12-05 Bruce Schneier and Hildo |
 | 2007-12-04 Bye bye, good Christian soul |
 | 2007-12-03 Confusing mail message |
 | 2007-11-30 Medion MD 85276 reviewed |
 | 2007-11-29 Recent cases of data exposure |
 |
2007-11-20 Bayes bites |
|
As I'm re-reading my notes on privacy and huge data collections, a favorite
story springs into mind. It's about Baysian statistics. The idea for
the text below is from John A. Paulos' book A Mathematician Reads
the Newspaper (I can highly recommend reading this).
Imagine a city of 10 million people, where a brutal crime is
committed by one of the citizens. Furtunately, the killer left their
fingerprints at the murder scene, so the detectives have a perfect
starting point for their investigations.
Another fortunate thing is that the City has collected the
fingerprints of all citizens in a huge database, with the purpose of
speeding up and helping the fight against crime. Great! One might
expect that the murder will be solved swiftly...
So the detectives fire up their systems and let the computer
compare the crime scene prints against the set of the prints of all
citizens. The computers start crunching numbers, everyone holds
their breath, and.. finally.. after 3 weeks of humming, the computer
finds a match. A suspect is arrested and brought to trial.
The court hears all involved parties, and all subject matter
specialists. Just to be sure of the evidence, the judge asks an
expert: "How good is that fingerprint matching system?" The expert
answers: "Your honor, it's the newest-bestest system one can have, all
singing and dancing. The chance of error is very small indeed: it will
correctly identify a fingerprint in 99.9999% of the cases! There's
a chance of only one in a million that the machine says it's a match
while in fact it isn't."
The jury is convinced. Despite the fact that the suspect
continues to plea not guilty (but hey, that's to be expected), a
guilty verdict is returned. The jurors' motivation is: "The defendant
simply must be the one who did it. Why, there machine is almost
error-safe! The chance of a bad decision are only one in a million.
That's good enough for the court, We, the jury, are convicting the
defendant."
The truth here is that the suspect probably didn't do it. If the
fingerprint maching machine makes an error once in a million
comparisons, then statistically it would spit out 10 suspects, since
there are 10 million citizens. The chance that any randomly chosen
suspect who is identified by the machine has committed the crime, is
therefore 10%. The chance that the John Doe who was arrested and
convicted is actually not guilty, is 90%.
The error that was made by the jurors is a typical one. The "one
in a million" error chance suggests that the fingerprinting method is
pretty accurate. Which it may or may not be - but in any case, that's
not the question here. Bayes described his theorem using the following
phrasing: "What are the odds that X happens, given the fact that Y
is already observed". The "given that" clause is paramount: the
jurors should have asked themselves, "what are the odds that the
defendant is not guilty given the fact that his fingerprints
matched".
The answer to this question is straight forward: we expect
10 people's fingerprints to match, and only one of them is the perp -
so there's a 90% chance that the defendant is not guilty. The
given-that clause will typically decrease the size of the reference
population by selecting out a group with an already known factor.
I always try to illustrate this for myself as a matrix of
categories and (sub)populations. We have two factors here: (a) A
person is guilty or not, and (b) a person's fingerprints match or they
don't. These categories can be illustrated in a simple 2x2 matrix,
with a third row and column for totals:
|
Guilty |
Not Guilty |
Totals |
| Fingerprints match |
1 |
9 |
10 |
| Fingerprints don't match |
0 |
9.999.990 |
9.999.990 |
| Totals |
1 |
9.999.999 |
10.000.000 |
I think that this representation nicely summarizes all of the
above, though admittedly it takes some time to getting used to such
illustrations. The bottom line of this example is that large data
collections (such as a database of fingerprints of all citizens)
may not be useful at all in determining culprits when used as the sole
source of information. On the contrary: "very precise" methods and
large datasets may look good but produce very bad results. Every
method, however precise, has a chance of producing false positives
(see also the Schiphol
entry), and even a small chance of false positives will yield
quite large numbers of distinct cases when the reference population is
large enough.
As a last remark. Incase you're wondering how precise fingerprint
comparisons are: current methods are approx. 98.5% accurate or better,
which is way less than the hypothetical 99.9999% used in this
example. That's another reason why I think that Japan's plans to
fingerprint all foreigners make no sense.
|
|
|
 | 2007-11-19 Japan starts fingerprinting foreigners |
 | 2007-11-14 Privacy, Yahoo and the Strange World |
 | 2007-11-14 Privacy, Fall through algorithms, and Securing data |
 | 2007-11-07 European airlines to retain data |
 | 2007-11-03 BloggEd |
 | 2007-10-30 Wilders and Marktplaats.nl |
 | 2007-10-28 The goldplated Mac |
 | 2007-10-26 More morons |
 | 2007-10-26 Dilbert nails it again |
 | 2007-10-23 Rough yet funny |
 | 2007-10-05 Another silly Trojan mail |
 | 2007-10-01 So ugly it is beautiful |
 | 2007-09-28 Here is a nickel kid |
 | 2007-09-23 Spy Shredder |
 | 2007-08-29 Web svn view 1.08 |
 | 2007-08-24 Caught in THE Process |
 | 2007-08-21 Stupid Trojan attack |
 | 2007-08-21 Back in 1994 |
 | 2007-08-20 A girly iPod |
 | 2007-08-17 Crossroads for RDP connections |
 | 2007-08-15 Firewall art |
 | 2007-08-14 jpeginfo |
 | 2007-08-13 Good People |
 | 2007-08-07 The Real Crossroads |
 | 2007-07-30 BBC Documentaries in the Netherlands |
 | 2007-07-12 No problems with Crossroads so far |
 | 2007-07-11 Politically correct ad nauseam |
 | 2007-07-02 Waka Waka Poem |
 | 2007-07-02 Voyage of the rubber ducks |
 | 2007-06-28 The On Off Switch |
 | 2007-06-27 No free lunch |
 | 2007-06-25 Crossroads web interface |
 | 2007-06-25 Blinkenlights |
 | 2007-06-21 There is no silver bullet |
 | 2007-06-18 Motto of the week |
 | 2007-06-18 Do not feed the troll |
 | 2007-06-17 Which programming language are you |
 | 2007-06-13 Crossroads support request |
 | 2007-06-12 Bokito glasses |
 | 2007-06-07 Apache mod_proxy balancer description |
 | 2007-06-05 A ticketnumber is not support |
 | 2007-06-05 403 Hammertime |
 | 2007-06-04 Playground Fun |
 | 2007-05-24 Ascii man |
 | 2007-05-07 Cannot find the damn server |
 | 2007-05-02 The BFG200 |
 | 2007-04-27 Crossroads Top User |
 | 2007-03-30 Crossroads Usage |
 | 2007-03-25 The guy with the dark motorhelmet |
 | 2007-03-22 The Process and The Result |
 | 2007-03-21 Quotes attributed to Jos |
 | 2007-03-20 A really nice comment about Crossroads |
 | 2007-03-18 Kubat in the air |