![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Your tax dollars at work. Russia warned the US about Tsarnaev going to Dagestan for training as a terrorist, and the TSA put a flag that he was to be detained immediately upon his return. But they spelled his name differently, entering it in to their database as Tsaernayev.
Since it didn't hit, he wasn't detained when he returned. And because he wasn't detained, the Boston Marathon bombing happened.
There's an algorithm that's been around for about a century called Soundex. It has been implemented in every major database system and most programming languages. It takes any word and translates it in to a four-character code, the first letter being the first letter of the word followed by three numbers. Tsarnaev is T265. Tsarnayev is also T265.
Granted, similar names will generate more false positives. But delaying someone for a few minutes to rectify a false positive is a lot better than letting a known terrorist through due to a false negative. The Watch and No Fly lists have been a horrible implementation since day 1: when you stop four year-olds and Senator Ted Kennedy because of these lists, there's a problem. And there's been no discussion of fixing their implementation, because that would 'leak vital information to the terrorists'.
This is the problem with large numbers. If I develop a system that is 99.9% accurate, you're going to say 'cool!' But if that is a facial recognition system, in a city the size of Phoenix (population 1,445,632 as of the 2010 census), that means that it'll incorrectly identify 1,445 people. And when you're talking about actual life and death cases, that's unacceptable. You MUST have something in place to handle out-liers to take in to account false positives and false negatives.
http://news.slashdot.org/story/14/03/26/2235230/tsa-missed-boston-bomber-because-his-name-was-misspelled-in-a-database
http://www.nbcnews.com/news/investigations/russia-warned-u-s-about-tsarnaev-spelling-issue-let-him-n60836
http://www.theverge.com/2014/3/26/5549206/us-airport-security-missed-boston-bomber-because-a-database
Since it didn't hit, he wasn't detained when he returned. And because he wasn't detained, the Boston Marathon bombing happened.
There's an algorithm that's been around for about a century called Soundex. It has been implemented in every major database system and most programming languages. It takes any word and translates it in to a four-character code, the first letter being the first letter of the word followed by three numbers. Tsarnaev is T265. Tsarnayev is also T265.
Granted, similar names will generate more false positives. But delaying someone for a few minutes to rectify a false positive is a lot better than letting a known terrorist through due to a false negative. The Watch and No Fly lists have been a horrible implementation since day 1: when you stop four year-olds and Senator Ted Kennedy because of these lists, there's a problem. And there's been no discussion of fixing their implementation, because that would 'leak vital information to the terrorists'.
This is the problem with large numbers. If I develop a system that is 99.9% accurate, you're going to say 'cool!' But if that is a facial recognition system, in a city the size of Phoenix (population 1,445,632 as of the 2010 census), that means that it'll incorrectly identify 1,445 people. And when you're talking about actual life and death cases, that's unacceptable. You MUST have something in place to handle out-liers to take in to account false positives and false negatives.
http://news.slashdot.org/story/14/03/26/2235230/tsa-missed-boston-bomber-because-his-name-was-misspelled-in-a-database
http://www.nbcnews.com/news/investigations/russia-warned-u-s-about-tsarnaev-spelling-issue-let-him-n60836
http://www.theverge.com/2014/3/26/5549206/us-airport-security-missed-boston-bomber-because-a-database