Desi: April 2009 Archives

I recently launched a new web tool called DesiFilter.

Like a lot of folks from immigrant communities, I tend to be hyper-aware of names from my culture. If I'm watching a movie, part of my brain goes "hey, wow!" when I see that the gaffer's backup caterer is named Banerjee or Patel or Khan.

DesiFilter sample results

South Asian American community journalists and bloggers will regularly do the same--scanning long lists of names to find community members involved in larger news stories. So I built a tool to help out, based on a list of over 26,000 uniquely South Asian first and last names I collected and hand-edited. (The word "Desi" is often used interchangeably with South Asian in diaspora.)

You just give DesiFIlter a URL or a bunch of text, and it'll find and highlight possible South Asian names. Commercial name ethnicity matching tools have been around for a while, and are used for things like targeted marketing and political campaigning. I believe this is the first such tool that handles South Asian names that's freely available to the public.

It wasn't particularly hard to build; the tech side (powered by Perl's Regexp::Assemble) was a breeze compared to the difficult task of collecting and refining name lists. South Asian names come from all over, so I ended up making a lot of awkward decisions to maximize usability in majority-Anglo countries, including throwing out most Anglo and many Portuguese names common in South Asia to minimize false positives. This means, for example, that it'll fail to identify John Abraham as a South Asian name. Short of a hard-to-build-and-visualize system of weights, I can't think of a much better solution.

DesiFilter got some big love on Sepia Mutiny. I'm currently working on some features to make it more useful to the folks over at the South Asian Journalists Association.

I was looking at my ballot for the 2009 election for members of the board of Amnesty International USA, and was surprised to see that 5 of the 12 candidates had South Asian names:

There's still a large enough disconnect between mainstream South Asian communities and mainstream social justice movements that stuff like this brings a smile to my face.


Anirvan Chatterjee is a San Francisco Bay Area tech geek and bibliophile.


Enter your email address:

About this Archive

This page is a archive of entries in the Desi category from April 2009.

Desi: February 2009 is the previous archive.

Desi: June 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Recently read