Docstring of the day: a mystery
I like to think I’m pretty good about writing docstrings that will be comprehensible to my successor when I leave a job or get hit by a bus, but sometimes I get a bit silly, and write riddles like…
Here’s the code to go along with it:
If you guessed that this code strips diacritics from the given string, giving the closest possible ASCII representation of any foreign characters, then congratulations you clever so-and-so. I was inspired by a similiar routine in Google Refine. They use a sort of homebaked lookup table, rather than the pithier approach of unicode equivalence - schmucks! It turns out that normalizing foreign characters is a helpful step in clustering swathes of messy free text (something I’m doing a lot of right now at Sciencescape).