During our conference on „Text Mining in Historical Science“ last week, some themes and problems around text mining were recurrent. Brought up during several presentations, the twitter crowd of the conference started collecting these mottos, some of them serious, some half way funny, some sad… During our overall discussion at the end of the conference, we used these themes to wrap up things around text mining in historical science. Posted here (somehow contextless) they might hopefully nevertheless serve for researchers planning on text mining projects.
- Never underestimate pre-processing!
- Doublets are evil.
- Don’t trust (closed source) developers.
[...]