Working in New Domains

As a data scientist, you may have domain expertise but sometimes you’ll find that your strong foundations in math, stats, machine learning, and computer science are needed on another project in your company. You may be a image processing guru but another manager would like to have you explore some NLP applications. So how should you approach a new domain?

Well you conclude, logically of course, that you know the basics, the machine learning, the libraries, the language to prototype with, the bread and butter, and that you just need to apply it to this new domain. You may do something basic to start off with. Some simple feature extraction and pipe it through one of the classic algorithms (SVM, K-means, etc.). But I think the best place to start is to implement a paper in that domain.

It won’t be easy but it’ll save time. A good paper should essentially outline everything you need to do at a high level. All you need to do is implement it yourself (again, not saying it will be easy but at least you have direction.). And if you pick a recent paper, your work will reflect the latest, state-of-the-art research in that domain. Find the terms that are used in your domain that describe what you are trying to do i.e. identifying where text is in an image? Text detection. Transcribing what the text actually says? Text recognition. Now search them on arxiv or Google Scholar and see what comes up!