This came across my desk this morning: theanalysisofdata.com. The author is former professor of Computer Science named Guy Lebanon, who is now, according to his website the Director of Product Innovation at Netflix. The text appears to be very rigorous and is notable because he goes all the way from first principles to Limit Theorems that rely on Measure Theory. Here’s a link to his volume on Probability:
http://theanalysisofdata.com/probability/0_2.html
The following passage from the preface caught my eye:
Probability theory is a wide field. This book focuses on the parts of probability that are most relevant for statistics and machine learning …. Probability textbooks are typically either elementary or advanced. This book strikes a balance by attempting to avoid measure theory where possible, but resorting to measure theory and other advanced material in a few places where they are essential ….
I am not aware of a single textbook that covers the material from probability theory that is necessary and sufficient for an in-depth understanding of statistics and machine learning. This book represents my best effort in that direction.
I often run into students who want to work in Data Science, but who say they are only interested in learning material that is “practical” or “useful.” What has always amazed me about probability and statistics (and mathematics in general) is that, basic, seemingly elementary questions, often have answers often require a substantial mathematical background. In fact, many topics that students now see as abstract theory were first developed to answer very practical questions. I would suggest that Lebanon’s inclusion of measure theory in his text ratifies this thought in the context of data science.
Just as musicians must spend 10,000 hours* practicing on their instruments and learning music theory before they can be creative and establish an individual voice; if you’re ambitious and want to be a “data science genius,” you had better dedicate at least 2,000 of your 10,000 hours to mathematics!
* Yes. I know the 10,000 hours is a matter of correlation rather than causation. Generally speaking though, if you want to be good at something, you had better be prepared to spend some serious time on it!