In my last post, I wrote about learning statistics for data science . In this post, I want to talk about mathematics which is often mentioned in tandem with statistics as core data science skills. The necessity of learning statistics is clear. It is the only way to learn about correlation and causation. However, I think math has a different story. I have always loved math and have always been good at it. Here, I am going to demystify math as mentioned in data science context.
Math for data science reminds me of Andrew Ng’s famous course on machine learning and how he taught from math point of view. It was the only course that taught machine learning in this way and I guess it is. I can also remember course assignments where students were supposed to write algorithms from scratch in MATLAB. For me, learning mathematically was joyful but the assignments seemed totally weird and irrelevant much like reinventing the wheel. The other disadvantage of the course in my eyes was that weeks’ contents were not independent. For learning week 5 you had to start from the very beginning to get an idea of the math formulas if your mind was not warmed-up.
My first question is: Do we need math for getting a basic understanding of what we are doing or we need them only for fine-tuning algorithms? Is it a basic part of data science or a fancy one? Do we need to delve deep into every algorithm we are learning or a general understanding and learning about its applications would suffice? My second question is: Which math topics we need to learn? And How? Some people might recommend taking a linear algebra course but I am doubtful the course instructors are teaching from the perspective a data scientist needs. If you have taken a math course for data science purposes, please share your experience.
If you are a data scientist and you studied math, maybe these are serious questions for you because you are probably obsessed with thinking about how your formal education helps a data science project.
I am waiting to hear your ideas about the use of math in data science.
Hamideh Iraj is a big data and data science researcher. She writes on a wide range of topics including Big Data, Data Science, Information Technology, Education and MOOCs.