Data Science — Some Blunt Advises By A Data Science Novice

Home
Data Science — Some Blunt Advises By A Data Science Novice

It was 2017 when I first heard the word machine learning by my senior Muhammad Awais. He told me about the applications and state of the art ML projects, techniques and recommended me a course Machine Leaning. I always wanted to find something which I am deeply interested in and this fancy term “Machine Learning” lifted my spirit and felt like this is something interesting. This is something worth trying.

I started the ML course immediately. I spent a lot of time on the course. When my peers were studying for exams I was hitting my head against those cost functions :). As I made some progress in the course I felt like I was not getting the concepts deeply. I was understanding everything but on the surface level. Anyhow I finished the course but could not find anything practical as the course includes a lot of mathematical equations (which are extremely important) but I felt like that I was not going anywhere.

I was having a hard time visualizing all the things that were taught during the class. I wanted something practical so started implementing ML algorithms in python and started participating in kaggle competitions. After spending around 6 months on kaggle I became very comfortable with python. Though I performed quite well in kaggle competitions won hackathons but deep inside I was not content with what I was doing. Here it is why

Machine Learning requires a very solid understanding of mathematics and as a second year student at university I was not introduced to some core mathematical concepts. When I first learned Linear Algebra during my undergrad I found it very interesting. The idea of thinking in high dimensions and thinking in terms of transformations is amazing. I was excited about the course I had some knowledge about some concepts of linear algebra like eigen vectors , dot products, span etc all these concepts are so beautiful that any one can easily fall in love with them. I don’t want to complain but the way mathematics is taught at “prestigious” universities (by most of the teachers not all) is awful :). Its all computation (which is important) but teaching mathematics without visual intuition is just awful :) Okay No Complains !

I realized that even though I am an Engineer but I don’t understand mathematics not even 0.1 % . I started to work on my mathematics along with writing code.

Since Data Science is a combination of a lot of fields and if you search on Quora or some social media group about how to get started then you will get a long list of prerequisites to work in data science which will scare you a lot and you will feel like that you are not suitable for data science but that may not be the case.

I am going to give some blunt advises on how to get started in data science. This article is for the people who are just getting started in data science. These are my personal opinions and you have the right to differ :)

Here you go:

Don’t Rush

Move slow. Yes you read it right, when the world is moving so fast to learn the new emerging technologies you need to move slow.

You don’t need to complete 4 online MOOC in a month instead focus on just ONE. Don’t try to jump on another topic before you are fully comfortable with the topic you are studying otherwise it will be a matter of months that you will messed up and it will hit you hard in the long run.

Pause and ponder think a lot about it, discuss with it your friends, write an article about it don’t move an inch until you have a solid understanding of the concept.

There are a lot of things in each concept. Just acquiring the surface knowledge will not take you any where. Don’t fall into the illusion that you have fully grabbed the concept. Have doubts about everything you study. Don’t blindly trust any influencer. Keep hitting your head until it makes complete sense to you. Trust your intuition but be teachable.

Trust me you have more time than you think :). If you really want to become a Data Scientist then move slow.

Stay longer

Its hard to get started but it is easy to give up. We need to understand that learning is the life long process. If you have decided that you want to become a Data Scientist then be consistent with your learning. Spend some hours each week to learn mathematics. It will be awesome if you have one ongoing featured kaggle competition. Work on the fundamentals of programming and database.

Instead Of Width Go Into The Depth

I wish I could emphasis more on this point. With so much information available online it is very easy to get overwhelmed and distracted. For God sake don’t run after completing the online course for the sake of a certificate. No one cares how many certificates you have. Don’t get me wrong Coursera, Edx and other platforms are the great source of learning and its good to have certificates but sometimes we miss the core and essence of the subject in a hurry of getting a certificate.

Go into the depth of each topic ask yourself that why it makes sense? Do justice to the discoverer or inventor of the concept. Try to see the real beauty of the concept. Be doubtful about everything you study.

Your curiosity will guide you to the next level.

I have no special talent. I am only passionately curious.

Albert Einstein

Don’t Post Your Online Certificates on Social Media

You have the right to disagree but I believe that posting certificates online give us a psychological satisfaction that we have achieved something or we have completed a specialization it means that we have good knowledge of others but in reality its just the tip of an ice berg. Its fine to have them on Linkedin profile but sharing them and getting appreciation on learning does not look good to me. I also used to post my certificates online :)

Work On Your Mathematics

You need to work on your maths. There is no other way. If you don’t love maths then data science will be like trial and error for you. Any junior programmer can use python to train machine learning models in a matter of weeks even if he/she never coded in python.

But How Much Mathematics ?

There is a lot more things than we realize. I think these are some of the topics we need to master to become real data scientist.

  1. Linear Algebra
  2. Calculus 1
  3. Calculus 2
  4. Calculus 3
  5. Differential Equations
  6. Probability and Statistics
  7. Machine Learning

This is a lot but we have time (hopefully). Again I will emphasize don’t rush take your time to grab the fundamentals of each subject. Think deeply. Trust me if your basics of these topics are very strong then any further learning in Machine Learning will be more fruitful and enjoyable. Everything in Machine Learning and generally in science is built on fundamental concepts if we understand the basics really well then it becomes easier to learn the concepts which look horrible.

Don’t Compare Yourself With Anyone

You are not behind than anyone and you are not ahead than anyone. You are at good position as long as you are improving. Try to improve yourself and lift others. This is the only way to grow.

Build Projects Which Excite You

Build projects that you are interested in. Share it on github, kaggle. Write articles on medium. Try to answer questions on stack exchange. Ask other people to review your code and improve as much as you can. Kaggle community is amazing people regularly share their code on kaggle. They discuss on interesting topics. Every aspiring Data Scientist should participate in kaggle competitions.

Don’t Jump To Deep Learning

Don’t jump straight to Deep Learning without understanding Machine Learning. I did this mistake. Machine Learning is the combination of may different fields. I think the best way to start Machine Learning is to some basic course which give you a basic idea of what Machine Learning is and why on earth it useful. Then start coding some basic models preferably in python or R to get the feel of how it works in the real world. Learn some basic concepts in detail such as train/test split, evaluation metrics, overfitting/underfitting, regularization etc.

Once you are comfortable with ML then move to DL if you find it interesting.

Find Mentors

There are a lot of people in data science community who are regularly sharing their work and knowledge on Linkedin, kaggle and other platforms. Connect with them. Ask questions to them it is usually assumed that people with high profiles are hard to reach but believe me its the opposite if you are deeply interested then you will easily find good mentors who will guide and you can a lot from them.

Enjoy The Process

Most importantly enjoy the process of getting better. The satisfaction you get when you know that you are growing is priceless. Learning is good for your mental health and overall well being.

Ending Notes

These are my thoughts and you have the right to disagree. Constructive criticism is always welcomed. Thank you so much for reading ☺