The Role of Statistics in the field of Data Science

Why do we need to have sufficient knowledge of Statistics? What concepts in Statistics are useful in Data Science?

Data Science provides meaningful information based on large amount of data a.k.a. as big data. It is a combination of various tools, algorithms and machine learning principles in knowing what lies behind the raw data. Thus, this is where Statistics comes in. Having a sufficient knowledge of Statistics helps us use the proper methods of collecting, analyzing, and presenting data effectively. Finally, it allows us to understand the data more deeply.

The fundamental concepts in Statistics that are useful in Data Science are probability distribution, statistical significance, hypothesis testing and regression. Looking forward to learn more of these on the 7th Day of our FTW Workshop! 😉

Why do we need to explore the data? Why do we need to clean the data?

As an aspiring data scientist, we need to explore the data to get oriented and familiarize with the dataset as we try to know the story behind it. We need to clean the data to uncover initial patterns, characteristics, and points of interest. Finally, we would be able to have a quality data if we determine inaccurate, incomplete, or unreasonable data so we could make better results and decisions out of it.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store