Data Science? What is it?

A brief explanation of what the buzz-word “Data Science” means and what should come up to your mind when hearing that.

Addiotionally, I’ll list some sources where you can kick off 2021 learning that.

If we were to begin by analyzing the etymology of this so-called “sexiest word of the 21st century”, we ought to understand what is “Data” and what is “Science”. Well, let’s start with “Science” which I believe people are more familiar with, or at least used to.

If we do a quick research on Wikipedia we’re going to find that:

  • “Science” comes from Latin scientia, which means “Knowledge”

So, thus far, you might be wondering: “So we could say that Data Science is actually Data Knowledge ?” Yeap! My dear reader!

Our goal as Data Scientists is to organize knowledge to be able to explain things and make predictions, in this case, not about the whole Universe, but most of the time, about the Data we are dealing with. In a nutshell, science is all about discovery and building knowledge.

YOU: All Right! Gotcha! But what about “Data”? What exactly do you mean by that?

I’m certain that you won’t be satisfied if I told you that Data is any sort of information, right?

Alright! Let me show you a spreadsheet that my dad uses in his tires retail store or better, a translation of it.

That Excel spreadsheet contains data, such as the customer’s name, address, phone, information about the car, the service has done, the total (in reais), and so on. As you can imagine Data is everywhere! Literally!

While you’re browsing on the Web data is being collected from you. All that Google ads that pop-ups as soon as you access an online shop, invading your e-mail box, hijacking your opportunity to browse in peace without all those stubborn ads, where do you think do they come from? From your actions while online! Basically, every website that you visit is collecting your data rather by asking you to sign up for its newsletter — the moment you have given your e-mail address — or by the time you’ve accepted their cookies.

Have you ever thought about how Netflix makes predictions about what you would like to watch based on what you have already watched? That’s all about Machine Learning! But don’t worry about it right now.

Drew Conway Veen Diagram (http://drewconway.com/)

Since the goal is to introduce you to the idea of Data Science, let me take a step back and give you a visual overview of what Data Science is supposed to be. If you look at this Veen Diagram (that one from your math classes on Set Theory that I’m sure rings you a bell) you are going to see that Data Science is, actually a very interdisciplinary field. Thus, we shall break this diagram into parts.

Hacking Skills: Here, you are supposed to know skills like Python or R programming; Excel; SQL; manipulate large datasets, and some algorithmic thinking.

Math and Statistics Knowledge: This is crucial to understand what the data is really trying to “tell” you. As we have said, Data Science is not only about data itself but also science. Therefore, our main focus is to build knowledge and bring insights from the data.

Subjects as Exploratory Analysis, Linear Regression, Inference, Hypothesis Testing, Data Mining, Machine Learning, are all about statistical knowledge.

Substantive Expertise: Last but not least, substantive expertise is certainly what will make you feel that learning DS is almost an impossible task. But don’t be a fool; even though we might feel lost because it looks as tough, we are drowning in an infinite flood of information, tons of online resources will prepare you for this exciting journey. As published by The Harvard Bussiness Review, back then in 2012, the dominant trait among data scientists is curiosity! Being curious about the world. Only by doing so, you’ll be able to go beneath the surface of a problem and think more creatively. That being said, let’s check some of the sources where you can start learning Data Science in 2021.

Data Science Podcasts

Undoubtedly, podcasts are a great option either if you are willing to take it slow at first and start to get more familiar with such a vast field, or if you want to stay updated and on the ball about the latest trends on the topic.

The Real Python — which, by the way, is an awesome page to learn Data Science — have made a fantastic list with great options. You can check it in the link below.

Besides that, if you are from Brazil or perhaps speak Portuguese, there are great podcasts as well, such as:

  • Hipsters Ponto Tech
  • Data Hackers
  • Pizza de Dados
  • Cabeça de Lab

Pages and Courses

However, it might be that you want to start 2021 off striking through”̵S̵t̵a̵r̵t̵ ̵t̵o̵ ̵L̵e̵a̵r̵n̵ ̵S̵t̵a̵t̵i̵s̵t̵i̵c̵s̵ ̵a̵n̵d̵ ̵D̵a̵t̵a̵ ̵S̵c̵i̵e̵n̵c̵e̵ ̵i̵n̵ ̵2̵0̵2̵1̵”̵ and raise the stakes a bit. Hence, here is some good content you should definitely take a look at.

  • Khan Academy: It’s definitely one of the best platforms where you can learn almost any school subject ranging from the very basics until the A.P Courses. There you will find content about Math and Statistics to excel at DS. All courses are available in Portuguese, as well.
  • Coursera | EdX | Udemy: These platforms are incredible when it comes to learning, basically, any skill.
  • Probability Course: This is more than a website, in fact, it is an entire book! Available for free! I’ve used it during my first course in Probability and I highly recommend it.
  • Sigmoidal: If you speak Portuguese, this platform is a must! I’m currently enrolled in a Python course for beginners — the “Python do Zero” — and is unique! They have a very strong approach based on projects, portfolio and personal branding.

Youtube Channels

  • Roger Peng: He is one of the hosts of the Podcast “Not so Standard Deviations”. Every content is worth it.
  • Siraj Raval: I’ve discovered him, actually on GitHub, and found out he has an awesome Youtube Channel with plenty of content.
  • Joma Tech: Jomas’s channel is full of content. There you will find lots of stuff related to programming, technology, data science and much more.
  • Ken Jee: If you are seeking a Data Science career, you can’t miss him too.
  • Carlos Melo: Seriously, in my opinion, the best Youtube channel in this field in Brazil. Carlos is the founder of Sigmoidal and more than just a Data Scientist he is an astonishing self-taught filmmaker.
  • Alura Cursos Online: Consume any of their content and you're good. Trust me.

Books for Beginners

  • Data Science from Scratch: As the name suggests, if you’ve never heard about DS before, it is a good place to start off. Be careful, pay attention if you are taking the 2nd edition (which is available only in English, so far.)
  • Data Science for Bussiness (Data Science para Negócios): As you will notice after some time spent on the topic a Data Scientist is expected to know something, or more, about business.
  • Head First Python: A Brain-Friendly Guide (Use a cabeça! Python)
  • Probability and Statistics for Engineering and the Sciences (Probabilidade e Estatística para Engenharia e Ciências):

As we’ve seen, a Data Scientist is supposed to have a strong statistical background. This book is available in English and Portuguese. I’ve used it with some others and is pretty good, but you can choose any other one you like.

All things considered, if you consume half of this content or even all of it you will be more than ready to walk on this long path on your own.

I hope you enjoyed it! Good luck on your journey!

Take care!

Statistics and Data Science undergraduated student. Here I'll share some of what I've been learning with you. Hope you like it!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store