In short, the competition is more fierce, and different skills are needed to stand out. For the last 5–10 years, data science has attracted newcomers from around the world to build a career in the “21st century´s sexiest job”.
I recently read the following mind-boggling statistics in a KDNuggets article where ML researcher Mihail Eric analyzed the data roles being hired for at every company coming out of Y-Combinator since 2012.
Here’s the gist of what he found out in two sentences.
There are 70% more open roles at companies in data engineering as compared to data science. As we train the next generation of data and machine learning practitioners, let’s place more emphasis on engineering skills.
“This may sound boring and unsexy, but old-school software engineering with a bend toward data may be what we really need right now.”
So does that mean you shouldn’t study data science? No. What it means is that competition is going to be tougher. There are going to be fewer positions available for what is looking to be an abundance of newcomers to the market trained to do data science.
To stand out, you need to get your hands dirty and build the skills that really matter. And this means not just technical skills, but collaboration, leadership, and problem-solving abilities.
In order to get there, I want to share with you a few platforms, which all emphasize different skill-sets and knowledge to prepare you for the real world.
The article is divided into two parts:
- Competitive and collaborative platforms to hone your skills
- New resources to augment specific skills
While Kaggle is a well-known platform for Data Science competitions, there are many more platforms worth knowing and exploring if you are interested in acquiring the skills that you need on the job.
You might also like
Top 9 Data Science Platforms To Choose In 2023
1. Driven Data
DrivenData works on projects at the intersection of data science and social impact, in areas like international development, health, education, research and conservation, and public services. Their mission is to give more organizations access to the capabilities of data science and engage more data scientists with social challenges where their skills can make a difference.
The site has a section dedicated to Competitions. The datasets listed in Driven Data are related to Non-Profits ranging from wildlife preservation to public health. Thus, if you want to apply your skills to solve meaningful problems, Driven Data can be your place to get your hands dirty.
Omdena is most likely the first collaborative platform, not a competitive platform, which hosts two-month real-world projects that go from problem scoping to data collection, preprocessing, and ML modeling and deployment. Our mission is to help organizations (startups, NGOs, social enterprises) to build real-world solutions while data scientists and engineers from around the world make an impact, build up 21st-century job skills, and network in a global community.
In an Omdena project, you join a selected and collaborative team of up to 50 collaborators from diverse backgrounds. Projects range from rooftop detection for solar power, detecting bias in articles, preventing malaria infections, to both business and impact-driven projects. If you are looking for a full-scale real-world experience, Omdena is the right platform for you.
Build your portfolio with real-world projects from Omdena
CrowdANALYTIX converts business challenges into analytics competitions; and addresses the need for analytical solutions requiring predictive analytics, descriptive analytics, estimations, and business hypothesis validation.
The platform also hosts a community blog that has excellent resources, including interviews and reference materials. If you are looking for business-oriented competitions, this might be your platform.
A beginner-friendly approach to real-world problems is offered by DataCamp.
Datacamp’s approach is to build the confidence to code on your own. Projects let you apply your skills using tools like Jupyter Notebook and complete a data analysis from start to finish — all in a risk-free environment.
You can apply your coding skills to solve open-ended problems without step-by-step tasks. If you get stuck follow the live code-along videos to see how our expert instructor finds one of the many possible solutions.
InnoCentive is an open innovation and crowdsourcing company that mainly focuses on problems dealing with life sciences.
The crowd can either be external (i.e., their network of over 380,000 problem solvers) or internal (i.e., an organization’s employees, partners, or customers). Awards, typically monetary, are given for submissions that meet the requirements set out in the Challenge description. The average award amount for a Challenge is $20,000 but some offer awards of over $100,000.
As a solver, you can contribute to tackling some of the world’s most pressing problems.
Codalab is an open-source web-based platform that enables researchers, developers, and data scientists to collaborate to advance research fields where machine learning and advanced computation are used. CodaLab helps solve many common problems in data-oriented research through its online community, where people can share worksheets and participate in competitions.
Experiments can then be easily copied, reworked, and edited by other collaborators in order to advance the state-of-the-art in data-driven research and machine learning.
Zindi is a data science competition platform with the mission of building the data science ecosystem in Africa. They connect organizations with the thriving African data science community to solve the world’s most pressing challenges using machine learning and AI.
8. Analytics Vidhya
Data Science Competitions to Compete, Win, Practice, Learn, And Build your Data Science Portfolio!
Analytics Vidhya provides a community-based knowledge portal for Analytics and Data Science professionals. In addition to providing great resources for Data Science learnings, it hosts Hackathons, which are Real-life industry problems in the form of contests. You can either participate in the challenges or sponsor a hackathon. Most companies that organize Hackathons on Analytics Vidhya also offer job opportunities to the top scorers.
The data science challenge platform AIcrowd hosts multiple open data science challenges each year. The challenges cover image classification problems, text recognition, reinforcement learning, adversarial attacks, image segmentation, resource allocation optimization, and many other areas across multiple domains. They were awarded over $100,000 from Amazon and Nvidia for their 2017 challenge called “Learning to Run.”
Hackathons that Data Scientists Should Participate In
HackerEarth is a good place for #Beginners. It is a place where programmers from all over the world come together to solve problems in a wide range of Computer Science domains such as algorithms, machine learning, or artificial intelligence, as well as to practice different programming paradigms like functional programming.
MachineHack is an online platform for Machine Learning competitions.
It is a growing platform with a mission to support the ever-growing data science community and to help young aspirants learn and improve their skills in the field of analytics. They host tough business problems that can now find solutions in Machine Learning & Data Science.
New Resources to Become a Data Scientist
Free university courses in Machine Learning from Top Universities (updated 2023)
From Stanford University (Andrew Ng and others) to Summer Schools, all available, updated, or archived.
- Massachusetts Institute of Technology
- Stanford University (Andrew Ng and others)
- Berkley University
- Carnegie Mellon University
Find all courses here.
Either preparing for a machine learning or deep learning interview or even still studying. Even the senior data scientist will need this.
A four-page data science cheat sheet that covers all machine learning core concepts and algorithms from models to reinforcement learning.
- Linear and Logistic Regression
- Decision Trees and Random Forest
- Dimension Reduction (PCA, LDA, Factor Analysis)
Github Repo: http://bit.ly/3pGO5Hi
11 Recommend Data Science Books (Free & Fee) You Should Read
- Probabilistic Programming & Bayesian Methods for Hackers
- Elements of Statistical Learning
- Algorithm Design
- Introduction to Linear Algebra
- Linear Algebra Done Right
- Head First Statistics: A Brain-Friendly Guide
- Probability and Random Processes
- Python Machine Learning by Example
- Pattern recognition and machine learning
- Python for Data Analysis
- Ace the Data Science Interview
Bonus: Free courses from the Intel® AI Academy
Free courses from the Intel® AI Academy for software developers, data scientists, and students. Thus, for beginners to advanced developers, the Intel® AI Developer Program teaches about AI, how to make deep learning faster on Intel® hardware, and how to advance research.
Course durations vary from 4 to 12 weeks and cover topics from theory, Software, and Hardware.
Check all of them here.
Enjoy the learning journey. If this was helpful, feel free to leave a comment and share. 🙂