Introduction
Kaggle started its life as a platform for machine learning competitions, where data scientists and machine learning specialists/learners could participate and collaborate to solve a specific problem.
Quickly however, it developed into much more as additional services were integrated, making it one of the largest data platforms available to the public. Kaggles‘ popularity picked up rapidly during the early years, attracting many users and reaching 1M registered users in 2017.
Features and services
Among the reasons why Kaggle is on top of its game are the ongoing improvements and addition of new features. As more services were added and popularity picked up, an increasing number of users and organisations joined the platform. Currently Kaggle offers the following services :
- Machine learning competitions: Organisations/Companies post problems for machine learners to compete to solve, often offering rewards to the top winners.
- Kaggle Kernels: a workbench on the cloud for data science and machine learning. Allows users to share and discuss code snippits on a large array of topics in Python and R.
- Public data sets platform: one of the most useful features for machine learning and data science: users can create and share data sets of any format (for example spreadsheets and images). The data sets cover a wide array of topics from sports statistics to cats and dogs images.
- Kaggle Learn: for short-form AI education on multiple topics covering python, machine learning, Pandas, and others.
- Jobs board: all above services created a community of data scientist and AI experts, so it made sense to offer a service where employers can post job vacancies in the fields of data science and machine learning.
Viaboxx on Kaggle
As a company we’re always keeping an eye on what’s new and interesting, and Kaggle seems to be the perfect portal for monitoring innovation in the Machine Learning world. That’s why Viaboxx is registered as an organisation on Kaggle. Between competitions and „kernels“ posted by users, it is always interesting to look through others‘ ideas and see which can be useful for our projects.
For example, we used a „cats and dogs“ data set from Kaggle to conduct our own analysis of different open source CNNs, comparing their performance on different aspects including accuracy and loss. We went through some interesting challenges and came up with even more interesting results, and the knowledge gathered from that experience was applied to one of our ongoing research projects. For more details about Viaboxx’s experience with Deep Learning on the cloud please take a look at this and this blog post.
Kaggle and practical applications
Machine learning in general is rapidly infiltrating many sectors, including implementations in medical, finance, logistics, and even sports analysis fields. Kaggle competitions have also had their share of impact. For example, the implementation of the live leaderboard has helped fuel the competition to innovate, and that has helped improve and even create new best practices in deep learning.
Other useful applications came in the forms of improving gesture recognition in Microsoft Kinect, assisting in HIV research, traffic prediction, and improvements in the search for the Higgs particle at CERN.
Conclusion
With its diverse utilities and ever evolving community, Kaggle is the go to platform when it comes to machine learning and data science. The sheer amount of raw data and ready-to-consume information forms a rich environment for learning and growth. At Viaboxx, we recommend Kaggle for learners and experts alike.