Back in 2017, in Nantes (France), working from a client's warehouse (which was a formely garage), we started a side project : a price monitoring solution for e-Commerce.
We spent around 6 months testing most of the crawling solutions out there. We unsucessfuly tried to scale Scrapy to huge volumes, before realizing that 1) it would be unmaintainable and 2) extremely costly.
So, after 6 month of R&D, comes that moment when you throw everything to the garbage.
To handle the thousands of simultaneous connections required, we used Kafka, ZooKeeper, HBase; we developed our "Work Balancer" to handle the work-load distribution, and our own crawler, super light and blazing fast.
In 2018, we shifted our roadmap from e-Commerce to Data Engineering.
Thanks to our WorkBalancer / Crawler combo, we're able to quickly deploy scraping projects that scale and are maintainable. That's when we team up with EP (now part of our board). The french prop-tech startup was looking to create knowledge on top of a global agregation of the real-estate french market.
While keeping in mind to stay "fully customizable", we developed an interface to allow EP's Data Team to setup their own Data Pipes. To handle complex scenarios, we released the Workflows.
In 2020, we extensively used the platform ourselves, acting as a "Data Agency". Huge improvements were made to the Keecode to simplify our Data Templates.
Spoiler alert : nice features could possibly come for the Data Agencies out there.
High school buddies since 2008, they kept working together since then, from their Computer Science MSc to Indie Developers. François is a specialist in parallel computing and is in charge of our infrastructure while Vincent focuses on the product.