Data Challenge Winner: Q&A with Christian Ayala Lauroba

A conversation with the First Place winning team’s lead of the AI4FoodSecurity Data Challenge.

Radiant Earth
Radiant Earth Insights

--

Hosted on ESA’s AI4EO platform, the AI4FoodSecurity data challenge brought together participants worldwide to find the best machine learning/AI solutions for crop identification using Planet Fusion data and Sentinel-1 and -2 data. The challenge covered two areas of interest, Germany and South Africa, with high-quality cadastral data on field boundaries and crop types as ground truth input. It was organized by Planet, TUM/DLR, and Radiant Earth from 4 October to 19 December 2021.

One hundred eighty-eight teams competed for a chance to win one of the fantastic prizes that included internships, subscriptions to various platforms, and scholarships. In this Q&A, we sat down with Christian Ayala Lauroba from Spain, the leader of the team that won first place in both the Germany and South Africa tracks of the AI4FoodSecurity data competitions, to talk about his journey to become a machine learning engineer and winning the contest. Christian’s team members are Javier Lasheras, Rubén Sesma, and Christian Gutierrez.

Christian, a Machine Learning Engineer at R&D Tracasa, is a doctoral candidate in AI applied to EO at the Public University of Navarre in Spain.

“ We deeply researched the state-of-the-art crop classification methods in satellite imagery . . . Then pondered the best strategy to pre-process the data to enable fast experimentation.” — Christian Ayala Lauroba

Congratulations on winning the AI4FoodSecurity Data Challenge! What inspired you to get involved in this field? How did you become interested in machine learning? Tell us about your machine learning journey.

I received a BSc degree in computer science from the Public University of Navarre (UPNA), Navarre, Spain, in 2018 and started my professional journey as a web developer. However, I soon realized that this kind of job did not make me happy since all the projects were similar (buttons, dropdown lists, interactions with databases). At the same time, I was enrolled in a masters degree in computer science, which introduced me to the field of artificial intelligence. Then I was offered to join a newly formed R&D department mainly focused on applying AI techniques to remote sensing use cases. I found this job particularly interesting since my childhood dream was to work in the space industry. Therefore, being offered to work with satellites was a step toward that dream. Nowadays, I combine my job as a senior researcher in this department with a Ph.D. in AI applied to Earth observation, which I am pursuing.

Where did you learn about the AI4FoodSecurity Data Challenge, and what made you decide to participate?

I have always been interested in AI competitions and have competed in many on Kaggle or DrivenData. A year ago, I realized that a new platform was about to emerge as an initiative of the Φ-lab of ESA, focused on remote sensing scenarios called AI4EO. Since then our R&D department has taken part in every competition. I must note that we are the only team that has conquered a podium in every competition. Specifically, this competition was interesting, considering we have barely played around with time series. Moreover, the Common Agricultural Policy (CAP) is a hot topic today; therefore, building CAP-related applications may be valuable to our company.

Your winning algorithm outperformed 102 teams (for the Germany track) and 86 teams (for the South Africa track). How did you approach the problem, and what do you think set you apart?

First, we deeply researched the state-of-the-art crop classification methods in satellite imagery (mainly Sentinel-1 and Sentinel-2). Then we pondered the best strategy to pre-process the data to enable fast experimentation. This was of paramount importance to try as many approaches as possible. Additionally, it is very important to define a strong and reliable cross-validation setup to validate or discard approaches before tweaking one’s hyperparameters. But, there is no winning recipe, and probably our vast experience in remote sensing has set us apart from the other competitors.

Were you familiar with using machine learning on satellite imagery before this competition? How does this differ from common problems in computer vision?

Our R&D department has wide experience applying artificial intelligence techniques to remote sensing imagery (e.g., satellite, aerial, drone). We have approached many use cases such as super-resolution, object detection, and semantic segmentation. Note that remote sensing imagery significantly differs from natural images such as ImageNet in how they are handled. The difference is not only in the number of channels but also in the way the images are pre-processed. Therefore, this competition may be difficult for computer vision experts who have never dealt with remote sensing imagery.

What unexpected insights into the data have you discovered?

We discovered that Sentinel-2 data made no difference when using Planet Fusion data. This unexpected behavior can be due to the nature of the new Planet’s product, which despite its high resolution, gap-filled, and cloudless, is harmonized with Sentinel-2.

Any challenges you would like to share?

It was really difficult to fuse data for multiple sensors in a way that it could be scaled up easily. We decided to use multiple encoders and aggregate the features extracted by them prior to a classification head. However, this approach did not work since the model focused more on one of the sensors, disregarding valuable information. We came up with the idea of a multi-task training scheme where the model has to learn how to make the most of the sensors individually and how to aggregate the extracted features accurately.

Machine learning is a fast-growing field. How do you stay up-to-date with the latest technological developments?

Nowadays, there are many possibilities to stay up-to-date with the latest artificial intelligence developments. For example, I would recommend Twitter and LinkedIn communities where talented people share state-of-the-art research in an easy-to-follow insightful way. Moreover, competing or just delving into the forums of Kaggle or DrivenData’s competitions is a good way of learning new things.

Any advice for beginner data scientists who would like to participate in data competitions?

First, I would recommend spending time exploring the data. It might be boring (I know), but it will give you valuable insights. Then, look for the state-of-the-art and implement some of the ideas. It is of paramount importance to build a strong cross-validation scheme before starting tweaking hyperparameters. If possible, join forces with other competitors, especially those with different fields of expertise (e.g., Sentinel-2 expert with Sentinel-1 expert). The most important advice is to never run out of ideas; even if they sound crazy beforehand, try them out.

--

--

Radiant Earth
Radiant Earth Insights

Increasing shared understanding of our world by expanding access to geospatial data and machine learning models.