written by

Rémy Ghalayini

# AI and Drones: Part II Algorithms

## What is an algorithm?

An algorithm is a collection of code that instructs the machine to perform specific actions. The algorithm is usually based on mathematical functions to process data, perform calculations, predict outcomes, and visualize the results in graphs.

That’s why data scientists often have a good understanding of mathematics, as it allows them to create more robust algorithms. Algorithms are typically divided into two groups; supervised and unsupervised learning algorithms. The main distinction between the two approaches is the use of labeled datasets. Supervised learning uses labeled input and output data, while an unsupervised learning algorithm does not. The latter work on their own to discover the inherent structure of unlabeled data

## Example of algorithms

There are many algorithms, and each can perform different functions, depending on the business scenario and the desired output. We often hear about ‘regression analysis’ in finance and insurance, which falls under the ‘supervised learning algorithm.’ This algorithm is based on statistical processes for estimating the relationship between a dependent variable and one or more independent variables. It allows future scenarios to be accurately predicted based on closely fitting the data according to a specific mathematical criterion.

A popular algorithm, such as ‘K-mean clustering,’ helps companies in customer segmentation, which is the process of dividing customers into groups based on common characteristics so companies can market to each group effectively and appropriately. It falls under the ‘unsupervised learning category’ and analyses the different categories of customers, such as age, expenditure, location, and so on, and creates clusters of customers sharing similar categories. Imagine having tens of thousands of customers; manually doing this work would take ages. But teaching a machine how to do and then update it weekly would take a few minutes.

In drone operations, we don’t usually use regression or k-mean clustering. When taking a large amount of image data, a common challenge is automatically detecting what the images contain within thousands or millions of image data. Humans can differentiate between objects when looking at something and tell whether we are looking at a cat, a dog, or a car. Computers cannot do that because an image is a collection of pixels, and therefore it only represents a series of colors and pixels for a machine. As a result, we should teach the computer how to identify objects in an image.

Typically, coders do this using convolutional neural networks (CNN), a type of artificial neural network used in image recognition and processing specifically designed to process pixel data. The main advantage of CNN is that it automatically detects the essential features without any human supervision. For example, given many pictures of cats and dogs, it learns distinctive features for each class by itself.

Classification builds up on CNN results to classify objects detected in images and produce a map-like image as the final product. We can use classification for crop monitoring, soil mapping characteristics, forest cover mapping differences, land cover change detection, natural disaster assessment, water resource applications, wetland mapping, and urban and regional planning. It has been widely used in remote sensing using satellite imagery and has become more popular with drone imagery.

## Data models

Almost all algorithms, specifically for those under the supervised learning category, a training dataset is labeled manually beforehand and then used to train the data model.  An example would be to have a few thousand images featuring either a cat or a dog, and we manually go through them and label them by highlighting what they contain. Once we run the algorithm, a model is prepared through a training process in which it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data.

Once a data model is created, it is then tested against another labeled data set to check its accuracy. If the model can correctly guess the type of 95% of the test dataset, it will have a 95% accuracy. It is imperative to have a high accuracy in data models as we use them to make predictions and make decisions. If the model is inaccurate, it will be unusable. That’s why the quality of the input data is crucial when training data models. Therefore, a model is as good as the data feeding it.

This blog was a short description of the data algorithms used in drone operations. Multiple other algorithms can be used, but the ones discussed in this blog post are the most widely used in today’s drone and geospatial industry.

#### In part III, we will discuss the practical application and use cases for how AI and data models are used to analyze drone data.

First published on 2022-01-19