Design

The ultralytics-inference package that will implemented in this repository is intended to provide a simple, thin wrapper around the ultralytics Python package in order to run inference on images and video on a variety of compute resources ("targets").

The initial targets are:

Local (Docker)
Amazon SageMaker Processing

This document describes the proposed design of the ultralytics-inference package.

Resources

Docker

Docker is a platform for developing, shipping, and running applications in containers. Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

Amazon SageMaker Processing

Amazon SageMaker Processing is a feature of Amazon SageMaker that lets you easily run arbitrary workloads on fully managed infrastructure. SageMaker Processing makes use of a Docker images as a means of specifying the processing job. Typically, processing jobs are used to preprocess data before training, to postprocess the output of a training job, or to evaluate a model. In this case, we are interested in using SageMaker Processing to run inference on images and video.

ultralytics

ultralytics is a Python package that provides a simple and efficient way to run inference on images and video. The ultralytics package currently supports image classification, object detection, and image segmentation. Multiple object tracking is also supported for video.

ultralytics supports a wide range of models, including YOLOv5, YOLOv8, and many others. The complete list of supported models can be found here.

ultralytics provides a docker image docker.io/ultralytics/ultralytics that includes the necessary dependencies to run inference on images and video using the ultralytics Python package. This image can make use of the nvidia runtime to take advantage of GPU acceleration.

Links

Design

The ultralytics-inference package will provide a simple, high-level interface for running inference on images and video using the ultralytics package. The package will be designed to be extensible, so that additional targets can be added in the future.

The package will delegate all the heavy lifting to the ultralytics package, and will focus on providing a simple, consistent interface for running inference on different targets. In this vein, the ultralytics-inference package will focus primarily on the following tasks:

Setting up the environment for running inference (copying files, setting up the model, etc.)
Abstracting the details of running inference on different targets
Running inference on images and video
Collecting and returning the results of the inference

From the user perspective, the process will be:

Install the ultralytics-inference package
Write a configuration file specifying an inference task, providing:
1. the task name
2. a list of inputs
3. a list of outputs
4. the Docker image to use
5. the command to run (e.g., yolo predict ...)
Run an ultralytics-inference command from the command line, specifying:
1. the configuration file
2. a target (e.g., local, sagemaker)
3. a target-specific staging location (e.g., a local directory, an S3 bucket) for the inputs and outputs

The ultralytics-inference command will:

Copy the inputs to the staging location
Set up the target environment for running inference
Run the inference task on the target
Collect the results and copy them to the desired output location(s)