-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Adding Custom Vision object detection sample #952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jwood803
wants to merge
11
commits into
dotnet:main
Choose a base branch
from
jwood803:cv-onnx
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 9 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
7838714
Adding weather recognition sample
jwood803 60dadb4
Added README
jwood803 da73a99
Update readme
jwood803 49a306f
Remove old sample and add new sample
jwood803 f579c7d
Add new README
jwood803 e5ec0db
Handle multiple objects
jwood803 a6f63c9
Calculate multiple bounding boxes correctly
jwood803 77967f1
Update readme
jwood803 854cdde
Output section
jwood803 54bc31c
Save images to same location
jwood803 8a39271
Update readme
jwood803 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
165 changes: 165 additions & 0 deletions
165
samples/csharp/end-to-end-apps/StopSignDetection_ONNX/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# Object Detection - ASP.NET Core Web & WPF Desktop Sample | ||
|
||
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms | | ||
|----------------|-------------|------------|-------------|-------------|------------------|---------------|-----------------------------------| | ||
| v1.7.1 | Dynamic API | Up-to-date | End-End app | image files | Object detection | Deep Learning | ONNX: Custom Vision | | ||
|
||
## Problem | ||
|
||
Object detection is one of the main applicatinos of deep learning by being able to not only classify part of an image, but also show where in the image the object is with a bounding box. For deep learning scenarios, you can either use a pre-trained model or train your own model. This sample uses an object detection model exported from [Custom Vision](https://www.customvision.ai). | ||
|
||
## How the sample works | ||
|
||
This sample consists of a single console application that builds an ML.NET pipeline from an ONNX model downnloaded from Custom Vision and predicts as well as shows the bounding box on any images in the "test" folder. | ||
|
||
## ONNX | ||
|
||
The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners, including Microsoft. | ||
|
||
## Model input and output | ||
|
||
In order to parse the prediction output of the ONNX model, we need to understand the format (or shape) of the input and output tensors. To do this, we'll start by using [Netron](https://netron.app/), a GUI visualizer for neural networks and machine learning models, to inspect the model. | ||
|
||
Below is an example of what we'd see upon opening this sample's model with Netron: | ||
|
||
 | ||
|
||
From the output above, we can see the ONNX model has the following input/output formats: | ||
|
||
### Input: 'image_tensor' 3x320x320 | ||
|
||
The first thing to notice is that the **input tensor's name** is **'image_tensor'**. We'll need this name later when we define **input** parameter of the estimation pipeline. | ||
|
||
We can also see that the or **shape of the input tensor** is **3x320x320**. This tells that the image passed into the model should be 320 high x 320 wide. The '3' indicates the image(s) should be in BGR format; the first 3 'channels' are blue, green, and red, respectively. | ||
|
||
### Output | ||
|
||
We can see that the ONNX model has three outputs: | ||
- **detected_classes**: An array of indexes that corresponds to the **labels.txt** file of what classes have been detected in the image. The labels are the tags that are added when uploading images to the Custom Vision service. | ||
- **detected_boxes**: An array of floats that are normalized to the input image. There will be a set of four items in the array for each bounding box. | ||
- **detected_scores**: An array of scores for each detected class. | ||
|
||
## Solution | ||
|
||
**The projects in this solution uses .NET 6. In order to run this sample, you must install the .NET 6.0. To do this either:** | ||
|
||
1. Manually install the SDK by going to [.NET Core 6.0 download page](https://dotnet.microsoft.com/en-us/download/dotnet/6.0) and download the latest **.NET Core Installer** in the **SDK** column. | ||
2. Or, if you're using Visual Studio 2019, go to: _**Tools > Options > Environment > Preview Features**_ and check the box next to: _**Use previews of the .NET Core SDK**_ | ||
|
||
## Code Walkthrough | ||
|
||
Create a class that defines the data schema to use while loading data into an `IDataView`. ML.NET supports the `Bitmap` type for images, so we'll specify `Bitmap` property decorated with the `ImageTypeAttribute` and pass in the height and width dimensions we got by [inspecting the model](#model-input-and-output), as shown below. | ||
|
||
```csharp | ||
public class StopSignInput | ||
{ | ||
public struct ImageSettings | ||
{ | ||
public const int imageHeight = 320; | ||
public const int imageWidth = 320; | ||
} | ||
|
||
public class StopSignInput | ||
{ | ||
[ImageType(ImageSettings.imageHeight, ImageSettings.imageWidth)] | ||
public Bitmap Image { get; set; } | ||
} | ||
} | ||
``` | ||
|
||
### ML.NET: Configure the model | ||
|
||
The first step is to create an empty `DataView` to obtain the schema of the data to use when configuring the model. | ||
|
||
```csharp | ||
var data = _mlContext.Data.LoadFromEnumerable(new List<StopSignInput>()); | ||
``` | ||
|
||
Next, we can use the input and output tensor names we got by [inspecting the model](#model-input-and-output) to define the **input** and **output** parameters of the ONNX Model. We can use this information to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. For this reason, the code below resizes and transforms the images (pixel values are normalized across all R,G,B channels). Since we have multiple outputs in our model, we can use the overload in **ApplyOnnxModel** to define a string array of output column names. | ||
|
||
```csharp | ||
var pipeline = context.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image_tensor", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(StopSignInput.Image)) | ||
.Append(context.Transforms.ExtractPixels(outputColumnName: "image_tensor")) | ||
.Append(context.Transforms.ApplyOnnxModel(outputColumnNames: new string[] { "detected_boxes", "detected_scores", "detected_classes" }, | ||
inputColumnNames: new string[] { "image_tensor" }, modelFile: "./Model/model.onnx")); | ||
``` | ||
|
||
Last, create the model by fitting the `DataView`. | ||
|
||
```csharp | ||
var model = pipeline.Fit(data); | ||
``` | ||
|
||
## Create a PredictionEngine | ||
|
||
After the model is configured, create a `PredictionEngine`, and then pass the image to the engine to classify images using the model. | ||
|
||
The **Console** app uses the `CreatePredictionEngine` to make predictions. Internally, it is optimized so the object dependencies are cached and shared across Http requests with minimum overhead when creating those objects. | ||
jwood803 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
```csharp | ||
var predictionEngine = context.Model.CreatePredictionEngine<StopSignInput, StopSignPrediction>(model); | ||
``` | ||
|
||
## Detect objects in an image | ||
|
||
When obtaining the prediction from images in the `test` directory, we get a `long` array in the `PredictedLabels` property, a `float` array in the `BoundingBoxes` property, and a `float` array in the `Scores` property. For each test image load it into a `FileStream` and parse it into a `Bitmap` object, then we use the `Bitmap` object to send into our input to make a prediction. | ||
|
||
We use the `Chunk` method to determine how many bounding boxes were predicted and use that to draw the bounding boxes on the image. To get the labels, we use the `labels.txt` file and use the `PredictedLabels` property to look up the label. | ||
|
||
```csharp | ||
var labels = File.ReadAllLines("./model/labels.txt"); | ||
|
||
var testFiles = Directory.GetFiles("./test"); | ||
|
||
Bitmap testImage; | ||
|
||
foreach (var image in testFiles) | ||
{ | ||
using (var stream = new FileStream(image, FileMode.Open)) | ||
{ | ||
testImage = (Bitmap)Image.FromStream(stream); | ||
} | ||
|
||
var prediction = predictionEngine.Predict(new StopSignInput { Image = testImage }); | ||
|
||
var boundingBoxes = prediction.BoundingBoxes.Chunk(prediction.BoundingBoxes.Count() / prediction.PredictedLabels.Count()); | ||
|
||
var originalWidth = testImage.Width; | ||
var originalHeight = testImage.Height; | ||
|
||
for (int i = 0; i < boundingBoxes.Count(); i++) | ||
{ | ||
var boundingBox = boundingBoxes.ElementAt(i); | ||
|
||
var left = boundingBox[0] * originalWidth; | ||
var top = boundingBox[1] * originalHeight; | ||
var right = boundingBox[2] * originalWidth; | ||
var bottom = boundingBox[3] * originalHeight; | ||
|
||
var x = left; | ||
var y = top; | ||
var width = Math.Abs(right - left); | ||
var height = Math.Abs(top - bottom); | ||
|
||
var label = labels[prediction.PredictedLabels[i]]; | ||
|
||
using var graphics = Graphics.FromImage(testImage); | ||
|
||
graphics.DrawRectangle(new Pen(Color.Red, 3), x, y, width, height); | ||
graphics.DrawString(label, new Font(FontFamily.Families[0], 32f), Brushes.Red, x + 5, y + 5); | ||
} | ||
|
||
if (File.Exists(predictedImage)) | ||
{ | ||
File.Delete(predictedImage); | ||
} | ||
|
||
testImage.Save(predictedImage); | ||
} | ||
``` | ||
|
||
## Output | ||
|
||
For this object detection scenario, we will output a new photo where the bounding boxes and label are drawn onto it. If one already exists when running the console application, it will delete it and save a new photo. | ||
|
||
 |
25 changes: 25 additions & 0 deletions
25
samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX.sln
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
| ||
Microsoft Visual Studio Solution File, Format Version 12.00 | ||
# Visual Studio Version 17 | ||
VisualStudioVersion = 17.1.32228.430 | ||
MinimumVisualStudioVersion = 10.0.40219.1 | ||
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StopSignDetection_ONNX", "StopSignDetection_ONNX\StopSignDetection_ONNX.csproj", "{37A33ADD-47A7-4B09-B323-CB9BCBC86851}" | ||
EndProject | ||
Global | ||
GlobalSection(SolutionConfigurationPlatforms) = preSolution | ||
Debug|Any CPU = Debug|Any CPU | ||
Release|Any CPU = Release|Any CPU | ||
EndGlobalSection | ||
GlobalSection(ProjectConfigurationPlatforms) = postSolution | ||
{37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Debug|Any CPU.ActiveCfg = Debug|Any CPU | ||
{37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Debug|Any CPU.Build.0 = Debug|Any CPU | ||
{37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Release|Any CPU.ActiveCfg = Release|Any CPU | ||
{37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Release|Any CPU.Build.0 = Release|Any CPU | ||
EndGlobalSection | ||
GlobalSection(SolutionProperties) = preSolution | ||
HideSolutionNode = FALSE | ||
EndGlobalSection | ||
GlobalSection(ExtensibilityGlobals) = postSolution | ||
SolutionGuid = {0FCF9329-4869-4595-94F9-56E4055DA8D4} | ||
EndGlobalSection | ||
EndGlobal |
1 change: 1 addition & 0 deletions
1
...les/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/labels.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
stop-sign |
Binary file added
BIN
+10.7 MB
...les/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/model.onnx
Binary file not shown.
78 changes: 78 additions & 0 deletions
78
samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Program.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
using Microsoft.ML; | ||
using Microsoft.ML.Transforms.Image; | ||
using StopSignDetection_ONNX; | ||
using System.Drawing; | ||
|
||
var context = new MLContext(); | ||
|
||
var data = context.Data.LoadFromEnumerable(new List<StopSignInput>()); | ||
|
||
// Create pipeline | ||
var pipeline = context.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image_tensor", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(StopSignInput.Image)) | ||
.Append(context.Transforms.ExtractPixels(outputColumnName: "image_tensor")) | ||
.Append(context.Transforms.ApplyOnnxModel(outputColumnNames: new string[] { "detected_boxes", "detected_scores", "detected_classes" }, | ||
inputColumnNames: new string[] { "image_tensor" }, modelFile: "./Model/model.onnx")); | ||
|
||
// Fit and create prediction engine | ||
var model = pipeline.Fit(data); | ||
|
||
var predictionEngine = context.Model.CreatePredictionEngine<StopSignInput, StopSignPrediction>(model); | ||
|
||
var labels = File.ReadAllLines("./Model/labels.txt"); | ||
|
||
var testFiles = Directory.GetFiles("./test"); | ||
|
||
Bitmap testImage; | ||
|
||
foreach (var image in testFiles) | ||
{ | ||
// Load test image into memory | ||
var predictedImage = $"{image}-predicted.jpg"; | ||
|
||
using (var stream = new FileStream(image, FileMode.Open)) | ||
{ | ||
testImage = (Bitmap)Image.FromStream(stream); | ||
} | ||
|
||
// Predict on test image | ||
var prediction = predictionEngine.Predict(new StopSignInput { Image = testImage }); | ||
|
||
// Calculate how many sets of bounding boxes we get from the prediction | ||
var boundingBoxes = prediction.BoundingBoxes.Chunk(prediction.BoundingBoxes.Count() / prediction.PredictedLabels.Count()); | ||
|
||
var originalWidth = testImage.Width; | ||
var originalHeight = testImage.Height; | ||
|
||
// Draw boxes and predicted label | ||
for (int i = 0; i < boundingBoxes.Count(); i++) | ||
{ | ||
var boundingBox = boundingBoxes.ElementAt(i); | ||
|
||
var left = boundingBox[0] * originalWidth; | ||
var top = boundingBox[1] * originalHeight; | ||
var right = boundingBox[2] * originalWidth; | ||
var bottom = boundingBox[3] * originalHeight; | ||
|
||
var x = left; | ||
var y = top; | ||
var width = Math.Abs(right - left); | ||
var height = Math.Abs(top - bottom); | ||
|
||
// Get predicted label from labels file | ||
var label = labels[prediction.PredictedLabels[i]]; | ||
|
||
// Draw bounding box and add label to image | ||
using var graphics = Graphics.FromImage(testImage); | ||
|
||
graphics.DrawRectangle(new Pen(Color.Red, 3), x, y, width, height); | ||
graphics.DrawString(label, new Font(FontFamily.Families[0], 32f), Brushes.Red, x + 5, y + 5); | ||
} | ||
|
||
// Save the prediction image, but delete it if it already exists before saving | ||
if (File.Exists(predictedImage)) | ||
{ | ||
File.Delete(predictedImage); | ||
} | ||
|
||
testImage.Save(predictedImage); | ||
} |
35 changes: 35 additions & 0 deletions
35
...d-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignDetection_ONNX.csproj
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<PropertyGroup> | ||
<OutputType>Exe</OutputType> | ||
<TargetFramework>net6.0</TargetFramework> | ||
<ImplicitUsings>enable</ImplicitUsings> | ||
<Nullable>enable</Nullable> | ||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<None Remove="Model\a889f48fdc1c45b5af840c5df4210a04.ONNX.zip" /> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<PackageReference Include="Microsoft.ML" Version="1.7.1" /> | ||
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.7.1" /> | ||
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="1.7.1" /> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<None Update="Model\labels.txt"> | ||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> | ||
</None> | ||
<None Update="Model\model.onnx"> | ||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> | ||
</None> | ||
<None Update="test\stop-sign-multiple-test.jpg"> | ||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> | ||
</None> | ||
<None Update="test\stop-sign-test.jpg"> | ||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> | ||
</None> | ||
</ItemGroup> | ||
|
||
</Project> |
17 changes: 17 additions & 0 deletions
17
...les/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignInput.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
using Microsoft.ML.Transforms.Image; | ||
using System.Drawing; | ||
|
||
namespace StopSignDetection_ONNX | ||
{ | ||
public struct ImageSettings | ||
{ | ||
public const int imageHeight = 320; | ||
public const int imageWidth = 320; | ||
} | ||
|
||
public class StopSignInput | ||
{ | ||
[ImageType(ImageSettings.imageHeight, ImageSettings.imageWidth)] | ||
public Bitmap Image { get; set; } | ||
} | ||
} |
16 changes: 16 additions & 0 deletions
16
...sharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignPrediction.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
using Microsoft.ML.Data; | ||
|
||
namespace StopSignDetection_ONNX | ||
{ | ||
public class StopSignPrediction | ||
{ | ||
[ColumnName("detected_classes")] | ||
public long[] PredictedLabels { get; set; } | ||
|
||
[ColumnName("detected_boxes")] | ||
public float[] BoundingBoxes { get; set; } | ||
|
||
[ColumnName("detected_scores")] | ||
public float[] Scores { get; set; } | ||
} | ||
} |
Binary file added
BIN
+485 KB
.../StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-multiple-test.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+3.51 MB
...-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-test.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+209 KB
...sharp/end-to-end-apps/StopSignDetection_ONNX/assets/object-detection-output.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+58.1 KB
samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/onnx-input.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.