ML.NET Workflow Summary | Proving Ground Apps

Introduction

LunchBoxML implements Microsoft’s ML.NET to provide Grasshopper users with the ability to train, save, and test Machine Learning models using a variety of algorithms including regression, binary classification, and multiclass classification.

To utilize the ML.NET components, this page describes a general workflow pattern that should be implemented:

Data Preparation

LunchBoxML includes a useful set of ‘Data Set’ components that can be used to prepare Grasshopper data for use in training and testing models. Data sets can be created within the Grasshopper definition or read in from an external text files.

You can think of data sets as a way to organize and structure your data into a table with labeled fields. The labels and associated data are essential for using supervised learning algorithms such as regression and binary classification methods.

**Example:** A basic dataset preparation for organizing the X, Y, and Z coordinates of a point collection. Each coordinate ‘column’ receives a label ‘header’ for use in training.

Model Creation

After defining your data set, you can then determine what training model to use that will suit your data use:

Regression trainers are useful for analyzing relationships between numeric input labels and their output – think defining predictive ‘fit curves’ or other forms of numeric forecasting.
Binary classification trainers are useful in determining true/false or yes/no classification outputs given a set of inputs.
Multiclass Classification trainers are useful in determining a variety of classifications given a set of inputs – like determining if a set of inputs should receive a particular name or identifier.

Trainer Type Selection

Trainer types refer to the specific algorithm that should be implemented by the Model trainer. The selection of your trainer type will depend greatly on the nuances of your specific model and the goal for your prediction.

Experiment with different trainer types for regression, binary classification, and multiclass classification to discover what best suits your use case.

**Example:** A basic regression example that implements a ‘Light GBM’ trainer type to train a model on X, Y input features and the resultant Z labels.

Model Testing

After you have trained your Model, you can then ‘test’ the model against new data. The predictive effectiveness of the model will depend greatly on both the quantity and quality of your model’s training data as well as the selection of an appropriate trainer type.

**Example:** A basic regression tester that uses a structured data set of X and Y coordinates to predict a Z coordinate value. The construction of the data set mirrors the structure of the data use in the trainer in prior examples.

**Example:** The visual output of the regression trainer and tester. The random collection of points represents the training inputs for X, Y, and Z. The surface represents the results of testing the regression model to predict a surface condition within the 3D collection of random points.

Saving and Loading models

After you have trained a model, that model can be reused in your Grasshopper definitions as a pre-trained model. This means that your definitions need not re-train your models upon each execution (which can be computationally expensive!)

By right-clicking on your Model trainer, you will see an option to Save the trainer to a Zip file.

LunchBox Documentation