Memorytamer 1 5 0 – Automatic Memory Freeing Applied

broken image


-->

In this guide, learn how to define various configuration settings of your automated machine learning experiments with the Azure Machine Learning SDK. Automated machine learning picks an algorithm and hyperparameters for you and generates a model ready for deployment. There are several options that you can use to configure automated machine learning experiments.

To view examples of automated machine learning experiments, see Tutorial: Train a classification model with automated machine learning or Train models with automated machine learning in the cloud.

The QuantStudio 3 and 5 systems provide sensitive detection and high-confidence target discrimination down to 1.5-fold differences. (A) Amplification plots for 1.5-fold dilutions of a KAZ plasmid amplified with Applied Biosystems PE2 TaqMan. Assay under Fast run conditions using TaqMan. Fast Advanced Master Mix. Hands Free (1) High Charging Capacity with 5 Ma Charging Current Insulation Systems Upto 50 micronf Can Be Quickly Charged and Tested (1) High EM Ohms (Noise) Immunity Built In Noise Rejection Filters and Shielded Test Leads Supplied In A Standard Set Make The Instrument Suitable for Accurate Insulation Testing In Environments with High EM Ohms. Use App Lock to lock any app on your phone using PIN, Pattern & Finger Print Lock and stay updated all time every day with the after call menu. The advanced Caller ID feature lets you identify who is calling you. Key Feature ☞ Protect apps using Fingerprint Security ☞ Protection overview after each call ☞ Comes with smart Caller Information ID ☞ Lock any Apps using Hi Tech App Lock. Auto-Tune Evo VST 6.0.9.2 Corrects vocals or solo instruments. Advanced Scan to PDF Free 4.6.1 Revolutionary Scan-to-PDF Solution. Minecraft 1.10.2 Build anything you can imagine. 1 –0.5 0.8 –0.5 0.8 V IIX Input leakage current GND VI VCC –5 +5 –5 +5 A IOZ Output leakage current GND VI VCC, output disabled –5 +5 –5 +5 A ICC VCC operating supply current VCC = Max, IOUT = 0 mA Commercial – 100 – 100 mA Industrial – 260 – 200 Automotive-A – – 200 ISB1 Automatic CE1 power-down current.

Configuration options available in automated machine learning:

  • Select your experiment type: Classification, Regression, or Time Series Forecasting
  • Data source, formats, and fetch data
  • Choose your compute target: local or remote
  • Automated machine learning experiment settings
  • Run an automated machine learning experiment
  • Explore model metrics
  • Register and deploy model

If you prefer a no code experience, you can also Create your automated machine learning experiments in Azure Machine Learning studio.

Prerequisites

For this article you need,

  • An Azure Machine Learning workspace. To create the workspace, see Create an Azure Machine Learning workspace.

  • The Azure Machine Learning Python SDK installed.To install the SDK you can either,

    • Create a compute instance, which automatically installs the SDK and is preconfigured for ML workflows. See Create and manage an Azure Machine Learning compute instance for more information.

    • Install the automl package yourself, which includes the default installation of the SDK.

Select your experiment type

Before you begin your experiment, you should determine the kind of machine learning problem you are solving. Automated machine learning supports task types of classification, regression, and forecasting. Learn more about task types.

The following code uses the task parameter in the AutoMLConfig constructor to specify the experiment type as classification.

Data source and format

Automated machine learning supports data that resides on your local desktop or in the cloud such as Azure Blob Storage. The data can be read into a Pandas DataFrame or an Azure Machine Learning TabularDataset. Learn more about datasets.

Requirements for training data in machine learning:

  • Data must be in tabular form.
  • The value to predict, target column, must be in the data.

For remote experiments, training data must be accessible from the remote compute. AutoML only accepts Azure Machine Learning TabularDatasets when working on a remote compute.

Azure Machine Learning datasets expose functionality to:

  • Easily transfer data from static files or URL sources into your workspace.
  • Make your data available to training scripts when running on cloud compute resources. See How to train with datasets for an example of using the Dataset class to mount data to your remote compute target.

The following code creates a TabularDataset from a web url. See Create a TabularDatasets for code examples on how to create datasets from other sources like local files and datastores.

For local compute experiments, we recommend pandas dataframes for faster processing times.

Training, validation, and test data

You can specify separate training data and validation data sets directly in the AutoMLConfig constructor. Learn more about how to configure data splits and cross validation for your AutoML experiments.

If you do not explicitly specify a validation_data or n_cross_validation parameter, automated ML applies default techniques to determine how validation is performed. This determination depends on the number of rows in the dataset assigned to your training_data parameter.

Training data sizeValidation technique
Larger than 20,000 rowsTrain/validation data split is applied. The default is to take 10% of the initial training data set as the validation set. In turn, that validation set is used for metrics calculation.
Smaller than 20,000 rowsCross-validation approach is applied. The default number of folds depends on the number of rows.
If the dataset is less than 1,000 rows, 10 folds are used.
If the rows are between 1,000 and 20,000, then three folds are used.

At this time, you need to provide your own test data for model evaluation. For a code example of bringing your own test data for model evaluation see the Test section of this Jupyter notebook.

Compute to run experiment

Next determine where the model will be trained. An automated machine learning training experiment can run on the following compute options. Learn the pros and cons of local and remote compute options.

  • Your local machine such as a local desktop or laptop – Generally when you have a small dataset and you are still in the exploration stage. See this notebook for a local compute example.

  • A remote machine in the cloud – Azure Machine Learning Managed Compute is a managed service that enables the ability to train machine learning models on clusters of Azure virtual machines.

    See this notebook for a remote example using Azure Machine Learning Managed Compute.

  • An Azure Databricks cluster in your Azure subscription. You can find more details in Set up an Azure Databricks cluster for automated ML. See this GitHub site for examples of notebooks with Azure Databricks.

Configure your experiment settings

There are several options that you can use to configure your automated machine learning experiment. These parameters are set by instantiating an AutoMLConfig object. See the AutoMLConfig class for a full list of parameters.

Some examples include:

  1. Classification experiment using AUC weighted as the primary metric with experiment timeout minutes set to 30 minutes and 2 cross-validation folds.

  2. The following example is a regression experiment set to end after 60 minutes with five validation cross folds.

  3. Forecasting tasks require extra setup, see the Autotrain a time-series forecast model article for more details.

Supported models

Memorytamer 1 5 0 – Automatic Memory Freeing Applied Memory

Automated machine learning tries different models and algorithms during the automation and tuning process. As a user, there is no need for you to specify the algorithm.

The three different task parameter values determine the list of algorithms, or models, to apply. Use the allowed_models or blocked_models parameters to further modify iterations with the available models to include or exclude.

The following table summarizes the supported models by task type.

Note

If you plan to export your auto ML created models to an ONNX model, only those algorithms indicated with an * are able to be converted to the ONNX format. Learn more about converting models to ONNX.
Also note, ONNX only supports classification and regression tasks at this time.

ClassificationRegressionTime Series Forecasting
Logistic Regression*Elastic Net*Elastic Net
Light GBM*Light GBM*Light GBM
Gradient Boosting*Gradient Boosting*Gradient Boosting
Decision Tree*Decision Tree*Decision Tree
K Nearest Neighbors*K Nearest Neighbors*K Nearest Neighbors
Linear SVC*LARS Lasso*LARS Lasso
Support Vector Classification (SVC)*Stochastic Gradient Descent (SGD)*Stochastic Gradient Descent (SGD)
Random Forest*Random Forest*Random Forest
Extremely Randomized Trees*Extremely Randomized Trees*Extremely Randomized Trees
Xgboost*Xgboost*Xgboost
Averaged Perceptron ClassifierOnline Gradient Descent RegressorAuto-ARIMA
Naive Bayes*Fast Linear RegressorProphet
Stochastic Gradient Descent (SGD)*ForecastTCN
Linear SVM Classifier*

Primary Metric

The primary metric parameter determines the metric to be used during model training for optimization. The available metrics you can select is determined by the task type you choose, and the following table shows valid primary metrics for each task type.

Choosing a primary metric for automated machine learning to optimize depends on many factors. We recommend your primary consideration be to choose a metric which best represents your business needs. Then consider if the metric is suitable for your dataset profile (data size, range, class distribution, etc.).

Learn about the specific definitions of these metrics in Understand automated machine learning results.

ClassificationRegressionTime Series Forecasting
accuracyspearman_correlationspearman_correlation
AUC_weightednormalized_root_mean_squared_errornormalized_root_mean_squared_error
average_precision_score_weightedr2_scorer2_score
norm_macro_recallnormalized_mean_absolute_errornormalized_mean_absolute_error
precision_score_weighted

Primary metrics for classification scenarios

Post thresholded metrics, like accuracy, average_precision_score_weighted, norm_macro_recall, and precision_score_weighted may not optimize as well for datasets which are very small, have very large class skew (class imbalance), or when the expected metric value is very close to 0.0 or 1.0. In those cases, AUC_weighted can be a better choice for the primary metric. After automated machine learning completes, you can choose the winning model based on the metric best suited to your business needs.

MetricExample use case(s)
accuracyImage classification, Sentiment analysis, Churn prediction
AUC_weightedFraud detection, Image classification, Anomaly detection/spam detection
average_precision_score_weightedSentiment analysis
norm_macro_recallChurn prediction
precision_score_weighted

Primary metrics for regression scenarios

Metrics like r2_score and spearman_correlation can better represent the quality of model when the scale of the value-to-predict covers many orders of magnitude. For instance salary estimation, where many people have a salary of $20k to $100k, but the scale goes very high with some salaries in the $100M range.

normalized_mean_absolute_error and normalized_root_mean_squared_error would in this case treat a $20k prediction error the same for a worker with a $30k salary as a worker making $20M. While in reality, predicting only $20k off from a $20M salary is very close (a small 0.1% relative difference), whereas $20k off from $30k is not close (a large 67% relative difference). normalized_mean_absolute_error and normalized_root_mean_squared_error are useful when the values to predict are in a similar scale.

MetricExample use case(s)
spearman_correlation
normalized_root_mean_squared_errorPrice prediction (house/product/tip), Review score prediction
r2_scoreAirline delay, Salary estimation, Bug resolution time
normalized_mean_absolute_error
Applied

Primary metrics for time series forecasting scenarios

See regression notes, above.

MetricExample use case(s)
spearman_correlation
normalized_root_mean_squared_errorPrice prediction (forecasting), Inventory optimization, Demand forecasting
r2_scorePrice prediction (forecasting), Inventory optimization, Demand forecasting
normalized_mean_absolute_error

Data featurization

In every automated machine learning experiment, your data is automatically scaled and normalized to help certain algorithms that are sensitive to features that are on different scales. This scaling and normalization is referred to as featurization.See Featurization in AutoML for more detail and code examples.

When configuring your experiments in your AutoMLConfig object, you can enable/disable the setting featurization. The following table shows the accepted settings for featurization in the AutoMLConfig object.

Featurization ConfigurationDescription
'featurization': 'auto'Indicates that as part of preprocessing, data guardrails and featurization steps are performed automatically. Default setting.
'featurization': 'off'Indicates featurization step shouldn't be done automatically.
'featurization':'FeaturizationConfig'Indicates customized featurization step should be used. Learn how to customize featurization.

Note

Automated machine learning featurization steps (feature normalization, handling missing data,converting text to numeric, etc.) become part of the underlying model. When using the model forpredictions, the same featurization steps applied during training are applied toyour input data automatically.

Ensemble configuration

Ensemble models are enabled by default, and appear as the final run iterations in an AutoML run. Currently VotingEnsemble and StackEnsemble are supported.

Voting implements soft-voting, which uses weighted averages. The stacking implementation uses a two layer implementation, where the first layer has the same models as the voting ensemble, and the second layer model is used to find the optimal combination of the models from the first layer.

If you are using ONNX models, or Triplety 1 0 3. have model-explainability enabled, stacking is disabled and only voting is utilized.

Ensemble training can be disabled by using the enable_voting_ensemble and enable_stack_ensemble boolean parameters.

To alter the default ensemble behavior, there are multiple default arguments that can be provided as kwargs in an AutoMLConfig object.

Important

The following parameters aren't explicit parameters of the AutoMLConfig class.

  • ensemble_download_models_timeout_sec: During VotingEnsemble and StackEnsemble model generation, multiple fitted models from the previous child runs are downloaded. If you encounter this error: AutoMLEnsembleException: Could not find any models for running ensembling, then you may need to provide more time for the models to be downloaded. The default value is 300 seconds for downloading these models in parallel and there is no maximum timeout limit. Configure this parameter with a higher value than 300 secs, if more time is needed.

    Note

    If the timeout is reached and there are models downloaded, then the ensembling proceeds with as many models it has downloaded. It's not required that all the models need to be downloaded to finish within that timeout.

The following parameters only apply to StackEnsemble models:

  • stack_meta_learner_type: the meta-learner is a model trained on the output of the individual heterogeneous models. Default meta-learners are LogisticRegression for classification tasks (or LogisticRegressionCV if cross-validation is enabled) and ElasticNet for regression/forecasting tasks (or ElasticNetCV if cross-validation is enabled). This parameter can be one of the following strings: LogisticRegression, LogisticRegressionCV, LightGBMClassifier, ElasticNet, ElasticNetCV, LightGBMRegressor, or LinearRegression.

  • stack_meta_learner_train_percentage: specifies the proportion of the training set (when choosing train and validation type of training) to be reserved for training the meta-learner. Default value is 0.2.

  • stack_meta_learner_kwargs: optional parameters to pass to the initializer of the meta-learner. These parameters and parameter types mirror the parameters and parameter types from the corresponding model constructor, and are forwarded to the model constructor.

The following code shows an example of specifying custom ensemble behavior in an AutoMLConfig object.

Exit criteria

There are a few options you can define in your AutoMLConfig to end your experiment.

Criteriadescription
No criteriaIf you do not define any exit parameters the experiment continues until no further progress is made on your primary metric.
After a length of timeUse experiment_timeout_minutes in your settings to define how long, in minutes, your experiment should continue to run.
To help avoid experiment time out failures, there is a minimum of 15 minutes, or 60 minutes if your row by column size exceeds 10 million.
A score has been reachedUse experiment_exit_score completes the experiment after a specified primary metric score has been reached.

Run experiment

For automated ML, you create an Experiment object, which is a named object in a Workspace used to run experiments.

Submit the experiment to run and generate a model. Pass the AutoMLConfig to the submit method to generate the model.

Note

Dependencies are first installed on a new machine. It may take up to 10 minutes before output is shown.Setting show_output to True results in output being shown on the console.

Multiple child runs on clusters

Automated ML experiment child runs can be performed on a cluster that is already running another experiment. However, the timing depends on how many nodes the cluster has, and if those nodes are available to run a different experiment.

Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run; for automated ML this means a child run. If all the nodes are busy, the new experiment is queued. But if there are free nodes, the new experiment will run automated ML child runs in parallel in the available nodes/VMs.

To help manage child runs and when they can be performed, we recommend you create a dedicated cluster per experiment, and match the number of max_concurrent_iterations of your experiment to the number of nodes in the cluster. This way, you use all the nodes of the cluster at the same time with the number of concurrent child runs/iterations you want.

Configure max_concurrent_iterations in your AutoMLConfig object. If it is not configured, then by default only one concurrent child run/iteration is allowed per experiment.

Explore models and metrics

You can view your training results in a widget or inline if you are in a notebook. See Track and evaluate models for more details.

See Evaluate automated machine learning experiment results for definitions and examples of the performance charts and metrics provided for each run.

To get a featurization summary and understand what features were added to a particular model, see Featurization transparency.

Note

The algorithms automated ML employs have inherent randomness that can cause slight variation in a recommended model's final metrics score, like accuracy. Automated ML also performs operations on data such as train-test split, train-validation split or cross-validation when necessary. So if you run an experiment with the same configuration settings and primary metric multiple times, you'll likely see variation in each experiments final metrics score due to these factors.

Register and deploy models

For details on how to download or register a model for deployment to a web service, see how and where to deploy a model.

Model interpretability

Model interpretability allows you to understand why your models made predictions, and the underlying feature importance values. The SDK includes various packages for enabling model interpretability features, both at training and inference time, for local and deployed models.

See the how-to for code samples on how to enable interpretability features specifically within automated machine learning experiments.

For general information on how model explanations and feature importance can be enabled in other areas of the SDK outside of automated machine learning, see the concept article on interpretability.

Note

The ForecastTCN model is not currently supported by the Explanation Client. This model will not return an explanation dashboard if it is returned as the best model, and does not support on-demand explanation runs.

Troubleshooting

Automatic

Primary metrics for time series forecasting scenarios

See regression notes, above.

MetricExample use case(s)
spearman_correlation
normalized_root_mean_squared_errorPrice prediction (forecasting), Inventory optimization, Demand forecasting
r2_scorePrice prediction (forecasting), Inventory optimization, Demand forecasting
normalized_mean_absolute_error

Data featurization

In every automated machine learning experiment, your data is automatically scaled and normalized to help certain algorithms that are sensitive to features that are on different scales. This scaling and normalization is referred to as featurization.See Featurization in AutoML for more detail and code examples.

When configuring your experiments in your AutoMLConfig object, you can enable/disable the setting featurization. The following table shows the accepted settings for featurization in the AutoMLConfig object.

Featurization ConfigurationDescription
'featurization': 'auto'Indicates that as part of preprocessing, data guardrails and featurization steps are performed automatically. Default setting.
'featurization': 'off'Indicates featurization step shouldn't be done automatically.
'featurization':'FeaturizationConfig'Indicates customized featurization step should be used. Learn how to customize featurization.

Note

Automated machine learning featurization steps (feature normalization, handling missing data,converting text to numeric, etc.) become part of the underlying model. When using the model forpredictions, the same featurization steps applied during training are applied toyour input data automatically.

Ensemble configuration

Ensemble models are enabled by default, and appear as the final run iterations in an AutoML run. Currently VotingEnsemble and StackEnsemble are supported.

Voting implements soft-voting, which uses weighted averages. The stacking implementation uses a two layer implementation, where the first layer has the same models as the voting ensemble, and the second layer model is used to find the optimal combination of the models from the first layer.

If you are using ONNX models, or Triplety 1 0 3. have model-explainability enabled, stacking is disabled and only voting is utilized.

Ensemble training can be disabled by using the enable_voting_ensemble and enable_stack_ensemble boolean parameters.

To alter the default ensemble behavior, there are multiple default arguments that can be provided as kwargs in an AutoMLConfig object.

Important

The following parameters aren't explicit parameters of the AutoMLConfig class.

  • ensemble_download_models_timeout_sec: During VotingEnsemble and StackEnsemble model generation, multiple fitted models from the previous child runs are downloaded. If you encounter this error: AutoMLEnsembleException: Could not find any models for running ensembling, then you may need to provide more time for the models to be downloaded. The default value is 300 seconds for downloading these models in parallel and there is no maximum timeout limit. Configure this parameter with a higher value than 300 secs, if more time is needed.

    Note

    If the timeout is reached and there are models downloaded, then the ensembling proceeds with as many models it has downloaded. It's not required that all the models need to be downloaded to finish within that timeout.

The following parameters only apply to StackEnsemble models:

  • stack_meta_learner_type: the meta-learner is a model trained on the output of the individual heterogeneous models. Default meta-learners are LogisticRegression for classification tasks (or LogisticRegressionCV if cross-validation is enabled) and ElasticNet for regression/forecasting tasks (or ElasticNetCV if cross-validation is enabled). This parameter can be one of the following strings: LogisticRegression, LogisticRegressionCV, LightGBMClassifier, ElasticNet, ElasticNetCV, LightGBMRegressor, or LinearRegression.

  • stack_meta_learner_train_percentage: specifies the proportion of the training set (when choosing train and validation type of training) to be reserved for training the meta-learner. Default value is 0.2.

  • stack_meta_learner_kwargs: optional parameters to pass to the initializer of the meta-learner. These parameters and parameter types mirror the parameters and parameter types from the corresponding model constructor, and are forwarded to the model constructor.

The following code shows an example of specifying custom ensemble behavior in an AutoMLConfig object.

Exit criteria

There are a few options you can define in your AutoMLConfig to end your experiment.

Criteriadescription
No criteriaIf you do not define any exit parameters the experiment continues until no further progress is made on your primary metric.
After a length of timeUse experiment_timeout_minutes in your settings to define how long, in minutes, your experiment should continue to run.
To help avoid experiment time out failures, there is a minimum of 15 minutes, or 60 minutes if your row by column size exceeds 10 million.
A score has been reachedUse experiment_exit_score completes the experiment after a specified primary metric score has been reached.

Run experiment

For automated ML, you create an Experiment object, which is a named object in a Workspace used to run experiments.

Submit the experiment to run and generate a model. Pass the AutoMLConfig to the submit method to generate the model.

Note

Dependencies are first installed on a new machine. It may take up to 10 minutes before output is shown.Setting show_output to True results in output being shown on the console.

Multiple child runs on clusters

Automated ML experiment child runs can be performed on a cluster that is already running another experiment. However, the timing depends on how many nodes the cluster has, and if those nodes are available to run a different experiment.

Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run; for automated ML this means a child run. If all the nodes are busy, the new experiment is queued. But if there are free nodes, the new experiment will run automated ML child runs in parallel in the available nodes/VMs.

To help manage child runs and when they can be performed, we recommend you create a dedicated cluster per experiment, and match the number of max_concurrent_iterations of your experiment to the number of nodes in the cluster. This way, you use all the nodes of the cluster at the same time with the number of concurrent child runs/iterations you want.

Configure max_concurrent_iterations in your AutoMLConfig object. If it is not configured, then by default only one concurrent child run/iteration is allowed per experiment.

Explore models and metrics

You can view your training results in a widget or inline if you are in a notebook. See Track and evaluate models for more details.

See Evaluate automated machine learning experiment results for definitions and examples of the performance charts and metrics provided for each run.

To get a featurization summary and understand what features were added to a particular model, see Featurization transparency.

Note

The algorithms automated ML employs have inherent randomness that can cause slight variation in a recommended model's final metrics score, like accuracy. Automated ML also performs operations on data such as train-test split, train-validation split or cross-validation when necessary. So if you run an experiment with the same configuration settings and primary metric multiple times, you'll likely see variation in each experiments final metrics score due to these factors.

Register and deploy models

For details on how to download or register a model for deployment to a web service, see how and where to deploy a model.

Model interpretability

Model interpretability allows you to understand why your models made predictions, and the underlying feature importance values. The SDK includes various packages for enabling model interpretability features, both at training and inference time, for local and deployed models.

See the how-to for code samples on how to enable interpretability features specifically within automated machine learning experiments.

For general information on how model explanations and feature importance can be enabled in other areas of the SDK outside of automated machine learning, see the concept article on interpretability.

Note

The ForecastTCN model is not currently supported by the Explanation Client. This model will not return an explanation dashboard if it is returned as the best model, and does not support on-demand explanation runs.

Troubleshooting

  • Recent upgrade of AutoML dependencies to newer versions will be breaking compatibility: As of version 1.13.0 of the SDK, models won't be loaded in older SDKs due to incompatibility between the older versions we pinned in our previous packages, and the newer versions we pin now. You will see error such as:

    • Module not found: Ex.No module named 'sklearn.decomposition._truncated_svd,
    • Import errors: Ex.ImportError: cannot import name 'RollingOriginValidator',
    • Attribute errors: Ex. AttributeError: 'SimpleImputer' object has no attribute 'add_indicator

    To work around this issue, take either of the following two steps depending on your AutoML SDK training version:

    • If your AutoML SDK training version is greater than 1.13.0, you need pandas 0.25.1 and scikit-learn0.22.1. If there is a version mismatch, upgrade scikit-learn and/or pandas to correct version as shown below:

    • If your AutoML SDK training version is less than or equal to 1.12.0, you need pandas 0.23.4 and sckit-learn0.20.3. If there is a version mismatch, downgrade scikit-learn and/or pandas to correct version as shown below:

  • Failed deployment: For versions <= 1.18.0 of the SDK, the base image created for deployment may fail with the following error: 'ImportError: cannot import name cached_property from werkzeug'.

    The following steps can work around the issue:

    1. Download the model package
    2. Unzip the package
    3. Deploy using the unzipped assets
  • Forecasting R2 score is always zero: This issue arises if the training data provided has time series that contains the same value for the last n_cv_splits + forecasting_horizon data points. If this pattern is expected in your time series, you can switch your primary metric to normalized root mean squared error.

  • TensorFlow: As of version 1.5.0 of the SDK, automated machine learning does not install TensorFlow models by default. To install TensorFlow and use it with your automated ML experiments, install tensorflow1.12.0 via CondaDependecies.

  • Experiment Charts: Binary classification charts (precision-recall, ROC, gain curve etc.) shown in automated ML experiment iterations are not rendering correctly in user interface since 4/12. Chart plots are currently showing inverse results, where better performing models are shown with lower results. A resolution is under investigation.

  • Databricks cancel an automated machine learning run: When you use automated machine learning capabilities on Azure Databricks, to cancel a run and start a new experiment run, restart your Azure Databricks cluster.

  • Databricks >10 iterations for automated machine learning: In automated machine learning settings, if you have more than 10 iterations, set show_output to False when you submit the run.

  • Databricks widget for the Azure Machine Learning SDK and automated machine learning: The Azure Machine Learning SDK widget isn't supported in a Databricks notebook because the notebooks can't parse HTML widgets. You can view the widget in the portal by using this Python code in your Azure Databricks notebook cell:

  • automl_setup fails:

    • On Windows, run automl_setup from an Anaconda Prompt. Use this link to install Miniconda.
    • Ensure that conda 64-bit is installed, rather than 32-bit by running the conda info command. The platform should be win-64 for Windows or osx-64 for Mac.
    • Ensure that conda 4.4.10 or later is installed. You can check the version with the command conda -V. If you have a previous version installed, you can update it by using the command: conda update conda.
    • Linux - gcc: error trying to exec 'cc1plus'
      • If the gcc: error trying to exec 'cc1plus': execvp: No such file or directory error is encountered, install build essentials using the command sudo apt-get install build-essential.
      • Pass a new name as the first parameter to automl_setup to create a new conda environment. View existing conda environments using conda env list and remove them with conda env remove -n .
  • automl_setup_linux.sh fails: If automl_setup_linus.sh fails on Ubuntu Linux with the error: unable to execute 'gcc': No such file or directory-

    1. Make sure that outbound ports 53 and 80 are enabled. On an Azure VM, you can do this from the Azure portal by selecting the VM and clicking on Networking.
    2. Run the command: sudo apt-get update
    3. Run the command: sudo apt-get install build-essential --fix-missing
    4. Run automl_setup_linux.sh again
  • configuration.ipynb fails:

    • For local conda, first ensure that automl_setup has successfully run.
    • Ensure that the subscription_id is correct. Find the subscription_id in the Azure portal by selecting All Service and then Subscriptions. The characters '<' and '>' should not be included in the subscription_id value. For example, subscription_id = '12345678-90ab-1234-5678-1234567890abcd' has the valid format.
    • Ensure Contributor or Owner access to the Subscription.
    • Check that the region is one of the supported regions: eastus2, eastus, westcentralus, southeastasia, westeurope, australiaeast, westus2, southcentralus.
    • Ensure access to the region using the Azure portal.
  • import AutoMLConfig fails: There were package changes in the automated machine learning version 1.0.76, which require the previous version to be uninstalled before updating to the new version. If the ImportError: cannot import name AutoMLConfig is encountered after upgrading from an SDK version before v1.0.76 to v1.0.76 or later, resolve the error by running: pip uninstall azureml-train automl and then pip install azureml-train-auotml. The automl_setup.cmd script does this automatically.

  • workspace.from_config fails: If the calls ws = Workspace.from_config()' fails -

    1. Ensure that the configuration.ipynb notebook has run successfully.
    2. If the notebook is being run from a folder that is not under the folder where the configuration.ipynb was run, copy the folder aml_config and the file config.json that it contains to the new folder. Workspace.from_config reads the config.json for the notebook folder or its parent folder.
    3. If a new subscription, resource group, workspace, or region, is being used, make sure that you run the configuration.ipynb notebook again. Changing config.json directly will only work if the workspace already exists in the specified resource group under the specified subscription.
    4. If you want to change the region, change the workspace, resource group, or subscription. Workspace.create will not create or update a workspace if it already exists, even if the region specified is different.
  • Sample notebook fails: If a sample notebook fails with an error that property, method, or library does not exist:

    • Ensure that the correct kernel has been selected in the Jupyter Notebook. The kernel is displayed in the top right of the notebook page. The default is azure_automl. The kernel is saved as part of the notebook. So, if you switch to a new conda environment, you will have to select the new kernel in the notebook.
      • For Azure Notebooks, it should be Python 3.6.
      • For local conda environments, it should be the conda environment name that you specified in automl_setup.
    • Ensure the notebook is for the SDK version that you are using. You can check the SDK version by executing azureml.core.VERSION in a Jupyter Notebook cell. You can download previous version of the sample notebooks from GitHub by clicking the Branch button, selecting the Tags tab and then selecting the version.
  • import numpy fails in Windows: Some Windows environments see an error loading numpy with the latest Python version 3.6.8. If you see this issue, try with Python version 3.6.7.

  • import numpy fails: Check the TensorFlow version in the automated ml conda environment. Supported versions are < 1.13. Uninstall TensorFlow from the environment if version is >= 1.13. You may check the version of TensorFlow and uninstall as follows:

    1. Start a command shell, activate conda environment where automated ml packages are installed.
    2. Enter pip freeze and look for tensorflow, if found, the version listed should be < 1.13
    3. If the listed version is not a supported version, pip uninstall tensorflow in the command shell and enter y for confirmation.
  • Run fails with jwt.exceptions.DecodeError: Exact error message: jwt.exceptions.DecodeError: It is required that you pass in a value for the 'algorithms' argument when calling decode().

    For versions <= 1.17.0 of the SDK, installation might result in an unsupported version of PyJWT. Check PyJWT version in the automated ml conda environment. Supported versions are < 2.0.0. You may check the version of PyJWT as follows:

    1. Start a command shell, activate conda environment where automated ml packages are installed.
    2. Enter pip freeze and look for PyJWT, if found, the version listed should be < 2.0.0

    If the listed version is not a supported version:

    1. Consider upgrading to the latest version of AutoML SDK: pip install -U azureml-sdk[automl].
    2. If that is not viable, uninstall PyJWT from the environment and install the right version as follows:
      • pip uninstall PyJWT in the command shell and enter y for confirmation.
      • Install using pip install 'PyJWT<2.0.0'.

Next steps

  • Learn more about how and where to deploy a model.

  • Learn more about how to train a regression model with Automated machine learning or how to train using Automated machine learning on a remote resource.

  • Learn how to train multiple models with AutoML in the Many Models Solution Accelerator.

This section provides background information on the automatic memory management feature of Oracle Database, and includes instructions for enabling this feature. The following topics are covered:

About Automatic Memory Management

The simplest way to manage instance memory is to allow the Oracle Database instance to automatically manage and tune it for you. To do so (on most platforms), you set only a target memory size initialization parameter (MEMORY_TARGET) and optionally a maximum memory size initialization parameter (MEMORY_MAX_TARGET). The instance then tunes to the target memory size, redistributing memory as needed between the system global area (SGA) and the instance program global area (instance PGA). Because the target memory initialization parameter is dynamic, you can change the target memory size at any time without restarting the database. The maximum memory size serves as an upper limit so that you cannot accidentally set the target memory size too high, and so that enough memory is set aside for the Oracle Database instance in case you do want to increase total instance memory in the future. Because certain SGA components either cannot easily shrink or must remain at a minimum size, the instance also prevents you from setting the target memory size too low.

If you create your database with Database Configuration Assistant (DBCA) and choose the basic installation option, automatic memory management is enabled. If you choose advanced installation, Database Configuration Assistant (DBCA) enables you to select automatic memory management.

Note:

Memorytamer 1 5 0 – Automatic Memory Freeing Applied Technology

You cannot enable automatic memory management if the LOCK_SGA initialization parameter is TRUE. See Oracle Database Reference for information about this parameter.

See Also:

Enabling Automatic Memory Management

If you did not enable automatic memory management upon database creation (either by selecting the proper options in DBCA or by setting the appropriate initialization parameters for the CREATE DATABASE SQL statement), you can enable it at a later time. Enabling automatic memory management involves a shutdown and restart of the database.

To enable automatic memory management

  1. Start SQL*Plus and connect to the database as SYSDBA.

    See 'About Database Administrator Security and Privileges' and 'Database Administrator Authentication' for instructions.

  2. Calculate the minimum value for MEMORY_TARGET as follows:

    1. Determine the current sizes of SGA_TARGET and PGA_AGGREGATE_TARGET by entering the following SQL*Plus command:

      Amazing 2 9 9 multiplication. SQL*Plus displays the values of all initialization parameters with the string TARGET in the parameter name.

    2. Run the following query to determine the maximum instance PGA allocated since the database was started:

    3. Compute the maximum value between the query result from step 2b and PGA_AGGREGATE_TARGET. Add SGA_TARGET to this value.

    For example, if SGA_TARGET is 272M and PGA_AGGREGATE_TARGET is 90M as shown above, and if the maximum PGA allocated is determined to be 120M, then MEMORY_TARGET should be at least 392M (272M + 120M).

  3. Choose the value for MEMORY_TARGET that you want to use.

    This can be the minimum value that you computed in step 2, or you can choose to use a larger value if you have enough physical memory available.

  4. For the MEMORY_MAX_TARGET initialization parameter, decide on a maximum amount of memory that you would want to allocate to the database for the foreseeable future. That is, determine the maximum value for the sum of the SGA and instance PGA sizes. This number can be larger than or the same as the MEMORY_TARGET value that you chose in the previous step.

  5. Do one of the following:

    • If you started your Oracle Database instance with a server parameter file, which is the default if you created the database with the Database Configuration Assistant (DBCA), enter the following command:

      where n is the value that you computed in Step 4.

      The SCOPE=SPFILE clause sets the value only in the server parameter file, and not for the running instance. You must include this SCOPE clause because MEMORY_MAX_TARGET is not a dynamic initialization parameter.

    • If you started your instance with a text initialization parameter file, manually edit the file so that it contains the following statements:

      where n is the value that you determined in Step 4, and m is the value that you determined in step 3.

    Note:

    In a text initialization parameter file, if you omit the line for MEMORY_MAX_TARGET and include a value for MEMORY_TARGET, the database automatically sets MEMORY_MAX_TARGET to the value of MEMORY_TARGET. If you omit the line for MEMORY_TARGET and include a value for MEMORY_MAX_TARGET, the MEMORY_TARGET parameter defaults to zero. After startup, you can then dynamically change MEMORY_TARGET to a nonzero value, provided that it does not exceed the value of MEMORY_MAX_TARGET.
  6. Shut down and restart the database.

    See Chapter 3, 'Starting Up and Shutting Down' for instructions.

  7. If you started your Oracle Database instance with a server parameter file, enter the following commands:

    where n is the value that you determined in step 3.

Note:

The preceding steps instruct you to set SGA_TARGET and PGA_AGGREGATE_TARGET to zero so that the sizes of the SGA and instance PGA are tuned up and down as required, without restrictions. You can omit the statements that set these parameter values to zero and leave either or both of the values as positive numbers. In this case, the values act as minimum values for the sizes of the SGA or instance PGA.

See Also:

  • Oracle Database SQL Language Reference for information on the ALTERSYSTEM SQL statement

Memorytamer 1 5 0 – Automatic Memory Freeing Applied Valve

Monitoring and Tuning Automatic Memory Management

The dynamic performance view V$MEMORY_DYNAMIC_COMPONENTS shows the current sizes of all dynamically tuned memory components, including the total sizes of the SGA and instance PGA.

The view V$MEMORY_TARGET_ADVICE provides tuning advice for the MEMORY_TARGET initialization parameter.

Memorytamer 1 5 0 – Automatic Memory Freeing Applied Learning

The row with the MEMORY_SIZE_FACTOR of 1 shows the current size of memory, as set by the MEMORY_TARGET initialization parameter, and the amount of DB time required to complete the current workload. In previous and subsequent rows, the results show a number of alternative MEMORY_TARGET sizes. For each alternative size, the database shows the size factor (the multiple of the current size), and the estimated DB time to complete the current workload if the MEMORY_TARGET parameter were changed to the alternative size. Notice that for a total memory size smaller than the current MEMORY_TARGET size, estimated DB time increases. Notice also that in this example, there is nothing to be gained by increasing total memory size beyond 450MB. However, this situation might change if a complete workload has not yet been run.

Enterprise Manager provides an easy-to-use graphical memory advisor to help you select an optimal size for MEMORY_TARGET. See Oracle Database 2 Day DBA for details.

See Also:

  • Oracle Database Reference for more information about these dynamic performance views

  • Oracle Database Performance Tuning Guide for a definition of DB time.





broken image