How To Apply Randomforest Algorithm In Pmml Format
SkLearn2PMML
Python library for converting Scikit-Learn pipelines to PMML.
Features
This library is a thin wrapper around the JPMML-SkLearn command-line application.
For a list of supported Estimator and Transformer types, please refer to JPMML-SkLearn supported packages.
Prerequisites
- Python 2.7, 3.4 or newer.
- Java 1.8 or newer. The Java executable must be available on system path.
Installation
Installing a release version from PyPI:
Alternatively, installing the latest snapshot version from GitHub:
pip install --upgrade git+https://github.com/jpmml/sklearn2pmml.git
Usage
A typical workflow can be summarized as follows:
- Create a
PMMLPipeline
object, and populate it with pipeline steps as usual. Classsklearn2pmml.pipeline.PMMLPipeline
extends classsklearn.pipeline.Pipeline
with the following functionality:
- If the
PMMLPipeline.fit(X, y)
method is invoked withpandas.DataFrame
orpandas.Series
object as anX
argument, then its column names are used as feature names. Otherwise, feature names default to "x1", "x2", .., "x{number_of_features}". - If the
PMMLPipeline.fit(X, y)
method is invoked withpandas.Series
object as any
argument, then its name is used as the target name (for supervised models). Otherwise, the target name defaults to "y".
- Fit and validate the pipeline as usual.
- Optionally, compute and embed verification data into the
PMMLPipeline
object by invokingPMMLPipeline.verify(X)
method with a small but representative subset of training data. - Convert the
PMMLPipeline
object to a PMML file in local filesystem by invoking utility methodsklearn2pmml.sklearn2pmml(pipeline, pmml_destination_path)
.
Developing a simple decision tree model for the classification of iris species:
import pandas iris_df = pandas.read_csv("Iris.csv") iris_X = iris_df[iris_df.columns.difference(["Species"])] iris_y = iris_df["Species"] from sklearn.tree import DecisionTreeClassifier from sklearn2pmml.pipeline import PMMLPipeline pipeline = PMMLPipeline([ ("classifier", DecisionTreeClassifier()) ]) pipeline.fit(iris_X, iris_y) from sklearn2pmml import sklearn2pmml sklearn2pmml(pipeline, "DecisionTreeIris.pmml", with_repr = True)
Developing a more elaborate logistic regression model for the same:
import pandas iris_df = pandas.read_csv("Iris.csv") iris_X = iris_df[iris_df.columns.difference(["Species"])] iris_y = iris_df["Species"] from sklearn_pandas import DataFrameMapper from sklearn.decomposition import PCA from sklearn.feature_selection import SelectKBest from sklearn.impute import SimpleImputer from sklearn.linear_model import LogisticRegression from sklearn2pmml.decoration import ContinuousDomain from sklearn2pmml.pipeline import PMMLPipeline pipeline = PMMLPipeline([ ("mapper", DataFrameMapper([ (["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"], [ContinuousDomain(), SimpleImputer()]) ])), ("pca", PCA(n_components = 3)), ("selector", SelectKBest(k = 2)), ("classifier", LogisticRegression(multi_class = "ovr")) ]) pipeline.fit(iris_X, iris_y) pipeline.verify(iris_X.sample(n = 15)) from sklearn2pmml import sklearn2pmml sklearn2pmml(pipeline, "LogisticRegressionIris.pmml", with_repr = True)
Documentation
Up-to-date:
- Benchmarking Scikit-Learn against JPMML-Evaluator in Java and Python environments
- Extending Scikit-Learn with outlier detector transformer type
- Analyzing Scikit-Learn feature importances via PMML
- Training Scikit-Learn based TF(-IDF) plus XGBoost pipelines
- Converting Scikit-Learn based TF(-IDF) pipelines to PMML documents
- Converting Scikit-Learn based Imbalanced-Learn (imblearn) pipelines to PMML documents
- Extending Scikit-Learn with date and datetime features
- Extending Scikit-Learn with feature specifications
- Converting logistic regression models to PMML documents
- Stacking Scikit-Learn, LightGBM and XGBoost models
- Converting Scikit-Learn hyperparameter-tuned pipelines to PMML documents
- Extending Scikit-Learn with GBDT plus LR ensemble (GBDT+LR) model type
- Converting Scikit-Learn based TPOT automated machine learning (AutoML) pipelines to PMML documents
- Converting Scikit-Learn based LightGBM pipelines to PMML documents
- Extending Scikit-Learn with business rules (BR) model type
Slightly outdated:
- Converting Scikit-Learn to PMML
De-installation
Uninstalling:
pip uninstall sklearn2pmml
License
SkLearn2PMML is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use SkLearn2PMML in a proprietary software project, then it is possible to enter into a licensing agreement which makes SkLearn2PMML available under the terms and conditions of the BSD 3-Clause License instead.
Additional information
SkLearn2PMML is developed and maintained by Openscoring Ltd, Estonia.
Interested in using Java PMML API software in your company? Please contact info@openscoring.io
How To Apply Randomforest Algorithm In Pmml Format
Source: https://github.com/jpmml/sklearn2pmml
Posted by: wilsonobblet.blogspot.com
0 Response to "How To Apply Randomforest Algorithm In Pmml Format"
Post a Comment