{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pickling Models for Persistence\n",
    "\n",
    "This notebook demonstrates simple pickling of both single-GPU and multi-GPU cuML models for persistence"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\", category=FutureWarning)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Single GPU Model Pickling\n",
    "\n",
    "All single-GPU estimators are pickleable. The following example demonstrates the creation of a synthetic dataset, training, and pickling of the resulting model for storage. Trained single-GPU models can also be used to distribute the inference on a Dask cluster, which the `Distributed Model Pickling` section below demonstrates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cuml.datasets import make_blobs\n",
    "\n",
    "X, y = make_blobs(n_samples=50,\n",
    "                  n_features=10,\n",
    "                  centers=5,\n",
    "                  cluster_std=0.4,\n",
    "                  random_state=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cuml.cluster import KMeans\n",
    "\n",
    "model = KMeans(n_clusters=5)\n",
    "\n",
    "model.fit(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pickle\n",
    "\n",
    "pickle.dump(model, open(\"kmeans_model.pkl\", \"wb\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = pickle.load(open(\"kmeans_model.pkl\", \"rb\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.cluster_centers_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Distributed Model Pickling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The distributed estimator wrappers inside of the `cuml.dask` are not intended to be pickled directly. The Dask cuML estimators provide a function `get_combined_model()`, which returns the trained single-GPU model for pickling. The combined model can be used for inference on a single-GPU, and the `ParallelPostFit` wrapper from the [Dask-ML](https://ml.dask.org/meta-estimators.html) library can be used to perform distributed inference on a Dask cluster."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dask.distributed import Client\n",
    "from dask_cuda import LocalCUDACluster\n",
    "\n",
    "cluster = LocalCUDACluster()\n",
    "client = Client(cluster)\n",
    "client"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cuml.dask.datasets import make_blobs\n",
    "\n",
    "n_workers = len(client.scheduler_info()[\"workers\"].keys())\n",
    "\n",
    "X, y = make_blobs(n_samples=5000, \n",
    "                  n_features=30,\n",
    "                  centers=5, \n",
    "                  cluster_std=0.4, \n",
    "                  random_state=0,\n",
    "                  n_parts=n_workers*5)\n",
    "\n",
    "X = X.persist()\n",
    "y = y.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cuml.dask.cluster import KMeans\n",
    "\n",
    "dist_model = KMeans(n_clusters=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dist_model.fit(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pickle\n",
    "\n",
    "single_gpu_model = dist_model.get_combined_model()\n",
    "pickle.dump(single_gpu_model, open(\"kmeans_model.pkl\", \"wb\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "single_gpu_model = pickle.load(open(\"kmeans_model.pkl\", \"rb\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "single_gpu_model.cluster_centers_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exporting cuML Random Forest models for inferencing on machines without GPUs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Starting with cuML version 21.06, you can export cuML Random Forest models and run predictions with them on machines without an NVIDIA GPUs. The [Treelite](https://github.com/dmlc/treelite) package defines an efficient exchange format that lets you portably move the cuML Random Forest models to other machines. We will refer to the exchange format as \"checkpoints.\"\n",
    "\n",
    "Here are the steps to export the model:\n",
    "\n",
    "1. Call `to_treelite_checkpoint()` to obtain the checkpoint file from the cuML Random Forest model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cuml.ensemble import RandomForestClassifier as cumlRandomForestClassifier\n",
    "from sklearn.datasets import load_iris\n",
    "import numpy as np\n",
    "\n",
    "X, y = load_iris(return_X_y=True)\n",
    "X, y = X.astype(np.float32), y.astype(np.int32)\n",
    "clf = cumlRandomForestClassifier(max_depth=3, random_state=0, n_estimators=10)\n",
    "clf.fit(X, y)\n",
    "\n",
    "checkpoint_path = './checkpoint.tl'\n",
    "# Export cuML RF model as Treelite checkpoint\n",
    "clf.convert_to_treelite_model().to_treelite_checkpoint(checkpoint_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. Copy the generated checkpoint file `checkpoint.tl` to another machine on which you'd like to run predictions.\n",
    "\n",
    "3. On the target machine, install Treelite by running `pip install treelite` or `conda install -c conda-forge treelite`. The machine does not need to have an NVIDIA GPUs and does not need to have cuML installed.\n",
    "\n",
    "4. You can now load the model from the checkpoint, by running the following on the target machine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import treelite\n",
    "\n",
    "# The checkpoint file has been copied over\n",
    "checkpoint_path = './checkpoint.tl'\n",
    "tl_model = treelite.Model.deserialize(checkpoint_path)\n",
    "out_prob = treelite.gtil.predict(tl_model, X, pred_margin=True)\n",
    "print(out_prob)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.6.9 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  },
  "nbsphinx": {
   "execute": "never"
  },
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}