lincc-frameworks · delucchi-cmu · Feb 5, 2025
diff --git a/docs/conf.py b/docs/conf.py
@@ -41,3 +41,6 @@
 html_logo = "_static/lincc-fw.png"
 html_title = "LINCC Frameworks"
 html_theme = "sphinx_book_theme"
+html_theme_options = {
+  "show_toc_level": 2
+}
diff --git a/docs/notebooks/fibad/fibad_demo.ipynb b/docs/notebooks/fibad/fibad_demo.ipynb
@@ -4,7 +4,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Fibad Demonstration"
+    "# Demo: Fibad\n",
+    "\n",
+    "**Author**: Drew\n",
+    "\n",
+    "**Last updated**: Feb 4, 2025\n",
+    "\n",
+    "In this notebook, we'll do an end-to-end demonstration of working with FIBAD utilities to perform some image-based inference. This includes:\n",
+    "\n",
+    "* fetching the images to train/evaluate/perform inference on\n",
+    "* train up a model\n",
+    "* run inference on the model\n",
+    "* look around at the results"
    ]
   },
   {
@@ -14,11 +25,14 @@
     "## Download a sample HSC dataset\n",
     "\n",
     "This dataset is comprised of approximately 993 cutouts from the Hyper Suprime Cam survey.\n",
-    "The cutouts were requested to be 8 arcsecs on a side.\n",
-    "For consistency, we crop each image to [96, 96] pixels at runtime.\n",
-    "For each object 3 bands have been acquired, I, R, G.\n",
     "\n",
-    "Once unzipped there will be a .fits file for each (object id, band) in the `./data/hsc_8asec_1000` directory.\n"
+    "Some characteristics of the cut-outs:\n",
+    "\n",
+    "* 8 arcsecs on a side\n",
+    "* crop each image to [96, 96] pixels at runtime.\n",
+    "* 3 bands have been acquired: `I`, `R`, `G`\n",
+    "\n",
+    "We'll create a local directory, `./data/hsc_8asec_1000`, and unzip to create a .fits file for each (object id, band) combo.\n"
    ]
   },
   {
@@ -85,7 +99,31 @@
    "source": [
     "## Configuration and Training\n",
     "\n",
-    "First we import fibad and create a new fibad object, instantiated (implicitly), with the default configuration file."
+    "The flow to get a model trained is straightforward:\n",
+    "\n",
+    "1. create a new instance\n",
+    "2. configure that instance\n",
+    "\n",
+    "    * where's the data?\n",
+    "    * what shape is the data?\n",
+    "    * what kind of model do you want?\n",
+    "    * what are your model hyperparameters?\n",
+    "\n",
+    "3. train that model\n",
+    "\n",
+    "Reusing the configuration is made easy, too. The fibad instance will store its configuration in `./results/<timestamp>/runtime.config.toml`.\n",
+    "\n",
+    "1. If running in another notebook, instantiate a fibad object like so:\n",
+    "\n",
+    "    ```\n",
+    "    new_fibad_instance = fibad.Fibad(config_file='./results/<timestamped_directory>/runtime_config.toml')\n",
+    "    ```\n",
+    "\n",
+    "2. Or from the command line on an HPC system:\n",
+    "\n",
+    "    ```\n",
+    "    >> fibad train --runtime-config ./results/<timestamped_directory>/runtime_config.toml\n",
+    "    ```"
    ]
   },
   {
@@ -104,24 +142,8 @@
    "source": [
     "import fibad\n",
     "\n",
-    "f = fibad.Fibad()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "For this demo, we'll make a few adjustments to the default configuration settings that the `fibad` object was instantiated with. By accessing the `.config` attribute of the fibad instance, we can modify any configuration value. \n",
+    "f = fibad.Fibad()\n",
     "\n",
-    "Here we specify the location of our sample data, the data set class, the model to train, number of epochs for training and the batch size."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 153,
-   "metadata": {},
-   "outputs": [],
-   "source": [
     "# Specify the location of the data to use for training\n",
     "f.config[\"general\"][\"data_dir\"] = \"./data/hsc_8asec_1000_bu\"\n",
     "\n",
@@ -133,73 +155,10 @@
     "\n",
     "# Set the number of epochs and batch size for training.\n",
     "f.config[\"train\"][\"epochs\"] = 1\n",
-    "f.config[\"data_loader\"][\"batch_size\"] = 32"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We call the `.train()` method to train the model"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 154,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "[2025-01-30 21:48:31,624 fibad.data_sets.hsc_data_set:INFO] Processed 993 objects for pruning\n",
-      "[2025-01-30 21:48:31,625 fibad.data_sets.hsc_data_set:INFO] Checking file dimensions to determine standard cutout size...\n",
-      "[2025-01-30 21:48:31,628 fibad.data_sets.hsc_data_set:INFO] HSC Data set loader has 993 objects\n",
-      "[2025-01-30 21:48:31,632 fibad.data_sets.hsc_data_set:INFO] test split contains 199 items\n",
-      "[2025-01-30 21:48:31,633 fibad.data_sets.hsc_data_set:INFO] train split contains 596 items\n",
-      "[2025-01-30 21:48:31,633 fibad.data_sets.hsc_data_set:INFO] validate split contains 198 items\n",
-      "[2025-01-30 21:48:31,644 fibad.models.model_registry:INFO] Using criterion: torch.nn.CrossEntropyLoss with default arguments.\n",
-      "2025-01-30 21:48:31,646 ignite.distributed.auto.auto_dataloader INFO: Use data loader kwargs for dataset '<fibad.data_sets.hsc': \n",
-      "\t{'sampler': <torch.utils.data.sampler.SubsetRandomSampler object at 0x7f98004e1ca0>, 'batch_size': 32, 'num_workers': 2, 'pin_memory': True}\n",
-      "2025-01-30 21:48:31,647 ignite.distributed.auto.auto_dataloader INFO: Use data loader kwargs for dataset '<fibad.data_sets.hsc': \n",
-      "\t{'sampler': <torch.utils.data.sampler.SubsetRandomSampler object at 0x7f9800949cd0>, 'batch_size': 32, 'num_workers': 2, 'pin_memory': True}\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "2025/01/30 21:48:31 INFO mlflow.system_metrics.system_metrics_monitor: Started monitoring system metrics.\n",
-      "[2025-01-30 21:48:31,690 fibad.pytorch_ignite:INFO] Training model on device: cuda\n",
-      "[2025-01-30 21:48:31,691 fibad.pytorch_ignite:INFO] Total epochs: 1\n",
-      "[2025-01-30 21:48:35,026 fibad.pytorch_ignite:INFO] Total training time: 3.34[s]\n",
-      "[2025-01-30 21:48:35,027 fibad.pytorch_ignite:INFO] Latest checkpoint saved as: /home/drew/code/fibad/results/20250130-214831-train-cn8s/checkpoint_epoch_1.pt\n",
-      "[2025-01-30 21:48:35,028 fibad.pytorch_ignite:INFO] Best metric checkpoint saved as: /home/drew/code/fibad/results/20250130-214831-train-cn8s/checkpoint_1_loss=-1077.9679.pt\n",
-      "2025/01/30 21:48:35 INFO mlflow.system_metrics.system_metrics_monitor: Stopping system metrics monitoring...\n",
-      "2025/01/30 21:48:35 INFO mlflow.system_metrics.system_metrics_monitor: Successfully terminated system metrics monitoring!\n",
-      "[2025-01-30 21:48:35,692 fibad.train:INFO] Finished Training\n"
-     ]
-    }
-   ],
-   "source": [
-    "f.train()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The output of the training will be stored in a time-stamped directory under the `./results/`. By default, a copy of the final configuration used in training is persisted as `runtime_config.toml`. To run fibad again with the same configuration, you can reference the runtime_config.toml file.\n",
-    "\n",
-    "If running in another notebook, instantiate a fibad object like so:\n",
-    "```\n",
-    "new_fibad_instance = fibad.Fibad(config_file='./results/<timestamped_directory>/runtime_config.toml')\n",
-    "```\n",
+    "f.config[\"data_loader\"][\"batch_size\"] = 32\n",
     "\n",
-    "Or from the command line on an HPC system:\n",
-    "```\n",
-    ">> fibad train --runtime-config ./results/<timestamped_directory>/runtime_config.toml\n",
-    "```"
+    "# We call the this method to train the model\n",
+    "f.train()"
    ]
   },
   {

diff --git a/docs/notebooks/nested/review_demo.ipynb b/docs/notebooks/nested/review_demo.ipynb
@@ -5,7 +5,11 @@
    "id": "9d8ebc3c-ac7e-4ce4-ae37-15d54c7afb7d",
    "metadata": {},
    "source": [
-    "# LINCC Frameworks `nested-pandas`/`nested-dask` Review Demo\n"
+    "# Demo: `nested-pandas` and `nested-dask`\n",
+    "\n",
+    "**Author**: Doug\n",
+    "\n",
+    "**Last updated**: Feb 3, 2025\n"
    ]
   },
   {

diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1,10 +1,7 @@
-
 ipykernel
 ipython
-jupytext
 nbconvert
 nbsphinx
 sphinx
-sphinx-autoapi
 sphinx-copybutton
 sphinx-book-theme