Skip to content

Commit

Permalink
edit cross reference (#36)
Browse files Browse the repository at this point in the history
1. check table and code cross-reference for: ch-dask, ch-dask-dataframe, ch-data-science, ch-intro, ch-mpi, ch-mpi-large-model, ch-ray-cluster, ch-ray-core, ch-ray-data, ch-ray-train-tune
  • Loading branch information
leverage-point authored Apr 20, 2024
1 parent b01fe1d commit 99d3a51
Show file tree
Hide file tree
Showing 7 changed files with 30 additions and 30 deletions.
4 changes: 2 additions & 2 deletions ch-dask-dataframe/read-write.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@
"(sec-dask-dataframe-read-write)=\n",
"# 读写数据\n",
"\n",
"Dask DataFrame 支持 pandas 中几乎所有的数据读写操作,包括从本地、NFS、HDFS 或 S3 上读写文本文件、Parquet、HDF、JSON 等格式的文件。 {numref}`dask-read-write-operations` 是几个常见的读写操作。\n",
"Dask DataFrame 支持 pandas 中几乎所有的数据读写操作,包括从本地、NFS、HDFS 或 S3 上读写文本文件、Parquet、HDF、JSON 等格式的文件。 {numref}`tab-dask-read-write-operations` 是几个常见的读写操作。\n",
"\n",
"```{table} 几个 Dask DataFrame 读写操作示例\n",
":name: dask-read-write-operations\n",
":name: tab-dask-read-write-operations\n",
"| \t| CSV \t| Parquet \t| HDF \t|\n",
"|---\t|--------------\t|------------------\t|--------------\t|\n",
"| 读 \t| [`read_csv()`](https://docs.dask.org/en/stable/generated/dask.dataframe.read_csv.html) \t| [`read_parquet()`](https://docs.dask.org/en/stable/generated/dask.dataframe.read_parquet.html) \t| [`read_hdf()`](https://docs.dask.org/en/stable/generated/dask.dataframe.read_hdf.html) \t|\n",
Expand Down
8 changes: 4 additions & 4 deletions ch-mpi/collective.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,11 @@
"```\n",
"### 案例1:广播\n",
"\n",
"{numref}`mpi-broadcast-py` 对如何将一个 NumPy 数组广播到所有的进程中进行了演示\n",
"{numref}`code-mpi-broadcast-py` 对如何将一个 NumPy 数组广播到所有的进程中进行了演示\n",
"\n",
"```{code-block} python\n",
":caption: broadcast.py\n",
":name: mpi-broadcast-py\n",
":name: code-mpi-broadcast-py\n",
"\n",
"import numpy as np\n",
"from mpi4py import MPI\n",
Expand Down Expand Up @@ -104,11 +104,11 @@
"\n",
"### 案例2:Scatter\n",
"\n",
"{numref}`mpi-scatter` 演示了如何使用 Scatter 将数据分散到所有进程。\n",
"{numref}`code-mpi-scatter` 演示了如何使用 Scatter 将数据分散到所有进程。\n",
"\n",
"```{code-block} python\n",
":caption: scatter.py\n",
":name: mpi-scatter\n",
":name: code-mpi-scatter\n",
"\n",
"from mpi4py import MPI\n",
"import numpy as np\n",
Expand Down
4 changes: 2 additions & 2 deletions ch-mpi/mpi-hello-world.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,11 @@
"\n",
"## 案例:Hello World\n",
"\n",
"{numref}`mpi-hello` 使用一个简单的例子来演示 MPI 编程。\n",
"{numref}`code-mpi-hello` 使用一个简单的例子来演示 MPI 编程。\n",
"\n",
"```{code-block} python\n",
":caption: hello.py\n",
":name: mpi-hello\n",
":name: code-mpi-hello\n",
"\n",
"from mpi4py import MPI\n",
"\n",
Expand Down
20 changes: 10 additions & 10 deletions ch-mpi/point-to-point.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@
"\n",
"比如,我们发送一个 Python 对象。Python 对象在通信过程中的序列化使用的是 [pickle](https://docs.python.org/3/library/pickle.html#module-pickle)。\n",
"\n",
"{numref}`mpi-send-py-object` 演示了如何发送一个 Python 对象。\n",
"{numref}`code-mpi-send-py-object` 演示了如何发送一个 Python 对象。\n",
"\n",
"```{code-block} python\n",
":caption: send-py-object.py\n",
":name: mpi-send-py-object\n",
":name: code-mpi-send-py-object\n",
"\n",
"from mpi4py import MPI\n",
"\n",
Expand Down Expand Up @@ -71,11 +71,11 @@
"\n",
"或者发送一个 NumPy `ndarray`:\n",
"\n",
"{numref}`mpi-send-np` 演示了如何发送一个 NumPy `ndarray`。\n",
"{numref}`code-mpi-send-np` 演示了如何发送一个 NumPy `ndarray`。\n",
"\n",
"```{code-block} python\n",
":caption: send-np.py\n",
":name: mpi-send-np\n",
":name: code-mpi-send-np\n",
"\n",
"from mpi4py import MPI\n",
"import numpy as np\n",
Expand Down Expand Up @@ -139,11 +139,11 @@
"\n",
"现在我们做一个 Master-Worker 案例,共有 `size` 个进程,前 `size-1` 个进程作为 Worker,随机生成数据,最后一个进程(Rank 为 `size-1`)作为 Master,接收数据,并将数据的大小打印出来。\n",
"\n",
"{numref}`mpi-master-worker` 对 Master 与 Worker 进程间,数据的发送和接收过程进行了演示。\n",
"{numref}`code-mpi-master-worker` 对 Master 与 Worker 进程间,数据的发送和接收过程进行了演示。\n",
"\n",
"```{code-block} python\n",
":caption: master-worker.py\n",
":name: mpi-master-worker\n",
":name: code-mpi-master-worker\n",
"\n",
"from mpi4py import MPI\n",
"import numpy as np\n",
Expand Down Expand Up @@ -221,11 +221,11 @@
"\n",
"假设此时有 `size` 个进程参与计算,首先求每个进程需要处理的长方形数量 (`N/size`)。每个进程各自计算长方形面积之和,并发送给 Master 进程。第一个进程作为 Master,接收各 Worker 发送数据,汇总所有矩形面积,从而近似计算出 $\\pi$ 值。\n",
"\n",
"{numref}`mpi-rectangle-pi` 演示了长方形模拟求 $\\pi$ 值的过程。\n",
"{numref}`code-mpi-rectangle-pi` 演示了长方形模拟求 $\\pi$ 值的过程。\n",
"\n",
"```{code-block} python\n",
":caption: rectangle-pi.py\n",
":name: mpi-rectangle-pi\n",
":name: code-mpi-rectangle-pi\n",
"\n",
"import math\n",
"import time\n",
Expand Down Expand Up @@ -338,11 +338,11 @@
"\n",
"非阻塞式通信调用后直接返回 [`Request`](https://mpi4py.readthedocs.io/en/stable/reference/mpi4py.MPI.Request.html#mpi4py.MPI.Request) 句柄(Handle),程序员接下来再对 `Request` 做处理,比如等待 `Request` 涉及的数据传输完毕。非阻塞式通信有大写的 i(I)或小写的 i 作为前缀,I 前缀的基于缓存,i 的没有。[`isend`](https://mpi4py.readthedocs.io/en/stable/reference/mpi4py.MPI.Comm.html#mpi4py.MPI.Comm.isend) 的函数参数与 [`send`](https://mpi4py.readthedocs.io/en/stable/reference/mpi4py.MPI.Comm.html#mpi4py.MPI.Comm.send) 相差不大,只不过 `isend` 返回值是一个 `Request`。 `Request` 类提供了 `wait` 方法,显示地调用 `wait()` 可以等待数据传输完毕。用 `send` 写的阻塞式的代码,可以改为 `Isend` + [`Request.wait()`](https://mpi4py.readthedocs.io/en/stable/reference/mpi4py.MPI.Request.html#mpi4py.MPI.Request.wait) 以非阻塞方式实现。\n",
"\n",
"{numref}`mpi-non-blocking` 展示了一个非阻塞式通信的例子。\n",
"{numref}`code-mpi-non-blocking` 展示了一个非阻塞式通信的例子。\n",
"\n",
"```{code-block} python\n",
":caption: non-blocking.py\n",
":name: mpi-non-blocking\n",
":name: code-mpi-non-blocking\n",
"\n",
"from mpi4py import MPI\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions ch-mpi/remote-memory-access.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,11 @@
"2. 数据同步\n",
"3. 数据读写\n",
"\n",
"{numref}`mpi-rma-lock` 展示了一个案例,其代码保存为 `rma-lock.py`。\n",
"{numref}`code-mpi-rma-lock` 展示了一个案例,其代码保存为 `rma-lock.py`。\n",
"\n",
"```{code-block} python\n",
":caption: rma-lock.py\n",
":name: mpi-rma-lock\n",
":name: code-mpi-rma-lock\n",
"\n",
"import numpy as np\n",
"from mpi4py import MPI\n",
Expand Down
12 changes: 6 additions & 6 deletions ch-ray-data/data-load-inspect-save.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -211,10 +211,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"其他类型的文件格式(CSV、TFRecord 等)读取方法如 {numref}`ray-data-read-files` 所示。\n",
"其他类型的文件格式(CSV、TFRecord 等)读取方法如 {numref}`tab-ray-data-read-files` 所示。\n",
"\n",
"```{table} Ray Data 数据读取方法\n",
":name: ray-data-read-files\n",
":name: tab-ray-data-read-files\n",
"| \t| Parquet \t| Text \t| CSV \t| TFRecord \t| 二进制 \t|\n",
"|:----:\t|:--------------:\t|:-----------:\t|:----------:\t|:----------------:\t|---------------------\t|\n",
"| 方法 \t| [`read_parquet()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.read_parquet.html) \t| [`read_text()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.read_text.html) \t| [`read_csv()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.read_csv.html) \t| [`read_tfrecords()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.read_tfrecords.html) \t| [`read_binary_files()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.read_binary_files.html) \t|\n",
Expand Down Expand Up @@ -647,10 +647,10 @@
"\n",
"使用 HDFS、S3 或者其他文件系统时,Ray Data 遵守 {numref}`tab-uri-schemes` 中提及的 URI 和文件系统 Scheme 标准,应在 URI 中明确 Scheme 信息。\n",
"\n",
"{numref}`ray-data-save` 列举了几个将 `Dataset` 保存为不同文件格式的 API。\n",
"{numref}`tab-ray-data-save` 列举了几个将 `Dataset` 保存为不同文件格式的 API。\n",
"\n",
"```{table} 将 Dataset 写入文件系统\n",
":name: ray-data-save\n",
":name: tab-ray-data-save\n",
"| \t| Parquet \t| CSV \t| JSON \t| TFRecord \t|\n",
"|:----:\t|:--------------:\t|:-----------:\t|:----------:\t|:----------------:\t|\n",
"| 方法 \t| [`Dataset.write_parquet()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.write_parquet.html) \t| [`Dataset.write_csv()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.write_csv.html) \t| [`Dataset.write_json()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.write_json.html) \t| [`Dataset.write_tfrecords()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.write_tfrecords.html) \t|\n",
Expand Down Expand Up @@ -754,10 +754,10 @@
"source": [
"### 转化成其他框架格式\n",
"\n",
"我们可以将 Ray Data 的数据转化为单机运行的 pandas DataFrame 或者分布式的 Dask DataFrame,如 {numref}`ray-data-convert-other-library` 所示。\n",
"我们可以将 Ray Data 的数据转化为单机运行的 pandas DataFrame 或者分布式的 Dask DataFrame,如 {numref}`tab-ray-data-convert-other-library` 所示。\n",
"\n",
"```{table} 将 Dataset 保存为其他框架数据格式\n",
":name: ray-data-convert-other-library\n",
":name: tab-ray-data-convert-other-library\n",
"| \t| pandas \t| Dask \t| Spark \t| \n",
"|:----:\t|:--------------:\t|:-----------:\t|:----------:\t|\n",
"| 方法 \t| [`Dataset.to_pandas()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.to_pandas.html) \t| [`Dataset.to_dask()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.to_dask.html) \t| [`Dataset.to_spark()`](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.to_spark.html) \t|\n",
Expand Down
8 changes: 4 additions & 4 deletions ch-ray-data/preprocessor.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -258,10 +258,10 @@
"\n",
"### 分类变量\n",
"\n",
"机器学习模型无法接受分类变量,所以需要进行一些转换。{numref}`categorical-data-preprocessor` 是几个处理分类变量的 Preprocessor。\n",
"机器学习模型无法接受分类变量,所以需要进行一些转换。{numref}`tab-categorical-data-preprocessor` 是几个处理分类变量的 Preprocessor。\n",
"\n",
"```{table} 用于处理分类变量的 Preprocessor\n",
":name: categorical-data-preprocessor\n",
":name: tab-categorical-data-preprocessor\n",
"\n",
"| Preprocessor \t| 变量类型 \t | 案例 \t|\n",
"|:-----------------:\t|:--------:\t|:----------------------------------: |\n",
Expand All @@ -272,10 +272,10 @@
"\n",
"### 数值变量\n",
"\n",
"使用下面的转换将数据进行转换,以适应特定的机器学习模型,{numref}`numerical-data-preprocessor` 是几个处理数值变量的 Preprocessor。\n",
"使用下面的转换将数据进行转换,以适应特定的机器学习模型,{numref}`tab-numerical-data-preprocessor` 是几个处理数值变量的 Preprocessor。\n",
"\n",
"```{table} 用于处理数值变量的 Preprocessor\n",
":name: numerical-data-preprocessor\n",
":name: tab-numerical-data-preprocessor\n",
"\n",
"| Preprocessor \t| 变量类型 \t| 计算方式 \t| 备注 \t|\n",
"|--------------------\t|----------------------\t|--------------------------------------------\t|----------------------------------------------------------\t|\n",
Expand Down

0 comments on commit 99d3a51

Please sign in to comment.