diff --git a/_toc.yml b/_toc.yml
index b189c73..696d2f3 100644
--- a/_toc.yml
+++ b/_toc.yml
@@ -2,13 +2,13 @@ root: index
subtrees:
- numbered: 2
entries:
- - file: ch-intro/index
+ - file: ch-parallel-computing/index
entries:
- - file: ch-intro/computer-architecture
- - file: ch-intro/serial-parallel
- - file: ch-intro/thread-process
- - file: ch-intro/parallel-program-design
- - file: ch-intro/performance-metrics
+ - file: ch-parallel-computing/computer-architecture
+ - file: ch-parallel-computing/serial-parallel
+ - file: ch-parallel-computing/thread-process
+ - file: ch-parallel-computing/parallel-program-design
+ - file: ch-parallel-computing/performance-metrics
- file: ch-data-science/index
entries:
- file: ch-data-science/data-science-lifecycle
@@ -57,6 +57,10 @@ subtrees:
- file: ch-ray-ml/ray-train
- file: ch-ray-ml/ray-tune
- file: ch-ray-ml/ray-serve
+ - file: ch-modin-xorbits/index
+ entries:
+ - file: ch-modin-xorbits/modin
+ - file: ch-modin-xorbits/xorbits
- file: ch-mpi/index
entries:
- file: ch-mpi/mpi-intro
diff --git a/ch-modin-xorbits/modin.ipynb b/ch-modin-xorbits/modin.ipynb
index e69de29..d2c9b8c 100644
--- a/ch-modin-xorbits/modin.ipynb
+++ b/ch-modin-xorbits/modin.ipynb
@@ -0,0 +1,332 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "(sec-modin)=\n",
+ "# Modin\n",
+ "\n",
+ "Modin 是一款专门加速 pandas DataFrame 的框架。它对大数据进行了切分,使 DataFrame 运算分布到多核和集群上。它底层使用了 Ray 或 Dask 作为分布式执行引擎。因此,在安装 Modin 时,还要安装对应的执行引擎(Ray、Dask 或 [unidist](https://github.com/modin-project/unidist/)),比如 `pip install \"modin[ray]\"` 或 `pip install \"modin[dask]\"`。Modin 默认使用 Ray 作为其执行引擎。\n",
+ "\n",
+ "## API 兼容性\n",
+ "\n",
+ "Dask DataFrame 与 pandas DataFrame 其实有不少差异,很多 pandas 工作流并不能快速迁移到 Dask DataFrame 上。Modin 更看重与 pandas 的兼容性,用户只需要 `import modin.pandas as pd`,绝大多数 pandas 工作流可以快速迁移到 Modin 上。\n",
+ "\n",
+ "Dask DataFrame 只按列对大数据进行切分,且没有记录每个 Partition 有多少数据,Modin 在多维度对数据进行切分,保留行标签和列标签。Modin 支持行索引 `iloc()`;记录了每个数据块的数据量,可以支持`median()`、`quantile()`;支持行和列的转换(比如,`pivot()`、`transpose()`)等操作。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append(\"..\")\n",
+ "from utils import nyc_flights\n",
+ "\n",
+ "folder_path = nyc_flights()\n",
+ "file_path = os.path.join(folder_path, \"*.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ ":::{note}\n",
+ "Modin 的 API 尽量与 pandas 一致,比如,pandas 的 `read_csv()` 只能读一个文件,不能读 `*.csv` 这样的通配符。Modin 额外增加了一些 API,比如,Modin 拓展了 `read_csv()`,提出了 `read_csv_glob()` 方法 可以读取 `*.csv` 这样的通配符,适合读大数据。这些额外增加的 API 在 `modin.experimental.pandas` 中。\n",
+ ":::"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 34,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Date 1991-01-11 00:00:00\n",
+ "DayOfWeek 5\n",
+ "DepTime 1303.0\n",
+ "CRSDepTime 1215\n",
+ "ArrTime 1439.0\n",
+ "CRSArrTime 1336\n",
+ "UniqueCarrier US\n",
+ "FlightNum 121\n",
+ "TailNum NaN\n",
+ "ActualElapsedTime 96.0\n",
+ "CRSElapsedTime 81\n",
+ "AirTime NaN\n",
+ "ArrDelay 63.0\n",
+ "DepDelay 48.0\n",
+ "Origin EWR\n",
+ "Dest PIT\n",
+ "Distance 319.0\n",
+ "TaxiIn NaN\n",
+ "TaxiOut NaN\n",
+ "Cancelled 0\n",
+ "Diverted 0\n",
+ "Name: 3, dtype: object"
+ ]
+ },
+ "execution_count": 34,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import modin.experimental.pandas as pd\n",
+ "df = pd.read_csv_glob(file_path, parse_dates={'Date': [0, 1, 2]})\n",
+ "df.iloc[3]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.0"
+ ]
+ },
+ "execution_count": 35,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df['ArrDelay'].median()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "如果某些 API 在 Modin 中还没实现,Modin 会退回(Fallback)到 pandas,这也保证了兼容性。当然,缺点也很明显:将 Modin 的 DataFrame 转换为 pandas DataFrame 时,会有额外的开销;如果这个 DataFrame 分布在多个节点上,转化回 pandas 时会把数据集中到单机内存,有可能把单机内存挤爆。\n",
+ "\n",
+ "## 立即执行\n",
+ "\n",
+ "Modin 是立即执行,这一点与 pandas 一致。用户不需要像 Dask 那样调用 `.compute()` 来触发计算。\n",
+ "\n",
+ "Modin 也没有 Dask DataFrame 的数据类型推断。{numref}`sec-dask-dataframe-read-write`中的飞机起降数据上,Dask DataFrame `tail()` 会抛出异常,但 Modin 能够得到 pandas 一样的语义。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 36,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " Date | \n",
+ " DayOfWeek | \n",
+ " DepTime | \n",
+ " CRSDepTime | \n",
+ " ArrTime | \n",
+ " CRSArrTime | \n",
+ " UniqueCarrier | \n",
+ " FlightNum | \n",
+ " TailNum | \n",
+ " ActualElapsedTime | \n",
+ " ... | \n",
+ " AirTime | \n",
+ " ArrDelay | \n",
+ " DepDelay | \n",
+ " Origin | \n",
+ " Dest | \n",
+ " Distance | \n",
+ " TaxiIn | \n",
+ " TaxiOut | \n",
+ " Cancelled | \n",
+ " Diverted | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 1555982 | \n",
+ " 1994-12-27 | \n",
+ " 2 | \n",
+ " 1721.0 | \n",
+ " 1715 | \n",
+ " 1930.0 | \n",
+ " 1945 | \n",
+ " DL | \n",
+ " 149 | \n",
+ " NaN | \n",
+ " 129.0 | \n",
+ " ... | \n",
+ " NaN | \n",
+ " -15.0 | \n",
+ " 6.0 | \n",
+ " JFK | \n",
+ " ATL | \n",
+ " 760.0 | \n",
+ " NaN | \n",
+ " NaN | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1555983 | \n",
+ " 1994-12-28 | \n",
+ " 3 | \n",
+ " 1715.0 | \n",
+ " 1715 | \n",
+ " 1934.0 | \n",
+ " 1945 | \n",
+ " DL | \n",
+ " 149 | \n",
+ " NaN | \n",
+ " 139.0 | \n",
+ " ... | \n",
+ " NaN | \n",
+ " -11.0 | \n",
+ " 0.0 | \n",
+ " JFK | \n",
+ " ATL | \n",
+ " 760.0 | \n",
+ " NaN | \n",
+ " NaN | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " 1555984 | \n",
+ " 1994-12-29 | \n",
+ " 4 | \n",
+ " 1715.0 | \n",
+ " 1715 | \n",
+ " 1941.0 | \n",
+ " 1945 | \n",
+ " DL | \n",
+ " 149 | \n",
+ " NaN | \n",
+ " 146.0 | \n",
+ " ... | \n",
+ " NaN | \n",
+ " -4.0 | \n",
+ " 0.0 | \n",
+ " JFK | \n",
+ " ATL | \n",
+ " 760.0 | \n",
+ " NaN | \n",
+ " NaN | \n",
+ " 0 | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
3 rows x 21 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Date DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime \\\n",
+ "1555982 1994-12-27 2 1721.0 1715 1930.0 1945 \n",
+ "1555983 1994-12-28 3 1715.0 1715 1934.0 1945 \n",
+ "1555984 1994-12-29 4 1715.0 1715 1941.0 1945 \n",
+ "\n",
+ " UniqueCarrier FlightNum TailNum ActualElapsedTime ... AirTime \\\n",
+ "1555982 DL 149 NaN 129.0 ... NaN \n",
+ "1555983 DL 149 NaN 139.0 ... NaN \n",
+ "1555984 DL 149 NaN 146.0 ... NaN \n",
+ "\n",
+ " ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled \\\n",
+ "1555982 -15.0 6.0 JFK ATL 760.0 NaN NaN 0 \n",
+ "1555983 -11.0 0.0 JFK ATL 760.0 NaN NaN 0 \n",
+ "1555984 -4.0 0.0 JFK ATL 760.0 NaN NaN 0 \n",
+ "\n",
+ " Diverted \n",
+ "1555982 0 \n",
+ "1555983 0 \n",
+ "1555984 0 \n",
+ "\n",
+ "[3 rows x 21 columns]"
+ ]
+ },
+ "execution_count": 36,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.tail(3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 执行引擎\n",
+ "\n",
+ "Modin 支持 Ray、Dask 和 unidist 分布式执行引擎:可以利用单机多核,也可以运行在集群上。以 Ray 为例,用户可以向 Ray 集群上提交作业,在代码中初始 Ray 运行时 `ray.init(address=\"auto\")` 后,会将作业运行 Ray 集群。\n",
+ "\n",
+ "Modin 默认使用 Ray 作为执行后端,也可以通过环境变量 `MODIN_ENGINE` 来设置执行后端,在命令行里:`export MODIN_ENGINE=dask`;或在 Jupyter Notebook 中:\n",
+ "\n",
+ "```python\n",
+ "import modin.config as modin_cfg\n",
+ "modin_cfg.Engine.put(\"ray\")\n",
+ "```\n",
+ "\n",
+ "undist 是 Modin 自己实现的一个执行后端,它支持 MPI,如果想用 undist MPI,除了设置 `MODIN_ENGINE` 还要设置 `UNIDIST_BACKEND`:\n",
+ "\n",
+ "```shell\n",
+ "export MODIN_ENGINE=unidist\n",
+ "export UNIDIST_BACKEND=mpi \n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "dispy",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.7"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/ch-modin-xorbits/xorbits.ipynb b/ch-modin-xorbits/xorbits.ipynb
new file mode 100644
index 0000000..ed29831
--- /dev/null
+++ b/ch-modin-xorbits/xorbits.ipynb
@@ -0,0 +1,322 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "(sec-xorbits)=\n",
+ "# Xorbits\n",
+ "\n",
+ "Xorbits 是一款面向数据科学的分布式计算框架,功能上与 Dask 或 Modin 有些类似,用来加速 pandas DataFrame、NumPy。它与 Dask 和 Modin 类似,对大数据进行切分,切分之后使用 pandas 或 NumPy 来执行。它底层使用自己研发的 Actor 编程框架 [Xoscar](https://github.com/xorbitsai/xoscar),而不是依赖 Ray 或者 Dask。\n",
+ "\n",
+ "## Xorbits 集群\n",
+ "\n",
+ "在进行计算前,Xorbits 需要初始化一个集群,单机上可以直接 `xorbits.init()`,如果有一个集群,可以按照下面的方式启动:先启动一个管理进程(Supervisor),再在不同的计算节点上启动 Worker:\n",
+ "\n",
+ "```shell\n",
+ "# 先在管理节点启动 Supervisor\n",
+ "xorbits-supervisor -H -p -w \n",
+ "\n",
+ "# 在每个计算节点启动 Worker\n",
+ "xorbits-worker -H -p -s :\n",
+ "```\n",
+ "\n",
+ "其中,`` 和 `` 为管理节点的 IP 和端口号,`` 为仪表盘端口号,客户端也通过这个端口与集群连接。`` 和 `` 为每个计算节点的 IP 和端口号。启动好 Supervisor 和 Worker 后,在代码中使用 `xorbits.init(\":\")` 连接到这个集群,计算任务就可以横向扩展到集群上。\n",
+ "\n",
+ "## API 兼容性\n",
+ "\n",
+ "在 pandas DataFrame 的兼容性上,Modin > Xorbits > Dask DataFrame;在性能上,Xorbits > Dask DataFrame > Modin。\n",
+ "\n",
+ "Xorbits 也是在多维度对数据进行切分,保留行标签和列标签,提供了绝大多数 pandas API,比如 `iloc()`、`median()`。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "import sys\n",
+ "sys.path.append(\"..\")\n",
+ "from utils import nyc_flights\n",
+ "from utils import nyc_taxi\n",
+ "\n",
+ "taxi_path = nyc_taxi()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/Users/luweizheng/miniconda3/envs/dispy/lib/python3.11/site-packages/xorbits/_mars/deploy/oscar/session.py:1953: UserWarning: No existing session found, creating a new local session now.\n",
+ " warnings.warn(warning_msg)\n",
+ "2024-05-09 10:13:12,400 xorbits._mars.deploy.oscar.local 43280 WARNING Web service started at http://0.0.0.0:54965\n"
+ ]
+ },
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "d59ff71355b24395bb3c2a76bc0220ba",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ " 0%| | 0.00/100 [00:00, ?it/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/plain": [
+ "VendorID 1\n",
+ "tpep_pickup_datetime 2023-01-01 00:03:48\n",
+ "tpep_dropoff_datetime 2023-01-01 00:13:25\n",
+ "passenger_count 0.0\n",
+ "trip_distance 1.9\n",
+ "RatecodeID 1.0\n",
+ "store_and_fwd_flag N\n",
+ "PULocationID 138\n",
+ "DOLocationID 7\n",
+ "payment_type 1\n",
+ "fare_amount 12.1\n",
+ "extra 7.25\n",
+ "mta_tax 0.5\n",
+ "tip_amount 0.0\n",
+ "tolls_amount 0.0\n",
+ "improvement_surcharge 1.0\n",
+ "total_amount 20.85\n",
+ "congestion_surcharge 0.0\n",
+ "airport_fee 1.25\n",
+ "Name: 3, dtype: object"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import xorbits\n",
+ "import xorbits.pandas as pd\n",
+ "\n",
+ "df = pd.read_parquet(taxi_path, use_arrow_dtype=False)\n",
+ "df.iloc[3]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "VendorID int64\n",
+ "tpep_pickup_datetime datetime64[us]\n",
+ "tpep_dropoff_datetime datetime64[us]\n",
+ "passenger_count float64\n",
+ "trip_distance float64\n",
+ "RatecodeID float64\n",
+ "store_and_fwd_flag object\n",
+ "PULocationID int64\n",
+ "DOLocationID int64\n",
+ "payment_type int64\n",
+ "fare_amount float64\n",
+ "extra float64\n",
+ "mta_tax float64\n",
+ "tip_amount float64\n",
+ "tolls_amount float64\n",
+ "improvement_surcharge float64\n",
+ "total_amount float64\n",
+ "congestion_surcharge float64\n",
+ "airport_fee float64\n",
+ "dtype: object"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.dtypes"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "574c784460ae4be89a4c190919c74c1a",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ " 0%| | 0.00/100 [00:00, ?it/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/plain": [
+ "1.79"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df['trip_distance'].median()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 推迟执行\n",
+ "\n",
+ "Xorbits 采用了和 Dask 类似的计算图,任何计算会先转换为计算图再执行;但 Xorbits 不需要明确调用 `compute()` 来触发计算,这种方式被称为推迟执行。Xorbits 在背后构建了计算图,但只有遇到 `print()` 等需要将数据呈现给用户的操作,Xorbits 才会执行计算图。这种方式使得 Xorbits 与 pandas 和 NumPy 的语义更加相似。如果手动触发计算,也可以 `xorbits.run(df)`。\n",
+ "\n",
+ "以下面的数据可视化为例,`gb_time` 只是一个指向计算图的指针,并不是实际的数据,但当 plotly 需要 `gb_time` 的结果时,Xorbits 会触发计算。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df['PU_dayofweek'] = df['tpep_pickup_datetime'].dt.dayofweek\n",
+ "df['PU_hour'] = df['tpep_pickup_datetime'].dt.hour\n",
+ "gb_time = df.groupby(by=['PU_dayofweek', 'PU_hour'], as_index=False).agg(count=('PU_dayofweek', 'count'))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ " \n",
+ " "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "import plotly.express as px\n",
+ "import plotly.io as pio\n",
+ "pio.renderers.default = \"notebook\"\n",
+ "\n",
+ "b = px.bar(\n",
+ " gb_time,\n",
+ " x='PU_hour',\n",
+ " y='count',\n",
+ " color='PU_dayofweek',\n",
+ " color_continuous_scale='sunset_r',\n",
+ ")\n",
+ "b.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "dispy",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.7"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/ch-intro/computer-architecture.md b/ch-parallel-computing/computer-architecture.md
similarity index 100%
rename from ch-intro/computer-architecture.md
rename to ch-parallel-computing/computer-architecture.md
diff --git a/ch-intro/index.md b/ch-parallel-computing/index.md
similarity index 54%
rename from ch-intro/index.md
rename to ch-parallel-computing/index.md
index 3b8739f..231a10e 100644
--- a/ch-intro/index.md
+++ b/ch-parallel-computing/index.md
@@ -1,4 +1,4 @@
-# 引言
+# 并行计算基础
```{tableofcontents}
```
\ No newline at end of file
diff --git a/ch-intro/parallel-program-design.md b/ch-parallel-computing/parallel-program-design.md
similarity index 100%
rename from ch-intro/parallel-program-design.md
rename to ch-parallel-computing/parallel-program-design.md
diff --git a/ch-intro/performance-metrics.md b/ch-parallel-computing/performance-metrics.md
similarity index 100%
rename from ch-intro/performance-metrics.md
rename to ch-parallel-computing/performance-metrics.md
diff --git a/ch-intro/serial-parallel.md b/ch-parallel-computing/serial-parallel.md
similarity index 100%
rename from ch-intro/serial-parallel.md
rename to ch-parallel-computing/serial-parallel.md
diff --git a/ch-intro/thread-process.md b/ch-parallel-computing/thread-process.md
similarity index 100%
rename from ch-intro/thread-process.md
rename to ch-parallel-computing/thread-process.md
diff --git a/conf.py b/conf.py
index 2e69396..53f6ac4 100644
--- a/conf.py
+++ b/conf.py
@@ -49,6 +49,9 @@
}
html_static_path = ["_static"]
html_css_files = ["custom.css"]
+html_js_files = [
+ "https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js",
+]
html_title = 'Python 数据科学加速'
latex_engine = 'pdflatex'
myst_enable_extensions = ['colon_fence', 'dollarmath', 'linkify', 'substitution', 'tasklist']