Skip to main content
Back to top
Ctrl
+
K
中文版
Scalable Data Science with Python
1. Parallel Computing Basics
1.1. Modern Computer Architecture
1.2. Serial Execution v.s. Parallel Execution
1.3. Threads and Processes
1.4. Parallel Programming Design Methods
1.5. Performance Metrics
2. Data Science
2.1. Data Science Lifecycle
2.2. Machine Learning
2.3. Deep Learning
2.4. Hyperparameter Optimization
2.5. Ecosystem and Content
3. Dask
3.1. Dask Overview
3.2. Getting Started with Dask DataFrame
3.3. Scaling Dask to a Cluster
3.4. GPU
3.5. Task Graph and Data Partitioning
4. Dask DataFrame
4.1. Reading and Writing Data
4.2. Indexing
4.3.
map_partitions
4.4. Shuffle
4.5. Data Analysis with Dask
5. Machine Learning with Dask
5.1. Data Preprocessing
5.2. Hyperparameter Tuning
5.3. Distributed Machine Learning
6. Ray
6.1. Ray Overview
6.2. Ray Remote Functions
6.3. Distributed Object Storage
6.4. Ray Remote Classes
7. Ray Data
7.1. Ray Data Overview
7.2. Data Loading, Inspection, and Saving
7.3. Data Transformation
7.4. Preprocessor
8. MPI for Python
8.1. MPI Overview
8.2. MPI Hello World
8.3. Point-to-Point Communication
8.4. Collective Communication
8.5. Remote Memory Access
9. References
Index