翻译和简化官方教程:CUDA C++ Programming Guide
主要是简化和通俗化官方的教程,不会修改原官方文档中的内容,会增加一些额外的解释。
系列文章会发布在这里:
- CUDA C++ 编程指北-第一章 入门以及编程模型
- CUDA C++ 编程指北-第二章 编程接口
- CUDA C++ 编程指北-第三章 GPU硬件实现
- CUDA C++ 编程指北-第四章 性能提升指南
- CUDA C++ 编程指北-第五章 C++语言拓展
原始文章大纲:
- Introduction is a general introduction to CUDA.
- Programming Model outlines the CUDA programming model.
- Programming Interface describes the programming interface.
- Hardware Implementation describes the hardware implementation.
- Performance Guidelines gives some guidance on how to achieve maximum performance.
- CUDA-Enabled GPUs lists all CUDA-enabled devices.
- C++ Language Extensions is a detailed description of all extensions to the C++ language.
- Cooperative Groups describes synchronization primitives for various groups of CUDA threads.
- CUDA Dynamic Parallelism describes how to launch and synchronize one kernel from another.
- Virtual Memory Management describes how to manage the unified virtual address space.
- Stream Ordered Memory Allocator describes how applications can order memory allocation and deallocation.
- Graph Memory Nodes describes how graphs can create and own memory allocations.
- Mathematical Functions lists the mathematical functions supported in CUDA.
- C++ Language Support lists the C++ features supported in device code.
- Texture Fetching gives more details on texture fetching.
- Compute Capabilities gives the technical specifications of various devices, as well as more architectural details.
- Driver API introduces the low-level driver API.
- CUDA Environment Variables lists all the CUDA environment variables.
- Unified Memory Programming introduces the Unified Memory programming model.