Get in Touch

Course Outline

Overview of the Chinese AI GPU Ecosystem

  • Comparison of Huawei Ascend, Biren, and Cambricon MLU.
  • Contrast between CUDA and CANN, Biren SDK, and BANGPy models.
  • Industry trends and vendor ecosystems.

Preparing for Migration

  • Assessing your CUDA codebase.
  • Identifying target platforms and SDK versions.
  • Toolchain installation and environment setup.

Code Translation Techniques

  • Porting CUDA memory access and kernel logic.
  • Mapping compute grid/thread models.
  • Evaluating automated versus manual translation options.

Platform-Specific Implementations

  • Utilizing Huawei CANN operators and custom kernels.
  • Navigating the Biren SDK conversion pipeline.
  • Rebuilding models using BANGPy (Cambricon).

Cross-Platform Testing and Optimization

  • Profiling execution on each target platform.
  • Comparing memory tuning and parallel execution.
  • Tracking performance and iterating.

Managing Mixed GPU Environments

  • Hybrid deployments involving multiple architectures.
  • Implementing fallback strategies and device detection.
  • Employing abstraction layers to enhance code maintainability.

Case Studies and Best Practices

  • Porting vision and NLP models to Ascend or Cambricon.
  • Retrofitting inference pipelines on Biren clusters.
  • Handling version mismatches and API gaps.

Summary and Next Steps

Requirements

  • Experience in programming with CUDA or GPU-based applications.
  • Understanding of GPU memory models and compute kernels.
  • Familiarity with AI model deployment or acceleration workflows.

Audience

  • GPU programmers.
  • System architects.
  • Porting specialists.
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories