- Published on
HelloCuda 系列: CUDA Thrust Basic
- Thrust: The C++ Parallel Algorithms Library 最主要的2个component:
- thrust::device_vector
- thrust::host_vector
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/reduce.h>
#include <iostream>
void thrustBasic() {
int host_data[] = {1, 2, 3, 4, 5};
// Transfer data to device using Thrust
thrust::device_vector<int> d_data(host_data, host_data + 5);
// Sort data on the device
thrust::sort(d_data.begin(), d_data.end());
// Sum of the data on the device
int sum = thrust::reduce(d_data.begin(), d_data.end(), 0, thrust::plus<int>());
std::cout << "Sum of sorted data: " << sum << std::endl;
// Transfer data back to host
thrust::copy(d_data.begin(), d_data.end(), host_data);
std::cout << "Sorted data: ";
for (int i = 0; i < 5; ++i) {
std::cout << host_data[i] << " ";
}
std::cout << std::endl;
}
Caveat:
nvcc -Wno-deprecated-gpu-targets -rdc=true hello_world.cu -arch=sm_61 -o hello_world && ./hello_world
需要加上 -arch=sm_61(我的GPU架构)来编译 Thrust代码,否则会报错。
terminate called after throwing an instance of 'thrust::THRUST_200802_SM_520_NS::system::system_error'
what(): radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function
参考资料
THE END