Need the best result
GPU EXPERIMENT:
You will create an array A with 5000 random chars in host
Transfer A to device
In the device, create 10 threads
Each thread will (insertion) sort its own chuck, such as Thread 0 sorts from A[0] to A[499], Thread 1 sorts from A[500] to A[999]
Transfer back A to device
Measure how long does it take the whole process, including transferring A to device and from device back to host
Print each chuck
hi, it seems like a simple CUDA code and the insertion sort can be coded in CUDA without much pain, let me know if you're interested.
I can measure performance as you describe
Hello I am student in computer science and I am good at cuda programming. My bachelor Thesis was on cuda and also I made a project at the university with CUDA-C. Hire me for your job to be done well at low costs.
Hello, Click on the [CHAT] button below so that I can ask a few questions concerning GPU code using CUDA, C programing
project. I have read all the provided instructions, and I am the right person to work on this task. I provide exceptional quality services on time, leaving you fully satisfied and this is what your money is worth.