{"id":9178,"date":"2025-01-06T11:51:14","date_gmt":"2025-01-06T10:51:14","guid":{"rendered":"https:\/\/solidt.eu\/site\/?p=9178"},"modified":"2025-01-06T11:51:16","modified_gmt":"2025-01-06T10:51:16","slug":"c-running-computations-on-gpu","status":"publish","type":"post","link":"https:\/\/solidt.eu\/site\/c-running-computations-on-gpu\/","title":{"rendered":"C# Running computations on GPU"},"content":{"rendered":"\n<p>Example uses NuGet package ComputeSharp 3.1.0<\/p>\n\n\n\n<div style=\"height: 250px; position:relative; margin-bottom: 50px;\" class=\"wp-block-simple-code-block-ace\"><pre class=\"wp-block-simple-code-block-ace\" style=\"position:absolute;top:0;right:0;bottom:0;left:0\" data-mode=\"csharp\" data-theme=\"monokai\" data-fontsize=\"14\" data-lines=\"Infinity\" data-showlines=\"true\" data-copy=\"false\">using System.Diagnostics;\nusing ComputeSharp;\n\nnamespace GpuTestConsole\n{\n    public class Program\n    {\n        static void Main(string[] args)\n        {\n            var sw = new Stopwatch();\n\n            \/\/ Input arrays\n            float[] input1 = { 1, 2, 3, 4 };\n            float[] input2 = { 5, 6, 7, 8 };\n            float[] input3 = { 9, 10, 11, 12 };\n\n            \/\/ Output array\n            float[] output = new float[input1.Length];\n\n            var gpu = GraphicsDevice.GetDefault();\n            Console.WriteLine($\"GPU: {gpu.Name}\");\n            Console.WriteLine($\"Compute units: {gpu.ComputeUnits}\");\n\n            \/\/ Allocate buffers once\n            using var inputBuffer = gpu.AllocateReadOnlyBuffer&lt;float>(input1.Length);\n            using var outputBuffer = gpu.AllocateReadWriteBuffer&lt;float>(output.Length);\n\n            \/\/ Create the kernel (without running it yet)\n            var kernel = new AddKernel(inputBuffer, inputBuffer, outputBuffer);\n\n            \/\/ First computation\n            inputBuffer.CopyFrom(input1);\n            kernel.InputBuffer2.CopyFrom(input2); \/\/ Dynamically set second input buffer\n\n            sw.Start();\n            gpu.For(input1.Length, kernel);\n\n            \/\/ Copy result\n            outputBuffer.CopyTo(output);\n            Console.WriteLine($\"Result 1: {string.Join(\", \", output)}  ({sw.ElapsedMilliseconds} ms)\");\n\n            \/\/ Second computation with new inputs\n            inputBuffer.CopyFrom(input3);\n            kernel.InputBuffer2.CopyFrom(input1);\n\n            sw.Restart();\n            gpu.For(input1.Length, kernel);\n\n            outputBuffer.CopyTo(output);\n            Console.WriteLine($\"Result 2: {string.Join(\", \", output)}  ({sw.ElapsedMilliseconds} ms)\");\n        }\n    }\n\n    [GeneratedComputeShaderDescriptor]\n    [ThreadGroupSize(DefaultThreadGroupSizes.X)]\n    public partial struct AddKernel : IComputeShader\n    {\n        public ReadOnlyBuffer&lt;float> InputBuffer1;\n        public ReadOnlyBuffer&lt;float> InputBuffer2;\n        public ReadWriteBuffer&lt;float> OutputBuffer;\n\n        public AddKernel(ReadOnlyBuffer&lt;float> input1, ReadOnlyBuffer&lt;float> input2, ReadWriteBuffer&lt;float> output)\n        {\n            InputBuffer1 = input1;\n            InputBuffer2 = input2;\n            OutputBuffer = output;\n        }\n\n        \/*\n         Aandachtspunten bij Execute\n\n        Thread Safety:\n            De GPU voert de Execute-functie parallel uit voor elk element.\n            Vermijd gedeelde variabelen (globals) en zorg dat elke thread alleen zijn eigen index gebruikt.\n\n        Thread Index:\n            Gebruik ThreadIds.X voor de 1D-index. Voor 2D- of 3D-berekeningen kun je ook ThreadIds.Y en ThreadIds.Z gebruiken.\n            Controleer altijd of de thread binnen het bereik van de buffer zit om out-of-bounds fouten te voorkomen.\n\n        Buffer Types:\n            Gebruik ReadOnlyBuffer&lt;T> voor buffers die niet worden gewijzigd.\n            Gebruik ReadWriteBuffer&lt;T> voor buffers die worden gelezen en geschreven.\n\n        Debugging:\n            Fouten in GPU-shaders kunnen lastig te debuggen zijn. Controleer of alle indexberekeningen correct zijn en zorg dat buffers voldoende groot zijn.\n\n        Prestatieoptimalisatie:\n            Voorkom onnodige synchronisaties tussen CPU en GPU.\n            Combineer meerdere kleine berekeningen in \u00e9\u00e9n kernel om overhead te verminderen.\n         *\/\n\n        public void Execute()\n        {\n            int i = ThreadIds.X;\n            OutputBuffer[i] = InputBuffer1[i] + InputBuffer2[i];\n        }\n    }\n}\n<\/pre><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Example uses NuGet package ComputeSharp 3.1.0<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-9178","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/posts\/9178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/comments?post=9178"}],"version-history":[{"count":1,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/posts\/9178\/revisions"}],"predecessor-version":[{"id":9179,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/posts\/9178\/revisions\/9179"}],"wp:attachment":[{"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/media?parent=9178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/categories?post=9178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solidt.eu\/site\/wp-json\/wp\/v2\/tags?post=9178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}