How do I create threads on different processor cores?

Let's say I had a C # program that did something expensive computing, like encoding a list of WAV files to MP3. Usually I encoded the files one at a time, but let's say I wanted the program to find out how many kernel cores I had, and we twist the coding stream on each core. So, when I run the program on a quad-core processor, the program calculates its quad-core processor, and there are four cores for operation, and then spawns four threads for encoding, each of which runs independently by the CPU. How should I do it?

And would it be different if the cores were distributed across several physical processors? As if I had a machine with two quad-core processors on it, are there any special considerations or eight cores on two matrices are considered equal in Windows?

+50
multithreading c # windows
Aug 28 '08 at 14:11
source share
10 answers

Do not worry about it.

Use the Thread Pool instead. A thread pool is a structure mechanism (actually a class) that you can request for a new thread.

When you request a new thread, it will either give you a new one or close the job until the thread is freed. Thus, the structure is responsible for deciding that it should create more threads or not depend on the number of existing processors.

Edit: In addition, as already mentioned, the OS is responsible for distributing threads between different CPUs.

+49
Aug 28 '08 at 14:13
source share

This is not necessarily as simple as using a thread pool.

By default, a thread pool allocates multiple threads for each CPU. Since each thread participating in the work that you do has a cost (overhead for tasks, using a very limited CPU L1, L2, and possibly L3 cache, etc.), the optimal number of threads used is = = number of available CPUs - unless each thread requests services from other machines - for example, a highly scalable web service. In some cases, especially when it comes to reading and writing to the hard drive than the processor, you may be better off with 1 thread than with multiple threads.

For most applications and, of course, for encoding WAV and MP3, you should limit the number of workflows to the number of available CPUs. Here is the C # code to find the number of processors:

int processors = 1; string processorsStr = System.Environment.GetEnvironmentVariable("NUMBER_OF_PROCESSORS"); if (processorsStr != null) processors = int.Parse(processorsStr); 

Unfortunately, this is not as simple as limiting the number of processors. You should also consider the performance of the controller (s) of the hard disk and disks (disks).

The only way to find the optimal number of threads is a trial error. This is especially true if you use hard drives, web services, etc. With hard drives, you might be better off not using all four processors on your quad-core processor. On the other hand, with some web services you might be better off making 10 or even 100 requests for the processor.

+18
Feb 20 '09 at 1:54
source share

In the case of managed threads, the complexity of this process is higher than that of native threads. This is because the CLR threads are not directly tied to the thread of their own OS. In other words, the CLR can switch the managed thread from its own thread to its own thread, as it sees fit. The Thread.BeginThreadAffinity function is provided for placing a managed thread in blocking mode with a thread of its own OS. At this point, you can experiment using the built-in API to enable you to associate the native stream with the stream. As everyone says here, this is not a good idea. In fact, there is documentation suggesting that threads can receive less processing time if they are limited to a single processor or core.

You can also learn the System.Diagnostics.Process class. There you can find a function for listing process threads as a collection of ProcessThread objects. This class has methods for installing ProcessorAffinity, or even installing the preferred processor — not sure what it is.

Disclaimer: I had a similar problem when I thought that the CPUs were used and investigated a lot of all this; however, based on everything I read, it turned out that this was not a good idea, as evidenced by the comments posted here. However, it is still an interesting and experimental experience.

+8
Aug 28 '08 at 14:49
source share

You do not have to worry about doing it yourself. I have multithreaded .NET applications running on dual-core machines, and no matter how the threads start, whether through ThreadPool or manually, I see a nice even distribution of work across all cores.

+2
Aug 28 '08 at 14:16
source share

You can do this by writing a program inside your program.

However, you should not try to do this, since the operating system is the best candidate for managing these materials. I mean, a user mode program should not try to do this.

However, sometimes this can be done (for a truly advanced user) to achieve load balancing and even to find out the true multi-core problem with multiple threads (data matching / code caching ...), since different threads will actually be executed on different processors.

Having said that, if you still want to achieve, we can do it as follows. I provide you with pseudo-code for (Windows OS), however they can easily be made on Linux as well.

 #define MAX_CORE 256 processor_mask[MAX_CORE] = {0}; core_number = 0; Call GetLogicalProcessorInformation(); // From Here we calculate the core_number and also we populate the process_mask[] array // which would be used later on to set to run different threads on different CORES. for(j = 0; j < THREAD_POOL_SIZE; j++) Call SetThreadAffinityMask(hThread[j],processor_mask[j]); //hThread is the array of handles of thread. //Now if your number of threads are higher than the actual number of cores, // you can use reset the counters(j) once you reach to the "core_number". 

After calling the above procedure, threads always execute as follows:

 Thread1-> Core1 Thread2-> Core2 Thread3-> Core3 Thread4-> Core4 Thread5-> Core5 Thread6-> Core6 Thread7-> Core7 Thread8-> Core8 Thread9-> Core1 Thread10-> Core2 ............... 

See the / MSDN manual for more information to learn more about these concepts.

+2
Aug 28 '13 at 23:28
source share

Where each thread goes, it is usually processed by the OS itself ... so generate 4 threads in a quad-core system, and the OS will decide which kernels to run each, which will usually have 1 thread on each core.

+1
Aug 28 '08 at 14:13
source share

The task of the operating system is to split threads into different cores, and it will do this automatically when your threads use a lot of processor time. Do not worry about it. As for determining the number of cores of your user, try Environment.ProcessorCount in C #.

+1
Aug 28 '08 at 14:13
source share

One of the reasons why you should not (as was said) try to highlight this material yourself is that you simply don’t have enough information to do it right, especially in the future with NUMA, etc.

If you have a read-only thread and the kernel is idle there, the kernel starts your thread, don’t worry.

+1
Aug 28 '08 at 14:29
source share

you cannot do this, since there are privileges only for this operating system. If you decide this ..... then it will be difficult to code the application. Because then you also need to take care of interprocess communication. critical sections. for each application you need to create your own semaphores or mutexes ... to which the operating system gives a general solution, doing it yourself ........

+1
Jun 18 2018-12-18T00:
source share

Although I agree with most of the answers here, I find it worth adding a new consideration: Speedstep technology.

When starting an intensive single-threaded task based on a multi-core system, in my case Xeon E5-2430 with 6 real cores (12 with HT) under a Windows 2012 server, the work became widespread among all 12 using about 8.33% of each core and never starts speed increase. The processor remained at a frequency of 1.2 GHz.

When I set the flow binding to a specific core, it used ~ 100% of this core, as a result of which the CPU reaches a maximum of 2.5 GHz, which more than doubles the performance.

This is the program I used that just loops through the variable increment. When called with -a, it will establish an affinity for core 1. The affinity part was based on this post .

 using System; using System.Diagnostics; using System.Linq; using System.Runtime.InteropServices; using System.Threading; namespace Esquenta { class Program { private static int numThreads = 1; static bool affinity = false; static void Main(string[] args) { if (args.Contains("-a")) { affinity = true; } if (args.Length < 1 || !int.TryParse(args[0], out numThreads)) { numThreads = 1; } Console.WriteLine("numThreads:" + numThreads); for (int j = 0; j < numThreads; j++) { var param = new ParameterizedThreadStart(EsquentaP); var thread = new Thread(param); thread.Start(j); } } static void EsquentaP(object numero_obj) { int i = 0; DateTime ultimo = DateTime.Now; if(affinity) { Thread.BeginThreadAffinity(); CurrentThread.ProcessorAffinity = new IntPtr(1); } try { while (true) { i++; if (i == int.MaxValue) { i = 0; var lps = int.MaxValue / (DateTime.Now - ultimo).TotalSeconds / 1000000; Console.WriteLine("Thread " + numero_obj + " " + lps.ToString("0.000") + " M loops/s"); ultimo = DateTime.Now; } } } finally { Thread.EndThreadAffinity(); } } [DllImport("kernel32.dll")] public static extern int GetCurrentThreadId(); [DllImport("kernel32.dll")] public static extern int GetCurrentProcessorNumber(); private static ProcessThread CurrentThread { get { int id = GetCurrentThreadId(); return Process.GetCurrentProcess().Threads.Cast<ProcessThread>().Single(x => x.Id == id); } } } } 

And the results:

results

The processor speed, as shown by the task manager, is similar to what CPU-Z reports:

enter image description here

+1
Apr 01 '15 at 13:25
source share



All Articles