Particle: optimize threading for many particles and many cores

The maximum particles per task of 256 was outdated and lead to too much thread
contention. Instead define a low fixed number of tasks per thread.

On a i7-7700HQ, creating 4 million particles went down from 31s to 4s.

Thanks to Oscar Abad, Sav Martin, Zebus3d, Sebastián Barschkis and Martin Felke
for testing and advice.

Differential Revision: https://developer.blender.org/D4910
This commit is contained in:
Juan Gea 2019-05-21 16:30:03 +02:00 committed by Brecht Van Lommel
parent 9e82e48937
commit fbae1c9ed5

@ -471,8 +471,6 @@ void psys_thread_context_init(ParticleThreadContext *ctx, ParticleSimulationData
ctx->ma = give_current_material(sim->ob, sim->psys->part->omat);
}
#define MAX_PARTICLES_PER_TASK \
256 /* XXX arbitrary - maybe use at least number of points instead for better balancing? */
BLI_INLINE int ceil_ii(int a, int b)
{
@ -486,7 +484,7 @@ void psys_tasks_create(ParticleThreadContext *ctx,
int *r_numtasks)
{
ParticleTask *tasks;
int numtasks = ceil_ii((endpart - startpart), MAX_PARTICLES_PER_TASK);
int numtasks = min_ii(BLI_system_thread_count() * 4, endpart - startpart);
float particles_per_task = (float)(endpart - startpart) / (float)numtasks, p, pnext;
int i;