memory problem on process fork
|Assignee:||Alex Norton||% Done:||
For some reason of fo.usa the scheduler is unable to fork all of the agents.
Updated by Bob Gobeille about 1 year ago
- Priority changed from High to Urgent
Changing to urgent since fo.usa sched is now dead.
Updated by Alex Norton about 1 year ago
- Status changed from In Progress to Resolved
Basically, what is happening is that fo.usa has enough hosts that is uses that the number of agents being started at the same time was exceeding the amount of memory allocated to the scheduler. The only time that the 60+ agents are all started at the same time is during the start up test.
the scheduler now uses a random back off when a fork fails. The thread that the agent is being forked from will wait for between 1 and 5 second and then try the fork again. Since this is currently only happening on scheduler start up this seems like a reasonable action to take.
Should be fixed with svn 5845