]> rtime.felk.cvut.cz Git - sojka/nv-tegra/linux-3.10.git/commit
gpu: nvgpu: implement per-channel watchdog
authorDeepak Nibade <dnibade@nvidia.com>
Mon, 31 Aug 2015 09:00:35 +0000 (14:30 +0530)
committermobile promotions <svcmobile_promotions@nvidia.com>
Thu, 5 Nov 2015 07:19:33 +0000 (23:19 -0800)
commit5db0de02f70f23687fa6990a5ddb93e8b283ac03
tree48213d0fe240d41d15af24d3bf135ade8e15998f
parentb30d97e1adf3f6ce85b4bf36c15896811f54c7b5
gpu: nvgpu: implement per-channel watchdog

Implement per-channel watchdog/timer as per below rules :
- start the timer while submitting first job on channel or if
  no timer is already running
- cancel the timer when job completes
- re-start the timer if there is any incomplete job left
  in the channel's queue
- trigger appropriate recovery method as part of timeout
  handling mechanism

Handle the timeout as per below :
- get timed out channel, and job data
- disable activity on all engines
- check if fence is really pending
- get information on failing engine
- if no engine is failing, just abort the channel
- if engine is failing, trigger the recovery

Also, add flag "ch_wdt_enabled" to enable/disable channel
watchdog mechanism. Watchdog can also be disabled using
global flag "timeouts_enabled"

Set the watchdog time to be 5s using macro
NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS

Bug 200133289

Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797072
(cherry picked from commit 2d4bcbae629bfdee6b7886c9c2bf2932c3ef8245)
Reviewed-on: http://git-master/r/793638
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/815931
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
drivers/gpu/nvgpu/gk20a/channel_gk20a.c
drivers/gpu/nvgpu/gk20a/channel_gk20a.h
drivers/gpu/nvgpu/gk20a/fifo_gk20a.c
drivers/gpu/nvgpu/gk20a/fifo_gk20a.h
drivers/gpu/nvgpu/gk20a/gk20a.c
drivers/gpu/nvgpu/gk20a/gk20a.h