Results from managed_memory_test on Jetson pro board suggest that
cuda kernel launch times for 1MB buffer is more than that of 2MB buffer.
CPU cache flush operation forms a major portion of the launch time. For buffers
above and equal to 2MB, cache flush is performed by set/ways explaining why
2MB launch times are less than that for 1MB buffers.
This patch attempts to reduce the threshold to 1MB for T124 based platforms
and reduce the cuda kernel launch times for 1MB buffers.
Bug
1627912
Change-Id: I5dbca6be06f80cc1f2ed02913380c05769fc45c1
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/726134
(cherry picked from commit
9fc6e7917a07e476ccc4213e486871c608564ac3)
Reviewed-on: http://git-master/r/740186
Reviewed-by: Yogesh Kini <ykini@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
#define NVMAP_CARVEOUT_KILLER_RETRY_TIME 100 /* msecs */
-/* this is basically the L2 cache size */
-#ifdef CONFIG_DENVER_CPU
+/* this is basically the L2 cache size but may be tuned as per requirement */
+#if defined(CONFIG_DENVER_CPU)
size_t cache_maint_inner_threshold = SZ_2M * 8;
+#elif defined(CONFIG_ARCH_TEGRA_12x_SOC)
+size_t cache_maint_inner_threshold = SZ_1M;
#else
size_t cache_maint_inner_threshold = SZ_2M;
#endif