Do not destroy fftw plans if they were not created

[hercules2020/kcf.git] / README.md
diff --git a/README.md b/README.md

index 31e6e9b98dad018d135111fa67c918efda94d9c2..856a3ead71ef61bfad88bf3dfc73c53aaad4c929 100644 (file)
--- a/README.md
+++ b/README.md
@@ -6,28 +6,30 @@ achieve the needed performance we try various ways of parallelization
  of the algorithm including execution on the GPU. The aim is also to
  modify the code according to the PRedictable Execution Model (PREM).
  
-Stable version of the tracker is available from [CTU server][2],
-development happens at [Github][3].
+Stable version of the tracker is available from a [CTU server][2],
+development happens at GitHub [here][wsh] and [here][3].
  
  [1]: http://hercules2020.eu/
  [2]: http://rtime.felk.cvut.cz/gitweb/hercules2020/kcf.git
+[wsh]: https://github.com/wentasah/kcf
  [3]: https://github.com/Shanigen/kcf
  
  ## Prerequisites
  
-The code depends on OpenCV 2.4 (3.0+ for CUDA-based version) library
-and cmake is used for building. Depending on the version to be
-compiled you have to have [FFTW][4], [CUDA][5] or [OpenMP][6]
-installed.
+The code depends on OpenCV 2.4 library
+and [CMake][13] (optionally with [Ninja][8]) is used for building.
+Depending on the version to be compiled you need to have development
+packages for [FFTW][4], [CUDA][5] or [OpenMP][6] installed.
  
-SSE instructions were used in the original code and these are only
-supported on x86 architecture. Thanks to the [SSE2NEON][7] code, we
-now support both ARM and x86 architectures.
+On TX2, the following command should install what's needed:
+``` shellsession
+$ apt install cmake ninja-build libopencv-dev libfftw3-dev
+```
  
  [4]: http://www.fftw.org/
  [5]: https://developer.nvidia.com/cuda-downloads
  [6]: http://www.openmp.org/
-[7]: https://github.com/jratcliff63367/sse2neon
+[13]: https://cmake.org/
  
  ## Compilation
  
@@ -46,7 +48,8 @@ versions in them. If prerequisites of some builds are missing, the
  build system, which is useful when building naively on TX2, because
  builds with `ninja` are faster (better parallelized) than with `make`.
  
-To build only a specific version run `make <version>`, for example:
+To build only a specific version run `make <version>`. For example,
+CUDA-based version can be compiled with:
  
  ``` shellsession
  $ make cufft
@@ -78,7 +81,8 @@ $ make -C build
  $ git submodule update --init
  $ mkdir build
  $ cd build
-$ cmake [options] ..
+$ cmake [options] ..  # see the tables below
+$ make
  ```
  
  The `cmake` options below allow to select, which version to build.
@@ -98,14 +102,11 @@ With all of these FFT version additional options can be added:
  |Option| Description |
  | --- | --- |
  | `-DASYNC=ON` | Use C++ `std::async` to run computations for different scales in parallel. This doesn't work with `BIG_BATCH` mode.|
-| `-DOPENMP=ON` | This option can only be used with CPU versions of the tracker. In normal mode it will run computations for differenct scales in parallel. In the case of the big batch mode it will parallelize the feature extraction  and the search for maximal response for differenct scales. If Fftw version is used with big batch mode it will also parallelize Ffftw's plans.|
-| `-DBIG_BATCH=ON` | Concatenate matrices of different scales to one big matrix and perform all computations on this matrix. This mode doesn't work for OpenCV FFT.|
-| `-DCUDA_DEBUG=ON` | This mode adds CUDA error checking for all kernels and CUDA runtime libraries. Only works with cuFFT version.|
+| `-DBIG_BATCH=ON` | Concatenate matrices of different scales to one big matrix and perform all computations on this matrix. This mode doesn't work with `OpenCV` FFT.|
+| `-DOPENMP=ON` | Parallelize certain operation with OpenMP. This can only be used with `OpenCV` or `fftw` FFT implementations. By default it runs computations for differenct scales in parallel. With `-DBIG_BATCH=ON` it parallelizes the feature extraction and the search for maximal response for differenct scales. With `fftw`, Ffftw's plans will execute in parallel.|
+| `-DCUDA_DEBUG=ON` | Adds calls cudaDeviceSynchronize after every CUDA function and kernel call.|
+| `-DOpenCV_DIR=/opt/opencv-3.3/share/OpenCV` | Compile against a custom OpenCV version. |
  
-Finally call make:
-```
-$ make
-```
  
  ### Compilation for non-TX2 CUDA
  
@@ -164,7 +165,7 @@ top_left_y, width, height".
  | --visualize, -v[delay_ms] | Visualize the output, optionally with specified delay. If the delay is 0 the program will wait for a key press. |
  | --output, -o <output.txt>     | Specify name of output file. |
  | --debug, -d                           | Generate debug output. |
-| --fit, -f[W[xH]] | Specifies the dimension to which the extracted patch should be scaled. It should be divisible by 4. No dimension is the same as `128x128`, a single dimension `W` will result in patch size of `W`×`W`. |
+| --fit, -f[W[xH]] | Specifies the dimension to which the extracted patch should be scaled. It should be divisible by 4. No dimension or zero rounds the dimensions to the nearest smaller power of 2, a single dimension `W` will result in patch size of `W`×`W`. |
  
  
  ## Authors
@@ -172,13 +173,13 @@ top_left_y, width, height".
  
  Original C++ implementation of KCF tracker was written by Tomas Vojir
  [here][12] and is reimplementation of algorithm presented in
-"High-Speed Tracking with Kernelized Correlation Filters" paper [1].
+"High-Speed Tracking with Kernelized Correlation Filters" paper \[1].
  
  [12]: https://github.com/vojirt/kcf/blob/master/README.md
  
  ## References
  
-[1] João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista,
+\[1] João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista,
  “High-Speed Tracking with Kernelized Correlation Filters“, IEEE
  Transactions on Pattern Analysis and Machine Intelligence, 2015