Show locations of detected maxima in visual debug mode

[hercules2020/kcf.git] / README.md
diff --git a/README.md b/README.md

index cc4ef07eee3d6b4f57acea885590416491b08a1d..9b44bb5f737b257a8910ed9e58b483925610fa11 100644 (file)
--- a/README.md
+++ b/README.md
@@ -1,28 +1,53 @@
  # KCF tracker – parallel and PREM implementations
  
  The goal of this project is modify KCF tracker for use in the
-[HERCULES](http://hercules2020.eu/) project, where it will run on
-NVIDIA TX2 board. To achieve the needed performance we try various
-ways of parallelization of the algorithm including execution on the
-GPU. The aim is also to modify the code according to the PRedictable
-Execution Model (PREM).
+[HERCULES][1] project, where it will run on NVIDIA TX2 board. To
+achieve the needed performance we try various ways of parallelization
+of the algorithm including execution on the GPU. The aim is also to
+modify the code according to the PRedictable Execution Model (PREM).
+
+Stable version of the tracker is available from a [CTU server][2],
+development happens at [GitHub][iig].
+
+[1]: http://hercules2020.eu/
+[2]: http://rtime.felk.cvut.cz/gitweb/hercules2020/kcf.git
+[iig]: https://github.com/CTU-IIG/kcf
+[3]: https://github.com/Shanigen/kcf
+
+<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
+**Table of Contents**
+
+- [Prerequisites](#prerequisites)
+- [Compilation](#compilation)
+    - [Compile all supported versions](#compile-all-supported-versions)
+    - [Using cmake gui](#using-cmake-gui)
+    - [Command line](#command-line)
+- [Running](#running)
+    - [Options](#options)
+- [Automated testing](#automated-testing)
+- [Authors](#authors)
+- [References](#references)
+- [License](#license)
+
+<!-- markdown-toc end -->
  
-Stable version of the tracker is available from [CTU
-server](http://rtime.felk.cvut.cz/gitweb/hercules2020/kcf.git.),
-development happens at [Github](https://github.com/Shanigen/kcf.).
  
  ## Prerequisites
  
-The code depends on OpenCV 2.4 (3.0+ for CUDA-based version) library
-and cmake is used for building. Depending on the version to be
-compiled you have to have [FFTW](http://www.fftw.org/),
-[CUDA](https://developer.nvidia.com/cuda-downloads) or
-[OpenMP](http://www.openmp.org/) installed.
+The code depends on OpenCV (version 2.4 or 3.x) library. [CMake][13]
+(optionally with [Ninja][8]) is used for building. Depending on the
+version to be compiled you need to have development packages for
+[FFTW][4], [CUDA][5] or [OpenMP][6] installed.
+
+On TX2, the following command should install what's needed:
+``` shellsession
+$ apt install cmake ninja-build libopencv-dev libfftw3-dev
+```
  
-SSE instructions were used in the original code and these are only
-supported on x86 architecture. Thanks to the
-[SSE2NEON](https://github.com/jratcliff63367/sse2neon) code, we now
-support both ARM and x86 architectures.
+[4]: http://www.fftw.org/
+[5]: https://developer.nvidia.com/cuda-downloads
+[6]: http://www.openmp.org/
+[13]: https://cmake.org/
  
  ## Compilation
  
@@ -37,17 +62,19 @@ $ make -k
  
  This will create several `build-*` directories and compile different
  versions in them. If prerequisites of some builds are missing, the
-`-k` option ensures that the errors are ignored. This uses
-[Ninja](https://ninja-build.org/) build system, which is useful when
-building naively on TX2, because builds with `ninja` are faster
-(better parallelized) than with `make`.
+`-k` option ensures that the errors are ignored. This uses [Ninja][8]
+build system, which is useful when building naively on TX2, because
+builds with `ninja` are faster (better parallelized) than with `make`.
  
-To build only a specific version run `make <version>`, for example:
+To build only a specific version run `make <version>`. For example,
+CUDA-based version can be compiled with:
  
  ``` shellsession
-make cufft
+$ make cufft
  ```
  
+[8]: https://ninja-build.org/
+
  ### Using cmake gui
  
  ```shellsession
@@ -72,7 +99,8 @@ $ make -C build
  $ git submodule update --init
  $ mkdir build
  $ cd build
-$ cmake [options] ..
+$ cmake [options] ..  # see the tables below
+$ make
  ```
  
  The `cmake` options below allow to select, which version to build.
@@ -91,33 +119,23 @@ With all of these FFT version additional options can be added:
  
  |Option| Description |
  | --- | --- |
-| `-DASYNC=ON` | Use C++ `std::async` to run computations for different scales in parallel. This doesn't work with `BIG_BATCH` mode.|
-| `-DOPENMP=ON` | This option can only be used with CPU versions of the tracker. In normal mode it will run computations for differenct scales in parallel. In the case of the big batch mode it will parallelize the feature extraction  and the search for maximal response for differenct scales. If Fftw version is used with big batch mode it will also parallelize Ffftw's plans.|
-| `-DBIG_BATCH=ON` | Concatenate matrices of different scales to one big matrix and perform all computations on this matrix. This mode doesn't work for OpenCV FFT.|
-| `-DCUDA_DEBUG=ON` | This mode adds CUDA error checking for all kernels and CUDA runtime libraries. Only works with cuFFT version.|
-
-Finally call make:
-```
-$ make
-```
+| `-DBIG_BATCH=ON` | Concatenate matrices of different scales to one big matrix and perform all computations on this matrix. This improves performance of GPU FFT offloading. |
+| `-DOPENMP=ON` | Parallelize certain operation with OpenMP. With `-DBIG_BATCH=OFF` it runs computations for differenct scales in parallel, with `-DBIG_BATCH=ON` it parallelizes the feature extraction, which runs on the CPU. With `fftw`, Ffftw's plans will execute in parallel.|
+| `-DCUDA_DEBUG=ON` | Adds calls cudaDeviceSynchronize after every CUDA function and kernel call.|
+| `-DOpenCV_DIR=/opt/opencv-3.3/share/OpenCV` | Compile against a custom OpenCV version. |
+| `-DASYNC=ON` | Use C++ `std::async` to run computations for different scales in parallel. This mode of parallelization was present in the original implementation. Here, it is superseeded with -DOPENMP. This doesn't work with `BIG_BATCH` mode.|
  
-### Compilation for non-TX2 CUDA
-
-The CuFFT version is set up to run on NVIDIA Jetson TX2. If you want
-to run it on different architecture, change the `--gpu-architecture
-sm_62` NVCC flag in **/src/CMakeLists.txt** to your architecture of
-NVIDIA GPU. To find what SM variation you architecture has look
-[here](http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/).
+See also the top-level `Makefile` for other useful cmake parameters
+such as extra compiler flags etc.
  
  ## Running
  
-No matter which method is used to compile the code, the results will
-be `kcf_vot` binary.
+No matter which method is used to compile the code, the result will be
+a `kcf_vot` binary.
  
  It operates on an image sequence created according to [VOT 2014
-methodology](http://www.votchallenge.net/). You can find some image
-sequences in [vot2016
-datatset](http://www.votchallenge.net/vot2016/dataset.html).
+methodology][10]. You can find some image sequences in [vot2016
+datatset][11].
  
  The binary can be run as follows:
  
@@ -147,27 +165,54 @@ By default the program generates file `output.txt` containing the
  bounding boxes of the tracked object in the format "top_left_x,
  top_left_y, width, height".
  
+[10]: http://www.votchallenge.net/
+[11]: http://www.votchallenge.net/vot2016/dataset.html
+
  ### Options
  
  | Options | Description |
  | ------- | ----------- |
+| --fit, -f[W[xH]] | Specifies the dimension to which the extracted patches should be scaled. Best performance is achieved for powers of two; the smaller number the higher performance but worse accuracy. No dimension or zero rounds the dimensions to the nearest smaller power of 2, a single dimension `W` will result in patch size of `W`×`W`. The numbers should be divisible by 4. |
  | --visualize, -v[delay_ms] | Visualize the output, optionally with specified delay. If the delay is 0 the program will wait for a key press. |
  | --output, -o <output.txt>     | Specify name of output file. |
  | --debug, -d                           | Generate debug output. |
-| --fit, -f[W[xH]] | Specifies the dimension to which the extracted patch should be scaled. It should be divisible by 4. No dimension is the same as `128x128`, a single dimension `W` will result in patch size of `W`×`W`. |
+
+## Automated testing
+
+The tracker comes with a test suite based on [vot2016 datatset][11].
+You can run the test suite as follows:
+
+    make vot2016  # This download the datased (about 1GB of data)
+       make test
+
+The above command run all tests in parallel and displays the results
+in a table. If you want to measure performance, do not run multiple
+tests together. This can be achieved by:
+
+       make build.ninja
+       ninja -j1 test
+
+You can test only a subset of builds or image sequences by setting
+BUILDS, TESTSEQ or TESTFLAGS make variables. For instance:
+
+       make build.ninja BUILDS="cufft cufft-big fftw" TESTSEQ="bmx ball1"
+       ninja test
+
+
  
  
  ## Authors
  * Vít Karafiát, Michal Sojka
  
-Original C++ implementation of KCF tracker was written by Tomas Vojir
-[here](https://github.com/vojirt/kcf/blob/master/README.md) and is
-reimplementation of algorithm presented in "High-Speed Tracking with
-Kernelized Correlation Filters" paper [1].
+[Original C++ implementation of the KCF tracker][12] was written by
+Tomas Vojir and is reimplementation of the algorithm presented in
+"High-Speed Tracking with Kernelized Correlation Filters" paper \[1].
+
+[12]: https://github.com/vojirt/kcf/blob/master/README.md
  
  ## References
  
-[1] João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista,
+\[1] João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista,
  “High-Speed Tracking with Kernelized Correlation Filters“, IEEE
  Transactions on Pattern Analysis and Machine Intelligence, 2015
  
@@ -188,3 +233,7 @@ ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+
+<!-- Local Variables: -->
+<!-- markdown-toc-user-toc-structure-manipulation-fn: cdr -->
+<!-- End: -->