2 \section{Image Processing}
4 Note: The chapter describes functions for image processing and
5 analysis. Most of the functions work with 2d arrays of pixels. We refer
6 to the arrays as "images"; however, they do not have to be of type
7 \cross{IplImage}, they may be \cross{CvMat} or \cross{CvMatND} as well.
9 \subsection{Gradients, Edges and Corners}
11 \cvfunc{Sobel}\label{Sobel}
13 Calculates first, second, third or mixed image derivatives using extended Sobel operator
26 int aperture\_size=3 );
30 \cvarg{src}{Source image of type CvArr*}
31 \cvarg{dst}{Destination image}
32 \cvarg{xorder}{Order of the derivative x}
33 \cvarg{yorder}{Order of the derivative y}
34 \cvarg{aperture\_size}{Size of the extended Sobel kernel, must be 1, 3, 5 or 7}
37 In all cases except 1, an $\texttt{aperture\_size} \times
38 \texttt{aperture\_size}$ separable kernel will be used to calculate the
39 derivative. For $\texttt{aperture\_size} = 1$ a $ 3 \times 1$ or $ 1 \times 3$
40 kernel is used (Gaussian smoothing is not done). There is also special
41 value \texttt{CV\_SCHARR} (-1) that corresponds to $3\times3$ Scharr
42 filter that may give more accurate results than $3\times3$ Sobel. Scharr
51 for x-derivative or transposed for y-derivative.
53 The function `cvSobel` calculates the image derivative by convolving the image with the appropriate kernel:
56 \texttt{dst}(x,y) = \frac{d^{xorder+yorder} \texttt{src}}{dx^{xorder} \cdot dy^{yorder}}
59 The Sobel operators combine Gaussian smoothing and differentiation
60 so the result is more or less robust to the noise. Most often,
61 the function is called with (\texttt{xorder} = 1, \texttt{yorder} = 0,
62 \texttt{aperture\_size} = 3) or (\texttt{xorder} = 0, \texttt{yorder} = 1,
63 \texttt{aperture\_size} = 3) to calculate first x- or y- image
64 derivative. The first case corresponds to
72 kernel and the second one corresponds to
85 kernel, depending on the image origin (\texttt{origin} field of
86 \texttt{IplImage} structure). No scaling is done, so the destination image
87 usually has larger by absolute value numbers than the source image. To
88 avoid overflow, the function requires 16-bit destination image if the
89 source image is 8-bit. The result can be converted back to 8-bit using
90 \cross{ConvertScale} or \cross{ConvertScaleAbs} functions. Besides 8-bit images
91 the function can process 32-bit floating-point images. Both source and
92 destination must be single-channel images of equal size or ROI size.
94 \cvfunc{Laplace}\label{Laplace}
96 Calculates Laplacian of the image
105 int aperture\_size=3 );
109 \cvarg{src}{Source image}
110 \cvarg{dst}{Destination image}
111 \cvarg{aperture\_size}{Aperture size (it has the same meaning as in \cross{Sobel})}
114 The function \texttt{cvLaplace} calculates Laplacian of the source image by summing second x- and y- derivatives calculated using Sobel operator:
117 \texttt{dst}(x,y) = \frac{d^2 \texttt{src}}{dx^2} + \frac{d^2 \texttt{src}}{dy^2}
120 Specifying \texttt{aperture\_size} = 1 gives the fastest variant that is equal to convolving the image with the following kernel:
122 \[ \vecthreethree {0}{1}{0}{1}{-4}{1}{0}{1}{0} \]
124 Similar to \cross{Sobel} function, no scaling is done and the same combinations of input and output formats are supported.
126 \cvfunc{Canny}\label{Canny}
127 Implements Canny algorithm for edge detection
130 void cvCanny( const CvArr* image,
138 int aperture\_size=3 );
142 \cvarg{image}{Single-channel input image}
143 \cvarg{edges}{Single-channel image to store the edges found by the function}
144 \cvarg{threshold1}{The first threshold}
145 \cvarg{threshold2}{The second threshold}
146 \cvarg{aperture\_size}{Aperture parameter for Sobel operator (see \cross{Sobel})}
149 The function \texttt{cvCanny} finds the edges on the input image \texttt{image} and marks them in the output image \texttt{edges} using the Canny algorithm. The smallest of \texttt{threshold1} and \texttt{threshold2} is used for edge linking, the largest - to find initial segments of strong edges.
151 \cvfunc{PreCornerDetect}\label{PreCornerDetect}
152 Calculates feature map for corner detection
155 void cvPreCornerDetect(
161 int aperture\_size=3 );
165 \cvarg{image}{Input image}
166 \cvarg{corners}{Image to store the corner candidates}
167 \cvarg{aperture\_size}{Aperture parameter for Sobel operator (see \cross{Sobel})}
170 The function \texttt{cvPreCornerDetect} calculates the function
173 D_x^2 D_{yy} + D_y^2 D_{xx} - 2 D_x D_y D_{xy}
176 where $D_?$ denotes one of the first image derivatives and $D_{??}$ denotes a second image derivative.
178 The corners can be found as local maximums of the function:
181 // assume that the image is floating-point
182 IplImage* corners = cvCloneImage(image);
183 IplImage* dilated_corners = cvCloneImage(image);
184 IplImage* corner_mask = cvCreateImage( cvGetSize(image), 8, 1 );
185 cvPreCornerDetect( image, corners, 3 );
186 cvDilate( corners, dilated_corners, 0, 1 );
187 cvSubS( corners, dilated_corners, corners );
188 cvCmpS( corners, 0, corner_mask, CV_CMP_GE );
189 cvReleaseImage( &corners );
190 cvReleaseImage( &dilated_corners );
193 \cvfunc{CornerEigenValsAndVecs}\label{CornerEigenValsAndVecs}
194 Calculates eigenvalues and eigenvectors of image blocks for corner detection
197 void cvCornerEigenValsAndVecs( \par const CvArr* image,\par CvArr* eigenvv,\par int block\_size,\par int aperture\_size=3 );
202 \cvarg{image}{Input image}
203 \cvarg{eigenvv}{Image to store the results. It must be 6 times wider than the input image}
204 \cvarg{block\_size}{Neighborhood size (see discussion)}
205 \cvarg{aperture\_size}{Aperture parameter for Sobel operator (see \cross{Sobel})}
208 For every pixel The function \texttt{cvCornerEigenValsAndVecs} considers \texttt{block\_size} $\times$ \texttt{block\_size} neigborhood S(p). It calcualtes covariation matrix of derivatives over the neigborhood as:
212 \sum_{S(p)}(dI/dx)^2 & \sum_{S(p)}(dI/dx \cdot dI/dy)^2 \\
213 \sum_{S(p)}(dI/dx \cdot dI/dy)^2 & \sum_{S(p)}(dI/dy)^2
217 After that it finds eigenvectors and eigenvalues of the matrix and stores them into destination image in form
218 $(\lambda_1, \lambda_2, x_1, y_1, x_2, y_2)$ where
220 \item[$\lambda_1, \lambda_2$]eigenvalues of $M$; not sorted
221 \item[$x_1, y_1$]eigenvector corresponding to $\lambda_1$
222 \item[$x_2, y_2$]eigenvector corresponding to $\lambda_2$
225 \cvfunc{CornerMinEigenVal}\label{CornerMinEigenVal}
226 Calculates minimal eigenvalue of gradient matrices for corner detection
229 void cvCornerMinEigenVal(
237 int aperture\_size=3 );
241 \cvarg{image}{Input image}
242 \cvarg{eigenval}{Image to store the minimal eigen values. Should have the same size as \texttt{image}}
243 \cvarg{block\_size}{Neighborhood size (see discussion of \cross{CornerEigenValsAndVecs})}
244 \cvarg{aperture\_size}{Aperture parameter for Sobel operator (see \cross{Sobel}).}
245 % format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing
248 The function \texttt{cvCornerMinEigenVal} is similar to \cross{CornerEigenValsAndVecs} but it calculates and stores only the minimal eigen value of derivative covariation matrix for every pixel, i.e. $min(\lambda_1, \lambda_2)$ in terms of the previous function.
250 \cvfunc{CornerHarris}\label{CornerHarris}
258 CvArr* harris\_responce,
262 int aperture\_size=3,
269 \cvarg{image}{Input image}
270 \cvarg{harris\_responce}{Image to store the Harris detector responces. Should have the same size as \texttt{image}}
271 \cvarg{block\_size}{Neighborhood size (see discussion of \cross{CornerEigenValsAndVecs})}
272 \cvarg{aperture\_size}{Aperture parameter for Sobel operator (see \cross{Sobel}).}
273 % format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing
274 \cvarg{k}{Harris detector free parameter. See the formula below.}
277 The function \texttt{cvCornerHarris} runs the Harris edge detector on image. Similarly to \cross{CornerMinEigenVal} and \cross{CornerEigenValsAndVecs}, for each pixel it calculates $2\times2$ gradient covariation matrix $M$ over $\texttt{block\_size} \times \texttt{block\_size}$ neighborhood. Then, it stores
280 det(M) - k \, trace(M)^2
283 to the destination image. Corners in the image can be found as local maxima of the destination image.
285 \cvfunc{FindCornerSubPix}\label{FindCornerSubPix}
286 Refines corner locations
289 void cvFindCornerSubPix(
293 CvPoint2D32f* corners,
301 CvTermCriteria criteria );
305 \cvarg{image}{Input image}
306 \cvarg{corners}{Initial coordinates of the input corners and refined coordinates on output}
307 \cvarg{count}{Number of corners}
308 \cvarg{win}{Half sizes of the search window. For example, if \texttt{win} =(5,5) then 5*2+1 $\times$ 5*2+1 = 11 $\times$ 11 search window is used}
309 \cvarg{zero\_zone}{Half size of the dead region in the middle of the search zone over which the summation in formulae below is not done. It is used sometimes to avoid possible singularities of the autocorrelation matrix. The value of (-1,-1) indicates that there is no such size}
310 \cvarg{criteria}{Criteria for termination of the iterative process of corner refinement. That is, the process of corner position refinement stops either after certain number of iteration or when a required accuracy is achieved. The `criteria` may specify either of or both the maximum number of iteration and the required accuracy}
313 The function \texttt{cvFindCornerSubPix} iterates to find the sub-pixel accurate location of corners, or radial saddle points, as shown in on the picture below.
315 \includegraphics[width=1.0\textwidth]{pics/cornersubpix.png}
317 Sub-pixel accurate corner locator is based on the observation that every vector from the center $q$ to a point $p$ located within a neighborhood of $q$ is orthogonal to the image gradient at $p$ subject to image and measurement noise. Consider the expression:
320 \epsilon_i = {DI_{p_i}}^T \cdot (q - p_i)
323 where ${DI_{p_i}}$ is the image gradient at the one of the points $p_i$ in a neighborhood of $q$. The value of $q$ is to be found such that $\epsilon_i$ is minimized. A system of equations may be set up with $\epsilon_i$ set to zero:
326 \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T) - \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T \cdot p_i)
329 where the gradients are summed within a neighborhood ("search window") of $q$. Calling the first gradient term $G$ and the second gradient term $b$ gives:
335 The algorithm sets the center of the neighborhood window at this new center $q$ and then iterates until the center keeps within a set threshold.
337 \cvfunc{GoodFeaturesToTrack}\label{GoodFeaturesToTrack}
338 Determines strong corners on image
341 void cvGoodFeaturesToTrack(
345 CvArr* eig\_image, CvArr* temp\_image
347 CvPoint2D32f* corners
351 double quality\_level
355 const CvArr* mask=NULL
366 \cvarg{image}{The source 8-bit or floating-point 32-bit, single-channel image}
367 \cvarg{eig\_image}{Temporary floating-point 32-bit image of the same size as \texttt{image}}
368 \cvarg{temp\_image}{Another temporary image of the same size and same format as \texttt{eig\_image}}
369 \cvarg{corners}{Output parameter. Detected corners}
370 \cvarg{corner\_count}{Output parameter. Number of detected corners}
371 \cvarg{quality\_level}{Multiplier for the maxmin eigenvalue; specifies minimal accepted quality of image corners.}
372 \cvarg{min\_distance}{Limit, specifying minimum possible distance between returned corners; Euclidian distance is used.}
373 \cvarg{mask}{Region of interest. The function selects points either in the specified region or in the whole image if the mask is NULL.}
374 \cvarg{block\_size}{Size of the averaging block, passed to underlying \cross{CornerMinEigenVal} or \cross{CornerHarris} used by the function}
375 \cvarg{use\_harris}{If nonzero, Harris operator (\cross{CornerHarris}) is used instead of default \cross{CornerMinEigenVal}}
376 \cvarg{k}{Free parameter of Harris detector; used only if ($\texttt{use\_harris} != 0$)}
379 The function \texttt{cvGoodFeaturesToTrack} finds corners with big
380 eigenvalues in the image. The function first calculates the minimal
381 eigenvalue for every source image pixel using \cross{CornerMinEigenVal}
382 function and stores them in \texttt{eig\_image}. Then it performs
383 non-maxima suppression (only local maxima in $3\times 3$ neighborhood
384 remain). The next step is rejecting the corners with the minimal
386 $\texttt{quality\_level} \cdot max(\texttt{eig\_image}(x,y))$
388 Finally, the function ensures that all the corners found are distanced
389 enough one from another by considering the corners (the most strongest
390 corners are considered first) and checking that the distance between
391 the newly considered feature and the features considered earlier
392 is larger than \texttt{min\_distance}. So, the function removes the
393 features than are too close to the stronger features.
395 \cvfunc{ExtractSURF}\label{ExtractSURF}
397 Extracts Speeded Up Robust Features from image
400 void cvExtractSURF( \par const CvArr* image,\par const CvArr* mask,\par CvSeq** keypoints,\par CvSeq** descriptors,\par CvMemStorage* storage,\par CvSURFParams params );
404 \cvarg{image}{The input 8-bit grayscale image.}
405 \cvarg{mask}{The optional input 8-bit mask. The features are only found in the areas that contain more than 50\% of non-zero mask pixels.}
406 \cvarg{keypoints}{The output parameter; double pointer to the sequence of keypoints. This will be the sequence of CvSURFPoint structures:}
408 typedef struct CvSURFPoint
410 CvPoint2D32f pt; // position of the feature within the image
411 int laplacian; // -1, 0 or +1. sign of the laplacian at the point.
412 // can be used to speedup feature comparison
413 // (normally features with laplacians of different signs can not match)
414 int size; // size of the feature
415 float dir; // orientation of the feature: 0..360 degrees
416 float hessian; // value of the hessian (can be used to approximately estimate the feature strengths;
417 // see also params.hessianThreshold)
421 \cvarg{descriptors}{The optional output parameter; double pointer to the sequence of descriptors; Depending on the params.extended value, each element of the sequence will be either 64-element or 128-element floating-point (\texttt{CV\_32F}) vector. If the parameter is NULL, the descriptors are not computed. }
422 \cvarg{storage}{Memory storage where keypoints and descriptors will be stored.}
423 \cvarg{params}{Various algorithm parameters put to the structure CvSURFParams: }
425 typedef struct CvSURFParams
427 int extended; // 0 means basic descriptors (64 elements each),
428 // 1 means extended descriptors (128 elements each)
429 double hessianThreshold; // only features with keypoint.hessian larger than that are extracted.
430 // good default value is ~300-500 (can depend on the average
431 // local contrast and sharpness of the image).
432 // user can further filter out some features based on their hessian values
433 // and other characteristics
434 int nOctaves; // the number of octaves to be used for extraction.
435 // With each next octave the feature size is doubled (3 by default)
436 int nOctaveLayers; // The number of layers within each octave (4 by default)
440 CvSURFParams cvSURFParams(double hessianThreshold, int extended=0); // returns default parameters
444 The function cvExtractSURF finds robust features in the image, as
447 . For each feature it returns its location, size,
448 orientation and optionally the descriptor, basic or extended. The function
449 can be used for object tracking and localization, image stitching etc. See
450 \texttt{find\_obj.cpp} demo in OpenCV samples directory.
452 \cvfunc{GetStarKeypoints}\label{GetStarKeypoints}
454 Retrieves keypoints using StarDetector algorithm
457 CvSeq* cvGetStarKeypoints( \par const CvArr* image,\par CvMemStorage* storage,\par CvStarDetectorParams params=cvStarDetectorParams() );
461 \cvarg{image}{The input 8-bit grayscale image.}
462 \cvarg{storage}{Memory storage where keypoints will be stored.}
463 \cvarg{params}{Various algorithm parameters put to the structure CvStarDetectorParams:}
465 typedef struct CvStarDetectorParams
467 int maxSize; // maximal size of the features detected. The following values
468 // of the parameter are supported:
469 // 4, 6, 8, 11, 12, 16, 22, 23, 32, 45, 46, 64, 90, 128
470 int responseThreshold; // threshold for the approximatd laplacian,
471 // used to eliminate weak features
472 int lineThresholdProjected; // another threshold for laplacian to eliminate edges
473 int lineThresholdBinarized; // another threshold for the feature scale to eliminate edges
474 int suppressNonmaxSize; // linear size of a pixel neighborhood for non-maxima suppression
476 CvStarDetectorParams;
480 The function cvGetStarKeypoints extracts keypoints that are local
481 scale-space extremas. The scale-space is constructed by computing
482 approximate values of laplacians with different sigma's at each
483 pixel. Instead of using pyramids, a popular approach to save computing
484 time, all the laplacians are computed at each pixel of the original
485 high-resolution image. But each approximate laplacian value is computed
486 in O(1) time regardless of the sigma, thanks to the use of integral
487 images. The algorithm is based on the paper
490 of square, hexagon or octagon it uses 8-end star shape, hence the name,
491 consisting of overlapping upright and tilted squares.
493 Each computed feature is represented by the following structure:
496 typedef struct CvStarKeypoint
498 CvPoint pt; // coordinates of the feature
499 int size; // feature size, see CvStarDetectorParams::maxSize
500 float response; // the approximated laplacian value at that point.
504 inline CvStarKeypoint cvStarKeypoint(CvPoint pt, int size, float response);
507 Below is the small usage sample:
513 int main(int argc, char** argv)
515 const char* filename = argc > 1 ? argv[1] : "lena.jpg";
516 IplImage* img = cvLoadImage( filename, 0 ), *cimg;
517 CvMemStorage* storage = cvCreateMemStorage(0);
518 CvSeq* keypoints = 0;
523 cvNamedWindow( "image", 1 );
524 cvShowImage( "image", img );
525 cvNamedWindow( "features", 1 );
526 cimg = cvCreateImage( cvGetSize(img), 8, 3 );
527 cvCvtColor( img, cimg, CV_GRAY2BGR );
529 keypoints = cvGetStarKeypoints( img, storage, cvStarDetectorParams(45) );
531 for( i = 0; i < (keypoints ? keypoints->total : 0); i++ )
533 CvStarKeypoint kpt = *(CvStarKeypoint*)cvGetSeqElem(keypoints, i);
535 cvCircle( cimg, kpt.pt, r, CV_RGB(0,255,0));
536 cvLine( cimg, cvPoint(kpt.pt.x + r, kpt.pt.y + r),
537 cvPoint(kpt.pt.x - r, kpt.pt.y - r), CV_RGB(0,255,0));
538 cvLine( cimg, cvPoint(kpt.pt.x - r, kpt.pt.y + r),
539 cvPoint(kpt.pt.x + r, kpt.pt.y - r), CV_RGB(0,255,0));
541 cvShowImage( "features", cimg );
546 \subsection{Sampling, Interpolation and Geometrical Transforms}
548 \cvfunc{SampleLine}\label{SampleLine}
549 Reads raster line to buffer
562 int connectivity=8 );
567 \cvarg{image}{Image to sample the line from}
568 \cvarg{pt1}{Starting the line point}
569 \cvarg{pt2}{Ending the line point}
570 \cvarg{buffer}{Buffer to store the line points; must have enough size to store
571 $max( |\texttt{pt2.x} - \texttt{pt1.x}|+1, |\texttt{pt2.y} - \texttt{pt1.y}|+1 )$
572 points in case of 8-connected line and
573 $ (|\texttt{pt2.x}-\texttt{pt1.x}|+|\texttt{pt2.y}-\texttt{pt1.y}|+1) $
574 in case of 4-connected line.}
575 \cvarg{connectivity}{The line connectivity, 4 or 8}
578 The function \texttt{cvSampleLine} implements a particular case of application of line iterators. The function reads all the image points lying on the line between \texttt{pt1} and \texttt{pt2}, including the ending points, and stores them into the buffer.
580 \cvfunc{GetRectSubPix}\label{GetRectSubPix}
582 Retrieves pixel rectangle from image with sub-pixel accuracy
585 void cvGetRectSubPix(
591 CvPoint2D32f center );
595 \cvarg{src}{Source image}
596 \cvarg{dst}{Extracted rectangle}
597 \cvarg{center}{Floating point coordinates of the extracted rectangle center within the source image. The center must be inside the image.}
600 The function \texttt{cvGetRectSubPix} extracts pixels from \texttt{src}:
603 dst(x, y) = src(x + \texttt{center.x} - (width(\texttt{dst})-1)*0.5, y + \texttt{center.y} - (height(\texttt{dst} )-1)*0.5)
606 where the values of pixels at non-integer coordinates are retrieved
607 using bilinear interpolation. Every channel of multiple-channel
608 images is processed independently. Whereas the rectangle center
609 must be inside the image, the whole rectangle may be partially
610 occluded. In this case, the replication border mode is used to get
611 pixel values beyond the image boundaries.
613 \cvfunc{GetQuadrangleSubPix}\label{GetQuadrangleSubPix}
615 Retrieves pixel quadrangle from image with sub-pixel accuracy
618 void cvGetQuadrangleSubPix(
624 const CvMat* map\_matrix );
629 \cvarg{src}{Source image}
630 \cvarg{dst}{Extracted quadrangle}
631 \cvarg{map\_matrix}{The transformation $2 \times 3$ matrix $[A|b]$ (see the discussion)}
634 The function \texttt{cvGetQuadrangleSubPix} extracts pixels from \texttt{src} at sub-pixel accuracy and stores them to \texttt{dst} as follows:
637 dst(x, y)= src( A_{11} x' + A_{12} y' + b_1, A_{21} x' + A_{22} y' + b_2)
643 x'=\frac{x-(width(dst)-1)}{2},
644 y'=\frac{y-(height(dst)-1)}{2}
650 \texttt{map\_matrix} = \begin{bmatrix}
651 A_{11} & A_{12} & b_1\\
652 A_{21} & A_{22} & b_2
656 where the values of pixels at non-integer coordinates are retrieved using bilinear interpolation. When the function needs pixels outside of the image, it uses replication border mode to reconstruct the values. Every channel of multiple-channel images is processed independently.
659 \cvfunc{Resize}\label{Resize}
669 int interpolation=CV\_INTER\_LINEAR );
674 \cvarg{src}{Source image}
675 \cvarg{dst}{Destination image}
676 \cvarg{interpolation}{Interpolation method:
678 \cvarg{CV\_INTER\_NN}{nearest-neigbor interpolation}
679 \cvarg{CV\_INTER\_LINEAR}{bilinear interpolation (used by default)}
680 \cvarg{CV\_INTER\_AREA}{resampling using pixel area relation. It is preferred method for image decimation that gives moire-free results. In case of zooming it is similar to \texttt{CV\_INTER\_NN} method.}
681 \cvarg{CV\_INTER\_CUBIC}{bicubic interpolation.}
685 The function \texttt{cvResize} resizes image \texttt{src} so that it fits exactly to \texttt{dst}. If ROI is set, the function considers the ROI as supported as usual.
687 \cvfunc{WarpAffine}\label{WarpAffine}
689 Applies affine transformation to the image
698 const CvMat* map\_matrix,
700 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
702 CvScalar fillval=cvScalarAll(0) );
707 \cvarg{src}{Source image}
708 \cvarg{dst}{Destination image}
709 \cvarg{map\_matrix}{$2\times 3$ transformation matrix}
710 \cvarg{flags}{A combination of interpolation method and the following optional flags:
712 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to `fillval`}
713 \cvarg{CV\_WARP\_INVERSE\_MAP}{indicates that \texttt{matrix} is inverse
714 transform from destination image to source and, thus, can be used
715 directly for pixel interpolation. Otherwise, the function finds
716 the inverse transform from \texttt{map\_matrix}.}}
718 \cvarg{fillval}{A value used to fill outliers}
721 The function \texttt{cvWarpAffine} transforms source image using the specified matrix:
724 dst(x',y') = src(x,y)
734 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
738 \end{bmatrix} & \mbox{if CV\_WARP\_INVERSE\_MAP is not set}\\
742 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
746 \end{bmatrix}& \mbox{otherwise}
750 The function is similar to \cross{GetQuadrangleSubPix} but they are not exactly the same. \cross{WarpAffine} requires input and output image have the same data type, has larger overhead (so it is not quite suitable for small images) and can leave part of destination image unchanged. While \cross{GetQuadrangleSubPix} may extract quadrangles from 8-bit images into floating-point buffer, has smaller overhead and always changes the whole destination image content.
752 To transform a sparse set of points, use \cross{Transform} function from cxcore.
754 \cvfunc{GetAffineTransform}\label{GetAffineTransform}
756 Calculates affine transform from 3 corresponding points
759 CvMat* cvGetAffineTransform(
761 const CvPoint2D32f* src,
763 const CvPoint2D32f* dst,
765 CvMat* map\_matrix );
769 \cvarg{src}{ Coordinates of 3 triangle vertices in the source image}
770 \cvarg{dst}{ Coordinates of the 3 corresponding triangle vertices in the destination image}
771 \cvarg{map\_matrix}{ Pointer to the destination $2 \times 3$ matrix}
774 The function cvGetAffineTransform calculates the matrix of an affine transform such that:
799 \cvfunc{2DRotationMatrix}\label{2DRotationMatrix}
801 Calculates affine matrix of 2d rotation
804 CvMat* cv2DRotationMatrix(
812 CvMat* map\_matrix );
816 \cvarg{center}{Center of the rotation in the source image}
817 \cvarg{angle}{The rotation angle in degrees. Positive values mean couter-clockwise rotation (the coordiate origin is assumed at top-left corner)}
818 \cvarg{scale}{Isotropic scale factor}
819 \cvarg{map\_matrix}{Pointer to the destination $2\times 3$ matrix}
822 The function \texttt{cv2DRotationMatrix} calculates matrix:
826 a & \beta & (1-a) \cdot \texttt{center.x} - \beta \cdot \texttt{center.y} \\
827 \beta - 1 & a & \beta \cdot \texttt{center.x} - (1-a) \cdot \texttt{center.y}
834 a = \texttt{scale} \cdot cos(\texttt{angle}), \beta = \texttt{scale} \cdot sin(\texttt{angle})
837 The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted.
839 \cvfunc{WarpPerspective}\label{WarpPerspective}
841 Applies perspective transformation to the image
844 void cvWarpPerspective(
850 const CvMat* map\_matrix,
852 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
854 CvScalar fillval=cvScalarAll(0) );
859 \cvarg{src}{Source image}
860 \cvarg{dst}{Destination image}
861 \cvarg{map\_matrix}{$3\times 3$ transformation matrix}
862 \cvarg{flags}{A combination of interpolation method and the following optional flags:
864 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to `fillval`}
865 \cvarg{CV\_WARP\_INVERSE\_MAP}{indicates that \texttt{matrix} is inverse transform from destination image to source and, thus, can be used directly for pixel interpolation. Otherwise, the function finds the inverse transform from \texttt{map\_matrix}}
867 \cvarg{fillval}{A value used to fill outliers}
870 The function \texttt{cvWarpPerspective} transforms source image using the specified matrix:
877 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
881 \end{bmatrix} & \mbox{if CV\_WARP\_INVERSE\_MAP is not set}\\
885 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
889 \end{bmatrix}& \mbox{otherwise}
893 For a sparse set of points use \cross{PerspectiveTransform} function from cxcore.
896 \cvfunc{GetPerspectiveTransform}\label{GetPerspectiveTransform}
898 Calculates perspective transform from 4 corresponding points
901 CvMat* cvGetPerspectiveTransform(
903 const CvPoint2D32f* src,
905 const CvPoint2D32f* dst,
907 CvMat* map\_matrix );
911 \cvarg{src}{Coordinates of 4 quadrangle vertices in the source image}
912 \cvarg{dst}{Coordinates of the 4 corresponding quadrangle vertices in the destination image}
913 \cvarg{map\_matrix}{Pointer to the destination $3\times 3$ matrix}
916 The function \texttt{cvGetPerspectiveTransform} calculates matrix of perspective transform such that:
941 \cvfunc{Remap}\label{Remap}
943 Applies generic geometrical transformation to the image
956 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
958 CvScalar fillval=cvScalarAll(0) );
963 \cvarg{src}{Source image}
964 \cvarg{dst}{Destination image}
965 \cvarg{mapx}{The map of x-coordinates (32fC1 image)}
966 \cvarg{mapy}{The map of y-coordinates (32fC1 image)}
967 \cvarg{flags}{A combination of interpolation method and the following optional flag(s):
969 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to `fillval`}
971 \cvarg{fillval}{A value used to fill outliers}
974 The function \texttt{cvRemap} transforms source image using the specified map:
977 \texttt{dst}(x,y) = \texttt{src}(\texttt{mapx}(x,y),\texttt{mapy}(x,y))
980 Similar to other geometrical transformations, some interpolation method (specified by user) is used to extract pixels with non-integer coordinates.
982 \cvfunc{LogPolar}\label{LogPolar}
984 Remaps image to log-polar space
997 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS );
1002 \cvarg{src}{Source image}
1003 \cvarg{dst}{Destination image}
1004 \cvarg{center}{The transformation center, where the output precision is maximal}
1005 \cvarg{M}{Magnitude scale parameter. See below}
1006 \cvarg{flags}{A combination of interpolation method and the following optional flags:
1008 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to zeros}
1009 \cvarg{CV\_WARP\_INVERSE\_MAP}{See below}
1013 The function \texttt{cvLogPolar} transforms source image using the following transformation:
1015 Forward transformation (\texttt{CV\_WARP\_INVERSE\_MAP} is not set):
1018 dst(\phi,\rho) = src(x,y)
1021 Inverse transformation (\texttt{CV\_WARP\_INVERSE\_MAP} is set):
1024 dst(x,y) = src(\phi,\rho)
1030 \rho = M \cdot \log{\sqrt{x^2 + y^2}},
1034 The function emulates the human "foveal" vision and can be used for fast scale and rotation-invariant template matching, for object tracking etc.
1036 % ===== Example. Log-polar transformation. =====
1039 #include <highgui.h>
1041 int main(int argc, char** argv)
1045 if( argc == 2 && (src=cvLoadImage(argv[1],1) != 0 )
1047 IplImage* dst = cvCreateImage( cvSize(256,256), 8, 3 );
1048 IplImage* src2 = cvCreateImage( cvGetSize(src), 8, 3 );
1049 cvLogPolar( src, dst, cvPoint2D32f(src->width/2,src->height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS );
1050 cvLogPolar( dst, src2, cvPoint2D32f(src->width/2,src->height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS+CV_WARP_INVERSE_MAP );
1051 cvNamedWindow( "log-polar", 1 );
1052 cvShowImage( "log-polar", dst );
1053 cvNamedWindow( "inverse log-polar", 1 );
1054 cvShowImage( "inverse log-polar", src2 );
1061 And this is what the program displays when \texttt{opencv/samples/c/fruits.jpg} is passed to it
1063 \includegraphics[width=0.4\textwidth]{pics/logpolar.jpg}
1064 \includegraphics[width=0.4\textwidth]{pics/inv_logpolar.jpg}
1066 \subsection{Morphological Operations}
1068 \cvfunc{CreateStructuringElementEx}\label{CreateStructuringElementEx}
1070 Creates structuring element
1073 IplConvKernel* cvCreateStructuringElementEx(
1089 \cvarg{cols}{Number of columns in the structuring element}
1090 \cvarg{rows}{Number of rows in the structuring element}
1091 \cvarg{anchor\_x}{Relative horizontal offset of the anchor point}
1092 \cvarg{anchor\_y}{Relative vertical offset of the anchor point}
1093 \cvarg{shape}{Shape of the structuring element; may have the following values:}
1095 \cvarg{CV\_SHAPE\_RECT}{a rectangular element}
1096 \cvarg{CV\_SHAPE\_CROSS}{a cross-shaped element}
1097 \cvarg{CV\_SHAPE\_ELLIPSE}{an elliptic element}
1098 \cvarg{CV\_SHAPE\_CUSTOM}{a user-defined element. In this case the parameter \texttt{values} specifies the mask, that is, which neighbors of the pixel must be considered}
1100 \cvarg{values}{Pointer to the structuring element data, a plane array, representing row-by-row scanning of the element matrix. Non-zero values indicate points that belong to the element. If the pointer is \texttt{NULL}, then all values are considered non-zero, that is, the element is of a rectangular shape. This parameter is considered only if the shape is \texttt{CV\_SHAPE\_CUSTOM} }
1103 The function CreateStructuringElementEx allocates and fills the structure \texttt{IplConvKernel}, which can be used as a structuring element in the morphological operations.
1105 \cvfunc{ReleaseStructuringElement}\label{ReleaseStructuringElement}
1107 Deletes structuring element
1110 void cvReleaseStructuringElement( IplConvKernel** element );
1114 \cvarg{element}{Pointer to the deleted structuring element}
1117 The function \texttt{cvReleaseStructuringElement} releases the structure \texttt{IplConvKernel} that is no longer needed. If \texttt{*element} is \texttt{NULL}, the function has no effect.
1119 \cvfunc{Erode}\label{Erode}
1121 Erodes image by using arbitrary structuring element
1130 IplConvKernel* element=NULL,
1136 \cvarg{src}{Source image}
1137 \cvarg{dst}{Destination image}
1138 \cvarg{element}{Structuring element used for erosion. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
1139 \cvarg{iterations}{Number of times erosion is applied}
1142 The function \texttt{cvErode} erodes the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the minimum is taken:
1145 \min_{(x',y') \, in \, \texttt{element}}src(x+x',y+y')
1148 The function supports the in-place mode. Erosion can be applied several (\texttt{iterations}) times. For color images, each channel is processed independently.
1150 \cvfunc{Dilate}\label{Dilate}
1152 Dilates image by using arbitrary structuring element
1161 IplConvKernel* element=NULL,
1167 \cvarg{src}{Source image}
1168 \cvarg{dst}{Destination image}
1169 \cvarg{element}{Structuring element used for dilation. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
1170 \cvarg{iterations}{Number of times dilation is applied}
1173 The function \texttt{cvDilate} dilates the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the maximum is taken:
1176 \max_{(x',y') \, in \, \texttt{element}}src(x+x',y+y')
1179 The function supports the in-place mode. Dilation can be applied several (\texttt{iterations}) times. For color images, each channel is processed independently.
1181 \cvfunc{MorphologyEx}\label{MorphologyEx}
1183 Performs advanced morphological transformations
1186 void cvMorphologyEx(
1194 IplConvKernel* element,
1202 \cvarg{src}{Source image}
1203 \cvarg{dst}{Destination image}
1204 \cvarg{temp}{Temporary image, required in some cases}
1205 \cvarg{element}{Structuring element}
1206 \cvarg{operation}{Type of morphological operation, one of:}
1208 \cvarg{CV\_MOP\_OPEN}{opening}
1209 \cvarg{CV\_MOP\_CLOSE}{closing}
1210 \cvarg{CV\_MOP\_GRADIENT}{morphological gradient}
1211 \cvarg{CV\_MOP\_TOPHAT}{"top hat"}
1212 \cvarg{CV\_MOP\_BLACKHAT}{"black hat"}
1214 \cvarg{iterations}{Number of times erosion and dilation are applied}
1217 The function \texttt{cvMorphologyEx} can perform advanced morphological transformations using erosion and dilation as basic operations.
1222 dst=open(src,element)=dilate(erode(src,element),element)
1228 dst=close(src,element)=erode(dilate(src,element),element)
1231 Morphological gradient:
1234 dst=morph\_grad(src,element)=dilate(src,element)-erode(src,element)
1240 dst=tophat(src,element)=src-open(src,element)
1246 dst=blackhat(src,element)=close(src,element)-src
1249 The temporary image \texttt{temp} is required for morphological gradient and, in case of in-place operation, for "top hat" and "black hat".
1251 \subsection{Filters and Color Conversion}
1253 \cvfunc{Smooth}\label{Smooth}
1255 Smooths the image in one of several ways
1264 int smoothtype=CV\_GAUSSIAN,
1270 double param3=0, double param4=0 );
1274 \cvarg{src}{The source image}
1275 \cvarg{dst}{The destination image}
1276 \cvarg{smoothtype}{Type of the smoothing:}
1278 \cvarg{CV\_BLUR\_NO\_SCALE (simple blur with no scaling)}{summation over a pixel $\texttt{param1}\times \texttt{param2}$ neighborhood. If the neighborhood size may vary, one may precompute integral image with \cross{Integral} function}
1279 \cvarg{CV\_BLUR (simple blur)}{summation over a pixel $\texttt{param1}\times \texttt{param2}$ neighborhood with subsequent scaling by $1/(\texttt{param1} \cdot \texttt{param2})$}
1280 \cvarg{CV\_GAUSSIAN (gaussian blur)}{convolving image with $\texttt{param1}\times \texttt{param2}$ Gaussian kernel}
1281 \cvarg{CV\_MEDIAN (median blur)}{finding median of $\texttt{param1}\times \texttt{param1}$ neighborhood (i.e. the neighborhood is square)}
1282 \cvarg{CV\_BILATERAL (bilateral filter)}{applying bilateral $\texttt{param1}\times \texttt{param2}$ filtering with color sigma=\texttt{param3} and space sigma=\texttt{param4}. \texttt{param1} and \texttt{param2} must be equal (square). Information about bilateral filtering can be found at
1283 \url{http://www.dai.ed.ac.uk/CVonline/LOCAL\_COPIES/MANDUCHI1/Bilateral\_Filtering.html}
1286 \cvarg{param1}{The first parameter of smoothing operation}
1287 \cvarg{param2}{The second parameter of smoothing operation. In case of simple scaled/non-scaled and Gaussian blur if \texttt{param2} is zero, it is set to \texttt{param1}}
1288 \cvarg{param3}{In case of Gaussian parameter this parameter may specify Gaussian $\sigma$ (standard deviation). If it is zero, it is calculated from the kernel size:
1290 \sigma = 0.3 (n/2 - 1) + 0.8 \quad \text{where} \quad n=
1292 \mbox{\texttt{param1} for horizontal kernel}\\
1293 \mbox{\texttt{param2} for vertical kernel}
1297 Using standard sigma for small kernels ($3\times 3$ to $7\times 7$) gives better speed. If \texttt{param3} is not zero, while \texttt{param1} and \texttt{param2} are zeros, the kernel size is calculated from the sigma (to provide accurate enough operation).}
1300 The function \texttt{cvSmooth} smooths image using one of several methods. Every of the methods has some features and restrictions listed below
1302 Blur with no scaling works with single-channel images only and supports accumulation of 8-bit to 16-bit format (similar to \cross{Sobel} and \cross{Laplace}) and 32-bit floating point to 32-bit floating-point format.
1304 Simple blur and Gaussian blur support 1- or 3-channel, 8-bit and 32-bit floating point images. These two methods can process images in-place.
1306 Median and bilateral filters work with 1- or 3-channel 8-bit images and can not process images in-place.
1308 \cvfunc{Filter2D}\label{Filter2D}
1310 Convolves image with the kernel
1319 const CvMat* kernel,
1321 CvPoint anchor=cvPoint(-1,-1));
1325 \cvarg{src}{The source image}
1326 \cvarg{dst}{The destination image}
1327 \cvarg{kernel}{Convolution kernel, single-channel floating point matrix. If you want to apply different kernels to different channels, split the image using cvSplit into separate color planes and process them individually}
1328 \cvarg{anchor}{The anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor shoud lie within the kernel. The special default value (-1,-1) means that it is at the kernel center}
1331 The function \texttt{cvFilter2D} applies arbitrary linear filter to the image. In-place operation is supported. When the aperture is partially outside the image, the function interpolates outlier pixel values from the nearest pixels that is inside the image.
1333 \cvfunc{CopyMakeBorder}\label{CopyMakeBorder}
1335 Copies image and makes border around it
1338 void cvCopyMakeBorder(
1348 CvScalar value=cvScalarAll(0) );
1352 \cvarg{src}{The source image}
1353 \cvarg{dst}{The destination image}
1354 \cvarg{offset}{Coordinates of the top-left corner (or bottom-left in case of images with bottom-left origin) of the destination image rectangle where the source image (or its ROI) is copied. Size of the rectanlge matches the source image size/ROI size}
1355 \cvarg{bordertype}{Type of the border to create around the copied source image rectangle:
1357 \cvarg{IPL\_BORDER\_CONSTANT}{border is filled with the fixed value, passed as last parameter of the function.}
1358 \cvarg{IPL\_BORDER\_REPLICATE}{the pixels from the top and bottom rows, the left-most and right-most columns are replicated to fill the border.}
1360 (The other two border types from IPL, \texttt{IPL\_BORDER\_REFLECT} and \texttt{IPL\_BORDER\_WRAP}, are currently unsupported)}
1361 \cvarg{value}{Value of the border pixels if \texttt{bordertype} is \texttt{IPL\_BORDER\_CONSTANT}}
1364 The function \texttt{cvCopyMakeBorder} copies the source 2D array into interior of destination array and makes a border of the specified type around the copied area. The function is useful when one needs to emulate border type that is different from the one embedded into a specific algorithm implementation. For example, morphological functions, as well as most of other filtering functions in OpenCV, internally use replication border type, while the user may need zero border or a border, filled with 1's or 255's.
1366 \cvfunc{Integral}\label{Integral}
1368 Calculates integral images
1379 CvArr* tilted\_sum=NULL );
1383 \cvarg{image}{The source image, $W\times H$, 8-bit or floating-point (32f or 64f) image}
1384 \cvarg{sum}{The integral image, $(W+1)\times (H+1)$, 32-bit integer or double precision floating-point (64f)}
1385 \cvarg{sqsum}{The integral image for squared pixel values, $(W+1)\times (H+1)$, double precision floating-point (64f)}
1386 \cvarg{tilted\_sum}{The integral for the image rotated by 45 degrees, $(W+1)\times (H+1)$, the same data type as \texttt{sum}}
1389 The function \texttt{cvIntegral} calculates one or more integral images for the source image as following:
1392 \texttt{sum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)
1396 \texttt{sqsum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)^2
1400 \texttt{tilted\_sum}(X,Y) = \sum_{y<Y,abs(x-X)<y} \texttt{image}(x,y)
1403 Using these integral images, one may calculate sum, mean, standard deviation over arbitrary up-right or rotated rectangular region of the image in a constant time, for example:
1406 \sum_{x_1<=x<x_2, \, y_1<=y<y_2} = \texttt{sum}(x_2,y_2)-\texttt{sum}(x_1,y_2)-\texttt{sum}(x_2,y_1)+\texttt{sum}(x_1,x_1)
1409 It makes possible to do a fast blurring or fast block correlation with variable window size etc. In case of multi-channel images sums for each channel are accumulated independently.
1411 \cvfunc{CvtColor}\label{CvtColor}
1413 Converts image from one color space to another
1426 \cvarg{src}{The source 8-bit (8u), 16-bit (16u) or single-precision floating-point (32f) image}
1427 \cvarg{dst}{The destination image of the same data type as the source one. The number of channels may be different}
1428 \cvarg{code}{Color conversion operation that can be specifed using \texttt{CV\_ \textit{src\_color\_space} 2 \textit{dst\_color\_space}} constants (see below)}
1431 The function \texttt{cvCvtColor} converts input image from one color
1432 space to another. The function ignores \texttt{colorModel} and
1433 \texttt{channelSeq} fields of \texttt{IplImage} header, so the
1434 source image color space should be specified correctly (including
1435 order of the channels in case of RGB space, e.g. BGR means 24-bit
1436 format with $B_0, G_0, R_0, B_1, G_1, R_1, ...$ layout
1437 whereas RGB means 24-format with $R_0, G_0, B_0, R_1, G_1, B_1, ...$
1440 The conventional range for R,G,B channel values is:
1443 \item 0..255 for 8-bit images
1444 \item 0..65535 for 16-bit images and
1445 \item 0..1 for floating-point images.
1448 Of course, in case of linear transformations the range can be
1449 arbitrary, but in order to get correct results in case of non-linear
1450 transformations, the input image should be scaled if necessary.
1452 The function can do the following transformations:
1455 \item Transformations within RGB space like adding/removing alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5) color, as well as conversion to/from grayscale using:
1457 \text{RGB[A] to Gray:} Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B
1461 \text{Gray to RGB[A]:} R \leftarrow Y, G \leftarrow Y, B \leftarrow Y, A \leftarrow 0
1464 The conversion from a RGB image to gray is done with:
1466 cvCvtColor(src ,bwsrc, CV_RGB2GRAY)
1469 \item RGB $\leftrightarrow$ CIE XYZ.Rec 709 with D65 white point (\texttt{CV\_BGR2XYZ, CV\_RGB2XYZ, CV\_XYZ2BGR, CV\_XYZ2RGB}):
1478 0.412453 & 0.357580 & 0.180423\\
1479 0.212671 & 0.715160 & 0.072169\\
1480 0.019334 & 0.119193 & 0.950227
1497 3.240479 & -1.53715 & -0.498535\\
1498 -0.969256 & 1.875991 & 0.041556\\
1499 0.055648 & -0.204043 & 1.057311
1508 $X$, $Y$ and $Z$ cover the whole value range (in case of floating-point images $Z$ may exceed 1).
1510 \item RGB$\leftrightarrow$YCrCb JPEG (a.k.a. YCC) (\texttt{CV\_BGR2YCrCb, CV\_RGB2YCrCb, CV\_YCrCb2BGR, CV\_YCrCb2RGB})
1511 \[ Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B \]
1512 \[ Cr \leftarrow (R-Y) \cdot 0.713 + delta \]
1513 \[ Cb \leftarrow (B-Y) \cdot 0.564 + delta \]
1514 \[ R \leftarrow Y + 1.403 \cdot (Cr - delta) \]
1515 \[ G \leftarrow Y - 0.344 \cdot (Cr - delta) - 0.714 \cdot (Cb - delta) \]
1516 \[ B \leftarrow Y + 1.773 \cdot (Cb - delta) \]
1521 128 & \mbox{for 8-bit images}\\
1522 32768 & \mbox{for 16-bit images}\\
1523 0.5 & \mbox{for floating-point images}
1526 Y, Cr and Cb cover the whole value range.
1528 \item RGB<=>HSV (\texttt{CV\_BGR2HSV, CV\_RGB2HSV, CV\_HSV2BGR, CV\_HSV2RGB})
1529 In case of 8-bit and 16-bit images
1530 R, G and B are converted to floating-point format and scaled to fit 0..1 range
1531 \[ V \leftarrow max(R,G,B) \]
1533 \[ S \leftarrow \fork{\frac{V-min(R,G,B)}{V}}{if $V \neq 0$}{0}{otherwise} \]
1534 \[ H \leftarrow \forkthree
1535 {{60(G - B)}/{S}}{if $V=R$}
1536 {{120+60(B - R)}/{S}}{if $V=G$}
1537 {{240+60(R - G)}/{S}}{if $V=B$} \]
1538 if $H<0$ then $H \leftarrow H+360$
1540 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1542 The values are then converted to the destination data type:
1545 \[ V \leftarrow 255 V, S \leftarrow 255 S, H \leftarrow H/2 \text{(to fit to 0...255)} \]
1546 \item[16-bit images (currently not supported)]
1547 \[ V <- 65535 V, S <- 65535 S, H <- H \]
1548 \item[32-bit images]
1549 H, S, V are left as is
1552 \item RGB $\leftrightarrow$ HLS (\texttt{CV\_BGR2HLS, CV\_RGB2HLS, CV\_HLS2BGR, CV\_HLS2RGB}).
1553 In case of 8-bit and 16-bit images
1554 R, G and B are converted to floating-point format and scaled to fit 0..1 range.
1555 \[ V_{max} \leftarrow {max}(R,G,B) \]
1556 \[ V_{min} \leftarrow {min}(R,G,B) \]
1557 \[ L \leftarrow \frac{V_{max} - V_{min}}{2} \]
1558 \[ S \leftarrow \fork
1559 {\frac{V_{max} - V_{min}}{V_{max} + V_{min}}}{if $L < 0.5$}
1560 {\frac{V_{max} - V_{min}}{2 - (V_{max} + V_{min})}}{if $L \ge 0.5$} \]
1561 \[ H \leftarrow \forkthree
1562 {{60(G - B)}/{S}}{if $V_{max}=R$}
1563 {{120+60(B - R)}/{S}}{if $V_{max}=G$}
1564 {{240+60(R - G)}/{S}}{if $V_{max}=B$} \]
1565 if $H<0$ then $H \leftarrow H+360$
1566 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1568 The values are then converted to the destination data type:
1571 \[ V \leftarrow 255 V, S \leftarrow 255 S, H \leftarrow H/2 \text{(to fit to 0...255)} \]
1572 \item[16-bit images (currently not supported)]
1573 \[ V <- 65535 V, S <- 65535 S, H <- H \]
1574 \item[32-bit images]
1575 H, S, V are left as is
1578 \item RGB $\leftrightarrow$ CIE L*a*b* (\texttt{CV\_BGR2Lab, CV\_RGB2Lab, CV\_Lab2BGR, CV\_Lab2RGB})
1579 In case of 8-bit and 16-bit images
1580 R, G and B are converted to floating-point format and scaled to fit 0..1 range
1581 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1582 {0.412453}{0.357580}{0.180423}
1583 {0.212671}{0.715160}{0.072169}
1584 {0.019334}{0.119193}{0.950227}
1586 \vecthree{R}{G}{B} \]
1587 \[ X \leftarrow X/X_n, \text{where} X_n = 0.950456 \]
1588 \[ Z \leftarrow Z/Z_n, \text{where} Z_n = 1.088754 \]
1589 \[ L \leftarrow \fork
1590 {116*Y^{1/3}-16}{for $Y>0.008856$}
1591 {903.3*Y}{for $Y \le 0.008856$} \]
1592 \[ a \leftarrow 500 (f(X)-f(Y)) + delta \]
1593 \[ b \leftarrow 200 (f(Y)-f(Z)) + delta \]
1596 {t^{1/3}}{for $t>0.008856$}
1597 {7.787 t+16/116}{for $t<=0.008856$} \]
1599 \[ delta = \fork{128}{for 8-bit images}{0}{for floating-point images} \]
1600 On output $0 \leq L \leq 100$, $-127 \leq a \leq 127$, $-127 \leq b \leq 127$
1602 The values are then converted to the destination data type:
1605 \[L \leftarrow L*255/100, a \leftarrow a + 128, b \leftarrow b + 128\]
1606 \item[16-bit images] currently not supported
1607 \item[32-bit images]
1608 L, a, b are left as is
1611 \item RGB $\leftrightarrow$ CIE L*u*v* (\texttt{CV\_BGR2Luv, CV\_RGB2Luv, CV\_Luv2BGR, CV\_Luv2RGB})
1612 In case of 8-bit and 16-bit images
1613 R, G and B are converted to floating-point format and scaled to fit 0..1 range
1614 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1615 {0.412453}{0.357580}{0.180423}
1616 {0.212671}{0.715160}{0.072169}
1617 {0.019334}{0.119193}{0.950227}
1619 \vecthree{R}{G}{B} \]
1620 \[ L \leftarrow \fork
1621 {116 Y^{1/3}}{for $Y>0.008856$}
1622 {903.3 Y}{for $Y<=0.008856$} \]
1623 \[ u' \leftarrow 4*X/(X + 15*Y + 3 Z) \]
1624 \[ v' \leftarrow 9*Y/(X + 15*Y + 3 Z) \]
1625 \[ u \leftarrow 13*L*(u' - u_n) \quad \text{where} \quad u_n=0.19793943 \]
1626 \[ v \leftarrow 13*L*(v' - v_n) \quad \text{where} \quad v_n=0.46831096 \]
1627 On output $0 \leq L \leq 100$, $-134 \leq u \leq 220$, $-140 \leq v \leq 122$.
1629 The values are then converted to the destination data type:
1632 \[L \leftarrow 255/100 L, u \leftarrow 255/354 (u + 134), v \leftarrow 255/256 (v + 140) \]
1633 \item[16-bit images] currently not supported
1634 \item[32-bit images] L, u, v are left as is
1637 The above formulae for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from
1639 document at Charles Poynton site.
1641 \item Bayer $\rightarrow$ RGB (\texttt{CV\_BayerBG2BGR, CV\_BayerGB2BGR, CV\_BayerRG2BGR, CV\_BayerGR2BGR, CV\_BayerBG2RGB, CV\_BayerGB2RGB, CV\_BayerRG2RGB, CV\_BayerGR2RGB}) Bayer pattern is widely used in CCD and CMOS cameras. It allows to get color picture out of a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:
1643 \newcommand{\R}{\color{red}R}
1644 \newcommand{\G}{\color{green}G}
1645 \newcommand{\B}{\color{blue}B}
1649 \definecolor{BackGray}{rgb}{0.8,0.8,0.8}
1650 \begin{array}{ c c c c c }
1652 \G&\colorbox{BackGray}{\B}&\colorbox{BackGray}{\G}&\B&\G\\
1659 The output RGB components of a pixel are interpolated from 1, 2 or
1660 4 neighbors of the pixel having the same color. There are several
1661 modifications of the above pattern that can be achieved by shifting
1662 the pattern one pixel left and/or one pixel up. The two letters
1664 in the conversion constants
1665 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2BGR}
1667 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2RGB}
1668 indicate the particular pattern
1669 type - these are components from the second row, second and third
1670 columns, respectively. For example, the above pattern has very
1674 \cvfunc{Threshold}\label{Threshold}
1676 Applies fixed-level threshold to array elements
1689 int threshold\_type );
1693 \cvarg{src}{Source array (single-channel, 8-bit of 32-bit floating point)}
1694 \cvarg{dst}{Destination array; must be either the same type as \texttt{src} or 8-bit}
1695 \cvarg{threshold}{Threshold value}
1696 \cvarg{max\_value}{Maximum value to use with \texttt{CV\_THRESH\_BINARY} and \texttt{CV\_THRESH\_BINARY\_INV} thresholding types}
1697 \cvarg{threshold\_type}{Thresholding type (see the discussion)}
1700 The function \texttt{cvThreshold} applies fixed-level thresholding
1701 to single-channel array. The function is typically used to get
1702 bi-level (binary) image out of grayscale image \cross{CmpS} could
1703 be also used for this purpose) or for removing a noise, i.e. filtering
1704 out pixels with too small or too large values. There are several
1705 types of thresholding the function supports that are determined by
1706 \texttt{threshold\_type}:
1709 \cvarg{CV\_THRESH\_BINARY}{\[ \texttt{dst}(x,y) = \fork{\texttt{max\_value}}{if $\texttt{src}(x,y) > \texttt{threshold}$}{0}{otherwise} \]}
1710 \cvarg{CV\_THRESH\_BINARY\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{max\_value}}{otherwise} \]}
1711 \cvarg{CV\_THRESH\_BINARY\_TRUNC}{\[ \texttt{dst}(x,y) = \fork{\texttt{threshold}}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{src}(x,y)}{otherwise} \]}
1712 \cvarg{CV\_THRESH\_BINARY\_TOZERO}{\[ \texttt{dst}(x,y) = \fork{\texttt{src}(x,y)}{if $\texttt{src}(x,y) > \texttt{threshold}$}{0}{otherwise} \]}
1713 \cvarg{CV\_THRESH\_BINARY\_TOZERO\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{src}(x,y)}{otherwise} \]}
1716 \includegraphics[width=0.5\textwidth]{pics/threshold.png}
1718 \cvfunc{AdaptiveThreshold}\label{AdaptiveThreshold}
1720 Applies adaptive threshold to array
1723 void cvAdaptiveThreshold(
1725 const CvArr* src,\par CvArr* dst,\par double max\_value,\par
1726 int adaptive\_method=CV\_ADAPTIVE\_THRESH\_MEAN\_C,\par
1727 int threshold\_type=CV\_THRESH\_BINARY,\par
1728 int block\_size=3,\par double param1=5 );
1733 \cvarg{src}{Source image}
1734 \cvarg{dst}{Destination image}
1735 \cvarg{max\_value}{Maximum value that is used with \texttt{CV\_THRESH\_BINARY} and \texttt{CV\_THRESH\_BINARY\_INV}}
1736 \cvarg{adaptive\_method}{Adaptive thresholding algorithm to use: \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} or \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} (see the discussion)}
1737 \cvarg{threshold\_type}{Thresholding type; must be one of
1739 \cvarg{CV\_THRESH\_BINARY}
1740 \cvarg{CV\_THRESH\_BINARY\_INV}
1742 \cvarg{block\_size}{The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, ..}
1743 \cvarg{param1}{The method-dependent parameter. For the methods \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} and \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} it is a constant subtracted from mean or weighted mean (see the discussion), though it may be negative}
1746 The function \texttt{cvAdaptiveThreshold} transforms grayscale image to binary image according to the formulae:
1749 \cvarg{CV\_THRESH\_BINARY}{\[ dst(x,y) = \fork{\texttt{max\_value}}{if $src(x,y) > T(x,y)$}{0}{otherwise} \]}
1750 \cvarg{CV\_THRESH\_BINARY\_INV}{\[ dst(x,y) = \fork{0}{if $src(x,y) > T(x,y)$}{\texttt{max\_value}}{otherwise} \]}
1753 where $T(x,y)$ is a threshold calculated individually for each pixel.
1755 For the method \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} it is a mean of $\texttt{block\_size} \times \texttt{block\_size}$ pixel neighborhood, subtracted by \texttt{param1}.
1757 For the method \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} it is a weighted sum (gaussian) of $\texttt{block\_size} \times \texttt{block\_size}$ pixel neighborhood, subtracted by \texttt{param1}.
1759 \subsection{Pyramids and the Applications}
1761 \cvfunc{PyrDown}\label{PyrDown}
1766 void cvPyrDown(\par const CvArr* src,\par CvArr* dst,\par int filter=CV\_GAUSSIAN\_5x5 );
1770 \cvarg{src}{The source image}
1771 \cvarg{dst}{The destination image, should have 2x smaller width and height than the source}
1772 \cvarg{filter}{Type of the filter used for convolution; only \texttt{CV\_GAUSSIAN\_5x5} is currently supported}
1775 The function \texttt{cvPyrDown} performs downsampling step of Gaussian pyramid decomposition. First it convolves source image with the specified filter and then downsamples the image by rejecting even rows and columns.
1777 \cvfunc{PyrUp}\label{PyrUp}
1782 void cvPyrUp(\par const CvArr* src,\par CvArr* dst,\par int filter=CV\_GAUSSIAN\_5x5 );
1786 \cvarg{src}{The source image}
1787 \cvarg{dst}{The destination image, should have 2x larger width and height than the source}
1788 \cvarg{filter}{Type of the filter used for convolution; only \texttt{CV\_GAUSSIAN\_5x5} is currently supported}
1791 The function \texttt{cvPyrUp} performs up-sampling step of Gaussian pyramid decomposition. First it upsamples the source image by injecting even zero rows and columns and then convolves result with the specified filter multiplied by 4 for interpolation. So the destination image is four times larger than the source image.
1793 \cvfunc{PyrSegmentation}\label{PyrSegmentation}
1795 Implements image segmentation by pyramids
1798 void cvPyrSegmentation(\par IplImage* src,\par IplImage* dst,\par
1799 CvMemStorage* storage,\par CvSeq** comp,\par
1800 int level,\par double threshold1,\par double threshold2 );
1804 \cvarg{src}{The source image}
1805 \cvarg{dst}{The destination image}
1806 \cvarg{storage}{Storage; stores the resulting sequence of connected components}
1807 \cvarg{comp}{Pointer to the output sequence of the segmented components}
1808 \cvarg{level}{Maximum level of the pyramid for the segmentation}
1809 \cvarg{threshold1}{Error threshold for establishing the links}
1810 \cvarg{threshold2}{Error threshold for the segments clustering}
1813 The function \texttt{cvPyrSegmentation} implements image segmentation by pyramids. The pyramid builds up to the level \texttt{level}. The links between any pixel \texttt{a} on level \texttt{i} and its candidate father pixel \texttt{b} on the adjacent level are established if
1814 $p(c(a),c(b))<threshold1$.
1815 After the connected components are defined, they are joined into several clusters.
1816 Any two segments A and B belong to the same cluster, if $p(c(A),c(B))<threshold2$.
1817 The input image has only one channel, then $p(c^1,c^2)=|c^1-c^2|$.
1818 If the input image has three channels (red, green and blue), then
1820 p(c^1,c^2) = 0.30 (c^1_r - c^2_r) +
1821 0.59 (c^1_g - c^2_g) +
1822 0.11 (c^1_b - c^2_b).
1825 There may be more than one connected component per a cluster. The images \texttt{src} and \texttt{dst} should be 8-bit single-channel or 3-channel images or equal size.
1827 \subsection{Connected Components and Contour Retrieval}
1829 \cvstruct{CvConnectedComp}\label{CvConnectedComp}
1834 typedef struct CvConnectedComp
1836 double area; /* area of the segmented component */
1837 CvScalar value; /* average color of the connected component */
1838 CvRect rect; /* ROI of the segmented component */
1839 CvSeq* contour; /* optional component boundary
1840 (the contour might have child contours corresponding to the holes) */
1845 \cvfunc{FloodFill}\label{FloodFill}
1847 Fills a connected component with given color
1850 void cvFloodFill(\par CvArr* image,\par CvPoint seed\_point,\par CvScalar new\_val,\par
1851 CvScalar lo\_diff=cvScalarAll(0),\par CvScalar up\_diff=cvScalarAll(0),\par
1852 CvConnectedComp* comp=NULL,\par int flags=4,\par CvArr* mask=NULL );
1857 \#define CV\_FLOODFILL\_FIXED\_RANGE (1 << 16)
1858 \#define CV\_FLOODFILL\_MASK\_ONLY (1 << 17)
1862 \cvarg{image}{Input 1- or 3-channel, 8-bit or floating-point image. It is modified by the function unless \texttt{CV\_FLOODFILL\_MASK\_ONLY} flag is set (see below)}
1863 \cvarg{seed\_point}{The starting point}
1864 \cvarg{new\_val}{New value of repainted domain pixels}
1865 \cvarg{lo\_diff}{Maximal lower brightness/color difference between the currently observed pixel and one of its neighbor belong to the component or seed pixel to add the pixel to component. In case of 8-bit color images it is packed value}
1866 \cvarg{up\_diff}{Maximal upper brightness/color difference between the currently observed pixel and one of its neighbor belong to the component or seed pixel to add the pixel to component. In case of 8-bit color images it is packed value}
1867 \cvarg{comp}{Pointer to structure the function fills with the information about the repainted domain}
1868 \cvarg{flags}{The operation flags. Lower bits contain connectivity value, 4 (by default) or 8, used within the function. Connectivity determines which neighbors of a pixel are considered. Upper bits can be 0 or combination of the following flags:
1870 \cvarg{CV\_FLOODFILL\_FIXED\_RANGE}{if set the difference between the current pixel and seed pixel is considered, otherwise difference between neighbor pixels is considered (the range is floating)}
1871 \cvarg{CV\_FLOODFILL\_MASK\_ONLY}{if set, the function does not fill the image (`new\_val` is ignored), but the fills mask (that must be non-NULL in this case)}
1873 \cvarg{mask}{Operation mask, should be singe-channel 8-bit image, 2 pixels wider and 2 pixels taller than `image`. If not NULL, the function uses and updates the mask, so user takes responsibility of initializing `mask` content. Floodfilling can't go across non-zero pixels in the mask, for example, an edge detector output can be used as a mask to stop filling at edges. Or it is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap. ''Note'': because mask is larger than the filled image, pixel in `mask` that corresponds to $(x,y)$ pixel in `image` will have coordinates $(x+1,y+1)$ }
1876 The function \texttt{cvFloodFill} fills a connected component starting from the seed point with the specified color. The connectivity is determined by the closeness of pixel values. The pixel at $(x,y)$ is considered to belong to the repainted domain if:
1880 \item[grayscale image, floating range] \[
1881 src(x',y')-\texttt{lo\_diff} <= src(x,y) <= src(x',y')+\texttt{up\_diff} \]
1883 \item[grayscale image, fixed range] \[
1884 src(seed.x,seed.y)-\texttt{lo\_diff}<=src(x,y)<=src(seed.x,seed.y)+\texttt{up\_diff} \]
1886 \item[color image, floating range]
1887 \[ src(x',y')_r-\texttt{lo\_diff}_r<=src(x,y)_r<=src(x',y')_r+\texttt{up\_diff}_r \]
1888 \[ src(x',y')_g-\texttt{lo\_diff}_g<=src(x,y)_g<=src(x',y')_g+\texttt{up\_diff}_g \]
1889 \[ src(x',y')_b-\texttt{lo\_diff}_b<=src(x,y)_b<=src(x',y')_b+\texttt{up\_diff}_b \]
1891 \item[color image, fixed range]
1892 \[ src(seed.x,seed.y)_r-\texttt{lo\_diff}_r<=src(x,y)_r<=src(seed.x,seed.y)_r+\texttt{up\_diff}_r \]
1893 \[ src(seed.x,seed.y)_g-\texttt{lo\_diff}_g<=src(x,y)_g<=src(seed.x,seed.y)_g+\texttt{up\_diff}_g \]
1894 \[ src(seed.x,seed.y)_b-\texttt{lo\_diff}_b<=src(x,y)_b<=src(seed.x,seed.y)_b+\texttt{up\_diff}_b \]
1897 where $src(x',y')$ is value of one of pixel neighbors. That is, to be added to the connected component, a pixel's color/brightness should be close enough to:
1899 \item color/brightness of one of its neighbors that are already referred to the connected component in case of floating range
1900 \item color/brightness of the seed point in case of fixed range.
1903 \cvfunc{FindContours}\label{FindContours}
1905 Finds contours in binary image
1908 int cvFindContours(\par CvArr* image,\par CvMemStorage* storage,\par CvSeq** first\_contour,\par
1909 int header\_size=sizeof(CvContour),\par int mode=CV\_RETR\_LIST,\par
1910 int method=CV\_CHAIN\_APPROX\_SIMPLE,\par CvPoint offset=cvPoint(0,\par0) );
1914 \cvarg{image}{The source 8-bit single channel image. Non-zero pixels are treated as 1's, zero pixels remain 0's - that is image treated as `binary`. To get such a binary image from grayscale, one may use \cross{Threshold}, \cross{AdaptiveThreshold} or \cross{Canny}. The function modifies the source image content}
1915 \cvarg{storage}{Container of the retrieved contours}
1916 \cvarg{first\_contour}{Output parameter, will contain the pointer to the first outer contour}
1917 \cvarg{header\_size}{Size of the sequence header, $\ge \texttt{sizeof(CvChain)}$ if $\texttt{method} =\texttt{CV\_CHAIN\_CODE}$,
1918 and $\ge \texttt{sizeof(CvContour)}$ otherwise}
1919 \cvarg{mode}{Retrieval mode}
1921 \cvarg{CV\_RETR\_EXTERNAL}{retrive only the extreme outer contours}
1922 \cvarg{CV\_RETR\_LIST}{retrieve all the contours and puts them in the list}
1923 \cvarg{CV\_RETR\_CCOMP}{retrieve all the contours and organizes them into two-level hierarchy: top level are external boundaries of the components, second level are bounda boundaries of the holes}
1924 \cvarg{CV\_RETR\_TREE}{retrieve all the contours and reconstructs the full hierarchy of nested contours}
1926 \cvarg{method}{Approximation method (for all the modes, except \texttt{CV\_LINK\_RUNS}, which uses built-in approximation)}
1928 \cvarg{CV\_CHAIN\_CODE}{output contours in the Freeman chain code. All other methods output polygons (sequences of vertices).}
1929 \cvarg{CV\_CHAIN\_APPROX\_NONE}{translate all the points from the chain code into points;}
1930 \cvarg{CV\_CHAIN\_APPROX\_SIMPLE}{compress horizontal, vertical, and diagonal segments, that is, the function leaves only their ending points;}
1931 \cvarg{CV\_CHAIN\_APPROX\_TC89\_L1,CV\_CHAIN\_APPROX\_TC89\_KCOS}{apply one of the flavors of Teh-Chin chain approximation algorithm.}
1932 \cvarg{CV\_LINK\_RUNS}{use completely different contour retrieval algorithm via linking of horizontal segments of 1's. Only \texttt{CV\_RETR\_LIST} retrieval mode can be used with this method.}
1934 \cvarg{offset}{Offset, by which every contour point is shifted. This is useful if the contours are extracted from the image ROI and then they should be analyzed in the whole image context}
1937 The function \texttt{cvFindContours} retrieves contours from the
1938 binary image and returns the number of retrieved contours. The
1939 pointer \texttt{first\_contour} is filled by the function. It will
1940 contain pointer to the first most outer contour or \texttt{NULL} if no
1941 contours is detected (if the image is completely black). Other
1942 contours may be reached from \texttt{first\_contour} using
1943 \texttt{h\_next} and \texttt{v\_next} links. The sample in
1944 \cross{DrawContours} discussion shows how to use contours for
1945 connected component detection. Contours can be also used for shape
1946 analysis and object recognition - see \texttt{squares.c} in OpenCV
1949 \cvfunc{StartFindContours}\label{StartFindContours}
1951 Initializes contour scanning process
1954 CvContourScanner cvStartFindContours(\par CvArr* image,\par CvMemStorage* storage,\par
1955 int header\_size=sizeof(CvContour),\par
1956 int mode=CV\_RETR\_LIST,\par
1957 int method=CV\_CHAIN\_APPROX\_SIMPLE,\par
1958 CvPoint offset=cvPoint(0,\par0) );
1962 \cvarg{image}{The source 8-bit single channel binary image}
1963 \cvarg{storage}{Container of the retrieved contours}
1964 \cvarg{header\_size}{Size of the sequence header, >=sizeof(CvChain) if \texttt{method} =CV\_CHAIN\_CODE, and >=sizeof(CvContour) otherwise}
1965 \cvarg{mode}{Retrieval mode; see \cross{FindContours}}
1966 \cvarg{method}{Approximation method. It has the same meaning as in \cross{FindContours}, but \texttt{CV\_LINK\_RUNS} can not be used here}
1967 \cvarg{offset}{ROI offset; see \cross{FindContours}}
1970 The function \texttt{cvStartFindContours} initializes and returns pointer to the contour scanner. The scanner is used further in \cross{FindNextContour} to retrieve the rest of contours.
1972 \cvfunc{FindNextContour}\label{FindNextContour}
1974 Finds next contour in the image
1977 CvSeq* cvFindNextContour( \par CvContourScanner scanner );
1981 \cvarg{scanner}{Contour scanner initialized by The function \texttt{cvStartFindContours} }
1984 The function \texttt{cvFindNextContour} locates and retrieves the next contour in the image and returns pointer to it. The function returns NULL, if there is no more contours.
1986 \cvfunc{SubstituteContour}\label{SubstituteContour}
1988 Replaces retrieved contour
1991 void cvSubstituteContour( \par CvContourScanner scanner, \par CvSeq* new\_contour );
1995 \cvarg{scanner}{Contour scanner initialized by \cross{StartFindContours} }
1996 \cvarg{new\_contour}{Substituting contour}
1999 The function \texttt{cvSubstituteContour} replaces the retrieved
2000 contour, that was returned from the preceding call of The function
2001 \cross{FindNextContour} and stored inside the contour scanner
2002 state, with the user-specified contour. The contour is inserted
2003 into the resulting structure, list, two-level hierarchy, or tree,
2004 depending on the retrieval mode. If the parameter \texttt{new\_contour}
2005 is \texttt{NULL}, the retrieved contour is not included into the
2006 resulting structure, nor all of its children that might be added
2007 to this structure later.
2009 \cvfunc{EndFindContours}\label{EndFindContours}
2011 Finishes scanning process
2014 CvSeq* cvEndFindContours( \par CvContourScanner* scanner );
2018 \cvarg{scanner}{Pointer to the contour scanner}
2021 The function \texttt{cvEndFindContours} finishes the scanning process and returns the pointer to the first contour on the highest level.
2023 % XXX missing PyrMeanShiftFiltering
2024 % XXX missing Watershed
2026 \subsection{Image and Contour moments}
2028 \cvfunc{Moments}\label{Moments}
2030 Calculates all moments up to third order of a polygon or rasterized shape
2033 void cvMoments( \par const CvArr* arr,\par CvMoments* moments,\par int binary=0 );
2037 \cvarg{arr}{Image (1-channel or 3-channel with COI set) or polygon (CvSeq of points or a vector of points)}
2038 \cvarg{moments}{Pointer to returned moment state structure}
2039 \cvarg{binary}{(For images only) If the flag is non-zero, all the zero pixel values are treated as zeroes, all the others are treated as 1's}
2042 The function \texttt{cvMoments} calculates spatial and central moments up to the third order and writes them to \texttt{moments}. The moments may be used then to calculate gravity center of the shape, its area, main axises and various shape characeteristics including 7 Hu invariants.
2044 \cvfunc{GetSpatialMoment}\label{GetSpatialMoment}
2046 Retrieves spatial moment from moment state structure
2049 double cvGetSpatialMoment( \par CvMoments* moments, \par int x\_order, \par int y\_order );
2053 \cvarg{moments}{The moment state, calculated by \cross{Moments}}
2054 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2055 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2058 The function \texttt{cvGetSpatialMoment} retrieves the spatial moment, which in case of image moments is defined as:
2061 M_{x\_order, \, y\_order} = \sum_{x,y} (I(x,y) \cdot x^{x\_order} \cdot y^{y\_order})
2064 where $I(x,y)$ is the intensity of the pixel $(x, y)$.
2066 \cvfunc{GetCentralMoment}\label{GetCentralMoment}
2068 Retrieves central moment from moment state structure
2071 double cvGetCentralMoment( \par CvMoments* moments,\par int x\_order,\par int y\_order );
2075 \cvarg{moments}{Pointer to the moment state structure}
2076 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2077 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2080 The function \texttt{cvGetCentralMoment} retrieves the central moment, which in case of image moments is defined as:
2083 \mu_{x\_order, \, y\_order} = \sum_{x,y} (I(x,y) \cdot (x-x_c)^{x\_order} \cdot (y-y_c)^{y\_order})
2086 where $x_c,y_c$ are coordinates of the gravity center:
2089 x_c=\frac{M_{10}}{M_{00}}, y_c=\frac{M_{01}}{M_{00}}
2092 \cvfunc{GetNormalizedCentralMoment}\label{GetNormalizedCentralMoment}
2094 Retrieves normalized central moment from moment state structure
2097 double cvGetNormalizedCentralMoment( \par CvMoments* moments,\par int x\_order,\par int y\_order );
2101 \cvarg{moments}{Pointer to the moment state structure}
2102 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2103 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2106 The function \texttt{cvGetNormalizedCentralMoment} retrieves the normalized central moment:
2109 \eta_{x\_order, \, y\_order} = \frac{\mu_{x\_order, \, y\_order}}{M_{00}^{(y\_order+x\_order)/2+1}}
2112 \cvfunc{GetHuMoments}\label{GetHuMoments}
2114 Calculates seven Hu invariants
2117 void cvGetHuMoments( CvMoments* moments, CvHuMoments* hu\_moments );
2121 \cvarg{moments}{Pointer to the moment state structure}
2122 \cvarg{hu\_moments}{Pointer to Hu moments structure}
2125 The function \texttt{cvGetHuMoments} calculates seven Hu invariants that are defined as:
2128 h1=\eta_{20}+\eta_{02}\\
2129 h2=(\eta_{20}-\eta_{02})^{2}+4\eta_{11}^{2}\\
2130 h1=\eta_{20}+\eta_{02}\\
2131 h2=(\eta_{20}-\eta_{02})^{2}+4\eta_{11}^{2}\\
2132 h3=(\eta_{30}-3\eta_{12})^{2}+ (3\eta_{21}-\eta_{03})^{2}\\
2133 h4=(\eta_{30}+\eta_{12})^{2}+ (\eta_{21}+\eta_{03})^{2}\\
2134 h5=(\eta_{30}-3\eta_{12})(\eta_{30}+\eta_{12})[(\eta_{30}+\eta_{12})^{2}-3(\eta_{21}+\eta_{03})^{2}]+(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
2135 h6=(\eta_{20}-\eta_{02})[(\eta_{30}+\eta_{12})^{2}- (\eta_{21}+\eta_{03})^{2}]+4\eta_{11}(\eta_{30}+\eta_{12})(\eta_{21}+\eta_{03})\\
2136 h7=(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]-(\eta_{30}-3\eta_{12})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
2140 where $\eta_{i,j}$ are normalized central moments of $2^{nd}$ and $3^{rd}$ orders.
2141 These values are proved to be invariants to the image scale, rotation, and reflection except the seventh one, whose sign is changed by reflection.
2143 \subsection{Special Image Transforms}
2145 \cvfunc{HoughLines2}\label{HoughLines2}
2147 Finds lines in binary image using Hough transform
2150 CvSeq* cvHoughLines2( \par CvArr* image,\par void* line\_storage,\par int method,\par double rho,\par double theta,\par int threshold,\par double param1=0,\par double param2=0 );
2154 \cvarg{image}{The input 8-bit single-channel binary image. In case of probabilistic method the image is modified by the function}
2155 \cvarg{line\_storage}{The storage for the lines detected. It can
2156 be a memory storage (in this case a sequence of lines is created in
2157 the storage and returned by the function) or single row/single column
2158 matrix (CvMat*) of a particular type (see below) to which the lines'
2159 parameters are written. The matrix header is modified by the function
2160 so its \texttt{cols} or \texttt{rows} will contain a number of lines
2161 detected. If \texttt{line\_storage} is a matrix and the actual number
2162 of lines exceeds the matrix size, the maximum possible number of lines
2163 is returned (in case of standard hough transform the lines are sorted
2164 by the accumulator value)}
2165 \cvarg{method}{The Hough transform variant, one of:
2167 \cvarg{CV\_HOUGH\_STANDARD}{classical or standard Hough transform. Every line is represented by two floating-point numbers $(\rho, \theta)$, where $\rho$ is a distance between (0,0) point and the line, and $\theta$ is the angle between x-axis and the normal to the line. Thus, the matrix must be (the created sequence will be) of \texttt{CV\_32FC2} type.}
2168 \cvarg{CV\_HOUGH\_PROBABILISTIC}{probabilistic Hough transform (more efficient in case if picture contains a few long linear segments). It returns line segments rather than the whole lines. Every segment is represented by starting and ending points, and the matrix must be (the created sequence will be) of \texttt{CV\_32SC4} type.}
2169 \cvarg{CV\_HOUGH\_MULTI\_SCALE}{multi-scale variant of classical Hough transform. The lines are encoded the same way as in \texttt{CV\_HOUGH\_STANDARD}.}
2171 \cvarg{rho}{Distance resolution in pixel-related units}
2172 \cvarg{theta}{Angle resolution measured in radians}
2173 \cvarg{threshold}{Threshold parameter. A line is returned by the function if the corresponding accumulator value is greater than \texttt{threshold}}
2174 \cvarg{param1}{The first method-dependent parameter:
2176 \item For classical Hough transform it is not used (0).
2177 \item For probabilistic Hough transform it is the minimum line length.
2178 \item For multi-scale Hough transform it is divisor for distance resolution $\rho$. (The coarse distance resolution will be $\rho$ and the accurate resolution will be $(\rho / \texttt{param1})$).
2180 \cvarg{param2}{The second method-dependent parameter:
2182 \item For classical Hough transform it is not used (0).
2183 \item For probabilistic Hough transform it is the maximum gap between line segments lieing on the same line to treat them as the single line segment (i.e. to join them).
2184 \item For multi-scale Hough transform it is divisor for angle resolution $\theta$. (The coarse angle resolution will be $\theta$ and the accurate resolution will be $(\theta / \texttt{param2})$).
2188 The function \texttt{cvHoughLines2} implements a few variants of Hough transform for line detection.
2190 % ===== Example. Detecting lines with Hough transform. =====
2192 /* This is a standalone program. Pass an image name as a first parameter
2193 of the program. Switch between standard and probabilistic Hough transform
2194 by changing "#if 1" to "#if 0" and back */
2196 #include <highgui.h>
2199 int main(int argc, char** argv)
2202 if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0)
2204 IplImage* dst = cvCreateImage( cvGetSize(src), 8, 1 );
2205 IplImage* color_dst = cvCreateImage( cvGetSize(src), 8, 3 );
2206 CvMemStorage* storage = cvCreateMemStorage(0);
2209 cvCanny( src, dst, 50, 200, 3 );
2210 cvCvtColor( dst, color_dst, CV_GRAY2BGR );
2212 lines = cvHoughLines2( dst,
2221 for( i = 0; i < MIN(lines->total,100); i++ )
2223 float* line = (float*)cvGetSeqElem(lines,i);
2224 float rho = line[0];
2225 float theta = line[1];
2227 double a = cos(theta), b = sin(theta);
2228 double x0 = a*rho, y0 = b*rho;
2229 pt1.x = cvRound(x0 + 1000*(-b));
2230 pt1.y = cvRound(y0 + 1000*(a));
2231 pt2.x = cvRound(x0 - 1000*(-b));
2232 pt2.y = cvRound(y0 - 1000*(a));
2233 cvLine( color_dst, pt1, pt2, CV_RGB(255,0,0), 3, 8 );
2236 lines = cvHoughLines2( dst,
2238 CV_HOUGH_PROBABILISTIC,
2244 for( i = 0; i < lines->total; i++ )
2246 CvPoint* line = (CvPoint*)cvGetSeqElem(lines,i);
2247 cvLine( color_dst, line[0], line[1], CV_RGB(255,0,0), 3, 8 );
2250 cvNamedWindow( "Source", 1 );
2251 cvShowImage( "Source", src );
2253 cvNamedWindow( "Hough", 1 );
2254 cvShowImage( "Hough", color_dst );
2261 This is the sample picture the function parameters have been tuned for:
2263 \includegraphics[width=0.5\textwidth]{pics/building.jpg}
2265 And this is the output of the above program in case of probabilistic Hough transform (\texttt{\#if 0} case):
2267 \includegraphics[width=0.5\textwidth]{pics/houghp.png}
2269 \cvfunc{HoughCircles}\label{HoughCircles}
2271 Finds circles in grayscale image using Hough transform
2274 CvSeq* cvHoughCircles( \par CvArr* image,\par void* circle\_storage,\par int method,\par double dp,\par double min\_dist,\par double param1=100,\par double param2=100,\par int min\_radius=0,\par int max\_radius=0 );
2278 \cvarg{image}{The input 8-bit single-channel grayscale image}
2279 \cvarg{circle\_storage}{The storage for the circles detected. It can be a memory storage (in this case a sequence of circles is created in the storage and returned by the function) or single row/single column matrix (CvMat*) of type \texttt{CV\_32FC3}, to which the circles' parameters are written. The matrix header is modified by the function so its \texttt{cols} or \texttt{rows} will contain a number of lines detected. If \texttt{circle\_storage} is a matrix and the actual number of lines exceeds the matrix size, the maximum possible number of circles is returned. Every circle is encoded as 3 floating-point numbers: center coordinates (x,y) and the radius}
2280 \cvarg{method}{Currently, the only implemented method is \texttt{CV\_HOUGH\_GRADIENT}, which is basically 21HT, described in Yuen03.}
2281 \cvarg{dp}{Resolution of the accumulator used to detect centers of the circles. For example, if it is 1, the accumulator will have the same resolution as the input image, if it is 2 - accumulator will have twice smaller width and height, etc}
2282 \cvarg{min\_dist}{Minimum distance between centers of the detected circles. If the parameter is too small, multiple neighbor circles may be falsely detected in addition to a true one. If it is too large, some circles may be missed}
2283 \cvarg{param1}{The first method-specific parameter. In case of \texttt{CV\_HOUGH\_GRADIENT} it is the higher threshold of the two passed to \cross{Canny} edge detector (the lower one will be twice smaller)}
2284 \cvarg{param2}{The second method-specific parameter. In case of \texttt{CV\_HOUGH\_GRADIENT} it is accumulator threshold at the center detection stage. The smaller it is, the more false circles may be detected. Circles, corresponding to the larger accumulator values, will be returned first}
2285 \cvarg{min\_radius}{Minimum circle radius}
2286 \cvarg{max\_radius}{Maximum circle radius}
2289 The function \texttt{cvHoughCircles} finds circles in grayscale image using some modification of Hough transform.
2291 % ===== Example. Detecting circles with Hough transform. =====
2294 #include <highgui.h>
2297 int main(int argc, char** argv)
2300 if( argc == 2 && (img=cvLoadImage(argv[1], 1))!= 0)
2302 IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 );
2303 CvMemStorage* storage = cvCreateMemStorage(0);
2304 cvCvtColor( img, gray, CV_BGR2GRAY );
2305 // smooth it, otherwise a lot of false circles may be detected
2306 cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 );
2307 CvSeq* circles = cvHoughCircles( gray,
2315 for( i = 0; i < circles->total; i++ )
2317 float* p = (float*)cvGetSeqElem( circles, i );
2319 cvPoint(cvRound(p[0]),cvRound(p[1])),
2324 cvPoint(cvRound(p[0]),cvRound(p[1])),
2329 cvNamedWindow( "circles", 1 );
2330 cvShowImage( "circles", img );
2336 \cvfunc{DistTransform}\label{DistTransform}
2338 Calculates distance to closest zero pixel for all non-zero pixels of source image
2341 void cvDistTransform( \par const CvArr* src,\par CvArr* dst,\par int distance\_type=CV\_DIST\_L2,\par int mask\_size=3,\par const float* mask=NULL,\par CvArr* labels=NULL );
2345 \cvarg{src}{Source 8-bit single-channel (binary) image}
2346 \cvarg{dst}{Output image with calculated distances (32-bit floating-point, single-channel)}
2347 \cvarg{distance\_type}{Type of distance; can be \texttt{CV\_DIST\_L1, CV\_DIST\_L2, CV\_DIST\_C} or \texttt{CV\_DIST\_USER}}
2348 \cvarg{mask\_size}{Size of distance transform mask; can be 3 or 5. In case of \texttt{CV\_DIST\_L1} or \texttt{CV\_DIST\_C} the parameter is forced to 3, because $3\times 3$ mask gives the same result as $5\times 5 $ yet it is faster}
2349 \cvarg{mask}{User-defined mask in case of user-defined distance, it consists of 2 numbers (horizontal/vertical shift cost, diagonal shift cost) in case of $3\times 3$ mask and 3 numbers (horizontal/vertical shift cost, diagonal shift cost, knight's move cost) in case of $5\times 5$ mask}
2350 \cvarg{labels}{The optional output 2d array of labels of integer type and the same size as \texttt{src} and \texttt{dst}}
2353 The function \texttt{cvDistTransform} calculates the approximated
2354 distance from every binary image pixel to the nearest zero pixel.
2355 For zero pixels the function sets the zero distance, for others it
2356 finds the shortest path consisting of basic shifts: horizontal,
2357 vertical, diagonal or knight's move (the latest is available for
2358 $5\times 5$ mask). The overal distance is calculated as a sum of these
2359 basic distances. Because the distance function should be symmetric,
2360 all the horizontal and vertical shifts must have the same cost (that
2361 is denoted as \texttt{a}), all the diagonal shifts must have the
2362 same cost (denoted \texttt{b}), and all knight?s moves must have
2363 the same cost (denoted \texttt{c}). For \texttt{CV\_DIST\_C} and
2364 \texttt{CV\_DIST\_L1} types the distance is calculated precisely,
2365 whereas for \texttt{CV\_DIST\_L2} (Euclidian distance) the distance
2366 can be calculated only with some relative error ($5\times 5$ mask
2367 gives more accurate results), OpenCV uses the values suggested in
2371 \begin{tabular}{| c | c | c |}
2373 \texttt{CV\_DIST\_C} & $(3\times 3)$ & a = 1, b = 1\\ \hline
2374 \texttt{CV\_DIST\_L1} & $(3\times 3)$ & a = 1, b = 2\\ \hline
2375 \texttt{CV\_DIST\_L2} & $(3\times 3)$ & a=0.955, b=1.3693\\ \hline
2376 \texttt{CV\_DIST\_L2} & $(5\times 5)$ & a=1, b=1.4, c=2.1969\\ \hline
2379 And below are samples of distance field (black (0) pixel is in the middle of white square) in case of user-defined distance:
2381 User-defined $3 \times 3$ mask (a=1, b=1.5)
2383 \begin{tabular}{| c | c | c | c | c | c | c |}
2385 4.5 & 4 & 3.5 & 3 & 3.5 & 4 & 4.5\\ \hline
2386 4 & 3 & 2.5 & 2 & 2.5 & 3 & 4\\ \hline
2387 3.5 & 2.5 & 1.5 & 1 & 1.5 & 2.5 & 3.5\\ \hline
2388 3 & 2 & 1 & & 1 & 2 & 3\\ \hline
2389 3.5 & 2.5 & 1.5 & 1 & 1.5 & 2.5 & 3.5\\ \hline
2390 4 & 3 & 2.5 & 2 & 2.5 & 3 & 4\\ \hline
2391 4.5 & 4 & 3.5 & 3 & 3.5 & 4 & 4.5\\ \hline
2394 User-defined $5 \times 5$ mask (a=1, b=1.5, c=2)
2396 \begin{tabular}{| c | c | c | c | c | c | c |}
2398 4.5 & 3.5 & 3 & 3 & 3 & 3.5 & 4.5\\ \hline
2399 3.5 & 3 & 2 & 2 & 2 & 3 & 3.5\\ \hline
2400 3 & 2 & 1.5 & 1 & 1.5 & 2 & 3\\ \hline
2401 3 & 2 & 1 & & 1 & 2 & 3\\ \hline
2402 3 & 2 & 1.5 & 1 & 1.5 & 2 & 3\\ \hline
2403 3.5 & 3 & 2 & 2 & 2 & 3 & 3.5\\ \hline
2404 4 & 3.5 & 3 & 3 & 3 & 3.5 & 4\\ \hline
2408 Typically, for fast coarse distance estimation \texttt{CV\_DIST\_L2},
2409 $3\times 3$ mask is used, and for more accurate distance estimation
2410 \texttt{CV\_DIST\_L2}, $5\times 5$ mask is used.
2412 When the output parameter \texttt{labels} is not \texttt{NULL}, for
2413 every non-zero pixel the function also finds the nearest connected
2414 component consisting of zero pixels. The connected components
2415 themselves are found as contours in the beginning of the function.
2417 In this mode the processing time is still O(N), where N is the number of
2418 pixels. Thus, the function provides a very fast way to compute approximate
2419 Voronoi diagram for the binary image.
2421 % XXX missing Inpaint
2423 \subsection{Histograms}
2425 \cvfunc{CvHistogram}\label{CvHistogram}
2427 Muti-dimensional histogram
2430 typedef struct CvHistogram
2434 float thresh[CV\_MAX\_DIM][2]; /* for uniform histograms */
2435 float** thresh2; /* for non-uniform histograms */
2436 CvMatND mat; /* embedded matrix header for array histograms */
2441 \cvfunc{CreateHist}\label{CreateHist}
2446 CvHistogram* cvCreateHist(\par int dims,\par int* sizes,\par int type,\par float** ranges=NULL,\par int uniform=1 );
2450 \cvarg{dims}{Number of histogram dimensions}
2451 \cvarg{sizes}{Array of histogram dimension sizes}
2452 \cvarg{type}{Histogram representation format: \texttt{CV\_HIST\_ARRAY} means that histogram data is represented as an multi-dimensional dense array CvMatND; \texttt{CV\_HIST\_SPARSE} means that histogram data is represented as a multi-dimensional sparse array CvSparseMat}
2453 \cvarg{ranges}{Array of ranges for histogram bins. Its meaning depends on the \texttt{uniform} parameter value. The ranges are used for when histogram is calculated or backprojected to determine, which histogram bin corresponds to which value/tuple of values from the input image[s]}
2454 \cvarg{uniform}{Uniformity flag; if not 0, the histogram has evenly
2455 spaced bins and for every $0<=i<cDims$ \texttt{ranges[i]}
2456 is array of two numbers: lower and upper boundaries for the i-th
2457 histogram dimension.
2458 The whole range [lower,upper] is split then
2459 into \texttt{dims[i]} equal parts to determine \texttt{i-th} input
2460 tuple value ranges for every histogram bin. And if \texttt{uniform=0},
2461 then \texttt{i-th} element of \texttt{ranges} array contains
2462 \texttt{dims[i]+1} elements:
2463 $\texttt{lower}_0, \texttt{upper}_0,
2464 \texttt{lower}_1, \texttt{upper}_1 = \texttt{lower}_2,
2466 \texttt{upper}_{dims[i]-1} $
2468 $\texttt{lower}_j$ and $\texttt{upper}_j$
2470 boundaries of \texttt{i-th} input tuple value for \texttt{j-th}
2471 bin, respectively. In either case, the input values that are beyond
2472 the specified range for a histogram bin, are not counted by
2473 \cross{CalcHist} and filled with 0 by \cross{CalcBackProject}}
2476 The function \texttt{CreateHist} creates a histogram of the specified size and returns the pointer to the created histogram. If the array \texttt{ranges} is 0, the histogram bin ranges must be specified later via the function \texttt{cvSetHistBinRanges}, though \cross{CalcHist} and \cross{CalcBackProject} may process 8-bit images without setting bin ranges, they assume equally spaced in 0..255 bins.
2478 \cvfunc{SetHistBinRanges}\label{SetHistBinRanges}
2480 Sets bounds of histogram bins
2483 void cvSetHistBinRanges( \par CvHistogram* hist,\par float** ranges,\par int uniform=1 );
2487 \cvarg{hist}{Histogram}
2488 \cvarg{ranges}{Array of bin ranges arrays, see \cross{CreateHist}}
2489 \cvarg{uniform}{Uniformity flag, see \cross{CreateHist}}
2492 The function \texttt{cvSetHistBinRanges} is a stand-alone function for setting bin ranges in the histogram. For more detailed description of the parameters \texttt{ranges} and \texttt{uniform} see \cross{CalcHist} function, that can initialize the ranges as well. Ranges for histogram bins must be set before the histogram is calculated or backproject of the histogram is calculated.
2495 \cvfunc{ReleaseHist}\label{ReleaseHist}
2500 void cvReleaseHist( CvHistogram** hist );
2504 \cvarg{hist}{Double pointer to the released histogram}
2507 The function \texttt{cvReleaseHist} releases the histogram (header and the data). The pointer to histogram is cleared by the function. If \texttt{*hist} pointer is already \texttt{NULL}, the function does nothing.
2509 \cvfunc{ClearHist}\label{ClearHist}
2514 void cvClearHist( CvHistogram* hist );
2518 \cvarg{hist}{Histogram}
2521 The function \texttt{ClearHist} sets all histogram bins to 0 in case of dense histogram and removes all histogram bins in case of sparse array.
2523 \cvfunc{MakeHistHeaderForArray}\label{MakeHistHeaderForArray}
2525 Makes a histogram out of array
2528 CvHistogram* cvMakeHistHeaderForArray( \par int dims,\par int* sizes,\par CvHistogram* hist,\par float* data,\par float** ranges=NULL,\par int uniform=1 );
2532 \cvarg{dims}{Number of histogram dimensions}
2533 \cvarg{sizes}{Array of histogram dimension sizes}
2534 \cvarg{hist}{The histogram header initialized by the function}
2535 \cvarg{data}{Array that will be used to store histogram bins}
2536 \cvarg{ranges}{Histogram bin ranges, see \cross{CreateHist}}
2537 \cvarg{uniform}{Uniformity flag, see \cross{CreateHist}}
2540 The function \texttt{cvMakeHistHeaderForArray} initializes the histogram, which header and bins are allocated by user. No \cross{ReleaseHist} need to be called afterwards. Only dense histograms can be initialized this way. The function returns \texttt{hist}.
2542 \cvfunc{QueryHistValue\_nD}\label{QueryHistValue_nD}
2544 Queries value of histogram bin
2547 \#define cvQueryHistValue\_1D( hist, idx0 ) \
2548 cvGetReal1D( (hist)->bins, (idx0) )
2549 \#define cvQueryHistValue\_2D( hist, idx0, idx1 ) \
2550 cvGetReal2D( (hist)->bins, (idx0), (idx1) )
2551 \#define cvQueryHistValue\_3D( hist, idx0, idx1, idx2 ) \
2552 cvGetReal3D( (hist)->bins, (idx0), (idx1), (idx2) )
2553 \#define cvQueryHistValue\_nD( hist, idx ) \
2554 cvGetRealND( (hist)->bins, (idx) )
2558 \cvarg{hist}{Histogram}
2559 \cvarg{idx0, idx1, idx2, idx3}{Indices of the bin}
2560 \cvarg{idx}{Array of indices}
2563 The macros \texttt{QueryHistValue\_nD} return the value of the specified bin of 1D, 2D, 3D or N-D histogram. In case of sparse histogram the function returns 0, if the bin is not present in the histogram, and no new bin is created.
2565 \cvfunc{GetHistValue\_nD}\label{GetHistValue_nD}
2567 Returns pointer to histogram bin
2570 \#define cvGetHistValue\_1D( hist, idx0 ) \
2571 ((float*)(cvPtr1D( (hist)->bins, (idx0), 0 ))
2572 \#define cvGetHistValue\_2D( hist, idx0, idx1 ) \
2573 ((float*)(cvPtr2D( (hist)->bins, (idx0), (idx1), 0 ))
2574 \#define cvGetHistValue\_3D( hist, idx0, idx1, idx2 ) \
2575 ((float*)(cvPtr3D( (hist)->bins, (idx0), (idx1), (idx2), 0 ))
2576 \#define cvGetHistValue\_nD( hist, idx ) \
2577 ((float*)(cvPtrND( (hist)->bins, (idx), 0 ))
2581 \cvarg{hist}{Histogram}
2582 \cvarg{idx0, idx1, idx2, idx3}{Indices of the bin}
2583 \cvarg{idx}{Array of indices}
2586 The macros \texttt{GetHistValue} return pointer to the specified bin of 1D, 2D, 3D or N-D histogram. In case of sparse histogram the function creates a new bin and sets it to 0, unless it exists already.
2589 \cvfunc{GetMinMaxHistValue}\label{GetMinMaxHistValue}
2591 Finds minimum and maximum histogram bins
2594 void cvGetMinMaxHistValue( \par const CvHistogram* hist,\par float* min\_value,\par float* max\_value,\par int* min\_idx=NULL,\par int* max\_idx=NULL );
2599 \cvarg{hist}{Histogram}
2600 \cvarg{min\_value}{Pointer to the minimum value of the histogram}
2601 \cvarg{max\_value}{Pointer to the maximum value of the histogram}
2602 \cvarg{min\_idx}{Pointer to the array of coordinates for minimum}
2603 \cvarg{max\_idx}{Pointer to the array of coordinates for maximum}
2606 The function \texttt{cvGetMinMaxHistValue} finds the minimum and
2607 maximum histogram bins and their positions. Any of output arguments is
2608 optional. Among several extremums with the same value the ones with
2609 minimum index (in lexicographical order) In case of several maximums
2610 or minimums the earliest in lexicographical order extrema locations
2613 \cvfunc{NormalizeHist}\label{NormalizeHist}
2615 Normalizes histogram
2618 void cvNormalizeHist( CvHistogram* hist, double factor );
2622 \cvarg{hist}{Pointer to the histogram}
2623 \cvarg{factor}{Normalization factor}
2626 The function \texttt{cvNormalizeHist} normalizes the histogram bins by scaling them, such that the sum of the bins becomes equal to \texttt{factor}.
2628 \cvfunc{ThreshHist}\label{ThreshHist}
2630 Thresholds histogram
2633 void cvThreshHist( CvHistogram* hist, double threshold );
2637 \cvarg{hist}{Pointer to the histogram}
2638 \cvarg{threshold}{Threshold level}
2641 The function \texttt{cvThreshHist} clears histogram bins that are below the specified threshold.
2643 \cvfunc{CompareHist}\label{CompareHist}
2645 Compares two dense histograms
2648 double cvCompareHist( \par const CvHistogram* hist1,\par const CvHistogram* hist2,\par int method );
2652 \cvarg{hist1}{The first dense histogram}
2653 \cvarg{hist2}{The second dense histogram}
2654 \cvarg{method}{Comparison method, one of:
2656 \cvarg{CV\_COMP\_CORREL}{Correlation}
2657 \cvarg{CV\_COMP\_CHISQR}{Chi-Square}
2658 \cvarg{CV\_COMP\_INTERSECT}{Intersection}
2659 \cvarg{CV\_COMP\_BHATTACHARYYA}{Bhattacharyya distance}
2663 The function \texttt{CompareHist} compares two dense histograms using the specified method as following $H_1$ denotes the first histogram, $H_2$ the second):
2666 \item[Correlation (method=CV\_COMP\_CORREL)]
2669 {\sum_I (H'_1(I) \cdot H'_2(I))}
2670 {\sqrt{\sum_I(H'_1(I)^2) \cdot \sum_I(H'_2(I)^2)}}
2674 H'_k(I) = \frac{H_k(I) - 1}{N \cdot \sum_J H_k(J)}
2676 where N is the number of histogram bins.
2678 \item[Chi-Square (method=CV\_COMP\_CHISQR)]
2679 \[ d(H_1,H_2) = \sum_I \frac{H_1(I)-H_2(I)}{H_1(I)+H_2(I)} \]
2681 \item[Intersection (method=CV\_COMP\_INTERSECT)]
2682 \[ d(H_1,H_2) = \sum_I \min (H_1(I), H_2(I)) \]
2684 \item[Bhattacharyya distance (method=CV\_COMP\_BHATTACHARYYA)]
2685 \[ d(H_1,H_2) = \sqrt{1 - \sum_I \sqrt{H_1(I) \cdot H_2(I)}} \]
2689 The function returns $d(H_1, H_2)$ value.
2691 Note: the method \texttt{CV\_COMP\_BHATTACHARYYA} only works with normalized histograms.
2693 To compare sparse histogram or more general sparse configurations of weighted points, consider using \cross{CalcEMD2} function.
2695 \cvfunc{CopyHist}\label{CopyHist}
2700 void cvCopyHist( const CvHistogram* src, CvHistogram** dst );
2704 \cvarg{src}{Source histogram}
2705 \cvarg{dst}{Pointer to destination histogram}
2708 The function \texttt{CopyHist} makes a copy of the histogram. If the
2709 second histogram pointer \texttt{*dst} is NULL, a new histogram of the
2710 same size as \texttt{src} is created. Otherwise, both histograms must
2711 have equal types and sizes. Then the function copies the source histogram
2712 bins values to destination histogram and sets the same bin values ranges
2715 \cvfunc{CalcHist}\label{CalcHist}
2717 Calculates histogram of image(s)
2720 void cvCalcHist( \par IplImage** image,\par CvHistogram* hist,\par int accumulate=0,\par const CvArr* mask=NULL );
2724 \cvarg{image}{Source images (though, you may pass CvMat** as well)}
2725 \cvarg{hist}{Pointer to the histogram}
2726 \cvarg{accumulate}{Accumulation flag. If it is set, the histogram is not cleared in the beginning. This feature allows user to compute a single histogram from several images, or to update the histogram online}
2727 \cvarg{mask}{The operation mask, determines what pixels of the source images are counted}
2730 The function \texttt{CalcHist} calculates the histogram of one or more
2731 single-channel images. The elements of a tuple that is used to increment
2732 a histogram bin are taken at the same location from the corresponding
2735 % ===== Sample. Calculating and displaying 2D Hue-Saturation histogram of a color image =====
2738 #include <highgui.h>
2740 int main( int argc, char** argv )
2743 if( argc == 2 && (src=cvLoadImage(argv[1], 1))!= 0)
2745 IplImage* h_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2746 IplImage* s_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2747 IplImage* v_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2748 IplImage* planes[] = { h_plane, s_plane };
2749 IplImage* hsv = cvCreateImage( cvGetSize(src), 8, 3 );
2750 int h_bins = 30, s_bins = 32;
2751 int hist_size[] = {h_bins, s_bins};
2752 /* hue varies from 0 (~0 deg red) to 180 (~360 deg red again) */
2753 float h_ranges[] = { 0, 180 };
2754 /* saturation varies from 0 (black-gray-white) to
2755 255 (pure spectrum color) */
2756 float s_ranges[] = { 0, 255 };
2757 float* ranges[] = { h_ranges, s_ranges };
2759 IplImage* hist_img =
2760 cvCreateImage( cvSize(h_bins*scale,s_bins*scale), 8, 3 );
2762 float max_value = 0;
2765 cvCvtColor( src, hsv, CV_BGR2HSV );
2766 cvCvtPixToPlane( hsv, h_plane, s_plane, v_plane, 0 );
2767 hist = cvCreateHist( 2, hist_size, CV_HIST_ARRAY, ranges, 1 );
2768 cvCalcHist( planes, hist, 0, 0 );
2769 cvGetMinMaxHistValue( hist, 0, &max_value, 0, 0 );
2772 for( h = 0; h < h_bins; h++ )
2774 for( s = 0; s < s_bins; s++ )
2776 float bin_val = cvQueryHistValue_2D( hist, h, s );
2777 int intensity = cvRound(bin_val*255/max_value);
2778 cvRectangle( hist_img, cvPoint( h*scale, s*scale ),
2779 cvPoint( (h+1)*scale - 1, (s+1)*scale - 1),
2780 CV_RGB(intensity,intensity,intensity),
2785 cvNamedWindow( "Source", 1 );
2786 cvShowImage( "Source", src );
2788 cvNamedWindow( "H-S Histogram", 1 );
2789 cvShowImage( "H-S Histogram", hist_img );
2796 \cvfunc{CalcBackProject}\label{CalcBackProject}
2798 Calculates back projection
2801 void cvCalcBackProject( \par IplImage** image,\par CvArr* back\_project,\par const CvHistogram* hist );
2805 \cvarg{image}{Source images (though you may pass CvMat** as well)}
2806 \cvarg{back\_project}{Destination back projection image of the same type as the source images}
2807 \cvarg{hist}{Histogram}
2810 The function \texttt{cvCalcBackProject} calculates the back project of the histogram. For each tuple of pixels at the same position of all input single-channel images the function puts the value of the histogram bin, corresponding to the tuple, to the destination image. In terms of statistics, the value of each output image pixel is probability of the observed tuple given the distribution (histogram). For example, to find a red object in the picture, one may do the following:
2813 \item Calculate a hue histogram for the red object assuming the image contains only this object. The histogram is likely to have a strong maximum, corresponding to red color.
2814 \item Calculate back projection of a hue plane of input image where the object is searched, using the histogram. Threshold the image.
2815 \item Find connected components in the resulting picture and choose the right component using some additional criteria, for example, the largest connected component.
2818 That is the approximate algorithm of Camshift color object tracker, except for the 3rd step, instead of which CAMSHIFT algorithm is used to locate the object on the back projection given the previous object position.
2820 \cvfunc{CalcBackProjectPatch}\label{CalcBackProjectPatch}
2822 Locates a template within image by histogram comparison
2825 void cvCalcBackProjectPatch( \par IplImage** image,\par CvArr* dst,\par CvSize patch\_size,\par CvHistogram* hist,\par int method,\par float factor );
2829 \cvarg{image}{Source images (though, you may pass CvMat** as well)}
2830 \cvarg{dst}{Destination image}
2831 \cvarg{patch\_size}{Size of patch slid though the source image}
2832 \cvarg{hist}{Histogram}
2833 \cvarg{method}{Compasion method, passed to \cross{CompareHist} (see description of that function)}
2834 \cvarg{factor}{Normalization factor for histograms, will affect normalization scale of destination image, pass 1. if unsure}
2837 The function \texttt{cvCalcBackProjectPatch} calculates back projection by comparing histograms of the source image patches with the given histogram. Taking measurement results from some image at each location over ROI creates an array \texttt{image}. These results might be one or more of hue, \texttt{x} derivative, \texttt{y} derivative, Laplacian filter, oriented Gabor filter, etc. Each measurement output is collected into its own separate image. The \texttt{image} image array is a collection of these measurement images. A multi-dimensional histogram \texttt{hist} is constructed by sampling from the \texttt{image} image array. The final histogram is normalized. The \texttt{hist} histogram has as many dimensions as the number of elements in \texttt{image} array.
2839 Each new image is measured and then converted into an \texttt{image} image array over a chosen ROI. Histograms are taken from this \texttt{image} image in an area covered by a "patch" with anchor at center as shown in the picture below. The histogram is normalized using the parameter \texttt{norm\_factor} so that it may be compared with \texttt{hist}. The calculated histogram is compared to the model histogram; \texttt{hist} uses The function \texttt{cvCompareHist} with the comparison method=\texttt{method}). The resulting output is placed at the location corresponding to the patch anchor in the probability image \texttt{dst}. This process is repeated as the patch is slid over the ROI. Iterative histogram update by subtracting trailing pixels covered by the patch and adding newly covered pixels to the histogram can save a lot of operations, though it is not implemented yet.
2841 % ===== Back Project Calculation by Patches =====
2842 \includegraphics[width=0.5\textwidth]{pics/backprojectpatch.png}
2844 \cvfunc{CalcProbDensity}\label{CalcProbDensity}
2846 Divides one histogram by another
2849 void cvCalcProbDensity( \par const CvHistogram* hist1,\par const CvHistogram* hist2,\par CvHistogram* dst\_hist,\par double scale=255 );
2853 \cvarg{hist1}{first histogram (the divisor)}
2854 \cvarg{hist2}{second histogram}
2855 \cvarg{dst\_hist}{destination histogram}
2856 \cvarg{scale}{scale factor for the destination histogram}
2859 The function \texttt{CalcProbDensity} calculates the object probability density from the two histograms as:
2862 \texttt{dist\_hist}(I)=
2864 {0}{if $\texttt{hist1}(I)=0$}
2865 {\texttt{scale}}{if $\texttt{hist1}(I) \ne 0$ and $\texttt{hist2}(I) > \texttt{hist1}(I)$}
2866 {\frac{\texttt{hist2}(I) \cdot \texttt{scale}}{\texttt{hist1}(I)}}{if $\texttt{hist1}(I) \ne 0$ and $\texttt{hist2}(I) \le \texttt{hist1}(I)$}
2869 So the destination histogram bins are within less than scale.
2871 \cvfunc{EqualizeHist}\label{EqualizeHist}
2873 Equalizes histogram of grayscale image
2876 void cvEqualizeHist( const CvArr* src, CvArr* dst );
2880 \cvarg{src}{The input 8-bit single-channel image}
2881 \cvarg{dst}{The output image of the same size and the same data type as \texttt{src}}
2884 The function \texttt{cvEqualizeHist} equalizes histogram of the input image using the following algorithm:
2887 \item calculate histogram $H$ for src.
2888 \item normalize histogram, so that the sum of histogram bins is 255.
2889 \item compute integral of the histogram:
2891 H'_i = \sum_{0 \le j \le i} H(j)
2893 \item transform the image using $H'$ as a look-up table: $\texttt{dst}(x,y) = H'(\texttt{src}(x,y))$
2896 The algorithm normalizes brightness and increases contrast of the image.
2898 \subsection{Matching}
2900 \cvfunc{MatchTemplate}\label{MatchTemplate}
2902 Compares template against overlapped image regions
2905 void cvMatchTemplate( \par const CvArr* image,\par const CvArr* templ,\par CvArr* result,\par int method );
2909 \cvarg{image}{Image where the search is running. It should be 8-bit or 32-bit floating-point}
2910 \cvarg{templ}{Searched template; must be not greater than the source image and the same data type as the image}
2911 \cvarg{result}{A map of comparison results; single-channel 32-bit floating-point.
2912 If \texttt{image} is $W \times H$ and
2913 \texttt{templ} is $w \times h$ then \texttt{result} must be $(W-w+1) \times (H-h+1)$}
2914 \cvarg{method}{Specifies the way the template must be compared with image regions (see below)}
2917 The function \texttt{cvMatchTemplate} is similiar to
2918 \cross{CalcBackProjectPatch}. It slids through \texttt{image}, compares
2919 overlapped patches of size $w \times h$ against \texttt{templ}
2920 using the specified method and stores the comparison results to
2921 \texttt{result}. Here are the formulae for the different comparison
2922 methods one may use ($I$ denotes \texttt{image}, $T$ \texttt{template},
2923 $R$ \texttt{result}. The summation is done over template and/or the
2924 image patch: $x' = 0...w-1, y' = 0...h-1$
2926 % \texttt{x'=0..w-1, y'=0..h-1}):
2929 \item[method=CV\_TM\_SQDIFF]
2930 \[ R(x,y)=\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2 \]
2932 \item[method=CV\_TM\_SQDIFF\_NORMED]
2934 {\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2}
2935 {\sqrt{\sum_{x',y'}T(x',y')^2 \cdot \sum_{x',y'} I(x+x',y+y')^2}}
2938 \item[method=CV\_TM\_CCORR]
2939 \[ R(x,y)=\sum_{x',y'} (T(x',y') \cdot I(x+x',y+y')) \]
2941 \item[method=CV\_TM\_CCORR\_NORMED]
2943 {\sum_{x',y'} (T(x',y') \cdot I'(x+x',y+y'))}
2944 {\sqrt{\sum_{x',y'}T(x',y') \cdot \sum_{x',y'} I(x+x',y+y')}}
2947 \item[method=CV\_TM\_CCOEFF]
2948 \[ R(x,y)=\sum_{x',y'} (T'(x',y') \cdot I(x+x',y+y')) \]
2953 T'(x',y')=T(x',y') - 1/(w \cdot h) \cdot \sum_{x'',y''} T(x'',y'')\\
2954 I'(x+x',y+y')=I(x+x',y+y') - \frac{1}{(w \cdot h) \cdot \sum_{x'',y''} I(x+x'',y+y'')}
2958 \item[method=CV\_TM\_CCOEFF\_NORMED]
2960 { \sum_{x',y'} (T'(x',y') \cdot I'(x+x',y+y')) }
2961 { \sqrt{\sum_{x',y'}T'(x',y')^2 \cdot \sum_{x',y'} I'(x+x',y+y')^2} }
2965 After the function finishes comparison, the best matches can be found as global minimums (CV\_TM\_SQDIFF) or maximums (CV\_TM\_CCORR and CV\_TM\_CCOEFF) using \cross{MinMaxLoc} function. In case of color image and template summation in both numerator and each sum in denominator is done over all the channels (and separate mean values are used for each channel).
2967 \cvfunc{MatchShapes}\label{MatchShapes}
2972 double cvMatchShapes( \par const void* object1,\par const void* object2,\par int method,\par double parameter=0 );
2976 \cvarg{object1}{First contour or grayscale image}
2977 \cvarg{object2}{Second contour or grayscale image}
2978 \cvarg{method}{Comparison method, one of
2979 \texttt{CV\_CONTOUR\_MATCH\_I1},
2980 \texttt{CV\_CONTOURS\_MATCH\_I2}
2982 \texttt{CV\_CONTOURS\_MATCH\_I3}}
2983 \cvarg{parameter}{Method-specific parameter (is not used now)}
2986 The function \texttt{cvMatchShapes} compares two shapes. The 3 implemented methods all use Hu moments (see \cross{GetHuMoments}) ($A$ is \texttt{object1}, $B$ is \texttt{object2}):
2989 \item[method=CV\_CONTOUR\_MATCH\_I1]
2990 \[ I_1(A,B) = \sum_{i=1...7} \left| \frac{1}{m^A_i} - \frac{1}{m^B_i} \right| \]
2992 \item[method=CV\_CONTOUR\_MATCH\_I2]
2993 \[ I_2(A,B) = \sum_{i=1...7} \left| m^A_i - m^B_i \right| \]
2995 \item[method=CV\_CONTOUR\_MATCH\_I3]
2996 \[ I_3(A,B) = \sum_{i=1...7} \frac{ \left| m^A_i - m^B_i \right| }{ \left| m^A_i \right| } \]
3003 m^A_i = sign(h^A_i) \cdot \log{h^A_i}
3004 m^B_i = sign(h^B_i) \cdot \log{h^B_i}
3008 and $h^A_i, h^B_i$ are Hu moments of $A$ and $B$ respectively.
3010 \cvfunc{CalcEMD2}\label{CalcEMD2}
3012 Computes "minimal work" distance between two weighted point configurations
3015 float cvCalcEMD2( \par const CvArr* signature1,\par const CvArr* signature2,\par int distance\_type,\par CvDistanceFunction distance\_func=NULL,\par const CvArr* cost\_matrix=NULL,\par CvArr* flow=NULL,\par float* lower\_bound=NULL,\par void* userdata=NULL );
3019 typedef float (*CvDistanceFunction)(const float* f1, const float* f2, void* userdata);
3023 \cvarg{signature1}{First signature, $\texttt{size1}\times \texttt{dims}+1$ floating-point matrix. Each row stores the point weight followed by the point coordinates. The matrix is allowed to have a single column (weights only) if the user-defined cost matrix is used}
3024 \cvarg{signature2}{Second signature of the same format as \texttt{signature1}, though the number of rows may be different. The total weights may be different, in this case an extra "dummy" point is added to either \texttt{signature1} or \texttt{signature2}}
3025 \cvarg{distance\_type}{Metrics used; \texttt{CV\_DIST\_L1, CV\_DIST\_L2}, and \texttt{CV\_DIST\_C} stand for one of the standard metrics; \texttt{CV\_DIST\_USER} means that a user-defined function \texttt{distance\_func} or pre-calculated \texttt{cost\_matrix} is used}
3026 \cvarg{distance\_func}{The user-defined distance function. It takes coordinates of two points and returns the distance between the points}
3027 \cvarg{cost\_matrix}{The user-defined $\texttt{size1}\times \texttt{size2}$ cost matrix. At least one of \texttt{cost\_matrix} and \texttt{distance\_func} must be NULL. Also, if a cost matrix is used, lower boundary (see below) can not be calculated, because it needs a metric function}
3028 \cvarg{flow}{The resultant $\texttt{size1} \times \texttt{size2}$ flow matrix: $\texttt{flow}_{i,j}$ is a flow from $i$ th point of \texttt{signature1} to $j$ th point of \texttt{signature2}}
3029 \cvarg{lower\_bound}{Optional input/output parameter: lower boundary of distance between the two signatures that is a distance between mass centers. The lower boundary may not be calculated if the user-defined cost matrix is used, the total weights of point configurations are not equal, or there is the signatures consist of weights only (i.e. the signature matrices have a single column). User ''must'' initialize \texttt{*lower\_bound}. If the calculated distance between mass centers is greater or equal to \texttt{*lower\_bound} (it means that the signatures are far enough) the function does not calculate EMD. In any case \texttt{*lower\_bound} is set to the calculated distance between mass centers on return. Thus, if user wants to calculate both distance between mass centers and EMD, \texttt{*lower\_bound} should be set to 0}
3030 \cvarg{userdata}{Pointer to optional data that is passed into the user-defined distance function}
3033 The function \texttt{cvCalcEMD2} computes earth mover distance and/or
3034 a lower boundary of the distance between the two weighted point
3035 configurations. One of the application desctibed in \cite{RubnerSept98} is
3036 multi-dimensional histogram comparison for image retrieval. EMD is a
3037 transportation problem that is solved using some modification of simplex
3038 algorithm, thus the complexity is exponential in the worst case, though,
3039 it is much faster in average. In case of a real metric the lower boundary
3040 can be calculated even faster (using linear-time algorithm) and it can
3041 be used to determine roughly whether the two signatures are far enough
3042 so that they cannot relate to the same object.
3044 \section{Structural Analysis}
3046 \subsection{Contour Processing Functions}
3048 \cvfunc{ApproxChains}\label{ApproxChains}
3050 Approximates Freeman chain(s) with polygonal curve
3053 CvSeq* cvApproxChains( \par CvSeq* src\_seq,\par CvMemStorage* storage,\par int method=CV\_CHAIN\_APPROX\_SIMPLE,\par double parameter=0,\par int minimal\_perimeter=0,\par int recursive=0 );
3057 \cvarg{src\_seq}{Pointer to the chain that can refer to other chains}
3058 \cvarg{storage}{Storage location for the resulting polylines}
3059 \cvarg{method}{Approximation method (see the description of the function \cross{FindContours})}
3060 \cvarg{parameter}{Method parameter (not used now)}
3061 \cvarg{minimal\_perimeter}{Approximates only those contours whose perimeters are not less than \texttt{minimal\_perimeter}. Other chains are removed from the resulting structure}
3062 \cvarg{recursive}{If not 0, the function approximates all chains that access can be obtained to from \texttt{src\_seq} by \texttt{h\_next} or \texttt{v\_next links}. If 0, the single chain is approximated}
3065 This is a stand-alone approximation routine. The function \texttt{cvApproxChains} works exactly in the same way as \cross{FindContours} with the corresponding approximation flag. The function returns pointer to the first resultant contour. Other approximated contours, if any, can be accessed via \texttt{v\_next} or \texttt{h\_next} fields of the returned structure.
3067 \cvfunc{StartReadChainPoints}\label{StartReadChainPoints}
3069 Initializes chain reader
3072 void cvStartReadChainPoints( CvChain* chain, CvChainPtReader* reader );
3075 The function \texttt{cvStartReadChainPoints} initializes a special reader
3077 \cvfunc{ReadChainPoint}\label{ReadChainPoint}
3079 Gets next chain point
3082 CvPoint cvReadChainPoint( CvChainPtReader* reader );
3086 \cvarg{reader}{Chain reader state}
3089 The function \texttt{cvReadChainPoint} returns the current chain point and updates the reader position.
3091 \cvfunc{ApproxPoly}\label{ApproxPoly}
3093 Approximates polygonal curve(s) with desired precision
3096 CvSeq* cvApproxPoly( \par const void* src\_seq,\par int header\_size,\par CvMemStorage* storage,\par int method,\par double parameter,\par int parameter2=0 );
3100 \cvarg{src\_seq}{Sequence of array of points}
3101 \cvarg{header\_size}{Header size of approximated curve[s]}
3102 \cvarg{storage}{Container for approximated contours. If it is NULL, the input sequences' storage is used}
3103 \cvarg{method}{Approximation method; only \texttt{CV\_POLY\_APPROX\_DP} is supported, that corresponds to Douglas-Peucker algorithm}
3104 \cvarg{parameter}{Method-specific parameter; in case of \texttt{CV\_POLY\_APPROX\_DP} it is a desired approximation accuracy}
3105 \cvarg{parameter2}{If case if \texttt{src\_seq} is sequence it means whether the single sequence should be approximated or all sequences on the same level or below \texttt{src\_seq} (see \cross{FindContours} for description of hierarchical contour structures). And if \texttt{src\_seq} is array CvMat* of points, the parameter specifies whether the curve is closed (\texttt{parameter2}!=0) or not (\texttt{parameter2} =0)}
3108 The function \texttt{cvApproxPoly} approximates one or more curves and returns the approximation result[s]. In case of multiple curves approximation the resultant tree will have the same structure as the input one (1:1 correspondence).
3110 \cvfunc{BoundingRect}\label{BoundingRect}
3112 Calculates up-right bounding rectangle of point set
3115 CvRect cvBoundingRect( CvArr* points, int update=0 );
3119 \cvarg{points}{2D point set, either a sequence or vector (\texttt{CvMat}) of points}
3120 \cvarg{update}{The update flag. See below.}
3123 The function \texttt{cvBoundingRect} returns the up-right bounding rectangle for 2d point set.
3124 Here is list of possible combination of the flag values and type of \texttt{points}:
3126 \begin{tabular}{|c|c|p{3in}|}
3128 update & points & action \\ \hline
3129 0 & \texttt{CvContour\*} & the bounding rectangle is not calculated, but it is taken from \texttt{rect} field of the contour header.\\ \hline
3130 1 & \texttt{CvContour\*} & the bounding rectangle is calculated and written to \texttt{rect} field of the contour header.\\ \hline
3131 0 & \texttt{CvSeq\*} or \texttt{CvMat\*} & the bounding rectangle is calculated and returned.\\ \hline
3132 1 & \texttt{CvSeq\*} or \texttt{CvMat\*} & runtime error is raised.\\ \hline
3135 \cvfunc{ContourArea}\label{ContourArea}
3137 Calculates area of the whole contour or contour section
3140 double cvContourArea( \par const CvArr* contour, \par CvSlice slice=CV\_WHOLE\_SEQ );
3144 \cvarg{contour}{Contour (sequence or array of vertices)}
3145 \cvarg{slice}{Starting and ending points of the contour section of interest, by default area of the whole contour is calculated}
3148 The function \texttt{cvContourArea} calculates area of the whole contour
3149 or contour section. In the latter case the total area bounded by the
3150 contour arc and the chord connecting the 2 selected points is calculated
3151 as shown on the picture below:
3153 \includegraphics[width=0.5\textwidth]{pics/contoursecarea.png}
3155 Orientation of the contour affects the area sign, thus the function may return \emph{negative} result. Use \texttt{fabs()} function from C runtime to get the absolute value of area.
3157 \cvfunc{ArcLength}\label{ArcLength}
3159 Calculates contour perimeter or curve length
3162 double cvArcLength( \par const void* curve,\par CvSlice slice=CV\_WHOLE\_SEQ,\par int is\_closed=-1 );
3166 \cvarg{curve}{Sequence or array of the curve points}
3167 \cvarg{slice}{Starting and ending points of the curve, by default the whole curve length is calculated}
3168 \cvarg{is\_closed}{Indicates whether the curve is closed or not. There are 3 cases:
3170 \item $\texttt{is\_closed} =0$ the curve is assumed to be unclosed.
3171 \item $\texttt{is\_closed}>0$ the curve is assumed to be closed.
3172 \item $\texttt{is\_closed}<0$ if curve is sequence, the flag \texttt{CV\_SEQ\_FLAG\_CLOSED} of \texttt{((CvSeq*)curve)->flags} is checked to determine if the curve is closed or not, otherwise (curve is represented by array (CvMat*) of points) it is assumed to be unclosed.
3176 The function \texttt{cvArcLength} calculates length or curve as sum of lengths of segments between subsequent points
3178 \cvfunc{CreateContourTree}\label{CreateContourTree}
3180 Creates hierarchical representation of contour
3183 CvContourTree* cvCreateContourTree( /par const CvSeq* contour,\par CvMemStorage* storage,\par double threshold );
3187 \cvarg{contour}{Input contour}
3188 \cvarg{storage}{Container for output tree}
3189 \cvarg{threshold}{Approximation accuracy}
3192 The function \texttt{cvCreateContourTree} creates binary tree representation for the input \texttt{contour} and returns the pointer to its root. If the parameter \texttt{threshold} is less than or equal to 0, the function creates full binary tree representation. If the threshold is greater than 0, the function creates representation with the precision \texttt{threshold}: if the vertices with the interceptive area of its base line are less than \texttt{threshold}, the tree should not be built any further. The function returns the created tree.
3194 \cvfunc{ContourFromContourTree}\label{ContourFromContourTree}
3196 Restores contour from tree
3199 CvSeq* cvContourFromContourTree( \par const CvContourTree* tree,\par CvMemStorage* storage,\par CvTermCriteria criteria );
3203 \cvarg{tree}{Contour tree}
3204 \cvarg{storage}{Container for the reconstructed contour}
3205 \cvarg{criteria}{Criteria, where to stop reconstruction}
3208 The function \texttt{cvContourFromContourTree} restores the contour from its binary tree representation. The parameter \texttt{criteria} determines the accuracy and/or the number of tree levels used for reconstruction, so it is possible to build approximated contour. The function returns reconstructed contour.
3210 \cvfunc{MatchContourTrees}\label{MatchContourTrees}
3212 Compares two contours using their tree representations
3215 double cvMatchContourTrees( \par const CvContourTree* tree1,\par const CvContourTree* tree2,\par int method,\par double threshold );
3219 \cvarg{tree1}{First contour tree}
3220 \cvarg{tree2}{Second contour tree}
3221 \cvarg{method}{Similarity measure, only \texttt{CV\_CONTOUR\_TREES\_MATCH\_I1} is supported}
3222 \cvarg{threshold}{Similarity threshold}
3225 The function \texttt{cvMatchContourTrees} calculates the value of the matching measure for two contour trees. The similarity measure is calculated level by level from the binary tree roots. If at the certain level difference between contours becomes less than \texttt{threshold}, the reconstruction process is interrupted and the current difference is returned.
3227 \subsection{Computational Geometry}
3229 \cvfunc{MaxRect}\label{MaxRect}
3231 Finds bounding rectangle for two given rectangles
3234 CvRect cvMaxRect( \par const CvRect* rect1,\par const CvRect* rect2 );
3238 \cvarg{rect1}{First rectangle}
3239 \cvarg{rect2}{Second rectangle}
3242 The function \texttt{cvMaxRect} finds minimum area rectangle that contains both input rectangles inside:
3244 \includegraphics[width=0.5\textwidth]{pics/maxrect.png}
3246 \cvfunc{CvBox2D}\label{CvBox2D}
3251 typedef struct CvBox2D
3253 CvPoint2D32f center; /* center of the box */
3254 CvSize2D32f size; /* box width and length */
3255 float angle; /* angle between the horizontal axis
3256 and the first side (i.e. length) in radians */
3261 \cvfunc{PointSeqFromMat}\label{PointSeqFromMat}
3263 Initializes point sequence header from a point vector
3266 CvSeq* cvPointSeqFromMat( \par int seq\_kind,\par const CvArr* mat,\par CvContour* contour\_header,\par CvSeqBlock* block );
3270 \cvarg{seq\_kind}{Type of the point sequence: point set (0), a curve (\texttt{CV\_SEQ\_KIND\_CURVE}), closed curve (\texttt{CV\_SEQ\_KIND\_CURVE+CV\_SEQ\_FLAG\_CLOSED}) etc.}
3271 \cvarg{mat}{Input matrix. It should be continuous 1-dimensional vector of points, that is, it should have type \texttt{CV\_32SC2} or \texttt{CV\_32FC2}}
3272 \cvarg{contour\_header}{Contour header, initialized by the function}
3273 \cvarg{block}{Sequence block header, initialized by the function}
3276 The function \texttt{cvPointSeqFromMat} initializes sequence
3277 header to create a "virtual" sequence which elements reside in
3278 the specified matrix. No data is copied. The initialized sequence
3279 header may be passed to any function that takes a point sequence
3280 on input. No extra elements could be added to the sequence,
3281 but some may be removed. The function is a specialized variant of
3282 \cross{MakeSeqHeaderForArray} and uses
3283 the latter internally. It returns pointer to the initialized contour
3284 header. Note that the bounding rectangle (field \texttt{rect} of
3285 \texttt{CvContour} strucuture is not initialized by the function. If
3286 you need one, use \cross{BoundingRect}.
3288 Here is the simple usage example.
3293 CvMat* vector = cvCreateMat( 1, 3, CV_32SC2 );
3295 CV_MAT_ELEM( *vector, CvPoint, 0, 0 ) = cvPoint(100,100);
3296 CV_MAT_ELEM( *vector, CvPoint, 0, 1 ) = cvPoint(100,200);
3297 CV_MAT_ELEM( *vector, CvPoint, 0, 2 ) = cvPoint(200,100);
3299 IplImage* img = cvCreateImage( cvSize(300,300), 8, 3 );
3302 cvDrawContours( img,
3303 cvPointSeqFromMat(CV_SEQ_KIND_CURVE+CV_SEQ_FLAG_CLOSED,
3309 0, 3, 8, cvPoint(0,0));
3312 \cvfunc{CvBox2D}\label{CvBox2D}
3317 typedef struct CvBox2D
3319 CvPoint2D32f center; /* center of the box */
3320 CvSize2D32f size; /* box width and length */
3321 float angle; /* angle between the horizontal axis
3322 and the first side (i.e. length) in radians */
3327 \cvfunc{BoxPoints}\label{BoxPoints}
3332 void cvBoxPoints( \par CvBox2D box,\par CvPoint2D32f pt[4] );
3337 \cvarg{pt}{Array of vertices}
3340 The function \texttt{cvBoxPoints} calculates vertices of the input 2d box. Here is the function code:
3343 void cvBoxPoints( CvBox2D box, CvPoint2D32f pt[4] )
3345 float a = (float)cos(box.angle)*0.5f;
3346 float b = (float)sin(box.angle)*0.5f;
3348 pt[0].x = box.center.x - a*box.size.height - b*box.size.width;
3349 pt[0].y = box.center.y + b*box.size.height - a*box.size.width;
3350 pt[1].x = box.center.x + a*box.size.height - b*box.size.width;
3351 pt[1].y = box.center.y - b*box.size.height - a*box.size.width;
3352 pt[2].x = 2*box.center.x - pt[0].x;
3353 pt[2].y = 2*box.center.y - pt[0].y;
3354 pt[3].x = 2*box.center.x - pt[1].x;
3355 pt[3].y = 2*box.center.y - pt[1].y;
3359 \cvfunc{FitEllipse}\label{FitEllipse}
3361 Fits ellipse to set of 2D points
3364 CvBox2D cvFitEllipse2( \par const CvArr* points );
3368 \cvarg{points}{Sequence or array of points}
3371 The function \texttt{cvFitEllipse} calculates ellipse that fits best
3372 (in least-squares sense) to a set of 2D points. The meaning of the
3373 returned structure fields is similar to those in \cross{Ellipse} except
3374 that \texttt{size} stores the full lengths of the ellipse axises,
3377 \cvfunc{FitLine}\label{FitLine}
3379 Fits line to 2D or 3D point set
3382 void cvFitLine( \par const CvArr* points,\par int dist\_type,\par double param,\par double reps,\par double aeps,\par float* line );
3386 \cvarg{points}{Sequence or array of 2D or 3D points with 32-bit integer or floating-point coordinates}
3387 \cvarg{dist\_type}{The distance used for fitting (see the discussion)}
3388 \cvarg{param}{Numerical parameter (\texttt{C}) for some types of distances, if 0 then some optimal value is chosen}
3389 \cvarg{reps, aeps}{Sufficient accuracy for radius (distance between the coordinate origin and the line) and angle, respectively, 0.01 would be a good defaults for both. is used}
3390 \cvarg{line}{The output line parameters. In case of 2d fitting it is array of 4 floats \texttt{(vx, vy, x0, y0)} where \texttt{(vx, vy)} is a normalized vector collinear to the line and \texttt{(x0, y0)} is some point on the line. In case of 3D fitting it is array of 6 floats \texttt{(vx, vy, vz, x0, y0, z0)} where \texttt{(vx, vy, vz)} is a normalized vector collinear to the line and \texttt{(x0, y0, z0)} is some point on the line}
3393 The function \texttt{cvFitLine} fits line to 2D or 3D point set by minimizing $\sum_i \rho(r_i)$ where $r_i$ is distance between $i$ th point and the line and $\rho(r)$ is a distance function, one of:
3397 \item[dist\_type=CV\_DIST\_L2]
3398 \[ \rho(r) = r^2/2 \quad \text{(the simplest and the fastest least-squares method)} \]
3400 \item[dist\_type=CV\_DIST\_L1]
3403 \item[dist\_type=CV\_DIST\_L12]
3404 \[ \rho(r) = 2 \cdot (\sqrt{1 + \frac{r^2}{2}} - 1) \]
3406 \item[dist\_type=CV\_DIST\_FAIR]
3407 \[ \rho\left(r\right) = C^2 \cdot \left( \frac{r}{C} - \log{\left(1 + \frac{r}{C}\right)}\right) \quad \text{where} \quad C=1.3998 \]
3409 \item[dist\_type=CV\_DIST\_WELSCH]
3410 \[ \rho\left(r\right) = \frac{C^2}{2} \cdot \left( 1 - \exp{\left(-\left(\frac{r}{C}\right)^2\right)}\right) \quad \text{where} \quad C=2.9846 \]
3412 \item[dist\_type=CV\_DIST\_HUBER]
3415 {C \cdot (r-C/2)}{otherwise} \quad \text{where} \quad C=1.345
3419 \cvfunc{ConvexHull2}\label{ConvexHull2}
3421 Finds convex hull of point set
3424 CvSeq* cvConvexHull2( \par const CvArr* input,\par void* hull\_storage=NULL,\par int orientation=CV\_CLOCKWISE,\par int return\_points=0 );
3428 \cvarg{points}{Sequence or array of 2D points with 32-bit integer or floating-point coordinates}
3429 \cvarg{hull\_storage}{The destination array (CvMat*) or memory storage (CvMemStorage*) that will store the convex hull. If it is array, it should be 1d and have the same number of elements as the input array/sequence. On output the header is modified so to truncate the array downto the hull size. If hull\_storage is NULL then the convex hull will be stored in the same storage as the input sequence}
3430 \cvarg{orientation}{Desired orientation of convex hull: \texttt{CV\_CLOCKWISE} or \texttt{CV\_COUNTER\_CLOCKWISE}}
3431 \cvarg{return\_points}{If non-zero, the points themselves will be stored in the hull instead of indices if \texttt{hull\_storage} is array, or pointers if \texttt{hull\_storage} is memory storage}
3434 The function \texttt{cvConvexHull2} finds convex hull of 2D point set using Sklansky?s algorithm. If \texttt{hull\_storage} is memory storage, the function creates a sequence containing the hull points or pointers to them, depending on \texttt{return\_points} value and returns the sequence on output. If \texttt{hull\_storage} is a CvMat, the function returns NULL.
3436 % ===== Example. Building convex hull for a sequence or array of points =====
3439 #include "highgui.h"
3442 #define ARRAY 0 /* switch between array/sequence method by replacing 0<=>1 */
3444 void main( int argc, char** argv )
3446 IplImage* img = cvCreateImage( cvSize( 500, 500 ), 8, 3 );
3447 cvNamedWindow( "hull", 1 );
3450 CvMemStorage* storage = cvCreateMemStorage();
3455 int i, count = rand()%100 + 1, hullcount;
3458 CvSeq* ptseq = cvCreateSeq( CV_SEQ_KIND_GENERIC|CV_32SC2,
3464 for( i = 0; i < count; i++ )
3466 pt0.x = rand() % (img->width/2) + img->width/4;
3467 pt0.y = rand() % (img->height/2) + img->height/4;
3468 cvSeqPush( ptseq, &pt0 );
3470 hull = cvConvexHull2( ptseq, 0, CV_CLOCKWISE, 0 );
3471 hullcount = hull->total;
3473 CvPoint* points = (CvPoint*)malloc( count * sizeof(points[0]));
3474 int* hull = (int*)malloc( count * sizeof(hull[0]));
3475 CvMat point_mat = cvMat( 1, count, CV_32SC2, points );
3476 CvMat hull_mat = cvMat( 1, count, CV_32SC1, hull );
3478 for( i = 0; i < count; i++ )
3480 pt0.x = rand() % (img->width/2) + img->width/4;
3481 pt0.y = rand() % (img->height/2) + img->height/4;
3484 cvConvexHull2( &point_mat, &hull_mat, CV_CLOCKWISE, 0 );
3485 hullcount = hull_mat.cols;
3488 for( i = 0; i < count; i++ )
3491 pt0 = *CV_GET_SEQ_ELEM( CvPoint, ptseq, i );
3495 cvCircle( img, pt0, 2, CV_RGB( 255, 0, 0 ), CV_FILLED );
3499 pt0 = **CV_GET_SEQ_ELEM( CvPoint*, hull, hullcount - 1 );
3501 pt0 = points[hull[hullcount-1]];
3504 for( i = 0; i < hullcount; i++ )
3507 CvPoint pt = **CV_GET_SEQ_ELEM( CvPoint*, hull, i );
3509 CvPoint pt = points[hull[i]];
3511 cvLine( img, pt0, pt, CV_RGB( 0, 255, 0 ));
3515 cvShowImage( "hull", img );
3517 int key = cvWaitKey(0);
3518 if( key == 27 ) // 'ESC'
3522 cvClearMemStorage( storage );
3531 \cvfunc{CheckContourConvexity}\label{CheckContourConvexity}
3533 Tests contour convex
3536 int cvCheckContourConvexity( const CvArr* contour );
3540 \cvarg{contour}{Tested contour (sequence or array of points)}
3543 The function \texttt{cvCheckContourConvexity} tests whether the input contour is convex or not. The contour must be simple, i.e. without self-intersections.
3545 \cvstruct{CvConvexityDefect}\label{CvConvexityDefect}
3547 Structure describing a single contour convexity detect
3550 typedef struct CvConvexityDefect
3552 CvPoint* start; /* point of the contour where the defect begins */
3553 CvPoint* end; /* point of the contour where the defect ends */
3554 CvPoint* depth_point; /* the farthest from the convex hull point within the defect */
3555 float depth; /* distance between the farthest point and the convex hull */
3556 } CvConvexityDefect;
3559 % ===== Picture. Convexity defects of hand contour. =====
3560 \includegraphics[width=0.5\textwidth]{pics/defects.png}
3562 \cvfunc{ConvexityDefects}\label{ConvexityDefects}
3564 Finds convexity defects of contour
3567 CvSeq* cvConvexityDefects( \par const CvArr* contour,\par const CvArr* convexhull,\par CvMemStorage* storage=NULL );
3571 \cvarg{contour}{Input contour}
3572 \cvarg{convexhull}{Convex hull obtained using \cross{ConvexHull2} that should contain pointers or indices to the contour points, not the hull points themselves, i.e. \texttt{return\_points} parameter in \cross{ConvexHull2} should be 0}
3573 \cvarg{storage}{Container for output sequence of convexity defects. If it is NULL, contour or hull (in that order) storage is used}
3576 The function \texttt{ConvexityDefects} finds all convexity defects of the input contour and returns a sequence of the CvConvexityDefect structures.
3578 \cvfunc{PointPolygonTest}\label{PointPolygonTest}
3580 Point in contour test
3583 double cvPointPolygonTest( \par const CvArr* contour,\par CvPoint2D32f pt,\par int measure\_dist );
3587 \cvarg{contour}{Input contour}
3588 \cvarg{pt}{The point tested against the contour}
3589 \cvarg{measure\_dist}{If it is non-zero, the function estimates distance from the point to the nearest contour edge}
3592 The function \texttt{cvPointPolygonTest} determines whether the
3593 point is inside contour, outside, or lies on an edge (or coinsides
3594 with a vertex). It returns positive, negative or zero value,
3595 correspondingly. When $\texttt{measure\_dist} =0$, the return value
3596 is +1, -1 and 0, respectively. When $\texttt{measure\_dist} \ne 0$,
3597 it is a signed distance between the point and the nearest contour
3600 Here is the sample output of the function, where each image pixel is tested against the contour.
3602 \includegraphics[width=0.5\textwidth]{pics/pointpolygon.png}
3604 \cvfunc{MinAreaRect2}\label{MinAreaRect2}
3606 Finds circumscribed rectangle of minimal area for given 2D point set
3609 CvBox2D cvMinAreaRect2( \par const CvArr* points,\par CvMemStorage* storage=NULL );
3613 \cvarg{points}{Sequence or array of points}
3614 \cvarg{storage}{Optional temporary memory storage}
3617 The function \texttt{cvMinAreaRect2} finds a circumscribed rectangle of the minimal area for 2D point set by building convex hull for the set and applying rotating calipers technique to the hull.
3619 % ===== Picture. Minimal-area bounding rectangle for contour =====
3620 \includegraphics[width=0.5\textwidth]{pics/minareabox.png}
3622 \cvfunc{MinEnclosingCircle}\label{MinEnclosingCircle}
3624 Finds circumscribed circle of minimal area for given 2D point set
3627 int cvMinEnclosingCircle( \par const CvArr* points,\par CvPoint2D32f* center,\par float* radius );
3631 \cvarg{points}{Sequence or array of 2D points}
3632 \cvarg{center}{Output parameter. The center of the enclosing circle}
3633 \cvarg{radius}{Output parameter. The radius of the enclosing circle}
3636 The function \texttt{cvMinEnclosingCircle} finds the minimal circumscribed
3637 circle for 2D point set using iterative algorithm. It returns nonzero
3638 if the resultant circle contains all the input points and zero otherwise
3639 (i.e. algorithm failed).
3641 \cvfunc{CalcPGH}\label{CalcPGH}
3643 Calculates pair-wise geometrical histogram for contour
3646 void cvCalcPGH( const CvSeq* contour, CvHistogram* hist );
3650 \cvarg{contour}{Input contour. Currently, only integer point coordinates are allowed}
3651 \cvarg{hist}{Calculated histogram; must be two-dimensional}
3654 The function \texttt{cvCalcPGH} calculates
3655 2D pair-wise geometrical histogram (PGH), described in
3657 for the contour. The algorithm considers every pair of the contour
3658 edges. The angle between the edges and the minimum/maximum distances
3659 are determined for every pair. To do this each of the edges in turn
3660 is taken as the base, while the function loops through all the other
3661 edges. When the base edge and any other edge are considered, the minimum
3662 and maximum distances from the points on the non-base edge and line of
3663 the base edge are selected. The angle between the edges defines the row
3664 of the histogram in which all the bins that correspond to the distance
3665 between the calculated minimum and maximum distances are incremented
3666 (that is, the histogram is transposed relatively to [Iivarninen97]
3667 definition). The histogram can be used for contour matching.
3669 \subsection{Planar Subdivisions}
3671 \cvstruct{CvSubdiv2D}\label{CvSubdiv2D}
3676 #define CV_SUBDIV2D_FIELDS() \
3679 int is_geometry_valid; \
3680 CvSubdiv2DEdge recent_edge; \
3681 CvPoint2D32f topleft; \
3682 CvPoint2D32f bottomright;
3684 typedef struct CvSubdiv2D
3686 CV_SUBDIV2D_FIELDS()
3691 Planar subdivision is a subdivision of a plane into a set of
3692 non-overlapped regions (facets) that cover the whole plane. The above
3693 structure describes a subdivision built on 2d point set, where the points
3694 are linked together and form a planar graph, which, together with a few
3695 edges connecting exterior subdivision points (namely, convex hull points)
3696 with infinity, subdivides a plane into facets by its edges.
3698 For every subdivision there exists dual subdivision there facets and
3699 points (subdivision vertices) swap their roles, that is, a facet is
3700 treated as a vertex (called virtual point below) of dual subdivision and
3701 the original subdivision vertices become facets. On the picture below
3702 original subdivision is marked with solid lines and dual subdivision
3705 \includegraphics[width=0.5\textwidth]{pics/subdiv.png}
3707 OpenCV subdivides plane into triangles using Delaunay's
3708 algorithm. Subdivision is built iteratively starting from a dummy
3709 triangle that includes all the subdivision points for sure. In this
3710 case the dual subdivision is Voronoi diagram of input 2d point set. The
3711 subdivisions can be used for 3d piece-wise transformation of a plane,
3712 morphing, fast location of points on the plane, building special graphs
3713 (such as NNG,RNG) etc.
3715 \cvstruct{CvQuadEdge2D}\label{CvQuadEdge2D}
3717 Quad-edge of planar subdivision
3720 /* one of edges within quad-edge, lower 2 bits is index (0..3)
3721 and upper bits are quad-edge pointer */
3722 typedef long CvSubdiv2DEdge;
3724 /* quad-edge structure fields */
3725 #define CV_QUADEDGE2D_FIELDS() \
3727 struct CvSubdiv2DPoint* pt[4]; \
3728 CvSubdiv2DEdge next[4];
3730 typedef struct CvQuadEdge2D
3732 CV_QUADEDGE2D_FIELDS()
3738 Quad-edge is a basic element of subdivision, it contains four edges (e, eRot, reversed e and reversed eRot):
3740 \includegraphics[width=0.5\textwidth]{pics/quadedge.png}
3742 \cvstruct{CvSubdiv2DPoint}\label{CvSubdiv2DPoint}
3744 Point of original or dual subdivision
3747 #define CV_SUBDIV2D_POINT_FIELDS()\
3749 CvSubdiv2DEdge first; \
3752 #define CV_SUBDIV2D_VIRTUAL_POINT_FLAG (1 << 30)
3754 typedef struct CvSubdiv2DPoint
3756 CV_SUBDIV2D_POINT_FIELDS()
3761 \cvfunc{Subdiv2DGetEdge}\label{Subdiv2DGetEdge}
3763 Returns one of edges related to given
3766 CvSubdiv2DEdge cvSubdiv2DGetEdge( CvSubdiv2DEdge edge, CvNextEdgeType type );
3771 #define cvSubdiv2DNextEdge( edge ) cvSubdiv2DGetEdge( edge, CV_NEXT_AROUND_ORG )
3775 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3776 \cvarg{type}{Specifies, which of related edges to return, one of:}
3778 \cvarg{CV\_NEXT\_AROUND\_ORG}{next around the edge origin (\texttt{eOnext} on the picture above if \texttt{e} is the input edge)}
3779 \cvarg{CV\_NEXT\_AROUND\_DST}{next around the edge vertex (\texttt{eDnext})}
3780 \cvarg{CV\_PREV\_AROUND\_ORG}{previous around the edge origin (reversed \texttt{eRnext})}
3781 \cvarg{CV\_PREV\_AROUND\_DST}{previous around the edge destination (reversed \texttt{eLnext})}
3782 \cvarg{CV\_NEXT\_AROUND\_LEFT}{next around the left facet (\texttt{eLnext})}
3783 \cvarg{CV\_NEXT\_AROUND\_RIGHT}{next around the right facet (\texttt{eRnext})}
3784 \cvarg{CV\_PREV\_AROUND\_LEFT}{previous around the left facet (reversed \texttt{eOnext})}
3785 \cvarg{CV\_PREV\_AROUND\_RIGHT}{previous around the right facet (reversed \texttt{eDnext})}
3789 The function \texttt{cvSubdiv2DGetEdge} returns one the edges related to the input edge.
3791 \cvfunc{Subdiv2DRotateEdge}\label{Subdiv2DRotateEdge}
3793 Returns another edge of the same quad-edge
3796 CvSubdiv2DEdge cvSubdiv2DRotateEdge( \par CvSubdiv2DEdge edge,\par int rotate );
3800 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3801 \cvarg{type}{Specifies, which of edges of the same quad-edge as the input one to return, one of:
3803 \cvarg{0}{the input edge (\texttt{e} on the picture above if \texttt{e} is the input edge)}
3804 \cvarg{1}{the rotated edge (\texttt{eRot})}
3805 \cvarg{2}{the reversed edge (reversed \texttt{e} (in green))}
3806 \cvarg{3}{the reversed rotated edge (reversed \texttt{eRot} (in green))}
3810 The function \texttt{cvSubdiv2DRotateEdge} returns one the edges of the same quad-edge as the input edge.
3812 \cvfunc{Subdiv2DEdgeOrg}\label{Subdiv2DEdgeOrg}
3817 CvSubdiv2DPoint* cvSubdiv2DEdgeOrg( \par CvSubdiv2DEdge edge );
3821 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3824 The function \texttt{cvSubdiv2DEdgeOrg} returns the edge
3825 origin. The returned pointer may be NULL if the edge is from dual
3826 subdivision and the virtual point coordinates are not calculated
3827 yet. The virtual points can be calculated using function
3828 \cross{CalcSubdivVoronoi2D}.
3830 \cvfunc{Subdiv2DEdgeDst}\label{Subdiv2DEdgeDst}
3832 Returns edge destination
3835 CvSubdiv2DPoint* cvSubdiv2DEdgeDst( \par CvSubdiv2DEdge edge );
3839 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3842 The function \texttt{cvSubdiv2DEdgeDst} returns the edge destination. The
3843 returned pointer may be NULL if the edge is from dual subdivision and
3844 the virtual point coordinates are not calculated yet. The virtual points
3845 can be calculated using function \cross{CalcSubdivVoronoi2D}.
3847 \cvfunc{CreateSubdivDelaunay2D}\label{CreateSubdivDelaunay2D}
3849 Creates empty Delaunay triangulation
3852 CvSubdiv2D* cvCreateSubdivDelaunay2D( \par CvRect rect,\par CvMemStorage* storage );
3856 \cvarg{rect}{Rectangle that includes all the 2d points that are to be added to subdivision}
3857 \cvarg{storage}{Container for subdivision}
3860 The function \texttt{CreateSubdivDelaunay2D} creates an empty Delaunay
3861 subdivision, where 2d points can be added further using function
3862 \cross{SubdivDelaunay2DInsert}. All the points to be added must be within
3863 the specified rectangle, otherwise a runtime error will be raised.
3865 \cvfunc{SubdivDelaunay2DInsert}\label{SubdivDelaunay2DInsert}
3867 Inserts a single point to Delaunay triangulation
3870 CvSubdiv2DPoint* cvSubdivDelaunay2DInsert( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt);
3874 \cvarg{subdiv}{Delaunay subdivision created by function \cross{CreateSubdivDelaunay2D}}
3875 \cvarg{pt}{Inserted point}
3878 The function \texttt{cvSubdivDelaunay2DInsert} inserts a single point to subdivision and modifies the subdivision topology appropriately. If a points with same coordinates exists already, no new points is added. The function returns pointer to the allocated point. No virtual points coordinates is calculated at this stage.
3880 \cvfunc{Subdiv2DLocate}\label{Subdiv2DLocate}
3882 Inserts a single point to Delaunay triangulation
3885 CvSubdiv2DPointLocation cvSubdiv2DLocate( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt,\par CvSubdiv2DEdge* edge,\par CvSubdiv2DPoint** vertex=NULL );
3889 \cvarg{subdiv}{Delaunay or another subdivision}
3890 \cvarg{pt}{The point to locate}
3891 \cvarg{edge}{The output edge the point falls onto or right to}
3892 \cvarg{vertex}{Optional output vertex double pointer the input point coinsides with}
3895 The function \texttt{cvSubdiv2DLocate} locates input point within subdivision. There are 5 cases:
3898 \item point falls into some facet. The function returns \\texttt{CV\_PTLOC\_INSIDE} and \texttt{*edge} will contain one of edges of the facet.
3899 \item point falls onto the edge. The function returns \\texttt{CV\_PTLOC\_ON\_EDGE} and \texttt{*edge} will contain this edge.
3900 \item point coinsides with one of subdivision vertices. The function returns \\texttt{CV\_PTLOC\_VERTEX} and \texttt{*vertex} will contain pointer to the vertex.
3901 \item point is outside the subdivsion reference rectangle. The function returns \\texttt{CV\_PTLOC\_OUTSIDE\_RECT} and no pointers is filled.
3902 \item one of input arguments is invalid. Runtime error is raised or, if silent or "parent" error processing mode is selected, \\texttt{CV\_PTLOC\_ERROR} is returnd.
3905 \cvfunc{FindNearestPoint2D}\label{FindNearestPoint2D}
3907 Finds the closest subdivision vertex to given point
3910 CvSubdiv2DPoint* cvFindNearestPoint2D( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt );
3914 \cvarg{subdiv}{Delaunay or another subdivision}
3915 \cvarg{pt}{Input point}
3918 The function \texttt{cvFindNearestPoint2D} is another function that
3919 locates input point within subdivision. It finds subdivision vertex that
3920 is the closest to the input point. It is not necessarily one of vertices
3921 of the facet containing the input point, though the facet (located using
3922 \cross{Subdiv2DLocate}) is used as a starting
3923 point. The function returns pointer to the found subdivision vertex.
3925 \cvfunc{CalcSubdivVoronoi2D}\label{CalcSubdivVoronoi2D}
3927 Calculates coordinates of Voronoi diagram cells
3930 void cvCalcSubdivVoronoi2D( \par CvSubdiv2D* subdiv );
3934 \cvarg{subdiv}{Delaunay subdivision, where all the points are added already}
3937 The function \texttt{cvCalcSubdivVoronoi2D} calculates coordinates
3938 of virtual points. All virtual points corresponding to some vertex of
3939 original subdivision form (when connected together) a boundary of Voronoi
3942 \cvfunc{ClearSubdivVoronoi2D}\label{ClearSubdivVoronoi2D}
3944 Removes all virtual points
3947 void cvClearSubdivVoronoi2D( CvSubdiv2D* subdiv );
3951 \cvarg{subdiv}{Delaunay subdivision}
3954 The function \texttt{ClearSubdivVoronoi2D} removes all virtual points. It
3955 is called internally in \cross{CalcSubdivVoronoi2D} if the subdivision
3956 was modified after previous call to the function.
3958 There are a few other lower-level functions that work with planar
3959 subdivisions, see cv.h and the sources. Demo script delaunay.c that
3960 builds Delaunay triangulation and Voronoi diagram of random 2d point
3961 set can be found at opencv/samples/c.
3963 \section{Motion Analysis and Object Tracking Reference}
3965 \subsection{Accumulation of Background Statistics}
3967 \cvfunc{Acc}\label{Acc}
3969 Adds frame to accumulator
3972 void cvAcc( \par const CvArr* image,\par CvArr* sum,\par const CvArr* mask=NULL );
3976 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point. (each channel of multi-channel image is processed independently)}
3977 \cvarg{sum}{Accumulator of the same number of channels as input image, 32-bit or 64-bit floating-point}
3978 \cvarg{mask}{Optional operation mask}
3981 The function \texttt{cvAcc} adds the whole image \texttt{image} or its selected region to accumulator \texttt{sum}:
3983 \[ \texttt{sum}(x,y) \leftarrow \texttt{sum}(x,y) + \texttt{image}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
3985 \cvfunc{SquareAcc}\label{SquareAcc}
3987 Adds the square of source image to accumulator
3990 void cvSquareAcc( \par const CvArr* image,\par CvArr* sqsum,\par const CvArr* mask=NULL );
3994 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
3995 \cvarg{sqsum}{Accumulator of the same number of channels as input image, 32-bit or 64-bit floating-point}
3996 \cvarg{mask}{Optional operation mask}
3999 The function \texttt{cvSquareAcc} adds the input image \texttt{image} or its selected region, raised to power 2, to the accumulator \texttt{sqsum}:
4001 \[ \texttt{sqsum}(x,y) \leftarrow \texttt{sqsum}(x,y) + \texttt{image}(x,y)^2 \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4003 \cvfunc{MultiplyAcc}\label{MultiplyAcc}
4005 Adds product of two input images to accumulator
4008 void cvMultiplyAcc( \par const CvArr* image1,\par const CvArr* image2,\par CvArr* acc,\par const CvArr* mask=NULL );
4012 \cvarg{image1}{First input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
4013 \cvarg{image2}{Second input image, the same format as the first one}
4014 \cvarg{acc}{Accumulator of the same number of channels as input images, 32-bit or 64-bit floating-point}
4015 \cvarg{mask}{Optional operation mask}
4018 The function \texttt{cvMultiplyAcc} adds product of 2 images or thier selected regions to accumulator \texttt{acc}:
4020 \[ \texttt{acc}(x,y) \leftarrow \texttt{acc}(x,y) + \texttt{image1}(x,y) \cdot \texttt{image2}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4022 \cvfunc{RunningAvg}\label{RunningAvg}
4024 Updates running average
4027 void cvRunningAvg( \par const CvArr* image,\par CvArr* acc,\par double alpha,\par const CvArr* mask=NULL );
4031 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
4032 \cvarg{acc}{Accumulator of the same number of channels as input image, 32-bit or 64-bit floating-point}
4033 \cvarg{alpha}{Weight of input image}
4034 \cvarg{mask}{Optional operation mask}
4037 The function \texttt{cvRunningAvg} calculates weighted sum of input image
4038 \texttt{image} and the accumulator \texttt{acc} so that \texttt{acc}
4039 becomes a running average of frame sequence:
4041 \[ \texttt{acc}(x,y) \leftarrow (1-\alpha) \cdot \texttt{acc}(x,y) + \alpha \cdot \texttt{image}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4043 where $\alpha$ (\texttt{alpha}) regulates update speed (how fast accumulator forgets about previous frames).
4045 \subsection{Motion Templates}
4047 \cvfunc{UpdateMotionHistory}\label{UpdateMotionHistory}
4049 Updates motion history image by moving silhouette
4052 void cvUpdateMotionHistory( \par const CvArr* silhouette,\par CvArr* mhi,\par double timestamp,\par double duration );
4056 \cvarg{silhouette}{Silhouette mask that has non-zero pixels where the motion occurs}
4057 \cvarg{mhi}{Motion history image, that is updated by the function (single-channel, 32-bit floating-point)}
4058 \cvarg{timestamp}{Current time in milliseconds or other units}
4059 \cvarg{duration}{Maximal duration of motion track in the same units as \texttt{timestamp}}
4062 The function \texttt{cvUpdateMotionHistory} updates the motion history image as following:
4065 \texttt{mhi}(x,y)=\forkthree
4066 {\texttt{timestamp}}{if $\texttt{silhouette}(x,y) \ne 0$}
4067 {0}{if $\texttt{silhouette}(x,y) = 0$ and $\texttt{mhi} < (\texttt{timestamp} - \texttt{duration})$}
4068 {\texttt{mhi}(x,y)}{otherwise}
4070 That is, MHI pixels where motion occurs are set to the current timestamp, while the pixels where motion happened far ago are cleared.
4072 \cvfunc{CalcMotionGradient}\label{CalcMotionGradient}
4074 Calculates gradient orientation of motion history image
4077 void cvCalcMotionGradient( \par const CvArr* mhi,\par CvArr* mask,\par CvArr* orientation,\par double delta1,\par double delta2,\par int aperture\_size=3 );
4081 \cvarg{mhi}{Motion history image}
4082 \cvarg{mask}{Mask image; marks pixels where motion gradient data is correct. Output parameter}
4083 \cvarg{orientation}{Motion gradient orientation image; contains angles from 0 to ~360 degrees }
4084 \cvarg{delta1, delta2}{The function finds minimum ($m(x,y)$) and maximum ($M(x,y)$) mhi values over each pixel $(x,y)$ neighborhood and assumes the gradient is valid only if
4086 \min(\texttt{delta1} , \texttt{delta2} ) \le M(x,y)-m(x,y) \le \max(\texttt{delta1} ,\texttt{delta2} ).
4088 \cvarg{aperture\_size}{Aperture size of derivative operators used by the function: CV\_SCHARR, 1, 3, 5 or 7 (see \cross{Sobel})}
4091 The function \texttt{cvCalcMotionGradient} calculates the derivatives $Dx$ and $Dy$ of \texttt{mhi} and then calculates gradient orientation as:
4094 \texttt{orientation}(x,y)=\arctan{\frac{Dy(x,y)}{Dx(x,y)}}
4097 where both $Dx(x,y)$ and $Dy(x,y)$ signs are taken into account (as in \cross{CartToPolar} function). After that \texttt{mask} is filled to indicate where the orientation is valid (see \texttt{delta1} and \texttt{delta2} description).
4099 \cvfunc{CalcGlobalOrientation}\label{CalcGlobalOrientation}
4101 Calculates global motion orientation of some selected region
4104 double cvCalcGlobalOrientation( \par const CvArr* orientation,\par const CvArr* mask,\par const CvArr* mhi,\par double timestamp,\par double duration );
4108 \cvarg{orientation}{Motion gradient orientation image; calculated by the function \cross{CalcMotionGradient}}
4109 \cvarg{mask}{Mask image. It may be a conjunction of valid gradient mask, obtained with \cross{CalcMotionGradient} and mask of the region, whose direction needs to be calculated}
4110 \cvarg{mhi}{Motion history image}
4111 \cvarg{timestamp}{Current time in milliseconds or other units, it is better to store time passed to \cross{UpdateMotionHistory} before and reuse it here, because running \cross{UpdateMotionHistory} and \cross{CalcMotionGradient} on large images may take some time}
4112 \cvarg{duration}{Maximal duration of motion track in milliseconds, the same as in \cross{UpdateMotionHistory}}
4115 The function \texttt{cvCalcGlobalOrientation} calculates the general
4116 motion direction in the selected region and returns the angle between
4117 0 degrees and 360 degrees . At first the function builds the orientation histogram
4118 and finds the basic orientation as a coordinate of the histogram
4119 maximum. After that the function calculates the shift relative to the
4120 basic orientation as a weighted sum of all orientation vectors: the more
4121 recent is the motion, the greater is the weight. The resultant angle is
4122 a circular sum of the basic orientation and the shift.
4124 \cvfunc{SegmentMotion}\label{SegmentMotion}
4126 Segments whole motion into separate moving parts
4129 CvSeq* cvSegmentMotion( \par const CvArr* mhi,\par CvArr* seg\_mask,\par CvMemStorage* storage,\par double timestamp,\par double seg\_thresh );
4133 \cvarg{mhi}{Motion history image}
4134 \cvarg{seg\_mask}{Image where the mask found should be stored, single-channel, 32-bit floating-point}
4135 \cvarg{storage}{Memory storage that will contain a sequence of motion connected components}
4136 \cvarg{timestamp}{Current time in milliseconds or other units}
4137 \cvarg{seg\_thresh}{Segmentation threshold; recommended to be equal to the interval between motion history "steps" or greater}
4140 The function \texttt{cvSegmentMotion} finds all the motion segments and
4141 marks them in \texttt{seg\_mask} with individual values each (1,2,...). It
4142 also returns a sequence of \cross{CvConnectedComp}
4143 structures, one per each motion components. After than the
4144 motion direction for every component can be calculated with
4145 \cross{CalcGlobalOrientation} using extracted mask of the particular
4146 component \cross{Cmp}.
4148 \subsection{Object Tracking}
4150 \cvfunc{MeanShift}\label{MeanShift}
4152 Finds object center on back projection
4155 int cvMeanShift( \par const CvArr* prob\_image,\par CvRect window,\par CvTermCriteria criteria,\par CvConnectedComp* comp );
4159 \cvarg{prob\_image}{Back projection of object histogram (see \cross{CalcBackProject})}
4160 \cvarg{window}{Initial search window}
4161 \cvarg{criteria}{Criteria applied to determine when the window search should be finished}
4162 \cvarg{comp}{Resultant structure that contains converged search window coordinates (\texttt{comp->rect} field) and sum of all pixels inside the window (\texttt{comp->area} field)}
4165 The function \texttt{cvMeanShift} iterates to find the object center
4166 given its back projection and initial position of search window. The
4167 iterations are made until the search window center moves by less than
4168 the given value and/or until the function has done the maximum number
4169 of iterations. The function returns the number of iterations made.
4171 \cvfunc{CamShift}\label{CamShift}
4173 Finds object center, size, and orientation
4176 int cvCamShift( \par const CvArr* prob\_image,\par CvRect window,\par CvTermCriteria criteria,\par CvConnectedComp* comp,\par CvBox2D* box=NULL );
4180 \cvarg{prob\_image}{Back projection of object histogram (see \cross{CalcBackProject})}
4181 \cvarg{window}{Initial search window}
4182 \cvarg{criteria}{Criteria applied to determine when the window search should be finished}
4183 \cvarg{comp}{Resultant structure that contains converged search window coordinates (\texttt{comp->rect} field) and sum of all pixels inside the window (\texttt{comp->area} field)}
4184 \cvarg{box}{Circumscribed box for the object. If not \texttt{NULL}, contains object size and orientation}
4187 The function \texttt{cvCamShift} implements CAMSHIFT object tracking algrorithm
4189 First, it finds an object center using \cross{MeanShift} and, after that, calculates the object size and orientation. The function returns number of iterations made within \cross{MeanShift}.
4191 \cross{CvCamShiftTracker} class declared in cv.hpp implements color object tracker that uses the function.
4193 \cvfunc{SnakeImage}\label{SnakeImage}
4195 Changes contour position to minimize its energy
4198 void cvSnakeImage( \par const IplImage* image,\par CvPoint* points,\par int length,\par float* alpha,\par float* beta,\par float* gamma,\par int coeff\_usage,\par CvSize win,\par CvTermCriteria criteria,\par int calc\_gradient=1 );
4202 \cvarg{image}{The source image or external energy field}
4203 \cvarg{points}{Contour points (snake)}
4204 \cvarg{length}{Number of points in the contour}
4205 \cvarg{alpha}{Weight[s] of continuity energy, single float or array of \texttt{length} floats, one per each contour point}
4206 \cvarg{beta}{Weight[s] of curvature energy, similar to \texttt{alpha}}
4207 \cvarg{gamma}{Weight[s] of image energy, similar to \texttt{alpha}}
4208 \cvarg{coeff\_usage}{Variant of usage of the previous three parameters:
4210 \cvarg{CV\_VALUE}{indicates that each of \texttt{alpha, beta, gamma} is a pointer to a single value to be used for all points;}
4211 \cvarg{CV\_ARRAY}{indicates that each of \texttt{alpha, beta, gamma} is a pointer to an array of coefficients different for all the points of the snake. All the arrays must have the size equal to the contour size.}
4213 \cvarg{win}{Size of neighborhood of every point used to search the minimum, both \texttt{win.width} and \texttt{win.height} must be odd}
4214 \cvarg{criteria}{Termination criteria}
4215 \cvarg{calc\_gradient}{Gradient flag. If not 0, the function calculates gradient magnitude for every image pixel and consideres it as the energy field, otherwise the input image itself is considered}
4218 The function \texttt{cvSnakeImage} updates snake in order to minimize its
4219 total energy that is a sum of internal energy that depends on contour
4220 shape (the smoother contour is, the smaller internal energy is) and
4221 external energy that depends on the energy field and reaches minimum at
4222 the local energy extremums that correspond to the image edges in case
4225 The parameter \texttt{criteria.epsilon} is used to define the minimal
4226 number of points that must be moved during any iteration to keep the
4227 iteration process running.
4229 If at some iteration the number of moved points is less
4230 than \texttt{criteria.epsilon} or the function performed
4231 \texttt{criteria.max\_iter} iterations, the function terminates.
4233 \subsection{Optical Flow}
4235 \cvfunc{CalcOpticalFlowHS}\label{CalcOpticalFlowHS}
4237 Calculates optical flow for two images
4240 void cvCalcOpticalFlowHS( \par const CvArr* prev,\par const CvArr* curr,\par int use\_previous,\par CvArr* velx,\par CvArr* vely,\par double lambda,\par CvTermCriteria criteria );
4244 \cvarg{prev}{First image, 8-bit, single-channel}
4245 \cvarg{curr}{Second image, 8-bit, single-channel}
4246 \cvarg{use\_previous}{Uses previous (input) velocity field}
4247 \cvarg{velx}{Horizontal component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4248 \cvarg{vely}{Vertical component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4249 \cvarg{lambda}{Lagrangian multiplier}
4250 \cvarg{criteria}{Criteria of termination of velocity computing}
4253 The function \texttt{cvCalcOpticalFlowHS} computes flow for every pixel of the first input image using Horn and Schunck algorithm
4256 \cvfunc{CalcOpticalFlowLK}\label{CalcOpticalFlowLK}
4258 Calculates optical flow for two images
4261 void cvCalcOpticalFlowLK( \par const CvArr* prev,\par const CvArr* curr,\par CvSize win\_size,\par CvArr* velx,\par CvArr* vely );
4265 \cvarg{prev}{First image, 8-bit, single-channel}
4266 \cvarg{curr}{Second image, 8-bit, single-channel}
4267 \cvarg{win\_size}{Size of the averaging window used for grouping pixels}
4268 \cvarg{velx}{Horizontal component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4269 \cvarg{vely}{Vertical component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4272 The function \texttt{cvCalcOpticalFlowLK} computes flow for every pixel of the first input image using Lucas and Kanade algorithm
4275 \cvfunc{CalcOpticalFlowBM}\label{CalcOpticalFlowBM}
4277 Calculates optical flow for two images by block matching method
4280 void cvCalcOpticalFlowBM( \par const CvArr* prev,\par const CvArr* curr,\par CvSize block\_size,\par CvSize shift\_size,\par CvSize max\_range,\par int use\_previous,\par CvArr* velx,\par CvArr* vely );
4284 \cvarg{prev}{First image, 8-bit, single-channel}
4285 \cvarg{curr}{Second image, 8-bit, single-channel}
4286 \cvarg{block\_size}{Size of basic blocks that are compared}
4287 \cvarg{shift\_size}{Block coordinate increments}
4288 \cvarg{max\_range}{Size of the scanned neighborhood in pixels around block}
4289 \cvarg{use\_previous}{Uses previous (input) velocity field}
4290 \cvarg{velx}{Horizontal component of the optical flow of
4292 \left\lfloor \frac{\texttt{prev->width} - \texttt{block\_size.width}}{\texttt{shiftSize.width}} \right\rfloor
4294 \left\lfloor \frac{\texttt{prev->height} - \texttt{block\_size.height}}{\texttt{shiftSize.height}} \right\rfloor
4296 size, 32-bit floating-point, single-channel}
4297 \cvarg{vely}{Vertical component of the optical flow of the same size \texttt{velx}, 32-bit floating-point, single-channel}
4300 The function \texttt{CalcOpticalFlowBM} calculates optical
4301 flow for overlapped blocks $\texttt{block\_size.width} \times
4302 \texttt{block\_size.height}$ pixels each, thus the velocity
4303 fields are smaller than the original images. For every block
4304 in \texttt{prev} the functions tries to find a similar block in
4305 \texttt{curr} in some neighborhood of the original block or shifted by
4306 (velx(x0,y0),vely(x0,y0)) block as has been calculated by previous
4307 function call (if \texttt{use\_previous=1})
4309 \cvfunc{CalcOpticalFlowPyrLK}\label{CalcOpticalFlowPyrLK}
4311 Calculates optical flow for a sparse feature set using iterative Lucas-Kanade method in pyramids
4314 void cvCalcOpticalFlowPyrLK( \par const CvArr* prev,\par const CvArr* curr,\par CvArr* prev\_pyr,\par CvArr* curr\_pyr,\par const CvPoint2D32f* prev\_features,\par CvPoint2D32f* curr\_features,\par int count,\par CvSize win\_size,\par int level,\par char* status,\par float* track\_error,\par CvTermCriteria criteria,\par int flags );
4318 \cvarg{prev}{First frame, at time \texttt{t}}
4319 \cvarg{curr}{Second frame, at time \texttt{t + dt} }
4320 \cvarg{prev\_pyr}{Buffer for the pyramid for the first frame. If the pointer is not \texttt{NULL} , the buffer must have a sufficient size to store the pyramid from level \texttt{1} to level \texttt{level} ; the total size of \texttt{(image\_width+8)*image\_height/3} bytes is sufficient}
4321 \cvarg{curr\_pyr}{Similar to \texttt{prev\_pyr}, used for the second frame}
4322 \cvarg{prev\_features}{Array of points for which the flow needs to be found}
4323 \cvarg{curr\_features}{Array of 2D points containing calculated new positions of inputfeatures}\cvarg{}{in the second image}
4324 \cvarg{count}{Number of feature points}
4325 \cvarg{win\_size}{Size of the search window of each pyramid level}
4326 \cvarg{level}{Maximal pyramid level number. If \texttt{} , pyramids are not used (single level), if \texttt{1} , two levels are used, etc}
4327 \cvarg{status}{Array. Every element of the array is set to \texttt{1} if the flow for the corresponding feature has been found, \texttt{} otherwise}
4328 \cvarg{error}{Array of double numbers containing difference between patches around the original and moved points. Optional parameter; can be \texttt{NULL }}
4329 \cvarg{criteria}{Specifies when the iteration process of finding the flow for each point on each pyramid level should be stopped}
4330 \cvarg{flags}{Miscellaneous flags:
4332 \cvarg{CV\_LKFLOW\_PYR\_A\_READY}{pyramid for the first frame is precalculated before the call}
4333 \cvarg{CV\_LKFLOW\_PYR\_B\_READY}{ pyramid for the second frame is precalculated before the call}
4334 \cvarg{CV\_LKFLOW\_INITIAL\_GUESSES}{array B contains initial coordinates of features before the function call}
4338 The function \texttt{cvCalcOpticalFlowPyrLK} implements sparse iterative version of Lucas-Kanade optical flow in pyramids
4340 . It calculates coordinates of the feature points on the current video
4341 frame given their coordinates on the previous frame. The function finds
4342 the coordinates with sub-pixel accuracy.
4344 Both parameters \texttt{prev\_pyr} and \texttt{curr\_pyr} comply with the
4345 following rules: if the image pointer is 0, the function allocates the
4346 buffer internally, calculates the pyramid, and releases the buffer after
4347 processing. Otherwise, the function calculates the pyramid and stores
4348 it in the buffer unless the flag \texttt{CV\_LKFLOW\_PYR\_A[B]\_READY}
4349 is set. The image should be large enough to fit the Gaussian pyramid
4350 data. After the function call both pyramids are calculated and the
4351 readiness flag for the corresponding image can be set in the next call
4352 (i.e., typically, for all the image pairs except the very first one
4353 \texttt{CV\_LKFLOW\_PYR\_A\_READY} is set).
4355 \subsection{Estimators}
4357 \cvfunc{CvKalman}\label{CvKalman}
4362 typedef struct CvKalman
4364 int MP; /* number of measurement vector dimensions */
4365 int DP; /* number of state vector dimensions */
4366 int CP; /* number of control vector dimensions */
4368 /* backward compatibility fields */
4370 float* PosterState; /* =state_pre->data.fl */
4371 float* PriorState; /* =state_post->data.fl */
4372 float* DynamMatr; /* =transition_matrix->data.fl */
4373 float* MeasurementMatr; /* =measurement_matrix->data.fl */
4374 float* MNCovariance; /* =measurement_noise_cov->data.fl */
4375 float* PNCovariance; /* =process_noise_cov->data.fl */
4376 float* KalmGainMatr; /* =gain->data.fl */
4377 float* PriorErrorCovariance;/* =error_cov_pre->data.fl */
4378 float* PosterErrorCovariance;/* =error_cov_post->data.fl */
4379 float* Temp1; /* temp1->data.fl */
4380 float* Temp2; /* temp2->data.fl */
4383 CvMat* state_pre; /* predicted state (x'(k)):
4384 x(k)=A*x(k-1)+B*u(k) */
4385 CvMat* state_post; /* corrected state (x(k)):
4386 x(k)=x'(k)+K(k)*(z(k)-H*x'(k)) */
4387 CvMat* transition_matrix; /* state transition matrix (A) */
4388 CvMat* control_matrix; /* control matrix (B)
4389 (it is not used if there is no control)*/
4390 CvMat* measurement_matrix; /* measurement matrix (H) */
4391 CvMat* process_noise_cov; /* process noise covariance matrix (Q) */
4392 CvMat* measurement_noise_cov; /* measurement noise covariance matrix (R) */
4393 CvMat* error_cov_pre; /* priori error estimate covariance matrix (P'(k)):
4394 P'(k)=A*P(k-1)*At + Q)*/
4395 CvMat* gain; /* Kalman gain matrix (K(k)):
4396 K(k)=P'(k)*Ht*inv(H*P'(k)*Ht+R)*/
4397 CvMat* error_cov_post; /* posteriori error estimate covariance matrix (P(k)):
4398 P(k)=(I-K(k)*H)*P'(k) */
4399 CvMat* temp1; /* temporary matrices */
4408 The structure \texttt{CvKalman} is used to keep Kalman filter
4409 state. It is created by \cross{CreateKalman} function, updated
4410 by \cross{KalmanPredict} and \cross{KalmanCorrect} functions
4411 and released by \cross{ReleaseKalman} functions. Normally, the
4412 structure is used for standard Kalman filter (notation and the
4413 formulae below are borrowed from the excellent Kalman tutorial
4417 x_k=A \cdot x_{k-1}+B \cdot u_k+w_k
4425 x_k (x{k-1}) & \text{state of the system at the moment $k$ $(k-1)$ }\\
4426 z_k & \text{measurement of the system state at the moment $k$}\\
4427 u_k & \text{external control applied at the moment $k$}
4431 $w_k$ and $v_k$ are normally-distributed process and measurement noise, respectively:
4442 $Q$ process noise covariance matrix, constant or variable,
4444 $R$ measurement noise covariance matrix, constant or variable
4446 In case of standard Kalman filter, all the matrices: A, B, H, Q and R are initialized once after \cross{CvKalman} structure is allocated via \cross{CreateKalman}. However, the same structure and the same functions may be used to simulate extended Kalman filter by linearizing extended Kalman filter equation in the current system state neighborhood, in this case A, B, H (and, probably, Q and R) should be updated on every step.
4448 \cvfunc{CreateKalman}\label{CreateKalman}
4450 Allocates Kalman filter structure
4453 CvKalman* cvCreateKalman( \par int dynam\_params,\par int measure\_params,\par int control\_params=0 );
4457 \cvarg{dynam\_params}{dimensionality of the state vector}
4458 \cvarg{measure\_params}{dimensionality of the measurement vector}
4459 \cvarg{control\_params}{dimensionality of the control vector}
4462 The function \texttt{cvCreateKalman} allocates \cross{CvKalman} and all its matrices and initializes them somehow.
4464 \cvfunc{ReleaseKalman}\label{ReleaseKalman}
4466 Deallocates Kalman filter structure
4469 void cvReleaseKalman( \par CvKalman** kalman );
4473 \cvarg{kalman}{double pointer to the Kalman filter structure}
4476 The function \texttt{cvReleaseKalman} releases the structure \cross{CvKalman} and all underlying matrices.
4478 \cvfunc{KalmanPredict}\label{KalmanPredict}
4480 Estimates subsequent model state
4483 const CvMat* cvKalmanPredict( \par CvKalman* kalman, \par const CvMat* control=NULL );
4486 #define cvKalmanUpdateByTime cvKalmanPredict
4490 \cvarg{kalman}{Kalman filter state}
4491 \cvarg{control}{Control vector $u_k$, should be NULL iff there is no external control (\texttt{control\_params} =0)}
4494 The function \texttt{cvKalmanPredict} estimates the subsequent stochastic model state by its current state and stores it at \texttt{kalman->state\_pre}:
4498 x'_k=A \cdot x_k+B \cdot u_k\\
4499 P'_k=A \cdot P_{k-1}+A^T + Q
4505 \begin{tabular}{l p{5 in}}
4506 $x'_k$ & is predicted state \texttt{kalman->state\_pre},\\
4507 $x_{k-1}$ & is corrected state on the previous step \texttt{kalman->state\_post}
4508 (should be initialized somehow in the beginning, zero vector by default),\\
4509 $u_k$ & is external control (\texttt{control} parameter),\\
4510 $P'_k$ & is priori error covariance matrix \texttt{kalman->error\_cov\_pre}\\
4511 $P_{k-1}$ & is posteriori error covariance matrix on the previous step \texttt{kalman->error\_cov\_post}
4512 (should be initialized somehow in the beginning, identity matrix by default),
4515 The function returns the estimated state.
4517 \cvfunc{KalmanCorrect}\label{KalmanCorrect}
4522 const CvMat* cvKalmanCorrect( CvKalman* kalman, const CvMat* measurement );
4526 #define cvKalmanUpdateByMeasurement cvKalmanCorrect
4530 \cvarg{kalman}{Pointer to the structure to be updated}
4531 \cvarg{measurement}{Pointer to the structure CvMat containing the measurement vector}
4534 The function \texttt{cvKalmanCorrect} adjusts stochastic model state on the basis of the given measurement of the model state:
4538 K_k=P'_k \cdot H^T \cdot (H \cdot P'_k \cdot H^T+R)^{-1}\\
4539 x_k=x'_k+K_k \cdot (z_k-H \cdot x'_k)\\
4540 P_k=(I-K_k \cdot H) \cdot P'_k
4546 \begin{tabular}{l p{4 in}}
4547 $z_k$ & given measurement (\texttt{mesurement} parameter)\\
4548 $K_k$ & Kalman "gain" matrix.
4551 The function stores adjusted state at \texttt{kalman->state\_post} and returns it on output.
4553 % ===== Example. Using Kalman filter to track a rotating point =====
4556 #include "highgui.h"
4559 int main(int argc, char** argv)
4562 const float A[] = { 1, 1, 0, 1 };
4564 IplImage* img = cvCreateImage( cvSize(500,500), 8, 3 );
4565 CvKalman* kalman = cvCreateKalman( 2, 1, 0 );
4566 /* state is (phi, delta_phi) - angle and angle increment */
4567 CvMat* state = cvCreateMat( 2, 1, CV_32FC1 );
4568 CvMat* process_noise = cvCreateMat( 2, 1, CV_32FC1 );
4569 /* only phi (angle) is measured */
4570 CvMat* measurement = cvCreateMat( 1, 1, CV_32FC1 );
4574 cvRandInit( &rng, 0, 1, -1, CV_RAND_UNI );
4576 cvZero( measurement );
4577 cvNamedWindow( "Kalman", 1 );
4581 cvRandSetRange( &rng, 0, 0.1, 0 );
4582 rng.disttype = CV_RAND_NORMAL;
4584 cvRand( &rng, state );
4586 memcpy( kalman->transition_matrix->data.fl, A, sizeof(A));
4587 cvSetIdentity( kalman->measurement_matrix, cvRealScalar(1) );
4588 cvSetIdentity( kalman->process_noise_cov, cvRealScalar(1e-5) );
4589 cvSetIdentity( kalman->measurement_noise_cov, cvRealScalar(1e-1) );
4590 cvSetIdentity( kalman->error_cov_post, cvRealScalar(1));
4591 /* choose random initial state */
4592 cvRand( &rng, kalman->state_post );
4594 rng.disttype = CV_RAND_NORMAL;
4598 #define calc_point(angle) \
4599 cvPoint( cvRound(img->width/2 + img->width/3*cos(angle)), \
4600 cvRound(img->height/2 - img->width/3*sin(angle)))
4602 float state_angle = state->data.fl[0];
4603 CvPoint state_pt = calc_point(state_angle);
4605 /* predict point position */
4606 const CvMat* prediction = cvKalmanPredict( kalman, 0 );
4607 float predict_angle = prediction->data.fl[0];
4608 CvPoint predict_pt = calc_point(predict_angle);
4609 float measurement_angle;
4610 CvPoint measurement_pt;
4612 cvRandSetRange( &rng,
4614 sqrt(kalman->measurement_noise_cov->data.fl[0]),
4616 cvRand( &rng, measurement );
4618 /* generate measurement */
4619 cvMatMulAdd( kalman->measurement_matrix, state, measurement, measurement );
4621 measurement_angle = measurement->data.fl[0];
4622 measurement_pt = calc_point(measurement_angle);
4625 #define draw_cross( center, color, d ) \
4626 cvLine( img, cvPoint( center.x - d, center.y - d ), \
4627 cvPoint( center.x + d, center.y + d ), \
4629 cvLine( img, cvPoint( center.x + d, center.y - d ), \
4630 cvPoint( center.x - d, center.y + d ), \
4634 draw_cross( state_pt, CV_RGB(255,255,255), 3 );
4635 draw_cross( measurement_pt, CV_RGB(255,0,0), 3 );
4636 draw_cross( predict_pt, CV_RGB(0,255,0), 3 );
4637 cvLine( img, state_pt, predict_pt, CV_RGB(255,255,0), 3, 0 );
4639 /* adjust Kalman filter state */
4640 cvKalmanCorrect( kalman, measurement );
4642 cvRandSetRange( &rng,
4644 sqrt(kalman->process_noise_cov->data.fl[0]),
4646 cvRand( &rng, process_noise );
4647 cvMatMulAdd( kalman->transition_matrix,
4652 cvShowImage( "Kalman", img );
4653 code = cvWaitKey( 100 );
4655 if( code > 0 ) /* break current simulation by pressing a key */
4658 if( code == 27 ) /* exit by ESCAPE */
4666 \cvfunc{CvConDensation}\label{CvConDensation}
4671 typedef struct CvConDensation
4673 int MP; //Dimension of measurement vector
4674 int DP; // Dimension of state vector
4675 float* DynamMatr; // Matrix of the linear Dynamics system
4676 float* State; // Vector of State
4677 int SamplesNum; // Number of the Samples
4678 float** flSamples; // array of the Sample Vectors
4679 float** flNewSamples; // temporary array of the Sample Vectors
4680 float* flConfidence; // Confidence for each Sample
4681 float* flCumulative; // Cumulative confidence
4682 float* Temp; // Temporary vector
4683 float* RandomSample; // RandomVector to update sample set
4684 CvRandState* RandS; // Array of structures to generate random vectors
4688 The structure \texttt{CvConDensation} stores CONditional DENSity propagATION tracker state. The information about the algorithm can be found at \url{http://www.dai.ed.ac.uk/CVonline/LOCAL\_COPIES/ISARD1/condensation.html}.
4690 \cvfunc{CreateConDensation}\label{CreateConDensation}
4692 Allocates ConDensation filter structure
4695 CvConDensation* cvCreateConDensation( \par int dynam\_params,\par int measure\_params,\par int sample\_count );
4699 \cvarg{dynam\_params}{Dimension of the state vector}
4700 \cvarg{measure\_params}{Dimension of the measurement vector}
4701 \cvarg{sample\_count}{Number of samples}
4704 The function \texttt{cvCreateConDensation} creates \texttt{CvConDensation} structure and returns pointer to the structure.
4706 \cvfunc{ReleaseConDensation}\label{ReleaseConDensation}
4708 Deallocates ConDensation filter structure
4711 void cvReleaseConDensation( CvConDensation** condens );
4715 \cvarg{condens}{Pointer to the pointer to the structure to be released}
4718 The function \texttt{cvReleaseConDensation} releases the structure \cross{CvConDensation}) and frees all memory previously allocated for the structure.
4720 \cvfunc{ConDensInitSampleSet}\label{ConDensInitSampleSet}
4722 Initializes sample set for ConDensation algorithm
4725 void cvConDensInitSampleSet( CvConDensation* condens, \par CvMat* lower\_bound, \par CvMat* upper\_bound );
4729 \cvarg{condens}{Pointer to a structure to be initialized}
4730 \cvarg{lower\_bound}{Vector of the lower boundary for each dimension}
4731 \cvarg{upper\_bound}{Vector of the upper boundary for each dimension}
4734 The function \texttt{cvConDensInitSampleSet} fills the samples arrays in the structure \cross{CvConDensation} with values within specified ranges.
4736 \cvfunc{ConDensUpdateByTime}\label{ConDensUpdateByTime}
4738 Estimates subsequent model state
4741 void cvConDensUpdateByTime( \par CvConDensation* condens );
4745 \cvarg{condens}{Pointer to the structure to be updated}
4748 The function \texttt{cvConDensUpdateByTime} estimates the subsequent stochastic model state from its current state.
4750 \section{Pattern Recognition}
4752 \subsection{Object Detection}
4754 The object detector described below has been initially proposed by Paul Viola
4756 and improved by Rainer Lienhart
4758 . First, a classifier (namely a \emph{cascade of boosted classifiers working with haar-like features}) is trained with a few hundreds of sample views of a particular object (i.e., a face or a car), called positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size.
4760 After a classifier is trained, it can be applied to a region of interest
4761 (of the same size as used during the training) in an input image. The
4762 classifier outputs a "1" if the region is likely to show the object
4763 (i.e., face/car), and "0" otherwise. To search for the object in the
4764 whole image one can move the search window across the image and check
4765 every location using the classifier. The classifier is designed so that
4766 it can be easily "resized" in order to be able to find the objects of
4767 interest at different sizes, which is more efficient than resizing the
4768 image itself. So, to find an object of an unknown size in the image the
4769 scan procedure should be done several times at different scales.
4771 The word "cascade" in the classifier name means that the resultant
4772 classifier consists of several simpler classifiers (\emph{stages}) that
4773 are applied subsequently to a region of interest until at some stage the
4774 candidate is rejected or all the stages are passed. The word "boosted"
4775 means that the classifiers at every stage of the cascade are complex
4776 themselves and they are built out of basic classifiers using one of four
4777 different \texttt{boosting} techniques (weighted voting). Currently
4778 Discrete Adaboost, Real Adaboost, Gentle Adaboost and Logitboost are
4779 supported. The basic classifiers are decision-tree classifiers with at
4780 least 2 leaves. Haar-like features are the input to the basic classifers,
4781 and are calculated as described below. The current algorithm uses the
4782 following Haar-like features:
4784 \includegraphics[width=0.5\textwidth]{pics/haarfeatures.png}
4786 The feature used in a particular classifier is specified by its shape (1a, 2b etc.), position within the region of interest and the scale (this scale is not the same as the scale used at the detection stage, though these two scales are multiplied). For example, in case of the third line feature (2c) the response is calculated as the difference between the sum of image pixels under the rectangle covering the whole feature (including the two white stripes and the black stripe in the middle) and the sum of the image pixels under the black stripe multiplied by 3 in order to compensate for the differences in the size of areas. The sums of pixel values over a rectangular regions are calculated rapidly using integral images (see below and \cross{Integral} description).
4788 To see the object detector at work, have a look at HaarFaceDetect demo.
4790 The following reference is for the detection part only. There
4791 is a separate application called \texttt{haartraining} that can
4792 train a cascade of boosted classifiers from a set of samples. See
4793 \texttt{opencv/apps/haartraining} for details.
4795 \cvfunc{CvHaarFeature, CvHaarClassifier, CvHaarStageClassifier, CvHaarClassifierCascade}
4796 \label{CvHaarFeature}
4797 \label{CvHaarClassifier}
4798 \label{CvHaarStageClassifier}
4799 \label{CvHaarClassifierCascade}
4801 Boosted Haar classifier structures
4804 #define CV_HAAR_FEATURE_MAX 3
4806 /* a haar feature consists of 2-3 rectangles with appropriate weights */
4807 typedef struct CvHaarFeature
4809 int tilted; /* 0 means up-right feature, 1 means 45--rotated feature */
4811 /* 2-3 rectangles with weights of opposite signs and
4812 with absolute values inversely proportional to the areas of the rectangles.
4813 if rect[2].weight !=0, then
4814 the feature consists of 3 rectangles, otherwise it consists of 2 */
4819 } rect[CV_HAAR_FEATURE_MAX];
4823 /* a single tree classifier (stump in the simplest case) that returns the response for the feature
4824 at the particular image location (i.e. pixel sum over subrectangles of the window) and gives out
4825 a value depending on the responce */
4826 typedef struct CvHaarClassifier
4828 int count; /* number of nodes in the decision tree */
4830 /* these are "parallel" arrays. Every index \texttt{i}
4831 corresponds to a node of the decision tree (root has 0-th index).
4833 left[i] - index of the left child (or negated index if the left child is a leaf)
4834 right[i] - index of the right child (or negated index if the right child is a leaf)
4835 threshold[i] - branch threshold. if feature responce is <= threshold, left branch
4836 is chosen, otherwise right branch is chosed.
4837 alpha[i] - output value correponding to the leaf. */
4838 CvHaarFeature* haar_feature;
4846 /* a boosted battery of classifiers(=stage classifier):
4847 the stage classifier returns 1
4848 if the sum of the classifiers' responces
4849 is greater than \texttt{threshold} and 0 otherwise */
4850 typedef struct CvHaarStageClassifier
4852 int count; /* number of classifiers in the battery */
4853 float threshold; /* threshold for the boosted classifier */
4854 CvHaarClassifier* classifier; /* array of classifiers */
4856 /* these fields are used for organizing trees of stage classifiers,
4857 rather than just stright cascades */
4862 CvHaarStageClassifier;
4864 typedef struct CvHidHaarClassifierCascade CvHidHaarClassifierCascade;
4866 /* cascade or tree of stage classifiers */
4867 typedef struct CvHaarClassifierCascade
4869 int flags; /* signature */
4870 int count; /* number of stages */
4871 CvSize orig_window_size; /* original object size (the cascade is trained for) */
4873 /* these two parameters are set by cvSetImagesForHaarClassifierCascade */
4874 CvSize real_window_size; /* current object size */
4875 double scale; /* current scale */
4876 CvHaarStageClassifier* stage_classifier; /* array of stage classifiers */
4877 CvHidHaarClassifierCascade* hid_cascade; /* hidden optimized representation of the cascade,
4878 created by cvSetImagesForHaarClassifierCascade */
4880 CvHaarClassifierCascade;
4883 All the structures are used for representing a cascaded of boosted Haar classifiers. The cascade has the following hierarchical structure:
4900 The whole hierarchy can be constructed manually or loaded from a file or an embedded base using function \cross{LoadHaarClassifierCascade}.
4903 \cvfunc{LoadHaarClassifierCascade}\label{LoadHaarClassifierCascade}
4905 Loads a trained cascade classifier from file or the classifier database embedded in OpenCV
4908 CvHaarClassifierCascade* cvLoadHaarClassifierCascade( \par const char* directory,\par CvSize orig\_window\_size );
4912 \cvarg{directory}{Name of directory containing the description of a trained cascade classifier}
4913 \cvarg{orig\_window\_size}{Original size of objects the cascade has been trained on. Note that it is not stored in the cascade and therefore must be specified separately}
4916 The function \texttt{cvLoadHaarClassifierCascade} loads a trained cascade
4917 of haar classifiers from a file or the classifier database embedded in
4918 OpenCV. The base can be trained using \texttt{haartraining} application
4919 (see opencv/apps/haartraining for details).
4921 \textbf{The function is obsolete}. Nowadays object detection classifiers are stored in XML or YAML files, rather than in directories. To load cascade from a file, use cvLoad function.
4923 \cvfunc{ReleaseHaarClassifierCascade}\label{ReleaseHaarClassifierCascade}
4925 Releases haar classifier cascade
4928 void cvReleaseHaarClassifierCascade( \par CvHaarClassifierCascade** cascade );
4932 \cvarg{cascade}{Double pointer to the released cascade. The pointer is cleared by the function}
4935 The function \texttt{cvReleaseHaarClassifierCascade} deallocates the cascade that has been created manually or loaded using \cross{LoadHaarClassifierCascade} or \cross{Load}.
4937 \cvfunc{HaarDetectObjects}\label{HaarDetectObjects}
4939 Detects objects in the image
4942 typedef struct CvAvgComp
4944 CvRect rect; /* bounding rectangle for the object (average rectangle of a group) */
4945 int neighbors; /* number of neighbor rectangles in the group */
4951 CvSeq* cvHaarDetectObjects( \par const CvArr* image,\par CvHaarClassifierCascade* cascade,\par CvMemStorage* storage,\par double scale\_factor=1.1,\par int min\_neighbors=3,\par int flags=0,\par CvSize min\_size=cvSize(0,\par0) );
4955 \cvarg{image}{Image to detect objects in}
4956 \cvarg{cascade}{Haar classifier cascade in internal representation}
4957 \cvarg{storage}{Memory storage to store the resultant sequence of the object candidate rectangles}
4958 \cvarg{scale\_factor}{The factor by which the search window is scaled between the subsequent scans, for example, 1.1 means increasing window by 10\% }
4959 \cvarg{min\_neighbors}{Minimum number (minus 1) of neighbor rectangles that makes up an object. All the groups of a smaller number of rectangles than \texttt{min\_neighbors}-1 are rejected. If \texttt{min\_neighbors} is 0, the function does not any grouping at all and returns all the detected candidate rectangles, which may be useful if the user wants to apply a customized grouping procedure}
4960 \cvarg{flags}{Mode of operation. Currently the only flag that may be specified is \texttt{CV\_HAAR\_DO\_CANNY\_PRUNING}. If it is set, the function uses Canny edge detector to reject some image regions that contain too few or too much edges and thus can not contain the searched object. The particular threshold values are tuned for face detection and in this case the pruning speeds up the processing}
4961 \cvarg{min\_size}{Minimum window size. By default, it is set to the size of samples the classifier has been trained on ($\sim 20\times 20$ for face detection)}
4964 The function \texttt{cvHaarDetectObjects} finds rectangular regions in the given image that are likely to contain objects the cascade has been trained for and returns those regions as a sequence of rectangles. The function scans the image several times at different scales (see \cross{SetImagesForHaarClassifierCascade}). Each time it considers overlapping regions in the image and applies the classifiers to the regions using \cross{RunHaarClassifierCascade}. It may also apply some heuristics to reduce number of analyzed regions, such as Canny prunning. After it has proceeded and collected the candidate rectangles (regions that passed the classifier cascade), it groups them and returns a sequence of average rectangles for each large enough group. The default parameters (\texttt{scale\_factor} =1.1, \texttt{min\_neighbors} =3, \texttt{flags} =0) are tuned for accurate yet slow object detection. For a faster operation on real video images the settings are: \texttt{scale\_factor} =1.2, \texttt{min\_neighbors} =2, \texttt{flags} =\texttt{CV\_HAAR\_DO\_CANNY\_PRUNING}, \texttt{min\_size} =\textit{minimum possible face size} (for example, $\sim$ 1/4 to 1/16 of the image area in case of video conferencing).
4966 % ===== Example. Using cascade of Haar classifiers to find objects (e.g. faces). =====
4969 #include "highgui.h"
4971 CvHaarClassifierCascade* load_object_detector( const char* cascade_path )
4973 return (CvHaarClassifierCascade*)cvLoad( cascade_path );
4976 void detect_and_draw_objects( IplImage* image,
4977 CvHaarClassifierCascade* cascade,
4980 IplImage* small_image = image;
4981 CvMemStorage* storage = cvCreateMemStorage(0);
4985 /* if the flag is specified, down-scale the input image to get a
4986 performance boost w/o loosing quality (perhaps) */
4989 small_image = cvCreateImage( cvSize(image->width/2,image->height/2), IPL_DEPTH_8U, 3 );
4990 cvPyrDown( image, small_image, CV_GAUSSIAN_5x5 );
4994 /* use the fastest variant */
4995 faces = cvHaarDetectObjects( small_image, cascade, storage, 1.2, 2, CV_HAAR_DO_CANNY_PRUNING );
4997 /* draw all the rectangles */
4998 for( i = 0; i < faces->total; i++ )
5000 /* extract the rectanlges only */
5001 CvRect face_rect = *(CvRect*)cvGetSeqElem( faces, i, 0 );
5002 cvRectangle( image, cvPoint(face_rect.x*scale,face_rect.y*scale),
5003 cvPoint((face_rect.x+face_rect.width)*scale,
5004 (face_rect.y+face_rect.height)*scale),
5005 CV_RGB(255,0,0), 3 );
5008 if( small_image != image )
5009 cvReleaseImage( &small_image );
5010 cvReleaseMemStorage( &storage );
5013 /* takes image filename and cascade path from the command line */
5014 int main( int argc, char** argv )
5017 if( argc==3 && (image = cvLoadImage( argv[1], 1 )) != 0 )
5019 CvHaarClassifierCascade* cascade = load_object_detector(argv[2]);
5020 detect_and_draw_objects( image, cascade, 1 );
5021 cvNamedWindow( "test", 0 );
5022 cvShowImage( "test", image );
5024 cvReleaseHaarClassifierCascade( &cascade );
5025 cvReleaseImage( &image );
5032 \cvfunc{SetImagesForHaarClassifierCascade}\label{SetImagesForHaarClassifierCascade}
5034 Assigns images to the hidden cascade
5037 void cvSetImagesForHaarClassifierCascade( \par CvHaarClassifierCascade* cascade,\par const CvArr* sum,\par const CvArr* sqsum,\par const CvArr* tilted\_sum,\par double scale );
5041 \cvarg{cascade}{Hidden Haar classifier cascade, created by \cross{CreateHidHaarClassifierCascade}}
5042 \cvarg{sum}{Integral (sum) single-channel image of 32-bit integer format. This image as well as the two subsequent images are used for fast feature evaluation and brightness/contrast normalization. They all can be retrieved from input 8-bit or floating point single-channel image using The function \cross{Integral}}
5043 \cvarg{sqsum}{Square sum single-channel image of 64-bit floating-point format}
5044 \cvarg{tilted\_sum}{Tilted sum single-channel image of 32-bit integer format}
5045 \cvarg{scale}{Window scale for the cascade. If \texttt{scale} =1, original window size is used (objects of that size are searched) - the same size as specified in \cross{LoadHaarClassifierCascade} (24x24 in case of \texttt{default\_face\_cascade}), if \texttt{scale} =2, a two times larger window is used (48x48 in case of default face cascade). While this will speed-up search about four times, faces smaller than 48x48 cannot be detected}
5048 The function \texttt{cvSetImagesForHaarClassifierCascade} assigns images and/or window scale to the hidden classifier cascade. If image pointers are NULL, the previously set images are used further (i.e. NULLs mean "do not change images"). Scale parameter has no such a "protection" value, but the previous value can be retrieved by \cross{GetHaarClassifierCascadeScale} function and reused again. The function is used to prepare cascade for detecting object of the particular size in the particular image. The function is called internally by \cross{HaarDetectObjects}, but it can be called by user if there is a need in using lower-level function \cross{RunHaarClassifierCascade}.
5050 \cvfunc{RunHaarClassifierCascade}\label{RunHaarClassifierCascade}
5052 Runs cascade of boosted classifier at given image location
5055 int cvRunHaarClassifierCascade( \par CvHaarClassifierCascade* cascade,\par CvPoint pt,\par int start\_stage=0 );
5059 \cvarg{cascade}{Haar classifier cascade}
5060 \cvarg{pt}{Top-left corner of the analyzed region. Size of the region is a original window size scaled by the currenly set scale. The current window size may be retrieved using \cross{GetHaarClassifierCascadeWindowSize} function}
5061 \cvarg{start\_stage}{Initial zero-based index of the cascade stage to start from. The function assumes that all the previous stages are passed. This feature is used internally by \cross{HaarDetectObjects} for better processor cache utilization}
5064 The function \texttt{cvRunHaarHaarClassifierCascade} runs Haar classifier
5065 cascade at a single image location. Before using this function the
5066 integral images and the appropriate scale (window size) should be set
5067 using \cross{SetImagesForHaarClassifierCascade}. The function returns
5068 positive value if the analyzed rectangle passed all the classifier stages
5069 (it is a candidate) and zero or negative value otherwise.
5071 \section{Camera Calibration and 3D Reconstruction}
5073 \subsection{Pinhole Camera Model, Distortion}
5075 The functions in this section use so-called pinhole camera model. That
5076 is, a scene view is formed by projecting 3D points into the image plane
5077 using perspective transformation.
5080 s \quad m' = A [R|t] M'
5086 s \vecthree{u}{v}{1} = \vecthreethree
5091 r_{11} & r_{12} & r{13} & t_1 \\
5092 r_{21} & r_{22} & r{23} & t_2 \\
5093 r_{31} & r_{32} & r{33} & t_3
5095 \begin{bmatrix}X\\Y\\Z\\1 \end{bmatrix}
5098 Where $(X, Y, Z)$ are coordinates of a 3D point in the world
5099 coordinate space, $(u, v)$ are coordinates of point projection
5100 in pixels. $A$ is called a camera matrix, or matrix of
5101 intrinsic parameters. $(cx, cy)$ is a principal point (that is
5102 usually at the image center), and $fx, fy$ are focal lengths
5103 expressed in pixel-related units. Thus, if an image from camera is
5104 up-sampled/down-sampled by some factor, all these parameters should
5105 be scaled (multiplied/divided, respectively) by the same factor. The
5106 matrix of intrinsic parameters does not depend on the scene viewed and,
5107 once estimated, can be re-used (as long as the focal length is fixed (in
5108 case of zoom lens)). The joint rotation-translation matrix $[R|t]$
5109 is called a matrix of extrinsic parameters. It is used to describe the
5110 camera motion around a static scene, or vice versa, rigid motion of an
5111 object in front of still camera. That is, $[R|t]$ translates
5112 coordinates of a point $(X, Y, Z)$ to some coordinate system,
5113 fixed with respect to the camera. The transformation above is equivalent
5114 to the following (when $z \ne 0$):
5118 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
5126 Real lens usually have some distortion, which major components are
5127 radial distorion and slight tangential distortion. So, the above model
5132 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
5135 x'' = x' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + 2 p_1 x' y' + p_2(r^2 + 2 x'^2) \\
5136 y'' = y' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' \\
5137 \text{where} \quad r^2 = x'^2 + y'^2 \\
5143 $k_1$, $k_2$, $k_3$ are radial distortion coefficients, $p_1$, $p_2$ are tangential distortion coefficients.
5144 Higher-order coefficients are not considered in OpenCV.
5145 The distortion coefficients also do not depend on the scene viewed, thus they are intrinsic camera parameters.
5146 \emph{And they remain the same regardless of the captured image resolution.}
5147 That is, if, for example, a camera has been calibrated on images of $320
5148 \times 240$ resolution, absolutely the same distortion coefficients can
5149 be used for images of $640 \times 480$ resolution from the same camera (while $fx$,
5150 $fy$, $cx$ and $cy$ need to be scaled appropriately).
5152 The functions below use the above model to
5155 \item Project 3D points to the image plane given intrinsic and extrinsic parameters
5156 \item Compute extrinsic parameters given intrinsic parameters, a few 3D points and their projections.
5157 \item Estimate intrinsic and extrinsic camera parameters from several views of a known calibration pattern (i.e. every view is described by several 3D-2D point correspodences).
5160 \subsection{Camera Calibration}
5162 \cvfunc{ProjectPoints2}\label{ProjectPoints2}
5164 Projects 3D points to image plane
5167 void cvProjectPoints2( \par const CvMat* object\_points,\par const CvMat* rotation\_vector,\par const CvMat* translation\_vector,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvMat* image\_points,\par CvMat* dpdrot=NULL,\par CvMat* dpdt=NULL,\par CvMat* dpdf=NULL,\par CvMat* dpdc=NULL,\par CvMat* dpddist=NULL );
5171 \cvarg{object\_points}{The array of object points, 3xN or Nx3, where N is the number of points in the view}
5172 \cvarg{rotation\_vector}{The rotation vector, 1x3 or 3x1}
5173 \cvarg{translation\_vector}{The translation vector, 1x3 or 3x1}
5174 \cvarg{intrinsic\_matrix}{The camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5175 \cvarg{distortion\_coeffs}{The vector of distortion coefficients, 4x1 or 1x4 $k_1, k_2, k_3, k_4$. If it is \texttt{NULL}, all distortion coefficients are considered 0's}
5176 \cvarg{image\_points}{The output array of image points, 2xN or Nx2, where N is the total number of points in the view}
5177 \cvarg{dpdrot}{Optional Nx3 matrix of derivatives of image points with respect to components of the rotation vector}
5178 \cvarg{dpdt}{Optional Nx3 matrix of derivatives of image points w.r.t. components of the translation vector}
5179 \cvarg{dpdf}{Optional Nx2 matrix of derivatives of image points w.r.t. $fx$ and $fy$}
5180 \cvarg{dpdc}{Optional Nx2 matrix of derivatives of image points w.r.t. $cx$ and $cy$}
5181 \cvarg{dpddist}{Optional Nx4 matrix of derivatives of image points w.r.t. distortion coefficients}
5184 The function \texttt{cvProjectPoints2} computes projections of 3D
5185 points to the image plane given intrinsic and extrinsic camera
5186 parameters. Optionally, the function computes jacobians - matrices
5187 of partial derivatives of image points as functions of all the
5188 input parameters w.r.t. the particular parameters, intrinsic and/or
5189 extrinsic. The jacobians are used during the global optimization
5190 in \cross{CalibrateCamera2} and
5191 \cross{FindExtrinsicCameraParams2}. The
5192 function itself is also used to compute back-projection error for with
5193 current intrinsic and extrinsic parameters.
5195 Note, that with intrinsic and/or extrinsic parameters set to special
5196 values, the function can be used to compute just extrinsic transformation
5197 or just intrinsic transformation (i.e. distortion of a sparse set
5200 \cvfunc{FindHomography}\label{FindHomography}
5202 Finds perspective transformation between two planes
5205 void cvFindHomography( \par const CvMat* src\_points,\par const CvMat* dst\_points,\par CvMat* homography \par
5206 int method=0, \par double ransacReprojThreshold=0, \par CvMat* mask=NULL);
5210 \cvarg{src\_points}{Point coordinates in the original plane, 2xN, Nx2, 3xN or Nx3 array (the latter two are for representation in homogenious coordinates), where N is the number of points}
5211 \cvarg{dst\_points}{Point coordinates in the destination plane, 2xN, Nx2, 3xN or Nx3 array (the latter two are for representation in homogenious coordinates)}
5212 \cvarg{homography}{Output 3x3 homography matrix}
5213 \cvarg{method}{ The method used to computed homography matrix. One of:
5215 \cvarg{0}{regular method using all the point pairs}
5216 \cvarg{CV\_RANSAC}{RANSAC-based robust method}
5217 \cvarg{CV\_LMEDS}{Least-Median robust method}
5219 \cvarg{ransacReprojThreshold}{The maximum allowed reprojection error to treat a point pair as an inlier. The parameter is only used in RANSAC-based homography estimation. E.g. if \texttt{dst\_points} coordinates are measured in pixels with pixel-accurate precision, it makes sense to set this parameter somewhere in the range $1...3$. }
5220 \cvarg{mask}{The optional output mask set by a robust method (\texttt{CV\_RANSAC} or \texttt{CV\_LMEDS}).}
5223 The function \texttt{cvFindHomography} finds perspective transformation $H$ between the source and the destination planes:
5226 s_i \vecthree{x'_i}{y'_i}{1} \sim H \vecthree{x_i}{y_i}{1}
5229 So that the back-projection error is minimized:
5233 \left( x'_i-\frac{h_{11} x_i + h_{12} y_i + h_{13}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2+
5234 \left( y'_i-\frac{h_{21} x_i + h_{22} y_i + h_{23}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2
5237 If the parameter method is set to the default value 0, the function
5238 uses all the point pairs and estimates the best suitable homography
5239 matrix. However, if there can not all the points pairs ($src\_points_i$,
5240 $dst\_points_i$) fit the rigid perspective transformation (i.e. there
5241 can be outliers), it is still possible to estimate the correct
5242 transformation using one of the robust methods available. Both
5243 methods, \texttt{CV\_RANSAC} and \texttt{CV\_LMEDS}, try many different random subsets
5244 of the corresponding point pairs (of 5 pairs each), estimate
5245 homography matrix using this subset using simple least-square
5246 algorithm and then compute quality/goodness of the computed homography
5247 (which is the number of inliers for RANSAC or the median reprojection
5248 error for LMeDs). The best subset is then used to produce the initial
5249 estimate of the homography matrix and the mask of inliers/outliers.
5251 Regardless of the method, robust or not, the computed homography
5252 matrix is refined further (using inliers only in case of a robust
5253 method) with Levenberg-Marquardt method in order to reduce the
5254 reprojection error even more.
5256 The method \texttt{CV\_RANSAC} can handle practically any ratio of outliers,
5257 but it needs the threshold to distinguish inliers from outliers.
5258 The method \texttt{CV\_LMEDS} does not need any threshold, but it works
5259 correctly only when there are more than 50\% of inliers. Finally,
5260 if you are sure in the computed features and there can be only some
5261 small noise, but no outliers, the default method could be the best
5264 The function is used to find initial intrinsic and extrinsic matrices.
5265 Homography matrix is determined up to a scale, thus it is normalized
5266 to make $h_{33} =1$.
5268 \cvfunc{CalibrateCamera2}\label{CalibrateCamera2}
5270 Finds intrinsic and extrinsic camera parameters using calibration pattern
5273 void cvCalibrateCamera2( \par const CvMat* object\_points,\par const CvMat* image\_points,\par const CvMat* point\_counts,\par CvSize image\_size,\par CvMat* intrinsic\_matrix,\par CvMat* distortion\_coeffs,\par CvMat* rotation\_vectors=NULL,\par CvMat* translation\_vectors=NULL,\par int flags=0 );
5277 \cvarg{object\_points}{The joint matrix of object points, 3xN or Nx3, where N is the total number of points in all views}
5278 \cvarg{image\_points}{The joint matrix of corresponding image points, 2xN or Nx2, where N is the total number of points in all views}
5279 \cvarg{point\_counts}{Vector containing numbers of points in each particular view, 1xM or Mx1, where M is the number of a scene views}
5280 \cvarg{image\_size}{Size of the image, used only to initialize intrinsic camera matrix}
5281 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $. If \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} and/or \texttt{CV\_CALIB\_FIX\_ASPECT\_RATION} are specified, some or all of \texttt{fx, fy, cx, cy} must be initialized}
5282 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$}
5283 \cvarg{rotation\_vectors}{The output 3xM or Mx3 array of rotation vectors (compact representation of rotation matrices, \cross{Rodrigues2})}
5284 \cvarg{translation\_vectors}{The output 3xM or Mx3 array of translation vectors}
5285 \cvarg{flags}{Different flags, may be 0 or combination of the following values:
5287 \cvarg{CV\_CALIB\_USE\_INTRINSIC\_GUESS}{\texttt{intrinsic\_matrix} contains valid initial values of \texttt{fx, fy, cx, cy} that are optimized further. Otherwise, \texttt{(cx, cy)} is initially set to the image center (\texttt{image\_size} is used here), and focal distances are computed in some least-squares fashion. Note, that if intrinsic parameters are known, there is no need to use this function. Use \cross{FindExtrinsicCameraParams2} instead.}
5288 \cvarg{CV\_CALIB\_FIX\_PRINCIPAL\_POINT}{The principal point is not changed during the global optimization, it stays at the center and at the other location specified (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} is set as well)}
5289 \cvarg{CV\_CALIB\_FIX\_ASPECT\_RATIO}{The optimization procedure consider only one of \texttt{fx} and \texttt{fy} as independent variable and keeps the aspect ratio \texttt{fx/fy} the same as it was set initially in \texttt{intrinsic\_matrix}. In this case the actual initial values of \texttt{(fx, fy)} are either taken from the matrix (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} is set) or estimated somehow (in the latter case \texttt{fx, fy} may be set to arbitrary values, only their ratio is used).}
5290 \cvarg{CV\_CALIB\_ZERO\_TANGENT\_DIST}{Tangential distortion coefficients are set to zeros and do not change during the optimization.}}
5294 The function \texttt{cvCalibrateCamera2} estimates intrinsic camera
5295 parameters and extrinsic parameters for each of the views. The
5296 coordinates of 3D object points and their correspondent 2D projections
5297 in each view must be specified. That may be achieved by using an
5298 object with known geometry and easily detectable feature points.
5299 Such an object is called calibration rig or calibration pattern,
5300 and OpenCV has built-in support for a chessboard as a calibration
5301 rig (see \cross{FindChessboardCornerGuesses}). Currently, initialization
5302 of inrtrinsic parameters (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS}
5303 is not set) is only implemented for planar calibration rigs
5304 (z-coordinates of object points must be all 0's or all 1's). 3D
5305 rigs can still be used as long as initial \texttt{intrinsic\_matrix}
5306 is provided. After the initial values of intrinsic and extrinsic
5307 parameters are computed, they are optimized to minimize the total
5308 back-projection error - the sum of squared differences between the
5309 actual coordinates of image points and the ones computed using
5310 \cross{ProjectPoints2}.
5312 Note: if you're using a non-square (=non-NxN) grid and
5313 \cross{FindChessboardCorners} for calibration, and cvCalibrateCamera2 returns
5314 bad values (i.e. zero distortion coefficients, an image center of
5315 (w/2-0.5,h/2-0.5), and / or large differences between $fx$ and $fy$ (ratios of
5316 10:1 or more)), then you've probaby used pattern\_size=cvSize(rows,cols),
5317 but should use pattern\_size=cvSize(cols,rows) in \cross{FindChessboardCorners}.
5319 \cvfunc{FindExtrinsicCameraParams2}\label{FindExtrinsicCameraParams2}
5321 Finds extrinsic camera parameters for particular view
5324 void cvFindExtrinsicCameraParams2( \par const CvMat* object\_points,\par const CvMat* image\_points,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvMat* rotation\_vector,\par CvMat* translation\_vector );
5328 \cvarg{object\_points}{The array of object points, 3xN or Nx3, where N is the number of points in the view}
5329 \cvarg{image\_points}{The array of corresponding image points, 2xN or Nx2, where N is the number of points in the view}
5330 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5331 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$. If it is NULL, all distortion coefficients are considered 0's}
5332 \cvarg{rotation\_vector}{The output 3x1 or 1x3 rotation vector (compact representation of a rotation matrix, cvRodrigues2}
5333 \cvarg{translation\_vector}{The output 3x1 or 1x3 translation vector}
5336 The function \texttt{cvFindExtrinsicCameraParams2} estimates extrinsic camera parameters using known intrinsic parameters and extrinsic parameters for each view. The coordinates of 3D object points and their correspondent 2D projections must be specified. This function also minimizes back-projection error.
5338 % XXX missing StereoCalibrate
5339 % XXX missing StereoRectify
5340 % XXX misssing StereoRectifyUncalibrated
5342 \cvfunc{Rodrigues2}\label{Rodrigues2}
5344 Converts rotation matrix to rotation vector or vice versa
5347 int cvRodrigues2( \par const CvMat* src,\par CvMat* dst,\par CvMat* jacobian=0 );
5351 \cvarg{src}{The input rotation vector (3x1 or 1x3) or rotation matrix (3x3)}
5352 \cvarg{dst}{The output rotation matrix (3x3) or rotation vector (3x1 or 1x3), respectively}
5353 \cvarg{jacobian}{Optional output Jacobian matrix, 3x9 or 9x3 - partial derivatives of the output array components w.r.t the input array components}
5356 The function \texttt{cvRodrigues2} converts a rotation vector to rotation matrix or vice versa. Rotation vector is a compact representation of rotation matrix. Direction of the rotation vector is the rotation axis and the length of the vector is the rotation angle around the axis. The rotation matrix $R$, corresponding to the rotation vector $r$, is computed as following:
5360 \theta \leftarrow norm(r)\\
5361 r \leftarrow r/\theta\\
5362 R = \cos{\theta} I + (1-\cos{\theta}) r r^T + \sin{\theta}
5370 Inverse transformation can also be done easily as
5382 Rotation vector is a convenient representation of a rotation matrix
5383 as a matrix with only 3 degrees of freedom. The representation is
5384 used in the global optimization procedures inside
5385 \cross{FindExtrinsicCameraParams2}
5386 and \cross{CalibrateCamera2}.
5388 \cvfunc{Undistort2}\label{Undistort2}
5390 Transforms image to compensate lens distortion
5393 void cvUndistort2( \par const CvArr* src,\par CvArr* dst,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs );
5397 \cvarg{src}{The input (distorted) image}
5398 \cvarg{dst}{The output (corrected) image}
5399 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5400 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$.}
5403 The function \texttt{cvUndistort2} transforms the image to compensate
5404 radial and tangential lens distortion. The camera matrix and
5405 distortion parameters can be determined using
5406 \cross{CalibrateCamera2}. For every
5407 pixel in the output image the function computes coordinates of the
5408 corresponding location in the input image using the formulae in the
5409 section beginning. Then, the pixel value is computed using bilinear
5410 interpolation. If the resolution of images is different from what
5411 was used at the calibration stage, $fx, fy, cx$ and $cy$
5412 need to be adjusted appropriately, while the distortion coefficients
5415 \cvfunc{InitUndistortMap}\label{InitUndistortMap}
5417 Computes undistorion map
5420 void cvInitUndistortMap( \par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvArr* mapx,\par CvArr* mapy );
5424 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5425 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$.}
5426 \cvarg{mapx}{The output array of x-coordinates of the map}
5427 \cvarg{mapy}{The output array of y-coordinates of the map}
5430 The function \texttt{cvInitUndistortMap} pre-computes the undistortion map - coordinates of the corresponding pixel in the distorted image for every pixel in the corrected image. Then, the map (together with input and output images) can be passed to \cross{Remap} function.
5432 % XXX missing InitUndistortRectifyMap
5433 % XXX missing UndistortPoints
5435 \cvfunc{FindChessboardCorners}\label{FindChessboardCorners}
5437 Finds positions of internal corners of the chessboard
5440 int cvFindChessboardCorners( \par const void* image,\par CvSize pattern\_size,\par CvPoint2D32f* corners,\par int* corner\_count=NULL,\par int flags=CV\_CALIB\_CB\_ADAPTIVE\_THRESH );
5444 \cvarg{image}{Source chessboard view; it must be 8-bit grayscale or color image}
5445 \cvarg{pattern\_size}{The number of inner corners '''per''' chessboard row and column}
5446 ( pattern\_size = cvSize(points\_per\_row,points\_per\_colum) = cvSize(columns,rows) )
5447 \cvarg{corners}{The output array of corners detected}
5448 \cvarg{corner\_count}{The output corner counter. If it is not NULL, the function stores there the number of corners found}
5449 \cvarg{flags}{Various operation flags, can be 0 or a combination of the following values:
5451 \cvarg{CV\_CALIB\_CB\_ADAPTIVE\_THRESH}{use adaptive thresholding to convert the image to black-n-white, rather than a fixed threshold level (computed from the average image brightness).}
5452 \cvarg{CV\_CALIB\_CB\_NORMALIZE\_IMAGE}{normalize the image using \cross{NormalizeHist} before applying fixed or adaptive thresholding.}
5453 \cvarg{CV\_CALIB\_CB\_FILTER\_QUADS}{use additional criteria (like contour area, perimeter, square-like shape) to filter out false quads that are extracted at the contour retrieval stage.}
5457 The function \texttt{cvFindChessboardCorners} attempts to determine
5458 whether the input image is a view of the chessboard pattern and
5459 locate internal chessboard corners. The function returns non-zero
5460 value if all the corners have been found and they have been placed
5461 in a certain order (row by row, left to right in every row),
5462 otherwise, if the function fails to find all the corners or reorder
5463 them, it returns 0. For example, a regular chessboard has 8 x 8
5464 squares and 7 x 7 internal corners, that is, points, where the black
5465 squares touch each other. The coordinates detected are approximate,
5466 and to determine their position more accurately, the user may use
5467 the function \cross{FindCornerSubPix}.
5469 \cvfunc{DrawChessBoardCorners}\label{DrawChessBoardCorners}
5471 Renders the detected chessboard corners
5474 void cvDrawChessboardCorners( \par CvArr* image,\par CvSize pattern\_size,\par CvPoint2D32f* corners,\par int count,\par int pattern\_was\_found );
5478 \cvarg{image}{The destination image; it must be 8-bit color image}
5479 \cvarg{pattern\_size}{The number of inner corners '''per''' chessboard row and column. ( pattern\_size = cvSize(points\_per\_row,points\_per\_colum) = cvSize(columns,rows) )}
5480 \cvarg{corners}{The array of corners detected}
5481 \cvarg{count}{The number of corners}
5482 \cvarg{pattern\_was\_found}{Indicates whether the complete board was found $(\ne 0)$ or not $(=0)$. One may just pass the return value \cross{FindChessboardCorners} here}
5485 The function \texttt{cvDrawChessboardCorners} draws the individual chessboard corners detected (as red circles) in case if the board was not found $(\texttt{pattern\_was\_found} =0)$ or the colored corners connected with lines when the board was found $(\texttt{pattern\_was\_found} \ne 0)$.
5488 \cvfunc{RQDecomp3x3}\label{RQDecomp3x3}
5490 Computes RQ decomposition of 3x3 matrices
5493 void cvRQDecomp3x3( \par const CvMat *matrixM,\par CvMat *matrixR,\par CvMat *matrixQ,\par CvMat *matrixQx=NULL,\par CvMat *matrixQy=NULL,\par CvMat *matrixQz=NULL,\par CvPoint3D64f *eulerAngles=NULL);
5497 \cvarg{matrixM}{The 3x3 input matrix M}
5498 \cvarg{matrixR}{The output 3x3 upper-triangular matrix R}
5499 \cvarg{matrixQ}{The output 3x3 orthogonal matrix Q}
5500 \cvarg{matrixQx}{Optional 3x3 rotation matrix around x-axis}
5501 \cvarg{matrixQy}{Optional 3x3 rotation matrix around y-axis}
5502 \cvarg{matrixQz}{Optional 3x3 rotation matrix around z-axis}
5503 \cvarg{eulerAngles}{Optional 3-point containing the three Euler angles of rotation}
5506 The function \texttt{cvRQDecomp3x3} computes a RQ decomposition using Givens rotations. This function is used in \cross{DecomposeProjectionMatrix} to decompose the left 3x3 submatrix of a projection matrix into a calibration and a rotation matrix.
5508 It optionally returns three rotation matrices, one for each axis, and the three Euler angles that could be used in OpenGL.
5511 \cvfunc{DecomposeProjectionMatrix}\label{DecomposeProjectionMatrix}
5513 Computes RQ decomposition of 3x3 matrices
5516 void cvDecomposeProjectionMatrix( \par const CvMat *projMatr,\par CvMat *calibMatr,\par CvMat *rotMatr,\par CvMat *posVect,\par CvMat *rotMatrX=NULL,\par CvMat *rotMatrY=NULL,\par CvMat *rotMatrZ=NULL,\par CvPoint3D64f *eulerAngles=NULL);
5520 \cvarg{projMatr}{The 3x4 input projection matrix P}
5521 \cvarg{calibMatr}{The output 3x3 internal calibration matrix K}
5522 \cvarg{rotMatr}{The output 3x3 external rotation matrix R}
5523 \cvarg{posVect}{The output 4x1 external homogenious position vector C}
5524 \cvarg{rotMatrX}{Optional 3x3 rotation matrix around x-axis}
5525 \cvarg{rotMatrY}{Optional 3x3 rotation matrix around y-axis}
5526 \cvarg{rotMatrZ}{Optional 3x3 rotation matrix around z-axis}
5527 \cvarg{eulerAngles}{Optional 3-point containing the three Euler angles of rotation}
5530 The function \texttt{cvDecomposeProjectionMatrix} computes a decomposition of a projection matrix into a calibration and a rotation matrix and the position of the camera.
5532 It optionally returns three rotation matrices, one for each axis, and the three Euler angles that could be used in OpenGL.
5535 \subsection{Pose Estimation}
5538 \cvfunc{CreatePOSITObject}\label{CreatePOSITObject}
5540 Initializes structure containing object information
5543 CvPOSITObject* cvCreatePOSITObject( \par CvPoint3D32f* points,\par int point\_count );
5547 \cvarg{points}{Pointer to the points of the 3D object model}
5548 \cvarg{point\_count}{Number of object points}
5551 The function \texttt{cvCreatePOSITObject} allocates memory for the object structure and computes the object inverse matrix.
5553 The preprocessed object data is stored in the structure \cross{CvPOSITObject}, internal for OpenCV, which means that the user cannot directly access the structure data. The user may only create this structure and pass its pointer to the function.
5555 Object is defined as a set of points given in a coordinate system. The function \cross{POSIT} computes a vector that begins at a camera-related coordinate system center and ends at the \texttt{points[0]} of the object.
5557 Once the work with a given object is finished, the function \cross{ReleasePOSITObject} must be called to free memory.
5559 \cvfunc{POSIT}\label{POSIT}
5561 Implements POSIT algorithm
5564 void cvPOSIT( \par CvPOSITObject* posit\_object,\par CvPoint2D32f* image\_points,\par double focal\_length,\par CvTermCriteria criteria,\par CvMatr32f rotation\_matrix,\par CvVect32f translation\_vector );
5568 \cvarg{posit\_object}{Pointer to the object structure}
5569 \cvarg{image\_points}{Pointer to the object points projections on the 2D image plane}
5570 \cvarg{focal\_length}{Focal length of the camera used}
5571 \cvarg{criteria}{Termination criteria of the iterative POSIT algorithm}
5572 \cvarg{rotation\_matrix}{Matrix of rotations}
5573 \cvarg{translation\_vector}{Translation vector}
5576 The function \texttt{cvPOSIT} implements POSIT algorithm. Image coordinates are given in a camera-related coordinate system. The focal length may be retrieved using camera calibration functions. At every iteration of the algorithm new perspective projection of estimated pose is computed.
5578 Difference norm between two projections is the maximal distance between corresponding points. The parameter \texttt{criteria.epsilon} serves to stop the algorithm if the difference is small.
5580 \cvfunc{ReleasePOSITObject}\label{ReleasePOSITObject}
5582 Deallocates 3D object structure
5585 void cvReleasePOSITObject( \par CvPOSITObject** posit\_object );
5589 \cvarg{posit\_object}{Double pointer to \texttt{CvPOSIT} structure}
5592 The function \texttt{cvReleasePOSITObject} releases memory previously allocated by the function \cross{CreatePOSITObject}.
5595 \cvfunc{CalcImageHomography}\label{CalcImageHomography}
5597 Calculates homography matrix for oblong planar object (e.g. arm)
5600 void cvCalcImageHomography( \par float* line,\par CvPoint3D32f* center,\par float* intrinsic,\par float* homography );
5604 \cvarg{line}{the main object axis direction (vector (dx,dy,dz))}
5605 \cvarg{center}{object center ((cx,cy,cz))}
5606 \cvarg{intrinsic}{intrinsic camera parameters (3x3 matrix)}
5607 \cvarg{homography}{output homography matrix (3x3)}
5610 The function \texttt{cvCalcImageHomography} calculates the homography matrix for the initial image transformation from image plane to the plane, defined by 3D oblong object line (See \_\_Figure 6-10\_\_ in OpenCV Guide 3D Reconstruction Chapter).
5613 \subsection{Epipolar Geometry}
5615 \cvfunc{FindFundamentalMat}\label{FindFundamentalMat}
5617 Calculates fundamental matrix from corresponding points in two images
5620 int cvFindFundamentalMat( \par const CvMat* points1,\par const CvMat* points2,\par CvMat* fundamental\_matrix,\par int method=CV\_FM\_RANSAC,\par double param1=1.,\par double param2=0.99,\par CvMat* status=NULL);
5624 \cvarg{points1}{Array of the first image points of \texttt{2xN, Nx2, 3xN} or \texttt{Nx3} size (where \texttt{N} is number of points). Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable. The point coordinates should be floating-point (single or double precision)}
5625 \cvarg{points2}{Array of the second image points of the same size and format as \texttt{points1}}
5626 \cvarg{fundamental\_matrix}{The output fundamental matrix or matrices. The size should be 3x3 or 9x3 (7-point method may return up to 3 matrices)}
5627 \cvarg{method}{Method for computing the fundamental matrix
5629 \cvarg{CV\_FM\_7POINT}{for 7-point algorithm. $N = 7$}
5630 \cvarg{CV\_FM\_8POINT}{for 8-point algorithm. $N \ge 8$}
5631 \cvarg{CV\_FM\_RANSAC}{for RANSAC algorithm. $N \ge 8$}
5632 \cvarg{CV\_FM\_LMEDS}{for LMedS algorithm. $N \ge 8$}
5634 \cvarg{param1}{The parameter is used for RANSAC or LMedS methods only. It is the maximum distance from point to epipolar line in pixels, beyond which the point is considered an outlier and is not used for computing the final fundamental matrix. Usually it is set to 0.5 or 1.0}
5635 \cvarg{param2}{The parameter is used for RANSAC or LMedS methods only. It denotes the desirable level of confidence that the matrix is correct}
5636 \cvarg{status}{The optional output array of N elements, every element of which is set to 0 for outliers and to 1 for the other points. The array is computed only in RANSAC and LMedS methods. For other methods it is set to all 1's}
5639 The epipolar geometry is described by the following equation:
5643 where $F$ is fundamental matrix, $p_1$ and $p_2$ are corresponding points in the first and the second images, respectively.
5645 The function \texttt{cvFindFundamentalMat} calculates fundamental
5646 matrix using one of four methods listed above and returns the number
5647 of fundamental matrices found (1 or 3) and 0, if no matrix is found.
5649 The calculated fundamental matrix may be passed further to
5650 \texttt{cvComputeCorrespondEpilines} that finds epipolar lines
5651 corresponding to the specified points.
5653 %===== Example. Estimation of fundamental matrix using RANSAC algorithm =====
5655 int point_count = 100;
5659 CvMat* fundamental_matrix;
5661 points1 = cvCreateMat(1,point_count,CV_32FC2);
5662 points2 = cvCreateMat(1,point_count,CV_32FC2);
5663 status = cvCreateMat(1,point_count,CV_8UC1);
5665 /* Fill the points here ... */
5666 for( i = 0; i < point_count; i++ )
5668 points1->data.fl[i*2] = <x,,1,i,,>;
5669 points1->data.fl[i*2+1] = <y,,1,i,,>;
5670 points2->data.fl[i*2] = <x,,2,i,,>;
5671 points2->data.fl[i*2+1] = <y,,2,i,,>;
5674 fundamental_matrix = cvCreateMat(3,3,CV_32FC1);
5675 int fm_count = cvFindFundamentalMat( points1,points2,fundamental_matrix,
5676 CV_FM_RANSAC,1.0,0.99,status );
5679 \cvfunc{ComputeCorrespondEpilines}\label{ComputeCorrespondEpilines}
5681 For points in one image of stereo pair computes the corresponding epilines in the other image
5684 void cvComputeCorrespondEpilines( \par const CvMat* points,\par int which\_image,\par const CvMat* fundamental\_matrix,\par CvMat* correspondent\_lines);
5688 \cvarg{points}{The input points. \texttt{2xN, Nx2, 3xN} or \texttt{Nx3} array (where \texttt{N} number of points). Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable}
5689 \cvarg{which\_image}{Index of the image (1 or 2) that contains the \texttt{points}}
5690 \cvarg{fundamental\_matrix}{Fundamental matrix}
5691 \cvarg{correspondent\_lines}{Computed epilines, \texttt{3xN} or \texttt{Nx3} array}
5694 For every point in one of the two images of stereo-pair the function
5695 \texttt{ComputeCorrespondEpilines} finds equation of a line that
5696 contains the corresponding point (i.e. projection of the same 3D
5697 point) in the other image. Each line is encoded by a vector of 3
5698 elements $l = \vecthree{a}{b}{c}$ so that:
5700 \[ l^T \vecthree{x}{y}{1} = 0 \]
5702 \[ a x + b y + c = 0 \]
5704 From the fundamental matrix definition (see \cross{FindFundamentalMatrix}
5705 discussion), line $l_1$ for a point $p_1$ in the first image
5706 $(\texttt{which\_image} =1)$ can be computed as:
5710 and the line $l_1$ for a point $p_2$ in the second image $(\texttt{which\_image} =1)$ can be computed as:
5714 Line coefficients are defined up to a scale. They are normalized $(a^2+b^2=1)$ are stored into \texttt{correspondent\_lines}.
5716 \cvfunc{ConvertPointsHomogenious}\label{ConvertPointsHomogenious}
5718 Convert points to/from homogenious coordinates
5721 void cvConvertPointsHomogenious( \par const CvMat* src,\par CvMat* dst );
5725 \cvarg{src}{The input point array, \texttt{2xN, Nx2, 3xN, Nx3, 4xN or Nx4 (where \texttt{N} is the number of points)}. Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable}
5726 \cvarg{dst}{The output point array, must contain the same number of points as the input; The dimensionality must be the same, 1 less or 1 more than the input, and also within 2..4}
5729 The function \texttt{cvConvertPointsHomogenious} converts 2D or 3D points from/to homogenious coordinates, or simply copies or transposes the array. In case if the input array dimensionality is larger than the output, each point coordinates are divided by the last coordinate:
5733 (x,y[,z],w) -> (x',y'[,z'])\\
5737 z' = z/w \quad \text{(if output is 3D)}
5741 If the output array dimensionality is larger, an extra 1 is appended to each point. Otherwise, the input array is simply copied (with optional tranposition) to the output.
5743 \textbf{Note} because the function accepts a large variety of array layouts, it may report an error when input/output array dimensionality is ambiguous. It is always safe to use the function with number of points $\texttt{N} \ge 5$, or to use multi-channel \texttt{Nx1} or \texttt{1xN} arrays.
5745 % XXX missing: CvStereoBMState
5746 % XXX missing: CreateStereoBMState
5747 % XXX missing: ReleaseStereoBMState
5748 % XXX missing: FindStereoCorrespondenceBM
5749 % XXX missing: CvStereoGCState
5750 % XXX missing: CreateStereoGCState
5751 % XXX missing: ReleaseStereoGCState
5752 % XXX missing: FindStereoCorrespondenceGC
5753 % XXX missing: ReprojectImageTo3D
5755 \section{Bibliography}
5757 This bibliography provides a list of publications that were might be useful to the OpenCV users. This list is not complete; it serves only as a starting point.
5759 1. '''[Borgefors86]''' Gunilla Borgefors, "Distance Transformations in Digital Images". Computer Vision, Graphics and Image Processing 34, 344-371 (1986).
5760 1. '''[Bouguet00]''' Jean-Yves Bouguet. Pyramidal Implementation of the Lucas Kanade Feature Tracker.<<BR>> The paper is included into OpenCV distribution ([[attachment:algo\_tracking.pdf]])
5761 1. '''[Bradski98]''' G.R. Bradski. Computer vision face tracking as a component of a perceptual user interface. In Workshop on Applications of Computer Vision, pages 214?219, Princeton, NJ, Oct. 1998.<<BR>> Updated version can be found at http://www.intel.com/technology/itj/q21998/articles/art\_2.htm.<<BR>> Also, it is included into OpenCV distribution ([[attachment:camshift.pdf]])
5762 1. '''[Bradski00]''' G. Bradski and J. Davis. Motion Segmentation and Pose Recognition with Motion History Gradients. IEEE WACV'00, 2000.
5763 1. '''[Burt81]''' P. J. Burt, T. H. Hong, A. Rosenfeld. Segmentation and Estimation of Image Region Properties Through Cooperative Hierarchical Computation. IEEE Tran. On SMC, Vol. 11, N.12, 1981, pp. 802-809.
5764 1. '''[Canny86]''' J. Canny. A Computational Approach to Edge Detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(6), pp. 679-698 (1986).
5765 1. '''[Davis97]''' J. Davis and Bobick. The Representation and Recognition of Action Using Temporal Templates. MIT Media Lab Technical Report 402, 1997.
5766 1. '''[DeMenthon92]''' Daniel F. DeMenthon and Larry S. Davis. Model-Based Object Pose in 25 Lines of Code. In Proceedings of ECCV '92, pp. 335-343, 1992.
5767 1. '''[Fitzgibbon95]''' Andrew W. Fitzgibbon, R.B.Fisher. A Buyer?s Guide to Conic Fitting. Proc.5th British Machine Vision Conference, Birmingham, pp. 513-522, 1995.
5768 1. '''[Ford98]''' Adrian Ford, Alan Roberts. Colour Space Conversions. http://www.poynton.com/PDFs/coloureq.pdf
5769 1. '''[Horn81]''' Berthold K.P. Horn and Brian G. Schunck. Determining Optical Flow. Artificial Intelligence, 17, pp. 185-203, 1981.
5770 1. '''[Hu62]''' M. Hu. Visual Pattern Recognition by Moment Invariants, IRE Transactions on Information Theory, 8:2, pp. 179-187, 1962.
5771 1. '''[Iivarinen97]''' Jukka Iivarinen, Markus Peura, Jaakko Srel, and Ari Visa. Comparison of Combined Shape Descriptors for Irregular Objects, 8th British Machine Vision Conference, BMVC'97.<<BR>>http://www.cis.hut.fi/research/IA/paper/publications/bmvc97/bmvc97.html
5772 1. '''[Jahne97]''' B. Jahne. Digital Image Processing. Springer, New York, 1997.
5773 1. '''[Lucas81]''' Lucas, B., and Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision, Proc. of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674-679.
5774 1. '''[Kass88]''' M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active Contour Models, International Journal of Computer Vision, pp. 321-331, 1988.
5775 1. '''[Lienhart02]''' Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. IEEE ICIP 2002, Vol. 1, pp. 900-903, Sep. 2002.<<BR>> This paper, as well as the extended technical report, can be retrieved at http://www.lienhart.de/Publications/publications.html
5776 1. '''[Matas98]''' J.Matas, C.Galambos, J.Kittler. Progressive Probabilistic Hough Transform. British Machine Vision Conference, 1998.
5777 1. '''[Rosenfeld73]''' A. Rosenfeld and E. Johnston. Angle Detection on Digital Curves. IEEE Trans. Computers, 22:875-878, 1973.
5778 1. '''[RubnerJan98]''' Y. Rubner. C. Tomasi, L.J. Guibas. Metrics for Distributions with Applications to Image Databases. Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, January 1998, pp. 59-66.
5779 1. '''[RubnerSept98]''' Y. Rubner. C. Tomasi, L.J. Guibas. The Earth Mover?s Distance as a Metric for Image Retrieval. Technical Report STAN-CS-TN-98-86, Department of Computer Science, Stanford University, September 1998.
5780 1. '''[RubnerOct98]''' Y. Rubner. C. Tomasi. Texture Metrics. Proceeding of the IEEE International Conference on Systems, Man, and Cybernetics, San-Diego, CA, October 1998, pp. 4601-4607. http://robotics.stanford.edu/~rubner/publications.html
5781 1. '''[Serra82]''' J. Serra. Image Analysis and Mathematical Morphology. Academic Press, 1982.
5782 1. '''[Schiele00]''' Bernt Schiele and James L. Crowley. Recognition without Correspondence Using Multidimensional Receptive Field Histograms. In International Journal of Computer Vision 36 (1), pp. 31-50, January 2000.
5783 1. '''[Suzuki85]''' S. Suzuki, K. Abe. Topological Structural Analysis of Digital Binary Images by Border Following. CVGIP, v.30, n.1. 1985, pp. 32-46.
5784 1. '''[Teh89]''' C.H. Teh, R.T. Chin. On the Detection of Dominant Points on Digital Curves. - IEEE Tr. PAMI, 1989, v.11, No.8, p. 859-872.
5785 1. '''[Trucco98]''' Emanuele Trucco, Alessandro Verri. Introductory Techniques for 3-D Computer Vision. Prentice Hall, Inc., 1998.
5786 1. '''[Viola01]''' Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001.<<BR>> The paper is available online at http://www.ai.mit.edu/people/viola/
5787 1. '''[Welch95]''' Greg Welch, Gary Bishop. An Introduction To the Kalman Filter. Technical Report TR95-041, University of North Carolina at Chapel Hill, 1995.<<BR>> Online version is available at http://www.cs.unc.edu/~welch/kalman/kalmanIntro.html
5788 1. '''[Williams92]''' D. J. Williams and M. Shah. A Fast Algorithm for Active Contours and Curvature Estimation. CVGIP: Image Understanding, Vol. 55, No. 1, pp. 14-26, Jan., 1992. http://www.cs.ucf.edu/~vision/papers/shah/92/WIS92A.pdf.
5789 1. '''[Yuen03]''' H.K. Yuen, J. Princen, J. Illingworth and J. Kittler. Comparative study of Hough Transform methods for circle finding.<<BR>>http://www.sciencedirect.com/science/article/B6V09-48TCV4N-5Y/2/91f551d124777f7a4cf7b18325235673
5790 1. '''[Yuille89]''' A.Y.Yuille, D.S.Cohen, and P.W.Hallinan. Feature Extraction from Faces Using Deformable Templates in CVPR, pp. 104-109, 1989.
5791 1. '''[Zhang96]''' Z. Zhang. Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting, Image and Vision Computing Journal, 1996.
5792 1. '''[Zhang99]''' Z. Zhang. Flexible Camera Calibration By Viewing a Plane From Unknown Orientations. International Conference on Computer Vision (ICCV'99), Corfu, Greece, pages 666-673, September 1999.
5793 1. '''[Zhang00]''' Z. Zhang. A Flexible New Technique for Camera Calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334, 2000.