<para>Specify if you want to do full cache simulation. By default,
only instruction read accesses will be counted ("Ir").
With cache simulation, further event counters are enabled:
- Cache misses on instruction reads ("I1mr"/"I2mr"),
- data read accesses ("Dr") and related cache misses ("D1mr"/"D2mr"),
- data write accesses ("Dw") and related cache misses ("D1mw"/"D2mw").
+ Cache misses on instruction reads ("I1mr"/"ILmr"),
+ data read accesses ("Dr") and related cache misses ("D1mr"/"DLmr"),
+ data write accesses ("Dw") and related cache misses ("D1mw"/"DLmw").
For more information, see <xref linkend="cg-manual"/>.
</para>
</listitem>
</term>
<listitem>
<para>Specify whether write-back behavior should be simulated, allowing
- to distinguish L2 caches misses with and without write backs.
+ to distinguish LL caches misses with and without write backs.
The cache model of Cachegrind/Callgrind does not specify write-through
vs. write-back behavior, and this also is not relevant for the number
of generated miss counts. However, with explicit write-back simulation
it can be decided whether a miss triggers not only the loading of a new
cache line, but also if a write back of a dirty cache line had to take
- place before. The new dirty miss events are I2dmr, D2dmr, and D2dmw,
+ place before. The new dirty miss events are ILdmr, DLdmr, and DLdmw,
for misses because of instruction read, data read, and data write,
respectively. As they produce two memory transactions, they should
account for a doubled time estimation in relation to a normal miss.
bad access behavior). The new counters are defined in a way such
that worse behavior results in higher cost.
AcCost1 and AcCost2 are counters showing bad temporal locality
- for L1 and L2 caches, respectively. This is done by summing up
+ for L1 and LL caches, respectively. This is done by summing up
reciprocal values of the numbers of accesses of each cache line,
multiplied by 1000 (as only integer costs are allowed). E.g. for
a given source line with 5 read accesses, a value of 5000 AcCost
means that for every access, a new cache line was loaded and directly
evicted afterwards without further accesses. Similarly, SpLoss1/2
- shows bad spatial locality for L1 and L2 caches, respectively. It
+ shows bad spatial locality for L1 and LL caches, respectively. It
gives the <emphasis>spatial loss</emphasis> count of bytes which
were loaded into cache but never accessed. It pinpoints at code
accessing data in a way such that cache space is wasted. This hints
</listitem>
</varlistentry>
- <varlistentry id="opt.L2" xreflabel="--L2">
+ <varlistentry id="opt.LL" xreflabel="--LL">
<term>
- <option><![CDATA[--L2=<size>,<associativity>,<line size> ]]></option>
+ <option><![CDATA[--LL=<size>,<associativity>,<line size> ]]></option>
</term>
<listitem>
- <para>Specify the size, associativity and line size of the level 2
+ <para>Specify the size, associativity and line size of the last-level
cache.</para>
</listitem>
</varlistentry>
</sect1>
+<sect1 id="cl-manual.monitor-commands" xreflabel="Callgrind Monitor Commands">
+<title>Callgrind Monitor Commands</title>
+<para>The Callgrind tool provides monitor commands handled by the Valgrind
+gdbserver (see <xref linkend="manual-core.gdbserver-commandhandling"/>).
+</para>
+
+<itemizedlist>
+ <listitem>
+ <para><varname>ct.dump [<dump_hint>]</varname> requests to dump the
+ profile data. </para>
+ </listitem>
+
+ <listitem>
+ <para><varname>ct.zero</varname> requests to zero the profile data
+ counters. </para>
+ </listitem>
+
+ <listitem>
+ <para>It would be nice to have some more callgrind monitor
+ commands such as e.g. toggle collect and start instrumentation.
+ </para>
+ </listitem>
+
+</itemizedlist>
+</sect1>
+
<sect1 id="cl-manual.clientrequests" xreflabel="Client request reference">
<title>Callgrind specific client requests</title>