Jan Kiszka [Mon, 13 Jun 2016 09:47:22 +0000 (11:47 +0200)]
arm: Use asm-defines.h for struct per_cpu members
Port the logic over from x86 and also drop CHECK_ASSUMPTION here.
The only slightly ugly detail: the PERCPU_SIZE_SHIFT define is now
duplicated in both asm/percpu.h instances because there is no good
generic header yet to hold it. Can be cleaned up later on.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
cyng93 [Sat, 11 Jun 2016 00:40:46 +0000 (08:40 +0800)]
Documentation: More BananaPi documentation
This patch include more details about how to setup Jailhouse on a BananaPi-M1 board.
Basically this documentation covered:
1. Installation of Bananian(BananaPi offical OS) on BananaPi
2. Modifying U-boot configuration on BananaPi to run Jailhouse.
3. Update Bananian to newer kernel so Jailhouse could works.
- Compiling Kernel.
- Installing Kernel.
4. Installing Jailhouse on BananaPi.
5. Simple demo/test: Running Jailhouse with Freertos-cell on BananaPi.
Signed-off-by: CHING-YI NG <cyng93@gmail.com>
[Jan: removed external media link showing FUSE selection - not needed] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 11 Jun 2016 17:00:13 +0000 (19:00 +0200)]
x86: Add missing include to amd_iommu.h
Reported by header-check script: We need this in the header due to the
use of struct jailhouse_memory. Consequently, we can remove the include
from the corresponding amd_iommu.c.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Xuguo Wang [Tue, 31 May 2016 03:44:16 +0000 (11:44 +0800)]
inmates/lib: cmdline.c
There is no point in checking for *p == 0 in the while loop,
after over the blanks, then checking for the parameters, if
find, return true, otherwise continue check the parameters,
if to the end of the cmdline, return false.
Signed-off-by: Xuguo Wang <huddy1985@gmail.com>
[Jan: also removed curly braces] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Georg Schiesser [Fri, 20 May 2016 18:07:06 +0000 (20:07 +0200)]
tools: fix missing hardware-check after make install
Add the new hardware-check script to the HELPERS, such that
"make install" will install it properly, just like the other
scripts, into: $DESTDIR/usr/local/libexec/jailhouse/
Signed-off-by: Georg Schiesser <georg.schiesser@opentech.at> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 18 May 2016 23:17:13 +0000 (01:17 +0200)]
tools: Add hardware feature check
The hypervisor itself is not very helpful when it comes to analyzing
feature deficits of the target platform. This adds another extension
script to the jailhouse command which checks the hardware using the
same key criteria that also the hypervisor applied.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Georg Schiesser [Tue, 10 May 2016 21:40:10 +0000 (23:40 +0200)]
tools: fix gcc sign-compare warnings
Cosmetic change to avoid multiple gcc sign-compare warnings between
signed int argc and unsigned int arg_num, both being small and
non-negative. Alternatively, we could use unsigned int argc or
disable the warning with gcc -Wno-sign-compare.
Signed-off-by: Georg Schiesser <georg.schiesser@opentech.at>
[Jan: reordered lines for visual pleasure] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Georg Schiesser [Tue, 10 May 2016 21:39:42 +0000 (23:39 +0200)]
configs: fixed typo in e1000-demo pio_bitmap
The pio_bitmap initialization incorrectly assigns overlapping ranges to
different values, similar to commit 886ca63f. As Jan pointed out:
"Fortunately, it was harmless because succeeding initializations
overwrote this exceeding one."
see also: https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
Signed-off-by: Georg Schiesser <georg.schiesser@opentech.at> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 7 May 2016 17:02:28 +0000 (19:02 +0200)]
x86: Block DMA from unlisted devices in AMD IOMMU
Invalid device table entries in the AMD IOMMU mean that those devices
are actually allowed to perform DMA requests and issue interrupts. We
have to avoid this case because only devices listed in a config are
permitted to do so. We already achieve this effect when removing an
existing device from the table, but we have to ensure it also for any
unlisted device.
Devices with IDs not covered by any table are blocked by the IOMMU, see
AMD I/O Virtualization Technology spec, 2.2.2.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 9 May 2016 17:55:45 +0000 (19:55 +0200)]
x86: Use safer pattern with AMD IOMMU to block DMA requests
The AMD IOMMU spec is not 100% clear if a device table entry with V=1
but TV=0 implies that DMA requests from that device are blocked. Play
safe and use the pattern that Linux uses as well: TV=1, Mode=0 and IW as
well as IR cleared.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
driver: fix unsigned long overflow in leave_hypervisor
When shutting down the hypervisor, in the leave_hypervisor
function, the Linux driver touches every hypervisor page, to
ensure all pages are mapped. However, the current implementation
assumes hv_core_and_percpu_size is aligned to PAGE_SIZE. This may
not be the case, if PAGE_SIZE is different on the hypervisor side.
This can cause an unsigned long overflow, leading to an infinite
loop of touching successive pages starting from hypervisor_mem.
The loop will be broken as soon as Linux tries to touch an invalid
page, leading to a kernel crash.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
core: map the zero page to the full hypervisor memory region
During initialization, in init_early, the hypervisor maps the
memory used by the hypervisor with empty pages for the root cell.
However, if the root cell tries to access the region used by the
hypervisor, this is only safe if both sides agree on PAGE_SIZE.
It is a long shot to try to guess the granularity used by the
root cell; the safest bet is to map the full range that has been
allocated for the hypervisor to use.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 1 Mar 2016 22:31:31 +0000 (23:31 +0100)]
x86: Unify AMD page tables for CPU and IOMMU
This exploits AMD's architecture feature that you can reuse the nested
page tables also for the IOMMU.
Both tables have the same depth (4), share the same address fields, the
valid bit - but all other bits are separate. Therefore, we need to
enhance the NPT paging handlers so that they fold both bit sets into an
entry.
The rewards are saving of several lines of code as well as a bunch of
hypervisor pages (typically some dozen).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Add iommu_pending_faults() for amd_iommu. This looks into
Hardware Event Register first, and then loops over the event log
printing what's in it. This way, we don't miss errors that happen
when event logging is unavailable.
Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
[Jan: Cleanups] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Add functions to read event logs AMD IOMMU provides and print their
contents. The latter is rather basic, but decoding all possible log
entries is hairy, so we'd better wait and collect stats which
problems occur most often.
Jan Kiszka [Wed, 15 Jul 2015 19:34:47 +0000 (00:34 +0500)]
x86: Add iommu_commit_config() for amd_iommu
Implement functions to apply configuration for an IOMMU.
In case something goes wrong, we need to trigger an NMI, which
amd_iommu_init_fault_nmi() configures.
Based on patch by Valentine Sinitsyn.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 15 Jul 2015 19:13:05 +0000 (00:13 +0500)]
x86: Add device management functions for amd_iommu
Implement iommu_add_pci_device() for amd_iommu.
Basically, this is all about filling DTE entry. However, there is no way
to allocate device tables sparsely with ADM IOMMU. To save some memory,
Device Table Segmentation (Revision 2.6 and up) is used whenever possible,
and this adds some infrastructure.
Based on patch by Valentine Sinitsyn.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Add basic infrastructure (heavily influenced by Linux amd_iommu driver)
to submit commands to AMD IOMMU command buffer. For now, having only
INVALIDATE_IOMMU_PAGES and COMPLETION_WAIT seems to be sufficient.
Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
[Jan: Cleanups, simplification of draining] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 1 Mar 2016 06:15:38 +0000 (07:15 +0100)]
x86: Extend bit range returned by x86_64_get_flags
In order to support also the AMD IOMMU with x86_64_paging, we extend
the set of bits returned by get_flags handler. We now include all bits
ignored by the MMU, which includes the bits relevant for the AMD IOMMU.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 13 Mar 2015 17:41:02 +0000 (22:41 +0500)]
core, configs, tools: Add AMD-specific fields to struct jailhouse_iommu
For AMD, we also need to store the PCI address, capability offset and
IOMMU feature bits coming from ACPI (overwriting what the hardware
reports) in the cell configuration file.
Based on patches by Valentine Sinitsyn.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 24 Feb 2016 09:19:54 +0000 (10:19 +0100)]
x86: Filter out physical address that can't be handled by DMAR units
Make sure that we do not try to program DMAR page tables with physical
addresses beyond the supported range (39 or 48 bits, depending on the
page table levels).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 20 Feb 2016 18:10:22 +0000 (19:10 +0100)]
x86: Account for DMAR units with multi-page register sets
The fault reporting registers we use may be placed in a 2nd or even 3rd
page. Account for such cases by using the MMIO region size now provided
via the system config.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 20 Feb 2016 18:09:49 +0000 (19:09 +0100)]
core, configs, tools: Prepare for variable IOMMU register set sizes
Introduce a size field to struct jailhouse_iommu and fill it via the
config generator. The information can be retrieved from the ACPI tables
for AMD. On Intel, we need to study the Linux mappings, thus we need to
demand that DMAR is enabled now while retrieving system information.
Based on patches by Valentine Sinitsyn.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
For both AMD and Intel, we need to store not only base address but also
a size to map the complete MMIO region. Moreover, AMD requires a number
of PCI device parameters for the IOMMU. Introduce struct jailhouse_iommu
that will encapsulate all required data.
Based on patches by Valentine Sinitsyn.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 3 Feb 2016 17:55:23 +0000 (18:55 +0100)]
inmates: e1000-demo: Enable queues explicitly
Newer NICs require us to enable the RX and TX queue. Although they
should be on after reset, at least the I350 refuses to work otherwise.
As the related bit is harmless or even unused on older NICs, do this
unconditionally (just like ipxe does).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 26 Jan 2016 08:27:40 +0000 (09:27 +0100)]
x86: Make debug UART port configurable via system config
We already allow to enable a VGA console via the system config, so let's
make the UART port configurable this way as well: phys_start will hold
the port, and flags must not have JAILHOUSE_MEM_IO set, in order to
differentiate us from the memory-mapped VGA console. And by leaving
phys_start at 0, we can even turn off the console now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 25 Jan 2016 17:20:37 +0000 (18:20 +0100)]
core, driver: Pass rounded-up core size in hypervisor header
Hypervisor and root kernel may have different ideas about PAGE_SIZE.
This will cause wrong hypervisor core size calculations as seen on arm64
with 64K Linux PAGE_SIZE.
Avoid this trap by moving the round-up into the hypervisor code, passing
a ready-to-be-used size value in the header.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Daniel Sangorrin [Thu, 21 Jan 2016 01:31:26 +0000 (10:31 +0900)]
vga: Add support for VGA text buffer output on x86
Hypervisor messages are useful for debugging and are
typically handed out to the serial port. Unfortunately, x86
computers often lack of a serial port. This patch allows
hypervisor messages to be redirected to a screen by leveraging
the traditional VGA text buffer mode.
Signed-off-by: Daniel Sangorrin <daniel.sangorrin@toshiba.co.jp>
[Jan: avoid row_line writeback in panic case, remove redundant braces] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 9 Jan 2016 06:15:59 +0000 (07:15 +0100)]
configs: Update Banana Pi configs to make use of unaligned MMIO regions
Split up the MMIO page 0x1c20000 on the Alwinner A20 into CCU,
interrupts controller, GPIOs and the timer. GPIOs are further broken up
to allow assigning port H to the gic-demo cell, along with the CCU (to
control the UART timing).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 8 Jan 2016 18:18:34 +0000 (19:18 +0100)]
core: Add support for sub-page MMIO regions
This allows to specify memory regions for MMIO accesses that do not
start or end on page boundaries. Instead of mapping full pages into the
cell, sub-page MMIO requires to intercept the page accesses, validate
all parameters against the target memory region and then perform the
access in hypervisor context, provided the validation was successful.
As the access can now fail in hypervisor context, we need to be more
picky: besides read/write permissions, alignment and access widths can
be checked additionally. These attributes are specified via the
JAILHOUSE_MEM_IO_* flags.
Sub-page MMIO is surely not a fast path. It not only requires world
switches between cell and hypervisor, the current implementation also
uses dynamic mappings. This is easier to implement than a static mapping
scheme, but surely not faster. We may revisit this design later on,
ideally towards a 1:1 mapping scheme.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 7 Jan 2016 17:21:55 +0000 (18:21 +0100)]
core: Remove memory regions check
Most of the checks will be removed when adding sub-page memory region
support. We rather need some offline validation outside the hypervisor
eventually.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 7 Jan 2016 17:10:20 +0000 (18:10 +0100)]
arm: Remove useless warning from arm_mmio_perform_access
This functions is only called with size 1, 2 or 4. This is ensured by
arch_handle_dabt, the only (indirect) caller, which generates the size
accordingly (1 << sas) and filters out sizes > 4.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 11 Aug 2015 07:20:41 +0000 (09:20 +0200)]
configs: Add cache region to x86 demo cells
Assuming we have more than 4 units of L3 cache on systems that support
L3 partitioning, assign the first 2 units (e.g. 2 MB on a Xeon D 1540)
to apic-demo, the 3rd to tiny-demo. Also the non-root Linux config gets
the first 2 units (it cannot run in parallel to the other demos). All
this is for testing the management logic and will later be used to
benchmark the partitioning.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 11 Aug 2015 07:05:24 +0000 (09:05 +0200)]
x86: Introduce Cache Allocation Technology support for Intel CPUs
CAT is a CPU feature first added to Xeon D and certain Xeon E5 v3
processors. It so far allows to specify access restrictions to the L3
cache, including complete isolation between different entities.
This adds CAT control to Jailhouse on a per-cell level. The user is free
to specify a contiguous access mask for each cell, use that mask
exclusively (typical case), share any overlaps with the root cell
(JAILHOUSE_CACHE_ROOTSHARED), or simply use the root cell mask. If
nothing else is specified, the root cell uses the full cache (until
non-root cells shrink it).
Due to the hardware-induced requirement to have a contiguous bitmask,
shrinking the root mask on cell creation and extending it again on
destruction is not trivial. Not at all.
When creating a new cell, we may punch a hole into the root mask. In
that case, we also remove the lower half from the roor mask and
accumulate those bits in a "freed mask" for reuse once the hole closes
again. And if we are unlucky, adding a cell empties the current root
mask. Then we have to look into the freed mask and switch to it if it's
non-empty.
When restoring the root mask on cell destruction, we choose a simple
algorithm that first collects all released bits in the freed mask, then
try to merge that mask bit-wise with the current root cell mask. On
success we restart the freed mask walk to ensure that all contiguous
bits are merged.
One may wonder why not reallocating masks completely dynamically and
automatically on each reconfiguration, instead of requiring that
explicit allocation via the config? The reason is that we do not want to
invalidate cache allocations of those cells that are not involved in a
reconfiguration.
A lot of complication with this mechanism which looked so simple on
first sight. Let's just hope that there is a noteworthy benefit in
restricting CAT bitmasks in hardware this way.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 11 Aug 2015 06:58:38 +0000 (08:58 +0200)]
core, tools: Introduce cache regions to the cell configuration
Allow to specify regions of caches so that the hypervisor can partition
their usage accordingly whenever the hardware supports this.
The specification of their start location and sizes depend on the
architecture specific partitioning support. So far, only L3 cache types
are definable, either as unified cached or further partitioned into code
and data (to cater Intel's CAT and CDP). As with memory regions, caches
are usually taken from the root cell on non-root cell creation, but they
can also be declared as shared with the root cell.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 10 Jan 2016 08:42:43 +0000 (09:42 +0100)]
inmates: arm: Make LED blinking in gic-demo optional
This is both a test/demo case for command line parsing on ARM and a
feature to control the LED signal in the gic-demo on Banana Pi. The
green LED will now only blink if "blinking_led" is specified as inmate
command line option.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 4 Jan 2016 10:31:49 +0000 (11:31 +0100)]
inmates: x86: Add optional cache pollution to apic-demo
When "pollute_cache" is specified as command line parameter of the
apic-demo, the demo will fill each cache line with a pattern in each
measurement loop. Up to 512 KB of cache can be polluted this way.
This allows to test L3 cache partitioning features of recent Intel CPUs:
The cache pollution will dirty the L1 and L2 data caches so that the
next loop iteration will access L3. If that cache is shared, latencies
will rise as other cells use the cache as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>