rtime.felk.cvut.cz Git - jailhouse.git/log

arm: Switch to generic UART mapping

Start using the generic UART mapping by the Linux driver. For this the
VExpress config has to gain physical base and size information of the
debug UART.

This removed the tedious need to adjust UART_BASE_VIRT in platform.h
according to the Linux configuration.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core/driver: Add support for mapping the debug UART from the driver

If the debug UART is memory-mapped, we can only access prior to
switching to hypervisor mappings if the driver supports us in this. By
adding a debug_uart memory region to the system configuration, we tell
the driver about the mapping need. In turn, the driver reports the
virtual address via an additional header field. The mapping can be
released on Linux side right after enabling the hypervisor

Provided the virtual address of the UART mapping as chosen by Linux does
not conflict with our remapping region, this mapping can safely be
replicated into the hypervisor address space so that we don't need to
adjust the UART access after enabling our own mapping.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Redefine PAGE_FLAG_UNCACHED to PAGE_FLAG_DEVICE

All (x86) users of this page flag map devices into the hypervisor
address space. We will do the same for ARM when mapping the debug UART.
For this we need a generic flag with the same semantics. As uncached is
different from device mappings, redefine the semantic of UNCACHED flag
for this purpose.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Remove unused phys_base from uart_chip

This field is write-only, the UART driver is only interested in the
virtual address.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Refactor arch-specific section definition

Require all archs to define ARCH_SECTIONS via asm/section.h, at least an
empty one. Include this unconditionally in the hypervisor layout.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Stop misusing JAILHOUSE_MEM_DMA for marking MMIO

Introduce JAILHOUSE_MEM_IO so that archs that need to tag MMIO regions
have a proper flag. Apply it on ARM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Adjust UART_BASE_VIRT according to local test configuration

It's almost pointless to tune this constant as it is highly dependent on
the local kernel config. However, this one helps local testing until we
have a better solution for getting the UART mapped for the hypervisor
during early setup.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: arm: Fix and improve build

Introduce and use DECLARE_TARGETS just like x86 does. This prevents
unconditional rebuilding of the inmates on every make. Also move the
filtering of "-include asm/unified.h" into reusable Makefile.lib.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Adjust VExpress configs for smaller reservations

Move the hypervisor at the top of 2G memory. The gic and uart demo cells
are placed below, each given much less space than before - they don't
need more.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Add NIC and MMC to VExpress config

These two come at least with the Fast Model of the Versatile Express.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Improve output on fatal traps

Dump the exception class but drop the CPU ID - that one will be printed
anyway by panic_park.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: config examples

These config files are used to run the root cell (vexpress.c) and the
inmates examples (vexpress-*-demo) on the vexpress platform.
vexpress-linux-demo can be used to create a cell for an SMP linux on CPUs
2 and 3, by assigning it the UART3 IRQ.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: add copyright headers]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: basic inmates demos

This patch adds the necessary libraries for writing simple inmates on arm,
and provides two demos.
It attempts to use the same layout as x86, and allows to set the devices
base addresses with platform-specific includes, in order to avoid
including kconfig.h

Only the vexpress platform with GICv2 and GICv3 is supported for the
moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Use config.mk instead of kbuild, use mmio accessors,
avoid integer overflow, add copyright headers]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: exit statistics

This patch implements the counters that report the number VM exits,
accessible by the driver. It also adds three statistics for the ARM
side: the number of IRQs injected, the number of IPIs injected, and the
number of GIC maintenance IRQs received.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv2: handle SPI routing

GICv2 is limited to 8 CPUs and uses independent routing bits, whereas
GICv3 (with ARE enabled) uses the MPIDR encoding (aff3.aff2.aff1.aff0)
for routing SPIs.
Before handling SPIs, the GICv2 backend has to probe its banked view of
the distributor to know which CPU interface it is accessing. After that,
the implementation is roughly the same as for GICv3, but GICD_ITARGETSR
are used instead of IROUTER.
Because the guest isn't supposed to rely on the CPU interface number
being coherent with the CPU logical ID, we don't have to translate it to
a virtual ID before handling route accesses inside SMP cells.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: add support for GICv2

This patch implements the following GICv2 features:
- Remap GICC to GICV in the cells to provide a virtual interface
- Guest SGI filtering and hyp SGI handling
- IRQ injection

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: factor some GICv3 functions into gic_common

Some functions are abstract enough to be used by the GICv2 backend.
This patch moves them into the common code.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: restore kernel on setup failure

This patch implements two cases:
- When an error occurs before setting up EL2, there is nothing much
  to do except restore the linux registers stored in the per_cpu
  datas.
- When it happens after EL2 setup, arch_cpu_restore copies the saved
  registers on the stack, and continues into arch_shutdown_self

When it happens during the MMU setup, chances of recovering a clean
state are pretty thin anyway. The bootstrap vectors could be used to
catch and dump a minimal context (which would require a raw_printk
implementation), but we cowardly ignore this case for the moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: fix memcpy size in cpu_return_el1]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement hypervisor shutdown

When an HV_DISABLE hypercall is issued on all root CPUs by the driver,
the core `shutdown' function executes the following operations:
- Suspend all non-root cells (all the CPUs are taken to hyp idle mode),
- call arch_shutdown_cpu for all those CPUs,
- call arch_shutdown.
Once the master CPU (the first to take the shutdown lock) did this, the
other root CPUs don't actually perform any operation.

This patch lets the arch_shutdown and arch_shutdown_cpu set a boolean
that is considered by the cores right before returning to EL1: for the
cells' CPUs, arch_shutdown_cpu will trigger a return to arch_reset_self,
that will clean up EL1 and EL2. On the root cpus, the exit handler
checks this boolean and calls the shutdown function.

Once inside arch_shutdown_self, the principle is the same as with the
hypervisor initialisation:
- Create identity mappings of the trampoline page and the stack,
- Jump to the physical address of the shutdown function,
- Disable the MMU,
- Reset the vectors,
- Return to EL1

This patch does not handle hosts using PSCI yet: they will need to issue
a final SMC on secondary CPUs in order to park themselves at EL3, since
the hypervisor won't exist anymore to emulate the wakeup call.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: moved arch_shutdown_cpu & arch_shutdown to control.c]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: irqchip: add hypervisor shutdown

Shutting down the GIC on the root cell consists of re-enabling direct
access to the CPU interface.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: save the linux hyp-stub vectors

This patch stores the hypervisor stub vectors before installing EL2, in
order to reset them on shutdown. It assumes that they are the same on all
CPUs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: ignore writes to the ACTLR register

The Auxiliary Control Register may be used on some platforms to disable
memory coherency between the cores, for instance when unplugging a CPU.
This patch ensures that ACTLR is never modified, by trapping its accesses
with the HCR.TAC bit.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: add platform-dependent SMP operations

Hotplugging CPUs on ARM is quite difficult, as each platform uses its
own system. The use of PSCI emulation will greatly simplify this, but
on many platforms, we still have to define a series of specific SMP
operations wrapped around the kernel hotplug implementation.

This patch adds support for the vexpress hotplug system:
- When the root cell attempts to unplug a CPU, to give it to a new
  cell, it is put in a WFI loop, which is left when Jailhouse sends
  a synchronising IPI to all CPUs that need to be parked.
- When re-assigning a CPU to the root cell, the simplest return path
  is through the kernel's secondary entry, whose address is stored in
  the system flags register.

Because the kernel only writes the flag register once, plugging CPUs in
the host cannot be accomplished by waiting for a trapped MMIO. Moreover,
such a trap would be missed on hypervisor shutdown, since CPU0 may
return to bare EL1 before secondary CPUs. On some platforms, it may be
necessary to park secondary CPUs outside of the hypervisor on shutdown,
by copying a minimal spin code in a reserved location...

This patch also attempts to combine both classical and PSCI boot methods
in SMP guests: secondary CPUs are held in the psci_emulate_spin handler,
and can be woken up by both a PSCI call and a trapped access to the
vexpress mbox.
The same applies for hotplugging secondary CPUs in the guests, but the
mailbox method only waits for an IPI.

PSCI in the host is not currently supported: it would require a call to
the actual CPU_OFF handler when shutting down the whole hypervisor.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: PSCI emulation

By adding a new field 'guest_mbox' in the cpu_datas, this patch allows
the guests to issue PSCI HVC calls. Currently, only PSCI_CPU_ON,
PSCI_CPU_OFF, and PSCI_VERSION are handled.
A call to CPU_OFF enters the suspend mode through arch_reset_self. When
a core calls CPU_ON, the hypervisor wakes up the other core, which will
take its return address from the guest_mbox, wipe its registers and go
back to EL1. The context argument to PSCI_CPU_ON is currently ignored,
since the whole core is reset.
This patch also traps SMC instructions in order to catch the PSCI
requests done using this way. All others SMC calls are forwarded to
EL3.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: handle distributor accesses

This patch adds the handling of MMIO accesses to the GICv3 distributor.
By restricting the SPI masks to the cell's configuration, it makes sure
that they do not touch the other cell's SPI's when writing the common
registers.
Except for the routing and SGIR registers, most of the code should be
common to both GICv2 and v3.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: irqchip: add SPI configuration in cell_init and cell_exit

This patch enables the routing of SPIs to the new cell's first CPU. When
destroyed, all SPIs are re-routed to the root cell.
An exhaustive implementation would save the targets of each IRQ before
transferring it to a new cell. Since linux does not currently route SPIs
to secondary CPUs and the root cell is not supposed to use devices that
will be assigned to guests anyway, it should be safe to route everything
to CPU0.

This patch follows the core configuration and the IOAPIC implementation,
which only allows to use the first 64 SPIs.
A future patch will need to change this minimal bitmap size to 988.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: filter redistributor accesses

Since each cell has its own set of CPU ids, they can't access the
redistributor associated to their MPIDR. Instead, the MMIO accesses are
translated to their physical redistributor, and a read to the ID
register returns the virtual affinity value.

It is a bit more expensive than simply mapping the redistributor to the
cell, but the guest rarely needs to reconfigure its PPIs and IPIs, so
this patch shouldn't introduce any significant performance loss.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: attribute virtual IDs to the cell cpus

To handle SMP guests, the cells need to be assigned virtual CPU IDs
through the VMPIDR register. For the moment, those IDs are simply
generated incrementally on each CPU.

This change will allow to use the same guest code in different cells.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: add "arm_" prefix, uninline arm_cpu_virt2phys]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: mmio emulation skeleton

This patch adds the necessary code for handling MMIO accesses. The trap
handler fills a mmio_access struct according to the fields in the ESR,
and passes it to all relevant sub-handlers.
If all return UNHANDLED, the access is considered invalid and the CPU is
put into failed mode.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors, use asm/mmio.h]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: better error reporting and panic dump

This patch adds exhaustive handling of hypervisor errors, and the
ability to stop and park CPUs after dumping their EL1 context, when they
encounter an unhandled trap.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Adjustments to recent control subsystem changes]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Complete paging invalidations

This patch is based on the original version by Jean-Philippe Brucker. It
fills different paging stubs:
- the arch_flush_cell_vcpu_caches stub, which is used by the core via
  config_commit each time the memory is remapped. It allows to
  invalidate the TLBs on all affected CPUs of the cell.
- the arch_paging_flush_cpu_caches function is used to flush the
  hypervisor page table entries when using the PAGE_MAP_COHERENT flag
  (useful for IOMMU, not currently in use on the arm side.)
- the arch_paging_flush_page_tlbs function is used to invalidate a TLB
  entry after modifying the hypervisor paging structures. It must ignore
  accesses done from the initial setup code at EL1, which are committed
  once at EL2 with a TLBIALLH, just before enabling the MMU.

arch_config_commit has nothing to do so far. Will change when an IOMMU is
supported.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: disable caches on cell reset

This patch allows to enter new guests with cache disabled. By cleaning
the data caches, it makes sure that the recently written guest code and
datas are present in memory before returning to an environment with only
a stage-2 MMU.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: flush and enable the caches at initialisation

Before jumping to EL2, which has its MMU disabled, the data caches need
to be cleaned, in order to be coherent with the EL1 context.
This patch implements the complete data cache flush by set/way, and
enables the EL2 caches if possible.

Please note that the hypervisor always assumes that the kernel sets up
its memory in a coherent way between the cores, which means that all the
relevant memory regions (ie. the Jailhouse code and datas) are supposed
to be cacheable and inner-shareable, so that only a clean is needed
before turning the caches off.
In the short section where the MMU is off, the hypervisor doesn't write
anything to memory.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: clear the banked and system registers on reset

This patch allows to boot the new guests with an -almost- empty context.
The reset function is still missing Performance Monitor, debug, SIMD and
float registers.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the cell destruction

Give the CPUs back to Linux when a cell is destroyed, after resetting its
whole context.

This patch uses the old vexpress CPU hotplug system in Linux: the
secondary startup function address is kept in the system flags, so the
cpu_reset function will simply jump there after resetting the CPU state.
A future patch will add PSCI hotplug support, and once the hypervisor
has access to a device tree, this spin function will need to be set
dynamically in a `struct hotplug_ops'.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: reset the CPU interface before running a new guest

All pending interrupts need to be cleared before running a new guest.
This patch resets the list registers, the software pending queue, and
the GIC hypervisor config registers.
Since the suspend loop was entered through the IRQ handler, we also need
to deactivate that active IPI.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the cell creation

This patch implements most CPU handling functions needed to setup and start
a new cell.
- When the core calls suspend_cpu, an SGI is sent to this CPU.
  Since all IRQs are taken directly to the hypervisor, the guest will be
  interrupted and execution will continue into psci_suspend.
- resume_cpu will simply call psci_cpu_on and return from to the IRQ
  handling loop.
- reset_cpu will resume the CPU into arch_reset_self.
  After resetting all relevant registers and devices, execution will
  continue into the newly created guest, by calling vmresume with a
  clean set of registers.

To keep this patch light, most of the reset code is still missing.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: minimal PSCI implementation

This is not the final thing: it can only be used internally for
suspending, resuming and parking CPUs while reconfiguring the cells.
Using this base, a trivial PSCI 0.2 emulation can be added by implementing
the appropriate trap hooks.

The mailbox in the per_cpu structure is used to store the address and
context where psci_cpu_off returns.
CPUs doing reconfiguration on the cells can use the psci_suspend and
psci_resume wrappers to store all affected cores.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3: filter the guests' SGIs

In GICv3, IPIs are sent by writing the system register `ICC_SGIR'.
This patch moderates those writes by injecting the IPIs into the
appropriate cells, and issues an hypervisor IPI to let the cell's CPUs
fill their list registers.

Since there shouldn't be many cases where Jailhouse needs to emulate
system register accesses, this patch keeps it simple, by calling directly
the GICv3 function from the trap handler, without abstracting it through
irqchip.
However, this change adds an ungraceful ifdef, since the GICv2 and v3
headers are mutually exclusive for the moment.
In GICv2, the SGIR register is 32bit and will be handled directly in the
gic-common.c code, using an MMIO trap of the distributor accesses.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: skip instructions that fail their condition check

On some implementations, instructions may trap before the PE was able to
check their conditional code. This patch adds the ability to check it
before emulating something that wouldn't be executed. In thumb mode, the IT
state has to be updated when skipping instructions.

Most of this patch is copied and adapted from Linux and KVM.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: read/write the banked registers

When emulating instructions, the trap handler will need to access the
cell registers according to the guest's processor mode when the trap
occurred, which is stored inside the saved PSR.
This patch allows to directly read and write the banked registers.
If HSR reports a load into r14 and the mode was IRQ for instance, the
hypervisor will need to write something into LR_IRQ instead of the LR
saved on the stack.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3: handle IRQs

The GIC IRQ handler loops over the ack register to get all pending IRQs.
It then dispatches them either in the common SGI handler, or injects
them into the cell.
A first attempt to directly inject an IRQ by writing to a free list
register is done. If it fails, the IRQ is appended to the pending list,
and an attempt will be made later on, once a maintenance interrupt is
received.
Injection in the GIC is a little bit expensive for the moment, because
it needs to iterate over all list registers that have a valid interrupt,
to check that there will be no duplication. This could be optimized by
only checking the `active' GIC register for SPIs and PPIs.
A future patch will also add proper handling of the maintenance bits in
the vGIC.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: store the pending virtual interrupts

This patch introduces a pending_irq structure to provide a level of
abstraction, in order to store the interrupts waiting to be injected in
the cell. They are allocated as a static array of 256 IRQs for each CPU,
which should be more than enough. Insertion finds the first available slot
and builds a linked list of pending vIRQs.

Two cases justify the need for this structure:
- The GIC has a limited number of list registers for injecting virtual
  interrupts. Once they are full, software must store the pending ones
  itself, and use the GIC's maintenance IRQ to be informed when they are
  available again.
  In jailhouse, this case should be very rare since IRQs are directly
  injected, but it must be taken into account nonetheless.
- IPIs sent by a core need to be stored somewhere to let the other CPUs
  inject them into their own list registers.

The GIC backend will need to call irqchip_inject_pending when receiving
a maintenance IRQ or a synchronisation SGI in order to clean the list.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: IRQ handling skeleton

Since IRQs taken to HYP use a different vector, the trap handler needs to
be aware of the exit context. To this end, the patch adds an 'exit_reason'
field to the struct registers.
The structure is still passed to the dispatcher as a pointer to the stack,
but care must be taken to ignore the exit field when restoring the user
registers.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3 initialisation

Assuming there is a GIC distributor at address GICD_BASE, this patch
checks its version and call the gic init function. Linux's kconfig header
is used to guess the base address of the distributor.
Ideally, a device tree would be passed to the hypervisor in the root
cell's config, allowing to remove all constant base addresses.

The patch also assumes that most of the GIC has been setup by Linux prior
to the hypervisor installation, and only initialises the vGIC.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: use mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC initialisation skeleton

Since the GIC uses MMIOs, its initialisation must be done at EL2. This
is why arch_cpu_init first calls irqchip_init on the master CPU, to map
the devices, and then irqchip_cpu_init on all CPUs.

The aim of this patch is to allow support for both GICv2 and GICv3. It
abstracts the GIC operations by using `struct irqchip_ops', and fills it
with the right device hooks after detecting which irqchip is available.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Pass through init_late

The only thing we need to in init_late for the moment is to trigger the
root cell memory mapping to be build. This allows to run a complete
hypervisor setup from the driver and return to the kernel normally.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: dispatch hypercalls

Initial code for handling hypervisor traps. Only the non-banked registers
need to be saved in the low-level handler, the rest of the context won't
be overwritten.
The per-cpu datas are loaded from TPIDR_EL2 and the general-purpose
registers are saved directly on the stack and supplied as 'struct
registers' to the dispatcher.
The latter then inspects the ESR value and calls the core accordingly.
The return value and the general-purpose registers are passed back to the
driver by retrieving 'struct registers' from the stack, before doing the
final ERET.
This patch allows to handle all status querying hypercalls.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: check architecture features

Verify that virtualization is actually supported before going any further
in the initialisation process.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: setup stage 2 MMU for the cells

This patch adds the necessary MMU setup code for the cells. They use the
same paging functions as the hypervisor, but their flags are slightly
different.

As an improvement, it would be good to use only two levels of page
tables on 32bit instead of three. This would limit the memory accessible
from EL1 to 16GB instead of the current 256.
This doesn't really matter for the moment: since the core handles virtual
addresses with unsigned longs, LPAE cannot be used.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: adjustments to recent paging subsystem changes]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: initialise hypervisor stage 1 MMU

This patch enables the EL2 stage-1 MMU, after the core initialisation of
all the paging structures needed by the hypervisor.
The arm backend also needs to map the needed devices for MMIOs. In order
to stay compatible with the linux ioremaps (which is quite dodgy, cf.
ecfa8d1a), the UART is still accessed through high memory, but the GIC
should be accessed at its real address.

Some temporary mappings allow the mmu setup code to run at its physical
address while enabling the translations. Given the current hypervisor
configuration, there shouldn't be any conflict with existing mappings.
Once the PE runs at EL2, the HTTBR is installed, setup_mmu jumps back to
the virtual addresses, and the identity mappings are deleted.

This patch attempts to make most of the process 64bit-compatible. Only a
small bit of assembly is needed, which calls phys2hvirt and hvirt2phys
to translate the lr and sp addresses. Since these functions consist of
simple additions, they are currently harmless. This code would blow up
if they needed to dereference some pointers one day, but it should be
safe for the time being.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: adjustments to recent paging subsystem changes, add copyright header]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: add the ability to use arch-specific linker scripts

This patch adds an optional include inside the hypervisor's linker
script in order to add sections specific to the architecture.
For instance on ARM, a trampoline section will need to be added to
safely enable and disable the EL2 MMU. To avoid overlapping
complications, it will be less than one page, and aligned.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the paging callbacks

This patch adds the ability to implement EL2 stage-1 and EL1 stage-2
MMU, using the struct paging defined by the core.
In the future, it may be useful to separate the stage1 and stage2
functions, like the x86 implementation, in order to use less translation
levels for the IPA->PA translation. For the moment, we keep the
hv_paging = arm_paging definition.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: adjustments to recent paging subsystem changes]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Implement per-cpu accessors

TPIDR_EL2 already holds a reference to the current per_cpu data
structure. Return it from this_cpu_data and build the other accessors on
top.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: hyp vectors installation

The EL2 installation is done in two times:
- First, an HVC is issued to jump into the kernel stub and install the
bootstrap vectors.
- Then, a second HVC allows the setup code to switch to physical address
space.
Execution continues at EL2. Once the whole initialisation is done, the
final vectors are installed, and arch_cpu_activate_vmm will do an ERET
to jump back to the kernel, which is now a guest.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the debug routines for the pl011 UART

This patch uses the kernel config to detect which UART is available.
Currently, only the vexpress platform is implemented.
It assumes that the first UART uses the default, fixed VA->PA mapping
set by the kernel for the vexpress low-level printk.

This is far from ideal: a clean implementation would either need to
communicate the uart address from the driver, or postpone all debug
printks until the hypervisor is able to use its own mappings, as linux
does when earlyprintk is disabled.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: removed kernel config dependencies, switched to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement atomic bitops

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: spinlock implementation

Instead of attempting to reinvent the wheel, this patch copies the
ticket spinlock implementation from Linux, that will allow for optimal
locking between heterogeneous cores.
It contains a few subtleties, such as a preload instruction that ensures
that the cache line is immediately loaded in exclusive state.

Big endian is not supported for the moment, but as soon as we have a
macro that declares the use of this mode, the change will be trivial
here.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: add SMP barriers and utilities

This patch adds a few simple macros that allow the C code to use
specific ARM instructions.
The memory_barrier helper is only used to commit the changes of cpu_init
on the last core before allowing the others to return to EL1.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement some base functions

memcpy and phys_processor_id implementations are required before going
any further. This patch introduces a very trivial version of those
functions.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: provide an interface for accessing system registers

To avoid using inline asm directly in the setup and control code, to
provide clear names for system registers, and to allow easier
refactoring for a future arm64 port, this patch introduces some useful
macros for accessing the core system registers.

For a 64bit port, a couple of wrappers would still need to be added to
modify system registers that are of different sizes on the two
architectures (e.g. HCR)

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: hypervisor entry point

Each CPU saves its general-purpose registers on the stack, switches to
the hypervisor stack and saves the return context in the per-cpu datas.
After that, it jumps to the core entry which will do the necessary
initialisation to setup EL2.

Clusters are not supported yet: they will require the entry code to
fetch the total number of possible CPUs from the header, in order to
deduce an efficient base address for the per-cpu datas.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: build with virtualisation support

Most armv7-compatible toolchains still need an additional flag to
recognise instructions such as ERET or an MSR to banked registers.
This patch allows to includes the virt flag in files that require it.
It also forces the hypervisor image to only use the ARM instruction
set.
Support for Thumb2 hypervisor and kernel will be added later.
Guests should still be able to run in Thumb2, as long as they allow
to be entered in ARM mode.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Extend poor man's config system to makefiles

Generate hypervisor/config.mk from the optional config.h the user may
leave in hypervisor/include/jailhouse. Only boolean switches of the form

#define CONFIG_OPTION 1

are supported and will be translated to CONFIG_OPTION=y. Any other line
in config.h is ignored. The kernel's CONFIG variables a unset to remove
any collision possibilities. The architecture makefile can then include
the config.mk via

include $(CONFIG_MK)

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Use symbolic registers for tuning apic_reserved_bits

No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Limit number of extended LVTs on AMD

If future hardware should decide to report more than 4 LVTs, our
apic_reserved_bits array would overflow when enabling all of them in
apic_cpu_init.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: add missing free statement in jailhouse tool

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Fix asm-defines.h dependencies tracking

Previously, a change to headers like <asm/percpu.h> didn't trigger
a rebuild of asm-defines.s (thus asm-defines.h). This is fixed now.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

docs: inter-cell: Add pointer to root-cell test code

Adding a section on how to play with inter-cell communication and where
to find the Linux test code.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: x86: Account for rounding of region address in map_range

When feeding in a start address that is not huge-page aligned, we will
let the mapping start earlier, thus also have to extend the size.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

docs: add documentation for inter-cell communication

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: qemu: add a virtual PCI device to qemu config

This adds a virtual PCI device of type ivshmem to the root-cell on qemu.
Starting a cell with an ivshmem at the same bdf, refering to the same
memory will allow the two cells to connect.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: adjusted memory regions to avoid overlappings]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates, configs: add ivshmem virtual PCI device demo code and config

New demo to demonstrate ivshmem virtual PCI devices for shared memory
inter-cell communication.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: adjusted region location]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci_ivshmem: add PCI shared memory device

This patch adds support for a virtio shared memory device which can be
used for inter-cell communication.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: add load barrier for serialization

Introduce a memory load barrier function.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: refactor error path of pci_cell_init()

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: add list for virtual PCI devices to cell, makes lookup faster

Add a list of virtual PCI devices to a cell, that way we do need to go
through all PCI devices when looking for virtual ones. This lookup is on
the mmio path, which should be fast.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: add a bunch of defines to pci.h

Add a few general defines to pci.h. They will be used in a later commit
by the ivshmem virtual PCI device but are not specific to it.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: apic: make function non-static

Make the function pci_translate_msi_vector available for code outside of
arch/x86/pci.c.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: cell-config: add field to PCI device struct to refer to memory

Add shmem_region to struct jailhouse_pci_device to be able to refer to
the shared memory region used by the virtual PCI device used for
inter-cell communication.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver: add/remove virtual PCI devices to/from root-cell

Introduce a new PCI device class and make Linux discover these virtual
PCI devices on enable and remove them on disable.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver, pci: add functions to add/del devices to/from root-cell

This patch introduces functions to add or remove PCI devices in the root
cell. The functions are not called yet, the patch prepares for virtual
PCI devices.
The original idea was to remove PCI devices from Linux on cell create
and re-add them on cell destroy. But in this case Linux could reprogram
the BARs so we ignore regular PCI devices for now.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: ignore PCI bridge violations in Linux to allow bus rescans

When adding new devices Linux will access the PCI root bridge. Introduce
a HACK that will ignore the access and not treat it as a violation.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

pci: move define from pci.c to general pci header

This prepares for using the define in other C-files.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

apic: extract irq validation logic from iommu_map_interrupt

Introducing a new function apic_irq_dest_in_cell that makes sure the
destination cpu of the interrupt belongs to the given cell.
So far this code was part of iommu_map_interrupt but these checks are
required in other places as well.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: trivial formatting adjustments]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86 apic: make sure to only send valid irqs

struct apic_irq_message has a valid field, make sure to call
apic_send_irq only in cases where this bit is set. This protects us
against sending invalid interrupts during handover when the Linux setup
should ever be broken.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: extended commit description]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: add memory flag ROOTSHARED and use it

This patch introduces a memory flag to express shared memory with the
root cell. Memory regions with that flag will stay mapped in the root
cell on non-root-cell creation and do not need to be mapped back into
the root cell on non-root-cell destruction.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Revert "core: Validate CPU ID before using it"

This reverts commit c1eb94e35d03a71bb528fe1a6ab5304e840f91b0.

This attempt to protect us from out-of-bounds CPU IDs is futile: already
arch_entry requires a usable per-cpu data structure to save and - in
case of an error - restore Linux state. It's the driver's duty to
prevent hypervisor invocations on out-of-bounds CPUs.

Moreover the patch contained a regression as it passed an uninitialized
member of cpu_data to cpu_id_valid.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver: Catch unconfigured CPUs earlier

As the driver allocates per-CPU buffers according to the system
configuration, calling into the hypervisor with a CPU outside that range
can cause a crash instead of a proper error return. Catch this case
earlier, already in the driver.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Account for 64-bit overflow in open-coded div_u64_u64

We run into an endless loop when trying to divide something which has
bit 63 set. In this case, tmp_div will overrun and never become larger
than the dividend. Detect this case and terminate the inner loop
properly.

Reported-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: Support restrictive terminals with cell stats

Some terminals may not support use_default_colors or curs_set and throw
an exception at us. Seen with serial consoles. However, the stats
command can still work if they fail, so ignore those errors.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Use BITS_PER_LONG instead of hard-coded value

PERCPU_SIZE_SHIFT relied on sizeof(long) being 64 bit in size, which is always
true in Jailhouse but is also a potential breakage point. Fixed with
BITS_PER_LONG macro now.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: set BITS_PER_LONG for printk-core

printk-core.c uses BITS_PER_LONG and is included into inmate printk. In
order order to use that code correctly inmates need that define.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Mark hexdigit string static

ARM inmates have troubles processing a non-static version, and it also
makes no sense to have this constant on the stack.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Handle Extended APIC Register Space better

Extended APIC Register Space is AMD feature which was initially supported
in Jailhouse with a quick hack. It's fixed with this patch: now, the presence
of Extended APIC Register Space is determined as per APMv2, Sect. 16.3.4, and
the number of Extended Local Vector Tables is also determined in runtime.

As all of these is now done inside apic.c, and there is no need to keep
apic_reserved_bits[] public anymore.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Fix mask of reserved bits for error LVT

Only bit 16 (mask) is writable, not bits 17 and 18. Was copied & pasted
from timer LVT.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Bounds check for apic_reserved_bits[] array

apic_accessing_reserved_bits() function doesn't sanitize 'reg' argument.
Malicious cell can use this fact to perform out-of-bounds read in
apic_reserved_bits[] array in host mode, which should never happen.

With this patch, all APIC register bits not listed in apic_reserved_bits[]
are considered reserved.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

vtd: intremap: fix calculation of maximum amount of msi(x) vectors

Fix typo that prevented MSI-X remapping regions to get freed correctly.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: remove redundant test on error-code

The global error-state is checked before a call to cpu_init and checked
again in it. Because this happens under a lock and the global state is
never touched in cpu_init - or any sub-call -, the second check is
redundant and will always be true, thus remove it.

Signed-off-by: Benjamin Block <bebl@mageta.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Explain asm-defines generation code origins

Explicitly mention code that was taken from Linux kernel in corresponding
source files' licensing headers.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>