rtime.felk.cvut.cz Git - jailhouse.git/log

ci: Use script for building all configurations

This will ease the maintenance when we start to use it for the Coverity
build as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

ci: Beautify travis script

Adjust whitespaces, comment installation steps, use pushd/popd for
switching directories.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: Switch IOAPIC demo from power button to timer interrupt source

This simplifies testing as no more manual triggering is required.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Return an initialized value from AMD iommu_get_remapped_root_int

This is just to please code scanners, the function isn't called yet
(iommu_cell_emulates_ir always returns false).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Clean up and document cpu_suspended setting in arch_panic_stop

Document why we manipulate cpu_suspended outside of the per-cpu lock and
drop the superfluous memory barrier. Nothing has to be ordered here, we
just do a full stop and try to avoid that some other CPU will wait
infinitely on us to finish "suspension".

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Avoid theoretical race between CPU suspension and arch_resume_cpu

Conceptually, we avoid this race by synchronizing on cpu_suspended in
arch_suspend_cpu. However, to ease the analysis by both humans and code
scanners, let's apply the lock around the manipulation. Lock acquisition
also includes the required memory barrier so that we can drop the
explicit one.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: jailhouse: Fully initialize cell_id structure

This mostly helps code checkers to stop believing we are copying
uninitialized data around, even if it is semantically unused.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver: Fix uninitialized return code of jailhouse_cell_create

Found by Coverity: In case no CPUs of a new cell need to be offlined, we
left err uninitialized. Fix this.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

ci: Add Coverity scan

This only processes x86 code so far as Coverity also relies on binary
outputs to at least trigger the scan. We will have to decide to develop
a workaround or switch to a matrix build (including redundant
environment setups).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: Power-up PHY on E1000 takeover

Clear the power-down bit in the PHY control register in case the
previous user turned it off. Linux does so since about 3.15.

Note that we do not try to reset the PHY. Getting it running again with
the proper link speed turned out to be too complicated (too many PHY
variants) for this little demo.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Prepare f2a8xm-hd3 cell to multi-IOAPIC

The board has two IOAPICs, and now as Jailhouse supports more than
one of these chips, they can be safely added to the config.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: Close files after use in config generator

Just to be clean and to avoid piling up unused resources. In some cases
we already did so, in one we were using the with statement. Now the
remaining perform the close as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config create: add assertions in DMAR parser

Assert for some of the comments from the VT-d manual, might help to
detect invalid ACPI tables.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config create: add PCI-PCI bridge support to DMAR parser

Implement "PCI Sub-hierarchy" scope in DMAR parser.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config create: Comments and style, no functional changes

Add comments to help make some sense out of the scope type numbers.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config create: break out of pci device iteration after first hit

The list of PCI devices contains only one entry per bdf, break out of
loop after finding it.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config create: do not use the class file of pci devs anymore

The class file just contains the classcode. Since we started also using
the file containing the whole PCI config space we might as well get the
class information from there and copy/access less input files.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: config-create: fix pep8 style violations

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

i8042: remove unreachable condition from if statement

the size check was already done earlier in the function

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver: use min macro from Linux instead of defining another one

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Switch to ticket spinlocks

ARM already has it, x86 should gain it as well: To avoid the risk of
unfair lock assignment or even starvation in excessive contention
scenarios, switch to the ticket-based spinlock algorithm that also Linux
uses. Our implementation is a condensed version of the kernel as we do
not have to take para-virtual optimizations and instrumentations into
account.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Account for multiple IOAPICs per cell

Finally overcome the limitation of only one IOAPIC per cell, thus also
per system. We either look up the IOAPIC from the cell array based on
its physical address or we iterate over all IOAPICs of a cell when
needed - that's all. A good sign that we achieved this is the removal of
the IOAPIC_BASE_ADDR constant.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Introduce per-cell IOAPIC state

This introduces per-cell IOAPIC static and dynamic information. It
replaces related cell fields with a reference to an array of cell_ioapic
structures. As we do not want to keep a large array for every cell, even
for those that do not use the IOAPIC (typically all non-root cells), the
array is stored in a page allocated on demand during cell creation.

Using this abstraction obsoletes ioapic_find_config and moves us a bit
further away from the assumption that there is only a single IOAPIC.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Prepare ioapic_shutdown for multiple IOAPICs

Iterate over all physical IOAPICs during shutdown to write their
shadow states into the hardware.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Set up physical IOAPIC on cell creation

In preparation to support multiple IOAPICs, instantiate their physical
state phys_ioapic only on demand during cell creation. For simplicity
reasons, those instances will not be released on cell destruction again.
That means, once created, physical IOAPIC states and mappings stay with
the hypervisor until it is disabled again.

Note: Parts of the code keep their single-IOAPIC restrictions for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Introduce phys_ioapic abstraction

This structure will keep static and dynamic information about a physical
IOAPIC in a system. The three global variables ioapic_lock, ioapic and
shadow_redir_table are moved over, and an array of phys_ioapic
structures takes over their place. There is still only a single instance
supported, but once we have more, the physical base address will be used
to differentiate between them and also look them up from the array.

Internal functions of the IOAPIC subsystem are converted to make use of
the abstraction.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Make for_each_[non_root_]cell globally available

We are going to use the for_each_cell iterator in the IOAPIC module. To
remain consistent, export both of them.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Filter out unsupported numbers of irqchips

So far we only support a single IOAPIC per cell on x86. Soon this number
will be increased significantly, but a limit will remain. Filter out any
unsupported configurations during cell-specific IOAPIC setup.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tools: Extend config generator to process multiple IOAPICs

As a first step towards full support of more than one IOAPIC, extend the
config generator to process multiple IOAPIC entries in the DMAR table.
It used the MADT ("APIC") table to collect further information about the
found IOAPICs and lists them all in the irqchips array.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Only hand over IOAPICs pins to the root cell that are in use

Use the bitmap of currently assigned IOAPIC pins to hand them over to
the root cell, not those that are initially assigned. That makes a
difference when shutting down the hypervisor while some pins are still
owned by a non-root cell. During startup, both bitmaps are identical.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Fix error roll-back for vtd

If we fail the hypervisor setup before vtd_init_unit was run, we must
not try to restore anything during iommu_shutdown. This happened to far
and caused Linux crashes as well as spurious NMI injections.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

x86: Move vtd_init_fault_nmi from cell creation to config commit

This avoids that we change the DMAR unit settings before the setup
process succeeded. Will help to fix the roll-back on errors.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

driver: Avoid vmalloc(0) on creation of cells without PCI devices

The kernel does not like this pattern and may throw warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Revert "x86/tools/inmates: Account for 32-bit PM timers"

This reverts commit 6cd05b8f9b3f97998d7a4c857584dbfc5ef901f9.

Another way of dealing with 32-bit PM timers is to just pretend they
where 24-bit long. That is what an earlier patch does for jailhouse, so
this one is not required anymore.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Conflicts:
inmates/lib/x86/inmate.h
inmates/lib/x86/timing.c

[Jan: remove pm_timer_init also from ivshmem-demo]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: x86: mask pm_timer to 24bits

Operate any pm_timer in 24bit mode, even if it is 32bit capable. Linux
also just looks at the lower 24.
That simplyfies the code and we can deal with 24bit timers where the
ACPI tables claim they where 32bit wide.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Remove obsolete chromebook config

This was never completed and most likely will never be. Drop it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Adjust QEMU config to use VGA instead of Cirrus

VGA became the standard video adapter in QEMU 2.2. Adjust the config
accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Convert to TODO.md

Make this file markdown-friendly as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Update TODO

Remove recently completed or obsoleted items, add details on next steps
about inter-cell communication.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

docs: Add CONTRIBUTING.md

Specify the contribution cycle in form of a checklist and a sketched
integration process. Also list people with specific responsibility areas
that should be involved on their topics.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

tooling: Detect too old make version

Massaged version of Hans' original patch: Since d0ca500b we depend on
make >= 3.82. That can be a problem for oldish distributions. Better
catch it early.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

README: Add link to Travis CI

Link to out continuous integration service, including build status
visualization that github renders for us when displaying the README.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

README: Add information about Banana Pi setup

Describe how to set up and run Jailhouse with inmates on the Banana Pi
board. This is currently our physical reference for ARM systems.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

README: Adjust to markdown format

Perform some reformatting so that we can present the README as markdown
file for nicer visualization on github. Also prepare for ARM addition
and adjust the kernel version requirement of x86 at this chance.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

ci: Add Travis CI infrastructure

Based on Roger Meier's proposal, this adds support for testing Jailhouse
builds on Travis CI (travis-ci.org). The major differences to Roger's
approach are:
- Linux kernels are pre-built and pushed as archive to a webserver
- all target variants (x86, Banana Pi, Versatile Express) are built in
a single run to limit archive downloads
- required kernel and Jailhouse configs become part of our tree

The kernel archive can be generated via ci/gen-kernel-build.sh in an
environment comparable to the Travis CI VMs. See ci/README.md for more
information.

CC: Roger Meier <r.meier@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Clear virtual GICs before handing them over to Linux during setup

Previous users of the virtual GICs may have left them with pending
interrupts or raised priority levels. Fix this up before starting Linux
under Jailhouse control. Otherwise we risk to inject spurious interrupts
or stall interrupt delivery to Linux.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Implement PSCI_AFFINITY_INFO_32

Linux uses it to check if a CPU is really dead and at least dumps
warnings on the console if this function fails. It is mandatory to
implement according to the spec.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Wait for CPU to stop in arch_suspend_cpu

The semantic of arch_suspend_cpu is synchronous, i.e. it has to wait
until the target CPU was actually suspended. Extend the ARM
implementation accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Clean up psci_cpu_stopped usage

psci_cpu_stopped returns a bool, so let's use it like this.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: arm: Enhance gic-demo with latency statistics

Original version by Johann Pfefferl: This transfers the apic-demo to
ARM by letting the timer tick at 10 Hz and print jitter statistics on
each event. In addition, this also lets the green LED on the Banana Pi
blink.

CC: Johann Pfefferl <johann.pfefferl@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Open clock gate on UART setup

Add the infrastructure to open a clock gate on UART configuration. This
is particularly helpful if Linux drivers close the gate when releasing
the device.

For now the assumption is that a clock gate can be described by a single
bit in a specific register.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Add support for Banana Pi board

The Banana Pi is a cheap ARMv7 board with a dual-core Cortex-A7, thus
with virtualization support. Upstream U-boot and kernel work fine -
ideal conditions. We just lack some IOMMU on that board, but it remains
handy for testing purposes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Rework return to EL1 path

Refactor cpu_return_el1 to cpu_prepare_return_el1, moving the differing
parts depending on the return mode to the caller site. Ensure that we
return to Linux passing the proper error code - it's now available to
arch_cpu_restore.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Pass return code to arch_cpu_restore

Some architectures, so far ARM, may prefer to jump directly to the
target Linux context from arch_cpu_restore. In this case we need to have
the return code at hand as well. Extend the parameter list accordingly
and document the possibility that arch_cpu_restore does not return to
the caller.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Account for irqchip_cell_exit being called before irqchip_init

If the hypervisor setup procedures fails before irqchip_init was called,
arch_shutdown will still invoke irqchip_cell_exit. If we run this
function, we'll crash latest when trying to access the not yet mapped
GIC. Leave irqchip_cell_exit early in this case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Switch to generic UART mapping

Start using the generic UART mapping by the Linux driver. For this the
VExpress config has to gain physical base and size information of the
debug UART.

This removed the tedious need to adjust UART_BASE_VIRT in platform.h
according to the Linux configuration.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core/driver: Add support for mapping the debug UART from the driver

If the debug UART is memory-mapped, we can only access prior to
switching to hypervisor mappings if the driver supports us in this. By
adding a debug_uart memory region to the system configuration, we tell
the driver about the mapping need. In turn, the driver reports the
virtual address via an additional header field. The mapping can be
released on Linux side right after enabling the hypervisor

Provided the virtual address of the UART mapping as chosen by Linux does
not conflict with our remapping region, this mapping can safely be
replicated into the hypervisor address space so that we don't need to
adjust the UART access after enabling our own mapping.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Redefine PAGE_FLAG_UNCACHED to PAGE_FLAG_DEVICE

All (x86) users of this page flag map devices into the hypervisor
address space. We will do the same for ARM when mapping the debug UART.
For this we need a generic flag with the same semantics. As uncached is
different from device mappings, redefine the semantic of UNCACHED flag
for this purpose.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Remove unused phys_base from uart_chip

This field is write-only, the UART driver is only interested in the
virtual address.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

core: Refactor arch-specific section definition

Require all archs to define ARCH_SECTIONS via asm/section.h, at least an
empty one. Include this unconditionally in the hypervisor layout.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Stop misusing JAILHOUSE_MEM_DMA for marking MMIO

Introduce JAILHOUSE_MEM_IO so that archs that need to tag MMIO regions
have a proper flag. Apply it on ARM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Adjust UART_BASE_VIRT according to local test configuration

It's almost pointless to tune this constant as it is highly dependent on
the local kernel config. However, this one helps local testing until we
have a better solution for getting the UART mapped for the hypervisor
during early setup.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

inmates: arm: Fix and improve build

Introduce and use DECLARE_TARGETS just like x86 does. This prevents
unconditional rebuilding of the inmates on every make. Also move the
filtering of "-include asm/unified.h" into reusable Makefile.lib.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Adjust VExpress configs for smaller reservations

Move the hypervisor at the top of 2G memory. The gic and uart demo cells
are placed below, each given much less space than before - they don't
need more.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

configs: Add NIC and MMC to VExpress config

These two come at least with the Fast Model of the Versatile Express.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Improve output on fatal traps

Dump the exception class but drop the CPU ID - that one will be printed
anyway by panic_park.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: config examples

These config files are used to run the root cell (vexpress.c) and the
inmates examples (vexpress-*-demo) on the vexpress platform.
vexpress-linux-demo can be used to create a cell for an SMP linux on CPUs
2 and 3, by assigning it the UART3 IRQ.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: add copyright headers]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: basic inmates demos

This patch adds the necessary libraries for writing simple inmates on arm,
and provides two demos.
It attempts to use the same layout as x86, and allows to set the devices
base addresses with platform-specific includes, in order to avoid
including kconfig.h

Only the vexpress platform with GICv2 and GICv3 is supported for the
moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Use config.mk instead of kbuild, use mmio accessors,
avoid integer overflow, add copyright headers]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: exit statistics

This patch implements the counters that report the number VM exits,
accessible by the driver. It also adds three statistics for the ARM
side: the number of IRQs injected, the number of IPIs injected, and the
number of GIC maintenance IRQs received.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv2: handle SPI routing

GICv2 is limited to 8 CPUs and uses independent routing bits, whereas
GICv3 (with ARE enabled) uses the MPIDR encoding (aff3.aff2.aff1.aff0)
for routing SPIs.
Before handling SPIs, the GICv2 backend has to probe its banked view of
the distributor to know which CPU interface it is accessing. After that,
the implementation is roughly the same as for GICv3, but GICD_ITARGETSR
are used instead of IROUTER.
Because the guest isn't supposed to rely on the CPU interface number
being coherent with the CPU logical ID, we don't have to translate it to
a virtual ID before handling route accesses inside SMP cells.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: add support for GICv2

This patch implements the following GICv2 features:
- Remap GICC to GICV in the cells to provide a virtual interface
- Guest SGI filtering and hyp SGI handling
- IRQ injection

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: factor some GICv3 functions into gic_common

Some functions are abstract enough to be used by the GICv2 backend.
This patch moves them into the common code.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: restore kernel on setup failure

This patch implements two cases:
- When an error occurs before setting up EL2, there is nothing much
  to do except restore the linux registers stored in the per_cpu
  datas.
- When it happens after EL2 setup, arch_cpu_restore copies the saved
  registers on the stack, and continues into arch_shutdown_self

When it happens during the MMU setup, chances of recovering a clean
state are pretty thin anyway. The bootstrap vectors could be used to
catch and dump a minimal context (which would require a raw_printk
implementation), but we cowardly ignore this case for the moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: fix memcpy size in cpu_return_el1]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement hypervisor shutdown

When an HV_DISABLE hypercall is issued on all root CPUs by the driver,
the core `shutdown' function executes the following operations:
- Suspend all non-root cells (all the CPUs are taken to hyp idle mode),
- call arch_shutdown_cpu for all those CPUs,
- call arch_shutdown.
Once the master CPU (the first to take the shutdown lock) did this, the
other root CPUs don't actually perform any operation.

This patch lets the arch_shutdown and arch_shutdown_cpu set a boolean
that is considered by the cores right before returning to EL1: for the
cells' CPUs, arch_shutdown_cpu will trigger a return to arch_reset_self,
that will clean up EL1 and EL2. On the root cpus, the exit handler
checks this boolean and calls the shutdown function.

Once inside arch_shutdown_self, the principle is the same as with the
hypervisor initialisation:
- Create identity mappings of the trampoline page and the stack,
- Jump to the physical address of the shutdown function,
- Disable the MMU,
- Reset the vectors,
- Return to EL1

This patch does not handle hosts using PSCI yet: they will need to issue
a final SMC on secondary CPUs in order to park themselves at EL3, since
the hypervisor won't exist anymore to emulate the wakeup call.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: moved arch_shutdown_cpu & arch_shutdown to control.c]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: irqchip: add hypervisor shutdown

Shutting down the GIC on the root cell consists of re-enabling direct
access to the CPU interface.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: save the linux hyp-stub vectors

This patch stores the hypervisor stub vectors before installing EL2, in
order to reset them on shutdown. It assumes that they are the same on all
CPUs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: ignore writes to the ACTLR register

The Auxiliary Control Register may be used on some platforms to disable
memory coherency between the cores, for instance when unplugging a CPU.
This patch ensures that ACTLR is never modified, by trapping its accesses
with the HCR.TAC bit.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: add platform-dependent SMP operations

Hotplugging CPUs on ARM is quite difficult, as each platform uses its
own system. The use of PSCI emulation will greatly simplify this, but
on many platforms, we still have to define a series of specific SMP
operations wrapped around the kernel hotplug implementation.

This patch adds support for the vexpress hotplug system:
- When the root cell attempts to unplug a CPU, to give it to a new
  cell, it is put in a WFI loop, which is left when Jailhouse sends
  a synchronising IPI to all CPUs that need to be parked.
- When re-assigning a CPU to the root cell, the simplest return path
  is through the kernel's secondary entry, whose address is stored in
  the system flags register.

Because the kernel only writes the flag register once, plugging CPUs in
the host cannot be accomplished by waiting for a trapped MMIO. Moreover,
such a trap would be missed on hypervisor shutdown, since CPU0 may
return to bare EL1 before secondary CPUs. On some platforms, it may be
necessary to park secondary CPUs outside of the hypervisor on shutdown,
by copying a minimal spin code in a reserved location...

This patch also attempts to combine both classical and PSCI boot methods
in SMP guests: secondary CPUs are held in the psci_emulate_spin handler,
and can be woken up by both a PSCI call and a trapped access to the
vexpress mbox.
The same applies for hotplugging secondary CPUs in the guests, but the
mailbox method only waits for an IPI.

PSCI in the host is not currently supported: it would require a call to
the actual CPU_OFF handler when shutting down the whole hypervisor.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: PSCI emulation

By adding a new field 'guest_mbox' in the cpu_datas, this patch allows
the guests to issue PSCI HVC calls. Currently, only PSCI_CPU_ON,
PSCI_CPU_OFF, and PSCI_VERSION are handled.
A call to CPU_OFF enters the suspend mode through arch_reset_self. When
a core calls CPU_ON, the hypervisor wakes up the other core, which will
take its return address from the guest_mbox, wipe its registers and go
back to EL1. The context argument to PSCI_CPU_ON is currently ignored,
since the whole core is reset.
This patch also traps SMC instructions in order to catch the PSCI
requests done using this way. All others SMC calls are forwarded to
EL3.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: handle distributor accesses

This patch adds the handling of MMIO accesses to the GICv3 distributor.
By restricting the SPI masks to the cell's configuration, it makes sure
that they do not touch the other cell's SPI's when writing the common
registers.
Except for the routing and SGIR registers, most of the code should be
common to both GICv2 and v3.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: irqchip: add SPI configuration in cell_init and cell_exit

This patch enables the routing of SPIs to the new cell's first CPU. When
destroyed, all SPIs are re-routed to the root cell.
An exhaustive implementation would save the targets of each IRQ before
transferring it to a new cell. Since linux does not currently route SPIs
to secondary CPUs and the root cell is not supposed to use devices that
will be assigned to guests anyway, it should be safe to route everything
to CPU0.

This patch follows the core configuration and the IOAPIC implementation,
which only allows to use the first 64 SPIs.
A future patch will need to change this minimal bitmap size to 988.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: filter redistributor accesses

Since each cell has its own set of CPU ids, they can't access the
redistributor associated to their MPIDR. Instead, the MMIO accesses are
translated to their physical redistributor, and a read to the ID
register returns the virtual affinity value.

It is a bit more expensive than simply mapping the redistributor to the
cell, but the guest rarely needs to reconfigure its PPIs and IPIs, so
this patch shouldn't introduce any significant performance loss.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: attribute virtual IDs to the cell cpus

To handle SMP guests, the cells need to be assigned virtual CPU IDs
through the VMPIDR register. For the moment, those IDs are simply
generated incrementally on each CPU.

This change will allow to use the same guest code in different cells.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: add "arm_" prefix, uninline arm_cpu_virt2phys]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: mmio emulation skeleton

This patch adds the necessary code for handling MMIO accesses. The trap
handler fills a mmio_access struct according to the fields in the ESR,
and passes it to all relevant sub-handlers.
If all return UNHANDLED, the access is considered invalid and the CPU is
put into failed mode.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors, use asm/mmio.h]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: better error reporting and panic dump

This patch adds exhaustive handling of hypervisor errors, and the
ability to stop and park CPUs after dumping their EL1 context, when they
encounter an unhandled trap.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Adjustments to recent control subsystem changes]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: Complete paging invalidations

This patch is based on the original version by Jean-Philippe Brucker. It
fills different paging stubs:
- the arch_flush_cell_vcpu_caches stub, which is used by the core via
  config_commit each time the memory is remapped. It allows to
  invalidate the TLBs on all affected CPUs of the cell.
- the arch_paging_flush_cpu_caches function is used to flush the
  hypervisor page table entries when using the PAGE_MAP_COHERENT flag
  (useful for IOMMU, not currently in use on the arm side.)
- the arch_paging_flush_page_tlbs function is used to invalidate a TLB
  entry after modifying the hypervisor paging structures. It must ignore
  accesses done from the initial setup code at EL1, which are committed
  once at EL2 with a TLBIALLH, just before enabling the MMU.

arch_config_commit has nothing to do so far. Will change when an IOMMU is
supported.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: disable caches on cell reset

This patch allows to enter new guests with cache disabled. By cleaning
the data caches, it makes sure that the recently written guest code and
datas are present in memory before returning to an environment with only
a stage-2 MMU.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: flush and enable the caches at initialisation

Before jumping to EL2, which has its MMU disabled, the data caches need
to be cleaned, in order to be coherent with the EL1 context.
This patch implements the complete data cache flush by set/way, and
enables the EL2 caches if possible.

Please note that the hypervisor always assumes that the kernel sets up
its memory in a coherent way between the cores, which means that all the
relevant memory regions (ie. the Jailhouse code and datas) are supposed
to be cacheable and inner-shareable, so that only a clean is needed
before turning the caches off.
In the short section where the MMU is off, the hypervisor doesn't write
anything to memory.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: clear the banked and system registers on reset

This patch allows to boot the new guests with an -almost- empty context.
The reset function is still missing Performance Monitor, debug, SIMD and
float registers.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the cell destruction

Give the CPUs back to Linux when a cell is destroyed, after resetting its
whole context.

This patch uses the old vexpress CPU hotplug system in Linux: the
secondary startup function address is kept in the system flags, so the
cpu_reset function will simply jump there after resetting the CPU state.
A future patch will add PSCI hotplug support, and once the hypervisor
has access to a device tree, this spin function will need to be set
dynamically in a `struct hotplug_ops'.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GIC: reset the CPU interface before running a new guest

All pending interrupts need to be cleared before running a new guest.
This patch resets the list registers, the software pending queue, and
the GIC hypervisor config registers.
Since the suspend loop was entered through the IRQ handler, we also need
to deactivate that active IPI.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: implement the cell creation

This patch implements most CPU handling functions needed to setup and start
a new cell.
- When the core calls suspend_cpu, an SGI is sent to this CPU.
  Since all IRQs are taken directly to the hypervisor, the guest will be
  interrupted and execution will continue into psci_suspend.
- resume_cpu will simply call psci_cpu_on and return from to the IRQ
  handling loop.
- reset_cpu will resume the CPU into arch_reset_self.
  After resetting all relevant registers and devices, execution will
  continue into the newly created guest, by calling vmresume with a
  clean set of registers.

To keep this patch light, most of the reset code is still missing.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: minimal PSCI implementation

This is not the final thing: it can only be used internally for
suspending, resuming and parking CPUs while reconfiguring the cells.
Using this base, a trivial PSCI 0.2 emulation can be added by implementing
the appropriate trap hooks.

The mailbox in the per_cpu structure is used to store the address and
context where psci_cpu_off returns.
CPUs doing reconfiguration on the cells can use the psci_suspend and
psci_resume wrappers to store all affected cores.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3: filter the guests' SGIs

In GICv3, IPIs are sent by writing the system register `ICC_SGIR'.
This patch moderates those writes by injecting the IPIs into the
appropriate cells, and issues an hypervisor IPI to let the cell's CPUs
fill their list registers.

Since there shouldn't be many cases where Jailhouse needs to emulate
system register accesses, this patch keeps it simple, by calling directly
the GICv3 function from the trap handler, without abstracting it through
irqchip.
However, this change adds an ungraceful ifdef, since the GICv2 and v3
headers are mutually exclusive for the moment.
In GICv2, the SGIR register is 32bit and will be handled directly in the
gic-common.c code, using an MMIO trap of the distributor accesses.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: skip instructions that fail their condition check

On some implementations, instructions may trap before the PE was able to
check their conditional code. This patch adds the ability to check it
before emulating something that wouldn't be executed. In thumb mode, the IT
state has to be updated when skipping instructions.

Most of this patch is copied and adapted from Linux and KVM.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: read/write the banked registers

When emulating instructions, the trap handler will need to access the
cell registers according to the guest's processor mode when the trap
occurred, which is stored inside the saved PSR.
This patch allows to directly read and write the banked registers.
If HSR reports a load into r14 and the mode was IRQ for instance, the
hypervisor will need to write something into LR_IRQ instead of the LR
saved on the stack.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3: handle IRQs

The GIC IRQ handler loops over the ack register to get all pending IRQs.
It then dispatches them either in the common SGI handler, or injects
them into the cell.
A first attempt to directly inject an IRQ by writing to a free list
register is done. If it fails, the IRQ is appended to the pending list,
and an attempt will be made later on, once a maintenance interrupt is
received.
Injection in the GIC is a little bit expensive for the moment, because
it needs to iterate over all list registers that have a valid interrupt,
to check that there will be no duplication. This could be optimized by
only checking the `active' GIC register for SPIs and PPIs.
A future patch will also add proper handling of the maintenance bits in
the vGIC.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: store the pending virtual interrupts

This patch introduces a pending_irq structure to provide a level of
abstraction, in order to store the interrupts waiting to be injected in
the cell. They are allocated as a static array of 256 IRQs for each CPU,
which should be more than enough. Insertion finds the first available slot
and builds a linked list of pending vIRQs.

Two cases justify the need for this structure:
- The GIC has a limited number of list registers for injecting virtual
  interrupts. Once they are full, software must store the pending ones
  itself, and use the GIC's maintenance IRQ to be informed when they are
  available again.
  In jailhouse, this case should be very rare since IRQs are directly
  injected, but it must be taken into account nonetheless.
- IPIs sent by a core need to be stored somewhere to let the other CPUs
  inject them into their own list registers.

The GIC backend will need to call irqchip_inject_pending when receiving
a maintenance IRQ or a synchronisation SGI in order to clean the list.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: IRQ handling skeleton

Since IRQs taken to HYP use a different vector, the trap handler needs to
be aware of the exit context. To this end, the patch adds an 'exit_reason'
field to the struct registers.
The structure is still passed to the dispatcher as a pointer to the stack,
but care must be taken to ignore the exit field when restoring the user
registers.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

arm: GICv3 initialisation

Assuming there is a GIC distributor at address GICD_BASE, this patch
checks its version and call the gic init function. Linux's kconfig header
is used to guess the base address of the distributor.
Ideally, a device tree would be passed to the hypervisor in the root
cell's config, allowing to remove all constant base addresses.

The patch also assumes that most of the GIC has been setup by Linux prior
to the hypervisor installation, and only initialises the vGIC.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: use mmio accessors]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>