Jan Kiszka [Mon, 2 Feb 2015 12:41:29 +0000 (13:41 +0100)]
x86: Clean up and document cpu_suspended setting in arch_panic_stop
Document why we manipulate cpu_suspended outside of the per-cpu lock and
drop the superfluous memory barrier. Nothing has to be ordered here, we
just do a full stop and try to avoid that some other CPU will wait
infinitely on us to finish "suspension".
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 2 Feb 2015 12:27:43 +0000 (13:27 +0100)]
x86: Avoid theoretical race between CPU suspension and arch_resume_cpu
Conceptually, we avoid this race by synchronizing on cpu_suspended in
arch_suspend_cpu. However, to ease the analysis by both humans and code
scanners, let's apply the lock around the manipulation. Lock acquisition
also includes the required memory barrier so that we can drop the
explicit one.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 2 Feb 2015 09:15:28 +0000 (10:15 +0100)]
ci: Add Coverity scan
This only processes x86 code so far as Coverity also relies on binary
outputs to at least trigger the scan. We will have to decide to develop
a workaround or switch to a matrix build (including redundant
environment setups).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 1 Feb 2015 10:38:18 +0000 (11:38 +0100)]
inmates: Power-up PHY on E1000 takeover
Clear the power-down bit in the PHY control register in case the
previous user turned it off. Linux does so since about 3.15.
Note that we do not try to reset the PHY. Getting it running again with
the proper link speed turned out to be too complicated (too many PHY
variants) for this little demo.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 28 Jan 2015 09:58:58 +0000 (10:58 +0100)]
tools: Close files after use in config generator
Just to be clean and to avoid piling up unused resources. In some cases
we already did so, in one we were using the with statement. Now the
remaining perform the close as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Henning Schild [Tue, 27 Jan 2015 14:05:30 +0000 (15:05 +0100)]
tools: config create: do not use the class file of pci devs anymore
The class file just contains the classcode. Since we started also using
the file containing the whole PCI config space we might as well get the
class information from there and copy/access less input files.
Signed-off-by: Henning Schild <henning.schild@siemens.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 27 Jan 2015 10:09:21 +0000 (11:09 +0100)]
x86: Switch to ticket spinlocks
ARM already has it, x86 should gain it as well: To avoid the risk of
unfair lock assignment or even starvation in excessive contention
scenarios, switch to the ticket-based spinlock algorithm that also Linux
uses. Our implementation is a condensed version of the kernel as we do
not have to take para-virtual optimizations and instrumentations into
account.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 26 Jan 2015 12:25:48 +0000 (13:25 +0100)]
x86: Account for multiple IOAPICs per cell
Finally overcome the limitation of only one IOAPIC per cell, thus also
per system. We either look up the IOAPIC from the cell array based on
its physical address or we iterate over all IOAPICs of a cell when
needed - that's all. A good sign that we achieved this is the removal of
the IOAPIC_BASE_ADDR constant.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 26 Jan 2015 10:02:36 +0000 (11:02 +0100)]
x86: Introduce per-cell IOAPIC state
This introduces per-cell IOAPIC static and dynamic information. It
replaces related cell fields with a reference to an array of cell_ioapic
structures. As we do not want to keep a large array for every cell, even
for those that do not use the IOAPIC (typically all non-root cells), the
array is stored in a page allocated on demand during cell creation.
Using this abstraction obsoletes ioapic_find_config and moves us a bit
further away from the assumption that there is only a single IOAPIC.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 26 Jan 2015 09:07:58 +0000 (10:07 +0100)]
x86: Set up physical IOAPIC on cell creation
In preparation to support multiple IOAPICs, instantiate their physical
state phys_ioapic only on demand during cell creation. For simplicity
reasons, those instances will not be released on cell destruction again.
That means, once created, physical IOAPIC states and mappings stay with
the hypervisor until it is disabled again.
Note: Parts of the code keep their single-IOAPIC restrictions for now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 25 Jan 2015 20:57:19 +0000 (21:57 +0100)]
x86: Introduce phys_ioapic abstraction
This structure will keep static and dynamic information about a physical
IOAPIC in a system. The three global variables ioapic_lock, ioapic and
shadow_redir_table are moved over, and an array of phys_ioapic
structures takes over their place. There is still only a single instance
supported, but once we have more, the physical base address will be used
to differentiate between them and also look them up from the array.
Internal functions of the IOAPIC subsystem are converted to make use of
the abstraction.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 24 Jan 2015 07:54:48 +0000 (08:54 +0100)]
x86: Filter out unsupported numbers of irqchips
So far we only support a single IOAPIC per cell on x86. Soon this number
will be increased significantly, but a limit will remain. Filter out any
unsupported configurations during cell-specific IOAPIC setup.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 24 Jan 2015 06:41:25 +0000 (07:41 +0100)]
tools: Extend config generator to process multiple IOAPICs
As a first step towards full support of more than one IOAPIC, extend the
config generator to process multiple IOAPIC entries in the DMAR table.
It used the MADT ("APIC") table to collect further information about the
found IOAPICs and lists them all in the irqchips array.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 25 Jan 2015 09:28:31 +0000 (10:28 +0100)]
x86: Only hand over IOAPICs pins to the root cell that are in use
Use the bitmap of currently assigned IOAPIC pins to hand them over to
the root cell, not those that are initially assigned. That makes a
difference when shutting down the hypervisor while some pins are still
owned by a non-root cell. During startup, both bitmaps are identical.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 28 Jan 2015 07:01:04 +0000 (08:01 +0100)]
x86: Fix error roll-back for vtd
If we fail the hypervisor setup before vtd_init_unit was run, we must
not try to restore anything during iommu_shutdown. This happened to far
and caused Linux crashes as well as spurious NMI injections.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Another way of dealing with 32-bit PM timers is to just pretend they
where 24-bit long. That is what an earlier patch does for jailhouse, so
this one is not required anymore.
Henning Schild [Wed, 26 Nov 2014 10:12:07 +0000 (11:12 +0100)]
inmates: x86: mask pm_timer to 24bits
Operate any pm_timer in 24bit mode, even if it is 32bit capable. Linux
also just looks at the lower 24.
That simplyfies the code and we can deal with 24bit timers where the
ACPI tables claim they where 32bit wide.
Signed-off-by: Henning Schild <henning.schild@siemens.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 9 Jan 2015 19:15:21 +0000 (20:15 +0100)]
docs: Add CONTRIBUTING.md
Specify the contribution cycle in form of a checklist and a sketched
integration process. Also list people with specific responsibility areas
that should be involved on their topics.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 9 Jan 2015 17:59:42 +0000 (18:59 +0100)]
tooling: Detect too old make version
Massaged version of Hans' original patch: Since d0ca500b we depend on
make >= 3.82. That can be a problem for oldish distributions. Better
catch it early.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 2 Jan 2015 13:37:43 +0000 (14:37 +0100)]
README: Adjust to markdown format
Perform some reformatting so that we can present the README as markdown
file for nicer visualization on github. Also prepare for ARM addition
and adjust the kernel version requirement of x86 at this chance.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 1 Jan 2015 12:58:08 +0000 (13:58 +0100)]
ci: Add Travis CI infrastructure
Based on Roger Meier's proposal, this adds support for testing Jailhouse
builds on Travis CI (travis-ci.org). The major differences to Roger's
approach are:
- Linux kernels are pre-built and pushed as archive to a webserver
- all target variants (x86, Banana Pi, Versatile Express) are built in
a single run to limit archive downloads
- required kernel and Jailhouse configs become part of our tree
The kernel archive can be generated via ci/gen-kernel-build.sh in an
environment comparable to the Travis CI VMs. See ci/README.md for more
information.
CC: Roger Meier <r.meier@siemens.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 26 Dec 2014 10:52:04 +0000 (11:52 +0100)]
arm: Clear virtual GICs before handing them over to Linux during setup
Previous users of the virtual GICs may have left them with pending
interrupts or raised priority levels. Fix this up before starting Linux
under Jailhouse control. Otherwise we risk to inject spurious interrupts
or stall interrupt delivery to Linux.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 20 Dec 2014 11:11:44 +0000 (12:11 +0100)]
arm: Implement PSCI_AFFINITY_INFO_32
Linux uses it to check if a CPU is really dead and at least dumps
warnings on the console if this function fails. It is mandatory to
implement according to the spec.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 19 Dec 2014 15:25:42 +0000 (16:25 +0100)]
arm: Wait for CPU to stop in arch_suspend_cpu
The semantic of arch_suspend_cpu is synchronous, i.e. it has to wait
until the target CPU was actually suspended. Extend the ARM
implementation accordingly.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 15 Dec 2014 17:02:49 +0000 (18:02 +0100)]
inmates: arm: Enhance gic-demo with latency statistics
Original version by Johann Pfefferl: This transfers the apic-demo to
ARM by letting the timer tick at 10 Hz and print jitter statistics on
each event. In addition, this also lets the green LED on the Banana Pi
blink.
CC: Johann Pfefferl <johann.pfefferl@siemens.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 14 Dec 2014 17:28:34 +0000 (18:28 +0100)]
arm: Open clock gate on UART setup
Add the infrastructure to open a clock gate on UART configuration. This
is particularly helpful if Linux drivers close the gate when releasing
the device.
For now the assumption is that a clock gate can be described by a single
bit in a specific register.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 12 Nov 2014 12:01:43 +0000 (13:01 +0100)]
arm: Add support for Banana Pi board
The Banana Pi is a cheap ARMv7 board with a dual-core Cortex-A7, thus
with virtualization support. Upstream U-boot and kernel work fine -
ideal conditions. We just lack some IOMMU on that board, but it remains
handy for testing purposes.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 25 Nov 2014 08:06:42 +0000 (09:06 +0100)]
arm: Rework return to EL1 path
Refactor cpu_return_el1 to cpu_prepare_return_el1, moving the differing
parts depending on the return mode to the caller site. Ensure that we
return to Linux passing the proper error code - it's now available to
arch_cpu_restore.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Tue, 25 Nov 2014 08:02:45 +0000 (09:02 +0100)]
core: Pass return code to arch_cpu_restore
Some architectures, so far ARM, may prefer to jump directly to the
target Linux context from arch_cpu_restore. In this case we need to have
the return code at hand as well. Extend the parameter list accordingly
and document the possibility that arch_cpu_restore does not return to
the caller.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 21 Nov 2014 20:00:28 +0000 (21:00 +0100)]
arm: Account for irqchip_cell_exit being called before irqchip_init
If the hypervisor setup procedures fails before irqchip_init was called,
arch_shutdown will still invoke irqchip_cell_exit. If we run this
function, we'll crash latest when trying to access the not yet mapped
GIC. Leave irqchip_cell_exit early in this case.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 12 Oct 2014 15:50:12 +0000 (17:50 +0200)]
arm: Switch to generic UART mapping
Start using the generic UART mapping by the Linux driver. For this the
VExpress config has to gain physical base and size information of the
debug UART.
This removed the tedious need to adjust UART_BASE_VIRT in platform.h
according to the Linux configuration.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 12 Oct 2014 14:52:31 +0000 (16:52 +0200)]
core/driver: Add support for mapping the debug UART from the driver
If the debug UART is memory-mapped, we can only access prior to
switching to hypervisor mappings if the driver supports us in this. By
adding a debug_uart memory region to the system configuration, we tell
the driver about the mapping need. In turn, the driver reports the
virtual address via an additional header field. The mapping can be
released on Linux side right after enabling the hypervisor
Provided the virtual address of the UART mapping as chosen by Linux does
not conflict with our remapping region, this mapping can safely be
replicated into the hypervisor address space so that we don't need to
adjust the UART access after enabling our own mapping.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 12 Oct 2014 14:13:49 +0000 (16:13 +0200)]
core: Redefine PAGE_FLAG_UNCACHED to PAGE_FLAG_DEVICE
All (x86) users of this page flag map devices into the hypervisor
address space. We will do the same for ARM when mapping the debug UART.
For this we need a generic flag with the same semantics. As uncached is
different from device mappings, redefine the semantic of UNCACHED flag
for this purpose.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 29 Sep 2014 10:49:34 +0000 (12:49 +0200)]
arm: Adjust UART_BASE_VIRT according to local test configuration
It's almost pointless to tune this constant as it is highly dependent on
the local kernel config. However, this one helps local testing until we
have a better solution for getting the UART mapped for the hypervisor
during early setup.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 29 Sep 2014 10:37:29 +0000 (12:37 +0200)]
inmates: arm: Fix and improve build
Introduce and use DECLARE_TARGETS just like x86 does. This prevents
unconditional rebuilding of the inmates on every make. Also move the
filtering of "-include asm/unified.h" into reusable Makefile.lib.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 28 Sep 2014 19:43:54 +0000 (21:43 +0200)]
configs: Adjust VExpress configs for smaller reservations
Move the hypervisor at the top of 2G memory. The gic and uart demo cells
are placed below, each given much less space than before - they don't
need more.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
These config files are used to run the root cell (vexpress.c) and the
inmates examples (vexpress-*-demo) on the vexpress platform.
vexpress-linux-demo can be used to create a cell for an SMP linux on CPUs
2 and 3, by assigning it the UART3 IRQ.
This patch adds the necessary libraries for writing simple inmates on arm,
and provides two demos.
It attempts to use the same layout as x86, and allows to set the devices
base addresses with platform-specific includes, in order to avoid
including kconfig.h
Only the vexpress platform with GICv2 and GICv3 is supported for the
moment.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Use config.mk instead of kbuild, use mmio accessors,
avoid integer overflow, add copyright headers] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch implements the counters that report the number VM exits,
accessible by the driver. It also adds three statistics for the ARM
side: the number of IRQs injected, the number of IPIs injected, and the
number of GIC maintenance IRQs received.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
GICv2 is limited to 8 CPUs and uses independent routing bits, whereas
GICv3 (with ARE enabled) uses the MPIDR encoding (aff3.aff2.aff1.aff0)
for routing SPIs.
Before handling SPIs, the GICv2 backend has to probe its banked view of
the distributor to know which CPU interface it is accessing. After that,
the implementation is roughly the same as for GICv3, but GICD_ITARGETSR
are used instead of IROUTER.
Because the guest isn't supposed to rely on the CPU interface number
being coherent with the CPU logical ID, we don't have to translate it to
a virtual ID before handling route accesses inside SMP cells.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch implements the following GICv2 features:
- Remap GICC to GICV in the cells to provide a virtual interface
- Guest SGI filtering and hyp SGI handling
- IRQ injection
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch implements two cases:
- When an error occurs before setting up EL2, there is nothing much
to do except restore the linux registers stored in the per_cpu
datas.
- When it happens after EL2 setup, arch_cpu_restore copies the saved
registers on the stack, and continues into arch_shutdown_self
When it happens during the MMU setup, chances of recovering a clean
state are pretty thin anyway. The bootstrap vectors could be used to
catch and dump a minimal context (which would require a raw_printk
implementation), but we cowardly ignore this case for the moment.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: fix memcpy size in cpu_return_el1] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When an HV_DISABLE hypercall is issued on all root CPUs by the driver,
the core `shutdown' function executes the following operations:
- Suspend all non-root cells (all the CPUs are taken to hyp idle mode),
- call arch_shutdown_cpu for all those CPUs,
- call arch_shutdown.
Once the master CPU (the first to take the shutdown lock) did this, the
other root CPUs don't actually perform any operation.
This patch lets the arch_shutdown and arch_shutdown_cpu set a boolean
that is considered by the cores right before returning to EL1: for the
cells' CPUs, arch_shutdown_cpu will trigger a return to arch_reset_self,
that will clean up EL1 and EL2. On the root cpus, the exit handler
checks this boolean and calls the shutdown function.
Once inside arch_shutdown_self, the principle is the same as with the
hypervisor initialisation:
- Create identity mappings of the trampoline page and the stack,
- Jump to the physical address of the shutdown function,
- Disable the MMU,
- Reset the vectors,
- Return to EL1
This patch does not handle hosts using PSCI yet: they will need to issue
a final SMC on secondary CPUs in order to park themselves at EL3, since
the hypervisor won't exist anymore to emulate the wakeup call.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: moved arch_shutdown_cpu & arch_shutdown to control.c] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch stores the hypervisor stub vectors before installing EL2, in
order to reset them on shutdown. It assumes that they are the same on all
CPUs.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The Auxiliary Control Register may be used on some platforms to disable
memory coherency between the cores, for instance when unplugging a CPU.
This patch ensures that ACTLR is never modified, by trapping its accesses
with the HCR.TAC bit.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Hotplugging CPUs on ARM is quite difficult, as each platform uses its
own system. The use of PSCI emulation will greatly simplify this, but
on many platforms, we still have to define a series of specific SMP
operations wrapped around the kernel hotplug implementation.
This patch adds support for the vexpress hotplug system:
- When the root cell attempts to unplug a CPU, to give it to a new
cell, it is put in a WFI loop, which is left when Jailhouse sends
a synchronising IPI to all CPUs that need to be parked.
- When re-assigning a CPU to the root cell, the simplest return path
is through the kernel's secondary entry, whose address is stored in
the system flags register.
Because the kernel only writes the flag register once, plugging CPUs in
the host cannot be accomplished by waiting for a trapped MMIO. Moreover,
such a trap would be missed on hypervisor shutdown, since CPU0 may
return to bare EL1 before secondary CPUs. On some platforms, it may be
necessary to park secondary CPUs outside of the hypervisor on shutdown,
by copying a minimal spin code in a reserved location...
This patch also attempts to combine both classical and PSCI boot methods
in SMP guests: secondary CPUs are held in the psci_emulate_spin handler,
and can be woken up by both a PSCI call and a trapped access to the
vexpress mbox.
The same applies for hotplugging secondary CPUs in the guests, but the
mailbox method only waits for an IPI.
PSCI in the host is not currently supported: it would require a call to
the actual CPU_OFF handler when shutting down the whole hypervisor.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
By adding a new field 'guest_mbox' in the cpu_datas, this patch allows
the guests to issue PSCI HVC calls. Currently, only PSCI_CPU_ON,
PSCI_CPU_OFF, and PSCI_VERSION are handled.
A call to CPU_OFF enters the suspend mode through arch_reset_self. When
a core calls CPU_ON, the hypervisor wakes up the other core, which will
take its return address from the guest_mbox, wipe its registers and go
back to EL1. The context argument to PSCI_CPU_ON is currently ignored,
since the whole core is reset.
This patch also traps SMC instructions in order to catch the PSCI
requests done using this way. All others SMC calls are forwarded to
EL3.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch adds the handling of MMIO accesses to the GICv3 distributor.
By restricting the SPI masks to the cell's configuration, it makes sure
that they do not touch the other cell's SPI's when writing the common
registers.
Except for the routing and SGIR registers, most of the code should be
common to both GICv2 and v3.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessor] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
arm: irqchip: add SPI configuration in cell_init and cell_exit
This patch enables the routing of SPIs to the new cell's first CPU. When
destroyed, all SPIs are re-routed to the root cell.
An exhaustive implementation would save the targets of each IRQ before
transferring it to a new cell. Since linux does not currently route SPIs
to secondary CPUs and the root cell is not supposed to use devices that
will be assigned to guests anyway, it should be safe to route everything
to CPU0.
This patch follows the core configuration and the IOAPIC implementation,
which only allows to use the first 64 SPIs.
A future patch will need to change this minimal bitmap size to 988.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Since each cell has its own set of CPU ids, they can't access the
redistributor associated to their MPIDR. Instead, the MMIO accesses are
translated to their physical redistributor, and a read to the ID
register returns the virtual affinity value.
It is a bit more expensive than simply mapping the redistributor to the
cell, but the guest rarely needs to reconfigure its PPIs and IPIs, so
this patch shouldn't introduce any significant performance loss.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
To handle SMP guests, the cells need to be assigned virtual CPU IDs
through the VMPIDR register. For the moment, those IDs are simply
generated incrementally on each CPU.
This change will allow to use the same guest code in different cells.
This patch adds the necessary code for handling MMIO accesses. The trap
handler fills a mmio_access struct according to the fields in the ESR,
and passes it to all relevant sub-handlers.
If all return UNHANDLED, the access is considered invalid and the CPU is
put into failed mode.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors, use asm/mmio.h] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch adds exhaustive handling of hypervisor errors, and the
ability to stop and park CPUs after dumping their EL1 context, when they
encounter an unhandled trap.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: Adjustments to recent control subsystem changes] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 30 Jul 2014 14:23:19 +0000 (15:23 +0100)]
arm: Complete paging invalidations
This patch is based on the original version by Jean-Philippe Brucker. It
fills different paging stubs:
- the arch_flush_cell_vcpu_caches stub, which is used by the core via
config_commit each time the memory is remapped. It allows to
invalidate the TLBs on all affected CPUs of the cell.
- the arch_paging_flush_cpu_caches function is used to flush the
hypervisor page table entries when using the PAGE_MAP_COHERENT flag
(useful for IOMMU, not currently in use on the arm side.)
- the arch_paging_flush_page_tlbs function is used to invalidate a TLB
entry after modifying the hypervisor paging structures. It must ignore
accesses done from the initial setup code at EL1, which are committed
once at EL2 with a TLBIALLH, just before enabling the MMU.
arch_config_commit has nothing to do so far. Will change when an IOMMU is
supported.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch allows to enter new guests with cache disabled. By cleaning
the data caches, it makes sure that the recently written guest code and
datas are present in memory before returning to an environment with only
a stage-2 MMU.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
arm: flush and enable the caches at initialisation
Before jumping to EL2, which has its MMU disabled, the data caches need
to be cleaned, in order to be coherent with the EL1 context.
This patch implements the complete data cache flush by set/way, and
enables the EL2 caches if possible.
Please note that the hypervisor always assumes that the kernel sets up
its memory in a coherent way between the cores, which means that all the
relevant memory regions (ie. the Jailhouse code and datas) are supposed
to be cacheable and inner-shareable, so that only a clean is needed
before turning the caches off.
In the short section where the MMU is off, the hypervisor doesn't write
anything to memory.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
arm: clear the banked and system registers on reset
This patch allows to boot the new guests with an -almost- empty context.
The reset function is still missing Performance Monitor, debug, SIMD and
float registers.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Give the CPUs back to Linux when a cell is destroyed, after resetting its
whole context.
This patch uses the old vexpress CPU hotplug system in Linux: the
secondary startup function address is kept in the system flags, so the
cpu_reset function will simply jump there after resetting the CPU state.
A future patch will add PSCI hotplug support, and once the hypervisor
has access to a device tree, this spin function will need to be set
dynamically in a `struct hotplug_ops'.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
arm: GIC: reset the CPU interface before running a new guest
All pending interrupts need to be cleared before running a new guest.
This patch resets the list registers, the software pending queue, and
the GIC hypervisor config registers.
Since the suspend loop was entered through the IRQ handler, we also need
to deactivate that active IPI.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: switch to mmio accessors] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch implements most CPU handling functions needed to setup and start
a new cell.
- When the core calls suspend_cpu, an SGI is sent to this CPU.
Since all IRQs are taken directly to the hypervisor, the guest will be
interrupted and execution will continue into psci_suspend.
- resume_cpu will simply call psci_cpu_on and return from to the IRQ
handling loop.
- reset_cpu will resume the CPU into arch_reset_self.
After resetting all relevant registers and devices, execution will
continue into the newly created guest, by calling vmresume with a
clean set of registers.
To keep this patch light, most of the reset code is still missing.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This is not the final thing: it can only be used internally for
suspending, resuming and parking CPUs while reconfiguring the cells.
Using this base, a trivial PSCI 0.2 emulation can be added by implementing
the appropriate trap hooks.
The mailbox in the per_cpu structure is used to store the address and
context where psci_cpu_off returns.
CPUs doing reconfiguration on the cells can use the psci_suspend and
psci_resume wrappers to store all affected cores.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
In GICv3, IPIs are sent by writing the system register `ICC_SGIR'.
This patch moderates those writes by injecting the IPIs into the
appropriate cells, and issues an hypervisor IPI to let the cell's CPUs
fill their list registers.
Since there shouldn't be many cases where Jailhouse needs to emulate
system register accesses, this patch keeps it simple, by calling directly
the GICv3 function from the trap handler, without abstracting it through
irqchip.
However, this change adds an ungraceful ifdef, since the GICv2 and v3
headers are mutually exclusive for the moment.
In GICv2, the SGIR register is 32bit and will be handled directly in the
gic-common.c code, using an MMIO trap of the distributor accesses.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
arm: skip instructions that fail their condition check
On some implementations, instructions may trap before the PE was able to
check their conditional code. This patch adds the ability to check it
before emulating something that wouldn't be executed. In thumb mode, the IT
state has to be updated when skipping instructions.
Most of this patch is copied and adapted from Linux and KVM.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When emulating instructions, the trap handler will need to access the
cell registers according to the guest's processor mode when the trap
occurred, which is stored inside the saved PSR.
This patch allows to directly read and write the banked registers.
If HSR reports a load into r14 and the mode was IRQ for instance, the
hypervisor will need to write something into LR_IRQ instead of the LR
saved on the stack.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The GIC IRQ handler loops over the ack register to get all pending IRQs.
It then dispatches them either in the common SGI handler, or injects
them into the cell.
A first attempt to directly inject an IRQ by writing to a free list
register is done. If it fails, the IRQ is appended to the pending list,
and an attempt will be made later on, once a maintenance interrupt is
received.
Injection in the GIC is a little bit expensive for the moment, because
it needs to iterate over all list registers that have a valid interrupt,
to check that there will be no duplication. This could be optimized by
only checking the `active' GIC register for SPIs and PPIs.
A future patch will also add proper handling of the maintenance bits in
the vGIC.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This patch introduces a pending_irq structure to provide a level of
abstraction, in order to store the interrupts waiting to be injected in
the cell. They are allocated as a static array of 256 IRQs for each CPU,
which should be more than enough. Insertion finds the first available slot
and builds a linked list of pending vIRQs.
Two cases justify the need for this structure:
- The GIC has a limited number of list registers for injecting virtual
interrupts. Once they are full, software must store the pending ones
itself, and use the GIC's maintenance IRQ to be informed when they are
available again.
In jailhouse, this case should be very rare since IRQs are directly
injected, but it must be taken into account nonetheless.
- IPIs sent by a core need to be stored somewhere to let the other CPUs
inject them into their own list registers.
The GIC backend will need to call irqchip_inject_pending when receiving
a maintenance IRQ or a synchronisation SGI in order to clean the list.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Since IRQs taken to HYP use a different vector, the trap handler needs to
be aware of the exit context. To this end, the patch adds an 'exit_reason'
field to the struct registers.
The structure is still passed to the dispatcher as a pointer to the stack,
but care must be taken to ignore the exit field when restoring the user
registers.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Assuming there is a GIC distributor at address GICD_BASE, this patch
checks its version and call the gic init function. Linux's kconfig header
is used to guess the base address of the distributor.
Ideally, a device tree would be passed to the hypervisor in the root
cell's config, allowing to remove all constant base addresses.
The patch also assumes that most of the GIC has been setup by Linux prior
to the hypervisor installation, and only initialises the vGIC.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
[Jan: use mmio accessors] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>