]> rtime.felk.cvut.cz Git - jailhouse.git/log
jailhouse.git
10 years agocore: Fix page_map_get_guest_page for non-identically mapped guests
Jan Kiszka [Fri, 7 Feb 2014 09:42:29 +0000 (10:42 +0100)]
core: Fix page_map_get_guest_page for non-identically mapped guests

We were missing a final gphys-to-phys translation before mapping the
guest page.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoTODO: Update
Jan Kiszka [Wed, 5 Feb 2014 16:50:03 +0000 (17:50 +0100)]
TODO: Update

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Enable hugepages in all hypervisor, EPT and VT-d page tables
Jan Kiszka [Wed, 5 Feb 2014 08:21:55 +0000 (09:21 +0100)]
x86: Enable hugepages in all hypervisor, EPT and VT-d page tables

Arm support for hugepage creation by adding the required sizes and
callbacks to the 64-bit paging mode. When deriving the paging modes of
EPT and VT-d, we now need to take their capabilities into account and
have to clear page_size at those levels that are not supported by the
underlying hardware.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Add support for creating page tables with hugepages
Jan Kiszka [Tue, 4 Feb 2014 17:03:24 +0000 (18:03 +0100)]
core: Add support for creating page tables with hugepages

When adding support for generating hugepages during page_map_create, we
also need to address the issue of overwriting or splitting up such
pages. When partially unmapping a hugepage, we need to break it up
first, then unmap the included pages. This break-up may fail when we are
short on memory, thus page_map_destroy may actually fail now, and we
have to take this into account on the caller side.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Check VMX features earlier
Jan Kiszka [Tue, 4 Feb 2014 18:56:44 +0000 (19:56 +0100)]
x86: Check VMX features earlier

Factor out vmx_check_features and call it already during vmx_init. This
is required soon when we will access VMX MSR during common init.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Drop unused parameters from page_map_virt2phys/create/destroy
Jan Kiszka [Tue, 4 Feb 2014 22:28:15 +0000 (23:28 +0100)]
core: Drop unused parameters from page_map_virt2phys/create/destroy

The number of paging levels as well as flags for non-terminal table
entries are now encoded via the paging structures and their callbacks.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Switch to table-driven page table construction and interpretation
Jan Kiszka [Tue, 4 Feb 2014 21:35:04 +0000 (22:35 +0100)]
core: Switch to table-driven page table construction and interpretation

Switch page table creation and interpretation to a new, fully
table-driven scheme. It is much more regular and also more flexible when
it comes to support more paging modes, specifically on x86 (32-bit
paging, PAE etc.) in order to extend MMIO support. It is also laying the
foundation creating hugepages, which will reduce TLB pressure and memory
usage. So far only reading of hugepages is supported.

A paging mode is now define via an array of paging structure. An array
entry represents a page table level, starting with the root level. Each
paging structure contains a number of handlers to set or get entries at
the corresponding level. It also contains a page size value which is
non-zero in case the page table level support terminal entries that
point to a physical page address. This implies that the final element in
the paging structure array must have a non-zero page size field.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce paging_structures abstraction
Jan Kiszka [Sun, 26 Jan 2014 18:26:28 +0000 (19:26 +0100)]
core: Introduce paging_structures abstraction

This structure shall once hold a reference to both the paging hierarchy
and how it is read or manipulated. So far, we once encapsulate the root
table reference and update all sites that deal with page tables.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Reduce dependencies of jailhouse/entry.h
Jan Kiszka [Sun, 26 Jan 2014 11:03:15 +0000 (12:03 +0100)]
core: Reduce dependencies of jailhouse/entry.h

Will help to include jailhouse/paging.h from files where this created
circular dependencies so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Make use of mmio_read32/64_field
Jan Kiszka [Wed, 5 Feb 2014 08:34:26 +0000 (09:34 +0100)]
x86: Make use of mmio_read32/64_field

Minor cleanup: Use the new mmio field accessor to obtain values from the
xAPIC and the DMAR units.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Drop no longer used includes
Jan Kiszka [Thu, 30 Jan 2014 22:24:19 +0000 (23:24 +0100)]
x86: Drop no longer used includes

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Fix a bug with writing zeros to register IOTLB_REG containing RsvdP field
Ivan Kolchin [Fri, 31 Jan 2014 05:43:03 +0000 (09:43 +0400)]
x86: Fix a bug with writing zeros to register IOTLB_REG containing RsvdP field

Register IOTLB_REG contains RsvdP fields. Their values must be preserved on writing
in accordance with the specification. Operations having accessed to this kind of
registers in unsafe way were replaced by means of new helpers.

Signed-off-by: Ivan Kolchin <ivan.kolchin@siemens.com>
[Jan: style fix of VTD_IOTLB_R_MASK]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Add functions to read/write field values of 32/64-bit registers
Ivan Kolchin [Wed, 29 Jan 2014 12:42:02 +0000 (16:42 +0400)]
core: Add functions to read/write field values of 32/64-bit registers

The following aims are obtained using these functions: making a register
description to be uniform, easy reading/writing field values without using bit
operations, making the code more readable, preventing casual changes of
register content.

Signed-off-by: Ivan Kolchin <ivan.kolchin@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agodriver: Rename source file
Jan Kiszka [Thu, 30 Jan 2014 11:55:25 +0000 (12:55 +0100)]
driver: Rename source file

It's a single module, and it will likely stay such, so give it a more
specific name.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoTODO: Update
Jan Kiszka [Thu, 30 Jan 2014 08:46:18 +0000 (09:46 +0100)]
TODO: Update

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore/driver: Switch hypervisor to fixed virtual address layout
Jan Kiszka [Thu, 30 Jan 2014 11:16:55 +0000 (12:16 +0100)]
core/driver: Switch hypervisor to fixed virtual address layout

Now that the driver always puts us at the same virtual address, we can
compile this into the hypervisor as well. On x86, we switch to the code
model "kernel", i.e. all virtual addresses have the higher 32 bits set.
This allows to drop -fpic and -fpie, the global offset table. And the
entry field in the header is now absolute.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Fix output of hypervisor text start
Jan Kiszka [Thu, 30 Jan 2014 11:27:02 +0000 (12:27 +0100)]
core: Fix output of hypervisor text start

Though we reduced the header size in dfe32d1ba6, the text segment is
still 16-byte aligned. Better introduce an explicit mark for the start
instead of relying on address calculations.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agodriver: Map hypervisor at fixed virtual address
Jan Kiszka [Wed, 29 Jan 2014 15:27:16 +0000 (16:27 +0100)]
driver: Map hypervisor at fixed virtual address

First step to overcome relocation of the hypervisor and to stabilize its
configuration footprint after loading: We reserve a fixed virtual
address range from the kernel and simply map the hypervisor there. The
address is, of course, architecture specific, may even require
adjustments per target. But the advantages of having a stable
configuration in memory that can rather easily checked after setup and
the simplifications in the hypervisor code when it will always have the
same virtual address outweighs this.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoarm: Fix build
Jan Kiszka [Wed, 29 Jan 2014 15:11:01 +0000 (16:11 +0100)]
arm: Fix build

More blunt hacks to keep it building.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agodriver: Fault-in hypervisor core pages before shutting down
Jan Kiszka [Wed, 29 Jan 2014 12:50:22 +0000 (13:50 +0100)]
driver: Fault-in hypervisor core pages before shutting down

Linux tends to apply changes to kernel mappings lazily to mm structs. If
this hits us in the middle of the world switch during shutdown, we will
triple-fault. Avoid this by touching all hypervisor core and per-CPU
pages in the IPI handler before triggering the hypercall.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Allow dummy-read to hypervisor core region from Linux cell
Jan Kiszka [Wed, 29 Jan 2014 12:20:37 +0000 (13:20 +0100)]
core: Allow dummy-read to hypervisor core region from Linux cell

In order to support the shutdown process which may have to fault-in the
hypervisor mapping into Linux address space, allow read access to the
physical region that contains the hypervisor core and the per-CPU data
structures. We simple expose an empty (zeroed) page to Linux.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Mark page_map_hvirt2phys argument const
Jan Kiszka [Wed, 29 Jan 2014 09:28:25 +0000 (10:28 +0100)]
core: Mark page_map_hvirt2phys argument const

Allows to pass in pointers to constant data without generating warnings.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Move TEMPORARY_MAPPING_BASE define to generic header
Jan Kiszka [Sun, 26 Jan 2014 18:40:46 +0000 (19:40 +0100)]
core: Move TEMPORARY_MAPPING_BASE define to generic header

The temporary mapping region is always located at the beginning of the
remapping pool. That's already encoded into generic code, so move the
symbolic address over as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Get rid of unusual term "foreign"
Jan Kiszka [Sun, 26 Jan 2014 18:13:03 +0000 (19:13 +0100)]
core: Get rid of unusual term "foreign"

In order to avoid using unusual terms for well-known things, rename
page_map_get_foreign_page to page_map_get_guest_page. Also avoid the
term foreign in related constants, calling the mapping target area now
temporary mapping region.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Factor out FOREIGN_MAPPING_CPU_BASE
Jan Kiszka [Sun, 26 Jan 2014 10:40:10 +0000 (11:40 +0100)]
core: Factor out FOREIGN_MAPPING_CPU_BASE

Encapsulate the calculation for the start of the per-CPU mapping region.
We use a macro to avoid having to include percpu.h from paging.h which
would create circular dependencies soon.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore/driver: Consolidate bss_start/end header fields to core_size
Jan Kiszka [Tue, 28 Jan 2014 16:41:41 +0000 (17:41 +0100)]
core/driver: Consolidate bss_start/end header fields to core_size

We only need to know how large the hypervisor core is, i.e. the part
that is loaded into RAM during initialization. That this size is derived
from the end of the bss section can be seen in our linker script.
bss_start was completely unused so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Clean up xAPIC-related magics and register accesses
Jan Kiszka [Fri, 24 Jan 2014 16:15:26 +0000 (17:15 +0100)]
x86: Clean up xAPIC-related magics and register accesses

Instead of open-coding, use mmio_read/write for accessing xAPIC
registers. Wrap the address calculations with the helper macro and
use a symbolic constant for the xAPIC ID shift.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Reduce APIC MMIO value variable to 32-bit length
Jan Kiszka [Tue, 21 Jan 2014 13:52:50 +0000 (14:52 +0100)]
x86: Reduce APIC MMIO value variable to 32-bit length

This avoids evaluating the higher 32 bits on writes. We only support
32-bit writes, so we must only look at the lower 32 bit of the input
register.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Factor out common_exception_entry
Jan Kiszka [Sun, 19 Jan 2014 15:29:43 +0000 (16:29 +0100)]
x86: Factor out common_exception_entry

Less duplication, no functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Fix SIPI processing
Jan Kiszka [Sat, 18 Jan 2014 21:37:31 +0000 (22:37 +0100)]
x86: Fix SIPI processing

Make sure that we only deliver a SIPI vector when there is actually one
pending. We park the CPU while in wait-for-SIPI state. If we receive an
IPI before a SIPI was defined, x86_handle_events will deliver a random
SIPI vector. Avoid this by encoding SIPI availability via the
sipi_vector fields.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Replace __start with &hypervisor_header
Jan Kiszka [Sat, 18 Jan 2014 10:53:41 +0000 (11:53 +0100)]
core: Replace __start with &hypervisor_header

These naturally point to the same address, so we can replace the extra
__start symbol.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Fix stand-alone inclusion of vmx.h and vtd.h
Jan Kiszka [Fri, 17 Jan 2014 16:07:06 +0000 (17:07 +0100)]
x86: Fix stand-alone inclusion of vmx.h and vtd.h

This ensures compliance with our (yet unwritten) rule that all headers
should allow stand-alone inclusion. Exceptions are headers used also by
guests or the driver.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce per-page TLB flushing
Jan Kiszka [Fri, 17 Jan 2014 13:27:06 +0000 (14:27 +0100)]
core: Introduce per-page TLB flushing

Reduce the overhead of MMIO parsing specifically by introducing a
per-page TLB flush. Restrict the existing global one to x86, that's
where is is used so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Fix some minor style issues
Jan Kiszka [Fri, 17 Jan 2014 10:21:35 +0000 (11:21 +0100)]
core: Fix some minor style issues

Found by checkpatch. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Add compiler barrier semantic to cpu_relax
Jan Kiszka [Fri, 17 Jan 2014 09:44:30 +0000 (10:44 +0100)]
core: Add compiler barrier semantic to cpu_relax

Will eventually help to get rid of volatile for several synchronization
variables by enforcing a re-read in busy-wait loops:

        while (state_var == STATE)
                cpu_relax();

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Properly translate guest physical addresses in page_map_get_foreign_page
Jan Kiszka [Thu, 16 Jan 2014 19:47:08 +0000 (20:47 +0100)]
core: Properly translate guest physical addresses in page_map_get_foreign_page

We were incorrectly using a fixed page table offset for translating
physical addresses read from guest page tables to host physical
addresses. This approach totally neglected guest address space limits
and fragmentations. Fix it by asking the architecture to translate a
guest physical address. With VMX, we simply walk the EPT table of the
caller's CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Reintroduce page_map_virt2phys
Jan Kiszka [Thu, 16 Jan 2014 17:54:12 +0000 (18:54 +0100)]
core: Reintroduce page_map_virt2phys

This was quick: We actually need this function to obtain guest->host
physical address translations during page_map_get_foreign_page.
Reintroduce it, but with adjusted interface:

First, the only page table offset we need is the one of the hypervisor
because this function will only be used for walking page tables that are
fully mapped into the hypervisor address space.

Second, to align its interface to the companion function page_map_create
and page_map_destroy, introduce a variable level parameter.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Pass caller's per_cpu to page_map_get_foreign_page
Jan Kiszka [Thu, 16 Jan 2014 17:49:11 +0000 (18:49 +0100)]
core: Pass caller's per_cpu to page_map_get_foreign_page

Replace mapping_region and page_table_offset parameters with the
caller's per_cpu struct. All information can be obtained from it, and we
will need it for the upcoming changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Fix comment typo
Jan Kiszka [Thu, 16 Jan 2014 17:11:02 +0000 (18:11 +0100)]
core: Fix comment typo

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates: Add cell_status support to apic-demo
Jan Kiszka [Thu, 16 Jan 2014 16:26:29 +0000 (17:26 +0100)]
inmates: Add cell_status support to apic-demo

Demonstrate the usage of the cell_status field in the comm region by
performing a self-shutdown after the first shutdown request.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Allow a cell to declare itself terminated
Jan Kiszka [Thu, 16 Jan 2014 15:57:29 +0000 (16:57 +0100)]
core: Allow a cell to declare itself terminated

This provides communication channel from the cell to the hypervisor to
signal cell termination, either regular one or after a failure. The cell
can set its cell_status field in the comm region for this purpose. The
hypervisor will take this field additionally into account when trying to
destroy a cell that requires managed shutdown.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates: Demonstrate shutdown rejection in apic-demo
Jan Kiszka [Sun, 12 Jan 2014 19:56:41 +0000 (20:56 +0100)]
inmates: Demonstrate shutdown rejection in apic-demo

Enable managed shutdown in the apic-demo and reject the first request to
demonstrate the mechanism.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce cell shutdown control interface
Jan Kiszka [Tue, 14 Jan 2014 19:26:13 +0000 (20:26 +0100)]
core: Introduce cell shutdown control interface

Using the messaging feature of the comm region, this introduces an
interface for cell to control their shutdown. The hypervisor sends a
shutdown request to a cell before destroying it. The cell can either
reject or accept the request in its reply message.

Cells can be excluded from this procedure and will continue to be
destructed immediately by setting the flag JAILHOUSE_CELL_UNMANAGED_EXIT
in their configuration file. For now we mark both demo cells as
unmanaged.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce messaging interface via comm region
Jan Kiszka [Tue, 14 Jan 2014 18:31:12 +0000 (19:31 +0100)]
core: Introduce messaging interface via comm region

This adds an interface to send messages from the hypervisor to cells and
receive answers back. Messages and replies currently consists of plain
integer values. Inline helpers are provided (for x86 only so far) to
ensure proper ordering of comm region field updates.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Rename arch-specific hypercall headers
Jan Kiszka [Tue, 14 Jan 2014 18:16:15 +0000 (19:16 +0100)]
core: Rename arch-specific hypercall headers

asm/jailhouse.h is too generic, these headers actually define the
arch-specific hypercall interface parts of Jailhouse. Rename them but
keep "jailhouse" as part of the name to avoid potential clashes over the
generic "asm/hypercall.h".

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Prepare for error handling in shutdown
Jan Kiszka [Tue, 14 Jan 2014 17:40:54 +0000 (18:40 +0100)]
core: Prepare for error handling in shutdown

We are going to ask all none-Linux if they permit destruction. In case
they refuse, we need to handle this error and report it to all CPUs that
requested a shutdown via the hypercall.

Prepare the code for this by introducing a per-CPU shutdown_state. Its
default state is SHUTDOWN_NONE, thus no shutdown in progress. It is
set for all CPUs by the one that leads a shutdown either to
SHUTDOWN_STARTED or a negative error code. Then each CPU passing through
shutdown() can check if a) they are leading the shutdown and b) if an
already started shutdown failed and required to report the saved error
back.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce communication region
Jan Kiszka [Sun, 12 Jan 2014 19:47:51 +0000 (20:47 +0100)]
core: Introduce communication region

The communication region consists of memory shared between cells and the
hypervisor. One page is reserved for each cell so far, and an empty
jailhouse_comm_region structure is defined. The comm region is mapped
into a cell by defining a memory region with JAILHOUSE_MEM_COMM_REGION
set in its flags.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Rename jailhouse_memory::access_flags to flags
Jan Kiszka [Sun, 12 Jan 2014 19:34:32 +0000 (20:34 +0100)]
core: Rename jailhouse_memory::access_flags to flags

We will encode more than access types in them. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates: Perform initialization of UART console
Jan Kiszka [Tue, 14 Jan 2014 16:09:13 +0000 (17:09 +0100)]
inmates: Perform initialization of UART console

Needed if we don't write to the same one as the hypervisor.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Park CPU on guest originated faults
Jan Kiszka [Sun, 12 Jan 2014 15:23:51 +0000 (16:23 +0100)]
x86: Park CPU on guest originated faults

If a guest CPU violates a cell boundary or cause an unhandled vmexit,
only park the CPU in non-root mode afterwards. This allows to recover
the CPU when its cell is destructed, e.g. Any fault we see in root mode
remains fatal, and we will continue to stop using the CPU from that on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Introduce panic_halt
Jan Kiszka [Sun, 12 Jan 2014 15:07:29 +0000 (16:07 +0100)]
x86: Introduce panic_halt

In case we can recover from a panic situation, either by destroying the
guest cell or by re-initializing the CPU from with that cell, we do not
need to enter the fatal stop state. Provide panic_halt for that
scenarios, it will park the faulted CPU in wait-for-SIPI state instead,
keeping virtualization enabled.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Convert global wait_lock into per-CPU control_lock
Jan Kiszka [Sun, 12 Jan 2014 18:36:04 +0000 (19:36 +0100)]
x86: Convert global wait_lock into per-CPU control_lock

The wait_lock actually only protects CPU-specific fields, thus does not
need to be global. Convert it, and also document what it protects
precisely and under which conditions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Remove mmio_lock
Jan Kiszka [Sun, 12 Jan 2014 18:03:05 +0000 (19:03 +0100)]
x86: Remove mmio_lock

There is no state shared between CPUs when runngin mmio_parse. Also, we
no longer modify the hypervisor page table in a way that may require
synchronization: page_map_get_foreign_page only modifies a single PTE
in the given CPU's foreign mapping region. Since d401dba795 we no longer
unmap the foreign pages, only overwrite the mapping. And since
3c01f4192c we guarantee that all page directories are present for the
foreign mapping region. So let's get ride of this unneeded lock.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoconfigs: Rename folder
Jan Kiszka [Sun, 12 Jan 2014 14:42:22 +0000 (15:42 +0100)]
configs: Rename folder

Same logic as for inmates: There is more than one config file kept here.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates: Rename folder
Jan Kiszka [Sun, 12 Jan 2014 14:34:23 +0000 (15:34 +0100)]
inmates: Rename folder

We keep many of them here, at least more than one so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoUpdate README according to config changes, extend it
Jan Kiszka [Sun, 12 Jan 2014 14:23:11 +0000 (15:23 +0100)]
Update README according to config changes, extend it

Reflect the split-up and renaming of inmate config files, and also
describe cell destruction and Jailhouse disabling at that chance.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates/configs: Use separate configs for tiny-demo and apic-demo
Jan Kiszka [Sun, 12 Jan 2014 14:16:04 +0000 (15:16 +0100)]
inmates/configs: Use separate configs for tiny-demo and apic-demo

Split the inmate configs into to files, allowing the tiny-demo to run in
parallel to the apic-demo on a different CPU and at a different RAM
location. The tiny-demo is now on CPU 2 and uses the second UART. Note
that both cells use the PM timer, but shared access to it is harmless as
it is read-only.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoinmates: Make printk UART base configurable by inmates
Jan Kiszka [Sun, 12 Jan 2014 12:02:13 +0000 (13:02 +0100)]
inmates: Make printk UART base configurable by inmates

This allows to use different UARTs for different inmates. For now, both
included examples continue to use the same.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Workable configuration for Celsius-W420
Ivan Kolchin [Mon, 13 Jan 2014 12:48:45 +0000 (16:48 +0400)]
x86: Workable configuration for Celsius-W420

Valid config-file was created and tested.
Test PC has configuration:
  - Ubuntu 13.10
  - 4 GB RAM

Signed-off-by: Ivan Kolchin <ivan.kolchin@siemens.com>
10 years agogitignore: Exclude all *.bin files
Jan Kiszka [Sun, 12 Jan 2014 14:32:18 +0000 (15:32 +0100)]
gitignore: Exclude all *.bin files

We don't want to track any binary file at all.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Don't switch off virtualization on panic stops
Jan Kiszka [Sun, 12 Jan 2014 15:12:12 +0000 (16:12 +0100)]
x86: Don't switch off virtualization on panic stops

This CPU is lost until the next system reset. Disabling VMX mode doesn't
buy us anything, so drop that line of code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Loop over hlt in shutdown state
Jan Kiszka [Sun, 12 Jan 2014 14:57:10 +0000 (15:57 +0100)]
x86: Loop over hlt in shutdown state

If an NMI should hit us while waiting for Linux to get this CPU online
again, we should avoid resuming execution of the hypervisor code. The
active hypervisor page table may not allow us to cause much damage, but
it's better to not rely on this.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Add barrier after vmx_state to VMXOFF
Jan Kiszka [Sun, 12 Jan 2014 11:52:02 +0000 (12:52 +0100)]
x86: Add barrier after vmx_state to VMXOFF

This plugs potential races between vmx_cpu_exit switching VMX mode off
and vmx_schedule_vmexit still accessing the VMCS of the same CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Fix state test in vmx_schedule_vmexit
Jan Kiszka [Sun, 12 Jan 2014 11:16:49 +0000 (12:16 +0100)]
x86: Fix state test in vmx_schedule_vmexit

Obviously broken. This could cause hypervisor faults, e.g. when sending
NMI IPIs to a CPU in panic state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agodriver: Register reboot notifier to shut down Jailhouse
Jan Kiszka [Sat, 11 Jan 2014 18:17:22 +0000 (19:17 +0100)]
driver: Register reboot notifier to shut down Jailhouse

Provided the cell shutdown can be performed without problems, this
allows to clean up Jailhouse and all running cells before system reboot
or power-down.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Suspend non-Linux cells during shutdown
Jan Kiszka [Sat, 11 Jan 2014 12:02:59 +0000 (13:02 +0100)]
core: Suspend non-Linux cells during shutdown

Just like we already do during cell destruction, we also have to suspend
non-Linux cell that are still running during a hypervisor shutdown. This
excludes race scenarios in cells with multiple CPUs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Rename apic_deliver_[logical_dest]ipi -> apic_send_[logical_dest]ipi
Jan Kiszka [Sat, 11 Jan 2014 11:05:51 +0000 (12:05 +0100)]
x86: Rename apic_deliver_[logical_dest]ipi -> apic_send_[logical_dest]ipi

Delivery takes place on the target CPU, sending is rather what we do
here. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Move CPU control logic from apic.c to control.c
Jan Kiszka [Sat, 11 Jan 2014 10:58:03 +0000 (11:58 +0100)]
x86: Move CPU control logic from apic.c to control.c

A lot of APIC-unrelated CPU execution control piled up in apic.c.
Factor it out from the APIC core logic and move it over to x86/control.c
in order to keep duties separated.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Make sure that cpu suspension take precedence over INIT processing
Jan Kiszka [Sat, 11 Jan 2014 10:19:03 +0000 (11:19 +0100)]
x86: Make sure that cpu suspension take precedence over INIT processing

If arch_suspend_cpu send by one CPU races with an INIT signal send by
another, make sure that we process the suspension first. Cell management
has higher priority than intra-cell events.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Remove unused page_map_virt2phys
Jan Kiszka [Sat, 11 Jan 2014 09:09:08 +0000 (10:09 +0100)]
core: Remove unused page_map_virt2phys

This might have been useful for reverse-walking hypervisor-provided
guest page tables, but there is still no use case in sight. Can be
reintroduced when needed.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Flush DMAR unit caches after setting root pointer
Jan Kiszka [Fri, 10 Jan 2014 19:57:56 +0000 (20:57 +0100)]
x86: Flush DMAR unit caches after setting root pointer

This is required according to section 6.6 in the VT-d architecture
specification. Failing to do so can cause spurious faults after
enabling the hypervisor.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Factor out vtd_flush_dmar_caches
Jan Kiszka [Sat, 11 Jan 2014 08:44:48 +0000 (09:44 +0100)]
x86: Factor out vtd_flush_dmar_caches

This introduced a more flexible per-unit DMAR cache flush. The scope of
the flush can be specified by the caller. We will reuse it to perform a
global flush.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoconfig: Add graphic RAM of h87i to region list
Jan Kiszka [Fri, 10 Jan 2014 19:35:32 +0000 (20:35 +0100)]
config: Add graphic RAM of h87i to region list

At least the GPU requires access to its graphic RAM that is taken from
main memory.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Wait for DMAR units to enable translation
Jan Kiszka [Fri, 10 Jan 2014 17:25:52 +0000 (18:25 +0100)]
x86: Wait for DMAR units to enable translation

Better spin on the hardware report DMA translation as active before
resuming initialization. In case the hardware is slow, this avoids
starting into isolated mode without all isolation fully active.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Rename VTD_GSTS_TE -> VTD_GSTS_TES
Jan Kiszka [Fri, 10 Jan 2014 17:22:41 +0000 (18:22 +0100)]
x86: Rename VTD_GSTS_TE -> VTD_GSTS_TES

Aligns it with the spec's notation. No functional change.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce arch_shutdown
Jan Kiszka [Fri, 10 Jan 2014 16:27:46 +0000 (17:27 +0100)]
core: Introduce arch_shutdown

Introduce an arch-specific shutdown function to be called once when the
hypervisor is disabled. Use it on x86 for invoking vtd_shutdown. This
avoids that vtd_shutdown is called multiple times, which was harmless
but conceptually incorrect.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Reject hypercalls issued by userspace contexts
Jan Kiszka [Thu, 9 Jan 2014 08:27:57 +0000 (09:27 +0100)]
x86: Reject hypercalls issued by userspace contexts

Reject any hypercall issued by userspace contexts, thus enable cells to
establish proper access control to Jailhouse services.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Factor out vmx_handle_hypercall
Jan Kiszka [Thu, 9 Jan 2014 08:27:19 +0000 (09:27 +0100)]
x86: Factor out vmx_handle_hypercall

Move hypercall handling from vmx_handle_exit into a separate function
for better code structuring. No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoTODO: DMAR support for non-Linux cells completed
Jan Kiszka [Sun, 5 Jan 2014 16:14:25 +0000 (17:14 +0100)]
TODO: DMAR support for non-Linux cells completed

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoarm: Fix build
Jan Kiszka [Sun, 5 Jan 2014 13:03:55 +0000 (14:03 +0100)]
arm: Fix build

Add stubs for new arch-specific functions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agoDocument VT-d disabling in Linux kernel
Jan Kiszka [Tue, 31 Dec 2013 08:42:33 +0000 (09:42 +0100)]
Document VT-d disabling in Linux kernel

We do not support emulating VT-d to the Linux kernel, so we have to
demand disabling it when Jailhouse is used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Clear APIC on INIT emulation and shutdown
Jan Kiszka [Fri, 27 Dec 2013 23:18:56 +0000 (00:18 +0100)]
x86: Clear APIC on INIT emulation and shutdown

Make sure no problematic part of the previous APIC state leaks across
an INIT signal or the shutdown of non-Linux cells. We mask all LVTs,
then ack any interrupts that are marked in the APIC's ISR and finally
drain all interrupts that may be still held in IRR. This is done by
clearing TPR first, thus releasing interrupts that may have been blocked
by a raised task priority. We then briefly enable local interrupts so
that the dummy handler installed for vectors 32-255 can accept and ack
them.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Factor out set_idt_int_gate
Jan Kiszka [Sun, 5 Jan 2014 12:23:31 +0000 (13:23 +0100)]
x86: Factor out set_idt_int_gate

Encapsulate common code to set an interrupt gate in the IDT. Use a
symbolic constant to represent its flags.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Introduce enable_irq and disable_irq inline helpers
Jan Kiszka [Sun, 5 Jan 2014 12:20:57 +0000 (13:20 +0100)]
x86: Introduce enable_irq and disable_irq inline helpers

To be used soon for APIC interrupt draining.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Factor out interrupt_entry macro
Jan Kiszka [Sun, 5 Jan 2014 12:18:35 +0000 (13:18 +0100)]
x86: Factor out interrupt_entry macro

Move the interrupt entry code into an assembly macro so that it can be
reused for other interrupts than the NMI.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Enforce cell creation and hypervisor shutdown to run over Linux
Jan Kiszka [Sun, 29 Dec 2013 21:55:10 +0000 (22:55 +0100)]
core: Enforce cell creation and hypervisor shutdown to run over Linux

The rule is already encoded for cell destruction, also establish it
formally for cell creation and Jailhouse shutdown: The code is not yet
prepared to take those commands over non-Linux cell. May change one day
(if there will be a use case), but for now play safe and reject rule
violations, now with the error code -EPERM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Suspend cell before its destruction
Jan Kiszka [Sun, 29 Dec 2013 21:43:19 +0000 (22:43 +0100)]
core: Suspend cell before its destruction

To avoid that CPU parking races with INIT/SIPI signals sent by not yet
parked CPUs of the same cell, stop all cell CPUs first.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Release VT-d resources on cell destruction
Jan Kiszka [Fri, 27 Dec 2013 14:02:17 +0000 (15:02 +0100)]
x86: Release VT-d resources on cell destruction

Corresponding to VMX, implement vtd_map_memory_region,
vtd_unmap_memory_region and vtd_cell_exit so that we can drop all VT-d
related resources of a cell on its destruction and, according to the
configuration, reassign them back to Linux.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Generic memory region handling on cell destruction
Jan Kiszka [Fri, 27 Dec 2013 13:10:49 +0000 (14:10 +0100)]
core: Generic memory region handling on cell destruction

Most of vmx_remap_to_linux and also the memory region loop in
vmx_cell_exit are generic and can be reused for adding VT-d support and,
later on, different architectures. So move the generic bits to the core
and provide arch_map_memory_region and arch_unmap_memory_region instead.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce jailhouse_cell_pci_devices helper
Jan Kiszka [Fri, 27 Dec 2013 12:12:27 +0000 (13:12 +0100)]
core: Introduce jailhouse_cell_pci_devices helper

Given a cell configuration, this function returns a pointer to the first
PCI device. Use it instead of open-coding the address calculation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore/driver: Introduce jailhouse_cell_cpu_set helper
Jan Kiszka [Fri, 27 Dec 2013 00:10:19 +0000 (01:10 +0100)]
core/driver: Introduce jailhouse_cell_cpu_set helper

Given a cell configuration, this function returns a pointer to the CPU
set. Use it instead of open-coding the address calculation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Introduce jailhouse_cell_pio_bitmap helper
Jan Kiszka [Thu, 26 Dec 2013 23:35:57 +0000 (00:35 +0100)]
core: Introduce jailhouse_cell_pio_bitmap helper

Given a cell configuration, this function returns a pointer to the PIO
bitmap. Use it instead of open-coding the address calculation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore/driver: Introduce jailhouse_cell_mem_regions helper
Jan Kiszka [Thu, 26 Dec 2013 23:27:33 +0000 (00:27 +0100)]
core/driver: Introduce jailhouse_cell_mem_regions helper

Given a cell configuration, this function returns a pointer to the first
memory region. Use it instead of open-coding the address calculation.
This requires marking several pointers to jailhouse_memory as
referencing a constant data structure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Reconfigure DMAR units on cell creation
Jan Kiszka [Mon, 23 Dec 2013 17:58:40 +0000 (18:58 +0100)]
x86: Reconfigure DMAR units on cell creation

Remove devices that shall be assigned to a newly created cell from
Linux. Then we can initialize the new cell regularly via vtd_cell_init.
We use a register-based cache flush on Linux cell shrinking, requesting
domain granularity for the flush. As the page table will be new and all
reassigned devices are disabled before entry of vtd_cell_init, there is
no need to flush caches after the creation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Officially focus vmx_cell_shrink on the Linux cell
Jan Kiszka [Mon, 23 Dec 2013 23:27:13 +0000 (00:27 +0100)]
x86: Officially focus vmx_cell_shrink on the Linux cell

Let's make it clear: vmx_cell_shrink shall only be used to shrink the
Linux cell by a given config. Rename the function and reduce its
parameter list accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Minor cleanup of DMAR unit enabling
Jan Kiszka [Mon, 23 Dec 2013 23:14:15 +0000 (00:14 +0100)]
x86: Minor cleanup of DMAR unit enabling

No need to check per unit if it is enabled, we require all of them to be
off initially: if the first one is on, all are on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Require DMAR units with CM=0
Jan Kiszka [Mon, 23 Dec 2013 22:58:51 +0000 (23:58 +0100)]
x86: Require DMAR units with CM=0

According to the spec, all hardware implementation have to support cache
mode 0, i.e. they do not cache invalid remapping table entries. We
target real hardware and will make use of this property, so validate
this assumption during startup.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Limit domain IDs to maximum that all DMAR units support
Jan Kiszka [Mon, 23 Dec 2013 14:27:49 +0000 (15:27 +0100)]
x86: Limit domain IDs to maximum that all DMAR units support

VT-d spec says that we must not use the same DID for different domain,
namely for different DMA page tables. So reject any cell with an ID
larges than the range that all DMAR units support.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agox86: Replace wbinvd on vtd_cell_init with focused flush_cache calls
Jan Kiszka [Mon, 23 Dec 2013 13:12:08 +0000 (14:12 +0100)]
x86: Replace wbinvd on vtd_cell_init with focused flush_cache calls

Instead of force a complete cache flush, only mark those root entries
and context entries invalid that are actually changed. The DMA page
table creation was already marked as coherent.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
10 years agocore: Add support for cache-coherent changes to page tables
Jan Kiszka [Mon, 23 Dec 2013 13:06:23 +0000 (14:06 +0100)]
core: Add support for cache-coherent changes to page tables

Extend page_map_create and page_map_destroy with a parameter that
controls cache flushes after page table changes. This feature is useful
for changes on VT-d tables, therefore only related invocations will be
tagged with PAGE_MAP_COHERENT, the rest remains PAGE_MAP_NON_COHERENT.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>