]> rtime.felk.cvut.cz Git - jailhouse.git/log
jailhouse.git
8 years agoDocumentation: Add how-to for non-root Linux cells
Jan Kiszka [Sun, 24 May 2015 09:40:22 +0000 (11:40 +0200)]
Documentation: Add how-to for non-root Linux cells

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
8 years agotools: Add bash completion for Linux loader command
Jan Kiszka [Sun, 10 May 2015 21:54:27 +0000 (23:54 +0200)]
tools: Add bash completion for Linux loader command

Teach the new subcommand "cell linux" to the bash completion script.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
8 years agoconfigs: Add a linux-x86-demo cell configuration
Jan Kiszka [Sun, 24 May 2015 08:39:28 +0000 (10:39 +0200)]
configs: Add a linux-x86-demo cell configuration

This demonstrates non-root Linux booting. It is targeting the QEMU
reference setup but can easily be tailored for physical setups as well.
The config contains an ivshmem device to demonstrate both PCI device
discovery and inter-cell communication. Of the four available CPUs in
the QEMU setup, 3 are assigned to the cell to show that SMP works.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
8 years agotools, inmates: Add "cell linux" subcommand to jailhouse tool
Jan Kiszka [Sun, 24 May 2015 08:10:22 +0000 (10:10 +0200)]
tools, inmates: Add "cell linux" subcommand to jailhouse tool

This adds support for loading and booting paravirtualized x86 Linux
kernels in non-root cells. The jailhouse tool is extended for this
purpose with a new subcommand "cell linux" that accepts the cell
configuration, the kernel image and an optional initrd as input. Also a
kernel command line can be specified. The script then creates the cell,
unless it already exists, load kernel, initrd, a special boot loader and
the required parameters for that loader into the cell RAM. Finally, it
starts the cell.

The interface between python helper and the boot loader inmate is based
on the kernels boot_params structure with a custom setup_data extension.
The former is initialized by the python help, specifically to inform
Linux about the location of its initrd and the command line. It also
contains an e820 list to report the memory layout. The setup_data is
filled by the boot loader with information about the PM timer address
and the available CPUs as well as their physical APIC IDs. For that
purpose, the Linux cell requires a communication region.

Although the loader script is currently x86-only, extension to ARM is
surely feasible as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: Generalize python script patching during installation
Jan Kiszka [Sun, 24 May 2015 08:07:12 +0000 (10:07 +0200)]
tools: Generalize python script patching during installation

Rename patch_datadir_var to patch_dirvar and add a parameter to specify
which variable to patch. This will allow to use it also for libexecdir.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add infrastructure for inmates that serve as tools
Jan Kiszka [Sun, 24 May 2015 08:00:02 +0000 (10:00 +0200)]
inmates: Add infrastructure for inmates that serve as tools

We will had an x86 inmate that will support the booting of Linux in
non-root cells. This lays the foundation for such tools, including their
installation into $(libexecdir)/jailhouse.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Extend inmates memory in qemu config
Jan Kiszka [Mon, 27 Apr 2015 18:42:13 +0000 (20:42 +0200)]
configs: Extend inmates memory in qemu config

Reduce the hypervisor memory to 6 MB, which is still plenty, so that we
can create more or larger inmates. Reorder and extend the description
accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Ignore writes to the xAPIC ID register
Jan Kiszka [Sat, 2 May 2015 10:40:05 +0000 (12:40 +0200)]
x86: Ignore writes to the xAPIC ID register

Writing to the APIC ID register is legal in xAPIC mode but is ignored by
recent CPU models. Linux performs a write on boot-up, e.g., and ignoring
this is both cheap and helpful to keep para-virtualization needs low.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Implement standard hypervisor detection protocol
Jan Kiszka [Fri, 1 May 2015 11:00:09 +0000 (13:00 +0200)]
x86: Implement standard hypervisor detection protocol

This provides cpuid-based Jailhouse detection conforming to the protocol
also used by other major hypervisors: set bit 31 of ecx for function
0x01, provide a signature via function 0x40000000 and a so far empty
feature set via function 0x40000001.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Always intercept cpuid
Jan Kiszka [Fri, 1 May 2015 10:12:27 +0000 (12:12 +0200)]
x86: Always intercept cpuid

Refactor vmx_handle_cpuid to vcpu_handle_cpuid and ensure that both VMX
and SVM use it for emulating guest cpuid invocations. That means SVM has
to intercept it now.

We will need this to reliably indicate the presence of Jailhouse to our
inmates.

CC: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Simplify pci_ivshmem_cfg_read
Jan Kiszka [Mon, 18 May 2015 07:44:17 +0000 (09:44 +0200)]
core: ivshmem: Simplify pci_ivshmem_cfg_read

Masking of the returned value is already done by the callers. So we just
need to shift the DWORD according to the access address.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Use generic BAR emulation
Jan Kiszka [Mon, 18 May 2015 06:51:27 +0000 (08:51 +0200)]
core: ivshmem: Use generic BAR emulation

Simplify the code by relying on the PCI core to emulate BAR writes. This
just requires proper settings of the bar_mask fields of ivshmem devices
in configs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Add bar_mask fields for ivshmem devices
Jan Kiszka [Mon, 18 May 2015 06:57:38 +0000 (08:57 +0200)]
configs: Add bar_mask fields for ivshmem devices

Will be used when moving BAR emulation to the PCI core.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Refactor pci_ivshmem_cfg_write
Jan Kiszka [Mon, 18 May 2015 05:22:09 +0000 (07:22 +0200)]
core: ivshmem: Refactor pci_ivshmem_cfg_write

We can do simpler by passing in the bias-shifted row value to be written
and the access byte-mask. Then pci_ivshmem_cfg_write just needs to
combine the new value with those of the other bytes as needed, and we
can drop all the size-specific dispatching.

This also lays the foundation for reusing generic BAR emulation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Add basic BAR write emulation for physical PCI devices
Jan Kiszka [Sun, 17 May 2015 09:50:04 +0000 (11:50 +0200)]
core: Add basic BAR write emulation for physical PCI devices

This enables cell to explore the size of PCI device resources by writing
1's to base address registers and then reading back which bits got
modified. We so far didn't support this because Linux in the root cell
already retrieved the sized before Jailhouse ran and other cell could
have been customized to use preconfigured information.

However, adding this features only increases the code by few ten lines
while making life for preexisting inmate OSes, including Linux,
significantly easier. Moreover, we will save some code again when
switching ivshmem's BAR emulation to this version.

Note that this does NOT allow cells to remap PCI device resources in
their address space. That would require more effort with at limited
benefits. Given that we preconfigure all BARs, neither Linux nor other
OSes have a need to change them. Any attempt to do so will simply have
no effect.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore, tools: Add BAR masks to jailhouse_pci_device
Jan Kiszka [Sun, 17 May 2015 09:06:50 +0000 (11:06 +0200)]
core, tools: Add BAR masks to jailhouse_pci_device

Add a new field per BAR to the PCI device configuration. It allows to
mask the modifiable part of a BAR before storing writes. This will
support BAR write emulation that is required to make PCI resource sizes
explorable by cells.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: pci: Rework config space header write moderation
Jan Kiszka [Sun, 17 May 2015 08:47:33 +0000 (10:47 +0200)]
core: pci: Rework config space header write moderation

Switch to a more powerful array-based write access control for the PCI
config space header. The array consists of tuples, each controlling the
access to one dword row. Access can be denied, permitted or emulated as
read-only, thus ignored. As before, a mask selects the bytes of the row
for which the access type applies.

This new model allows to properly describe which registers of the bridge
header we effectively want to freeze as read-only so that Linux can
rescan buses.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: root cell template: add delay IO port to root cell whitelist
Henning Schild [Tue, 19 May 2015 17:03:26 +0000 (19:03 +0200)]
tools: root cell template: add delay IO port to root cell whitelist

Port 0x80 is used by some device drivers to delay IO operations, put it
on the default whitelist in our root-cell template.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfig-create: remove unused variable
Henning Schild [Mon, 18 May 2015 10:47:52 +0000 (12:47 +0200)]
config-create: remove unused variable

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: vmx: Clear IA32_DEBUGCTL only on hard reset
Jan Kiszka [Mon, 18 May 2015 10:53:59 +0000 (12:53 +0200)]
x86: vmx: Clear IA32_DEBUGCTL only on hard reset

According to the spec, this MSR (like most) remain unchanged on INIT. As
it's cheap to conform to this, follow that rule.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Improve error reporting
Jan Kiszka [Mon, 18 May 2015 06:49:52 +0000 (08:49 +0200)]
core: ivshmem: Improve error reporting

The warning in ivshmem_update_msix is actually fatal (callers will fail
the CPU when we return an error code), and we need some additional
reporting on MMIO accesses. The latter avoids that we just get a
register dump, no information where the problem was detected.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: pci: Adjust error report wording in pci_msix_access_handler
Jan Kiszka [Mon, 18 May 2015 07:32:15 +0000 (09:32 +0200)]
core: pci: Adjust error report wording in pci_msix_access_handler

The error scope is broader, also includes (unaligned) reads. Moreover,
the access is not on the BAR but the MSI-X table or PBA.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Fix comment
Jan Kiszka [Mon, 18 May 2015 05:12:54 +0000 (07:12 +0200)]
core: ivshmem: Fix comment

The value is 0 or 1, depending on the ID assigned during
ivshmem_connect_cell.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Remove superfluous ivshmem endpoint checks
Jan Kiszka [Mon, 18 May 2015 04:47:44 +0000 (06:47 +0200)]
core: ivshmem: Remove superfluous ivshmem endpoint checks

An ivshmem PCI device always has a valid ivshmem_endpoint pointer, that
is ensured by ivshmem_connect_cell, called during device initialization.
And there is nothing that invalidates the pointer during device
lifetime. So we can remove related NULL checks.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: ivshmem: Fix BAR range in ivshmem_cfg_write32
Jan Kiszka [Mon, 18 May 2015 04:43:07 +0000 (06:43 +0200)]
core: ivshmem: Fix BAR range in ivshmem_cfg_write32

The last BAR is number 5.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: pci: Skip architecture hooks on virtual device addition/removal
Jan Kiszka [Sun, 17 May 2015 08:39:19 +0000 (10:39 +0200)]
core: pci: Skip architecture hooks on virtual device addition/removal

arch_pci_add_device and arch_pci_remove_device acted as nops for virtual
PCI devices so far, and there is no change in sight. So stop calling the
hooks from pci_add/remove_virtual_device, drop related checks from the
vtd code and rename functions that work on physical devices to clarify
their scope.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: pci: Fix error forwarding from pci_add_device
Jan Kiszka [Sun, 17 May 2015 08:34:10 +0000 (10:34 +0200)]
core: pci: Fix error forwarding from pci_add_device

Properly forward the error that arch_pci_add_device returned.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Do not reset PAT and MTRR_DEF_TYPE on INIT
Jan Kiszka [Fri, 15 May 2015 17:20:18 +0000 (19:20 +0200)]
x86: Do not reset PAT and MTRR_DEF_TYPE on INIT

That is also not done by real hardware, and it can easily confuse
inmates.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agodriver: Prevent disabling when there are offlined CPUs
Jan Kiszka [Fri, 15 May 2015 07:57:41 +0000 (09:57 +0200)]
driver: Prevent disabling when there are offlined CPUs

If Linux has some of the CPUs offlined itself, i.e. not for passing them
to other cells, and we disable the hypervisor then, those CPUs will not
be released. Attempts to online them again later on will fail. Reject
disable requests in such a case.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates, configs: Add x86 SMP demo
Jan Kiszka [Fri, 15 May 2015 06:49:22 +0000 (08:49 +0200)]
inmates, configs: Add x86 SMP demo

This is a simple demo for SMP startup and IPI signaling.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: x86: Add basic SMP support
Jan Kiszka [Thu, 14 May 2015 14:26:52 +0000 (16:26 +0200)]
inmates: x86: Add basic SMP support

Under Jailhouse, all the cell CPUs are started in parallel. To enable
SMP inmates, the entry code records their number and their APIC IDs (up
to the current limit of 255). Only the first CPU arriving at the entry
check will call inmate_main, the others are parked in halt state.

Inmates can use the recorded parameters to pick up all CPUs by sending
them regular INIT/SIPI signals. We use the entry path for this case as
well: ap_entry is introduced as an alternative entry function pointer.
If it is non-NULL, the CPU will bypass the SMP startup procedure and
call that function.

The library is extended to provide a boot-up barrier and a single-CPU
wakeup service. It also adds a simple IPI service.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Report number of CPUs via communication region
Jan Kiszka [Fri, 15 May 2015 06:20:35 +0000 (08:20 +0200)]
x86: Report number of CPUs via communication region

Append a field to the x86-specific part of the communication region to
inform non-root cells about the number of CPUs they can expect to show
up during boot.

We can generalize this when ARM has a need as well, but it's more likely
that it will use device trees instead (which are underdeveloped on x86).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Build library archive and link it implicitly
Jan Kiszka [Sat, 9 May 2015 14:57:08 +0000 (16:57 +0200)]
inmates: Build library archive and link it implicitly

Kbuild already comes with support for building lib.a archives from a set
of objects. Use this to build inmate libraries for x86, here in 64 and
32-bit form, and for ARM. Link against the correct libraries implicitly
so that the demos no longer have to state their dependencies explicitly.

This will also allow to use the inmate libraries from different folders
than demos because the library objects are now only built once.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Define entry point in linker script
Jan Kiszka [Sat, 9 May 2015 14:51:34 +0000 (16:51 +0200)]
inmates: Define entry point in linker script

Export the reset address as symbols and define them as entry point of
our inmates in the linker scripts. We will bundle the headers together
with the other library objects in archives, and defining entry points
will ensure that the related sections will be included in the final
binary. This will simplify the inmate rules significantly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agobuild: Simplify dependency rule
Jan Kiszka [Sat, 9 May 2015 13:54:04 +0000 (15:54 +0200)]
build: Simplify dependency rule

$(addprefix) makes no sense here.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: x86: Fix write_msr for passing u64 variables as value
Jan Kiszka [Fri, 15 May 2015 06:51:34 +0000 (08:51 +0200)]
inmates: x86: Fix write_msr for passing u64 variables as value

The assembler complains about incompatible constraints otherwise.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix and clean up APIC ICR write handling
Jan Kiszka [Fri, 15 May 2015 06:09:40 +0000 (08:09 +0200)]
x86: Fix and clean up APIC ICR write handling

apic_handle_icr_write so far expects the hi_val in the format that
corresponds to the APIC mode in use. Internally, it then normalizes it
into x2APIC-mode format. That's complicating the usage and actually
enabled the bug that x2apic_handle_write did not convert the
cell-provided value into the required format.

Simplify and fix things by changing the API of apic_handle_icr_write to
accept the destination only in x2APIC format. That's much easier because
both callers can hard-code the conversion (none or shift by 24 bits) as
they know the input format.

The only side effect is that apic_send_ipi will now report errors with
ICR.hi always in x2APIC format, independent of the delivery path.
Probably even an advantage.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Flush pending events when reprogramming the VT-d error interrupt
Jan Kiszka [Fri, 15 May 2015 05:57:29 +0000 (07:57 +0200)]
x86: Flush pending events when reprogramming the VT-d error interrupt

There seems to be the risk of in-flight error events still using the
address and data registers while we reprogram them. In practice, this
shouldn't happen on a correctly configured system because all valid
interrupt sources are silenced at this point. Nevertheless, play safe,
just like Linux does.

However, there is no reason to also read back after unmasking (like
Linux does) because the hardware injects pending events when the mask is
cleared.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Fix IOMMU unit number of IOAPICs
Jan Kiszka [Wed, 13 May 2015 17:22:24 +0000 (19:22 +0200)]
tools: config-create: Fix IOMMU unit number of IOAPICs

IOAPICs under the control of IOMMUs with unit number >= 1 were not
described correctly in the generated configs due to a stupid naming
mistake that Python cannot report.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Clarify precedence of "&" over "?"
Jan Kiszka [Wed, 13 May 2015 06:50:24 +0000 (08:50 +0200)]
x86: Clarify precedence of "&" over "?"

Style adjustment, suggested by cppcheck.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Filter out invalid MSI-X capabilities
Jan Kiszka [Tue, 12 May 2015 06:24:17 +0000 (08:24 +0200)]
tools: config-create: Filter out invalid MSI-X capabilities

Avoid passing zero as MSI-X address to the hypervisor, it will only
crash. Rather disable this capability by leaving related fields cleared.

See also http://thread.gmane.org/gmane.linux.jailhouse/3056.

Reported-by: Yijun Zhu <zhuyijun@huawei.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Consistent argument ordering for vtd_init_ir_emulation
Jan Kiszka [Mon, 11 May 2015 18:30:05 +0000 (20:30 +0200)]
x86: Consistent argument ordering for vtd_init_ir_emulation

Aligns it with other functions taking both unit_no and reg_base.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Report IOMMU number on fault events
Jan Kiszka [Mon, 11 May 2015 18:25:15 +0000 (20:25 +0200)]
x86: Report IOMMU number on fault events

Can help in case some device was assigned to the wrong unit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoBump version number
Jan Kiszka [Mon, 11 May 2015 15:14:59 +0000 (17:14 +0200)]
Bump version number

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfig-collect: fix filename typo
Henning Schild [Mon, 11 May 2015 10:58:00 +0000 (12:58 +0200)]
config-collect: fix filename typo

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agodriver: Fix image loading at unaligned addresses
Jan Kiszka [Sun, 10 May 2015 13:27:08 +0000 (15:27 +0200)]
driver: Fix image loading at unaligned addresses

Make sure that images are loaded at the correct location if the target
address is not aligned on a page boundary.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoarm: Clean up hypervisor stage 1 memory attributes
Jan Kiszka [Sat, 9 May 2015 06:00:41 +0000 (08:00 +0200)]
arm: Clean up hypervisor stage 1 memory attributes

Of the many attributes defined, some probably wrong, only 3 are actually
used: normal memory, device and non-cacheable. Validate those and drop
the rest. We can re-add more as needed.

See ARM ARM B4.1.104.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoarm: Fix stage 2 memory attributes
Jan Kiszka [Sat, 9 May 2015 05:54:53 +0000 (07:54 +0200)]
arm: Fix stage 2 memory attributes

The definition of memory attributes for stage 2 translations was wrong.
This attributes consist only of 4 bits, but the defines covered 8. Set
the proper values for those two types we use: normal memory and devices.

See ARM ARM B3.6.2 and B3.8.5 for details.

This fixes the enforcement of read-only or write-only cell memory
regions.

Reported-and-tested-by: Philipp Rosenberger <ilu@linutronix.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Update CPUID vmexit counter in vmx_handle_cpuid
Jan Kiszka [Thu, 7 May 2015 18:43:13 +0000 (20:43 +0200)]
x86: Update CPUID vmexit counter in vmx_handle_cpuid

Forgotten so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Make field structs of pci_msix_registers and pci_msix_vector anonymous
Jan Kiszka [Thu, 7 May 2015 17:34:01 +0000 (19:34 +0200)]
core: Make field structs of pci_msix_registers and pci_msix_vector anonymous

"field" provides no additional information to the reader, and all
affected sub-fields have unique names, so remove this.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Disable non-root PCI devices on shutdown
Jan Kiszka [Thu, 7 May 2015 17:27:12 +0000 (19:27 +0200)]
core: Disable non-root PCI devices on shutdown

We already disable PCI devices that are removed when a cell is
destroyed but we should also do this on hypervisor shutdown to avoid
that those device later on annoy Linux with unexpected activities.

The change is bigger as it re-indents the shutdown loop to maintain
readability.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Do not program MSI-X vectors that are masked
Jan Kiszka [Thu, 7 May 2015 17:10:20 +0000 (19:10 +0200)]
core: Do not program MSI-X vectors that are masked

Test for both function-level and vector-level masking before updating a
MSI-X interrupt mapping. Otherwise, we risk to let cells stumble over
stall but masked vector entries.

All accesses to a vector table entry now cause a mapping update. The
vector control dword is always cached to simplify testing it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Break up pci_msix_vector control field
Jan Kiszka [Thu, 7 May 2015 17:08:02 +0000 (19:08 +0200)]
core: Break up pci_msix_vector control field

Avoid testing the masked bit via a magic value.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix vtd int-remap region release
Jan Kiszka [Thu, 7 May 2015 16:14:53 +0000 (18:14 +0200)]
x86: Fix vtd int-remap region release

Tiny mistake, but it had the effect of only releasing the first MSI or
MSI-X vector of a PCI device on removal. The succeeding ones remained
both active for vtd and occupied for vtd_reserve_int_remap_region.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Reject xAPIC accesses while in x2APIC mode
Jan Kiszka [Wed, 6 May 2015 07:12:05 +0000 (09:12 +0200)]
x86: Reject xAPIC accesses while in x2APIC mode

If the APIC is in x2APIC mode, accesses via MMIO are not working (APIC
behaves like disabled). If Jailhouse executes them, it can be tricked to
access x2APIC registers that are invalid, causing a hypervisor-side #GP.
Prevent this by bailing out early.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: x86: Enable MTRRs during start to avoid disable caches
Jan Kiszka [Wed, 6 May 2015 05:43:47 +0000 (07:43 +0200)]
inmates: x86: Enable MTRRs during start to avoid disable caches

Since fe8fac80d7, emulation of the MTRR enable bit works. That has no
effect on KVM so far, but we effectively run with hand break put on over
real hardware.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Allow access to Focus Processor Checking bit in APIC SVR
Jan Kiszka [Fri, 1 May 2015 13:04:27 +0000 (15:04 +0200)]
x86: Allow access to Focus Processor Checking bit in APIC SVR

The Intel manual says: "In Pentium 4 and Intel Xeon processors, this bit
is reserved and should be cleared to 0." It apparently refers to the
first Xeon series here, not newer ones that support IA32e. Linux sets
this bit on x86-64 unconditionally for more than a decade. There are no
availability restrictions mentioned for AMD at all.

So let's release this bit to the cells because it cannot cause any harm
to the system or the hypervisor.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Hand over the APIC in soft-disabled state
Jan Kiszka [Sat, 2 May 2015 10:34:39 +0000 (12:34 +0200)]
x86: Hand over the APIC in soft-disabled state

This brings the Spurious-Interrupt Vector Register into its well-defined
reset state before handing the APIC over. Avoids surprises for cells and
the need for additional explanations.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Enable APIC for clearing
Jan Kiszka [Sat, 2 May 2015 10:24:21 +0000 (12:24 +0200)]
x86: Enable APIC for clearing

The cell may have turned it off, and then our attempts to clear pending
interrupts will be in vain.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoRevert "x86: Make FS_BASE MSR restoration VMX-specific"
Jan Kiszka [Wed, 6 May 2015 05:28:52 +0000 (07:28 +0200)]
Revert "x86: Make FS_BASE MSR restoration VMX-specific"

This reverts commit ee283bcf1818076662d897d489260f09d2b46c6c.

Loading the FS selector with 0 in arch_cpu_restore clears the base on
real hardware. Thus we have to reload it and can't apply this
optimization.

This bug caused crashes of the jailhouse tool.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Add BSD 2-Clause license to hypercall headers
Jan Kiszka [Fri, 20 Feb 2015 08:51:42 +0000 (09:51 +0100)]
core: Add BSD 2-Clause license to hypercall headers

This allows to use our types, inline functions etc. for interacting with
the hypervisor from within differently licensed cells.

Contributions came from Valentine, Jean-Philippe, Henning and me. I'm
signing off for Henning as well in the name of Siemens.

CC: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
CC: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
9 years agoarm: Remove ancient compiler bug test via __asmeq
Jan Kiszka [Mon, 4 May 2015 17:38:37 +0000 (19:38 +0200)]
arm: Remove ancient compiler bug test via __asmeq

This macro was once copied in from the Linux kernel. There it tries to
catch buggy gcc 3.x versions that didn't follow the specified register
assignments (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15089).

This bug is now 10 years old, fixed, and affected compilers that weren't
even aware of the virt extensions for ARMv7 that we depend on anyway. So
let's remove it.

This also removes a GPL'ed line of code, thus enables a dual-licensing
of the file.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Add BSD 2-Clause license to configuration format header
Jan Kiszka [Fri, 20 Feb 2015 08:46:32 +0000 (09:46 +0100)]
core: Add BSD 2-Clause license to configuration format header

This avoids having to distribute configuration files for target systems
under GPL terms. It also allows to process those files with differently
licensed management tools.

Contributions came from Valentine, Henning and me. I'm signing off for
Henning as well in the name of Siemens.

CC: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
9 years agotools: Add BSD 2-Clause license to configuration file template
Jan Kiszka [Fri, 20 Feb 2015 08:34:56 +0000 (09:34 +0100)]
tools: Add BSD 2-Clause license to configuration file template

This avoids having to distribute configuration files for target systems
under GPL terms.

Contributions came from Valentine, Henning and me. I'm signing off for
Henning as well in the name of Siemens.

CC: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
9 years agodriver: Add BSD 2-Clause license to user space interface header
Jan Kiszka [Fri, 20 Feb 2015 08:30:59 +0000 (09:30 +0100)]
driver: Add BSD 2-Clause license to user space interface header

This enables the development of alternatively licensed management
front-ends.

Contributions came from me only.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoImprove documentation of license application
Jan Kiszka [Tue, 14 Apr 2015 05:44:34 +0000 (07:44 +0200)]
Improve documentation of license application

This prepares for deviations from our GPLv2 default license and explains
both the why and the how better.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: clarify svm.h origins
Valentine Sinitsyn [Tue, 5 May 2015 19:24:44 +0000 (00:24 +0500)]
x86: clarify svm.h origins

Add specific file and copyrights for data structures in svm.h
headers file.

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Add copyright information to files with Linux roots
Jan Kiszka [Mon, 23 Feb 2015 15:35:00 +0000 (16:35 +0100)]
x86: Add copyright information to files with Linux roots

Some x86 headers and a Makefile have more or less significant roots in
the Linux kernel without declaring this properly so far. Fix it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: Add copyright information to jailhouse-completion.bash
Jan Kiszka [Fri, 20 Feb 2015 11:24:19 +0000 (12:24 +0100)]
tools: Add copyright information to jailhouse-completion.bash

This file was contributed under the default license of Jailhouse.
Better state this explicitly.

CC: Benjamin Block <bebl@mageta.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Benjamin Block <bebl@mageta.org>
9 years agodriver: Avoid deprecated usage of cpumask API
Jan Kiszka [Sat, 25 Apr 2015 07:02:28 +0000 (09:02 +0200)]
driver: Avoid deprecated usage of cpumask API

We used the legacy API so far, and that will be removed in 4.1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Do not call vmload/vmsave on every VM exit
Jan Kiszka [Sun, 5 Apr 2015 09:55:07 +0000 (11:55 +0200)]
x86: Do not call vmload/vmsave on every VM exit

Benchmarks indicate that we can gain about 160 cycles per VM exit &
reentry by only saving/restoring MSR_GS_BASE. We don't touch the other
states that vmload/vmsave deals with.

Specifically, we don't depend on a valid TR/TSS while in root mode
because Jailhouse has neither in userspace nor uses the IST for
interrupts or exceptions, thus does not try to access the TSS.

We still need to perform vmload on handover (actually, we only need to
load MSR_GS_BASE, but vmload is simpler) and after VCPU reset. And as we
no longer save the full state, also for shutdown, we need to pull the
missing information for arch_cpu_restore directly from the registers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Make FS_BASE MSR restoration VMX-specific
Jan Kiszka [Sun, 5 Apr 2015 08:52:32 +0000 (10:52 +0200)]
x86: Make FS_BASE MSR restoration VMX-specific

SVM does not touch this MSR on VM exit, thus does not require the
restoration done in arch_cpu_restore so far. Make it VMX-specific so
that we can drop a few lines of code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove write-only linux_sysenter_* fields
Jan Kiszka [Sun, 5 Apr 2015 07:21:36 +0000 (09:21 +0200)]
x86: Remove write-only linux_sysenter_* fields

The vendor code reads the state directly from the MSRs during setup.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Make SYSENTER MSR restoration VMX-specific
Jan Kiszka [Sun, 5 Apr 2015 07:19:33 +0000 (09:19 +0200)]
x86: Make SYSENTER MSR restoration VMX-specific

SVM does not overwrite these MSRs on VM exit, thus does not require the
restoration done in arch_cpu_restore so far. Make them VMX-specific so
that we can drop a few lines of code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove unneeded MSR restoring from SVM's vcpu_deactivate_vmm
Jan Kiszka [Sun, 5 Apr 2015 06:58:30 +0000 (08:58 +0200)]
x86: Remove unneeded MSR restoring from SVM's vcpu_deactivate_vmm

None of these MSRs is modified by Jailhouse after VM exit, thus they
still contain the state the Linux guest left behind.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Refactor SVM version of vcpu_activate_vmm
Jan Kiszka [Sat, 4 Apr 2015 11:27:59 +0000 (13:27 +0200)]
x86: Refactor SVM version of vcpu_activate_vmm

We can reduce the assembly required in vcpu_activate_vmm by reordering
svm_vmexit to svm_vmentry, i.e. pulling the VM entry logic to the front.
Moreover, RAX can be loaded directly. There is furthermore no need to
declare clobbered variables as we won't return from the assembly block,
which is already declared via __builtin_unreachable.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Simplify error exit of svm_parse_mov_to_cr and svm_handle_cr
Jan Kiszka [Sun, 5 Apr 2015 14:03:34 +0000 (16:03 +0200)]
x86: Simplify error exit of svm_parse_mov_to_cr and svm_handle_cr

No need to maintain a return code variable when we can simply return
false directly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop constant return values from SVM functions
Jan Kiszka [Sun, 5 Apr 2015 13:58:41 +0000 (15:58 +0200)]
x86: Drop constant return values from SVM functions

vmcb writing cannot fail on AMD, thus neither vmcb_setup nor
svm_set_cell_config can. Simply remove the error codes and related
handling.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Simplify descriptor reset in svm_vcpu_reset
Jan Kiszka [Sun, 5 Apr 2015 07:55:59 +0000 (09:55 +0200)]
x86: Simplify descriptor reset in svm_vcpu_reset

Reduce boilerplate code by using constants for common reset states.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Simplify set_svm_segment_from_dtr
Jan Kiszka [Sun, 5 Apr 2015 07:45:17 +0000 (09:45 +0200)]
x86: Simplify set_svm_segment_from_dtr

By using set_svm_segment_from_segment for ldtr, we can remove the
condition from set_svm_segment_from_dtr.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Simplify set_svm_segment_from_segment
Jan Kiszka [Sun, 5 Apr 2015 07:36:44 +0000 (09:36 +0200)]
x86: Simplify set_svm_segment_from_segment

No need to complain: segment.access_rights is generic as it simply holds
bits 8..23 of the second descriptor dword. The additional invalid bit
used by VMX only can be ignored by SVM - and it is already, even when
leaving out the explicit test.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop PERCPU_VMCB and VMCB_RAX
Jan Kiszka [Sat, 4 Apr 2015 21:19:13 +0000 (23:19 +0200)]
x86: Drop PERCPU_VMCB and VMCB_RAX

We can calculate PERCPU_VMCB_RAX directly and save the two intermediate
steps.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop local guest_regs variable from SVM version of vcpu_handle_exit
Jan Kiszka [Sat, 4 Apr 2015 15:51:51 +0000 (17:51 +0200)]
x86: Drop local guest_regs variable from SVM version of vcpu_handle_exit

No need to cache it. It can be derived from cpu_data now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers parameter from svm_handle_msr_write
Jan Kiszka [Sat, 4 Apr 2015 15:50:44 +0000 (17:50 +0200)]
x86: Remove guest registers parameter from svm_handle_msr_write

We can retrieve them from the per-cpu data structure now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Pass vmcb instead of cpu_data to some internal SVM functions
Jan Kiszka [Sat, 4 Apr 2015 15:32:14 +0000 (17:32 +0200)]
x86: Pass vmcb instead of cpu_data to some internal SVM functions

update_efer, svm_parse_mov_to_cr and svm_handle_apic_access have no use
for cpu_data and rather convert it into a vmcb reference directly. So
pass that one instead to save some statements.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Rename x86_parse_mov_to_cr to svm_parse_mov_to_cr
Jan Kiszka [Sat, 4 Apr 2015 15:29:00 +0000 (17:29 +0200)]
x86: Rename x86_parse_mov_to_cr to svm_parse_mov_to_cr

This functions is SVM-specific.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Cache vmcb instead of cpu_data in SVM's vcpu_vendor_get_execution_state
Jan Kiszka [Sat, 4 Apr 2015 15:24:08 +0000 (17:24 +0200)]
x86: Cache vmcb instead of cpu_data in SVM's vcpu_vendor_get_execution_state

Easier to read.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers parameter from svm_handle_cr
Jan Kiszka [Sat, 4 Apr 2015 15:22:11 +0000 (17:22 +0200)]
x86: Remove guest registers parameter from svm_handle_cr

We can retrieve them from the per-cpu data structure now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove traces of cpuid interception from SVM
Jan Kiszka [Sat, 4 Apr 2015 12:57:39 +0000 (14:57 +0200)]
x86: Remove traces of cpuid interception from SVM

There is no foreseeable need to intercept cpuid on AMD. On Intel, we
are not asked if we want to, so we have to execute it on behalf of the
cell.But here we can simple let it happen.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop some unneeded local variables from SVM functions
Jan Kiszka [Fri, 3 Apr 2015 10:08:01 +0000 (12:08 +0200)]
x86: Drop some unneeded local variables from SVM functions

No need to maintain cpu_data or even vmcb as local variable if they are
only used once.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop local guest_regs variable from VMX version of vcpu_handle_exit
Jan Kiszka [Sat, 4 Apr 2015 11:46:40 +0000 (13:46 +0200)]
x86: Drop local guest_regs variable from VMX version of vcpu_handle_exit

No need to cache it. It can be derived from cpu_data now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Factor out vmx_handle_cpuid
Jan Kiszka [Sat, 4 Apr 2015 11:45:48 +0000 (13:45 +0200)]
x86: Factor out vmx_handle_cpuid

Shortens vcpu_handle_exit and improves readability.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove parameters from vmx_handle_cr
Jan Kiszka [Sat, 4 Apr 2015 11:38:30 +0000 (13:38 +0200)]
x86: Remove parameters from vmx_handle_cr

Guest registers can be retrieved inline.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers parameter from vcpu_handle_msr_read/write
Jan Kiszka [Sat, 4 Apr 2015 06:22:49 +0000 (08:22 +0200)]
x86: Remove guest registers parameter from vcpu_handle_msr_read/write

The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers parameter from vcpu_handle_mmio_access
Jan Kiszka [Sat, 4 Apr 2015 06:20:33 +0000 (08:20 +0200)]
x86: Remove guest registers parameter from vcpu_handle_mmio_access

The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove unused guest registers parameter from vcpu_handle_io_access
Jan Kiszka [Sat, 4 Apr 2015 06:14:21 +0000 (08:14 +0200)]
x86: Remove unused guest registers parameter from vcpu_handle_io_access

All filter functions obtain the reference themselves now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers and cell parameters from x86_pci_config_handler
Jan Kiszka [Sat, 4 Apr 2015 06:02:21 +0000 (08:02 +0200)]
x86: Remove guest registers and cell parameters from x86_pci_config_handler

The function only works against the current CPU, thus should avoid to
take the misleading parameters. Guest registers are no long er required,
and the cell reference can be obtained inline.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Rework RAX register accessors of PCI layer
Jan Kiszka [Sat, 4 Apr 2015 05:53:18 +0000 (07:53 +0200)]
x86: Rework RAX register accessors of PCI layer

Stop requiring that the guest registers are passed down to the
accessors. Access handlers always work over the issuing CPU, thus can
obtain the register state themselves. Rename the accessors to make it
clear that they work against guest registers.

This allows to drop the guest_regs parameters from
data_port_in/out_handler.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers parameter from i8042_access_handler
Jan Kiszka [Fri, 3 Apr 2015 18:04:44 +0000 (20:04 +0200)]
x86: Remove guest registers parameter from i8042_access_handler

The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove guest registers and cpu_data parameters from apic_mmio_access
Jan Kiszka [Fri, 3 Apr 2015 17:58:25 +0000 (19:58 +0200)]
x86: Remove guest registers and cpu_data parameters from apic_mmio_access

The function only works on the current CPU, thus should avoid to take
misleading parameters. The necessary references can be obtained inline.

With the parameters no longer needed, the callers
svm/vmx_handle_apic_access can drop some of them as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>