Jan Kiszka [Wed, 13 May 2015 17:22:24 +0000 (19:22 +0200)]
tools: config-create: Fix IOMMU unit number of IOAPICs
IOAPICs under the control of IOMMUs with unit number >= 1 were not
described correctly in the generated configs due to a stupid naming
mistake that Python cannot report.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 9 May 2015 06:00:41 +0000 (08:00 +0200)]
arm: Clean up hypervisor stage 1 memory attributes
Of the many attributes defined, some probably wrong, only 3 are actually
used: normal memory, device and non-cacheable. Validate those and drop
the rest. We can re-add more as needed.
See ARM ARM B4.1.104.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 9 May 2015 05:54:53 +0000 (07:54 +0200)]
arm: Fix stage 2 memory attributes
The definition of memory attributes for stage 2 translations was wrong.
This attributes consist only of 4 bits, but the defines covered 8. Set
the proper values for those two types we use: normal memory and devices.
See ARM ARM B3.6.2 and B3.8.5 for details.
This fixes the enforcement of read-only or write-only cell memory
regions.
Reported-and-tested-by: Philipp Rosenberger <ilu@linutronix.de> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 7 May 2015 17:27:12 +0000 (19:27 +0200)]
core: Disable non-root PCI devices on shutdown
We already disable PCI devices that are removed when a cell is
destroyed but we should also do this on hypervisor shutdown to avoid
that those device later on annoy Linux with unexpected activities.
The change is bigger as it re-indents the shutdown loop to maintain
readability.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 7 May 2015 17:10:20 +0000 (19:10 +0200)]
core: Do not program MSI-X vectors that are masked
Test for both function-level and vector-level masking before updating a
MSI-X interrupt mapping. Otherwise, we risk to let cells stumble over
stall but masked vector entries.
All accesses to a vector table entry now cause a mapping update. The
vector control dword is always cached to simplify testing it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 7 May 2015 16:14:53 +0000 (18:14 +0200)]
x86: Fix vtd int-remap region release
Tiny mistake, but it had the effect of only releasing the first MSI or
MSI-X vector of a PCI device on removal. The succeeding ones remained
both active for vtd and occupied for vtd_reserve_int_remap_region.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 6 May 2015 07:12:05 +0000 (09:12 +0200)]
x86: Reject xAPIC accesses while in x2APIC mode
If the APIC is in x2APIC mode, accesses via MMIO are not working (APIC
behaves like disabled). If Jailhouse executes them, it can be tricked to
access x2APIC registers that are invalid, causing a hypervisor-side #GP.
Prevent this by bailing out early.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Wed, 6 May 2015 05:43:47 +0000 (07:43 +0200)]
inmates: x86: Enable MTRRs during start to avoid disable caches
Since fe8fac80d7, emulation of the MTRR enable bit works. That has no
effect on KVM so far, but we effectively run with hand break put on over
real hardware.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 1 May 2015 13:04:27 +0000 (15:04 +0200)]
x86: Allow access to Focus Processor Checking bit in APIC SVR
The Intel manual says: "In Pentium 4 and Intel Xeon processors, this bit
is reserved and should be cleared to 0." It apparently refers to the
first Xeon series here, not newer ones that support IA32e. Linux sets
this bit on x86-64 unconditionally for more than a decade. There are no
availability restrictions mentioned for AMD at all.
So let's release this bit to the cells because it cannot cause any harm
to the system or the hypervisor.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 2 May 2015 10:34:39 +0000 (12:34 +0200)]
x86: Hand over the APIC in soft-disabled state
This brings the Spurious-Interrupt Vector Register into its well-defined
reset state before handing the APIC over. Avoids surprises for cells and
the need for additional explanations.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 4 May 2015 17:38:37 +0000 (19:38 +0200)]
arm: Remove ancient compiler bug test via __asmeq
This macro was once copied in from the Linux kernel. There it tries to
catch buggy gcc 3.x versions that didn't follow the specified register
assignments (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15089).
This bug is now 10 years old, fixed, and affected compilers that weren't
even aware of the virt extensions for ARMv7 that we depend on anyway. So
let's remove it.
This also removes a GPL'ed line of code, thus enables a dual-licensing
of the file.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 20 Feb 2015 08:46:32 +0000 (09:46 +0100)]
core: Add BSD 2-Clause license to configuration format header
This avoids having to distribute configuration files for target systems
under GPL terms. It also allows to process those files with differently
licensed management tools.
Contributions came from Valentine, Henning and me. I'm signing off for
Henning as well in the name of Siemens.
Jan Kiszka [Sun, 5 Apr 2015 09:55:07 +0000 (11:55 +0200)]
x86: Do not call vmload/vmsave on every VM exit
Benchmarks indicate that we can gain about 160 cycles per VM exit &
reentry by only saving/restoring MSR_GS_BASE. We don't touch the other
states that vmload/vmsave deals with.
Specifically, we don't depend on a valid TR/TSS while in root mode
because Jailhouse has neither in userspace nor uses the IST for
interrupts or exceptions, thus does not try to access the TSS.
We still need to perform vmload on handover (actually, we only need to
load MSR_GS_BASE, but vmload is simpler) and after VCPU reset. And as we
no longer save the full state, also for shutdown, we need to pull the
missing information for arch_cpu_restore directly from the registers.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 5 Apr 2015 08:52:32 +0000 (10:52 +0200)]
x86: Make FS_BASE MSR restoration VMX-specific
SVM does not touch this MSR on VM exit, thus does not require the
restoration done in arch_cpu_restore so far. Make it VMX-specific so
that we can drop a few lines of code.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 5 Apr 2015 07:19:33 +0000 (09:19 +0200)]
x86: Make SYSENTER MSR restoration VMX-specific
SVM does not overwrite these MSRs on VM exit, thus does not require the
restoration done in arch_cpu_restore so far. Make them VMX-specific so
that we can drop a few lines of code.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 11:27:59 +0000 (13:27 +0200)]
x86: Refactor SVM version of vcpu_activate_vmm
We can reduce the assembly required in vcpu_activate_vmm by reordering
svm_vmexit to svm_vmentry, i.e. pulling the VM entry logic to the front.
Moreover, RAX can be loaded directly. There is furthermore no need to
declare clobbered variables as we won't return from the assembly block,
which is already declared via __builtin_unreachable.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 5 Apr 2015 07:36:44 +0000 (09:36 +0200)]
x86: Simplify set_svm_segment_from_segment
No need to complain: segment.access_rights is generic as it simply holds
bits 8..23 of the second descriptor dword. The additional invalid bit
used by VMX only can be ignored by SVM - and it is already, even when
leaving out the explicit test.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 15:32:14 +0000 (17:32 +0200)]
x86: Pass vmcb instead of cpu_data to some internal SVM functions
update_efer, svm_parse_mov_to_cr and svm_handle_apic_access have no use
for cpu_data and rather convert it into a vmcb reference directly. So
pass that one instead to save some statements.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 12:57:39 +0000 (14:57 +0200)]
x86: Remove traces of cpuid interception from SVM
There is no foreseeable need to intercept cpuid on AMD. On Intel, we
are not asked if we want to, so we have to execute it on behalf of the
cell.But here we can simple let it happen.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 06:22:49 +0000 (08:22 +0200)]
x86: Remove guest registers parameter from vcpu_handle_msr_read/write
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 06:20:33 +0000 (08:20 +0200)]
x86: Remove guest registers parameter from vcpu_handle_mmio_access
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 06:02:21 +0000 (08:02 +0200)]
x86: Remove guest registers and cell parameters from x86_pci_config_handler
The function only works against the current CPU, thus should avoid to
take the misleading parameters. Guest registers are no long er required,
and the cell reference can be obtained inline.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 05:53:18 +0000 (07:53 +0200)]
x86: Rework RAX register accessors of PCI layer
Stop requiring that the guest registers are passed down to the
accessors. Access handlers always work over the issuing CPU, thus can
obtain the register state themselves. Rename the accessors to make it
clear that they work against guest registers.
This allows to drop the guest_regs parameters from
data_port_in/out_handler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 18:04:44 +0000 (20:04 +0200)]
x86: Remove guest registers parameter from i8042_access_handler
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 13:33:25 +0000 (15:33 +0200)]
x86: Remove parameters from x2apic_handle_read/write
The function only works against the current CPU, thus should avoid to
take the misleading parameters. We can retrieve the per-cpu data
structure and the guest registers in the function now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Mon, 6 Apr 2015 18:19:34 +0000 (20:19 +0200)]
x86: Remove guest registers parameter from vcpu_handle_xsetbv
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 13:03:22 +0000 (15:03 +0200)]
x86: Remove guest registers parameter from vcpu_handle_hypercall
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 13:02:19 +0000 (15:02 +0200)]
x86: Remove guest registers parameter from vcpu_deactivate_vmm
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 12:47:52 +0000 (14:47 +0200)]
x86: Remove guest registers parameter from vcpu_reset
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The necessary reference can be obtained
from the per-cpu data structure now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 12:26:08 +0000 (14:26 +0200)]
x86: Enable direct access to per-cpu guest registers
Now that the guest registers are saved at the same location on the
per-cpu stack for both Intel and AMD, we can enable direct access via
the per-cpu data structure. This will allow to drop the guest registers
parameter from most functions.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 11:46:28 +0000 (13:46 +0200)]
x86: Reorder stack layout in svm_vmexit
Push the guest registers first so that they end up at the same location
on the stack as on Intel. This will allow to address them generically
via the per_cpu structure.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 17:21:32 +0000 (19:21 +0200)]
x86: Allow index-based guest register access without type casts
Convert struct registers into a union and provide a by_index array for
index-based access. This is used by various handlers that parse guest
instructions and so far use a blunt type case on the structure.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 11:07:03 +0000 (13:07 +0200)]
x86: Retrieve vcpu_mmio_intercept from vcpu_handle_mmio_access
Analogously to vcpu_handle_io_access, define the vendor callback
vcpu_vendor_get_mmio_intercept and call it from vcpu_handle_mmio_access
instead of passing it to the handler. For consistency reasons, rename
vcpu_pf_intercept to vcpu_mmio_intercept.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 4 Apr 2015 10:23:09 +0000 (12:23 +0200)]
x86: Retrieve vcpu_io_intercept from vcpu_handle_io_access
Convert the vendor-specific functions into vcpu_vendor_get_io_intercept
and invoke that one from vcpu_handle_io_access. That offloads this
burden from the callers of vcpu_handle_io_access and takes us further
towards consistent vendor callbacks for such purposes.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 17:51:51 +0000 (19:51 +0200)]
x86: Remove cpu_data parameter from vcpu_park
The function only works against the current CPU, thus should avoid to
take the misleading parameter. The implementations can obtain the
reference inline as needed.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 09:06:54 +0000 (11:06 +0200)]
x86: Block write access to MTRR registers
Linux does not try to rewrite them on CPU hotplug if they are identical
to other CPUs' registers, and our non-root cells have no business in
touching them as well. This effectively freezes MTRRs after handover
ensures consistent states for both the hypervisor and all cells across
all CPUs.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Fri, 3 Apr 2015 08:48:19 +0000 (10:48 +0200)]
x86: Emulate MTRR enable/disable
We assume that cells will only flip the enabled flag of
IA32_MTRR_DEF_TYPE, leaving the rest of the register in default state
(the one found during handover). SVM already implemented this but
emulated the disabled state by modifying the host PAT.
This approach works less invasively by only changing the effective guest
PAT to 0 in case MTRRs are off. And it provides this for Intel as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Thu, 2 Apr 2015 08:15:40 +0000 (10:15 +0200)]
x86: Maintain PAT shadow
For emulating the MTRR-disabled state, we will have to modify the
effective guest PAT state soon. This prepares for it by keeping PAT in
a shadow per-cpu field and intercept access to the MSR.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sun, 29 Mar 2015 10:19:47 +0000 (12:19 +0200)]
x86: Switch between host and guest PAT
Do not allow the guest to mess with the PAT MSR in a was that also
affects the host. This may cause the host to run in uncached mode,
slowing it down, or - even worse- access MMIO with caches enabled which
will cause inconsistencies.
On Intel, we have to require and enable the related save/restore
feature. On AMD, we need to intercept the MSR accesses and map them on
the g_pat field of the VMCB.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Jan Kiszka [Sat, 28 Mar 2015 11:02:04 +0000 (12:02 +0100)]
x86: Prevent interference by Intel perf counters
Make it simple but safe: Disable perf counters during setup and prevent
that cells can modify the corresponding MSR. This avoids that we have
to switch the MSR during vmentry/exit, but it also blocks perf & friends
while Jailhouse is active.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>