]> rtime.felk.cvut.cz Git - jailhouse.git/log
jailhouse.git
9 years agocore/configs/tools: Remove ACPI support from hypervisor
Jan Kiszka [Wed, 30 Jul 2014 18:54:38 +0000 (20:54 +0200)]
core/configs/tools: Remove ACPI support from hypervisor

The is no more user of the APCI table lookup. Remove this code as well
as the config memory region in the configuration files. Update the
config generator accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Obtain DMAR unit addresses from system configuration
Jan Kiszka [Wed, 30 Jul 2014 17:27:46 +0000 (19:27 +0200)]
x86: Obtain DMAR unit addresses from system configuration

This removes the last ACPI user and simplifies the hypervisor code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs/tools: Describe DMAR units in config files
Jan Kiszka [Wed, 30 Jul 2014 16:15:43 +0000 (18:15 +0200)]
configs/tools: Describe DMAR units in config files

This prepares to switch from ACPI parsing to config file based DMAR unit
discovery. For simplicity reasons, we limit the number of supported DMAR
units to 8. Can be extended or made dynamic when needed.

Update the h87i config accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Obtain MMCONFIG region via system configuration
Jan Kiszka [Wed, 30 Jul 2014 16:26:50 +0000 (18:26 +0200)]
core: Obtain MMCONFIG region via system configuration

This is a step towards overcoming ACPI in the hypervisor completely.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs/tools: Describe MMCONFIG region in config files
Jan Kiszka [Wed, 30 Jul 2014 14:11:30 +0000 (16:11 +0200)]
configs/tools: Describe MMCONFIG region in config files

This prepares to switch from ACPI parsing to config file based MMCONFIG
discovery.

Update the qemu-vm and h87i configs accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Remove Celsius H700 config
Jan Kiszka [Wed, 30 Jul 2014 18:49:54 +0000 (20:49 +0200)]
configs: Remove Celsius H700 config

No longer in use, already started to rot.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Require Q35 machine model for QEMU-based test setup
Jan Kiszka [Wed, 30 Jul 2014 15:15:46 +0000 (17:15 +0200)]
configs: Require Q35 machine model for QEMU-based test setup

With the introduction of config-based MMCONFIG parameters, it becomes
impossible to have one QEMU config for both its PC machines. Restrict
us to the one that will soon gain VT-d support: Q35.

Update README to reflect these requirements and changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Adjust qemu-vm regarding ACPI region size
Jan Kiszka [Wed, 30 Jul 2014 08:46:10 +0000 (10:46 +0200)]
configs: Adjust qemu-vm regarding ACPI region size

QEMU 2.1 faced another change in the ACPI region size. Account for it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Initialize PCIDevice.num_caps already on construction
Jan Kiszka [Mon, 28 Jul 2014 12:10:11 +0000 (14:10 +0200)]
tools: config-create: Initialize PCIDevice.num_caps already on construction

No need to postpone this until we know where the caps are located -
their number will stay the same.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Disable PCI devices on removal
Jan Kiszka [Mon, 4 Aug 2014 08:41:52 +0000 (10:41 +0200)]
core: Disable PCI devices on removal

Switch off any bus master, MMIO and PIO dispatching when removing a
device from a cell. Also try to suppress INTx signals (not all devices
may respect this).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Introduce PCI device iterator
Jan Kiszka [Mon, 4 Aug 2014 07:43:06 +0000 (09:43 +0200)]
core: Introduce PCI device iterator

We will have more of these iterations in the future, so let's make them
more handy to use.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove presence checks from vtd_add/remove_pci_device
Jan Kiszka [Mon, 4 Aug 2014 08:29:49 +0000 (10:29 +0200)]
x86: Remove presence checks from vtd_add/remove_pci_device

Runtime ownership tracking at generic PCI layer level protects us from
being called redundantly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Fix PCI device runtime ownership tracking
Jan Kiszka [Mon, 4 Aug 2014 06:29:26 +0000 (08:29 +0200)]
core: Fix PCI device runtime ownership tracking

Trigger PCI device addition and removal from the PCI core and update the
cell field in the device state in order to track active ownership. The
vtd module now only provides callbacks to update its tables when adding
or removing a device.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Introduce PCI device state
Jan Kiszka [Tue, 29 Jul 2014 18:34:24 +0000 (20:34 +0200)]
core: Introduce PCI device state

We will have to store a number of runtime state information for PCI
devices, specifically its owner. Allocate these states as an array
during cell creation and release them on cell destruction.

We can already use the structure to keep a reference to the cell the
device belongs to. This avoids having to pass this around over multiple
hops. It will also be used soon to encode runtime ownership by setting
or clearing the reference.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Only perform PCI config space writes on PCI_ACCESS_PERFORM
Jan Kiszka [Tue, 29 Jul 2014 18:45:23 +0000 (20:45 +0200)]
core: Only perform PCI config space writes on PCI_ACCESS_PERFORM

If we emulate a config space write, we may be able to skip the physical
access completely. To model this, rename PCI_ACCESS_EMULATE to
PCI_ACCESS_DONE which signals to the caller of the moderation functions
that no physical access should be performed.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Pass value directly to pci_cfg_write_moderate
Jan Kiszka [Sun, 3 Aug 2014 19:11:37 +0000 (21:11 +0200)]
core: Pass value directly to pci_cfg_write_moderate

Convert pass-by-reference to pass-by-value for the value
pci_cfg_write_moderate should handle. Reason: either we will emulate and
write in the context of the moderation, or we let the original value
pass as-is.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Introduce generic PCI config space access functions
Jan Kiszka [Mon, 28 Jul 2014 05:52:03 +0000 (07:52 +0200)]
core: Introduce generic PCI config space access functions

If available, these functions use MMIO-based access. Otherwise, they
call the arch-specific access functions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Clean up PCI defines
Jan Kiszka [Mon, 28 Jul 2014 05:45:04 +0000 (07:45 +0200)]
x86: Clean up PCI defines

Most of them became unused. Use self-describing definitions for the
remaining ones.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Factor out arch_pci_read/write_config
Jan Kiszka [Mon, 28 Jul 2014 05:32:03 +0000 (07:32 +0200)]
x86: Factor out arch_pci_read/write_config

These PIO-based access functions will be reused for providing
hypervisor-internal config access.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Pass combined register address to pci_cfg_read/write_moderate
Jan Kiszka [Mon, 28 Jul 2014 04:47:38 +0000 (06:47 +0200)]
core: Pass combined register address to pci_cfg_read/write_moderate

Fold reg_base and reg_bias parameters of the access moderation functions
into a single address value. It simplifies most callers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore/configs/tools: Switch PCI configuration format to single BDF value
Jan Kiszka [Sun, 27 Jul 2014 17:38:48 +0000 (19:38 +0200)]
core/configs/tools: Switch PCI configuration format to single BDF value

There is no value in splitting up the PCI device address in the config
format into bus and devfn. Fold them into a single value that can easier
be matched and is also easily be split up again via new helper macros
whenever needed.

This generates some work for locally maintained config file, sorry.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Restrict vtd_init_fault_nmi to 8-bit APIC IDs
Jan Kiszka [Fri, 1 Aug 2014 15:21:22 +0000 (17:21 +0200)]
x86: Restrict vtd_init_fault_nmi to 8-bit APIC IDs

There is more code in Jailhouse that is restricted to 8-bit IDs, so
let's simplify vtd_init_fault_nmi correspondingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Micro-refactor vtd_add_device_to_cell/vtd_remove_device_from_cell
Jan Kiszka [Sun, 27 Jul 2014 17:31:12 +0000 (19:31 +0200)]
x86: Micro-refactor vtd_add_device_to_cell/vtd_remove_device_from_cell

Keep the pointer of referenced root table entry low-word instead of its
value. This simplifies its manipulation later on.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Privatize PCI_CONFIG_HEADER_SIZE define
Jan Kiszka [Sun, 27 Jul 2014 17:19:22 +0000 (19:19 +0200)]
core: Privatize PCI_CONFIG_HEADER_SIZE define

No use case outside of hypervisor/pci.c

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Provide MMIO accessors for all sizes
Jan Kiszka [Sun, 27 Jul 2014 16:27:01 +0000 (18:27 +0200)]
core: Provide MMIO accessors for all sizes

Use a macro to define size-specific MMIO accessors and add support for
8 and 16 bit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop useless conditions from data_port_in/out_handler
Jan Kiszka [Sun, 27 Jul 2014 15:59:26 +0000 (17:59 +0200)]
x86: Drop useless conditions from data_port_in/out_handler

Size must be 4 if it's not 1 or 2.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Fix calculation of MMCONFIG region size
Jan Kiszka [Sun, 27 Jul 2014 15:42:24 +0000 (17:42 +0200)]
core: Fix calculation of MMCONFIG region size

The PCI Firmware Specification says: "For PCI-X and PCI Express
platforms utilizing the enhanced configuration access method, the base
address of the memory mapped configuration space always corresponds to
bus number 0 (regardless of the start bus number decoded by the host
bridge) [...]." So drop the start bus from the size calculation.

Moreover, we had an off-by-one regarding end bus to size translation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Adjust MSI capability address of HDA devices
Jan Kiszka [Sun, 27 Jul 2014 06:54:06 +0000 (08:54 +0200)]
configs: Adjust MSI capability address of HDA devices

Real hardware locates MSI at 0x60, only current QEMU has this at 0x50
(patch posted).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: pci-demo: Clear STATESTS before triggering the MSI
Jan Kiszka [Sun, 27 Jul 2014 06:41:25 +0000 (08:41 +0200)]
inmates: pci-demo: Clear STATESTS before triggering the MSI

STATESTS may still have pending events which could prevent a MSI
delivery after controller reset. That reset doesn't clear them, so do
this explicitly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix error path in arch_cell_create
Jan Kiszka [Tue, 29 Jul 2014 17:50:29 +0000 (19:50 +0200)]
x86: Fix error path in arch_cell_create

Regression of 46ab6c2f1e: Properly return the error if vtd_cell_init
fails.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoUpdate TODO
Jan Kiszka [Sat, 26 Jul 2014 06:35:10 +0000 (08:35 +0200)]
Update TODO

Recently added APIC topics are addressed now. Also add new ideas
regarding error handling and cell watchdogs / message timeouts.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Block write access to IA32_APIC_BASE MSR
Jan Kiszka [Sun, 3 Aug 2014 17:55:31 +0000 (19:55 +0200)]
x86: Block write access to IA32_APIC_BASE MSR

The hypervisor depends on a consistent APIC mode. So prevent that a cell
can mess it up. As the APIC is kept in the same state across cell
assignments, no cell has a need to change it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Filter LVT delivery modes
Jan Kiszka [Sat, 26 Jul 2014 06:17:46 +0000 (08:17 +0200)]
x86: Filter LVT delivery modes

Do not allow cells to program anything else than Fixed or NMI mode. NMIs
will still be swallowed by the hypervisor NMI interception path, so perf
& Co. remain broken.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Filter writes to reserved APIC register bits
Jan Kiszka [Sat, 26 Jul 2014 05:45:22 +0000 (07:45 +0200)]
x86: Filter writes to reserved APIC register bits

Set up a bitmap for all xAPIC/x2APIC register that marks reserved bits
(or complete registers). Check to-be-written values against this bitmap
before executing accesses in hypervisor context.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Move ICR access dispatching into x2apic_handle_write
Jan Kiszka [Sat, 26 Jul 2014 05:44:13 +0000 (07:44 +0200)]
x86: Move ICR access dispatching into x2apic_handle_write

More code sharing, specifically when AMD support will be added.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Refactor x2apic_handle_read/write
Jan Kiszka [Sat, 26 Jul 2014 05:42:03 +0000 (07:42 +0200)]
x86: Refactor x2apic_handle_read/write

Use the relative APIC registers number internally and prepare the
interface of x2apic_handle_write for returning access errors.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Prevent getting stuck while trying to clear the APIC
Jan Kiszka [Thu, 24 Jul 2014 17:16:59 +0000 (19:16 +0200)]
x86: Prevent getting stuck while trying to clear the APIC

If some interrupt source (typically a level-triggered IOAPIC pin)
continuously sends messages to the APIC we are trying to clear from
pending bits in ISR and IRR, we will get stuck in the hypervisor in an
interrupt storm.

Avoid this by limiting the number of handled interrupts to the number of
vectors we have. When reaching this limit, simply raise TPR to break out
of the loop. It's cleared again on exit from apic_clear, and the code
booting the CPU can handle the then pending interrupt itself. That's
almost like real hardware would behave (low-prio IRR bits may remain set
due to a stuck high-prio interrupt). However, only buggy SMP cells will
once be able to trigger this, IOAPIC pins of exitings cells will soon be
masked to prevent this scenario.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Moderate access to PCI capabilities
Jan Kiszka [Wed, 23 Jul 2014 07:19:27 +0000 (09:19 +0200)]
core: Moderate access to PCI capabilities

Make use of the capability configuration and permit write access only to
explicitly configured capabilities. Read access is harmless as it is
free of side effects.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools/configs: Describe PCI capabilities in config files
Jan Kiszka [Sun, 20 Jul 2014 09:45:37 +0000 (11:45 +0200)]
tools/configs: Describe PCI capabilities in config files

Instead of parsing the PCI config spaces in the hypervisor, offload this
to the configuration generator. It will produce a (logically) linked
list of capabilities per device, their ID, start and length. Each
capability can furthermore be marked as writable by the cell.

Note that identical capability lists shared between multiple devices
will automatically folded into a single one. The user can duplicate and
customize them individually in a manual post-processing step.

Configurations are updated for QEMU, the H87i and the pci-demo cell.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Sort listed directories
Jan Kiszka [Wed, 23 Jul 2014 12:04:27 +0000 (14:04 +0200)]
tools: config-create: Sort listed directories

This will specifically ensure that PCI devices will be added to the
configuration file in a sorted manner which helps with manual
post-processing.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Do not bail out on missing ACPI tables
Jan Kiszka [Mon, 21 Jul 2014 07:45:38 +0000 (09:45 +0200)]
tools: config-create: Do not bail out on missing ACPI tables

Rather warn and give the user a chance to post-process the config file.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Simplify optional file handling and generator mode
Jan Kiszka [Sun, 20 Jul 2014 22:13:15 +0000 (00:13 +0200)]
tools: config-create: Simplify optional file handling and generator mode

Stop passing exceptions from input_open to its callers: An optional file
can be returned as empty (derived from /dev/null), same on failures
during collector creation. The typical mode for input_open is 'r', so we
can save passing this explicitly in most cases. Same for optional which
is generally False.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Differentiate between optional and mandatory files during collect
Jan Kiszka [Sun, 20 Jul 2014 21:25:44 +0000 (23:25 +0200)]
tools: config-create: Differentiate between optional and mandatory files during collect

Bail out if a required file is missing but continue without irritating
warnings if an optional one cannot be found.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Cleanup typos and style in collector template
Jan Kiszka [Sun, 20 Jul 2014 21:10:46 +0000 (23:10 +0200)]
tools: config-create: Cleanup typos and style in collector template

No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Get rid of record switch in input_open
Jan Kiszka [Sun, 20 Jul 2014 18:11:08 +0000 (20:11 +0200)]
tools: config-create: Get rid of record switch in input_open

We no longer invoke input_open if the file is part of a recorded
directory. So this switch became obsolete.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Return empty dir list when running in generator mode
Jan Kiszka [Sun, 20 Jul 2014 18:01:10 +0000 (20:01 +0200)]
tools: config-create: Return empty dir list when running in generator mode

No point in returning valid results here. The files in this directory
may not be accessible for normal users, and then the collector
generation will fail prematurely.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: Account for larger sets of CPUs
Henning Schild [Thu, 24 Jul 2014 14:07:15 +0000 (16:07 +0200)]
tools: config-create: Account for larger sets of CPUs

Reserves as many cpu set words as actually required. Format the
initialization in hex to improve readability with large numbers of CPUs.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: fixed trailing whitespace]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Remove forgotten release of pci_lock
Jan Kiszka [Mon, 28 Jul 2014 04:31:38 +0000 (06:31 +0200)]
x86: Remove forgotten release of pci_lock

Regression of 8c20eff688: lock is taken/released inside the
data_port_in/out_handlers now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Allow writing to more PCI config space registers
Jan Kiszka [Wed, 23 Jul 2014 18:24:13 +0000 (20:24 +0200)]
core: Allow writing to more PCI config space registers

Enable write access to the Master Latency Timer (non-functional for
PCIe) and the Cacheline Size (no misuse potential identified).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Update ACPI location and size of QEMU VM
Jan Kiszka [Wed, 23 Jul 2014 17:56:02 +0000 (19:56 +0200)]
configs: Update ACPI location and size of QEMU VM

The ACPI region size grew to 16K with latest QEMU. Adjust the config.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Tune memory regions of H87i
Jan Kiszka [Tue, 22 Jul 2014 19:08:13 +0000 (21:08 +0200)]
configs: Tune memory regions of H87i

ACPI was off by one page, and two regions overlapped, exposing too much
access.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Enable MMCONFIG moderation for QEMU Q35 and H87i
Jan Kiszka [Tue, 22 Jul 2014 12:29:45 +0000 (14:29 +0200)]
configs: Enable MMCONFIG moderation for QEMU Q35 and H87i

Exclude the MMCONFIG regions from memory regions so that Jailhouse can
intercept accesses. Not yet converted is the H700 config.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Increase remapping region size
Jan Kiszka [Tue, 22 Jul 2014 16:57:06 +0000 (18:57 +0200)]
x86: Increase remapping region size

If we have to map the MMCONFIG space of a complete PCI section, we run
out of remapping space. 4 means 128K pages, more than enough for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Don't match MMIO accesses if there is no MMCONFIG space
Jan Kiszka [Tue, 22 Jul 2014 16:53:16 +0000 (18:53 +0200)]
core: Don't match MMIO accesses if there is no MMCONFIG space

Without this additional check, we will use addr >= (u64)-1 as upper
boundary if there is no MMCONFIG, which is almost always true.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Fix error handling of MMCONFIG setup
Jan Kiszka [Tue, 22 Jul 2014 16:48:19 +0000 (18:48 +0200)]
core: Fix error handling of MMCONFIG setup

If we have an MMCONFIG region, we must either successfully map it or
fail the initialization. Succeeding without setting up pci_space will
cause crashes later on when accessing it on behalf of a cell.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Enable unwinding from exception handler
Jan Kiszka [Tue, 22 Jul 2014 16:37:40 +0000 (18:37 +0200)]
x86: Enable unwinding from exception handler

Preserve the .eh_frame section for the linked hypervisor objection and
only remove it from the binary. Then add .cfi directives to the
exception entry code. This enables to use a debugger for unwinding from
the exception handler to the causing function and beyond (not perfect
due to missing stack frames, though).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Minimize scope of pci_lock
Jan Kiszka [Mon, 21 Jul 2014 21:03:17 +0000 (23:03 +0200)]
x86: Minimize scope of pci_lock

Move PIO accesses and the related spinlock into data_port_in/out_handler
in order to reduce the lock contention time.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Rework PCI config space access handling
Jan Kiszka [Mon, 21 Jul 2014 18:22:08 +0000 (20:22 +0200)]
core: Rework PCI config space access handling

Move more logic into generic code by extending the write handler to
pci_cfg_write_moderate and introducing pci_cfg_read_moderate. These
handlers are responsible for any config space access, including to
unowned or non-existent devices. They can reject the access, return an
emulated value on read or a real value to be written to hardware, or
they instruct the caller to perform the access directly.

We already pass a reference to the issuing cell to the access handlers.
It stays unused for now but will be needed by succeeding changes. So
add it now to avoid changing API and callers once again later on.

This commit lays the foundation for capability access moderation and,
specifically, MSI/MSI-X emulation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Reject PCI config space writes to unowned devices
Jan Kiszka [Mon, 21 Jul 2014 18:37:45 +0000 (20:37 +0200)]
x86: Reject PCI config space writes to unowned devices

Align the PIO path with MMIO accesses: We should report any write to a
non-existing or unowned device as failure than silently swallowing it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Clean up PIO bitmaps
Jan Kiszka [Sun, 20 Jul 2014 10:29:08 +0000 (12:29 +0200)]
configs: Clean up PIO bitmaps

Remove unneeded access permissions to PIC1 from all config. DMA and IDE
access is only relevant to QEMU in PIIX2 mode, so drop this from real
machines and the config template. ACPI access is also not needed during
typical operation, nor is it safe.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agocore: Simplify config field accessors
Jan Kiszka [Sun, 20 Jul 2014 09:49:53 +0000 (11:49 +0200)]
core: Simplify config field accessors

Express config field accessors by using the accessor of the previous
field. This removes duplicate statements for field sizes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoUpdate TODO list
Jan Kiszka [Tue, 22 Jul 2014 12:29:10 +0000 (14:29 +0200)]
Update TODO list

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Re-park CPU while in wait-for-SIPI state
Jan Kiszka [Wed, 23 Jul 2014 15:49:30 +0000 (17:49 +0200)]
x86: Re-park CPU while in wait-for-SIPI state

We may receive IPIs, e.g. to stop the CPU, while in wait-for-SIPI state.
In this case, we must park the CPU again before leaving
x86_handle_events. Currently, we resume CPU execution erroneously.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: close output file before exiting from script
Henning Schild [Tue, 15 Jul 2014 11:35:30 +0000 (13:35 +0200)]
tools: config-create: close output file before exiting from script

Maybe not strictly required in that case, but open should always be
followed by close.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: set default root cell name to RootCell
Henning Schild [Tue, 15 Jul 2014 11:27:17 +0000 (13:27 +0200)]
tools: config-create: set default root cell name to RootCell

The cell name could end up as an empty string because it was derived
from optional input files. In fact just giving the root cell a fixed
default name seems to make more sense than to generate a name or require
users to provide one.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: cell-list: make python3 compatible
Henning Schild [Mon, 21 Jul 2014 14:40:14 +0000 (16:40 +0200)]
tools: cell-list: make python3 compatible

Avoid python2 only functions.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: refuse to generate config if jailhouse is enabled
Henning Schild [Fri, 11 Jul 2014 14:19:27 +0000 (16:19 +0200)]
tools: config-create: refuse to generate config if jailhouse is enabled

The input files used by the configuration generator might look different
on a system where jailhouse is enabled. Think missing cpus, pci devices etc.
Refuse to work with potentially corrupt input data.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
[Jan: add python2 compatibility]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: fix cpu counting
Henning Schild [Fri, 11 Jul 2014 12:56:13 +0000 (14:56 +0200)]
tools: config-create: fix cpu counting

Count the number of cpus based on another file to detect offline ones as
well. (cpu*/topology/core_id does not exists for offline cpus)

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add PCI demo using an Intel HDA
Jan Kiszka [Fri, 18 Jul 2014 08:21:40 +0000 (10:21 +0200)]
inmates: Add PCI demo using an Intel HDA

This demonstrates the setup of a PCI device including MSI.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add delay_us service to timing library
Jan Kiszka [Sat, 19 Jul 2014 14:05:06 +0000 (16:05 +0200)]
inmates: Add delay_us service to timing library

This performs a busy-wait for the specified microseconds.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add PCI services to inmates framework
Jan Kiszka [Fri, 18 Jul 2014 07:50:21 +0000 (09:50 +0200)]
inmates: Add PCI services to inmates framework

Provide library services for PCI config space access, bus scanning,
capability scanning (non-extended only so far) and MSI vector
programming (MSI-X to be added later).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add primitive 32-bit demo
Jan Kiszka [Thu, 17 Jul 2014 19:24:58 +0000 (21:24 +0200)]
inmates: Add primitive 32-bit demo

This only demonstrates the capability to build inmates which come up in
32-bit mode. Note that not all library services support this already.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Halt on return from inmate_main
Jan Kiszka [Thu, 17 Jul 2014 19:21:22 +0000 (21:21 +0200)]
inmates: Halt on return from inmate_main

Remove the need for explicit clt;hlt from the inmates, moving this to
header code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add IOAPIC demo
Jan Kiszka [Thu, 17 Jul 2014 18:50:33 +0000 (20:50 +0200)]
inmates: Add IOAPIC demo

Simply demonstration and test for using the IOAPIC within an non-root
cell: Rob the ACPI IRQ and wait for events on this line, e.g. a power
button push. Read the warning before using it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add IOAPIC service to inmates framework
Jan Kiszka [Thu, 17 Jul 2014 18:30:24 +0000 (20:30 +0200)]
inmates: Add IOAPIC service to inmates framework

This provides the functionality for programing an assigned IOAPIC pin to
send standard IRQs to the caller's CPU. Just enough for basic IOAPIC
usage.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Provide missing PIO accessors
Jan Kiszka [Thu, 17 Jul 2014 18:28:54 +0000 (20:28 +0200)]
inmates: Provide missing PIO accessors

Add inw, outw and outl.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Add memory services to inmates framework
Jan Kiszka [Thu, 17 Jul 2014 15:18:21 +0000 (17:18 +0200)]
inmates: Add memory services to inmates framework

This adds a primitive memory allocator (without release) and a page
mapper (without unmap) to the inmates library. MMIO accessors are also
included. Those used for intercepted resources are encoded in assembly
to ensure that only supported instructions are used. With these
services, inmates can now access memory-mapped devices.

The allocator uses the lower memory starting from the first page.
Document this as well as the remaining memory layout.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Truly terminate apic-demo on second attempt
Jan Kiszka [Wed, 16 Jul 2014 20:07:36 +0000 (22:07 +0200)]
inmates: Truly terminate apic-demo on second attempt

More consistent demo: Reject the first shutdown request and keep running
until the second arrives. Then terminate directly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Factor out timing services
Jan Kiszka [Wed, 16 Jul 2014 19:54:08 +0000 (21:54 +0200)]
inmates: Factor out timing services

Move the APIC timer services together with PM timer access into a timing
library module.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Factor our interrupt library services
Jan Kiszka [Wed, 16 Jul 2014 19:49:25 +0000 (21:49 +0200)]
inmates: Factor our interrupt library services

This simplifies registering interrupt handlers and also moves the EOI
ACK into library code. Only 64-bit support so far. Still, we need to fix
the definition of s64/u64 and make read/write_msr compatible with 32-bit
builds.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Map Comm Region always at 0x100000 for inmates framework
Jan Kiszka [Wed, 16 Jul 2014 11:46:49 +0000 (13:46 +0200)]
inmates: Map Comm Region always at 0x100000 for inmates framework

Standardize mapping and access to the Comm Region within the inmates
framework. Reduces the work to be done for new inmates. We will move it
higher once paging services are available so that larger inmates can be
created.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates: Refactor folder structure
Jan Kiszka [Wed, 16 Jul 2014 11:05:42 +0000 (13:05 +0200)]
inmates: Refactor folder structure

Move common code into inmates/lib and showcases into a inmates/demos to
prepare for a reusable and extensible inmates framework. Also split
along architecture dependencies, we will get code for non-x86 as well
one day.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Handle more SIB cases in MMIO instruction parser
Jan Kiszka [Mon, 21 Jul 2014 06:48:07 +0000 (08:48 +0200)]
x86: Handle more SIB cases in MMIO instruction parser

This adds, among other things, support for using r12 as address register
in MMIO accesses. And it actually simplifies the code. We can ignore SS
and index in MOD 0 as these only affect the memory address we obtain
differently.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix emulated X2APIC ID reading
Jan Kiszka [Fri, 18 Jul 2014 16:30:22 +0000 (18:30 +0200)]
x86: Fix emulated X2APIC ID reading

The xAPIC reports its ID in different bits than the x2APIC. Account for
this when emulating x2APIC accesses by calling the read_id handler.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Add support for REX.B to MMIO instruction parser
Jan Kiszka [Thu, 17 Jul 2014 18:48:12 +0000 (20:48 +0200)]
x86: Add support for REX.B to MMIO instruction parser

In none of the supported modes, REX.B is relevant for us because we
obtain the memory address - which it influences by selecting the address
register - differently. Therefore, we can ignore this bit, extending the
set of supported MMIO instructions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix name of sib.base
Jan Kiszka [Thu, 17 Jul 2014 18:47:22 +0000 (20:47 +0200)]
x86: Fix name of sib.base

It's called "base" in the spec.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Fix argument widths of hypercall ABI
Jan Kiszka [Tue, 8 Jul 2014 06:06:21 +0000 (08:06 +0200)]
x86: Fix argument widths of hypercall ABI

The x86 hypercall ABI defined 64-bit arguments and return codes so far.
However, our interface header took and returned only 32 bits. This
slipped through unnoticed because usually no physical addresses beyond
4G are passed to the Cell Create hypercall, the only place where it
practically matters.

Fix the issue and extend the ABI to support also 32-bit callers. We
define hypercall code and return value to be 32 bits, argument width are
now corresponding the the callers mode: 64 bits in IA-32e mode, 32 bits
otherwise. While the root cell still has to be in 64-bit mode, non-root
cells in other modes are now fine to invoke the hypercalls as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Clear APIC on every SIPI event
Jan Kiszka [Mon, 7 Jul 2014 17:01:00 +0000 (19:01 +0200)]
x86: Clear APIC on every SIPI event

The current logic only ensures that we clear the APIC when the CPU
enters the virtual wait-for-SIPI state. However, this does not cover the
case when we transfer a CPU from the root to a non-root cell. We only
stop the CPU for this, and reset it directly via a pseudo SIPI. This
change moves the clearing to the point where we are about to deliver the
SIPI.

The change has the positive side effect of moving potentially costly
APIC clearing out of the control_lock.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Simplify IPI processing with logical APIC IDs
Jan Kiszka [Mon, 7 Jul 2014 16:11:14 +0000 (18:11 +0200)]
x86: Simplify IPI processing with logical APIC IDs

We now have ffsl available, so we can avoid inverting the mask and
looking for zero bits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: config-create: include PM timer in root cell config
Henning Schild [Mon, 7 Jul 2014 15:48:51 +0000 (17:48 +0200)]
tools: config-create: include PM timer in root cell config

Add the PM timer to configurations created with jailhouse config create.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agotools: Add configuration generator
Henning Schild [Mon, 7 Jul 2014 10:46:00 +0000 (12:46 +0200)]
tools: Add configuration generator

Adding a helper script to generate a configuration for the root cell.
The script can also generate another script to collect all the necessary
files on a remote machine.
Both scripts can be accessed through the jailhouse command.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Drop Celsius W420
Jan Kiszka [Mon, 7 Jul 2014 10:14:09 +0000 (12:14 +0200)]
configs: Drop Celsius W420

No machine of this type in reach, so we cannot update to recent config
extensions. Also, we will soon focus on auto-generated configs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoUpdate TODO list
Jan Kiszka [Sun, 6 Jul 2014 20:29:14 +0000 (22:29 +0200)]
Update TODO list

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Add Q35 machine support to QEMU VM
Jan Kiszka [Sun, 6 Jul 2014 14:41:17 +0000 (16:41 +0200)]
configs: Add Q35 machine support to QEMU VM

Add required PCI devices to the QEMU config so that it both works with
the default i440FX and the newer Q35 machine. This is transitional until
Q35 gains VT-d support, then we will drop i440FX bits.

Open the whole 0xC0xx port range for PCI devices to be more tolerant
regarding ordering or other changes.

At this chance, drop the unneeded permission to talk to the first legacy
PIC.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoinmates/configs: Pick up PM timer address from Comm Region
Jan Kiszka [Sun, 6 Jul 2014 09:32:14 +0000 (11:32 +0200)]
inmates/configs: Pick up PM timer address from Comm Region

Instead of probing it, use the information that is now provided via the
Communication Region. This requires to map the region also into the
tiny-demo cell.

We can now drop all explicit port permissions from the inmate's cell
configurations as this is now done automatically by the hypervisor.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Provide PM timer access to all cells
Jan Kiszka [Sun, 6 Jul 2014 09:06:07 +0000 (11:06 +0200)]
x86: Provide PM timer access to all cells

Export the PM timer address via the Communication Region to non-root
cells and allow access to that port for all cells. This is safe as the
PM timer hardware is specified to be read-only.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agoconfigs: Specify PM timer addresses
Jan Kiszka [Sun, 6 Jul 2014 09:29:47 +0000 (11:29 +0200)]
configs: Specify PM timer addresses

Specify the location of the PM timer. For QEMU VMs, we already use the
address of upcoming release 2.1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Allow to specify the PM timer address via the system configuration
Jan Kiszka [Sun, 6 Jul 2014 09:37:48 +0000 (11:37 +0200)]
x86: Allow to specify the PM timer address via the system configuration

This enables the hypervisor to forward the information to non-root cells
and to permit access to the resource. We could also parse the ACPI table
in the hypervisor, but this approach is much simpler.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Drop redundant vmx_invept
Jan Kiszka [Sat, 5 Jul 2014 06:48:59 +0000 (08:48 +0200)]
x86: Drop redundant vmx_invept

No need to call invept also on vmx_cell_init. We already perform this
for all cpus involved in a cell creation (root and new cell cpus) via
arch_config_commit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
9 years agox86: Enable RDTSCP for the cells
Jan Kiszka [Mon, 30 Jun 2014 15:01:58 +0000 (17:01 +0200)]
x86: Enable RDTSCP for the cells

If the CPU support RDTSCP, we must enable this feature for cell usage or
they will receive unexpected #UD.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>