aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc/spapr_drc.c
AgeCommit message (Collapse)Author
2021-01-06spapr: Add drc_ prefix to the DRC realize and unrealize functionsGreg Kurz
Use a less generic name for an easier experience with tools such as cscope or grep. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-6-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Introduce spapr_drc_reset_all()Greg Kurz
No need to expose the way DRCs are traversed outside of spapr_drc.c. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-4-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Fix reset of transient DR connectorsGreg Kurz
Documentation of object_property_iter_init() clearly stipulates that "it is forbidden to modify the property list while iterating". But this is exactly what we do when resetting transient DR connectors during CAS. The call to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a PCI bridge, both of which will then in turn destroy their PCI DRCs. This could potentially invalidate the iterator. It is pure luck that this haven't caused any issues so far. Change spapr_drc_reset() to return true if it caused a device to be removed. Restart from scratch in this case. This can potentially increase the overall DRC reset time, especially with a high maxmem which generates a lot of LMB DRCs. But this kind of setup is rare, and so is the use case of rebooting a guest while doing hot-unplug. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-3-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Call spapr_drc_reset() for all DRCs at CASGreg Kurz
Non-transient DRCs are either in the empty or the ready state, which means spapr_drc_reset() doesn't change their state. It is thus not needed to do any checking. Call spapr_drc_reset() unconditionally and squash spapr_drc_transient() into its only user, spapr_drc_needed(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-2-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Fix DR properties of the root nodeGreg Kurz
Section 13.5.2 of LoPAPR mandates various DR related indentifiers for all hot-pluggable entities to be exposed in the "ibm,drc-indexes", "ibm,drc-power-domains", "ibm,drc-names" and "ibm,drc-types" properties of their parent node. These properties are created with spapr_dt_drc(). PHBs and LMBs are both children of the machine. Their DR identifiers are thus supposed to be exposed in the afore mentioned properties of the root node. When PHB hot-plug support was added, an extra call to spapr_dt_drc() was introduced: this overwrites the existing properties, previously populated with the LMB identifiers, and they end up containing only PHB identifiers. This went unseen so far because linux doesn't care, but this is still not conformant with LoPAPR. Fortunately spapr_dt_drc() is able to handle multiple DR entity types at the same time. Use that to handle DR indentifiers for PHBs and LMBs with a single call to spapr_dt_drc(). While here also account for PMEM DR identifiers, which were forgotten when NVDIMM hot-plug support was added. Also add an assert to prevent further misuse of spapr_dt_drc(). With -m 1G,maxmem=2G,slots=8 passed on the QEMU command line we get: Without this patch: /proc/device-tree/ibm,drc-indexes 0000001f 20000001 20000002 20000003 20000000 20000005 20000006 20000007 20000004 20000009 20000008 20000010 20000011 20000012 20000013 20000014 20000015 20000016 20000017 20000018 20000019 2000000a 2000000b 2000000c 2000000d 2000000e 2000000f 2000001a 2000001b 2000001c 2000001d 2000001e These are the DRC indexes for the 31 possible PHBs. With this patch: /proc/device-tree/ibm,drc-indexes 0000002b 90000000 90000001 90000002 90000003 90000004 90000005 90000006 90000007 20000001 20000002 20000003 20000000 20000005 20000006 20000007 20000004 20000009 20000008 20000010 20000011 20000012 20000013 20000014 20000015 20000016 20000017 20000018 20000019 2000000a 2000000b 2000000c 2000000d 2000000e 2000000f 2000001a 2000001b 2000001c 2000001d 2000001e 80000004 80000005 80000006 80000007 And now we also have the 4 ((2G - 1G) / 256M) LMBs and the 8 (slots) PMEMs. Fixes: 3998ccd09298 ("spapr: populate PHB DRC entries for root DT node") Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160794479566.35245.17809158217760761558.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-14spapr: spapr_drc_attach() cannot failGreg Kurz
All users are passing &error_abort already. Document the fact that spapr_drc_attach() should only be passed a free DRC, which is supposedly the case if appropriate checking is done earlier. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201201113728.885700-5-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-28spapr: Clarify why DR connectors aren't user creatableGreg Kurz
DR connector is a device that emulates a firmware abstraction used by PAPR compliant guests to manage hotplug/dynamic-reconfiguration of PHBs, PCI devices, memory, and CPUs. It is internally created by the spapr platform and requires to be owned by either the machine (PHBs, CPUs, memory) or by a PHB (PCI devices). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160250199940.765467.6896806997161856576.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09spapr: Simplify error handling in prop_get_fdt()Greg Kurz
Use the return value of visit_check_struct() and visit_check_list() for error checking instead of local_err. This allows to get rid of the error propagation overhead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-10-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09spapr: Add a return value to spapr_drc_attach()Greg Kurz
As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-9-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-21qom: Change object_get_canonical_path_component() not to mallocMarkus Armbruster
object_get_canonical_path_component() returns a malloced copy of a property name on success, null on failure. 19 of its 25 callers immediately free the returned copy. Change object_get_canonical_path_component() to return the property name directly. Since modifying the name would be wrong, adjust the return type to const char *. Drop the free from the 19 callers become simpler, add the g_strdup() to the other six. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200714160202.3121879-4-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com>
2020-07-10error: Eliminate error_propagate() with Coccinelle, part 1Markus Armbruster
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) | - !fun(args, &err, args2) + !fun(args, errp, args2) | - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var | !var | var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; | return var; ) } @depends on rule1 || rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10qom: Crash more nicely on object_property_get_link() failureMarkus Armbruster
Pass &error_abort instead of NULL where the returned value is dereferenced or asserted to be non-null. Drop a now redundant assertion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-24-armbru@redhat.com>
2020-07-10qapi: Use returned bool to check for failure, Coccinelle partMarkus Armbruster
The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*"; expression list args; typedef Error; Error *err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>
2020-07-02Clean up some calls to ignore Error objects the right wayMarkus Armbruster
Receiving the error in a local variable only to free it is less clear (and also less efficient) than passing NULL. Clean up. Cc: Daniel P. Berrange <berrange@redhat.com> Cc: Jerome Forissier <jerome@forissier.org> CC: Greg Kurz <groug@kaod.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200630090351.1247703-4-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-06-15qdev: Convert bus-less devices to qdev_realize() with CoccinelleMarkus Armbruster
All remaining conversions to qdev_realize() are for bus-less devices. Coccinelle script: // only correct for bus-less @dev! @@ expression errp; expression dev; @@ - qdev_init_nofail(dev); + qdev_realize(dev, NULL, &error_fatal); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(OBJECT(dev), true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(dev, true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); Note that Coccinelle chokes on ARMSSE typedef vs. macro in hw/arm/armsse.c. Worked around by temporarily renaming the macro for the spatch run. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
2020-05-15qom: Drop @errp parameter of object_property_del()Markus Armbruster
Same story as for object_property_add(): the only way object_property_del() can fail is when the property with this name does not exist. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Most callers do that, the commit before previous fixed one that didn't (and got the error handling wrong), and the two remaining exceptions ignore errors. Drop the @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-05-15qdev: Unrealize must not failMarkus Armbruster
Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-15qom: Drop parameter @errp of object_property_add() & friendsMarkus Armbruster
The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]
2020-05-15qom: Clean up inconsistent use of gchar * vs. char *Markus Armbruster
Uses of gchar * in qom/object.h: * ObjectProperty member @name Functions that take a property name argument all use char *. Change the member to match. * ObjectProperty member @type Functions that take a property type argument or return it all use char *. Change the member to match. * ObjectProperty member @description Functions that take a property description argument all use char *. Change the member to match. * object_resolve_path_component() parameter @part Path components are property names. Most callers pass char * arguments. Change the parameter to match. Adjust the few callers that pass gchar * to pass char *. * Return value of object_get_canonical_path_component(), object_get_canonical_path() Most callers convert their return values right back to char *. Change the return value to match. Adjust the few callers where that would add a conversion to gchar * to use char * instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-3-armbru@redhat.com>
2020-03-16qom/object: enable setter for uint typesFelipe Franciosi
Traditionally, the uint-specific property helpers only offer getters. When adding object (or class) uint types, one must therefore use the generic property helper if a setter is needed (and probably duplicate some code writing their own getters/setters). This enhances the uint-specific property helper APIs by adding a bitwise-or'd 'flags' field and modifying all clients of that API to set this paramater to OBJ_PROP_FLAG_READ. This maintains the current behaviour whilst allowing others to also set OBJ_PROP_FLAG_WRITE (or use the more convenient OBJ_PROP_FLAG_READWRITE) in the future (which will automatically install a setter). Other flags may be added later. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21spapr: Fix handling of unplugged devices during CAS and migrationGreg Kurz
We already detect if a device is being hot plugged before CAS to trigger a CAS reboot and during migration to migrate the state of the associated DRC. But hot unplugging a device is also an asynchronous operation that requires the guest to take action. This means that if the guest is migrated after the hot unplug event was sent but before it could release the device with RTAS, the destination QEMU doesn't know about the pending unplug operation and doesn't actually remove the device when the guest finally releases it. Similarly, if the unplug request is fired before CAS, the guest isn't notified of the change, just like with hotplug. It ends up booting with the device still present in the DT and configures it, just like it was never removed. Even weirder, since the event is still queued, it will be eventually processed when some other unrelated event is posted to the guest. Enhance spapr_drc_transient() to also return true if an unplug request is pending. This fixes the issue at CAS with a CAS reboot request and causes the DRC state to be migrated. Some extra care is still needed to inform the destination that an unplug request is pending : migrate the unplug_requested field of the DRC in an optional subsection. This might break backwards migration, but this is still better than ending with an inconsistent guest. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158169248798.3465937.1108351365840514270.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-21spapr: Don't use spapr_drc_needed() in CAS codeGreg Kurz
We currently don't support hotplug of devices between boot and CAS. If this happens a CAS reboot is triggered. We detect this during CAS using the spapr_drc_needed() function which is essentially a VMStateDescription .needed callback. Even if the condition for CAS reboot happens to be the same as for DRC migration, it looks wrong to piggyback a migration helper for this. Introduce a helper with slightly more explicit name and use it in both CAS and DRC migration code. Since a subsequent patch will enhance this helper to cover the case of hot unplug, let's go for spapr_drc_transient(). While here convert spapr_hotplugged_dev_before_cas() to the "transient" wording as well. This doesn't change any behaviour. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158169248180.3465937.9531405453362718771.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-21spapr: Add NVDIMM device supportShivaprasad G Bhat
Add support for NVDIMM devices for sPAPR. Piggyback on existing nvdimm device interface in QEMU to support virtual NVDIMM devices for Power. Create the required DT entries for the device (some entries have dummy values right now). The patch creates the required DT node and sends a hotplug interrupt to the guest. Guest is expected to undertake the normal DR resource add path in response and start issuing PAPR SCM hcalls. The device support is verified based on the machine version unlike x86. This is how it can be used .. Ex : For coldplug, the device to be added in qemu command line as shown below -object memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896 -device nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0 For hotplug, the device to be added from monitor as below object_add memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896 device_add nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0 Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> [Early implementation] Message-Id: <158131058078.2897.12767731856697459923.stgit@lep8c.aus.stglabs.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-01-06vmstate: replace DeviceState with VMStateIfMarc-André Lureau
Replace DeviceState dependency with VMStateIf on vmstate API. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com>
2019-08-21ppc: fix memory leak in spapr_dt_drc()Shivaprasad G Bhat
Leaking the drc_name while preparing the DT properties. Fixing that. Also, remove the const qualifier from spapr_drc_name(). Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Message-Id: <156335159028.82682.5404622104535818162.stgit@lep8c.aus.stglabs.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include migration/vmstate.h lessMarkus Armbruster
In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include sysemu/reset.h a lot lessMarkus Armbruster
In my "build everything" tree, changing sysemu/reset.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The main culprit is hw/hw.h, which supposedly includes it for convenience. Include sysemu/reset.h only where it's needed. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-06-12spapr: Clean up spapr_drc_populate_dt()David Gibson
This makes some minor cleanups to spapr_drc_populate_dt(), renaming it to the shorter and more idiomatic spapr_dt_drc() along the way. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12spapr: Use CamelCase properlyDavid Gibson
The qemu coding standard is to use CamelCase for type and structure names, and the pseries code follows that... sort of. There are quite a lot of places where we bend the rules in order to preserve the capitalization of internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR". That was a bad idea - it frequently leads to names ending up with hard to read clusters of capital letters, and means they don't catch the eye as type identifiers, which is kind of the point of the CamelCase convention in the first place. In short, keeping type identifiers look like CamelCase is more important than preserving standard capitalization of internal "words". So, this patch renames a heap of spapr internal type names to a more standard CamelCase. In addition to case changes, we also make some other identifier renames: VIOsPAPR* -> SpaprVio* The reverse word ordering was only ever used to mitigate the capital cluster, so revert to the natural ordering. VIOsPAPRVTYDevice -> SpaprVioVty VIOsPAPRVLANDevice -> SpaprVioVlan Brevity, since the "Device" didn't add useful information sPAPRDRConnector -> SpaprDrc sPAPRDRConnectorClass -> SpaprDrcClass Brevity, and makes it clearer this is the same thing as a "DRC" mentioned in many other places in the code This is 100% a mechanical search-and-replace patch. It will, however, conflict with essentially any and all outstanding patches touching the spapr code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr: add hotplug hooks for PHB hotplugGreg Kurz
Hotplugging PHBs is a machine-level operation, but PHBs reside on the main system bus, so we register spapr machine as the handler for the main system bus. Provide the usual pre-plug, plug and unplug-request handlers. Move the checking of the PHB index to the pre-plug handler. It is okay to do that and assert in the realize function because the pre-plug handler is always called, even for the oldest machine types we support. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> (Fixed interrupt controller phandle in "interrupt-map" and TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz) Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059672926.1466090.13612804072190051439.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr: create DR connectors for PHBsMichael Roth
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059670389.1466090.10015601248906623076.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr/drc: Drop spapr_drc_attach() fdt argumentGreg Kurz
All DRC subtypes have been converted to generate the FDT fragment at configure connector time instead of attach time. The fdt and fdt_offset arguments of spapr_drc_attach() aren't needed anymore. Drop them and make the implementation of the dt_populate() method mandatory. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059667853.1466090.16527852453054217565.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr/pci: Generate FDT fragment at configure connector timeGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059667346.1466090.326696113231137772.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr: Generate FDT fragment for CPUs at configure connector timeGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059666839.1466090.3833376527523126752.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr: Generate FDT fragment for LMBs at configure connector timeGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059666331.1466090.6766540766297333313.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26spapr_drc: Allow FDT fragment to be added laterGreg Kurz
The current logic is to provide the FDT fragment when attaching a device to a DRC. This works perfectly fine for our current hotplug support, but soon we will add support for PHB hotplug which has some constraints, that CPU, PCI and LMB devices don't seem to have. The first constraint is that the "ibm,dma-window" property of the PHB node requires the IOMMU to be configured, ie, spapr_tce_table_enable() has been called, which happens during PHB reset. It is okay in the case of hotplug since the device is reset before the hotplug handler is called. On the contrary with coldplug, the hotplug handler is called first and device is only reset during the initial system reset. Trying to create the FDT fragment on the hotplug path in this case, would result in somthing like this: ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >; This will cause linux in the guest to panic, by simply removing and re-adding the PHB using the drmgr command: page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz)); if (!page) panic("iommu_init_table: Can't allocate %ld bytes\n", sz); The second and maybe more problematic constraint is that the "interrupt-map" property needs to reference the interrupt controller node using the very same phandle that SLOF has already exposed to the guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall at some point to know about this phandle. With the latest QEMU and SLOF, this happens when SLOF gets quiesced. This means that if the PHB gets hotplugged after CAS but before SLOF quiesce, then we're sure that the phandle is not known when the hotplug handler is called. The FDT is only needed when the guest first invokes RTAS to configure the connector actually, long after SLOF quiesce. Let's postpone the creation of FDT fragments for PHBs to rtas_ibm_configure_connector(). Since we only need this for PHBs, introduce a new method in the base DRC class for that. DRC subtypes will be converted to use it in subsequent patches. Allow spapr_drc_attach() to be passed a NULL fdt argument if the method is available. When all DRC subtypes have been converted, the fdt argument will eventually disappear. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059665823.1466090.18358845122627355537.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12hw/ppc/spapr_drc: Replace error_setg(&error_abort) by error_report() + abort()Philippe Mathieu-Daudé
Use error_report() + abort() instead of error_setg(&error_abort), as suggested by the "qapi/error.h" documentation: Please don't error_setg(&error_fatal, ...), use error_report() and exit(), because that's more obvious. Likewise, don't error_setg(&error_abort, ...), use assert(). Use abort() instead of the suggested assert() because the error message already got displayed. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREFMarc-André Lureau
Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-02-09qdict qlist: Make most helper macros functionsMarkus Armbruster
The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h and qnum.h in the headers, but not qbool.h and qstring.h. Works, because we include those wherever the macros get used. Open-coding these helpers is of dubious value. Turn them into functions and drop the includes from the headers. This cleanup makes the number of objects depending on qapi/qmp/qnum.h from 4551 (out of 4743) to 46 in my "build everything" tree. For qapi/qmp/qnull.h, the number drops from 4552 to 21. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-10-armbru@redhat.com>
2017-11-20spapr: reset DRCs after devicesGreg Kurz
A DRC with a pending unplug request releases its associated device at machine reset time. In the case of LMB, when all DRCs for a DIMM device have been reset, the DIMM gets unplugged, causing guest memory to disappear. This may be very confusing for anything still using this memory. This is exactly what happens with vhost backends, and QEMU aborts with: qemu-system-ppc64: used ring relocated for ring 2 qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion `r >= 0' failed. The issue is that each DRC registers a QEMU reset handler, and we don't control the order in which these handlers are called (ie, a LMB DRC will unplug a DIMM before the virtio device using the memory on this DIMM could stop its vhost backend). To avoid such situations, let's reset DRCs after all devices have been reset. Reported-by: Mallesh N. Koti <mallesh@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_drc: pass object ownership to parent/ownerMichael Roth
DRC objects attach themselves to an owner as a child property. unref afterward to allow them to be finalized when their owner is finalized. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_drc: add unrealize method to physical DRC classGreg Kurz
When hot-unplugging a PHB, all its PCI DRC connectors get unrealized. This patch adds an unrealize method to the physical DRC class, in order to undo registrations performed in realize_physical(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_drc: use g_strdup_printf() instead of snprintf()Greg Kurz
Passing a stack allocated buffer of arbitrary length to snprintf() without checking the return value can cause the resultant strings to be silently truncated. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08hw/ppc: CAS reset on early device hotplugDaniel Henrique Barboza
This patch is a follow up on the discussions made in patch "hw/ppc: disable hotplug before CAS is completed" that can be found at [1]. At this moment, we do not support CPU/memory hotplug in early boot stages, before CAS. When a hotplug occurs, the event is logged in an internal RTAS event log queue and an IRQ pulse is fired. In regular conditions, the guest handles the interrupt by executing check_exception, fetching the generated hotplug event and enabling the device for use. In early boot, this IRQ isn't caught (SLOF does not handle hotplug events), leaving the event in the rtas event log queue. If the guest executes check_exception due to another hotplug event, the re-assertion of the IRQ ends up de-queuing the first hotplug event as well. In short, a device hotplugged before CAS is considered coldplugged by SLOF. This leads to device misbehavior and, in some cases, guest kernel Ooops when trying to unplug the device. A proper fix would be to turn every device hotplugged before CAS as a colplugged device. This is not trivial to do with the current code base though - the FDT is written in the guest memory at ppc_spapr_reset and can't be retrieved without adding extra state (fdt_size for example) that will need to managed and migrated. Adding the hotplugged DT in the middle of CAS negotiation via the updated DT tree works with CPU devs, but panics the guest kernel at boot. Additional analysis would be necessary for LMBs and PCI devices. There are questions to be made in QEMU/SLOF/kernel level about how we can make this change in a sustainable way. With Linux guests, a fix would be the kernel executing check_exception at boot time, de-queueing the events that happened in early boot and processing them. However, even if/when the newer kernels start fetching these events at boot time, we need to take care of older kernels that won't be doing that. This patch works around the situation by issuing a CAS reset if a hotplugged device is detected during CAS: - the DRC conditions that warrant a CAS reset is the same as those that triggers a DRC migration - the DRC must have a device attached and the DRC state is not equal to its ready_state. With that in mind, this patch makes use of 'spapr_drc_needed' to determine if a CAS reset is needed. - In the middle of CAS negotiations, the function 'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see if there are any DRC that requires a reset, using spapr_drc_needed. If that happens, returns '1' in 'spapr_h_cas_compose_response' which will set spapr->cas_reboot to true, causing the machine to reboot. No changes are made for coldplug devices. [1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08hw/ppc/spapr_drc.c: change spapr_drc_needed to use drc->devDaniel Henrique Barboza
This patch makes a small fix in 'spapr_drc_needed' to change how we detect if a DRC has a device attached. Previously it used dr_entity_sense for this, which works for physical DRCs. However, for logical DRCs, it didn't cover the case where a logical DRC has a drc->dev but the state is LOGICAL_UNUSABLE (e.g. a hotplugged CPU before CAS). In this case, the dr_entity_sense of this DRC returns UNUSABLE and the code was considering that there were no dev attached, making spapr_drc_needed return 'false' when in fact we would like to migrate the DRC. Changing it to check for drc->dev instead works for all DRC types. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22spapr: Allow configure-connector to be called multiple timesBharata B Rao
In case of in-kernel memory hot unplug, when the guest is not able to remove all the LMBs that are requested for removal, it will add back any LMBs that have been successfully removed. The DR Connectors of these LMBs wouldn't have been unconfigured and hence the addition of these LMBs will result in configure-connector call being issued on LMB DR connectors that are already in configured state. Such configure-connector calls will fail resulting in a DIMM which is partially unplugged. This however worked till recently before we overhauled the DRC implementation in QEMU. Commit 9d4c0f4f0a71e: "spapr: Consolidate DRC state variables" is the first commit where this problem shows up as per git bisect. Ideally guest shouldn't be issuing configure-connector call on an already configured DR connector. However for now, work around this in QEMU by allowing configure-connector to be called multiple times for all types of DR connectors. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> [dwg: Corrected buglet that would have initialized fdt pointers ready for reading on a device not present at reset] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-09spapr_drc: abort if object_property_add_child() failsGreg Kurz
object_property_add_child() can only fail in two cases: - the child already has a parent, which shouldn't happen since the DRC was allocated a few lines above - the parent already has a child with the same name, which would mean the caller tries to create a DRC that already exists In both case, this is a QEMU bug and we should abort. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-29spapr_drc: fix realize and unrealizeGreg Kurz
If object_property_add_alias() returns an error in realize(), we should propagate it to the caller and certainly not unref the DRC. Same thing goes for unrealize(). Since object_property_del() is the last call, we can even get rid of the intermediate Error *. And finally, unrealize() should undo all registrations performed by realize(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-24qapi: Use QNull for a more regular visit_type_null()Markus Armbruster
Make visit_type_null() take an @obj argument like its buddies. This helps keep the next commit simple. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>