13 files changed, 2219 insertions, 0 deletions
diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX
new file mode 100644
index 00000000..5620fb5a
--- /dev/null
+++ b/Documentation/powerpc/00-INDEX
@@ -0,0 +1,23 @@
+Index of files in Documentation/powerpc.  If you think something about
+Linux/PPC needs an entry here, needs correction or you've written one
+please mail me.
+                                        Cort Dougan (cort@fsmlabs.com)
+
+00-INDEX
+	- this file
+cpu_features.txt
+	- info on how we support a variety of CPUs with minimal compile-time
+	options.
+eeh-pci-error-recovery.txt
+	- info on PCI Bus EEH Error Recovery
+hvcs.txt
+	- IBM "Hypervisor Virtual Console Server" Installation Guide
+mpc52xx.txt
+	- Linux 2.6.x on MPC52xx family
+sound.txt
+	- info on sound support under Linux/PPC
+zImage_layout.txt
+	- info on the kernel images for Linux/PPC
+qe_firmware.txt
+	- describes the layout of firmware binaries for the Freescale QUICC
+	  Engine and the code that parses and uploads the microcode therein.
diff --git a/Documentation/powerpc/bootwrapper.txt b/Documentation/powerpc/bootwrapper.txt
new file mode 100644
index 00000000..d60fced5
--- /dev/null
+++ b/Documentation/powerpc/bootwrapper.txt
@@ -0,0 +1,141 @@
+The PowerPC boot wrapper
+------------------------
+Copyright (C) Secret Lab Technologies Ltd.
+
+PowerPC image targets compresses and wraps the kernel image (vmlinux) with
+a boot wrapper to make it usable by the system firmware.  There is no
+standard PowerPC firmware interface, so the boot wrapper is designed to
+be adaptable for each kind of image that needs to be built.
+
+The boot wrapper can be found in the arch/powerpc/boot/ directory.  The
+Makefile in that directory has targets for all the available image types.
+The different image types are used to support all of the various firmware
+interfaces found on PowerPC platforms.  OpenFirmware is the most commonly
+used firmware type on general purpose PowerPC systems from Apple, IBM and
+others.  U-Boot is typically found on embedded PowerPC hardware, but there
+are a handful of other firmware implementations which are also popular.  Each
+firmware interface requires a different image format.
+
+The boot wrapper is built from the makefile in arch/powerpc/boot/Makefile and
+it uses the wrapper script (arch/powerpc/boot/wrapper) to generate target
+image.  The details of the build system is discussed in the next section.
+Currently, the following image format targets exist:
+
+   cuImage.%:		Backwards compatible uImage for older version of
+			U-Boot (for versions that don't understand the device
+			tree).  This image embeds a device tree blob inside
+			the image.  The boot wrapper, kernel and device tree
+			are all embedded inside the U-Boot uImage file format
+			with boot wrapper code that extracts data from the old
+			bd_info structure and loads the data into the device
+			tree before jumping into the kernel.
+			  Because of the series of #ifdefs found in the
+			bd_info structure used in the old U-Boot interfaces,
+			cuImages are platform specific.  Each specific
+			U-Boot platform has a different platform init file
+			which populates the embedded device tree with data
+			from the platform specific bd_info file.  The platform
+			specific cuImage platform init code can be found in
+			arch/powerpc/boot/cuboot.*.c.  Selection of the correct
+			cuImage init code for a specific board can be found in
+			the wrapper structure.
+   dtbImage.%:		Similar to zImage, except device tree blob is embedded
+			inside the image instead of provided by firmware.  The
+			output image file can be either an elf file or a flat
+			binary depending on the platform.
+			  dtbImages are used on systems which do not have an
+			interface for passing a device tree directly.
+			dtbImages are similar to simpleImages except that
+			dtbImages have platform specific code for extracting
+			data from the board firmware, but simpleImages do not
+			talk to the firmware at all.
+			  PlayStation 3 support uses dtbImage.  So do Embedded
+			Planet boards using the PlanetCore firmware.  Board
+			specific initialization code is typically found in a
+			file named arch/powerpc/boot/<platform>.c; but this
+			can be overridden by the wrapper script.
+   simpleImage.%:	Firmware independent compressed image that does not
+			depend on any particular firmware interface and embeds
+			a device tree blob.  This image is a flat binary that
+			can be loaded to any location in RAM and jumped to.
+			Firmware cannot pass any configuration data to the
+			kernel with this image type and it depends entirely on
+			the embedded device tree for all information.
+			  The simpleImage is useful for booting systems with
+			an unknown firmware interface or for booting from
+			a debugger when no firmware is present (such as on
+			the Xilinx Virtex platform).  The only assumption that
+			simpleImage makes is that RAM is correctly initialized
+			and that the MMU is either off or has RAM mapped to
+			base address 0.
+			  simpleImage also supports inserting special platform
+			specific initialization code to the start of the bootup
+			sequence.  The virtex405 platform uses this feature to
+			ensure that the cache is invalidated before caching
+			is enabled.  Platform specific initialization code is
+			added as part of the wrapper script and is keyed on
+			the image target name.  For example, all
+			simpleImage.virtex405-* targets will add the
+			virtex405-head.S initialization code (This also means
+			that the dts file for virtex405 targets should be
+			named (virtex405-<board>.dts).  Search the wrapper
+			script for 'virtex405' and see the file
+			arch/powerpc/boot/virtex405-head.S for details.
+   treeImage.%;		Image format for used with OpenBIOS firmware found
+			on some ppc4xx hardware.  This image embeds a device
+			tree blob inside the image.
+   uImage:		Native image format used by U-Boot.  The uImage target
+			does not add any boot code.  It just wraps a compressed
+			vmlinux in the uImage data structure.  This image
+			requires a version of U-Boot that is able to pass
+			a device tree to the kernel at boot.  If using an older
+			version of U-Boot, then you need to use a cuImage
+			instead.
+   zImage.%:		Image format which does not embed a device tree.
+			Used by OpenFirmware and other firmware interfaces
+			which are able to supply a device tree.  This image
+			expects firmware to provide the device tree at boot.
+			Typically, if you have general purpose PowerPC
+			hardware then you want this image format.
+
+Image types which embed a device tree blob (simpleImage, dtbImage, treeImage,
+and cuImage) all generate the device tree blob from a file in the
+arch/powerpc/boot/dts/ directory.  The Makefile selects the correct device
+tree source based on the name of the target.  Therefore, if the kernel is
+built with 'make treeImage.walnut simpleImage.virtex405-ml403', then the
+build system will use arch/powerpc/boot/dts/walnut.dts to build
+treeImage.walnut and arch/powerpc/boot/dts/virtex405-ml403.dts to build
+the simpleImage.virtex405-ml403.
+
+Two special targets called 'zImage' and 'zImage.initrd' also exist.  These
+targets build all the default images as selected by the kernel configuration.
+Default images are selected by the boot wrapper Makefile
+(arch/powerpc/boot/Makefile) by adding targets to the $image-y variable.  Look
+at the Makefile to see which default image targets are available.
+
+How it is built
+---------------
+arch/powerpc is designed to support multiplatform kernels, which means
+that a single vmlinux image can be booted on many different target boards.
+It also means that the boot wrapper must be able to wrap for many kinds of
+images on a single build.  The design decision was made to not use any
+conditional compilation code (#ifdef, etc) in the boot wrapper source code.
+All of the boot wrapper pieces are buildable at any time regardless of the
+kernel configuration.  Building all the wrapper bits on every kernel build
+also ensures that obscure parts of the wrapper are at the very least compile
+tested in a large variety of environments.
+
+The wrapper is adapted for different image types at link time by linking in
+just the wrapper bits that are appropriate for the image type.  The 'wrapper
+script' (found in arch/powerpc/boot/wrapper) is called by the Makefile and
+is responsible for selecting the correct wrapper bits for the image type.
+The arguments are well documented in the script's comment block, so they
+are not repeated here.  However, it is worth mentioning that the script
+uses the -p (platform) argument as the main method of deciding which wrapper
+bits to compile in.  Look for the large 'case "$platform" in' block in the
+middle of the script.  This is also the place where platform specific fixups
+can be selected by changing the link order.
+
+In particular, care should be taken when working with cuImages.  cuImage
+wrapper bits are very board specific and care should be taken to make sure
+the target you are trying to build is supported by the wrapper bits.
diff --git a/Documentation/powerpc/cpu_features.txt b/Documentation/powerpc/cpu_features.txt
new file mode 100644
index 00000000..ae09df87
--- /dev/null
+++ b/Documentation/powerpc/cpu_features.txt
@@ -0,0 +1,56 @@
+Hollis Blanchard <hollis@austin.ibm.com>
+5 Jun 2002
+
+This document describes the system (including self-modifying code) used in the
+PPC Linux kernel to support a variety of PowerPC CPUs without requiring
+compile-time selection.
+
+Early in the boot process the ppc32 kernel detects the current CPU type and
+chooses a set of features accordingly. Some examples include Altivec support,
+split instruction and data caches, and if the CPU supports the DOZE and NAP
+sleep modes.
+
+Detection of the feature set is simple. A list of processors can be found in
+arch/powerpc/kernel/cputable.c. The PVR register is masked and compared with
+each value in the list. If a match is found, the cpu_features of cur_cpu_spec
+is assigned to the feature bitmask for this processor and a __setup_cpu
+function is called.
+
+C code may test 'cur_cpu_spec[smp_processor_id()]->cpu_features' for a
+particular feature bit. This is done in quite a few places, for example
+in ppc_setup_l2cr().
+
+Implementing cpufeatures in assembly is a little more involved. There are
+several paths that are performance-critical and would suffer if an array
+index, structure dereference, and conditional branch were added. To avoid the
+performance penalty but still allow for runtime (rather than compile-time) CPU
+selection, unused code is replaced by 'nop' instructions. This nop'ing is
+based on CPU 0's capabilities, so a multi-processor system with non-identical
+processors will not work (but such a system would likely have other problems
+anyways).
+
+After detecting the processor type, the kernel patches out sections of code
+that shouldn't be used by writing nop's over it. Using cpufeatures requires
+just 2 macros (found in arch/powerpc/include/asm/cputable.h), as seen in head.S
+transfer_to_handler:
+
+	#ifdef CONFIG_ALTIVEC
+	BEGIN_FTR_SECTION
+		mfspr	r22,SPRN_VRSAVE		/* if G4, save vrsave register value */
+		stw	r22,THREAD_VRSAVE(r23)
+	END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
+	#endif /* CONFIG_ALTIVEC */
+
+If CPU 0 supports Altivec, the code is left untouched. If it doesn't, both
+instructions are replaced with nop's.
+
+The END_FTR_SECTION macro has two simpler variations: END_FTR_SECTION_IFSET
+and END_FTR_SECTION_IFCLR. These simply test if a flag is set (in
+cur_cpu_spec[0]->cpu_features) or is cleared, respectively. These two macros
+should be used in the majority of cases.
+
+The END_FTR_SECTION macros are implemented by storing information about this
+code in the '__ftr_fixup' ELF section. When do_cpu_ftr_fixups
+(arch/powerpc/kernel/misc.S) is invoked, it will iterate over the records in
+__ftr_fixup, and if the required feature is not present it will loop writing
+nop's from each BEGIN_FTR_SECTION to END_FTR_SECTION.
diff --git a/Documentation/powerpc/eeh-pci-error-recovery.txt b/Documentation/powerpc/eeh-pci-error-recovery.txt
new file mode 100644
index 00000000..9d4e33df
--- /dev/null
+++ b/Documentation/powerpc/eeh-pci-error-recovery.txt
@@ -0,0 +1,334 @@
+
+
+                      PCI Bus EEH Error Recovery
+                      --------------------------
+                           Linas Vepstas
+                       <linas@austin.ibm.com>
+                          12 January 2005
+
+
+Overview:
+---------
+The IBM POWER-based pSeries and iSeries computers include PCI bus
+controller chips that have extended capabilities for detecting and
+reporting a large variety of PCI bus error conditions.  These features
+go under the name of "EEH", for "Extended Error Handling".  The EEH
+hardware features allow PCI bus errors to be cleared and a PCI
+card to be "rebooted", without also having to reboot the operating
+system.
+
+This is in contrast to traditional PCI error handling, where the
+PCI chip is wired directly to the CPU, and an error would cause
+a CPU machine-check/check-stop condition, halting the CPU entirely.
+Another "traditional" technique is to ignore such errors, which
+can lead to data corruption, both of user data or of kernel data,
+hung/unresponsive adapters, or system crashes/lockups.  Thus,
+the idea behind EEH is that the operating system can become more
+reliable and robust by protecting it from PCI errors, and giving
+the OS the ability to "reboot"/recover individual PCI devices.
+
+Future systems from other vendors, based on the PCI-E specification,
+may contain similar features.
+
+
+Causes of EEH Errors
+--------------------
+EEH was originally designed to guard against hardware failure, such
+as PCI cards dying from heat, humidity, dust, vibration and bad
+electrical connections. The vast majority of EEH errors seen in
+"real life" are due to either poorly seated PCI cards, or,
+unfortunately quite commonly, due to device driver bugs, device firmware
+bugs, and sometimes PCI card hardware bugs.
+
+The most common software bug, is one that causes the device to
+attempt to DMA to a location in system memory that has not been
+reserved for DMA access for that card.  This is a powerful feature,
+as it prevents what; otherwise, would have been silent memory
+corruption caused by the bad DMA.  A number of device driver
+bugs have been found and fixed in this way over the past few
+years.  Other possible causes of EEH errors include data or
+address line parity errors (for example, due to poor electrical
+connectivity due to a poorly seated card), and PCI-X split-completion
+errors (due to software, device firmware, or device PCI hardware bugs).
+The vast majority of "true hardware failures" can be cured by
+physically removing and re-seating the PCI card.
+
+
+Detection and Recovery
+----------------------
+In the following discussion, a generic overview of how to detect
+and recover from EEH errors will be presented. This is followed
+by an overview of how the current implementation in the Linux
+kernel does it.  The actual implementation is subject to change,
+and some of the finer points are still being debated.  These
+may in turn be swayed if or when other architectures implement
+similar functionality.
+
+When a PCI Host Bridge (PHB, the bus controller connecting the
+PCI bus to the system CPU electronics complex) detects a PCI error
+condition, it will "isolate" the affected PCI card.  Isolation
+will block all writes (either to the card from the system, or
+from the card to the system), and it will cause all reads to
+return all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads).
+This value was chosen because it is the same value you would
+get if the device was physically unplugged from the slot.
+This includes access to PCI memory, I/O space, and PCI config
+space.  Interrupts; however, will continued to be delivered.
+
+Detection and recovery are performed with the aid of ppc64
+firmware.  The programming interfaces in the Linux kernel
+into the firmware are referred to as RTAS (Run-Time Abstraction
+Services).  The Linux kernel does not (should not) access
+the EEH function in the PCI chipsets directly, primarily because
+there are a number of different chipsets out there, each with
+different interfaces and quirks. The firmware provides a
+uniform abstraction layer that will work with all pSeries
+and iSeries hardware (and be forwards-compatible).
+
+If the OS or device driver suspects that a PCI slot has been
+EEH-isolated, there is a firmware call it can make to determine if
+this is the case. If so, then the device driver should put itself
+into a consistent state (given that it won't be able to complete any
+pending work) and start recovery of the card.  Recovery normally
+would consist of resetting the PCI device (holding the PCI #RST
+line high for two seconds), followed by setting up the device
+config space (the base address registers (BAR's), latency timer,
+cache line size, interrupt line, and so on).  This is followed by a
+reinitialization of the device driver.  In a worst-case scenario,
+the power to the card can be toggled, at least on hot-plug-capable
+slots.  In principle, layers far above the device driver probably
+do not need to know that the PCI card has been "rebooted" in this
+way; ideally, there should be at most a pause in Ethernet/disk/USB
+I/O while the card is being reset.
+
+If the card cannot be recovered after three or four resets, the
+kernel/device driver should assume the worst-case scenario, that the
+card has died completely, and report this error to the sysadmin.
+In addition, error messages are reported through RTAS and also through
+syslogd (/var/log/messages) to alert the sysadmin of PCI resets.
+The correct way to deal with failed adapters is to use the standard
+PCI hotplug tools to remove and replace the dead card.
+
+
+Current PPC64 Linux EEH Implementation
+--------------------------------------
+At this time, a generic EEH recovery mechanism has been implemented,
+so that individual device drivers do not need to be modified to support
+EEH recovery.  This generic mechanism piggy-backs on the PCI hotplug
+infrastructure,  and percolates events up through the userspace/udev
+infrastructure.  Following is a detailed description of how this is
+accomplished.
+
+EEH must be enabled in the PHB's very early during the boot process,
+and if a PCI slot is hot-plugged. The former is performed by
+eeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by
+drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code.
+EEH must be enabled before a PCI scan of the device can proceed.
+Current Power5 hardware will not work unless EEH is enabled;
+although older Power4 can run with it disabled.  Effectively,
+EEH can no longer be turned off.  PCI devices *must* be
+registered with the EEH code; the EEH code needs to know about
+the I/O address ranges of the PCI device in order to detect an
+error.  Given an arbitrary address, the routine
+pci_get_device_by_addr() will find the pci device associated
+with that address (if any).
+
+The default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(),
+etc. include a check to see if the i/o read returned all-0xff's.
+If so, these make a call to eeh_dn_check_failure(), which in turn
+asks the firmware if the all-ff's value is the sign of a true EEH
+error.  If it is not, processing continues as normal.  The grand
+total number of these false alarms or "false positives" can be
+seen in /proc/ppc64/eeh (subject to change).  Normally, almost
+all of these occur during boot, when the PCI bus is scanned, where
+a large number of 0xff reads are part of the bus scan procedure.
+
+If a frozen slot is detected, code in 
+arch/powerpc/platforms/pseries/eeh.c will print a stack trace to 
+syslog (/var/log/messages).  This stack trace has proven to be very 
+useful to device-driver authors for finding out at what point the EEH 
+error was detected, as the error itself usually occurs slightly 
+beforehand.
+
+Next, it uses the Linux kernel notifier chain/work queue mechanism to
+allow any interested parties to find out about the failure.  Device
+drivers, or other parts of the kernel, can use
+eeh_register_notifier(struct notifier_block *) to find out about EEH
+events.  The event will include a pointer to the pci device, the
+device node and some state info.  Receivers of the event can "do as
+they wish"; the default handler will be described further in this
+section.
+
+To assist in the recovery of the device, eeh.c exports the
+following functions:
+
+rtas_set_slot_reset() -- assert the  PCI #RST line for 1/8th of a second
+rtas_configure_bridge() -- ask firmware to configure any PCI bridges
+   located topologically under the pci slot.
+eeh_save_bars() and eeh_restore_bars(): save and restore the PCI
+   config-space info for a device and any devices under it.
+
+
+A handler for the EEH notifier_block events is implemented in
+drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events().
+It saves the device BAR's and then calls rpaphp_unconfig_pci_adapter().
+This last call causes the device driver for the card to be stopped,
+which causes uevents to go out to user space. This triggers
+user-space scripts that might issue commands such as "ifdown eth0"
+for ethernet cards, and so on.  This handler then sleeps for 5 seconds,
+hoping to give the user-space scripts enough time to complete.
+It then resets the PCI card, reconfigures the device BAR's, and
+any bridges underneath. It then calls rpaphp_enable_pci_slot(),
+which restarts the device driver and triggers more user-space
+events (for example, calling "ifup eth0" for ethernet cards).
+
+
+Device Shutdown and User-Space Events
+-------------------------------------
+This section documents what happens when a pci slot is unconfigured,
+focusing on how the device driver gets shut down, and on how the
+events get delivered to user-space scripts.
+
+Following is an example sequence of events that cause a device driver
+close function to be called during the first phase of an EEH reset.
+The following sequence is an example of the pcnet32 device driver.
+
+    rpa_php_unconfig_pci_adapter (struct slot *)  // in rpaphp_pci.c
+    {
+      calls
+      pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c
+      {
+        calls
+        pci_destroy_dev (struct pci_dev *)
+        {
+          calls
+          device_unregister (&dev->dev) // in /drivers/base/core.c
+          {
+            calls
+            device_del (struct device *)
+            {
+              calls
+              bus_remove_device() // in /drivers/base/bus.c
+              {
+                calls
+                device_release_driver()
+                {
+                  calls
+                  struct device_driver->remove() which is just
+                  pci_device_remove()  // in /drivers/pci/pci_driver.c
+                  {
+                    calls
+                    struct pci_driver->remove() which is just
+                    pcnet32_remove_one() // in /drivers/net/pcnet32.c
+                    {
+                      calls
+                      unregister_netdev() // in /net/core/dev.c
+                      {
+                        calls
+                        dev_close()  // in /net/core/dev.c
+                        {
+                           calls dev->stop();
+                           which is just pcnet32_close() // in pcnet32.c
+                           {
+                             which does what you wanted
+                             to stop the device
+                           }
+                        }
+                     }
+                   which
+                   frees pcnet32 device driver memory
+                }
+     }}}}}}
+
+
+    in drivers/pci/pci_driver.c,
+    struct device_driver->remove() is just pci_device_remove()
+    which calls struct pci_driver->remove() which is pcnet32_remove_one()
+    which calls unregister_netdev()  (in net/core/dev.c)
+    which calls dev_close()  (in net/core/dev.c)
+    which calls dev->stop() which is pcnet32_close()
+    which then does the appropriate shutdown.
+
+---
+Following is the analogous stack trace for events sent to user-space
+when the pci device is unconfigured.
+
+rpa_php_unconfig_pci_adapter() {             // in rpaphp_pci.c
+  calls
+  pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
+    calls
+    pci_destroy_dev (struct pci_dev *) {
+      calls
+      device_unregister (&dev->dev) {        // in /drivers/base/core.c
+        calls
+        device_del(struct device * dev) {    // in /drivers/base/core.c
+          calls
+          kobject_del() {                    //in /libs/kobject.c
+            calls
+            kobject_uevent() {               // in /libs/kobject.c
+              calls
+              kset_uevent() {                // in /lib/kobject.c
+                calls
+                kset->uevent_ops->uevent()   // which is really just
+                a call to
+                dev_uevent() {               // in /drivers/base/core.c
+                  calls
+                  dev->bus->uevent() which is really just a call to
+                  pci_uevent () {            // in drivers/pci/hotplug.c
+                    which prints device name, etc....
+                 }
+               }
+               then kobject_uevent() sends a netlink uevent to userspace
+               --> userspace uevent
+               (during early boot, nobody listens to netlink events and
+               kobject_uevent() executes uevent_helper[], which runs the
+               event process /sbin/hotplug)
+           }
+         }
+         kobject_del() then calls sysfs_remove_dir(), which would
+         trigger any user-space daemon that was watching /sysfs,
+         and notice the delete event.
+
+
+Pro's and Con's of the Current Design
+-------------------------------------
+There are several issues with the current EEH software recovery design,
+which may be addressed in future revisions.  But first, note that the
+big plus of the current design is that no changes need to be made to
+individual device drivers, so that the current design throws a wide net.
+The biggest negative of the design is that it potentially disturbs
+network daemons and file systems that didn't need to be disturbed.
+
+-- A minor complaint is that resetting the network card causes
+   user-space back-to-back ifdown/ifup burps that potentially disturb
+   network daemons, that didn't need to even know that the pci
+   card was being rebooted.
+
+-- A more serious concern is that the same reset, for SCSI devices,
+   causes havoc to mounted file systems.  Scripts cannot post-facto
+   unmount a file system without flushing pending buffers, but this
+   is impossible, because I/O has already been stopped.  Thus,
+   ideally, the reset should happen at or below the block layer,
+   so that the file systems are not disturbed.
+
+   Reiserfs does not tolerate errors returned from the block device.
+   Ext3fs seems to be tolerant, retrying reads/writes until it does
+   succeed. Both have been only lightly tested in this scenario.
+
+   The SCSI-generic subsystem already has built-in code for performing
+   SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter
+   (HBA) resets.  These are cascaded into a chain of attempted
+   resets if a SCSI command fails. These are completely hidden
+   from the block layer.  It would be very natural to add an EEH
+   reset into this chain of events.
+
+-- If a SCSI error occurs for the root device, all is lost unless
+   the sysadmin had the foresight to run /bin, /sbin, /etc, /var
+   and so on, out of ramdisk/tmpfs.
+
+
+Conclusions
+-----------
+There's forward progress ...
+
+
diff --git a/Documentation/powerpc/firmware-assisted-dump.txt b/Documentation/powerpc/firmware-assisted-dump.txt
new file mode 100644
index 00000000..3007bc98
--- /dev/null
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -0,0 +1,270 @@
+
+                   Firmware-Assisted Dump
+                   ------------------------
+                       July 2011
+
+The goal of firmware-assisted dump is to enable the dump of
+a crashed system, and to do so from a fully-reset system, and
+to minimize the total elapsed time until the system is back
+in production use.
+
+- Firmware assisted dump (fadump) infrastructure is intended to replace
+  the existing phyp assisted dump.
+- Fadump uses the same firmware interfaces and memory reservation model
+  as phyp assisted dump.
+- Unlike phyp dump, fadump exports the memory dump through /proc/vmcore
+  in the ELF format in the same way as kdump. This helps us reuse the
+  kdump infrastructure for dump capture and filtering.
+- Unlike phyp dump, userspace tool does not need to refer any sysfs
+  interface while reading /proc/vmcore.
+- Unlike phyp dump, fadump allows user to release all the memory reserved
+  for dump, with a single operation of echo 1 > /sys/kernel/fadump_release_mem.
+- Once enabled through kernel boot parameter, fadump can be
+  started/stopped through /sys/kernel/fadump_registered interface (see
+  sysfs files section below) and can be easily integrated with kdump
+  service start/stop init scripts.
+
+Comparing with kdump or other strategies, firmware-assisted
+dump offers several strong, practical advantages:
+
+-- Unlike kdump, the system has been reset, and loaded
+   with a fresh copy of the kernel.  In particular,
+   PCI and I/O devices have been reinitialized and are
+   in a clean, consistent state.
+-- Once the dump is copied out, the memory that held the dump
+   is immediately available to the running kernel. And therefore,
+   unlike kdump, fadump doesn't need a 2nd reboot to get back
+   the system to the production configuration.
+
+The above can only be accomplished by coordination with,
+and assistance from the Power firmware. The procedure is
+as follows:
+
+-- The first kernel registers the sections of memory with the
+   Power firmware for dump preservation during OS initialization.
+   These registered sections of memory are reserved by the first
+   kernel during early boot.
+
+-- When a system crashes, the Power firmware will save
+   the low memory (boot memory of size larger of 5% of system RAM
+   or 256MB) of RAM to the previous registered region. It will
+   also save system registers, and hardware PTE's.
+
+   NOTE: The term 'boot memory' means size of the low memory chunk
+         that is required for a kernel to boot successfully when
+         booted with restricted memory. By default, the boot memory
+         size will be the larger of 5% of system RAM or 256MB.
+         Alternatively, user can also specify boot memory size
+         through boot parameter 'fadump_reserve_mem=' which will
+         override the default calculated size. Use this option
+         if default boot memory size is not sufficient for second
+         kernel to boot successfully.
+
+-- After the low memory (boot memory) area has been saved, the
+   firmware will reset PCI and other hardware state.  It will
+   *not* clear the RAM. It will then launch the bootloader, as
+   normal.
+
+-- The freshly booted kernel will notice that there is a new
+   node (ibm,dump-kernel) in the device tree, indicating that
+   there is crash data available from a previous boot. During
+   the early boot OS will reserve rest of the memory above
+   boot memory size effectively booting with restricted memory
+   size. This will make sure that the second kernel will not
+   touch any of the dump memory area.
+
+-- User-space tools will read /proc/vmcore to obtain the contents
+   of memory, which holds the previous crashed kernel dump in ELF
+   format. The userspace tools may copy this info to disk, or
+   network, nas, san, iscsi, etc. as desired.
+
+-- Once the userspace tool is done saving dump, it will echo
+   '1' to /sys/kernel/fadump_release_mem to release the reserved
+   memory back to general use, except the memory required for
+   next firmware-assisted dump registration.
+
+   e.g.
+     # echo 1 > /sys/kernel/fadump_release_mem
+
+Please note that the firmware-assisted dump feature
+is only available on Power6 and above systems with recent
+firmware versions.
+
+Implementation details:
+----------------------
+
+During boot, a check is made to see if firmware supports
+this feature on that particular machine. If it does, then
+we check to see if an active dump is waiting for us. If yes
+then everything but boot memory size of RAM is reserved during
+early boot (See Fig. 2). This area is released once we finish
+collecting the dump from user land scripts (e.g. kdump scripts)
+that are run. If there is dump data, then the
+/sys/kernel/fadump_release_mem file is created, and the reserved
+memory is held.
+
+If there is no waiting dump data, then only the memory required
+to hold CPU state, HPTE region, boot memory dump and elfcore
+header, is reserved at the top of memory (see Fig. 1). This area
+is *not* released: this region will be kept permanently reserved,
+so that it can act as a receptacle for a copy of the boot memory
+content in addition to CPU state and HPTE region, in the case a
+crash does occur.
+
+  o Memory Reservation during first kernel
+
+  Low memory                                        Top of memory
+  0      boot memory size                                       |
+  |           |                       |<--Reserved dump area -->|
+  V           V                       |   Permanent Reservation V
+  +-----------+----------/ /----------+---+----+-----------+----+
+  |           |                       |CPU|HPTE|  DUMP     |ELF |
+  +-----------+----------/ /----------+---+----+-----------+----+
+        |                                           ^
+        |                                           |
+        \                                           /
+         -------------------------------------------
+          Boot memory content gets transferred to
+          reserved area by firmware at the time of
+          crash
+                   Fig. 1
+
+  o Memory Reservation during second kernel after crash
+
+  Low memory                                        Top of memory
+  0      boot memory size                                       |
+  |           |<------------- Reserved dump area ----------- -->|
+  V           V                                                 V
+  +-----------+----------/ /----------+---+----+-----------+----+
+  |           |                       |CPU|HPTE|  DUMP     |ELF |
+  +-----------+----------/ /----------+---+----+-----------+----+
+        |                                                    |
+        V                                                    V
+   Used by second                                    /proc/vmcore
+   kernel to boot
+                   Fig. 2
+
+Currently the dump will be copied from /proc/vmcore to a
+a new file upon user intervention. The dump data available through
+/proc/vmcore will be in ELF format. Hence the existing kdump
+infrastructure (kdump scripts) to save the dump works fine with
+minor modifications.
+
+The tools to examine the dump will be same as the ones
+used for kdump.
+
+How to enable firmware-assisted dump (fadump):
+-------------------------------------
+
+1. Set config option CONFIG_FA_DUMP=y and build kernel.
+2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
+3. Optionally, user can also set 'fadump_reserve_mem=' kernel cmdline
+   to specify size of the memory to reserve for boot memory dump
+   preservation.
+
+NOTE: If firmware-assisted dump fails to reserve memory then it will
+   fallback to existing kdump mechanism if 'crashkernel=' option
+   is set at kernel cmdline.
+
+Sysfs/debugfs files:
+------------
+
+Firmware-assisted dump feature uses sysfs file system to hold
+the control files and debugfs file to display memory reserved region.
+
+Here is the list of files under kernel sysfs:
+
+ /sys/kernel/fadump_enabled
+
+    This is used to display the fadump status.
+    0 = fadump is disabled
+    1 = fadump is enabled
+
+    This interface can be used by kdump init scripts to identify if
+    fadump is enabled in the kernel and act accordingly.
+
+ /sys/kernel/fadump_registered
+
+    This is used to display the fadump registration status as well
+    as to control (start/stop) the fadump registration.
+    0 = fadump is not registered.
+    1 = fadump is registered and ready to handle system crash.
+
+    To register fadump echo 1 > /sys/kernel/fadump_registered and
+    echo 0 > /sys/kernel/fadump_registered for un-register and stop the
+    fadump. Once the fadump is un-registered, the system crash will not
+    be handled and vmcore will not be captured. This interface can be
+    easily integrated with kdump service start/stop.
+
+ /sys/kernel/fadump_release_mem
+
+    This file is available only when fadump is active during
+    second kernel. This is used to release the reserved memory
+    region that are held for saving crash dump. To release the
+    reserved memory echo 1 to it:
+
+    echo 1  > /sys/kernel/fadump_release_mem
+
+    After echo 1, the content of the /sys/kernel/debug/powerpc/fadump_region
+    file will change to reflect the new memory reservations.
+
+    The existing userspace tools (kdump infrastructure) can be easily
+    enhanced to use this interface to release the memory reserved for
+    dump and continue without 2nd reboot.
+
+Here is the list of files under powerpc debugfs:
+(Assuming debugfs is mounted on /sys/kernel/debug directory.)
+
+ /sys/kernel/debug/powerpc/fadump_region
+
+    This file shows the reserved memory regions if fadump is
+    enabled otherwise this file is empty. The output format
+    is:
+    <region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
+
+    e.g.
+    Contents when fadump is registered during first kernel
+
+    # cat /sys/kernel/debug/powerpc/fadump_region
+    CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
+    HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
+    DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
+
+    Contents when fadump is active during second kernel
+
+    # cat /sys/kernel/debug/powerpc/fadump_region
+    CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
+    HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
+    DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
+        : [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
+
+NOTE: Please refer to Documentation/filesystems/debugfs.txt on
+      how to mount the debugfs filesystem.
+
+
+TODO:
+-----
+ o Need to come up with the better approach to find out more
+   accurate boot memory size that is required for a kernel to
+   boot successfully when booted with restricted memory.
+ o The fadump implementation introduces a fadump crash info structure
+   in the scratch area before the ELF core header. The idea of introducing
+   this structure is to pass some important crash info data to the second
+   kernel which will help second kernel to populate ELF core header with
+   correct data before it gets exported through /proc/vmcore. The current
+   design implementation does not address a possibility of introducing
+   additional fields (in future) to this structure without affecting
+   compatibility. Need to come up with the better approach to address this.
+   The possible approaches are:
+	1. Introduce version field for version tracking, bump up the version
+	whenever a new field is added to the structure in future. The version
+	field can be used to find out what fields are valid for the current
+	version of the structure.
+	2. Reserve the area of predefined size (say PAGE_SIZE) for this
+	structure and have unused area as reserved (initialized to zero)
+	for future field additions.
+   The advantage of approach 1 over 2 is we don't need to reserve extra space.
+---
+Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
+This document is based on the original documentation written for phyp
+assisted dump by Linas Vepstas and Manish Ahuja.
diff --git a/Documentation/powerpc/hvcs.txt b/Documentation/powerpc/hvcs.txt
new file mode 100644
index 00000000..a730ca5a
--- /dev/null
+++ b/Documentation/powerpc/hvcs.txt
@@ -0,0 +1,567 @@
+===========================================================================
+				   HVCS
+	IBM "Hypervisor Virtual Console Server" Installation Guide
+			  for Linux Kernel 2.6.4+
+		    Copyright (C) 2004 IBM Corporation
+
+===========================================================================
+NOTE:Eight space tabs are the optimum editor setting for reading this file.
+===========================================================================
+
+	       Author(s) :  Ryan S. Arnold <rsa@us.ibm.com>
+		       Date Created: March, 02, 2004
+		       Last Changed: August, 24, 2004
+
+---------------------------------------------------------------------------
+Table of contents:
+
+	1.  Driver Introduction:
+	2.  System Requirements
+	3.  Build Options:
+		3.1  Built-in:
+		3.2  Module:
+	4.  Installation:
+	5.  Connection:
+	6.  Disconnection:
+	7.  Configuration:
+	8.  Questions & Answers:
+	9.  Reporting Bugs:
+
+---------------------------------------------------------------------------
+1. Driver Introduction:
+
+This is the device driver for the IBM Hypervisor Virtual Console Server,
+"hvcs".  The IBM hvcs provides a tty driver interface to allow Linux user
+space applications access to the system consoles of logically partitioned
+operating systems (Linux and AIX) running on the same partitioned Power5
+ppc64 system.  Physical hardware consoles per partition are not practical
+on this hardware so system consoles are accessed by this driver using
+firmware interfaces to virtual terminal devices.
+
+---------------------------------------------------------------------------
+2. System Requirements:
+
+This device driver was written using 2.6.4 Linux kernel APIs and will only
+build and run on kernels of this version or later.
+
+This driver was written to operate solely on IBM Power5 ppc64 hardware
+though some care was taken to abstract the architecture dependent firmware
+calls from the driver code.
+
+Sysfs must be mounted on the system so that the user can determine which
+major and minor numbers are associated with each vty-server.  Directions
+for sysfs mounting are outside the scope of this document.
+
+---------------------------------------------------------------------------
+3. Build Options:
+
+The hvcs driver registers itself as a tty driver.  The tty layer
+dynamically allocates a block of major and minor numbers in a quantity
+requested by the registering driver.  The hvcs driver asks the tty layer
+for 64 of these major/minor numbers by default to use for hvcs device node
+entries.
+
+If the default number of device entries is adequate then this driver can be
+built into the kernel.  If not, the default can be over-ridden by inserting
+the driver as a module with insmod parameters.
+
+---------------------------------------------------------------------------
+3.1 Built-in:
+
+The following menuconfig example demonstrates selecting to build this
+driver into the kernel.
+
+	Device Drivers  --->
+		Character devices  --->
+			<*> IBM Hypervisor Virtual Console Server Support
+
+Begin the kernel make process.
+
+---------------------------------------------------------------------------
+3.2 Module:
+
+The following menuconfig example demonstrates selecting to build this
+driver as a kernel module.
+
+	Device Drivers  --->
+		Character devices  --->
+			<M> IBM Hypervisor Virtual Console Server Support
+
+The make process will build the following kernel modules:
+
+	hvcs.ko
+	hvcserver.ko
+
+To insert the module with the default allocation execute the following
+commands in the order they appear:
+
+	insmod hvcserver.ko
+	insmod hvcs.ko
+
+The hvcserver module contains architecture specific firmware calls and must
+be inserted first, otherwise the hvcs module will not find some of the
+symbols it expects.
+
+To override the default use an insmod parameter as follows (requesting 4
+tty devices as an example):
+
+	insmod hvcs.ko hvcs_parm_num_devs=4
+
+There is a maximum number of dev entries that can be specified on insmod.
+We think that 1024 is currently a decent maximum number of server adapters
+to allow.  This can always be changed by modifying the constant in the
+source file before building.
+
+NOTE: The length of time it takes to insmod the driver seems to be related
+to the number of tty interfaces the registering driver requests.
+
+In order to remove the driver module execute the following command:
+
+	rmmod hvcs.ko
+
+The recommended method for installing hvcs as a module is to use depmod to
+build a current modules.dep file in /lib/modules/`uname -r` and then
+execute:
+
+modprobe hvcs hvcs_parm_num_devs=4
+
+The modules.dep file indicates that hvcserver.ko needs to be inserted
+before hvcs.ko and modprobe uses this file to smartly insert the modules in
+the proper order.
+
+The following modprobe command is used to remove hvcs and hvcserver in the
+proper order:
+
+modprobe -r hvcs
+
+---------------------------------------------------------------------------
+4. Installation:
+
+The tty layer creates sysfs entries which contain the major and minor
+numbers allocated for the hvcs driver.  The following snippet of "tree"
+output of the sysfs directory shows where these numbers are presented:
+
+	sys/
+	|-- *other sysfs base dirs*
+	|
+	|-- class
+	|   |-- *other classes of devices*
+	|   |
+	|   `-- tty
+	|       |-- *other tty devices*
+	|       |
+	|       |-- hvcs0
+	|       |   `-- dev
+	|       |-- hvcs1
+	|       |   `-- dev
+	|       |-- hvcs2
+	|       |   `-- dev
+	|       |-- hvcs3
+	|       |   `-- dev
+	|       |
+	|       |-- *other tty devices*
+	|
+	|-- *other sysfs base dirs*
+
+For the above examples the following output is a result of cat'ing the
+"dev" entry in the hvcs directory:
+
+	Pow5:/sys/class/tty/hvcs0/ # cat dev
+	254:0
+
+	Pow5:/sys/class/tty/hvcs1/ # cat dev
+	254:1
+
+	Pow5:/sys/class/tty/hvcs2/ # cat dev
+	254:2
+
+	Pow5:/sys/class/tty/hvcs3/ # cat dev
+	254:3
+
+The output from reading the "dev" attribute is the char device major and
+minor numbers that the tty layer has allocated for this driver's use.  Most
+systems running hvcs will already have the device entries created or udev
+will do it automatically.
+
+Given the example output above, to manually create a /dev/hvcs* node entry
+mknod can be used as follows:
+
+	mknod /dev/hvcs0 c 254 0
+	mknod /dev/hvcs1 c 254 1
+	mknod /dev/hvcs2 c 254 2
+	mknod /dev/hvcs3 c 254 3
+
+Using mknod to manually create the device entries makes these device nodes
+persistent.  Once created they will exist prior to the driver insmod.
+
+Attempting to connect an application to /dev/hvcs* prior to insertion of
+the hvcs module will result in an error message similar to the following:
+
+	"/dev/hvcs*: No such device".
+
+NOTE: Just because there is a device node present doesn't mean that there
+is a vty-server device configured for that node.
+
+---------------------------------------------------------------------------
+5. Connection
+
+Since this driver controls devices that provide a tty interface a user can
+interact with the device node entries using any standard tty-interactive
+method (e.g. "cat", "dd", "echo").  The intent of this driver however, is
+to provide real time console interaction with a Linux partition's console,
+which requires the use of applications that provide bi-directional,
+interactive I/O with a tty device.
+
+Applications (e.g. "minicom" and "screen") that act as terminal emulators
+or perform terminal type control sequence conversion on the data being
+passed through them are NOT acceptable for providing interactive console
+I/O.  These programs often emulate antiquated terminal types (vt100 and
+ANSI) and expect inbound data to take the form of one of these supported
+terminal types but they either do not convert, or do not _adequately_
+convert, outbound data into the terminal type of the terminal which invoked
+them (though screen makes an attempt and can apparently be configured with
+much termcap wrestling.)
+
+For this reason kermit and cu are two of the recommended applications for
+interacting with a Linux console via an hvcs device.  These programs simply
+act as a conduit for data transfer to and from the tty device.  They do not
+require inbound data to take the form of a particular terminal type, nor do
+they cook outbound data to a particular terminal type.
+
+In order to ensure proper functioning of console applications one must make
+sure that once connected to a /dev/hvcs console that the console's $TERM
+env variable is set to the exact terminal type of the terminal emulator
+used to launch the interactive I/O application.  If one is using xterm and
+kermit to connect to /dev/hvcs0 when the console prompt becomes available
+one should "export TERM=xterm" on the console.  This tells ncurses
+applications that are invoked from the console that they should output
+control sequences that xterm can understand.
+
+As a precautionary measure an hvcs user should always "exit" from their
+session before disconnecting an application such as kermit from the device
+node.  If this is not done, the next user to connect to the console will
+continue using the previous user's logged in session which includes
+using the $TERM variable that the previous user supplied.
+
+Hotplug add and remove of vty-server adapters affects which /dev/hvcs* node
+is used to connect to each vty-server adapter.  In order to determine which
+vty-server adapter is associated with which /dev/hvcs* node a special sysfs
+attribute has been added to each vty-server sysfs entry.  This entry is
+called "index" and showing it reveals an integer that refers to the
+/dev/hvcs* entry to use to connect to that device.  For instance cating the
+index attribute of vty-server adapter 30000004 shows the following.
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat index
+	2
+
+This index of '2' means that in order to connect to vty-server adapter
+30000004 the user should interact with /dev/hvcs2.
+
+It should be noted that due to the system hotplug I/O capabilities of a
+system the /dev/hvcs* entry that interacts with a particular vty-server
+adapter is not guaranteed to remain the same across system reboots.  Look
+in the Q & A section for more on this issue.
+
+---------------------------------------------------------------------------
+6. Disconnection
+
+As a security feature to prevent the delivery of stale data to an
+unintended target the Power5 system firmware disables the fetching of data
+and discards that data when a connection between a vty-server and a vty has
+been severed.  As an example, when a vty-server is immediately disconnected
+from a vty following output of data to the vty the vty adapter may not have
+enough time between when it received the data interrupt and when the
+connection was severed to fetch the data from firmware before the fetch is
+disabled by firmware.
+
+When hvcs is being used to serve consoles this behavior is not a huge issue
+because the adapter stays connected for large amounts of time following
+almost all data writes.  When hvcs is being used as a tty conduit to tunnel
+data between two partitions [see Q & A below] this is a huge problem
+because the standard Linux behavior when cat'ing or dd'ing data to a device
+is to open the tty, send the data, and then close the tty.  If this driver
+manually terminated vty-server connections on tty close this would close
+the vty-server and vty connection before the target vty has had a chance to
+fetch the data.
+
+Additionally, disconnecting a vty-server and vty only on module removal or
+adapter removal is impractical because other vty-servers in other
+partitions may require the usage of the target vty at any time.
+
+Due to this behavioral restriction disconnection of vty-servers from the
+connected vty is a manual procedure using a write to a sysfs attribute
+outlined below, on the other hand the initial vty-server connection to a
+vty is established automatically by this driver.  Manual vty-server
+connection is never required.
+
+In order to terminate the connection between a vty-server and vty the
+"vterm_state" sysfs attribute within each vty-server's sysfs entry is used.
+Reading this attribute reveals the current connection state of the
+vty-server adapter.  A zero means that the vty-server is not connected to a
+vty.  A one indicates that a connection is active.
+
+Writing a '0' (zero) to the vterm_state attribute will disconnect the VTERM
+connection between the vty-server and target vty ONLY if the vterm_state
+previously read '1'.  The write directive is ignored if the vterm_state
+read '0' or if any value other than '0' was written to the vterm_state
+attribute.  The following example will show the method used for verifying
+the vty-server connection status and disconnecting a vty-server connection.
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state
+	1
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo 0 > vterm_state
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state
+	0
+
+All vty-server connections are automatically terminated when the device is
+hotplug removed and when the module is removed.
+
+---------------------------------------------------------------------------
+7. Configuration
+
+Each vty-server has a sysfs entry in the /sys/devices/vio directory, which
+is symlinked in several other sysfs tree directories, notably under the
+hvcs driver entry, which looks like the following example:
+
+	Pow5:/sys/bus/vio/drivers/hvcs # ls
+	.  ..  30000003  30000004  rescan
+
+By design, firmware notifies the hvcs driver of vty-server lifetimes and
+partner vty removals but not the addition of partner vtys.  Since an HMC
+Super Admin can add partner info dynamically we have provided the hvcs
+driver sysfs directory with the "rescan" update attribute which will query
+firmware and update the partner info for all the vty-servers that this
+driver manages.  Writing a '1' to the attribute triggers the update.  An
+explicit example follows:
+
+	Pow5:/sys/bus/vio/drivers/hvcs # echo 1 > rescan
+
+Reading the attribute will indicate a state of '1' or '0'.  A one indicates
+that an update is in process.  A zero indicates that an update has
+completed or was never executed.
+
+Vty-server entries in this directory are a 32 bit partition unique unit
+address that is created by firmware.  An example vty-server sysfs entry
+looks like the following:
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # ls
+	.   current_vty   devspec       name          partner_vtys
+	..  index         partner_clcs  vterm_state
+
+Each entry is provided, by default with a "name" attribute.  Reading the
+"name" attribute will reveal the device type as shown in the following
+example:
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000003 # cat name
+	vty-server
+
+Each entry is also provided, by default, with a "devspec" attribute which
+reveals the full device specification when read, as shown in the following
+example:
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat devspec
+	/vdevice/vty-server@30000004
+
+Each vty-server sysfs dir is provided with two read-only attributes that
+provide lists of easily parsed partner vty data: "partner_vtys" and
+"partner_clcs".
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_vtys
+	30000000
+	30000001
+	30000002
+	30000000
+	30000000
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_clcs
+	U5112.428.103048A-V3-C0
+	U5112.428.103048A-V3-C2
+	U5112.428.103048A-V3-C3
+	U5112.428.103048A-V4-C0
+	U5112.428.103048A-V5-C0
+
+Reading partner_vtys returns a list of partner vtys.  Vty unit address
+numbering is only per-partition-unique so entries will frequently repeat.
+
+Reading partner_clcs returns a list of "converged location codes" which are
+composed of a system serial number followed by "-V*", where the '*' is the
+target partition number, and "-C*", where the '*' is the slot of the
+adapter.  The first vty partner corresponds to the first clc item, the
+second vty partner to the second clc item, etc.
+
+A vty-server can only be connected to a single vty at a time.  The entry,
+"current_vty" prints the clc of the currently selected partner vty when
+read.
+
+The current_vty can be changed by writing a valid partner clc to the entry
+as in the following example:
+
+	Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo U5112.428.10304
+	8A-V4-C0 > current_vty
+
+Changing the current_vty when a vty-server is already connected to a vty
+does not affect the current connection.  The change takes effect when the
+currently open connection is freed.
+
+Information on the "vterm_state" attribute was covered earlier on the
+chapter entitled "disconnection".
+
+---------------------------------------------------------------------------
+8. Questions & Answers:
+===========================================================================
+Q: What are the security concerns involving hvcs?
+
+A: There are three main security concerns:
+
+	1. The creator of the /dev/hvcs* nodes has the ability to restrict
+	the access of the device entries to certain users or groups.  It
+	may be best to create a special hvcs group privilege for providing
+	access to system consoles.
+
+	2. To provide network security when grabbing the console it is
+	suggested that the user connect to the console hosting partition
+	using a secure method, such as SSH or sit at a hardware console.
+
+	3. Make sure to exit the user session when done with a console or
+	the next vty-server connection (which may be from another
+	partition) will experience the previously logged in session.
+
+---------------------------------------------------------------------------
+Q: How do I multiplex a console that I grab through hvcs so that other
+people can see it:
+
+A: You can use "screen" to directly connect to the /dev/hvcs* device and
+setup a session on your machine with the console group privileges.  As
+pointed out earlier by default screen doesn't provide the termcap settings
+for most terminal emulators to provide adequate character conversion from
+term type "screen" to others.  This means that curses based programs may
+not display properly in screen sessions.
+
+---------------------------------------------------------------------------
+Q: Why are the colors all messed up?
+Q: Why are the control characters acting strange or not working?
+Q: Why is the console output all strange and unintelligible?
+
+A: Please see the preceding section on "Connection" for a discussion of how
+applications can affect the display of character control sequences.
+Additionally, just because you logged into the console using and xterm
+doesn't mean someone else didn't log into the console with the HMC console
+(vt320) before you and leave the session logged in.  The best thing to do
+is to export TERM to the terminal type of your terminal emulator when you
+get the console.  Additionally make sure to "exit" the console before you
+disconnect from the console.  This will ensure that the next user gets
+their own TERM type set when they login.
+
+---------------------------------------------------------------------------
+Q: When I try to CONNECT kermit to an hvcs device I get:
+"Sorry, can't open connection: /dev/hvcs*"What is happening?
+
+A: Some other Power5 console mechanism has a connection to the vty and
+isn't giving it up.  You can try to force disconnect the consoles from the
+HMC by right clicking on the partition and then selecting "close terminal".
+Otherwise you have to hunt down the people who have console authority.  It
+is possible that you already have the console open using another kermit
+session and just forgot about it.  Please review the console options for
+Power5 systems to determine the many ways a system console can be held.
+
+OR
+
+A: Another user may not have a connectivity method currently attached to a
+/dev/hvcs device but the vterm_state may reveal that they still have the
+vty-server connection established.  They need to free this using the method
+outlined in the section on "Disconnection" in order for others to connect
+to the target vty.
+
+OR
+
+A: The user profile you are using to execute kermit probably doesn't have
+permissions to use the /dev/hvcs* device.
+
+OR
+
+A: You probably haven't inserted the hvcs.ko module yet but the /dev/hvcs*
+entry still exists (on systems without udev).
+
+OR
+
+A: There is not a corresponding vty-server device that maps to an existing
+/dev/hvcs* entry.
+
+---------------------------------------------------------------------------
+Q: When I try to CONNECT kermit to an hvcs device I get:
+"Sorry, write access to UUCP lockfile directory denied."
+
+A: The /dev/hvcs* entry you have specified doesn't exist where you said it
+does?  Maybe you haven't inserted the module (on systems with udev).
+
+---------------------------------------------------------------------------
+Q: If I already have one Linux partition installed can I use hvcs on said
+partition to provide the console for the install of a second Linux
+partition?
+
+A: Yes granted that your are connected to the /dev/hvcs* device using
+kermit or cu or some other program that doesn't provide terminal emulation.
+
+---------------------------------------------------------------------------
+Q: Can I connect to more than one partition's console at a time using this
+driver?
+
+A: Yes.  Of course this means that there must be more than one vty-server
+configured for this partition and each must point to a disconnected vty.
+
+---------------------------------------------------------------------------
+Q: Does the hvcs driver support dynamic (hotplug) addition of devices?
+
+A: Yes, if you have dlpar and hotplug enabled for your system and it has
+been built into the kernel the hvcs drivers is configured to dynamically
+handle additions of new devices and removals of unused devices.
+
+---------------------------------------------------------------------------
+Q: For some reason /dev/hvcs* doesn't map to the same vty-server adapter
+after a reboot.  What happened?
+
+A: Assignment of vty-server adapters to /dev/hvcs* entries is always done
+in the order that the adapters are exposed.  Due to hotplug capabilities of
+this driver assignment of hotplug added vty-servers may be in a different
+order than how they would be exposed on module load.  Rebooting or
+reloading the module after dynamic addition may result in the /dev/hvcs*
+and vty-server coupling changing if a vty-server adapter was added in a
+slot between two other vty-server adapters.  Refer to the section above
+on how to determine which vty-server goes with which /dev/hvcs* node.
+Hint; look at the sysfs "index" attribute for the vty-server.
+
+---------------------------------------------------------------------------
+Q: Can I use /dev/hvcs* as a conduit to another partition and use a tty
+device on that partition as the other end of the pipe?
+
+A: Yes, on Power5 platforms the hvc_console driver provides a tty interface
+for extra /dev/hvc* devices (where /dev/hvc0 is most likely the console).
+In order to get a tty conduit working between the two partitions the HMC
+Super Admin must create an additional "serial server" for the target
+partition with the HMC gui which will show up as /dev/hvc* when the target
+partition is rebooted.
+
+The HMC Super Admin then creates an additional "serial client" for the
+current partition and points this at the target partition's newly created
+"serial server" adapter (remember the slot).  This shows up as an
+additional /dev/hvcs* device.
+
+Now a program on the target system can be configured to read or write to
+/dev/hvc* and another program on the current partition can be configured to
+read or write to /dev/hvcs*.  Now you have a tty conduit between two
+partitions.
+
+---------------------------------------------------------------------------
+9. Reporting Bugs:
+
+The proper channel for reporting bugs is either through the Linux OS
+distribution company that provided your OS or by posting issues to the
+PowerPC development mailing list at:
+
+linuxppc-dev@lists.ozlabs.org
+
+This request is to provide a documented and searchable public exchange
+of the problems and solutions surrounding this driver for the benefit of
+all users.
diff --git a/Documentation/powerpc/kvm_440.txt b/Documentation/powerpc/kvm_440.txt
new file mode 100644
index 00000000..c02a003f
--- /dev/null
+++ b/Documentation/powerpc/kvm_440.txt
@@ -0,0 +1,41 @@
+Hollis Blanchard <hollisb@us.ibm.com>
+15 Apr 2008
+
+Various notes on the implementation of KVM for PowerPC 440:
+
+To enforce isolation, host userspace, guest kernel, and guest userspace all
+run at user privilege level. Only the host kernel runs in supervisor mode.
+Executing privileged instructions in the guest traps into KVM (in the host
+kernel), where we decode and emulate them. Through this technique, unmodified
+440 Linux kernels can be run (slowly) as guests. Future performance work will
+focus on reducing the overhead and frequency of these traps.
+
+The usual code flow is started from userspace invoking an "run" ioctl, which
+causes KVM to switch into guest context. We use IVPR to hijack the host
+interrupt vectors while running the guest, which allows us to direct all
+interrupts to kvmppc_handle_interrupt(). At this point, we could either
+- handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or
+- let the host interrupt handler run (e.g. when the decrementer fires), or
+- return to host userspace (e.g. when the guest performs device MMIO)
+
+Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
+address space (in host or guest), which gives us virtual address space to use
+for guest mappings. While the guest is running, the host kernel remains mapped
+in AS=0, but the guest can only use AS=1 mappings.
+
+TLB entries: The TLB entries covering the host linear mapping remain
+present while running the guest. This reduces the overhead of lightweight
+exits, which are handled by KVM running in the host kernel. We keep three
+copies of the TLB:
+ - guest TLB: contents of the TLB as the guest sees it
+ - shadow TLB: the TLB that is actually in hardware while guest is running
+ - host TLB: to restore TLB state when context switching guest -> host
+When a TLB miss occurs because a mapping was not present in the shadow TLB,
+but was present in the guest TLB, KVM handles the fault without invoking the
+guest. Large guest pages are backed by multiple 4KB shadow pages through this
+mechanism.
+
+IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
+and block IO, so those drivers must be enabled in the guest. It's possible
+that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
+little effort.
diff --git a/Documentation/powerpc/mpc52xx.txt b/Documentation/powerpc/mpc52xx.txt
new file mode 100644
index 00000000..0d540a31
--- /dev/null
+++ b/Documentation/powerpc/mpc52xx.txt
@@ -0,0 +1,39 @@
+Linux 2.6.x on MPC52xx family
+-----------------------------
+
+For the latest info, go to http://www.246tNt.com/mpc52xx/
+
+To compile/use :
+
+  - U-Boot:
+     # <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
+        if you wish to ).
+     # make lite5200_defconfig
+     # make uImage
+
+     then, on U-boot:
+     => tftpboot 200000 uImage
+     => tftpboot 400000 pRamdisk
+     => bootm 200000 400000
+
+  - DBug:
+     # <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
+        if you wish to ).
+     # make lite5200_defconfig
+     # cp your_initrd.gz arch/ppc/boot/images/ramdisk.image.gz
+     # make zImage.initrd
+     # make
+
+     then in DBug:
+     DBug> dn -i zImage.initrd.lite5200
+
+
+Some remarks :
+ - The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
+   is not supported, and I'm not sure anyone is interesting in working on it
+   so. I didn't took 5xxx because there's apparently a lot of 5xxx that have
+   nothing to do with the MPC5200. I also included the 'MPC' for the same
+   reason.
+ - Of course, I inspired myself from the 2.4 port. If you think I forgot to
+   mention you/your company in the copyright of some code, I'll correct it
+   ASAP.
diff --git a/Documentation/powerpc/ptrace.txt b/Documentation/powerpc/ptrace.txt
new file mode 100644
index 00000000..f2a7a391
--- /dev/null
+++ b/Documentation/powerpc/ptrace.txt
@@ -0,0 +1,150 @@
+GDB intends to support the following hardware debug features of BookE
+processors:
+
+4 hardware breakpoints (IAC)
+2 hardware watchpoints (read, write and read-write) (DAC)
+2 value conditions for the hardware watchpoints (DVC)
+
+For that, we need to extend ptrace so that GDB can query and set these
+resources. Since we're extending, we're trying to create an interface
+that's extendable and that covers both BookE and server processors, so
+that GDB doesn't need to special-case each of them. We added the
+following 3 new ptrace requests.
+
+1. PTRACE_PPC_GETHWDEBUGINFO
+
+Query for GDB to discover the hardware debug features. The main info to
+be returned here is the minimum alignment for the hardware watchpoints.
+BookE processors don't have restrictions here, but server processors have
+an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
+adding special cases to GDB based on what it sees in AUXV.
+
+Since we're at it, we added other useful info that the kernel can return to
+GDB: this query will return the number of hardware breakpoints, hardware
+watchpoints and whether it supports a range of addresses and a condition.
+The query will fill the following structure provided by the requesting process:
+
+struct ppc_debug_info {
+       unit32_t version;
+       unit32_t num_instruction_bps;
+       unit32_t num_data_bps;
+       unit32_t num_condition_regs;
+       unit32_t data_bp_alignment;
+       unit32_t sizeof_condition; /* size of the DVC register */
+       uint64_t features; /* bitmask of the individual flags */
+};
+
+features will have bits indicating whether there is support for:
+
+#define PPC_DEBUG_FEATURE_INSN_BP_RANGE		0x1
+#define PPC_DEBUG_FEATURE_INSN_BP_MASK		0x2
+#define PPC_DEBUG_FEATURE_DATA_BP_RANGE		0x4
+#define PPC_DEBUG_FEATURE_DATA_BP_MASK		0x8
+
+2. PTRACE_SETHWDEBUG
+
+Sets a hardware breakpoint or watchpoint, according to the provided structure:
+
+struct ppc_hw_breakpoint {
+        uint32_t version;
+#define PPC_BREAKPOINT_TRIGGER_EXECUTE  0x1
+#define PPC_BREAKPOINT_TRIGGER_READ     0x2
+#define PPC_BREAKPOINT_TRIGGER_WRITE    0x4
+        uint32_t trigger_type;       /* only some combinations allowed */
+#define PPC_BREAKPOINT_MODE_EXACT               0x0
+#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE     0x1
+#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE     0x2
+#define PPC_BREAKPOINT_MODE_MASK                0x3
+        uint32_t addr_mode;          /* address match mode */
+
+#define PPC_BREAKPOINT_CONDITION_MODE   0x3
+#define PPC_BREAKPOINT_CONDITION_NONE   0x0
+#define PPC_BREAKPOINT_CONDITION_AND    0x1
+#define PPC_BREAKPOINT_CONDITION_EXACT  0x1	/* different name for the same thing as above */
+#define PPC_BREAKPOINT_CONDITION_OR     0x2
+#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
+#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000	/* byte enable bits */
+#define PPC_BREAKPOINT_CONDITION_BE(n)  (1<<((n)+16))
+        uint32_t condition_mode;     /* break/watchpoint condition flags */
+
+        uint64_t addr;
+        uint64_t addr2;
+        uint64_t condition_value;
+};
+
+A request specifies one event, not necessarily just one register to be set.
+For instance, if the request is for a watchpoint with a condition, both the
+DAC and DVC registers will be set in the same request.
+
+With this GDB can ask for all kinds of hardware breakpoints and watchpoints
+that the BookE supports. COMEFROM breakpoints available in server processors
+are not contemplated, but that is out of the scope of this work.
+
+ptrace will return an integer (handle) uniquely identifying the breakpoint or
+watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
+request to ask for its removal. Return -ENOSPC if the requested breakpoint
+can't be allocated on the registers.
+
+Some examples of using the structure to:
+
+- set a breakpoint in the first breakpoint register
+
+  p.version         = PPC_DEBUG_CURRENT_VERSION;
+  p.trigger_type    = PPC_BREAKPOINT_TRIGGER_EXECUTE;
+  p.addr_mode       = PPC_BREAKPOINT_MODE_EXACT;
+  p.condition_mode  = PPC_BREAKPOINT_CONDITION_NONE;
+  p.addr            = (uint64_t) address;
+  p.addr2           = 0;
+  p.condition_value = 0;
+
+- set a watchpoint which triggers on reads in the second watchpoint register
+
+  p.version         = PPC_DEBUG_CURRENT_VERSION;
+  p.trigger_type    = PPC_BREAKPOINT_TRIGGER_READ;
+  p.addr_mode       = PPC_BREAKPOINT_MODE_EXACT;
+  p.condition_mode  = PPC_BREAKPOINT_CONDITION_NONE;
+  p.addr            = (uint64_t) address;
+  p.addr2           = 0;
+  p.condition_value = 0;
+
+- set a watchpoint which triggers only with a specific value
+
+  p.version         = PPC_DEBUG_CURRENT_VERSION;
+  p.trigger_type    = PPC_BREAKPOINT_TRIGGER_READ;
+  p.addr_mode       = PPC_BREAKPOINT_MODE_EXACT;
+  p.condition_mode  = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
+  p.addr            = (uint64_t) address;
+  p.addr2           = 0;
+  p.condition_value = (uint64_t) condition;
+
+- set a ranged hardware breakpoint
+
+  p.version         = PPC_DEBUG_CURRENT_VERSION;
+  p.trigger_type    = PPC_BREAKPOINT_TRIGGER_EXECUTE;
+  p.addr_mode       = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
+  p.condition_mode  = PPC_BREAKPOINT_CONDITION_NONE;
+  p.addr            = (uint64_t) begin_range;
+  p.addr2           = (uint64_t) end_range;
+  p.condition_value = 0;
+
+- set a watchpoint in server processors (BookS)
+
+  p.version         = 1;
+  p.trigger_type    = PPC_BREAKPOINT_TRIGGER_RW;
+  p.addr_mode       = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
+  or
+  p.addr_mode       = PPC_BREAKPOINT_MODE_EXACT;
+
+  p.condition_mode  = PPC_BREAKPOINT_CONDITION_NONE;
+  p.addr            = (uint64_t) begin_range;
+  /* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
+   * addr2 - addr <= 8 Bytes.
+   */
+  p.addr2           = (uint64_t) end_range;
+  p.condition_value = 0;
+
+3. PTRACE_DELHWDEBUG
+
+Takes an integer which identifies an existing breakpoint or watchpoint
+(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
+corresponding breakpoint or watchpoint..
diff --git a/Documentation/powerpc/qe_firmware.txt b/Documentation/powerpc/qe_firmware.txt
new file mode 100644
index 00000000..2031ddb3
--- /dev/null
+++ b/Documentation/powerpc/qe_firmware.txt
@@ -0,0 +1,295 @@
+	   Freescale QUICC Engine Firmware Uploading
+	   -----------------------------------------
+
+(c) 2007 Timur Tabi <timur at freescale.com>,
+    Freescale Semiconductor
+
+Table of Contents
+=================
+
+  I - Software License for Firmware
+
+  II - Microcode Availability
+
+  III - Description and Terminology
+
+  IV - Microcode Programming Details
+
+  V - Firmware Structure Layout
+
+  VI - Sample Code for Creating Firmware Files
+
+Revision Information
+====================
+
+November 30, 2007: Rev 1.0 - Initial version
+
+I - Software License for Firmware
+=================================
+
+Each firmware file comes with its own software license.  For information on
+the particular license, please see the license text that is distributed with
+the firmware.
+
+II - Microcode Availability
+===========================
+
+Firmware files are distributed through various channels.  Some are available on
+http://opensource.freescale.com.  For other firmware files, please contact
+your Freescale representative or your operating system vendor.
+
+III - Description and Terminology
+================================
+
+In this document, the term 'microcode' refers to the sequence of 32-bit
+integers that compose the actual QE microcode.
+
+The term 'firmware' refers to a binary blob that contains the microcode as
+well as other data that
+
+	1) describes the microcode's purpose
+	2) describes how and where to upload the microcode
+	3) specifies the values of various registers
+	4) includes additional data for use by specific device drivers
+
+Firmware files are binary files that contain only a firmware.
+
+IV - Microcode Programming Details
+===================================
+
+The QE architecture allows for only one microcode present in I-RAM for each
+RISC processor.  To replace any current microcode, a full QE reset (which
+disables the microcode) must be performed first.
+
+QE microcode is uploaded using the following procedure:
+
+1) The microcode is placed into I-RAM at a specific location, using the
+   IRAM.IADD and IRAM.IDATA registers.
+
+2) The CERCR.CIR bit is set to 0 or 1, depending on whether the firmware
+   needs split I-RAM.  Split I-RAM is only meaningful for SOCs that have
+   QEs with multiple RISC processors, such as the 8360.  Splitting the I-RAM
+   allows each processor to run a different microcode, effectively creating an
+   asymmetric multiprocessing (AMP) system.
+
+3) The TIBCR trap registers are loaded with the addresses of the trap handlers
+   in the microcode.
+
+4) The RSP.ECCR register is programmed with the value provided.
+
+5) If necessary, device drivers that need the virtual traps and extended mode
+   data will use them.
+
+Virtual Microcode Traps
+
+These virtual traps are conditional branches in the microcode.  These are
+"soft" provisional introduced in the ROMcode in order to enable higher
+flexibility and save h/w traps If new features are activated or an issue is
+being fixed in the RAM package utilizing they should be activated.  This data
+structure signals the microcode which of these virtual traps is active.
+
+This structure contains 6 words that the application should copy to some
+specific been defined.  This table describes the structure.
+
+	---------------------------------------------------------------
+	| Offset in |                  | Destination Offset | Size of |
+	|   array   |     Protocol     |   within PRAM      | Operand |
+	--------------------------------------------------------------|
+	|     0     | Ethernet         |      0xF8          | 4 bytes |
+	|           | interworking     |                    |         |
+	---------------------------------------------------------------
+	|     4     | ATM              |      0xF8          | 4 bytes |
+	|           | interworking     |                    |         |
+	---------------------------------------------------------------
+	|     8     | PPP              |      0xF8          | 4 bytes |
+	|           | interworking     |                    |         |
+	---------------------------------------------------------------
+	|     12    | Ethernet RX      |      0x22          | 1 byte  |
+	|           | Distributor Page |                    |         |
+	---------------------------------------------------------------
+	|     16    | ATM Globtal      |      0x28          | 1 byte  |
+	|           | Params Table     |                    |         |
+	---------------------------------------------------------------
+	|     20    | Insert Frame     |      0xF8          | 4 bytes |
+	---------------------------------------------------------------
+
+
+Extended Modes
+
+This is a double word bit array (64 bits) that defines special functionality
+which has an impact on the softwarew drivers.  Each bit has its own impact
+and has special instructions for the s/w associated with it.  This structure is
+described in this table:
+
+	-----------------------------------------------------------------------
+	| Bit #  |     Name     |   Description                               |
+	-----------------------------------------------------------------------
+	|   0    | General      | Indicates that prior to each host command   |
+	|        | push command | given by the application, the software must |
+	|        |              | assert a special host command (push command)|
+	|        |              | CECDR = 0x00800000.                         |
+	|        |              | CECR = 0x01c1000f.                          |
+	-----------------------------------------------------------------------
+	|   1    | UCC ATM      | Indicates that after issuing ATM RX INIT    |
+	|        | RX INIT      | command, the host must issue another special|
+	|        | push command | command (push command) and immediately      |
+	|        |              | following that re-issue the ATM RX INIT     |
+	|        |              | command. (This makes the sequence of        |
+	|        |              | initializing the ATM receiver a sequence of |
+	|        |              | three host commands)                        |
+	|        |              | CECDR = 0x00800000.                         |
+	|        |              | CECR = 0x01c1000f.                          |
+	-----------------------------------------------------------------------
+	|   2    | Add/remove   | Indicates that following the specific host  |
+	|        | command      | command: "Add/Remove entry in Hash Lookup   |
+	|        | validation   | Table" used in Interworking setup, the user |
+	|        |              | must issue another command.                 |
+	|        |              | CECDR = 0xce000003.                         |
+	|        |              | CECR = 0x01c10f58.                          |
+	-----------------------------------------------------------------------
+	|   3    | General push | Indicates that the s/w has to initialize    |
+	|        | command      | some pointers in the Ethernet thread pages  |
+	|        |              | which are used when Header Compression is   |
+	|        |              | activated.  The full details of these       |
+	|        |              | pointers is located in the software drivers.|
+	-----------------------------------------------------------------------
+	|   4    | General push | Indicates that after issuing Ethernet TX    |
+	|        | command      | INIT command, user must issue this command  |
+	|        |              | for each SNUM of Ethernet TX thread.        |
+	|        |              | CECDR = 0x00800003.                         |
+	|        |              | CECR = 0x7'b{0}, 8'b{Enet TX thread SNUM},  |
+	|        |              |        1'b{1}, 12'b{0}, 4'b{1}              |
+	-----------------------------------------------------------------------
+	| 5 - 31 |     N/A      | Reserved, set to zero.                      |
+	-----------------------------------------------------------------------
+
+V - Firmware Structure Layout
+==============================
+
+QE microcode from Freescale is typically provided as a header file.  This
+header file contains macros that define the microcode binary itself as well as
+some other data used in uploading that microcode.  The format of these files
+do not lend themselves to simple inclusion into other code.  Hence,
+the need for a more portable format.  This section defines that format.
+
+Instead of distributing a header file, the microcode and related data are
+embedded into a binary blob.  This blob is passed to the qe_upload_firmware()
+function, which parses the blob and performs everything necessary to upload
+the microcode.
+
+All integers are big-endian.  See the comments for function
+qe_upload_firmware() for up-to-date implementation information.
+
+This structure supports versioning, where the version of the structure is
+embedded into the structure itself.  To ensure forward and backwards
+compatibility, all versions of the structure must use the same 'qe_header'
+structure at the beginning.
+
+'header' (type: struct qe_header):
+	The 'length' field is the size, in bytes, of the entire structure,
+	including all the microcode embedded in it, as well as the CRC (if
+	present).
+
+	The 'magic' field is an array of three bytes that contains the letters
+	'Q', 'E', and 'F'.  This is an identifier that indicates that this
+	structure is a QE Firmware structure.
+
+	The 'version' field is a single byte that indicates the version of this
+	structure.  If the layout of the structure should ever need to be
+	changed to add support for additional types of microcode, then the
+	version number should also be changed.
+
+The 'id' field is a null-terminated string(suitable for printing) that
+identifies the firmware.
+
+The 'count' field indicates the number of 'microcode' structures.  There
+must be one and only one 'microcode' structure for each RISC processor.
+Therefore, this field also represents the number of RISC processors for this
+SOC.
+
+The 'soc' structure contains the SOC numbers and revisions used to match
+the microcode to the SOC itself.  Normally, the microcode loader should
+check the data in this structure with the SOC number and revisions, and
+only upload the microcode if there's a match.  However, this check is not
+made on all platforms.
+
+Although it is not recommended, you can specify '0' in the soc.model
+field to skip matching SOCs altogether.
+
+The 'model' field is a 16-bit number that matches the actual SOC. The
+'major' and 'minor' fields are the major and minor revision numbers,
+respectively, of the SOC.
+
+For example, to match the 8323, revision 1.0:
+     soc.model = 8323
+     soc.major = 1
+     soc.minor = 0
+
+'padding' is necessary for structure alignment.  This field ensures that the
+'extended_modes' field is aligned on a 64-bit boundary.
+
+'extended_modes' is a bitfield that defines special functionality which has an
+impact on the device drivers.  Each bit has its own impact and has special
+instructions for the driver associated with it.  This field is stored in
+the QE library and available to any driver that calles qe_get_firmware_info().
+
+'vtraps' is an array of 8 words that contain virtual trap values for each
+virtual traps.  As with 'extended_modes', this field is stored in the QE
+library and available to any driver that calles qe_get_firmware_info().
+
+'microcode' (type: struct qe_microcode):
+	For each RISC processor there is one 'microcode' structure.  The first
+	'microcode' structure is for the first RISC, and so on.
+
+	The 'id' field is a null-terminated string suitable for printing that
+	identifies this particular microcode.
+
+	'traps' is an array of 16 words that contain hardware trap values
+	for each of the 16 traps.  If trap[i] is 0, then this particular
+	trap is to be ignored (i.e. not written to TIBCR[i]).  The entire value
+	is written as-is to the TIBCR[i] register, so be sure to set the EN
+	and T_IBP bits if necessary.
+
+	'eccr' is the value to program into the ECCR register.
+
+	'iram_offset' is the offset into IRAM to start writing the
+	microcode.
+
+	'count' is the number of 32-bit words in the microcode.
+
+	'code_offset' is the offset, in bytes, from the beginning of this
+	structure where the microcode itself can be found.  The first
+	microcode binary should be located immediately after the 'microcode'
+	array.
+
+	'major', 'minor', and 'revision' are the major, minor, and revision
+	version numbers, respectively, of the microcode.  If all values are 0,
+	then these fields are ignored.
+
+	'reserved' is necessary for structure alignment.  Since 'microcode'
+	is an array, the 64-bit 'extended_modes' field needs to be aligned
+	on a 64-bit boundary, and this can only happen if the size of
+	'microcode' is a multiple of 8 bytes.  To ensure that, we add
+	'reserved'.
+
+After the last microcode is a 32-bit CRC.  It can be calculated using
+this algorithm:
+
+u32 crc32(const u8 *p, unsigned int len)
+{
+	unsigned int i;
+	u32 crc = 0;
+
+	while (len--) {
+	   crc ^= *p++;
+	   for (i = 0; i < 8; i++)
+		   crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
+	}
+	return crc;
+}
+
+VI - Sample Code for Creating Firmware Files
+============================================
+
+A Python program that creates firmware binaries from the header files normally
+distributed by Freescale can be found on http://opensource.freescale.com.
diff --git a/Documentation/powerpc/sound.txt b/Documentation/powerpc/sound.txt
new file mode 100644
index 00000000..df23d95e
--- /dev/null
+++ b/Documentation/powerpc/sound.txt
@@ -0,0 +1,81 @@
+            Information about PowerPC Sound support
+=====================================================================
+
+Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions,
+comments or corrections.
+
+Last Change: 6.16.99
+
+This just covers sound on the PReP and CHRP systems for now and later
+will contain information on the PowerMac's.
+
+Sound on PReP has been tested and is working with the PowerStack and IBM
+Power Series onboard sound systems which are based on the cs4231(2) chip.
+The sound options when doing the make config are a bit different from
+the default, though.
+
+The I/O base, irq and dma lines that you enter during the make config
+are ignored and are set when booting according to the machine type.
+This is so that one binary can be used for Motorola and IBM machines
+which use different values and isn't allowed by the driver, so things
+are hacked together in such a way as to allow this information to be
+set automatically on boot.
+
+1. Motorola PowerStack PReP machines
+
+  Enable support for "Crystal CS4232 based (PnP) cards" and for the
+  Microsoft Sound System.  The MSS isn't used, but some of the routines
+  that the CS4232 driver uses are in it.
+
+  Although the options you set are ignored and determined automatically
+  on boot these are included for information only:
+
+  (830) CS4232 audio I/O base 530, 604, E80 or F40
+  (10) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15
+  (6) CS4232 audio DMA 0, 1 or 3
+  (7) CS4232 second (duplex) DMA 0, 1 or 3
+
+  This will allow simultaneous record and playback, as 2 different dma
+  channels are used.
+
+  The sound will be all left channel and very low volume since the
+  auxiliary input isn't muted by default.  I had the changes necessary
+  for this in the kernel but the sound driver maintainer didn't want
+  to include them since it wasn't common in other machines.  To fix this
+  you need to mute it using a mixer utility of some sort (if you find one
+  please let me know) or by patching the driver yourself and recompiling.
+
+  There is a problem on the PowerStack 2's (PowerStack Pro's) using a
+  different irq/drq than the kernel expects.  Unfortunately, I don't know
+  which irq/drq it is so if anyone knows please email me.
+
+  Midi is not supported since the cs4232 driver doesn't support midi yet.
+
+2. IBM PowerPersonal PReP machines
+
+  I've only tested sound on the Power Personal Series of IBM workstations
+  so if you try it on others please let me know the result.  I'm especially
+  interested in the 43p's sound system, which I know nothing about.
+
+  Enable support for "Crystal CS4232 based (PnP) cards" and for the
+  Microsoft Sound System.  The MSS isn't used, but some of the routines
+  that the CS4232 driver uses are in it.
+
+  Although the options you set are ignored and determined automatically
+  on boot these are included for information only:
+
+  (530) CS4232 audio I/O base 530, 604, E80 or F40
+  (5) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15
+  (1) CS4232 audio DMA 0, 1 or 3
+  (7) CS4232 second (duplex) DMA 0, 1 or 3
+  (330) CS4232 MIDI I/O base 330, 370, 3B0 or 3F0
+  (9) CS4232 MIDI IRQ 5, 7, 9, 11, 12 or 15
+
+  This setup does _NOT_ allow for recording yet.
+
+  Midi is not supported since the cs4232 driver doesn't support midi yet.
+
+2. IBM CHRP
+
+  I have only tested this on the 43P-150.  Build the kernel with the cs4232
+  set as a module and load the module with irq=9 dma=1 dma2=2 io=0x550
diff --git a/Documentation/powerpc/transactional_memory.txt b/Documentation/powerpc/transactional_memory.txt
new file mode 100644
index 00000000..c907be41
--- /dev/null
+++ b/Documentation/powerpc/transactional_memory.txt
@@ -0,0 +1,175 @@
+Transactional Memory support
+============================
+
+POWER kernel support for this feature is currently limited to supporting
+its use by user programs.  It is not currently used by the kernel itself.
+
+This file aims to sum up how it is supported by Linux and what behaviour you
+can expect from your user programs.
+
+
+Basic overview
+==============
+
+Hardware Transactional Memory is supported on POWER8 processors, and is a
+feature that enables a different form of atomic memory access.  Several new
+instructions are presented to delimit transactions; transactions are
+guaranteed to either complete atomically or roll back and undo any partial
+changes.
+
+A simple transaction looks like this:
+
+begin_move_money:
+  tbegin
+  beq   abort_handler
+
+  ld    r4, SAVINGS_ACCT(r3)
+  ld    r5, CURRENT_ACCT(r3)
+  subi  r5, r5, 1
+  addi  r4, r4, 1
+  std   r4, SAVINGS_ACCT(r3)
+  std   r5, CURRENT_ACCT(r3)
+
+  tend
+
+  b     continue
+
+abort_handler:
+  ... test for odd failures ...
+
+  /* Retry the transaction if it failed because it conflicted with
+   * someone else: */
+  b     begin_move_money
+
+
+The 'tbegin' instruction denotes the start point, and 'tend' the end point.
+Between these points the processor is in 'Transactional' state; any memory
+references will complete in one go if there are no conflicts with other
+transactional or non-transactional accesses within the system.  In this
+example, the transaction completes as though it were normal straight-line code
+IF no other processor has touched SAVINGS_ACCT(r3) or CURRENT_ACCT(r3); an
+atomic move of money from the current account to the savings account has been
+performed.  Even though the normal ld/std instructions are used (note no
+lwarx/stwcx), either *both* SAVINGS_ACCT(r3) and CURRENT_ACCT(r3) will be
+updated, or neither will be updated.
+
+If, in the meantime, there is a conflict with the locations accessed by the
+transaction, the transaction will be aborted by the CPU.  Register and memory
+state will roll back to that at the 'tbegin', and control will continue from
+'tbegin+4'.  The branch to abort_handler will be taken this second time; the
+abort handler can check the cause of the failure, and retry.
+
+Checkpointed registers include all GPRs, FPRs, VRs/VSRs, LR, CCR/CR, CTR, FPCSR
+and a few other status/flag regs; see the ISA for details.
+
+Causes of transaction aborts
+============================
+
+- Conflicts with cache lines used by other processors
+- Signals
+- Context switches
+- See the ISA for full documentation of everything that will abort transactions.
+
+
+Syscalls
+========
+
+Performing syscalls from within transaction is not recommended, and can lead
+to unpredictable results.
+
+Syscalls do not by design abort transactions, but beware: The kernel code will
+not be running in transactional state.  The effect of syscalls will always
+remain visible, but depending on the call they may abort your transaction as a
+side-effect, read soon-to-be-aborted transactional data that should not remain
+invisible, etc.  If you constantly retry a transaction that constantly aborts
+itself by calling a syscall, you'll have a livelock & make no progress.
+
+Simple syscalls (e.g. sigprocmask()) "could" be OK.  Even things like write()
+from, say, printf() should be OK as long as the kernel does not access any
+memory that was accessed transactionally.
+
+Consider any syscalls that happen to work as debug-only -- not recommended for
+production use.  Best to queue them up till after the transaction is over.
+
+
+Signals
+=======
+
+Delivery of signals (both sync and async) during transactions provides a second
+thread state (ucontext/mcontext) to represent the second transactional register
+state.  Signal delivery 'treclaim's to capture both register states, so signals
+abort transactions.  The usual ucontext_t passed to the signal handler
+represents the checkpointed/original register state; the signal appears to have
+arisen at 'tbegin+4'.
+
+If the sighandler ucontext has uc_link set, a second ucontext has been
+delivered.  For future compatibility the MSR.TS field should be checked to
+determine the transactional state -- if so, the second ucontext in uc->uc_link
+represents the active transactional registers at the point of the signal.
+
+For 64-bit processes, uc->uc_mcontext.regs->msr is a full 64-bit MSR and its TS
+field shows the transactional mode.
+
+For 32-bit processes, the mcontext's MSR register is only 32 bits; the top 32
+bits are stored in the MSR of the second ucontext, i.e. in
+uc->uc_link->uc_mcontext.regs->msr.  The top word contains the transactional
+state TS.
+
+However, basic signal handlers don't need to be aware of transactions
+and simply returning from the handler will deal with things correctly:
+
+Transaction-aware signal handlers can read the transactional register state
+from the second ucontext.  This will be necessary for crash handlers to
+determine, for example, the address of the instruction causing the SIGSEGV.
+
+Example signal handler:
+
+    void crash_handler(int sig, siginfo_t *si, void *uc)
+    {
+      ucontext_t *ucp = uc;
+      ucontext_t *transactional_ucp = ucp->uc_link;
+
+      if (ucp_link) {
+        u64 msr = ucp->uc_mcontext.regs->msr;
+        /* May have transactional ucontext! */
+#ifndef __powerpc64__
+        msr |= ((u64)transactional_ucp->uc_mcontext.regs->msr) << 32;
+#endif
+        if (MSR_TM_ACTIVE(msr)) {
+           /* Yes, we crashed during a transaction.  Oops. */
+   fprintf(stderr, "Transaction to be restarted at 0x%llx, but "
+                           "crashy instruction was at 0x%llx\n",
+                           ucp->uc_mcontext.regs->nip,
+                           transactional_ucp->uc_mcontext.regs->nip);
+        }
+      }
+
+      fix_the_problem(ucp->dar);
+    }
+
+
+Failure cause codes used by kernel
+==================================
+
+These are defined in <asm/reg.h>, and distinguish different reasons why the
+kernel aborted a transaction:
+
+ TM_CAUSE_RESCHED       Thread was rescheduled.
+ TM_CAUSE_FAC_UNAV      FP/VEC/VSX unavailable trap.
+ TM_CAUSE_SYSCALL       Currently unused; future syscalls that must abort
+                        transactions for consistency will use this.
+ TM_CAUSE_SIGNAL        Signal delivered.
+ TM_CAUSE_MISC          Currently unused.
+
+These can be checked by the user program's abort handler as TEXASR[0:7].
+
+
+GDB
+===
+
+GDB and ptrace are not currently TM-aware.  If one stops during a transaction,
+it looks like the transaction has just started (the checkpointed state is
+presented).  The transaction cannot then be continued and will take the failure
+handler route.  Furthermore, the transactional 2nd register state will be
+inaccessible.  GDB can currently be used on programs using TM, but not sensibly
+in parts within transactions.
diff --git a/Documentation/powerpc/zImage_layout.txt b/Documentation/powerpc/zImage_layout.txt
new file mode 100644
index 00000000..048e0150
--- /dev/null
+++ b/Documentation/powerpc/zImage_layout.txt
@@ -0,0 +1,47 @@
+          Information about the Linux/PPC kernel images
+=====================================================================
+
+Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions,
+comments or corrections.
+
+This document is meant to answer several questions I've had about how
+the PReP system boots and how Linux/PPC interacts with that mechanism.
+It would be nice if we could have information on how other architectures
+boot here as well.  If you have anything to contribute, please
+let me know.
+
+
+1. PReP boot file
+
+  This is the file necessary to boot PReP systems from floppy or
+  hard drive.  The firmware reads the PReP partition table entry
+  and will load the image accordingly.
+
+  To boot the zImage, copy it onto a floppy with dd if=zImage of=/dev/fd0h1440
+  or onto a PReP hard drive partition with dd if=zImage of=/dev/sda4
+  assuming you've created a PReP partition (type 0x41) with fdisk on
+  /dev/sda4.
+
+  The layout of the image format is:
+
+  0x0     +------------+
+          |            | PReP partition table entry
+          |            |
+  0x400   +------------+
+          |            | Bootstrap program code + data
+          |            |
+          |            |
+          +------------+
+          |            | compressed kernel, elf header removed
+          +------------+
+          |            | initrd (if loaded)
+          +------------+
+          |            | Elf section table for bootstrap program
+          +------------+
+
+
+2. MBX boot file
+
+  The MBX boards can load an elf image, and relocate it to the
+  proper location in memory - it copies the image to the location it was
+  linked at.