ryan.arnold/bernie.ogden/cbuild2.git


Cbuildv2 is a bourne shell rewrite of the existing cbuild system as
used by Linaro. While being oriented to the Linaro way of doing
things, Cbuildv2 should be usable by others by just reconfiguring.

Configuring Cbuildv2:
---------------------
  While it is possible to run Cbuildv2 from it's source tree, this
isn't recommended. The best practice is to create a new build
directory, and configure Cbuildv2 in that directory. That makes it
easy to change branches, or even delete subdirectories. There
are defaults for all paths, which is to create them under the same
directory configure is run in.

  There are several directories that Cbuildv2 needs. These are the
following:

  * local snapshots - This is where all the downloaded sources get
    stored. The default is the current directory under snapshots.
  * local build - This is where all the executables get built. The
    default is to have builds go in a top level directory which is the
    full hostname of the build machine. Under this a directory is
    created for the target architecture.
  * remote snapshots - This is where remote tarballs are stored. This
    currently defaults to cbuild.validation.linaro.org, which is
    accessible via HTTP to the general public.

  If configure is executed without any parameters. the defaults are
used. It is also possible to change those values at configure
time. For example:

$CBUILD-PATH/configure
--with-local-snapshots=$CBUILD_PATH/snapshots 
--with-local-builds=$CBUILD-PATH/destdir
--with-remote-snapshots=cbuild@toolchain64.lab:/home/cbuild/var/snapshots/

  This changes the 3 primary paths, including changing the remote host
to use rsync or ssh to download tarballs instead of HTTP. You can
execute ./configure --help to get the full list of configure time
parameters.

  The configure process produces a host.conf file, with the default
settings. This file is read by cbuildv2 at runtime, so it's possible to
change the values and rerun cbuild2.sh to use the new values. Each
toolchain component also has a config file. The default version is
copied at build time to the build tree. It is also possible to edit
this file, usually called something like gcc.conf, to change the
toolchain component specific values used when configuring the
component.

  You can see what values were used to configure each component by
looking in the top of the config.log file, which is produced at
configure time. For example:

	  head ${hostname}/${target}/${component}/config.log

Default Behaviours:
------------------
  There are several behaviours that may not be obvious, so they're
documented here. One relates to GCC builds. When built natively, GCC
only builds itself once, and is fully functional. When cross
compiling, this works differently. A fully functional GCC can't be
built without a sysroot, so a minimum compiler is built to build the C
library. This is called stage1. This is then used to compiler the C
library. One this library is installed, then stage2 of GCC can be
built.

  When building an entire toolchain using the 'all' option to --build,
all components are built in the correct order, including both GCC
stages. It is possible to specify building only GCC, in this case the
stage1 configure flags are used if the C library isn't installed. If
the C library is installed, then the stage2 flags are used. A message
is displayed to document the automatic configure decision.

Specifying the source works like this:
-------------------------------------
It's possible to specify a full tarball name, minus the compression
part, which will be found dynamically. If the word 'tar' appears in
the suppplied name, it's only looked for on the remote directory. If
the name is a URL for bzr, sv, http, or git, the the source are
instead checked out from a remote repository instead with the
appropriate source code control system.

It's also possible to specify an 'alias', ie ... 'gcc-4.8' instead. In
this case, the remote snapshot directory is checked first. If the name
is not unique, but matches are found, and error is generated. If the
name isn't found at all, then a URL for the source repository is
extracted from the sources.conf file, and the code is checkout out.

There are a few ways if listing the files to see what is
available. There ar two primary remote directories where files that
can be downloaded are stored. These are 'snapshots' or
'infrastructure'. Infrastructure is usualy only installed once per
host, and contains the other packages needed to build GCC, like
gmp. Snapshots is the primary location of all source tarballs. To list
all the available snapshots, you can do this"

    "cbuild2.sh --list snapshots".

You can also run cbuildv2 with the --interactive options, which will
display a subset of all the packages that matches the supplied
string. To build a specific component, use the --build option to
cbuildv2. the --target option is also used for cross builds. For
example:

"cbuild2.sh --target arm-none-linux-gnueabihf gcc-linaro-4.8.2013.07-1"

This would fetch the source tarball for this release, build anything
it needs to compile, the binutils for example, and then build these
sources. You can also specify a URL to a source repository
instead. For example:

"cbuild2.sh --target arm-none-linux-gnueabihf git://git.linaro.org/toolchain/eglibc.git"

To build an entire cross toolchain, the simplest way is to let
cbuildv2 control all the steps. Although it is also possible to do each
step separately. To build the full toolchain, do this:

"cbuild2.sh --target arm-none-linux-gnueabihf --build all"

---------------------------------------------------------------------
Older NOTES

x86_64-none-linux-gnu
arm-none-linux-gnueabi
arm-none-linux-gnueabihf
armeb-none-linux-gnueabihf
aarch64-none-linux-gnu
aarch64-none-elf
aarch64_be-none-elf
aarch64_be-none-linux-gnu

Toolchain Components
 * gcc (gcc, g++, objc, fortran)
 * gas
 * ld (gold)
 * libc (newlib,eglibc,glibc)
 * dependant libs (gmp, mpc, mpfr, libelf)

Build with stock Ubuntu toolchains
 * Linaro GCC Maintenance (currently Linaro GCC 4.7)
 * FSF GCC Previous (currently FSF GCC 4.7)
 * FSF GCC Current (currently FSF GCC 4.8)

Build with trunk/master
 * Linaro GCC Development (currently Linaro GCC 4.8)
 * FSF GCC Trunk (currently FSF GCC 4.9)

Variables
 * build machine toolchain versions of all components (gcc, gas, ld, libc)
 * configure options

Features:
---------
 * Build with downloaded binaries or build them all locally
 * Has default configuration for all components
 * All info recorded into an SQL database
 * Ability to mix & match all toolchain components
 * Test locally or remotely
 * Queue jobs for LAVA
 * Lists possible versions and components for build

cbuild2 command line arguments:
-------------------------------
 * --build (architecture for the build machine, default native)
 * --target (architecture for the target machine, default native)
 * --snapshots XXX (URL of remote host or local directory)
 * --libc {newlib,eglibc,glibc} (C library to use)
 * --list {gcc,binutils,libc} (list possible values for component versions)
 * --set {gcc,binutils,libc,latest}=XXX (change config file setting)
 * --binutils (binutils version to use, default $PATH)
 * --gcc (gcc version to use, default $PATH)
 * --config XXX (alternate config file)
 * --clean (clean a previous build, default is to start where it left off)
 * --dispatch (run on LAVA build farm, probably remote)
 * --sysroot XXX (specify path to alternate sysroot)


General ideas
-------------
Use a shell script as config file, ala *.conf and set global
variables or custom functions. A good naming convention to avoid
collisions is important. It should be able to be download from a
remote host. 

Rather than Makefiles, it should use bourne shell scripts for more
functionality. Both Android and Chromium use this technique.

Each project that needs to be built should have a file that lists it's
major build time dependancies. There should be defaults for all
targets, but values can be overridden with local config files. The
config data could also potentially live in a database. The config data
is recorded into the database for each build and test run.

Cross building G++ is complex, there is a stage 1 build of the
compiler, which is only used to build the C library. Then the C
library is compiled. After that G++ can be built.
 
Analysis
--------

The main goal is to have the appropriate data accessible to support
charting it in various ways to assist in better understanding of the
quality of each toolchain component. This will assist developers in
determining if their changes improve or degrade the component's
quality. This will also allow others to get an overview of each
component's quality, which will support product releases and
management planning.

 * Plot a test run of a specific version of a component across all
   supported platforms
 * Plot a test case of a component across all supported platforms
 * Plot a test case of a component across a list of versions
 * Plot a component's PASS percentage across all supported platforms
 * Plot a component's PASS percentage across all supported platforms
   and several versions.
 * Plot a component's FAIL percentage across all supported platforms
 * Plot a component's FAIL percentage across all supported platforms
   and several versions.
 * Plot all test states as a percentage of total test results
 * Plot all test states of the actual totals for each
 * 
CoreMark
 * Compiler and version
 * Operating Speed in Mhz
 * CoreMark/Mhz
 * CoreMark
 * CoreMark/Core
 * Parallel Execution
 * EEMBC

EEMBC
 * TODO

Spec CPU 2000
 * CINT
  * Benchmark
  * Reference Time
  * Base Runtime
  * Base Ratio
  * Runtime
  * Ratio
 * CFP
  * Benchmark
  * Reference Time
  * Base Runtime
  * Base Ratio
  * Runtime
  * Ratio

Spec CPU 2006
 * Benchmark
 * Base Seconds
 * Base Ratio
 * Peak Seconds
 * Peak Ratio

Notes on Bourne Shell Scripting
-------------------------------

  These scripts use a few techniques in many places that relate to
shell functions. One is heavy use of bourne shell functions to reduce
duplication, and make the code better organized. Any string echo'd by
a function becomes it's return value. Bourne shell supports 2 types of
return values though. One is the string returned by the function. This
is used whenever the called function returns data. This is captured by
the normal shell commands like this: values="`call_function $1`". The
other type of return value is a single integer. Much like system
calls, these scripts all return 0 for success, and 1 for errors. This
enables the calling function to trap errors, and handle them in a
clean fashion.

  A few good habits to mention, always enclose a sub shell execution
in double quotes. If the returned string contains spaces, this
preserves the data, otherwise it'll get truncated.

  Another good habit is to always prepend a character when doing
string comparisons. If one of the two strings is undefined, the script
will abort. So always using "test x${foo} = xbar" prevents that.