Age | Commit message (Collapse) | Author |
|
The extension is optional, partial support exists in atomic.cl, so can
be added back if that becomes a requirement.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
GP: Amended to include cpu.h instead of clc.h, which defines abs() in
terms of __builtin_abs().
|
|
Previously, it was assumed the OpenCL headers version 1.2 were installed, and
if not, the build failed.
Now CMake will check this and error out before the build.
OpenCL header versions greater than 1.2 should also be acceptable.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
|
|
whitespace
|
|
|
|
This implementation passes the Khronos test_buffer_migrate test of the
buffers test.
However, a real test requires devices of different memory affinities,
between which to migrate the buffers, which
in the case of a CPUDevice only implementation on heterogenous cores,
we don't currently have.
Nevertheless, this patch implements the API, and the backend has a
nominal implementation which will pre-allocate device buffers when called.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
|
|
This implements an aspect of the clCreateSubDevices() API, per the v1.2 spec:
"A program binary (compiled binary, library binary or executable binary)
built for a parent device can be used by all its sub-devices.
If a program binary has not been built for a sub-device, the
program binary associated with the parent device will be used."
Previously, each program or kernel object was created for a device. If
a device was paritioned into sub-devices, the sub-devices may not have
had an associated program/kernel object (unless one was explicitly built).
Now, when shamrock queries a device for a kernel or program, it will search
the device hierarchy and look for a parent device for which a program/kernel
object may have been created.
This patch enables the Khronos v1.2 test_device_partition to PASS.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
|
|
|
|
|
|
Memory for a temporary array of context devices was not getting freed.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
|
|
|
|
This function now implements this aspect of the spec for this function:
"CL_INVALID_DEVICE if device is not in the list of devices associated with
kernel..."
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Device data allocated for Buffer objects allocated via clCreateBuffer()
using the CL_MEM_COPY_HOST_PTR flag for CPUDevices should be allocated
only once in global device memory, and shared between the CPUDevices.
Previously, shamrock was creating a brand new allocation for each
device buffer, for the same MemObject. This was causing the
test_device_partition Khronos test to fail (for device fission).
This is now fixed, by enabling sharing of device data.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
If a sub-device cannot be further sub-divided, the v1.2 spec allows
the implementation to return CL_DEVICE_PARTITION_FAILED, or CL_INVALID_VALUE
depending on one's interpretation.
The Khronos v1.2 device_partition test fails in either case (which appears
to be a bug).
The Khronos v2.0 device_partition test was modified to accept the
return value of CL_DEVICE_PARTITION_FAILED, which is a logical fix.
This patch allows a similarly modified (fixed) Khronos v1.2 test to
pass the test subroutine code which recursively subdivides devices.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
|
|
|
|
Used by clients to determine how a (sub)device was partitioned by
clCreateSubDevices()
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Devices were previously being created during enumeration even when client
was not asking for the devices.
This is fixed.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Take cue from dsp.h, which has the abs() function working.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Previously, comparison of Args objects between Kernel objects returned
false if the Args this pointers were not the same.
For two different devices with two separate kernels, it is still possible the
Args are the same, if their data members are the same.
So, replaced the != operator with an comparison method, which avoids
inheriting the C++ != operator which was comparing this pointers.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
This is a WIP patch beginning the addition of the v1.2 device fission
feature.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
This (should) allow building on 64 bit systems, and is benign on 32
bit systems per Khronos 'test_printf' tests.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
This implementation is quite simple, and avoids creation of special
OpenCL kernels in favor of using a GNU host builtin, which it is expected
would be sufficiently optimized.
A possible optimization worth exploring would be to use OpenMP on the
host to dispatch the pattern filling across the host cores.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Khronos test 'test_buffers' requires new v1.2 API symbols to compile.
This is in preparation for developing and validating the new v1.2
buffer functions (clEnqueueFillBuffer(), clEnqueueMigrateMemObjects()).
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Default to current clUnloadCompiler() function.
No Khronos v1.2 test available.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Simply check the platform ID, and call the existing
clGetExtensionFunctionAddress() v1.1 API.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
This adds an OpenCL C file taken from pocl, with some minor tweaks.
Per the Khronos v1.2 test_printf test case, this enables all of the 57
sub tests to pass, with two exceptions:
*** Testing printf for vector ***
0)testing printf("%2.2v4hlf",(1.0f,2.0f,3.0f,4.0f))
*** FAILED ***
4)testing printf("%v2ld",(12345678,98765432))
*** FAILED ***
Some debugging indicates a possible issue involving va_args and floating
point types, which becomes apparent when passing vectors of floats to a
variadic function.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
OpenCL v1.2 defines the builtin popcount, which counts the number of
one bits in the argument.
Extra care is taken to avoid sign extension in casting from the smaller
signed types (char,short) to uint, the argument of __builting_popcount().
This commit enables the Khronos conformance test to PASS:
% test_integer_ops popcount
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Add a check for this new flag.
Fix alignment on subBuffer check per TI OpenCL code sync.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
The v1.2 spec updates clSetEventCallback() to add notifications on
CL_SUBMITTED and CL_RUNNING states.
Updated worker thread to set CL_RUNNING status on the event when it starts.
Updated clSetEventCallback() to allow CL_SUBMITTED and CL_RUNNING states.
Updated test commandqueue to remove failure check for CL_SUBMITTED state.
This change allows the following Khronos v1.2 tests to PASS:
% test_events callbacks
% test_events callback_simultaneous
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Previously, shamrock was not able to handle the case where negative status
codes set by clSetUserEventStatus() cause termination of currently queued
dependent events.
This is now handled allowing the following Khronos tests to pass:
% test_events test_userevents
% test_events userevents_multithreaded
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
The C++ standard says the result of assigning negative values to an
enum of positive values is compiler dependent.
Since event objects can have negative status (indicating abnormal termination),
this commit explicitly promotes the Status type to int.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Refresh of the commandqueue.cpp code based on latest TI bug fixes,
as preparation for adding new OCL v1.2 features.
Updates taken from: http://git.ti.com/opencl/ti-opencl
commit:
6ffe5906d0f78c1b0398f9460f3af6df978603dd Merge branch 'hotfix/ctrl-c-fix'
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Per the OCL 1.2 spec, build (compile, link and build) APIs must call the
provided callback even if the builds fail.
Previously, this was only being done when the builds succeeded, and also
was being done once per device (which the spec does not require).
This has been changed so that notifications now occur for *any* build result.
This was validated using the following test_compiler Khronos test cases:
% test_compiler simple_compile_with_callback
% test_compiler simple_link_with_callback
% test_compiler execute_after_simple_compile_and_link_with_callbacks
% test_compiler simple_library_with_callback
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Extra check for execution_status < 0, which is valid.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Coal::Event::User type events have no command queue parents. The code
was dereferencing a NULL pointer.
This is fixed.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
clEnqueueBarrierWithWaitList() is a new OpenCL v1.2 API, which adds
a list of dependent events to the barrier being enqueued.
The new API was also added to ICD table.
This commit allows the Khronos barrier event tests to PASS:
- % test_events event_enqueue_barrier_with_event_list
- % test_events out_of_order_event_enqueue_barrier_single_queue
Also, added the new API to the ICD table.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
clEnqueueMarkerWithWaitList() is a new OpenCL v1.2 API, which adds
a list of dependent events to the marker being enqueued.
Semantics of clEnqueueMarker() were also modified (error check) to the
updated spec.
The new API was also added to ICD table.
This commit enables passing the following Khronos conformance tests:
- % test_events event_enqueue_marker
- % test_events event_enqueue_marker_with_event_list
- % test_events out_of_order_event_enqueue_marker_single_queue
- % test_events out_of_order_event_enqueue_marker_multi_queue
- % test_events out_of_order_event_enqueue_marker_multi_queue_multi_device
=> PASSED (by default, since only one device).
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Per the v1.2 spec for clBuildProgram() and clCompileProgram():
"Returns: CL_INVALID_OPERATION if there are kernel objects attached to program."
Note this causes some Khronos v1.2 tests to fail, on their second
re-build (without previously releasing kernels):
% test_compiler options_build_macro
% test_compiler options_build_macro_existence
% test_compiler options_include_directory
These tests pass on their first build, which validates the main objective,
but they fail because they appear to violate the OCL v1.2 spec in regards
to re-building a program with kernels already attached.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Added a few build options to clBuildProgram() for v1.2 spec.
This enables following Khronos v1.2 conformance tests to pass:
% test_compiler options_build_optimizations
% test_compiler options_denorm_cache
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
OCL v1.2 allows querying of the type of binary program created.
This is a partial implementation, pending a method to distinguish
LLVM modules as libraries vs executables vs neither.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Added:
clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, ...)
clGetProgramInfo( program, CL_PROGRAM_BINARIES, ...)
to return binaries for library program objects.
Previously, this was only returning binaries for executables.
This enables the Khronos v1.2 conformance test to pass:
- % test_compiler execute_after_serialize_reload_library
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Handle the "-create-library" option to clLinkProgram(), which enables
creation of library objects, which can later be linked together into
an executable.
This enables the following Khronos v1.2 conformance tests to pass:
- % test_compiler execute_after_simple_library_with_link
- % test_compiler simple_library_only
- % test_compiler simple_library_with_callback
- % test_compiler simple_library_with_link
- % test_compiler multiple_libraries
- % test_compiler multiple_files_multiple_libraries
=> PASSED.
- % test_compiler multiple_embedded_headers
=> PASSES*: Up until 256 programs, then runs out of memory.
found LLVMLinkModules() is taking ~20MB per Link!
- % test_compiler multi_file_libraries
=> PASSES*: *Up to 128 libraries.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
v1.2 adds the ability to load header file source into program objects,
then pass those program objects into clCompileProgram().
Without the ability to compile the program into LLVM IR, or to have
a functional VFS in clang (?), this forces us to create temporary
actual header files on the real filesystem for clang to locate.
This commit causes the header source to be written into a temporary
/tmp/.shamrock directory, and then adds the "-I /tmp/.shamrock" include
path to the compile options.
This was validated using Khronos tests:
% test_compiler simple_embedded_header_compile
% test_compiler simple_embedded_header_link
% test_compiler execute_after_embedded_header_link
% test_compiler multiple_embedded_headers
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|
|
Separate the Program::build() method into separate ::compile() and link()
phases.
Implement basic clCompileProgram() and clLinkProgram() APIs, and export
via ICD table.
This commit allows the following Khronos v1.2 tests to pass:
% test_compiler simple_link_only
% test_compiler simple_link_with_callback
% test_compiler two_file_link
% test_compiler multiple_files
The last test validates linking 256 files together.
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
|