aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-07-14Remove 64 bit atomics extensions from CPU Device.ocl_1_2Gil Pitney
The extension is optional, partial support exists in atomic.cl, so can be added back if that becomes a requirement. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-07-14fixed the crash issue for convert builtin functionShow Liu
GP: Amended to include cpu.h instead of clc.h, which defines abs() in terms of __builtin_abs().
2015-07-09Implemented CMake check for opencl-headers minium version 1.2 installedGil Pitney
Previously, it was assumed the OpenCL headers version 1.2 were installed, and if not, the build failed. Now CMake will check this and error out before the build. OpenCL header versions greater than 1.2 should also be acceptable. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-07-02Added the declaration for the 64 bits atomic operationsShow Liu
2015-07-02Fixed the atomic operations failed on 64 bits argument and discard the ↵Show Liu
whitespace
2015-07-02Fixed the atomic operation crash issuesShow Liu
2015-06-30Implement clEnqueueMigrateMemObjects() v1.2 APIGil Pitney
This implementation passes the Khronos test_buffer_migrate test of the buffers test. However, a real test requires devices of different memory affinities, between which to migrate the buffers, which in the case of a CPUDevice only implementation on heterogenous cores, we don't currently have. Nevertheless, this patch implements the API, and the backend has a nominal implementation which will pre-allocate device buffers when called. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-26Tixed the mad_sat() builtin function crash issueShow Liu
2015-06-24Sub-devices inherit the parent's kernel and program device dependent structs.Gil Pitney
This implements an aspect of the clCreateSubDevices() API, per the v1.2 spec: "A program binary (compiled binary, library binary or executable binary) built for a parent device can be used by all its sub-devices. If a program binary has not been built for a sub-device, the program binary associated with the parent device will be used." Previously, each program or kernel object was created for a device. If a device was paritioned into sub-devices, the sub-devices may not have had an associated program/kernel object (unless one was explicitly built). Now, when shamrock queries a device for a kernel or program, it will search the device hierarchy and look for a parent device for which a program/kernel object may have been created. This patch enables the Khronos v1.2 test_device_partition to PASS. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-23fixed the mul_hi() builtin function crash issueShow Liu
2015-06-23fixed for rotate() crash issue when argument is uint typeShow Liu
2015-06-23added "-cl-std=CL1.2" cflags supportShow Liu
2015-06-17clCreateProgramWithBinary(): fix memory leak.Gil Pitney
Memory for a temporary array of context devices was not getting freed. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-16fixed the upsample() builtin function crash issueShow Liu
2015-06-16fiexed for clz() builtin function crash issueShow Liu
2015-06-16Fix clGetKernelWorkGroupInfo per v1.2 specGil Pitney
This function now implements this aspect of the spec for this function: "CL_INVALID_DEVICE if device is not in the list of devices associated with kernel..." Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-12Allow CPUBuffers for CPUDevices to share copied host pointer dataGil Pitney
Device data allocated for Buffer objects allocated via clCreateBuffer() using the CL_MEM_COPY_HOST_PTR flag for CPUDevices should be allocated only once in global device memory, and shared between the CPUDevices. Previously, shamrock was creating a brand new allocation for each device buffer, for the same MemObject. This was causing the test_device_partition Khronos test to fail (for device fission). This is now fixed, by enabling sharing of device data. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-05clCreateSubDevices(): Return CL_DEVICE_PARTITION_FAILED if numCPUs() == 1Gil Pitney
If a sub-device cannot be further sub-divided, the v1.2 spec allows the implementation to return CL_DEVICE_PARTITION_FAILED, or CL_INVALID_VALUE depending on one's interpretation. The Khronos v1.2 device_partition test fails in either case (which appears to be a bug). The Khronos v2.0 device_partition test was modified to accept the return value of CL_DEVICE_PARTITION_FAILED, which is a logical fix. This patch allows a similarly modified (fixed) Khronos v1.2 test to pass the test subroutine code which recursively subdivides devices. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-05fixed the sub_sat() builin function crash issueShow Liu
2015-06-05fixed the add_sat() builtin function crash issueShow Liu
2015-06-04Implement CL_DEVICE_PARTITION_TYPE case of clGetDeviceInfo()Gil Pitney
Used by clients to determine how a (sub)device was partitioned by clCreateSubDevices() Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-06-02Fix possible CPUDevice object mem leak in clCreateSubDevicesGil Pitney
Devices were previously being created during enumeration even when client was not asking for the devices. This is fixed. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-05-29Fix abs() builtin function for CPU.Gil Pitney
Take cue from dsp.h, which has the abs() function working. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-05-29relaxed comparison of Args objects in Kenrel objectsGil Pitney
Previously, comparison of Args objects between Kernel objects returned false if the Args this pointers were not the same. For two different devices with two separate kernels, it is still possible the Args are the same, if their data members are the same. So, replaced the != operator with an comparison method, which avoids inheriting the C++ != operator which was comparing this pointers. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-05-14Implemented clCreateSubDevices() PARITION_EQUALLY capabibility.Gil Pitney
This is a WIP patch beginning the addition of the v1.2 device fission feature. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-05-14printf.c: Added cl_khr_int64 feature to printf builtinGil Pitney
This (should) allow building on 64 bit systems, and is benign on 32 bit systems per Khronos 'test_printf' tests. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-05-04Implemented clEnqueueFillBuffer() v1.2 API.Gil Pitney
This implementation is quite simple, and avoids creation of special OpenCL kernels in favor of using a GNU host builtin, which it is expected would be sufficiently optimized. A possible optimization worth exploring would be to use OpenMP on the host to dispatch the pattern filling across the host cores. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-29Stub out new v1.2 functions, sufficient to allow building of test_buffersGil Pitney
Khronos test 'test_buffers' requires new v1.2 API symbols to compile. This is in preparation for developing and validating the new v1.2 buffer functions (clEnqueueFillBuffer(), clEnqueueMigrateMemObjects()). Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-29Added v1.2 API clUnloadPlatformCompiler()Gil Pitney
Default to current clUnloadCompiler() function. No Khronos v1.2 test available. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-29Added v1.2 clGetExtensionFunctionAddressForPlatform() APIGil Pitney
Simply check the platform ID, and call the existing clGetExtensionFunctionAddress() v1.1 API. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-29Blank line removed from file.Gil Pitney
Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-29Added printf builtin for v1.2Gil Pitney
This adds an OpenCL C file taken from pocl, with some minor tweaks. Per the Khronos v1.2 test_printf test case, this enables all of the 57 sub tests to pass, with two exceptions: *** Testing printf for vector *** 0)testing printf("%2.2v4hlf",(1.0f,2.0f,3.0f,4.0f)) *** FAILED *** 4)testing printf("%v2ld",(12345678,98765432)) *** FAILED *** Some debugging indicates a possible issue involving va_args and floating point types, which becomes apparent when passing vectors of floats to a variadic function. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-21Add popcount builtin function for OpenCL v1.2Gil Pitney
OpenCL v1.2 defines the builtin popcount, which counts the number of one bits in the argument. Extra care is taken to avoid sign extension in casting from the smaller signed types (char,short) to uint, the argument of __builting_popcount(). This commit enables the Khronos conformance test to PASS: % test_integer_ops popcount Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-20clEnqueueMapBuffer: Add new CL_MAP_WRITE_INVALIDATE_REGION flagGil Pitney
Add a check for this new flag. Fix alignment on subBuffer check per TI OpenCL code sync. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-16Updated callback handling for events to notify on Submitted, Running statesGil Pitney
The v1.2 spec updates clSetEventCallback() to add notifications on CL_SUBMITTED and CL_RUNNING states. Updated worker thread to set CL_RUNNING status on the event when it starts. Updated clSetEventCallback() to allow CL_SUBMITTED and CL_RUNNING states. Updated test commandqueue to remove failure check for CL_SUBMITTED state. This change allows the following Khronos v1.2 tests to PASS: % test_events callbacks % test_events callback_simultaneous Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-15Implement the concept of "event termination" required by clSetUserEventStatus()Gil Pitney
Previously, shamrock was not able to handle the case where negative status codes set by clSetUserEventStatus() cause termination of currently queued dependent events. This is now handled allowing the following Khronos tests to pass: % test_events test_userevents % test_events userevents_multithreaded Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-15Change Event Status type from enum to int to allow portable negative numbersGil Pitney
The C++ standard says the result of assigning negative values to an enum of positive values is compiler dependent. Since event objects can have negative status (indicating abnormal termination), this commit explicitly promotes the Status type to int. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-14Update event handling code per latest TI OpenCL public git repoGil Pitney
Refresh of the commandqueue.cpp code based on latest TI bug fixes, as preparation for adding new OCL v1.2 features. Updates taken from: http://git.ti.com/opencl/ti-opencl commit: 6ffe5906d0f78c1b0398f9460f3af6df978603dd Merge branch 'hotfix/ctrl-c-fix' Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-13Moved build notifications out of the program object to the API levelGil Pitney
Per the OCL 1.2 spec, build (compile, link and build) APIs must call the provided callback even if the builds fail. Previously, this was only being done when the builds succeeded, and also was being done once per device (which the spec does not require). This has been changed so that notifications now occur for *any* build result. This was validated using the following test_compiler Khronos test cases: % test_compiler simple_compile_with_callback % test_compiler simple_link_with_callback % test_compiler execute_after_simple_compile_and_link_with_callbacks % test_compiler simple_library_with_callback Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-10Updated error check for clSetUserEventStatus per v1.2 spec.Gil Pitney
Extra check for execution_status < 0, which is valid. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-10Fixed bug in clWaitForEvents causing crash when event is Event::UserGil Pitney
Coal::Event::User type events have no command queue parents. The code was dereferencing a NULL pointer. This is fixed. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-10Implementied clEnqueueBarrierWithWaitList(), and updated clEnqueueBarrier()Gil Pitney
clEnqueueBarrierWithWaitList() is a new OpenCL v1.2 API, which adds a list of dependent events to the barrier being enqueued. The new API was also added to ICD table. This commit allows the Khronos barrier event tests to PASS: - % test_events event_enqueue_barrier_with_event_list - % test_events out_of_order_event_enqueue_barrier_single_queue Also, added the new API to the ICD table. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-09Implement clEnqueueMarkerWithWaitList(), and updated clEnqueueMarker()Gil Pitney
clEnqueueMarkerWithWaitList() is a new OpenCL v1.2 API, which adds a list of dependent events to the marker being enqueued. Semantics of clEnqueueMarker() were also modified (error check) to the updated spec. The new API was also added to ICD table. This commit enables passing the following Khronos conformance tests: - % test_events event_enqueue_marker - % test_events event_enqueue_marker_with_event_list - % test_events out_of_order_event_enqueue_marker_single_queue - % test_events out_of_order_event_enqueue_marker_multi_queue - % test_events out_of_order_event_enqueue_marker_multi_queue_multi_device => PASSED (by default, since only one device). Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-07Update clBuildProgram() and clCompileProgram() to prevent re-buildingGil Pitney
Per the v1.2 spec for clBuildProgram() and clCompileProgram(): "Returns: CL_INVALID_OPERATION if there are kernel objects attached to program." Note this causes some Khronos v1.2 tests to fail, on their second re-build (without previously releasing kernels): % test_compiler options_build_macro % test_compiler options_build_macro_existence % test_compiler options_include_directory These tests pass on their first build, which validates the main objective, but they fail because they appear to violate the OCL v1.2 spec in regards to re-building a program with kernels already attached. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-07Added new build options for v1.2Gil Pitney
Added a few build options to clBuildProgram() for v1.2 spec. This enables following Khronos v1.2 conformance tests to pass: % test_compiler options_build_optimizations % test_compiler options_denorm_cache Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-04-02Implemented CL_PROGRAM_BINARY_TYPE of the clGetProgramBuildInfo() APIGil Pitney
OCL v1.2 allows querying of the type of binary program created. This is a partial implementation, pending a method to distinguish LLVM modules as libraries vs executables vs neither. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-03-31Update clGetProgramInfo() for library program objects.Gil Pitney
Added: clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, ...) clGetProgramInfo( program, CL_PROGRAM_BINARIES, ...) to return binaries for library program objects. Previously, this was only returning binaries for executables. This enables the Khronos v1.2 conformance test to pass: - % test_compiler execute_after_serialize_reload_library Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-03-31Libraries: Implement the concept of building program librariesGil Pitney
Handle the "-create-library" option to clLinkProgram(), which enables creation of library objects, which can later be linked together into an executable. This enables the following Khronos v1.2 conformance tests to pass: - % test_compiler execute_after_simple_library_with_link - % test_compiler simple_library_only - % test_compiler simple_library_with_callback - % test_compiler simple_library_with_link - % test_compiler multiple_libraries - % test_compiler multiple_files_multiple_libraries => PASSED. - % test_compiler multiple_embedded_headers => PASSES*: Up until 256 programs, then runs out of memory. found LLVMLinkModules() is taking ~20MB per Link! - % test_compiler multi_file_libraries => PASSES*: *Up to 128 libraries. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-03-20Implement "Embedded Header" feature of the v1.2 Separate Compiling and LinkingGil Pitney
v1.2 adds the ability to load header file source into program objects, then pass those program objects into clCompileProgram(). Without the ability to compile the program into LLVM IR, or to have a functional VFS in clang (?), this forces us to create temporary actual header files on the real filesystem for clang to locate. This commit causes the header source to be written into a temporary /tmp/.shamrock directory, and then adds the "-I /tmp/.shamrock" include path to the compile options. This was validated using Khronos tests: % test_compiler simple_embedded_header_compile % test_compiler simple_embedded_header_link % test_compiler execute_after_embedded_header_link % test_compiler multiple_embedded_headers Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
2015-03-18Implement the basics of separate Compilation and Linking (v1.2 feature)Gil Pitney
Separate the Program::build() method into separate ::compile() and link() phases. Implement basic clCompileProgram() and clLinkProgram() APIs, and export via ICD table. This commit allows the following Khronos v1.2 tests to pass: % test_compiler simple_link_only % test_compiler simple_link_with_callback % test_compiler two_file_link % test_compiler multiple_files The last test validates linking 256 files together. Signed-off-by: Gil Pitney <gil.pitney@linaro.org>