ocl.c


Log

Author Commit Date CI Message
ckolivas e7fdadfc 2012-02-15T14:52:29 Automatically choose phatk kernel for bitalign non-gcn ATI cards, and then only select poclbm if SDK2.6 is detected.
ckolivas 6a785946 2012-02-15T14:47:02 Make SDK 2.6 warning and advice big and bold.
Con Kolivas 23c01bc7 2012-02-13T13:19:04 Make output buffer write only as per Diapolo's suggestion.
Con Kolivas b2b5083b 2012-02-13T12:22:35 Microoptimise phatk kernel on return code.
Con Kolivas fd05341a 2012-02-13T10:39:26 Do not loop indefinitely setting poclbm kernel to load a binary.
Con Kolivas d689cfbd 2012-02-13T10:06:26 Try to load a binary if we've defaulted to the poclbm kernel on SDK2.6
Con Kolivas 3057b701 2012-02-13T09:59:29 Use the poclbm kernel on SDK2.6 with bitalign devices only if there is no binary available.
Con Kolivas 2c33f122 2012-02-13T08:34:44 Whitelist ATI SDK 2.6 to use the poclbm kernel by default.
Con Kolivas fb99c8d5 2012-02-12T21:38:45 The longstanding generation of a zero sized binary appears to be due to the OpenCL library putting the binary in a RANDOM SLOT amongst 4 possible binary locations. Iterate over each of them after building from source till the real binary is found and use that.
Con Kolivas 56907db2 2012-02-12T18:21:30 Fix harmless warnings with -Wsign-compare to allow cgminer to build with -W.
Con Kolivas 405a2120 2012-02-11T20:11:18 Remove unnecessary check for opt_debug on every invocation of applog at LOG_DEBUG and place the check in applog().
Con Kolivas 60c70145 2012-02-11T16:41:41 Retain cl program after successfully loading a binary image.
Con Kolivas 55bd031d 2012-02-11T16:38:55 Variable unused after this so remove setting it.
Con Kolivas 1c1b8bec 2012-02-11T15:58:07 BFI INT patching is not necessarily true on binary loading of files and not true on ATI SDK2.6+. Report bitalign instead.
ckolivas f2d5db0c 2012-02-10T16:45:35 Use only working kernels by default.
ckolivas 59d3d011 2012-02-10T14:33:40 Implement diablo kernel support and try to make it work.
ckolivas 95a989da 2012-02-10T13:18:16 Conflicting entries of cl_kernel may have been causing problems, and automatically chosen kernel type was not being passed on. Rename the enum to cl_kernels and store the chosen kernel in each clState.
ckolivas e6cf96ad 2012-02-10T10:28:45 ALlow much longer filenames for kernels to load properly.
ckolivas 4822cca7 2012-02-10T10:23:06 Allow different kernels to be used by different devices and fix the logic fail of overcorrecting on last commit with !strstr.
Con Kolivas 196e8a0f 2012-02-10T09:10:57 Fix kernel selection process and build error.
Philip Kaufmann 47a09cea 2012-02-09T15:15:03 added OpenCL >= 1.1 detection code, in preparation of OpenCL 1.1 global offset parameter support
ckolivas cb7145b1 2012-02-08T13:45:56 Add basic build ability with diakgcn and put all kernel names in configure.ac to avoid changing them in mutliple places.
ckolivas 6776b0ea 2012-02-10T16:45:35 Use only working kernels by default.
ckolivas 2270b4e0 2012-02-10T14:33:40 Implement diablo kernel support and try to make it work.
ckolivas 02c94272 2012-02-10T13:18:16 Conflicting entries of cl_kernel may have been causing problems, and automatically chosen kernel type was not being passed on. Rename the enum to cl_kernels and store the chosen kernel in each clState.
ckolivas 35ea31b1 2012-02-10T10:28:45 ALlow much longer filenames for kernels to load properly.
ckolivas 8af2365e 2012-02-10T10:23:06 Allow different kernels to be used by different devices and fix the logic fail of overcorrecting on last commit with !strstr.
Con Kolivas 2b23805e 2012-02-10T09:10:57 Fix kernel selection process and build error.
Philip Kaufmann ed7210af 2012-02-09T15:15:03 added OpenCL >= 1.1 detection code, in preparation of OpenCL 1.1 global offset parameter support
ckolivas a6c6866a 2012-02-08T13:45:56 Add basic build ability with diakgcn and put all kernel names in configure.ac to avoid changing them in mutliple places.
ckolivas 53c1e9ae 2012-02-04T15:15:57 Allow the OpenCL platform ID to be chosen with --gpu-platform.
ckolivas a4f47812 2012-02-04T14:47:23 Iterate over all platforms displaying their information and number of devices when --ndevs is called.
Con Kolivas ebaa2be1 2012-02-03T18:19:39 Update poclbm kernel for better performance on GCN and new SDKs with bitalign support when not BFI INT patching. Update phatk kernel to work properly for non BFI INT patched kernels, providing support for phatk to run on GCN and non-ATI cards.
Con Kolivas 82af288e 2012-01-29T22:57:29 Revert "Fix various harmless warnings." This reverts commit a4b67f030fc0c7e2b18e79114a441c1e1617d5f8.
Con Kolivas a4b67f03 2012-01-29T21:06:17 Fix various harmless warnings.
Con Kolivas b8f845b4 2012-01-29T16:43:38 Display information about the opencl platform with verbose enabled.
ckolivas 5d5584f8 2012-01-29T16:31:03 Explicitly check for nvidia in opencl platform strings as well.
Con Kolivas a3d90f84 2012-01-29T11:01:17 Default to poclbm kernel on Tahiti (7970) since phatk does not work, even though performance is sub-standard so that at least it will mine successfully by default.
Con Kolivas 31f6e8c7 2012-01-28T17:06:28 Unset prog_built after it is patched because it needs rebuilding.
Con Kolivas 1e503549 2012-01-28T16:29:19 Retain cl program after every possible place we might build the program.
Con Kolivas 25caca90 2012-01-28T16:26:53 Revert "Don't explicitly retain the cl program as it is of no benefit to do so and may lead to problems when trying to release the program." This reverts commit 32910463a3124265b56aca48a6c0fbb107ccfb70. Turns out this does help.
Con Kolivas 32910463 2012-01-26T20:53:35 Don't explicitly retain the cl program as it is of no benefit to do so and may lead to problems when trying to release the program.
Con Kolivas d18d5564 2012-01-26T20:39:35 Do not attempt to build the program that becomes the kernel twice. This could have been leading to failures on initialising cl.
Con Kolivas c87460b3 2012-01-26T19:42:57 Typo.
Con Kolivas 2ecabd85 2012-01-26T19:38:15 Some opencl compilers have issues with no spaces after -D in the compiler options.
Con Kolivas 77e9b1c2 2012-01-26T13:06:39 Use calloced stack memory for CompilerOptions to ensure sprintf writes to the beginning of the char.
Con Kolivas d7aac254 2012-01-26T11:44:42 Whitelist 79x0 cards to prefer no vectors as they perform better without.
Con Kolivas 3d4cfce8 2012-01-24T20:23:44 Instead of using the BFI_INT patching hack on any device reporting cl_amd_media_ops, create a whitelist of devices that need it. This should enable GCN architectures (ATI 79xx cards) to work properly.
Con Kolivas 6442c1ab 2012-01-22T20:36:57 Style police.
Con Kolivas 0719d407 2012-01-22T17:09:06 Clean up on failure to load a binary kernel.
Con Kolivas fb0c580b 2011-10-15T13:29:44 Go to kernel build should we fail to clCreateProgramWithBinary instead of failing on that device. Should fix the windows problems with devices not initialising.
Con Kolivas 2053de6d 2011-09-06T10:11:34 Add the directory name from the arguments cgminer was called from as well to allow it running from a relative pathname.
Con Kolivas 5848c110 2011-08-29T00:16:58 Confusion over the variable name for number of devices was passing a bogus value which likely was causing the zero sized binary issue.
Con Kolivas 3567b69e 2011-08-26T10:20:02 Remove fragile source patching for bitalign, vectors et. al and simply pass it with the compiler options.
Con Kolivas 3d5f5554 2011-08-25T14:42:03 Allow a custom kernel path to be entered on the command line.
Con Kolivas 413d9709 2011-08-25T13:59:46 Make cgminer look in the install directory for the .cl files making make install work correctly.
Con Kolivas 48180b69 2011-08-25T13:10:53 Fail gracefully if unable to open the opencl files.
Con Kolivas 6d10ef2f 2011-08-22T10:17:23 Bump version numbers of kernels to indicate slightly different versions.
Con Kolivas 4beade37 2011-08-18T22:42:37 Retain the program immediately after it's created from source.
Con Kolivas 082e20df 2011-08-18T22:34:03 Explicitly tell the compiler to retain the program to minimise the chance of the zero sized binary errors.
Con Kolivas 0f782ba6 2011-08-17T15:47:18 Update poclbm kernel to FF sized mask and only check that range.
Con Kolivas c40f51c7 2011-08-17T15:06:59 Move to cgminer style buffer return and file naming convention and fix a compiler warning.
Phateus d15d225a 2011-08-16T23:19:46 Changed phatk version to 2.2
Con Kolivas 42d49ffd 2011-08-15T20:28:25 Revert "Restart threads by abstracting out the clcontext initialisation and using that instead of probing all cards." This reverts commit 8f186e61e250e71bd606cabb52795eaa0c9ad423.
Con Kolivas cf543507 2011-08-15T20:27:02 Revert "Preinitialise the devices only once on startup." This reverts commit 071a0ad2f156ab492ebea6c5a60a1e49a62466de.
Con Kolivas b1289a01 2011-08-15T20:26:46 Revert "Move the non cl_ variables into the cgpu info struct to allow creating a new cl state on reinit, preserving known GPU variables." This reverts commit 28880d0dc7c601ee4479921502b66e913e38e36d.
Con Kolivas 28880d0d 2011-08-13T20:54:20 Move the non cl_ variables into the cgpu info struct to allow creating a new cl state on reinit, preserving known GPU variables. Create a new context from scratch in initCQ in case something was corrupted to maximise our chance of succesfully creating a new worker thread.
Con Kolivas 071a0ad2 2011-08-12T13:00:25 Preinitialise the devices only once on startup.
Con Kolivas 8f186e61 2011-07-30T16:59:54 Restart threads by abstracting out the clcontext initialisation and using that instead of probing all cards.
Con Kolivas 4365896b 2011-07-29T10:17:36 Release the command queue created after we've copied the binary data.
Con Kolivas 283d5d23 2011-07-29T10:09:24 Create a command queue from the program created from source which allows us to flush the command queue in the hope it will not generate a zero sized binary any more.
Con Kolivas 2e37e337 2011-07-24T10:58:03 Out of order command queue may fail on osx. Try without if it fails.
Con Kolivas 4cd12aa8 2011-07-24T09:04:56 Fix harmless warning.
Con Kolivas a9e1a255 2011-07-23T15:15:46 Make it possible to select the choice of kernel on the command line.
Con Kolivas 116a9dc0 2011-07-23T14:17:25 Update phatk kernel to one with new parameters for slightly less overhead again. Make the queue kernel parameters call a function pointer to select phatk or poclbm.
Con Kolivas 1c67f606 2011-07-21T10:07:29 Sometimes the cl compiler generates zero sized binaries and only a reboot seems to fix it.
Con Kolivas 7b13812e 2011-07-21T09:58:28 Kernels are safely flushed in a way that allows out of order execution to work.
Con Kolivas a7707a26 2011-07-18T10:42:24 Rename the poclbm file to ensure a new binary is built since.
Con Kolivas eea05c05 2011-07-15T13:04:25 Update kernel with a shorter output path, and use 4k output buffer to match OS page sizes.
Con Kolivas 857902a1 2011-07-12T22:23:03 Commit a new phatk kernel renamed to force new binary building and add proper support in makefiles.
Con Kolivas 0c910673 2011-07-10T00:30:12 Set max preferred size to 256 to prevent lying cards from crashing when no worksize is set.
Con Kolivas 826cc480 2011-07-08T11:58:04 Opcode should be ULL.
Rusty Russell efebee5a 2011-07-06T16:47:29 Fix the case where there are no GPUs, and exit if they give errors. If there are no GPUs, set nDevs to 0 not -1 (status is set to an unhelpful -1001 here on my laptop, so we can't rely on a particular status value). Also, if nDevs is -1, exit rather than screwing up later.
Ycros a636a674 2011-07-05T21:31:41 Merge branch 'cgminer' of git://github.com/ckolivas/cgminer into cgminer
Ycros 52d6e7ca 2011-07-05T21:31:24 Fixed fread issues under Windows.
Con Kolivas a93b22c6 2011-07-05T17:34:54 Make it possible to build without GPU mining by picking up HAVE_OPENCL from config.h.
Ycros 5d301c8b 2011-07-02T10:22:09 Make a binary load failure build from source.
Con Kolivas 821da37c 2011-07-04T13:49:28 Add hardware name to binary kernel name allowing for unique kernels for different cards on the same machine.
Con Kolivas 13b43cfa 2011-07-03T00:28:51 Update copyright and authors.
Con Kolivas 594b38b8 2011-07-02T13:46:17 Fix redefinition of gnu source.
Ycros ec831917 2011-06-25T04:43:37 Build on windows using mingw32.
Con Kolivas 4d730577 2011-06-30T10:36:19 Build binaries with unique filenames from the kernel generated and save them. Try to load this cached binary if it matches on next kernel instantiation. This speeds up start-up dramatically, and has a unique kernel binary for different kernel configurations.
Con Kolivas 2b6e8416 2011-06-29T23:38:16 Use a buffer of up to 512 * 4 integers when retrieving work from the GPU. This allows each local thread id to have one slot to put any positive results into, thus making overlapping results far less likely. Thus races will be much rarer, allowing more threads. It should also pick up blocks close to each other more reliably and hopefully decrease the number of rejects and opencl errors. Do the search over the buffer entirely in a separate thread to allow the GPU to stay as busy as possible. Detach threads from themselves to prevent unlucky even where dereferencing occurs by freeing the data that stores the thread info.
Con Kolivas 60f0bb19 2011-06-30T15:47:17 Temporarily back out binary building till it's working more reliably.
Con Kolivas 973b2199 2011-06-30T08:58:07 Tidy.
Con Kolivas 3aa5be4f 2011-07-01T01:14:43 Reinstate binary kernel loading with fixes. Build binaries with unique filenames from the kernel generated and save them. Try to load this cached binary if it matches on next kernel instantiation. This speeds up start-up dramatically, and has a unique kernel binary for different kernel configurations.
Con Kolivas a095f0fa 2011-06-30T14:30:10 Broke source generated program. Fix.
Con Kolivas 6374e0fa 2011-06-28T21:11:04 Import the phatk kernel. Enable it only for hardware with amd media ops for now since it crashes nvidia et. al. Fallback to the poclbm kernel for the rest. Try harder to avoid stale blocks around longpoll detecting new blocks.
Con Kolivas 2dbb3944 2011-06-27T22:05:03 Base was being set wrongly meaning we were repeating searches and the rate was actually lower than displayed :( Tweak Ma with new changes. Change default vectors to 2 since it's faster than 4 even when 4 is reported as preferred.
Con Kolivas 06f39506 2011-06-26T08:49:50 Fix typo which prevented BFI INT patch working on multi-GPUs.