ocl.c


Log

Author Commit Date CI Message
Con Kolivas 2643ad1b 2013-04-25T00:16:09 Use only the one jump in ocl.c to bypass binary saves for osx opencl.
Con Kolivas 0a8f5849 2013-04-25T00:09:09 Initialise variables not set on OSX in ocl.c.
Con Kolivas 9aae2256 2013-04-24T23:53:44 Bypass attempting to read and save binary files on OSX to avoid crashes on >1 GPU.
Con Kolivas 57e5bfbb 2013-04-21T09:36:49 Set default ocl work size for scrypt to 256.
ckolivas 6ffba7e9 2013-04-17T22:03:00 Convert error getting device IDs in ocl code to info log level only since multiple platforms may be installed and the error is harmless there.
ckolivas a797898f 2013-04-17T22:01:38 Unnecessary extra array in ocl code.
Kano ed480de9 2013-04-03T09:57:16 LTC text typo
Con Kolivas 132ee4c9 2013-03-21T14:56:07 Do not scan other gpu platforms if one is specified.
Con Kolivas 584fc013 2013-03-15T22:31:46 Use a new algorithm for choosing a thread concurrency when none or no shader value is specified for scrypt.
Con Kolivas d0f18e83 2013-03-15T22:00:52 Do not round up the bufsize to the maximum allocable with scrypt.
Con Kolivas 3c3fbdce 2013-03-15T21:48:48 Remove the rounding-up of the scrypt padbuffer which was not effectual and counter-productive on devices with lots of ram, limiting thread concurrencies and intensities.
Con Kolivas 1c6d8a36 2013-03-15T19:43:38 bufsize is an unsigned integer, make it so for debug.
Con Kolivas 767d6df1 2013-03-09T16:35:06 Whitelist AMD APP SDK 2.8 for diablo kernel.
Con Kolivas 87b62bde 2013-03-09T16:19:00 Cope with the highest opencl platform not having usable devices.
Con Kolivas 266d3127 2013-02-10T15:07:49 Make the numbuf larger to accept larger scrypt parameters.
Con Kolivas 69494c12 2012-12-10T15:38:21 BeaverCreek doesn't like BFI INT patching.
Con Kolivas 25c39c96 2012-10-15T12:31:57 Ease the checking on allocation of padbuffer8 in the hope it works partially anyway on an apparently failed call.
Con Kolivas cc3b693c 2012-10-07T12:25:19 Minor warning fixes.
Con Kolivas 40b747ba 2012-10-07T10:00:02 Put scrypt warning on separate line to avoid 0 being shown on windows as bufsize.
Con Kolivas d91af893 2012-08-28T18:08:39 Use correct sdk version detection for SDK 2.7
Con Kolivas 69983b77 2012-08-28T17:19:38 Revert "Pick worksize 256 with Cypress if none is specified." This reverts commit 482322a4b7add8458bee946ffb247a9a587fc25f. Worksize 256 was only helpful on cypress with ultra-low memory speeds with old SDKs and the new kernels require higher memory clocks, having the opposite net effect.
Con Kolivas 4fbe5bed 2012-08-23T23:25:32 OpenCL 1.0 does not have native atomic_add and extremely slow support with atom_add so detect opencl1.0 and use a non-atomic workaround.
Con Kolivas 482322a4 2012-08-23T12:47:28 Pick worksize 256 with Cypress if none is specified.
Con Kolivas be06cf70 2012-08-23T12:44:42 Give warning with sdk2.7 and phatk as well.
Con Kolivas cce19d90 2012-08-23T12:42:10 Whitelist sdk2.7 for diablo kernel as well.
Con Kolivas fc44b6d7 2012-08-05T15:32:44 Use different variables for command line specified lookup gap and thread concurrency to differentiate user defined versus auto chosen values.
Con Kolivas 97aa6ea4 2012-07-29T19:13:45 Fix build error without scrypt enabled.
Con Kolivas 43752ee5 2012-07-26T16:12:45 Limit thread concurrency for scrypt to 5xshaders if shaders is specified.
Con Kolivas da1b996a 2012-07-26T16:10:21 Simplify repeated use of gpus[gpu]. in ocl.c
Con Kolivas ea10b08d 2012-07-25T22:02:14 Find the nearest power of 2 maximum alloc size for the scrypt buffer that can successfully be allocated and is large enough to accomodate the thread concurrency chosen, thus mapping it to an intensity.
Con Kolivas 9a6c082a 2012-07-24T20:27:37 Make the thread concurrency and lookup gap options hidden on the command line and autotune parameters with a newly parsed --shaders option.
Con Kolivas 3a0d60cf 2012-07-23T21:30:30 Always create the largest possible padbuffer for scrypt kernels even if not needed for thread_concurrency, giving us some headroom for intensity levels.
Con Kolivas d8f81c18 2012-07-23T17:51:57 Use the detected maximum allocable memory on a GPU to determine the optimal scrypt settings when lookup_gap and thread_concurrency parameters are not given.
Con Kolivas 89eb1fa3 2012-07-23T17:41:31 Check the maximum allocable memory size per opencl device.
Con Kolivas 5087ff90 2012-07-23T16:37:13 Add debugging output if buffer allocation fails for scrypt and round up bufsize to a multiple of 256.
Con Kolivas 1711b4eb 2012-07-22T00:58:09 Display size of scrypt buffer used in debug.
Con Kolivas 39f7d2fa 2012-07-21T17:31:06 Allow lookup gap and thread concurrency to be passed per device and store details in kernel binary filename.
Con Kolivas 7d53fba1 2012-07-21T02:49:50 Reinstate GPU only opencl device detection.
Con Kolivas d13a3f1d 2012-07-21T02:47:27 Decrease lookup gap to 1. Does not seem to help in any way being 2.
Con Kolivas d72add9a 2012-07-20T16:16:18 Send correct values to scrypt kernel to get it finally working.
Con Kolivas 3e61db10 2012-07-18T21:58:27 Create command queue before compiling program in opencl.
Con Kolivas 471daecb 2012-07-16T17:05:08 Initialise mdplatform.
Con Kolivas 07292f73 2012-07-16T17:05:08 Initialise mdplatform.
Con Kolivas ffd21f8d 2012-07-15T13:40:11 Find the gpu platform with the most devices and use that if no platform option is passed.
Con Kolivas f99ac0ca 2012-07-15T13:31:03 Allow more platforms to be probed if first does not return GPUs.
Con Kolivas 428d5e5d 2012-07-16T13:22:35 Limit scrypt to 1 vector.
Con Kolivas a9a0bba1 2012-07-16T11:53:18 Set the correct data for cldata and prepare for pad8 fixes.
Con Kolivas 04edf4bf 2012-07-15T13:40:56 Temporarily set opencl to use all devices to allow debugging of scrypt kernel rapidly.
Con Kolivas 53e9c61c 2012-07-15T13:40:11 Find the gpu platform with the most devices and use that if no platform option is passed.
Con Kolivas 884f83f3 2012-07-15T13:31:03 Allow more platforms to be probed if first does not return GPUs.
Con Kolivas 243d005b 2012-07-14T16:21:27 Set scrypt settings and buffer size in ocl.c code to be future modifiable.
Con Kolivas aabc7233 2012-07-14T00:30:25 Make sure goffset is set for scrypt and drop padbuffer8 to something manageable for now.
Con Kolivas e0296c41 2012-07-13T21:35:25 Set up buffer8 for scrypt.
Con Kolivas 0f43eb5e 2012-07-13T20:35:44 Don't test nonce with sha and various fixes for scrypt.
Con Kolivas b085c338 2012-07-13T20:28:36 Make scrypt buffers and midstate compatible with cgminer.
Con Kolivas dd740caa 2012-07-13T19:02:43 Provide initial support for the scrypt kernel to compile with and mine scrypt with the --scrypt option.
Philip Kaufmann f479be07 2012-04-27T08:29:56 add goffset support for diakgcn with -v 1 and update kernel version
Con Kolivas 9a3ae266 2012-04-27T10:22:53 Add support for latest ATI SDK on windows.
Con Kolivas bb319883 2012-04-25T11:41:35 Detect poorly performing combination of SDK and phatk kernel and add verbose warning at startup.
Con Kolivas 9175e4f2 2012-04-23T17:56:31 Display all OpenCL devices when -n is called as well to allow debugging of differential mapping of OpenCL to ADL.
Con Kolivas 6274fbe7 2012-03-30T09:32:42 Change the preferred vector width to 1 for Tahiti only, not all poclbm kernels.
Con Kolivas 621bcca7 2012-03-27T22:10:17 Use global offset parameter to diablo and poclbm kernel ONLY for 1 vector kernels.
Con Kolivas 39395eb1 2012-03-27T19:58:51 Use poclbm preferentially on Tahiti now regardless of SDK.
Con Kolivas edb070c8 2012-02-24T13:31:29 Fixes.
Con Kolivas fb077c6d 2012-02-24T13:27:15 Pass vectors * worksize to kernel to avoid one op.
Con Kolivas 709c4cd8 2012-02-23T20:24:32 Use diablo kernel on all future SDKs for Tahiti and set preferred vector width to 1 on poclbm kernel only.
ckolivas dfcb98de 2012-02-23T00:45:40 Use the SDK and hardware information to choose good performing default kernels.
ckolivas d3ad87f5 2012-02-22T20:13:23 Allow writing of multiple worksizes to the configuration file.
ckolivas 1b1fa5cd 2012-02-22T20:08:29 Allow writing of multiple vector sizes to the configuration file.
ckolivas 994cd775 2012-02-22T20:01:09 Allow writing of multiple kernels to the configuration file.
ckolivas 93efb726 2012-02-22T19:38:01 Allow multiple different kernels to be chosen per device.
ckolivas a54f7606 2012-02-22T19:00:44 Fix multiple work size entry.
Con Kolivas 26c59fbf 2012-02-22T16:59:28 Allow the worksize to be set per-device.
Con Kolivas deff55c6 2012-02-22T16:54:06 Allow different vectors to be set per device.
Con Kolivas bf3a9f94 2012-02-22T14:42:20 Unintentionally dropped the device name from the binary filenames. Reinstate.
Con Kolivas 5d23d70f 2012-02-22T14:14:26 As all kernels will be new versions it's an opportunity to change the .bin format and make it simpler. Specifying bitalign is redundant and long can be l.
Con Kolivas d1cddf8b 2012-02-21T22:23:07 Update licensing to GPL V3.
Con Kolivas 00290a3e 2012-02-21T21:31:31 Select diablo kernel on all but GCN+SDK 2.6.
Con Kolivas e9c3d730 2012-02-19T18:32:56 Tahiti prefers worksize 64 with poclbm.
Con Kolivas 30936f17 2012-02-18T23:28:41 No need to expressly retain the opencl program now that the zero binary issue is fixed.
Con Kolivas 810ad045 2012-02-18T23:16:08 More copyright updates.
Con Kolivas 22d3034e 2012-02-18T23:13:45 Show error code on any opencl failure status.
Con Kolivas be9db9ce 2012-02-18T23:00:21 Copyright updates.
Con Kolivas 0b6e35cd 2012-02-18T22:49:49 Add detection for version 898.1 SDK as well but only give SDK 2.6 warning once on startup instead of with each device initialisation.
Con Kolivas 67c4ada1 2012-02-16T01:10:11 Provide warning on each startup about sdk 2.6 and decrease poclbm kernel selection to LOG_INFO.
Con Kolivas b4c86ba6 2012-02-16T00:48:34 Give SDK 2.6 warning only on building a kernel for !GCN bitalign devices.
Con Kolivas 728e3d43 2012-02-16T00:43:05 Revert "Automatically choose phatk kernel for bitalign non-gcn ATI cards, and then only select poclbm if SDK2.6 is detected." This reverts commit e7fdadfc8fc388f68772d5a4c2740da60287c889. Broke kernel loading.
ckolivas e7fdadfc 2012-02-15T14:52:29 Automatically choose phatk kernel for bitalign non-gcn ATI cards, and then only select poclbm if SDK2.6 is detected.
ckolivas 6a785946 2012-02-15T14:47:02 Make SDK 2.6 warning and advice big and bold.
Con Kolivas 23c01bc7 2012-02-13T13:19:04 Make output buffer write only as per Diapolo's suggestion.
Con Kolivas b2b5083b 2012-02-13T12:22:35 Microoptimise phatk kernel on return code.
Con Kolivas fd05341a 2012-02-13T10:39:26 Do not loop indefinitely setting poclbm kernel to load a binary.
Con Kolivas d689cfbd 2012-02-13T10:06:26 Try to load a binary if we've defaulted to the poclbm kernel on SDK2.6
Con Kolivas 3057b701 2012-02-13T09:59:29 Use the poclbm kernel on SDK2.6 with bitalign devices only if there is no binary available.
Con Kolivas 2c33f122 2012-02-13T08:34:44 Whitelist ATI SDK 2.6 to use the poclbm kernel by default.
Con Kolivas fb99c8d5 2012-02-12T21:38:45 The longstanding generation of a zero sized binary appears to be due to the OpenCL library putting the binary in a RANDOM SLOT amongst 4 possible binary locations. Iterate over each of them after building from source till the real binary is found and use that.
Con Kolivas 56907db2 2012-02-12T18:21:30 Fix harmless warnings with -Wsign-compare to allow cgminer to build with -W.
Con Kolivas 405a2120 2012-02-11T20:11:18 Remove unnecessary check for opt_debug on every invocation of applog at LOG_DEBUG and place the check in applog().
Con Kolivas 60c70145 2012-02-11T16:41:41 Retain cl program after successfully loading a binary image.
Con Kolivas 55bd031d 2012-02-11T16:38:55 Variable unused after this so remove setting it.