kmx git

Commit	Date	Message
9f726400	2011-07-02T21:42:14	Logic error in the queueing of work ended up generating more stale blocks. There is a small chance that a longpoll is signalled right at the start which would lead to a deadlock so check for first work before restart.
48caf248	2011-07-02T09:39:43	Check for stale block after failed submission as well.
09104ce3	2011-07-02T00:13:13	Flag the work back to just thread 0 used by all the threads to avoid lots of queued older work for each thread.
9fe21064	2011-06-25T08:18:29	Fixed up using config.h instead of cpuminer-config.h.
edd0591e	2011-07-02T13:29:31	Make the number of queued work items configurable and default to 2.
594b38b8	2011-07-02T13:46:17	Fix redefinition of gnu source.
131f60a5	2011-07-02T13:06:51	Move queueing of one request to separate function in preparation for variable length queues.
ffdffe77	2011-07-02T12:12:35	Make sure the work gets attributed to the correct gpu. Add an fflush to stderr to minimise garbled output when multiple threads write at once.
86e40ed9	2011-07-02T09:44:29	Stale block control makes it possible to make 2 threads per gpu the default again.
bed69215	2011-07-01T23:45:15	Get rid of the requirement for a static struct that needs locking to cache work. Make it possible to use the thread id for getting work again. Flag the getwork() function when we have a new block to explicitly discard any cached work when a new block is detected. Store the header of each new work and compare it to blocks we're about to submit to decide if they're stale due to a new block and don't try to submit them. This should significantly decrease the number of rejected blocks.
e2fb3e84	2011-07-01T20:34:22	Queueing all kernel parameters dramatically reduces stale block rates.
7ae9afc4	2011-07-01T14:16:41	Profile points and warning clean ups.
b54a3425	2011-07-01T13:58:43	Change default number of threads back to 1. The 2nd just increases the time taken to complete a work item thus increasing stale blocks, despite increasing the rate slightly.
d100281d	2011-06-30T14:21:34	Make sure correct thread id is in work struct and correct cpu is set for per-cpu data.
998d8d45	2011-06-30T11:30:37	Postcalc hash is already its own thread so work can be submitted synchronously from that.
2b6e8416	2011-06-29T23:38:16	Use a buffer of up to 512 * 4 integers when retrieving work from the GPU. This allows each local thread id to have one slot to put any positive results into, thus making overlapping results far less likely. Thus races will be much rarer, allowing more threads. It should also pick up blocks close to each other more reliably and hopefully decrease the number of rejects and opencl errors. Do the search over the buffer entirely in a separate thread to allow the GPU to stay as busy as possible. Detach threads from themselves to prevent unlucky even where dereferencing occurs by freeing the data that stores the thread info.
6af84770	2011-06-29T11:30:06	Add spaces to make output clearer.
88d9d631	2011-06-30T23:36:57	Use two separate curl instances for submit and get and use separate threads for each to prevent one blocking the other.
72baac08	2011-06-30T21:55:39	Clearly delineate the cpus from the gpus for their local data.
142576a9	2011-06-30T20:50:52	We already have gpu/cpu from id, so use that. Likely the current convoluted code is wrong and leading to segfaults!
18f8b0f9	2011-06-30T16:30:05	Submit work async is still unreliable and only used for cpu mining, so back it out for now.
d5d4d1da	2011-06-30T14:41:01	Don't want to free the work data out of the transient structs.
f1114992	2011-06-26T09:07:52	Implement a potentially variable number of threads per gpu, setting it to 2 for now.
08f56f5f	2011-06-26T08:55:53	Set default CPU threads to 0 if GPU mining.
295ef0f9	2011-06-25T21:47:16	Discard accumulated work when longpoll indicates a new block.
f44e8fac	2011-06-25T20:56:17	Curl appears to be not thread safe so only have one curl open at a time.
343ae851	2011-06-25T20:38:40	Intensity 5 is too high for a normal desktop causing unacceptable lag so change the default to 4.
88e2cf7b	2011-06-25T20:22:23	Initialise libcurl properly.
656b485d	2011-06-25T18:58:59	Make the worksize and vector width configurable.
ead1281b	2011-06-25T18:27:56	Cleanup of return codes.
e1dd27c5	2011-06-29T11:19:43	Ensure that we don't overflow due to 32 bit limitations.
f6486efb	2011-06-25T13:40:42	Make the getting of work asynchronous from the mining threads requests by always having one work item queued. This prevents drops in hash rates when getting work from a pool that is slow to respond. Use a local static struct work in get_work that is used to queue one extra work item.
0cef8f8d	2011-06-25T12:50:15	Default scan timeout of 5 seconds is way too short leading to abandoning blocks too early and being seen as an "inefficient" miner. Increase it to 60.
b38a02bd	2011-06-29T11:14:16	Make the log time hash rate a rolling exponential average so it doesn't fluctuate so dramatically.
d2cb012f	2011-06-25T10:07:29	Detach the thread once created so we don't have to explicitly try and join it.
08a78210	2011-06-29T10:12:00	Make the log show what the thread is: cpu or gpu and what number.
f490143a	2011-06-29T09:22:21	Add local thread count to info, store hw error count, and make share submission debug only.
e016d0c8	2011-06-28T23:41:57	Increase maximum intensity configurable to 14.
dfc52fd5	2011-06-28T21:46:09	Make sure we can have gpu and cpu threads running.
24a28e29	2011-06-28T21:28:50	Make it possible to run as a pure cpu miner by setting gpu threads to 0.
e1d01d06	2011-06-28T11:18:26	Minor fixes.
6374e0fa	2011-06-28T21:11:04	Import the phatk kernel. Enable it only for hardware with amd media ops for now since it crashes nvidia et. al. Fallback to the poclbm kernel for the rest. Try harder to avoid stale blocks around longpoll detecting new blocks.
948b514c	2011-06-27T12:02:47	The buffer needs to be flushed before enqueueing the kernel again. Further optimise the mining loop by removing the need_work bool.
a45c54aa	2011-06-27T11:31:05	Make postcalc_hash asynchronous as well.
378d18f8	2011-06-27T10:15:03	Submit all work asynchronously via a submit_work thread.
612c3a45	2011-06-27T09:32:12	Curl doesn't like multiple instances so go back to one instance.
f0dcd127	2011-06-27T09:17:13	Show which cpu mining thread when giving affinity message.
58f6bf42	2011-06-26T16:21:58	Prevent 32bit overflow of local_mhashes as well.
00de8225	2011-06-26T15:28:33	Upper limit should be -hashes.
c29a4322	2011-06-26T13:45:38	Only update the hashmeter once per second from gpu mining threads.
063adc64	2011-06-26T12:59:15	Implement runtime selectable numbers of GPU threads and rename CPU threads option.
b6ae1db8	2011-06-26T10:53:16	The submit_lock is not required nor helpful.
d1c0cccd	2011-06-26T09:09:07	Show correct GPU from thread number.
b7a17753	2011-06-25T09:56:37	Make a separate thread for work submission that returns immediately so that miner threads aren't kept waiting when submitting results to slow pools.
e8f4eead	2011-06-24T16:24:53	Use total mhashes as a counter to prevent 32 bit overflows.
f7926088	2011-06-24T10:55:05	Limit intensity to 10. Anything larger overflows. Simplify test for new work.
feb8cfc8	2011-06-24T09:51:54	applog fixes.
b19ee2f5	2011-06-24T09:39:33	Make sure a GPU doesn't work on a block longer than opt_scantime.
26546ad5	2011-06-24T09:17:09	Make the optimisations per-gpu card and update code to work properly with multiple cards.
852e78e7	2011-06-23T22:09:49	Fix mutex unlocking with only one thread and opt_log_interval.
70f73576	2011-06-23T21:58:46	Make the output display the 5 second and total average Mhash/s. Make the log interval configurable.
debe7776	2011-06-23T21:23:46	Use cpu_from_thr_id when binding threads.
4cd5f47e	2011-06-23T21:09:22	Revert "Multiple compiler warning fixes." This reverts commit a5cbfbde2610e9f60e14b41a4e0595bcb34c772a. Broke.
88761e6c	2011-06-23T21:04:29	Multiple compiler warning fixes.
19eea906	2011-06-23T17:50:37	Implement code detecting max work size and optimal vector width. Use this to patch the kernel to suit the idea values for the card. Then use these values when invoking the kernel.
237a5067	2011-06-23T15:28:12	Skip trying to start thread of GPUs that don't successfully initcl().
14ca8883	2011-06-23T14:59:17	Update help.
c08be809	2011-06-23T14:56:27	Fix the setting of number of processors. Add scan intensity variable.
2ab6180d	2011-06-23T10:34:40	Reset count once all threads are started to avoid slow rate being shown initially. Update copyright notice and comments.
932ff72f	2011-06-22T23:35:23	The gpuminer thread uses very little cpu and needs to keep the gpu busy with as few delays as possible. Don't nice it.
f54d2cc0	2011-06-22T23:07:30	Make poclbm use 4 vectors and decrease worksize to keep pipelines fullish. Make it possible to have 0 CPU threads and update docs. Fix counter with no cpu threads.
66240481	2011-06-22T15:39:27	Fix deref.
fa2f6b19	2011-06-22T13:54:06	Unwind.
79fec01a	2011-06-22T12:27:57	Remove the input buffer and just pass args to the kernel as per plugin design.
f117675a	2011-06-22T10:15:23	Optimise work loop to make cl calls asynchronous where possible.
f05270b8	2011-06-22T01:19:19	Optimise loop and make debug debug only.
91e5cef3	2011-06-22T00:13:46	Actually get first BFI_INT patch working.
910e6943	2011-06-19T22:21:51	Increase baseline threads to 1<<22. Make total counter regularly update every 5 seconds. Only write the blank buffer when it needs to be blanked.
6b77d850	2011-06-17T14:00:41	Fixes.
ce3382ca	2011-06-14T16:26:34	Don't run gpu thread idle prio.
dde70397	2011-06-14T10:32:54	Merge gpumining from oclmine. Unstable.
7062557e	2011-06-12T01:19:04	Implement global hash rate counter through mutex lock protected data. Make the output easier to read. Don't do hashmeter updates if no output is requested. Remove redundant output when using a single thread.
4d6be0c1	2011-06-09T03:47:07	Fix number-of-threads init logic on Windows
8e0e2493	2011-06-08T22:30:10	only read processor count via sysconf on non-Windows platforms
262b98ca	2011-06-09T11:45:06	Linux + x86_64 optimisations. Add likely() macro. Optimise a few obvious code paths with likely/unlikely. Change algo to sse2_amd64 by default. Move priority change to worker threads only. Detect number of CPUs and set default number of threads == CPUs. Add scheduling policy change to worker threads to SCHED_IDLE first and fallback to SCHED_BATCH on linux. Don't error when failing to set priority. Add CPU affinity and bind worker threads to CPUs when number of threads is a multiple of number of CPUs. Update NEWS with changes.
51817422	2011-06-14T14:09:10	Cope with older linux kernel headers that don't have the newer scheduling policies defined.
0a8ac14c	2011-06-12T14:26:26	Forgot the else.
4f8045c2	2011-06-12T11:40:15	Only increase solutions count when confirmed true.
ce750e42	2011-06-12T01:35:08	Add a solution counter to the output.
69529c38	2011-03-24T14:09:49	Support full URL, in X-Long-Polling header
46819af3	2011-03-21T20:50:59	--user/--pass fixes Also, some newline fixes (applog callers do not need newlines in strings)
81352ca4	2011-03-21T04:27:02	Support --user and --pass, as alternative to --userpass
2fd9d544	2011-03-21T04:02:13	Convert remaining [f]print to applog(). Also, remove a few superfluous printouts.
144cf62d	2011-03-21T03:45:26	Avoid potential for div-by-zero, when calculating max-nonce
d49d6392	2011-03-21T03:42:57	cpu-miner.c: Remove newline from applog() call
24afd617	2011-03-18T17:24:16	Introduce more standardized logging (incl. optional syslog). Also, improve portability of alloca.
7a87bee9	2011-03-18T02:53:13	Add long polling support
6818c692	2011-03-17T23:22:10	Improve max nonce auto-adjustment with some basic algebra.
2d49a9a5	2011-03-17T22:02:28	Introduce ability to interrupt hash scanners in the middle of scanning.
0258fae4	2011-03-14T23:36:28	Fix Windows build, that broke with yasm integration

9f726400

2011-07-02T21:42:14

Logic error in the queueing of work ended up generating more stale blocks. There is a small chance that a longpoll is signalled right at the start which would lead to a deadlock so check for first work before restart.

48caf248

2011-07-02T09:39:43

Check for stale block after failed submission as well.

09104ce3

2011-07-02T00:13:13

Flag the work back to just thread 0 used by all the threads to avoid lots of queued older work for each thread.

9fe21064

2011-06-25T08:18:29

Fixed up using config.h instead of cpuminer-config.h.

edd0591e

2011-07-02T13:29:31

Make the number of queued work items configurable and default to 2.

594b38b8

2011-07-02T13:46:17

Fix redefinition of gnu source.

131f60a5

2011-07-02T13:06:51

Move queueing of one request to separate function in preparation for variable length queues.

ffdffe77

2011-07-02T12:12:35

Make sure the work gets attributed to the correct gpu. Add an fflush to stderr to minimise garbled output when multiple threads write at once.

86e40ed9

2011-07-02T09:44:29

Stale block control makes it possible to make 2 threads per gpu the default again.

bed69215

2011-07-01T23:45:15

Get rid of the requirement for a static struct that needs locking to cache work. Make it possible to use the thread id for getting work again. Flag the getwork() function when we have a new block to explicitly discard any cached work when a new block is detected. Store the header of each new work and compare it to blocks we're about to submit to decide if they're stale due to a new block and don't try to submit them. This should significantly decrease the number of rejected blocks.

e2fb3e84

2011-07-01T20:34:22

Queueing all kernel parameters dramatically reduces stale block rates.

7ae9afc4

2011-07-01T14:16:41

Profile points and warning clean ups.

b54a3425

2011-07-01T13:58:43

Change default number of threads back to 1. The 2nd just increases the time taken to complete a work item thus increasing stale blocks, despite increasing the rate slightly.

d100281d

2011-06-30T14:21:34

Make sure correct thread id is in work struct and correct cpu is set for per-cpu data.

998d8d45

2011-06-30T11:30:37

Postcalc hash is already its own thread so work can be submitted synchronously from that.

2b6e8416

2011-06-29T23:38:16

Use a buffer of up to 512 * 4 integers when retrieving work from the GPU. This allows each local thread id to have one slot to put any positive results into, thus making overlapping results far less likely. Thus races will be much rarer, allowing more threads. It should also pick up blocks close to each other more reliably and hopefully decrease the number of rejects and opencl errors. Do the search over the buffer entirely in a separate thread to allow the GPU to stay as busy as possible. Detach threads from themselves to prevent unlucky even where dereferencing occurs by freeing the data that stores the thread info.

6af84770

2011-06-29T11:30:06

Add spaces to make output clearer.

88d9d631

2011-06-30T23:36:57

Use two separate curl instances for submit and get and use separate threads for each to prevent one blocking the other.

72baac08

2011-06-30T21:55:39

Clearly delineate the cpus from the gpus for their local data.

142576a9

2011-06-30T20:50:52

We already have gpu/cpu from id, so use that. Likely the current convoluted code is wrong and leading to segfaults!

18f8b0f9

2011-06-30T16:30:05

Submit work async is still unreliable and only used for cpu mining, so back it out for now.

d5d4d1da

2011-06-30T14:41:01

Don't want to free the work data out of the transient structs.

f1114992

2011-06-26T09:07:52

Implement a potentially variable number of threads per gpu, setting it to 2 for now.

08f56f5f

2011-06-26T08:55:53

Set default CPU threads to 0 if GPU mining.

295ef0f9

2011-06-25T21:47:16

Discard accumulated work when longpoll indicates a new block.

f44e8fac

2011-06-25T20:56:17

Curl appears to be not thread safe so only have one curl open at a time.

343ae851

2011-06-25T20:38:40

Intensity 5 is too high for a normal desktop causing unacceptable lag so change the default to 4.

88e2cf7b

2011-06-25T20:22:23

Initialise libcurl properly.

656b485d

2011-06-25T18:58:59

Make the worksize and vector width configurable.

ead1281b

2011-06-25T18:27:56

Cleanup of return codes.

e1dd27c5

2011-06-29T11:19:43

Ensure that we don't overflow due to 32 bit limitations.

f6486efb

2011-06-25T13:40:42

Make the getting of work asynchronous from the mining threads requests by always having one work item queued. This prevents drops in hash rates when getting work from a pool that is slow to respond. Use a local static struct work in get_work that is used to queue one extra work item.

0cef8f8d

2011-06-25T12:50:15

Default scan timeout of 5 seconds is way too short leading to abandoning blocks too early and being seen as an "inefficient" miner. Increase it to 60.

b38a02bd

2011-06-29T11:14:16

Make the log time hash rate a rolling exponential average so it doesn't fluctuate so dramatically.

d2cb012f

2011-06-25T10:07:29

Detach the thread once created so we don't have to explicitly try and join it.

08a78210

2011-06-29T10:12:00

Make the log show what the thread is: cpu or gpu and what number.

f490143a

2011-06-29T09:22:21

Add local thread count to info, store hw error count, and make share submission debug only.

e016d0c8

2011-06-28T23:41:57

Increase maximum intensity configurable to 14.

dfc52fd5

2011-06-28T21:46:09

Make sure we can have gpu and cpu threads running.

24a28e29

2011-06-28T21:28:50

Make it possible to run as a pure cpu miner by setting gpu threads to 0.

e1d01d06

2011-06-28T11:18:26

Minor fixes.

6374e0fa

2011-06-28T21:11:04

Import the phatk kernel. Enable it only for hardware with amd media ops for now since it crashes nvidia et. al. Fallback to the poclbm kernel for the rest. Try harder to avoid stale blocks around longpoll detecting new blocks.

948b514c

2011-06-27T12:02:47

The buffer needs to be flushed before enqueueing the kernel again. Further optimise the mining loop by removing the need_work bool.

a45c54aa

2011-06-27T11:31:05

Make postcalc_hash asynchronous as well.

378d18f8

2011-06-27T10:15:03

Submit all work asynchronously via a submit_work thread.

612c3a45

2011-06-27T09:32:12

Curl doesn't like multiple instances so go back to one instance.

f0dcd127

2011-06-27T09:17:13

Show which cpu mining thread when giving affinity message.

58f6bf42

2011-06-26T16:21:58

Prevent 32bit overflow of local_mhashes as well.

00de8225

2011-06-26T15:28:33

Upper limit should be -hashes.

c29a4322

2011-06-26T13:45:38

Only update the hashmeter once per second from gpu mining threads.

063adc64

2011-06-26T12:59:15

Implement runtime selectable numbers of GPU threads and rename CPU threads option.

b6ae1db8

2011-06-26T10:53:16

The submit_lock is not required nor helpful.

d1c0cccd

2011-06-26T09:09:07

Show correct GPU from thread number.

b7a17753

2011-06-25T09:56:37

Make a separate thread for work submission that returns immediately so that miner threads aren't kept waiting when submitting results to slow pools.

e8f4eead

2011-06-24T16:24:53

Use total mhashes as a counter to prevent 32 bit overflows.

f7926088

2011-06-24T10:55:05

Limit intensity to 10. Anything larger overflows. Simplify test for new work.

feb8cfc8

2011-06-24T09:51:54

applog fixes.

b19ee2f5

2011-06-24T09:39:33

Make sure a GPU doesn't work on a block longer than opt_scantime.

26546ad5

2011-06-24T09:17:09

Make the optimisations per-gpu card and update code to work properly with multiple cards.

852e78e7

2011-06-23T22:09:49

Fix mutex unlocking with only one thread and opt_log_interval.

70f73576

2011-06-23T21:58:46

Make the output display the 5 second and total average Mhash/s. Make the log interval configurable.

debe7776

2011-06-23T21:23:46

Use cpu_from_thr_id when binding threads.

4cd5f47e

2011-06-23T21:09:22

Revert "Multiple compiler warning fixes." This reverts commit a5cbfbde2610e9f60e14b41a4e0595bcb34c772a. Broke.

88761e6c

2011-06-23T21:04:29

Multiple compiler warning fixes.

19eea906

2011-06-23T17:50:37

Implement code detecting max work size and optimal vector width. Use this to patch the kernel to suit the idea values for the card. Then use these values when invoking the kernel.

237a5067

2011-06-23T15:28:12

Skip trying to start thread of GPUs that don't successfully initcl().

14ca8883

2011-06-23T14:59:17

Update help.

c08be809

2011-06-23T14:56:27

Fix the setting of number of processors. Add scan intensity variable.

2ab6180d

2011-06-23T10:34:40

Reset count once all threads are started to avoid slow rate being shown initially. Update copyright notice and comments.

932ff72f

2011-06-22T23:35:23

The gpuminer thread uses very little cpu and needs to keep the gpu busy with as few delays as possible. Don't nice it.

f54d2cc0

2011-06-22T23:07:30

Make poclbm use 4 vectors and decrease worksize to keep pipelines fullish. Make it possible to have 0 CPU threads and update docs. Fix counter with no cpu threads.

66240481

2011-06-22T15:39:27

Fix deref.

fa2f6b19

2011-06-22T13:54:06

Unwind.

79fec01a

2011-06-22T12:27:57

Remove the input buffer and just pass args to the kernel as per plugin design.

f117675a

2011-06-22T10:15:23

Optimise work loop to make cl calls asynchronous where possible.

f05270b8

2011-06-22T01:19:19

Optimise loop and make debug debug only.

91e5cef3

2011-06-22T00:13:46

Actually get first BFI_INT patch working.

910e6943

2011-06-19T22:21:51

Increase baseline threads to 1<<22. Make total counter regularly update every 5 seconds. Only write the blank buffer when it needs to be blanked.

6b77d850

2011-06-17T14:00:41

Fixes.

ce3382ca

2011-06-14T16:26:34

Don't run gpu thread idle prio.

dde70397

2011-06-14T10:32:54

Merge gpumining from oclmine. Unstable.

7062557e

2011-06-12T01:19:04

Implement global hash rate counter through mutex lock protected data. Make the output easier to read. Don't do hashmeter updates if no output is requested. Remove redundant output when using a single thread.

4d6be0c1

2011-06-09T03:47:07

Fix number-of-threads init logic on Windows

8e0e2493

2011-06-08T22:30:10

only read processor count via sysconf on non-Windows platforms

262b98ca

2011-06-09T11:45:06

Linux + x86_64 optimisations. Add likely() macro. Optimise a few obvious code paths with likely/unlikely. Change algo to sse2_amd64 by default. Move priority change to worker threads only. Detect number of CPUs and set default number of threads == CPUs. Add scheduling policy change to worker threads to SCHED_IDLE first and fallback to SCHED_BATCH on linux. Don't error when failing to set priority. Add CPU affinity and bind worker threads to CPUs when number of threads is a multiple of number of CPUs. Update NEWS with changes.

51817422

2011-06-14T14:09:10

Cope with older linux kernel headers that don't have the newer scheduling policies defined.

0a8ac14c

2011-06-12T14:26:26

Forgot the else.

4f8045c2

2011-06-12T11:40:15

Only increase solutions count when confirmed true.

ce750e42

2011-06-12T01:35:08

Add a solution counter to the output.

69529c38

2011-03-24T14:09:49

Support full URL, in X-Long-Polling header

46819af3

2011-03-21T20:50:59

--user/--pass fixes Also, some newline fixes (applog callers do not need newlines in strings)

81352ca4

2011-03-21T04:27:02

Support --user and --pass, as alternative to --userpass

2fd9d544

2011-03-21T04:02:13

Convert remaining [f]print to applog(). Also, remove a few superfluous printouts.

144cf62d

2011-03-21T03:45:26

Avoid potential for div-by-zero, when calculating max-nonce

d49d6392

2011-03-21T03:42:57

cpu-miner.c: Remove newline from applog() call

24afd617

2011-03-18T17:24:16

Introduce more standardized logging (incl. optional syslog). Also, improve portability of alloca.

7a87bee9

2011-03-18T02:53:13

Add long polling support

6818c692

2011-03-17T23:22:10

Improve max nonce auto-adjustment with some basic algebra.

2d49a9a5

2011-03-17T22:02:28

Introduce ability to interrupt hash scanners in the middle of scanning.

0258fae4

2011-03-14T23:36:28

Fix Windows build, that broke with yasm integration

thodg/cgminer/cpu-miner.c

cpu-miner.c

Log