From: Chris Duncan Date: Tue, 7 Jan 2025 13:54:18 +0000 (-0800) Subject: Log some more benchmarks. Switch back to PowGl for blocks, with the plan to later... X-Git-Url: https://zoso.dev/?a=commitdiff_plain;h=a7914e49c045ae973b8ba1fac8ebb63ae614b578;p=libnemo.git Log some more benchmarks. Switch back to PowGl for blocks, with the plan to later make PowGpu available behind a flag. --- diff --git a/benchmarks.md b/benchmarks.md index 3b0706f..389caa7 100644 --- a/benchmarks.md +++ b/benchmarks.md @@ -1,15 +1,14 @@ +PASS Original PoW module: Time to calculate proof-of-work for a send block 16 times Total: 89756 ms Average: 5609.75 ms Harmonic: 2092.567565254879 ms Geometric: 3612.112662613675 ms -PASS Original PoW module: Time to calculate proof-of-work for a send block 16 times - +PASS Customized PoW: Time to calculate proof-of-work for a send block 16 times Total: 33240 ms Average: 2077.5 ms Harmonic: 1328.5635414262717 ms Geometric: 1663.110986923899 ms -PASS Customized PoW: Time to calculate proof-of-work for a send block 16 times How much faster? Total: 56156 ms @@ -17,6 +16,14 @@ Average: 3532 ms Harmonic: 764 ms Geometric: 1949 ms +Another PowGl test: +Total: 22831.300000041723 ms +Average: 3805.2166666736207 ms +Harmonic: 928.6432328540742 ms +Geometric: 2500.810238375608 ms +Minimum: 193 ms +Maximum: 8361 ms + The proof-of-work equation for Nano cryptocurrency is defined as `blake2b(nonce||blockhash)>=threshold` where blake2b is the hash function configured for an 8-byte output, nonce is a random 8-byte value, || is concatenation, blockhash is a 32-byte value, and threshold is 0xfffffff800000000. My code currently finds valid nonces on a gaming GPU without issue but only because the initial search space is so large due to using a workgroup size of 256 and a dispatch of (256, 256, 256), so the probability of finding a nonce in the first pass is extremely high. However, this does not perform well on less powerful hardware like smartphones. Please alter my code to perform the nonce search in `main()` in a loop until a valid nonce is found with the idea that a smaller workgroup size and/or smaller dispatch dimensions can allow weak hardware to nonetheless iterate through nonces to search for a valid one and return it. diff --git a/perf/block.perf.js b/perf/block.perf.js index fae621b..d1ee578 100644 --- a/perf/block.perf.js +++ b/perf/block.perf.js @@ -37,7 +37,7 @@ await suite('Block performance', async () => { console.log(`Maximum: ${max} ms`) }) - await skip(`PowGl: Time to calculate proof-of-work for a block hash ${COUNT} times`, async () => { + await test(`PowGl: Time to calculate proof-of-work for a block hash ${COUNT} times`, async () => { const times = [] const hashes = [ NANO_TEST_VECTORS.PRIVATE_0, @@ -63,7 +63,7 @@ await suite('Block performance', async () => { console.log(`Maximum: ${max} ms`) }) - await test(`PowGpu: Time to calculate proof-of-work for a send block ${COUNT} times`, async () => { + await skip(`PowGpu: Time to calculate proof-of-work for a send block ${COUNT} times`, async () => { const times = [] const block = new SendBlock( NANO_TEST_VECTORS.SEND_BLOCK.account, diff --git a/src/lib/block.ts b/src/lib/block.ts index 11b969f..1d36116 100644 --- a/src/lib/block.ts +++ b/src/lib/block.ts @@ -16,7 +16,7 @@ import { PowGl, PowGpu } from './workers.js' * of three derived classes: SendBlock, ReceiveBlock, ChangeBlock. */ abstract class Block { - static #pool: Pool = new Pool(PowGpu) + static #pool: Pool = new Pool(PowGl) account: Account type: string = 'state' abstract subtype: 'send' | 'receive' | 'change'