From 0b6fb55e8c01bddff0d9e4e03b409623b55794e4 Mon Sep 17 00:00:00 2001 From: Chris Duncan Date: Mon, 13 Jan 2025 12:58:52 -0800 Subject: [PATCH] Import NanoPow dynamically, preferring local source. Allow user to set test count. Remove testing alternate libs and only test best one loaded. Add benchmarks. --- benchmarks.md | 133 ++++++++++++++++++++++++++++++++--------- test.html | 160 +++++++++++++++++--------------------------------- 2 files changed, 159 insertions(+), 134 deletions(-) diff --git a/benchmarks.md b/benchmarks.md index 04bd409..1055b5d 100644 --- a/benchmarks.md +++ b/benchmarks.md @@ -3,13 +3,13 @@ SPDX-FileCopyrightText: 2025 Chris Duncan SPDX-License-Identifier: GPL-3.0-or-later --> -PASS Original PoW module: Time to calculate proof-of-work for a send block 16 times +nano-webgl-pow: Time to calculate proof-of-work for a send block 16 times Total: 89756 ms Average: 5609.75 ms Harmonic: 2092.567565254879 ms Geometric: 3612.112662613675 ms -PASS Customized PoW: Time to calculate proof-of-work for a send block 16 times +NanoPowGl: Time to calculate proof-of-work for a send block 16 times Total: 33240 ms Average: 2077.5 ms Harmonic: 1328.5635414262717 ms @@ -21,7 +21,7 @@ Average: 3532 ms Harmonic: 764 ms Geometric: 1949 ms -Another PowGl test: +NanoPowGl: Total: 22831.300000041723 ms Average: 3805.2166666736207 ms Harmonic: 928.6432328540742 ms @@ -29,21 +29,16 @@ Geometric: 2500.810238375608 ms Minimum: 193 ms Maximum: 8361 ms -The proof-of-work equation for Nano cryptocurrency is defined as `blake2b(nonce||blockhash)>=threshold` where blake2b is the hash function configured for an 8-byte output, nonce is a random 8-byte value, || is concatenation, blockhash is a 32-byte value, and threshold is 0xfffffff800000000. - -My code currently finds valid nonces on a gaming GPU without issue but only because the initial search space is so large due to using a workgroup size of 256 and a dispatch of (256, 256, 256), so the probability of finding a nonce in the first pass is extremely high. However, this does not perform well on less powerful hardware like smartphones. Please alter my code to perform the nonce search in `main()` in a loop until a valid nonce is found with the idea that a smaller workgroup size and/or smaller dispatch dimensions can allow weak hardware to nonetheless iterate through nonces to search for a valid one and return it. - - +NanoPowGl: Time to calculate proof-of-work for a send block 512 times Total: 680948 ms Average: 1329.9765625 ms Harmonic: 749.6552658409396 ms -PASS Customized PoW: Time to calculate proof-of-work for a send block 512 times CHROMIUM with more accurate timings -PowGpu: Time to calculate proof-of-work for a send block 8192 times +NanoPowGpu: Time to calculate proof-of-work for a send block 8192 times Total: 2934170.3000008166 ms Average: 358.17508544931843 ms Harmonic: 218.11823673331645 ms @@ -52,17 +47,17 @@ Maximum: 2999.9000000059605 ms -PowGpu: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times Total: 187428.40000000596 ms Average: 366.07109375001164 ms Harmonic: 220.70399520519166 ms -PowGpu: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times Total: 187827.7999998629 ms Average: 366.85117187473224 ms Harmonic: 223.9897252426498 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining entire first G round) Total: 156981.3999993205 ms Average: 306.60429687367287 ms @@ -70,7 +65,7 @@ Harmonic: 128.74904701127866 ms Minimum: 21.700000047683716 ms Maximum: 1981.199999988079 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining entire first G round) Total: 162225.30000036955 ms Average: 316.8462890632218 ms @@ -79,7 +74,7 @@ Geometric: 211.25671228925867 ms Minimum: 21.600000023841858 ms Maximum: 2267.600000023842 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining 3 rounds of G mixing) Total: 155547.09999996424 ms Average: 303.80292968743015 ms @@ -88,7 +83,7 @@ Geometric: 196.77234360098842 ms Minimum: 19.5 ms Maximum: 2140.2000000476837 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining 5 rounds of G mixing) Total: 165145.19999998808 ms Average: 322.5492187499767 ms @@ -97,7 +92,7 @@ Geometric: 205.28427810986508 ms Minimum: 20.099999964237213 ms Maximum: 1850.5 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining 5 rounds of G mixing and replacing if with select in original G function) Total: 135665.40000021458 ms Average: 264.9714843754191 ms @@ -106,7 +101,7 @@ Geometric: 181.19191881133972 ms Minimum: 19.599999964237213 ms Maximum: 1908.5 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining 9 rounds of G mixing and replacing if with select in original G function) Total: 147481.09999907017 ms Average: 288.0490234356839 ms @@ -115,7 +110,7 @@ Geometric: 192.75325397221323 ms Minimum: 22.19999998807907 ms Maximum: 1762.800000011921 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining all rounds of G mixing) Total: 165041.20000058413 ms Average: 322.34609375114087 ms @@ -124,7 +119,7 @@ Geometric: 202.80092012876665 ms Minimum: 21.69999998807907 ms Maximum: 2303 ms -libnemo: Time to calculate proof-of-work for a send block 512 times +NanoPowGpu: Time to calculate proof-of-work for a send block 512 times (after inlining all rounds of G mixing and all if statements replaced with select function) Total: 134865.20000064373 ms Average: 263.4085937512573 ms @@ -133,9 +128,81 @@ Geometric: 171.8797089689105 ms Minimum: 20.80000001192093 ms Maximum: 2093.199999988079 ms - - -PowGpu: Time to calculate proof-of-work for a send block 32 times +NanoPow (WebGPU) 0xff +{ + "count": 512, + "total": 149335.80000003055, + "min": 9.400000000372529, + "max": 1503.300000000745, + "arithmetic": 291.67148437505966, + "truncated": 222.58417968753201, + "harmonic": 106.71381226989509, + "geometric": 186.92638314142255 +} + +NanoPow (WebGPU) 0xfff +{ + "count": 512, + "total": 164261.39999999292, + "min": 79.5, + "max": 1424.7000000011176, + "arithmetic": 320.8230468749862, + "truncated": 263.8744140625058, + "harmonic": 209.95457211379528, + "geometric": 256.8968599479061 +} + +NanoPow (WebGPU) 0x800 +{ + "count": 512, + "total": 125924.59999999404, + "min": 23, + "max": 1799.1000000014901, + "arithmetic": 245.94648437498836, + "truncated": 198.84531250000146, + "harmonic": 115.44432001873471, + "geometric": 171.54249948295475 +} + +NanoPow (WebGPU) 0x400 +{ + "count": 512, + "total": 132129.60000000335, + "min": 11.799999998882413, + "max": 2051.9000000003725, + "arithmetic": 258.06562500000655, + "truncated": 201.65429687500364, + "harmonic": 86.37881890351905, + "geometric": 156.54611901649818 +} + +NanoPow (WebGPU) 0x999 +{ + "count": 512, + "total": 132693.0000000093, + "min": 32.30000000074506, + "max": 2258.800000000745, + "arithmetic": 259.1660156250182, + "truncated": 208.9763671874971, + "harmonic": 133.4766737582568, + "geometric": 185.94074203825846 +} + +NanoPow (WebGPU) 0x400 +{ + "count": 512, + "total": 136912.30000001006, + "min": 8.900000000372529, + "max": 1369.9000000003725, + "arithmetic": 267.40683593751965, + "truncated": 196.9111328124891, + "harmonic": 96.43707569252571, + "geometric": 166.5867151432514 +} + + + +NanoPowGpu: Time to calculate proof-of-work for a send block 32 times Total: 8909.500000029802 ms Average: 278.4218750009313 ms Harmonic: 191.49100480215873 ms @@ -143,7 +210,7 @@ Geometric: 232.13670548729021 ms Minimum: 76.69999998807907 ms Maximum: 641.5 ms -PowGpu: Time to calculate proof-of-work for a send block 32 times +NanoPowGpu: Time to calculate proof-of-work for a send block 32 times Total: 11805.200000077486 ms Average: 368.91250000242144 ms Harmonic: 131.36379466491744 ms @@ -151,7 +218,7 @@ Geometric: 228.69384924435158 ms Minimum: 21.900000005960464 ms Maximum: 1479.5 ms -libnemo: Time to calculate proof-of-work for a send block 32 times +NanoPowGpu: Time to calculate proof-of-work for a send block 32 times (after inlining three G calls) Total: 11208.399999916553 ms Average: 350.2624999973923 ms @@ -160,7 +227,7 @@ Geometric: 210.41080264689026 ms Minimum: 25 ms Maximum: 1249.199999988079 ms -libnemo: Time to calculate proof-of-work for a send block 32 times +NanoPowGpu: Time to calculate proof-of-work for a send block 32 times (after inlining entire first G round) Total: 9778.899999797344 ms Average: 305.590624993667 ms @@ -169,7 +236,7 @@ Geometric: 193.85674573632113 ms Minimum: 23.69999998807907 ms Maximum: 1752.199999988079 ms -libnemo: Time to calculate proof-of-work for a send block 32 times +NanoPowGpu: Time to calculate proof-of-work for a send block 32 times (after inlining 3 rounds of G mixing) Total: 10425.399999856949 ms Average: 325.79374999552965 ms @@ -178,4 +245,14 @@ Geometric: 231.43806657572657 ms Minimum: 31.900000035762787 ms Maximum: 954.9000000357628 ms -In the following code, look at the fourth argument. Prepend it with `v&` and remove its `u` suffix. Then insert a parameter right after it with the same name except the digit is incremented by 1. For example, `G(&v, &v0, &v1, 8u, 16u, 24u, m0, m1, m2, m3);` becomes `G(&v, &v0, &v1, &v8, &v9, 16u, 24u, m0, m1, m2, m3);`. Make sure the ampersand is present and the digits are correct as described. Do not make any other modifications. +NanoPow (WebGPU) iPhone 0xff +{ + "count": 32, + "total": 161323, + "min": 130, + "max": 22190, + "arithmetic": 5041.3438, + "truncated": 3780.2813, + "harmonic": 1252.8660, + "geometric": 2906.9620 +} diff --git a/test.html b/test.html index 0bbb0cc..95b77de 100644 --- a/test.html +++ b/test.html @@ -7,15 +7,18 @@ SPDX-License-Identifier: GPL-3.0-or-later - @@ -163,13 +110,14 @@ SPDX-License-Identifier: GPL-3.0-or-later

nano-pow

https://www.npmjs.com/package/nano-pow

-

Speed tests comparing three different Nano proof-of-work tools.

-

NanoPowGpu uses cutting edge WebGPU technology. Not all browsers are supported.

-

NanoPowGl uses WebGL 2.0 and is a fallback option in the NanoPow package.

-

nano-webgl-pow is the original package from which NanoPow was inspired and optimized.

+

Speed test for NanoPow proof-of-work tool.

+

NanoPow uses cutting edge WebGPU technology. Not all browsers are supported.

+

NanoPow uses WebGL 2.0 as a fallback option if WebGPU is not detected.

Times below are in milliseconds and summarized by various averaging methods.


-

TESTING IN PROGRESS

+ + +

WAITING



 	
-- 2.34.1