[output] format = "html" threshold_speedup = 1.10 # only show improvements >10%
Apply with:
xbestpp tune --annotated-only -- ./my_program xbestpp profile --gpu --kernel="myKernel" -- ./cuda_app Reports: occupancy, global load/store efficiency, bank conflicts. 5.3 Regression testing in CI xbestpp ci --baseline=golden.json --max-regression=0.05 -- ./test_suite Fails if any metric worsens >5%. 6. Configuration File ( xbestpp.toml ) Example: xbestpp
[profiling] events = ["cycles", "cache-misses", "instructions"] duration = 10 # seconds [optimization] max_unroll = 8 allow_fp_contract = true gpu_grid_size = [256, 1, 1] [output] format = "html" threshold_speedup = 1
Function Baseline (ms) Optimized (ms) Speedup matrix_multiply 342.12 189.44 1.81x 5.1 Targeted tuning via annotation Add to your C++ code: global load/store efficiency
[[xbestpp::hot(iterations=1000000)]] void compute() ... Then run:
Cards and other items are available printed or personalized upon request.
Unser Angebot gilt ausschliesslich für gewerbliche Abnehmer (Industrie, Handwerk, Handel und freie Berufe zur Verwendung in der beruflich selbständigen oder gewerblichen Tätigkeit). Alle Preise sind Nettopreise in Euro (EUR) zuzüglich Mehrwertsteuer und Versandkosten. (Endverbraucher werden von uns nicht beliefert und wenden sich bitte an ihr Systemhaus.)
Our offer is reserved for business clients (industry, trade, professional jobs). All prices are net prices in Euro (EUR), excluding taxes and S/H. (Consumers are not served by our company).
Dismiss