Skip to content

[LinearAlgebra] Speedup accumulation on BTDMatrix#4248

Merged
bakpaul merged 2 commits into
sofa-framework:masterfrom
alxbilger:btdadd
Oct 25, 2023
Merged

[LinearAlgebra] Speedup accumulation on BTDMatrix#4248
bakpaul merged 2 commits into
sofa-framework:masterfrom
alxbilger:btdadd

Conversation

@alxbilger

Copy link
Copy Markdown
Contributor

Validated with the following benchmark. The benchmark BM_BTDMatrix_addBlock uses the new implemented accumulation function and must be compared to BM_BTDMatrix_add.

-------------------------------------------------------------------------------
Benchmark                                     Time             CPU   Iterations
-------------------------------------------------------------------------------
BM_BTDMatrix_add<6, SReal>/64              6212 ns         3069 ns       224000
BM_BTDMatrix_add<6, SReal>/512            49857 ns        32087 ns        22400
BM_BTDMatrix_add<6, SReal>/1024          103536 ns        54408 ns        11200
BM_BTDMatrix_addBlock<6, SReal>/64          602 ns          391 ns      1600000
BM_BTDMatrix_addBlock<6, SReal>/512        5607 ns         3432 ns       186667
BM_BTDMatrix_addBlock<6, SReal>/1024      12176 ns         6417 ns       112000

alxbilger/SofaBenchmark#36


By submitting this pull request, I acknowledge that
I have read, understand, and agree SOFA Developer Certificate of Origin (DCO).


Reviewers will merge this pull-request only if

  • it builds with SUCCESS for all platforms on the CI.
  • it does not generate new warnings.
  • it does not generate new unit test failures.
  • it does not generate new scene test failures.
  • it does not break API compatibility.
  • it is more than 1 week old (or has fast-merge label).

@alxbilger alxbilger added pr: enhancement About a possible enhancement pr: status to review To notify reviewers to review this pull-request labels Oct 20, 2023
@alxbilger

Copy link
Copy Markdown
Contributor Author

[ci-build][with-all-tests]

@hugtalbot

Copy link
Copy Markdown
Contributor

Wow, @fredroy and @epernod might enjoy it!

@fredroy

fredroy commented Oct 23, 2023

Copy link
Copy Markdown
Contributor

Wow, @fredroy and @epernod might enjoy it!

I tried on the 3instrument_collis (BeamAdapter) and there is no noticeable speed-up (or speed-down for that matter) ; but maybe this is not a good use case

@alxbilger

alxbilger commented Oct 23, 2023

Copy link
Copy Markdown
Contributor Author

I tried on the 3instrument_collis (BeamAdapter) and there is no noticeable speed-up (or speed-down for that matter) ; but maybe this is not a good use case

I am not sure that this new function is called. I think #4253 must be merged first.

Anyway, I don't expect a huge speedup in a scene. This kind of speed up is significant with a high number of DoFs, and it's not the case when using beams. I expect that you save a few microseconds by time step, which is really not significant.

@fredroy

fredroy commented Oct 24, 2023

Copy link
Copy Markdown
Contributor

I tried on the 3instrument_collis (BeamAdapter) and there is no noticeable speed-up (or speed-down for that matter) ; but maybe this is not a good use case

I am not sure that this new function is called. I think #4253 must be merged first.

Indeed ! 🚀

master;
[BatchGUI] 5000 iterations done in 66.0261 s ( 75.7276 FPS)

master+ this branch + #4253 :
[BatchGUI] 5000 iterations done in 58.863 s ( 84.943 FPS)

@alxbilger

Copy link
Copy Markdown
Contributor Author

@fredroy thanks for the benchmark. I am pleased with the results.

@fredroy

fredroy commented Oct 25, 2023

Copy link
Copy Markdown
Contributor

I tried on the 3instrument_collis (BeamAdapter) and there is no noticeable speed-up (or speed-down for that matter) ; but maybe this is not a good use case

I am not sure that this new function is called. I think #4253 must be merged first.

Indeed ! 🚀

master; [BatchGUI] 5000 iterations done in 66.0261 s ( 75.7276 FPS)

master+ this branch + #4253 : [BatchGUI] 5000 iterations done in 58.863 s ( 84.943 FPS)

This was on WIndows11+MSVC2022 ; on ubuntu/gcc no speedups 😓

@hugtalbot hugtalbot added pr: status ready Approved a pull-request, ready to be squashed and removed pr: status to review To notify reviewers to review this pull-request labels Oct 25, 2023
@bakpaul bakpaul merged commit f670b11 into sofa-framework:master Oct 25, 2023
@hugtalbot hugtalbot added this to the v23.12 milestone Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: enhancement About a possible enhancement pr: status ready Approved a pull-request, ready to be squashed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants