Update ibm.py faster SWAR LUT creation #596
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Im doing https://github.com/raszpl/sigrok-disk (FM/MFM/RLL decoder for Sigrok/PulseView/DSView) Sigrok is limited to python 3.4 making everything 2-4 slower than modern already painfully slow python so Im benchmarking everything trying to find cheap speedups. I decided to look how others implement this stuff and found your code using bytearray lookup. Somehow on my own I tried lists, arrays, even dictionaries but not bytearrays so I added it to quick benchmark edit: uploaded whole thing at https://github.com/raszpl/sigrok-disk/blob/main/benchmarks/decode_bench.py
bytearray lookup is the second fastest method in Python after pure List, but List LUT size sucks blowing up cache. None of this matters for greaseweazle as your software is just sitting there waiting for real hardware 99.9999% of the time. For my use case decoding FM/MFM is about 30% of cpu time, and one track takes ~1-2 seconds. All LUT methods look great (as great as 2.3MB/s of slow python might look), but LUTs might be overloading CPU Cache because its smidge slower in real life than "just 0.7MB/s" SWAR.
bytearray LUT:
fdd_fm.sr 0.9494 seconds
fdd_mfm.sr 1.3841 seconds
hdd_mfm_RQDX3.sr 2.5356 seconds
hdd_mfm_RQDX3.sr 2.2046 seconds
hdd_mfm_AMS1100M4.sr 1.6169 seconds
hdd_mfm_WD1003V-MM2.sr 2.0987 seconds
hdd_mfm_WD1003V-MM2_int.sr 2.4361 seconds
hdd_mfm_EV346.sr 1.9930 seconds
hdd_rll_ST21R.sr 2.3816 seconds
hdd_rll_WD1003V-SR1.sr 2.5258 seconds
#HDD_11tracks.sr is mix of 6 RLL + 5 MFM tracks in one dump
HDD_11tracks.sr 20.3159 seconds
HDD_11tracks.sr 13.4105 seconds
SWAR:
fdd_fm.sr 0.9368 seconds
fdd_mfm.sr 1.3776 seconds
hdd_mfm_RQDX3.sr 2.2254 seconds
hdd_mfm_RQDX3.sr 2.1476 seconds
hdd_mfm_AMS1100M4.sr 1.5863 seconds
hdd_mfm_WD1003V-MM2.sr 1.9990 seconds
hdd_mfm_WD1003V-MM2_int.sr 2.1873 seconds
hdd_mfm_EV346.sr 1.9719 seconds
hdd_rll_ST21R.sr 2.3424 seconds
hdd_rll_WD1003V-SR1.sr 2.5386 seconds
#HDD_11tracks.sr is mix of 6 RLL + 5 MFM tracks in one dump
HDD_11tracks.sr 19.6183 seconds
HDD_11tracks.sr 13.3527 seconds
Of course this pull request is very silly, only saving you ~30ms of startup time :-) but It was a good pretext to make contact and show off https://github.com/raszpl/sigrok-disk :)