-
Notifications
You must be signed in to change notification settings - Fork 8
CommPat / CommPv2 hasher #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #30 +/- ##
===========================================
+ Coverage 41.58% 63.45% +21.87%
===========================================
Files 3 4 +1
Lines 392 561 +169
===========================================
+ Hits 163 356 +193
+ Misses 225 191 -34
- Partials 4 14 +10
🚀 New features to boost your workflow:
|
|
For reviewers: https://ipfs.io/ipfs/bafkreiacuhuhzbfvplw7d3lzei5vg4uyczoebpq3ihhffplij4he3opply is very nice to have opened on the side to look at tree layouts |
|
|
||
| // nextPow2 returns the smallest power of 2 >= n | ||
| func nextPow2(n uint64) uint64 { | ||
| if n == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Micro-op:
func nextPow2(n uint64) uint64 {
if n == 0 {
return 1
}
return 1 << bits.Len(uint(n-1))
}
| } | ||
|
|
||
| // isZero checks if a byte slice is all zeros | ||
| func isZero(b []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func isZero is unused (U1000)
| // Shift bits to account for the 2-bit shim | ||
| // In: {{ C[7] C[6] }} X[7] X[6] X[5] X[4] X[3] X[2] X[1] X[0] ... | ||
| // Out: X[5] X[4] X[3] X[2] X[1] X[0] C[7] C[6] ... | ||
| for i := 31; i < 63; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance: (if this is hotpath)
Keep a carry var instead of 2 reads per loop. Also, process 8 bytes at a time:
var carry uint64
for ........ {
v := binary.BigEndian.Uint64(b[i:])
out := (v << 2) | carry
binary.BigEndian.PutUint64(b[i:], out)
carry = v >> 62
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably worse in out-of-order CPUs because it creates dependencies. Good compiler probably can resolve both to exactly the same bit-shuffling instructions tho.
| func payloadPosToLeaf(pos int) int { | ||
| switch { | ||
| case pos < 31: | ||
| return 0 | ||
| case pos < 62: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func payloadPosToLeaf(pos int) int { return (pos+1) >> 5 }
| for quadIdx := startQuad; quadIdx < endQuad; quadIdx++ { | ||
| startLeafInQuad, endLeafInQuad := quadFromBuf(paddedQuad[:], buf, start, quadIdx) | ||
| if startLeafInQuad < 0 { | ||
| continue // no data in this quad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no test touches this line. Is this an interesting case?
| for _, offset := range quadBoundaries { | ||
| t.Run(fmt.Sprintf("offset_%d", offset), func(t *testing.T) { | ||
| dataSize := int64(127 * 4) // 4 quads of data | ||
| data := generateRandomData(dataSize, 12345) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run some of these generated data through a known-good implementation and have those results compared here in some tests?
Related to filecoin-project/FIPs#1216
It does appear to work correctly with easily >1GB/s throughput. Possibly a bit more verbose than it needs to be and possibly quite a few optimizations remain (e.g. better cache-line utilisation for left-right pairs, less allocations) but it's pretty fast.
Do not merge until PoDSIv2 FRC is in; feel free to review.