Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Read all chunks when hashing a readable stream #2911

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jpdutoit
Copy link

@jpdutoit jpdutoit commented Jun 5, 2024

createHash was only reading the first chunk when hashing a readable stream. This breaks etags for longer responses.

On bun the first chunk seems to always be 16384 bytes long, so I added a test there. Could not reproduce on node

The author should do the following, if applicable

  • Add tests
  • Run tests
  • bun run format:fix && bun run lint:fix to format the code

@jpdutoit jpdutoit changed the title Read all chunks when hashing a readable stream fix: Read all chunks when hashing a readable stream Jun 5, 2024
@yusukebe
Copy link
Member

yusukebe commented Jun 6, 2024

Hi @jpdutoit

Thank you for the PR! Can you write a unit test in src/utils/crypto.test.ts?

Copy link

codecov bot commented Jun 6, 2024

Codecov Report

Attention: Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 94.10%. Comparing base (0d851b6) to head (2603229).
Report is 4 commits behind head on main.

Files Patch % Lines
src/utils/crypto.ts 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2911      +/-   ##
==========================================
- Coverage   94.10%   94.10%   -0.01%     
==========================================
  Files         136      136              
  Lines       13369    13377       +8     
  Branches     2266     2263       -3     
==========================================
+ Hits        12581    12588       +7     
- Misses        788      789       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@usualoma
Copy link
Member

usualoma commented Jun 6, 2024

Hi @jpdutoit
Thanks for creating the PR!

Your point about "etag" is correct, and the current hono "middleware/etag" behavior is incorrect. I also believe that this PR will fix the "middleware/etag" behavior.

sha1

import { sha1 } from '... /... /utils/crypto'.

I think it is an incorrect result for the name sha1 that the value changes depending on how the chunk is split, even if the same data is passed. I think the same hash value should always be returned for the same data. I think it would be good if the behaviour of the following test would be.

https://github.com/honojs/hono/pull/2918/files#diff-8dde9c6181523f43830fbad5a7e2eeda12b860c5d712ec628e883bd58ad4904aR44-R59

etag

However, since "etag" does not require a strict "sha1 value for data", your method of generating "etag" is suitable in terms of saving memory usage about "etag".

For that reason, I believe it is better to resolve the problem in "middleware/etag" regarding the generation of "etag" when split into multiple chunks.

@jpdutoit
Copy link
Author

This is all good comments, haven't had time to revisit this yet...

@jpdutoit
Copy link
Author

For fixing the sha1 part, it would be better with a solution that supports true incremental hashing instead of needing to load the whole thing into memory, not sure what portable options there are. Unfortunately subtle.crypto does not support that..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants