fix scaling issues when computing delta signatures on large files #77
+12
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR drastically improves the throughput of computing delta signatures on large files.
To test it:
before:
after:
Rationale
During archive creation, dar stores the signature in memory using the
memoryfileclass.It appears that
memoryfilehas terrible scaling properties when subjected to short writes, because:storageclass stores each newly written block in a separate heap-allocatedcelluleobject and these objects are collected in an internal linked listmemoryfile::inherited_write()implies at least one call tostorage::size(), which fully traverses the linked list.As the file grows the throughput decreases asymptotically. Rough measurement: 50 Mo/s at t0, 3.3 Mo/s at t0+0h20, 1.5Mo/s at t0+1h30, 1.2Mo/s at t0+2h30. At some point the cpu spends most of its time on page faults.
This patch modifies
generic_rsync::inherited_read()so that its internal buffer is flushed only when full (rather than on every call). This reduces the size of the linked-list by two orders of magnitude. Hashing a 48Go file with a 4ko block size now takes 4 minutes (instead of 8 hours previously).A longer term solution would be to refactor
memoryfile(pehaps a naivestd::vectorwould yield better results than thestorageclass, is the optimisation for arbitrary insertions really relevant?).Also the PR fixes a possible null-pointer dereference when
generic_rsync::send_eof()is called on delta creation (send_eof()has no use when creating a delta (it is only relevant forcreating signatures) and it may dereferencex_output(which is null when creating a delta))