-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This is me ranting (while waiting for the changlog generator to run)
The current instructions in https://github.com/apache/arrow-rs/blob/main/dev/release/README.md consumes far more time than it should in my opinion:
- I have to manually edit the script
- It takes many minutes to generate a CHANGELOG file
- I have to then manually touch up the various CHANGELOGs
- It doesn't seem to know how to deal with branches
For example, while creating a CHANGELOG for 57.3.0 (which has 4 commits on the branch) it produces this nonsense (includes commits from main).
This is even after I tried to tell it to use 57_maintenance as the base
diff --git a/dev/release/update_change_log.sh b/dev/release/update_change_log.sh
index 7f0195bbd7b..0617079171b 100755
--- a/dev/release/update_change_log.sh
+++ b/dev/release/update_change_log.sh
@@ -29,8 +29,8 @@
set -e
-SINCE_TAG="57.1.0"
-FUTURE_RELEASE="57.2.0"
+SINCE_TAG="57.2.0"
+FUTURE_RELEASE="57.3.0"
SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)"
@@ -49,6 +49,8 @@ docker run -it --rm -e CHANGELOG_GITHUB_TOKEN="$ARROW_GITHUB_API_TOKEN" -v "$(pw
--max-issues=300 \
--exclude-tags-regex "^object_store_\d+\.\d+\.\d+$|-rc\d$" \
--since-tag ${SINCE_TAG} \
+ --since-commit 2026-02-01 00:00:00 \
+ --release-branch 57_maintenance \
--future-release ${FUTURE_RELEASE}Details
Changelog
57.3.0 (2026-02-02)
Implemented enhancements:
- Optimize data page statistics conversion #9306
- A more generic convenience method to create list arrays from nested iterators #9267 [arrow]
- Speed up string view comparison #9253
- Improve parquet BinaryView / StringView decoder performance #9238
- Speedup filter (up to 1.5x) filter/BitIndexIterator/iter_set_bits_rev #9230
- [Parquet] Optimize struct reading #9216
- [Parquet] Add benchmarks for reading struct arrays from parquet #9209
- Support casting negative scale decimals to numeric #9201
- Add ability to reuse
DictionaryTrackerwhen creating new IPC Stream #9195 - [regression] Sealing the
Arraytrait broke downstream crates #9184 - perf: optimize
RowGroupIndexReaderfor single row group reads #9180 - Support formatting ListView #9174
- Row format support for ListView #9173
- Add lossy flag in CastOptions #9172
- Consolidate parquet examples into the doc comments #9154
- Uncomment part of test_utf8_single_column_reader_test #9147
- Document / Add an example of RowFilter usage #9096 [parquet]
- Document / Add an example of preserving dictionary encoding when reading parquet #9095 [parquet]
- Reduce overhead to create an Array from ArrayData (
make_array) #9061 - [arrow-avro] Add Explicit Projection API to ReaderBuilder #8923
...
This may be a function of not configuring the changelog generator correctly but it is really quite a pain
Describe the solution you'd like
Something that I don't have to spend much time working on changelogs
Describe alternatives you've considered
@andygrove made one in DataFusion that seems to work well; https://github.com/apache/datafusion/blob/main/dev/release/generate-changelog.py
Maybe @kylebarron or @Jefffrey have some suggestions for a better solution too
Additional context