Skip to content

fix: double type is not handled correctly in tsz compress on windows#34847

Merged
guanshengliang merged 2 commits intomainfrom
fix/compress-cases-failure
Mar 24, 2026
Merged

fix: double type is not handled correctly in tsz compress on windows#34847
guanshengliang merged 2 commits intomainfrom
fix/compress-cases-failure

Conversation

@localvar
Copy link
Copy Markdown
Contributor

@localvar localvar commented Mar 18, 2026

Description

for tsz compression algorithm, the double data type is not handled correctly on windows and result in two failed cases:

  • test_compress_basic.py
  • test_compress_alter_table.py

this PR fixes the issue.

Issue(s)

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

test_compress_basic.py
test_compress_alter_table.py
Copilot AI review requested due to automatic review settings March 18, 2026 10:50
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses and resolves compression failures observed on Windows by standardizing integer types and size representations within the TSZ compression library. The changes primarily involve migrating from potentially ambiguous long and unsigned long types to explicit int64_t, uint64_t, and size_t to ensure consistent data handling across different operating systems, thereby fixing issues in test_compress_basic.py and test_compress_alter_table.py.

Highlights

  • Type Standardization for 64-bit Integers: Replaced platform-dependent long and unsigned long types with fixed-width int64_t and uint64_t across various headers and source files to ensure consistent 64-bit integer representation, particularly for big-endian byte conversions and Huffman coding.
  • Size Type Consistency: Updated function signatures and variable types from unsigned long to size_t for parameters and return values related to data lengths and sizes in lossless compression/decompression utilities, improving type safety and portability.
  • Adjusted Data Structure Size Calculation: Corrected a byte length calculation within the TightDataPointStorageD structure initialization, which was likely contributing to data corruption or incorrect sizing on specific platforms.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request primarily updates integer types from long and unsigned long to int64_t, uint64_t, and size_t across several C header and source files to improve type consistency and portability. This includes changes in ByteToolkit.h/c, CompressElement.h, Huffman.h/c, dataCompression.h/c, sz.h, utility.h/c, sz_double.c, szd_double.c, and szd_float.c. Additionally, a specific bug in TightDataPointStorageD.c was addressed by correcting a size calculation from 4 to 8, which a reviewer noted could be further improved by replacing magic numbers with sizeof or named constants for better readability, maintainability, and to prevent similar bugs.

Comment thread contrib/TSZ/sz/src/TightDataPointStorageD.c Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the bundled SZ/TSZ compression code to behave consistently on Windows by replacing platform-dependent integer types (long, unsigned long) with fixed-width / pointer-width types (int64_t, uint64_t, size_t) in several serialization and compression/decompression paths.

Changes:

  • Switched lossless compress/decompress APIs to size_t for sizes and return values.
  • Replaced long/unsigned long with int64_t/uint64_t in Huffman coding and byte-tooling utilities to avoid Windows long size differences.
  • Fixed a size accounting constant in TightDataPointStorageD residual size calculation.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
contrib/TSZ/sz/src/utility.c Updates lossless compress/decompress function signatures to size_t.
contrib/TSZ/sz/inc/utility.h Header updates matching utility.c signature changes.
contrib/TSZ/sz/src/szd_float.c Updates sz_lossless_decompress call sites to new size_t signature.
contrib/TSZ/sz/src/szd_double.c Updates sz_lossless_decompress call sites to new size_t signature.
contrib/TSZ/sz/src/sz_double.c Uses size_t when computing radiusIndex.
contrib/TSZ/sz/src/dataCompression.c Moves from long to int64_t for double bit-pattern serialization and updates getRealPrecision_int signature.
contrib/TSZ/sz/inc/dataCompression.h Header updates matching dataCompression.c signature changes.
contrib/TSZ/sz/src/TightDataPointStorageD.c Adjusts residual payload size computation to match written layout.
contrib/TSZ/sz/src/Huffman.c Migrates Huffman codeword storage to uint64_t-based buffers.
contrib/TSZ/sz/inc/Huffman.h Updates Huffman public types to uint64_t and adds <stdint.h>.
contrib/TSZ/sz/src/ByteToolkit.c Changes long<->bytes helpers to use int64_t and updates related bitmask constants.
contrib/TSZ/sz/inc/ByteToolkit.h Header updates matching ByteToolkit.c signature changes and adds <stdint.h>.
contrib/TSZ/sz/inc/sz.h Uses fixed-width 64-bit types in SZ unions to avoid Windows long differences.
contrib/TSZ/sz/inc/CompressElement.h Changes DoubleValueCompressElement.curValue to int64_t.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread contrib/TSZ/sz/inc/CompressElement.h Outdated
Comment thread contrib/TSZ/sz/src/ByteToolkit.c Outdated
Comment thread contrib/TSZ/sz/inc/ByteToolkit.h Outdated
Comment thread contrib/TSZ/sz/src/Huffman.c Outdated
Comment thread contrib/TSZ/sz/src/dataCompression.c Outdated
@localvar localvar changed the title fix: failed compress cases on windows fix: double type is not handled correctly in tsz compress Mar 19, 2026
@localvar localvar changed the title fix: double type is not handled correctly in tsz compress fix: double type is not handled correctly in tsz compress on windows Mar 19, 2026
@guanshengliang guanshengliang merged commit ebef39f into main Mar 24, 2026
15 of 16 checks passed
@localvar localvar deleted the fix/compress-cases-failure branch March 24, 2026 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants