io.fits.Card: minimal float formatting to prevent truncation (swev-id: astropy__astropy-14508)#31
Conversation
…; exponent normalization; tests (refs #30)
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for the thoughtful improvement and tests. Overall direction is good: favoring Python’s minimal round‑trip string to preserve comment space and normalizing exponents aligns with FITS readability. I’m requesting small changes for correctness and maintainability:
- Normalize exponents by parsing the digits and reformatting so small exponents use exactly two digits (e.g., E+03), preserving larger ones as-is (E+100). Current rjust can keep unnecessary leading zeros (E+003), which regresses prior behavior.
- Remove the dead/unreachable leftover code (value_str/str_len) to avoid confusion.
- Avoid raw substring truncation of the mantissa. When minimal/.16G output exceeds 20 chars, prefer adjusting precision with rounding to fit the 20-char field rather than slicing, which can silently change the value in a non-rounding way.
- Add tests for negative zero formatting and a large positive exponent to lock in the intended normalization across edge cases.
Everything else looks consistent with FITS header rules and HIERARCH behavior. See inline comments for details.
There was a problem hiding this comment.
[minor] _format_float contains an unreachable leftover block at the end referencing value_str and str_len (from the previous implementation). Although it won’t execute due to earlier returns, it’s confusing and risks future maintenance issues. Please remove the dead code entirely.
There was a problem hiding this comment.
[major] Exponent normalization in _normalize_float_str currently pads digits to "at least two" using rjust(2, '0'), which preserves strings like E+003. The previous implementation explicitly normalized zero-padded small exponents to two digits (e.g., E+03). To maintain backward compatibility and FITS style, please collapse unnecessary leading zeros by parsing the exponent and reformatting: digits_fmt = f"{int(digits):02d}" and then rebuild mant + "E" + sign + digits_fmt. This retains 3+ digits where necessary (e.g., E+100) while ensuring small exponents use exactly two digits.
There was a problem hiding this comment.
[minor] The truncation logic in _format_float slices the mantissa and final output (mant[:avail] / out[:20]). This can produce non-rounded mantissas and break round-trip behavior when the minimal-string or .16G formats exceed 20 chars. Consider computing an appropriate precision and using a rounding format (e.g., :.{n}g or :.{n}f depending on presence of a decimal) to fit within the 20-char field while preserving numeric intent, rather than raw substring truncation.
There was a problem hiding this comment.
[nit] The _value_field helper assumes a single space after = and grabs up to 20 chars; this is fine for standard cards but HIERARCH can alter padding semantics. Consider a brief docstring note or leveraging Card parsing helpers (if available) to avoid brittle assumptions in future refactors.
There was a problem hiding this comment.
[minor] Please add a test case for negative zero to ensure formatting preserves the sign and decimal point (e.g., Card('NEGZERO', -0.0) yields a value field containing -0.0). This guards against regressions in _normalize_float_str.
There was a problem hiding this comment.
[minor] Consider adding a test for large positive exponents (e.g., 1e+100) to confirm normalization keeps 3-digit exponents unchanged (i.e., E+100), alongside the small-exponent case. This will document the intended behavior across exponent sizes.
…ntissa rounding; remove dead code; add tests for -0.0 and large exponent; clarify test helper comments
noa-lucent
left a comment
There was a problem hiding this comment.
Great work—changes address the prior concerns:
- Exponent normalization now preserves E±100 and collapses small exponents to two digits, avoiding E+003.
- Removed stale code.
- Mantissa fitting uses precision-based rounding instead of substring truncation.
- Added tests for -0.0 and large exponents; clarified
_value_fieldhelper.
Left two minor, non-blocking suggestions to simplify exponent parsing and consider :.{N}G in the precision loop for non-exponential fits. Otherwise, this is solid and adheres to FITS header value rules and HIERARCH behavior. Approving.
There was a problem hiding this comment.
[minor] In _format_float, the precision loop uses :.{N}E which always emits scientific notation, even for values where a fixed or minimal form might fit within 20 characters (e.g., 0.1). You already attempt minimal representation first; however, when falling back, consider trying :.{N}G before :.{N}E so values near 1.0 are shown without an exponent when beneficial and still normalized with _normalize_float_str. This may further reduce comment truncation in borderline cases.
There was a problem hiding this comment.
[minor] Thanks for the updates—this looks much cleaner. One small suggestion for _normalize_float_str: building the exponent currently reparses exp with a try/except and a fallback parse. Since Python’s float.__format__('E') already produces canonical exponents (no spaces, with sign), you can simplify by parsing via int(exp) directly after ensuring a leading sign and catching only ValueError once, or rely on the int(exp) result from the earlier sci branch. Not a blocker, but consider simplifying the defensive branches for maintainability.
|
Closing per policy to keep exactly one open PR for task ID 12. Continuing work in PR #124. |
Implements Issue #30: Prefer minimal round-trip float representation (str(value)) within 20-char value field, fallback to .16G with normalization and trimming. Normalizes exponent to uppercase E with at least two digits; ensures decimal point for integer-like floats; applies to complex components. Adds tests covering HIERARCH minimal formatting preventing comment truncation and exponent/decimal rules. No CI triggered.