Skip to content

WIP - Intl: Add a new IntlNumberRangeFormatter class #19232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BogdanUngureanu
Copy link
Contributor

closes #18599

Adds support for ICU's NumberRangeFormatter that returns a locale-aware string interval given 2 numbers. Instead of using the C API, I've opted for the C++ one because it offers a few more possibilities in the future (it could combined with a new NumberFormatter ICU class) while the C version only allows support for skeleton strings.

Since the new NumberFormatter class hasn't been added to PHP Intl yet, the implementation uses skeletons for now.

The implementation uses a factory method that returns a new IntlNumberRangeFormatter object, while the constructor is private. I went with this approach because it would allow the class to have a second factory method in the future for different number formatting (e.g. NumberFormatter). Something like

createFromSkeleton(...)
createFromNumberFormatter()

Drawbacks of the current PHP API:
The C++ version allows to configure the NumberFormatter/skeleton for $start and $end individually (using a fluent interface). The PHP API I propose sets it for both numbers
C++ example code:

NumberRangeFormatter::with()
    .identityFallback(UNUM_IDENTITY_FALLBACK_APPROXIMATELY_OR_SINGLE_VALUE)
    .numberFormatterFirst(NumberFormatter::with().adoptUnit(MeasureUnit::createMeter()))
    .numberFormatterSecond(NumberFormatter::with().adoptUnit(MeasureUnit::createKilometer()))
    .locale("en-GB")
    .formatFormattableRange(750, 1.2, status)
    .toString(status);
// => "750 m - 1.2 km"

and a C example

// Setup:
UErrorCode ec = U_ZERO_ERROR;
UNumberRangeFormatter* uformatter = unumrf_openForSkeletonCollapseIdentityFallbackAndLocaleWithError(
    u"currency/USD precision-integer",
    -1,
    UNUM_RANGE_COLLAPSE_AUTO,
    UNUM_IDENTITY_FALLBACK_APPROXIMATELY,
    "en-US",
    NULL,
    &ec);
UFormattedNumberRange* uresult = unumrf_openResult(&ec);
if (U_FAILURE(ec)) { return; }

// Format a double range:
unumrf_formatDoubleRange(uformatter, 3.0, 5.0, uresult, &ec);
if (U_FAILURE(ec)) { return; }

// Get the result string:
int32_t len;
const UChar* str = ufmtval_getString(unumrf_resultAsValue(uresult, &ec), &len, &ec);
if (U_FAILURE(ec)) { return; }
// str should equal "$3 – $5"

While the code works, the work is still in progress - I still have to implement a proper error handling. In the meantime, I would like to hear your opinions about the PHP interface I'm proposing.

* @not-serializable
* @strict-properties
*/
final class IntlNumberRangeFormatter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should have a condition around the class so that it's only exposed for libicu 68+. Similar example:

#if U_ICU_VERSION_MAJOR_NUM >= 67


private function __construct() {}

public static function createFromSkeleton(string $skeleton, string $locale, int $collapse, int $identityFallback): IntlNumberRangeFormatter {}
Copy link
Member

@kocsismate kocsismate Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I won't upset you too much, but I think these changes will need an RFC, because any API changes involve a lot of bikeshedding lately.

For example, we have started to use enums more often (see my https://wiki.php.net/rfc/url_parsing_api RFC), so I think this practice could be continued in your case too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see that you have recently added the ListFormatter class in #18519 that also has some enum-like constants. It probably also makes sense to stay consistent with the current convention of intl (class constants), so I don't insist on my above suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide that we should employ nicer features in intl, we should do a thorough review of not only the constants ,but other mechanisms (like error handling) as well. And then we can make one compelling holistic RFC (if we would desire to do so ;) ).

Comment on lines +52 to +53
{

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could throw an exception if this functions is tried to be called:

zend_throw_error(NULL, "Cannot directly construct %s, use document methods instead", ZSTR_VAL(Z_OBJCE_P(ZEND_THIS)->name));

Copy link
Member

@kocsismate kocsismate Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ please add a test for this case

}

if (skeleton_len == 0) {
zend_argument_value_error(1, "Skeleton string cannot be empty");
Copy link
Member

@kocsismate kocsismate Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages don't fit zend_argument_value_error. E.g. the result of this one will read as Argument #1 ($skeleton) Skeleton string cannot be empty. So in this case, you should change the 2nd argument to cannot be empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better yet, use zend_argument_must_not_be_empty_error(1); which will ensure standard wording

IntlNumberRangeFormatter_object* obj = Z_INTL_RANGEFORMATTER_P(ZEND_THIS);

ZEND_PARSE_PARAMETERS_START(2, 2)
Z_PARAM_DOUBLE(start)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this macro implies float type, but you would want int|float which needs Z_PARAM_NUMBER

if (!ret) {
intl_error_set(NULL, error, "Failed to convert result to UTF-8", 0);
// RETVAL_FALSE;
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this (and the return above) will result in a null return value which is missing from the stub, so it should either be added there, or you should throw an exception.

public static function createFromSkeleton(string $skeleton, string $locale, int $collapse, int $identityFallback): IntlNumberRangeFormatter {}

public function format(float|int $start, float|int $end): string {}
}
Copy link
Member

@kocsismate kocsismate Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntlListFormatter has the getError*() methods. Why didn't you implement them for IntlNumberRangeFormatter? (maybe that's what you still need to implement according to the last sentence of the PR description?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[intl] Expose the ICU NumberRangeFormatter
3 participants