Skip to content

Conversation

triska
Copy link
Contributor

@triska triska commented Sep 19, 2025

Source:

https://github.com/mthom/scryer-prolog/discussions/3080#discussioncomment-14387382

Currently, I get:

?- phrase_from_file(...@, "pio.pl").
   freeze:freeze(_A,user:render_step(true,'$dropped_value',position_and_lines_read(0,0),_A)).

How to best solve this?

@UWN
Copy link

UWN commented Sep 19, 2025

An explicit '$unfreeeze'/1 that removes the attribute. Probably. The question is rather whether there is some other possibility to access that variable other than via backtracking.

triska added 2 commits October 4, 2025 15:19
Source:

    mthom#3080 (comment)

Currently, I get:

    ?- phrase_from_file(...@, "pio.pl").
       freeze:freeze(_A,user:render_step(true,'$dropped_value',position_and_lines_read(0,0),_A)).

How to best solve this?
@triska
Copy link
Contributor Author

triska commented Oct 4, 2025

I think I solved it! Please have a look!

@triska triska marked this pull request as ready for review October 4, 2025 13:34
@UWN
Copy link

UWN commented Oct 5, 2025

?- length(_,E),E>25,N is 2^E,length(L,N),time(phrase(...@,L)).
   % CPU time: 0.143s, 61 inferences
   E = 26, N = 67108864, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  % CPU time: 0.289s, 112 inferences
   E = 27, N = 134217728, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  % CPU time: 0.573s, 112 inferences
   E = 28, N = 268435456, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  ... .
?- length(_,E),E>25,N is 2^E,length(L,N),time(phrase(...,L)).
   % CPU time: 17.831s, 335_544_379 inferences
   E = 26, N = 67108864, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  % CPU time: 35.118s, 671_088_750 inferences
   E = 27, N = 134217728, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  % CPU time: 71.450s, 1_342_177_390 inferences
   E = 28, N = 268435456, L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...]
;  ... .

So more than a factor 100 faster for regular lists. And seems fine with phrase_from_file/2

@triska
Copy link
Contributor Author

triska commented Oct 5, 2025

I think one of the most impressive features is that this lets us automatically stop reading from a file as soon as we find what we are looking for, is this true?

In this way, the potential improvement for very large files is in fact unbounded.

@UWN
Copy link

UWN commented Oct 6, 2025

Right, I was not sure what file to take.

?- time(phrase_from_file((...@|...,[C]|...@|...),"src/machine/system_calls.rs")).
   % CPU time: 0.001s, 589 inferences
   true
;  % CPU time: 0.466s, 2_033_250 inferences
   C = '\n'
;  % CPU time: 0.002s, 166 inferences
   true
;  % CPU time: 0.310s, 1_689_233 inferences
   true
;  % CPU time: 0.001s, 76 inferences
   false.

While the frozen goal is removed for a leaf answer, on backtracking it is reestablished such that also the last character of the file can be read.

@UWN
Copy link

UWN commented Oct 6, 2025

@jjtolton, have you tried it?

@UWN
Copy link

UWN commented Oct 13, 2025

@jjtolton, did you look at it?

@jjtolton
Copy link

@jjtolton, did you look at it?

Somehow I was missing these notifications.

:- use_module(library(pio)).
:- use_module(library(dcgs)).
:- use_module(library(lists)).
:- use_module(library(time)).

demo :-
    E = 24, N is 2^E, length(L, N),
    time(phrase(...@, L)), nl,
    time(phrase(..., L)), nl, nl.

?- once(demo).
%@    % CPU time: 0.044s, 38 inferences
%@ 
%@    % CPU time: 6.129s, 83_886_139 inferences
%@ 
%@ 
%@    true.

Wow -- 140x speedup, 8 or 9 factors less inferences??

@bakaq
Copy link
Contributor

bakaq commented Oct 13, 2025

Any reason why this couldn't just be integrated into ... /0 itself? Doing the $skip_max_list doesn't have anything to do with files.

I guess we would need this anyway to actually close the file. Maybe there's a way to invert the control here and make the file closing not depend on the non-terminal that is being called, in a way that just the stock ... /0 would already work unmodified. I can't see anything obvious though.

@triska
Copy link
Contributor Author

triska commented Oct 14, 2025

The file is automatically closed on deterministic success, failure or exception, via setup_call_cleanup/3.

The design question is: Should ... //0 have to bear the overhead of additional checks, even in cases where that overhead can be avoided?

@UWN
Copy link

UWN commented Oct 14, 2025

Any reason why this couldn't just be integrated into ... /0 itself?

Integrating it into ... //0 would incur a tiny overhead everywhere, just for this case. ...@ means I do know this is the end and if I am wrong, then I do want some indication that I am wrong. Only ...@ //0 can produce such an error.

@bakaq
Copy link
Contributor

bakaq commented Oct 14, 2025

Integrating it into ... //0 would incur a tiny overhead everywhere [...]

Considering the current implementation of ... //0 isn't even optimized, I think the tradeoff is worth it.

% This could be the definition of `... //0`.
elipsis(S0, S) :-
    % The test is only done once for each call of `... //0`
    % and immediately dispatches to the old implementation.
    % This means that the overhead doesn't appear on backtracking
    % of this non-terminal.
    (   S == [] ->
        '$skip_max_list'(_,_,S0,S1),
        elipsis_(S1, S)
    ;   elipsis_(S0, S)
    ).

% Basically the current definition of `... //0` but optimized.
% (Also pure and simpler!??)
elipsis_(S, S).
elipsis_([_|S0], S) :- elipsis_(S0, S).

% The current definition of `... //0` for reference.
...(Cs0,Cs) :- % Isn't this "a tiny overhead everywhere"?
   Cs0 == [],
   !,
   Cs0 = Cs.
... --> [] | [_], ... .

Some benchmarks of the worst case scenario:

?- Ls = [_,_], time((phrase(({between(1,1000000,_)}, ..., "a"), Ls), false; true)).
   % CPU time: 2.099s, 23_000_149 inferences
   Ls = [_A,_B].
?- Ls = [_,_], time((phrase(({between(1,1000000,_)}, elipsis_, "a"), Ls), false; true)).
   % CPU time: 1.346s, 15_000_149 inferences
   Ls = [_A,_B].
?- Ls = [_,_], time((phrase(({between(1,1000000,_)}, elipsis, "a"), Ls), false; true)).
   % CPU time: 1.513s, 17_000_149 inferences
   Ls = [_A,_B].

Notice that it's actually faster than the current implementation even with the check. Here's the best case scenario:

?- length(Ls, 1000000), time((phrase(({between(1,100,_)}, ...), Ls), false; true)).
   % CPU time: 35.875s, 500_001_141 inferences
   Ls = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...].
?- length(Ls, 1000000), time((phrase(({between(1,100,_)}, elipsis_), Ls), false; true)).
   % CPU time: 8.508s, 200_000_941 inferences
   Ls = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...].
?- length(Ls, 1000000), time((phrase(({between(1,100,_)}, elipsis), Ls), false; true)).
   % CPU time: 0.331s, 1_141 inferences
   Ls = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P,_Q,_R,_S,_T|...].

Absurd gains! I believe using ... //0 at the end of a grammar rule body is common enough, and the overhead small enough, to make this check useful to have in general.

A bigger brained solution to this might be to do this in the expansion of phrase itself, use the elipsis//0 definition if ... //0 appears on the end of the grammar rule body, and the elipsis_//0 definition everywhere else. In this case the overhead would only be there when it actually matters, specializing the implementation at compile/expansion-time instead of at runtime. I can think of other non-terminals that could also benefit from this kind of compile-time specialization, so maybe there could be a generalized way to do this.

Anyway, this is just general optimization stuff which isn't that important for this PR. I think the vanilla ... //0 shouldn't be aware of attributed variables, so ...@ //0 is needed either way. Let's get this in!

@UWN
Copy link

UWN commented Oct 14, 2025

The difference between ... //0 and your elipsis_//0 is really a problem of the WAM. It really should be the same speed. Optimizing this (on this level) means fossilizing the suboptimal WAM design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants