Description
The construction in LearnLib for the canonical RFSA sometimes includes a sink state. This is wrong. A row which does not accept anything, should not be consider to be prime.
I have to agree that the definitions in the Bollig et al paper "Angluin-Style Learning of NFA" (2009) are not very precise on this matter. But in the original work on RFSA by Denis and Eposito ("Residual Finite State Automata", 2002) it is clear that an empty language is not prime:
Note that a prime residual language is not empty and that the set of distinct prime residual languages of a regular language is finite. (page 6)
(This also agrees with the standard notion of join-irreducible from lattice theory: the bottom element is not join-irreducible.) Consequently, the canonical RFSA will never have sink states.
Here is the code to reproduce the bug:
// language is L = {abbb...} + {baaa...}
private static CompactNFA<Character> constructSUL() {
Alphabet<Character> alphabet = Alphabets.characters('a', 'b');
return AutomatonBuilders.newNFA(alphabet)
.withInitial("q0")
.from("q0")
.on('a').to("q1")
.on('b').to("q2")
.from("q1")
.on('b').to("q1")
.from("q2")
.on('a').to("q2")
.withAccepting("q1")
.withAccepting("q2")
.create();
}
public static void main(String[] args) throws IOException {
CompactNFA<Character> target = constructSUL();
Alphabet<Character> alphabet = target.getInputAlphabet();
// MQs
SimulatorOracle<Character, Boolean> mqOracle = new SimulatorOracle<Character, Boolean>(target);
// EQs
SampleSetEQOracle<Character, Boolean> eqOracle = new SampleSetEQOracle<>(false);
eqOracle.addAll(mqOracle, Word.fromCharSequence("a"), Word.fromCharSequence("ab"), Word.fromCharSequence("aa"), Word.fromCharSequence("bab"));
// construct Learner instance
NLStarLearner<Character> learner = new NLStarLearnerBuilder<Character>().withAlphabet(alphabet).withOracle(mqOracle).create();
learner.startLearning();
NFA<?, Character> hyp;
while (true) {
hyp = learner.getHypothesisModel();
System.out.println("States " + ": " + hyp.size());
DefaultQuery<Character, Boolean> ce = eqOracle.findCounterExample(hyp, alphabet);
if (ce == null) break;
System.out.println("CEX " + ": " + ce.getInput().toString());
boolean refined = learner.refineHypothesis(ce);
assert refined;
}
// report results
System.out.println("-------------------------------------------------------");
System.out.println("States: " + hyp.size());
System.out.println("Model: ");
GraphDOT.write(hyp, alphabet, System.out);
}
Wrong output
States : 3
CEX : «'b'␣'a'␣'b'»
States : 4
-------------------------------------------------------
States: 4
Model:
digraph g {
s0 [shape="circle" label="0"];
s1 [shape="doublecircle" label="1"];
s2 [shape="doublecircle" label="2"];
s3 [shape="circle" label="3"];
s0 -> s1 [label="a"];
s0 -> s2 [label="b"];
s1 -> s3 [label="a"];
s1 -> s1 [label="b"];
s1 -> s3 [label="b"];
s2 -> s2 [label="a"];
s2 -> s3 [label="a"];
s2 -> s3 [label="b"];
s3 -> s3 [label="a"];
s3 -> s3 [label="b"];
__start0 [label="" shape="none" width="0" height="0"];
__start0 -> s0;
}
The state s3
is a sink state, and should be removed altogether. Otherwise, the automaton is correct, and accepts the language L.