Bug in XML::SAX::PurePerl
Jump to navigation
Jump to search
Hello,
I've just found a bug in XML::SAX. When the “--” of “-->” is exactly at the end of the input buffer, it is treated as normal comment content. In the following diff, I explicitly set up reading at least four characters and also check for this special case when “-” is at the end of the buffer. In this case, I move it to the start of the buffer and process it in the next iteration.
The bug has caused spurious misbehaving of my program. It was not pleasant to pinpoint cause of the problem.
Feel free to apply my patch. Attribution is appreciated but not required.
jwo
Diff follows.
--- a 2024-08-22 07:29:01.400162161 +0200 +++ b 2024-08-22 07:29:28.013494950 +0200 @@ -590,33 +590,37 @@ return $value; } sub Comment { my ($self, $reader) = @_; my $data = $reader->data(4); if ($data =~ /^<!--/) { $reader->move_along(4); my $comment_str = ''; while (1) { - my $data = $reader->data; + my $data = $reader->data(3); $self->parser_error("End of data seen while looking for close comment marker", $reader) unless length($data); if ($data =~ /^(.*?)-->/s) { $comment_str .= $1; $self->parser_error("Invalid comment (dash)", $reader) if $comment_str =~ /-$/; $reader->move_along(length($1) + 3); last; } + elsif ($data =~ /^(.+?)-/s) { + $comment_str .= $1; + $reader->move_along(length($1)); + } else { $comment_str .= $data; $reader->move_along(length($data)); } } $self->comment({ Data => $comment_str }); return 1; } return 0; }