Skip to content

Pipeline Fix & Avoid Pointless UTF8 Checks

Michael Aaron Murphy requested to merge mmstick:pipeline_fix into master

This will fix the spacing issue with pipelines, and it will also address the issue of non-ASCII characters being processed wrong, and improving performance by disabling UTF8 checks where they are redundant.

Instead of iterating over characters to check for specific ASCII characters, this change will have strings iterated by bytes, which allows strings to be correctly spliced when there's non-ASCII characters in the mix. This is faster as bytes don't require UTF8 validation, and most text will consist of ASCII characters. It will also correctly slice strings at the correct index so that strings will not be butchered.

There is a use of the unsafe keyword when splicing strings in some sections of the code to convert a vector of bytes into a String without checking, but this is safe because the source where the string originates was already UTF8 valid in the first place. These sections are merely adding or removing specific ASCII characters.

Fixes issue #214 (closed)

Merge request reports