[Bf-docboard] Project Eddy

Fri Jan 26 03:27:53 CET 2018

Hi Tobias,

Having tools to check consistency can be handy - but am concerned this
is some collision with existing tools for eg, we already have "make
check_spelling" as a target.

On the other hand your work may be an improvement on existing tools
where there is overlap.

I think it might be get confusing if people look in our tools/ dir and
are faced with too many specialized tools which probably aren't needed
by most writers.
It may be that we need to split tools into
"tools_(maintenance/syntax/spelling/translations/reporting)" ... to
avoid too much confusion.

Could you post a patch for these scripts?

On Wed, Jan 24, 2018 at 9:49 AM, Tobias Heinke
<heinke.tobias at t-online.de> wrote:
> Hi all,
>
> I turned the regex checks I've done in the past into py-scripts.
> When these checks have a lot of false positives it's only feasible to apply
> them to new content.
> So had the idea to apply them on the diffs instead. The goal is to
> check/filter changes of the manual.
> Both incoming and outgoing, so it can be run by a single person.
>
> The first type of check is lint, of which the most important one is to
> prevent leaked markup and
> also to insure that only the manuals RST sub standard (style guide) is used.
> Spelling mistakes that pass normal spell-checking like 'mash' instead of
> 'mesh' or
> words that are not used yet in the manual are likely to be misspelled like
> 'decease' vs. 'decrease'
> and code style like: double spaces, and spaces at the end of line, etc.
>
> The benefits are quite obvious:
> - Errors are prevented.
> - When these errors aren't committed the versioning gets cleaner.
> - Manual clean up haven't be done so often.
> - The commit itself haven't to be manually checked for these kind of common
> errors.
>
> Writing the script was easy, because luckily svn_commit.py (by anfelor) does
> almost the same. The svn script itself is quite simple (100 lines).
> It checks the svn status and makes a diff of the modified files. The tools
> output is filtered for occurrences on lines that start with a '+'.
>
> I expanded the rst-helper to iterate over almost every rst-construct or to
> remove it (to prevent false positives).
> The tools have a common input schema and use the same output format.
>
> I think it's worth adding it to the tools folder, but it needs a utility
> (stemmer) and data.
> It has to be finished, polished, and tested anyway and that will take some
> time.
>
> Outlook:
> The interface that selects subsets of checks has to be finished.
> Preventing false positives are either computational expensive or challenging
> to manage (almost like parsing).
> A final feature could be automatic fixes of 100%-ers and a y/n console
> interface.
> And also the tools in the [tools] folder can be adapted.
>
> Finally some numbers:
> - 7 tools (e.g. char count of headline underlines, indention, etc.)
> - 45 reg. exp. (which as groups function as a tool e.g. to prevent leaked
> markup)
> - 5 lists (confusable words, Blender UI, British English, domains of
> external links, comma phrases)
> - 2 utility tools (stemmer, fuzzy string matching)
>
> Tobias
>
> _______________________________________________
> Bf-docboard mailing list
> Bf-docboard at blender.org
> https://lists.blender.org/mailman/listinfo/bf-docboard

-- 
- Campbell