forked from lversaw/id_tn_l3
3.5 KiB
3.5 KiB
Indonesian tN issues
The correct .md file format is documented at https://www.techadvancement.com/community/train/how-to-format-translation-notes-for-publishing-v-mast
Cleanup steps:
- Initial inspection of the id_tn_l3 data. Preliminary report to team on 11/12/19.
- Conference call on 11/20/19. (Chuck, John, Craig, MAx, Tabitha, Christine)
- Documented correct file format at https://www.techadvancement.com/community/train/how-to-format-translation-notes-for-publishing-v-mast.
- More extensive analysis and documentation of deviations.
- Forked the Indonesian repository to https://wacs.bibletranslationtools.org/lversaw/id_tn_l3. This is the WACS workspace for data cleanup for this project.
File modifications:
- Removed the first line of every note file other than the intro.md files. Also removed the second line if blank.
- Consistently, notes files other than intro.md had two extraneous lines at the top.
- RISK: might possibly have deleted some valid data.
- 23,055 files affected
- Removed the first two lines from intro.md files that start with some variation of "# Pendahuluan"
- 446 files affected
- Removed lines containing empty HTML comments, and the lines following, if blank.
- All HTML comment tags found were empty comments.
- 469 files affected
- Converted instances of & nbsp; to a single space.
- 11694 files affected
- Removed top line of file if blank. Consolidated consecutive blank lines elsewhere in file.
- 842 files affected
- Removed instances of <o:p></o:p> and <o:p> </o:p>
- They had no apparent purpose or meaning.
- 620 files affected
- Removed blank headers and the blank line following (if any).
- 1858 files affected
- Fixed language code in tA links. Replaced rc://en/ with rc://id/
- 20,065 files affected
- Removed blank lines between list items.
- 427 files affected
- Removed high level hash tags in files showing the first classic pattern of corrupted heading levels.
- Classic pattern means: First heading at level 1. Subsequent headings alternate higher level to level 1.
- Ends with higher level heading. No untagged text lines anywhere.
- 681 files affected
- Promoted headings to level 1 in files showing the second classic pattern of corrupted heading levels.
- Classic pattern means: First heading at level 2 or higher. Subsequent headings always the same level.
- Plain text lines alternate with headings. Ends with plain text line.
- 1665 files affected
- Removed top two lines of files meeting these criteria:
- At least 5 lines long
- First line contains a verse reference (space followed by digits, colon, and digits)
- Second line is blank, and third line starts with hash mark
- 3394 files affected
- RISK: might possibly have deleted some valid data.
- Removed "Kata-kata Terjemahan" section from files that had it.
- Those are from an older tN version.
- 4817 files affected
- Reapplied #11.
- 475 files affected
- Reapplied #10
- 241 files affected
- Volunteers manually edited the files identified in issues.txt.
- about 4300 files affected
- Renamed all folders and files in Psalms from 2-digit to 3-digit names.
- 2385 files/folders affected
- Made a few manual, one-off edits.
- about 15 files affected
- Removed top line of file if blank or all spaces. Consolidated consecutive blank or space-filled lines elsewhere in file.
- 378 files affected
Remaining issues are documented in issues.txt.