id_tn_l3/Cleanup process.md

67 lines
3.3 KiB
Markdown
Raw Normal View History

2019-11-26 04:13:57 +00:00
# Indonesian tN issues
The correct .md file format is documented at https://www.techadvancement.com/community/train/how-to-format-translation-notes-for-publishing-v-mast
## Cleanup steps:
1. Initial inspection of the id_tn_l3 data. Preliminary report to team on 11/12/19.
2. Conference call on 11/20/19. (Chuck, John, Craig, MAx, Tabitha, Christine)
3. Documented correct file format at https://www.techadvancement.com/community/train/how-to-format-translation-notes-for-publishing-v-mast.
4. More extensive analysis and documentation of deviations.
5. Forked the Indonesian repository to https://wacs.bibletranslationtools.org/lversaw/id_tn_l3. This is the WACS workspace for data cleanup for this project.
## Automated cleanup steps:
2019-11-30 02:53:50 +00:00
1. Removed the first line of every note file other than the intro.md files. Also removed the second line if blank.
2019-11-26 04:13:57 +00:00
* Consistently, notes files other than intro.md had two extraneous lines at the top.
* RISK: might possibly have deleted some valid data.
* 23,055 files affected
2019-11-30 02:53:50 +00:00
2. Removed the first two lines from intro.md files that start with some variation of "# Pendahuluan"
2019-11-26 04:13:57 +00:00
* 446 files affected
2019-11-30 02:53:50 +00:00
3. Removed lines containing empty HTML comments, and the lines following, if blank.
2019-11-26 04:13:57 +00:00
* All HTML comment tags found were empty comments.
* 469 files affected
2019-11-30 02:53:50 +00:00
4. Converted instances of & nbsp; to a single space.
2019-11-26 04:13:57 +00:00
* 11694 files affected
2019-11-30 02:53:50 +00:00
5. Removed top line of file if blank. Consolidated consecutive blank lines elsewhere in file.
2019-11-26 04:13:57 +00:00
* 842 files affected
2019-11-30 02:53:50 +00:00
6. Removed instances of <o:p></o:p> and <o:p> </o:p>
2019-11-26 04:13:57 +00:00
* They had no apparent purpose or meaning.
* 620 files affected
2019-11-30 02:53:50 +00:00
7. Removed blank headers and the blank line following (if any).
2019-11-26 04:13:57 +00:00
* 1858 files affected
2019-11-30 02:53:50 +00:00
8. Fixed language code in tA links. Replaced rc://en/ with rc://id/
2019-11-26 04:13:57 +00:00
* 20,065 files affected
2019-11-30 02:53:50 +00:00
9. Removed blank lines between list items.
2019-11-26 04:13:57 +00:00
* 427 files affected
2019-11-30 02:53:50 +00:00
10. Removed high level hash tags in files showing the first classic pattern of corrupted heading levels.
2019-11-26 04:13:57 +00:00
* Classic pattern means: First heading at level 1. Subsequent headings alternate higher level to level 1.
* Ends with higher level heading. No untagged text lines anywhere.
* 681 files affected
2019-11-30 02:53:50 +00:00
11. Promoted headings to level 1 in files showing the second classic pattern of corrupted heading levels.
2019-11-26 04:13:57 +00:00
* Classic pattern means: First heading at level 2 or higher. Subsequent headings always the same level.
* Plain text lines alternate with headings. Ends with plain text line.
* 1665 files affected
2019-11-30 02:53:50 +00:00
12. Removed top two lines of files meeting these criteria:
2019-11-27 16:39:56 +00:00
* At least 5 lines long
* First line contains a verse reference (space followed by digits, colon, and digits)
* Second line is blank, and third line starts with hash mark
* 3394 files affected
* RISK: might possibly have deleted some valid data.
2019-11-30 02:53:50 +00:00
13. Removed "Kata-kata Terjemahan" section from files that had it.
2019-11-30 02:49:23 +00:00
* Those are from an older tN version.
* 4817 files affected
2019-11-30 02:53:50 +00:00
14. Reapplied #11.
2019-11-30 02:49:23 +00:00
* 475 files affected
2019-11-30 02:53:50 +00:00
15. Reapplied #10
2019-11-30 02:49:23 +00:00
* 241 files affected
2019-11-26 04:13:57 +00:00
2019-11-30 02:53:50 +00:00
16. (asked permission to...) Remove links specific to V-MAST resources that no longer exist.
2019-11-26 04:13:57 +00:00
Remaining issues are documented in *issues.txt*.
## Remaining steps:
2019-11-30 02:53:50 +00:00
1. Manually edit the files identified in *issues.txt* to conform to the required markdown format.