6b384a69a3 | ||
---|---|---|
01-GEN.usfm | ||
02-EXO.usfm | ||
03-LEV.usfm | ||
04-NUM.usfm | ||
05-DEU.usfm | ||
06-JOS.usfm | ||
07-JDG.usfm | ||
08-RUT.usfm | ||
09-1SA.usfm | ||
10-2SA.usfm | ||
11-1KI.usfm | ||
12-2KI.usfm | ||
13-1CH.usfm | ||
14-2CH.usfm | ||
15-EZR.usfm | ||
16-NEH.usfm | ||
17-EST.usfm | ||
18-JOB.usfm | ||
19-PSA.usfm | ||
20-PRO.usfm | ||
21-ECC.usfm | ||
22-SNG.usfm | ||
23-ISA.usfm | ||
24-JER.usfm | ||
25-LAM.usfm | ||
26-EZK.usfm | ||
27-DAN.usfm | ||
28-HOS.usfm | ||
29-JOL.usfm | ||
30-AMO.usfm | ||
31-OBA.usfm | ||
32-JON.usfm | ||
33-MIC.usfm | ||
34-NAM.usfm | ||
35-HAB.usfm | ||
36-ZEP.usfm | ||
37-HAG.usfm | ||
38-ZEC.usfm | ||
39-MAL.usfm | ||
LICENSE | ||
Project Explanation.md | ||
README.md | ||
Volunteer job description.md | ||
manifest.yaml |
README.md
UHB
The resource we are using as our UHB is the Open Scriptures Hebrew Bible. This project is the Westminster Leningrad Codex with Strongs lexical data and morphological data marked up in OSIS files.
Parsing Status
See the parsing status for the whole Old Testament. Or use the book by book links below.
- Genesis First Pass
- Exodus First Pass
- Leviticus First Pass
- Numbers First Pass
- Deuteronomy First Pass
- Joshua First Pass
- Judges First Pass
- Ruth First Pass
- 1 Samuel First Pass
- 2 Samuel First Pass
- 1 Kings First Pass
- 2 Kings First Pass
- 1 Chronicles First Pass
- 2 Chronicles First Pass
- Ezra First Pass
- Nehemiah First Pass
- Esther First Pass
- Job First Pass
- Psalms First Pass
- Proverbs First Pass
- Ecclesiastes First Pass
- Song of Songs First Pass
- Isaiah First Pass
- Jeremiah First Pass
- Lamentations First Pass
- Ezekiel First Pass
- Daniel First Pass
- Hosea First Pass
- Joel First Pass
- Amos First Pass
- Obadiah First Pass
- Jonah First Pass
- Micah First Pass
- Nahum First Pass
- Habakkuk First Pass
- Zephaniah First Pass
- Haggai First Pass
- Zechariah First Pass
- Malachi First Pass
Roadmap
Initial Inclusion in tC
Get tC to support OSIS XML files like https://github.com/openscriptures/morphhb/blob/master/wlc/Ruth.xml
- Lexical data is encoded in
lemma
attribute, which is the word's Strongs number - Morph data is encoded in
morph
attribute, key here
May as well read the files directly from https://github.com/openscriptures/morphhb/blob/master/wlc/ unless we want to create a process to put this into our container format.
Currently, I'm only seeing about 1% of the words in those files has having morphological data.
Finishing Morphological Data
Stage 1
Write a comparer script that can verify our proposed parsings from http://hb.openscriptures.org/OshbParse/ against an existing dataset (such as https://shebanq.ancient-data.org/shebanq/static/docs/tools/shebanq/plain.html). If they check out then they can be marked as verified and included in the XML files.
Stage 2
Create a process that takes verified parsings from https://github.com/openscriptures/morphhb/blob/master/wlc/ and programmatically guess at the rest of the words in the OT (e.g. strip cantillation and find and replace for unknowns). Feed these back into the parsing system at http://hb.openscriptures.org/OshbParse/ and verify them against an existing dataset and/or Editors.
If we can make this an iterative process then we would be able to cut down the amount of manual intervention necessary to get the morph data.
Completion
After the morphology data is complete, the UHB project will effectively be completed. At the moment there are no further plans to markup the text with other information.