There are 77430 words in the Quran, but it is said that the Quran has around 2000 (TWO THOUSAND) unique (unrepeated) words. I am trying to categorize those words in a way easily comprehensible to non-Native learners of Arabic. I think this is the quickest way to understanding Quran, because you need not try to understand Quran word by word……What you need is to group the basic vocabulary with simple examples.
Thursday, September 6, 2012
How To Analyze Quranic Arabic Corpus morphological data 0.4
Before analyzing Quranic Arabic Corpus morphological data 0.4, you have to learn some terms of Corpus Linguistics.
In linguistics, a morpheme is the smallest semantically meaningful unit in a language. The field of study dedicated to morphemes is called morphology. Morphemes are of two types: Free and Bound Morphemes. A morpheme (or word element) that can stand alone as a word is called Free. It is sometimes called stem, because other non-free elements are added ti it.
In morphology, a bound morpheme is a morpheme that only appears as part of a larger word. They are sometimes called affixes.
Affixes are three types: Prefix, Infix, Suffix
Affixes (prefix, suffix, infix and circumfix) are all bound morphemes.
Bound morphemes occur only before other morphemes.Examples: un- (uncover, undo)
Infix Bound morphemes which are inserted into other morphemes. eg not found in English. But Food > Feed
Suffixes are Bound morphemes which occur following other morphemes.
Examples:
-er (singer, performer)
-ist (typist, pianist)
-ly (manly, friendly)
Quranic Arabic Corpus morphological data 0.4 includes these and other linguistic terms concerned.
Let me explain a few Rows
LOCATION is the Surah:Ayah:word:morpheme reference of the Quran. FORM is the English Transliteration of the surface Arabic Word form, which is based on Buckwalter Transliteration. See the chart:
http://corpus.quran.com/java/buckwalter.jsp
TAG is the lexical or grammatical category of the morpheme concerned. FEATURES describe the detailed linguistic features of the morpheme.
Description of FEATURES
In morphology and lexicography, a lemma (plural lemmas or lemmata) is the canonical form, dictionary form, or citation form of a set of words (headword). In English, for example, run, runs, ran and running are forms of the same lexeme, with run as the lemma. Lexeme, in this context, refers to the set of all the forms that have the same meaning, and lemma refers to the particular form that is chosen by convention to represent the lexeme.
Difference between stem and lemma
In computational linguistics, a stem is the part of the word that never changes even when morphologically inflected, whilst a lemma is the base form of the verb. For example, from "produced", the lemma is "produce", but the stem is "produc-." This is because there are words such as production. In linguistic analysis, the stem is defined more generally as the analyzed base form from which all inflected forms can be formed.
For illustrations of Other Abbreviated Terms, Go to page
http://corpus.quran.com/documentation/tagset.jsp
For Verb Forms, Refer to page:
http://corpus.quran.com/documentation/verbforms.jsp
The First Word of Quran Bismi
The First Word of Quran Bismi consists of two morphemes: bi which is used as prefix, and somi (don't think that the "o" in somi is like English "O", it is a symbol of 'sukun' according to Buckwalter Transliteration) is a noun; it is a stem; POS=Parts of Speech, N=Noun; its Lemma is {som (whwre hamzah is deleted for widespread use) which is derived from the triliteral ROOT smw ie س م و . It is a |M|masculine noun used here in Genitive case ie اضافة
LOCATION FORM TAG FEATURES
(1:1:1:1) bi P PREFIX|bi+
(1:1:1:2) somi N STEM|POS:N|LEM:{som|ROOT:smw|M|GEN
The First Explicit Verb of the Quran
The First Explicit Verb of the Quran is located in the 2nd word of the Fifth verse of First chapter Fatihah:
(1:5:2:1) naEobudu V STEM|POS:V|IMPF|LEM:Eabada|ROOT:Ebd|1P
This is an IMPERFECT Verb (Present-Future Tense)used in 1st Person Plural
The Second Verb
(1:5:4:1) nasotaEiynu V STEM|POS:V|IMPF|(X)|LEM:{sotaEiynu|ROOT:Ewn|1P
This is also an IMPERFECT verb used in (X) Form and the ROOT is Ewn ie ع و ن
How To Analyze:
Download the txt file, copy and paste it to Excel 2007/2010 (Excel 2003 won't help)
The rows and columns will be separated. Now the analysis depends on what you want out of the QAC.
If you want to know how many prepositions are used i quran, you can do so by auto-filtering the TAG column: choose Data>Filter, from drop-down deselect 'Select all' and check P. You will get all prepositions used in the Quran. How many?
Ok, in the last blank cell of Column C, write this formula =COUNTIF(C1:C128215, "P"), press ENTER, you will get 13006. Unfortunately, you will not get this stat from the site
http://corpus.quran.com/morphologicalsearch.jsp You will get only 7679, here prepositions as stems are counted, not the prefixed and suffixed prepositions.There are 7679 stem prep, 5325 prefix prep and 2 suffix prep in Quran, so the total is 7679+5325+2= 13006.
Sometimes Quranic Arabic Corpus morphological data 0.4 is very helpful for you to find specific Data. For example if you want to know The Past Passive Verbs used in the Quran, you can do that within seconds. Here is the list of Past Passive Verbs used in Quran. (Here FORM is the passive form, Go to ayat and check it)
LOCATION FORM TAG
(4:157:15:1) $ub~iha V
(6:118:3:1) *ukira V
(5:3:23:1) *ubiHa V
(5:13:15:1) *uk~iru V
(76:14:4:2) *ul~ilato V
(2:283:16:1) {&otumina V
(33:11:2:1) {botuliYa V
(2:173:14:1) {DoTur~a V
(14:26:6:1) {jotuv~ato V
(7:75:8:1) {sotuDoEifu V
(42:16:8:1) {sotujiyba V
(5:44:17:1) {sotuHofiZu V
(6:10:2:1) {sotuhozi}a V
(2:166:4:1) {t~ubiEu V
(11:110:5:2) {xotulifa V
(54:9:9:2) {zodujira V
(22:39:1:1) >u*ina V
(2:24:12:1) >uEid~ato V
(9:58:7:1) >uEoTu V
(10:22:28:1) >uHiyTa V
(2:187:1:1) >uHil~a V
(4:128:18:2) >uHoDirati V
(69:5:3:2) >uholiku V
(4:25:36:1) >uHoSi V
(77:12:3:1) >uj~ilato V
(7:120:1:2) >uloqiYa V
(4:60:21:1) >umiru V
(18:56:17:1) >un*iru V
(2:4:4:1) >unzila V
(72:10:5:1) >uriyda V
(4:91:13:1) >urokisu V
(7:6:3:1) >urosila V
(9:108:6:1) >us~isa V
(2:25:25:2) >utu V
(11:60:1:2) >utobiEu V
(6:19:11:2) >uwHiYa V
(7:43:33:1) >uwrivo V
(8:70:18:1) >uxi*a V
(2:246:40:1) >uxorijo V
(2:93:15:2) >u$oribu V
(6:70:34:1) >ubosilu V
(3:185:14:2) >udoxila V
(22:22:8:1) >uEiydu V
(51:9:4:1) >ufika V
(10:27:16:1) >ugo$iyato V
(71:25:3:1) >ugoriqu V
(2:173:9:1) >uhil~a V
(11:1:3:1) >uHokimato V
(2:196:6:1) >uHoSiro V
(5:109:7:1) >ujibo V
(16:106:9:1) >ukoriha V
(25:40:6:1) >umoTirato V
(77:11:3:1) >uq~itato V
(11:116:23:1) >utorifu V
(3:195:22:2) >uw*u V
(32:17:5:1) >uxofiYa V
(26:90:1:2) >uzolifati V
(2:101:14:1) >uwtu V
(27:8:5:1) buwrika V
(22:60:9:1) bugiYa V
(16:58:2:1) bu$~ira V
(82:4:3:1) buEovirato V
(2:258:36:2) buhita V
(26:91:1:2) bur~izati V
(56:5:1:2) bus~ati V
(2:282:77:1) duEu V
(2:61:37:2) Duribato V
(33:14:2:1) duxilato V
(69:14:4:2) duk~a V
(16:126:6:1) Euwqibo V
(2:178:16:1) EufiYa V
(6:91:31:2) Eul~imo V
(18:48:1:2) EuriDu V
(11:28:14:2) Eum~iyato V
(81:4:3:1) EuT~ilato V
(5:107:2:1) Euvira V
(16:71:10:1) fuD~ilu V
(34:54:7:1) fuEila V
(11:1:6:1) fuS~ilato V
(21:96:3:1) futiHato V
(16:110:9:1) futinu V
(82:3:3:1) fuj~irato V
(77:9:3:1) furijato V
(34:23:11:1) fuz~iEa V
(5:64:6:1) gul~ato V
(7:119:1:2) gulibu V
(11:44:7:2) giyDa V
(27:17:1:2) Hu$ira V
(34:54:1:2) Hiyla V
(3:101:14:1) hudiYa V
(69:14:1:2) Humilati V
(84:2:3:2) Huq~ato V
(3:50:11:1) Hur~ima V
(4:86:2:1) Huy~iy V
(22:40:18:2) hud~imato V
(76:21:6:2) Hul~u V
(20:87:7:1) Hum~ilo V
(100:10:1:2) HuS~ila V
(39:69:7:2) jiA@Y^'a V
(16:124:2:1) juEila V
(26:38:1:2) jumiEa V
(3:184:4:1) ku*~iba V
(12:110:8:1) ku*ibu V
(17:35:4:1) kilo V
(54:14:6:1) kufira V
(13:31:12:1) kul~ima V
(2:178:4:1) kutiba V
(11:55:3:2) kiydu V
(81:11:3:1) ku$iTato V
(27:90:4:2) kub~ato V
(58:5:6:1) kubitu V
(26:94:1:2) kubokibu V
(81:1:3:1) kuw~irato V
(5:64:8:2) luEinu V
(3:159:5:1) lin V
(23:35:4:1) mi V
(12:63:7:1) muniEa V
(84:3:3:1) mud~ato V
(18:18:20:3) muli}o V
(34:7:10:1) muz~iqo V
(7:43:29:2) nuwdu V
(68:49:7:2) nubi*a V
(18:99:7:2) nufixa V
(4:161:4:1) nuhu V
(12:110:11:2) nuj~iYa V
(6:37:3:1) nuz~ila V
(81:10:3:1) nu$irato V
(21:65:2:1) nukisu V
(74:8:2:1) nuqira V
(88:19:4:1) nuSibato V
(77:10:3:1) nusifato V
(59:11:23:1) quwtilo V
(2:11:2:1) qiyla V
(54:12:9:1) qudira V
(2:210:12:2) quDiYa V
(7:204:2:1) quri}a V
(13:31:8:1) quT~iEato V
(3:144:13:1) qutila V
(12:26:13:1) qud~a V
(33:61:5:2) qut~ilu V
(6:45:1:2) quTiEa V
(4:91:10:1) rud~u V
(88:18:4:1) rufiEato V
(41:50:17:1) r~ujiEo V
(2:25:14:1) ruziqu V
(56:4:2:1) ruj~ati V
(2:108:7:1) su}ila V
(11:77:5:1) siY^'a V
(40:37:15:2) Sud~a V
(47:15:40:2) suqu V
(7:47:2:1) Surifato V
(39:71:1:2) siyqa V
(13:33:30:2) Sud~u V
(81:12:3:1) suE~irato V
(11:108:3:1) suEidu V
(81:6:3:1) suj~irato V
(15:15:3:1) suk~irato V
(7:149:2:1) suqiTa V
(88:20:4:1) suTiHato V
(13:31:4:1) suy~irato V
(39:73:18:1) Tibo V
(9:87:6:2) TubiEa V
(8:2:10:1) tuliyato V
(5:27:10:2) tuqub~ila V
(77:8:3:1) Tumisato V
(3:112:6:1) vuqifu V
(83:36:2:1) vuw~iba V
(3:96:4:1) wuDiEa V
(13:35:4:1) wuEida V
(3:25:8:2) wuf~iyato V
(12:75:4:1) wujida V
(19:15:4:1) wulida V
(7:20:7:1) wu,riYa V
(32:11:6:1) wuk~ila V
(6:27:4:1) wuqifu V
(26:21:4:1) xifo V
(4:28:6:2) xuliqa V
(9:118:4:1) xul~ifu V
(16:88:7:1) zido V
(4:148:10:1) Zulima V
(2:212:1:1) zuy~ina V
(3:185:11:1) zuHoziHa V
(2:214:16:2) zulozilu V
(81:7:3:1) zuw~ijato V
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment