Estimating the current number of Okinawan speakers

For this essay, I take “Okinawan” to mean South-Central Okinawan, the varieties spoken on the southern and central portion of Okinawa Island (沖縄本島 Okinawa-hontō), and surrounding smaller islands like Kume Island (久米島 Kume-jima). This is distinct from the Kunigami varieties spoken on the northern portion of Okinawa Island. Borrowing from South-Central Okinawan varieties into Kunigami varieties means that there is no discrete line between what is Okinawan and what is Kunigami. The inverse also appears to occur, though it is less clear if this is actually borrowing from Kunigami varieties, retention of Kunigami features by otherwise Okinawan-ified Kunigami varieties, or a mix of both.

For the purposes of this essay, I take the dividing line between Kunigami and South-Central Okinawan to be the Ishikawa Isthmus, the narrow portion of land running from Ishikawa City (石川市 Ishikawa-shi) in the south, to Nago City (名護市 Nago-shi) in the north. I take varieties spoken on the Ishikawa Isthmus to be a part of the Kunigami dialect group, while varieties south of isthmus to belong to the South-Central Okinawan dialect group. So Yomitan is a South-Central Okinawan variety,  while Kin is a Kunigami variety.

Population counts, and percentages, of people binned into age 40 and over, age 50 and over, and age 65 and over can be found in table 1 and 2, respectively.

Location Total pop. Over 40 Over 50 Over 65
Ginowan 91,928 42,824 29,593 13,428
Itoman 57,320 28,152 21,151 9,480
Naha 315,945 163,334 116,143 55,644
Nakagami Dist. 147,688 72,295 52,699 24,628
Nanjō 39,758 21,892 17,160 8,415
Okinawa 130,249 61,726 43,927 20,137
Shimajiri Dist. 94,783 47,997 36,156 16,796
Tomigusuku 57,261 26,024 18,708 8,241
Urasoe 110,351 51,811 35,830 15,846
Uruma 116,979 58,191 43,531 20,445
Totals 1,162,262 574,246 414,898 193,060

Table 1: Populations by select age ranges of various cities (市 shi) and districts (郡 gun) of south and central Okinawa. 

Location Pct. over 40 Pct. over 50 Pct. over 65
Ginowan 46.6% 32.2% 14.6%
Itoman 49.1% 37.0% 16.5%
Naha 51.7% 36.8% 17.6%
Nakagami Dist. 49.0% 35.7% 16.7%
Nanjō 55.1% 43.2% 21.2%
Okinawa 47.4% 33.7% 15.5%
Shimajiri Dist. 50.6% 38.1% 17.7%
Tomigusuku 45.4% 32.7% 14.4%
Urasoe 47.0% 32.5% 14.4%
Uruma 49.7% 37.2% 17.5%
Totals 49.4% 35.7% 16.6%

Table 2: Populations by select age ranges of various cities (市 shi) and districts (郡 gun) of south and central Okinawa as percentages. 
Of course, being over a certain age is no guarantee of being a speaker of a language. That being said, we can be reasonably sure from anecdotal reports that most people over age 65 can speak Okinawan, while almost no person under age 20 can speak Okinawan. We could assume a contrived, simple linear regression model for language death, with 100% of people age 70 being able to speak Okinawan, and 0% of people age 20% being able to speak Okinawan. The equation of the line of best fit is thus:

(1) y = 2x – 40

Where is a given age (in years), and is the percentage of speakers that age able to speak the language. This is, of course, and overly simplified model. We would likely be better off using a slightly more complex model, like the one found in Abrams and Strogatz 2003, which models language competition between a pair of languages, based on the proportion of speakers of each language, as well as the relative prestige of each language (Abrams and Strogatz 2003: 900). For our estimate, though, we will simply use the simple linear model.

With this simple linear model, again returning to Japanese census data for the same areas as above, we arrive at a number of a little less that 490,552 people. As this simple linear regression model provides a poor estimate of the proportion of Okinawan speakers, we can probably conclude that this number is an overestimate of the true number of Okinawan speakers. A more aggressive decline, with no speakers under the age of 41, as defined in (2), leaves 334,457 people who can speak Okinawan.

(2) y = 10/3x – 400/3

Again, both (1) and (2) are overly simplified models. We would be better off with a more slightly more nuanced model, but this is a good first approximation. So we can say, with reservations, that around 400,000 people in southern and central Okinawa and its environs (or approximately 34.4% of people living there), can speak South-Central Okinawan.


Abrams, Daniel M. and Steven H. Strogatz. 2003. Modelling the dynamics of language death. Nature 424. p. 900. doi:10.1038/424900a

Census data from Japanese 2010 Population Census. Okinawa Prefecture. Table 3-2. “Population (Total and Japanese Population), by Age (Single Years) and Sex, Percentage by age, Average Age and Median Age – Prefectures*, All Shi*, All Gun*, Shi*, Machi*, Mura*, and Municipalities in 2000 ” Obtained from:

The Sōgen-ji Eastern Entrance stele

It seems that most stelai inscribed druing the Ryūkyū Kingdom are either in Classical Chinese, or in Middle (or literary Early Modern) Japanese. Very few are actually in the contemporary vernacular Old or Middle Okinawan. The stele erected at the eastern entrance to Sōgen-ji (崇元寺) is one of the few exceptions, with one face in Classical Chinese, and another in Middle Okinawan.

Background on Sōgen-ji
Sōgen-ji was located between Tomari (泊村 Tomari-mura) and Makishi Villages (牧志村 Makishi-mura) near where the Asato River (安里川 Asato-gawa) split into Tomari Harbor (泊港 Tomari-kō) and the Kumoji River (久茂地川  Kumoji-gawa), on the western edge of classical Naha. Originally built as a Rinzai (臨済宗 Rinzai-shū) Zen Buddhist temple, it later served as a Royal Mausoleum (until the construciton of Tamaudun in 1501 CE). It was built early on during the reign of Shō Shin (尚眞, r. 1477–1526 CE), and was one of several Rinzai temples Shō Shin constructed1. Much of the temple was destroyed during the Battle of Okinawa in 1945 CE. Thankfully, the stele itself survived, and, along with a large number of other important stelai, was included in both facsimile and transcribed form in Tsukada 1970.

Description of the Stele
The stele itself is made of stone, with inscriptions on the front (Classical Chinese) and the back (Middle Okinawan). Unfortunately, the version in Tsukada 1970 does not describe where on the stele the date is inscribed, and the facsimile does not show where it is either.

The facsimile (and, presumably, the inscription itself) is perfectly legible. The Classical Chinese text is inscribed in regular script (楷書 kaisho), while the Middle Okinawan text is inscribed in cursive script (草書 sōsho).

The Classical Chinese text and the Middle Okinawan texts are equivalent to one another. They serve as a prohibition, warning potential visitors to the temple of its significance (as a Royal Mausoleum), and ordering them to dismount.

The inscriptions

Front (Classical Chinese)


dàn <jù> guānyuán rén děng zhì cĭ xià mă
only <all> anji common.people arrive here go.down horse

‘All anji and commoners arriving here, dismount.’

The character 但 dàn ‘only’ is erroneously used as a variant of 俱 jù ‘all’. Both share the radical 人, and some graphical variants partially overlap (see the Dictionary of Chinese Character Variants entries on dàn ‘only’ and jù ‘all’, respectively, for examples).

官員 guānyuán ‘(an) official’ is used here in place of the Okinawan あんし or 按司2 anji, a class of landed nobility in Okinawa, which existed not only during the Ryūkyū Kingdom, but also previous to it.

Back (Middle Okinawan)


anji=to kesu=mo kuma n-ite muma=kara orer-ube-shi
anji=COM common.people=ASF here COP-COOR horse=ABL go.down-DEB-RLS

Anji and also commoners must dismount from [their] horses.’

Here, like in many other Middle Okinawan texts, we see etymological or pseudo-etymological spellings throughout the text. A three-way vowel height distinction is maintained in writing, while it would have likely merged into a two-way distinction in speech (with front and back mid vowels merging into the front and back high vowels, respectively).

Despite the text being very short, there is at least one diagnostic lexical item hinting that the text is intended to be Middle Okinawan as opposed to Middle Japanese. Namely, Middle Okinawan くま kuma ‘here’, rather than Middle Japanese ここ koko ‘here’. While not as decisive, the unetymological spelling of orir- as orer- is interesting, and shows that there was confusion between mid and high vowels, which merged quite early in the history of Okinawan.

However, the influence of Middle Japanese can clearly be seen in the loan of the debitive suffix -ube- (see discussion in Vovin 2009: 879-880).

It is unclear to me if むま muma ‘horse’ is a loan from Middle Japanese, or an attempt at rendering the regressive assimilation of nasality into the initial vowel. Modern Okinawan ʔNma ‘horse’ is likely from earlier *uma ‘horse’, as it has a glottal stop as its initial segment, rather than a nasal. Compare ‘horse’ with the historical outcome of ‘all’, Modern Okinawan nNna ‘all’, which is from earlier *mina ‘all’. This too has regressive nasalization of the vowel, but additionally had regressive palatalization of the initial segment *m > n.



’25th day of the 7th lunar month, 24th of the sexagenary cycle, 6th year of [the reign of the] Ming Emperor Jiājìng’

In the Gregorian calendar, this date is 22 August 1527 CE. Note that I have no reference to the inscription, as a facsimile or otherwise; I am relying solely on Tsukada (1970)’s transcription.

Edit (13 May 2016) — Added forgotten citation to Vovin 2009.


  1. I am not as well versed as I would like on the history of Buddhism in Ryūkyū Kingdom. However, it seems to me that the introduction of the Rinzai school of Zen Buddhism by Shō Shin is a calculated move imitating Rinzai’s intimate ties to government in Japan. The Rinzai school was closely associated with the Muromachi Shogunate—more or less the “official” sect—and more widely, nobility in general (Matsuo 2007: 195). It also retained a strong connection with Chinese practice, including that Chinese monks were present in some Rinzai temples, and Rinzai monks were used in addition to diplomats as intermediaries between Japan and China (Matsuo 2007: 195–6). This seems even more clear in light of the fact that the head temple of Rinzai in Kamakura and in Shuri are both named Engaku-ji (円覚寺).
  2. This is likely ateji—Chinese characters used soley (or at least primarily) for their phonetic value.

ABL – ablative case
ASF – additive-scalar focus paticle
COM – commitative case
COOR – coordinating converb
COP – copula
DEB – debitive mood
RLS – realis mood


Matsuo Kenji. 2007. A History of Japanese Buddhism. Folkestone: Global Oriental.

Tsukada Seisaku. 1970. 「琉球国碑文記」 Inscriptions of the Ryūkyū Kingdom. Tōkyō: Keigaku Shuppan.

Vovin, Alexander. 2009. A Descriptive and Comparative Grammar of Western Old Japanese. Part Two: Adjectives, Verbs, Adverbs, Conjunctions, Particles, Postpositions. Folkestone: Global Oriental.

The Misery of Slavery (Ryūka Zenshū 2308)

As I mentioned in my previous post, one famous poet of ryūka was named Yoshia (ca. 1650–1688 CE).

Yoshia was sold into slavery as a prostitute as a young woman. Various forms of slavery, including enslavement as a way to pay off a debt, were common and codified in law in the various polities of East Asia. By her own admission in her writings, Yoshia’s parents sold her off as a prostitute. It was common in pre-modern Japan and, apparently, the Ryūkyū Kingdom to sell one’s daughter into slavery as a prostitute in order to cover a debt, and almost certainly the reason Yoshia was sold. The received wisdom was that Yoshia was sold when she was 8 years old (Shimabukuro and Onaga 1968: 480–1).

Original Text

SODAteranu OYA no / noyode WAMI NAtiyute / HANA ni osiIdiyati / yoso ni momasu

sudatir-an-u ʔúya=nu
raise-NEG-ADN parents=NOM.IANI

nuyudi wa-mi nachu-ti

fana n-i ʔush-i-ʔNjach-i
flower COP-INF push-INF-go.out-INF

yúsu=ni múmas-u
elsewhere=LOC suffer-RLS

‘Why did my parents who didn’t raise me give birth to me? They sent me out as a flower, and I suffer elsewhere.’

This poem appears to be metrical, and in the normal 8-8-8-6 style. Lines 2 and 3 are potentially hypermetric, due the vowel length of nuyudi ‘why’ (which may be nuuyudi) and nachuti ‘having given birth to, and…’ (which may be nachooti). As we have seen elsewhere, it is likely that vowel length did not factor in to authors’ calculation of meter in the composition of ryūka.

ʔúya ‘parents’ with the inanimate nominative-genitive case marker =nu is unexpected, as parents are humans, and thus animate. We would expect the animate nominative-possessive case marker =ga.

If the verb form from nas- ‘to give birth to’ in line 2 is ultimately morphologically simplex, nachuti is odd. We would expect nachiti. However, this form and similar forms do occur elsewhere. Okinawa Kogo Daijiten gives <なちやうて> and <なちょーて> (but not the form with a short vowel <産ちゆて> attested here) as attested forms of a compound of nas- and the auxiliary verb wu- for a 継続 (‘continuous’) form of nas-.

Nuyudi ‘why’, to the best of my knowledge, does not occur in modern Okinawan. The OKD lists the attested forms as <のよて> and <のよで>, but does not propose an etymology. The first element is likely identical to modern Okinawan nuu ‘what’, but I do not have a good explanation for the second element. Perhaps the ablative case marker より, but an isolated change of r to d is unexpected though not unprecedented, especially in light of the fact that modern Naha Okinawan, confined formerly to those of lower socioeconomic status in and around the city of Naha merges earlier *d and *ɾ into /d/.

Fana ‘flower’ is a euphemism for a woman sold into slavery as a prostitute, as the author Yoshia was.

1 – first person
ADN – adnominal
CONT – continuous aspect
COOR – coordinating converb
COP – copula
IANI – inanimate
INF – infinitve
LOC – locative
NEG – negative
NOM – nominative
POSS – posessive
RLS – realis
SG – singular

Shimabukuro, Seibin and Toshio Onaga. 1968. 「標音評釈琉歌全集」 The Complete Ryūka: Transcribed and Annotated. Tōkyō: Musashino Sho’in.

Ryūka Zenshū 16

The Ryūka Zenshū1 (「琉歌全集」, ‘Complete Collection of Ryūka2) is the largest collection of the ryūka style of Okinawan poetry. It was compiled in the 1960s Shimabukuro Seibin (島袋盛敏) and Onaga Toshio (翁長俊郎), and transcribed phonetically into the Shuri lect1 of Okinawan by the latter.

Ryūka (琉歌 ‘Ryūkyūan3 poems’) are the traditional poems of Okinawa. They date to at least the 1700s CE, as the term ryūka itself first appears in the Konkōkenshū (「混効験集」, compiled 1711 CE), the earliest dictionary of Okinawan. This uniquely Okinawan poetic tradition likely dates back to even before the invasion of the Satsuma domain in 1609 CE, perhaps even as far back as the 1400s CE. Ryūka as a poetic style are not confined to the capital of Shuri, with several surviving examples from throughout the Ryūkyū Archipelago. Neither are they confined to the upper classes, as several examples exist from Onna Nabe (恩納なべ), a woman who lived as a farmer in the village of Onna (恩納村) in northern Okinawa, as well as from Yoshiya (よしや), a woman sold into slavery as a prostitute in the Nakajima (仲島) red-light district of Naha (previously on the southern banks of the mouth of the Kumoji River (久茂地川 Kumoji-kawa).

Ryūka have a set meter, most typically having three lines of 8 syllables and one line of 6 syllables, for a pattern of 8-8-8-6 (or 30 total syllables). At first blush, however, almost no written ryūka appear to resemble this meter. For example, the poem I will discuss in this post, RZ 16, is written as follows:

Original Text

kareyosi ya itumo / kareyosi do mesiyairu / tada ito no UE kara / Igiyai KItiyai

The poem appears to have 35 syllables, arranged in a pattern of 8-10-9-8. We know, however, that even very early in the history of Okinawan, sequences of vowels are reduced to individual vowels through coalescence (where CV1+V2 → CV3ː) and/or glide formation (where CV1+V2 → CGV2ː). For example, 上 ‘up; top’, which I transcribe as UE, would have likely already undergone a process of glide formation (and further changes) into something resembling modern Okinawan ッウィー [ʔʷíː] ‘up; top’. By way of these phonological changes, we can “fix”, or perhaps more accurately, reconstruct the poem as it was intended, which just happens to “fix” the hypermetrical lines, rendering all metrical, as below:

ka.i.ju.ɕi ja ʔí.tsi.n̩ / ka.i.ju.ɕi duɾu / tá.da ʔi.tu nu ʔʷíː ka.ɾa / ʔń̩.dʑa.i tɕi.tɕa.i

It is likely the case that syllables, rather than morae, count when it comes to determining the number of units allowed in a line. There is some evidence from modern Okinawan that ッウィー [ʔʷíː] ‘up; top’ is two morae: it has a clear fall in pitch over the length of the vowel, something that cannot occur with one mora words. It also has a long vowel, which means it cannot also have a geminite (or “heavy”, if you prefer) onset. But in this poem, the author is at least bending the rules, if not using syllable counts over mora counts when it comes to determining whether or not a line is metrical.

Onaga inconsitently has the more conservative karijusi (rather than the more innovative kaijusi) and the more innovative cicai (rather than the more conservative cicari)While it is unclear when the syllable /ɾi/ loses its consonant, the fact that there is variation between <ri> and <i> in the original text, with a more etymological spelling of かれよし <kareyosi> and a less etymological spelling of 来ちやい <KItiyai> (from earlier *ki-te ar-i) suggests that this had already taken place.

Interlinear gloss and free translation
kariyushi=ya ‘ítsiN / kariyushi=du mise-ru / táda itu=nu ‘wíi=kara / ‘Ndza-i chicha-i
good.fortune=TOP always / good.fortune=FOC give.HON-ADN / just silk=GEN top=ABL / go\PFV-RLS come\PFV-RLS

‘Good fortune! Always give [me] the good fortune [of travelling]! Only coming and going from the tops of silks.’

This is a poem asking for good fortune in the form of travel. As a courtier in the court of the Ryūkyū Kingdom, being assigned to be on a mission to Japan, China, Korea, or the like, was among the most prestigious duties that one could be granted.

  1. Abbreviated as RZ.
  2. Shuri is generally considered a dialect (or topolect) of Okinawan. While Shuri is geographically contained to the former capital of Okinawa, Shuri, it is perhaps better thought of as a sociolect, being spoken by the descendants of the Okinawan gentry. Compare this to the status of its neighboring Naha, spoken in an adjacent geographical area, but by the descendants of the common people of the capital region. I use lect here as a compromise, rather than choosing between the more traditional terminology of calling it a dialect, or the perhaps more historically accurate term sociolect.
  3. Or Okinawan, as opposed to waka (和歌 ‘Japanese poems) and kanshi (漢詩 ‘Chinese poems’).
  4. This is intended as a strict transliteration—not a transcription—of the original text, going from the poem as it is written. No analytic devices other than slashes, to divide lines, are used.

Stars, Asterisms, and Constellations in Okinawan

Just a short post this time around, but a series I plan on continuing.

Different cultures label different parts of the night sky differently. Various Australian Aboriginal cultures labeled the “gaps” in the Milky Way (which modern astronomy understands to be particularly dense clouds of interstellar dust) as different creatures. Other cultures, like the ancient Babylonians, labelled arbitrary patterns between particularly bright stars, the ancestors to today’s modern constellations. These different things—stars, asterisms (groupings of stars and other celestial objects which are not traditional constellations), and constellations, can vary significantly from culture to culture.

Here’s a brief look at some of these in Okinawan culture, and why they are named the way they are.

Polaris (the North Star), is niinufabushi (ニーヌファブシ) in modern Okinawan. This likely comes from older 子の方星 ne=no fau boshi |1st.earthly.branch=GEN.IANI direction GEN\star| ‘the North Star’. 子 ne, the 1st Earthly Branch is associated with the direction north, as well as the 11th Lunar Month, and the Rat zodiac sign.

The Ursa Minor (also called the Big Dipper) constellation is called kajimayaabusi (カジマヤーブシ) in modern Okinawan. This is from the earlier 風回りや kaze mawar-i-ya |wind turn-NMLZ-thing| ‘pinwheel’. In Okinawa, pinwheels are traditionally made from the leaves of the pandanus tree (modern Okinawan adan アダン ‘pandanus tree, Pandanus odoratissimus‘). This site describes how to make a traditional Okinawan pinwheel. Modern pinwheels are generally made of multicolored paper. An excellent question would be why it has this name. My best guess would be because of the fact that Ursa Minor seems to rotate around Polaris, much like a pinwheel spins in the wind.

What’s an “Ll” between friends?

A well-meaning vandal in Wales decided that a new sign, welcoming people to the town of Llanelli, had a spelling error, the Llanelli Star reports.

The sign originally read:

Croeso i Lanelli

Welcome to Llanelli

However, the vandal changed the Welsh portion to read:

Croeso i LLanelli

In extant Celtic languages, there are a number of morphosyntactic “triggers” which influence the initial consonant of a word. These are known as initial consonant mutations. In Welsh, one of these initial consonant mutations is the soft mutation. Generally, voiceless stops become voiced, voiced stops become fricatives, <m> /m/ becomes <f> /v/, <rh> /r̥/ becomes <r> /r/, and <ll> /ɬ/ becomes <l> /l/.

However, L2 speakers find initial consonant mutations hard to master. The morphosyntactic triggers are quite diverse, so knowing when and when not to mutate an initial consonant can be tricky. Unfortunately for our well-meaning vandal, the preposition i ‘to, for’ is a trigger for soft mutation in Welsh. So the correct spelling really is as the sign originally was: Croeso i Lanelli (pronounced [ˈkrɔi̯.sɔ iː la.ˈnɛ.ɬi] in Southern Welsh).

Zero derivation in Ainu

Ainu is supposedly a polysynthetic language, but displays a large number of characteristics that, at least when compared with the prototypical North American polysynthetic language, don’t really line up in my opinion.

I’ve long had a suspicion that Ainu has zero derivation, but didn’t have a lot of particularly good examples. One of my go-to ones up to this point has been the word omke ‘cough’, which is both a noun and a verb. The other night, I came across what I think is a much better example:

(1) a-un-kor-e-∅ ka somo ∅-∅-ki1 ruwe ∅-∅-ne wa.
INDF.A-1PL.O-have-CAUS-NMLZ even not 3.A-3.O-do FACT 3.A-3.O-COP EMPH
‘It was the case that we weren’t given it.’ (Nakagawa and Nakamoto 2004: 72)

I think there is one strong line of evidence for this being zero derivation of a verb to a noun, and one weaker one. The stronger of the two is the fact that this cannot be a relative clause, and the weaker is the non-occurrence of the complementizer hi.

Relative clauses in Ainu are modifier-head in their order. We have a verb, aunkore ‘we weren’t given any’, a potential relative clause but not a head and a particle ka ‘even’, along with the preceding verb, also a potential relative clause but not a head. Note that the adverb somo ‘not’ this negates the following verb (here, ki ‘to do’), so clearly, we have no potential heads for aunkore ka to be a relative clause that is the object of ki.

There is an alternate interpretation here, that the complementizer hi is just not expressed here for some reason. This would not be unheard of. For instance, compare the following in English:

(2) He said that I fell.

(3) He said I fell.

Both are perfectly grammatical. In Ainu, however, we do not find this variation. I was unable to locate any examples of the complementizer hi followed by even the lexical verb ki ‘to do’. So while it could be the case that we just lack a complementizer here, I think instead we are dealing with zero derivation.


1. As an aside, ki here functions as a light verb. I’m not entirely sure if in this capacity it is still transitive, but I have glossed it as such. Clearly, in its function as the lexical verb ‘to do’, we have good evidence that ki is indeed a transitive verb.

For instance, it takes transitive agreement markers (Izutsu 2003, a corpus of Asahikawa dialect Ainu, has examples of both an-∅-ki |INDF.A-3.O-do| and ci-∅-ki |1PL.A-3.O-do|, and no examples of *ki-an |do-INDF.S| or *ki-as |do-1PL.S|, and can be antipassivized (cf. i-ki-an |APASS-do-INDF.S|, also many examples in Izutsu 2003). But to the best of my knowledge, and certainly in Izutsu 2003’s corpus of Asahikawa Ainu, no distinction is made between the light verb ki and the lexical verb ki, so I’m just not sure.


1 – first person
3 – third person
A – Agent-like argument (“subject”) of a transitive verb
APASS – antipassive
CAUS – causative
COP – copula
EMPH – emphatic
FACT – factual/inferential evidential
INDF – indefinite person
NMLZ – nominalizer
O – Patient-like argument (“object”) of a transitive verb
PL – plural
S – single argument of an intransitive verb


Izutsu, Katsunobu. 2003. Ainugo Asahikawa Hōgen Kōpasu ni Motozuku Bunpōsho Hensan no Tame no Kiso Kenkyū [Basic Research for Compiling a Grammar Based on a Corpus of Asahikawa Dialect Ainu]. Asahikawa: Hokkaidō Kyōiku Daigaku Asahikawa-kō.

Two (informal) constraints on Okinawan phonology

In this post, I’d like to informally detail two phonotactic constrains in Okinawan: the bimoraic constraint and the word-initial non-syllabic onset constraint.

Bilabial Alveolar Alveolo-Palatal Palatal Velar Glottal
Tenuis Stop (p) b t d k g  ʔ
Palatalized Stop (bʲ)  ʔʲ
Labialized Stop kʷ gʷ ʔʷ
Flap ɾ
Affricate (ts dz) tɕ dʑ
Fricative s (z) ɕ h
Nasal m n
Approximant j w

Table 1: Okinawan consonants.

Table 1 details Okinawan consonants. Consonants in parentheses are marginal, occurring only on onomatopoeia (as is the case for /p/ and /bʲ/) or in the speech of people historically from the gentry (or the Shuri sociolect) (as is the case for /ts/, /dz/, and /z/, which have merged with /tɕ/, /dʑ/, /dʑ/ in the speech of people historically who were commoners (or the Naha sociolect).

The first constraint I’m going to discuss is the bimoraic constraint.

(1) The Bimoraic Constraint
If a phonological word is one syllable, it must be two mora in length.

So what is a mora in Okinawan? Vowels count as one mora each, syllabic nasals count as one mora, and geminate consonants count as one mora. For one syllable words, this means the following shapes are possible:

(2) Monosyllabic words in Okinawan
CVV – [jaa] ‘house’, [ʔʲaa] ‘you (nonpolite)’, [ʔaa] ‘bubble’
CVN – [ʔin̩] ‘dog’, [sun̩] ‘to do’, [mun̩] ‘thing’
QCV – [kkʷa] ‘child’, [ttɕu] ‘person’ (these may be the only two words of this shape)

The potential shapes *V , *VV, *VVC and *VCC do not occur. *V, and *VVC do not occur due to the fact that they inherently violate the bimoraic constraint. *VCC does not occur as it has no potential historical antecedents. Finally, the CCV shape is limited to QCV (where QC is a geminate version of C). As we have seen, the bimoraic constraint cannot explain all of these non-occurring forms, specifically *VV. For that, we need at least one more constraint that holds for all phonological words in Okinawan: the word-initial onset constraint.

(4) The Word-Initial Onset Constraint
A phonological word must begin with an onset.

Note that there are potential counter-examples of this constraint. For instance, [ʔeema] ‘interval’ contrasts with [eema] ‘Yaeyama’, with the latter apparently not having an onset, though it did historically (< *yaema < PR *ya-pe+yama |eight-layer.CLF+mountain|; I have no explanation for the missing syllable). Speakers, however, typically add a consonant (in this case the glide /j/) to repair the supposed violation of this constraint. Recordings of this word in particular from the Shuri-Naha Hōgen Onsei Dētabēsu pretty clearly reflects this, as does an acoustic look at the word:

Figure 1 – Waveform of [jeema]
Figure 2 – Spectrogram of [jeema]

It seems in most cases that speakers repair violations of this constraint, but there may be a few onomatopoeia and para-linguistic sound patterns that violate this constraint.

Both of these constraints and how I define a mora in Okinawan have significant influence on how I analyze Okinawan phonology. For instance, counting geminate consonants as a mora means that I must propose palatalized and labialized glottal stops—typologically quite rare—because otherwise a word like *[ʔjaa] (which I analyze as [ʔʲaa] ‘you (nonpolite)’) would violate the bimoraic constraint (as it would have three morae).

We also have two contrasting sets of words which have a syllabic nasal as the nucleus for their first syllable. For instance, we must analyze what is written as <ンニ> or <Nni> (with the capital <N> stands for a syllabic nasal) ‘breast, chest’ as [nn̩ni], rather than *[n̩ni], as the latter violates the word-initial onset constraint. Note that this contrasts with a glottal-stop initial [ʔn̩ni] ‘rice (plant)’. Historically, these seem to make sense to me, as the former originally had a consonantal onset (Proto-Ryūkyūan *mune ‘breast, chest’), while the latter did not (Proto-Ryūkyūan *ine ‘rice (plant)’). This constraint seems to be a Northern Ryūkyūan innovation, as southern Ryūkyūan varieties do not face this restriction; compare Ōgami Miyako [ɑmi] ‘rain’ (a Southern Ryūkyūan variety), Yamato Amami [ʔamï] (a Northern Ryūkyūan variety), and Shuri-Naha Okinawan [ʔami] ‘rain’ (another Northern Ryūkyūan variety), all from Proto-Ryūkyūan *ame ‘rain’.

An improved analysis of the imperfective realis form of Okinawan verbs

User Hakaku on my thread about the last post on reddit’s /r/linguistics subreddit pointed out that there is perhaps a better analysis than the one I presented last time, and I am inclined to agree.

One thing I left out last time was the fact that I only dealt with consonant-stem verbs. Japonic languages also have vowel-stem verbs, and any adequate explanation must work for both. My previous solution does not. So here I present a better solution which should account for all regular forms. Here we’ll just work with the following roots: *kak- ‘to write’ (a consonant-stem verb), *kir- ‘to cut’ (another consonant-stem verb), and *ki- ‘to wear’ (a vowel-stem verb).

Note that vowel-stem verbs historically only ended in *-i or *-e, but this second group disappeared in Okinawan due to the Northern Ryūkyūan chain shift (or perhaps paradigm leveling with the former type, or both—the number of verbs ending in *-e in Japonic is quite small).

We start off much the same as last time, with the optionality of the glide at the beginning of *wor-*. In all forms, this creates a non-identical vowel cluster, which is illicit.

(1a) *kak-i+wor-um |write-INF+CONT-RLS| > *kak-i+or-um
(1b) *kir-i+wor-um |cut-INF+CONT-RLS| > *kir-i+or-um
(1c)  *ki+wor-um |wear\INF+CONT-RLS| > *ki+or-um

This vowel cluster is resolved with V2 elision, where the *o in *or- is elided:

(2a) *kak-i+or-um > *kak-i+r-um
(2b) *kir-i+or-um > *kir-i+r-um
(2c) *ki+or-um > *ki+r-um

Next, a special change affects just the verb ‘to cut’. Throughout Okinawan, sequences of *ri became *i. This again creates an illicit vowel cluster, which is resolved with elision.

(3a) *kak-i+r-um
(3b) *kir-i+r-um > *ki-i+r-um (*ri to *i) > *ki+r-um (V2 ellision)
(3c) *ki+r-um

 Next is the progressive palatalization/lenition of *r to *y.

(4a) *kak-i+r-um > *kak-i+y-um
(4b) *ki+r-um > *ki+y-um
(4c) *ki+r-um > *ki+y-um

Next is the palatalization of *ki to *t͡ɕi:

(5a) *kak-i+y-um > *kat͡ɕ-i+y-um
(5b) *ki+y-um > *t͡ɕi+y-um
(5c) *ki+y-um > *t͡ɕi+y-um

Next is a change that only affects ‘to write’, the coalescence of the *iy sequence to just palatalization on the proceeding consonant:

(6a) *kat͡ɕ-i+y-um > *kat͡ɕ+um
(6b) *t͡ɕi+y-um
(6c) *t͡ɕi+y-um

This complete coalescence could not occur in (6b) and (c) as it would obliterate the original root, which only is licit in the most frequent (in other words, irregular) verbs.

Penultimately, though this change is late it doesn’t necessarily have to be this late, the neutralization of syllable-final *m. and *n.:

(7a) *kat͡ɕ+um > *kat͡ɕ+un
(7b) *t͡ɕi+y-um > *t͡ɕi+y-un
(7c) *t͡ɕi+y-um > *t͡ɕi+y-un

Lastly, there is the morphological reanalysis of the auxiliary as part of the root, and the realis suffix as also having an imperfective meaning (expanding to take on an expanded version of the continuous function of the original auxiliary):

(8a) *kat͡ɕ+un > kat͡ɕ-un
(8b) *t͡ɕi+y-un > t͡ɕiy-un
(8c) *t͡ɕi+y-un > t͡ɕiy-un

As I mention in my reply on reddit, the one thing I don’t have a good explanation for is for “Naha” (lower class/innovative) forms like t͡ɕi-in ‘to cut, to wear’, It likely does not have anything to do with pitch accent (in some cases, a low pitch accent causes a vowel to lengthen in Naha but not Shuri), but instead likely caused by the difference in intensity between *i and *u, along with compensatory lengthening and morphological reanalysis.

(9a) kat͡ɕ-un (Naha as in Shuri)
(9b) *t͡ɕiy-un > *t͡ɕi-un (coalesence of *i and *y) > *t͡ɕi-n (V2 elision) > *t͡ɕii-n (compensatory lengthening) > t͡ɕi-in
(9c) *t͡ɕiy-un > *t͡ɕi-un (coalesence of *i and *y) > *t͡ɕi-n (V2 elision) > *t͡ɕii-n (compensatory lengthening) > t͡ɕi-in

Or something.

reddit user Hakaku proposes the widely re-ocurring change of *yu to *i, which is probably a more straightforward analysis. We’d end up with:

(10) *t͡ɕiy-un > *t͡ɕi-in (coalesence of *yu to *i) > t͡ɕi-in (reanalysis, Naha form)

Simple verbs in Okinawan aren’t so simple

In the last post on this topic, I glossed over the development of the Modern Okinawan simple (imperfective) realis form from an earlier continuous realis form. This will be a short post simply describing what I think the series of phonological and morphological changes involved are.

For illustrative purposes, we’ll use the verb root √kak– ‘to write’. The original form would have been something like *kak-i+wor-um |write-INF+CONT-RLS|.

First, the evidence we have for the realis originally being *-um. In both Western and Eastern Old Japanese, we only find -u (or -i with r-irregular verbs) as what Vovin (2009: 595) calls the final predication form (what I call the realis form). However, the situation is different in the Ryūkyūan subgroup. In Okinawan, we find a number of forms ending in a nasal, as in the simple realis form, given in (1):

(1) kach-un |write-RLS| ‘[Someone] writes.’

The most crucial piece of evidence, however, is the realis form followed by the general question clitic =i. For instance:

(2) kach-um=i |write-RLS=GQ| ‘Does [someone] write?’

It appears that at some point in the history of Okinawan the distinction between syllable-final -m and -n was neutralized and we only find syllable-final -n. But in the form with the general question enclitic, there is resyllabification, and instead of being syllable-final, the underlying -m is now syllable-initial, and it is now licit for it to appear overtly as m. I will represent this underlying form as -uM.

The next point we need to explain is the palatalization of the verb root *√kak- with the realis suffix -un. There appears to be no overt synchronic motivation, but a diachronic explanation of a following infinitive form motivating the palatalization to *kat͡ɕ-i ( < *kakʲ-i. But -uM should not prompt the infinitive. The only reason we would expect the infinitive in this context is with an auxiliary verb. In Western Old Japanese, for instance, we find that nearly all of auxiliaries and some converbs following this pattern (such as the clause chaining converb -te, which always follows an infinitive), and even in Modern Japanese, we find compound verbs formed on this pattern (for instance, tob-i+kom-u |jump-INF+into-IPFV| ‘[Someone] jumps into [something]’).

Another neutralization involved the auxiliary verb wor- |CONT| (as a lexical verb ‘to exist’). This was caused by the addition of a pair of rules. The first mandated consonantal onsets to all morpheme-initial positions. For vowel-initial morphemes, this was satisfied by the addition of a glottal stop. Thus, PR *ame ‘rain’ (<  PJ *Samay1 ‘rain’) became pre-Modern Okinawan *ʔame ‘id.’. Additionally, the initial consonant of wo, especially after raising to wu, became optional and could be replaced by a hiatus. wor- is unaccented, and thus high register in all Okinawan varieties, but although short vowels tend to be higher in pitch than their long counterparts (an apparent phonetic universal), it is not clear how pitch would have interacted here (as the pitch accent pattern of a main verb + auxiliary verb compound would likely be different than the root verb).

The most contentious point is the V2 elision, due to the fact that V2 elision is a typological rare phenomenon. V2 elision is a process where a sequence of two vowels, V1 and V2, results in the second vowel, V2, being deleted and the first vowel, V1, remaining. I propose that in this case, the motivation is the relatively higher sonority of *i versus *o (or, if it raises in the Northern Ryūkyūan chain shift before this point, *u, which is even less sonorous). In any case, both the infinitive *-i and the entirety of the auxiliary of *or- must both delete, as there is no residue of either in Modern Okinawan.

As I just mentioned, the initial vowel of the auxiliary *or- |CONT| then deletes, and creates an illegal consonant cluster (all non-geminate consonant clusters are illegal in Modern Okinawan), which then simplifies by deleting the second consonant.

We are then left with just the realis form -uM, which becomes the new simple (imperfective) form of the verb, and does double duty contrasting with both the perfective aspect (an expansion from its original continuous aspect meaning), and with various other mood markers (like the adnominal2 -uru and the tentative -ura)

So schematically:

(3) *kak-i+wor-um
(introduction of contrast between o with a non-glottal onset, (w)o, and with a glottal onset, ʔo; hiatus optionally replaces glide)
(neutralization of syllable-final -n/-m contrast)
kakʲ-i+or-un (palatalization of k- due to following -i)
kakʲ-i+r-un (V2 elision of o)
kakʲ+r-un (deletion of due to palatalization of kʲ-)
kakʲ+un (deletion of r due to illegal consonant cluster)
kakʲ-un (reanalysis of auxiliary as suffix)
(affrication of kʲ- to t͡ɕ-)


(1) The *S stands for some sort of fricative. In certain compounds, ‘rain’ shows up with an initial s-. For instance, WOJ pîsamë ‘hail’ (< ‘ice’ +samë ‘rain’). Vovin (2010: 109) speculates that this may be PJ *z or *h.

(2) I consider the adnominal form a mood marker due to its interaction with so-called focus particles. In addition to being used to form relative clauses, it is obligatory when the emphatic focus particle =du is used in a sentence, like how the realis mood is typically obligatory, or how the tentative mood is used when the general question particle is unraised. Compare (4a), (b), and (c):

(4a) sumuchi=∅ kach-un
book=ACC write-RLS
‘[Someone] writes a book.’

(4b) sumuchi=du kach-uru
book=EFP write-ADN
‘[Someone] writes a book

(4c) sumuchi=ga kach-ura
book=IQ write-TENT
‘[Someone] writes a book?’


Vovin, Alexander. 2009. A Descriptive and Comparative Grammar of Western Old Japanese. Part Two: Adjectives, Verbs, Adverbs, Conjunctions, Particles, Postpositions. Folkestone: Global Oriental.

Vovin, Alexander. 2010. Koreo-Japonica: A Re-Evaluation of a Common Genetic Origin. Folkestone: Global Oriental.


∅ – null
– – morpheme boundary
+ – compound boundary
ACC – accusative case
ADN – adnominal mood marker
CONT – continuous aspect marker
EFP – emphatic focus particle
GQ – general question (yes/no-question) particle
INF – infitivie
IPFV – imperfective aspect marker
IQ – information question (wh-question) particle
RLS – realis mood marker
TENT – tentative mood marker