Steven Moran, Etian Grossman and Lilja Maria Sæbø
26 April, 2025
- Overview
- Setup
- Basics of the database contents
- Tables for the paper
- Distribution of the languages, families and cases of tonogenesis across different areas
- Number of languages in different families
- Cases of tonogenesis sorted by triggering context
- Tonogenesis conditioned by voiced and voiceless (unaspirated) obstruents
- Tonogenesis triggered by coda consonants
- Tonogenesis based on vowel length
- Tonogenesis based on vowel length
- Tonogenesis based on ATR
- Effect of voicing on tone
- Tonogenesis triggered by codas
- Onset Voicing by effect on pitch
- Effect of voicing on pitch
- Effect of voice on pitch
- Effect of coda glottal on pitch
- Effect of vowel height on pitch
- Effect of nucleus length on pitch
- Effect of nuclear +/iATR on pitch
- Number of cases/varieties of different types for each region
- Area and tonogenesis specific tables
- Tonogenetic events by macroarea
- Examples from the database for the paper
- A table showing the number of cases/langauges for each type in each region
- Multiple paths to the same result
- Patterns in level vs contour height
- New tables for revise and resubmit
Supplementary materials for “Tonogenesis: a diachronic typology” by Lilja Maria Sæbø, Eitan Grossman and Steven Moran, accepted in Diachronica.
The CLDF data are available here:
Load the libraries.
library(tidyverse)
library(knitr)
library(kableExtra)
library(xtable)
library(ggalluvial)
Load the tonodb CLDF data.
values <-
read_csv(url('https://raw.githubusercontent.com/cldf-datasets/tonodb/main/cldf/values.csv'))
languages <-
read_csv(url('https://raw.githubusercontent.com/cldf-datasets/tonodb/main/cldf/languages.csv'))
contributions <-
read_csv(url('https://raw.githubusercontent.com/cldf-datasets/tonodb/main/cldf/contributions.csv'))
parameters <-
read_csv(url('https://raw.githubusercontent.com/cldf-datasets/tonodb/main/cldf/parameters.csv'))
We have this many languages in our sample.
nrow(languages)
## [1] 97
And this many observations.
nrow(values)
## [1] 259
Let’s map our data points. We note some rows are removed because the lat/long figures are NA due to them being listed as dialects or language families.
ggplot(data=languages, aes(x=Longitude, y=Latitude)) +
borders("world", colour="gray50", fill="gray50") +
geom_point() +
theme_bw()
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_point()`).
These are the missing data points for geographic location.
languages %>% filter(is.na(Latitude)) %>% select(ID, Name, Macroarea, Latitude, Longitude) %>% kable()
ID |
Name |
Macroarea |
Latitude |
Longitude |
---|---|---|---|---|
atha1247 |
Athabaskan |
NA |
NA |
NA |
auks1239 |
Aukshtaitish |
Eurasia |
NA |
NA |
cant1236 |
Cantonese |
Eurasia |
NA |
NA |
cent2346 |
Central Tibetan |
NA |
NA |
NA |
coas1300 |
Coast Tsimshian |
North America |
NA |
NA |
east2280 |
Eastern Baltic |
NA |
NA |
NA |
extr1245 |
Extreme Southern New Caledonian |
NA |
NA |
NA |
kere1287 |
Keresan |
NA |
NA |
NA |
mang1393 |
Mangbetu-Asua |
NA |
NA |
NA |
metn1237 |
Metnyo |
Papunesia |
NA |
NA |
midd1319 |
Middle Franconian |
Eurasia |
NA |
NA |
moha1257 |
Mohawk-Oneida |
NA |
NA |
NA |
newc1243 |
New Caledonian |
NA |
NA |
NA |
nort3160 |
North Germanic |
NA |
NA |
NA |
podo1243 |
Podoko |
NA |
NA |
NA |
pwoo1239 |
Pwo |
NA |
NA |
NA |
raja1258 |
Raja Ampat Maya |
NA |
NA |
NA |
sind1278 |
Sindhi-Lahnda |
NA |
NA |
NA |
slav1255 |
Slavic |
NA |
NA |
NA |
taik1256 |
Tai-Kadai |
NA |
NA |
NA |
tere1281 |
Terena |
South America |
NA |
NA |
utsa1239 |
Lhasa Tibetan |
Eurasia |
NA |
NA |
yeni1252 |
Yeniseian |
NA |
NA |
NA |
zhuo1234 |
Zhuoni |
Eurasia |
NA |
NA |
We’ve gone through by hand and added approximate geocoordinates for visualization purposes, e.g., using Glottolog’s Swedish latitude and longitude for North Germanic.
Merge in the hand attributed geocoordinates.
# There must be a saner way to do this!
hc <- read_csv('hand_coordinates.csv')
tmp <- left_join(languages, hc, by=c("ID"="ID", "Name"="Name"))
tmp <- tmp %>% mutate(Latitude.x = coalesce(Latitude.x, Latitude.y))
tmp <- tmp %>% mutate(Longitude.x = coalesce(Longitude.x, Longitude.y))
tmp <- tmp %>% select(-Latitude.y, Longitude.y)
tmp <- tmp %>% rename(Latitude = Latitude.x)
tmp <- tmp %>% rename(Longitude = Longitude.x)
languages <- tmp
Redo the map.
ggplot(data=languages, aes(x=Longitude, y=Latitude)) +
borders("world", colour="gray50", fill="gray50") +
geom_point() +
theme_bw()
Here we can add some color by language family.
ggplot(data=languages, aes(x=Longitude, y=Latitude, color=family_id)) +
borders("world", colour="gray50", fill="gray50") +
geom_point() +
theme_bw() +
theme(legend.position="none")
# ggtitle("Language varieties colored for language family")
How many data points per macroarea? (Note again several NAs.)
table(languages$Macroarea, exclude=FALSE)
##
## Africa Eurasia North America Papunesia South America
## 11 40 17 7 6
## <NA>
## 16
Some Glottolog macroareas are missing, e.g., languages that don’t have Glottocodes or are family level codes.
languages %>% filter(is.na(Macroarea))
## # A tibble: 16 × 18
## ID Name Macroarea Latitude Longitude Glottocode ISO639P3code family_id
## <chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
## 1 atha1247 Atha… <NA> 60.5 -151. atha1247 <NA> atha1245
## 2 cent2346 Cent… <NA> 28.4 90.2 cent2346 <NA> sino1245
## 3 east2280 East… <NA> 56.8 24.3 east2280 <NA> indo1319
## 4 extr1245 Extr… <NA> -22.1 167. extr1245 <NA> aust1307
## 5 kere1287 Kere… <NA> 35.5 -106. kere1287 <NA> <NA>
## 6 mang1393 Mang… <NA> 0.268 27.3 mang1393 <NA> cent2225
## 7 moha1257 Moha… <NA> 43.7 -74.7 moha1257 <NA> iroq1247
## 8 newc1243 New … <NA> -20.9 167. newc1243 <NA> aust1307
## 9 nort3160 Nort… <NA> 59.8 17.4 nort3160 <NA> indo1319
## 10 podo1243 Podo… <NA> 10.9 14.0 podo1243 <NA> afro1255
## 11 pwoo1239 Pwo <NA> 18.0 99.6 pwoo1239 <NA> sino1245
## 12 raja1258 Raja… <NA> -0.173 130. raja1258 <NA> aust1307
## 13 sind1278 Sind… <NA> 30.1 75.3 sind1278 <NA> indo1319
## 14 slav1255 Slav… <NA> 49.9 15.1 slav1255 <NA> indo1319
## 15 taik1256 Tai-… <NA> 24.1 110. taik1256 <NA> <NA>
## 16 yeni1252 Yeni… <NA> 63.8 87.5 yeni1252 <NA> <NA>
## # ℹ 10 more variables: parent_id <chr>, bookkeeping <lgl>, level <chr>,
## # description <lgl>, markup_description <lgl>, child_family_count <dbl>,
## # child_language_count <dbl>, child_dialect_count <dbl>, country_ids <chr>,
## # Longitude.y <dbl>
# tmp <- languages %>% filter(is.na(Macroarea)) %>% select(ID, Name, Macroarea)
# write_csv(tmp, 'get_macroareas.csv')
# There must be a saner way to do this!
hc <- read_csv('hand_macroareas.csv')
tmp <- left_join(languages, hc, by=c("ID"="ID", "Name"="Name"))
tmp <- tmp %>% mutate(Macroarea.x = coalesce(Macroarea.x, Macroarea.y))
tmp <- tmp %>% select(-Macroarea.y)
tmp <- tmp %>% rename(Macroarea = Macroarea.x)
languages <- tmp
table(languages$Macroarea, exclude = FALSE)
##
## Africa Eurasia North America Papunesia South America
## 13 48 20 10 6
And a quick look at our areas.
contributions %>% filter(is.na(Area)) #
## # A tibble: 1 × 9
## ID Contributor Citation Glottocode LanguageVariety Family Area Notes
## <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 105 Lilja Saeboe Lilja Saeboe… <NA> Montagnais Algic <NA> <NA>
## # ℹ 1 more variable: BibTex <chr>
table(contributions$Area, exclude=FALSE)
##
## Africa Asia Europe North America Papunesia
## 13 39 14 21 10
## South America <NA>
## 6 1
Create tables for the paper. First merge the tonodb tables.
tonodb <- left_join(values, languages, by=c("Language_ID"="ID"))
# Reduce the Contributor table and get the TonoDB Area column
tmp <- contributions %>% select(ID, Family, Area)
tonodb <- left_join(tonodb, tmp, by=c("Inventory_ID"="ID"))
# tonodb %>% filter(is.na(family_id))
# Rename wordtype to syllable-count -- TODO replace when database is updated
tonodb <- tonodb %>% mutate(Type = str_replace(Type, "wordtype", "syllable"))
# Fix the mistakes (TODO: rerun the CLDF creation script, which will fix these typos below)
tonodb$Ordering <- str_replace(tonodb$Ordering, "broad", "Broad")
tonodb$Ordering <- str_replace(tonodb$Ordering, "strict", "Strict")
tonodb %>% filter(Ordering=="broad")
## # A tibble: 0 × 54
## # ℹ 54 variables: ID <dbl>, Parameter_ID <chr>, Value <chr>, Language_ID <chr>,
## # Inventory_ID <dbl>, LanguageVariety <chr>, Ordering <chr>, Ongoing <chr>,
## # TriggeringContext <chr>, Tone <chr>, Extra <chr>, Height <chr>,
## # Contour <chr>, Phonation <chr>, ToneDescription <chr>, ChaoNumerals <chr>,
## # RestrictedEnviroment <chr>, Notes <chr>, EffectOnPitch <chr>,
## # ResultantSystem <chr>, Type <chr>, Onset <chr>, OnsetManner <chr>,
## # OnsetVoicing <chr>, OnsetAspiration <chr>, Coda <chr>, …
tonodb %>% filter(is.na(Ordering))
## # A tibble: 1 × 54
## ID Parameter_ID Value Language_ID Inventory_ID LanguageVariety Ordering
## <dbl> <chr> <chr> <chr> <dbl> <chr> <chr>
## 1 259 8D966B2253A9170… high <NA> NA <NA> <NA>
## # ℹ 47 more variables: Ongoing <chr>, TriggeringContext <chr>, Tone <chr>,
## # Extra <chr>, Height <chr>, Contour <chr>, Phonation <chr>,
## # ToneDescription <chr>, ChaoNumerals <chr>, RestrictedEnviroment <chr>,
## # Notes <chr>, EffectOnPitch <chr>, ResultantSystem <chr>, Type <chr>,
## # Onset <chr>, OnsetManner <chr>, OnsetVoicing <chr>, OnsetAspiration <chr>,
## # Coda <chr>, CodaPhonation <chr>, CodaGlottal <chr>, CodaManner <chr>,
## # Stress <chr>, SyllableCount <chr>, NucleusATR <chr>, NucleusLength <chr>, …
x <- tonodb %>% select(Area, Language_ID) %>% distinct() %>% group_by(Area) %>% summarise(Languages = n())
y <- tonodb %>% select(Area, family_id) %>% distinct() %>% group_by(Area) %>% summarize(Families = n())
z <- tonodb %>% select(Area, TriggeringContext) %>% group_by(Area) %>% summarize(`Cases of tonogenesis` = n())
tmp <- left_join(x, y)
tmp <- left_join(tmp, z)
tmp <- tmp %>% arrange(desc(`Cases of tonogenesis`))
tmp %>% kable()
Area |
Languages |
Families |
Cases of tonogenesis |
---|---|---|---|
Asia |
37 |
9 |
157 |
North America |
20 |
10 |
33 |
Europe |
12 |
2 |
22 |
Africa |
13 |
5 |
21 |
Papunesia |
10 |
1 |
16 |
South America |
6 |
3 |
8 |
NA |
1 |
1 |
2 |
# Still getting some NAs, let's drop them
table(tonodb$family_id, exclude = FALSE)
##
## afro1255 algi1248 araw1281 atha1245 atla1278 aust1305 aust1307 cadd1255
## 3 7 1 5 6 15 27 1
## cent2225 chim1311 gong1255 hmon1336 indo1319 iroq1247 koma1264 kore1284
## 4 1 2 10 23 8 6 2
## maya1287 mong1349 nada1235 sino1245 taik1256 tsim1258 tuca1253 ural1272
## 4 2 3 58 28 1 4 1
## utoa1244 waka1280 <NA>
## 2 2 33
tmp <- tmp %>% filter(!is.na(Area))
tmp %>% kable()
Area |
Languages |
Families |
Cases of tonogenesis |
---|---|---|---|
Asia |
37 |
9 |
157 |
North America |
20 |
10 |
33 |
Europe |
12 |
2 |
22 |
Africa |
13 |
5 |
21 |
Papunesia |
10 |
1 |
16 |
South America |
6 |
3 |
8 |
print(xtable(tmp, type = "latex", caption="Distribution of the languages, families and cases of tonogenesis across different areas"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:40 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrr}
## \hline
## Area & Languages & Families & Cases of tonogenesis \\
## \hline
## Asia & 37 & 9 & 157 \\
## North America & 20 & 10 & 33 \\
## Europe & 12 & 2 & 22 \\
## Africa & 13 & 5 & 21 \\
## Papunesia & 10 & 1 & 16 \\
## South America & 6 & 3 & 8 \\
## \hline
## \end{tabular}
## \caption{Distribution of the languages, families and cases of tonogenesis across different areas}
## \end{table}
tmp <- tonodb %>% select(family_id, LanguageVariety) %>% distinct() %>% arrange(family_id, LanguageVariety) %>% group_by(family_id) %>% summarize(`Number of varieties` = n(), Languages = str_c(LanguageVariety, collapse=", "))
# We need the Glottolog family names
glottolog <- read_csv('data/languoid.csv')
families <- glottolog %>% filter(id %in% tmp$family_id) %>% select(id, name)
tmp <- left_join(tmp, families, by=c("family_id"="id"))
tmp <- tmp %>% select(name, `Number of varieties`, Languages)
tmp <- tmp %>% rename(Family = name)
tmp %>% kable()
Family |
Number of varieties |
Languages |
---|---|---|
Afro-Asiatic |
2 |
Iraqw, Podoko |
Algic |
4 |
Arapaho, Cheyenne, Kickapoo, Maliseet-Passamaquoddy |
Arawakan |
1 |
Terena |
Athabaskan-Eyak-Tlingit |
3 |
Proto-Athabaskan (tonal dialects) group one, Proto-Athabaskan (tonal dialects) group two, Sanya-Henya Tlingit |
Atlantic-Congo |
5 |
Bantu D30, Bila, Kohumono, Moba, Nupe |
Austroasiatic |
5 |
Hu, U, Vietnamese, Wester Kammu, Western Kammu |
Austronesian |
12 |
Cem, Central North New Caledonian languages, Far South New Caledonian langauges, Magey Matbat, Metnyo Ambel, Moor, Phan Rang Cham, Pre-proto-North Huon Gulf, Proto-Maˈya, Samoan, Utsat, Yerisiam |
Caddoan |
1 |
Caddo |
Central Sudanic |
2 |
Languages of the Mangbetu-Asua subgroup with three tones, Western Lugbara |
Chimakuan |
1 |
Quileute |
Ta-Ne-Omotic |
2 |
Gimira, Shinasha |
Hmong-Mien |
1 |
White Hmong |
Indo-European |
14 |
Auktaitian dialects of Lithuanian, Central Franconian, Central Scandinavian, East Baltic (Latvian and Lithuanian), East Slesvig, Late Proto-Slavic, Latvian, Limburgish, Lithuanian, Proto-Nordic, Punjabi, Scottish gaelic (Bernera), West Baltic (Prussian, Zealand Danish |
Iroquoian |
3 |
Cherokee, Mohawk, Proto-Mohawk-Oneida |
Koman |
2 |
Proto-Gwama, Proto-Opo |
Koreanic |
1 |
Korean |
Mayan |
4 |
Mocho’, San Bartolo Tzotzil, Uspanteko, Yucatec |
Mongolic-Khitan |
1 |
Mongour |
Naduhup |
1 |
Eastern Naduhup |
Sino-Tibetan |
20 |
Baima Tibetan, Brokpa, Burmese, Cantonese, Chitabu (bwe), Dzongkha, Geba, Khaling, Kurtöp, Lahu, Lhasa Tibetan, Middle Chinese, Phlong, Pwo Karen, Rikeze Tibetan, Sgaw Karen, Tokpe Gola (Tibetan), T’ientsin, Zhibo Tibetan, Zhuoni Tibetan |
Tai-Kadai |
4 |
Nakhon Si Thammarat Thai, Proto-Tai, Shan, Yung Chiang Kam |
Tsimshian |
1 |
Coast Tsimshian |
Tucanoan |
4 |
Barasana, Kubeo, Máíhɨ̃ki, Tatuyo |
Uralic |
1 |
Estonian |
Uto-Aztecan |
1 |
Hopi |
Wakashan |
1 |
Heiltsuk |
NA |
9 |
NA |
# print(xtable(tmp, type = "latex", caption="Number of languages in different language families"), include.rownames=FALSE)
z <- tonodb %>% select(Type, LanguageVariety) %>% separate_rows(Type)
x <- z %>% group_by(Type) %>% summarize(`Cases of tonogenesis` = n()) %>% arrange()
y <- z %>% select(Type, LanguageVariety) %>% distinct() %>% group_by(Type) %>% summarize(`Number of languages` = n()) %>% arrange()
tmp <- left_join(x, y)
## Joining with `by = join_by(Type)`
tmp <- tmp %>% arrange(desc(`Cases of tonogenesis`))
# Remove NAs
# tmp <- tmp %>% filter(!is.na(Type))
# tmp %>% kable()
# rename to syllable-count
tmp <- tmp %>% mutate(Type = str_replace(Type, "syllable", "syllable-count"))
tmp %>% kable()
Type |
Cases of tonogenesis |
Number of languages |
---|---|---|
onset |
133 |
41 |
coda |
70 |
43 |
count |
27 |
20 |
nucleus |
27 |
16 |
syllable-count |
27 |
20 |
stress |
12 |
8 |
other |
6 |
5 |
NA |
1 |
1 |
print(xtable(tmp, type = "latex", caption="Cases of tonogenesis by category"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:41 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrr}
## \hline
## Type & Cases of tonogenesis & Number of languages \\
## \hline
## onset & 133 & 41 \\
## coda & 70 & 43 \\
## count & 27 & 20 \\
## nucleus & 27 & 16 \\
## syllable-count & 27 & 20 \\
## stress & 12 & 8 \\
## other & 6 & 5 \\
## & 1 & 1 \\
## \hline
## \end{tabular}
## \caption{Cases of tonogenesis by category}
## \end{table}
# tmp <- tonodb %>% select(OnsetVoicing, EffectOnPitch)
# table(tmp)
# tmp <- tonodb %>% select(OnsetVoicing, EffectOnPitch) %>% filter(OnsetVoicing != "") %>% filter(EffectOnPitch != "")
# table(tmp)
# tmp <- tonodb %>% select(OnsetVoicing, EffectOnPitch) %>%
# filter(OnsetVoicing != "") %>%
# filter(EffectOnPitch != "") %>%
# filter(OnsetVoicing %in% c("Voiced", "Voiceless"))
# table(tmp)
# tmp <- tonodb %>% select(OnsetVoicing, EffectOnPitch) %>%
# filter(OnsetVoicing != "") %>%
# filter(EffectOnPitch != "") %>%
# filter(OnsetVoicing %in% c("Voiced", "Voiceless"))
# table(tmp)
tmp <- tonodb %>% select(OnsetVoicing, EffectOnPitch) %>%
filter(OnsetVoicing != "") %>%
filter(EffectOnPitch != "") %>%
filter(OnsetVoicing %in% c("Voiced", "Voiceless"))
t <- data.frame(unclass(table(tmp$OnsetVoicing, tmp$EffectOnPitch)))
t <- t %>% select(lowering, mid, elevating, rising, falling)
print(xtable(t, type = "latex", caption="Tonogenesis conditioned by voiced and voiceless (unaspirated) obstruents"))
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:41 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{rrrrrr}
## \hline
## & lowering & mid & elevating & rising & falling \\
## \hline
## Voiced & 37 & 0 & 10 & 2 & 2 \\
## Voiceless & 11 & 8 & 35 & 0 & 1 \\
## \hline
## \end{tabular}
## \caption{Tonogenesis conditioned by voiced and voiceless (unaspirated) obstruents}
## \end{table}
tmp <- tonodb %>% select(CodaGlottal, EffectOnPitch) %>%
filter(!is.na(CodaGlottal)) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "") %>%
filter(EffectOnPitch %in% c("level", "rising", "falling"))
table(tmp) %>% kable()
falling |
rising |
|
---|---|---|
/h/ |
2 |
0 |
/h/, glottal stop |
2 |
1 |
glottal stop |
4 |
3 |
glottalized |
1 |
2 |
laryngeal |
6 |
0 |
non-glottalized |
1 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Tonogenesis triggered by coda consonants"))
table(tonodb$Nucleus, tonodb$EffectOnPitch) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
rising-falling |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|---|---|---|---|---|
-ATR |
0 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
-ATR and non-high vowel |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR and high vowel |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
high vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
long vowel |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
low vowel |
1 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
other |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long, glottalic |
2 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% select(Nucleus, EffectOnPitch) %>%
filter(Nucleus != "") %>%
filter(EffectOnPitch != "") %>%
filter(Nucleus %in% c("long vowel", "short vowel"))
# print(xtable(table(tmp), type = "latex", caption="Tonogenesis based on vowel length"))
high/low is relative.
table(tonodb$Nucleus, tonodb$EffectOnPitch) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
rising-falling |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|---|---|---|---|---|
-ATR |
0 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
-ATR and non-high vowel |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR and high vowel |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
high vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
long vowel |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
low vowel |
1 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
other |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long, glottalic |
2 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% select(Nucleus, EffectOnPitch) %>%
filter(Nucleus != "") %>%
filter(EffectOnPitch != "") %>%
filter(Nucleus %in% c("high vowel", "low vowel"))
# print(xtable(table(tmp), type = "latex", caption="Tonogenesis based on vowel height – high/low is relative"))
High/low is relative.
table(tonodb$Nucleus, tonodb$EffectOnPitch) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
rising-falling |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|---|---|---|---|---|
-ATR |
0 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
-ATR and non-high vowel |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ATR and high vowel |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
high vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
long vowel |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
low vowel |
1 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
other |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short vowel |
3 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
short, long, glottalic |
2 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% select(Nucleus, EffectOnPitch) %>%
filter(Nucleus != "") %>%
filter(EffectOnPitch != "") %>%
filter(Nucleus %in% c("+ATR", "-ATR"))
# print(xtable(table(tmp), type = "latex", caption="Tonogenesis based on ATR – high/low is relative"))
In the DoTE (number of languages).
tmp <- tonodb %>% filter(Onset %in% c('voiceless', 'voiced'))
table(tmp$Onset, tmp$EffectOnPitch)
##
## elevating falling lowering rising
## voiced 3 1 19 1
## voiceless 15 0 2 0
t <- data.frame(unclass(table(tmp$Onset, tmp$EffectOnPitch)))
t <- t %>% select(lowering, elevating, rising, falling)
# print(xtable(t, type = "latex", caption="The effect of voicing on tone"))
In the DoTE (number of cases of tonogenesis).
# table(tonodb$Coda, tonodb$EffectOnPitch) %>% kable()
# tmp <- tonodb %>% select(Coda, EffectOnPitch) %>% filter_at(vars(Coda, EffectOnPitch),any_vars(!is.na(.)))
# table(tmp$Coda, tmp$EffectOnPitch) %>% kable()
# tmp <- tonodb %>% select(Coda, EffectOnPitch) %>% filter_at(vars(Coda, EffectOnPitch),all_vars(!is.na(.)))
# table(tmp$Coda, tmp$EffectOnPitch) %>% kable()
tmp <- tonodb %>% select(OnsetAspiration, EffectOnPitch) %>%
filter(OnsetAspiration != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
mid |
rising |
|
---|---|---|---|---|---|
Aspirated |
5 |
0 |
7 |
4 |
0 |
Aspirated, unaspirated |
5 |
1 |
1 |
0 |
0 |
Breathy |
0 |
0 |
0 |
0 |
1 |
Unaspirated |
7 |
0 |
3 |
6 |
0 |
t <- data.frame(unclass(table(tmp)))
t <- t %>% select(lowering, mid, elevating, falling, rising)
# print(xtable(t, type = "latex", caption="The effect of voicing on tone"))
tmp <- tonodb %>% select(CodaManner, EffectOnPitch) %>% separate_rows(CodaManner) %>%
filter(CodaManner != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
level |
lowering |
rising |
|
---|---|---|---|---|---|
cluster |
0 |
1 |
0 |
1 |
0 |
fricative |
1 |
3 |
0 |
1 |
0 |
obstruent |
3 |
4 |
0 |
1 |
1 |
open |
0 |
0 |
3 |
0 |
0 |
sonorant |
1 |
3 |
3 |
1 |
0 |
stop |
2 |
4 |
0 |
4 |
3 |
t <- data.frame(unclass(table(tmp)))
t <- t %>% select(lowering, level, elevating, rising, falling) %>% arrange(desc(lowering))
t %>% kable()
lowering |
level |
elevating |
rising |
falling |
|
---|---|---|---|---|---|
stop |
4 |
0 |
2 |
3 |
4 |
cluster |
1 |
0 |
0 |
0 |
1 |
fricative |
1 |
0 |
1 |
0 |
3 |
obstruent |
1 |
0 |
3 |
1 |
4 |
sonorant |
1 |
3 |
1 |
0 |
3 |
open |
0 |
3 |
0 |
0 |
0 |
print(xtable(t, type = "latex", caption="The effect of voicing on tone"))
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:41 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{rrrrrr}
## \hline
## & lowering & level & elevating & rising & falling \\
## \hline
## stop & 4 & 0 & 2 & 3 & 4 \\
## cluster & 1 & 0 & 0 & 0 & 1 \\
## fricative & 1 & 0 & 1 & 0 & 3 \\
## obstruent & 1 & 0 & 3 & 1 & 4 \\
## sonorant & 1 & 3 & 1 & 0 & 3 \\
## open & 0 & 3 & 0 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{The effect of voicing on tone}
## \end{table}
tmp <- tonodb %>% select(CodaPhonation, EffectOnPitch) %>%
filter(CodaPhonation != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
falling |
lowering |
rising |
|
---|---|---|---|
breathy |
1 |
0 |
0 |
creaky |
2 |
0 |
0 |
preaspirated |
0 |
1 |
0 |
voiced |
1 |
1 |
0 |
voiceless |
2 |
0 |
1 |
print(xtable(table(tmp), type = "latex", caption="The effect of voice on pitch"))
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:41 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{rrrr}
## \hline
## & falling & lowering & rising \\
## \hline
## breathy & 1 & 0 & 0 \\
## creaky & 2 & 0 & 0 \\
## preaspirated & 0 & 1 & 0 \\
## voiced & 1 & 1 & 0 \\
## voiceless & 2 & 0 & 1 \\
## \hline
## \end{tabular}
## \caption{The effect of voice on pitch}
## \end{table}
tmp <- tonodb %>% select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
|
---|---|---|---|---|
/h/ |
1 |
2 |
1 |
0 |
/h/, glottal stop |
1 |
2 |
0 |
1 |
glottal stop |
2 |
4 |
3 |
3 |
glottalized |
3 |
1 |
1 |
2 |
glottalized, non-glottalized |
1 |
0 |
1 |
0 |
laryngeal |
0 |
6 |
0 |
0 |
non-glottalized |
0 |
1 |
0 |
0 |
t <- data.frame(unclass(table(tmp)))
t <- t %>% select(lowering, elevating, falling, rising)
# print(xtable(t, type = "latex", caption="The effect of coda glottal on pitch"))
tmp <- tonodb %>% select(Height, EffectOnPitch) %>%
filter(Height != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|---|---|---|---|
high |
51 |
1 |
0 |
4 |
0 |
0 |
1 |
1 |
1 |
0 |
low |
0 |
2 |
0 |
47 |
0 |
0 |
0 |
0 |
0 |
1 |
mid |
8 |
1 |
1 |
5 |
1 |
2 |
0 |
1 |
0 |
0 |
# print(xtable(table(tmp), type = "latex", caption="The effect of vowel height on pitch"))
table(tmp) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|---|---|---|---|
high |
51 |
1 |
0 |
4 |
0 |
0 |
1 |
1 |
1 |
0 |
low |
0 |
2 |
0 |
47 |
0 |
0 |
0 |
0 |
0 |
1 |
mid |
8 |
1 |
1 |
5 |
1 |
2 |
0 |
1 |
0 |
0 |
tmp <- tonodb %>% select(NucleusLength, EffectOnPitch) %>%
filter(NucleusLength != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
rising-falling |
|
---|---|---|---|---|---|
long |
1 |
1 |
1 |
1 |
1 |
short |
3 |
0 |
1 |
0 |
0 |
# print(xtable(table(tmp), type = "latex", caption="The effect of nucleus length on pitch"))
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
rising-falling |
|
---|---|---|---|---|---|
long |
1 |
1 |
1 |
1 |
1 |
short |
3 |
0 |
1 |
0 |
0 |
tmp <- tonodb %>% select(NucleusATR, EffectOnPitch) %>%
filter(NucleusATR != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
lowering |
|
---|---|---|
-ATR |
0 |
3 |
+ATR |
3 |
0 |
# print(xtable(table(tmp), type = "latex", caption="The effect of nuclear +/- ATR on pitch"))
table(tmp) %>% kable()
elevating |
lowering |
|
---|---|---|
-ATR |
0 |
3 |
+ATR |
3 |
0 |
tmp <- tonodb %>% filter(Area == "Africa") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
falling |
lowering |
---|---|---|
tmp <- tonodb %>% filter(Area == "Africa") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
# Nothing here
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for Africa"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
falling |
level |
lowering |
lowering, elevating |
mid |
no change |
rising |
|
---|---|---|---|---|---|---|---|---|
/h/ |
1 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
glottal stop |
1 |
4 |
0 |
0 |
0 |
0 |
0 |
3 |
glottalized |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
non-glottalized |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% filter(Area == "Asia") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
rising |
|
---|---|---|---|
/h/ |
1 |
2 |
0 |
glottal stop |
1 |
4 |
3 |
glottalized |
1 |
0 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for Asia"))
tmp <- tonodb %>% filter(Area == "Europe") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
falling |
level |
no change |
rising |
rising-falling |
|
---|---|---|---|---|---|---|
glottalized |
1 |
0 |
0 |
0 |
2 |
0 |
non-glottalized |
0 |
1 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% filter(Area == "Europe") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
rising |
|
---|---|---|---|
glottalized |
1 |
0 |
2 |
non-glottalized |
0 |
1 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for Europe"))
tmp <- tonodb %>% filter(Area == "North America") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
rising, elevating |
rising, lowering |
|
---|---|---|---|---|---|---|
/h/ |
0 |
0 |
1 |
0 |
0 |
0 |
/h/, glottal stop |
1 |
2 |
0 |
1 |
0 |
0 |
glottalized |
1 |
1 |
1 |
0 |
0 |
0 |
glottalized, non-glottalized |
1 |
0 |
1 |
0 |
0 |
0 |
laryngeal |
0 |
6 |
0 |
0 |
0 |
0 |
tmp <- tonodb %>% filter(Area == "North America") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
|
---|---|---|---|---|
/h/ |
0 |
0 |
1 |
0 |
/h/, glottal stop |
1 |
2 |
0 |
1 |
glottalized |
1 |
1 |
1 |
0 |
glottalized, non-glottalized |
1 |
0 |
1 |
0 |
laryngeal |
0 |
6 |
0 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for North America"))
tmp <- tonodb %>% filter(Area == "Papunesia") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
lowering |
rising |
---|---|---|
tmp <- tonodb %>% filter(Area == "Papunesia") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
# No results
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for Papunesia"))
tmp <- tonodb %>% filter(Area == "South America") %>% select(CodaGlottal, EffectOnPitch)
table(tmp) %>% kable()
elevating |
falling |
lowering |
rising |
|
---|---|---|---|---|
glottal stop |
1 |
0 |
3 |
0 |
tmp <- tonodb %>% filter(Area == "South America") %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
lowering |
|
---|---|---|
glottal stop |
1 |
3 |
# print(xtable(table(tmp), type = "latex", caption="Number of cases/varieties of different tonogenesis types for South America"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(OnsetAspiration, EffectOnPitch) %>%
select(OnsetAspiration, EffectOnPitch) %>%
filter(OnsetAspiration != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
mid |
rising |
|
---|---|---|---|---|---|
Aspirated |
3 |
0 |
6 |
4 |
0 |
Aspirated, unaspirated |
5 |
1 |
1 |
0 |
0 |
Breathy |
0 |
0 |
0 |
0 |
1 |
Unaspirated |
7 |
0 |
1 |
6 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Onset aspiration in Asia"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(CodaGlottal, EffectOnPitch) %>%
select(CodaGlottal, EffectOnPitch) %>%
filter(CodaGlottal != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
rising |
|
---|---|---|---|
/h/ |
1 |
2 |
0 |
glottal stop |
1 |
4 |
3 |
glottalized |
1 |
0 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Coda glottal in Asia"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(CodaManner, EffectOnPitch) %>%
select(CodaManner, EffectOnPitch) %>%
filter(CodaManner != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
level |
lowering |
rising |
|
---|---|---|---|---|---|
fricative |
1 |
3 |
0 |
0 |
0 |
obstruent |
1 |
2 |
0 |
0 |
0 |
open |
0 |
0 |
1 |
0 |
0 |
sonorant |
0 |
1 |
1 |
0 |
0 |
sonorant, open |
0 |
0 |
2 |
0 |
0 |
stop |
1 |
4 |
0 |
1 |
3 |
# print(xtable(table(tmp), type = "latex", caption="Coda manner in Asia"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(CodaPhonation, EffectOnPitch) %>%
select(CodaPhonation, EffectOnPitch) %>%
filter(CodaPhonation != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
falling |
lowering |
|
---|---|---|
breathy |
1 |
0 |
voiced |
0 |
1 |
voiceless |
1 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Coda phonation type in Asia"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(NucleusHeight, EffectOnPitch) %>%
select(NucleusHeight, EffectOnPitch) %>%
filter(NucleusHeight != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
lowering |
|
---|---|---|
High |
1 |
0 |
Low |
0 |
1 |
# print(xtable(table(tmp), type = "latex", caption="Nucleus height in Asia"))
tmp <- tonodb %>% filter(Area == "Asia") %>% select(OnsetVoicing, EffectOnPitch) %>%
select(OnsetVoicing, EffectOnPitch) %>%
filter(OnsetVoicing != "") %>%
filter(EffectOnPitch != "")
table(tmp) %>% kable()
elevating |
falling |
lowering |
lowering, elevating |
mid |
rising |
|
---|---|---|---|---|---|---|
sonorant |
1 |
0 |
0 |
0 |
0 |
0 |
Voiced |
10 |
2 |
32 |
0 |
0 |
2 |
Voiced, voiceless |
4 |
0 |
3 |
1 |
2 |
0 |
Voiceless |
31 |
1 |
11 |
0 |
8 |
0 |
# print(xtable(table(tmp), type = "latex", caption="Onset voicing in Asia"))
tmp <- tonodb %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Cases of tonogenesis` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of languages` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t <- t %>% arrange(desc(`Cases of tonogenesis`))
t %>% kable()
Type |
Cases of tonogenesis |
Number of languages |
---|---|---|
onset |
133 |
41 |
coda |
70 |
43 |
count |
27 |
20 |
nucleus |
27 |
16 |
syllable |
27 |
20 |
stress |
12 |
8 |
other |
6 |
5 |
NA |
1 |
1 |
# print(xtable(t, type = "latex", caption="Cases of tonogenesis by category"), include.rownames=FALSE)
t(t) %>% kable()
Type |
onset |
coda |
count |
nucleus |
syllable |
stress |
other |
NA |
Cases of tonogenesis |
133 |
70 |
27 |
27 |
27 |
12 |
6 |
1 |
Number of languages |
41 |
43 |
20 |
16 |
20 |
8 |
5 |
1 |
# print(xtable(t(t), type = "latex", caption="Cases of tonogenesis by category"), include.rownames=FALSE)
tmp <- tonodb %>% filter(Area == "Africa") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
count |
6 |
4 |
nucleus |
7 |
4 |
onset |
7 |
5 |
other |
1 |
1 |
syllable |
6 |
4 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in Africa in the DTE"))
t(t) %>% kable()
Type |
count |
nucleus |
onset |
other |
syllable |
Number of cases |
6 |
7 |
7 |
1 |
6 |
Number of varieties |
4 |
4 |
5 |
1 |
4 |
tmp <- tonodb %>% filter(Area == "Asia") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
coda |
35 |
15 |
count |
3 |
3 |
nucleus |
4 |
2 |
onset |
116 |
29 |
other |
2 |
1 |
stress |
2 |
1 |
syllable |
3 |
3 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in Asia in the DTE"))
t(t) %>% kable()
Type |
coda |
count |
nucleus |
onset |
other |
stress |
syllable |
Number of cases |
35 |
3 |
4 |
116 |
2 |
2 |
3 |
Number of varieties |
15 |
3 |
2 |
29 |
1 |
1 |
3 |
tmp <- tonodb %>% filter(Area == "Europe") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
coda |
8 |
5 |
count |
8 |
6 |
nucleus |
1 |
1 |
other |
1 |
1 |
stress |
5 |
4 |
syllable |
8 |
6 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in Europe in the DTE"))
t(t) %>% kable()
Type |
coda |
count |
nucleus |
other |
stress |
syllable |
Number of cases |
8 |
8 |
1 |
1 |
5 |
8 |
Number of varieties |
5 |
6 |
1 |
1 |
4 |
6 |
tmp <- tonodb %>% filter(Area == "North America") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
coda |
19 |
16 |
count |
5 |
3 |
nucleus |
8 |
5 |
onset |
1 |
1 |
other |
2 |
2 |
stress |
4 |
2 |
syllable |
5 |
3 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in North America in the DTE"), include.colnames=FALSE)
t(t) %>% kable()
Type |
coda |
count |
nucleus |
onset |
other |
stress |
syllable |
Number of cases |
19 |
5 |
8 |
1 |
2 |
4 |
5 |
Number of varieties |
16 |
3 |
5 |
1 |
2 |
2 |
3 |
tmp <- tonodb %>% filter(Area == "South America") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
coda |
6 |
5 |
count |
1 |
1 |
nucleus |
3 |
1 |
syllable |
1 |
1 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in South America in the DTE"))
t(t) %>% kable()
Type |
coda |
count |
nucleus |
syllable |
Number of cases |
6 |
1 |
3 |
1 |
Number of varieties |
5 |
1 |
1 |
1 |
tmp <- tonodb %>% filter(Area == "Papunesia") %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Type) %>% summarize(`Number of cases` = n())
varieties <- tmp %>% distinct() %>% group_by(Type) %>% summarize(`Number of varieties` = n())
t <- left_join(cases, varieties)
## Joining with `by = join_by(Type)`
t %>% kable()
Type |
Number of cases |
Number of varieties |
---|---|---|
coda |
1 |
1 |
count |
4 |
3 |
nucleus |
3 |
2 |
onset |
9 |
6 |
stress |
1 |
1 |
syllable |
4 |
3 |
# print(xtable(t(t), type = "latex", caption="Tonogenetic events in Papunesia in the DTE"))
t(t) %>% kable()
Type |
coda |
count |
nucleus |
onset |
stress |
syllable |
Number of cases |
1 |
4 |
3 |
9 |
1 |
4 |
Number of varieties |
1 |
3 |
2 |
6 |
1 |
3 |
tmp <- tonodb %>% select(ID, LanguageVariety, TriggeringContext, EffectOnPitch, Type ) %>% head(n=10)
tmp %>% kable()
ID |
LanguageVariety |
TriggeringContext |
EffectOnPitch |
Type |
---|---|---|---|---|
1 |
Vietnamese |
Initial voiced stop + Falling tone |
lowering |
onset |
2 |
Vietnamese |
Initial voiceless stop + Falling tone |
elevating |
onset |
3 |
Vietnamese |
Final voiceless fricative |
falling |
coda |
4 |
Punjabi |
voiced aspirateed coda |
falling |
coda |
5 |
Middle Chinese |
final /h/ |
falling |
coda |
6 |
Cherokee |
inital glide or final glottal consonant |
falling |
coda, onset |
7 |
Lhasa Tibetan |
final glottal stop |
falling |
coda |
8 |
Khaling |
Obstruent coda OR disyllable –> monosyllable |
falling |
coda, syllable-count |
9 |
Proto-Mohawk-Oneida |
lengthened accented vowel followed by a glottal stop or by * / h / plus a resonant consonant |
falling |
coda |
10 |
Dzongkha |
loss of a second syllable OR loss of a coda /-r/ or /-l/. |
falling |
coda, syllable-count |
# print(xtable(tmp, type = "latex", caption="Example entries from the DTE"), include.rownames=FALSE)
tmp <- tonodb %>% select(Area, LanguageVariety, Type) %>% separate_rows(Type)
# tmp <- tonodb %>% select(LanguageVariety, Type) %>% separate_rows(Type)
cases <- tmp %>% group_by(Area, Type) %>% summarize(`Cases of tonogenesis` = n())
## `summarise()` has grouped output by 'Area'. You can override using the
## `.groups` argument.
varieties <- tmp %>% distinct() %>% group_by(Area, Type) %>% summarize(`Number of languages` = n())
## `summarise()` has grouped output by 'Area'. You can override using the
## `.groups` argument.
t <- left_join(cases, varieties)
## Joining with `by = join_by(Area, Type)`
t <- t %>% arrange(desc(`Cases of tonogenesis`))
tbl <- t %>% select(-`Number of languages`) %>% pivot_wider(names_from = Type, values_from = `Cases of tonogenesis`)
tbl
## # A tibble: 7 × 9
## # Groups: Area [7]
## Area onset coda count syllable nucleus stress other `NA`
## <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 Asia 116 35 3 3 4 2 2 NA
## 2 North America 1 19 5 5 8 4 2 NA
## 3 Papunesia 9 1 4 4 3 1 NA NA
## 4 Europe NA 8 8 8 1 5 1 NA
## 5 Africa 7 NA 6 6 7 NA 1 NA
## 6 South America NA 6 1 1 3 NA NA NA
## 7 <NA> NA 1 NA NA 1 NA NA 1
# print(xtable(tbl, type = "latex", caption="Tonogenesis events by area"), include.rownames=FALSE)
tbl <- t %>% select(-`Cases of tonogenesis`) %>% pivot_wider(names_from = Type, values_from = `Number of languages`)
tbl
## # A tibble: 7 × 9
## # Groups: Area [7]
## Area onset coda count syllable nucleus stress other `NA`
## <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 Asia 29 15 3 3 2 1 1 NA
## 2 North America 1 16 3 3 5 2 2 NA
## 3 Papunesia 6 1 3 3 2 1 NA NA
## 4 Europe NA 5 6 6 1 4 1 NA
## 5 Africa 5 NA 4 4 4 NA 1 NA
## 6 South America NA 5 1 1 1 NA NA NA
## 7 <NA> NA 1 NA NA 1 NA NA 1
# print(xtable(tbl, type = "latex", caption="Languages with tonogenesis events by area"), include.rownames=FALSE)
t$both_cases <- paste0(t$`Cases of tonogenesis`, " (", t$`Number of languages`, ")")
tbl <- t %>% select(-`Cases of tonogenesis`, -`Number of languages`) %>% pivot_wider(names_from = Type, values_from = both_cases)
tbl
## # A tibble: 7 × 9
## # Groups: Area [7]
## Area onset coda count syllable nucleus stress other `NA`
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Asia 116 (29) 35 (15) 3 (3) 3 (3) 4 (2) 2 (1) 2 (1) <NA>
## 2 North America 1 (1) 19 (16) 5 (3) 5 (3) 8 (5) 4 (2) 2 (2) <NA>
## 3 Papunesia 9 (6) 1 (1) 4 (3) 4 (3) 3 (2) 1 (1) <NA> <NA>
## 4 Europe <NA> 8 (5) 8 (6) 8 (6) 1 (1) 5 (4) 1 (1) <NA>
## 5 Africa 7 (5) <NA> 6 (4) 6 (4) 7 (4) <NA> 1 (1) <NA>
## 6 South America <NA> 6 (5) 1 (1) 1 (1) 3 (1) <NA> <NA> <NA>
## 7 <NA> <NA> 1 (1) <NA> <NA> 1 (1) <NA> <NA> 1 (1)
# print(xtable(tbl, type = "latex", caption="Tonogenesis events (languages) by area"), include.rownames=FALSE)
m <- tonodb %>% select(Latitude, Longitude, LanguageVariety, Type) %>% distinct() %>% separate_rows(Type)
ggplot(data=m, aes(x=Longitude, y=Latitude, color=Type)) +
borders("world", colour="gray50", fill="gray50") +
geom_point() +
theme_bw()
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_point()`).
Chord diagrams showing the relative frequencies between type of tonogenetic events (left) and their effect on various factors.
x <- tonodb %>% select(Type, Height, Ordering) %>% filter(!is.na(Height)) %>% separate_rows(Type)
x <- x %>% group_by(Type, Height, Ordering) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type', 'Height'. You can override using
## the `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
ggplot(data = x,
aes(axis1 = Type, axis2 = Height, y = Count)) +
geom_alluvium(aes(fill = Ordering)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
x <- tonodb %>% select(Type, Height) %>% filter(!is.na(Height)) %>% separate_rows(Type)
x <- x %>% group_by(Type, Height) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type'. You can override using the
## `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
ggplot(data = x,
aes(axis1 = Height, axis2 = Type, y = Count)) +
geom_alluvium(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
x <- tonodb %>% select(Type, EffectOnPitch, Ordering) %>% filter(!is.na(EffectOnPitch)) %>% separate_rows(Type)
x <- x %>% group_by(Type, EffectOnPitch, Ordering) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type', 'EffectOnPitch'. You can override
## using the `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
x <- x %>% filter(Count > 1) %>% filter(Type != "other")
x %>% kable()
Type |
EffectOnPitch |
Ordering |
Count |
Freq |
---|---|---|---|---|
onset |
elevating |
Broad - split |
39 |
0.1529412 |
onset |
lowering |
Broad - split |
39 |
0.1529412 |
coda |
falling |
Unclear |
14 |
0.0549020 |
onset |
mid |
Broad - split |
10 |
0.0392157 |
onset |
elevating |
Possibly Strict |
8 |
0.0313725 |
onset |
lowering |
Possibly Strict |
8 |
0.0313725 |
coda |
falling |
Possibly Strict |
5 |
0.0196078 |
coda |
elevating |
Unclear |
4 |
0.0156863 |
coda |
falling |
Broad - split |
4 |
0.0156863 |
coda |
rising |
Unclear |
4 |
0.0156863 |
nucleus |
elevating |
Broad - split |
4 |
0.0156863 |
nucleus |
lowering |
Broad - split |
4 |
0.0156863 |
onset |
elevating |
Unclear |
4 |
0.0156863 |
coda |
elevating |
Possibly Strict |
3 |
0.0117647 |
coda |
lowering |
Broad |
3 |
0.0117647 |
coda |
lowering |
Possibly Strict |
3 |
0.0117647 |
count |
falling |
Unclear |
3 |
0.0117647 |
nucleus |
elevating |
Possibly Strict |
3 |
0.0117647 |
nucleus |
elevating |
Unclear |
3 |
0.0117647 |
nucleus |
lowering |
Possibly Strict |
3 |
0.0117647 |
onset |
falling |
Broad - split |
3 |
0.0117647 |
onset |
lowering |
Broad |
3 |
0.0117647 |
syllable |
falling |
Unclear |
3 |
0.0117647 |
coda |
falling |
Strict |
2 |
0.0078431 |
coda |
level |
Strict |
2 |
0.0078431 |
coda |
level |
Unclear |
2 |
0.0078431 |
coda |
rising |
Possibly Strict |
2 |
0.0078431 |
coda |
rising |
Strict |
2 |
0.0078431 |
count |
elevating |
Unclear |
2 |
0.0078431 |
nucleus |
elevating |
Broad |
2 |
0.0078431 |
nucleus |
lowering |
Broad |
2 |
0.0078431 |
onset |
elevating |
Strict |
2 |
0.0078431 |
onset |
lowering |
Strict |
2 |
0.0078431 |
onset |
lowering |
Unclear |
2 |
0.0078431 |
onset |
rising |
Broad - split |
2 |
0.0078431 |
onset |
rising |
Unclear |
2 |
0.0078431 |
stress |
rising |
Unclear |
2 |
0.0078431 |
syllable |
elevating |
Unclear |
2 |
0.0078431 |
x %>% filter(!(Type %in% c("count", "stress", "syllable"))) %>%
filter(!(EffectOnPitch %in% c("level", "mid"))) %>%
ggplot(aes(axis1 = Type, axis2 = EffectOnPitch, y = Count)) +
geom_alluvium(aes(fill = Ordering)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
w <- x %>% filter(!(Type %in% c("count", "stress", "syllable"))) %>%
filter(!(EffectOnPitch %in% c("level", "mid")))
w$Type <- factor(w$Type, levels=c("onset", "nucleus", "coda"))
w$EffectOnPitch <- factor(w$EffectOnPitch, levels=c("elevating", "lowering", "rising", "falling"))
w %>%
ggplot(aes(axis1 = Type, axis2 = EffectOnPitch, y = Count)) +
geom_alluvium(aes(fill = Ordering)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
x <- tonodb %>% select(Type, Contour) %>% filter(!is.na(Contour)) %>% separate_rows(Type)
x <- x %>% group_by(Type, Contour) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type'. You can override using the
## `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
x %>% kable()
Type |
Contour |
Count |
Freq |
---|---|---|---|
coda |
falling |
26 |
0.2888889 |
onset |
rising |
11 |
0.1222222 |
onset |
falling |
9 |
0.1000000 |
onset |
level |
9 |
0.1000000 |
coda |
rising |
7 |
0.0777778 |
coda |
level |
5 |
0.0555556 |
count |
falling |
5 |
0.0555556 |
syllable |
falling |
5 |
0.0555556 |
stress |
rising |
3 |
0.0333333 |
count |
rising |
2 |
0.0222222 |
syllable |
rising |
2 |
0.0222222 |
count |
rising-falling |
1 |
0.0111111 |
nucleus |
falling |
1 |
0.0111111 |
nucleus |
rising |
1 |
0.0111111 |
nucleus |
rising-falling |
1 |
0.0111111 |
other |
falling |
1 |
0.0111111 |
syllable |
rising-falling |
1 |
0.0111111 |
ggplot(data = x,
aes(axis1 = Contour, axis2 = Type, y = Count)) +
geom_alluvium(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
ggplot(data = x,
aes(axis1 = Type, axis2 = Contour, y = Count)) +
geom_alluvium(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
x <- tonodb %>% select(Type, EffectOnPitch) %>% filter(!is.na(EffectOnPitch)) %>% separate_rows(Type)
x <- x %>% group_by(Type, EffectOnPitch) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type'. You can override using the
## `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
x <- x %>% filter(Count > 1) %>% filter(Type != "other")
x %>% kable()
Type |
EffectOnPitch |
Count |
Freq |
---|---|---|---|
onset |
elevating |
54 |
0.2117647 |
onset |
lowering |
54 |
0.2117647 |
coda |
falling |
26 |
0.1019608 |
nucleus |
elevating |
12 |
0.0470588 |
nucleus |
lowering |
10 |
0.0392157 |
onset |
mid |
10 |
0.0392157 |
coda |
elevating |
9 |
0.0352941 |
coda |
lowering |
9 |
0.0352941 |
coda |
rising |
8 |
0.0313725 |
coda |
level |
5 |
0.0196078 |
count |
falling |
5 |
0.0196078 |
onset |
rising |
5 |
0.0196078 |
syllable |
falling |
5 |
0.0196078 |
count |
elevating |
4 |
0.0156863 |
onset |
falling |
4 |
0.0156863 |
syllable |
elevating |
4 |
0.0156863 |
count |
lowering |
3 |
0.0117647 |
syllable |
lowering |
3 |
0.0117647 |
stress |
elevating |
2 |
0.0078431 |
stress |
lowering |
2 |
0.0078431 |
stress |
rising |
2 |
0.0078431 |
ggplot(data = x,
aes(axis1 = EffectOnPitch, axis2 = Type, y = Count)) +
geom_alluvium(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
ggplot(data = x,
aes(axis1 = Type, axis2 = EffectOnPitch, y = Count)) +
geom_alluvium(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum",
aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Survey", "Response"),
expand = c(0.15, 0.05)) +
theme_void()
x <- tonodb %>% select(Type, Height) %>% filter(!is.na(Height)) %>% separate_rows(Type)
x <- x %>% group_by(Type, Height) %>% summarize(Count = n())
## `summarise()` has grouped output by 'Type'. You can override using the
## `.groups` argument.
x <- x %>% mutate(Freq = Count / sum(x$Count))
x <- x %>% arrange(desc(Count))
ggplot(x, aes(x=Height, y=Type, fill = Freq)) +
geom_tile() +
theme_bw() +
scale_y_discrete(limits = c("other", "wordtype", "stress", "nucleus", "coda", "onset"))
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_tile()`).
ggplot(x, aes(x=Type, y=Height, fill = Freq)) +
geom_tile() +
theme_bw() +
scale_x_discrete(limits = c("onset", "coda", "nucleus", "stress", "wordtype", "other")) +
scale_y_discrete(limits = c("mid", "low", "high"))
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_tile()`).
It is more common for onset tonogenesis to have a elevating or lowering effect, and more common for coda tonogenesis to have a rising or falling effect.
type_height <- tonodb %>% select(Type, Height) %>% separate_rows(Type)
type_countour <- tonodb %>% select(Type, Contour) %>% separate_rows(Type)
table(type_height)
## Height
## Type high low mid
## coda 14 10 0
## count 5 3 0
## nucleus 14 7 1
## onset 27 32 17
## other 3 1 0
## stress 3 2 1
## syllable 5 3 0
table(type_countour)
## Contour
## Type falling level rising rising-falling
## coda 26 5 7 0
## count 5 0 2 1
## nucleus 1 0 1 1
## onset 9 9 11 0
## other 1 0 0 0
## stress 0 0 3 0
## syllable 5 0 2 1
th <- data.frame(unclass(table(type_height$Type, type_height$Height)))
tc <- data.frame(unclass(table(type_countour$Type, type_countour$Contour)))
th <- tibble::rownames_to_column(th, "Type")
tc <- tibble::rownames_to_column(tc, "Type")
tmp <- left_join(th, tc)
## Joining with `by = join_by(Type)`
tmp <- tmp %>% arrange(desc(high))
tmp %>% kable()
Type |
high |
low |
mid |
falling |
level |
rising |
rising.falling |
---|---|---|---|---|---|---|---|
onset |
27 |
32 |
17 |
9 |
9 |
11 |
0 |
coda |
14 |
10 |
0 |
26 |
5 |
7 |
0 |
nucleus |
14 |
7 |
1 |
1 |
0 |
1 |
1 |
count |
5 |
3 |
0 |
5 |
0 |
2 |
1 |
syllable |
5 |
3 |
0 |
5 |
0 |
2 |
1 |
other |
3 |
1 |
0 |
1 |
0 |
0 |
0 |
stress |
3 |
2 |
1 |
0 |
0 |
3 |
0 |
# print(xtable(tmp, type = "latex", caption=""), include.rownames=FALSE)
tmp <- tmp %>% rowwise() %>% mutate(height = sum(c(high, low, mid)))
tmp <- tmp %>% rowwise() %>% mutate(contour = sum(c(falling, level, rising, rising.falling)))
tmp
## # A tibble: 7 × 10
## # Rowwise:
## Type high low mid falling level rising rising.falling height contour
## <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 onset 27 32 17 9 9 11 0 76 29
## 2 coda 14 10 0 26 5 7 0 24 38
## 3 nucleus 14 7 1 1 0 1 1 22 3
## 4 count 5 3 0 5 0 2 1 8 8
## 5 syllable 5 3 0 5 0 2 1 8 8
## 6 other 3 1 0 1 0 0 0 4 1
## 7 stress 3 2 1 0 0 3 0 6 3
t <- tmp %>% select(Type, height, contour)
t %>% kable()
Type |
height |
contour |
---|---|---|
onset |
76 |
29 |
coda |
24 |
38 |
nucleus |
22 |
3 |
count |
8 |
8 |
syllable |
8 |
8 |
other |
4 |
1 |
stress |
6 |
3 |
print(xtable(t, type = "latex", caption=""), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrr}
## \hline
## Type & height & contour \\
## \hline
## onset & 76 & 29 \\
## coda & 24 & 38 \\
## nucleus & 22 & 3 \\
## count & 8 & 8 \\
## syllable & 8 & 8 \\
## other & 4 & 1 \\
## stress & 6 & 3 \\
## \hline
## \end{tabular}
## \caption{}
## \end{table}
Strict vs broad.
# table(tonodb$Ordering, exclude = FALSE)
table(tonodb$Ordering)
##
## Broad Broad - split Possibly Strict Strict Unclear
## 20 116 43 17 62
t <- data.frame(table(tonodb$Ordering))
t <- t %>% rename(Ordering = Var1, Count = Freq)
t
## Ordering Count
## 1 Broad 20
## 2 Broad - split 116
## 3 Possibly Strict 43
## 4 Strict 17
## 5 Unclear 62
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lr}
## \hline
## Ordering & Count \\
## \hline
## Broad & 20 \\
## Broad - split & 116 \\
## Possibly Strict & 43 \\
## Strict & 17 \\
## Unclear & 62 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis}
## \end{table}
Plus something like this, where the numbers outside the parenthesis represent cases, and numbers in parenthesis are languages.
tonodb %>% select(Type, Ordering)
## # A tibble: 259 × 2
## Type Ordering
## <chr> <chr>
## 1 onset Broad - split
## 2 onset Broad - split
## 3 coda Possibly Strict
## 4 coda Unclear
## 5 coda Strict
## 6 coda, onset Possibly Strict
## 7 coda Broad
## 8 coda, syllable-count Unclear
## 9 coda Possibly Strict
## 10 coda, syllable-count Unclear
## # ℹ 249 more rows
table(tonodb$Type, tonodb$Ordering)
##
## Broad Broad - split Possibly Strict Strict Unclear
## coda 6 7 10 10 24
## coda, nucleus 0 0 4 0 2
## coda, onset 0 0 1 0 1
## coda, syllable-count 0 1 0 0 3
## nucleus 3 8 2 0 5
## nucleus, coda 0 0 1 0 0
## nucleus, onset 1 0 0 0 0
## onset 4 98 14 5 7
## onset, other 0 0 2 0 0
## other 0 0 1 0 3
## stress 2 0 4 0 6
## syllable-count 4 2 3 2 11
## syllable-count, nucleus 0 0 1 0 0
t <- data.frame(unclass(table(tonodb$Type, tonodb$Ordering))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Broad Broad...split Possibly.Strict Strict Unclear
## 1 coda 6 7 10 10 24
## 2 coda, nucleus 0 0 4 0 2
## 3 coda, onset 0 0 1 0 1
## 4 coda, syllable-count 0 1 0 0 3
## 5 nucleus 3 8 2 0 5
## 6 nucleus, coda 0 0 1 0 0
## 7 nucleus, onset 1 0 0 0 0
## 8 onset 4 98 14 5 7
## 9 onset, other 0 0 2 0 0
## 10 other 0 0 1 0 3
## 11 stress 2 0 4 0 6
## 12 syllable-count 4 2 3 2 11
## 13 syllable-count, nucleus 0 0 1 0 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis by class"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrr}
## \hline
## Type & Broad & Broad...split & Possibly.Strict & Strict & Unclear \\
## \hline
## coda & 6 & 7 & 10 & 10 & 24 \\
## coda, nucleus & 0 & 0 & 4 & 0 & 2 \\
## coda, onset & 0 & 0 & 1 & 0 & 1 \\
## coda, syllable-count & 0 & 1 & 0 & 0 & 3 \\
## nucleus & 3 & 8 & 2 & 0 & 5 \\
## nucleus, coda & 0 & 0 & 1 & 0 & 0 \\
## nucleus, onset & 1 & 0 & 0 & 0 & 0 \\
## onset & 4 & 98 & 14 & 5 & 7 \\
## onset, other & 0 & 0 & 2 & 0 & 0 \\
## other & 0 & 0 & 1 & 0 & 3 \\
## stress & 2 & 0 & 4 & 0 & 6 \\
## syllable-count & 4 & 2 & 3 & 2 & 11 \\
## syllable-count, nucleus & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis by class}
## \end{table}
Numbers in parenthesis are languages.
tmp <- tonodb %>% select(Type, Ordering, Language_ID) %>% distinct()
t <- data.frame(unclass(table(tmp$Type, tmp$Ordering))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Broad Broad...split Possibly.Strict Strict Unclear
## 1 coda 6 3 7 3 19
## 2 coda, nucleus 0 0 1 0 1
## 3 coda, onset 0 0 1 0 1
## 4 coda, syllable-count 0 1 0 0 3
## 5 nucleus 2 4 1 0 4
## 6 nucleus, coda 0 0 1 0 0
## 7 nucleus, onset 1 0 0 0 0
## 8 onset 3 17 7 3 5
## 9 onset, other 0 0 1 0 0
## 10 other 0 0 1 0 3
## 11 stress 1 0 2 0 5
## 12 syllable-count 4 1 3 1 7
## 13 syllable-count, nucleus 0 0 1 0 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis by class by language"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrr}
## \hline
## Type & Broad & Broad...split & Possibly.Strict & Strict & Unclear \\
## \hline
## coda & 6 & 3 & 7 & 3 & 19 \\
## coda, nucleus & 0 & 0 & 1 & 0 & 1 \\
## coda, onset & 0 & 0 & 1 & 0 & 1 \\
## coda, syllable-count & 0 & 1 & 0 & 0 & 3 \\
## nucleus & 2 & 4 & 1 & 0 & 4 \\
## nucleus, coda & 0 & 0 & 1 & 0 & 0 \\
## nucleus, onset & 1 & 0 & 0 & 0 & 0 \\
## onset & 3 & 17 & 7 & 3 & 5 \\
## onset, other & 0 & 0 & 1 & 0 & 0 \\
## other & 0 & 0 & 1 & 0 & 3 \\
## stress & 1 & 0 & 2 & 0 & 5 \\
## syllable-count & 4 & 1 & 3 & 1 & 7 \\
## syllable-count, nucleus & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis by class by language}
## \end{table}
Class by type by area.
tmp <- tonodb %>% select(Type, Ordering, Macroarea)
t <- data.frame(unclass(table(tmp$Type, tmp$Macroarea))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Eurasia North.America Papunesia South.America
## 1 coda 0 39 14 0 4
## 2 coda, nucleus 0 0 4 0 2
## 3 coda, onset 0 0 1 1 0
## 4 coda, syllable-count 0 4 0 0 0
## 5 nucleus 7 5 3 2 1
## 6 nucleus, coda 0 0 0 0 0
## 7 nucleus, onset 0 0 0 1 0
## 8 onset 7 91 0 7 0
## 9 onset, other 0 2 0 0 0
## 10 other 1 1 2 0 0
## 11 stress 0 7 4 1 0
## 12 syllable-count 6 7 4 4 1
## 13 syllable-count, nucleus 0 0 1 0 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis by class by macroarea"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrr}
## \hline
## Type & Africa & Eurasia & North.America & Papunesia & South.America \\
## \hline
## coda & 0 & 39 & 14 & 0 & 4 \\
## coda, nucleus & 0 & 0 & 4 & 0 & 2 \\
## coda, onset & 0 & 0 & 1 & 1 & 0 \\
## coda, syllable-count & 0 & 4 & 0 & 0 & 0 \\
## nucleus & 7 & 5 & 3 & 2 & 1 \\
## nucleus, coda & 0 & 0 & 0 & 0 & 0 \\
## nucleus, onset & 0 & 0 & 0 & 1 & 0 \\
## onset & 7 & 91 & 0 & 7 & 0 \\
## onset, other & 0 & 2 & 0 & 0 & 0 \\
## other & 1 & 1 & 2 & 0 & 0 \\
## stress & 0 & 7 & 4 & 1 & 0 \\
## syllable-count & 6 & 7 & 4 & 4 & 1 \\
## syllable-count, nucleus & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis by class by macroarea}
## \end{table}
Strict vs broad cases of tonogenesis by class by macroarea (macroarea collapses Asia and Europe). All rows.
tmp <- tonodb %>% select(Type, Area)
t <- data.frame(unclass(table(tmp$Type, tmp$Area))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Asia Europe North.America Papunesia
## 1 coda 0 32 7 14 0
## 2 coda, nucleus 0 0 0 4 0
## 3 coda, onset 0 0 0 1 1
## 4 coda, syllable-count 0 3 1 0 0
## 5 nucleus 7 4 1 3 2
## 6 nucleus, coda 0 0 0 0 0
## 7 nucleus, onset 0 0 0 0 1
## 8 onset 7 114 0 0 7
## 9 onset, other 0 2 0 0 0
## 10 other 1 0 1 2 0
## 11 stress 0 2 5 4 1
## 12 syllable-count 6 0 7 4 4
## 13 syllable-count, nucleus 0 0 0 1 0
## South.America
## 1 4
## 2 2
## 3 0
## 4 0
## 5 1
## 6 0
## 7 0
## 8 0
## 9 0
## 10 0
## 11 0
## 12 1
## 13 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis per macroarea"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrrr}
## \hline
## Type & Africa & Asia & Europe & North.America & Papunesia & South.America \\
## \hline
## coda & 0 & 32 & 7 & 14 & 0 & 4 \\
## coda, nucleus & 0 & 0 & 0 & 4 & 0 & 2 \\
## coda, onset & 0 & 0 & 0 & 1 & 1 & 0 \\
## coda, syllable-count & 0 & 3 & 1 & 0 & 0 & 0 \\
## nucleus & 7 & 4 & 1 & 3 & 2 & 1 \\
## nucleus, coda & 0 & 0 & 0 & 0 & 0 & 0 \\
## nucleus, onset & 0 & 0 & 0 & 0 & 1 & 0 \\
## onset & 7 & 114 & 0 & 0 & 7 & 0 \\
## onset, other & 0 & 2 & 0 & 0 & 0 & 0 \\
## other & 1 & 0 & 1 & 2 & 0 & 0 \\
## stress & 0 & 2 & 5 & 4 & 1 & 0 \\
## syllable-count & 6 & 0 & 7 & 4 & 4 & 1 \\
## syllable-count, nucleus & 0 & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis per macroarea}
## \end{table}
Strict vs broad cases of tonogenesis by class by macroarea (macroarea collapses Asia and Europe). Per language.
tmp <- tonodb %>% select(Type, LanguageVariety, Area) %>% distinct()
t <- data.frame(unclass(table(tmp$Type, tmp$Area))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Asia Europe North.America Papunesia
## 1 coda 0 15 5 13 0
## 2 coda, nucleus 0 0 0 2 0
## 3 coda, onset 0 0 0 1 1
## 4 coda, syllable-count 0 3 1 0 0
## 5 nucleus 4 2 1 2 1
## 6 nucleus, coda 0 0 0 0 0
## 7 nucleus, onset 0 0 0 0 1
## 8 onset 5 28 0 0 4
## 9 onset, other 0 1 0 0 0
## 10 other 1 0 1 2 0
## 11 stress 0 1 4 2 1
## 12 syllable-count 4 0 5 3 3
## 13 syllable-count, nucleus 0 0 0 1 0
## South.America
## 1 4
## 2 1
## 3 0
## 4 0
## 5 1
## 6 0
## 7 0
## 8 0
## 9 0
## 10 0
## 11 0
## 12 1
## 13 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis by language by macroarea"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrrr}
## \hline
## Type & Africa & Asia & Europe & North.America & Papunesia & South.America \\
## \hline
## coda & 0 & 15 & 5 & 13 & 0 & 4 \\
## coda, nucleus & 0 & 0 & 0 & 2 & 0 & 1 \\
## coda, onset & 0 & 0 & 0 & 1 & 1 & 0 \\
## coda, syllable-count & 0 & 3 & 1 & 0 & 0 & 0 \\
## nucleus & 4 & 2 & 1 & 2 & 1 & 1 \\
## nucleus, coda & 0 & 0 & 0 & 0 & 0 & 0 \\
## nucleus, onset & 0 & 0 & 0 & 0 & 1 & 0 \\
## onset & 5 & 28 & 0 & 0 & 4 & 0 \\
## onset, other & 0 & 1 & 0 & 0 & 0 & 0 \\
## other & 1 & 0 & 1 & 2 & 0 & 0 \\
## stress & 0 & 1 & 4 & 2 & 1 & 0 \\
## syllable-count & 4 & 0 & 5 & 3 & 3 & 1 \\
## syllable-count, nucleus & 0 & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis by language by macroarea}
## \end{table}
Strict vs broad cases of tonogenesis by class by language by macroarea.
tmp <- tonodb %>% select(Type, Ordering, Macroarea) %>% distinct()
t <- data.frame(unclass(table(tmp$Type, tmp$Macroarea))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Eurasia North.America Papunesia South.America
## 1 coda 0 5 2 0 1
## 2 coda, nucleus 0 0 1 0 1
## 3 coda, onset 0 0 1 1 0
## 4 coda, syllable-count 0 2 0 0 0
## 5 nucleus 2 2 2 1 1
## 6 nucleus, coda 0 0 0 0 0
## 7 nucleus, onset 0 0 0 1 0
## 8 onset 2 5 0 2 0
## 9 onset, other 0 1 0 0 0
## 10 other 1 1 2 0 0
## 11 stress 0 3 2 1 0
## 12 syllable-count 3 3 3 2 1
## 13 syllable-count, nucleus 0 0 1 0 0
print(xtable(t, type = "latex", caption="Strict vs broad cases of tonogenesis by class by language by macroarea"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrr}
## \hline
## Type & Africa & Eurasia & North.America & Papunesia & South.America \\
## \hline
## coda & 0 & 5 & 2 & 0 & 1 \\
## coda, nucleus & 0 & 0 & 1 & 0 & 1 \\
## coda, onset & 0 & 0 & 1 & 1 & 0 \\
## coda, syllable-count & 0 & 2 & 0 & 0 & 0 \\
## nucleus & 2 & 2 & 2 & 1 & 1 \\
## nucleus, coda & 0 & 0 & 0 & 0 & 0 \\
## nucleus, onset & 0 & 0 & 0 & 1 & 0 \\
## onset & 2 & 5 & 0 & 2 & 0 \\
## onset, other & 0 & 1 & 0 & 0 & 0 \\
## other & 1 & 1 & 2 & 0 & 0 \\
## stress & 0 & 3 & 2 & 1 & 0 \\
## syllable-count & 3 & 3 & 3 & 2 & 1 \\
## syllable-count, nucleus & 0 & 0 & 1 & 0 & 0 \\
## \hline
## \end{tabular}
## \caption{Strict vs broad cases of tonogenesis by class by language by macroarea}
## \end{table}
Type by rows by area.
tmp <- tonodb %>% select(Ordering, Area)
t <- data.frame(unclass(table(tmp$Ordering, tmp$Area))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Asia Europe North.America Papunesia South.America
## 1 Broad 5 4 4 0 3 4
## 2 Broad - split 13 102 1 0 0 0
## 3 Possibly Strict 0 16 1 18 7 0
## 4 Strict 0 15 0 2 0 0
## 5 Unclear 3 20 16 13 6 4
print(xtable(t, type = "latex", caption="Type by rows by area"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrrr}
## \hline
## Type & Africa & Asia & Europe & North.America & Papunesia & South.America \\
## \hline
## Broad & 5 & 4 & 4 & 0 & 3 & 4 \\
## Broad - split & 13 & 102 & 1 & 0 & 0 & 0 \\
## Possibly Strict & 0 & 16 & 1 & 18 & 7 & 0 \\
## Strict & 0 & 15 & 0 & 2 & 0 & 0 \\
## Unclear & 3 & 20 & 16 & 13 & 6 & 4 \\
## \hline
## \end{tabular}
## \caption{Type by rows by area}
## \end{table}
Type by distinct languages.
tmp <- tonodb %>% select(Language_ID, Ordering, Macroarea) %>% distinct()
t <- data.frame(unclass(table(tmp$Ordering, tmp$Macroarea))) %>% rownames_to_column()
t <- t %>% rename(Type = rowname)
t
## Type Africa Eurasia North.America Papunesia South.America
## 1 Broad 5 6 0 2 4
## 2 Broad - split 6 17 0 0 0
## 3 Possibly Strict 0 8 11 4 0
## 4 Strict 0 6 1 0 0
## 5 Unclear 2 19 9 4 2
print(xtable(t, type = "latex", caption="Type by distinct languages"), include.rownames=FALSE)
## % latex table generated in R 4.3.2 by xtable 1.8-4 package
## % Sat Apr 26 08:51:45 2025
## \begin{table}[ht]
## \centering
## \begin{tabular}{lrrrrr}
## \hline
## Type & Africa & Eurasia & North.America & Papunesia & South.America \\
## \hline
## Broad & 5 & 6 & 0 & 2 & 4 \\
## Broad - split & 6 & 17 & 0 & 0 & 0 \\
## Possibly Strict & 0 & 8 & 11 & 4 & 0 \\
## Strict & 0 & 6 & 1 & 0 & 0 \\
## Unclear & 2 & 19 & 9 & 4 & 2 \\
## \hline
## \end{tabular}
## \caption{Type by distinct languages}
## \end{table}