mindstalk: (science)
This link goes to a longish article on the complexity of sentences and changes therein. Those who are familiar with the literature of the 18th or 19th centuries, including such documents basic to the USA as the Declaration of Independence, may have noticed a difference in the length and complexity of many sentences from those periods, compared to those of the current era. The author says that there is a real difference, across not just time but also languages: it is written languages which most reliably embed clauses in each other like Russian dolls. Even oral languages which have the tools for such behavior may have likely acquired them from contact with written languages.

Does that mean purely oral languages are simpler? Nay! Though their sentences are allegedly childishly simple (examples given include "It will be possible? You will teach me. I will make bread." "He came near those boys. They were throwing spears at something then."), their complexity "erupts" elsewhere, with frighteningly complex word formation, such as in polysynthetic languages.

However, not all complexity is the same. The author claims that the word-formation form of complexity requires massive amounts of memorization, by speakers "marinating" in the language from childhood, and makes analogy to a rise in compound words in modern English whose meaning is not derivable by pattern. (Examples given: "A house boat, for example, is a boat that functions like a house, but a housecoat is a coat you wear in a house, and a housewife fits neither pattern.") Whereas syntactical complexity is generative: once learned, you can generate it, and decompose it, with equal ease and glee.

My lay grasp of linguistics is far from able to judge the accuracy of the claims. I would note though that it's not a matter of the article contrasting modern Western languages to indigenous ones like Yupik: the claim is that the earliest written languages also showed the pattern:

"According to linguist Guy Deutscher, the earliest clay tablets (about 2500 B.C.) of the ancient language Akkadian reveal few embedded clauses. The same is evidently true of the earliest stages of other ancient written languages such as Sumerian, Hittite, or Greek. Although these languages boasted a profusion of grammatical features suitable for expressing subtle nuances of meaning, and included a variety of fancy word-building techniques, they avoided complicated sentence recursion."

(Bold emphasis mine.)

So instead of recursive embedded clauses, you get long run-on sentences of chained clauses. Which rings a bell about something I found odd in translations of old Sumerian and Akkadian writing.

Finally, the article tries to link this to esoteric vs. exoteric communities. Small isolated communities can build up memory-taxing stores of word building patterns, which in turn keep the community isolated; large and diverse communities need something with clearer rules. The esoteric community needn't just be some small ancient tribe: modern scientific discourse is identified as an area where sentence complexity diminishes, while non-transparent compound nouns or phrases grow in use.

"Evidence shows that the most insular scientific communities have led the march away from elaborated sentences in favor of complex, compressed nouns: Science articles in specialist publications such as the Journal of Cell Biology contain fewer relative clauses and more noun compounds than articles in publications like Science, which target a more diverse community of scientists."

(That said, I recall a friend's advisor explaining scientific language differently: given a desire to appeal to many people for whom English is not their first language, the acts of keeping sentences simple and free of colorful idioms, and using unambiguous vocabulary, are virtues.)
mindstalk: (Enki)
http://www.nytimes.com/interactive/2013/12/20/sunday-review/dialect-quiz-map.html

Server load or the various plugins in my browser mean I don't see a "Share" link for my maps. My first result was near Richmond VA and couple of other nearby cities; my second one, with some different questions and a few different answers to old ones, put me in upstate NY, Maine, and Wisconsin, with "nearest cities" of Rochester, Providence, and Springfield MA. My heat maps for individual questions are all over the place, especially the first time I took it (second one overlapped with Chicago more often, but still didn't end up there.)

I grew up a solitary and bookish child to parents from Boston and LA/Berkeley, in gifted/magnet schools, then lived in intellectual California for 10 years, Indian grad school for 8, and now in Cambridge. I've deliberately adopted "you all/y'all" as useful and thanks to knowing the originator of http://www.popvssoda.com/ I frankly have no idea what word I grew up with, though I think I got "soft drink" from my parents. I suspect I just confuse the system.
mindstalk: (kirin)
At a party tonight, people playing a homegrown version of Pictionary, basically Difficult All Play with made up words. A neutral player picks a word and shows it to the drawer of each team, and they race ot make the guesser say the word; no limit on the abstraction of the word. We saw expertise, irrelevant, vulgar, and tact (which was going on when I left.) The winners of the earlier words used "sounds like" techniques, e.g. Vulcan + car = vulgar. This was banned for the 4th round on the grounds of being too powerful. Progress by non-sounds like teams was, uh, amusing.

It occurred to me that "sounds like" is recapitulating the evolution of writing. First, pictures of concrete objects or verbs, then ideograms for the more suitable abstract concepts like 'up'... and then instead of arbitrary graphical symbols for the hard stuff, phonemic techniques to elicit the sounds of the arbitrary spoken word people already know.

This suggests a compromise, based on the vast majority of Chinese characters: people can use a partial 'sounds like' technique, indicating part of the sound but combining it with a other symbols that suggest the meaning domain. E.g. 'vulcan' + pictures suggesting politeness or rudeness or the populace.
mindstalk: (YoukoRaku1)
A neat image, the origins of which I have no idea of. Catgirls and bunnygirls in swimwear may not be for everyone, but the setting they're in has a magical feel to me. Makes me think of a future Neo-Venezia where the Martian neo-cats have merged with the people and gondolas been replaced by water buses.

Lois Bujold has a new novel coming out real soon now -- or sooner if you go DRM-free ebook -- and the first six chapters are online.

Article and comments on more productive farming.

The Met makes many catalogs free online.

Original pronunciation Shakespeare (Youtube).

Fake lesbian kiss upstages anti-gay rights protest.

Pre-sprawl aerial photos vs. modern satellite images.

Amtrak breaking ridership records. (Keep in mind Amtrak is young compared to US railroads.) (I was contemplating going to OVFF this weekend: ride, plane, bus, train? Nope, no train to Columbus Ohio.)

Lots of bicycle stuff! Bike paths may really be dramatically safer. A defense of electric bikes for SF hills. What bicyclists really want. A bicycle for carrying two small kids. The author is down on it, but the idea comes through anyway. And the Dutch contemplate heated bike lanes.

My Little Ponies teach fonts to use and avoid.

Someone: '"What's keming". is arguably the nerdiest joke I know.'

Why restaurant websites suck.
mindstalk: (12KMap)
Which languages to learn to maximize the number of speakers is a traditional exercise, going something like English, Mandarin, Spanish, Russian, French, Arabic, Hindi, Swahili, Bengali, Portuguese, Japanese, German... But what if you wanted to learn one language per family, while maximizing speakers?

English, Mandarin, Swahili, Arabic, Indonesian, Tamil, Japanese, maybe Turkish, Vietnamese, Thai; possibly Korean; Hungarian. This can be seen as how to learn up to 12 useful languages while minimizing the possibility of any re-use to make your life easier. :)


Of course both lists can look different if weighted by one's personal probability of running into the language or speakers thereof.
mindstalk: (Enki)
On further thought, what really strikes me about that list is how many major Asian languages are in different families, on the level not of Latin and German but of Latin and Chinese. Mongolian: probable Altaic Chinese: Sino-Tibetan Korean and Japanese: likely isolates, possibly relatives of each other, or in Altaic Ainu: isolate Indonesia, Malay, Tagalog/Filipino, and Formosan languages: Austronesian Vietnamese and Cambodian: Austro-Asiatic Thai and Lao: Tai-Kadai Burma: Sino-Tibetan Hindi and Bengali: Indo-European Tamil: Dravidian and if you can find Asian Muslims who actually know Arabic: Afro-Asiatic 8 language families, not counting Ainu and Arabic, and with a maximal Altaic group; 10 with a smaller one. And of course this isn't counting all the minor families and isolates. Even when there's an ostensible or even real genetic relationship, moving from one country to the next is likely to seem completely different; Thai and Lao are close, as are Indonesian and Malay, but those aren't close to Formosan or Tagalog; Vietnamese and Khmer aren't close; no one can agree if Korean and Japanese are related to anything. Contrast with Europe, where it's Indo-European almost everywhere you go, with older branches like Celtic seeming indigenous to later ones like Latin/Romance and Germanic, having completely overwhelmed whatever came before Celtic, with only a few survivors like Uralic (Finnish, Hungarian) and the Basque isolate. Two families, plus one isolate. Three families if you push out to Georgia and Caucasian, though at that point you might as well add Turkish:Altaic as well. Of course, once again, we're talking about a much smaller population; Europe is basically half the population of north India. Then again, population size and language diversity don't have much to do with each other. Geography's probably more relevant, but obviously hasn't done that much in Europe. For whatever reason, Indo-Europeans were really good at invading Europe, in multiple waves, even.
mindstalk: (kirin)
I knew there were many language families, and many of their names indicate where they are, but I thought it'd be useful to associate them with famous languages (and with a large number of speakers) to stand as representatives, as well as targets for learning if you wanted to go look at exotic languages. So I went to http://en.wikipedia.org/wiki/List_of_language_families and clicked a lot.

Takeaway: you can't do this for all language families, because there are dozens of them -- heck, dozens in each of the Americas, New Guinea, and Australia alone. Not counting isolates, of which there are many. But, going by the big groups (at least 1% of world population, which is nearly 70 million people), we have, in order of native speakers:

Family (example languages, and notes) [% of world native speakers]

* Indo-European (duh) [46%]
* Sino-Tibetan (Chinese; Burmese, Tibetan) [21%]
* Niger-Congo (Yoruba, Zulu, Swahili) [6.4%]
* Afro-Asiatic/Hamito-Semitic (Arabic, Berber, Amharic, Hausa, Egyptian, Hebrew, Akkadian)
* Austronesian (Indonesian, Hawaiian; 9 of 10 branches only on homeland of Taiwan/Formosa; very diverse)
* Dravidian (Tamil; south India) [3.7%]
* Altaic (Turkic, Mongolian, maybe Korean and Japanese; disputed)
* Austro-Asiatic (Vietnamese, Khmer (Cambodia), Munda (indigenes of India))
* Tai-Kadai/Kradai (Thai, Lao; highly tonal) [1.3%]

some others of note:

Uralic (Finnish, Hungarian, Sami, Estonian)
South Caucasian/Kartvelian (Georgian)
Hmong-Mien (Hmong, which has 12 tones)
Iroquois (Cherokee)
Mayan
Uto-Aztecan (Nahuatl)
Quechua (Andes, Inca)
Eskimo-Aleut (Yupik, Inuit, Aleut)
Algic (Algonquian; Blackfoot, Cree, Massachusett, Mohican; has a couple in California)
Tupian (Brazil; Tupi, Guarani)
Khoisan (click; Khoi, San; no longer accepted as a single family)
Ainu
Sumerian (isolate)

But there are many many others. E.g. at least one non-Eskimo family that's in both Siberia and NW America, not to mention other Siberian and American families separately. Seven families indigenous to Mesoamerica, with Maya and Aztec representing only two of them. New Guinea's many, Australia's many...


There's also a concept of http://en.wikipedia.org/wiki/Sprachbund
unrelated languages in an area resembling each other through mutual exchange

blend of Romance, Slavic, etc. in Balkans (Albanian, Romanian, south Slavic, Greek, Romani)
Indo-Aryan/Dravidian
tonal and vowel sharing in SE Asia, Sino/Thai/Khmer
possibly the whole Altaic 'family'
clicks from Khoisan into Bantu/Nguni


http://en.wikipedia.org/wiki/Language_family has a map and other discussion
mindstalk: (CrashMouse)
One of the things I marvel at is that Spanish and Japanese get away with five vowels. Maybe a few dipthongs for Spanish. But English has at least 12 semantically distinct vowels:

bait, bat; beet, bet; bite, bit; boat, bought; boot, but; plus bout. Boit is not a word but seems like it could be, for a total of 12. There's also butte, but arguably that has an extra consonant /byoot/.

The other thing is how many syllables English has; it's a good thing we have an alphabet, and not the more commonly invented syllabary. The syllabaries I know of are in the 40-60 range. English: 12 vowels, plus 18 useful consonantal letters, plus th (thin), th (then), ch, and sh. 22*12 = 264 CV syllables. And that's not counting all of the consonantal combinations which don't deserve their own alphabetic letter: sk, br, gl, gr, -nt, -ng, kl, kr, etc., etc. It'd be a nightmare!

Profile

mindstalk: (Default)
mindstalk

June 2025

S M T W T F S
123 45 67
89 10 1112 1314
15161718192021
22232425262728
2930     

Most Popular Tags

Expand Cut Tags

No cut tags

Syndicate

RSS Atom

Style Credit

Page generated 2025-06-22 11:08
Powered by Dreamwidth Studios
OSZAR »