IF
NOT, THEN WHY? 1: THE "PROTO-LATIN THEORY" OF THE MAGYAR LANGUAGE
B. Lukács
President of the Matter
Evolution Subcommittee
of the
CRIP RMKI H-1525
Bp. 114. Pf. 49,
lukacs@rmki.kfki.hu
ABSTRACT
Sometimes a theory is impossible, so cannot
be true; and still it works. Surely something mimics Truth; it is then
interesting and worthwhile to see, what mimics what. L. Szabédi, high official
of the
0. ON THE EVOLUTION IN
SCIENCES
While I believe in
Scientific Truth, I do not believe in the truth of any particular theory used just now. We know that any particular
theory used just now has substituted a previous theory proven wrong not too far
back in the past. The general (and convenient) belief/expression is that the present theory is more true than the
previous one; and I think this is generally true in quantitative sense: the new
one answers more questions than the old one, or gives more accurate
predictions, or simpler to apply &c. However in qualitative sense the
continuous improvement is often not true.
In Physics Democritus
used an atomistic description which we
now like to call correct, and with this scheme he was practically unable to
give explanations or predictions correct either in quantitative or in
qualitative sense. After a century came Aristotle, telling that atomism is
probably incorrect and surely unimportant; he used a strongly inhomogeneous
world picture with a preferred center and rest at proper place as preferred
state of motion; physicists generally like
now to call this incorrect but still Aristotle was able to give lots of
good predictions in qualitative sense even if few enough correct
quantitatively. His system survived 1900 years and then Galileo disproved it.
Based on Galileo's rather semiquantitative observations and arguments Newton
then demolished the Aristotelian physics (except in Thermodynamics, to be
sure), and the new one showed a Democritean structure, with mass points moving
with never diminishing momenta (in sums), with forces depending on relative
distances of agents, and not on distances from preferred points of space, and
Newton's World was infinite, the same everywhere, and unevolving. Philosophers
then were eloquent in jubilating on old Aristotle overthrown.
At the latest time as
1926 Relativity & Quantum Physics demolished
As for Gravity, it
seems that Democritus did not know what is
it (I am deliberately ungrammatical, but it would be even worse to use Past
Tense for an Ultimate Truth, even if we do not know It for sure, and my tenses
are correct in Uralic languages anyway). Maybe some
interactions of the small atoms? Aristotle told that Gravity is the
relation of the piece of matter to geometry of Space.
In some areas of
Physics now theories live 5 years in average. When I was choosing topics for my
BA dissertation (that was in 1969) the most interesting Particle Physics title
was "ρ-π-π Vertices in the Third Times Modified
Veneziano Model". I thought maybe that was my last time to choose freely,
so I would like something more True than a third times modified model, so I
opted for Gravitational Collapse. When I am writing this, General Relativity
survived 91 years in unmodified form,
but my young colleagues do not know what is the Veneziano model modified any
times. However signals are clear that even General Relativity will change,
hopefully in my life.
As for Astronomy,
Pythagoras believed that Earth revolved around either Sun or the Central Fire.
Aristotle believed that Sun revolved around Earth. Aristarchus supported
Pythagoras, but in vain. Copernicus believed that Earth revolved around Sun,
Galileo supported him and Church believed that She
condemned Galileo for making the support indecently stubborn. Then after Newton
Copernicus' scheme was accepted, but in 1915 Einstein showed that the question
"who revolves around whom" has no meaning at all, not being a covariant
statement. Then in 1992 Church observed anomalies in the process of Galileo's
trial and annulled the verdict in a backward acting way. What is Truth in this
question?
And Physics &
Astronomy are our best Sciences. As for the Age of Earth pre-Classic Greeks
guessed some thousands of years, Aristotle Infinity, Alexander of Aphrodisias
more than a million years, Early Middle Age 5000 years, XVIIIIth
century mainly on physical arguments ~10,000 years, XIXth century on
physical, chemical & geological arguments 40 million years, which then
jumped to a billion, and then gradually to 4.55 million years.
For Economy I rather
would not list best theories telling opposite Truths. In
Scholarship look for National Histories.
So at a given temporal point we cannot
answer questions about Ultimate Truth, although in average Science goes forward. But Ultimate Questions cannot be
answered in average; they should be answered Yes or No.
Orman Willard van
Quine tells us [1] that changes in Science follow a certain
Parsimony. We have lots of assumptions when building up a Theory. Then, if an
observation contradicts a prediction, we change some assumption, but those
which are the "cheapest". So the Ptolemaic World Scheme was improved
and improved by introducing more and more epicycles and excentric cycles for
1500 years; and only then took Copernicus over.
But even this is not
so simple. Historians of Science tell that the Aristotlean World Picture
collapsed when Galileo's telescope resolved Milky Way into individual stars.
They are obviously neglecting the original works of Aristotle. De Caelo gives
an explanation for Milky Way, and from this explanation the prediction would
have been (nobody applied Aristotle's theory to telescopes) that after some
magnification the picture will be light spots surrounded by darkness [2], as it
was. Because neither Galileo nor his opponents were fluent in Aristotle, they
agreed that there was something against Him. Now we know that there was, but
they should not have yet known.
So it seems that a
kind of "scientific fashion" is involved in each Yes/No choice. Maybe
if we could observe the many-world histories of
But we can remember
it. And this view gives a Prediction. There may be cases when somebody quite
good a scientist/scholar invents a theory whose fundaments seem quite impossible
for us; and still he is not a madman, and his theory
can explain a lot of facts. Maybe not so much as the Best Theory, but still, it
is surprisingly good compared to Common Opinion that it is Fundamentally Wrong.
And in a century futureward...?
This study is about
the "proto-Latin Theory" of the Origin of Magyar language [4],
elaborated by L. Szabédi, Magyar poet, linguist, University high official and
finally martyr in Roumania.
1. THE BACKGROUND
I must sketch the
situation of
Szabédi was an ethnic
Magyar in
Then the Magyar
minority made jokes about the Latin origin (in small voice) and did not believe
even the Neo-Latin origin of the language (true, lots of Bulgarian Slav words
can be found in it, even after the Herculanean work of Bishop Samuil
Micu-Klein, but also a comparable amount of Bulgarian Turk is present in
Magyar). So, because Magyars in
And yet, Szabédi
worked out a theory in which Magyar is a cousin of Roumanian, Roumanian being
the direct descendent of Latin, while Magyar comes from the direct ancestor of
Latin through proto-Magyar, Ugor, Finno-Ugric or so.
Of course, you can
believe in a simple explanation, as follows. The State is Roumanian. Szabédi
wants to get something from the
I heard lots of stories about lip service to
2. THE MAJORITY OPINION
ABOUT ROUMANIAN, MAGYAR AND LATIN
Indo-Europist linguists
have a very strict opinion about the origin of the Roumanian language.
According to it, Roumanian is the direct descendent of Eastern Vulgar Latin. I
agree with this opinion even if I think Westerners ignore the heavy Carp &
Bulgarian Slav substrates/adstrates. They ignore also the Lazhen’ inscription,
but the Lazhen’ inscription influences only the exact location of the
ethnogenesis and they have to ignore that.
For the origin of
Magyar there is an official Hungarian Academic opinion. I write Academic, not
academic; this is the majority opinion of the
Latin was a member of
the Italic Subfamily of the Indo-European family; or maybe of the Italoceltic
subfamily whose sister subgroups were Italic & Celtic; it will come in due
course. Indo-European and Uralic are 2 disjoint families, albeit with early
contacts or earlier genetic connections (again later), but they were disjoint
already in 4500 BC, or even earlier. Their grammatical structures vastly differ
even if a few words (water, name, honey &c.) seem to be close enough.
I may be in mistake,
but I do not have any strong reason to challenge this view, and I will take this scheme. To be sure
I always feel Altaic languages nearer to Uralic than generally linguists think;
but I may overemphasize the great grammatical
similarity. Also, Altaic is irrelevant in the present discussion.
So the majority
opinion (which I share) is that
Magyar & Latin/Roumanian were divergent for at least 6,500 years. In
contrast Szabédi stated [4] that the evolutions diverged for a mere 3,900 years
(or, alternatively, 3,600). When the common language split, proto-Italians went
to
Now some readers may
believe that 6500 years in contrast 3900 is not a really big difference. But
this is not so. From the Indo-European family we do have Hittite and Greek
original written documents 3500 years old, and from Hatti very good arguments
that some ethnically Hittite kings ruled in
The answer of majority
of Indo-European linguists would be No;
only they never read [4]. The overwhelming majority of Uralic linguists would
also be No; and a few of them know
the theory, even if not the details. My answer is also No; but I have the book.
So I ask: why?
3. ON ETYMOLOGIES
How to verify
linguistic kinship? It seems that there are two schools. Either
fundamental words are primary and
grammatical structure is secondary,
or vice versa. Since for Indo-Europeans the two criteria generally go hand in
hand (except, maybe, Albanian and Tocharian; later), they do not have to
decide, and for fundamental words the method is 200 year old, so they like it.
You have a few
languages you suspect to be in genetic kinship. Then you tentatively postulate regular changes of sounds. These
postulated changes may be anything, but you must be strict about them. If by
the postulated changes you can bring the words of different languages into
relations, then you are ready, The suspected kin
languages indeed came from one parent language even if none ever wrote anything
on that language.
Example 1: Latin and the daughter languages. Lots of written books
are extant in Latin. But first let us forget about Latin texts. We observe
several languages in
All
ones which come directly from the paternal language. Words borrowed from
elsewhere may or may not behave so. And a minority of exceptional words may
behave exceptionally. But lots of cognate words can be found. Then another rules can be looked for. Intervocal Italian
"-tt-" seems often related to intervocal French "-it-" and
Roumanian "-pt-" as in latte, laite, lapte, all "milk". And
so on and so on.
And now you take the
preserved Roman books and see that "cent" and "cento" was
indeed "centum", and "latte", "laite" and
"lapte" was originally "lactum". The words having Latin
etymology are the words that can be traced back via regular changes to a Latin
ancestor even if that word does not occur
in any extant book; but it generally occurs.
Example 2: Kentum vs. satem. You observe that there are really a
lot of languages to show up exciting similarities in words "father",
"mother", "bring" and so. E.g. "father" is
"pater" in Latin and "pitar" in Sanskrit. But half of these
languages have an initial stop in "100" as Latin "kentum",
written as "centum", while the other half a sibilant as in Iranian
"satem", or an affricate. Then your guess is that the original stop
changed into sibilant for any reason. No problem, it turned into sibilant
before a front vowel in French "in before our eyes", so it may have
done so before any vowel in Iranian. It turns out that the suspected related
languages can be grouped neatly into two halves: kentum as Italics, Celts,
Germans and Greeks, and satem, as Slavs, Balts, Iranians and Indians. XIXth
century guessed a West-East dichotomy; simple enough. Then French is
"kentum" although "cent" is now starting with an s-sound; but it comes from a Latin
"k".
In Roumanian
"100" is "suta", not even written with a "c",
although Roumanian is Neo-Latin, so "kentum". Then you can try with
easy explanations. 1) "Suta" comes from "kentum" in some
too involved way, and when Roumanian writing started (1521), they already did
not remember the "k"-sound. Or, 2) that especially "100" is
a Slavic loanword from "sto". Or, 3) that it is a loanword from
Dacian (?), or Thracian (satem) or Armenian (satem). I think 2) is the majority
opinion.
Example 3: Reconstruction of Proto-Indo-European (PIE). There are
many suspected Indo-European languages, and many scholars of them. In addition
some such languages have very old records. Hittite & Greek goes back to
1500 BC, but even Latin goes back to 600 BC. To be sure, the oldest records
come from not-IE languages about 3500 BC: Sumerian is an isolate (?) and
Egyptian is Afro-Asiatic, but even 1500 BC is something old. So there are lots of information, lots of reconstruers, so PIE is
reconstructed. E.g. we can try with the very obvious choice of a sentence:
"On a hill a sheep not having wool saw horses, one pulling a heavy wagon,
one carrying a big load and one carrying a man quickly." Then modern
linguistics reconstucts the sentence as: "Gwrreei owis, quesyo wlhnaa ne
eest, ekwoons espeket, oinom ghe gwrrum woghom weghontm, oinomque megam bhorom,
oinomque ghmmenm ooku bherontm." [6].
And now I, the Uralian
speaker, tell you that for me everything seems OK. "Gwrreei" may be
"on the hill", because in Slavic
"gora" is "hill" and there the Locative ends in e/i.
"Ovis" can be sheep because it is "ovis" in Latin.
"Wlhnaa" may be "wool" and "ne eest" may be the
Latin "non est". Again "ekwoons espeket" is reminiscent to
Latin "equos espectat", "oinomque" is Latin
"unumque", where we know the earlier "oinom" from original
inscriptions, in "megam" you can see the Greek "mega",
Hindu "maha" = "big", + -m for accusative as in Latin, and
so on.
While any of the words
may be an error (we cannot check it, not having inscriptions), the whole
sentence seems good for, say, 90%.
Now, the question is:
can we push Magyar into amongst these related languages (with or without
Uralics)? Can we find Indo-European, or Italo-Celtic or proto-Latin etymologies
for many Magyar words?
Szabédi found, for
hundreds. Of course, for this he postulated his own rules. But, as you could
observe, this was his right for a brand new IE language. It could not be
different without specific change rules. He only has to follow his rules strictly.
Was he strict enough?
I do not know. For me his derivations seem roughly similar in strictness to
those of professional linguists. (I
am a professional physicist.) Maybe
professional linguists could tell something; but they do not do it.
4. EXAMPLES FOR ETYMOLOGIES
Szabédi's etymologies
are multitudinous. I take two groups for examples: the seemingly absurd and the
surprisingly obvious. Let us see some examples, but be careful about
ortography. I cannot use a Latin orthography for Magyar; both vowels and
consonants are more various in Magyar. So let you see some correspondence to
English at least for consonants:
Magyar |
English |
Note |
c |
ts |
- |
cs |
ch |
- |
gy |
dy |
Originally j |
j |
y |
- |
s |
sh |
- |
sz |
s |
- |
ty |
ky |
As |
zs |
zh |
- |
And then we can start. Scarabeus = cserebog-ár. Sure, today's scarabeus
is a Scarabaeina, and today's cserebogár = cockchafer is a Melolonthina, but
they are neighbour subfamilies of the Scarabaeidae, so the cserebogár is at
least a Scarabaeida, and even now a layman sees little enough difference. Szabédi
starts from a proto-Latin *scarabaios. Thence Latin scarabeus is easy (and
Greek skarabeios too); the Magyar comes through a csaraboj.
Now
for "oil" (the food). Szabédi starts with a
proto-Latin *olaivom. Hence he gets Greek elaion through an *elaivon (indeed
the digamma sound vanished in Attic), also the Latin "oleum". Now, on
the Uralic side he assumes an *ulojwu, but with dark l, hence an *uwojwu, and
then an Ugric woj, Magyar vaj. Indeed, Finnish "voi" and Magyar
"vaj" both are "butter", the Lapponian "vuojj" is
some "oil", and Manyshi "woi" is "fish-oil". The
different Uralic languages use the same word for different hydrocarbons,
according to lifestyle.
Proto-Latin
*cannabaria remained the same on the Latin side, i.e. "hemp" (remember
cannabis). But "hemp" is "kender" in Magyar, which he gets
easily from *cannabaria.
Proto-Latin *manuvali
becomes "manuale" in Latin, so a kind of "handle". Now it
is "nyél" in Magyar, which can be got via
*manivali>*mnivéli>*nyél.
Proto-Latin *gönücle
went into Latin geniculum and Magyar könyök = elbow.
Latin mersat is Magyar
merít, in older times mereht. Alibi ~ egyéb (in the
sense: other). From proto-Latin *vunda one gets the Latin "unda"
(wave), and the FU "wite", Magyar "víz" (the Uralic side is
orthodox!), "water". Latin "flamma" and Magyar
"láng" (both "flame") come from proto-Latin *phlogma. Latin "casa" and Magyar "ház" both from
proto-Latin "cotta". (I note that the Magyar "ház"
is very near to German "Haus", but cannot be a loanword, having
regular Finno-Ugric cognates.)
I give one more pair,
and stop. Nepos ~ nép. The Latin word means
"descendants", the Magyar now is "people", but originally
may have meant the wife, children and some cognate persons together, so opposed
to the head of the greater family.
There are really more
than hundred etymologies; I was lazy to count them. How is this possible?
5. ONE CENTURY BACK
First I retreat into
1904. You may ask, why. The answer is simple. Szabédi was born in 1907, and in
1904 a great monograph was published about comparative IE linguistics [7]. I
got my copy in a second-hand bookshop, with pen writings on margins and paper
sheets inside. A draft of a letter showed that somebody used it as University
textbook in 1923. Szabédi learnt linguistics probably from 1926; I cannot prove
that he would have used another copy of [7], but anyways his textbook must have
been similar. Also those of his critics.
Obviously at the end
of the XIXth century IE linguistics did not consider itself complete, as Physics
believed (the "only two small cloudlets" of Lord Thompson in 1900 [8]);
linguists expected new languages, but generally not new subfamilies. Lots of
extant and extinct IE languages were known, some practically only as a name
(Greek or Latin authors mentioned them), some from scanty inscriptions of a few
words, but some with literature. Let us then look back a century.
Travellers and then
linguists discovered newer and newer IE languages, but all of them belonged to
the Indo-Iranic or Aryan subfamily. Only one other subfamily belonged to
As for extinct
languages, from time to time inscriptions, codex glosses and such were known, and careful Brugmann does mention some languages
which either may have been Indo-European, or they definitely were but the texts
were too scanty to tell anything else about. His views are as follows.
Phrygian and Thracian
were IE, maybe satem, and may have been in kinship with Armenian; texts were
scanty. (Indeed a Hungarian textbook, which I will not cite
here because it is written in Magyar, so you could not read it, from 1970,
still tells that the only Thracian text is a few words inscribed on a ring.)
He knows about 3
languages at the periphery of
Brugmann tells that
Lydian showed IE characteristics, but he knew too few about it to be more
definite. Finally he mentioned Lycian, which may have been IE, or it may have
not.
Surely individual
linguists had individual guesses about Lydian & Lycian; but there was no
expectation of new subfamilies. OK, there had been Thracians; then a branch of
Phrygians, the Bryx, still had been living in European Thrace in the mythic age
(read the stories about the Golden Fleece and Iason), so Phrygians were kins of
Thracians. Now, Lydians lived between Phrygians, Greeks and Persians, so they may
have been a more Eastern kind of Phrygians. If Thracians
& Phrygians were classified together with Armenian, why not Lydians?
And then the last, Lycian? Maybe it also belonged to
Armenian.
As I told, they neatly
dissected these languages into kentum/satem. All kentum languages were Western,
and all satem Eastern.
Of
course, with Western optics. Namely
So k-stops remained on
the West but became sibilants on the East. Why? Who knows? Maybe originally the
change was a fashion, which could not overstep the borders of Old Greece; Greeks
regarded Phrygians, Thracians, Lydians & Iranians as
barbarous. Or maybe satemization was a laziness. A
stop sound is hard & disciplined. If we became lax, air comes through a
slit, and then there is a sibilant. For XIXth century Europeans it was trivial
to consider Easterners lazy & undisciplined.
And then came 2
discoveries. The bigger one will not enter into this story too strongly. In the
1860's travelers became fascinated about an old civilisation of Eastern
Anatolia, and some experts of the Bible guessed (correctly enough) that the
ruined cities had been built by the people called Hittite in the Bible. Then inscriptions
were collected, and in the 1910's Hrozny deciphered them. A strange and archaic
IE language came forward. It was kentum, in spite so Eastern; but it was old
too. Maybe older than satemization.
Indeed, it was old.
The original Hittite empire was broken up (probably by Phrygians & Lydians)
in 1190 BC as we know now. Classical Greek travellers may have observed the two
Hittite colonies in
It seems now that
Lycian classifies together with Hittite, in an Anatolian subfamily. But then we
are ready with this surprising new subfamily. Let us then see the second one.
My compatriot, Sir
Aurel Stein (I am serious) travelled a lot in
For any case, the
Tocharian texts were a shock. First, they were the Easternmost Indo-Europeans.
Nobody expected Indo-Europeans at Long. 95 E!
However, if we know
that they were IE, everybody can imagine a small group of Parthian horsemen
trekking Eastward. (See e.g. H. Beam Piper’s
Alternative History novel Lord Kalvan of Otherwhen, which I relegate to
Appendix A. Only that belongs hither that even in the novel of the Alternative
History expert [10]
the Behring-crossing Indo-Europeans are Aryans, so IndoIranians.
Even he in the 60’s did not hear about Tocharians; or did not take them seriously.)
But this language was kentum! This is easy to show: "100" is
känt/kante (there were 2 dialects).
And then came the third surprise. The extinct language had a lot of
-r- sounds in the passive verbal suffices.
If you learnt Latin,
you see what I am telling. See a Table:
English |
Latin |
English |
Latin |
I see |
video |
I am seen |
videor |
thou seest |
vides |
thou art seen |
videris |
he sees |
videt |
he is seen |
videtur |
we see |
videmus |
we are seen |
videmur |
You see |
videtis |
you are seen |
videmini |
they see |
vident |
they are seen |
videntur |
And so on: also in Past and Future.
It seems as if this
-r- was the original suffix of Passive Voice in Latin. And XIXth century
linguists found this -r- in the Passives of Osk, Umber (so in all Italic) + the
Celtic; but nowhere else. Then XXth century found it in Hittite; and in Tocharian. In Tocharian A
klots=ear, klyos is to hear, and klyostär is "is heard". Turning
to Latin,
klyostär = auditur
Is it not nice?
In the first half of
XXth century many linguists preferred the Italoceltic
theory. Sometimes in the far, far past (now we know that it was cca. 2500
BC) Indo-Europeans diverged. Some went to Northeast, some Southeast, some West
and so on. But the ancestors of later Italics and Celts remained together for a
while, so there are more common features between them than between any other
pair. To be sure, not the passive -r- is the only common feature. Some words
were almost mutually understandable between them in the time of Julius Caesar,
e.g.
English |
Latin |
|
King |
rex |
rix |
Now some scholars
accept the Italoceltic cohabitation, some not. For example see Kortlandt on one
hand [11], cladistic people on the other. For any case, even Warnow's cladistic
computer classification [12] clearly shows a common Italoceltic branch; and in another
computer calculation [13], the calculators take the years for the beginning and
end of independent Italoceltic community as 3000 & 2400 BC. But computer
calculations or not, when our sources (Greek travellers and Roman traditions)
start to report, Italics & Celts are just neighbours. An earlier common
life is more or less expected. And if we indeed go back to 1904, then European
linguists all learned Latin (and
Greek); they feel in the bones the -r- Passives, so an Italoceltic community is
trivial. (I, in 2006, am a physicist. I will not try to decide what are the truer reasons of similarities, if at least the
similarities exist.)
I was unable to find
anything about Tocharians at Szabédi. Indeed, you cannot expect them, because they were not
in Brugmann's textbook written in 1904.
6. THE INDO-EUROPEAN
MIGRATIONS IN THE MINDS OF SCHOLARS AT THE BEGINNING OF XXTH CENTURY
If you have a group of
related languages, then maybe in old times the ancestors spoke the Ur-Language
in the Urheimat.
This is not necessarily
true. You can imagine a chain of ancient languages where every community
understood the two neighbours and not the farther ones. However let us not be
pedantic, and continue. Where was the Urheimat?
There were two
preferred places. But first let us state that two living languages were the
most archaic ones: Lithuanian & Sanskrit. Sanskrit is still a living language, albeit barely; Brahmins are not
celibate as Roman Catholic priests,
so you can imagine a family in India speaking Sanskrit as first language, and
such families are reported; I do not know if I should believe the reports.
Otherwise the Sanskrit is artificial and some 2000 years old.
Now, there are
arguments that very archaic idioms remain at the core territories where the
foreign influence is weakest. Then you may conclude that the Urheimat was 1)
close to the Baltic; or 2) in
OK, then put the
Urheimat into
The alternate Urheimat
in
Surely the IE and FU
Urheimats were neighbours (if the second one existed at all). While the grammars differ as much as it is
possible, some very primary vocabulary agreements can easily be found. E.g.
NAME.
PIE "nem-", PU "nime". Magyar "név", Finnish "nimi", Latin “nomen”
&c.
WATER.
Magyar "víz", Finnish "vete", German "Wasser",
Hittite waatar", &c.
THOU.
PIE "te", PU "ti" (The Sg2N pronoun.)
Latin "tu", German "du", Magyar
"te", Mordvin "ton", Selkup "tan" &c.
Dozens of such
agreements exist (see e.g. [14]). Some of them may be simply borrowings but
maybe not all. Since originally Magyar was spoken in Westernmost Siberia, if
both Magyar and Finnish (“Suomalainen”) are included in the etymology, that
means at least PFU sharing the root.
So the Urheimats were
neighbours. Now this means that in a (strange enough) migration either a FU
group could borrow more IE words, or vice versa.
Now, it seems that Proto-Magyars did never cross the line between
Two such groups are yielding
words which are proven beyond doubt. The more famous are the Indo-Iranians.
Lots of common words between Magyar and some Indo-Iranian languages are known,
e.g.:
GOLD: Magyar
"arany", Avestan "zaranya".
COW: Magyar
"tehén", Avestan "daenush" (female animal), proto-Iranian
"dhena" &c.
MILK: Magyar
"tej", Avestan "dayah" &c.
FELT: Magyar
"nemez", Pehlevi "namat" &c.
Several dozens of such
words are known; some of them are proven. The explanation may be simple enough:
the whole Ugric group was involved in the Andronovo Culture in the 2nd
millenium BC, whose Southern component was "Sarmatian", so some
Indo-Iranian.
However most of these
words could not mislead Szabédi. Indo-Iranian forms seriously differ from
Latin, or Italic or Italoceltic. E.g. COW is Latin "vacca", or more
generally "bovis", and Magyar
"tehén" cannot correlate with that.
Now we are nearing my
suggestion for the explanation of Szabédi's error.
7. ON THE TOCHARIANS
As earlier was told,
Tocharian was the Easternmost IE language, in "historical" times
(i.e. when the written texts were made) in the
FATHER is
"pacar". For most of you this, while not un-Indo-European, may not
seem too "Western". But here "c" is palatal "t",
so "t'". This sound is absent in recent Western European languages,
however it was present in Late Western Vulgar Latin in "ti-" groups;
it is now written as "ty" in Magyar, "t'" in Slovakian,
"ci" in Polish, "c'" in Croatian", and "ky"
in Romanized Japanese. (Or maybe Tocharian c stood for the “ch” of English; the
palatal “k” and the “ch” sound are not far from each other and are rather
confused by many Westerners, but Magyars, Slovaks, Croats and Japanese can
distinguish. From the history of Late Vulgar Latin we see that after some
evolution “-tiV” and “ci-“ groups coincided in a
string of dialects.) For any case,
Tocharian "c" is a "t"-sound in general sense. Now look at
Tocharian "pacar", Latin "pater". On the East it is
"pitar" in Old Indian.
MOTHER is
"macar"; it is "mater" in in Latin and "matir" in
Gaulish ("mathair" in Irish). True, it is similar enough,
"matar", in Old Indian.
DOG is "cu".
It is "canis" in Latin, "ci" in Welsh (and "kyon"
in Greek &c). To be sure, it is "kutya" in Magyar,
"koira" in Finnish and "köpek" in Turkish, so it may be an
old Nostratic word, but anyway within IE
it seems Western.
HORSE is
"yuk/yakwe". In Latin "equus", the root is "epo-"
in Gaulish; see the equine goddess Epona. On the East it is "asva" in
Old Indian and "aspa" in Old Iranian; the same root but different
evolution, obviously.
The Tocharian language
seems KENTUM, "100" is "känt". All Western IE languages are
kentum, and no Eastern ones are such,
except Tocharian and Bangani.
Tocharian Passive
suffices tend to have an -r- sound, as shown earlier. This is rather a rule in
Italic languages, it is so also in Celtic ones, it is also detected in Hittite,
and in Tocharian. Nowhere
else. There are two logical explanations: i) the -r- Passives are
original, surviving only in Hittite, Tocharian & Italoceltic; 2) the -r-
Passive is an innovation shared by Hittite, Tocharian and Italoceltic. While
the majority now accepts the first explanation, that is theory, not fact.
For any case, the
Western origin of Tocharian, even the origin from the neighbourhood of the
Italic and Celtic, is widely accepted. See e.g. the genealogical tree of
Gamkrelidze & Ivanov [15]. The general explanation for the Easternmost historical position of the Tocharian is that
they started first to East, maybe in 3500 BC. Then they probably did it
mounted, not on chariots; and all Magyars (and Altaians) know this is the
faster way. Now, there is no strong argument against starting on horseback to East even from the neighbourhood
of Italocelts. So you may (or may
not; as you like) imagine an Italian-Celtic-Tocharian Sprachbund, or even an
ItaloCeltoTocharian subgroup dissolved by the Drang nach Osten of the Tocharians.
Later the Tocharians
did surely meet FU people, more probably Eastern Ugric than Western Finnish.
Arguments are numerous. E.g.:
Stops are all
voiceless in Tocharian. This is also so in FU, except Ugric Magyar and Finnish
Permian, where voiced stops are secondary, maybe since 7th c. AD.
Some Tocharian cases
of nominal declination seem rather agglutinative than flective, and at least
one is not IE at all even for meaning.
Dvandva type compounds
occur in Tocharian. An example is "akmal" or "aakmal" =
FACE. Now, the word seems to come from "ak-malan" =
"NOSE-EYE". The Magyar word for FACE is "arc", but in 19th
c. AD it was still "orca" (the form is now archaic, but still in
use), written then as "orcza". We know that it is a dvandva:
"orr+száj", so NOSE+MOUTH. The same principle if
not exact mirror translation for "akmal".
Finally let us apply
the usual demonstrative trick for a wide comparison: numerals from 3 to 8.
Numbers 1 and (sometimes) 2 have the tendency to come elsewhence
("this", "other" &c.), and numbers 9 & 10 vary even
within IE. (E.g. "9" is "nine" in English, but
"d'evit'" in Russian.) I took the numerals from [16], but consulted a
lot of other sources too, and composed the Anatolian from 2 languages. So:
Numeral |
Gothic |
Tocharian A |
Oscan |
Gaulish |
Hittite |
Lithuanian |
Slovak |
Albanian |
Sanskrit |
Greek |
Armenian |
Subfam. |
German |
Tocharian |
Italic |
Celtic |
Anatolian |
Baltic |
Slavic |
Albanian |
IndoIranian |
Greek |
Armenian |
3 |
threis |
tre |
tris |
treis |
tri |
trys |
tri |
tre |
tri |
treis |
erekh |
4 |
fidwor |
s'twar |
petora |
petor |
meiu |
keturi |
shtyri |
katër |
catúr |
tettares |
chorkh |
5 |
fimf |
päng |
pompe |
pempe |
panku |
penki |
pät' |
pesë |
pángca |
pente |
hing |
6 |
saihs |
säk |
sehs |
suex |
? |
sheshi |
shest' |
gjashtë |
sas |
hex |
vec |
7 |
sibun |
spät |
seften |
sextan |
shipta |
septyni |
sedem |
shtatë |
saptá |
hepta |
evthn |
8 |
ahtau |
okät |
uhto |
oxtu |
haktau |
ashtuoni |
osem |
tetë |
astá |
okto |
uth |
I admit, my orthography is somewhat arbitrary here. In principle,
for languages of Latin writing I used national orthographies; still, in Baltic
& Slavic "sh" and "th" is the English sounds. Accent was
not indicated, and length not always.
As for "Hittite" I am combining
"cuneiform Hittite" or Kaneshian with "hieroglyphic
Hittite" or Luwian. I think the main difference is not between synchronic
Kaneshian and Luwian; rather than hieroglyphic Hittite is many centuries later
than the cuneiform. Hittite society collapsed almost synchronously with the
Troian War at the center, while near to the Syrian border small dukedoms
maintained Hittite civilization for half a further millenium, but these local
states used hieroglyphes.
Henceforth I assume that Tocharian was the closest kin of Italoceltic. I cannot still prove it. In cladistic context the
similarities are not proofs; there they may be simply commonly preserved
archaisms. However I am sure that in linguistics the cladistic approach is only
an approximation. Languages do not split clearly: neighbour related languages
influence each other even after they have become distinct languages. One
example is Romance languages which influenced each other during whole Middle
Ages (most strongly Italian & Castilian), another
is Slavic ones, where in 19th century this mutual influence was even a
political programme of Pan-Slavs. In both cases the languages were on the verge
of mutual understanding, at least for passive reading; and mutually readable
languages diverge slower. Also, imagine a migration. Clearly after some time differences
will still be smaller between original neighbours than between original
non-neighbours.
My second reason for the assumption
is definitely not linguistic. Szabédi was the last rector of the Bolyai University of
Magyar speaking people of Roumania. In addition he was from the same tribe as I. I admit,
it is improbable that he were right about the genetic origin of our language.
But I am positively indoctrinated for
him and his cause: so I think he was not
a madman. And I can find an easy explanation for a mistake.
8. THE RELATIVES OF LATIN
Szabédi discussed the close
relatives of Latin. True, he was
interested in the relatives of Magyar,
and he believed to find Latin as such. However such a statement is reciprocial:
when looking for close relatives of Latin,
he believed that Magyar was one of the nearest. But of course, proto-Latin, and
Latin’s Italian relatives were even closer. Now, let us start on this inverse
way, and let us remain amongst Indo-European languages.
Surely Latin had its close relatives
in
Now, let us go one step backwards.
What was the nearest kin of Italics? I think, no doubt here: the Celtic. Many
authors even speak about an Italoceltic stage, so first the speakers of the common
idiom of later Italics and later Celts separated themselves from the Western IE
group, and for some centuries they still spoke dialects mutually
understandable. Where? Outside of
Other authors refuse the existence
of an Italoceltic, but still Italian was
close to Celtic. Look at the Numerals 3-7. (But I am tricky…)
Now, the next level is: what was the
closest relative to Italoceltic (or: to Italic and Celtic)?
Here the answer is equivocal; and
languages surely did die out even without trace. Venetian is a possible guess
(if it was not even Italic), Illyrian is another. But we do not know too much
of them. Restricted ourselves to the 11 well known subfamilies of the Table of Numerals,
3 non-absurd answers are possible:
1) None; this would mean that more than one (Western) groups are at
comparable distances from Italoceltic.
2) Greek. This was possible for 19th century scholars; and
sometimes the etymologies are surprising indeed. Some authors believed that
Western Greeks (“Dorians”) before their final Southeastern migration lived near
to Italics; some tried with Aeolian-Italic comparisons.
3) Tocharian. When Tocharians appear at Strabo, they are the Easternmost IE people, but the language
is surprisingly Western (kentum, r-passive, similarities in individual words
&c.). This classification was more or less a commonplace in 20th
century, see e.g. [15]. The general idea is that Tocharian separated from a
Western group, the general neighbourhood of Italic and Celtic, only Tocharians
either started early or travelled fast. The means are not trivial; but let us
continue.
Now, let us discuss
very shortly the 3 viable possibilities.
None means that we cannot proceed anymore. While this
may be true, may not, as well. We can return to this opinion if we are
unsuccessful with the other two.
Greek is surely a close relative to Latin, but it may not
be close genetic relative. (According
to the majority of experts; Garrett has another idea which you can find in App.
B. That view is quite non-cladistic.) After settling down, Italic & Greeks
were quite near to each other. Maybe Illyrians were between, but maybe not
either they. So Greek could influence Italic; and it
seems that Mycenean Greek society was richer/more complicated than Italic
one(s) about 1500 BC. Then, just after the Troian War, refugees may have
arrived at Sicily/Italy (this is e.g. the Aeneas story); they might import
Greek/Dardan/Frigian/&c. words. (Homer is explicit that all Greeks were on
the Greek side. I am not absolutely convinced having read many examples of
political propaganda. Gamkrelidze & Ivanov [15] believes that Greeks
trekked through
However the third
possibility, Tocharian, is
interesting. It is not exotic/absurd for any case. In this Chapter I remain at this choice. This assumption may be
wrong. And then what? Would be this the first
linguistic idea proven wrong? I will discuss the question further; but only in
an Appendix.
For a Hungarian, the
simplest explanation for the early arrival of Tocharians at (almost)
Westerners generally
believe that wagons and chariots preceded riding. This is based mainly on Greek
traditions. Mycenaean elite warriors were charioteers, not riders. Small horses
are better for pulling chariots than for riding. No Mycenaean or Homeric source
mentions riding.
On the other hand we did find a horserider grave [17] from
between the Tiszapolgár & Bodrogkeresztúr periods, just when some uncouth
steppe warriors represented the Eastern challenge for the
OK, proto-Tocharian
could not be horse nomads. For a Hungarian
a Nomad, and definitely a horse nomad, is not a miserable fellow having nothing
except his clothes. A horse nomad is a product of long evolution, transporting
the whole economy & society on horseback & wagons, including the houses
as Yurts. That way of life was obviously nonexistent before the late stages of
Andronovo, say 1200 BC. Proto-Tocharians might ride, but even then they had to
stop frequently. So what needed later a mere 10 years for the Avars may have
been even 200 years for proto-Tocharians. But we have unrecorded millenia.
Linguistic records do
not help too much. Horse is yuk/yakwe, clearly a cognate of Latin equus, a
definitely IE word. The wagon is kukäl, clearly a cognate of PIE *kwel
(wheel), so they started from
9. IF TOCHARIANS WERE THE
CLOSEST RELATIVES OF ITALOCELTS...
Then Szabédi, martyr
of Magyar high education, was not a madman. (To be sure, this question does not
have to be settled to decide if he was a noble Magyar. To decide his status as Hungarian I should confer at least with
my Slovakian colleagues.) Namely, visualize a near kin of
Italocelts/Italics/Latin on the way to the
We do not know where
and when this happened; but surely on the steppe, earlier than Andronovo Bronze
Age. The IE component of the Andronovo very probably was IndoIranian
("Sauromat"), and IndoIranians were newcomers on East compared to
Tocharians.
Surely, Eastern
Uralians & Tocharians did meet, and the meeting was more than transient.
Some of the Tocharian cases of nouns are agglutinative. One case, the
Pervasive, is unheard in IE community (except, maybe, for Northern Lithuanian dialects
heavily influenced by Uralic Livonians). Aakmal, EYE+NOSE is FACE, as
ARC=ORR+SZÁJ (FACE=NOSE+MOUTH) in Magyar (and if you are not convinced, 150
years ago "arc" was still "orcza"). And so on.
"And so on"
stands for research not done so far. Uralists are not too interested in
Tocharians, and Indo-Europists do not know generally any Uralian. But
Tocharians could have transferred lots of words for the Ugric. And then Szabédi
met with lots of roots we know mimicking proto-Latin.
Be careful here. For
anybody not Szabédi Proto-Latin was the language of Alba Longa (whose last
king/dictator, Mettius Fufetius, was defeated by Ancus Martius about 665 BC,
see the Horatius/Curiatius story at Livy [18]), or the pure Italic dialect of
Latium just before the arrival of Aeneas. However for a Magyar the differences
between proto-Latin & proto-Tocharian are almost undetectable. I saw and
heard my mother conversing with wives of Russian occupying officers at the
street. Russian is Eastern Slavic and Slovakian is Western Slavic. My mother
was Magyar and I do not know if the Russians detected that fact; but
communication went without serious problem.
Latin
"oculus" is Tocharian "ak/aak". Now, Szabédi derives Magyar
"agy"="brains" from "oculus". I do not know if
the meanings of "eye" and "brains" are near enough; but I
am sure the derivation is easier from Tocharian "ak" than from Latin
"oculus". Also, he tries with the derivation
"ocularia"~"agyar". Here "ocularia" is the
"tooth of eye", the canine tooth. In Magyar it is indeed the
"tooth of eye", and he cites a French construction. But again, if
"agyar" can be derived so at all, it is much easier from
"ak".
Szabédi's idea is that
Magyar "pata", "hoof", comes from proto-Latin *podos,
Latin pedes, foot. Indeed, the hoof is a part/extension of foot,
"pedes" in Latin. But Latins hardly can have been on the East. But
would you like Tocharian "pe"="pedes"="foot"?
Szabédi claims a
connection between Latin "oriundus"="originating" and
Magyar "eredô"="originating". This etymology may be
"mirageous" (the Magyar catchword for everything scholars do not
believe to be true). But "orior" also has the meaning "comes
up" (e.g. a star). And Tocharian "orto" is "up".
The Magyar
"könyök"="elbow" has good Uralic etymologies as far to West
as the Finns. Szabédi, however, tries with proto-Latin "genucla".
Hence one could indeed get "könyök", although
"genu"="knee", but for amniotes the two articulates are
homologous. But Latins were far from
And now we have
arrived at the obscure points of Orthodox Magyar etymology. Magyar
"piros" is "red", although rather "fire red" than
"blood red". The latter is "vôrôs", clearly a derivate of
"véres"="bloody". "Piros" has no convenient
Uralic etymology, while "pír"<-"pyr" (Greek; fire) would
be good but historically difficult. Szabédi suggests
"purus"~"piros", but he is "mirageous", the usual
slogan of Hungarian intelligentsia if the theory or the person is simply not
accepted. (OK, he was a University High Official; but nominated by the
Roumanian, not the Hungarian, Ministry of Education.) OK, and what about
Tocharian "por"="fire"?
For metals we arrive
at a difficult point. It is generally accepted that the only common IE metal
was copper; maybe it was called something preserved in Latin as "aes"
meaning generally "metal", sometimes "copper" or even
"bronze". Greek "metallan" was something "searched
for".
In Uralic
etymologically related words definitely mean different metals; surely, because
the FU linguistic unity, if it existed at all (the original territory seems
rather large), did not survive Eneolite (if at all; maybe the last stage was
still Neolithe). Still, there are common metallic words, but their meanings are often different. Let us see
an example. Magyar "vas"="iron", is obviously related to
Finnish "vaski" (Fi -sk-~Ma -s- seems
regular), but the latter means "copper". Copper is "réz" in
Magyar, but the word may come from Slovakian "ruda" (or from some
older IE connection). Szabédi turns to Latin, finds there the word
"vasculum", "small vessel", and argues that some vessels
were made of copper or bronze, and in Uralic the word went to mean the vessel.
OK,
"mirageous". Why did a "small
vessel" mean "metal"? However there is a quite classic Latin
word "vascularius", meaning an artificer "manufacturing metallic vessels". I do not know,
why; but this was Latin usage. These metallic vessels of Latins obviously were
made of copper or bronze (otherwise you could not have water in them); or
maybe, for rich people, of silver or gold. In Magyar "bronze" is an
international word, "copper" is maybe Slovakian (although both words
must have existed in 1,000 BC), "silver"=”ezüst” is dubious, and "gold"="arany" is surely
from Iranian (e.g. in Avestan: "zaranya"). What are the words in the
language of our closest relatives, the Manyshi?
The words are
different; still we can learn something. "Silver" is "oln".
But "tin" is also "oln". Manyshi
does not distinguish brilliant white tin from brilliant white silver! ("Oln"
is "ón" in Magyar, so the word is related but not the metals.)
As
for the coloured metals, "copper" is "tarn'e"; no Magyar
etymology. But interestingly, "gold" is also
"tarn'e". Magyar's closest kin, Manyshi, does not distinguish red
copper from yellow gold; and white tin from white silver.
Now, wait a moment.
"Silver" in Latin is "argens", and in Tocharian B
"white" is "árki". This might be an accident. But in
Tocharian B "wäs" is "gold". Now, is Finnish
"vaski"="copper" (recognised by "mirageous"
Szabédi) simply an accident?
And is the similarity
in animate pronouns La "quis"~Ma "ki" an
accident? It is "kus" in Tocharian! (OK, maybe the origins are common
from Nostratic...) Similarly, La "sal"~ Ma "só"
(="salt") may be "mirageous", but Tocharian
"sále"~Magyar "só" is not absurd. And, at last: Tocharian
is a Kentum language. "100" is "kant". Now,
"army", "host" is "had" in Magyar. (Now rather "hadsereg", but originally it was
simply "had".) "Had-nagy" is now “lieutenant", but
this title originally meant a leader of the host; and "hadra kel" is
"goes to war".
Now "had"
has very good Uralic etymology. Magyar "h" before back vowels came
from "k", and "d" from "nt". So Magyar
"had" should come from "*kant".
Indeed Magyar "had" is etymologically identical with Manyshi
"khant", Khanty "kant", Mordvin "kon'dä", Finnish
"kunta". In Mordvin the word now means "family", in
Finnish, "community" [19]. But everywhere, a
"coherent group".
Now, "100"
is a big number. In contemporary Hungarian military language, a
"100-ad" (század) is a body led by a Captain. (Although
he leads rather several hundreds.) Captain's equivalent in Ottoman
Turkish is "yüzbasi" and that is simply "head of 100". And
"100" is "kant" in Tocharian. Is it a mirage? Am I a
madman? Was Szabédi a madman?
He was not right. He
did not know about Tocharians because great Brugmann did not know about them. I
think even Nicoleau Ceausescu, Head of the
Here the main part
ends. What remains, goes into Appendices. Namely, I, as a physicist, have
learned that Absolute Truth does exist (should exist?), even if we now do not
necessarily know it. The Appendices formulate my Notes about this unknown
Absolute Truth. And beware: different Appendices contain mutually inconsistent
scenarios. So maybe you must not simply add up their contents.
APPENDIX A: LORD KALVAN OF
OTHERWHEN
If one has no empathy
for science-fiction, he should jump over this Appendix. However
even for such persons App. A demonstrates how easy it is to forget about
Tocharians.
Alternative History,
although it is mainly science fiction, has quite strict rules. You assume that things can happen also
differently than in Our TimeLine (OTL) and may even assume that highly
developed future science/technology will be able to visit the ATL’s
(Alternative TimeLines). If somebody asks why to assume this, you may refer to
H. Beam Piper in the
40’s invented a scheme. World is five-dimensional, the fifth is time-like, and
one civilisation of the multiple humanity happens to learn Crosstime Travel.
Afterwards they can exploit the other TimeLines (mining &c.) Now, obviously
“neighbouring” TL’s are similar if “equivocal alternatives” are rare. Piper’s
map of ATL’s is of course incomplete, but interesting.
There are 5 Levels. In
this scenario Humanity came from Mars, fleeing from loss of atmosphere &
such. The exodus happened under dire conditions and on lots of TL’s it was more
or less catastrophic. On Level 5 TL’s the exodus completely failed and Earth
remained for autochtonous Neanderthals. In OTL (Level 4) one spaceship
performed the travel but could not settle down; one of her scout boats arrived
with a few humans who then devolved back to Stone Age culture. On the other
hand, Level 1 TL’s had quite substantial human population transplanted, with
great part of the original Martian science & culture. So the so-called Home
TimeLine is far before us.
Very
similar TL’s form a Subsector.
OTL belongs to Level 4, Europo-American, Hispano-Columbian one, but there are
others on Level 4 as well, e.g. Indo-Turanian or Sino-Hindic. (Names refer the
dominant cultures.) On the somewhat more advances Third Level there is e.g. an
Alexandrian-Roman Subsector.
Now, Lord Kalvan of
Otherwhen is originally Calvin Morrison, a cop from
Lord Kalvan of
Otherwhen (I mean, the novel) is posthumous, composed from Gunpowder God and
Down with Styphon. So Piper did not elaborate the details as much as in other
works. However the Zarthani are light-complexioned; and they are Aryan. The
subsector is called (on Home TimeLine) Aryan
Transpacific; the TL’s where sometimes in the past Aryans went behind the Pacific.
Now, in 19th
century, and also in Hitler’s circles, Aryan and Indo-European were synonyms. But not for 20th century science or for Piper.
For us Aryan is the synonym of IndoIranian: the term Iranian comes from the
root Aryan, in translation “noble”, “free” and such.
Piper died before
making the last polish on Lord Kalvan of Otherwhen. His idea was continued by
Carr. And in the Prologue of R. Green & J. F. Carr’s Great King’s War [21]
on Home TL the greatest Aryan Transpacific expert Danthor Dras simply states:
“…Zarthani, as this group of the Sanskrit-speaking Indo-Aryan settlers called
themselves…”. Obviously, IndoIranians can be the
Easternmost IE people only if Tocharians’ Drang nach Osten has been aborted
somehow.
Even in the 80’s for
people interested in history the par excellence Eastern Indo-Europeans seemed
to be the IndoIranians, not the Tocharians. Everybody forgets Tocharians, not
only Szabédi.
APPENDIX B: WAS PROTO-GREEK
SIMPLY A NUCLEAR INDO-EUROPEAN DIALECT?
Ref. [22] sketches up
a scenario diametrically opposite to Cladistics or Stammbäume. Strictly
speaking, Garrett shows evidences only for Celtic, Italics and Greek (and
Hittite), but for the others written records are simply not old enough. So, if you accept his arguments for these
Western ones, it is possible that it was true for the whole family as well.
He was not the first
preferring such a scenario. But maybe he was the first to prove it for Proto-Greek.
We do have Mycenaean
texts from cca. 1400 BC, so we know that Mycenaean was not the common ancestor of all
1st millennium BC Greek dialects. Rather Mycenean was one of the
dialects at 1400 BC, the only one recorded.
Let us see an example.
“They are” is “ehensi” in Mycenaean,
and in 1st millennium the –si
can be found in Arcado-Cyprian (Mycenean’s closest kin), Aeolic and Ionic, so
roughly in Eastern dialects. But “they are” is “enti” in 1st millennium Western Greek, and it seems that –ti was the older. So Mycenaean, Aeolic
and Ionic form a group (“common innovations”), but Western Greek is not the
successor of Mycenean.
So
far so good. But then the common ancestor of all Greek dialects (if it
existed…) was spoken before 1500 BC. No
problem even in
Garrett’s answer is:
“almost nothing”. To be more definite: vowel structure of Greek is quite
conservative, the verbal system is quite archaic, for consonants only the First
Palatalisation may have happened before (or during) Proto-Greek, and for noun
inflexion the only Common Greek specific innovation is the degeneration
(coincidence) of Dat and Loc in Plural in –si; the PNIE Loc ending was –su.
Garrett tells some analogic causes for the –su>-si change; I tell that in
Proto-Greek times lots of autochtones learned PNIE in Greece, so some
simplification was inevitable and the prepositions could take over the task of
fine discrimination. So this difference is negligible. His conclusion is that
the only important difference between PNIE and Proto-Greek was the prae-Greek
part of Greek vocabulary.
So before 1500 a
(Proto-) Greek individual could converse with his geographic neighbours, not
Greek, if he avoided words exotic for
them. So he used circumlocutions, synonymes and such.
As it is well known,
Magyar is almost without dialectal differences, and very, very different from
neighbours. Still, there is a Central Northern dialect using different endings
for the fundamental triad of directions. The
“at/from/to” triality is expressed by the “-nál/tól/hoz” suffixes in the
literary language, while in the mentioned dialect there is also another triad
“-nott/nól/ni”. The choice between the triads is tricky. This is obviously
bigger difference than the –su/-si between PNIE & Proto-Greek. (I could
explain what is common in –nál and –nott. But I guess you are not really
interested…)
And now let us see
verbal suffices from 1st millennium. (Mycenaean texts, being mainly
inventories, are poor in verbal suffices.) For 1Pl Praes the ending was –mes on
the West (so in the Adriatic neighbourhood), while –men on the East (in the
Aegean region). Now this might be an unimportant minor variation; but Garrett
calls the attention to external relations. In the Adriatic region Italics uses
an end-s (as Latin –mus, there is even possible that in the
other Italic languages the suffix was –mes;
“they are” is “sunt” in Latin, but
“sent” in Oscan), while on the East
in Anatolia Hittite used –wen. So it
seems as if before 1500 BC a continuum of IE dialects existed from the
Tyrrhenian to Hattusas,
with mutually intelligible neighbours, in first approximation no Greekness. In second approximation,
some people at the Southernmost
I am not arguing for
or against; but obviously you cannot describe a continuum with a Stammbaum.
APPENDIX C: COMPUTER
SIMULATIONS VIA CLADISTICS
Cladistics was worked
out as an evolutionary theory for genealogy in
biology. The idea is: if you want to reconstruct the descents of related
species, in first approximation
similarities should not be used. Namely, because of the common origin,
similarities are expected. Now, differences
need explanation, they originate from innovations, and the same innovation
appearing independently in more than one species is improbable. So consider a
number of species from A to K. Assume that a characteristics is x
for species A, B, D, H, I and K, y for C and G, and z
for E and F. (Anything is good, if it
is irrelevant enough, so not
produced by selection. Number of teeth, number of fingers,
form of skull &c.) Then maybe x is the ancestral status, appearing
in most species; y and z are different innovations. So then
A classifies together with G, and E with F; C and G had an extinct common
ancestor in which already Character y was present, and E and F had
another with Character z. Maybe A, B, D, H, I and K look
similar, still they may quite be far relatives only: the similarity simply
comes from the fact that they preserved the "primitive" state. Using
many characteristics you can make alternative genealogic trees, and then the
simplest (with minimal number of innovations, e.g.) or most parsimonious has
the best chance to be true. For example, dolphins and Ichtyosaurs are quite
similar to some fishes, but that comes from selection for streamlined forms in
aquaeous environment. On the other hand, coelacanths (e.g. the living
Latimeria) classify together with amphibian & land tetrapodes, not with
"fishes", although Latimeria looks quite fishy. Namely, Latimeria
"fins" are real bony extremities as tetrapode legs and arms while
piscine fins are not.
A natural idea was to
apply the technique then for linguistic
evolution as well. I will mention differences of principle for which it is not
trivial that the method is applicable there; but first let us proceed.
About IE genealogic
trees one of the most cited is the Ringe-Warnow-Taylor tree [23], [24]. The
resulting tree can be narrated even via everyday terms; of course completely
extinct and unknown branches may have existed too.
Albanian's position is
uncertain, and German's one is anomalous. We do not know enough about
Thraco-Phrygian & Illyrian. But otherwise:
There was first PIE,
the Indo-European primordial language. Anatolians separated first. The
ancestors of Tocharians did second; then separated the common ancestor of
Italic and Celtic. The next separating branch was the common ancestor of Greek
and Armenian(!). After this branching the ancestors of
later Baltic, Slavonic, Iranian and Indian still remained together for a while,
and then the last big forking was into Baltic + Slavonic on one hand and
IndoIranian on the other.
German maybe borrowed
words & grammar heavily from a substrate or neighbour Ertebölle and/or
moved between central Baltic and Westernmost Celtic and these facts caused its
evolution anomalous.
As for times of
separation, the most parsimonious tree gives only lower and upper bounds.
However those we can append with some historical data be even as uncertain as
they are. Using [23] and common sense we may get the results as:
Anatolian
left the IE community (or backwards?) about -3800. It seems that in this time
already the IE word for either wagons, or chariots, or wheels existed. At
least, there is a reconstructed IE word, *kwel (or *kwekwel),
which well relates to English "wheel", or Greek "kyklos". The Hittite is
"hurkis", but remember that the -s is not of the word but merely the
Nominative ending (as in Lithuanian, Latvian, and almost so regularly in Latin
& Greek, but in the latter 4 languages only in Masculine).
Of
course, the Indo-European family is big, and some groups lost the *kwekwel
"wheel word". It seems that Italics lost it; in Latin the place was
occupied by "rota", "rotare" &c., if "circum"
is not related to *kwekwel. "Hurki-" is also not too similar; but wait a moment.
The remaining languages were
together until cca. -3300; then the ancestors of Tocharians left (surely
towards East). Unfortunately we have no mentions of Tocharians for more than 3
millenia. "Wagon" is "kukäl", and "wheel" is
"wärkänt", for any case.
And look:
"wagon", "kukäl" is the relation of the wheel word,
*kwekwel, and the dictionary entry for "wheel", "wärkänt"
is not; but " wärkänt " can be the relation
to Hittite "hurki-". Common innovation? Are
"hurki-" & " wärkänt" related
to "circum"? I do not know and I would be told "Ne sutor ultra
crepidam!"; but at least the -rk- part is common.
Anyways, Tocharian had wagons with wheels, and also a genuine IE word for
horse: yakwe~equus. So they could speedily migrate.
After more 4
centuries, in -2900, ancestors of Italocelts leave the community (surely,
towards West); the Italoceltic unity remains until, say, -2400, and thenceforth
we are with separate Italic and Celtic. Germans and Albanians left the
community either a bit earlier or a bit later than Italocelts.
Common ancestor of
Greek and Armenian leaves the community a bit later, in -2800. This branch
forks later, but we simply do not know when. But anyways, if the scheme is
correct, the Satem innovation is after -2800, because
Greek is Kentum and Armenian is Satem.
The last big forking,
Baltic+Slavonic vs. IndoIranian seems to happen in -2200.
This is a definite
story, maybe true, maybe not. We do not have IE records from -2200. Our
earliest IE text, the Proclamations (and Curse) of King Anittas of Kanesh, was
composed about -1900 BC; the text available for us is from not much later.
If this picture is
true, then the kinship of Tocharian and Italoceltic is nothing meaningful in
cladistic context. Namely, the most striking 2 common characteristics are the
common KENTUM behaviour and the common -r Passive.
Now, people generally
accept that PIE was a KENTUM language. Then in cladistics Kentum languages are
not necessarily kins while Satem ones are: [k]®[s]
is common
innovation. (Was it really? It happened independently in the Romance! But let
us continue.) Before cladistics people argued that Satem would be Eastern,
Kentum Western, and so Far Eastern Kentum Tocharian must have originated on
West. However maybe simply peoples at the peripheries were no more active
members of the IE community anymore; their languages remained archaic, while
the core territory underwent the K->S change (Baltic, Slavic, IndoIranian,
Armenic, Thracian, Phrygian). If so, we can guess the
time and can draw the border of the periphery. Kentum Italic, Celtic, Germanic,
Tocharian, Greek and Anatolian were already separated in -2200, and Italoceltic
at the West, Greek at the South, Hittite in
As for the r-Passive,
it is really present in Hittite, Tocharian, Italic & Celtic, and it is not
even a real Passive in Hittite/Anatolian. There it is connected rather with a
disturbingly non-IE thing: animate vs. nonanimate, Ergative vs. Nominative
constructions. This may be the ancestor of Active vs. Mediopassive; but the
whole construct is reminiscent to non-IE Basque or Caucasian. (I do not suggest however that this Hittite
characteristic would come from Caucasian Hattic. True, no Ergative construction
is known in any non-Anatolian IE language; but this does not mean that Ergative
was unknown in early enough times. And as for Animate vs. Inanimate: Latin
o-stem Masculine and Neuter nouns differ only in Nominative: Neuter Nominative
is just looks like a Masculine Accusative, as
if Nominative would be impossible for
an Inanimate/Neuter. The same in Slavic.) But the
r-sound is there.
Then in pure cladistics
the r-Passive is simply a primitive character preserved in 3 (or 4) branches
and given up for various innovations by others.
If the
Tocharian r-Passive is indeed a preserved archaism, then Szabédi cannot have
been misdirected by Tocharian (because then proto-Tocharian meeting Common
Ugric on the East would not have been too similar to Italic). However the above
simple & logical scheme is far from explaining everything. And look: it
seems me that Hittite and Tocharian share an innovation, namely hurki/wärkänt
for "wheel", instead of "kwekwel". And if
we take "circum" here too...
In the next Appendix I list some problems remaining,
including arguments against cladistics in IE linguistics. To be sure,
Cladistics carries an important amount of truth; but pure biological cladistics
cannot be applied to linguistics. You will see my points in due course.
APPENDIX D: FORKING ONLY?
Cladistics (elaborated
in eukaryote biology) contains a very strict rule in its basis. Let us speak in
the language of Zoology for simplicity. Now we have humans, chimpanzees,
gorillas, orangs, gibbons and monkeys. They are all kins to each other, but the
degrees of kinship vary. Let us measure the closeness (in immunology, amino
acid sequences of proteins, or by any other reasonable way), then it turns out
that our closest kin is the chimpanzee, then the gorilla, then the orang, then
the gibbons (two genera), and only
then the monkeys.
This is not really
surprising; anyone would guess so. But: from the viewpoint of the chimps, we
are closest kins, gorilla is second closest, and
thenceforth as above. From the viewpoint of orang human, chimp & gorilla
classify together. And so on.
Now, from the
differences one can find out the branching times using the correct tree. But we
never (well: hardly ever) see two
eukaryotes from different species hybridize: the elaborated mechanism of
mitosis, meiosis &c. would not permit that. "Macroscopic"
hybridising efforts are either utterly unsuccessful (say, between man and mare)
or at least the next generation is sterile (mules). So branching happens
upwards in time, and two branches cannot unite. (For plants this is not so
strict; or do we define plant species incorrectly?)
Now, Cladistics also
assumes that each forking results in exactly
two new species, while the old one
ceases to exist. For first sight everybody could find out counterexamples;
however in usual zoologic contexts the difference belongs to pure philosophy.
Consider a species A.
Evolution then produces a new species B via, say, gene duplication &
differentiation or Robertson translocation on chromosomes or such. E. g. we
know that our Chromosome 2 is the fusion of two chimpanzee chromosomes. Maybe
this reorganisation was the first/most important step towards mutual
infertility and so the divergence of humans and chimps.
Now, henceforth the
original and the mutated species represent two isolated gene pools. For a
limited time the original species A is still there, but indeed we
cannot see the details of geologic Past from Present very well. For any
practical reason we may tell that the original species A gave life to two
successor species B & C, and itself vanished. Good,
C
is more similar to the common ancestor than B; and then what? It is
very rare to have continuous descent sequences without gaps.
So, there was an LCA
(Last Common Ancestor) of humans and chimp. We can guess the time when it
forked into human & chimp (both in general sense); maybe it was 5.1 My ago [25]. Since neither LCA, not the beings just after
are found, we cannot even discuss the pairwise
similarities. Anyways, A, B and C
all are classified into Genus Homo; the "first human" into Subgenus
(Homo), the "first chimp” into Subgenus (Pan), and the LCA into neither of
the Subgenera; if needs be, a third can be defined.
Since we do not know
every species from fossils, even the most parsimonius tree will have
reconstructed species; but for any case there will be a most parsimonious tree.
Why can we tell that both B and C are new species, even if only B changed, and really C
was (for a while) identical with A? For simplicity, let us speak as
if life went in disjoint generations. LCA lived in Generation 137; in
Generation 138 the mutation appeared. However individuals of Generation 138
mate within that generation. There are two gene pools, mutually infertile. We
call one B, and the other C. Cross matings B-C
are infertile, so no hybrids appear; cross matings A-B and A-C
are neglected, but they are rare enough (and A-B ones are infertile,
while A-C ones are indistinguishable from C-C ones). From
incomplete fossils we cannot tell if C was or was not "the
same" as A, and after several forkings the details become no more real
as the numbers of angels on the point of the pin.
OK, in some cases the
successors may remain cross-fertile in a limited extent. Robertson
translocation, for example, results in limited cross-fertility [26]. And then what? The hybrid population very probably vanishes
in a few generations. The essence is: eukaryote ways of proliferation guarantee
simple forkings always upward in time. Triple forkings are logically impossible
with the above convention: we always may tell that A did not fork into B,
C
and D,
but first A forked into B
and C,
and somewhat later C forked into D and E. A, C & E
are rather similar but not identical, B and D diverged more.
And now let us see a
linguistic case, the best documented major one: the dissolution of Latin around
the end of the Empire.
We have cca. 10 recent
successor languages (the difference between language & dialect is somewhat
arbitrary, government-dependent & such), namely
Catalan (Ca)
Dalmatian
(Da)
French (Fr)
Italian (It)
Portuguese
(
Provencal
(Pr)
Rhaetoroman
(Rh)
Roumanian
(Ru)
Sardinian
(Sa)
Spanish (Sp)
True, last speaker of Da, Tuone Udine Burbur, died in 1898 on
a small island of the
It may be that
patriotic French do not like to call Pr
a separate language, but a dialect of Fr;
on the other hand, some Gascons would like to define their idiom as a 11th
Neo-Latin language. Also, I should have mentioned Aromun, Meglenoromun &
Istroromun. These languages (not dialects) are important in the so called
Daco-Roumanian Argumentations; but these pseudo-scientific battles are unknown
for anybody not Roumanian or Hungarian. The above list of 10 languages is
practically enough.
Everybody can learn
that these 10 languages are not equally distant pairwise. It is more similar to Sp
and Pr than to any else.
En |
Ru |
It |
Sp |
Lady |
doamna |
donna |
dona |
Ladies |
doamne |
donne |
donas |
while otherwise Book Italian is
near to Book Spanish; people of high schools can read the books of the other
language; while nobody native in
OK; we must collect
lots of characteristics before finding the most parsimonious tree.
For a moment let us
look for branching times. The most formal way is to look for the first written document ("oldest
fossil") of the respective language. It is easy to compose the list as we know now (all AD):
Language |
First Text |
Ca |
13th c. |
Da |
13th c. |
Fr |
842 |
It |
960 |
|
12th c. |
Pr |
11th c. |
Rh |
12th c. |
Ru |
1521 |
Sa |
11th c. |
Sp |
10th c. |
However in some cases
the first document is obviously from well after the birth of the separate
language. The most obvious example is Ru:
Hungarian sources know about Roumanians from the end of the 13th century,
arriving from
In several other cases
the above time is more or less the moment when laics already did not understood
well enough written Latin, so some legal documents should have been composed
also on lingua rustica. Definitely
this is the case for It;
the first document is the Placito di Capua, a short text of an oath, made by a lady about the property of some land.
Maybe gentlemen still made the oaths
in a language they believed to be Latin.
For Fr we know some details. In 6th century
priests wrote Latin, laics spoke something they believed Latin, but no written documents
remained about. 7th century was the nadir of the Latinitas of Gallia; the so
called Fredegarius Chronicle is in principle in Latin, but at points simply
unintelligible for posteriority. When the Capeting Kings came to power in the
second half of the 8th century, the Latin of the legal documents &
chronicles started to improve, and the quality became quite good under
Charlemagne. And then it turned out that laics did no more understand the Correct
Latin. So the Synod of Tours declared in 813 that the priests should translate
some texts "...in Rusticam Romanam linguam, aut in Theotiscam..." so
into village Roman or German, so that the people understand. Surely after that
immediately some texts were written, not extant. But there remains only 29
years until 842…
As for Spanish, we do
know about a Southern Spanish language spoken in Andaluz, the Mozarab; it is
still used in limited extent in two churches. The present Spanish is the
Northern variety. The Moslim occupation caused a divergent evolution. The
end-product is 4 Neo-Latin languages in the
However the formation
of a new language does not mark the
split of the original linguistic pool into two. Near kin languages remain mutually
more or less understandable for centuries. In such a way "gene flow"
exists between separate languages. Also, until 476 the Roman Empire existed and
Latin was official everywhere, so in principle no Neo-Latin language could
start before 476; but we know that Neo-Latin languages indeed come mainly from
Vulgar Latin, not from the official High Language; and Vulgar Latin had serious
geographic varieties in 476.
Surely,
OK. Ru became gradually separated from an
Eastern Vulgar Latin after 395, while Sa from a Western Vulgar Latin gradually after 238. So the
differences are caused by
A) the
separations, which, however, were gradual;
B) the
original differences between farther or otherwise isolated areas; and
C) the
different substrates/superstrates/adstrates.
Cladistics would correctly handle A) if the separation would not be gradual, but not B) or C). The
decomposition of PIE very probably had all three factors. PIE was spoken about
-4000 on a rather large territory of the steppe from the Carpathians to maybe
the Caspian. The subpopulations had an effective way of transport via horses,
either ridden or with wagons/chariots (wheel, kyklos, hurki &c.), but
surely neighbours generally remained neighbours. So there were geographic
gradients of PIE, foreign from cladistics. When somebody started away to non-IE
territories, then the separation of the group was similar to speciation in
cladistics (divergent evolution starts), but when a group was moving from one
PIE part to another, that was an un-cladistic event (see the problems with
Germans wandering between Baltics & Celts). Also, locally different
substrates could increase the differences between neighbour dialects (Ertebölle
in
Hittite even shows a
superposed effect of both substrate and superstrate (as for adstrate, who knows?).
OK, Anatolian separates in -3800. It is an interesting question if
So proto-Anatolians go
to
Were I a proud
Indo-European, I would certainly be surprised. Being I not an IE, I am not too
interested. Maybe proto-Anatolians were a really small tribe somewhere in
Easternmost Anatolia, learning improved agriculture for almost 2 millenia from
Caucasians. Then after the necessary acculturation some chiefs occupied some
cities of the Hattic agriculturists while partially preserving martial skill,
and then Anittas, son of Pitannas, destroyed the Hattic metropolis Hattusas.
(Note that I do know that according to formal rules of English I should have
written "of Pitanna". English prepositions do not go with Nominative but
with the Common Case or Accusative: of me,
not of
Tocharians started from somewhere on the steppe. Without
original gradients and without going away on airplanes this would be a nice
cladistic situation: Tocharian keeps some linguistic stage of their departure,
of which some they innovated. However
1) their
original language was most similar to their neighbours independently of the
time of departure; and
2) if
they were not the Easternmost in the PIE homeland, then they contacted
linguistic kins on the steppe moving Eastward, with whom they were in limited
linguistic understanding. These interactions "contaminated" the
language.
For our purpose we
should know the Tocharian vocabulary about -1500 when they met proto-Ugric in
Westernmost Asia. But we have only Buddhist religious and medical texts from
+800! If we could guess the original position, we might take the languages of
synchronous neighbours. But Cladistics does not answer this question.
Cladistics suggests the answer that similarities with Italoceltic (&
Hittite) come simply from the early departure. However some fundamental
assumptions for using Cladistics may not be valid!
I cannot overcome this
question. True, Swadesh word lists and similar quantitative “measures of
similarity” seem to help. But the problem is complicated enough. Let us see a
simple and transparent approach.
One may take all the known IE words of
Tocharian with the IE roots, from e.g. Pokorny [27].
Now you can look for the same root in other IE languages. Following
e.g. [28], for simplicity let us choose 6 groups for the “others”: Celtic,
German, Greek, Hittite, Italic and all Satem; then Albanian is ignored, as well
as the not well known extinct ones as Phrygian, Thracian &c. Then [28] gives for all known IE words in Tocharian if they have
cognates in the 6 other groups or not. (Of course, if a Tocharian word
is accepted as IE, the root occurs in some other IE language(s) too.)
However, as you can
explicitly see from [28], Pokorny’s list contradicts to Watkins’s work about
the origins of English words [29]. Namely, there are e.g. Tocharian words whose
IE ancestors had no successor in the Italic subfamily according to Pokorny,
while according to Watkins there is a successor in Modern English through Latin. Obviously some
etymologies were accepted by one author and refused by the other. Then we may
choose between the approaches; here I accept the extra etymologies of Watkins
too, but use only the Pokorny roots. Then:
Tocharian total |
296 |
In Celtic |
192 |
In German |
254 |
In Greek |
237 |
In Hittite |
69 |
In Italic |
224 |
In Satems |
251 |
and you might conclude that Tocharian was
nearest to German, then to the Satem languages, then to Greek and so on.
Hittite is farthest.
However such a method
would contain a tremendous bias. The number of Satem languages is above 100,
many living, while Italics are really 3 even if the Neo-Latin divergence
produced 10 very similar successors of Latin; and Hittite means the relative
known but long ago extinct Neshian (syllabal Hittite), the hieroglyphic Luwian
of not so directly known vocabulary, and the hardly known Lycian; all extinct.
So anyway we except much less hits in Hittite than in
Satems.
Still, this material
deserves further statistical analysis; and for any case we can tell that ľ of
the IE roots preserved in Tocharian was
preserved also in Italics.
Such approaches seem
diametrically opposite to Cladistics if we apply them according to an extremal
hypothesis: strong geographic variations in the Urheimat without temporal
differences (“synchronous breakup”). Appendix C will be Stetsyuk's approach,
based on taking Nostratic very seriously [30], which eliminated at least
partially the above documented bias.
APPENDIX E: SOME STATISTICS
At the end of the
previous Appendix we saw a bulk number telling that
1) There are 296 IE
roots in Tocharian.
2) Of these 296 roots other subfamilies
preserved so & so many.
Both statements can be challenged. The number 296 is rather limited,
but the extant Tocharian literature is limited too: mainly Buddhist theology
& medicine. So the meaning of this 296 is not that far-migrated Tocharian
language would have diluted amongst strangers, but simply that the majority of
the roots are not preserved in the written documents at disposal. The number
will increase.
Similarly, the
statement that, say, of these 296 roots xA are found in Anatolian
languages, depends on our present knowledge about Anatolian, on the present
status of IE art and on stricter or looser criteria.
However for the
present study let us accept the numbers in face values. In the previous
Appendix I narrated how the numbers were obtained: they are primarily from
Pokorny [27], but corrected with the Germanic, Romance and Greek etymologies of
Watkins [29] for English.
The main goal of this
Appendix is the demonstration of the application of the methods of statistics
in linguistics. According to present fashion in scholarship I should rather
make references of computer calculations, e.g. the good qualities of cladistic
code packets on the software market. Instead I discuss explicit connections
between quantitative statistical data and evolutionary hypotheses.
We have N(=296) roots which, according to consensus, authorities or
such, were present in the ancestor language (IE, PIE, Indo-Hittite or any).
Then, if the descendant of such a root cannot be found in one daughter group,
then there are 2 possibilities:
1) There is a
continuous decay of the original dictionary. Words/roots are being substituted.
If the subgroup contains only a small number of languages, then this loss can
be significant. If the number of languages is high enough, complete
substitution is rare; the root remains in a few languages, and then it is
present in the subgroup. For improved statistical analysis polynomial or
Poisson distributions would be adequate, but in first approximation it is true
that during a given time in the ith subgroup pi is the probability to lose a root, so the
number of remaining roots is
Ni = N(1-pi),
1>pi>0 (E.1)
2) Now, we may not
found the root even if it still is/was present in some languages. For extinct
languages the main factor in not finding the root is the limited extant
dictionary. This effect results in first approximation in an extra factor wi:
Ni = Nwi(1-pi),
1>pi>0, wi<1 (E.2)
However this means that for practical purposes we can write
wi(1-pi)
-> qi (E.3)
The qi's can be taken from the etymologies. Also, q's of
more than one indices can be calculated from the
Tocharian IE etymologies, when the root is found in more than one other IE
subgroup. Now I am giving all te data, according to
the notations
1: Celtic
2: Germanic
3: Greek
4: Hittite (or
Anatolian)
5: Italic
6: Satem languages
Obviously the smallest w factor is expected to be w4; maybe
w6 is closest to 1.
The q's of 1 index
were practically given in the previous Appendix, but for a homogeneous structure
I repeat them:
qi =
{0.6486, 0.8581; 0.8007, 0.2331, 0.7568, 0.8480}
Indeed, q4 is very low, and for this fact the simplest
explanation is a low w4. (An alternative possibility would be p4
close to 1, so drastic decay of the original dictionary, because of e.g.
massive substrate influence; and such might be behind of the moderately low q1
& q5.)
The q's of more than
one indices will be given as follows. By construction
a qik...m is symmetric in all indices, but the value does not have
any meaning if two indices coincide. So I give only the components of increasing index order. These
independent components are as follows:
q1k = {0.5980, 0.5709, 0.1791,
0.5642, 0.5777
q2k = {0.6959, 0.2061, 0.6858,
0.7568}
q3k = {0.2128, 0.6655, 0.6858}
q4k = {0.2128, 0.2264}
q5k = {0.6520}
q123 = 0.5304
q124 = 0.1655
q125 = 0.5338
q126 = 0.5304
q134 = 0.1723
q135 = 0.5135
q136 = 0.5034
q145 = 0.1723
q146 = 0.1723
q156 = 0.4932
q234
= 0.1953
q235
= 0.6081
q236
= 0.6115
q245
= 0.1926
q246
= 0.2027
q256
= 0.6014
q345
= 0.2095
q346
= 0.2095
q356
= 0.5777
q456
= 0.2061
q1234 = 0.1622
q1235 = 0.4865
q1236 = 0.4730
q1245 = 0.1622
q1246 = 0.1622
q1256 = 0.4696
q1345 = 0.1689
q1346 = 0.1655
q1356 = 0.4493
q1456 = 0.1655
q2345 = 0.1892
q2346 = 0.1926
q2356 = 0.5338
q2456 = 0.1892
q3456 = 0.2027
q12345 = 0.1588
q12346 = 0.1588
q12356 = 0.4291
q12456 = 0.1588
q13456 = 0.1622
q23456 = 0.1858
q123456 = 0.1554
In what follows I will
stop at two indices; but the analysis might go further. Let us guess first,
what is the connection between qi and qik. For this
purpose consider two extremal cases. If the language groups i and k
were already separated from each other as different and mutually unintelligible
languages, or if the speakers were already geographically isolated from each
other at the detachment of Tocharian, and did not take again contact until the
linguistic separation happened, then the forgetting/substituting processes in
eqs. (E.1-3) were completely independent, so
qik = qi*qk (E.4)
On the other hand, we may formally distinguish 2 languages even if they
are still completely intelligible mutually and they are in intensive contact
(and very similar and mutually intelligible ones are indeed sometimes declared
different); in this case
qi = qk;
qik2 = qiqk (E.5)
In the generic case then we expect
(qiqk)1/2
> qik > qiqk (E.6)
Also we can get a similar result in cladistics. Assume that Group 0,
now Tocharian, detached, when the ancestor(s) of Groups i & k were still in
close connection; then, after some time tik they separated. If so, the forgetting process was
partially common, i.e.
Ni = Nwi(1-pik)(1-pi) (E.7)
but
Nik = Nwiwk(1-pik)(1-pi)(1-pk) (E.8)
so then
qik/qiqk
= (1-pik)-1 > 1 (E.9)
Obviously pik is <<1 if the common life was short, and
increases monotonically with longer and longer common life, contacts &c. So
anyways, qik/qiqk>1 shows some common
history of any kind.
Now I give the ratio
for all combinations:
ik= |
12 |
13 |
14 |
15 |
16 |
23 |
24 |
25 |
26 |
34 |
35 |
36 |
45 |
46 |
56 |
qik/qiqk= |
1.074 |
1.099 |
1.185 |
1.149 |
1.050 |
1.013 |
1.030 |
1.056 |
1.040 |
1.140 |
1.098 |
1.010 |
1.206 |
1.145 |
1.016 |
where, somewhat arbitrarily, I
drew a border at qik/qiqk=1.1, and above that
the numbers are in boldface. What do
we see?
A) Greek is not too
dependent on anyone except Hittite.
B) Hittite is
independent of Germanic, but not of the others.
C) Excepting the
dependences of Hittite, the only other strong dependence is Celtic to Italic.
Now, I do not try to
interpret A), and I do not want to interpret B), since the multiply suggested
history that Hittite was already separate from the Core IE when Tocharian was
detaching might disturb the results. However the interpretation of C) is simple
enough. At the Tocharian departure the subfamilies were already more or less
separated, except that Italoceltic community still existed for a while (see eq.
(E.9)).
Obviously this is not a rigorous proof of the existence of
Italoceltic community. For that a much more detailed statistical analysis would
be necessary, For that at least a Hittite-based analysis would be necessary
(according to the hypothesis that Hittite separated first); and I will not do
that here. Appendix E is only to demonstrate that it is not either necessary or
sufficient to apply Cladistic softwares outside their boundaries of validity.
(As we saw, biological Cladistics has fundamental assumptions untrue in
linguistics.)
APPENDIX F: THE CASE OF PURE
GEOGRAPHIC FACTORS
Take a group of N
related languages, occupying some positions XA, not moving. Strengths of contacts are governed by
some "generalised distances", DAB(XA,XB). If a D is small, then the two dialects are in strong
connection, so even after a considerable time they remain similar. If the
particular D is great, contacts are weak, and the dialects/languages diverge.
DAB's are generalised distances, because a
complicated geography can cause the equivalent of great distance. E.g. in the
Many years ago Jánossy
formulated the question, how to decide the dimensionality of space [31]. He was
definitely against Riemannian geometry; I am not, but let us continue for a
while. We have N points, with distances DAB between, and we assume
some coordinates XAi, AŁN, iŁn. If we know the geometry, then
DAB = f(XAi,XBi) (F.1)
where the form of the f function
is given. In Euclidean geometry it is the Pythagorean formula, on Earth's
surface that of the spherical geometry, &c.
Now, we have N(N-1)/2 independent distances and nN coordinates. So eq.
(1) cannot trivially be fulfilled if
N < 2n+1 (F.2)
Hence we get a constraint for n, the dimensionality of the space. For
Earth's surface n=2, so the distances of the 6th point check if they are on a
surface.
Stetsyuk used the
number of etymologically related words as a measure of distance between 2
languages. This is surely in close connection with differences. If we have a
string of languages all going back to an original Ur-language, languages in
more contact will retain more words in common. Languages in loose contact
forget independently, so a small number of common words remain. Sure, this DAB
is a functional of the distance in
physical space, not simply equal with it, but let us follow Stetsyuk in first
approximation by ignoring the difference.
His approach is of two
steps. First he wanted to discover the Nostratic Urheimat, and second that of
the Indo-Europeans.
The Nostratic
discipline comes from
If you ask, what
remains outside, I can give a list. A lot of African languages, Native
Australian & New Guinean ones, together with Pacific ones, are definitely
non-Nostratic, together with such important Eurasian ones as (probably)
Sumerian, Elamite, Basque, some Caucasians, Ket, Chinese & Burushaski,
together with the Na-Dene languages of Northern America.
The Nostratic
superfamily may be a product of immigration of Homo sapiens (or Homo sapiens
sapiens, or Anatomically Modern Humans) into
Our gracilis ancestors
with chins and without brow ridges tried to leave
The third exodus again
went through the Sinai, and was at the site Bokher Takhtit in ca. -58,000. This
wave introduced some innovations (as e.g. blade technology) in which many
experts see the start of
If so, then the
languages of
Of course, 50,000
years is a time depth; however PIE is reconstructed back to 6,000 years (Uralic
too), and it is generally accepted that until Magdalenian specialised hunting
humans were more mobile than later. So maybe
some similarity of languages remained up to now.
If so, Nostratic may
show the latest close kins either at end-Palaeolithic or at Mesolithic. Stetsyuk
is brave enough to reconstruct the locations at the end of Nostratic unity from
linguistic similarities, and gets that the distances fit best with a Nostratic
Urheimat South from Caucasus, East from Black Sea, West from the Caspian Sea,
and to the South somewhere at the valleys of Great & Little Zab.
So Indo-Europeans,
Uralics & Altaics formed almost an equilateral triangle, Altaics around Mt.
Ararat & Lake Sevan, Indo-Europeans to the East, until the Caspian, and
Uralics around the
If you like, you may
believe this. For any case, Uralic & Altaic is on the verge of forming an
established higher family, Uralo-Altaic, with a common grammar and at least
dozens of common words, and Uralic & Indo-European, while the grammars are
not similar, have such common words as e.g. "water", "name"
or "honey/mead". As for the other 3 "peripherial" families,
they were farther so not in so strong connections.
Then Nostratic unity
came to an end (-15,000? -10,000? Indo-European surely remained still in some
contact with Semitic if indeed some
names of domesticated animals are common). Stetsyuk assumes migrations, and
then tries to reconstruct the secondary Urheimats of Indo-Europeans, Uralics
& Altaics. While for me the reconstructed Uralic Urheimat would be most
interesting, I can imagine that most readers would read rather about
Indo-Europeans. So let us see that; the time is PIE community, so maybe -4,000.
The site is
Now comes
the western half. The Westernmost IE people were Celts. Their Westernmost point
was the confluence of Wistula & Western Bug, so cca.
Natural borders were
somewhat less explicite in the Southwestern quadrant. Illyrian territory was
bordered by
You may or may not
believe the reconstruction; it is an attempt to explain IE
similarities/dissimilarities just before the exodus of Anatolians to
Now we can return to
the Tocharian migration to East. They started, say, in -3,300, with Western IE
word forms and crossed first
I only wanted to show
that there are reconstructions in reasonable schemes, good or bad ones.
Tocharian language in itself is a proof for intensive connections with Uralics
on the East, very probably Ugrics. According to Uralic consensus in the time of
contact (surely before 1,000 BC) separate Magyar language did not yet exist;
maybe the other two extant Ugric languages should be looked up for Tocharian
impact as well. But Western IE
influence on Magyar is not "mirageous", and cladistics is not the
final answer in IE history.
APPENDIX G: AN UNTIMELY DEATH IN NOVAE
Here
comes the mysteriously mentioned Lazhen grave inscription. In Late Antiquity
the site was Dacia Nova, just South of the limes, the
River Danubius, near to the non-negligible city Novae, along the tributary
river Asamus. (Now the names are respectively,
A
young male, hardly beyond boyhood, died and got a nice grave inscription,
nontrivial to reproduce here, so let us make in a circuitous way.
The
father wanted to write something:
I
went away when my dear maleness was blooming
in my four and tenth year.
The Classical Latin text would have been
something:
Ego
decedebam caro florente marito
in quartum decimumque annum...
OK, maybe the father looked for some
synonyme. Lots existed: exibam, moriebar, occidebam &c. But the gravestone
gives something surprising for us:
Ipso
immargebam caro florente marito
in quartum decimumque annum...
Ipso may be a local usage, or, maybe, the
death was suicide. But immargebam is "incorrect", and even with a
good orthography it would be "immergebam": "I sunk in/went
down". Maybe proper for a sailor; but otherwise...
However
it was not at all strange for Daicoviciu [32], who published the text. He translated
the text to Roumanian as "Mergeam in anul 14-lea." In his vernacular
it is "I went away", because "mergeam" is simply "I
went away". Similarly, in Albanian "mërgonj"
is cca. "I move away".
By
other words, further evolution of the Novae Latin probably led to Modern
Roumanian (and to the Latin words of Albanian). On other
territories not. The immergo -> exeo transition of meaning was not
the temporary status of Vulgar Latin, but Common Innovation of Palaeo-Roumanian
and Palaeo-Albanian in
And
still Vulgar Latin was mutually understood.
REFERENCES
[1] Quine W. V. O.: Methods of Logic. Holt,
[2] Lukács B., Martinás K. & Bérczi Sz.:
Symmetry and Katachi in the Works of Aristotle. Forma 15, 173 (2000)
[3]
[4] Szabédi L.: A magyar nyelv
ôstörténete. Kriterion,
[5]
[6] Diamond J.: The Rise and Fall of the
Third Chimpanzee. Vintage,
[7] Brugmann K.: Kurze vergleichende Grammatik der
indogermanischen Sprachen. Tr/übner, Strassburg, 1904
[8] Lord Kelvin: Nineteenth Century Clouds over the Dynamical
Theory of Heat and Light. The
[9] Strabo: Geographika. In 8
[10] Piper B. H.: Lord Kalvan
of Otherwhen. Ace Books,
[11] Kortlandt P.: More
Evidence for Italo-Celtic. Eriu 32,
1 (1981)
[12] Warnow T.: Mathematical
Approaches to Comparative Linguistics. PNAS 94, 6585 (1997)
[13] Erdem E., Lifschitz V.
& Ringe D.: Temporal Phylogenetic Networks and Logic Programming. arXiv:cs.LO/0508129v1; to appear in Theory and Practice of
Logic Programming.
[14] Collinder B: Hat das
Uralische Verwandte?
[15]
[16] Rosenfelder M.: Numbers
in over 4000 Languages. www://zompist.com/numbers.shtml
[17] Makkay
J.: The Tiszaszôlôs Treasure. (See also some citations therein.)
[18] Livius T.: Ab urbe
condita. Teubner,
[19] Hajdú P.: Bevezetés az uráli
nyelvtudományba. Tankönyvkiadó,
[20] Toynbee A. J.: Some
Problems in Greek History.
[21] Green R. & Carr J.
F.: Great King’s War. Ace Books,
[22] Garrett A.: Convergence
in the Formation of Indo-European Subgroups: Phylogeny and Chronology. In:
Phylogenetic Methods and the Prehistory of Languages, eds. Forster P. &
Renfew C., Cambridge, McDonald Inst. for Archaeologic Research, p. 139, 2006
[23] Ringe D., Warnow T.
& Taylor Ann: Indo-European and Computational Cladistics. Trans. Philol.
Soc. 100, 59 (2002)
[24] Warnow T.: Mathematical
Approaches to Comparative Linguistics. PNAS 94, 6585 (1997)
[25] Wildman D. E. & al.:
Implications of Natural Selection in Shaping 99.4% Nonsynonymous DNA Identity
between Humans and Chimpanzees: Enlarging Genus Homo. PNAS 100, 7181 (2003)
[26] Holba Ágnes & Lukács
B.: How to Jump into Humanity: A Mathematical Reconstruction. In: Evolution:
from Cosmogenesis to Biogenesis, eds. Lukács B. & al., KFKI-1990-50, p. 125
[27] Pokorny J.:
Indogermanisches etymologisches Wörterbuch. Francke,
[28] Richter G. C.: A
Maitreyasamiti-nat.aka with Addendum.
http://www2.truman.edu/~grichter/translations/tochcog.pdf
[29] Watkins C.: The
American Heritage Dictionary of Indo-European Roots. Houghton Mifflin,
[30] Stetsyuk V.:
Introduction to the Study of Prehistoric Ethnogenic Processes in
[31] Jánossy L.: Theory of
Relativity Based on Physical Reality. Akadémiai Kiadó,
[32] Daicoviciu
C.: A merge. Fossatum - sat. Dacoromania V., Cluj, 1927/8, p. 477
My HomePage, with some other studies, if you are curious.