IF NOT, THEN WHY? 1: THE "PROTO-LATIN THEORY" OF THE MAGYAR LANGUAGE

 

B. Lukács

 

President of the Matter Evolution Subcommittee

of the

Hungarian Academy of Sciences

 

CRIP RMKI H-1525 Bp. 114. Pf. 49, Budapest, Hungary

 

lukacs@rmki.kfki.hu

 

ABSTRACT

Sometimes a theory is impossible, so cannot be true; and still it works. Surely something mimics Truth; it is then interesting and worthwhile to see, what mimics what. L. Szabédi, high official of the Bolyai University at Clausenburg and a linguist, in the last months of that university (the Magyar university in Roumania) became convinced that Uralic Magyar originated from Indo-European proto-Latin. Some people believe that he simply wanted to get favours from Neo-Latin Roumanians, but I think Tocharian mimicked the nonexistent proto-Latin – proto-Ugric genetic kinship.

 

0. ON THE EVOLUTION IN SCIENCES

            While I believe in Scientific Truth, I do not believe in the truth of any particular theory used just now. We know that any particular theory used just now has substituted a previous theory proven wrong not too far back in the past. The general (and convenient) belief/expression is that the present theory is more true than the previous one; and I think this is generally true in quantitative sense: the new one answers more questions than the old one, or gives more accurate predictions, or simpler to apply &c. However in qualitative sense the continuous improvement is often not true.

            In Physics Democritus used an atomistic description which we now like to call correct, and with this scheme he was practically unable to give explanations or predictions correct either in quantitative or in qualitative sense. After a century came Aristotle, telling that atomism is probably incorrect and surely unimportant; he used a strongly inhomogeneous world picture with a preferred center and rest at proper place as preferred state of motion; physicists generally like now to call this incorrect but still Aristotle was able to give lots of good predictions in qualitative sense even if few enough correct quantitatively. His system survived 1900 years and then Galileo disproved it. Based on Galileo's rather semiquantitative observations and arguments Newton then demolished the Aristotelian physics (except in Thermodynamics, to be sure), and the new one showed a Democritean structure, with mass points moving with never diminishing momenta (in sums), with forces depending on relative distances of agents, and not on distances from preferred points of space, and Newton's World was infinite, the same everywhere, and unevolving. Philosophers then were eloquent in jubilating on old Aristotle overthrown.

            At the latest time as 1926 Relativity & Quantum Physics demolished Newton's World. There are no mass points: Y waves live & interact in at least 6N dimensional phase spaces, results of Measurements are ultimately stochastic, Space-Time is inhomogeneous and may be finite in all of its dimensions.

            As for Gravity, it seems that Democritus did not know what is it (I am deliberately ungrammatical, but it would be even worse to use Past Tense for an Ultimate Truth, even if we do not know It for sure, and my tenses are correct in Uralic languages anyway). Maybe some interactions of the small atoms? Aristotle told that Gravity is the relation of the piece of matter to geometry of Space. Newton told that Gravity is the result of mutual pulls of mass points: Earth is composed of many masses so if a new one is carried to the neighbourhood, the result will be a net pull towards Earth's center. But in the neighbourhood of giant Jupiter the pull points to Jupiter's center and so on; the ultimate cause is force between mass pairs. Then came Einstein telling that Gravity is the property of Space-Time. Space-Time has a nontrivial geometry, and from this geometry come some properties of unforced motions. Aristotle would have cherished this idea; but Einstein went, of course, further, telling that the geometry is not prescribed but the product of matter being there. For sure, Sun does not pull Earth. No force acts on Earth, but the geometry is such that Earth Gravitates toward Sun. For somebody in General Relativity Newton was moderately correct about forces, but he was simply misguided for Gravity and his opinion in this topics bordered Superstition. Aristotle, in contrast, was surprisingly correct for somebody 2200 years in the past although somewhat simplicistic.

            In some areas of Physics now theories live 5 years in average. When I was choosing topics for my BA dissertation (that was in 1969) the most interesting Particle Physics title was "ρ-π-π Vertices in the Third Times Modified Veneziano Model". I thought maybe that was my last time to choose freely, so I would like something more True than a third times modified model, so I opted for Gravitational Collapse. When I am writing this, General Relativity survived 91 years in unmodified form, but my young colleagues do not know what is the Veneziano model modified any times. However signals are clear that even General Relativity will change, hopefully in my life.

            As for Astronomy, Pythagoras believed that Earth revolved around either Sun or the Central Fire. Aristotle believed that Sun revolved around Earth. Aristarchus supported Pythagoras, but in vain. Copernicus believed that Earth revolved around Sun, Galileo supported him and Church believed that She condemned Galileo for making the support indecently stubborn. Then after Newton Copernicus' scheme was accepted, but in 1915 Einstein showed that the question "who revolves around whom" has no meaning at all, not being a covariant statement. Then in 1992 Church observed anomalies in the process of Galileo's trial and annulled the verdict in a backward acting way. What is Truth in this question?

            And Physics & Astronomy are our best Sciences. As for the Age of Earth pre-Classic Greeks guessed some thousands of years, Aristotle Infinity, Alexander of Aphrodisias more than a million years, Early Middle Age 5000 years, XVIIIIth century mainly on physical arguments ~10,000 years, XIXth century on physical, chemical & geological arguments 40 million years, which then jumped to a billion, and then gradually to 4.55 million years.

            For Economy I rather would not list best theories telling opposite Truths. In Scholarship look for National Histories.

            So at a given temporal point we cannot answer questions about Ultimate Truth, although in average Science goes forward. But Ultimate Questions cannot be answered in average; they should be answered Yes or No.

            Orman Willard van Quine tells us [1] that changes in Science follow a certain Parsimony. We have lots of assumptions when building up a Theory. Then, if an observation contradicts a prediction, we change some assumption, but those which are the "cheapest". So the Ptolemaic World Scheme was improved and improved by introducing more and more epicycles and excentric cycles for 1500 years; and only then took Copernicus over.

            But even this is not so simple. Historians of Science tell that the Aristotlean World Picture collapsed when Galileo's telescope resolved Milky Way into individual stars. They are obviously neglecting the original works of Aristotle. De Caelo gives an explanation for Milky Way, and from this explanation the prediction would have been (nobody applied Aristotle's theory to telescopes) that after some magnification the picture will be light spots surrounded by darkness [2], as it was. Because neither Galileo nor his opponents were fluent in Aristotle, they agreed that there was something against Him. Now we know that there was, but they should not have yet known.

            So it seems that a kind of "scientific fashion" is involved in each Yes/No choice. Maybe if we could observe the many-world histories of Everett [3], we could enjoy many formally very  different theories for the same topic, which were, however, quite similar for many quantitative predictions. Alas, we cannot do this.

            But we can remember it. And this view gives a Prediction. There may be cases when somebody quite good a scientist/scholar invents a theory whose fundaments seem quite impossible for us; and still he is not a madman, and his theory can explain a lot of facts. Maybe not so much as the Best Theory, but still, it is surprisingly good compared to Common Opinion that it is Fundamentally Wrong. And in a century futureward...?

            This study is about the "proto-Latin Theory" of the Origin of Magyar language [4], elaborated by L. Szabédi, Magyar poet, linguist, University high official and finally martyr in Roumania.

 

1. THE BACKGROUND

            I must sketch the situation of Transylvania in 1958, when the theory was finished (it was published  almost 20 years later), although some Roumanians may tell that the matter in this Chapter should not be overemphasized. Westerners either are uninterested or they tell that they are, but without this Chapter the motivation of the theory cannot be evaluated.

            Szabédi was an ethnic Magyar in Transylvania (or rather, he was an ethnic Szekler, classified together with Magyars when opposed to third parties), who believed that this fact should be translated "ethnic Hungarian". Magyars are generally mistaken in the nature of the terms Magyar and Hungarian, because Magyar language does not distinguish the two terms. Slovakians can distinguish excellently, but even other interested ethnics are in some confusion. For example, in Roumanian official statistics you can declare yourself both to Maghiar and to Ungur (both mean some non-Roumanian fellow with the very same Uralic language), Magyars wait for the announcement of their leaders and then generally tick off Maghiar. But: Magyar is language, and Hungarian is State. In 899 the Carpathian Basin ("Hungary") was united by conquering Magyars, but in 1000 the ruling clan of Magyars founded the Hungarian Kingdom of many languages  (officially Latin). The center was generally in the Magyar-speaking region, so Magyars generally feel that Magyar is the Magyar translation of Hungarus/Hungarian. While they are in mistake, the mistake is not gross stupidity, simply a mistake. (While my first language is Magyar, I am not exactly Magyar, moreover, I could argue for not being even Hungarian in some sense, but that would be an unnecessary excursion, so let us stop here.)

            Transylvania is a geographic area, which was never unilingual (and national histories produce contradictory pseudo-statistics). However it was the part of Hungary between 896/1003/1020 and 1918/1920; and for the bigger part again bw. 1940 and 1944/1947 (the different data are according to different theories). Then it again was transferred to Roumania, but a Magyar university remained in the capitol city Claudiopolis/Kolozsvár/Clausenburg/Cluj. That was the Bolyai University. Szabédi was one of the high officials of that university, closed down in 1959 ("united with the Roumanian Babes University"). As a protest Szabédi committed suicide, in vain, because the Roumanian State declared it railway accident, and the Hungarian State then was not in the position to deny it. (Details are irrelevant for our present topic.)

            Roumanian State was continuously declaring the Latin origin of Roumanian Nation and Language; and, indeed, the language is Neo-Latin, and the nation comes from the Roman Empire. The main difference between Hungarian and Roumanian National Histories is that we do not believe in the Roumanian ethnogenesis in present Roumania, but rather in Bulgaria. (Maybe in Southern Bulgaria, near to the hills Calvomuntes, if Procopius is reliable.)  The arguments would need the detailed discussions of some early Byzantian historians, plus the comparison of Roumanian words with (extinct) Dalmatian and (extant) Albanian ones. I will not compare them (for which, I think, most of the readers will be grateful; with Roumanian readers I am ready to discuss separately); only, for general orientation, I note that the Lazhen’ grave inscription is decisive, but maybe the readers would not at all demand me to tell about.

            Then the Magyar minority made jokes about the Latin origin (in small voice) and did not believe even the Neo-Latin origin of the language (true, lots of Bulgarian Slav words can be found in it, even after the Herculanean work of Bishop Samuil Micu-Klein, but also a comparable amount of Bulgarian Turk is present in Magyar). So, because Magyars in Transylvania were not permitted to openly derise the Latin origin of Roumanian (the State political police Siguranza was quite active), they discussed the whole thing as minimally as it was possible.

            And yet, Szabédi worked out a theory in which Magyar is a cousin of Roumanian, Roumanian being the direct descendent of Latin, while Magyar comes from the direct ancestor of Latin through proto-Magyar, Ugor, Finno-Ugric or so.

            Of course, you can believe in a simple explanation, as follows. The State is Roumanian. Szabédi wants to get something from the Roumanian State, so he tells: indeed you have originated from the great Romans but we are your inferior relatives, so give me a smaller amount... It would be a quite rational tactics of either the hare or of the fox; but then now Szabédi’s name would carry very bad memory in the Szekler nation, and it is not so. There is another explanation [5] which tells that Szabédi manufactured his theory “…to help the Magyar and Roumanian peoples to become friends”. The problem is the same: for anybody valuing much the Latin ancestry Magyars would have been put to second place unacceptable for Szabédi and the Szekler nation. In addition, this explanation would mean that Szabédi deliberately lied. I, as a scientist, would not assume this without serious evidence.

I heard lots of stories about lip service to Roumanian State, but in the Transylvanian tradition Szabédi is a good Hungarian, he committed suicide as protesting against the closing down of the Magyar university, and his theory was unpublished for 19 years afterwards, so the Roumanian State was not a buyer for it. I cannot believe anything else than Szabédi believed his Proto-Latin theory. And then comes the central question: why did he believe/make this theory? I think, I can get a tentative answer.

 

2. THE MAJORITY OPINION ABOUT ROUMANIAN, MAGYAR AND LATIN

            Indo-Europist linguists have a very strict opinion about the origin of the Roumanian language. According to it, Roumanian is the direct descendent of Eastern Vulgar Latin. I agree with this opinion even if I think Westerners ignore the heavy Carp & Bulgarian Slav substrates/adstrates. They ignore also the Lazhen’ inscription, but the Lazhen’ inscription influences only the exact location of the ethnogenesis and they have to ignore that.

            For the origin of Magyar there is an official Hungarian Academic opinion. I write Academic, not academic; this is the majority opinion of the Hungarian Academy of Sciences, and I have no reason to challenge it. According to that, Magyar belongs to the Uralic group. Within that, to the Finno-Ugric subgroup as compared to Samoiedic, and within the Finno-Ugric, to the Ugric, as compared to Finnic. There are two other extant Ugric relatives in Westernmost Siberia, Khanty & Manyshi. Earlier they may have lived astride the Ural Mountains, so we may have lived either in Easternmost Europe, or in Westernmost Asia.

            Latin was a member of the Italic Subfamily of the Indo-European family; or maybe of the Italoceltic subfamily whose sister subgroups were Italic & Celtic; it will come in due course. Indo-European and Uralic are 2 disjoint families, albeit with early contacts or earlier genetic connections (again later), but they were disjoint already in 4500 BC, or even earlier. Their grammatical structures vastly differ even if a few words (water, name, honey &c.) seem to be close enough.

            I may be in mistake, but I do not have any strong reason to challenge this view, and I will take this scheme. To be sure I always feel Altaic languages nearer to Uralic than generally linguists think; but I may overemphasize the great grammatical similarity. Also, Altaic is irrelevant in the present discussion.

            So the majority opinion (which I share) is that Magyar & Latin/Roumanian were divergent for at least 6,500 years. In contrast Szabédi stated [4] that the evolutions diverged for a mere 3,900 years (or, alternatively, 3,600). When the common language split, proto-Italians went to Italy, while some dialect of people remaining in Eastern Europe became the Finno-Ugric protolanguage (or the Ugric?).

            Now some readers may believe that 6500 years in contrast 3900 is not a really big difference. But this is not so. From the Indo-European family we do have Hittite and Greek original written documents 3500 years old, and from Hatti very good arguments that some ethnically Hittite kings ruled in Anatolia 3900 years ago. The earliest parts of the Indian Rg-Veda (composed in the direct ancestor of Sanskrit) go back to 3500 years according to European and cosmopolitan Indian scholars; according to Hindu National History it goes back to 7000 years or more. So if Szabédi is correct, we are an integral part of the Indo-European family, and we are roughly as close to Latin as the Celtic languages (and they are really close, see king=rix in Gaul and rex in Latin, or the common r-passive; later), or a hairbreadth farther; definitely nearer than Slavic, Lithuanian or Hindu. Can this be?

            The answer of majority of Indo-European linguists would be No; only they never read [4]. The overwhelming majority of Uralic linguists would also be No; and a few of them know the theory, even if not the details. My answer is also No; but I have the book. So I ask: why?

 

3. ON ETYMOLOGIES

            How to verify linguistic kinship? It seems that there are two schools. Either fundamental words are primary and grammatical structure is secondary, or vice versa. Since for Indo-Europeans the two criteria generally go hand in hand (except, maybe, Albanian and Tocharian; later), they do not have to decide, and for fundamental words the method is 200 year old, so they like it.

            You have a few languages you suspect to be in genetic kinship. Then you tentatively postulate regular changes of sounds. These postulated changes may be anything, but you must be strict about them. If by the postulated changes you can bring the words of different languages into relations, then you are ready, The suspected kin languages indeed came from one parent language even if none ever wrote anything on that language.

            Example 1: Latin and the daughter languages. Lots of written books are extant in Latin. But first let us forget about Latin texts. We observe several languages in Western Europe (and Southern America) similar to each other. For example, the number "100" is written "cent", "cento" &c. Sometimes an initial sibilant ("s-sound"; in French, Castilian, Catalan, Portuguese), sometimes an affricate ("ch-sound", in Italian and Roumanian), sometimes a stop ("k-sound", Sard). Then you observe stop-sounds also written as "c". Then you seem to see a rule. In the majority of languages written "c"-s are stops before a back vowel, but something "looser", affricate or sibilant, before a front one. There is no difference in Sard, and there it is a stop. Then the most parsimonious assumption is: in the common paternal language the sound was a k-stop; later in most languages this sound changed before a front vowel, but remained a stop before a back one. You formulate the proper rule; and then all words with a "c" fulfil it.

            All ones which come directly from the paternal language. Words borrowed from elsewhere may or may not behave so. And a minority of exceptional words may behave exceptionally. But lots of cognate words can be found. Then another rules can be looked for. Intervocal Italian "-tt-" seems often related to intervocal French "-it-" and Roumanian "-pt-" as in latte, laite, lapte, all "milk". And so on and so on.

            And now you take the preserved Roman books and see that "cent" and "cento" was indeed "centum", and "latte", "laite" and "lapte" was originally "lactum". The words having Latin etymology are the words that can be traced back via regular changes to a Latin ancestor even if that word does not occur in any extant book; but it generally occurs.

            Example 2: Kentum vs. satem. You observe that there are really a lot of languages to show up exciting similarities in words "father", "mother", "bring" and so. E.g. "father" is "pater" in Latin and "pitar" in Sanskrit. But half of these languages have an initial stop in "100" as Latin "kentum", written as "centum", while the other half a sibilant as in Iranian "satem", or an affricate. Then your guess is that the original stop changed into sibilant for any reason. No problem, it turned into sibilant before a front vowel in French "in before our eyes", so it may have done so before any vowel in Iranian. It turns out that the suspected related languages can be grouped neatly into two halves: kentum as Italics, Celts, Germans and Greeks, and satem, as Slavs, Balts, Iranians and Indians. XIXth century guessed a West-East dichotomy; simple enough. Then French is "kentum" although "cent" is now starting with an s-sound; but it comes from a Latin "k".

            In Roumanian "100" is "suta", not even written with a "c", although Roumanian is Neo-Latin, so "kentum". Then you can try with easy explanations. 1) "Suta" comes from "kentum" in some too involved way, and when Roumanian writing started (1521), they already did not remember the "k"-sound. Or, 2) that especially "100" is a Slavic loanword from "sto". Or, 3) that it is a loanword from Dacian (?), or Thracian (satem) or Armenian (satem). I think 2) is the majority opinion.

            Example 3: Reconstruction of Proto-Indo-European (PIE). There are many suspected Indo-European languages, and many scholars of them. In addition some such languages have very old records. Hittite & Greek goes back to 1500 BC, but even Latin goes back to 600 BC. To be sure, the oldest records come from not-IE languages about 3500 BC: Sumerian is an isolate (?) and Egyptian is Afro-Asiatic, but even 1500 BC is something old. So there are lots of information, lots of reconstruers, so PIE is reconstructed. E.g. we can try with the very obvious choice of a sentence: "On a hill a sheep not having wool saw horses, one pulling a heavy wagon, one carrying a big load and one carrying a man quickly." Then modern linguistics reconstucts the sentence as: "Gwrreei owis, quesyo wlhnaa ne eest, ekwoons espeket, oinom ghe gwrrum woghom weghontm, oinomque megam bhorom, oinomque ghmmenm ooku bherontm." [6].

            And now I, the Uralian speaker, tell you that for me everything seems OK. "Gwrreei" may be "on the hill", because in Slavic "gora" is "hill" and there the Locative ends in e/i. "Ovis" can be sheep because it is "ovis" in Latin. "Wlhnaa" may be "wool" and "ne eest" may be the Latin "non est". Again "ekwoons espeket" is reminiscent to Latin "equos espectat", "oinomque" is Latin "unumque", where we know the earlier "oinom" from original inscriptions, in "megam" you can see the Greek "mega", Hindu "maha" = "big", + -m for accusative as in Latin, and so on.

            While any of the words may be an error (we cannot check it, not having inscriptions), the whole sentence seems good for, say, 90%.

            Now, the question is: can we push Magyar into amongst these related languages (with or without Uralics)? Can we find Indo-European, or Italo-Celtic or proto-Latin etymologies for many Magyar words?

            Szabédi found, for hundreds. Of course, for this he postulated his own rules. But, as you could observe, this was his right for a brand new IE language. It could not be different without specific change rules. He only has to follow his rules strictly.

            Was he strict enough? I do not know. For me his derivations seem roughly similar in strictness to those of professional linguists. (I am a professional physicist.) Maybe professional linguists could tell something; but they do not do it.

 

4. EXAMPLES FOR ETYMOLOGIES

            Szabédi's etymologies are multitudinous. I take two groups for examples: the seemingly absurd and the surprisingly obvious. Let us see some examples, but be careful about ortography. I cannot use a Latin orthography for Magyar; both vowels and consonants are more various in Magyar. So let you see some correspondence to English at least for consonants:

 

Magyar

English

Note

c

ts

-

cs

ch

-

gy

dy

Originally j

j

y

-

s

sh

-

sz

s

-

ty

ky

As Tokyo

zs

zh

-

 

And then we can start. Scarabeus = cserebog-ár. Sure, today's scarabeus is a Scarabaeina, and today's cserebogár = cockchafer is a Melolonthina, but they are neighbour subfamilies of the Scarabaeidae, so the cserebogár is at least a Scarabaeida, and even now a layman sees little enough difference. Szabédi starts from a proto-Latin *scarabaios. Thence Latin scarabeus is easy (and Greek skarabeios too); the Magyar comes through a csaraboj.

            Now for "oil" (the food). Szabédi starts with a proto-Latin *olaivom. Hence he gets Greek elaion through an *elaivon (indeed the digamma sound vanished in Attic), also the Latin "oleum". Now, on the Uralic side he assumes an *ulojwu, but with dark l, hence an *uwojwu, and then an Ugric woj, Magyar vaj. Indeed, Finnish "voi" and Magyar "vaj" both are "butter", the Lapponian "vuojj" is some "oil", and Manyshi "woi" is "fish-oil". The different Uralic languages use the same word for different hydrocarbons, according to lifestyle.

            Proto-Latin *cannabaria remained the same on the Latin side, i.e. "hemp" (remember cannabis). But "hemp" is "kender" in Magyar, which he gets easily from *cannabaria.

            Proto-Latin *manuvali becomes "manuale" in Latin, so a kind of "handle". Now it is "nyél" in Magyar, which can be got via *manivali>*mnivéli>*nyél.

            Proto-Latin *gönücle went into Latin geniculum and Magyar könyök = elbow.

            Latin mersat is Magyar merít, in older times mereht. Alibi ~ egyéb (in the sense: other). From proto-Latin *vunda one gets the Latin "unda" (wave), and the FU "wite", Magyar "víz" (the Uralic side is orthodox!), "water". Latin "flamma" and Magyar "láng" (both "flame") come from proto-Latin *phlogma. Latin "casa" and Magyar "ház" both from proto-Latin "cotta". (I note that the Magyar "ház" is very near to German "Haus", but cannot be a loanword, having regular Finno-Ugric cognates.)

            I give one more pair, and stop. Nepos ~ nép. The Latin word means "descendants", the Magyar now is "people", but originally may have meant the wife, children and some cognate persons together, so opposed to the head of the greater family.

            There are really more than hundred etymologies; I was lazy to count them. How is this possible?

 

5. ONE CENTURY BACK

            First I retreat into 1904. You may ask, why. The answer is simple. Szabédi was born in 1907, and in 1904 a great monograph was published about comparative IE linguistics [7]. I got my copy in a second-hand bookshop, with pen writings on margins and paper sheets inside. A draft of a letter showed that somebody used it as University textbook in 1923. Szabédi learnt linguistics probably from 1926; I cannot prove that he would have used another copy of [7], but anyways his textbook must have been similar. Also those of his critics.

            Obviously at the end of the XIXth century IE linguistics did not consider itself complete, as Physics believed (the "only two small cloudlets" of Lord Thompson in 1900 [8]); linguists expected new languages, but generally not new subfamilies. Lots of extant and extinct IE languages were known, some practically only as a name (Greek or Latin authors mentioned them), some from scanty inscriptions of a few words, but some with literature. Let us then look back a century.

            Travellers and then linguists discovered newer and newer IE languages, but all of them belonged to the Indo-Iranic or Aryan subfamily. Only one other subfamily belonged to Asia, with a single extant language, Armenian. And there was no chance to find new living IE language in well-researched Europe.

            As for extinct languages, from time to time inscriptions, codex glosses and such were known, and careful Brugmann does mention some languages which either may have been Indo-European, or they definitely were but the texts were too scanty to tell anything else about. His views are as follows.

            Phrygian and Thracian were IE, maybe satem, and may have been in kinship with Armenian; texts were scanty. (Indeed a Hungarian textbook, which I will not cite here because it is written in Magyar, so you could not read it, from 1970, still tells that the only Thracian text is a few words inscribed on a ring.)

            He knows about 3 languages at the periphery of Italy, having died out still in antiquity. Messapian belonged to Albanian, while Illyrian and Siculian were kentum languages of unknown type. (Now Illyrian is an extinct but autonomous branch, maybe with some connections to Italic & Celtic.)

            Brugmann tells that Lydian showed IE characteristics, but he knew too few about it to be more definite. Finally he mentioned Lycian, which may have been IE, or it may have not.

            Surely individual linguists had individual guesses about Lydian & Lycian; but there was no expectation of new subfamilies. OK, there had been Thracians; then a branch of Phrygians, the Bryx, still had been living in European Thrace in the mythic age (read the stories about the Golden Fleece and Iason), so Phrygians were kins of Thracians. Now, Lydians lived between Phrygians, Greeks and Persians, so they may have been a more Eastern kind of Phrygians. If Thracians & Phrygians were classified together with Armenian, why not Lydians? And then the last, Lycian? Maybe it also belonged to Armenian.

            As I told, they neatly dissected these languages into kentum/satem. All kentum languages were Western, and all satem Eastern.

            Of course, with Western optics. Namely Greece at Long. 24 E was Western kentum, while Prussian at Long. 20 E Eastern satem. But we know this. Until the 90's Turkey as NATO (Northern + Atlantic) was Western even at Long. 44 E and we at 19 E were Eastern. Still a group containing Indian, Iranian, Armenian, Phrygian, Slavic & Baltic is Eastern compared to the other of Italic, Celtic, Illyrian, Germanian & Greek.

            So k-stops remained on the West but became sibilants on the East. Why? Who knows? Maybe originally the change was a fashion, which could not overstep the borders of Old Greece; Greeks regarded Phrygians, Thracians, Lydians & Iranians as barbarous. Or maybe satemization was a laziness. A stop sound is hard & disciplined. If we became lax, air comes through a slit, and then there is a sibilant. For XIXth century Europeans it was trivial to consider Easterners lazy & undisciplined.

            And then came 2 discoveries. The bigger one will not enter into this story too strongly. In the 1860's travelers became fascinated about an old civilisation of Eastern Anatolia, and some experts of the Bible guessed (correctly enough) that the ruined cities had been built by the people called Hittite in the Bible. Then inscriptions were collected, and in the 1910's Hrozny deciphered them. A strange and archaic IE language came forward. It was kentum, in spite so Eastern; but it was old too. Maybe older than satemization.

            Indeed, it was old. The original Hittite empire was broken up (probably by Phrygians & Lydians) in 1190 BC as we know now. Classical Greek travellers may have observed the two Hittite colonies in Syria, but no text about linguistic observations are extant. What is even stranger, the Hittites surely were in connection with Mycenian Greece, and it is very probable that somehow the Hittite Empire was peripherially involved even into the Troian War; still even now we do not know Greek texts about Hittites. The Classic Greeks knew nothing about them: so XIXth century scholarship knew nothing too.

            It seems now that Lycian classifies together with Hittite, in an Anatolian subfamily. But then we are ready with this surprising new subfamily. Let us then see the second one.

            My compatriot, Sir Aurel Stein (I am serious) travelled a lot in Eastern China about 1900. Now it is China, but still an autonomous territory Xinkiang-Uygur. 2000 years ago that territory was populated by Juechis, Wusuns, Huns and whatnot. Now Sir Aurel discovered some cave sanctuaries of Buddhists walled up for more than a millenium, and carried some books to museums. Then the Prussian King (the same person as the German Emperor) also sent an expedition, and they got manuscripts as well. And gradually a new language came forward from some manuscripts. Scholars named it Tocharian, after some geographical name at Ptolemy and at other geographers. E.g. Strabo [9] mentions that the Tokharioi were just passing the Iaxarthes (we know this river as Sir-Darya). Maybe they were the ancestors of the scribes having written the Buddhist manuscripts, maybe not.

            For any case, the Tocharian texts were a shock. First, they were the Easternmost Indo-Europeans. Nobody expected Indo-Europeans at Long. 95 E!

            However, if we know that they were IE, everybody can imagine a small group of Parthian horsemen trekking Eastward. (See e.g. H. Beam Piper’s Alternative History novel Lord Kalvan of Otherwhen, which I relegate to Appendix A. Only that belongs hither that even in the novel of the Alternative History expert [10]  the Behring-crossing Indo-Europeans are Aryans, so IndoIranians. Even he in the 60’s did not hear about Tocharians; or did not take them seriously.) But this language was kentum! This is easy to show: "100" is känt/kante (there were 2 dialects).

            And then came the third surprise. The extinct language had a lot of -r- sounds in the passive verbal suffices.

            If you learnt Latin, you see what I am telling. See a Table:

English

Latin

English

Latin

I see

video

I am seen

videor

thou seest

vides

thou art seen

videris

he sees

videt

he is seen

videtur

we see

videmus

we are seen

videmur

You see

videtis

you are seen

videmini

they see

vident

they are seen

videntur

And so on: also in Past and Future.

            It seems as if this -r- was the original suffix of Passive Voice in Latin. And XIXth century linguists found this -r- in the Passives of Osk, Umber (so in all Italic) + the Celtic; but nowhere else. Then XXth century found it in Hittite; and in Tocharian. In Tocharian A klots=ear, klyos is to hear, and klyosr is "is heard". Turning to Latin,

klyosr = auditur

Is it not nice?

            In the first half of XXth century many linguists preferred the Italoceltic theory. Sometimes in the far, far past (now we know that it was cca. 2500 BC) Indo-Europeans diverged. Some went to Northeast, some Southeast, some West and so on. But the ancestors of later Italics and Celts remained together for a while, so there are more common features between them than between any other pair. To be sure, not the passive -r- is the only common feature. Some words were almost mutually understandable between them in the time of Julius Caesar, e.g.

English

Latin

Gaul

King

rex

rix

            Now some scholars accept the Italoceltic cohabitation, some not. For example see Kortlandt on one hand [11], cladistic people on the other. For any case, even Warnow's cladistic computer classification [12] clearly shows a common Italoceltic branch;  and in another computer calculation [13], the calculators take the years for the beginning and end of independent Italoceltic community as 3000 & 2400 BC. But computer calculations or not, when our sources (Greek travellers and Roman traditions) start to report, Italics & Celts are just neighbours. An earlier common life is more or less expected. And if we indeed go back to 1904, then European linguists all learned Latin (and Greek); they feel in the bones the -r- Passives, so an Italoceltic community is trivial. (I, in 2006, am a physicist. I will not try to decide what are the truer reasons of similarities, if at least the similarities exist.)

            I was unable to find anything about Tocharians at Szabédi. Indeed, you cannot expect them, because  they were not in Brugmann's textbook written in 1904.

 

6. THE INDO-EUROPEAN MIGRATIONS IN THE MINDS OF SCHOLARS AT THE BEGINNING OF XXTH CENTURY

 

            If you have a group of related languages, then maybe in old times the ancestors spoke the Ur-Language in the Urheimat.

            This is not necessarily true. You can imagine a chain of ancient languages where every community understood the two neighbours and not the farther ones. However let us not be pedantic, and continue. Where was the Urheimat?

            There were two preferred places. But first let us state that two living languages were the most archaic ones: Lithuanian & Sanskrit. Sanskrit is still a living language, albeit barely; Brahmins are not celibate as Roman Catholic priests, so you can imagine a family in India speaking Sanskrit as first language, and such families are reported; I do not know if I should believe the reports. Otherwise the Sanskrit is artificial and some 2000 years old.

            Now, there are arguments that very archaic idioms remain at the core territories where the foreign influence is weakest. Then you may conclude that the Urheimat was 1) close to the Baltic; or 2) in India. The two opinions were obviously incompatible, and speakers of Indian IE languages were much more populous (hundreds of millions) than speakers of Baltic languages (3 million). And ex oriente lux.

            OK, then put the Urheimat into India, or, more moderately, into the Caspian neighbourhood. Then Italocelts had come a long way to West, and they crossed East Europe, say the Ukraine. There is no insurmountable difficulty in overlapping the Finno-Ugric Urheimat (tentatively a strip from Poland to the Ural Mountains) during this long trek.

            The alternate Urheimat in Ukraine does either not prohibit this overlap, albeit in a more complicated way. Here I will tell only the half of the story; the other half will come soon.

            Surely the IE and FU Urheimats were neighbours (if the second one existed at all). While the grammars differ as much as it is possible, some very primary vocabulary agreements can easily be found. E.g.

            NAME. PIE "nem-", PU "nime". Magyar "név", Finnish "nimi", Latin “nomen” &c.

            WATER. Magyar "víz", Finnish "vete", German "Wasser", Hittite waatar", &c.

            THOU. PIE "te", PU "ti" (The Sg2N pronoun.) Latin "tu", German "du", Magyar "te", Mordvin "ton", Selkup "tan" &c.

            Dozens of such agreements exist (see e.g. [14]). Some of them may be simply borrowings but maybe not all. Since originally Magyar was spoken in Westernmost Siberia, if both Magyar and Finnish (“Suomalainen”) are included in the etymology, that means at least PFU sharing the root.

            So the Urheimats were neighbours. Now this means that in a (strange enough) migration either a FU group could borrow more IE words, or vice versa.

            Now, it seems that Proto-Magyars did never cross the line between Ukraine and Italy. (This is not sure, but a consensus reconstruction.) It seems that Magyars remained at the Easternmost side of the FU territory until cca. 5th c. AD, and then in the Migration Period they started West. On the other hand there were IE groups crossing the Ukraine-Italy (or even the Ural-Italy) line.

            Two such groups are yielding words which are proven beyond doubt. The more famous are the Indo-Iranians. Lots of common words between Magyar and some Indo-Iranian languages are known, e.g.:

            GOLD: Magyar "arany", Avestan "zaranya".

            COW: Magyar "tehén", Avestan "daenush" (female animal), proto-Iranian "dhena" &c.

            MILK: Magyar "tej", Avestan "dayah" &c.

            FELT: Magyar "nemez", Pehlevi "namat" &c.

            Several dozens of such words are known; some of them are proven. The explanation may be simple enough: the whole Ugric group was involved in the Andronovo Culture in the 2nd millenium BC, whose Southern component was "Sarmatian", so some Indo-Iranian.

            However most of these words could not mislead Szabédi. Indo-Iranian forms seriously differ from Latin, or Italic or Italoceltic. E.g. COW is Latin "vacca", or more generally "bovis", and Magyar "tehén" cannot correlate with that.

            Now we are nearing my suggestion for the explanation of Szabédi's error.

 

7. ON THE TOCHARIANS

            As earlier was told, Tocharian was the Easternmost IE language, in "historical" times (i.e. when the written texts were made) in the Tarim Basin. However, the language is surprisingly "Western". E.g.:

            FATHER is "pacar". For most of you this, while not un-Indo-European, may not seem too "Western". But here "c" is palatal "t", so "t'". This sound is absent in recent Western European languages, however it was present in Late Western Vulgar Latin in "ti-" groups; it is now written as "ty" in Magyar, "t'" in Slovakian, "ci" in Polish, "c'" in Croatian", and "ky" in Romanized Japanese. (Or maybe Tocharian c stood for the “ch” of English; the palatal “k” and the “ch” sound are not far from each other and are rather confused by many Westerners, but Magyars, Slovaks, Croats and Japanese can distinguish. From the history of Late Vulgar Latin we see that after some evolution “-tiV” and “ci-“ groups coincided in a string of dialects.)  For any case, Tocharian "c" is a "t"-sound in general sense. Now look at Tocharian "pacar", Latin "pater". On the East it is "pitar" in Old Indian.

            MOTHER is "macar"; it is "mater" in in Latin and "matir" in Gaulish ("mathair" in Irish). True, it is similar enough, "matar", in Old Indian.

            DOG is "cu". It is "canis" in Latin, "ci" in Welsh (and "kyon" in Greek &c). To be sure, it is "kutya" in Magyar, "koira" in Finnish and "köpek" in Turkish, so it may be an old Nostratic word, but anyway within IE it seems Western.

            HORSE is "yuk/yakwe". In Latin "equus", the root is "epo-" in Gaulish; see the equine goddess Epona. On the East it is "asva" in Old Indian and "aspa" in Old Iranian; the same root but different evolution, obviously.

            The Tocharian language seems KENTUM, "100" is "känt". All Western IE languages are kentum, and no Eastern ones are such, except Tocharian and Bangani.

            Tocharian Passive suffices tend to have an -r- sound, as shown earlier. This is rather a rule in Italic languages, it is so also in Celtic ones, it is also detected in Hittite, and in Tocharian. Nowhere else. There are two logical explanations: i) the -r- Passives are original, surviving only in Hittite, Tocharian & Italoceltic; 2) the -r- Passive is an innovation shared by Hittite, Tocharian and Italoceltic. While the majority now accepts the first explanation, that is theory, not fact.

            For any case, the Western origin of Tocharian, even the origin from the neighbourhood of the Italic and Celtic, is widely accepted. See e.g. the genealogical tree of Gamkrelidze & Ivanov [15]. The general explanation for the Easternmost historical position of the Tocharian is that they started first to East, maybe in 3500 BC. Then they probably did it mounted, not on chariots; and all Magyars (and Altaians) know this is the faster way. Now, there is no strong argument against starting on horseback to East even from the neighbourhood of Italocelts. So you may (or may not; as you like) imagine an Italian-Celtic-Tocharian Sprachbund, or even an ItaloCeltoTocharian subgroup dissolved by the Drang nach Osten of the Tocharians.

            Later the Tocharians did surely meet FU people, more probably Eastern Ugric than Western Finnish. Arguments are numerous. E.g.:

            Stops are all voiceless in Tocharian. This is also so in FU, except Ugric Magyar and Finnish Permian, where voiced stops are secondary, maybe since 7th c. AD.

            Some Tocharian cases of nominal declination seem rather agglutinative than flective, and at least one is not IE at all even for meaning.

            Dvandva type compounds occur in Tocharian. An example is "akmal" or "aakmal" = FACE. Now, the word seems to come from "ak-malan" = "NOSE-EYE". The Magyar word for FACE is "arc", but in 19th c. AD it was still "orca" (the form is now archaic, but still in use), written then as "orcza". We know that it is a dvandva: "orr+száj", so NOSE+MOUTH. The same principle if not exact mirror translation for "akmal".

            Finally let us apply the usual demonstrative trick for a wide comparison: numerals from 3 to 8. Numbers 1 and (sometimes) 2 have the tendency to come elsewhence ("this", "other" &c.), and numbers 9 & 10 vary even within IE. (E.g. "9" is "nine" in English, but "d'evit'" in Russian.) I took the numerals from [16], but consulted a lot of other sources too, and composed the Anatolian from 2 languages. So:

 

Numeral

Gothic

Tocharian A

Oscan

Gaulish

Hittite

Lithuanian

Slovak

Albanian

Sanskrit

Greek

Armenian

Subfam.

German

Tocharian

Italic

Celtic

Anatolian

Baltic

Slavic

Albanian

IndoIranian

Greek

Armenian

3

threis

tre

tris

treis

tri

trys

tri

tre

tri

treis

erekh

4

fidwor

s'twar

petora

petor

meiu

keturi

shtyri

katër

catúr

tettares

chorkh

5

fimf

päng

pompe

pempe

panku

penki

pät'

pesë

pángca

pente

hing

6

saihs

säk

sehs

suex

?

sheshi

shest'

gjashtë

sas

hex

vec

7

sibun

spät

seften

sextan

shipta

septyni

sedem

shtatë

saptá

hepta

evthn

8

ahtau

okät

uhto

oxtu

haktau

ashtuoni

osem

tetë

astá

okto

uth

 

            I admit, my orthography is somewhat arbitrary here. In principle, for languages of Latin writing I used national orthographies; still, in Baltic & Slavic "sh" and "th" is the English sounds. Accent was not indicated, and length not always.

As for "Hittite" I am combining "cuneiform Hittite" or Kaneshian with "hieroglyphic Hittite" or Luwian. I think the main difference is not between synchronic Kaneshian and Luwian; rather than hieroglyphic Hittite is many centuries later than the cuneiform. Hittite society collapsed almost synchronously with the Troian War at the center, while near to the Syrian border small dukedoms maintained Hittite civilization for half a further millenium, but these local states used hieroglyphes.

            Henceforth I assume that Tocharian was the closest kin of Italoceltic. I cannot still prove it. In cladistic context the similarities are not proofs; there they may be simply commonly preserved archaisms. However I am sure that in linguistics the cladistic approach is only an approximation. Languages do not split clearly: neighbour related languages influence each other even after they have become distinct languages. One example is Romance languages which influenced each other during whole Middle Ages (most strongly Italian & Castilian), another is Slavic ones, where in 19th century this mutual influence was even a political programme of Pan-Slavs. In both cases the languages were on the verge of mutual understanding, at least for passive reading; and mutually readable languages diverge slower. Also, imagine a migration. Clearly after some time differences will still be smaller between original neighbours than between original non-neighbours.

            My second reason for the assumption is definitely not linguistic. Szabédi was the last rector of the Bolyai University of Magyar speaking people of Roumania. In addition he was from the same tribe as I. I admit, it is improbable that he were right about the genetic origin of our language. But I am positively indoctrinated for him and his cause: so I think he was not a madman. And I can find an easy explanation for a mistake.

           

8. THE RELATIVES OF LATIN

            Szabédi discussed the close relatives of Latin. True, he was interested in the relatives of Magyar, and he believed to find Latin as such. However such a statement is reciprocial: when looking for close relatives of Latin, he believed that Magyar was one of the nearest. But of course, proto-Latin, and Latin’s Italian relatives were even closer. Now, let us start on this inverse way, and let us remain amongst Indo-European languages.

            Surely Latin had its close relatives in Italy before the Roman Empire. Two of them, Oscan & Umbrian, left behind extended texts. A third, Faliscan, may have been simply a peculiar dialect of Latin; or may have been the very closest kin. (Differences bw. Latin & Faliscan are not bigger than between recent neighbour Slavic languages.) Roman authors recorded the Latin kings of Alba Longa and if one believes the tradition, Latin lived her separate life at least since 1100 BC. We know (?) that the difference between Alba-Longan and Common Latin was the Troian, Dardan or Greek influence for the leaders of Alba Longa (“Aeneas”), and later between Rome & Latium the strong Etruscan influence in Rome. For the “proto-Latin” phase one may play with trial (Latin-Oscan-Umbrian) reconstruction of the Italic. Italics were surely immigrants, but the exact chronology of the migration is changing even in these years.

            Now, let us go one step backwards. What was the nearest kin of Italics? I think, no doubt here: the Celtic. Many authors even speak about an Italoceltic stage, so first the speakers of the common idiom of later Italics and later Celts separated themselves from the Western IE group, and for some centuries they still spoke dialects mutually understandable. Where? Outside of Italy, somewhere in Central Europe. (E.g. Pannonia is a guess; surely it could feed more people than the Alps.) When? Say, in the first half of the second millennium BC.

            Other authors refuse the existence of an Italoceltic, but still Italian was close to Celtic. Look at the Numerals 3-7. (But I am tricky…)

            Now, the next level is: what was the closest relative to Italoceltic (or: to Italic and Celtic)?

            Here the answer is equivocal; and languages surely did die out even without trace. Venetian is a possible guess (if it was not even Italic), Illyrian is another. But we do not know too much of them. Restricted ourselves to the 11 well known subfamilies of the Table of Numerals, 3 non-absurd answers are possible:

            1) None; this would mean that more than one (Western) groups are at comparable distances from Italoceltic.

            2) Greek. This was possible for 19th century scholars; and sometimes the etymologies are surprising indeed. Some authors believed that Western Greeks (“Dorians”) before their final Southeastern migration lived near to Italics; some tried with Aeolian-Italic comparisons.

            3) Tocharian. When Tocharians appear at Strabo, they are the Easternmost IE people, but the language is surprisingly Western (kentum, r-passive, similarities in individual words &c.). This classification was more or less a commonplace in 20th century, see e.g. [15]. The general idea is that Tocharian separated from a Western group, the general neighbourhood of Italic and Celtic, only Tocharians either started early or travelled fast. The means are not trivial; but let us continue.

 

            Now, let us discuss very shortly the 3 viable possibilities.

            None means that we cannot proceed anymore. While this may be true, may not, as well. We can return to this opinion if we are unsuccessful with the other two.

            Greek is surely a close relative to Latin, but it may not be close genetic relative. (According to the majority of experts; Garrett has another idea which you can find in App. B. That view is quite non-cladistic.) After settling down, Italic & Greeks were quite near to each other. Maybe Illyrians were between, but maybe not either they. So Greek could influence Italic; and it seems that Mycenean Greek society was richer/more complicated than Italic one(s) about 1500 BC. Then, just after the Troian War, refugees may have arrived at Sicily/Italy (this is e.g. the Aeneas story); they might import Greek/Dardan/Frigian/&c. words. (Homer is explicit that all Greeks were on the Greek side. I am not absolutely convinced having read many examples of political propaganda. Gamkrelidze & Ivanov [15] believes that Greeks trekked through Western Anatolia to Greece. They may or may not be right. If they are, some Greeks may have remained on the Asian side. Or may not...) After the Troian War and the Dorian Conquest any kind of Mycenaean emigrants may have arrived. Two candidates are "Arcadian" Mycenaeans (some of them went to Cyprus; that was no less exotic than Italy), and early Aeolians. True, we do not have written evidences about such migrations, but we have no written evidence about anything between the Dorians and 750 BC. So I think similarities are no strong evidence for Italic-Greek close genetic connection, for any way not so strong as between Italic and Celtic.

            However the third possibility, Tocharian, is interesting. It is not exotic/absurd for any case. In this Chapter I remain at this choice. This assumption may be wrong. And then what? Would be this the first linguistic idea proven wrong? I will discuss the question further; but only in an Appendix.

            For a Hungarian, the simplest explanation for the early arrival of Tocharians at (almost) Far East from Western Central Europe is a simple enough matter. They took their horses and rode. Avarians bw. 552 & 562 AD made exactly the same distance in 10 years. (In 552 Khagan Bumin, the son of the Grey Wolf, Boz Kurt, or more correctly, Bora Kovrat, expelled them from the neighbourhood of Mnt. Altai. In 562 they defeated King Gisebert of Thuringy at the Elba.) But: were the Tocharians riding?

            Westerners generally believe that wagons and chariots preceded riding. This is based mainly on Greek traditions. Mycenaean elite warriors were charioteers, not riders. Small horses are better for pulling chariots than for riding. No Mycenaean or Homeric source mentions riding.

            On the other hand we did find a horserider grave [17] from between the Tiszapolgár & Bodrogkeresztúr periods, just when some uncouth steppe warriors represented the Eastern challenge for the Carpathian Basin. The absolute time, as we know now, is cca. 4200 BC. So for us there is no problem how proto-Tocharians could reach the trans-Iaxarthes steppe.

            OK, proto-Tocharian could not be horse nomads. For a Hungarian a Nomad, and definitely a horse nomad, is not a miserable fellow having nothing except his clothes. A horse nomad is a product of long evolution, transporting the whole economy & society on horseback & wagons, including the houses as Yurts. That way of life was obviously nonexistent before the late stages of Andronovo, say 1200 BC. Proto-Tocharians might ride, but even then they had to stop frequently. So what needed later a mere 10 years for the Avars may have been even 200 years for proto-Tocharians. But we have unrecorded millenia.

            Linguistic records do not help too much. Horse is yuk/yakwe, clearly a cognate of Latin equus, a definitely IE word. The wagon is kukäl, clearly a cognate of PIE *kwel (wheel), so they started from Ukraine with horses and wagons. We still do not know the details.

 

9. IF TOCHARIANS WERE THE CLOSEST RELATIVES OF ITALOCELTS...

            Then Szabédi, martyr of Magyar high education, was not a madman. (To be sure, this question does not have to be settled to decide if he was a noble Magyar. To decide his status as Hungarian I should confer at least with my Slovakian colleagues.) Namely, visualize a near kin of Italocelts/Italics/Latin on the way to the Tarim Basin. Somewhere and somewhen they met Uralics, most definitely eastern Ugrics, ancestors of Magyars.

            We do not know where and when this happened; but surely on the steppe, earlier than Andronovo Bronze Age. The IE component of the Andronovo very probably was IndoIranian ("Sauromat"), and IndoIranians were newcomers on East compared to Tocharians.

            Surely, Eastern Uralians & Tocharians did meet, and the meeting was more than transient. Some of the Tocharian cases of nouns are agglutinative. One case, the Pervasive, is unheard in IE community (except, maybe, for Northern Lithuanian dialects heavily influenced by Uralic Livonians). Aakmal, EYE+NOSE is FACE, as ARC=ORR+SZÁJ (FACE=NOSE+MOUTH) in Magyar (and if you are not convinced, 150 years ago "arc" was still "orcza"). And so on.

            "And so on" stands for research not done so far. Uralists are not too interested in Tocharians, and Indo-Europists do not know generally any Uralian. But Tocharians could have transferred lots of words for the Ugric. And then Szabédi met with lots of roots we know mimicking proto-Latin.

            Be careful here. For anybody not Szabédi Proto-Latin was the language of Alba Longa (whose last king/dictator, Mettius Fufetius, was defeated by Ancus Martius about 665 BC, see the Horatius/Curiatius story at Livy [18]), or the pure Italic dialect of Latium just before the arrival of Aeneas. However for a Magyar the differences between proto-Latin & proto-Tocharian are almost undetectable. I saw and heard my mother conversing with wives of Russian occupying officers at the street. Russian is Eastern Slavic and Slovakian is Western Slavic. My mother was Magyar and I do not know if the Russians detected that fact; but communication went without serious problem.

            Latin "oculus" is Tocharian "ak/aak". Now, Szabédi derives Magyar "agy"="brains" from "oculus". I do not know if the meanings of "eye" and "brains" are near enough; but I am sure the derivation is easier from Tocharian "ak" than from Latin "oculus". Also, he tries with the derivation "ocularia"~"agyar". Here "ocularia" is the "tooth of eye", the canine tooth. In Magyar it is indeed the "tooth of eye", and he cites a French construction. But again, if "agyar" can be derived so at all, it is much easier from "ak".

            Szabédi's idea is that Magyar "pata", "hoof", comes from proto-Latin *podos, Latin pedes, foot. Indeed, the hoof is a part/extension of foot, "pedes" in Latin. But Latins hardly can have been on the East. But would you like Tocharian "pe"="pedes"="foot"?

            Szabédi claims a connection between Latin "oriundus"="originating" and Magyar "eredô"="originating". This etymology may be "mirageous" (the Magyar catchword for everything scholars do not believe to be true). But "orior" also has the meaning "comes up" (e.g. a star). And Tocharian "orto" is "up".

            The Magyar "könyök"="elbow" has good Uralic etymologies as far to West as the Finns. Szabédi, however, tries with proto-Latin "genucla". Hence one could indeed get "könyök", although "genu"="knee", but for amniotes the two articulates are homologous. But Latins were far from Western Siberia. However Tocharian "kanwem" (it is Dual, of course) is not far.

            And now we have arrived at the obscure points of Orthodox Magyar etymology. Magyar "piros" is "red", although rather "fire red" than "blood red". The latter is "vôrôs", clearly a derivate of "véres"="bloody". "Piros" has no convenient Uralic etymology, while "pír"<-"pyr" (Greek; fire) would be good but historically difficult. Szabédi suggests "purus"~"piros", but he is "mirageous", the usual slogan of Hungarian intelligentsia if the theory or the person is simply not accepted. (OK, he was a University High Official; but nominated by the Roumanian, not the Hungarian, Ministry of Education.) OK, and what about Tocharian "por"="fire"?

            For metals we arrive at a difficult point. It is generally accepted that the only common IE metal was copper; maybe it was called something preserved in Latin as "aes" meaning generally "metal", sometimes "copper" or even "bronze". Greek "metallan" was something "searched for".

            In Uralic etymologically related words definitely mean different metals; surely, because the FU linguistic unity, if it existed at all (the original territory seems rather large), did not survive Eneolite (if at all; maybe the last stage was still Neolithe). Still, there are common metallic words, but their meanings are often different. Let us see an example. Magyar "vas"="iron", is obviously related to Finnish "vaski" (Fi -sk-~Ma -s- seems regular), but the latter means "copper". Copper is "réz" in Magyar, but the word may come from Slovakian "ruda" (or from some older IE connection). Szabédi turns to Latin, finds there the word "vasculum", "small vessel", and argues that some vessels were made of copper or bronze, and in Uralic the word went to mean the vessel.

            OK, "mirageous". Why did a "small vessel" mean "metal"? However there is a quite classic Latin word "vascularius", meaning an artificer "manufacturing metallic vessels". I do not know, why; but this was Latin usage. These metallic vessels of Latins obviously were made of copper or bronze (otherwise you could not have water in them); or maybe, for rich people, of silver or gold. In Magyar "bronze" is an international word, "copper" is maybe Slovakian (although both words must have existed in 1,000 BC), "silver"=”ezüst” is dubious, and "gold"="arany" is surely from Iranian (e.g. in Avestan: "zaranya"). What are the words in the language of our closest relatives, the Manyshi?

            The words are different; still we can learn something. "Silver" is "oln". But "tin" is also "oln". Manyshi does not distinguish brilliant white tin from brilliant white silver! ("Oln" is "ón" in Magyar, so the word is related but not the metals.)

            As for the coloured metals, "copper" is "tarn'e"; no Magyar etymology. But interestingly, "gold" is also "tarn'e". Magyar's closest kin, Manyshi, does not distinguish red copper from yellow gold; and white tin from white silver.

            Now, wait a moment. "Silver" in Latin is "argens", and in Tocharian B "white" is "árki". This might be an accident. But in Tocharian B "wäs" is "gold". Now, is Finnish "vaski"="copper" (recognised by "mirageous" Szabédi) simply an accident?

            And is the similarity in animate pronouns La "quis"~Ma "ki" an accident? It is "kus" in Tocharian! (OK, maybe the origins are common from Nostratic...) Similarly, La "sal"~ Ma "só" (="salt") may be "mirageous", but Tocharian "sále"~Magyar "só" is not absurd. And, at last: Tocharian is a Kentum language. "100" is "kant". Now, "army", "host" is "had" in Magyar. (Now rather "hadsereg", but originally it was simply "had".) "Had-nagy" is now “lieutenant", but this title originally meant a leader of the host; and "hadra kel" is "goes to war".

            Now "had" has very good Uralic etymology. Magyar "h" before back vowels came from "k", and "d" from "nt". So Magyar "had" should come from "*kant". Indeed Magyar "had" is etymologically identical with Manyshi "khant", Khanty "kant", Mordvin "kon'dä", Finnish "kunta". In Mordvin the word now means "family", in Finnish, "community" [19]. But everywhere, a "coherent group".

            Now, "100" is a big number. In contemporary Hungarian military language, a "100-ad" (század) is a body led by a Captain. (Although he leads rather several hundreds.) Captain's equivalent in Ottoman Turkish is "yüzbasi" and that is simply "head of 100". And "100" is "kant" in Tocharian. Is it a mirage? Am I a madman? Was Szabédi a madman?

            He was not right. He did not know about Tocharians because great Brugmann did not know about them. I think even Nicoleau Ceausescu, Head of the Roumanian State, dissolver of the Bolyai University, later Conducator, did not know about them. And then?

            Here the main part ends. What remains, goes into Appendices. Namely, I, as a physicist, have learned that Absolute Truth does exist (should exist?), even if we now do not necessarily know it. The Appendices formulate my Notes about this unknown Absolute Truth. And beware: different Appendices contain mutually inconsistent scenarios. So maybe you must not simply add up their contents.

 

APPENDIX A: LORD KALVAN OF OTHERWHEN

            If one has no empathy for science-fiction, he should jump over this Appendix. However even for such persons App. A demonstrates how easy it is to forget about Tocharians.

            Alternative History, although it is mainly science fiction, has quite strict rules. You assume that things can happen also differently than in Our TimeLine (OTL) and may even assume that highly developed future science/technology will be able to visit the ATL’s (Alternative TimeLines). If somebody asks why to assume this, you may refer to Everett’s Quantum Theory [3], may tell that it is only for fun, or may discuss the beneficial effects on OTL historiography. It is no surprise that even great Toynbee wrote a book where each chapter discussed a respective ATL [20]. (What if Philip II of Macedon had not been murdered? What if Alexander III had not drunk himself dead? And so on.) The main rule is: you may assume only one critical alternative: a different result of only one decision, and even this must be “probable”. At the moment of the different decision the specific Alternative History starts, and from this moment you must build up again the History of Maximal Probability. (E.g. if Philip II is not assassinated at the wedding of his daughter Cleopatra and King Alexander of Epirus, then Philip remains the Chief Commander of Free Greece, and according to the decisions of the Congress of Corynth the Macedonian Army on Greek money starts to set free the Greek cities on the Western  shore of the Aegean. Surely he defeats the Persians; very probably the new border will be the Halys River; but no youthful adventures to Persepolis, Ecbatana & Memphis. Philip might have gone rather to Southern Italy, where Greek cities were in danger of barbarian Lucanians, Bruttians &c.; and to Sicily endangered by Carthaginians. If you do not believe this scenario, remember that in OTL his son-of-law Alexander of Epirus died on the battlefield in Lucania, and Pyrrhus, ally of Tarentum against Rome was a cousin of second rank of Alexander of Macedon. And what would finally happen with Roman History?) The moment of the Alternative Decision is the Point of Divergence or in acronymic PoD. The later PoD the more similar histories.

            H. Beam Piper in the 40’s invented a scheme. World is five-dimensional, the fifth is time-like, and one civilisation of the multiple humanity happens to learn Crosstime Travel. Afterwards they can exploit the other TimeLines (mining &c.) Now, obviously “neighbouring” TL’s are similar if “equivocal alternatives” are rare. Piper’s map of ATL’s is of course incomplete, but interesting.

            There are 5 Levels. In this scenario Humanity came from Mars, fleeing from loss of atmosphere & such. The exodus happened under dire conditions and on lots of TL’s it was more or less catastrophic. On Level 5 TL’s the exodus completely failed and Earth remained for autochtonous Neanderthals. In OTL (Level 4) one spaceship performed the travel but could not settle down; one of her scout boats arrived with a few humans who then devolved back to Stone Age culture. On the other hand, Level 1 TL’s had quite substantial human population transplanted, with great part of the original Martian science & culture. So the so-called Home TimeLine is far before us.

            Very similar TL’s form a Subsector. OTL belongs to Level 4, Europo-American, Hispano-Columbian one, but there are others on Level 4 as well, e.g. Indo-Turanian or Sino-Hindic. (Names refer the dominant cultures.) On the somewhat more advances Third Level there is e.g. an Alexandrian-Roman Subsector.

            Now, Lord Kalvan of Otherwhen is originally Calvin Morrison, a cop from Pennsylvania, with Korean War experience. He is accidentally picked up for some paratime interval by a Crosstime craft and transferred to Level 4 Aryan Transpacific Subsector. That subsector is characterised by a very old decision when some Indo-Europeans started to East, after some time arrived at the Behring Strait and crossed, maybe via the Aleuts. (In App. D we shall see that a very crude estimate for PoD time is 3300 BC.) There may be Indo-Europeans in Asia & Europe; the Zarthani in Eastern America does not know anything about Europe and only legends about Asia. However: what is this subsector, and who are the Indo-Europeans in Pennsylvania?

            Lord Kalvan of Otherwhen (I mean, the novel) is posthumous, composed from Gunpowder God and Down with Styphon. So Piper did not elaborate the details as much as in other works. However the Zarthani are light-complexioned; and they are Aryan. The subsector is called (on Home TimeLine) Aryan Transpacific; the TL’s where sometimes in the past Aryans went behind the Pacific.

            Now, in 19th century, and also in Hitler’s circles, Aryan and Indo-European were synonyms. But not for 20th century science or for Piper. For us Aryan is the synonym of IndoIranian: the term Iranian comes from the root Aryan, in translation “noble”, “free” and such. Iran is the Land of Aryans. So Piper’s Zarthani are IndoIranians who went to North America. Iranians are quite light-complexioned even on OTL; Indians are not so, but that is after 3 millenia in the Indian subcontinent. Remaining on the North Proto-Zarthanis could remain light-complexioned. As for language, Zarthani must then be IndoIranian, and in closest relation with OTL Vedic Indian or Avestan; or, among “modern” languages, with artificially frozen Sanskrit.

            Piper died before making the last polish on Lord Kalvan of Otherwhen. His idea was continued by Carr. And in the Prologue of R. Green & J. F. Carr’s Great King’s War [21] on Home TL the greatest Aryan Transpacific expert Danthor Dras simply states: “…Zarthani, as this group of the Sanskrit-speaking Indo-Aryan settlers called themselves…”. Obviously, IndoIranians can be the Easternmost IE people only if Tocharians’ Drang nach Osten has been aborted somehow.

            Even in the 80’s for people interested in history the par excellence Eastern Indo-Europeans seemed to be the IndoIranians, not the Tocharians. Everybody forgets Tocharians, not only Szabédi.

 

APPENDIX B: WAS PROTO-GREEK SIMPLY A NUCLEAR INDO-EUROPEAN DIALECT?

            Ref. [22] sketches up a scenario diametrically opposite to Cladistics or Stammbäume. Strictly speaking, Garrett shows evidences only for Celtic, Italics and Greek (and Hittite), but for the others written records are simply not old enough. So, if you accept his arguments for these Western ones, it is possible that it was true for the whole family as well.

            He was not the first preferring such a scenario. But maybe he was the first to prove it for Proto-Greek.

            We do have Mycenaean texts from cca. 1400 BC, so we know that Mycenaean was not the common ancestor of all 1st millennium BC Greek dialects. Rather Mycenean was one of the dialects at 1400 BC, the only one recorded.

            Let us see an example. “They are” is “ehensi” in Mycenaean, and in 1st millennium the –si can be found in Arcado-Cyprian (Mycenean’s closest kin), Aeolic and Ionic, so roughly in Eastern dialects. But “they are” is “enti” in 1st millennium Western Greek, and it seems that –ti was the older. So Mycenaean, Aeolic and Ionic form a group (“common innovations”), but Western Greek is not the successor of Mycenean.

            So far so good. But then the common ancestor of all Greek dialects (if it existed…) was spoken before 1500 BC.  No problem even in Greece: first proto-Greeks arrived about 1900 BC. But what was the “Hellenism” of this Proto-Greek? What  was the Common Greek innovation compared to Common Nuclear (so “without Anatolian”) IE?

            Garrett’s answer is: “almost nothing”. To be more definite: vowel structure of Greek is quite conservative, the verbal system is quite archaic, for consonants only the First Palatalisation may have happened before (or during) Proto-Greek, and for noun inflexion the only Common Greek specific innovation is the degeneration (coincidence) of Dat and Loc in Plural in –si; the PNIE Loc ending was –su. Garrett tells some analogic causes for the –su>-si change; I tell that in Proto-Greek times lots of autochtones learned PNIE in Greece, so some simplification was inevitable and the prepositions could take over the task of fine discrimination. So this difference is negligible. His conclusion is that the only important difference between PNIE and Proto-Greek was the prae-Greek part of Greek vocabulary.

            So before 1500 a (Proto-) Greek individual could converse with his geographic neighbours, not Greek, if he avoided words exotic for them. So he used circumlocutions, synonymes and such.

            As it is well known, Magyar is almost without dialectal differences, and very, very different from neighbours. Still, there is a Central Northern dialect using different endings for the fundamental triad of directions. The “at/from/to” triality is expressed by the “-nál/tól/hoz” suffixes in the literary language, while in the mentioned dialect there is also another triad “-nott/nól/ni”. The choice between the triads is tricky. This is obviously bigger difference than the –su/-si between PNIE & Proto-Greek. (I could explain what is common in –nál and –nott. But I guess you are not really interested…)

            And now let us see verbal suffices from 1st millennium. (Mycenaean texts, being mainly inventories, are poor in verbal suffices.) For 1Pl Praes the ending was –mes on the West (so in the Adriatic neighbourhood), while –men on the East (in the Aegean region). Now this might be an unimportant minor variation; but Garrett calls the attention to external relations. In the Adriatic region Italics uses an end-s (as Latin –mus, there is even possible that in the other Italic languages the suffix was –mes; “they are” is “sunt” in Latin, but “sent” in Oscan), while on the East in Anatolia Hittite used –wen. So it seems as if before 1500 BC a continuum of IE dialects existed from the Tyrrhenian to Hattusas,  with mutually intelligible neighbours, in first approximation no Greekness. In second approximation, some people at the Southernmost Balkan peninsula understood the flower name hyakinthos, and, as later turned out, they were the ancestors of speakers of Greek dialects.

            I am not arguing for or against; but obviously you cannot describe a continuum with a Stammbaum.

 

APPENDIX C: COMPUTER SIMULATIONS VIA CLADISTICS

            Cladistics was worked out as an evolutionary theory for genealogy in biology. The idea is: if you want to reconstruct the descents of related species, in first approximation similarities should not be used. Namely, because of the common origin, similarities are expected. Now, differences need explanation, they originate from innovations, and the same innovation appearing independently in more than one species is improbable. So consider a number of species from A to K. Assume that a characteristics is x for species A, B, D, H, I and K, y for C and G, and z for E and F. (Anything is good, if it is irrelevant enough, so not produced by selection. Number of teeth, number of fingers, form of skull &c.) Then maybe x is the ancestral status, appearing in most species; y and z are different innovations. So then A classifies together with G, and E with F; C and G had an extinct common ancestor in which already Character y was present, and E and F had another with Character z. Maybe A, B, D, H, I and K look similar, still they may quite be far relatives only: the similarity simply comes from the fact that they preserved the "primitive" state. Using many characteristics you can make alternative genealogic trees, and then the simplest (with minimal number of innovations, e.g.) or most parsimonious has the best chance to be true. For example, dolphins and Ichtyosaurs are quite similar to some fishes, but that comes from selection for streamlined forms in aquaeous environment. On the other hand, coelacanths (e.g. the living Latimeria) classify together with amphibian & land tetrapodes, not with "fishes", although Latimeria looks quite fishy. Namely, Latimeria "fins" are real bony extremities as tetrapode legs and arms while piscine fins are not.

            A natural idea was to apply the technique then for linguistic evolution as well. I will mention differences of principle for which it is not trivial that the method is applicable there; but first let us proceed.

            About IE genealogic trees one of the most cited is the Ringe-Warnow-Taylor tree [23], [24]. The resulting tree can be narrated even via everyday terms; of course completely extinct and unknown branches may have existed too.

            Albanian's position is uncertain, and German's one is anomalous. We do not know enough about Thraco-Phrygian & Illyrian. But otherwise:

            There was first PIE, the Indo-European primordial language. Anatolians separated first. The ancestors of Tocharians did second; then separated the common ancestor of Italic and Celtic. The next separating branch was the common ancestor of Greek and Armenian(!). After this branching the ancestors of later Baltic, Slavonic, Iranian and Indian still remained together for a while, and then the last big forking was into Baltic + Slavonic on one hand and IndoIranian on the other.

            German maybe borrowed words & grammar heavily from a substrate or neighbour Ertebölle and/or moved between central Baltic and Westernmost Celtic and these facts caused its evolution anomalous.

            As for times of separation, the most parsimonious tree gives only lower and upper bounds. However those we can append with some historical data be even as uncertain as they are. Using [23] and common sense we may get the results as:

            Anatolian left the IE community (or backwards?) about -3800. It seems that in this time already the IE word for either wagons, or chariots, or wheels existed. At least, there is a reconstructed IE word, *kwel (or *kwekwel), which well relates to English "wheel", or Greek "kyklos". The Hittite is "hurkis", but remember that the -s is not of the word but merely the Nominative ending (as in Lithuanian, Latvian, and almost so regularly in Latin & Greek, but in the latter 4 languages only in Masculine).

            Of course, the Indo-European family is big, and some groups lost the *kwekwel "wheel word". It seems that Italics lost it; in Latin the place was occupied by "rota", "rotare" &c., if "circum" is not related to *kwekwel. "Hurki-" is also not too similar; but wait a moment.

            The remaining languages were together until cca. -3300; then the ancestors of Tocharians left (surely towards East). Unfortunately we have no mentions of Tocharians for more than 3 millenia. "Wagon" is "kukäl", and "wheel" is "wärkänt", for any case.

And look: "wagon", "kukäl" is the relation of the wheel word, *kwekwel, and the dictionary entry for "wheel", "wärkänt" is not; but " wärkänt " can be the relation to Hittite "hurki-". Common innovation? Are "hurki-" & " wärkänt" related to "circum"? I do not know and I would be told "Ne sutor ultra crepidam!"; but at least the -rk- part is common. Anyways, Tocharian had wagons with wheels, and also a genuine IE word for horse: yakwe~equus. So they could speedily migrate.

            After more 4 centuries, in -2900, ancestors of Italocelts leave the community (surely, towards West); the Italoceltic unity remains until, say, -2400, and thenceforth we are with separate Italic and Celtic. Germans and Albanians left the community either a bit earlier or a bit later than Italocelts.

            Common ancestor of Greek and Armenian leaves the community a bit later, in -2800. This branch forks later, but we simply do not know when. But anyways, if the scheme is correct, the Satem innovation is after -2800, because Greek is Kentum and Armenian is Satem.

            The last big forking, Baltic+Slavonic vs. IndoIranian seems to happen in -2200.

            This is a definite story, maybe true, maybe not. We do not have IE records from -2200. Our earliest IE text, the Proclamations (and Curse) of King Anittas of Kanesh, was composed about -1900 BC; the text available for us is from not much later.

            If this picture is true, then the kinship of Tocharian and Italoceltic is nothing meaningful in cladistic context. Namely, the most striking 2 common characteristics are the common KENTUM behaviour and the common -r Passive.

            Now, people generally accept that PIE was a KENTUM language. Then in cladistics Kentum languages are not necessarily kins while Satem ones are: [k]®[s] is common innovation. (Was it really? It happened independently in the Romance! But let us continue.) Before cladistics people argued that Satem would be Eastern, Kentum Western, and so Far Eastern Kentum Tocharian must have originated on West. However maybe simply peoples at the peripheries were no more active members of the IE community anymore; their languages remained archaic, while the core territory underwent the K->S change (Baltic, Slavic, IndoIranian, Armenic, Thracian, Phrygian). If so, we can guess the time and can draw the border of the periphery. Kentum Italic, Celtic, Germanic, Tocharian, Greek and Anatolian were already separated in -2200, and Italoceltic at the West, Greek at the South, Hittite in Anatolia and Tocharian at the East.

            As for the r-Passive, it is really present in Hittite, Tocharian, Italic & Celtic, and it is not even a real Passive in Hittite/Anatolian. There it is connected rather with a disturbingly non-IE thing: animate vs. nonanimate, Ergative vs. Nominative constructions. This may be the ancestor of Active vs. Mediopassive; but the whole construct is reminiscent to non-IE Basque or Caucasian. (I do not suggest however that this Hittite characteristic would come from Caucasian Hattic. True, no Ergative construction is known in any non-Anatolian IE language; but this does not mean that Ergative was unknown in early enough times. And as for Animate vs. Inanimate: Latin o-stem Masculine and Neuter nouns differ only in Nominative: Neuter Nominative is just looks like a Masculine Accusative, as if Nominative would be impossible for an Inanimate/Neuter. The same in Slavic.) But the r-sound is there.

            Then in pure cladistics the r-Passive is simply a primitive character preserved in 3 (or 4) branches and given up for various innovations by others.

     If the Tocharian r-Passive is indeed a preserved archaism, then Szabédi cannot have been misdirected by Tocharian (because then proto-Tocharian meeting Common Ugric on the East would not have been too similar to Italic). However the above simple & logical scheme is far from explaining everything. And look: it seems me that Hittite and Tocharian share an innovation, namely hurki/wärkänt for "wheel", instead of "kwekwel". And if we take "circum" here too...

            In the next Appendix I list some problems remaining, including arguments against cladistics in IE linguistics. To be sure, Cladistics carries an important amount of truth; but pure biological cladistics cannot be applied to linguistics. You will see my points in due course.

 

APPENDIX D: FORKING ONLY?

            Cladistics (elaborated in eukaryote biology) contains a very strict rule in its basis. Let us speak in the language of Zoology for simplicity. Now we have humans, chimpanzees, gorillas, orangs, gibbons and monkeys. They are all kins to each other, but the degrees of kinship vary. Let us measure the closeness (in immunology, amino acid sequences of proteins, or by any other reasonable way), then it turns out that our closest kin is the chimpanzee, then the gorilla, then the orang, then the gibbons (two genera), and only then the monkeys.

            This is not really surprising; anyone would guess so. But: from the viewpoint of the chimps, we are closest kins, gorilla is second closest, and thenceforth as above. From the viewpoint of orang human, chimp & gorilla classify together. And so on.

            Now, from the differences one can find out the branching times using the correct tree. But we never (well: hardly ever) see two eukaryotes from different species hybridize: the elaborated mechanism of mitosis, meiosis &c. would not permit that. "Macroscopic" hybridising efforts are either utterly unsuccessful (say, between man and mare) or at least the next generation is sterile (mules). So branching happens upwards in time, and two branches cannot unite. (For plants this is not so strict; or do we define plant species incorrectly?)

            Now, Cladistics also assumes that each forking results in exactly two new species, while the old one ceases to exist. For first sight everybody could find out counterexamples; however in usual zoologic contexts the difference belongs to pure philosophy.

            Consider a species A. Evolution then produces a new species B via, say, gene duplication & differentiation or Robertson translocation on chromosomes or such. E. g. we know that our Chromosome 2 is the fusion of two chimpanzee chromosomes. Maybe this reorganisation was the first/most important step towards mutual infertility and so the divergence of humans and chimps.

            Now, henceforth the original and the mutated species represent two isolated gene pools. For a limited time the original species A is still there, but indeed we cannot see the details of geologic Past from Present very well. For any practical reason we may tell that the original species A gave life to two successor species B & C, and itself vanished. Good, C is more similar to the common ancestor than B; and then what? It is very rare to have continuous descent sequences without gaps.

            So, there was an LCA (Last Common Ancestor) of humans and chimp. We can guess the time when it forked into human & chimp (both in general sense); maybe it was 5.1 My ago [25]. Since neither LCA, not the beings just after are found, we cannot even discuss the pairwise similarities. Anyways, A, B and C all are classified into Genus Homo; the "first human" into Subgenus (Homo), the "first chimp” into Subgenus (Pan), and the LCA into neither of the Subgenera; if needs be, a third can be defined.

            Since we do not know every species from fossils, even the most parsimonius tree will have reconstructed species; but for any case there will be a most parsimonious tree.

            Why can we tell that both B and C are new species, even if only B changed, and really C was (for a while) identical with A? For simplicity, let us speak as if life went in disjoint generations. LCA lived in Generation 137; in Generation 138 the mutation appeared. However individuals of Generation 138 mate within that generation. There are two gene pools, mutually infertile. We call one B, and the other C. Cross matings B-C are infertile, so no hybrids appear; cross matings A-B and A-C are neglected, but they are rare enough (and A-B ones are infertile, while A-C ones are indistinguishable from C-C ones). From incomplete fossils we cannot tell if C was or was not "the same" as A, and after several forkings the details become no more real as the numbers of angels on the point of the pin.

            OK, in some cases the successors may remain cross-fertile in a limited extent. Robertson translocation, for example, results in limited cross-fertility [26]. And then what? The hybrid population very probably vanishes in a few generations. The essence is: eukaryote ways of proliferation guarantee simple forkings always upward in time. Triple forkings are logically impossible with the above convention: we always may tell that A did not fork into B, C and D, but first A forked into B and C, and somewhat later C forked into D and E. A, C & E are rather similar but not identical, B and D diverged more.

            And now let us see a linguistic case, the best documented major one: the dissolution of Latin around the end of the Empire.

            We have cca. 10 recent successor languages (the difference between language & dialect is somewhat arbitrary, government-dependent & such), namely

                        Catalan (Ca)

                        Dalmatian (Da)

                        French (Fr)

                        Italian (It)

                        Portuguese (Po)

                        Provencal (Pr)

                        Rhaetoroman (Rh)

                        Roumanian (Ru)

                        Sardinian (Sa)

                        Spanish (Sp)

            True, last speaker of Da, Tuone Udine Burbur, died in 1898 on a small island of the Adriatic. However, the language is practically extant. In the last years hordes of linguists collected words, grammar &c. from the last speaker, so everything is documented about how to get water from the well, what to eat at breakfasts, suppers & dinners, &c. Also, there are the cities of Dalmatia. In the Middle Ages burghers of Spalato, Zara or Ragusa spoke Da, and they definitely did not call the cities Split, Zadar or Dubrovnik. Most medieval documents are in Latin, but there are Dalmatian texts too.

            It may be that patriotic French do not like to call Pr a separate language, but a dialect of Fr; on the other hand, some Gascons would like to define their idiom as a 11th Neo-Latin language. Also, I should have mentioned Aromun, Meglenoromun & Istroromun. These languages (not dialects) are important in the so called Daco-Roumanian Argumentations; but these pseudo-scientific battles are unknown for anybody not Roumanian or Hungarian. The above list of 10 languages is practically enough.

            Everybody can learn that these 10 languages are not equally distant pairwise. It is more similar to Sp and Pr than to any else. Po is most similar to Sp, Ca is somewhere between Pr and Sp, and so on. So maybe there was a first forking, then a second &c. Indeed, there are subgroups defined. But the classification is not unique. E.g. you can define Eastern and Western halves. Then 7 languages go surely to the Western half, Ru and Da go to the Eastern one, and It may go to any of them. E.g. It plurals are formed in rather Ru than Sp ways:

En

Ru

It

Sp

Lady

doamna

donna

dona

Ladies

doamne

donne

donas

while otherwise Book Italian is near to Book Spanish; people of high schools can read the books of the other language; while nobody native in Italy or Spain can read Roumanian texts.

            OK; we must collect lots of characteristics before finding the most parsimonious tree.

            For a moment let us look for branching times. The most formal way is to look for the first written document ("oldest fossil") of the respective language. It is easy to compose the list as we know now (all AD):

Language

First Text

Ca

13th c.

Da

13th c.

Fr

842

It

960

Po

12th c.

Pr

11th c.

Rh

12th c.

Ru

1521

Sa

11th c.

Sp

10th c.

           

            However in some cases the first document is obviously from well after the birth of the separate language. The most obvious example is Ru: Hungarian sources know about Roumanians from the end of the 13th century, arriving from Walachia and speaking a language which is reminiscent to Latin, but cannot be understood via Latin. However for a long time nobody tried to write this language; the Greek Orthodox priests used Church Slavic. Then a local leader outside of Hungary, Neacsu, wrote a letter to the Mayor of Brassó (Brasov in Roumanian, Kronstadt in Saxon &c.), notifying him that Turkish troops move to the general direction of the city. Neacsu was a landowner and a fighting man, not a priest, so he was not able to write Church Slavonic.

            In several other cases the above time is more or less the moment when laics already did not understood well enough written Latin, so some legal documents should have been composed also on lingua rustica. Definitely this is the case for It; the first document is the Placito di Capua, a short text of an oath, made by a lady about the property of some land. Maybe gentlemen still made the oaths in a language they believed to be Latin.

            For Fr we know some details. In 6th century priests wrote Latin, laics spoke something they believed Latin, but no written documents remained about. 7th century was the nadir of the Latinitas of Gallia; the so called Fredegarius Chronicle is in principle in Latin, but at points simply unintelligible for posteriority. When the Capeting Kings came to power in the second half of the 8th century, the Latin of the legal documents & chronicles started to improve, and the quality became quite good under Charlemagne. And then it turned out that laics did no more understand the Correct Latin. So the Synod of Tours declared in 813 that the priests should translate some texts "...in Rusticam Romanam linguam, aut in Theotiscam..." so into village Roman or German, so that the people understand. Surely after that immediately some texts were written, not extant. But there remains only 29 years until 842…

            As for Spanish, we do know about a Southern Spanish language spoken in Andaluz, the Mozarab; it is still used in limited extent in two churches. The present Spanish is the Northern variety. The Moslim occupation caused a divergent evolution. The end-product is 4 Neo-Latin languages in the Iberian Peninsula: Portuguese, Catalan, Castilian=Spanish and Mozarab.

            However the formation of a new language does not mark the split of the original linguistic pool into two. Near kin languages remain mutually more or less understandable for centuries. In such a way "gene flow" exists between separate languages. Also, until 476 the Roman Empire existed and Latin was official everywhere, so in principle no Neo-Latin language could start before 476; but we know that Neo-Latin languages indeed come mainly from Vulgar Latin, not from the official High Language; and Vulgar Latin had serious geographic varieties in 476.

            Surely, Sardinia being an island, Sa became more or less isolated when "the troubles" started, maybe in 3rd century. Eastern Ru became separated in 395 when the Empire got two Emperors. So Sa and Ru are rather "archaic": "to know" is "scire" in Latin, but this root is extant only in Sa and Ru as "iskire" & ""sti" (in the latter the "s" is the English "sh" and the "t" is roughly the Tocharian "c"; for both consonants some diacritics are applied which I ignore).

            OK. Ru became gradually separated from an Eastern Vulgar Latin after 395, while Sa from a Western Vulgar Latin gradually after 238. So the differences are caused by

            A) the separations, which, however, were gradual;

            B) the original differences between farther or otherwise isolated areas; and

            C) the different substrates/superstrates/adstrates.

Cladistics would correctly handle A) if the separation would not be gradual, but not B) or C). The decomposition of PIE very probably had all three factors. PIE was spoken about -4000 on a rather large territory of the steppe from the Carpathians to maybe the Caspian. The subpopulations had an effective way of transport via horses, either ridden or with wagons/chariots (wheel, kyklos, hurki &c.), but surely neighbours generally remained neighbours. So there were geographic gradients of PIE, foreign from cladistics. When somebody started away to non-IE territories, then the separation of the group was similar to speciation in cladistics (divergent evolution starts), but when a group was moving from one PIE part to another, that was an un-cladistic event (see the problems with Germans wandering between Baltics & Celts). Also, locally different substrates could increase the differences between neighbour dialects (Ertebölle in Germany, no substrate in the Baltic).

            Hittite even shows a superposed effect of both substrate and superstrate (as for adstrate, who knows?). OK, Anatolian separates in -3800. It is an interesting question if Anatolia was the PIE Urheimat, and all non-Anatolians left Anatolia in 3800 for Ukraine, or backwards. For Hungarians the second choice is simpler because somebody surely attacked the Carpathian Basin in -4200, and, repelled, they fled towards Ukraine on horseback [17]. OK, let us tell the story in this way.

            So proto-Anatolians go to Anatolia in -3800. That is old agricultural territory back to at least -6500 (Catal Hüyük &c.), so glorious PIE conquerors can find substrates to tax. However there is no trace of IE Anatolians until -1900! Then rules King Pitannas of Kanesh.

            Were I a proud Indo-European, I would certainly be surprised. Being I not an IE, I am not too interested. Maybe proto-Anatolians were a really small tribe somewhere in Easternmost Anatolia, learning improved agriculture for almost 2 millenia from Caucasians. Then after the necessary acculturation some chiefs occupied some cities of the Hattic agriculturists while partially preserving martial skill, and then Anittas, son of Pitannas, destroyed the Hattic metropolis Hattusas. (Note that I do know that according to formal rules of English I should have written "of Pitanna". English prepositions do not go with Nominative but with the Common Case or Accusative: of me, not of I. "Pitannas" is the Nominative. However you might be then surprised.) Proto-Anatolians probably started the nearest part of the steppe (Daghestan?), so their local dialect/language was originally most similar to that of their neighbours. But also, they separated early, so they brought Archaic PIE to Eastern Anatolia. There they had a Hattic (Caucasian?) substrate, and may have borrowed the ergative construction from the Hattic (the Caucasus is full with ergatives, but Euzkadi, the Basque Country, is full too, and Ergative may also be simply Archaic in IE). However they got a superstrate as well. Hittite texts are full with Sumerian and Akkadian ideograms, words &c., because they learned writing from Mesopotamia. As for adstrates, in Anatolia they met non-Anatolian IE peoples, as the probably Mycenaean Ahhiyava, proto-Phrygian and proto-Lydian, as well. Borrowings from unknown IE languages may mimic evolution without adstrates.

            Tocharians started from somewhere on the steppe. Without original gradients and without going away on airplanes this would be a nice cladistic situation: Tocharian keeps some linguistic stage of their departure, of which some they innovated. However

            1) their original language was most similar to their neighbours independently of the time of departure; and

            2) if they were not the Easternmost in the PIE homeland, then they contacted linguistic kins on the steppe moving Eastward, with whom they were in limited linguistic understanding. These interactions "contaminated" the language.

            For our purpose we should know the Tocharian vocabulary about -1500 when they met proto-Ugric in Westernmost Asia. But we have only Buddhist religious and medical texts from +800! If we could guess the original position, we might take the languages of synchronous neighbours. But Cladistics does not answer this question. Cladistics suggests the answer that similarities with Italoceltic (& Hittite) come simply from the early departure. However some fundamental assumptions for using Cladistics may not be valid!

            I cannot overcome this question. True, Swadesh word lists and similar quantitative “measures of similarity” seem to help. But the problem is complicated enough. Let us see a simple and transparent approach.

One may take all the known IE words of Tocharian with the IE roots, from e.g. Pokorny [27].

Now you can look for the same root in other IE languages. Following e.g. [28], for simplicity let us choose 6 groups for the “others”: Celtic, German, Greek, Hittite, Italic and all Satem; then Albanian is ignored, as well as the not well known extinct ones as Phrygian, Thracian &c. Then [28] gives for all known IE words in Tocharian if they have cognates in the 6 other groups or not. (Of course, if a Tocharian word is accepted as IE, the root occurs in some other IE language(s) too.)

            However, as you can explicitly see from [28], Pokorny’s list contradicts to Watkins’s work about the origins of English words [29]. Namely, there are e.g. Tocharian words whose IE ancestors had no successor in the Italic subfamily according to Pokorny, while according to Watkins there is a successor in Modern English through Latin. Obviously some etymologies were accepted by one author and refused by the other. Then we may choose between the approaches; here I accept the extra etymologies of Watkins too, but use only the Pokorny roots. Then:

Tocharian total

296

In Celtic

192

In German

254

In Greek

237

In Hittite

69

In Italic

224

In Satems

251

and you might conclude that Tocharian was nearest to German, then to the Satem languages, then to Greek and so on. Hittite is farthest.

            However such a method would contain a tremendous bias. The number of Satem languages is above 100, many living, while Italics are really 3 even if the Neo-Latin divergence produced 10 very similar successors of Latin; and Hittite means the relative known but long ago extinct Neshian (syllabal Hittite), the hieroglyphic Luwian of not so directly known vocabulary, and the hardly known Lycian; all extinct. So anyway we except much less hits in Hittite than in Satems.

            Still, this material deserves further statistical analysis; and for any case we can tell that ľ of the IE roots preserved in Tocharian was preserved also in Italics.

            Such approaches seem diametrically opposite to Cladistics if we apply them according to an extremal hypothesis: strong geographic variations in the Urheimat without temporal differences (“synchronous breakup”). Appendix C will be Stetsyuk's approach, based on taking Nostratic very seriously [30], which eliminated at least partially the above documented bias.

 

APPENDIX E: SOME STATISTICS

            At the end of the previous Appendix we saw a bulk number telling that

            1) There are 296 IE roots in Tocharian.

            2) Of these 296 roots other subfamilies preserved so & so many.

Both statements can be challenged. The number 296 is rather limited, but the extant Tocharian literature is limited too: mainly Buddhist theology & medicine. So the meaning of this 296 is not that far-migrated Tocharian language would have diluted amongst strangers, but simply that the majority of the roots are not preserved in the written documents at disposal. The number will increase.

            Similarly, the statement that, say, of these 296 roots xA are found in Anatolian languages, depends on our present knowledge about Anatolian, on the present status of IE art and on stricter or looser criteria.

            However for the present study let us accept the numbers in face values. In the previous Appendix I narrated how the numbers were obtained: they are primarily from Pokorny [27], but corrected with the Germanic, Romance and Greek etymologies of Watkins [29] for English.

            The main goal of this Appendix is the demonstration of the application of the methods of statistics in linguistics. According to present fashion in scholarship I should rather make references of computer calculations, e.g. the good qualities of cladistic code packets on the software market. Instead I discuss explicit connections between quantitative statistical data and evolutionary hypotheses.

            We have N(=296) roots which, according to consensus, authorities or such, were present in the ancestor language (IE, PIE, Indo-Hittite or any). Then, if the descendant of such a root cannot be found in one daughter group, then there are 2 possibilities:

            1) There is a continuous decay of the original dictionary. Words/roots are being substituted. If the subgroup contains only a small number of languages, then this loss can be significant. If the number of languages is high enough, complete substitution is rare; the root remains in a few languages, and then it is present in the subgroup. For improved statistical analysis polynomial or Poisson distributions would be adequate, but in first approximation it is true that during a given time in the ith subgroup pi is the probability to lose a root, so the number of remaining roots is

              Ni = N(1-pi), 1>pi>0                                                                         (E.1)

            2) Now, we may not found the root even if it still is/was present in some languages. For extinct languages the main factor in not finding the root is the limited extant dictionary. This effect results in first approximation in an extra factor wi:

              Ni = Nwi(1-pi), 1>pi>0, wi<1                                                            (E.2)

However this means that for practical purposes we can write

              wi(1-pi) -> qi                                                                                     (E.3)

The qi's can be taken from the etymologies. Also, q's of more than one indices can be calculated from the Tocharian IE etymologies, when the root is found in more than one other IE subgroup. Now I am giving all te data, according to the notations

            1: Celtic

            2: Germanic

            3: Greek

            4: Hittite (or Anatolian)

            5: Italic

            6: Satem languages

Obviously the smallest w factor is expected to be w4; maybe w6 is closest to 1.

            The q's of 1 index were practically given in the previous Appendix, but for a homogeneous structure I repeat them:

              qi = {0.6486, 0.8581; 0.8007, 0.2331, 0.7568, 0.8480}

Indeed, q4 is very low, and for this fact the simplest explanation is a low w4. (An alternative possibility would be p4 close to 1, so drastic decay of the original dictionary, because of e.g. massive substrate influence; and such might be behind of the moderately low q1 & q5.)

            The q's of more than one indices will be given as follows. By construction a qik...m is symmetric in all indices, but the value does not have any meaning if two indices coincide. So I give only the components of increasing index order. These independent components are as follows:

              q1k = {0.5980, 0.5709, 0.1791, 0.5642, 0.5777

              q2k = {0.6959, 0.2061, 0.6858, 0.7568}

              q3k = {0.2128, 0.6655, 0.6858}

              q4k = {0.2128, 0.2264}

              q5k = {0.6520}

 


              q123 = 0.5304            

              q124 = 0.1655

              q125 = 0.5338

              q126 = 0.5304

              q134 = 0.1723

              q135 = 0.5135

              q136 = 0.5034

              q145 = 0.1723

              q146 = 0.1723

              q156 = 0.4932

  q234 = 0.1953

  q235 = 0.6081

  q236 = 0.6115

  q245 = 0.1926

  q246 = 0.2027

  q256 = 0.6014

  q345 = 0.2095

  q346 = 0.2095

  q356 = 0.5777

  q456 = 0.2061

 

 

              q1234 = 0.1622

              q1235 = 0.4865

              q1236 = 0.4730

              q1245 = 0.1622

              q1246 = 0.1622

              q1256 = 0.4696

              q1345 = 0.1689

              q1346 = 0.1655

              q1356 = 0.4493

              q1456 = 0.1655

              q2345 = 0.1892

              q2346 = 0.1926

              q2356 = 0.5338

              q2456 = 0.1892

              q3456 = 0.2027

 

              q12345 = 0.1588

              q12346 = 0.1588

              q12356 = 0.4291

              q12456 = 0.1588

              q13456 = 0.1622

              q23456 = 0.1858

 

              q123456 = 0.1554


 

            In what follows I will stop at two indices; but the analysis might go further. Let us guess first, what is the connection between qi and qik. For this purpose consider two extremal cases. If the language groups i and k were already separated from each other as different and mutually unintelligible languages, or if the speakers were already geographically isolated from each other at the detachment of Tocharian, and did not take again contact until the linguistic separation happened, then the forgetting/substituting processes in eqs. (E.1-3) were completely independent, so

              qik = qi*qk                                                                                                                  (E.4)

On the other hand, we may formally distinguish 2 languages even if they are still completely intelligible mutually and they are in intensive contact (and very similar and mutually intelligible ones are indeed sometimes declared different); in this case

              qi = qk; qik2 = qiqk                                                                                                      (E.5)

In the generic case then we expect

              (qiqk)1/2 > qik > qiqk                                                                                        (E.6)

Also we can get a similar result in cladistics. Assume that Group 0, now Tocharian, detached, when the ancestor(s) of Groups i & k were still in close connection; then, after some time tik they separated. If so, the forgetting process was partially common, i.e.

              Ni = Nwi(1-pik)(1-pi)                                                                                     (E.7)

but

              Nik = Nwiwk(1-pik)(1-pi)(1-pk)                                                                                  (E.8)

so then

              qik/qiqk = (1-pik)-1 > 1                                                                                     (E.9)

Obviously pik is <<1 if the common life was short, and increases monotonically with longer and longer common life, contacts &c. So anyways, qik/qiqk>1 shows some common history of any kind.

            Now I give the ratio for all combinations:

 

ik=

12

13

14

15

16

23

24

25

26

34

35

36

45

46

56

qik/qiqk=

1.074

1.099

1.185

1.149

1.050

1.013

1.030

1.056

1.040

1.140

1.098

1.010

1.206

1.145

1.016

 

where, somewhat arbitrarily, I drew a border at qik/qiqk=1.1, and above that the numbers are in boldface. What do we see?

            A) Greek is not too dependent on anyone except Hittite.

            B) Hittite is independent of Germanic, but not of the others.

            C) Excepting the dependences of Hittite, the only other strong dependence is Celtic to Italic.

            Now, I do not try to interpret A), and I do not want to interpret B), since the multiply suggested history that Hittite was already separate from the Core IE when Tocharian was detaching might disturb the results. However the interpretation of C) is simple enough. At the Tocharian departure the subfamilies were already more or less separated, except that Italoceltic community still existed for a while (see eq. (E.9)).

            Obviously this is not a rigorous proof of the existence of Italoceltic community. For that a much more detailed statistical analysis would be necessary, For that at least a Hittite-based analysis would be necessary (according to the hypothesis that Hittite separated first); and I will not do that here. Appendix E is only to demonstrate that it is not either necessary or sufficient to apply Cladistic softwares outside their boundaries of validity. (As we saw, biological Cladistics has fundamental assumptions untrue in linguistics.)

 

APPENDIX F: THE CASE OF PURE GEOGRAPHIC FACTORS

            Take a group of N related languages, occupying some positions XA, not moving. Strengths of contacts are governed by some "generalised distances", DAB(XA,XB). If a D is small, then the two dialects are in strong connection, so even after a considerable time they remain similar. If the particular D is great, contacts are weak, and the dialects/languages diverge.

            DAB's are generalised distances, because a complicated geography can cause the equivalent of great distance. E.g. in the Caucasus Mountains sometimes neighbouring valleys speak very different languages. And also, some differences between language pairs will be functionals of the DAB's, monotonously increasing, fulfilling some limiting relations (e.g. difference vanishes if D goes to 0), but the exact functional form is still unknown. However let us ignore all the difficulties for first approach, as Stetsyuk himself did it.

            Many years ago Jánossy formulated the question, how to decide the dimensionality of space [31]. He was definitely against Riemannian geometry; I am not, but let us continue for a while. We have N points, with distances DAB between, and we assume some coordinates XAi, AŁN, iŁn. If we know the geometry, then

              DAB = f(XAi,XBi)                                                                                                         (F.1)

where the form of the f function is given. In Euclidean geometry it is the Pythagorean formula, on Earth's surface that of the spherical geometry, &c.

            Now, we have N(N-1)/2 independent distances and nN coordinates. So eq. (1) cannot trivially be fulfilled if

              N < 2n+1                                                                                                                   (F.2)

Hence we get a constraint for n, the dimensionality of the space. For Earth's surface n=2, so the distances of the 6th point check if they are on a surface.

            Stetsyuk used the number of etymologically related words as a measure of distance between 2 languages. This is surely in close connection with differences. If we have a string of languages all going back to an original Ur-language, languages in more contact will retain more words in common. Languages in loose contact forget independently, so a small number of common words remain. Sure, this DAB is a functional of the distance in physical space, not simply equal with it, but let us follow Stetsyuk in first approximation by ignoring the difference.

            His approach is of two steps. First he wanted to discover the Nostratic Urheimat, and second that of the Indo-Europeans.

            The Nostratic discipline comes from Russia. "Nostratic" means cca. "the ours" in Latin. Russian is Indo-European, so for a Russian Nostratic languages are the kins of Russian. The idea was to look for language families related to Indo-European. The idea comes from Illich-Svitych & Dolgopolsky, and the method is developing; for any case the core of the supposed Nostratic contains Indo-European, Uralic, Altaic, Kartvelian and Dravidian. On one side, most Nostratic experts add some Palaeo-Siberian languages as e.g. Chukch, Eskimo, or Nivh, on the other, some append it with Semitic, while others exclude Semitic, but accept it as closest non-Nostratic kin. Etruscan is probably Nostratic, may even be a close kin of Indo-European.

            If you ask, what remains outside, I can give a list. A lot of African languages, Native Australian & New Guinean ones, together with Pacific ones, are definitely non-Nostratic, together with such important Eurasian ones as (probably) Sumerian, Elamite, Basque, some Caucasians, Ket, Chinese & Burushaski, together with the Na-Dene languages of Northern America.

            The Nostratic superfamily may be a product of immigration of Homo sapiens (or Homo sapiens sapiens, or Anatomically Modern Humans) into Eurasia. As we know, Europe & Near East was Neanderthal country from time immemorial, while not too long time ago the Far East was habitated by some less known large-brained erectus descendants.

            Our gracilis ancestors with chins and without brow ridges tried to leave Africa 3 times. The first exodus was repelled by Levantine Neanderthals on the Holy Land ca. 100,000. The second attempt started from Ethiopia about -71,000, crossed at the Bab-el-Mandeb and went to Southern India, then to Indonesia & Australia. Then they were isolated from the remaining African AMH's for tens of millenia, so the local languages could diverge.

            The third exodus again went through the Sinai, and was at the site Bokher Takhtit in ca. -58,000. This wave introduced some innovations (as e.g. blade technology) in which many experts see the start of Upper Palaeolithic, and they continued to East. Very probably the wave  branched in Central Asia; one part continued to the Far East (where they met some population from the previous wave + large-brained post-erecti), the other entered Europe, maybe via Anatolia and the Bosporus. They were on the Balkans in ca. -48,000, in the Carpathian Basin in ca. -44,000, and in France before -30,000. The newcomers cohabitated the Carpathians with Neanderthals (Szeleta) for some millenia, and also some part of France (Castelperronian).

            If so, then the languages of Europe + Western Asia come from immigrants of limited numbers; + from the Neanderthals, but their linguistic ability is generally questioned by Homo sapiens linguists. So originally European languages should be similar.

            Of course, 50,000 years is a time depth; however PIE is reconstructed back to 6,000 years (Uralic too), and it is generally accepted that until Magdalenian specialised hunting humans were more mobile than later. So maybe some similarity of languages remained up to now.

            If so, Nostratic may show the latest close kins either at end-Palaeolithic or at Mesolithic. Stetsyuk is brave enough to reconstruct the locations at the end of Nostratic unity from linguistic similarities, and gets that the distances fit best with a Nostratic Urheimat South from Caucasus, East from Black Sea, West from the Caspian Sea, and to the South somewhere at the valleys of Great & Little Zab.

            So Indo-Europeans, Uralics & Altaics formed almost an equilateral triangle, Altaics around Mt. Ararat & Lake Sevan, Indo-Europeans to the East, until the Caspian, and Uralics around the Lake Resaye. For the 3 farther peoples, Kartvelians more or less were as now, Semitics around Lake Van, and Dravides Southernmost, at Upper Mesopotamia.

            If you like, you may believe this. For any case, Uralic & Altaic is on the verge of forming an established higher family, Uralo-Altaic, with a common grammar and at least dozens of common words, and Uralic & Indo-European, while the grammars are not similar, have such common words as e.g. "water", "name" or "honey/mead". As for the other 3 "peripherial" families, they were farther so not in so strong connections.

            Then Nostratic unity came to an end (-15,000? -10,000? Indo-European surely remained still in some contact with Semitic if indeed some names of domesticated animals are common). Stetsyuk assumes migrations, and then tries to reconstruct the secondary Urheimats of Indo-Europeans, Uralics & Altaics. While for me the reconstructed Uralic Urheimat would be most interesting, I can imagine that most readers would read rather about Indo-Europeans. So let us see that; the time is PIE community, so maybe -4,000.

            The site is Poland, Ukraine, Belarus, Westernmost Russia & the Baltic. The borders will come in due course. The great river Dnieper intersects the territory. East of the Dnieper, on the North, about the present border of Belarus & Russia, around River Sozh was Indian territory, and at the East of it, until the sources of River Oka, the Iranian one. South of them, just East of Dnieper (so around Chernihiv, but far to North as well) lived the Phrygians, East of them, between Rivers Desna and Seym, their (guessed) close kins, the Thracians, and to the South, in Left-Bank Ukraine from a point opposite to Kiev, in a triangle of Rivers Dnieper, Sula & Seym lived the Armenians. The site of Borispil Airport of Kiev was on Armenian territory. The Eastern border of IE territory was West of River Psel.

            Now comes the western half. The Westernmost IE people were Celts. Their Westernmost point was the confluence of Wistula & Western Bug, so cca. Warsaw. The Easternmost point was cca. Chernobyl (!), the Northern border was the River Pripet, and the Southern was River Uzh. North of Pripet, until Neman on the North and maybe until River Sluch on the East lived the Germans, and North of that, until Western Dvina, Slavs on the West, and Balts on the East. Western Dvina was the Northern boundary of the secondary IE homeland. Between Dnieper, Berezina, Sluch & Pripet was Greek territory, and between Western Dvina, Kasplya, Dnieper and Berezina lived the Tocharians.

            Natural borders were somewhat less explicite in the Southwestern quadrant. Illyrian territory was bordered by Western Bug on the West, by Pripet on the North, and the most Southeastern point was near to the sources of Southern Bug. To East, between roughly Pripet, Teteriv and maybe Stir you find the Italics, and finally between Teteriv, Dnieper and Ros, including Kiev, the Anatolians.

            You may or may not believe the reconstruction; it is an attempt to explain IE similarities/dissimilarities just before the exodus of Anatolians to Anatolia, so surely the opposite of the approach of [23]-[20]. Therefore it can answer questions which are uninterpretable in [23]-[20], although the answers can still be incorrect. In this scheme Tocharians’ neighbours were Balts & Greeks (River Dnieper is too big to cross it regularly). Italians were not too far, just behind Greeks. Note that Szabédi reconstructs lots of Greek words too in Magyar; only he believes that they were first borrowed by proto-Latin. In Stetsyuk's scheme, however, Tocharian was primarily similar to Greek, and to Italics only secondarily.

            Now we can return to the Tocharian migration to East. They started, say, in -3,300, with Western IE word forms and crossed first Indian territory. Since later they appear at Southern latitudes, maybe they took a Southeastern direction, maybe they avoided the Iranians and followed River Seym, but of course this is sheer guess in a sheer guess of scheme. With a further East-Southeast direction in Setsyuk's scheme Magyar territory comes just after the Don, between Medveditsa & Hopor, but of course you does not have to accept this.

            I only wanted to show that there are reconstructions in reasonable schemes, good or bad ones. Tocharian language in itself is a proof for intensive connections with Uralics on the East, very probably Ugrics. According to Uralic consensus in the time of contact (surely before 1,000 BC) separate Magyar language did not yet exist; maybe the other two extant Ugric languages should be looked up for Tocharian impact as well. But Western IE influence on Magyar is not "mirageous", and cladistics is not the final answer in IE history.

 

APPENDIX G: AN UNTIMELY DEATH IN NOVAE

            Here comes the mysteriously mentioned Lazhen grave inscription. In Late Antiquity the site was Dacia Nova, just South of the limes, the River Danubius, near to the non-negligible city Novae, along the tributary river Asamus. (Now the names are respectively, Bulgaria, Danube, Svistov & Osam.) Originally the provincia had been Moesia Inferior, but in 271 the Empire evacuated the original Dacia (Traiana), repatriated the civitas, and organised for them a new Dacia on the other side of the Danube.

            A young male, hardly beyond boyhood, died and got a nice grave inscription, nontrivial to reproduce here, so let us make in a circuitous way.

            The father wanted to write something:

            I went away when my dear maleness was blooming

            in my four and tenth year.

The Classical Latin text would have been something:

            Ego decedebam caro florente marito

            in quartum decimumque annum...

OK, maybe the father looked for some synonyme. Lots existed: exibam, moriebar, occidebam &c. But the gravestone gives something surprising for us:

            Ipso immargebam caro florente marito

            in quartum decimumque annum...

Ipso may be a local usage, or, maybe, the death was suicide. But immargebam is "incorrect", and even with a good orthography it would be "immergebam": "I sunk in/went down". Maybe proper for a sailor; but otherwise...

            However it was not at all strange for Daicoviciu [32], who published the text. He translated the text to Roumanian as "Mergeam in anul 14-lea." In his vernacular it is "I went away", because "mergeam" is simply "I went away". Similarly, in Albanian "mërgonj" is cca. "I move away".

            By other words, further evolution of the Novae Latin probably led to Modern Roumanian (and to the Latin words of Albanian). On other territories not. The immergo -> exeo transition of meaning was not the temporary status of Vulgar Latin, but Common Innovation of Palaeo-Roumanian and Palaeo-Albanian in Moesia. Even in neighbour Dalmatia this change was absent.

            And still Vulgar Latin was mutually understood.

 

 

 

REFERENCES

 [1]       Quine W. V. O.: Methods of Logic. Holt, Rinehart & Wilson, New York, 1963

 [2]       Lukács B., Martinás K. & Bérczi Sz.: Symmetry and Katachi in the Works of Aristotle. Forma 15, 173 (2000)

 [3]       Everett H. III: Relative State Formulation of Quantum Mechanics. Rev. Mod. Phys. 29, 454 (1957)

 [4]       Szabédi L.: A magyar nyelv ôstörténete. Kriterion, Bucharest, 1977

 [5]       Fodor I.: Az uráli ôstörténet és a régészet. Folia Uralica Debreceniensia 8, 143 (2001)

 [6]       Diamond J.: The Rise and Fall of the Third Chimpanzee. Vintage, London, 1992

 [7]       Brugmann K.: Kurze vergleichende Grammatik der indogermanischen Sprachen. Tr/übner, Strassburg, 1904

 [8]       Lord Kelvin: Nineteenth Century Clouds over the Dynamical Theory of Heat and Light. The London, Edinburgh & Dublin Phil. Mag. 2, 1 (1901)

 [9]       Strabo: Geographika. In 8 Vols. Harvard University Press, Cambridge Mass. 1967

[10]      Piper B. H.: Lord Kalvan of Otherwhen. Ace Books, New York, 1984

[11]      Kortlandt P.: More Evidence for Italo-Celtic. Eriu 32, 1 (1981)

[12]      Warnow T.: Mathematical Approaches to Comparative Linguistics. PNAS 94, 6585 (1997)

[13]      Erdem E., Lifschitz V. & Ringe D.: Temporal Phylogenetic Networks and Logic Programming. arXiv:cs.LO/0508129v1; to appear in Theory and Practice of Logic Programming.

[14]      Collinder B: Hat das Uralische Verwandte? Acta Univ. Uppsaliensis. 1, 109 (1965)

[15]      Gamkrelidze T. & Ivanov I.: The Early History of Indo-European Languages. Sci. Amer. March 1990, p. 110

[16]      Rosenfelder M.: Numbers in over 4000 Languages. www://zompist.com/numbers.shtml

[17]      Makkay J.: The Tiszaszôlôs Treasure. (See also some citations therein.)

[18]      Livius T.: Ab urbe condita. Teubner, Leipzig, 1902-30

[19]      Hajdú P.:  Bevezetés az uráli nyelvtudományba. Tankönyvkiadó, Budapest, 1966

[20]      Toynbee A. J.: Some Problems in Greek History. Oxford University Press, London, 1969

[21]      Green R. & Carr J. F.: Great King’s War. Ace Books, New York, 1985

[22]      Garrett A.: Convergence in the Formation of Indo-European Subgroups: Phylogeny and Chronology. In: Phylogenetic Methods and the Prehistory of Languages, eds. Forster P. & Renfew C., Cambridge, McDonald Inst. for Archaeologic Research, p. 139, 2006

[23]      Ringe D., Warnow T. & Taylor Ann: Indo-European and Computational Cladistics. Trans. Philol. Soc. 100, 59 (2002)

[24]      Warnow T.: Mathematical Approaches to Comparative Linguistics. PNAS 94, 6585 (1997)

[25]      Wildman D. E. & al.: Implications of Natural Selection in Shaping 99.4% Nonsynonymous DNA Identity between Humans and Chimpanzees: Enlarging Genus Homo. PNAS 100, 7181 (2003)

[26]      Holba Ágnes & Lukács B.: How to Jump into Humanity: A Mathematical Reconstruction. In: Evolution: from Cosmogenesis to Biogenesis, eds. Lukács B. & al., KFKI-1990-50, p. 125

[27]      Pokorny J.: Indogermanisches etymologisches Wörterbuch. Francke, Bern, 1959

[28]      Richter G. C.: A Maitreyasamiti-nat.aka with Addendum. http://www2.truman.edu/~grichter/translations/tochcog.pdf

[29]      Watkins C.: The American Heritage Dictionary of Indo-European Roots. Houghton Mifflin, Boston, 1965

[30]      Stetsyuk V.: Introduction to the Study of Prehistoric Ethnogenic Processes in Eastern Europe, Part 1-4. www.geocities.com/valentyn_ua/AO2@.doc (where @ stands for 1, 2a, 2b and 3, respectively)

[31]      Jánossy L.: Theory of Relativity Based on Physical Reality. Akadémiai Kiadó, Budapest, 1971

[32]      Daicoviciu C.: A merge. Fossatum - sat. Dacoromania V., Cluj, 1927/8, p. 477

 

 

My HomePage, with some other studies, if you are curious.