From WikiThor
Jump to: navigation, search

The problem of perfect language translation (or even accurate translation for lengthy sentences and more) is coming into prominence in technical fields as internet access is becoming cheaper and more ubiquitous, and portable devices become more sophisticated and internet-ready. The ever-present internet connection and environment-sensitivity of these devices (in the form of microphones and accelerometers) make them the perfect platform to perform computations that aid in travel abroad, namely language translation. However, language is a distinctly human construct, spanning deep into pre-history, potentially hundreds of thousands of years, well before the advent of second-order predicate logic and the likes, and computers and computer technology are an anthropologically nascent invention. Thus, to achieve computer-based human language translation, there must be a bridge of some sort.

As I understand the present state of research, the main thrust of the favored techniques is "teaching" computers human language through a heuristic, neural-network-driven process that uses the present body of digitized human-produced translations. I will call this the "human-centered" approach. The idea I will present shortly, however, takes a somewhat reverse approach. It aims to bring human language into a more computer-friendly domain by treating it as a mathematical construction bound by objective, universal laws. This is the "computer-centered" approach: producing a way to treat human language as an extremely specialized, and arcane, version of standard computer (a.k.a. mathematical) language.

The ideaspace is a vector space (potentially of infinite dimension) in which lies every possible idea. Right now, a rigorous definition of "idea" and even "dimension" of this space is beyond the scope of my rigor. Likewise, I have no earth-shattering results to share with the world, simply the beginnings of this idea that I want to preserve for future use, or posterity, should the theory fail.

So ideaspace is a vector space. Every idea that enters into a person's head corresponds to a particular vector in ideaspace, or potentially a vector subspace. When a person wants to transmit this idea to someone else, he can't simply rattle off a list of coordinates, as this requires a stated basis. From here, I have two hypotheses.

The first hypothesis is that human culture builds a kind of starting point, which I call the culture's "null vector," since no attempt at communication has yet been initiated. This null vector is culture specific and time specific, so the null vector of last year is slightly different from the null vector of today. From here, language provides a series of projection operators that push the null vector through ideaspace until it arrives at its destination. The computational challenge then becomes setting some sort of basis of this vector space and using the existing body of literature in both languages to establish their null vectors. Then, for the starting language, one maps the vector into ideaspace. From here, to get the translation in the final language, one must find a set of projection operators that take the null vector to the proper idea vector. Right now I'm not sure how this would be done, but I'm not spending a lot of effort into its investigation because this hypothesis has a few problems. Chiefly, this representation is too specific. Two people from the same culture (with presumably the same null vector) can hear the same sentence (with exactly the same projection operators for each person) and derive completely different meanings. This has led to my second hypothesis.

Instead of some arbitrary null vector as the starting point, I posit that a culture "pre-operates" on the whole of ideaspace with a set of projection operators. A specific individual's null vector could then lie arbitrarily in that subspace, so that when two individuals hear the same sentence, the projection operators of that sentence take them to two different final vectors, but those vectors are "close" to each other in the sense that they lie within the span of the sentence's operation on the entire culture's starting subspace. This approach would (probably) effectively change this from a vector space theory to a theory lying in projection space (where multiplication by a scalar doesn't change the "vector"). This approach appears computationally more difficult, but theoretically more accurate, as it accounts for the individual bias, but its results are not individual-specific (though they might be able to be made to be).

Personal tools
other projects