orion
03-31-2005, 04:02 PM
NEURO-FRACTAL COMPOSITION OF MEANING: TOWARD A COLLAGE THEOREM FOR LANGUAGE (http://www.cs.wlu.edu/~levy/pubs/bics2004.pdf)
Simon D. Levy, from Washington & Lee University, Lexington, VA
Following the interest on self-similar patterns (fractals) in language and semantics, Levy's paper discusses the well known collage theorem and Iterated Function Systems (http://www.miislita.com/fractals/motifs-iterated-function-systems.html) applied to language, semantics and the Vector Space Model. Prof. Levy writes,
"Self-similarity in language appears in the guise
of stories within stories, or sentences within sentences (”I
know what I know”), and has been represented in the form
of recursive grammar rules by Chomsky and his followers.
Having observed this common property of language and images,
we present a formal mathematical model for putting
together words and phrases, based on the iterated function
system (IFS) method used in fractal image compression.
Building (literally) on vector-space representations of word
meaning from contemporary cognitive science research, we
show how the meaning of phrases and sentences can likewise
be represented as points in a vector space of arbitrary
dimension. As in fractal image compression, the key is to
find a set of (linear or non-linear) transforms that map the
vector space into itself in a useful way. We conclude by
describing some advantages of such continuous-valued representations
of meaning, and potential implications."
No, this is not another paper on fractal images. It is fractal theory applied to the Term Vector Space Model and IR in general. Another excellent contribution for the understanding of patterns in words, language and semantics.
Prof Levy concludes
CONCLUSIONS, IMPLICATIONS, AND FUTURE WORK
"Describing the linguistic composition of meaning with
fractals instead of grammars allows us to approach a number
of important questions in an entirely new way. For example,
it is generally agreed that the linguistic data to which children
are exposed is of insufficient quality to enable them to induce
general structural patterns without some pre-existing mechanism
for acquiring language (Chomsky 1965). The traditional approach
has been to view this mechanism as a sort of “Universal
Grammar” (more accurately, grammar schema) constraining the
sorts of languages that human beings can acquire."
"Under the approach described in this paper – where the
lexicon consists of co-occurrence vectors and the “grammar” is
encoded as a set of IFS neural network weights – this “poverty
of the stimulus” phenomenon can be viewed as follows: Essentially,
the problem is to find a set of tree vectors and network
weights such that the frontiers of the trees generated by the IFS
match the sentences (strings) to which the learner is exposed.5
We would like to offer, very tentatively, that the universal
mechanism by which such a process might be constrained could
be something like Barnsley’s Collage Theorem. That is, the
“correct” set of tree-vectors and weights could be those that produce
an IFS whose attractor covers the set of lexical vectors, so
that the IFS effectively maps the lexicon onto itself. The notion
that the child explores a set of “candidate grammar” hypotheses
while learning language could then be seen as an exploration
of the (real-valued) space of network weights. Again, this view
is supported by the work of Tabor (2000), who describes how
fractal encoding of grammars allows accepting machines for
those grammars to be located in a spatial relationship to one
another."
Enjoy it.
Orion
Simon D. Levy, from Washington & Lee University, Lexington, VA
Following the interest on self-similar patterns (fractals) in language and semantics, Levy's paper discusses the well known collage theorem and Iterated Function Systems (http://www.miislita.com/fractals/motifs-iterated-function-systems.html) applied to language, semantics and the Vector Space Model. Prof. Levy writes,
"Self-similarity in language appears in the guise
of stories within stories, or sentences within sentences (”I
know what I know”), and has been represented in the form
of recursive grammar rules by Chomsky and his followers.
Having observed this common property of language and images,
we present a formal mathematical model for putting
together words and phrases, based on the iterated function
system (IFS) method used in fractal image compression.
Building (literally) on vector-space representations of word
meaning from contemporary cognitive science research, we
show how the meaning of phrases and sentences can likewise
be represented as points in a vector space of arbitrary
dimension. As in fractal image compression, the key is to
find a set of (linear or non-linear) transforms that map the
vector space into itself in a useful way. We conclude by
describing some advantages of such continuous-valued representations
of meaning, and potential implications."
No, this is not another paper on fractal images. It is fractal theory applied to the Term Vector Space Model and IR in general. Another excellent contribution for the understanding of patterns in words, language and semantics.
Prof Levy concludes
CONCLUSIONS, IMPLICATIONS, AND FUTURE WORK
"Describing the linguistic composition of meaning with
fractals instead of grammars allows us to approach a number
of important questions in an entirely new way. For example,
it is generally agreed that the linguistic data to which children
are exposed is of insufficient quality to enable them to induce
general structural patterns without some pre-existing mechanism
for acquiring language (Chomsky 1965). The traditional approach
has been to view this mechanism as a sort of “Universal
Grammar” (more accurately, grammar schema) constraining the
sorts of languages that human beings can acquire."
"Under the approach described in this paper – where the
lexicon consists of co-occurrence vectors and the “grammar” is
encoded as a set of IFS neural network weights – this “poverty
of the stimulus” phenomenon can be viewed as follows: Essentially,
the problem is to find a set of tree vectors and network
weights such that the frontiers of the trees generated by the IFS
match the sentences (strings) to which the learner is exposed.5
We would like to offer, very tentatively, that the universal
mechanism by which such a process might be constrained could
be something like Barnsley’s Collage Theorem. That is, the
“correct” set of tree-vectors and weights could be those that produce
an IFS whose attractor covers the set of lexical vectors, so
that the IFS effectively maps the lexicon onto itself. The notion
that the child explores a set of “candidate grammar” hypotheses
while learning language could then be seen as an exploration
of the (real-valued) space of network weights. Again, this view
is supported by the work of Tabor (2000), who describes how
fractal encoding of grammars allows accepting machines for
those grammars to be located in a spatial relationship to one
another."
Enjoy it.
Orion