Difference between revisions of "Expected Float Entropy Minimisation"

From Mathematical Consciousness Science Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 
'''Expected Float Entropy Minimisation (EFE)''' is a mathematically formulated model of consciousness that follows naturally from the intuitive idea that consciousness may be some kind of minimum entropy interpretation of system states. It is formulated with the aim of explaining (up to relationship isomorphism) how the brain defines the content of consciousness at least with respect to all relationships and associations within subjective experience and the structural content comprised of such relationships. For example, one might ask how the brain defines the perceived geometry of the field of view or the perceived relationships between different colours, or between different audible frequencies. At higher structural levels there are also perceived relationships between different objects and between objects and words for example.
 
'''Expected Float Entropy Minimisation (EFE)''' is a mathematically formulated model of consciousness that follows naturally from the intuitive idea that consciousness may be some kind of minimum entropy interpretation of system states. It is formulated with the aim of explaining (up to relationship isomorphism) how the brain defines the content of consciousness at least with respect to all relationships and associations within subjective experience and the structural content comprised of such relationships. For example, one might ask how the brain defines the perceived geometry of the field of view or the perceived relationships between different colours, or between different audible frequencies. At higher structural levels there are also perceived relationships between different objects and between objects and words for example.
  
Due to properties such as learning, the brain is very biased toward certain system states and therefore determines typical system states and, in theory, a probability distribution over the set of all system states. This opens up the possibility of applying [https://en.wikipedia.org/wiki/Information_theory/ information theory] type approaches and EFE has some similarities with conditional Shannon entropy except the condition involved is comprised of relationship parameters. EFE is a measure of the expected amount of information required to specify the state of a system (such as an artificial or [https://en.wikipedia.org/wiki/Neural_circuit/ biological neural network]) beyond what is already known about the system from the relationship parameters. For certain non-uniformly random systems, particular choices of the relationship parameters are isolated from other choices in the sense that they give much lower Expected Float Entropy values and, therefore, the system defines relationships. In the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience. The principle article (Quasi-Conscious Multivariate Systems<ref name=Mason2016>Mason, J. W. (2016), Quasi-conscious multivariate systems. Complexity, 21: 125-147. doi:10.1002/cplx.21720</ref>) on this mathematical theory was published in 2015 and was followed by the article (From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation<ref name=Mason2019>Mason, J. W. (2019), From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation. Entropy, 21, 60. doi:10.3390/e21010060</ref>) in 2019. EFE first appeared in a publication in 2012<ref name=Mason2012>Mason, J. W. (2013), Consciousness and the structuring property of typical data. Complexity, 18: 28-37. doi:10.1002/cplx.21431</ref>.  
+
Due to properties such as learning, the brain is very biased toward certain system states and therefore determines typical system states and, in theory, a probability distribution over the set of all system states. This opens up the possibility of applying [https://en.wikipedia.org/wiki/Information_theory information theory] type approaches and EFE has some similarities with conditional Shannon entropy except the condition involved is comprised of relationship parameters. EFE is a measure of the expected amount of information required to specify the state of a system (such as an artificial or [https://en.wikipedia.org/wiki/Neural_circuit biological neural network]) beyond what is already known about the system from the relationship parameters. For certain non-uniformly random systems, particular choices of the relationship parameters are isolated from other choices in the sense that they give much lower Expected Float Entropy values and, therefore, the system defines relationships. In the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience. The principle article (Quasi-Conscious Multivariate Systems<ref name=Mason2016>Mason, J. W. (2016), Quasi-conscious multivariate systems. Complexity, 21: 125-147. doi:10.1002/cplx.21720</ref>) on this mathematical theory was published in 2015 and was followed by the article (From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation<ref name=Mason2019>Mason, J. W. (2019), From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation. Entropy, 21, 60. doi:10.3390/e21010060</ref>) in 2019. EFE first appeared in a publication in 2012<ref name=Mason2012>Mason, J. W. (2013), Consciousness and the structuring property of typical data. Complexity, 18: 28-37. doi:10.1002/cplx.21431</ref>.  
  
 
The nomenclature “Float Entropy” comes from the notion of floating a choice of relationship parameters over a state of a system, similar to the idiom “to float an idea”.  Optimisation methods are used in order to obtain the relationship parameters that minimise Expected Float Entropy. A process that performs this minimisation is itself a type of learning method.
 
The nomenclature “Float Entropy” comes from the notion of floating a choice of relationship parameters over a state of a system, similar to the idiom “to float an idea”.  Optimisation methods are used in order to obtain the relationship parameters that minimise Expected Float Entropy. A process that performs this minimisation is itself a type of learning method.
  
 
==Overview==
 
==Overview==
Relationships are ubiquitous among mathematical structures. In particular, weighted relations (also called weighted graphs and [https://en.wikipedia.org/wiki/Weighted_network/ weighted networks]) are very general mathematical objects and, in the finite case, are often handled as adjacency matrices. They are a generalisation of graphs and include all [https://en.wikipedia.org/wiki/Function_(mathematics)/ functions] since functions are a rather constrained type of [https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)/ graph]. It is also the case that [[consciousness]] is awash with relationships; for example, red has a stronger relationship to orange than to green, relationships between points in our field of view give rise to geometry, some smells are similar whilst others are very different, and there’s an enormity of other relationships involving many senses such as between the sound of someone’s name, their visual appearance and the timbre of their voice. Expected Float Entropy includes weighted relations as parameters and, for certain non-uniformly random systems, certain choices of weighted relations are isolated from other choices in the sense that they give much lower Expected Float Entropy values. Therefore, systems such as the brain define relationships and, according to the theory, in the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience.
+
Relationships are ubiquitous among mathematical structures. In particular, weighted relations (also called weighted graphs and [https://en.wikipedia.org/wiki/Weighted_network weighted networks]) are very general mathematical objects and, in the finite case, are often handled as adjacency matrices. They are a generalisation of graphs and include all [https://en.wikipedia.org/wiki/Function_(mathematics) functions] since functions are a rather constrained type of [https://en.wikipedia.org/wiki/Graph_(discrete_mathematics) graph]. It is also the case that [[consciousness]] is awash with relationships; for example, red has a stronger relationship to orange than to green, relationships between points in our field of view give rise to geometry, some smells are similar whilst others are very different, and there’s an enormity of other relationships involving many senses such as between the sound of someone’s name, their visual appearance and the timbre of their voice. Expected Float Entropy includes weighted relations as parameters and, for certain non-uniformly random systems, certain choices of weighted relations are isolated from other choices in the sense that they give much lower Expected Float Entropy values. Therefore, systems such as the brain define relationships and, according to the theory, in the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience.
 
Expected Float Entropy minimisation is very general in scope. For example, the theory has been successfully applied in the context to image processing<ref name="Mason2016" /> but also applies to waveform recovery from audio data<ref name="Mason2012" />.
 
Expected Float Entropy minimisation is very general in scope. For example, the theory has been successfully applied in the context to image processing<ref name="Mason2016" /> but also applies to waveform recovery from audio data<ref name="Mason2012" />.
  
Line 17: Line 17:
 
If <math>S</math> is the set of nodes of a system, such as a neural network, then a state of the system <math>S_{i}</math> is given by the aggregate of the states of the nodes over some range <math>V:=\{v_{1},v_{2},\ldots,v_{m}\}</math> of node states. Therefore each state of the system <math>S_{i}</math> is determined by a corresponding function <math> f_{i}:S\to V</math>. The set of all possible states of the system is denoted <math>\Omega_{S,V}</math>.
 
If <math>S</math> is the set of nodes of a system, such as a neural network, then a state of the system <math>S_{i}</math> is given by the aggregate of the states of the nodes over some range <math>V:=\{v_{1},v_{2},\ldots,v_{m}\}</math> of node states. Therefore each state of the system <math>S_{i}</math> is determined by a corresponding function <math> f_{i}:S\to V</math>. The set of all possible states of the system is denoted <math>\Omega_{S,V}</math>.
  
Given an element <math>S_{i}\in\Omega_{S,V}</math>, the above definitions give rise to a [https://en.wikipedia.org/wiki/Canonical_map/ canonical map] from <math>\Psi_{V}</math> to <math>\Psi_{S}</math>. That is, for <math>U\in\Psi_{V}</math>, the function <math>R\{U,S_{i}\}</math> defined by
+
Given an element <math>S_{i}\in\Omega_{S,V}</math>, the above definitions give rise to a [https://en.wikipedia.org/wiki/Canonical_map canonical map] from <math>\Psi_{V}</math> to <math>\Psi_{S}</math>. That is, for <math>U\in\Psi_{V}</math>, the function <math>R\{U,S_{i}\}</math> defined by
 
:<math>R\{U,S_{i}\}(a,b):=U(f_{i}(a),f_{i}(b))</math>, for all <math>a,b\in S</math>,
 
:<math>R\{U,S_{i}\}(a,b):=U(f_{i}(a),f_{i}(b))</math>, for all <math>a,b\in S</math>,
 
is an element of <math>\Psi_{S}</math>.
 
is an element of <math>\Psi_{S}</math>.
Line 23: Line 23:
 
For <math>U\in\Psi_{V}</math> and <math>R\in\Psi_{S}</math>, the '''Float Entropy''' of a state of the system <math>S_{i}\in\Omega_{S,V}</math>, relative to <math>U</math> and <math>R</math>, is defined as  
 
For <math>U\in\Psi_{V}</math> and <math>R\in\Psi_{S}</math>, the '''Float Entropy''' of a state of the system <math>S_{i}\in\Omega_{S,V}</math>, relative to <math>U</math> and <math>R</math>, is defined as  
 
:<math>fe(R,U,S_{i}):=\log_{2}(\#\{S_{j}\in\Omega_{S,V}\colon d(R,R\{U,S_{j}\})\leq d(R,R\{U,S_{i}\})\})</math>,
 
:<math>fe(R,U,S_{i}):=\log_{2}(\#\{S_{j}\in\Omega_{S,V}\colon d(R,R\{U,S_{j}\})\leq d(R,R\{U,S_{i}\})\})</math>,
where <math>d</math> is a metric given by a [https://en.wikipedia.org/wiki/Matrix_norm/ matrix norm] on the elements of <math>\Psi_{S}</math> in matrix form. In the article Quasi-Conscious Multivariate Systems<ref name="Mason2016" />  the <math>L_{1}</math> norm is used. The article also includes a more general definition of Float Entropy called Multirelational Float Entropy and the nodes of the system can be larger structures than individual neurons.
+
where <math>d</math> is a metric given by a [https://en.wikipedia.org/wiki/Matrix_norm matrix norm] on the elements of <math>\Psi_{S}</math> in matrix form. In the article Quasi-Conscious Multivariate Systems<ref name="Mason2016" />  the <math>L_{1}</math> norm is used. The article also includes a more general definition of Float Entropy called Multirelational Float Entropy and the nodes of the system can be larger structures than individual neurons.
  
 
The '''Expected Float Entropy (EFE)''' of a system, relative to <math>U\in\Psi_{V}</math> and <math>R\in\Psi_{S}</math>, is defined as
 
The '''Expected Float Entropy (EFE)''' of a system, relative to <math>U\in\Psi_{V}</math> and <math>R\in\Psi_{S}</math>, is defined as
 
:<math>efe(R,U,P):=\sum_{S_{i}\in\Omega_{S,V}}P(S_{i})fe(R,U,S_{i})</math>,
 
:<math>efe(R,U,P):=\sum_{S_{i}\in\Omega_{S,V}}P(S_{i})fe(R,U,S_{i})</math>,
where <math>P</math> is the [https://en.wikipedia.org/wiki/Probability_distribution/ probability distribution] <math>P:\Omega_{S,V}\to [0,1]</math> determined by the bias of the system due to the long term effect of the system’s inherent learning paradigms in response to external stimulus.
+
where <math>P</math> is the [https://en.wikipedia.org/wiki/Probability_distribution probability distribution] <math>P:\Omega_{S,V}\to [0,1]</math> determined by the bias of the system due to the long term effect of the system’s inherent learning paradigms in response to external stimulus.
  
 
According to the theory, a system (such as the brain and its subregions) defines a particular choice of <math>U</math> and <math>R</math> (up to a certain resolution) under the requirement that the EFE is minimized. Therefore, for a given system (i.e., for a fixed <math>P</math>), solutions in <math>U</math> and <math>R</math> to the equation
 
According to the theory, a system (such as the brain and its subregions) defines a particular choice of <math>U</math> and <math>R</math> (up to a certain resolution) under the requirement that the EFE is minimized. Therefore, for a given system (i.e., for a fixed <math>P</math>), solutions in <math>U</math> and <math>R</math> to the equation
Line 43: Line 43:
  
 
===Connection with ideas in topology===
 
===Connection with ideas in topology===
In its simplest form involving only “primary relationships” (i.e. just <math>R</math> and <math>U</math> as shown above) EFE minimisation can also be considered as a generalisation of the [https://en.wikipedia.org/wiki/Initial_topology/ initial topology] (i.e. weak topology). To see this, the family of functions involved are the typical (probable) system states, the common domain of these functions is the set of system nodes (e.g. neurons, tuples of neurons or larger structures) and the common codomain is the set of node states. In the case of the initial topology a topology is already assumed on the common codomain and the initial topology is then the coarsest topology on the common domain for which the functions are continuous. In the case of EFE minimisation no structure is assumed on either the domain or codomain. Instead EFE minimisation simultaneously finds structures (for us weighted graphs, but topologies could in principle be used) on both the domain and codomain such that the functions are close (in some suitable sense) to being continuous whilst avoiding trivial solutions (such as the two element trivial topology) for which arbitrary improbable functions (system states) would also be continuous. Thus we find the primary relational structures that the system itself defines. In this context objects (visual and auditory) are present and EFE then extends to secondary relationships between such objects by involving correlation for example.
+
In its simplest form involving only “primary relationships” (i.e. just <math>R</math> and <math>U</math> as shown above) EFE minimisation can also be considered as a generalisation of the [https://en.wikipedia.org/wiki/Initial_topology initial topology] (i.e. weak topology). To see this, the family of functions involved are the typical (probable) system states, the common domain of these functions is the set of system nodes (e.g. neurons, tuples of neurons or larger structures) and the common codomain is the set of node states. In the case of the initial topology a topology is already assumed on the common codomain and the initial topology is then the coarsest topology on the common domain for which the functions are continuous. In the case of EFE minimisation no structure is assumed on either the domain or codomain. Instead EFE minimisation simultaneously finds structures (for us weighted graphs, but topologies could in principle be used) on both the domain and codomain such that the functions are close (in some suitable sense) to being continuous whilst avoiding trivial solutions (such as the two element trivial topology) for which arbitrary improbable functions (system states) would also be continuous. Thus we find the primary relational structures that the system itself defines. In this context objects (visual and auditory) are present and EFE then extends to secondary relationships between such objects by involving correlation for example.
  
 
==Connection to other mathematical theories of consciousness==
 
==Connection to other mathematical theories of consciousness==
There are some similarities between the minimisation of Expected Float Entropy and the minimisation of surprise in [https://en.wikipedia.org/wiki/Karl_J._Friston/ Karl J. Friston]’s [https://en.wikipedia.org/wiki/Free_energy_principle/ Free energy principle]. The theory is also somewhat complementary to [https://en.wikipedia.org/wiki/Giulio_Tononi/ Giulio Tononi]’s [https://en.wikipedia.org/wiki/Integrated_information_theory/ Integrated information theory] (IIT) which was initially developed to quantify consciousness but gave little priority to how systems may define relationships.
+
There are some similarities between the minimisation of Expected Float Entropy and the minimisation of surprise in [https://en.wikipedia.org/wiki/Karl_J._Friston Karl J. Friston]’s [https://en.wikipedia.org/wiki/Free_energy_principle Free energy principle]. The theory is also somewhat complementary to [https://en.wikipedia.org/wiki/Giulio_Tononi Giulio Tononi]’s [https://en.wikipedia.org/wiki/Integrated_information_theory Integrated information theory] (IIT) which was initially developed to quantify consciousness but gave little priority to how systems may define relationships.
  
 
== See also ==
 
== See also ==
* [https://en.wikipedia.org/wiki/Information_theory/ Information theory]
+
* [https://en.wikipedia.org/wiki/Information_theory Information theory]
* [https://en.wikipedia.org/wiki/Integrated_information_theory/ Integrated information theory]
+
* [https://en.wikipedia.org/wiki/Integrated_information_theory Integrated information theory]
* [https://en.wikipedia.org/wiki/Free_energy_principle/ Free energy principle]
+
* [https://en.wikipedia.org/wiki/Free_energy_principle Free energy principle]
 
* [[Consciousness]]
 
* [[Consciousness]]
* [https://en.wikipedia.org/wiki/Hard_problem_of_consciousness/ Hard problem of consciousness]
+
* [https://en.wikipedia.org/wiki/Hard_problem_of_consciousness Hard problem of consciousness]
* [https://en.wikipedia.org/wiki/Mind%E2%80%93body_problem/ Mind–body problem]
+
* [https://en.wikipedia.org/wiki/Mind%E2%80%93body_problem Mind–body problem]
* [https://en.wikipedia.org/wiki/Philosophy_of_mind/ Philosophy of mind]
+
* [https://en.wikipedia.org/wiki/Philosophy_of_mind Philosophy of mind]
  
 
== References ==
 
== References ==

Revision as of 12:33, 4 May 2020

Expected Float Entropy Minimisation (EFE) is a mathematically formulated model of consciousness that follows naturally from the intuitive idea that consciousness may be some kind of minimum entropy interpretation of system states. It is formulated with the aim of explaining (up to relationship isomorphism) how the brain defines the content of consciousness at least with respect to all relationships and associations within subjective experience and the structural content comprised of such relationships. For example, one might ask how the brain defines the perceived geometry of the field of view or the perceived relationships between different colours, or between different audible frequencies. At higher structural levels there are also perceived relationships between different objects and between objects and words for example.

Due to properties such as learning, the brain is very biased toward certain system states and therefore determines typical system states and, in theory, a probability distribution over the set of all system states. This opens up the possibility of applying information theory type approaches and EFE has some similarities with conditional Shannon entropy except the condition involved is comprised of relationship parameters. EFE is a measure of the expected amount of information required to specify the state of a system (such as an artificial or biological neural network) beyond what is already known about the system from the relationship parameters. For certain non-uniformly random systems, particular choices of the relationship parameters are isolated from other choices in the sense that they give much lower Expected Float Entropy values and, therefore, the system defines relationships. In the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience. The principle article (Quasi-Conscious Multivariate Systems[1]) on this mathematical theory was published in 2015 and was followed by the article (From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation[2]) in 2019. EFE first appeared in a publication in 2012[3].

The nomenclature “Float Entropy” comes from the notion of floating a choice of relationship parameters over a state of a system, similar to the idiom “to float an idea”. Optimisation methods are used in order to obtain the relationship parameters that minimise Expected Float Entropy. A process that performs this minimisation is itself a type of learning method.

Overview

Relationships are ubiquitous among mathematical structures. In particular, weighted relations (also called weighted graphs and weighted networks) are very general mathematical objects and, in the finite case, are often handled as adjacency matrices. They are a generalisation of graphs and include all functions since functions are a rather constrained type of graph. It is also the case that consciousness is awash with relationships; for example, red has a stronger relationship to orange than to green, relationships between points in our field of view give rise to geometry, some smells are similar whilst others are very different, and there’s an enormity of other relationships involving many senses such as between the sound of someone’s name, their visual appearance and the timbre of their voice. Expected Float Entropy includes weighted relations as parameters and, for certain non-uniformly random systems, certain choices of weighted relations are isolated from other choices in the sense that they give much lower Expected Float Entropy values. Therefore, systems such as the brain define relationships and, according to the theory, in the context of these relationships a brain state acquires meaning in the form of the relational content of the corresponding experience. Expected Float Entropy minimisation is very general in scope. For example, the theory has been successfully applied in the context to image processing[1] but also applies to waveform recovery from audio data[3].

Definitions and connections with some areas of mathematics

Definitions

For a nonempty set , a weighted relation on is a function of the form

.

Such a weighted relation is called reflexive if for all , and symmetric if for all . The set of all reflexive, symmetric weighted relations on is denoted .

If is the set of nodes of a system, such as a neural network, then a state of the system is given by the aggregate of the states of the nodes over some range of node states. Therefore each state of the system is determined by a corresponding function . The set of all possible states of the system is denoted .

Given an element , the above definitions give rise to a canonical map from to . That is, for , the function defined by

, for all ,

is an element of .

For and , the Float Entropy of a state of the system , relative to and , is defined as

,

where is a metric given by a matrix norm on the elements of in matrix form. In the article Quasi-Conscious Multivariate Systems[1] the norm is used. The article also includes a more general definition of Float Entropy called Multirelational Float Entropy and the nodes of the system can be larger structures than individual neurons.

The Expected Float Entropy (EFE) of a system, relative to and , is defined as

,

where is the probability distribution determined by the bias of the system due to the long term effect of the system’s inherent learning paradigms in response to external stimulus.

According to the theory, a system (such as the brain and its subregions) defines a particular choice of and (up to a certain resolution) under the requirement that the EFE is minimized. Therefore, for a given system (i.e., for a fixed ), solutions in and to the equation

are the weighted relations of interest. For example, when the theory is applied to digital photographs, U gives the relationships between colours and R gives the relationships that determine the geometry of the field of view.

Connection with Shannon entropy

The Shannon entropy of a system is defined as

.

For , and the following equalities holds

                    (1).

The expression on the left is similar in form to the definition of Shannon entropy. The middle expression reveals the value to be similar to that of when the probabilities in the argument of the logarithm are comparable. Indeed, is an approximation of (1). The expression on the right of (1) shows the mathematical connection to Shannon entropy; the first term is the Shannon entropy of the system and, with consideration of the log function, the second term has a negative value between and 0.

Connection with ideas in topology

In its simplest form involving only “primary relationships” (i.e. just and as shown above) EFE minimisation can also be considered as a generalisation of the initial topology (i.e. weak topology). To see this, the family of functions involved are the typical (probable) system states, the common domain of these functions is the set of system nodes (e.g. neurons, tuples of neurons or larger structures) and the common codomain is the set of node states. In the case of the initial topology a topology is already assumed on the common codomain and the initial topology is then the coarsest topology on the common domain for which the functions are continuous. In the case of EFE minimisation no structure is assumed on either the domain or codomain. Instead EFE minimisation simultaneously finds structures (for us weighted graphs, but topologies could in principle be used) on both the domain and codomain such that the functions are close (in some suitable sense) to being continuous whilst avoiding trivial solutions (such as the two element trivial topology) for which arbitrary improbable functions (system states) would also be continuous. Thus we find the primary relational structures that the system itself defines. In this context objects (visual and auditory) are present and EFE then extends to secondary relationships between such objects by involving correlation for example.

Connection to other mathematical theories of consciousness

There are some similarities between the minimisation of Expected Float Entropy and the minimisation of surprise in Karl J. Friston’s Free energy principle. The theory is also somewhat complementary to Giulio Tononi’s Integrated information theory (IIT) which was initially developed to quantify consciousness but gave little priority to how systems may define relationships.

See also

References

  1. 1.0 1.1 1.2 Mason, J. W. (2016), Quasi-conscious multivariate systems. Complexity, 21: 125-147. doi:10.1002/cplx.21720
  2. Mason, J. W. (2019), From Learning to Consciousness: An Example Using Expected Float Entropy Minimisation. Entropy, 21, 60. doi:10.3390/e21010060
  3. 3.0 3.1 Mason, J. W. (2013), Consciousness and the structuring property of typical data. Complexity, 18: 28-37. doi:10.1002/cplx.21431