diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/404.html b/404.html new file mode 100644 index 0000000..d04a4a6 --- /dev/null +++ b/404.html @@ -0,0 +1,33 @@ + + +
+ + + + + +What does it mean by understanding? Usually, it is use as the opposite of just memorizing. When you memorize something, you just remember it. But when you understand something, there are a few more magic happening:
This is quite difficult because the context usually doesn't match the knowledge exactly. For example, if the doctor says: "don't drink any water". You may think that you can drink juice, but you can't drink juice either. You can't drink anything.
This is even more difficult. The reason is two-fold:
The most important example of the use of indexing is search engine. Search engines collect the keywords in the documents and index them. When you search for a keyword, the search engine will return the documents that contain the keyword. This is the most basic form of indexing.
Search engines provide an efficient way to find a webpage with some keywords. However, you cannot imagine you discover the gravity by searching "apple" in Google. This is because the search engine doesn't understand the relation between the keywords. It only knows that the keywords appear in the same document. This approach is already very powerful, but we obviously want more.
A way to analyze the relation and the context of the keywords is to use a knowledge graph. A knowledge graph is a graph that contains the objects and the relations between them. For example, juice
and water
can be two nodes linked by an edge has ingrediant
. In this way, it might help understanding whether you can drink juice when the doctor says "don't drink any water".
However, obviously, none of these ancient approaches has a chance to draw a relation between celestial bodies and gravity. Ever after decades of development, they still struggle with understanding everything that is a little bit abstract. Certainly we need something new.
I will not introduce the history of the reason why LLMs work. But I believe every one of you who are reading this article must have some feeling that LLM can understand abstract mathematical concepts. If you ask ChatGPT:
When I dress myself, I can put on my shirt and then my pants.
+I can also put on my pants and then my shirt. It won't make a difference.
+
+What mathematical concept is this?
+
ChatGPT will recognize that the order of the two actions doesn't matter, so it is related to commutativity. The answer won't change a lot if you modify the situation as long as it represents the concept of commutativity. It's quite hard to imagine how a search engine or a knowledge graph can do this. The word "commutativity" doesn't even appear in the question.
However, this good performance of LLM is at a cost. The most important limitation is that LLM the size of its input is limited, and it is completely not at the same scale as the knowledge graph and traditional search engines. You have to decide what is the most important context that the LLM have to know before you ask the question. This, again, requires some understanding of the knowledge. LLM helps you the best when you already have some understanding of the knowledge.
Good news is that LLM not only helps us by directly giving the answer. It also helps us index existing knowledge. Notice that LLM are built with deep learning technology, in which neural networks are used to process the knowledge. In the intermediate layers of the neural network, the knowledge is represented as vectors called embeddings.
These embeddings carry all the information about the input and have already been processed by the neural network for abstract understanding. Therefore, if two inputs to the LLM have similar embeddings, they are likely to be related, even in an abstract way. This is the key idea of embedding-based indexing.
Giving a few pieces of knowledge, we can use LLM to generate their embeddings as their index. Whenever a context is given, we generate the embedding of the context and find the similar embeddings in the knowledge base. This will have the model to gain essential knowledge before answering a question. Importantly, this embedding similarity-based indexing is totally scalable, meaning that you have the chance to index the knowledge of astronomy and gravity together!
Though there might be still a lot of steps before we let the model rediscover gravity, we have already seen the potential of LLM in indexing. Importantly, we find a good roadmap to solve the two problems we put in the beginning. For the first, by embedding similarity, we have the tool for finding the relevant knowledge to the context and retrieve them. "Don't drink water" will have a high similarity with "don't drink juice". For the second, with the abstract understanding ability of LLM, we can extract relation of two pieces of knowledge. It can discover that "don't drink water" actually means "don't drink any liquid".
With embedding-based search in hand, it seems what left for us to build is simply improve its performance. However, you will find it as a surprisingly tricky task which involves much philosophical effort. Let's discuss it in another article.
Tree might be the most common traditional indexing that are adopted by human-being before the advent of computers. People have practiced organizing books in nested section for thousands of year. It has two advantages:
This is the most obvious reason and this is why we call it indexing. However, there are more subtle reasons:
These reasons are important when the reader is not familiar with the knowledge. It makes it possible to understand the knowledge better without reading the whole book. It works by the following mechanism:
The reason why we underscore the importance of tree indexing is obviously not because we want to make a better book. Our question is whether LLM can benefit from tree indexing. The answer is obviously yes. The reason is from the following points:
Further, if we make the LLM into an agent who can actively travel on the book and add new content to the book.
Usually, in human made books, the tree structure is quite coarse. One of the reasons might be this: explicit fine-grained tree structure is hard to make and read. Though human writers might make a lot of list and aside to make the tree structure actually more fine-grained, it is laborious to give a name to every small section. The human readers are also not willing to read a book with too many small sections.
However, the situation does not hold for LLM. This is because
With the discussion above, we know that
As we discussed in the first article, indexing is closed related to understanding. Surely, we can see that tree indexing can help organizing and retrieve knowledge. However, could it really help LLM to understand abstract things like science?
',20),r=[i];function s(h,l){return t(),o("div",null,r)}const c=e(n,[["render",s],["__file","2. Tree indexing.html.vue"]]);export{c as default}; diff --git a/assets/2. Tree indexing.html-c8f97e3e.js b/assets/2. Tree indexing.html-c8f97e3e.js new file mode 100644 index 0000000..2a001a5 --- /dev/null +++ b/assets/2. Tree indexing.html-c8f97e3e.js @@ -0,0 +1 @@ +const e=JSON.parse('{"key":"v-6b2d87a8","path":"/writings/2.%20Tree%20indexing.html","title":"Tree indexing","lang":"en-US","frontmatter":{},"headers":[{"level":2,"title":"What does it mean to LLM?","slug":"what-does-it-mean-to-llm","link":"#what-does-it-mean-to-llm","children":[]},{"level":2,"title":"What is the difference between tree indexing for human and for LLM?","slug":"what-is-the-difference-between-tree-indexing-for-human-and-for-llm","link":"#what-is-the-difference-between-tree-indexing-for-human-and-for-llm","children":[]}],"git":{"updatedTime":1696131659000,"contributors":[{"name":"Zijian Zhang","email":"doomspec@outlook.com","commits":1}]},"filePathRelative":"writings/2. Tree indexing.md"}');export{e as data}; diff --git a/assets/2.1 Method of Loci and sparsity.html-004732f1.js b/assets/2.1 Method of Loci and sparsity.html-004732f1.js new file mode 100644 index 0000000..d97fbcc --- /dev/null +++ b/assets/2.1 Method of Loci and sparsity.html-004732f1.js @@ -0,0 +1 @@ +import{_ as a,r as n,o as i,c as s,a as e,b as t,d as h,e as r}from"./app-75398fc7.js";const d={},c=r('Method of Loci, also known as memory palace, is a method to memorize things by associating them with a place. It is a very old method and has been used for thousands of years. It is also a very effective method.
When you try to memorize a list of things, you can just imagine a place you are familiar with and put the things in the place. When you want to recall the things, you can just imagine the place and the things will come to your mind.
Why this method is efficient? Here is the claim:
Claim
Method of Loci is efficient because it creates a graph of knowledge with each node has only limited number of edges. That is, it is a sparse graph.
Here, in the graph of knowledge, the nodes are the context (situation) and the edges leads to the memory or another situation. The whole point of method of loci is to turn a list of things, which is densely indexed, into a sparsely connected structure.
Here is the claim:
Claim
Sparse graph performs better because it fits in context window of human brain better.
Thinking with a sparse graph limits the number of things you need to think about at one time. In this meantime, because the knowledge are still interconnected, you can still think about the whole knowledge.
In the previous article, we found that indexing is deeply related to understanding. However, can we somehow give a definition of understanding? We mentioned in the first article that memorization is different from understanding. So the explicit knowledge must not be the understanding and understanding must be something other than the knowledge itself. Naturally, because the "other" thing must be related to the stand-alone knowledge, we can call it the implicit context of the knowledge. With this definition, I claim that
Claim
To understand a thing, you must know that implicit context of it.
and consequently,
Claim
The way to understand a thing, is the way to assign the implicit context to it.
Let's illustrate this with a few examples.
Example
Modern educated human think they understand earthquake and treat it as a result of the movement of tectonic plates. They think they understand because they can fit the phenomenon of earthquake into his existing knowledge of geology and use it as a context.
Example
Ancient Japanese people think they understand earthquake and treat it as a result of the movement of a giant catfish supporting the Japanese islands. They think they understand because they can fit the phenomenon of earthquake into his existing knowledge of mythology and use it as a context.
In this example, we show that different people have different understanding of the same thing. They assign different implicit context to one thing and both strongly believes so.
Here, I want to emphasize that, we do not care about which context they assign is correct or not. We only care about the fact that they assign different context. Importantly, the context they assign might be both correct, but the context they assign is different.
Example
A person who believes geocentric model thinks he understands the movement planets because they can perfectly fit into his existing knowledge of astronomy and use it as a context, though in the context planets move in a very complicated way.
Example
A person who believes heliocentric model thinks he understands the movement planets because they can perfectly fit into his existing knowledge of astronomy and use it as a context. The context is different from the previous one and the planets move in a very simple way.
As we introduced in the previous article, tree indexing can help assign a context to the knowledge. With a tree indexing, we can find existing knowledge that is similar to the incoming ones. With the help of the paths of the existing knowledge, a new path, namely a new context, can be created. This is the way tree indexing helps LLM to understand the knowledge. Specifically, the understanding can be carried out in the following way
Procedure
The generation of implicit context in tree indexing relies on the logic of paths. Therefore, we can say the logic of path generation can be used to characterize understandings. People can change the path logic to achieve a new understanding. For example: TODO.
LLM can understand sentences. Where is the implicit context? My interpretation is that the layers of the LLM is responsible for adding these implicit contexts, including the grammar, meaning of tokens and world knowledge. After being processed by layers, the hidden state, which is the embedding of the sentence, contains the implicit context. Therefore, we can say that even the simplest embedding based search provides a way to understand the knowledge, even when no tree structure is involved.
Surely, there are many criteria to classify knowledge. The important thing is how much insight we can get from the classification. In this article, I will introduce how to use the idea of continuous and discrete to classify knowledge.
Definition
Discrete knowledge is the ones whose state is defined in a discrete space. Variation on it cannot be infinitesimal.
For example, a coin has two states: head and tail. The state of a coin is discrete knowledge.
More importantly, logic deductions are operating discrete knowledge. All the system with a flavour of logic and have a clear border of what is true and what is wrong, e.g., knowledge graph and symbolic deductions, are mainly operating discrete knowledge.
Discrete knowledge is clear and easy to operate with computers. They can ensure 100% correctness given correct assumptions. For fields that have a concrete assumption, e.g., mathematics, discrete knowledge and its deduction will suffice.
However, not all fields have concrete assumptions. In the long debate of rationalism and empiricism, people found that it is absolutely not easy to find reliable and non-trivial assumption to reason from (See Kant and Hume). My claim to the failure is that the world is too complex to be described by a few pieces of discrete knowledge. Even there are a set of such discrete knowledge, they are not affordable to the human brain. For example, I admit that the world might be discrete if you look at it in a very small scale. However, the number of discrete states is too large for human to make any useful deduction except for cosmology or particle physics. Most of the useful knowledge does not change its essence when you vary it a little bit.
Definition
Continuous knowledge is the ones whose state is defined in a continuous space. It allows an infinitesimal variation.
For example, the probability that a coin will be head is continuous knowledge. The probability is a real number between 0 and 1.
More importantly, neural networks hold continuous knowledge. The state of a neural network is defined by the weights of the connections between neurons. The weights are real numbers, which is a continuous space.
It might be tricky to check whether a piece of knowledge is continuous or not. The key is to imagine whether the knowledge can have a very small variation and still remain mostly true. For example, when you try to recall a voice of someone, you can never ensure that your memory today is the same as your memory yesterday. It also works for smell, visual or kinetic memory.
Most importantly, though also containing discrete knowledge like grammar, a large part of our knowledge about words is also continuous. Your feeling about a certain word is continuous. The most obvious example is brands. You must have a certain feeling about Coca-cola, Pepsi, Tesla and BMW; and they don't have a clear border of correctness, nor you can check your feeling is stable.
The representation power of continuous knowledge is much stronger than discrete knowledge. It is very hard to imagine how to represent the feeling of ski or recalling a picture with a discrete format.
Continuous knowledge is more natural for human to process. Most of the physics theory also assume that the space is continuous or its discreteness is negligible for human. The power of continuous knowledge can also be proved by the success of neural network. There was a shift of the paradigm of artificial intelligence in the 1990s from discrete to continuous and then follows the triumph of neural networks in nearly all the field.
Admittedly, symbols in a language is discrete. However, they are meaningless without an interpreter. The development of natural language processing has witnessed that all the discrete approaches to understand natural language failed. The history has seen that parsing sentences in to syntax tree is hard and not as useful as using neural networks to directly process the natural language.
Syntax tree can never represent the accurate meaning. For example, I can set a question: "If apple means eat in the next sentence. 'Mike apple an apple.' What did Mike intake?" This question is easy for human to answer but will break any natural language parser.
However, the intrinsic drawbacks of continuous knowledge are still there. Even in 2023, we still cannot handle math, logic and coding satisfactorily with neural networks. This is surely because of the discrete nature of these tasks. How to bridge continuous knowledge with discrete knowledge will be the main challenge of building AI.
Insight
EvoNote is trying to add more discrete structure to the continuous knowledge.
EvoNote uses tree structure to organize the natural languages in a macro scale (Recall the section: Tree indexing). This can assign the continuous knowledge a discrete structure (tree), which we believe can help building a continuous-discrete hybrid knowledge to help making AI capable at discrete tasks.
',30),s=[i];function r(l,d){return t(),o("div",null,s)}const h=e(n,[["render",r],["__file","4. Continuous and discrete knowledge.html.vue"]]);export{h as default}; diff --git a/assets/4. Continuous and discrete knowledge.html-e31de42f.js b/assets/4. Continuous and discrete knowledge.html-e31de42f.js new file mode 100644 index 0000000..50f05f8 --- /dev/null +++ b/assets/4. Continuous and discrete knowledge.html-e31de42f.js @@ -0,0 +1 @@ +const e=JSON.parse('{"key":"v-43d2144f","path":"/writings/4.%20Continuous%20and%20discrete%20knowledge.html","title":"Continuous and discrete knowledge","lang":"en-US","frontmatter":{},"headers":[{"level":2,"title":"Discrete knowledge","slug":"discrete-knowledge","link":"#discrete-knowledge","children":[{"level":3,"title":"What is discrete knowledge?","slug":"what-is-discrete-knowledge","link":"#what-is-discrete-knowledge","children":[]},{"level":3,"title":"What is the property of discrete knowledge?","slug":"what-is-the-property-of-discrete-knowledge","link":"#what-is-the-property-of-discrete-knowledge","children":[]},{"level":3,"title":"Failure of discrete knowledge","slug":"failure-of-discrete-knowledge","link":"#failure-of-discrete-knowledge","children":[]}]},{"level":2,"title":"Continuous knowledge","slug":"continuous-knowledge","link":"#continuous-knowledge","children":[{"level":3,"title":"What is continuous knowledge?","slug":"what-is-continuous-knowledge","link":"#what-is-continuous-knowledge","children":[]},{"level":3,"title":"How to tell whether the knowledge is continuous?","slug":"how-to-tell-whether-the-knowledge-is-continuous","link":"#how-to-tell-whether-the-knowledge-is-continuous","children":[]},{"level":3,"title":"What is the property of continuous knowledge?","slug":"what-is-the-property-of-continuous-knowledge","link":"#what-is-the-property-of-continuous-knowledge","children":[]},{"level":3,"title":"Natural language carries continuous knowledge","slug":"natural-language-carries-continuous-knowledge","link":"#natural-language-carries-continuous-knowledge","children":[]},{"level":3,"title":"Failure of continuous knowledge","slug":"failure-of-continuous-knowledge","link":"#failure-of-continuous-knowledge","children":[]}]},{"level":2,"title":"How all this related to EvoNote?","slug":"how-all-this-related-to-evonote","link":"#how-all-this-related-to-evonote","children":[]}],"git":{"updatedTime":1696966382000,"contributors":[{"name":"Zijian Zhang","email":"doomspec@outlook.com","commits":4}]},"filePathRelative":"writings/4. Continuous and discrete knowledge.md"}');export{e as data}; diff --git a/assets/4.1 Interface of continuous and discrete.html-20711f93.js b/assets/4.1 Interface of continuous and discrete.html-20711f93.js new file mode 100644 index 0000000..4b399a1 --- /dev/null +++ b/assets/4.1 Interface of continuous and discrete.html-20711f93.js @@ -0,0 +1 @@ +const e=JSON.parse('{"key":"v-49ee2257","path":"/writings/4.1%20Interface%20of%20continuous%20and%20discrete.html","title":"Interface of continuous and discrete","lang":"en-US","frontmatter":{},"headers":[{"level":2,"title":"Heap attack","slug":"heap-attack","link":"#heap-attack","children":[{"level":3,"title":"Paradox of the heap","slug":"paradox-of-the-heap","link":"#paradox-of-the-heap","children":[]},{"level":3,"title":"Attack to any continuous concept","slug":"attack-to-any-continuous-concept","link":"#attack-to-any-continuous-concept","children":[]}]},{"level":2,"title":"Mixture of continuous and discrete","slug":"mixture-of-continuous-and-discrete","link":"#mixture-of-continuous-and-discrete","children":[{"level":3,"title":"Machine learning engineers","slug":"machine-learning-engineers","link":"#machine-learning-engineers","children":[]}]},{"level":2,"title":"Math that bridges continuous and discrete","slug":"math-that-bridges-continuous-and-discrete","link":"#math-that-bridges-continuous-and-discrete","children":[{"level":3,"title":"Group theory","slug":"group-theory","link":"#group-theory","children":[]},{"level":3,"title":"Topology","slug":"topology","link":"#topology","children":[]}]}],"git":{"updatedTime":1696966382000,"contributors":[{"name":"Zijian Zhang","email":"doomspec@outlook.com","commits":4}]},"filePathRelative":"writings/4.1 Interface of continuous and discrete.md"}');export{e as data}; diff --git a/assets/4.1 Interface of continuous and discrete.html-98bab576.js b/assets/4.1 Interface of continuous and discrete.html-98bab576.js new file mode 100644 index 0000000..975f8d4 --- /dev/null +++ b/assets/4.1 Interface of continuous and discrete.html-98bab576.js @@ -0,0 +1 @@ +import{_ as a,r as o,o as i,c as s,a as e,b as t,d as r,e as c}from"./app-75398fc7.js";const h={},d=c('Story
If you remove one grain of sand from a heap of sand, it is still a heap. If you keep removing grains of sand, eventually you will have only one grain of sand left. Is it still a heap?
The key point of this paradox lies in the continuous nature of the concept - heap
. The concept heap
, is actually never well-defined when we use it as natural language. It also does not have a formal definition in any way. You knowledge about the heap might vary infinitesimally.
Observation
For a same object, you might think that is a heap this second and not a heap in the next second. However, you do not think your knowledge about the heap is changing even you give different answer. This proves that the concept heap
is continuous because infinitesimal variation does not matter.
For any continuous concept that does not have a clear definition. You can always attack it by the following way:
Protocol
For example, you can attack the concept machine learning
by the following way:
Example
machine learning
.machine learning
.machine learning
.Importantly, nearly all the big concepts are continuous, including philosophy
, science
, understanding
, knowledge
, freedom
, democracy
, etc. They are all vulnerable to this attack. Though they are useful concepts, people should keep in mind that they can never have a clear definition. This impossibility of this has been proved by the history.
There are knowledge that is neither continuous nor discrete. They are mixture of both. For example, though I claimed that neural networks mainly carry continuous knowledge, they also carry discrete knowledge by its network structure. The network structure is discrete and the weights are continuous.
The tree of knowledge in EvoNote is the same. The tree structure is discrete and the knowledge in the nodes are continuous because they are natural language.
Machine learning engineers does a funny job. They design the discrete structure of the neural network and train the continuous weights. In this way, they mix the discrete and continuous knowledge together.
In this process, one interesting point is the roles of human and machine. The discrete structure is designed by human and the continuous weights are trained by machine. Therefore, machine only deals with continuous knowledge and the discrete knowledge are handled by human. This matches the performance of the model as products - they are good at continuous tasks and bad at discrete tasks.
Network Architecture Search (NAS) is a process that tries to automate the design of the discrete structure. However, it failed heavily in the competition with the transformer structure. This proves that machine learning are not good at discrete tasks again.
Math provides some concrete bridge between continuous and discrete. This kind of bridge is hard to find from daily life knowledge. This makes math beautiful and holy.
The space where the continuous knowledge lives might have a symmetry described by a certain Lie group. Group theory offers a way to analyze these continuous knowledge by analyzing its Lie group. For example, the Lie group might have a countable number of generators, which gives a discrete way to analyze the continuous knowledge. We can also analyze the representation of the Lie group, which will make the representation of it more discrete if we can decompose it into irreducible representations.
',23),l={href:"https://arxiv.org/abs/2006.10503",target:"_blank",rel:"noopener noreferrer"},u=e("h3",{id:"topology",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#topology","aria-hidden":"true"},"#"),t(" Topology")],-1),p=e("p",null,"Topology is an example where continuous entities can be unambiguously represented by discrete entities.",-1),m=e("p",null,"Condense matter physics loves topology very much. Part of the reason is the topological properties of condensed matter system nearly the only way to describe them in a discrete way.",-1);function g(f,y){const n=o("ExternalLinkIcon");return i(),s("div",null,[d,e("p",null,[t("Using the discrete knowledge found by group theory have been applied to neural network design. "),e("a",l,[t("Equivariant neural network"),r(n)]),t(" is one example.")]),u,p,m])}const b=a(h,[["render",g],["__file","4.1 Interface of continuous and discrete.html.vue"]]);export{b as default}; diff --git a/assets/4.2 Tyranny of science.html-7dafcbed.js b/assets/4.2 Tyranny of science.html-7dafcbed.js new file mode 100644 index 0000000..097f811 --- /dev/null +++ b/assets/4.2 Tyranny of science.html-7dafcbed.js @@ -0,0 +1 @@ +import{_ as c,o as i,c as n,a as e,b as t}from"./app-75398fc7.js";const a={},s=e("h1",{id:"tyranny-of-science",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#tyranny-of-science","aria-hidden":"true"},"#"),t(" Tyranny of science")],-1),o=e("p",null,"See Nietzsche and Foucault.",-1),r=e("h2",{id:"scientific-discourse",tabindex:"-1"},[e("a",{class:"header-anchor",href:"#scientific-discourse","aria-hidden":"true"},"#"),t(" Scientific discourse")],-1),d=e("p",null,"Science is meant to be accurate, whose cost is ignore the rich context that real-life situation might have. By making science the absolute premium way to think, it put a coercive discretizer to human world, which leads to a deviation from the fact and many tragedies.",-1),h=[s,o,r,d];function _(l,f){return i(),n("div",null,h)}const m=c(a,[["render",_],["__file","4.2 Tyranny of science.html.vue"]]);export{m as default}; diff --git a/assets/4.2 Tyranny of science.html-c043c0d4.js b/assets/4.2 Tyranny of science.html-c043c0d4.js new file mode 100644 index 0000000..0f1d23c --- /dev/null +++ b/assets/4.2 Tyranny of science.html-c043c0d4.js @@ -0,0 +1 @@ +const e=JSON.parse('{"key":"v-98e96e0e","path":"/writings/4.2%20Tyranny%20of%20science.html","title":"Tyranny of science","lang":"en-US","frontmatter":{},"headers":[{"level":2,"title":"Scientific discourse","slug":"scientific-discourse","link":"#scientific-discourse","children":[]}],"git":{"updatedTime":1696289115000,"contributors":[{"name":"Zijian Zhang","email":"doomspec@outlook.com","commits":1}]},"filePathRelative":"writings/4.2 Tyranny of science.md"}');export{e as data}; diff --git a/assets/404.html-3485f7d0.js b/assets/404.html-3485f7d0.js new file mode 100644 index 0000000..6a19e56 --- /dev/null +++ b/assets/404.html-3485f7d0.js @@ -0,0 +1 @@ +import{_ as e,o as c,c as t}from"./app-75398fc7.js";const _={};function o(r,n){return c(),t("div")}const a=e(_,[["render",o],["__file","404.html.vue"]]);export{a as default}; diff --git a/assets/404.html-60b35caa.js b/assets/404.html-60b35caa.js new file mode 100644 index 0000000..7a25b17 --- /dev/null +++ b/assets/404.html-60b35caa.js @@ -0,0 +1 @@ +const t=JSON.parse('{"key":"v-3706649a","path":"/404.html","title":"","lang":"en-US","frontmatter":{"layout":"NotFound"},"headers":[],"git":{},"filePathRelative":null}');export{t as data}; diff --git a/assets/Comments on previous work.html-92af1631.js b/assets/Comments on previous work.html-92af1631.js new file mode 100644 index 0000000..ec35ccd --- /dev/null +++ b/assets/Comments on previous work.html-92af1631.js @@ -0,0 +1 @@ +import{_ as e,o as t,c as o,e as n}from"./app-75398fc7.js";const a={},s=n('My comments are mainly made on the following paper:
What is Understanding? An Overview of Recent Debates in Epistemology and Philosophy of Science: https://philpapers.org/rec/BAUWIU
My decisive question to philosophers: Is Pavlov's dog and Russell's chicken understand bell and the coming of the farmer?
My answer: yes! They understand to the extend that they can draw and context. We should not require the context to be always true. Even human believe mythology.
My observation: Most of the philosophers do not agree with me according to the reference.
4.1 Understanding and the facts
My comment: factivity, to any extend, is not related to understanding. Understanding can easily be totally wrong. Mythology is a good example.
4.2.2. Grasping
My comment: It seems like the philosophers care about knowing the causality and thinks this might be a standard of understanding. In my view, causality is just a piece of knowledge. People usually forming an implicit context with it does not make it a necessary piece of understanding.
',12),i=[s];function r(h,c){return t(),o("div",null,i)}const p=e(a,[["render",r],["__file","Comments on previous work.html.vue"]]);export{p as default}; diff --git a/assets/Comments on previous work.html-e69171fc.js b/assets/Comments on previous work.html-e69171fc.js new file mode 100644 index 0000000..a1c3d67 --- /dev/null +++ b/assets/Comments on previous work.html-e69171fc.js @@ -0,0 +1 @@ +const e=JSON.parse('{"key":"v-2e6765f0","path":"/writings/Comments%20on%20previous%20work.html","title":"Comment on previous philosophical works","lang":"en-US","frontmatter":{},"headers":[{"level":2,"title":"A decisive question","slug":"a-decisive-question","link":"#a-decisive-question","children":[]},{"level":2,"title":"Chapter-wise comments","slug":"chapter-wise-comments","link":"#chapter-wise-comments","children":[]}],"git":{"updatedTime":1696131659000,"contributors":[{"name":"Zijian Zhang","email":"doomspec@outlook.com","commits":1}]},"filePathRelative":"writings/Comments on previous work.md"}');export{e as data}; diff --git a/assets/app-75398fc7.js b/assets/app-75398fc7.js new file mode 100644 index 0000000..366f74b --- /dev/null +++ b/assets/app-75398fc7.js @@ -0,0 +1,10 @@ +const zl="modulepreload",jl=function(e){return"/"+e},Go={},he=function(t,n,r){if(!n||n.length===0)return t();const o=document.getElementsByTagName("link");return Promise.all(n.map(s=>{if(s=jl(s),s in Go)return;Go[s]=!0;const i=s.endsWith(".css"),l=i?'[rel="stylesheet"]':"";if(!!r)for(let u=o.length-1;u>=0;u--){const f=o[u];if(f.href===s&&(!i||f.rel==="stylesheet"))return}else if(document.querySelector(`link[href="${s}"]${l}`))return;const c=document.createElement("link");if(c.rel=i?"stylesheet":zl,i||(c.as="script",c.crossOrigin=""),c.href=s,document.head.appendChild(c),i)return new Promise((u,f)=>{c.addEventListener("load",u),c.addEventListener("error",()=>f(new Error(`Unable to preload CSS for ${s}`)))})})).then(()=>t()).catch(s=>{const i=new Event("vite:preloadError",{cancelable:!0});if(i.payload=s,window.dispatchEvent(i),!i.defaultPrevented)throw s})};function fo(e,t){const n=Object.create(null),r=e.split(",");for(let o=0;ogit clone git@github.com:EvoEvolver/EvoNote.git
+pip install -e .
+
Note
: The node of knowledge. It only contains the knowledge itself.
Tree
: The collection of the references to Note
objects. It contains the relationship among knowledge. It includes parents, children and path in the tree structure. It also contains the indexings of the notes.
Indexing
: The class for storing the data of one indexing. It can return related notes when provided with queries. It must be interpreted by the Indexer
class.
Indexer
: The factory of Indexing
objects. Indexers are stateless and their state are stored in the Indexing
object. A Indexer
should never be instantiated and all its methods should be static.
CacheManager
: The class for managing the cache of expensive tasks.
cache_manager
: The instance of CacheManager
for the whole program. You should import it whenever you want to read and write the caches.
cache_manager.read_cache(self, input: any, type: str) -> Cache
: input
should be a hashable. Cache storage and retrieval is realized by matching both input
and type
. See https://github.com/EvoEvolver/EvoNote/blob/main/evonote/search/fine_searcher.py for an example of usage.
When you want to discard a certain type of cache. You can use with cache_manager.refresh_cache(cache_type: str):
to wrap the code that generates the cache. This will disable the cache of the type cache_type
.
In the debug
file, many useful function for revealing the intermediate results are provided. You can use them to debug the program. For example:
from evonote.debug import display_chats
+with display_chats():
+ some_code()
+
All the calling of chat completion will be displayed.
See https://github.com/EvoEvolver/EvoNote/blob/main/playground/debug.py for examples of usage.
search
: The folder includes the codes for searching in the knowledge base.
builder
: The folder includes the codes for building the knowledge base.
Putting sections in DocInPy is as easy as putting sections in Markdown. You just need to put a #
before your section title in a comment environment starting with """
. For example,
"""
+# Section 1
+The following is a function.
+"""
+
+
+def foo():
+ pass
+
+
+"""
+# Section 2
+The following is another function.
+"""
+
+
+def bar():
+ pass
+
In this way, foo()
and bar()
will have sections Section 1
and Section 2
respectively. The sections will also contain the comments under them.
Your section can also contain classes add levels. For example,
"""
+# Top Section
+## Section 1
+"""
+
+
+class Foo:
+ """
+ # Section 1
+ The following is a function.
+ """
+
+ def foo(self):
+ pass
+
+ """
+ # Section 2
+ The following is another function.
+ """
+
+ def bar(self):
+ pass
+
+
+"""
+## Section 2
+"""
+
+
+def baz():
+ pass
+
You add .tree.yml
file in a folder to add sections in it. For example, in the following folder
a_folder
+- __init__.py
+- a.py
+- b.py
+- c.py
+- .tree.yml
+
You can put a
,b
in a section by putting
sections:
+ your section title:
+ - a
+ - b
+default section: you default section title
+
Then a
and b
will be in the section your section title
and c
will be in the section you default section title
.
You can also add examples to your functions and classes. Just use the @example
decorator. For example,
from docinpy.decorator import example
+
+
+@example
+def how_to_use_foo():
+ """
+ # Example
+ The following is an example of using \`foo()\`.
+ """
+ foo()
+
In a similar way, you can also mark todos in your code. Just use the @todo
decorator. For example,
from docinpy.decorator import todo
+
+@todo
+def todo_foo():
+ """
+ # Todo
+ The following is a todo.
+ """
+ foo()
+
+@todo("This function is buggy. Fix it.")
+def buggy_foo():
+ foo(a=b)
+
IDE recommendation: PyCharm
Docstring style: rst
Please use the DocInPy style for adding sections in the codes.
git clone git@github.com:EvoEvolver/EvoNote.git
+pip install -e .
+
Note
: The node of knowledge. It only contains the knowledge itself.
Tree
: The collection of the references to Note
objects. It contains the relationship among knowledge. It includes parents, children and path in the tree structure. It also contains the indexings of the notes.
Indexing
: The class for storing the data of one indexing. It can return related notes when provided with queries. It must be interpreted by the Indexer
class.
Indexer
: The factory of Indexing
objects. Indexers are stateless and their state are stored in the Indexing
object. A Indexer
should never be instantiated and all its methods should be static.
CacheManager
: The class for managing the cache of expensive tasks.
cache_manager
: The instance of CacheManager
for the whole program. You should import it whenever you want to read and write the caches.
cache_manager.read_cache(self, input: any, type: str) -> Cache
: input
should be a hashable. Cache storage and retrieval is realized by matching both input
and type
. See https://github.com/EvoEvolver/EvoNote/blob/main/evonote/search/fine_searcher.py for an example of usage.
When you want to discard a certain type of cache. You can use with cache_manager.refresh_cache(cache_type: str):
to wrap the code that generates the cache. This will disable the cache of the type cache_type
.
In the debug
file, many useful function for revealing the intermediate results are provided. You can use them to debug the program. For example:
from evonote.debug import display_chats
+with display_chats():
+ some_code()
+
All the calling of chat completion will be displayed.
See https://github.com/EvoEvolver/EvoNote/blob/main/playground/debug.py for examples of usage.
search
: The folder includes the codes for searching in the knowledge base.
builder
: The folder includes the codes for building the knowledge base.
DocInPy is a standard for putting documentation just in your Python code. It is proposed to provide another option other than the current standard of putting documentation in a separate file. Though it is not new to mix documentation with code, in DocInPy, you can also do
We believe in this way we can provide much more context information to new contributors who are not familiar with the codebase. It is also important for AI-based agents to understand the codebase and develop it.
EvoNote is using DocInPy to document its codebase. You can have a good visualization of EvoNote's codebase by running EvoNote visualization.
Putting sections in DocInPy is as easy as putting sections in Markdown. You just need to put a #
before your section title in a comment environment starting with """
. For example,
"""
+# Section 1
+The following is a function.
+"""
+
+
+def foo():
+ pass
+
+
+"""
+# Section 2
+The following is another function.
+"""
+
+
+def bar():
+ pass
+
In this way, foo()
and bar()
will have sections Section 1
and Section 2
respectively. The sections will also contain the comments under them.
Your section can also contain classes add levels. For example,
"""
+# Top Section
+## Section 1
+"""
+
+
+class Foo:
+ """
+ # Section 1
+ The following is a function.
+ """
+
+ def foo(self):
+ pass
+
+ """
+ # Section 2
+ The following is another function.
+ """
+
+ def bar(self):
+ pass
+
+
+"""
+## Section 2
+"""
+
+
+def baz():
+ pass
+
You add .tree.yml
file in a folder to add sections in it. For example, in the following folder
a_folder
+- __init__.py
+- a.py
+- b.py
+- c.py
+- .tree.yml
+
You can put a
,b
in a section by putting
sections:
+ your section title:
+ - a
+ - b
+default section: you default section title
+
Then a
and b
will be in the section your section title
and c
will be in the section you default section title
.
You can also add examples to your functions and classes. Just use the @example
decorator. For example,
from docinpy.decorator import example
+
+
+@example
+def how_to_use_foo():
+ """
+ # Example
+ The following is an example of using `foo()`.
+ """
+ foo()
+
In a similar way, you can also mark todos in your code. Just use the @todo
decorator. For example,
from docinpy.decorator import todo
+
+@todo
+def todo_foo():
+ """
+ # Todo
+ The following is a todo.
+ """
+ foo()
+
+@todo("This function is buggy. Fix it.")
+def buggy_foo():
+ foo(a=b)
+
DocInPy can be regarded as an effort toward the idea - literate programming. We think literate programming gets even more important in the era of AI for it provides more context information for AI-based agents to understand the codebase.
All the programming languages encourage the programmers to put their code in the tree structure. For example, you can put your functions in difference classes, in different files and put the files in different folders. However, it is still very common to put a lot of functions in a single file, in which the codes are arranged in an almost flat structure.
DocInPy helps this by adding a zero-cost way to add sections to your functions and classes. It makes another step towards a more tree-like structure of the codebase. We believe this will help the programmers to understand the codebase better. See Method of Loci for more details.
What does it mean by understanding? Usually, it is use as the opposite of just memorizing. When you memorize something, you just remember it. But when you understand something, there are a few more magic happening:
This is quite difficult because the context usually doesn't match the knowledge exactly. For example, if the doctor says: "don't drink any water". You may think that you can drink juice, but you can't drink juice either. You can't drink anything.
This is even more difficult. The reason is two-fold:
The most important example of the use of indexing is search engine. Search engines collect the keywords in the documents and index them. When you search for a keyword, the search engine will return the documents that contain the keyword. This is the most basic form of indexing.
Search engines provide an efficient way to find a webpage with some keywords. However, you cannot imagine you discover the gravity by searching "apple" in Google. This is because the search engine doesn't understand the relation between the keywords. It only knows that the keywords appear in the same document. This approach is already very powerful, but we obviously want more.
A way to analyze the relation and the context of the keywords is to use a knowledge graph. A knowledge graph is a graph that contains the objects and the relations between them. For example, juice
and water
can be two nodes linked by an edge has ingrediant
. In this way, it might help understanding whether you can drink juice when the doctor says "don't drink any water".
However, obviously, none of these ancient approaches has a chance to draw a relation between celestial bodies and gravity. Ever after decades of development, they still struggle with understanding everything that is a little bit abstract. Certainly we need something new.
I will not introduce the history of the reason why LLMs work. But I believe every one of you who are reading this article must have some feeling that LLM can understand abstract mathematical concepts. If you ask ChatGPT:
When I dress myself, I can put on my shirt and then my pants.
+I can also put on my pants and then my shirt. It won't make a difference.
+
+What mathematical concept is this?
+
ChatGPT will recognize that the order of the two actions doesn't matter, so it is related to commutativity. The answer won't change a lot if you modify the situation as long as it represents the concept of commutativity. It's quite hard to imagine how a search engine or a knowledge graph can do this. The word "commutativity" doesn't even appear in the question.
However, this good performance of LLM is at a cost. The most important limitation is that LLM the size of its input is limited, and it is completely not at the same scale as the knowledge graph and traditional search engines. You have to decide what is the most important context that the LLM have to know before you ask the question. This, again, requires some understanding of the knowledge. LLM helps you the best when you already have some understanding of the knowledge.
Good news is that LLM not only helps us by directly giving the answer. It also helps us index existing knowledge. Notice that LLM are built with deep learning technology, in which neural networks are used to process the knowledge. In the intermediate layers of the neural network, the knowledge is represented as vectors called embeddings.
These embeddings carry all the information about the input and have already been processed by the neural network for abstract understanding. Therefore, if two inputs to the LLM have similar embeddings, they are likely to be related, even in an abstract way. This is the key idea of embedding-based indexing.
Giving a few pieces of knowledge, we can use LLM to generate their embeddings as their index. Whenever a context is given, we generate the embedding of the context and find the similar embeddings in the knowledge base. This will have the model to gain essential knowledge before answering a question. Importantly, this embedding similarity-based indexing is totally scalable, meaning that you have the chance to index the knowledge of astronomy and gravity together!
Though there might be still a lot of steps before we let the model rediscover gravity, we have already seen the potential of LLM in indexing. Importantly, we find a good roadmap to solve the two problems we put in the beginning. For the first, by embedding similarity, we have the tool for finding the relevant knowledge to the context and retrieve them. "Don't drink water" will have a high similarity with "don't drink juice". For the second, with the abstract understanding ability of LLM, we can extract relation of two pieces of knowledge. It can discover that "don't drink water" actually means "don't drink any liquid".
With embedding-based search in hand, it seems what left for us to build is simply improve its performance. However, you will find it as a surprisingly tricky task which involves much philosophical effort. Let's discuss it in another article.
Tree might be the most common traditional indexing that are adopted by human-being before the advent of computers. People have practiced organizing books in nested section for thousands of year. It has two advantages:
This is the most obvious reason and this is why we call it indexing. However, there are more subtle reasons:
These reasons are important when the reader is not familiar with the knowledge. It makes it possible to understand the knowledge better without reading the whole book. It works by the following mechanism:
The reason why we underscore the importance of tree indexing is obviously not because we want to make a better book. Our question is whether LLM can benefit from tree indexing. The answer is obviously yes. The reason is from the following points:
Further, if we make the LLM into an agent who can actively travel on the book and add new content to the book.
Usually, in human made books, the tree structure is quite coarse. One of the reasons might be this: explicit fine-grained tree structure is hard to make and read. Though human writers might make a lot of list and aside to make the tree structure actually more fine-grained, it is laborious to give a name to every small section. The human readers are also not willing to read a book with too many small sections.
However, the situation does not hold for LLM. This is because
With the discussion above, we know that
As we discussed in the first article, indexing is closed related to understanding. Surely, we can see that tree indexing can help organizing and retrieve knowledge. However, could it really help LLM to understand abstract things like science?
Method of Loci, also known as memory palace, is a method to memorize things by associating them with a place. It is a very old method and has been used for thousands of years. It is also a very effective method.
When you try to memorize a list of things, you can just imagine a place you are familiar with and put the things in the place. When you want to recall the things, you can just imagine the place and the things will come to your mind.
Why this method is efficient? Here is the claim:
Claim
Method of Loci is efficient because it creates a graph of knowledge with each node has only limited number of edges. That is, it is a sparse graph.
Here, in the graph of knowledge, the nodes are the context (situation) and the edges leads to the memory or another situation. The whole point of method of loci is to turn a list of things, which is densely indexed, into a sparsely connected structure.
Here is the claim:
Claim
Sparse graph performs better because it fits in context window of human brain better.
Thinking with a sparse graph limits the number of things you need to think about at one time. In this meantime, because the knowledge are still interconnected, you can still think about the whole knowledge.
LLM also has a limited number of token in the context window. Current technology still struggles to make the context window large. When it seems to be large, the performance is usually not good. (See Lost in the Middle: How Language Models Use Long Contexts)
Maybe it can be improved in the future, but I strongly don't believe that will happen very fast. We can use the sparsity of the graph to decrease the number of things LLM needs to think about at one time and enhance the performance.
EvoNote uses the tree structure to index the knowledge. It has a natural advantage to make the connection at each node (note) sparse. Compared to the approaches that use a flat list (e.g., chunks) or a dense graph (e.g., knowledge graph) to index the knowledge, it is more efficient.
DocInPy provides a way to add sections to your Python codes to separate the functions and classes for arranging them into a tree structure. It makes it possible to make the tree sparse.
There are a lot of Python projects put a tons of functions in one file. This have put a barrier for both human and LLM to understand the code for a long time. DocInPy can help to solve this problem.
In the previous article, we found that indexing is deeply related to understanding. However, can we somehow give a definition of understanding? We mentioned in the first article that memorization is different from understanding. So the explicit knowledge must not be the understanding and understanding must be something other than the knowledge itself. Naturally, because the "other" thing must be related to the stand-alone knowledge, we can call it the implicit context of the knowledge. With this definition, I claim that
Claim
To understand a thing, you must know that implicit context of it.
and consequently,
Claim
The way to understand a thing, is the way to assign the implicit context to it.
Let's illustrate this with a few examples.
Example
Modern educated human think they understand earthquake and treat it as a result of the movement of tectonic plates. They think they understand because they can fit the phenomenon of earthquake into his existing knowledge of geology and use it as a context.
Example
Ancient Japanese people think they understand earthquake and treat it as a result of the movement of a giant catfish supporting the Japanese islands. They think they understand because they can fit the phenomenon of earthquake into his existing knowledge of mythology and use it as a context.
In this example, we show that different people have different understanding of the same thing. They assign different implicit context to one thing and both strongly believes so.
Here, I want to emphasize that, we do not care about which context they assign is correct or not. We only care about the fact that they assign different context. Importantly, the context they assign might be both correct, but the context they assign is different.
Example
A person who believes geocentric model thinks he understands the movement planets because they can perfectly fit into his existing knowledge of astronomy and use it as a context, though in the context planets move in a very complicated way.
Example
A person who believes heliocentric model thinks he understands the movement planets because they can perfectly fit into his existing knowledge of astronomy and use it as a context. The context is different from the previous one and the planets move in a very simple way.
As we introduced in the previous article, tree indexing can help assign a context to the knowledge. With a tree indexing, we can find existing knowledge that is similar to the incoming ones. With the help of the paths of the existing knowledge, a new path, namely a new context, can be created. This is the way tree indexing helps LLM to understand the knowledge. Specifically, the understanding can be carried out in the following way
Procedure
The generation of implicit context in tree indexing relies on the logic of paths. Therefore, we can say the logic of path generation can be used to characterize understandings. People can change the path logic to achieve a new understanding. For example: TODO.
LLM can understand sentences. Where is the implicit context? My interpretation is that the layers of the LLM is responsible for adding these implicit contexts, including the grammar, meaning of tokens and world knowledge. After being processed by layers, the hidden state, which is the embedding of the sentence, contains the implicit context. Therefore, we can say that even the simplest embedding based search provides a way to understand the knowledge, even when no tree structure is involved.
Surely, there are many criteria to classify knowledge. The important thing is how much insight we can get from the classification. In this article, I will introduce how to use the idea of continuous and discrete to classify knowledge.
Definition
Discrete knowledge is the ones whose state is defined in a discrete space. Variation on it cannot be infinitesimal.
For example, a coin has two states: head and tail. The state of a coin is discrete knowledge.
More importantly, logic deductions are operating discrete knowledge. All the system with a flavour of logic and have a clear border of what is true and what is wrong, e.g., knowledge graph and symbolic deductions, are mainly operating discrete knowledge.
Discrete knowledge is clear and easy to operate with computers. They can ensure 100% correctness given correct assumptions. For fields that have a concrete assumption, e.g., mathematics, discrete knowledge and its deduction will suffice.
However, not all fields have concrete assumptions. In the long debate of rationalism and empiricism, people found that it is absolutely not easy to find reliable and non-trivial assumption to reason from (See Kant and Hume). My claim to the failure is that the world is too complex to be described by a few pieces of discrete knowledge. Even there are a set of such discrete knowledge, they are not affordable to the human brain. For example, I admit that the world might be discrete if you look at it in a very small scale. However, the number of discrete states is too large for human to make any useful deduction except for cosmology or particle physics. Most of the useful knowledge does not change its essence when you vary it a little bit.
Definition
Continuous knowledge is the ones whose state is defined in a continuous space. It allows an infinitesimal variation.
For example, the probability that a coin will be head is continuous knowledge. The probability is a real number between 0 and 1.
More importantly, neural networks hold continuous knowledge. The state of a neural network is defined by the weights of the connections between neurons. The weights are real numbers, which is a continuous space.
It might be tricky to check whether a piece of knowledge is continuous or not. The key is to imagine whether the knowledge can have a very small variation and still remain mostly true. For example, when you try to recall a voice of someone, you can never ensure that your memory today is the same as your memory yesterday. It also works for smell, visual or kinetic memory.
Most importantly, though also containing discrete knowledge like grammar, a large part of our knowledge about words is also continuous. Your feeling about a certain word is continuous. The most obvious example is brands. You must have a certain feeling about Coca-cola, Pepsi, Tesla and BMW; and they don't have a clear border of correctness, nor you can check your feeling is stable.
The representation power of continuous knowledge is much stronger than discrete knowledge. It is very hard to imagine how to represent the feeling of ski or recalling a picture with a discrete format.
Continuous knowledge is more natural for human to process. Most of the physics theory also assume that the space is continuous or its discreteness is negligible for human. The power of continuous knowledge can also be proved by the success of neural network. There was a shift of the paradigm of artificial intelligence in the 1990s from discrete to continuous and then follows the triumph of neural networks in nearly all the field.
Admittedly, symbols in a language is discrete. However, they are meaningless without an interpreter. The development of natural language processing has witnessed that all the discrete approaches to understand natural language failed. The history has seen that parsing sentences in to syntax tree is hard and not as useful as using neural networks to directly process the natural language.
Syntax tree can never represent the accurate meaning. For example, I can set a question: "If apple means eat in the next sentence. 'Mike apple an apple.' What did Mike intake?" This question is easy for human to answer but will break any natural language parser.
However, the intrinsic drawbacks of continuous knowledge are still there. Even in 2023, we still cannot handle math, logic and coding satisfactorily with neural networks. This is surely because of the discrete nature of these tasks. How to bridge continuous knowledge with discrete knowledge will be the main challenge of building AI.
Insight
EvoNote is trying to add more discrete structure to the continuous knowledge.
EvoNote uses tree structure to organize the natural languages in a macro scale (Recall the section: Tree indexing). This can assign the continuous knowledge a discrete structure (tree), which we believe can help building a continuous-discrete hybrid knowledge to help making AI capable at discrete tasks.
Story
If you remove one grain of sand from a heap of sand, it is still a heap. If you keep removing grains of sand, eventually you will have only one grain of sand left. Is it still a heap?
The key point of this paradox lies in the continuous nature of the concept - heap
. The concept heap
, is actually never well-defined when we use it as natural language. It also does not have a formal definition in any way. You knowledge about the heap might vary infinitesimally.
Observation
For a same object, you might think that is a heap this second and not a heap in the next second. However, you do not think your knowledge about the heap is changing even you give different answer. This proves that the concept heap
is continuous because infinitesimal variation does not matter.
For any continuous concept that does not have a clear definition. You can always attack it by the following way:
Protocol
For example, you can attack the concept machine learning
by the following way:
Example
machine learning
.machine learning
.machine learning
.Importantly, nearly all the big concepts are continuous, including philosophy
, science
, understanding
, knowledge
, freedom
, democracy
, etc. They are all vulnerable to this attack. Though they are useful concepts, people should keep in mind that they can never have a clear definition. This impossibility of this has been proved by the history.
There are knowledge that is neither continuous nor discrete. They are mixture of both. For example, though I claimed that neural networks mainly carry continuous knowledge, they also carry discrete knowledge by its network structure. The network structure is discrete and the weights are continuous.
The tree of knowledge in EvoNote is the same. The tree structure is discrete and the knowledge in the nodes are continuous because they are natural language.
Machine learning engineers does a funny job. They design the discrete structure of the neural network and train the continuous weights. In this way, they mix the discrete and continuous knowledge together.
In this process, one interesting point is the roles of human and machine. The discrete structure is designed by human and the continuous weights are trained by machine. Therefore, machine only deals with continuous knowledge and the discrete knowledge are handled by human. This matches the performance of the model as products - they are good at continuous tasks and bad at discrete tasks.
Network Architecture Search (NAS) is a process that tries to automate the design of the discrete structure. However, it failed heavily in the competition with the transformer structure. This proves that machine learning are not good at discrete tasks again.
Math provides some concrete bridge between continuous and discrete. This kind of bridge is hard to find from daily life knowledge. This makes math beautiful and holy.
The space where the continuous knowledge lives might have a symmetry described by a certain Lie group. Group theory offers a way to analyze these continuous knowledge by analyzing its Lie group. For example, the Lie group might have a countable number of generators, which gives a discrete way to analyze the continuous knowledge. We can also analyze the representation of the Lie group, which will make the representation of it more discrete if we can decompose it into irreducible representations.
Using the discrete knowledge found by group theory have been applied to neural network design. Equivariant neural network is one example.
Topology is an example where continuous entities can be unambiguously represented by discrete entities.
Condense matter physics loves topology very much. Part of the reason is the topological properties of condensed matter system nearly the only way to describe them in a discrete way.
See Nietzsche and Foucault.
Science is meant to be accurate, whose cost is ignore the rich context that real-life situation might have. By making science the absolute premium way to think, it put a coercive discretizer to human world, which leads to a deviation from the fact and many tragedies.
My comments are mainly made on the following paper:
What is Understanding? An Overview of Recent Debates in Epistemology and Philosophy of Science: https://philpapers.org/rec/BAUWIU
My decisive question to philosophers: Is Pavlov's dog and Russell's chicken understand bell and the coming of the farmer?
My answer: yes! They understand to the extend that they can draw and context. We should not require the context to be always true. Even human believe mythology.
My observation: Most of the philosophers do not agree with me according to the reference.
4.1 Understanding and the facts
My comment: factivity, to any extend, is not related to understanding. Understanding can easily be totally wrong. Mythology is a good example.
4.2.2. Grasping
My comment: It seems like the philosophers care about knowing the causality and thinks this might be a standard of understanding. In my view, causality is just a piece of knowledge. People usually forming an implicit context with it does not make it a necessary piece of understanding.