[HN Gopher] Remembering Doug Lenat and his quest to capture the ...
___________________________________________________________________
 
Remembering Doug Lenat and his quest to capture the world with
logic
 
Author : andyjohnson0
Score  : 131 points
Date   : 2023-09-06 09:23 UTC (10 hours ago)
 
web link (writings.stephenwolfram.com)
w3m dump (writings.stephenwolfram.com)
 
| ChaitanyaSai wrote:
| Great read. Surprised to read Wolfram never actually got to use
| CYC. Anyone here who has and can talk about its capabilities?
 
  | nvm0n2 wrote:
  | I played with OpenCyc once. It was quite hard to use because
  | you had to learn things like CycL and I couldn't get their
  | natural language processing module to work.
  | 
  | The knowledge base was impressively huge but it also took a lot
  | of work to learn because at the lower levels it was extremely
  | abstract. A lot of the assertions in the KB were establishing
  | very low level stuff that only made sense if you were really
  | into abstract logic or philosophy.
  | 
  | They made bold claims on their website for what it could do,
  | but I could never reproduce them. There was supposedly a more
  | advanced version called ResearchCyc though, which I didn't have
  | access to.
 
    | creer wrote:
    | That was exactly my reaction to it: it seemed to require
    | sooooo much background knowledge about the entire system to
    | do anything. And because you were warned about issues with
    | consistency it seemed you were warned about just fudging some
    | things. That it was a quick way to an application that
    | couldn't work. The learning curve seemed daunting.
 
  | gumby wrote:
  | Some of us who worked on Cyc commented in an earlier post about
  | Doug's decease.
 
  | lispm wrote:
  | Wolfram is able to write it in such a way that somehow it is
  | mostly about him. :-(
  | 
  | There is some overlap between Cyc and his Alpha. Cyc was
  | supposed to provide a lot of common sense knowledge, which
  | would be reusable. When Expert Systems were a thing, one of the
  | limiting factor were said to be limited amount of broader
  | knowledge of the world. Knowledge a human learns by experience,
  | interacting with the world. This would involve a lot of facts
  | about the world and also about all kinds of exceptions
  | (Example: a mother typically is older than its child, unless
  | the child was adopted and the mother is younger). Cyc knows a
  | lot of 'facts' and also many ways of logic reasoning plus many
  | logic 'reasoning rules'.
  | 
  | Wolfram Alpha has a lot of knowledge about facts, often in some
  | form of maths or somewhat structured data.
 
    | dang wrote:
    | Ok, but let's avoid doing the mirror image thing where we
    | make the thread about Wolfram doing that.
    | 
    | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que.
    | ..
 
      | lispm wrote:
      | Well, it's a disappointing and shallow read, because the
      | topic of the usefulness of combining Cyc and Alpha would
      | have been interesting.
 
  | stakhanov wrote:
  | I briefly looked into it many moons ago when I was a Ph.D.
  | student working in the area of computational semantics in
  | 2006-10. This was already well past the hayday of CYC though.
  | 
  | The first stumbling block was that CYC wasn't openly available.
  | Their research group was very insular, and they were very
  | protective of their IP, hoping to pay for their work through
  | licensing deals and industry- or academic collaborations that
  | could funnel money their way.
  | 
  | They had a subset called "OpenCYC" though, which they released
  | more publicly in the hope of drawing more attention. I tried
  | using that, but soon got frustrated with the software. The
  | representation was in a CYC-specific language called "CycL" and
  | the inference engine was CYC-specific as well and based on a
  | weird description logic specifically invented for CYC. So you
  | couldn't just hook up a first-order theorem prover or anything
  | like that. And "description logic" is a polite term for what
  | their software did. It seemed mostly designed as a workaround
  | to the fact that open-ended inferencing of the kind they spoke
  | of to motivate their work would have depended way too
  | frequently on factoids of common sense knowledge that were
  | missing from the knowledge base. I got frustrated with that
  | software very quickly and eventually gave up.
  | 
  | This was a period of AI-winter, and people doing AI were very
  | afraid to even use the term "AI" to describe what they were
  | doing. People were instead saying they were doing "pattern
  | processing with images" or "audio signal processing" or
  | "natural language processing" or "automated theorem proving" or
  | whatever. Any mention of "AI" made you look naive. But Lenat's
  | group called their stuff "AI" and stuck to their guns, even at
  | a time when that seemed a bit politically inept.
  | 
  | From what I gathered through hearsay, CYC were also doing
  | things like taking a grant from the defense department, and
  | suddenly a major proportion of the facts in the ontology were
  | about military helicopters. But they still kept beating the
  | drum about how they were codifying "common sense" knowledge,
  | and, if only they could get enough "common sense" knowledge in
  | there, they would break through a resistance level at some
  | point, where they could have the AI program itself, i.e. use
  | the existing facts to derive more facts by reading and
  | understanding plain text.
 
    | zozbot234 wrote:
    | Doesn't description logic mostly boil down to multi-modal
    | logic, which ought to be representable as a fragment of FOL
    | (w/ quantifiers ranging over "possible worlds")?
    | 
    | Description logic isn't just found in Cyc, either; Semantic
    | Web standards are based on it, for similar reasons - it's key
    | to making general inference computationally tractable.
 
      | stakhanov wrote:
      | I'm not trying to be dismissive of description logics. (And
      | I'm not dismissive of Lenat and his work, either). A lot of
      | things can fall under that umbrella term. The history of
      | description logic may in fact be just as old as post-
      | syllogism first-order predicate calculus (the syllogism is,
      | of course, far older, dating back to Aristotle). In the
      | Principia Mathematica there's a quantifier that basically
      | means "the", which is incidentally also the most common
      | word in the English language, and that can be thought of as
      | a description logic too. But the perspective of a
      | Mathematician on this is very different from that of an AI
      | systems "practitioner", and CYC seemed to belong more to
      | the latter tradition.
 
    | MichaelZuo wrote:
    | That's fascinating to read, thanks for sharing.
    | 
    | Did it ever do something genuinely surprising? That seemed
    | beyond the state-of-the-art at the time?
 
      | stakhanov wrote:
      | One of the people from Cyc gave a talk at the research
      | group I was in once and mentioned an idea that kind of
      | stuck with me.
      | 
      | ...sorry, it takes some building-up to this: At the time, a
      | lot of work in NLP was focused on building parsers that
      | were trying to draw constituency trees from sentences, or
      | extract syntactic dependency structures, but do so in a way
      | that completely abstracted away from semantics, or looked
      | at semantics as an extension of syntax, but not venturing
      | into the territory of inference and common sense. So, a
      | sentence like "Green ideas sleep furiously" (to borrow from
      | Chomsky's example), was just as good as a research object
      | to someone doing that kind of research as a sentence that
      | actually makes sense and is comprised of words of the same
      | lexical categories, like "Absolute power corrupts
      | absolutely". -- I suspect, that line of research is still
      | going strong, so the past tense may not be quite
      | appropriate here. I'm using it, because I have been so out
      | of the loop since leaving academia.
      | 
      | The major problem these folk are facing is an exploding
      | combinatorial space of ambiguity at the grammatical level
      | ("I saw a man with a telescope" can be bracketed "I saw (a
      | man) with a telescope" or "I saw a (man with a telescope)")
      | and the semantic level ("Every man loves a woman" can mean
      | "For every man M there exists a woman W, such that M loves
      | W" or it can mean "There exists a woman W, such that for
      | every man M it is true that M loves W"). Even if you could
      | completely solve the parsing problem, the ambiguity problem
      | would remain.
      | 
      | Now this guy from the Cyc group said: Forget about parsing.
      | If you give me the words that are in the sentence and
      | you're not even giving me any clue about how the words were
      | used in the sentence, I can already look into my ontology
      | and tell you how the ontology would be most likely to
      | connect the words.
      | 
      | Now, the sentence "The cat chased the dog" obviously means
      | something different from "The dog chased the cat" despite
      | using the same words. But in most text genres, you're
      | likely to only encounter sentences that are saying things
      | that are commonly held as true. So if you have an ontology
      | that tells you what's commonly held as true, that gives you
      | a statistical prior that enables you to understand
      | language. In fact, you probably can't hope to understand
      | language without it, and it's probably the key to
      | "disambiguation".
      | 
      | This thought kind of flipped my worldview upside down. I
      | had always kind of thought of it as this "pipelined
      | architecture" where you first need to parse the text,
      | before it even makes sense to think about how to solve the
      | problems of what to do with the output from that parser.
      | But that was unnecessarily limiting. You can look at the
      | problem as a joint-decoding problem, and it may very well
      | be the case that the lion's share of entropy comes from
      | elsewhere, and it may be foolish to go around trying to
      | build parsers, if you haven't yet hooked up your system to
      | the information source that provides the lion's share of
      | entropy, namely common-sense knowledge.
      | 
      | Now, I don't think that Cyc had gotten particularly close
      | to solving that problem either, and, in fact, it was a bit
      | uncharacteristic for a "Cycler" to talk about statistical
      | priors at all, as their work hadn't even gotten into the
      | territory of collecting those kinds of statistics. But, as
      | a theoretical point, I thought it was very valid.
 
| jmj wrote:
| I'm working on old fashioned A.I. for my PhD. I wrote Doug a few
| times, he was very kind and offered very good advice. I was
| hoping to work with him one day.
| 
| I'll miss you Doug.
 
  | nairboon wrote:
  | What are you working on?
 
| dekhn wrote:
| I recommend reading Norvig's thinking about the various cultures.
| 
| https://static.googleusercontent.com/media/research.google.c...
| and https://norvig.com/chomsky.html
| 
| In short, Norvig concludes there are several conceptual
| approaches to ML/AI/Stats/Scientific analysis. One is "top down":
| teach the system some high level principles that correspond to
| known general concepts, and the other is "bottom up": determine
| the structure from the data itself and use that to generate
| general concepts. He observes that while the former is attractive
| to many, the latter has continuously produced more and better
| results with less effort.
| 
| I've seen this play out over and over. I've concluded that Norvig
| is right: empirically based probabilistic models are a cheaper,
| faster way to answer important engineering and scientific
| problems, even if they are possibly less satisfying
| intellectually. Cheap approximations are often far better than
| hard to find analytic solutions.
 
  | golol wrote:
  | this is the same concept as the bitter lesson, am I correct? I
  | don't see a substantial difference yet.
 
    | dekhn wrote:
    | I hadn't read that before, but yes. Sutton focuses mostly on
    | "large amounts of compute" whereas I think his own employer
    | has demonstrated that it's a combination of large amount of
    | compute, large amounts of data, and really clever
    | probabilistic algorithms, in combination, which really
    | demonstrate the utility of the bitter lesson.
    | 
    | And speaking as a biologist for a moment, that minds are
    | irredeemably complex and attemptng to understand them with
    | linear, first-order rules and logic is unlikely to be
    | fruitful.
 
  | jyscao wrote:
  | > One is "top down": teach the system some high level
  | principles that correspond to known general concepts, and the
  | other is "bottom up": determine the structure from the data
  | itself and use that to generate general concepts.
  | 
  | This is the same pattern explaining why bottom-up economic
  | systems, i.e. lassaire faire free markets, flawed as they are,
  | work better than top-down systems like central planning.
 
| richardjam73 wrote:
| I have that issue of Scientific American somewhere, I didn't know
| Stephen had an article in it too. I'll have reread of it.
 
| kensai wrote:
| This is very fascinating. Is there somewhere a review of Cyc
| regarding its abilities compared to other systems?
 
  | rjsw wrote:
  | Maybe read a bit about AM and Eurisko first, that will give an
  | idea of how Cyc was expected to get used.
 
    | cabalamat wrote:
    | My understanding of AM and Eurisko (having looked into them a
    | decade or so ago) was that their source code hadn't been
    | published, and that there was a dispute as to what their
    | capabilities actually were and how much was exaggeration by
    | Lenat.
    | 
    | I don't know if that's still the case. I do think that it
    | would be worth creating systems that mix the ANN and GOFAI
    | approaches to AI.
 
| HarHarVeryFunny wrote:
| I missed the news of Doug Lenat's passing. He died a few days ago
| on August 31st.
| 
| I'm old enough to have lived thru the hope but ultimate failure
| of Lenat's baby CYC. The CYC project was initiated in 1984, in
| the heyday of expert systems which had been successful in many
| domains. The idea of an expert system was to capture the
| knowledge and reasoning power of a subject matter expert in a
| system of declarative logic and rules.
| 
| CYC was going to be the ultimate expert system that captured
| human common sense knowledge about the world via a MASSIVE
| knowledge/rule set (initially estimated as a 1000 man-year
| project) of how everyday objects behaved. The hope was that
| through sheer scale and completeness it would be able to reason
| about the world in the same way as a human who had gained the
| same knowledge thru embodiment and interaction.
| 
| The CYC project continued for decades with a massive team of
| people encoding rules according to it's own complex ontology, but
| ultimately never met it's goals. In retrospect it seems the idea
| was doomed to failure from the beginning, but nonetheless it was
| an important project that needed to be tried. The problem with
| any expert system reasoning over a fixed knowledge set is that
| it's always going to be "brittle" - it may perform well for cases
| wholly within what it knows about, but then fail when asked to
| reason about things where common sense knowledge and associated
| extrapolation of behavior is required; CYC was hoping to avoid
| this via scale to be so complete that there were no important
| knowledge gaps.
| 
| I have to wonder if LLM-based "AI's" like GPT-4 aren't in some
| ways very similar to CYC in that they are ultimately also giant
| expert systems, but with the twist that they learnt their
| knowledge, rules and representations/reasoning mechanisms from a
| training set rather than it having to be laboriously hand
| entered. The end result is much he same though - an ultimately
| brittle system who's Achille's heel is that it is based on a
| fixed set of knowledge rather than being able to learn from it's
| own mistakes and interact with the domain it is attempting to
| gain knowledge over. It seems there's a similar hope to CYC of
| scaling these LLM's up to the point that they know everything and
| the brittleness disappears, but I suspect that ultimately that
| will prove a false hope and real AI's will need to learn through
| experimentation just as we do.
| 
| RIP Doug Lenat. A pioneer of the computer age and of artificial
| intelligence.
 
  | zozbot234 wrote:
  | > I missed the news of Doug Lenat's passing. He died a few days
  | ago on August 31st.
  | 
  | Discussed https://news.ycombinator.com/item?id=37354000 (172
  | comments)
 
    | HarHarVeryFunny wrote:
    | Thanks!
 
  | golol wrote:
  | Imo LLMs are absolutely the CYC dream come true. Common sense
  | rules are learned from the data instead of hand written.
 
  | detourdog wrote:
  | I understand what you are saying. I'm able to see that
  | brittleness as feature. The brittleness must be expressed so
  | that the user of the model understands the limits and why the
  | brittleness exists.
  | 
  | My thinking is that the next generation of computing will rely
  | on the human bridging that brittleness gap.
 
    | zozbot234 wrote:
    | The thing about "expert systems" is that they're just
    | glorified database query. (And yes, you can do also
    | 'semantic' inference in a DB simply by adding some views.
    | It's not generally done because it's quite computationally
    | expensive even for very simple taxonomy structures, i.e. 'A
    | implies B which implies C and foo is A, hence foo is C'.)
    | 
    | Database query is of course ubiquitous, but not generally
    | thought of as 'AI'.
 
  | brundolf wrote:
  | > The CYC project continued for decades with a massive team of
  | people encoding rules according to it's own complex ontology,
  | but ultimately never met it's goals
  | 
  | It's still going! I agree it's become clear that it probably
  | isn't the road to AGI, but it still employs people who are
  | still encoding rules and making the inference engine faster,
  | paying the bills mostly by doing contracts from companies that
  | want someone to make sense of their data warehouses
 
    | Taikonerd wrote:
    | It is? Are there success stories of companies using Cyc?
    | 
    | I always had the impression that Cycorp was sustained by
    | government funding (especially military) -- and that,
    | frankly, it was always premised more on what such software
    | _could theoretically_ do, rather than what it actually did.
 
      | brundolf wrote:
      | They did primarily government contracts for a long time,
      | but when I was there (2016-2020) it was all private
      | contracts
      | 
      | The contracts at the time were mostly skunkworks/internal
      | to the client companies, so not usually highly publicized.
      | A couple examples are mentioned on their website:
      | https://cyc.com/
 
  | nvm0n2 wrote:
  | Cyc was ahead of its time in a couple of ways:
  | 
  | 1. Recognizing that AI was a scale problem.
  | 
  | 2. Understanding that common sense was the core problem to
  | solve.
  | 
  | Although you say Cyc couldn't do common sense reasoning, wasn't
  | that actually a major feature they liked to advertise? IIRC a
  | lot of Cyc demos were various forms of common sense reasoning.
  | 
  | I once played around with OpenCyc back when that was a thing.
  | It was interesting because they'd had to solve a lot of
  | problems that smaller more theoretical systems never did. One
  | of their core features is called microtheories. The idea of a
  | knowledge base is that it's internally consistent and thus can
  | have formal logic be performed on it, but real world knowledge
  | isn't like that. Microtheories let you encode contradictory
  | knowledge about the world, in such a way that they can layer on
  | top of the more consistent foundation.
  | 
  | A very major and fundamental problem with the Cyc approach was
  | that the core algorithms don't scale well to large sizes.
  | Microtheories were also a way to constrain the computational
  | complexity. LLMs work partly because people found ways to make
  | them scale using GPUs. There's no equivalent for Cyc's
  | predicate logic algorithms.
 
    | HarHarVeryFunny wrote:
    | > IIRC a lot of Cyc demos were various forms of common sense
    | reasoning.
    | 
    | I never got to try it myself, but no doubt it worked fine in
    | those cases where correct inferences could be made based on
    | the knowledge/rules it had! Similarly GPT-4 is extremely
    | impressive when it's not bullshitting!
    | 
    | The brittleness in either case (CYC or LLMs) comes mainly
    | from incomplete knowledge (unknown unknowns), causing an
    | invalid inference which the system has no way to detect and
    | correct. The fix is a closed loop system where incorrect
    | outputs (predictions) are detected - prompting exploration
    | and learning.
    | 
    | I don't know if CYC tried to do it, but one potential speed
    | up for a system of that nature might be chunking, which is a
    | strategy that another GOFAI system, SOAR, used successfully.
    | A bit like using memoization (remembering results of work
    | already done) as a way to optimize dynamic programming
    | solutions.
 
  | TimPC wrote:
  | This fail when asked about cases not wholly within what it
  | knows about is a problem with lots of AI not just expert
  | systems. Neural Nets mostly do awfully on problems outside
  | their training data, assuming they can even generate an answer
  | at all, which isn't always possible. If you train a neural net
  | to order drinks from Starbucks and one of it's orders fails
  | with the server telling it "We are out of Soy Milk" chances are
  | quite high it's subsequent order will also contain Soy Milk.
 
  | wpietri wrote:
  | > The end result is much he same though - an ultimately brittle
  | system who's Achilles' heel is that it is based on a fixed set
  | of knowledge
  | 
  | I think CYC is a great cautionary tale for LLMs in terms of
  | hope vs reality, but I think it's worse than that. I don't
  | think LLMs have knowledge; they just mimic the ways we're used
  | to expressing knowledge.
 
| alexpotato wrote:
| This old Google EDU talk was the first time I heard of Doug
| Lenat.
| 
| Sad to hear:
| 
| a. of his passing
| 
| b. that CYC didn't eventually meet it's goals
| 
| https://www.youtube.com/watch?v=KTy601uiMcY
 
| specialist wrote:
| Just perfect. So glad I read this. Thanks for sharing.
| 
| > _In many ways the great quest of Doug Lenat's life was an
| attempt to follow on directly from the work of Aristotle and
| Leibniz._
| 
| Such a wonderful, respectful retrospective of Lenat's ideas and
| work.
| 
| > _I think Doug viewed CYC as some kind of formalized
| idealization of how he imagined human minds work: providing a
| framework into which a large collection of (fairly
| undifferentiated) knowledge about the world could be "poured". At
| some level it was a very "pure AI" concept: set up a generic
| brain-like thing, then "it'll just do the rest". But Doug still
| felt that the thing had to operate according to logic, and that
| what was fed into it also had to consist of knowledge packaged up
| in the form of logic._
| 
| I've always wanted CYC, or something like it, to be correct. Like
| somehow it'd fulfill my need for the universe to be knowable,
| legible. If human reason & logic could be encoded, then maybe
| things could start to make sense, if only we try hard enough.
| 
| Alas.
| 
| Back when SemanticWeb was the hotness, I was a firm ontology
| partisan. After working on customer's use cases, and given enough
| time to work thru the stages of grief, I grudgingly accepted the
| folksonomy worldview is probably true.
| 
| Since then, of course, the "fuzzy" strategies have prevailed.
| (Also, most of us have accepted humans aren't rational.)
| 
| To this day, statistics based approaches make me uncomfortable,
| perhaps even anxious. My pragmatism motivated holistic worldview
| is always running up against my reductionist impulses. Paradox in
| a nutshell.
| 
| Enough about me.
| 
| > _Doug's starting points were AI and logic, mine were ...
| computation writ large._
| 
| I do appreciate Wolfram placing their respective theories in the
| pantheon. It's a nice reminder of their lineages. So great.
| 
| I agree with Wolfram that encoding heuristics was an experiment
| that had to be done. Negative results are super important. I'm
| so, so glad Lenat (and crews) tried so hard.
| 
| And I hope the future holds some kind of synthesis of these
| strategies.
 
  | cabalamat wrote:
  | > Negative results are super important.
  | 
  | I agree, and this is often overlooked. Knowing what doesn't
  | work (and why) is a massive help in searching for what does
  | work.
 
  | zozbot234 wrote:
  | > And I hope the future holds some kind of synthesis of these
  | strategies.
  | 
  | My guess is that by June 19, 2024 we'll be able to take 3596.6
  | megabytes of descriptive text about President Abraham Lincoln
  | and do something cool with it.
 
    | specialist wrote:
    | Heh.
    | 
    | I was more hoping OpenAI would incorporate inference engines
    | to cure ChatGPT's "hallucinations". Such that it'd "know" bad
    | sex isn't better than good sex, despite the logic.
    | 
    | PS- I haven't actually asked ChatGPT. I'm just repeating a
    | cliche about the limits of logic wrt the real world.
 
  | patrec wrote:
  | > I agree with Wolfram that encoding heuristics was an
  | experiment that had to be done. Negative results are super
  | important. I'm so, so glad Lenat (and crews) tried so hard.
  | 
  | The problem is that Doug Lenat trying very hard is only useful
  | as a data point if you have some faith in Doug Lenat making
  | something that _is_ reasonably workable work by trying very
  | hard.
  | 
  | Do you have a reason for thinking so? I'm genuinely curious:
  | lots of people have positive reminiscences about Lenat, who
  | seems to have been likeable and smart, but on my (admittedly
  | somewhat shallow attempts) I always keep drawing blanks when
  | looking for anything of substance he produced or some deeper
  | insight he had (even before Cyc).
 
    | creer wrote:
    | I also feel it's great and useful that Lenat and crew tried
    | so hard. There is no doubt that a ton of work went into cyc.
    | It was a serious, well funded, long term project and
    | competent people put effort in making it work. And there are
    | some descriptions of how they went about it. And opencyc was
    | released.
    | 
    | But some projects - or at least the breakthroughs they
    | produce - are highly published as papers, which can be
    | studied by outsiders. And that is not the case of cyc. There
    | are some reports and papers but really not many that I have
    | found. And so it's not clear how solid or generalizable it is
    | as a data point.
 
    | mhewett wrote:
    | Lenat was my assigned advisor when I started my Masters at
    | Stanford. I met with him once and he gave me some advice on
    | classes. After that he was extremely difficult to schedule a
    | meeting with (for any student, not just me). He didn't get
    | tenure and left to join MCC after that year. I don't think I
    | ever talked to him again after the first meeting.
    | 
    | He was extremely smart, charismatic, and a bit arrogant (but
    | a well-founded arrogance). From other comments it sounds like
    | he was pleasant to young people at Cycorp. I think his peers
    | found him more annoying.
    | 
    | His great accomplishments were having a multi-decade vision
    | of how to build an AI and actually keeping the vision alive
    | for so long. You have to be charismatic and convincing to do
    | that.
    | 
    | In the mid-80s I took his thesis and tried to implement AM on
    | a more modern framework, but the thesis lacked so many
    | details about how it worked that I was unable to even get
    | started implementing anything.
    | 
    | BTW, if there are any historians out there I have a copy of
    | Lenat's thesis with some extra pages including emailed
    | messages from his thesis advisors (Minsky, McCarthy, et al)
    | commenting on his work. I also have a number of AI papers
    | from the early 1980s that might not be generally available.
 
      | mietek wrote:
      | I'd be quite interested to see these materials.
      | 
      | What's your take on AM and EURISKO? Do you think they
      | actually performed as mythologized? Do you think there's
      | any hope of recovering or reimplementing them?
 
      | eschaton wrote:
      | It'd be amazing to get those papers and letters digitized.
 
  | skissane wrote:
  | > And I hope the future holds some kind of synthesis of these
  | strategies.
  | 
  | Recently I've been involved in discussions about using an LLM
  | to generate JSON according to a schema, as in OpenAI's function
  | calling or Jsonformer-LLMs do okay for generating code in
  | mainstream languages like SQL or Python, but what if you have
  | some proprietary query language? Maybe have a JSON schema for
  | the AST, have the LLM generate JSON conforming to that schema,
  | then serialise the JSON to the proprietary query language
  | syntax?
  | 
  | And it makes me think - what if one used an LLM to generate or
  | evaluate assertions in a Cyc-style ontology language? And that
  | might be a bridge between the logic/ontology approach and the
  | statistical/neural approach
 
    | jebarker wrote:
    | This is similar to what people are trying for mathematical
    | theorem proving. Using LLMs to generate theorems that can be
    | validated in Lean.
 
    | nvm0n2 wrote:
    | I had the same idea last year. But it's difficult. To encode
    | knowledge in CycL required intensive training, mostly in how
    | their KB encoded very abstract concepts and "obvious"
    | knowledge. They used to boast about how they had more
    | philosophy PhDs than anywhere else.
    | 
    | It's possible that an LLM that's been trained on enough
    | examples, and that's smart enough, could actually do this.
    | But I'm not sure how you'd review the output to know if it's
    | right. The LLM doesn't have to be much faster than you to
    | overwhelm the capacity of reviewing the results.
 
    | theptip wrote:
    | This might work; you can view it as distilling the common
    | knowledge out of the LLM.
    | 
    | You'd need to provide enough examples of CycL for it to learn
    | the syntax.
    | 
    | But in my experience LLMs are not great at authoring code
    | with no ground truth to test against. So the LLM might
    | hallucinate some piece of common knowledge, and it could be
    | hard to detect.
    | 
    | But at the highest level, this sounds exactly how the
    | WolframAlpha ChatGPT plug-in works; the LLM knows how to call
    | the plugin and can use this to generate graphs or compute
    | numerical functions for domains where it cannot compute the
    | result directly.
 
  | at_a_remove wrote:
  | I really do believe (believe, rather than know) that some sort
  | of synthesis is necessary, that there's some base facts and
  | common sense that would make AI, as it stands, more reliable
  | and trustworthy if had some kind of touchstone, rather than the
  | slipshod "human hands come with thumbs and fingers" output we
  | have now. Something that can look back and say, "Typically,
  | there's just one thumb and there's four fingers. Sometimes not,
  | but that is rare."
 
| ansible wrote:
| Symbolic knowledge representation and reasoning is a quite
| interesting field. I think the design choices of projects like
| wikidata.org and CYC severely limit the application of this
| though.
| 
| For example, on the wikidata help page, they talk about the
| height of Mount Everest:
| 
| https://www.wikidata.org/wiki/Help:About_data#Structuring_da...
| Earth (Q2) (item) - highest point (P610) (property) - Mount
| Everest (Q513) (value)
| 
| and                   Mount Everest (Q513) (item) - instance of
| (P31) (property) - mountain (Q8502) (value)
| 
| So that's all fine, but it misses a lot of context. These facts
| might be true for the real world, right now, but they won't
| always be true. Even in the not-so-distant past, the height of
| Everest was lower, because of tectonic plate movement. And maybe
| in the future it will go even higher due to tectonics, or maybe
| it will go lower due to erosion.
| 
| Context awareness gets even more important when talking about
| facts like "the iPhone is the best selling phone", for example.
| That might be true right now, but it certainly wasn't true back
| in 2006, before the phone was released.
| 
| Context also comes in many forms, which can be necessary for
| useful reasoning. For example, consider the question: "What would
| be the highest mountain in the world, if someone blew up the peak
| of Everest with a bomb?" This question isn't about the real
| world, right here and right now, it is about a hypothetical world
| that doesn't exist.
| 
| Going a little further afield, you may want to ask a question
| like "Who is the best captain of the Enterprise?". This might be
| about the actual US Navy CVN-64 ship named "Enterprise", the
| planned CVN-80, or the older ship CV-6 Enterprise which fought in
| WW2. Or maybe a relevant context to the question was "Star Trek",
| and we're in one of several fictional worlds instead, which would
| result in a completely different set of facts.
| 
| I think some ability to deal with uncertainly (as with
| Probabilistic Graphical Models) is also necessary to deal with
| practical applications of this technology. We may be dealing with
| a mix of "objective facts" (well, let's not get into a discussion
| about the philosophy of science) and other facts that we may not
| be so certain about.
| 
| It seems to me that successful symbolic reasoning system will be
| very, very large and complex. I'm not at all sure even how such
| knowledge should be represented, never mind the issue of trying
| to capture it all in digital form.
 
| dang wrote:
| Lenat's post about Wolfram Alpha, mentioned in the OP, was
| discussed (a bit) at the time:
| 
|  _Doug Lenat - I was positively impressed with Wolfram Alpha_ -
| https://news.ycombinator.com/item?id=510579 - March 2009 (17
| comments)
| 
| And of course, recent and related:
| 
|  _Doug Lenat has died_ -
| https://news.ycombinator.com/item?id=37354000 - Sept 2023 (170
| comments)
 
| ks2048 wrote:
| I wonder if CYC would have had more success if it was open and
| collaborative. WikiData seems like a successful cousin. I know
| the goals are a quite different - wikidata doesn't really store
| "common sense" knowledge, but it seems any rule-based AI system
| would probably want to use wikidata as a database of facts.
 
  | zozbot234 wrote:
  | > wikidata doesn't really store "common sense" knowledge
  | 
  | They're actively working on this, with the goal of ultimately
  | building a language-independent representation[0] of ordinary
  | encyclopedic text. Much like a machine translation
  | interlanguage, but something that would be mostly authored by
  | humans, not auto-generated from existing natural-language text.
  | See https://meta.wikimedia.org/wiki/Abstract_Wikipedia for more
  | information.
  | 
  | [0] Of course, there are some very well-known pitfalls to this
  | general idea: what's the true, canonical language-independent
  | representation of _nimium saepe valedixit_? So this should
  | probably be understood as _mostly_ language-independent, enough
  | to be practically useful.
 
  | brundolf wrote:
  | If I recall, Cyc did exactly that (imported data from WikiData)
  | 
  | Unfortunately there was much more to it than ingesting large
  | volumes of structured entities
 
  | creer wrote:
  | I looked into it years ago and adding to, say, opencyc, really
  | did not seem simple. There was a lot of detail in the entity
  | descriptions. Even reading them seemed to required an awful lot
  | of background knowledge of the system.
  | 
  | It may have been possible to at least add lots of parallel
  | items / instances. For example more authors and books and music
  | works and performers, etc. Anyone here built a system around
  | opencyc? Or cyc?
 
___________________________________________________________________
(page generated 2023-09-06 20:00 UTC)