• 0 Posts
  • 25 Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle

  • There may be a need for additional information, there just isn’t any in these responses. Using a basic JSON schema like the Problem Details RFC provides a standard way to add that information if necessary. Error codes are also often too general to have an application specific meaning. For example, is a “400 bad request” response caused by a malformed payload, a syntactically valid but semantically invalid payload, or what? Hence you put some data in the response body.


















  • Your first two paragraphs seem to rail against a philosophical conclusion made by the authors by virtue of carrying out the Turing test. Something like “this is evidence of machine consciousness” for example. I don’t really get the impression that any such claim was made, or that more education in epistemology would have changed anything.

    In a world where GPT4 exists, the question of whether one person can be fooled by one chatbot in one conversation is long since uninteresting. The question of whether specific models can achieve statistically significant success is maybe a bit more compelling, not because it’s some kind of breakthrough but because it makes a generalized claim.

    Re: your edit, Turing explicitly puts forth the imitation game scenario as a practicable proxy for the question of machine intelligence, “can machines think?”. He directly argues that this scenario is indeed a reasonable proxy for that question. His argument, as he admits, is not a strongly held conviction or rigorous argument, but “recitations tending to produce belief,” insofar as they are hard to rebut, or their rebuttals tend to be flawed. The whole paper was to poke at the apparent differences between (a futuristic) machine intelligence and human intelligence. In this way, the Turing test is indeed a measure of intelligence. It’s not to say that a machine passing the test is somehow in possession of a human-like mind or has reached a significant milestone of intelligence.

    https://academic.oup.com/mind/article/LIX/236/433/986238


  • I don’t think the methodology is the issue with this one. 500 people can absolutely be a legitimate sample size. Under basic assumptions about the sample being representative and the effect size being sufficiently large you do not need more than a couple hundred participants to make statistically significant observations. 54% being close to 50% doesn’t mean the result is inconclusive. With an ideal sample it means people couldn’t reliably differentiate the human from the bot, which is presumably what the researchers believed is of interest.