New Google policy instructs Gemini’s fact-checkers to act outside their expertise

Summary

Google employs contract research agencies to evaluate Gemini response accuracy.
GlobalLogic contractors evaluating Gemini prompts are no longer allowed to skip individual interactions based on lack of expertise.
Concerns exist over Google’s reliance on fact-checkers without relevant knowledge, potentially impacting AI development goals.

Google Deepmind, the team responsible for developing and maintaining the conglomerate’s AI models, employs various techniques to evaluate and improve Gemini’s output. One such method, Gemini 2.0‘s recently announced FACTS Grounding benchmark, leverages responses from other advanced LLMs to determine if Gemini’s answers actually relate to a question, answer the question, and answer the question correctly.

Another method calls on human contractors from Hitachi-owned GlobalLogic to evaluate Gemini prompt responses and rate them for correctness. Until recently, contractors could skip individual prompts that fell significantly outside their areas of expertise. Now, Google has mandated that contractors can no longer skip prompts, forcing them to determine accuracy in subjects they might know nothing about (reporting by TechCrunch).

Gemini AI in Gmail needs to be incredibly accurate for me to trust it

The company making search results worse wants you to trust it with your emails

Hands-on LLM error-checking gone awry

Are fact-checkers in over their heads?

an ai image of graffitti on a black brick wall saying this image is ai generated with errors

Source: Google Gemini

Previously, GlobalData contractors could skip individual prompts they weren’t comfortable answering due to lack of background knowledge, with guidelines stating, “If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task.” According to sources that remain anonymous due to non-disclosure agreements, the new directive handed down from Google states, “You should not skip prompts that require specialized domain knowledge.”

Accompanying the new policy is an instruction to “rate the parts of the prompt you understand,” and make a note that it falls outside the reviewer’s knowledge base. The option to skip certain prompts due to lack of relevant expertise has been eliminated, with contractors now only allowed to bypass individual interactions due to non-existent prompts or responses, or the presence of harmful content the contractor isn’t authorized to evaluate.

How to use Google Fact Check Explorer

Misinformation is prevalent, but there’s a solution

What we know about GlobalLogic AI evaluation

A considerable, fluctuating number of open positions related to AI fact-checking exist on employment platforms like Upworthy and Indeed, offering anywhere from $14 per hour and up to evaluate AI performance. Various recruiters have reached out to jobseekers, apparently on behalf of GlobalLogic, in search of workers to fill potential contract-to-hire positions.

Many social media users report the company’s obfuscated interview process and lengthy, “stressful” onboarding process, while confirming Google as the GlobalData client. Some social media users purporting to currently work on the project have verified the claims of difficulties, as well as a starting pay around $21 per hour and the uncommon, but real, potential for direct hire.

A finger of a robot and a finger of a human both interacting with the same holographic interface

What is Reinforcement learning from human feedback?

Reinforcement learning has been a game changer in artificial intelligence, allowing machines to continuously improve their performance

What low-expertise fact-checking means for Gemini

Maybe nothing, and possibly nothing good

an ai image glitch woman wearing white clothes on a sunny beach in an unnatural pose

Source: Adobe Firefly

Predictably, contract, workflow, and data application details remain tightly locked down. Employing real people to evaluate individual prompt responses seems a logical choice. Complex recruiting and hiring processes, unclear client needs and guidelines during onboarding, and inconsistent management techniques have always surrounded large-scale, outsourced contracting jobs. Nothing there raises unexpected red flags, and current (claimed) GlobalData contractors note that many of its workers possess high-level and technical degrees.

The worry stems from Google’s apparent shift away from allowing admittedly uninformed evaluators to bypass questions they can’t answer. If a note indicating lack of expertise accompanies a contractor’s evaluation, Google could theoretically disregard the evaluation and return the interaction to the pool for re-inspection. We have no way of knowing at present how Google treats this data.

What are AI hallucinations?

AI hallucinations offer false information as fact: Here’s how this problem happens

How does non-expert error-checking advance Google’s AI goals?

The obvious concern remains that the new directive implies Google’s decreasing reliance on educated experts, or even confident, self-aware autodidacts. TechCrunch, which originally received the leaked claims, noted one contractor explained, “I thought the point of skipping was to increase accuracy by giving it to someone better.”

Perhaps Google is simply streamlining its data collection process, and fully intends to discard, ignore, or clarify potentially inaccurate evaluations. Or, maybe, it’s decided that Gemini fact-checking and further development for accuracy and anti-hallucinations don’t, necessarily, require relevant background expertise when evaluating whether an LLM’s answers make any sense.

Sundar Pichai giving a presentation at Google I/O 2024 about Gemini AI

Gemini AI in Gmail needs to be incredibly accurate for me to trust it

The company making search results worse wants you to trust it with your emails

New Google policy instructs Gemini’s fact-checkers to act outside their expertise

Best Amazon deals of the day: 10th generation Apple iPad, Sony ULT Field 1, Blink Mini 2, Apple Pencil 2, Sonos Beam

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Categories

New Google policy instructs Gemini’s fact-checkers to act outside their expertise

Summary

Gemini AI in Gmail needs to be incredibly accurate for me to trust it

Hands-on LLM error-checking gone awry

Are fact-checkers in over their heads?

How to use Google Fact Check Explorer

What we know about GlobalLogic AI evaluation

What is Reinforcement learning from human feedback?

What low-expertise fact-checking means for Gemini

Maybe nothing, and possibly nothing good

What are AI hallucinations?

How does non-expert error-checking advance Google’s AI goals?

Gemini AI in Gmail needs to be incredibly accurate for me to trust it

Best Amazon deals of the day: 10th generation Apple iPad, Sony ULT Field 1, Blink Mini 2, Apple Pencil 2, Sonos Beam

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Categories

Get more stuff like this in your inbox

Get more stuff like this
in your inbox