This Week in AI: Let's not forget the humble data annotation | Tech Crunch

Spread the love

It's great to keep up with an industry as fast-moving as AI. So until AI can do it for you, here's a handy roundup of the latest stories in the world of machine learning, along with notable research and experiments we didn't cover on their own.

This week in AI, I want to draw attention to labeling and annotation startups — startups like Scale AI, which is reportedly in talks to raise new funding with a $13 billion valuation. Labeling and annotation platforms Flashy new generative AI models like OpenAI's Sora do may not get attention. But they are imperative. Without them, modern AI models would arguably not exist.

Data for training several models should be labeled. why Labels or tags help models understand and interpret data during the training process. For example, the labels that train an image recognition model can be in the form of markers around objects, “bounding boxes,” or captions that refer to each person, place, or object depicted in the image.

The accuracy and quality of the labels significantly affect the performance – and reliability – of the trained models. And annotation is an extensive task, requiring thousands to millions of labels for the larger and more sophisticated data sets in use.

So you annotate data well, get paid a living wage, and get the same benefits that the engineers who build the models enjoy. But often, the opposite is true – a product of the atrocious working conditions that encourage many labeling and labeling startups.

Companies like OpenAI, with billions in the bank, rely on annotations in third world countries that pay only a few dollars an hour. Some of these citations are exposed to highly disturbing content, such as graphic imagery, yet they are not given time off (like contractors usually are) or given access to mental health resources.

A brilliant piece in NY Mag particularly pulls back the curtains on scale AI, recruiting annotations in countries as far away as Nairobi and Kenya. Some tasks at Scale AI take labelers multiple eight-hour workdays — no breaks — and pay just $10. And these workers are attuned to the likes of the platform. Annotators sometimes go long periods without receiving work, or they are unceremoniously booted from scale AI — as happened recently to contractors in Thailand, Vietnam, Poland, and Pakistan.

Some annotation and labeling platforms claim to offer “fair-trade” work. They have made this a core part of their branding. But as MIT Tech Review's Kate Kaye notes, there are no regulations, only weak industry standards for what constitutes ethical labeling work — and companies' own definitions vary widely.

So, what to do? Barring huge technological advances, annotating and labeling data for AI training is not necessary. We hope the platforms are self-regulating, but a more realistic solution appears to be policymaking. That, too, is a tricky prospect — but it's the best shot we have, I'd argue, of turning things around for the better. Or at least start.

Here are some other noteworthy AI stories from the past few days:

    • OpenAI has created a voice cloner that: OpenAI is previewing a new AI-powered tool it developed, Voice Engine, which lets users clone a voice from a 15-second recording of someone speaking. But the company chooses not to release it widely (yet), citing risks of misuse and abuse.
    • Amazon doubled down on Anthropic: Amazon has invested another $2.75 billion in growing AI power Anthropic, following an option it opened last September.
    • Launches Accelerator:, Google's charitable arm, is launching a new $20 million, six-month program to help fund nonprofits developing technology that leverages productive AI.
    • New Model Architecture: AI startup AI21 Labs has released a generative AI model, Jamba, that uses a novel, new(ish) model architecture — state space models, or SSMs — to improve efficiency.
    • Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model similar to OpenAI's GPT series and Google's Gemini. The company claims to achieve state-of-the-art results on several popular AI benchmarks, including several measured reasoning.
    • Uber Eats and UK AI regulation: Uber Eats courier's fight against AI bias Natasha writes that getting justice under the UK's AI rules is difficult.
    • EU Election Security Guidance: The European Union published draft election security guidelines on Tuesday Two dozen Platforms regulated under The Digital Services Act, including guidelines to prevent content recommendation algorithms from spreading productive AI-based disinformation (aka political deepfakes).
    • Grok Upgraded: X's Grok chatbot will soon get an upgraded underlying model, Grok-1.5 — while all premium subscribers on X will get access to Grok. (Groc was previously exclusive to X Premium+ customers.)
    • Adobe extends Firefly by: This week, Adobe unveiled Firefly services, a set of more than 20 new productive and creative APIs, tools and services. It also launched Custom Models, which allow businesses to fine-tune Firefly models based on their assets — part of Adobe's new GenStudio suite.

More machine learning

how is the weather AI is able to tell you more of this. I noted some attempts at hourly, weekly, and century-scale predictions a few months ago, but like all things AI, the field is moving fast. The teams behind MetNet-3 and Graphcast have published a paper describing a new system called SEEDS for a scalable ensemble envelope diffusion sampler.

An animation showing how more forecasts create a more even distribution of weather forecasts.

Seeds use diffusion to generate “sets” of acceptable weather outcomes for a region based on input (radar readings or orbital imagery) much faster than physics-based models. With larger ensemble counts, they can cover more edge cases (such as an event occurring only in a 1 in 100 scenario) and be more confident about more likely situations.

Fujitsu hopes to better understand the natural world by applying AI image handling techniques to underwater imagery and lidar data collected by underwater autonomous vehicles. Improving the quality of the images allows other, less sophisticated processes (such as 3D conversion) to perform better on the target data.

Image Credits: Fujitsu

The idea is to build a “digital twin” of waters that can help simulate and predict new developments. We're a long way from that, but you have to start somewhere.

In LLMs, the researchers found that they were simulating intelligence through a simpler method than expected: linear functions. Obviously the math is beyond me (vector stuff in many respects) but this writeup at MIT makes it clear that the recall mechanism for these models is pretty… basic.

Although these models are really complex, nonlinear functions that are trained on a lot of data and are very difficult to understand, sometimes there are really simple mechanisms at work inside them. This is an example of that,” said co-lead author Ivan Hernandez. If you're more technically minded, check out the paper here.

One way these models fail is by not understanding context or feedback. Even a really capable LLM won't “get it” if you tell them your name is pronounced a certain way, because they won't know or understand anything. In important cases, such as human-robot interactions, if a robot behaves in such a way, it can disable people.

Disney Research has been looking into automated character interactions for a long time, and this name pronunciation and reuse paper only appeared a short time ago. It seems obvious, but capturing phonemes when someone introduces themselves and encoding rather than just a written name is a smart approach.

Image Credits: Disney Research

Finally, as AI and search overlap more and more, it's worth reassessing how these tools are being used and whether there are any new risks posed by this unholy union. Safia Umoja Nobel has been an important voice in AI and research ethics for years, and her opinion is always enlightening. She did a great interview with the UCLA news team about how her work evolved and why we should be cool when it comes to bias and bad habits in search.

Source link

Leave a Comment