NVIDIA, Microsoft Create 'Megatron' Language Processor That Can 'Naturally' Complete Sentences, Answer Questions, And 'Reason'

NVIDIA has teamed up with Microsoft to create one of the world's largest and most powerful language processors.

Nvidia logo
LAS VEGAS, NV - JANUARY 04: An Nvidia logo is shown on a screen during a keynote address by Nvidia Founder, President and CEO Jen-Hsun Huang (not pictured) at CES 2017 at The Venetian Las Vegas on January 4, 2017 in Las Vegas, Nevada. CES, the world's largest annual consumer technology trade show, runs from January 5-8 and is expected to feature 3,800 exhibitors showing off their latest products and services to more than 165,000 attendees. Ethan Miller/Getty Images

The machine, which they ominously named "Megatron," actually goes by the more official name the Megatron-Turing Natural Language Generation model or MT-NLG), writes The Register.

According to a blog post on the NVIDIA Developer website, Megatron is designed to "demonstrate unmatched accuracy" in these natural language generation tasks:

  • Completion prediction (aka auto-completing sentences)

  • Reading comprehension

  • Common sense reasoning

  • Word sense disambiguation

  • Natural language inferences

NVIDIA and Microsoft's project is one of the largest ever built in terms of total parameters at 530 billion, dwarfed only by Google's Switch Transformer demo, which can boast 1.6 trillion parameters.

This puts the MT-NLG in second place, though the third placer, OpenAI's GPT-3, is not even close at only 175 billion parameters.

BEASTLY Hardware Requirements

To achieve this technical feat in the natural language processing (NLP) field, NVIDIA and Microsoft had to power Megatron with some insane hardware.

Ai brain art
Getty Images

To merely train the model, NVIDIA had to use its Selene supercomputer. This machine is composed of 560 DGX A100 servers, each of them containing eight A100 GPUs with 80GB of VRAM.

That's a total of 4,480 GPUs, all connected via NVLink and NVSwitch, communicating between one another. Furthermore, the Selene supercomputer also uses an array of AMD EPYC 7v742 processors.

As a result, Megatron can actually be trained on NLP tasks with barely any fine-tuning. The total project cost for this kind of hardware is a cool $85 million.

NLP Challenges

Like with every other type of AI tech, however, NLP still faces a lot of challenges.

The biggest one simply involves how many languages in the world there are. There are an estimated 6,500 languages in total, and a few of the most significant ones, such as Arabic, Spanish, Portuguese, and Hindi, are still proving troublesome for NLP, according to the NVIDIA Developer website.

It just goes to show that even with the most powerful hardware available, current AI technology still requires years of work to understand the intricacies of human languages, which is an extremely tall order on its own.

What's The Purpose Of NVIDIA-Microsoft's Megatron?

As said before, the main goal of the MT-NLG is to fulfill natural language processing (NLP) tasks.

For the uninitiated, this simply means that NVIDIA and Microsoft's plan for Megatron is to help machines further understand the human language and all of its intricacies, according to MonkeyLearn.

Greetings
Getty Images

The technical implications of this project are massive. For one, enabling AI to fully understand every single nuance of the human language (perhaps even slang terms) is going to be helpful in things such as providing accurate translations, as well as spell checks.

This could mean that in the future, maybe services like Google Translate will be able to fully understand and accurately translate anybody on the fly, which could be extremely useful for travelers wanting to overcome language barriers.

This article is owned by Tech Times

Written by RJ Pierce

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion
Real Time Analytics