- June 2008


CUDA, Supercomputing for the Masses: Part 5
Dr.Dobb’s Portal
The local and global memory spaces are not cached which means each memory access to global memory (or local memory) generates an explicit memory access. So what does it cost to access (read or write, for example) each of the different memory types?

Tesla 10 & CUDA 2.0: Technical Analysis & Performance - Page 1
Beyond 3D
CUDA was announced along with G80 in November 2006, released as a public beta in February 2007, and then finally hit the Version 1.0 milestone in June 2007 along with the launch of the G80-based Tesla solutions for the HPC market. Today, we look at the next stage in the CUDA/Tesla journey: GT200-based solutions, CUDA 2.0, and the overall state of NVIDIA's HPC business.

More Details on Elemental's GPU Accelerated H.264 Encoder
Elemental's software, if it truly performs the way as seen here, has the potential to be a disruptive force in both the GPU and CPU industries. On the GPU side it would give NVIDIA hardware a significant advantage over AMD's GPUs, and on the CPU side it would upset the balance between NVIDIA and Intel. Video encoding has historically been an area where Intel's CPUs have done very well, but if the fastest video encoder ends up being a NVIDIA GPU -- it could mean that video encoding performance would be microprocessor agnostic, you'd just need a good NVIDIA GPU.

Stanford releases beta Nvidia folding client
The Tech Report
“At last, Stanford University has released a beta version of the GPU2 Folding@home client for Nvidia graphics cards. You can grab the client from this post on the official FAH forums, although Stanford's Adam Beberg suggests users closely read the FAQ page to familiarize themselves with the software first.”

“3D card manufacturers shouldn't take this the wrong way, but it takes a lot to make us crawl out of the communal Eurogamer bed (yes, all the Eurogamer writers share a single large bed - we do it for frugality and communality, which remain our watchwords) and go to a hardware presentation. There's a nagging fear someone may talk maths at us and we'd come home clutching the local equivalent of magic beans. And then we'll be laughed at by our fellow writers and made to sleep in the chilly end where the covers are thin and Tom left dubious stains. That's no fun at all.”

NVIDIA's CUDA: The End of the CPU?
Tom's Hardware
CUDA is not a gimmick intended for researchers who want to cajole their university into buying them a GeForce. CUDA is genuinely usable by any programmer who knows C, provided he or she is ready to make a small investment of time and effort to adapt to this new programming paradigm. That effort won’t be wasted provided your algorithms lend themselves to parallelization.

OptiTex to Use NVIDIA's CUDA Technology
"OptiTex' software is an ideal fit for NVIDIA as it leverages the combined personalities of our CUDA enabled GPUs - rich graphics and data intensive computation," said Andy Keane general manager of the GPU Computing business at NVIDIA. "OptiTex' software will deliver new levels of creative freedom for designers."

NVIDIA Looking to Take Computing to the Next Level
“NVIDIA released a new set of GPUs that not only boast a crazy amount of speed, but come with the promise of helping take on a larger set of tasks by delivering a lot more usable horsepower.”

NVIDIA Releases 240-Core Graphics Processor
“Tesla 10 series processor is Nvidia's latest offering for high-performance computing.”

Nvidia and Stanford Finalizing Folding@Home Client for GeForce GPUs
“During Nvidia Editor's Day, we learned that Nvidia and the Folding@Home research group led by Vijay Pande are making final preparation to launch the first version of the Folding@Home client for Nvidia graphics processors.”

Apple Eyeing NVIDIA's CUDA Technology?
“One of the most important performance challenges facing CUDA Apple's Worldwide Developers Conference is expected to cover the parallel tracks of Mac and iPhone software development, but the company may have another aspect of parallelism to discuss.”

CUDA, Supercomputing for the Masses: Part 4
Dr.Dobb’s Portal
“One of the most important performance challenges facing CUDA (short for "Compute Unified Device Architecture") developers is the best use of local multiprocessor memory resources such as shared memory, constant memory, and registers.”